Calculus Textbook: Spivak, Third Edition

CALCULUS Michael Sþimh CALCULUS Thini Edition Copynght © 1967, 1980, 1994 byMichaelSpivak All nghts reserved Libraryof Congnus CatalogCardNumber80-82517 Publish or Perish, Inc. 1572 West Gray, #377 Houston, Texas 77019 in the UnitedStatesof America Manufactured ISBN 0-914098-89-6 Dedicated to the Memory of Y. P. PREFACE FRANCIS BA00N PREFACE TO THE FIRST EDITION Every aspect of this book was influenced by the desire to present calculus not merely as a prelude to but as the first real encounter with mathematics. Since the foundations of analysis provided the arena in which modern modes of mathematical thinking developed, calculus ought to be the place in which to expect, rather than avoid, the strengthening of insight with logic. In addition to developing the students' intuition about the beautiful concepts of analysis, it is surely to persuade them that precision and rigor are neither deterrents to intuition, nor ends in themselves, but the natural medium in which to formulate and think about mathematical questions. equally important which, in a sense, the entire book This goal implies a view of mathematics well matter particular topics may be developed, the how attempts to defend. No of will realized only succeeds goals this book be if it as a whole. For this reason, it would be of little value merely to list the topics covered, or to mention pedagogical practices and other innovations. Even the cursory glance customarily bestowed on new calculus texts will probably tell more than any such extended advertisement, and teachers with strong feelings about particular aspects of calculus will know just where to look to see if this book fulfills their requirements. A few features do require explicit comment, however. Of the twenty-nine chapchapters are optional, and the three chapters comters in the book, two (starred) prising Part V have been included only for the benefit of those students who might of the real numbers. Moreover, the want to examine on their own a construction and contain optional material. appendices to Chapters 3 11 also The order of the remaining chapters is intentionally quite inflexible, since the purpose of the book is to present calculus as the evolution of one idea, not as a collection of Since the most exciting concepts of calculus do not appear until Part III, it should be pointed out that Parts I and II will probably require the entire book covers a one-year less time than their length suggests-although chapters covered the be meant at any uniform rate. A rather not to course, are natural dividing point does occur between Parts II and III, so it is possible to reach differentiation and integration even more quickly by treating Part II very briefly, perhaps returning later for a more detailed treatment. This arrangement of most calculus courses, but I feel corresponds to the traditional organization who have seen a that it will only diminish the value of the book for students small amount of calculus previously, and for bright students with a reasonable background. The problems have been designed with this particular audience in mind. They but not overly simple, exercises which develop basic range from straightforward, techniques and test understanding of concepts, to problems of considerable diffiand, comparable culty interest. There are about 625 problems in all. I hope, of Those which emphasize manipulations usually contain many example, numbered "topics." vii viii Preface with small Roman numerals, while small letters are used to label interrelated parts other in problems. Some indication of relative difficulty is provided by a system of starring and double starring, but there are so many criteria for judgingdifficulty, and so many hints have been provided, especially for harder problems, that this guide is not completely reliable. Many problems are so difficult, especially if the hints are not consulted, that the best of students will probably have to attempt only those which especially interest them; from the less difficult problems it should be easy to select a portion which will keep a good class busy, but not frustrated. The answer section contains solutions to about half the examples from an assortment of problems that should provided a good test of technical competence. A separate answer book contains the solutions of the other parts of these problems, and of all the other problems as well. Finally, there is a Suggested Reading list, to which the problems often refer, and a glossary of symbols. I am grateful for the opportunity to mention the many people to whom I owe my thanks. JaneBjorkgren performed prodigious feats of typing that compensated for Richard Serkey helped collect the material my fitful production of the manuscript. which provides historical sidelights in the problems, and Richard Weiss supplied the answers appearing in the back of the book. I am especially grateful to my friends Michael Freeman, JayGoldman, Anthony Phillips, and Robert Wells for the care with which they read, and the relentlessness with which they criticized, a preliminary version of the book. Needles to say, they are not responsible for the deficiencies which remain, especially since I sometimes rejected suggestions which would have made the book appear suitable for a larger group of students. I must express my admiration for the editors and staff of W. A. Benjamin, Inc., who were always eager to increase the appeal of the book, while recognizing the audience for which it was intended. The inadequacies which preliminary editions always involve were gallantly endured by a rugged group of freshmen in the honors mathematics course at Brandeis University during the academic year 1965-1966. About half of this course was devoted to algebra and topology, while the other half covered calculus, with the preliminary edition as the text. It is almost obligatory in such circumstances to report that the preliminary version was a gratifying success. This is always safeafter all, the class is unlikely to rise up in a body and protest publicly-but the students themselves, it seems to me, deserve the right to assign credit for the thoroughness with which they absorbed an impressive amount of mathematics. I am content to hope that some other students will be able to use the book to such good purpose, and with such enthusiasm. Waltham,Massachusetts February1967 MIC HAE L SPIV AK PREFACE TO THE SECOND EDITION I have often been told that the title of this book should really be something like "An Introduction to Analysis," because the book is usually used in courses where the students have already learned the mechanical aspects of calculus-such courses are standard in Europe, and they are becoming inore common in the United States. After thirteen years it seems too late to change the title, but other changes, in addition to the correction of numerous misprints and mistakes, seemed called for. There are now separate Appendices for many topics that were previously slighted: polar coordinates, uniform continuity, parameterized curves, Riemann sums, and the use of integrals for evaluating lengths, volumes and surface areas. A few topics, like manipulations with power series, have been discussedmore thoroughly in the text, and there are also more problems on these topics, while other topics, like Newton's method and the trapezoid rule and Simpson's rule, have been developed in the problems. There are in all about 160 new problems, many of which are intermediate in difficulty between the few routine problems at the beginning of each chapter and the more difficult ones that occur later. Most of the new problems are the work of Ted Shiflin. Frederick Gordon pointed out several serious mistakes in the original problems, and supplied some non-trivial corrections, as well as the neat proof of Theorem 12-2, which took two Lemmas and two pages in the first edition. JosephLipman also told me of this proof, together with the similar trick for the proof of the last theorem in the Appendix to Chapter 11, which went unproved in the first edition. Roy O. Davies told me the trick for Problem 11-66, which previously was proved only in Problem 20-8 [21-8in the third edition], and Marina Ratner suggested several interesting problems, especially ones on uniform continuity and infinite series. To all these people go my thanks, and the hope that in the process of fashioning the contributions weren't too badly botched. edition their new MICHAEL ix SPIVAK PREFACE TO THE THIRD EDITION The most significant change in this third edition is the inclusion of a new (starred) Chapter 17 on planetary motion, in which calculus is employed for a substantial physics problem. In preparation for this, the old Appendix to Chapter 4 has been replaced by three Appendices: the first two cover vectors and conic sections, while polar coordinates are now deferred until the third Appendix, which also discusses the polar coordinate equations of the conic sections. Moreover, the Appendix to Chapter 12 has been extended to treat vector operations on vector-valued curves. of old material. Another large change is merely a rearrangement "The Cosmopolitan Integral," previously a second Appendix to Chapter 13, is now an ChapAppendix to the chapter on "Integration in Elementary Terms" (previously which chapter used those that problems from ter 18, now Chapter 19); moreover, the material from that Appendix now appear as problems in the newly placed Appendix. A few other changes and renumbering of Problems result from corrections, and elimination of incorrect problems. I was both startled and somewhat dismayed when I realized that after allowing 13 years to elapse between the first and second editions of the book, I have allowed another 14 years to elapse before this third edition. During this time I list of corrections, but no longer have seem to have accumulated a not-so-short the original communications, and therefore cannot properly thank the various individuals involved (whoby now have probably lost interest anyway). I have had time to make only a few changes to the Suggested Reading, which after all these years probably requires a complete revision; this will have to wait until the next edition, which I hope to make in a more timely fashion. MICHAEL x SPIVAK CONTENTS PREFACE PART I vi Prologne l Basic Properties of Numbers 3 2 Numbers of Various Sorts 21 PART II Foundations 3 Functions 39 Appendix.OrderedPairs 54 4 Graphs 56 Aþþendix1. Vectors75 Appendix2. The Cone Sections 80 Appendix3. Polar Coordinates 84 5 Limits 90 6 Continuous Functions 113 7 Three Hard Theorems 120 8 Least Upper Bounds 131 Appendix.UnformContinuity 142 PART III Derivatives and Integrals 9 Derivatives 147 10 Differentiation 166 11 Significance of the Derivative 185 Appendix.Convexipand Concavig 216 12 Inverse Functions 227 of Curves 241 Appendir.ParametricRepresentation Integrals 250 13 Appendix. RiemannSums 279 14 The Fundamental Theorem of Calculus 282 xi xii Contents 15 The Trigonometric Functions 300 x is Irrational 321 *17 Planetary Motion 327 18 The Logarithm and Exponential Functions 19 Integration in Elementary Terms 359 Integral 397 Appenda. The Cosmopolitan *16 PART IV Infinite Sequences 336 and Infinite Series 20 Approximation by Polynomial Functions 405 e is Transcendental 435 22 Infinite Sequences 445 23 Infinite Series 464 24 Uniform Convergence and Power Series 491 25 Complex Numbers 517 26 Complex Functions 532 27 Complex Power Series $46 *21 PART V Epilogue 28 Fields 571 29 Construction of the Real Numbers 578 30 Uniqueness of the Real Numbers 591 Suggested Reading 599 Answers(toselected problems)609 Glossaryof Symbols 655 Index 659 CALCULUS PART PROLOGUE To be consdaus ihat ym are ignorant is a grea¢ step to knowledge. HNþMB DISRHO BASIC PROPERTIES OF NUMBERS CHAPTER The title of this chapter expresses in a few words the mathematical knowledge to read this book. In fact, this short chapter is simply an explanation of what is meant by the and properties of numbers," all of which-addition multiplication, subtraction and division, solutions of equations and inequalities, already familiar to us. Neverfactoring and other algebraic manipulations-are required "basic Despite the familiarity of the subject, the theless, this chapter is not a review. survey we are about to undertake will probably seem quite novel; it does not aim to present an extended review of old material, but to condense this knowledge into a few simple and obvious properties of numbers. Some may even seem too obvious to mention, but a surprising number of diverse and important facts turn out to be consequences of the ones we shall emphasize. Of the twelve properties which we shall study in this chapter, the first nine are concerned with the fundamental operations of addition and multiplication. For this operation is performed on a pair the moment we consider only addition: of numbers-the sum a + b exists for any two given numbers a and b (which may possibly be the same number, of course). It might seem reasonable to regard addition as an operation which can be performed on several numbers at once, and consider the sum a1 + , an as a basic concept. It is + an of n numbers a1, of convenient, only, and consider addition of pairs numbers however, to more to · · ... define other sums in terms of sums of this type. For the sum of three numbers a, b, and c, this may be done in two different ways. One can first add b and c, obtaining b + c, and then add a to this number, obtaining a + (b + c); or one can first add a and b, and then add the sum a + b to c, obtaining (a + b) + c. Of course, the two compound sums obtained are equal, and this fact is the very first property we shall list: (Pl) If a, b, and c are any numbers, a+(b+c)= then (a+b)+c. of this property clearly renders a separate concept of the sum of three numbers superfluous; we simply agree that a + b + c denotes the number a + (b+ c) (a + b) + c. Addition of four numbers requires similar, though slightly The symbol a + b + c + d is defined to mean more involved, considerations. The statement = or or or or 3 (1) ((a + b) + c) + d, (2) (a+(b+c))+d, (3) a + ((b+ c) + d), (4) a + (b + (c + d)), (5) (a + b) + (c + d). 4 Prologue since these numbers are all equal. Fortunately, this This definition is unambiguous need not be listed separately, since it follows from the property P1 already fact listed. For example, we know from P1 that (a+ b) + and it follows immediately that c a + = c), (b + (1)and (2)are equal. The equality of (2)and (3) this P1, is a direct consequence may not be apparent at first sight role of b in P1, and d the role of c). The equalities (onemust let b + c play the (3) (4) (5) are also simple to prove. It is probably obvious that an appeal to P1 will also suffice to prove the equality of the 14 possible ways of summing five numbers, but it may not be so clear how we can reasonably arrange a proof that this is so without actually listing these 14 sums. although of = = Such a procedure is feasible, but would soon cease to be if we considered collections of six, seven, or more numbers; it would be totally inadequate to prove the equality of all possible sums of an arbitrary fmite collection of numbers ai, a.. This about would who the like to worry fact may be taken for granted, but for those approach is outlined in proof (andit is worth worrying about once) a reasonable Problem 24. Henceforth, we shall usually make a tacit appeal to the results of this problem and write sums ai + + as with a blithe disregard for the arrangement . - of parentheses. The number (P2) - . . , - 0 has one property so important that we list it next: If a is any number, then a+0=0+a=a. An important role is also played by 0 in the third property of our list: (P3) For every number a, there a+ is a number (-a) (-a) = such that -a +a = 0. Property P2 ought to represent a distinguishing characteristic of the number 0, it is comforting to note that we are already in a position to prove this. Indeed, if a number x satisfies and a+x=a for any one number a, then x = 0 (andconsequently this equation also holds for all numbers a). The proof of this assertion involves nothing more than subtracting a from both sides of the equation, in other words, adding to both sides; as the shows, all proof three properties P1-P3 must be used to justify following detailed this operation. ---a If then hence hence hence a+ x (-a) + (a + x) ((-a) + a) + x 0+ x x = a, = (-a) = 0; = 0; = 0. + a = 0; 1. BasicPropertiesof Numbers5 As we have just hinted, it is convenient to regard subtraction as an operation derived from addition: we consider a b to be an abbreviation for a + (-b). It is then possible to find the solution of certain simple equations by a series of steps justifiedby Pl, P2, or P3) similar to the ones just presented for the equation (each a. For example: a + x - = If then hence hence hence x+3=5, (x + 3) + (-3) x + (3 + (-3)) 5 + (-3); = 5 3 2; 2. = x+ 0 x - = = = 2; Naturally, such elaborate solutions are of interest only until you become convinced that they can always be supplied. In practice, it is usually just a waste of time to solve an equation by indicating so explicitly the reliance on properties Pl, P2, and P3 (orany of the further pmperties we shall list). Only one other property of addition remains to be listed. When considering the sums of three numbers a, b, and c, only two sums were mentioned: (a + b) + c and a + (b + c). Actually, several other arrangements are obtained if the order of a, b, and c is changed. That these sums are all equal depends on then (P4) If a and b are any numbers, a+b=b+a. The statement of P4 is meant to emphasize that although the operation of addition of pairs of numbers might conceivably depend on the order of the two numbers, in fact it does not. It is helpful to remember that not all operations are well example, this property: usually behaved. For subtraction have not does so might when passing ask b ¢ b a. In b does equal b a, and it we a a just is amusing to discoverhow powerless we are if we rely only on properties P1-P4 Algebra of the most elementary variety shows that to justifyour manipulations. when = = b b a only b. Nevertheless, it is impossible to derive this fact a a from properties P1-P4; it is instructive to examine the elementary algebra carefully and determine which step(s) cannot be justifiedby P1-P4. We will indeed be able to justifyall steps in detail when a few more properties are listed. Oddly enough, however, the crucial property involves multiplication. The basic properties of multiplication are fortunately so similar to those for adwill needed; dition that little comment be both the meaning and the consequences should be clear. (Asin elementary algebra, the product of a and b will be denoted by a b, or simply ab.) - - - - - - (P5) If a, b, and c are any numbers, then a (P6) - (b - c) = If a is any number, then a.1=1-a=a. (a - b) c. · - 6 Prologue Moreover, 1 ¢ 0. (The assertion that 1 ¢ 0 may seem a strange fact to list, but we have to list it, because there is no way it could possibly be proved on the basis of the other properties listed-these properties would all hold if there were only one number, namely, 0.) (P7) For every number a¯1 such that number ¢ 0, there is a a a·a-1=a -a=1. (PS) If a and b are any numbers, then a-b=b-a. One detail which deserves emphasis is the appearance of the condition a ¢ 0 in P7. This condition is quite necessary; since 0-b 0 for all numbers b, there is no number 0¯* satisfying 0 -0¯I 1. This restriction has an important consequence for division. Justas subtraction was defined in terms of addition, so division is defined in terms of multiplication: The symbol a/b means a b¯1. Since 0¯I is meaningless, a/0 is also meaninglessaivision by 0 is always undefined. Property P7 has two important consequences. If a b a c, it does not and necessarily follow that b = c; for if a then 0, both a b a c are 0, no matter what b and c are. However, if a ¢ 0, then b c; this can be deduced from P7 as follows: = = - = - = - - - = If a¯* then a b=a-canda¢0, (a b) a-1 (a 4 - = - - - -1 (a¯1 4 b hence hence hence - = - 1 b b (It may happen that a b = (a b) = · a then a hence hence hence (a¯I · 0, then either a = - -1 - · 1 c; c. - = It is also a consequence of P7 that if a b if . = · a) b - = 0 or b = 0. In fact, 0 and a ¢ 0, 0; 0; 0; = 1 b b = 0. = - 0 and b 0 are both true; this possibility is not excluded 0 or b 0"; in mathematics is always used in the a sense of or the other, or both.") This latter consequence of P7 is constantly used in the solution of equations. Suppose, for example, that a number x is known to satisfy when we say "either = = = "or" = "one (x Then it follows that either x - 1 = - 1) (x 2) - 0 or x - 2 = = 0. 0; hence x = 1 or x = 2. 1. Basic 14opertiesof Numbers 7 On the basis of the eight properties listed so far it is still possible to prove very little. Listing the next property, which combines the operations of addition and multiplication, will alter this situation drastically. (P9) If a, b, and c are any numbers, then a-(b+c)=a-b+a-c. (Notice that the equation (b + c) a b a + c a is also true, by P8.) = - - - As an example of the usefulness of P9 we will now determine just when a b - - b = a: If then (a hence hence Consequently a-b=b-a, b) + b -- a a + a a - (1 + 1) and therefore a a) + b b + (b = (b = b + b a; (b + b a) + a b (1 + 1), = = = - = a); - -- = - b + b. - b. 0 which we have A second use of P9 is the justificationof the assertion a 0 made, already and even used in a proof on page 6 (canyou find where?). This fact was not listed as one of the basic properties, even though no proof was offered when it was first mentioned. With P1-P8 alone a proof was not possible, since the number 0 appears only in P2 and P3, which concern addition, while the assertion With P9 the proof is simple, though perhaps in question involves multiplication. obvious: We have not - = a-0+a-0=a-(0+0) =a.0; -(a as we have already noted, sides) that a 0 0. - this immediately implies (byadding - 0) to both = A series of further consequences of P9 may help explain the somewhat mysterious rule that the product of two negative numbers is positive. To begin with, b). To we will establish the more easily acceptable assertion that (-a) b -(a = - - prove this, note that .b+a.b= -b [(-a)+a] (-a) = It follows immediately Now note that (byadding (-a) (-b) - + [-(a 0. -(a --(a - b)] b) to both sides) that = (-a) (-b) - (-a) (-a) 0 - = = + (-a) [(-b) + b] - 0. (-a) b - - b = - b). 8 Prologue Consequently, adding (a b) to both sides, we obtain - (-a) - (-b) = a b. - The fact that the product of two negative numbers is positive is thus a consequence of two of P1-P9. In other words, if we want P1 to P9 to be true, the rule fortheproduct negativenumbers isforced upon us. The various consequences of P9 examined so far, although interesting and important, do not really indicate the significance of P9; after all, we could have listed each of these properties separately. Actually, P9 is the justificationfor almost all algebraic manipulations. For example, although we have shown how to solve the equation (x 1)(x 2) - = - 0, we can hardly expect to be presented with an equation in this form. We are more likely to be confronted with the equation x2-3x+2=0. The x2 "factorization" - 3x + 2 (x 1) (x 2) -- · - (x = 1)(x - - 2) is really a triple use of P9: (x 2) + (-1) (x 2) = x = x x + x (-2) + (-1) x + x2 + x [(---2) + (-1)] + 2 = - - - - - - . (-1) (-2) - =x2-3x+2. A final illustration of the importance of P9 is the fact that this property is actually used every time one multiplies arabic numerals. For example, the calculation 13 ×24 26 is a concise arrangement for the following equations: 13-24= 13-(2-10+4) =13-2·10+13·4 =26-10+52. (Note that moving 26 to the left in the above calculation is the same as writing 26 10.) The multiplication 13 4 52 uses P9 also: - = - 13-4=(1-10+3)-4 =1-10-4+3-4 =4-10+12 =4-10+1-10+2 = (4 + 1) 10 + 2 - =5.10+2 = 52. 1. BasicPropertiesofXumbers9 The properties P1-P9 have descriptive names which are not essential to remember, but which are often convenient for reference. We will take this opportunity to list properties P1-P9 together and indicate the names by which they are commonly designated. (P1) (Associativelaw for addition) a (P2) (Existence of an additive identity} a + 0 (Existence of additive inverses) a + (b + + c) (a + b) + = 0+ a = = c. a. . (P3) (P4) (Commutative law for addition) (PS) (Associativelaw for multiplica- (-a) a + b a - a - (b = a 0. b + a. = · (-a) + = c) (a = b) c. - - tion) (P6) (Existence of a multiplicative identity) (P7) (Existence of multiplicative 1 1 a = a a¯1 · = a¯I - · 1 ¢ 0. a; - a = 1, for a ¢ 0. inverses) (PS) (Commutative law for multiplication) a - (P9) (Distributive law) a - b = (b + b a. - c) = a - b + a c. - The three basic properties of numbers which remain to be listed are concerned inequalities. Although inequalities occur rarely in elementary mathematics, they play a prominent role in calculus. The two notions of inequality, a b (a is greater than b), are intimately related: a a (thus1 < 3 and 3 > 1 are merely two ways of writing the same assertion). The numbers a satisfying a > 0 are called positive, while those numbers a satisfying a < 0 are called negative. While positivity can thus of <, possible the it is procedure: a < b can be be defined in terms to reverse defined to mean that b a is positive. In fact, it is convenient to consider the collection of all positive numbers, denoted by P, as the basic concept, and state all properties in terms of P: with - (P10) (Trichotomy law) For every number following holds: (i) (ii) (iii) (P11) (P12) a, one and only one of the 0, a is in the collection P, is in the collection P. a = -a (Closure under addition) If a and b are in P, then a + b is in P. (Closure under multiplication) If a and b are in P, then a b is in P. - 10 Prologue These three properties should be complemented a>b asb following definitions: a-bisinP; b>a; if if if if a>b abora=b; a<bora=b.* Note, in particular, that a > 0 if and only if a is in P. All the familiar facts about inequalities, however elementary they may seem, are of P1¾P12. For example, precisely one of the following holds: consequences (i) if a and b are any two numbers, then a-b=0, (ii) a (iii) - -(a b is in the collection P, b) b a is in the collection = - - P. Using the definitions just made, it follows that precisely one of the followingholds: (i) a = (ii) a > (iii)b > b, b, a. A slightly more interesting fact results from the following manipulations. b, so that b a is in P, then surely (b + c) (a + c) is in P; thus, if a a < - then a + c < - b + c. Similarly, suppose a b - and c - so c - < b and b < < If b, c. Then is in P, is b in P, a a = (c - b) + (b - a) is in P. This shows that if a < b and b < c, then a < c. (The two inequalities a 0. The only difficulty presented by the proof is the unraveling of definitions. is in P. The symbol a < 0 means, by definition, O > a, which means 0 a ab is in P. Thus is in P, and consequently, by P12, (-a)(-b) Similarly ab > 0. The fact that ab > 0 if a > 0, b > 0 and also if a < 0, b < 0 has one special consequence: a2 > 0 if a ¢ 0. Thus squares of nonzero numbers are always positive, and in particular we have proved a result which might have seemed sufficiently elementary to be included in our list of properties: 1 > 0 (since1 12). -a - = -b = = * There is one slightly perplexing feature of the symbols 1+153 1+152 > and 5. The statements by < in the first, and by = in the are both true, even though we know that 5 could be replaced second. This sort of thing is bound to occur when 5 is used with specific numbers; the usefulness of the symbol is revealed by a statement values of like Theorem 1-here equality holds for some and b, while inequality holds for other values. a 1. BasicPropertiesofNumbers11 which will play an The fact that > 0 if a < 0 is the basis of a concept extremely important role in this book. For any number a, we define the absolute value |al of a as follows: -a 0 a 5 0. a, a -a, Note that |a| is always positive, except when a 0. For example, we have | = - 3| = 3,|7|=7,]l+Ã-h|=1+Ã-Ë,and|1+Ã-EÕ|=EÕ-Ä-1. straightforward In general, the most approach to any problem involving absolute several treating cases separately, since absolute values are defined by cases to begin with. This approach may be used to prove the following very important fact about absolute values. values requires THEOREM 1 For all numbers a and b, we have |a+bl PROOF la|+ |b|. 5 We will consider 4 cases: a 2 0, a 2 0, a 5 0, a 5 0, (1) (2) (3) (4) In case (1)we 0; b i 0; b à 0; b 5 0. b also have a + by 0, and the theorem is obvious; |a+b|=a+b= |a|+|b|, so that in this case equality holds. In case (4)we have a + b 5 0, and again equality -(a+b)= (2),when a > holds: -a+(-b)= |a+b|= In case in fact, |a 0 and b < + |bl. 0, we must prove that |a+b|5a-b. This case may therefore be divided into two subcases. prove that If a + b 2 0, then we must a+bsa-b, -b, i.e., b5 -b which is certainly true since b 5 0 and hence a + b 5 0, we must prove that à 0. On the other hand, if -a-bsa-b, i.e., which -a y a, is certainly true since a 0 and hence 5 0. with that of Finally, note case (3)may be disposed no additional work, by applying case -a (2)with a and b interchanged. 12 Prologue consideration ofvarAlthough this method of treating absolute values (separate sometimes often the only approach available, there are ious cases) is simpler methods which may be used. In fact, it is possible to give a much shorter proof of Theorem 1; this proof is motivated by the observation that la| Ñ. = (Here, and throughout the book, denotes the positive square root of x; this is defined only when x > 0.) We may now observe that 2 2 + 2ab + b2 (|a + b|)2 a2 + 2|a| |b| + b2 5 |a|2 + 2|a| b + 1b|2 symbol _ _ - = . (|a[ + |b|)2. = From this we can conclude that |a + b| 5 |a| + |b| because x2 < y2 implies x < y, provided that x and y are both nonnegative; a proof of this fact is left to the reader (Problem 5). One final observation may be made about the theorem we have just proved: a close examination of either proof ofTered shows that la+b|= |a|+|b| if a and b have the same sign the two is 0, while (i.e.,are both positive or both negative), or if one of la+bl<lal+|bi if a and b are of opposite signs. We will conclude this chapter with a subtle point, neglected until now, whose inclusion is required in a conscientious survey of the properties of numbers. After stating property P9, we proved that a b b a implies a b. The proof began by establishing that - = - = a-(1+1)=b-(1+1), b. This result is obtained from the equation from which we concluded that a b (1 + 1) by dividing both sides by 1 + 1. Division by 0 should a (1 + 1) be avoided scrupulously, and it must therefore be admitted that the validity of the argument depends on knowing that 1+1 ¢ 0. Problem 25 is designed to convince you that this fact cannot possibly be proved from properties P1-P9 alone! Once Pl0, Pll, and P12 are available, however, the proof is very simple: We have already seen that 1 > 0; it follows that 1 + 1 > 0, and in particular 1 + 1 ¢ 0. This last demonstration has perhaps only strengthened your feeling that it is absurd to bother proving such obvious facts, but an honest assessment of our present situation will help justifyserious consideration of such details. In this chapter we have assumed that numbers are familiar objects, and that P1-Pl2 are merely explicit statements of obvious, well-known properties of numbers. It would be difficult, however, to justifythis assumption. Although one learns how to with" numbers in school, just what numbers are, remains rather vague. A great deal of this book is devoted to elucidating the concept of numbers, and by the end = - = - "work l. BasicPropertiesofNumbers 13 of the book we will have become quite well acquainted with them. But it will be necessary to work with numbers throughout the book. It is therefore reasonable numbers; we may still to admit frankly that we do not yet thoroughly understand whatever numbers that, they should certainly have in defined, finally way are say properties P1-P12. Most of this chapter has been an attempt to present convincing evidence that P1-Pl2 are indeed basic properties which we should assume in order to deduce other familiar properties of numbers. Some of the problems (which indicate the derivation of other facts about numbers from P1-P12) are offered as further evidence. It is still a crucial question whether P1-P12 actuallÿ account for all properties of numbers. As a matter of fact, we shall soon see that they do not. In the next chapter the deficiencies of properties Pl-P12 will become quite clear, but the proper means for correcting these deficiencies is not so easily discovered. The crucial additional basic property of numbers which we are seeking is profound and subtle, quite unlike P1-P12. The discovery of this crucial property will require all the work of Part II of this book. In the remainder of Part I we will begin to see why some additional property is required; in order to investigate this we will have to consider a little more carefully what we mean by "numbers." PROBLEMS I. Prove the following: (i) If ax x2 = a 2 for some number ¢ 0, a then x 1. = y)(X ‡ y). y2, then x y or x y)(x2 + xy + y2L x2 y* (x (iv) y)(In-1 n-2y +Xyn-2 n-I x" y" * (x (v) X + y2). (There is a particularly easy way to (vi) x3 + y3 (X # y)(X do this, using (iv),and it will show you how to find a factorization for x" + y" whenever n is odd.) (ii) (iii) If x2 _ -y. - = - = - - - = 2. = = - What is wrong with the following "proof"? X2 x2 (x + y)(x _ (ii) a - ac = -, b a - b bc y) = y(x x+y=y, 2y= y, + -- c d = if b, c ¢ 0. ad + bc bd , if b, d ¢ 0. 2 _ - Prove the following: (i) Xy, y2 2 3. = Let x = 1. -- y), = y. Then 14 Prologue a¯Ib¯1, if a, b ¢ 0. (To do this you must remember defining property of (ab)¯14 (iii) (ab) a b (iv) -- 4. c d ac if b, d db = a (v) (vi) - - = -- - --, ad c = bc If b, d ¢ 0, then a b b a Find all numbers ¢ 0. if b, e, d ¢ 0. -, d b if and only if ad = = bc. Also determine when for which x 4-x<3-2x. 2 x < 8 (ii) 5 x2 < (iii) 5 1) (x 3) (iv) (x (i) -- . -2. -- (v) > - - 0. (When is a product of two numbers positive?) x2-2x+2>0, (vi) x2+x+1>2. x2 x + 10 > 16. x2 + x + 1 > 0. (viii) (ix) (x ar) (x + 5) (x 3) (vii) -- -- - (x) (x 6)(x -- (xi) 22 < Ã) - > > 0. 0. 8. (xii)x + 32 < 4. 1 (xiii)x 1 + - 1 x 1 x (xiv)x + 1 > 0. > 0. - - 5. Prove the following: (i) If a If a d, then a c 0, then ac < bc. (v) If a bc. (vi) If a > 1, then > a. (vii) If0<a<1,thena2<a. (viii)If 0 < a 0 and a2 < b2, then a < b. (Use (ix),backwards.) (a) Prove that if 0 5 x < y, then x" < y", n 1, 2, 3, (b) Prove that if x < y and n is odd, then x" < y". (c) Prove that if x" y" and n is odd, then x y. < - = = (d) Prove that if x" the = .... = y" and n is even, then x --y. = y or x = 1. BasicPropertiesof Numbers 15 7. Prove that if 0 < a < b, then a<Ñ<a+b<b. 2 < (a + b)/2 holds for all a, b Notice that the inequality 5 of alization this fact occurs in Problem 2-22. *8. > 0. A gener- Although the basic properties of inequalities were stated in terms of the collection P of all positive numbers, and < was defined in terms of P, this procedure can be reversed. Suppose that P1&P12 are replaced by For any numbers a and b one, and only one, of the following holds: (i) a = b, (P'10) (ii) a b (iii) < b, < a. (P'11) For any numbers (P'12) For any numbers < a if a < b and b b, and c, if a < b, then < b and 0 a, b, and c, < c, then < c, then c. a, a+c<b+c. For any numbers ac < bc. (P'13) a, b, and c, if a Show that P1AP12 can then be deduced as theorems. 9. Express each of the following with at least one less pair of absolute value signs. (ii) |(|a+b| |a|- |b|)|. (iii) |(|a+b|+|c|-la+b+c|)|. (iv) |x2 24 + y2 - - (V) 10. I(lÃ Äl IE + - - Ë\)\. Express each of the following without when necessary. cases separately (i) |a+b|--|b|. (ii) (|x| 1)|. - (iii) |x! |x2 (iv) a |(a |a|)|. - -- 11. - Find all numbers x for which (i) |x (ii) |x (iii) - - 3| 3| = < 8. 8. x+4|<2. (iv) |x (v) |x - - 1| + |x 2| 1| + |x + 1| - > < l. 2. absolute value signs, treating various 16 Prologue (vi) lx-li+lx+li<1. (vii) [x 1 |x + (viii)fx-1|·|x+2 12. 1 - - 0. = =3. Prove the following: (i) |xy|= [x| |y|. - 1 (ii) 1 = - ixi x |x[¯I (iii) -= what is.) ]x| |y (iv) x (v) |x if x ¢ 0. (The best way to do this is to remember -, x -,ify¢0. y y| 5 - - |x|+ |y|. (Give a very short proof.) y [x y|. (A very short proof is possible, if you write things in -- y the right way.) (vi) [(|x| |yl)| 5 - (vii)|x + y + z| 5 y[. (Why does this follow immediately from (v)?) x xi + ly!+ ]zi. Indicate when equality holds, and prove - your statement. 13. of two numbers x and y is denoted by max(x, = max(3, 3) = 3 and max(-1, max(-4, of x and y is denoted by min(x, y). Prove that The maximum -4) max(-1, The 3) = minimum Thus y). -1) -1. = x+y+!y-x| max(x, y) = min(x, y) = , 2 x+y-|y-x| 2 Derive a formula for max(x, y, z) and min(x, y, z), using, for example max(x, 14. (a) Prove that |a = y, z) = max(x, max(y, z)). to become confused by too many for a > 0. Why is it then obvious for |-al. (The trick is not First prove the'statement 0?) a 5 5 a 5 b if and only if la 5 b. In particular, it follows (b) Prove that cases. -b -|al that 5 a 5 |a|. (c) Use this fact to give a new proof that |a + b| 5 |a + bl. *15. Prove that if x and y are not both 0, then x24y X4 X3 +X2 p 2>0, 2 3 4 Hint: Use Problem 1. *16. (a) Show that (x + y)2 (x + y)3 _ 2 2 X3 3 only when x only when x = = 0 or y 0 or y = = 0, 0 or x --y. = 1. BasicPropertiesof Numbers 17 (b) Using the fact that x2 show that + 2xy + y2 4x2 + 6xy + 4y2 > y)2 (x + = à 0, 0 unless x and y are both 0. (x + y)4 X4 4 y4 Hint: From the assumption (x + y)S derive the equation x3+2x2y+2xy2+y3 you 2 2 0, if xy ¢ 0. This implies that (x + y)3 (c) Use part (b)to find out when5 (d) Find out when (x + y)5 x5+ya should be able to = 5. You should now be able to make a good guess as to when (x + y)" 2x2 2(x the smallest possible value of square," i.e., write 2x2 3x + 4 = (b) Find the smallest possible value of (c) Find the smallest possible value of (a) Find 3x + 4. Hint: "Complete the 3/4)2 + ? x2 3x + 2y2 + 4y + 2. x2 + 4xy + 5y2 6y + 7. _ 18. b2 (a) Suppose that x" + y"; in Problem 11-57. the proof is contained 17. = = - -b + - - - -4x - 4c à 0. Show that the numbers /b2 - -b 4c 2 - ' /b2 - 4c 2 both satisfy the equation x2 + bx + c 0. Suppose that b2 4c < 0. Show that there are no numbers x satisfying (b) x2 0; in fact, x2 + bx + c > 0 for all x. Hint: Complete the + bx + c = - = square. Use (c) x2 this fact to give another 2 > (d) For which proof that if x and y are not both 0, then 0. numbers x2 + a is it true that axy + y2 > 0 whenever x and y are not both 0? x2 (e) Find the smallest possible value of + bx + c and of ax2 + bx + c, for a 19. > 0. The fact that a2 à 0 for all numbers a, elementary as it may seem, is nevertheless the fundamental idea upon which most important inequaliof all inequalities is the ties are ultimately based. The great-granddaddy Schwarzinequalig: x171 + X272 Il2 + X22 2 + 722 (A more general form occurs in Problem 2-21.) The three proofs of the Schwarz inequality outlined below have only one thing in common-their reliance on the fact that a2 à 0 for all a. that if xi Ayi and x2 ly2 for some number 1, then equality holds in the Schwarz inequality. Prove the same thing if yr 0. y2 Now suppose that yi and y2 are not both 0, and that there is no number (a) Prove = = = = 18 Prologue I such that xi 0 Ayr and x2 = xi)2 # 12(yi2 + y22) (lyt < = = (Ày2 - - Ay2. Then X2) X272) 2A(x1yi + - (112E X22L + Using Problem 18, complete the proof of the Schwarz inequality. (b) Prove the Schwarz inequality by using 2xy 5 x2+y2 (howis this derived?) with x x, = y , y, = ° »2 71 + first for i 1 and then for i 2. (c) Prove the Schwarz inequality by first proving that = = (x12 )(71 X2 72 ) (XIfl = +X272)2 + (X172 -X271)2 from each of these three proofs, that equality holds only when ly1 and 0 or when there is a number A such that xi y2 (d) Deduce, yi x2 = = = Ay2. - In our later work, three facts about inequalities will be crucial. Although proofs be supplied at the appropriate point in the text, a personal assault on these problems is infinitely more enlightening than a perusal of a completely worked-out proof. The statements of these propositions involve some weird numbers, but their basic message is very simple: if x is close enough to xo, and y is close enough to yo, then x+ y will be close to xo+ yo, and xy will be close to xoyo, and 1/y will be close which appears in these propositions is the fifth letter of the to 1/yo. The symbol alphabet and could just as well be replaced by a less intimidating Greek ("epsilon"), Roman letter; however, tradition has made the use of e almost sacrosanct in the will "s" contexts 20. to which these theorems apply. Prove that if |x xo| - < and |y (xo+ yo) - yo| < , then I(x + y) - I< e, I(x-y)-(zo-yo)l<E. *21. Prove that if |x-xo]<min then |xy -- xoyo| (The notation € ( ,1 2(|yo|+ 1) < and € ly-yo[< 2(\xo was defined in Problem 13, but the formula provided by the first inequality in the hypothesis that problem is irrelevant at the moment; just means that - , s. "min" |x + 1) xol 8 < 2(|yo|+ 1) and |x - xo| < 1; 1. Basic Propertiesof Numbers 19 at one point in the argument you will need the first inequality, and at another point you will need the second. One more word of advice: since the hypotheses only provide information about x xo and y yo, it is almost a foregone conclusion that the proof will depend upon writing xy xoyo in a yo.) xo and y way that involves x - - - - *22. - Prove that if yo ¢ 0 and . ly then y *23. yol - mm < ¢ 0 and -, [yo| elyo 2 1 1 y yo 2 , 2 Replace the question marks in the following statement ing e, xo, and yo so that the conclusion will be true: by expressions involv- If yo ¢ 0 and |y then y -- yol < ? and x xo y yo |x - xal < ? ¢ 0 and This problem is trivial in the sense that its solution followsfrom Problems 21 and 22 with almost no work at all (noticethat x/y = x 1/y). The crucial point is not to become confused; decide which of the two problems should be used first, and don't panic if your answer looks unlikely. - *24. This problem shows that the actual placement of parentheses in a sum is induction"; if you are not fairrelevant. The proofs involve miliar with such proofs, but still want to tackle this problem, it can be saved until after Chapter 2, where proofs by induction are explained. Let us agree, for definiteness, that ai + + a, will denote "mathematical ai + (a2 + (as + - - - + (an-2 ‡ (an-1 ‡ an))) - ). Thus ai + a2 + a3 denotes al + (a2 + as), and ai + a2 + as + a4 denotes ai + (a2+ (a3+ a4)), etc. (a) Prove that (a) ‡···#ak) ·-*ak+l #ak+l = al ‡ Hint: Use induction on k. (b) Prove that if n > k, then (ai+---+ag)+(ak+1 Hint: Use part (a)to "' n) 1 "" give a proof by induction on k. n· 20 Prolo.gue (c) Let s(ai, ..., ak) be some sum formed from s(ai,...,ak)=Ul ' ai, ak. Show that k• Hint: There must be two sums s'(ai,...,ai) that and s"(aisi,...,ak) such l)#S"(G¡gi,...,ak)- s(ai,...,ak)=S(Ul,---, 25. ..., "number" Suppose that we interpret to mean either 0 or 1, and + and be the operations defined by the following two tables. -01 +01 000 001 1 10 101 Check that properties P1-P9 all hold, even though I + 1 = 0. - to NUMBERS OF VARIOUS SORTS CHAPTER "number" In Chapter 1 we used the word very loosely, despite our concern with of numbers. will the basic properties It now be necessary to distinguish carefully various kinds of numbers. numbers" The simplest numbers are the "counting 1, 2, 3, .... The fundamental significance of this collection of numbers is emphasized by its numbers). A brief glance at P1-P12 will show that our N (fornatural of do not apply to N-for example, P2 and P3 do not basic properties make sense for N. From this point of view the system N has many deficiencies. Nevertheless, N is sufficiently important to deserve several comments before we consider larger collections of numbers. induction." The most basic property of N is the principle of Suppose P(x) means that the property P holds for the number x. Then the principle of mathematical induction states that P(x) is true for all natural numbers x symbol "numbers" "mathematical provided that (1)P(1) is true. (2)Whenever P(k) is true, P(k + 1) is true. asserts the truth of P(k+1) under the assumption that P(k) is true; this sufBces to ensure the truth of P(x) for all x, if condition (1)also holds. In fact, if P(1) is true, then it follows that P(2) is true (byusing (2)in the special case k 1). Now, since P(2) is true it follows that P(3) is true (using(2)in the special case k 2). It is clear that each number will eventually be reached by a series of steps of this sort, so that P(k) is true for all numbers k. induction envisions A favorite illustration of the reasoning behind mathematical of people, an infinite line Note that condition (2)merely = = person number 1, person number 2, person number 3, ... . If each person has been instructed to tell any secret he hears to the person behind (theone with the next largest number) and a secret is told to person number 1, then clearly every person will eventually learn the secret. If P(x) is the assertion that person number x will learn the secret, then the instructions given (totell all secrets learned to the next person) assures that condition (2)is true, and telling the secret to person number 1 makes (1)true. The following example is a less him induction. There is a useful and striking formula which expresses the sum of the first n numbers in a simple way: facetious use of mathematical 21 22 Prologue n(n+1) 1+---+n= . 2 To prove this formula, note first that it is clearly true for n for some natural number k we have 1+---+k k(k + 1) 2 = = 1. Now assume that . Then k(k + 1) +k+1 2 1+---+k+(k+1)= k(k+1)+2k+2 2 k2+3k+2 ¯ 2 = (k + 1)(k+ 2) ' 2 so the formula is also true for k + 1. By the principle of induction this proves the formula for all natural numbers n. This particular example illustrates a phethat frequently occurs, especially in connection with formulas like the nomenon one just proved. Although the proof by induction is often quite straightforward, the method by which the formula was discovered remains a mystery. Problems 5 and 6 indicate how some formulas of this type may be derived. The principle of mathematical induction may be formulated in an equivalent of a number, way without speaking of a term which is sufficiently discussion. A more precise formulation vague to be eschewed in a mathematical mathematical collection term) of states that if A is any synonymous (or "properties" "set"-a natural numbers and (1) (2) 1 is in A, k + 1 is in A whenever then A is the set of all natural numbers. adequately replaces the less formal one set A of natural numbers x which satisfy of natural numbers n for which it is true 1+---+n= k is in A, It should be clear that this formulation given previously-we just consider the P (x). For example, suppose A is the set that n(n + 1) . 2 Our previous proof of this formula showed that A contains 1, and that k + 1 is in A, if k is. It follows that A is the set of all natural numbers, i.e., that the formula holds for all natural numbers n. There is yet another rigorous formulation of the principle of mathematical induction, which looks quite different. If A is any collection of natural numbers, it 2. Nurnbers of'VariousSorts 23 is tempting to say that A must have a smallest member. Actually, this statement can fail to be true in a rather subtle way. A particularly important set of natural numbers is the collection A that contains no natural numbers at all, the collection" or set,"* denoted by Ø. The null set 0 is a collection of natural numbers that has no smallest member-in fact, it has no members at all. This is the only possible exception, however; if A is a nonnull set of natural numbers, then A has a least member. This obvious" statement, known as the principle," can be proved from the principle of induction as follows. Suppose that the set A has no least member. Let B be the set of natural numbers if 1 were in A, then n such that 1, n are all not in A. Clearly 1 is in B (because A would have 1 as smallest member). Moreover, if 1, k are not in A, surely would smallest member not of A), so 1, k + 1 is k+ 1 in A (otherwise be the k + 1 are all not in A. This shows that if k is in B, then k + 1 is in B. It follows that every number n is in B, i.e., the numbers 1, n are not in A for any natural number n. Thus A which completes the Ø, proof. It is also possible to prove the principle of induction from the well-ordering principle (Problem 10). Either principle may be considered as a basic assumption "empty "null "intuitively "well-ordering . . . , ..., . . . . . . , , = about the natural numbers. There is still another form of induction which should be mentioned. It sometimes happens that in order to prove P(k + 1) we must assume not only P(k), but also P(l) for all natural numbers I < k. In this case we rely on the "principle of complete induction": If A is a set of natural numbers and (1) (2) 1 is in A, k + 1 is in A if 1, . . . , k are in A, then A is the set of all natural numbers. Although the principle of complete induction may appear much stronger than the ordinary principle of induction, it is actually a consequence of that principle. reader, with of the this The proof fact is left to a hint (Problem 11). Applications will be found in Problems 7, 17, 20 and 22. definitions." For example, Closely related to proofs by induction are number the n! (read factorial") is defined as the product of all the natural numbers less than or equal to n: "recursive "n n!=1-2-...-(n-1)-n. This can be expressed more precisely as follows: (1) 1! (2) n!=n-(n-1)!. = 1 This form of the definition exhibits the relationship between n! and (n - 1)! in an *Although it may not strike you as a collection, in the ordinary sense of the word, the null set arises quite naturally in many contexts. We frequently consider the set A, consisting of all x satisfying some property P; often we have no guarantee that P is satisfied by may number, so that A might be 6-in fact often one proves that P is always false by showing that A = 0. 24 Prologue explicit way that is ideally suited for proofs by induction. Problem 23 reviews a which definition familiar to you, may be expressed more succinctly as a recursive definition;as this problem shows, the recursive definitionis really necessary for a rigorous proof of some of the basic properties of the definition. already One definition which may not which we will constantly be using. be familiar involves some convenient notation Instead of writing al + we will usually +as, · employ the Greek letter E (capitalsigma, for "sum") and write a¿. i=1 In other words, a¿ denotes the sum of the numbers obtained by letting i=1 i = 1, 2, n. ..., Thus " i = 1+2+---+n n(n + = ' 2 i=1 Notice that the letter i really has nothing 1) to do with the number denoted by i, i=l and can be replaced by any convenient symbol " = ' = n = i (i + 1) 2 j=1 ' j(j+1) ° 2 n=1 To define ' 2 Í of course!): n(n + 1) . 21 j=1 (exceptn, a¿ precisely really requires a recursive definition: i=l 1 (1) i a¿ = a¿ = ai, i=1 n-1 n (2) i=1 L a¿ + a.. i=l austerity would insist too strongly on such But only purveyors of mathematical precision. In practice, all sorts of modifications of this symbolism are used, and no one ever considers it necessary to add any words of explanation. The symbol of VmiousSorts 25 2. Numbers i=1 i¢4 for example, is an obvious way of writing ai + a2 + as + as + a6 · n. or more precisely, 3 n La¿+ i=I The deficiencies of the natural numbers of this chapter may be partially remedied integers ---2, a . i=5 which we discovered at the beginning by extending this system to the set of -1, 0, 1, 2, ..., .... This set is denoted by Z (from German "Zahl," number). Of properties Pl-Pl2, P7 fails for Z. A still larger system of numbers is obtained by taking quotients m/n of integers and the set of all (withn 0). These numbers are called rational numbers, rational numbers is denoted by Q (for In this system of numbers all of P1-Pl2 are true. It is tempting to conclude that the of numbers," which we studied in some detail in Chapter 1, refer to just one set of numbers, namely, Q. There is, however, a still larger collection of numbers to which properties P1-Pl2 apply-the denoted by R. The real numbers set of all real numbers, only rational the numbers, but other numbers as well (theirrational include not numbers) which can be represented by infinite decimals; x and å are both examples of irrational numbers. The proof that x is irrational is not easy-we shall devote all of Chapter 16 of Part III to a proof of this fact. The irrationality of Å, on the other hand, is quite simple, and was known to the Greeks. (Since the Pythagorean theorem shows that an isosceles right triangle, with sides of length 1, has a hypotenuse of length Å, it is not surprising that the Greeks should have investigated this question.) The proof depends on a few observations about the natural numbers. Every natural number n can be written either in the form 2k for some integer k, or else in the form 2k + 1 for some integer k (this fact has a simple proof by induction (Problem 8)). Those natural numbers of the form 2k are called even; those of the form 2k + 1 are called odd. Note that even numbers have even squares, and odd numbers have odd squares: only -y "quotients"). "properties "obvious" (2k)2 4k2 = (2k + 1)2 = 2 (2kl), 4k2 + 4k + 1 2 (2k + 2k) + 1. = - = - In particular it follows that the converse must also hold: if n2 is even, then n is even; if n2 is odd, then n is odd. The proof that Ã is irrational is now quite simple. Suppose that Ã were rational; that is, suppose there were natural numbers p 26 Prologue and q such that all common divisors We can assume that p and q have no common divisor(since could be divided out to begin with). Now we have p2 2q2. = This shows that p2 is even, and consequently some natural number k. Then p2 = 4k2 __ p must be even; that is, p = 2k for 2 so 2k2 = q2. This shows that q2 is even, and consequently that q is even. Thus both p and q the fact that p and q have no common divisor. This are even, contradicting contradiction completes the proof. It is important to understand precisely what this proof shows. We have demonnumber x such that x2 strated that there is no rational 2. This assertion is often expressed more briefly by saying that Ã is irrational. Note, however, that the irrational) use of the symbol å implies the existence of some number (necessarily whose square is 2. We have not proved that such a number exists and we can asfor us. Any proof at this stage sert confidently that, at present, a proof is impossible would have to be based on P1-P12 (theonly properties of R we have mentioned); since P1-P12 are also true for Q the exact same argument would show that there rational number whose square is 2, and this we know is false. (Note that the is a proof that there is no rational number whose reverse argument will not workeur used show that there is no real number whose square is 2, square is 2 cannot be to because our proof used not only P1-P12 but also a special property of Q, the fact that every number in Q can be written p/q for integers p and q.) This particular deficiencyin our list of properties of the real numbers could, of course, be corrected by adding a new property which asserts the existence of square roots of positive numbers. Resorting to such a measure is, however, neither aesthetically pleasing nor mathematically satisfactory; we would still not know that every number has an nth root if n is odd, and that every positive number has an nth root if n is even. Even if we assumed this, we could not prove the existence of x5 + + 1 0 x a number x satisfying (eventhough there does happen to be one), of nth since we do not know how to write the solution of the equation in ter solution written that the this in form). And, roots (infact, it is known cannot be of course, we certainly do not wish to assume that all equations have solutions, since this is false (noreal number x satisfies x2 + 1 0, for example). In fact, this direction of investigation is not a fruitful one. The most useful hints about the property distinguishing R from Q, the most compelling evidence for the necessity of elucidating this property, do not come from the study of numbers alone. In order to study the properties of the real numbers in a more profound way, we = = = 2. Numbersof VmiousSorts 27 must study more than the real numbers. foundations of calculus, in particular the At this point we must begin with the fundamental concept on which calculus is based-fimetions. PROBLEMS 1. Prove the following formulas by induction. 2. n(n+1)(2n+1) 12+--·+n (i) (ii) = 6 (1+---+n)2. 13+--·+n3= Find a formula for (2i-1)=1+3+5+---+(2n-1). (i) i-1 (2i-1)2--12+32+52+---+(2n-1)2 (ii) i=1 Hint: What do these expressions have to do with 1 + 2 + 3 + 12 + 22 + 32 + + (2n)2? - 3. If 0 sk )( - 5 n, the k!(n = - - + 2n and - "binomial coefficient" is defined by n(n-1)---(n-k+l) n! -4 , k! k)! -- - if k 0, n 1. (This becomes a special case of the first formula if we define 0! 1.) = = (a) Prove that n+1 kl (The proof does not require an induction argument.) This relation gives rise to the following configuration, known as "Pascal's triangle"-a number not on one of the sides is the sum of the two numbers above is the (k + 1)st number it; the binomial coefficient in the (n + 1)st row. 1 1 1 1 2 1 1 1 3 3 14641 1 10 10 5 5 1 28 Prologue (b) Notice that all the numbers in Pascal's triangle are natural is always a natural part (a)to prove by induction that by induction will, in a sense, be summed triangle.) Pascal's entire proof another (c) Give , (Your up in a glance number by showing by that n. (d) Prove the natural "binomial If a and b are any numbers theorem": a"¯Ib+ a"¯2b2+···+ ab"¯1+b" 1 n (e) Prove that (i) + = (-1) (ii) + - + + (iv) + + + l even / / 2". = - (iii) - - - - - - - ± = 0. 2"¯1 = - - = 2"¯ / (a) Prove that CX, n+m k Hint: Apply the binomial theorem to (1 + x)"(1 + x)m (b) Prove and n number, then (a+b)"=a"+ 4. number. Use of sets of exactly k integers each chosen from 1, is the number ... is a natural proof that numbers. that is a 2. Numbersof VanousSorts 29 5. on n that (a) Prove by induction 1-r"+1 l+r+r2+---+r"= 1 r --- 1, evaluating the sum certainly presents no problem). if r ¢ 1 (ifr multiplying this equation result by setting S 1+r +-· (b) Derive this by r, and solving the two equations for S. = -+r", = 6. The formula for 12 + formula - - + n2 may be derived as follows. We begin with the - (k+1)3-k3=3k2+3k+1. Writing this formula for k = 1, ... , n and adding, we obtain 23-13=3-12+3-1+1 33-23=3.22+3-2+1 (n+1)3-n3=3-n2+3-n+1 =3[12+-·-+n2 -1 (n+1) k2 if we already know Thus we can find (whichcould k have been k=1 k=1 found in a similar way). Use this method to fmd (i) 13 4 14 4 (ii) 1 (iii) 1.2 -- (iv) *7. . . . . . 3 . 4 1 1 + + n(n+1) 2-3 2n+1 5 3 + + n2(n + 12 22 22 32 + Î)2 Use the + - -- - · . · method · · - - - of Problem kP 6 to show that can always be written i=i the form n p+1 + An? + Bn*¯' + Cn*¯2 + p + 1 (The first 10 such expressions are - - · . in 30 Prologue k= + n n k=I +n2+n k2=n k=1 k3 _ 4 3 2 5 4 3 k=1 k4 k-I ks = n6 + k6 = n' + n6 # k' = n° + ni + k" = n° + n° + k = 5 n4 + - n2 k=1 n5 3 _ n k=1 n6 - 2 n4 k=1 n' - n° + n 3 - n k-1 nl°+ n° + (n" ni + þ© k-1 klo _ 11 + k=1 - - 4 n6 In' + NS - - n2 n' -I- gn. and that after Notice that the coefficients in the second column are always of with coefficients column the third the powers decrease by 2 until n nonzero n2 or n is reached. The coefEcients in all but the first two columns seem to rather haphazard, but there actually is some sort of pattern; finding it may be be regarded as a super-perspicacity test. See Problem 27-17 for the complete (, story) 8. 9. Prove that every natural number is either even or odd. Prove that if a set A of natural numbers contains no and contains it contains k, then A contains all natural numbers > no. k+ 1 whenever 10. Prove the principle of mathematical induction from the well-ordering principle. 11. Prove the principle of complete induction from the ordinary principle of induction. Hint: If A contains 1 and A contains n + 1 whenever it contains k are all in A. 1, n, consider the set B of all k such that 1, ..., 12. ..., is rational and b is irrational, is a + b necessarily irrational? What if a and b are both irrational? (b) If a is rational and b is irrational, is ab necessarily a4irrational? (Careful!) (c) Is there a number a such that a2 is irrational, but is rational? (a) If a 2. Numbersof VanousSorts 31 whose sum and product there two irrational numbers tional? (d) Are 13. (a) Prove that d, Ã, and Ã are irrational. Hint: To treat ß, for example, use the fact that every integer is of the form 3n or 3n + 1 or 3n + 2. Why doesn't this proof work for Ã? (b) Prove that 14. W are and + Ä Ä is irrational. is irrational. (b) Ä + (a) Prove that if x p + g = number, (b) Prove 16. irrationai. Prove that (a) Ä 15. are both ra- then xm = a + also that (p - where p and q are rational, and m b for some rational a and b. )m = a - bg. that if m and n are natural numbers 2n)2/(m + n)2 > 2; show, moreover, that (m+ (a) Prove (m+ 2n)2 (m+n)2 is a natural and m2/n2 < 2, then m2 -2 < 2---. n2 same results with all inequality signs reversed. then there is another rational that if m/n < (b) Prove the (c) Prove with m/n *17. m'/n' < < n, n. number m'/n' the natural number n is not is irrational whenever It seems likely that the square of another natural number. Although the method of Problem 13 may actually be used to treat any particular case, it is not clear in advance that it will always work, and a proof for the general case requires some extra information. A natural number p is called a prime number if it is impossible to write p ab for natural numbers a and b unless one of these is p, and the other 1; for convenience we also agree that 1 is not a prime number. The first few prime numbers are 2, 3, 5, 7, 11, 13, 17, 19. If n > 1 is not a with and ab, then b both < n; if either a or b is not a prime it prime, a n similarly; continuing in this way proves that we can write n can be factored product of primes. example, 28 4 7 2 27. For as a = = = - = - this argument into a rigorous proof by complete induction. (To would accept the informal argube sure, any reasonable mathematician ment, but this is partly because it would be obvious to her how to state (a) Turn it rigorously.) A fundamental theorem about integers, which we will not prove here, states that this factorization is unique, except for the order of the factors. Thus, for example, 28 can never be written as a product of primes one of which is 3, nor can it be written in a way that involves 2 only once (now you should appreciate why 1 is not allowed as a prime). 32 Prolqgue is irrational unless n Using this fact, prove that (b) number = m2 for som€ naturaÏ m. (c) Prove more generally that 6 (d) No discussion of prime numbers is irrational unless n = m*. should fail to allude to Euclid's beautiful proof that there are infmitely many of them. Prove that there cannot be only finitely many prime numbers pt, P2, P3, Pn by considering ---, pl ·P2·---·Pn+1. *18. (a) Prove that if x satisfies a,_ix"¯I x" + + - - + ao - 0, = for some integers a._i, , ao, then x is irrationalunless x is an integer. (Why is this a generalization of Problem 17?) (b) Prove that Ã E Ã is irrational ... - (c) Prove that Ã powers - W is irrational. + Hint: Start by working out the first 6 of this number. -1, 19. Prove Bernoulli's inequality: If h (1 + h)" Why is this trivial if h 20. > then > 1 + nh. > 0? The Fibonacci sequence al, a29 63, · - · is defined as follows: ai=1, az=1, a, This sequence, = a._i for n + a._2 which 3. > begins 1, 1, 2, 3, 5, 8, , was discovered by Fibonacci connection with a problem about rabbits. Fibonacci (circa1175-1250), in assumed that an initial pair of rabbits gave birth to one new pair of rabbits month, and that after two months each new pair behaved similarly. The per number a, of pairs born in the nth month is an-1 + an-2, because a pair of rabbits is born for each pair born the previous month, and moreover each pair born two months ago now gives birth to another pair. The number of is even a Fiinteresting results about this sequence is truly amazing-there Prove bonacci Association which publishes a journal, TheFibonacciQuarterly. .. that " (l+ß ¯ 2 . " 1-Ã 2 One way of deriving this astonishing formula is presented in Problem 24-15. 21. The Schwarz inequality (Problem 1-19) actually has a more general form: x, y¿ 5 i=l x, \ i=1 2 2 \ i=1 of VariousSorts 33 2. Numbers Give three proofs of this, analogous 22. to the three proofs in Problem 1-19. The result in Problem 1-7 has an important generalization: If ai, then the "arithmetic . .. , a. > 0, mean" A,=ai+---+a. n and "geometric mean" G, = Jai ...a. satisfy G, 5 A.. ai < A.. Then some at satisfies a¿ > A,; for convenience, A.. Let ãi A, and let ã2 ã1. Show that ai + a2 (a) Suppose that say a2 > = = äiã2 > -- aia2. this process enough times eventually prove that G, 5 place where it is a good exercise to provide a formal proof by induction, as well as an informal reason.) When does equality hold in the formula G, 5 An? Why do- s repeating A,? (This is another The reasoning in this proof is related to another interesting proof. fact that G, 5 A, when n 2, prove, by induction on k, that 2k G, 5 A. for n 2m > n. Apply part (b)to the 2m numbers general let For n, (c) a (b) Using the = = A.,..., al,...,a., A, to prove that G. 5 a.. 23. The following is a recursive definition of a": a1=a, an*1 = a" - a. Prove, by induction, that (Don't try to be fancy: an+m = a" am (a")"' = an. use either · induction on n or induction on m, not both at once.) 24. know properties P1 and P4 for the natural numbers, but that mentioned. the Then following can be used has never been multiplication: as a recursive definition of Suppose we multiplication 1·b=b, (a+1)-b=a-b+b. 34 Prologue Pove the following (inthe order suggested!): a · (b + c) = a-1=a, b a a b · 25. = - a - b+ a c - (useinduction (youjust finished on a), proving the case b = 1). In this chapter we began with the natural numbers and gradually built up to the real numbers. A completely rigorous discussion of this process requires a little book in itself (seePart V). No one has ever figured out how to get to the real numbers without going through this process, but if we do accept the real numbers as given, then the natural numbers can be dejinedas the real numbers of the form 1, 1+ 1, 1+ 1+ 1, etc. The whole point of this problem is to show that there is a rigorous mathematical way of saying "etc." (a) A set A of real numbers is called inductive (1) 1 is in A, (2) k + 1 is in A whenever if k is in A. Prove that (i) (ii) The (iii) The set of positive real numbers set of positive real numbers set of positive real numbers is inductive. j to is inductive. to 5 is not inductive. (iv) The real numbers which of then the and A B set inductive, C are (v) If and also inductive. B is are in both A will be called a natural number real number if n is in every inductive A n (b) 2 FIGURE R is inductive. unequal unequal Set. 1 (i) (ii) 26. Prove that 1 is a natural number. Prove that k + 1 is a natural number if k is a natural number. There is a puzzle consisting of three spindles, with n concentric rings of decreasing diameter stacked on the first (Figure 1). A ring at the top of a stack may be moved from one spindle to another spindle, provided that it is not placed on top of a smaller ring. For example, if the smallest ring is ring is moved to spindle 3, then moved to spindle 2 and the next-smallest the smallest ring may be moved to spindle 3 also, on top of the next-smallest. Prove that the entire stack of n rings can be moved onto spindle 3 in 2" 1 1 moves. moves, and that this cannot be done in fewer than 2" -- - *27. TradiUniversity B. Once boasted 17 tenured professors of mathematics. tion prescribed that at their weekly luncheon meeting, faithfully attended by all 17, any members who had discovered an error in their published work should make an announcement of this fact, and promptly resign. Such an anhad never actually been made, because no professor was aware nouncement of any errors in her or his work. This is not to say that no errors existed, however. In fact, over the years, in the work of every member of the department at least one error had been found, by some other member of the 2. Numbersof VariousSorts 35 This error had been mentioned to all other members of the department, but the actual author of the error had been kept ignorant of the fact, to forestall any resignations. One fateful year, the department was augmented by a visitor from another university, one Prof. X, who had come with hopes of being offered a permanent position at the end of the academic year. Naturally, he was apprised, by various members of the department, of the published errors which had been discovered. When the hoped-for appointment failed to materialize, Prof. X obtained his revenge at the last luncheon of the year. "I have enjoyed my visit here very much," he said, "but I feel that there is one thing that I have to tell you. At least one of you has published an incorrect result, which has been discovered by others in the department." What happened the next year? department. **28. After figuring out, or looking up, the answer to Problem 27, consider the following: Each member of the department already knew what Prof. X asserted, so how could his saying it change anything? PART FOUNDATIONS mado The statement is se frequently that the djerential calculus deals with continuens magnidade, and yet explanation of this antinuity is nowhere given; an even the mori rigorens expositions of the differmtialcalculus do not base their proofsuþen continuity but, with more er less consciousnessof the fact they either appeal to geometric notwns _geometry, those IN,g,gtsted by or deþenduþan theoremswhich are never es¢ablished in a purelyarithmetic manner. Among these,forexample, beleggsthe abow-mentioned theorem, or and a more careful invest(gation corwinced me that this theorem,or regarded any one equivalent to it, can be in some Rey as a sigicient basis forinfinitesimalanalysis. It then only ranained to discoverits true or(gin in the elementsof arithmetic and that at the same time to secure a real de(inition of the essence of continuity. I succeededNov. 2-( 1858, and a fewdays afinward I commamrated the results of my meditations to my dear riend f Durà.ge with whom I had a long and livelydiscassion. RICHARD DEDEK[ND FUNCTIONS CHAPTER Undoubtedly the most important concept in all of mathematics is that of a modern mathematics of function-in almost every branch functions turn out to be the central objects of investigation. It will therefore probably not surprise you to learn that the concept of a function is one of great generality. Perhaps it will be a relief to learn that, for the present, we will be able to restrict our attention to functions of a very special kind; even this small class of functions will exhibit sufficient variety to engage our attention for quite some time. We will not even begin with a proper definition. For the moment a provisional definition will enable us to discuss functions at length, and will illustrate the intuitive notion of functions, as understood by mathematicians. Later, we will consider and discuss the advantages of the modern mathematical definition. Let us therefore begin with the following: PILOVISIONAL DEFINITION A function is a rule which assigns, to each of certain real numbers, some other real number. The following examples of functions are meant to illustrate and amplify this definition, which, admittedly, Example1 The Example 2 requires some such clarification. rule which assigns to each number The rule which assigns to each number the square of that number. y the number y3+3y+5 y2 + 1 -1 Example3 The rule which assigns to each number c ¢ 1, the number c3+3c+5 c2 - 1 -17 Example4 the number The rule which assigns to each number x satisfying Example 5 The rule which assigns to each number irrational, and the number 1 if a is rational. Example 6 The rule which assigns to 2 the number to 17 the number 39 Ex 5 x/3 x2. 5, 36 --, a the number 0 if a is 40 Foundations 2 to to and to any y ¢ 2, 17, x2/17, or - 17 36 -- the number 28, the number 28, 36/x, the number 16 if y is of the form a + bÃ fora,binQ. Example7 The rule which assigns to each number t the number t3 + x. (This rule depends, of course, on what the number x is, so we are really describing infinitely many different functions, one for each number x.) Example8 The rule which assigns to each number z the number of 7's in the decimal expansion of z, if this number is finite, and if there are infinitely many expansion of 7's in the decimal z. -x One thing should be abundantly clear from these examples-a function is any rule that assigns numbers to certain other numbers, not just a rule which can be expressed by an algebraic formula, or even by one uniform condition which applies to every number; nor is it necessarily a rule which you, or anybody else, can actually apply in practice (noone knows, for example, what rule 8 associates to x). Moreover, the rule may neglect some numbers and it may not even be clear to which numbers the function applies to determine, for example, whether the function in Example 6 applies to x). The set of numbers to which a function does (try apply is called the domainof the function. Before saying anything else about functions we badly need some notation. Since throughout this book we shall frequently be talking about functions (indeed we shall anything need of about naming talk else) convenient hardly ever funcwe a way tions, and of referring to functions in general. The standard practice is to denote a function by a letter. For obvious reasons the letter f" is a favorite, thereby making and other obvious candidates, but any letter (orany reasonable symbol, for that matter) will do, not excluding although these letters and reserved numbers. usually then the number for indicating If is function, are f a " "h" "g" "x" which x" and "y", " associates to a number x is often called the value of is denoted by f (x)-this symbol is read f of f at x. Naturally, if we denote a function by x, other chosen must letter be to denote the number (a perfectly legitimate, some would though perverse, choice leading to the symbol x(f)). Note that the be symbol f (x) makes sense only for x in the domain of f; for other x the symbol f (x) is not defined. If the functions defined in Examples 1-8 are denoted by f, g, h, r, s, 0, ex, and y, then we can rewrite their definitions as follows: f "f," (1) f (x) = x2 for all x. y3+3y+5 (2) g(y) = (3) h(c) = y2 1 c3 + 3c + 5 1 c - for all y. -1. -j for all c 1, 3. Functions 41 (4) r(x) = (5) s(x) = x2 --17 5 x 5 x/3. for all x such that 0, x irrational 1, x rational. x=2 5, 36 17 x = 28, x = 28, x = 16, x¢2 ---, 2 (6) 0 (x) (7) ax (8) y(x) = (t) = = 36 t3 + x - 17,"-,or 17 for all numbers x ,andx=a+bÃfora,binQ. t. 7's appear in the decimal expansion of x infinitely many 7's appear in the decimal expansion of x. exactly n, I -x, n These definitions illustrate the common procedure adopted for defining a function f-indicating what f (x) is for every number x in the domain of f. (Notice that this is exactly the same as indicating f(a) for every number a, or f(b) for evb, etc.) In practice, certain abbreviations ery number are tolerated. Definition (1) could be written simple (1) f (x) x2 = "for all x" being understood. the qualifying phrase the only possible abbreviation is (4) It is usually understood r(x) be shortened -Î7 5 X $ (4) 3. X that a definition such as k(x)=-+ can x2, = Of course, for definition 1 1 x x-1 x¢0,1 , to k(x)=-+ 1 x 1 x-1 ; in other words, unless thedomainis explicitly restricted further, it is understood to consist of which makes all. all numbers thedefinition any sense at for should checking the following assertions about the funclittle have difficulty You tions defined above: f (x + 1) = f (x) + 2x + 1; g(x)=h(x)ifx3+3x+5=0; -17 r (x + 1) = r (x) + 2x + 1 if 5 xs - - 3 1; 42 Foundations s(x + y) = if y is rational; s(x) 36 (x2 0-=0-• 17 ' x ax(x)=x·[f(x)+1]; If the expression f(s(a)) looks unreasonable to you, then you are forgetting that s(a) is a number like any other number, so that f(s(a)) makes sense. As a matter of fact, f(s(a)) s(a) for all a. Why? Even more complicated expressions than after The expression a first exposure, no more difficult to unravet f(s(a)) are, = f (r(s(9(a3(7(g)))))), formidable as it appears, be evaluated quite easily with a little patience: may f (r(s(0(as(y(y)))))) = f(r(s(Ð(as(0))))) = f(r(s(0(3)))) = f(r(s(16))) f(r(l)) f(1) = = = 1. The first few problems at the end of this chapter give further practice manipulating this symbolism. is a rather special example of an extremely imporof functions, the polynomial functions. A function f is a polynomial tant class such that function if there are real numbers ao, , as The fimetion defined in (1) ... f(x)=anx"+a. ixn-i+···+a2x2‡aix+a0, foralÌx (whenf (x) is written in this form it is usually tacitly assumed that a, ¢ 0). The coefficient is called the degree of f; for highest power a nonzero 5x6 + 137x4 x has example, the polynomial function f defined by f(x) degree 6. The functions defined in (2)and (3)belong to a somewhat larger class of functions, the rational functions; these are the functions of the form p/q where p and q are polynomial functions (andq is not the function which is always 0). The rational functions are themselves quite special examples of an even larger class of functions, very thoroughly studied in calculus, which are simpler than many of the functions first mentioned in this chapter. The following are examples of this kind of function: of x with = x+x2+xsin2x (9) f (x) = (10) f (x) (11) f (x) = = x sin x + x sin(x2). sin(sin(x2g sin2 x - 3. Functions 43 (12) f (x) = sin2(sin(sin2(x + sin(x sin x) sin2 x2 (x x+smx By what criterion, you may feel impelled to ask, can such functions, especially a like (12),be considered simple? The answer is that they can be built simple functions using a few simple means of combining functions. from few up a In order to construct the functions (9)-(12) we need to start with the "sine function" I, for which I(x) function" sin, whose value sin(x) at x, and the often written simple sin x is x. The following are some of the important ways in which functions may be combined to produce new functions. If f and g are any two functions, we can define a new function f + g, called the sum of f and g, by the equation monstrosity "identity = (f + g)(x) g(x). f (x) + = Note that according to the conventions we have adopted, the domain of f + g for which f(x) + g(x)" makes sense, i.e., the set of all x in both domain f and domain g. If A and B are any two sets, then A nB (read"A intersect B" or intersection of A and B") denotes the set of x in both A and B; this notation allows us to write domain( + g) domain fn domain g. f " consists of all x "the = In a similar vein, we define the product f and g f - g and the quotient - -g)(x)= (orf/g) of g by (f f g(x) f(x) and f ( - g (x) = f (x) Moreover, if g is a function and c is a number, (c - g)(x) = . g(x) we define a new function c g by · c g(x). - This becomes a special case of the notation f g if we agree that the symbol c should also represent the function f defined by f(x) c; such a function, which value all numbers x, is called a constant has the same for function. - = The domain of f g is domain fn domain g, and the domain of c g is simply the domain of g. On the other hand, the domain of f/g is rather complicated-it may be written domain fn domaing n {x: g(x) ¢ 0}, the symbol {x: g(x) ¢ 0} denoting the set of numbers x such that g(x) ¢ 0. In general, {x : denotes is true. Thus {x: x3 + 3 < 11} denotes the set of the set of all x such that all numbers x such that x3 < 8, and consequently {x: x3 + 3 < 11) (x : x < 2). of symbols could well Either these have been written using y everywhere just as instead of x. Variations of this notation are common, but hardly require any discussion. Any one can guess that {x > 0 : x3 < 8} denotes the set of positive numbers whose cube is less than 8; it could be expressed more formally as {x : x3 < 8}. Incidentally, this set is equal to the set x > 0 and {x : 0 < x < 2}. One - - ...) " " ... = 44 Foundations is slightly less transparent, but very standard. contains just the four numbers 1, 2, 3, and 4; variation example, it The set {1,3, 2, 4}, for can also be denoted by {x:x=lorx=3orx=2orx=4}. Certain facts about the sum, product, and quotient of functions are obvious conFor example, sequences of facts about sums, products, and quotients of numbers. that it is very easy to prove f+(g+h). (f+g)+h= The proof is characteristic of almost every proof which demonstrates that two functions are equal-the two functions must be shown to have the same domain, and the same value at any number in the domain. For example, to prove that (f + g) + h f + (g + h), note that unraveling the definition of the two sides gives = [(f+g)+h](x)=(f+g)(x)+h(x) [f(x)+g(x)]+h(x) = and [f+(g+h)](x)= f(x)+(g+h)(x) f(x) + [g(x)+h(x)], = and the equality of + h(x)] is a fact about f(x) + g(x)] + h(x) and f (x)+ numbers. In this proof the equality of the two domains was not explicitly mentioned because this is obvious, as soon as we begin to write down these equations; [ the domain of [g(x) g) + h and of f + (g + h) is clearly domain h. We naturally write f + g + h for (f + g) + h as we did for numbers. (f + domain fn domain gn f + (g + h), precisely = It is just as easy to prove that (f g) h f (g h), and this function is denoted by f g h. The equations f + g g + f and f g g f should also present - - = - - - = - = - - difficulty. no Using the operations +, -, / we can now express the function f defined in (9) by f [+I-I+I-sin·sin = . I-sin+I-sm-sm . . It should be clear, however, that we cannot express function (10)this way. We require yet another way of combining functiqns. This combination, the composition of two functions, is by far the most important. If f and g are any two functions, we define a new function fo g, the cornposition of f and g, by (f the domain of " fo g" og)(x) = f(g(x)); fog is {x: x is in domain g and g(x) is in domain f}. The symbol composition of f f circle g." Compared to the phrase is often read " "the and g" this has the advantage of brevity, of course, but there is another advantage of far greater import: there is much less chance of confusing fog with go f, and 3. Functions 45 these must not be confused, since they are not usually equal; in fact, almost any f and g chosen at random will illustrate this point (tryf sin, for I I and g apprehensive about operation of the composition, example). Lest you become too let us hasten to point out that composition is associative: = og)oh= (f (andthe (10) f (11) f (12) f fo(goh) proof is a triviality); this function is denoted by fimetions (10),(11),(12)as write the sin o (I I), sin o sin o (I = = · fogo h. We can now - = (sin = - sin) o - sin I), o (sin - sin) o (I - [(sinsin) (I o - - I)]) - I + sin o (I sin) I + sm - . SIR O . One fact has probably already become clear. Although this method functions shortest "structure" reveals . of writing The very clearly, it is hardly short or convenient. sin(x2) for all = unfortunately such that x f f(x) sin(x2) for all x." The need for = such that their name for the function be seems to "the function f f (x) abbreviating this clumsy description has been clear for two hundred years, but no reasonable abbreviation has received universal acclaim. At present the strongest contender for this honor is something like sin(x2) -> x "x goes to sin(x2 m just "X Sin(X2)"), but it is hardly popular among of calculus textbooks. In this book we will tolerate a certain amount of "the ellipsis, and speak of function f (x) = sin(x2)." Even more popular is the "the quite drastic abbreviation: function sin(x2)." For the sake of precision we which, strictly speaking, confuses a number and will never use this description, (read Or arrOW writers a function, but it is so convenient that you will probably end up adopting it for personal use. As with any convention, utility is the motivating factor, and this criterion is reasonable so long as the slight logical deficiencies cause no confusion. On occasion, "the example, confusion function will arise unless a more x + t3» is an ambiguous x + t3, i.e., the -+ x precise description is used. For phrase; it could mean either function f such that f(x) = t3 x + = t3 for all t. x+ for all x or t 4 t3, i.e., the function x + f such that f (t) As we shall see, however, for many important concepts associated with functions, 4" which contains the calculus has a notation built in. By now we have made a sufficiently extensive investigation of functions to warbut it is rant reconsidering our definition. We have defined a function as a what rule?" this ask this clear hardly "What happens if you break it means. If we is not easy to say whether this question is merely facetious or actually profound. "x "rule," 46 Foundations A more substantial objection to the use of the word "rule" is that f(x)=x2 and f(x)=x2+3x+3-3(x+1) are certainly dafwentrules, if by a rule we mean the actual instructions given for determining f(x); nevertheless, f (x) we want x2 = and f(x)=x2+3x+3-c(x+1) to define the same function. For this reason, a function is sometimes defined as an between numbers; unfortunately the word escapes the objections raised against only because it is even more vague. "association" "association" "rule" There is, of course, a satisfactory way of defining functions, or we should never have gone to the trouble of criticizing our original definition. But a satisfactory definition can never be constructed by finding synonyms for English words which are troublesome. The definition which mathematicians have finally accepted for is a beautiful example of the means by which intuitive ideas have been incorporated into rigorous mathematics. The correct question to ask about a rule?" function is not "What is a or "What is an association?" but "What does one have to know about a function in order to know all about it?" The answer to the last question is easy-for each number x one needs to know the number f (x); we can imagine a table which would display all the information one could desire x2· about the function f(x) "function" = x f (x) 1 1 1 4 4 -1 2 -2 would actually It is not even necessary to arrange the numbers in a table (which wanted all of of column them). Instead be impossible if we array we to list a two of numbers consider various pairs can . (1, 1), (-1, 1), (2, 4), (-2, 4), (x, x2), (Ä, 2), .. . 3. Functions 47 simply collected together into a set.* To find f (1) we simply take the second number of the pair whose first member is 1; to find f (x) we take the second number of the pair whose first member is x. We seem to be saying that a function might as well be defined as a collection of pairs of numbers. For example, if we contains just 5 pairs): were given the following collection (which f = {(1,7), (3, 7), (5,3), (4, 8), (8,4)}, then f (1) 7, f (3) 7, f (5) only numbers in the domain of = = f = 3, f (4) = f. = 8, f (8) 4 and 1, 3, 4, 5, 8 are the = If we consider the collection {(1, 7), (3,7), (2,5), (1,8), (8, 4) }, 7, f (2) 5, f (8) 4; but it is impossible to decide whether f (1) 7 or f(1) 8. In other words, a function cannot be defined to be any old collection of pairs of numbers; we must rule out the possibility which arose in this case. We therefore led the following definition. to are then f (3) = = = = = DEFINITION A function is a collection of pairs of numbers with the following property: if c; in other words, the (a, b) and (a, c) are both in the collection, then b collection must not contain two different pairs with the same first element. = This is our first full-fledged definition, and illustrates the format we shall always use to define significant new concepts. These definitions are so important (at least as important as theorems) that it is essential to know when one is actually remarks, and casual at hand, and to distinguish them from comments, motivating explanations. They will be preceded by the word DEFINITION, contain the term being defined in boldface letters, and constitute a paragraph unto themselves. There is one more definition (actually defining two things at once) which can now be made rigorously: DEFINITION If f is a function, the dornain of f is the set of all a for which there is some b such that (a, b) is in f. If a is in the domain of f, it followsfrom the definition of a function that there is, in fact, a unique number b such that (a, b) is in f. This unique b is denoted by f(a). With this definition we have reached our goal: the important thing about a function f is that a number f (x) is determined for each number x in its domain. You may feel that we have also reached the point where an intuitive definition has been replaced by an abstraction with which the mind can hardly grapple. Two consolations may be ofTered. First, although a function has been defined as a "ordered pairs," to emphasize that, for example, (2, 4) is here are often caRed the pair not same as (4, 2). It is only fair to warn that we are going to define functions in terms of ordered pairs, another undefined term. Ordered pairs can be defined, however, and an appendix to this chapter has been provided for skeptics. *The pairs occurring 48 Foundations collection of pairs, there is nothing to stop you from thinking of a function as a rule. Second, neither the intuitive nor the formal definition indicates the best way of thinking about functions. The best way is to draw pictures; but this requires a chapter all by itself PROBLEMS 1. Let f(x) 1/(1+ x). What is = (i) f (f (x)) (forwhich x does this make sense?). (ii) f (iii) f (cx). (iv) f (x + y). . (v) f (x) + f (y). (vi) For which numbers c is there a number x such that f(cx) f(x). Hint: There are a lot more than you might think at first glance. (vii)For which numbers c is it true that f (cx) f (x) for two different numbers x? = = 2. Let g(x) x2, and let = h(x) 0, x rational 1, x irrational. = For which y is h(y) < y? (ii) For which y is h(y) 5 g(y)? h(z)? is g(h(z)) (iii) Whatwhich g(w) 5 w? w is (iv) For which e is g(g(s)) g(e)? For (v) Find the domain of the functions defined by the following formulas. (i) - = 3. (i) f(x)= (ii) f (x) (iii) f (x) = = 1-x2. 1 1 (v) f(x)= 4. V1 -- x2. - + x-1 (iv) f (x) VI = - 1 . x-2 X2 x2 # _ ¡ 1-x+fx-2. x2, let P(x) sin x. Find each of the following. Let S(x) 22, and let s(x) In each case you answer should be a number. = (So P)(y). (So s)(y). (SoPo s)(t) + (iii) s(t3). = = (i) (ii) (so P)(t). (iv) 5. Express each of the following functions in terms of S, P, s, using only and o (forexample, the answer to +, (i)is Po s). In each case your ., 3. Functions 49 answer should (i) f (x) (ii) f (x) (iii) f (x) (iv) f (x) (v) f(t) be a function. 2sinx sin 22. = = sin x2. sin2 = = x sin2 that (remember ab for (sinx)2). is an abbreviation x 22. (Note: always means aW; this convention abe written because (a")ccan be more simply as = (vi) f(u)=sin(2"+2"). sin(sin(sin(222 (vii) f(y) (viii)f(a) 2662 + sag2) = = is adopted 'Bþ a2÷ + 2sin mg Polynomial functions, because they are simple, yet flexible, occupy a favored in most investigations of functions. The following two problems illustrate their flexibility,and guide you through a derivation of their most important elementary properties. role 6. distinct numbers, find a polynomial function f¿ of is 1 at x¿ and 0 at x; for j ¢ i. Hint: the product of degree n all (x x;) for j ¢ i, is O at x; if j ¢ i. (This product is usually denoted by (a) If xi, ... x, are 1 which , - - (x xj), - j=1 the symbol n (capitalpi) playing for products that E plays for sums.) (b) Now find a polynomial function f of degree n 1 such that f (xt) ai, numbers. where ai, (You should use the functions , an are given will obtain is called the "Lagrange from The formula part (a). you f¿ interpolation formula.") same role t / = - ... 7. for any polynomial function f, and any number a, there is a polynomial function g, and a number b, such that f (x) (x for all x. (The idea is simply to divide (x a) into f (x) by long division, until a constant remainder is left. For example, the calculation (a) Prove that -a)g(x)+b = - x2 x - _; 1)x3 -3x + 1 -x2 x3 x2 x2 -3 -2x + 1 -2x + 2 - 1 50 Foundations shows that x-3 3x + 1 (x 1)(x2 + x possible by induction on the degree of f.) (b) Prove that if f(a) 0, then f(x) (x = - - 2) - - 1. A formal proof is -a)g(x) = for some polynomial = function g. (The converse is obvious.) (c) Prove that if f is a polynomial function of degree n, then f has at most n roots, i.e., there are at most n numbers a with f (a) 0. (d) Show that for each n there is a polynomial function of degree n with = n roots. If n roots, and if n 8. For which numbers is even find a polynomial function of degree n with no is odd find one with only one root. a, b, c, and d will the function f (x) 9. ax + b cx + d = satisfy f (f (x)) (a) If A is any set of real numbers, x for all x? = Ca(x) 1 = define a function Ca as follows: ' 0, xinA x not in A. Find expressions for Cana and Ca a and Ca._a, in terms of Ca and Ca. (The symbol A nB was defined in this chapter, but the other two may be new to you. They can be defined as follows: AUB={x:xisinAorxisinB}, A {x: x is in R but x is not in A}.) R (b) Suppose f = - is a function such that there is a set A such that f Ca. only that and f2 Show if if f f (c) f (x) 0 or 1 for each x. Prove that = = = 10. Ca for some set A. = is there a function g such that f g2? Hint: You is replaced by can certainly answer this question if such that f which there 1/g? functions f is a function g (b) For *(c) For which functionsb and c can we find a function x such that (a) For which functions f = "function" "number." = (x(t))2+ b(t)x(t) + *(d) c(t) = 0 for all numbers t? What conditions must the functions a and b satisfy if there is to be a function x such that a(t)x(t)+b(t)=0 for all numbers 11. (a) Suppose that t? How many such functions x will there be? H is a function and y is a number What is H(H(H(-·-(H(y)-·-)? 80 times such that H(H(y)) = y. 3. Functions 51 (b) Same question if 80 is replaced by 81. (c) Same question if H(H(y)) = H(y). *(d) Find a function H such that H(H(x)) H(x) for all numbers x, and = such that H(l) H(2) x|3, H(13) 36, 47, H(36) 36, H(x|3) n/3, H(47) 47. (Don't try to for H(x); there are many functions H with H(H(x)) H(x). The extra conditions on H are supposed of finding a suitable H.) to suggest a way *(e) Find a function H such that H(H(x)) = H(x) for all x, and such that H(1) 18. 7, H(17) = = = = = "solve" = = = = 12. A function f is even if f (x) f (-x) and odd if f (x) f (-x). For x2 example, f is even if f(x) or f(x) = |x| or f(x) cosx, while f is odd if f(x) sinx. x or f(x) = = = - = = = + g is even, odd, or not necessarily either, in the four f even or odd, and g even or odd. (Your answers can most conveniently be displayed in a 2 x 2 table.) Do (b) the same for f g. (c) Do the same for fo g. (d) Prove that every even function f can be written f(x) g(|x|), for inwhether obtained cases (a) Determine f by choosing . = fmitely many functions g. *13. function f with domain R can be written f E + O, and odd. E is even 0 is that this of writing f is unique. (If you try to do part (b)first, Prove way (b) "solving" by for E and O you will pobably find the solution to part (a).) (a) Prove that any = where 14. If f is any function, define a new function |f| by |f|(x) |f(x)|. If f min(f, functions, define two new functions, max(f, g) and g), by = and g are max(f,g)(x)=max(f(x),g(x)), min(f,g)(x)=min(f(x),g(x)). Find an expression 15. for max(f, g) and min(f, g) in terms of | |. max( f, 0) + min( f, 0). This particular way of writing useful; the functions max(f,0) and min(f,0) is fairly are called the f negative and positive parts of f. called nonnegative A function is if f(x) :> 0 for all x. Prove that any f (b) function f can be written f = g h, where g and h are nonnegative, way" is g max( f, 0) and h in infinitely many ways. (The min( f, 0).) Hint: Any number can certainly be written as the difference of two nonnegative numbers in infmitely many ways. (a) Show that f = - "standard = = - *16. Suppose f satisfies f (x + y) = (a) Prove that f(xi+--·+x.) f (x) + f (y) for all x = and y. f(xt)+---+ f(x.). (b) Prove that there is some number c such that f (x) cx for all rational numbers x (atthis point we're not trying to say anything about f (x) for irrational x). Hint: First figure out what c must be. Now prove that = 52 Foundations cx, first when x is a natural then when x is the reciprocal of an f (x) *17. = number, then when x is an integer, integer and, finally,for all rational x. 0 for all x, then f satisfies f (x + y) f (x) + f (y) for all x and y, all and y) for x y. Now suppose that f satisfies f(x f(x) f(y) always these two properties, but that f (x) is not 0. Prove that f (x) x for all x, as follows: If f (x) = = and also = - - = (a) Prove that f (1) 1. that Prove f (x) x if x is rational. (b) that (c) Prove f (x) > 0 if x > 0. (This part is tricky, but if you have the been paying attention to the philosophical remarks accompanying problems in the last two chapters, you will know what to do.) (d) Prove that f (x) > f (y) if x > y. (e) Prove that f (x) x for all x. Hint: Use the fact that between any two numbers there is a rational number. = = = *18. Precisely what conditions must h(x)k(y) for all x and y? *19. (a) Prove that g, h, and k satisfy f, in order that f(x)g(y) = there do not exist functions f and g with either of the following properties: (i) f(x) + g(y) (ii) f(x) g(y) - xy for all x and y. x + y for all x and y. = = Hint: Try to get some information about f or g by choosing particular values of x and y. (b) Find functions f and g such that f(x + y) g(xy) for all x and y. = *20. (a) Find a function f, other than a constant f(x)|_<|y-x|. function, such that |f (y) - -x)2 for all x and y. (Why does this that f is a constant function. ?) Prove |f (y) f (x)| (y the from Hint: Divide interval x to y into n equal pieces. (b) Suppose that f(y) imply that 21. f(x) -- 5 (y -x)2 5 - Prove or give a counterexample for each of the following assertions: (a) fo(g+h)=fog+foh. (b) (g+h)of=gof+hof. I (c) fog =-og, f 1 (d) fog 22. 1 = fo 1 - . g then g(x) g(y). g(y) that f and g are two functions such that g(x) whenever f (x) f (y). Prove that g hof for some function h. Hint: h(z) when z is of the form z f(x) (these define try to are the only z Just that matter) and use the hypotheses to show that your definition will not run into trouble. (a) Suppose g ho f. (b) Conversely, suppose = = Prove that if f(x) = f(y), = = = = 3. Functions 53 23. *24. Suppose that fog = I, where I(x) x. Prove that = (a) if x ¢ y, then g(x) ¢ g(y); (b) every number b can be written b (a) Suppose g is a function with the = f (a) for some number a. property that g(x) ¢ g(y) if x ¢ y. Prove that there is a function b can be written (b) Suppose that f is a function number that there is a function g such that b f (a) for some a. Prove f such that fog = I. such that every number = fog=I. *25. *26. Find a function function h with f such foh that gof I. = I and hof Suppose fog associative. is = = = I for some g, but such that there is no I. Prove that g = h. Hint: Use the fact that composition 27. (a) Suppose f (x) x + 1. Are there any functions g such that fog go f? (b) Suppose f is a constant function. For which functions g does fog go f? (c) Suppose that f og gof for all functions g. Show that f is the identity = = = = function, f (x) = x. 28. (a) Let F be the set of all functions whose domain is R. Prove that, using + as defined in this chapter, all of properties P1-P9 except P7 hold for F, provided 0 and 1 are interpreted as constant functions. (b) Show that P7 does not hold. *(c) Show that P1 Pl2 cannot hold. In other words, show that there is no collection P of functions in F, such that P10-Pl2 hold for P. (It is sufficient, and will simplify things, to consider only functions which are 0 and - except at two points xo and xi.) (d) Suppose we define f < g to mean that f(x) < P'10-P'13 (inProblem 1-8) now hold? Is fah<gah? (e) If f<g,ishof<hog? g(x) for all x. Which of 54 Foundations APPENDIX. ORDERED PAIRS Not only in the definition of a function, but in other parts of the book as well, it is necessary to use the notion of an ordered pair of objects. A definition has not yet been given, and we have never even stated explicitly what properties an ordered pair is supposed to have. The one property which we will require states formally that the ordered pair (a, b) should be determined by a and b, and the order in which they are given: if (a, b) = (c,d), then a = c and b = d. Ordered pairs may be treated most conveniently by simply introducing (a, b) this term and adopting the basic property as an axiom-since as an undefined ordered much the only significant about there point is fact pairs, is not property worrying about what an ordered pair is. Those who find this treatment satisfactory need read no further. The rest of this short appendix is for the benefit of those readers who will feel uncomfortable unless ordered pairs are somehow defined so that this basic property becomes a theorem. There is no point in restricting our attention to ordered pairs of numbers; it is just as reasonable, and just as important, to have available the "really" objects. This means that our of an ordered pair of any two mathematical definition ought to involve only concepts common to all branches of mathematics. The one common concept which pervades all areas of mathematics is that of a set, and ordered pairs (likeeverything else in mathematics) can be defined in this context; an ordered pair will turn out to be a set of a rather special sort. The set {a,b}, containing the two elements a and b, is an obvious first choice, but will not do as a definition for (a, b), because there is no way of determining notion from {a,b} which of a or b is meant to be the first element. candidate is the rather startling set: A more promising {{a},{a,b} }. members, both of which are themselvessets; one member is the set {a),containing the single member a, the other is the set {a,b}. Shocking as it may seem, we are going to define (a, b) to be this set. The justificationfor this choice is given by the theorem immediately following the definition-the definition works, and there really isn't anything else worth saying. This set has two (a, b) DEFINITION THEOREM 1 PROOF If (a, b) = (c,d), then a c and = b = = {{a),{a,b} }. d. The hypothesis means that {{a},{a,b)} {{c},{c,d}}. = {a) and {a,b}; and a is the only {{a},{a,b} }. Similarly, c is the unique Now {{a), {a,b} } contains just two members, common element of these two members of 3, Appendix.OrderedPairs 55 common member of both members of {{c), {c,d) ). Therefore a = c. We therefore have {{a},{a,b}} {{a), {a,d)), = and only the proof that b = d remains. It is convenient to distinguish 2 cases. Case1. b = a. In this case, {a,b} = {a},so the set {{a},{a,b) }really member, namely, {a}. The same must be true of ({a), {a,d) ), so which implies that d = a = b. has only one {a,d} (a), = Case2. b ¢ a. In this case, b is in one member of {(a}, {a,b} } but not in the other. It must therefore be true that b is in one member of {{a},{a,d) } but not in the other. This can happen only if b is in {a,d), but b is not in {a};thus b a = orb=d,butb¢a;sob=d. CHAPTER -1 0 FIGURE i 1 2 3 1 GRAPHS Mention the real numbers to a mathematician and the image of a straight line will probably form in her mind, quite involuntarily. And most likely she will neither banish nor too eagerly embrace this mental picture of the real numbers. "Geometric intuition" will allow her to interpret statements about numbers in terms of this picture, and may even suggest methods of proving them. Although the properties of the real numbers which were studied in Part I are not greatly illuminated by a geometric picture, such an interpretation will be a great aid in Part IL You are probably already familiar with the conventional method of considering the straight line as a picture of the real numbers, i.e., of associating to each real number a point on a line. To do this (Figure 1) we pick, arbitrarily, a point which which we label 1. The point twice as far to We label0, and a point to the right, the right is labeled 2, the point the same distance from 0 to 1, but to the left of 0, if a < b, then the point corresponding is labeled etc. With this arrangement, of corresponding the point to b. We can also draw rational to a lies to the left numbers, such as in the obvious way. It is usually taken for granted that the numbers also somehow fit into this scheme, so that every real number irrational can be drawn as a point on the line. We will not make too much fuss about numbers is intended justifyingthis assumption, since this method of solely as a method of picturing certain abstract ideas, and our proofs will never rely on these pictures we will frequently use a picture to suggest or help (although explain a proof). Because this geometric picture plays such a prominent, albeit inessential role, geometric terminology is frequently employed when speaking of numbers-thus is sometimes called a point,and R is often called the a number real kne. The number |a-b| has a simple interpretation in terms of this geometric picture: it is the distance between a and b, the length of the line segment which has a as one end point and b as the other. This means, to choose an example whose frequent -1, (, "drawing" that the set of numbers x which satisfy of |x a| < e may be pictured as the collection points whose distance from a is less than e. This set of points is the from a e to a + e, which may also numbers be described as the points corresponding to x with a e < x < a + e (Figure 2). occurrence justifiesspecial consideration, - "interval" - - r a FIGURE 2 e o , a + Sets of numbers which correspond to intervals arise so frequently that it is desirable to have special names for them. The set {x: a < x < b} is denoted by (a, b) and called the open interval from a to b. This notation naturally creates some ambiguity, since (a, b) is also used to denote a pair of numbers, but in context it is always clear (orcan easily be made clear) whether one is talking about a pair or Ø, the set with no elements; in pracan interval. Note that if a > b, then (a, b) = 56 4. Graphs 57 the open interval . the closed interval (a, b) [a,b] a the interval (- o, a) the interval the interval (- o, oo) a] The set {x: a Exs b} is denoted by [a,b] and is called the closed interval from a to b. This symbol is usually reserved for the case a < b, but it is sometimes used for a b, also. The usual pictures for the intervals (a, b) and [a,b] are shown in Figure 3; since no reasonably accurate picture could ever indicate the difference between the two intervals, various conventions have been adopted. Figure 3 also shows certain intervals. The set {x : x > a} is denoted by (a, oo), while the set {x: x > a} is denoted by [a,oo); the sets (-oo, a) and (-oo, a] are similarly. standard warning must be issued: the symbols ao point this defined At a = "infinite" [a, m) the interval FIGURE (a, tice, however, it is almost always assumed (explicitly if one has been careful, and implicitly otherwise), that whenever an interval (a, b) is mentioned, the number a is less than b. 3 "infinity" -oo, though usually read and "minus suggestive; infinity," are purely which satisfies oo > a for all numbers a. While the there is no number symbols oo and will appear in many contexts, it is always necessary to define these uses in ways that refer only to numbers. The set R of all real numbers is also considered to be an and is sometimes denoted by (-oo, oo). and "oo" -oo (0, b) _ _ (a, b) "interval," i I (1, 1) 1) (-1, Of even greater interest to us than the method of drawing numbers is a method drawing pairs of numbers. This procedure, probably also familiar to you, resystem," quires a two straight lines intersecting at right angles. To distinguish these straight lines, we call one the horizontalaxis, and one the vertical axis. (More prosaic terminology, such as the and axes, is probably of view, but most people hold their books, or at preferable from a logical point and least their blackboards, in the same way, so that are with real numbers, could of the descriptive.) be labeled Each but two axes more we can also label points on the horizontal axis with pairs (a, 0) and points on the vertical axis with pairs (0, b), so that the intersection of the two axes, the of the coordinate system, is labeled (0, 0). Any pair (a, b) can now be drawn as in Figure 4, lying at the vertex of the rectangle whose other three vertices are labeled (0,0), (a, 0), and (0,b). The numbers a and b are called the firstand second coordinates, respectively, of the point determined in this way. of "coordinate (0, 0) (a,o) "first" -1) . . (-1, -1) (1, "second" "horizontal" FIGURE 4 "vertical" "origin" f(x) FIGURE f(x) - i Our real concern, let us recall, is a method of drawing functions. Since a function is just a collection of pairs of numbers, we can draw a function by drawing each of the pairs in the function. The drawing obtained in this way is called the graph of the function. In other words, the graph of f contains all the points corresponding to pairs (x,f (x)). Since most functions contain infinitely many pairs, drawing the graph promises to be a laborious undertaking, but, in fact, many functions have graphs which are quite easy to draw. 5 --x = f(x) y = Not surprisingly, the simplest functions of all, the constant functions f(x) c, have the simplest graphs. It is easy to see that the graph of the function f (x) c is a straight line parallel to the horizontal axis, at distance c from it (Figure 5). = = The functions f (x) lines cx also have particularly simple graphsstraight of Figure this 6. A proof fact is indicated in Figure 7: (0, 0), as in = FIGURE 6 through 58 Foundations Let x be some number not equal to 0, and let L be the straight line which passes through the origin 0, corresponding to (0, 0), and through the point A, correwith sponding to (x, cx). A point A', first coordinate y, will lie on L when the triangle A'B'O is similar to the triangle ABO, thus when L (x,ex) A' --=-=c; OB' -(0, FIGURE o) D' 7 E (a, b) ' length FIGURE this is precisely the condition that A' corresponds to the pair (y, cy), i.e., that A' lies on the graph of f. The argument has implicitly assumed that c > 0, but the other cases are treated easily enough. The number c, which measures the ratio of the sides of the triangles appearing in the proof, is called the slope of the straight line, and a line parallel to this line is also said to have slope c. B length d - a a - b This demonstration has neither been labeled nor treated as a formal pmof. Indeed, a rigorous demonstration would necessitate a digression which we are not at all prepared to follow. The rigorous proof of any statement connecting geometric and algebraic concepts would first require a real proof (ora precisely stated assumption) that the points on a straight line correspond in an exact way Aside from this, it would be necessary to develop plane to the real numbers. geometry as precisely as we intend to develop the properties of real numbers. Now the detailed development of plane geometry is a beautiful subject, but it is by no means a prerequisite for the study of calculus. We shall use geometric pictures only as an aid to intuition; for our purposes (andfor most of mathematics) it is perfectly satisfactory to defmethe plane to be the set of all pairs of real numbers, and to dafmestraight lines as certain collections of pairs, including, among others, the collections {(x,cx) : x a real number}. To provide this artificially constructed geometry with all the structure of geometry studied in high school, one more defmition is required. If (a, b) and (c,d) are two points in the plane, i.e., pairs of real numbers, we dfme the distance between (a, b) and (c, d) to be (a f(x) - cx ‡ d / (o, OB - c)2 + (b - d)2. If the motivation for this definition is not clear, Figure 8 should serve as adequate explanation-with this definition the Pythagorean theorem has been built into our geometry.* Reverting once more to our informal geometric picture, it is not hard to see (Figure 9) that the graph of the function f (x) cx + d is a straight line with slope c, passing through the point (0, d). For this reason, the functions f (x) cx + d are called linear functions. Simple as they are, linear functions occur frequently, and you should feel comfortable working with them. The following is a typical problem whose solution should not cause any trouble. Given two distinct points (a, b) and (c,d), find the linear function f whose graph goes through (a, b) and (c, d). This amounts to saying that f(a) b and f(c) d. If = = FIGURE 9 = * = numbers The fastidious reader might object to this definition on the grounds that nonnegative objection really unanswerable the moment-the have This is are not yet known to square roots. at definition will just have to be accepted with reservations, until this little point is settled. 4. Graphs 59 e f is to be of the form f (x) = ax + ß, then we must have ac+ß=d; a) therefore a = (d - b)/(c f(x)= - d-b c-a a) and x+b- ß = b - [(d d-b c-a - b)/(c d-b a= c-a "point-slope (b) / / FIGURE e (c) lo (d) - a)]a, so (x-a)+b, form" (seeProblem 6). a formula most easily remembered by using the Of course, this solution is possible only if a ¢ c; the graphs of linear functions account only for the straight lines which are not parallel to the vertical axis. The vertical straight lines are not the graph of any function at all; in fact, the graph of a function can never contain even two distinct points on the same vertical line. This Conclusion is immediate from the definition of a function-two points on the same vertical line correspond to pairs of the form (a, b) and (a, c) and, by defmition, a function cannot contain (a, b) and (a, c) if b ¢ c. Conversely, if a set of points in the plane has the property that no two points lie on the same vertical line, then it is surely the graph of a fimction. Thus, the first two sets in Figure 10 are not graphs of functions and the last two are; notice that the fourth is the graph of a function whose domain is not all of R, since some vertical lines have no points on them at all. After the linear functions the simplest is perhaps the function f (x) = x2 gyy draw some of the pairs in f, i.e., some of the pairs of the form (x,x2), we obtain a picture like Figure 11. e (2, 4) FIGURE 11 60 Foundations M "' It is not hard to convince yourself that all the pairs (x, x2) lie along a curve like the one shown in Figure 12; this curve is known as a parabola. Since a graph is just a drawing on paper, made (inthis case) with printer's ink, the question "Is this what the graph really looks like?" is hard to phrase in any sensible manner. No drawing is ever really correct since the line has thickness. there Nevertheless, are some questions which one can ask: for example, how can you be sure that the graph does not look like one of the drawings in Figure 13? It is easy to see, and even to prove, that the graph cannot look like (a);for if 0 < x < y, then x2 < y2, so the graph should be higher at y than at x, which is not the case in (a) It is also easy to see, simply by drawing a very accurate graph, first plotting many pairs (x, x2), that the graph cannot have a large as in (b) order these assertions, however, we first need to prove or a as in (c).In to say, in a mathematical way, what it means for a function not to have a "corner"; already these ideas involve some of the fundamental concepts of or calculus. Eventually we will be able to define them rigorously, but meanwhile you to define these concepts, and then examining may amuse yourself by attempting your definitions critically Later these definitions may be compared with the ones mathematicians have agreed upon. If they compare favorably, you are certainly to be congratulated! called The functions f (x) x", for various natural numbers n, are sometimes easily compared as in Figure 14, by power functions. Their graphs are most drawing several at once. The power functions are only special cases of polynomial functions, introduced in the previous chapter. Two particular polynomial functions are graphed in . "jump" "corner" "jump" FIGURE 12 = (a) (c) FIGURE 13 FIGURE 14 4. Graphs 61 f(x) - Figure 15, while Figure 16 is polynomial function x + x f (x) -1 -1 = to give a general idea of the graph of the meant anx" + a,_ix"¯I + -··+ ao, in the case a, > 0. In general, the graph of f will have at most n 1 or (a is a point like (x,f (x)) in Figure 16, while a is a point like (y, f (y)). The number of peaks and valleys may actually be much smaller (thepower functions, for example, have at most one valley). Although these assertions are easy to make, giving proofs until Part III (oncethe powerful methwe will not even contemplate ods of Part III are available, the proofs will be very easy). Figure 17 illustrates the graphs of several rational functions. The rational functions exhibit even greater variety than the polynomial functions, but their behavior will also be easy to analyze once we can use the derivative, the basic tool of Part III. together" the graphs of Many interesting graphs can be constructed by functions already studied. The graph in Figure 18 is made up entirely of straight lines. The function f with this graph satisfies "peaks" "valleys" "peak" - "valley" (a) -2 - "piecing a __ 3 f 1 = (-1)n+1 = (-1).÷1 = 1, -1 f I -2-- - E f(x) (b) |x| > 1, -l/(n+ FIGURE 15 and is a linear function on each interval [l/(n+ 1), 1/n] and [-1/n, 1)]. (The number 0 is not in the domain of f.) Of course, one can write out an explicit formula for f (x), when x is in [l/(n+ 1), 1/n]; this is a good exercise in the use of linear functions, and will also convince you that a picture is worth a thousand words. 1 I I I I I I I I i I FIGURE 16 I n even n odd (a) (b) 02 Poundahons f(x) f(x) f(x) = 1 1 f(x) p - i (b) (c) (d) FIGURE 17 i FIGURE 18 - 4. Graphs 63 1- - f(x) = sin x -360 720 360 ---1 FIGURE 19 It is actually possible to define, in a much simpler way, a function which exhibits this same property of oscillating infinitely often near 0, by using the sine function. In Chapter 15 we will discuss this function in detail, and radian measure in particular; for the time being it will be easiest to use degree measurements for angles. The graph of the sine function is shown in Figure 19 (thescale on the horizontal axis has been altered so that the graph will be clearer; radian measure has, besides important mathematical properties, the additional advantage that such changes are unnecessary). sinl/x. The graph of Now consider the function f(x) observe that ure 20. To draw this graph it helps to first f(x)=0 f(x)=1 f(x)=--1 1 1 1 180 360 540 1 1 1 forx=90 90 + 360 90 + 720 1 1 1 forx=270 270 + 360 270 + 720 forx=- - ' -,..., ' ' ' ' is shown in Fig- f = ' ' ... ' .... ' Notice that when x is large, so that 1/x is small, f (x) is also small; when x is f (x) FIGURE 20 - sin 1 - 64 Foundations f(x) FIGURE "large = x sin 1 - 21 negative," although f (x) that is, when < |x| is large for negative x, again f (x) is close to 0, 0. An interesting modification of this fimetion is f (x) x sin 1/x. The graph of this function is sketched in Figure 21. Since sin 1/x oscillates infinitely often near 0 the function f (x) x sin 1/x oscillates infinitely often between between 1 and and The behavior of the graph for x large or large negative is harder to x = -1, = -x. \ FIGURE 22 / 4. Graphs 65 j 2·- Since sin 1/x is getting close to 0, while x is getting larger and larger, there seems to be no telling what the product will do. It is possible to decide, but this is x2 sin 1/x another question that is best deferred to Part III. The graph of f(x) has also been illustrated (Figure 22). For these infinitely oscillating functions, it is clear that the graph cannot hope to The best we can do is to show part of it, and leave out the be really the is interesting part). Actually, it is easy to find much simpler part near 0 (which whose functions graphs cannot be drawn. The graphs of analyze. = (a) "accurate." g "accurately" x2 f (x) f 2, ' 741 2 and g(x) 1 x> 23 = x > 1 FIGURE (x,y) (a, b) irrational 0, x 1, x rational. 1, x rational 0, x irrational ¯ 25 2, 41 ' CIOsed f (x) FIGURE = be distinguished by some convention similar to that used for open and intervals (Figure 23). Out last example is a function whose graph is spectacularly nondrawable: can only (b) FIGURE = 24 The graph of f must contain infinitely many points on the horizontal axis and also infinitely many points on a line parallel to the horizontal axis, but it must not contain either of these lines entirely. Figure 24 shows the usual textbook picture of the graph. To distinguish the two parts of the graph, the dots are placed closer together on the line corresponding to irrational x. (There is actually a mathematical reason behind this convention, but it depends on some sophisticated ideas, introduced in Problems 21-5 and 21-6.) The peculiarities exhibited by some functions are so engrossing that it is easy to forget some of the simplest, and most important, subsets of the plane, which are not the graphs of functions. The most important example of all is the circle. A circle with center (a, b) and radius r > 0 contains, by definition, all the points (X, y) Whose distance from (a, b) is equal to r. The circle thus consists (Figure 25) of all points (x, y) with (x-a)2+Q-b2=r or (x - a)2 + Q - 2 2 66 Foundations and radius 1, often regarded as a sort of standard copy, The circle with center (0,0) is called the unit cirde. A close relative of the circle is the ellipse. This is defined as the set of points, the sum of whose distances from two points is a constant. (When the two foci are the same, we obtain a circle.) If, for convenience, the focus points are taken to be (-c, 0) and (c, 0), and the sum of the distances is taken to be 2a (the factor 2 simplifies some algebra), then (x, y) is on the ellipse if and only if "focus" 2 (x (-c))2 - or (x + 2 c)2 - = 2a (x+c)2+y2=2a-d(x-c)2+y2 or x2+2cx+c2+y2=4a2-4ad(x-c)2+y2+x2-2cx+c2+y2 or 4(cx or c2x2 - - a2) 2cxa2 # a4 or (c2_ or 2 (x 2 _ 2__a2 2 _ b = a2 - c2 2 - C2 x2 y2 a2 b2 0, so that = 2 =1, x=±a, (a,0) (-a, 0) (0, 26 2 ' must clearly of an ellipse is shown . FIGURE 2 4 y2 1. = (sincewe 0). A picture horizontal axis when y > sects the 2 _ simply -+-=1 42 + y2 72 a2 + a2 - where a2 - 2_ x2 This is usually written c)2 -4a = - b) choose it follows that in Figure 26. The ellipse intera > c, 4. Graphs 67 and it intersects the vertical axis when x = 0, so that 2 1, = y = ±b. The hyperbola is defined analogously, except that we require the dagence of the two distances to be constant. Choosing the points (-c, 0) and (c,0) once again, and the constant difTerence as 2a, we obtain, as the condition that (x, y) be on the hyperbola, (-a,0) (a,o) which may FIGURE 27 2- (x+c)2 (X-C) +y2=±2a, be simplified to In this case, however, we must clearly choose c > a, so that a2 a2, then (x, y) is on the hyperbola if and only if b = c2 - < 0. If Ác2 - x 2 ---=1 a2 2 y b2 The picture is shown in Figure 27. It contains two pieces, because the difference between the distances of (x, y) from (-c, 0) and (c,0) may be taken in two different orders. The hyperbola intersects the horizontal axis when y 0, so that vertical the axis. ±a, but it never intersects x b It is interesting to compare (Figure 28) the hyperbola with a Ã and the graph of the function f (x) 1/x. The drawings look quite similar, and the two sets are actually identical, except for a rotation through an angle of 45° (Problem 23). Clearly no rotation of the plane will change circles or ellipses into the graphs of functions. Nevertheless, the study of these important geometric figures can often be reduced to the study of functions. Ellipses, for example, are made up of the = = = = (a) FIGURE 28 (b) = 68 Foundations graphs of two functions, f (x) = g(x) = Ál b (x2/a2), - -a 5 x 5 a and (x2g2), -b 1 - -a 5 x 5 a. Of course, there are many other pairs of functions with this same property For example, we can take f (x) b 1 (x2/a2), - O < xs = -bÅl-(x2/a2), a -a5x50 and Ál -b g(x) = 1 b - (x2/a2), - (x2 a2, O< x 5 a -a 5 x 5 0. We could also choose f(x) b /1 = (x2/a2), - -b 1 -- X rational, (x2/a ), x irrational, (x2/a2), x rational, - -- a X a a 5 x 5 a and -b g(x) 1 - -- = b Ál - (x2 2), x irradonal, a 5 x 5 a _< -- G X G. But all these other pairs necessarily involve unreasonable functions which jump of this fact, is too difficult at present. around. A proof, or even a precise statement Although you have probably already begun to make a distinction between those functions with reasonable graphs, and those with unreasonable graphs, you may reasonable reasonable of definition find it very difficult to state a functions. A mathematical definition of this concept is by no means easy, and a great deal of this book may be viewed as successive attempts to impose more and more conditions that a function must satisfy, As we define some of these conditions, we will take time out to ask if we have really succeeded in isolating the functions which deserve to be called reasonable. The answer, unfortunately, will always be "yes." or at best, a qualified "reasonable" "no," PROBLEMS 1. Indicate on a straight line the set of all x satisfying the following conditions. Also name each set, using the notation for intervals (insome cases you will also need the U sign). (i) (ii) (iii) (iv) |x--3|<1. |x 3| 5 1. |x a| < e. |x2 1| < - - - . 4. Graphs 69 +1 1 1 5 a (give an answer in terms of a, distinguishing various cases). 1 + x2 (vii)x2 + 1 > 2. (viii)(x + 1)(x 1) (x 2) > 0. (vi) - 2. - There is a very useful way of describing the points of the closed interval (wherewe as usual, that a assume, < b). [a,b] the interval [0,b], for b > 0. Prove that if x is in [0,b], tb 1. What is the significance of the then x for some t with 0 sts mid-point of the interval [0,b]? number t? What is the b], then that if is in (1 t)a + tb for some t with x x [a, (b) Now prove also expression a). be written as a + t(b 0 5 t 5 1. Hint: This can What is the midpoint of the interval [a, b]? What is the point 1/3 of the (a) First consider = = - - from a to b? (c) Prove, conversely, that if 0 sts 1, then (1 t)a + tb is in [a,b]. (d) The points of the open interval (a, b) are those of the form (1 t)a + tb for0<t<1. way - - 3. Draw the set of all points (x, y) satisfying the following conditions. (In most cases your picture will be a sizable portion of a plane, not just a line or curve.) (i) > y. + (ii) x a 2> y + b. (iii) y < x 2. (iv) y 5 x x (v) |x-y|<1. (vi) |x+y|<1. (vii)x + y is an integer. (viii)x + y 1 (ix) (x2 (x) 4. - is an integer. 1)2 + (y - 2)2 4 y 4 x Draw the set of all points (x, y) satisfying the following conditions: (i) |x|+|y|=1. (ii) |xi |yl - (iii) |x 1| - = |1 x| x2+y2=0. (iv) (v) (vi) xy x2 (vii)x2 (viii) = - = - -- = 1. |y |y 0. 2x + y2 y2. 1|. 1|. - - = 4. 70 Foundations 5. Draw the set of all points (i) x (iv) x the following conditions: y2. = 2 (ii) (iii) x satisfying (x, y) 2 = - = = |y 1. . sin y. Hint: You already know the answers when x and y are interchanged. 6. (a) Show that the straight function f (x) m(x = line through (a, b) with slope m is the graph of the a) + b. This formula, known as the "point-slope - form" is far more convenient than the equivalent expression f (x) it is immediately clear from the point-slope form that the mx + (b slope is m, and that the value of f at a is b. (b) For a y c, show that the straight line through (a, b) and (c, d) is the graph of the function = -ma); d-b(x-a)+b. f(x)= -a c are the graphs of straight lines? (c) When 7. f(x) = mx + b and g(x) = m'x + b' parallel nurnbers A, B, and C, with A and B not both 0, show that the 0 is a straight line (possibly set of all (x, y) satisfying Ax + By + C a vertical one). Hint: First decide when a vertical straight line is described. (b) Show conversely that every straight line, including vertical ones, can be described as the set of all (x, y) satisfying Ax + By + C 0. (a) For any = (1, m) = 8. (a) Prove that the graphs of the functions f(x)=mx+b, g(x)=nx+c, -1, by computing the squares of the lengths of the sides of the triangle in Figure 29. (Why is this special case, where the lines intersect at the origin, as good as the general case?) (b) Prove that the two straight lines consisting of all (x, y) satisfying the conare gi, FIGURE 29 perpendicular if mn = ditions Ax+By+C=0, A'x+B'y+C'=0, are perpendicular 9. (a) Prove, using if and only if AA'+ BB' = 0. Problem 1-19, that (x1+71)2+(X2+72) Il2+X22 2+122 4. Graphs 71 (b) Prove that (xs - xi)2 (ya yi)2 + - < V(x2 Il)2 - Ÿ Q2 + - B)2 Á(xa x2) - Interpret this inequality geometrically (itis called the ity"). When does strict inequality hold? 10. "triangle 13 - 72)2 inequal- Sketch the graphs of the following functions, plotting enough points to get (Part of the problem is to make a good idea of the general appearance. the queries posed below are decision how many is a reasonable meant to show that a little thought will often be more valuable than hundreds of individual points.) "enough"; (i) f (x) = 1 x+ (What happens for x -. x near 0, and for large x? Where does the graph lie in relation to the graph of the identify function? Why does it suffice to consider only positive x at first?) 11. (ii) f (x) = (iii) f (x) = (iv) f (x) = 1 x - -. x x2 + 1 4. x x 2 1 - yx . Describe the general features of the graph of f if is even. is (ii) f odd. (iii) f is nonnegative. (i) f (iv) f (x) odic, 12. = f (x + a) for all x with period (afunction with this property is called peri- a. Graph the fimetions f(x) 6 for m = 1, 2, 3, 4. (There is an easy way to do this, using Figure 14. Be sure to remember, however, that 6 means the mth root of x when m is even; you should also note that there will be positive an important difference between the graphs when m is even and when m is = odd.) 13. (a) Graph f(x) (b) Graph f(x) = |x| and f(x) x2. |sinx| and f(x) = sin2x. (There is an important differwhich the graphs, we cannot yet even describe rigorously. ence between what See if you can discover it is; part (a)is meant to be a clue.) 14. = = Describe the graph of g in terms of the graph of f if 72 Foundations g(x) (ii) g(x) (iii) g(x) (iv) g(x) (v) g(x)= (vi) g(x) (i) (vii)g(x) (viii)g(x) (ix) g(x) (x) g(x) = = = = = = f (x) + c. f (x + c). (It is easy cf (x) (Distinguish the f (cx). cases c = 0, c > 0, c < 0.) f(1/x). f (|x|). If(x)l. = max(f, = min( = to make a mistake here.) 0). f, 0). max(f, 1). ax2 15. Draw the graph of f (x) lem 1-18. 16. Suppose that A and C are not both 0. Show that the set of all (x, y) satisfying = + bx + c. Hint: Use the methods Ax2 + Bx + Cy2 + Dy + E = of Prob- 0 is either a parabola, an ellipse, or an hyperbola (orpossibly 0). Hint: The 0 is essentially Problem 15, and the case A 0 is just a minor variant. Now consider separately the cases where A and B are both positive or negative, and where one is positive while the other is negative. case C 17. = = The symbol [x]denotes the largest integer which is s x. Thus, [2.1] [2] 2 and [-0.9] Draw the graph of the following functions [-1] (theyare all quite interesting, and several will reappear frequently in other problems). = = --1. = = (i) f (x) [x]. (ii) f (x) x [x]. = = - (iii) f(x) fx [x]. (iv) f (x) [x]+ )x [x]. = - = , 18. (v) f (x) = (vi) f (x) = - 1 - . x 1 - . Graph the following functions. (i) f (x) = {x),where {x}is defined to be the distance from x integer. (ii) f (x) {2x}. = (iii) f (x) {x}+ {2x). (iv) f (x) {4x). + i {4x}. (v) f(x) {x}+ ({2x) = = = to the nearest 4. Graphs 73 Many functions may be described in terms of the decimal expansion of a number. Although we will not be in a position to describe infinite decimals rigorously until Chapter 23, your intuitive notion of infmite decimals should suffice to carry you through the following problem, and others which occur before Chapter 23. There is one ambiguity about infinite decimals which must be eliminated: Every decimal ending in a string of 9's is equal to another ending in a string of O's (e.g., 1.23999... 1.24000...). We will always use the one ending in 9's. = *19. Describe as best you can the graphs of the following functions picture is usually out of the question). (a complete the 1st number in the decimal expansion of x. (ii) f (x) the 2nd number in the decimal expansion of x. (iii) f (x) the number of 7's in the decimal expansion of x if this number is finite, and 0 otherwise. (iv) f (x) 0 if the number of 7's in the decimal expansion of x is finite, (i) f (x) = = = = and 1 otherwise. by replacing all digits in the decimal expansion come after the first 7 (ifany) by 0. 1 if never appears in the decimal expansion of x, and n if 1 f (x) 0 first appears in the nth place. (v) f (x) (vi) *20. = the number obtained of x which = Let f (x) 0, 1 = x irrational p rational x -, q = - q in lowest terms. (A number p/q is in lowest terms if p and q are integers with no common factor, and q > 0). Draw the graph of f as well as you can (don'tsprinkle 2, points randomly on the paper; consider first the rational numbers with q then those with q 3, etc.). = = 21. (x,x1 points on the graph of f(x) = x2 are the ones of the form (x.x2L Prove that each such point is equidistant from the point (0, and the (See Figure 30.) graph of g(x) line L, the graph of g(x) y, (b) Given a point P (a, ß) and a horizontal show that the set of all points (x, y) equidistant from P and L is the graph of a function of the form f (x) ax2 + bx + c. (a) The ¼) --i. = = = = FIGURE 3o *22. (a) Show that the square of the distance from x2(m2 + 1) + x(-2md - (c, d) to (x, mx) is 2c) + d2 ‡ C . Using Problem 1-18 to find the minimum of these numbers, the distance from (c, d) to the graph of f (x) mx is show that = |cm-d||Jm2+1. the distance from this case to part (a).) (b) Find (c,d) to the graph of f (x) = mx + b. (Reduce N Foundations *23. Problem 22, show that the numbers ure 31 are given by ', ' x' and y' (a) Using 1 indicated in Fig- 1 x'=-x+-y, ngth x' length y' r 45 (b) Show that FIGURE 31 the set of all as the set of all (x, y) (x, y) with xy with = 1. (x'/h)2 - O'/h)2 = 1 is the same 4, Appendix1. Vectors75 APPENDIX 1. VECTORS Suppose that v is a point in the plane; in other words, o is a pair of numbers v 1, (vl = , U2). For convenience, we will use this convention that subscripts indicate the first and second pairs of a point that has been described by a single letter. Thus, if we that w is the pair (wi, w2) mention the points w and z, it will be understood while z is the pair (zi, z2). Instead of the actual pair of numbers (vi, v2), we often picture v as an arrow from the origin O to this point (Figure 1), and we refer to these arrows as vectors in the plane. Of course, we've haven't really said anything new yet, we've simply introduced an alternate term for a point of the plane, and another mental picture. The real point of the new terminology is to emphasize that we are going to do some new things with points in the plane. For example, suppose that we have two vectors (i.e.,points) in the plane, U2) O v = (vi, v2). Then we can define a new vector FI GU RE 1 (1) E = (El, ©2)· point of the plane) v + w by the equation (anew v+w=(v1+wi,v2+w2). Notice that all the letters on the right side of this equation On the other + sign is just our usual addition of numbers. side the left is new: previously, the sum of two points in the and we've simply used equation (1)as a defmition. mathematician might want to use some new A very fussy defined operation, like v + w, or perhaps ve are numbers, and the hand, the + sign on plane wasn't defined, symbol for this newly w, but there's really no need to insist on this; since v + w hasn't been defined before, there's no possibility of confusion, so we might as well keep the notation simple. Of course, any one can make new notation; for example, since it's our defmition, 1012), we could just as well have defined o + w as (vi + wi w2, v2 + or by some construction weird real equally question formula. The other is, does our new have any particular significance? Figure 2 shows two vectors o and w, as well as the point - L (vi + wi, v2 + w2) (vi + (v2+ w2) w (vi, v2) ¯ v "' - v2 wi, v2 + for the moment, we have simply indicated in the usual way, without drawing an arrow. Note that it is easy to compute the slope of the line L between v and our new point: as indicated in Figure 2, this slope is just which, (v2 2) - (vi+wi)-vi FIGURE 2 ©2), U2 2 wl' and this, of course, is the slope of our vector w, from the origin O to Other words, the line L is parallel to w. (wi, ©2). In 76 Foundations Similarly, the slope of the line M between (wi, w2) and our new point is w2) (v2+ ' v2 (v1+wi)-v2 i ' v+w w2 - vi is the slope of the vector v; so M is parallel to v. In short, the new point o + w lies on the parallelogram having v and w as sides. When we draw v + w as an arrow (Figure 3), it points along the diagonal of this parallelogram. In physics, vectors are used to symbolize forces, and the sum of two vectors represents the resultant force when two different forces are applied simultaneously to the same which w i ' o object. FIGURE "w" Figure 4 shows another way ofvisualizing the sum v+w. If we use to denote of at the starting and having length, instead at v but same an arrow parallel to w, origin, the then o + w is the vector from O to the final endpoint; thus we get to by first following v, and then following w. + v w of Many the properties of + for ordinary numbers also hold for this new + for vectors. For example, the law" 3 "commutative v+w=w+v, is obvious from the geometric picture, since the parallelogram spanned by v and w is the same as the parallelogram spanned by w and v. It is also easily checked analytically, since it states that v + w "w" (vi + and thus simply Ÿ U2 wl, " vi + wi + 72 Similarly, unraveling 4 (El = Ÿ Ul, U2 W2 wi + + = defmitions, we find the 72"associative Figure 5 indicates a method of finding v + w + z. identity," The origin O (0,0) is an "additive = O+v=v+O=v, "z" and if we define -v2), -v = (-vi, then we also have "w" (-v) o+ w -v + v = 0. = Naturally we can also define z w - o w + = (-v), I exactly FIGURE 5 92), i, W2 = [v+w]+z=v+[w+z]. o + w + z + depends on the commutative law for numbers: i FIGURE ©2) as with numbers; equivalently, W ¯ U = (WI - 71. W2 ¯ U2). law" 4, Appendix1. Vectors77 with Justas numbers, our definition of w - v simply means that it satisfies "w v" that is parallel to w o but that starts Figure 6(a) shows v and an arrow at the endpoint of v. As we established with Figure 4, the vector from the origin to the endpoint of this arrow is just v + (w v) w (Figure 6(b)).In other words, that we can picture w v geometrically as the arrow that goes from v to w (except it must then be moved back to the origin). There is also a way of multiplying a number by a vector: For a number a and a vector o = (vi, v2), we define v - - = - - (a) "w (avt, av2) = a v -v" w (We sometimes simply write av instead of a v; of course, it is then especially important to remember that v denotes a vector, rather than a number.) The vector a v points in the same direction as v when a > 0 and in the opposite direction when a < 0 (Figure 7). You can easily check the following formulas: - - FIGURE 6 a - (b v) · (ab) = · v, 1-v=v, 0-v=0, -1-o=-v. 2v Notice that we have only defined a product of a number and a vector, we have two vectors to get another vector.* However, not defined a way of there are various ways of vectors to get numbers, which are explored in the following problems. 'multiplying' 'multiplying' PROBLEMS 1 2 -v - FIGURE 1. 7 Given a point v of the plane, let Re(v) be the result of rotating v around the origin through an angle of 8 (Figure 8). The aim of this problem is to obtain a formula for Ro, with minimal calculation. (a) Show that Ro (1,0) Re (0, 1) (b) Explain why we (cos0, sin 0), (- sin 9, cos 0). = = have Re(v+w) Re (a w) - Re(v) (c) Now show 8 Re(v)+ Re(w), a that for any point Re(x, y) FIGURE = = = (xcos0 - . Re(w). (x, y) we have y sin0, x sin0 + ycos0). *If you jump to Chapter 25, you'll find that there is an important way of defining a product, but this is something very special for the plane-it doesn't work for vectors in 3-space, for example, even though the other constructions do. 78 Foundations (d) Use this 2. result to give another solution to Problem 4-23. Given v and w, we define the number this is often called the being a rather 'dot product' or old-fashioned (a) Given v, find such vectors vimi + = v•w U2 2, 'scalar product' of v and w as opposed word for a number, w such that v . w a vector = ('scalar' to a vector). 0. Now describe the set of all w. (b) Show that v.w=w·v v.(w+z)=v•w+v·z and that a-(v•w)=(a-v).w=v.(a-w). Notice that the last of these equations involves three products: the dot product of two vectors; the product of a number and a vector; and the ordinary product of two numbers. (c) Show that v•v > 0, and that v•v 0 only when v 0. Hence we can define the norm ||v|| as - - - = = livll E, = will which be 0 only for v = 0. What is the geometric interpretation of the norm? that (d) Prove w Il5 Ilv+ Ilvll+ llell, and that equality holds if and only if v some number a > 0. (e) Show that = v•w 3. (a) Let ||v + w ||2 - ||v - = 0 or w = 0 or w = a - v for w ||2 4 Ro be rotation by an angle of 9 (Problem 1). Show that Re(v) • Re(w) = v •w. length 1 pointing along the first axis, and let w (cos0, sin 0); this is a vector of length 1 that makes an angle of 0 with the first axis (compareProblem 1). Calculate that (b) Let e = (1, 0) be the of vector = e.w = cos0. Conclude that in general v.w= where 0 ||v||· ||w||-cos0, is the angle between v and w. 4, Appendix1. Vectors79 4. L v+ w w Given two vectors v and w, we'd expect to have a simple formula, involving the coordinates vi, v2, wi, ©2, for the area of the parallelogram they span. Figure 9 indicates a strategy for finding such a formula: since the triangle with vertices w, A, v + w is congruent to the triangle OBv, we can reduce the problem to an easier one where one side of the parallelogram lies along the horizontal axis: line L passes through v and is parallel to w, so has slope w2/wlConclude that the point B has coordinate (a) The A UlW2 and that the parallelogram O FIGURE g Wl72 - therefore has area det(v, w) = v1m2 Elv2- - This formula, which defines the determinantdet, certainly seems to be simple enough, but it can't really be true that det(v, w) always gives the area. After all, we clearly have 9 det(w, v) so sometimes = det(v, w), - "deriva- det will be negative! Indeed, it is easy to see that our tion" made all sorts of assumptions (thatw2 was positive, that B had a positive coordinate, etc.) Nevertheless, it seems likely that det(v, w) is ± the area; the next problem gives an independent proof. 5. v points along the positive horizontal axis, show that det(v, w) is the area of the parallelogram spanned by v and w for w above the horizontal axis of the area for w below this axis. > 0), and the negative If Ro is rotation by an angle of 0 (Problem 1), show that (a) If (w2 (b) det(Rev, Row) det(v, w). = Conclude that det(v, w) is the area of the parallelogram spanned by and the v and w when the rotation from v to w is counterclockwise, negative of the area when it is clockwise. 6. Show that det(v, w + z) det(v + w, z) = det(v, w) + det(v, z) = det(v, z) + det(w, z) and that a w llw|| sin0 - 9 FIGURE 7. = det(a v, w) - det(v, a w). = - Using the method of Problem 3, show that det(v, w) v 10 det(v, w) Which = ||v|| ||wll sin0, - - is also obvious from the geometric interpretation (Figure 10). 80 Foundations APPENDIX 2. THE CONIC SECTIONS Although we will be concerned almost exclusively with figures in the plane, defined formally as the set of all pairs of real numbers, in this Appendix we want to consider three-dimensional space, which we can describe in terms of triples of real numbers, using a (0,0, 1) "three-dimensional coordinate system," consisting of three lines intersecting at right angles (Figure 1). Our hongontaland vedical axes now mutate to two axes in a horizontal plane, with the third axis perpendicular to both. One of the simplest subsets of this three-dimensional space is the (infinite) cone illustrated in Figure 2; this cone may be produced by rotating a line," of slope C say, around the third axis. straight 0, 1, 0) (1, 0, 0) "generating FIGURE l slope C (x, y, z) .(x,y,0) FIGURE 2 For any given first two coordinates plane has distance x and y, the point x2 + y2 from the origin, and thus (x,y, z) is in the (1) cone if and only if z = (x, y, 0) in the horizontal + y2. ±CÁx2 We can descend from these three-dimensional vistas to the more familiar twodimensional one by asking what happens when we intersect this cone with some plane P (Figure 3). C P FIGURE 3 4, Appendix2. The ConicSections 81 C C L If the plane is parallel to the horizontal plane, there's certainly no mystery-the intersection is just a circle. Otherwise, the plane P intersects the horizontal plane in a straight line. We can make things a lot simpler for ourselves if we rotate everything so that this intersection line points straight out from the plane of the paper, while the first axis is in the usual position that we are familiar with. The on," so that all we see (Figure 4) is its intersection L plane P is thus viewed with the plane of the first and third axes; from this view-point the cone itself simply appears as two straight lines. In the plane of the first and third axes, the line L can be described as the COÏÎCCtiOn of all points of the form "straight FIGURE 4 (x,Mx+B), where M is the slope of L. For an arbitrary point (x, y, z) it follows that P if and only if : (x, y, z) is in the plane (2) Combining (1)and (2),we see that = Mx + B. (x, y, z) is in the intersection of the cone and the plane if and only if Mx + B (*) L > , = ±C Now we have to choose coordinate axes in the plane P. We can choose L as the first axis, measuring distances from the intersection Q with the horizontal plane (Figure 5); for the second axis we just choose the line through Q parallel to our original second axis. If the first coordinate of a point in P with respect to these axes is x, then the first coordinate of this point with respect to the original axes can be written in the form ax + ax + FIGURE 5 ß Jx2+ y2. ß for some a and ß. On the other hand, if the second coordinate of the point with respect to these axes is y, then y is also the second coordinate with respect to the Original RXCS. Consequently, (*)says that the point lies on the intersection of the plane and the cone if and only if M(ax + B ß) + Although this looks fairly complicated, Œ2 2 = ±C J(ax + ß) after squaring + y2. we can write this as 2+a2(M2-A2)x2+Ex+F=0 «2 simplifies this for some E and F that we won't bother writing out. Dividing by to C2y2 + K2 - M2)x2 + Gx + H = 0. Now Problem 4-16 indicates that this is either a parabola, an ellipse, or an hyperbola. In fact, looking a little more closely at the solution (andinterchanging 82 Foundations the roles of x and y), we see that the values of G and H are irrelevant: (1)If M (2)If C2 = > (3)If C2 < ±C we obtain a parabola; M2 we obtain an ellipse; M2 we obtain an hyperbola. These analytic conditions are easy to interpret geometrically (Figure 6): plane is parallel to one a parabola; (2)If our plane slopes less than intersection omits one half of (3)If our plane slopes more than hyperbola. (1)If our FIGURE of the generating lines of the cone we obtain the generating line of the cone (sothat our the cone) we obtain an ellipse; the generating line of the cone we obtain an 6 "conic sections" are related to this description. In fact, the very names of these the same root The word parabolacomes from a Greek root meaning that appears in parable, not to mention paradigm, paradox, paragon, paragraph, paralegal, parallax, parallel, even parachute. Elhpse comes from a Greek root meaning or omission, as in ellipsis (anomission, or the dots that inmeaning beyond,' or from Greek dicate it). And hyperbolacomes root a and of words like hyperactive, hypersensitive, hypervenexcess. With the currency tilate, not to mention hype, one can probably say, without risk of hyperbole, that this root is familiar to almost everyone.* 'alongside,' 'defect,' . . . 'throwing ¯ ¯ PROBLEMS 1. FIGURE 7 Consider a cylinder with a generator perpendicular to the horizontal plane (Figure 7); the only requirement for a point (x, y, z) to lie on this cylinder is *Although the correspondence between these roots and the geornetric picture correspond so beautifully, for the sake of dull accuracy it has to be reported that the Greeks originally applied the words to describe features of certain equations involving the conic sections. 4, Appendix2. The ConicSections 83 that (x, y) lies on a circle: x2 4 2 C2 = Show that the intersection of a plane with this cylinder can be described by an equation of the form (ax + ß)2 P _ 2 What possibilities are there? 2. F2 In Figure 8, the sphere Si has the same diameter as the cylinder, so that its equator C1 lies along the cylinder; it is also tangent to the plane P at F1. Similarly,the equator C2 of S2 lies along the cylinder, and S2 is tangent to P z be any point on the intersection of P and the cylinder. Explain why the length of the line from to Flis equal to the length of the vertical z line L from z to Ct. (b) By proving a similar fact for the length of the line from z to F2, show that the distance from z to F1 plus the distance from z to F2 is a constant, so that the intersection is an ellipse, with foci Fi and F2- (a) Let C2 FIGURE 2 8 3. Similarly, use Figure 9 to prove geometrically that the intersection of a plane and a cone is an ellipse. FIGURE 9 84 Foundations APPENDIX 3. length r POLAR COORDINATES In this chapter we've been acting all along as if there's only one way to label points in the plane with pairs of numbers. Actually, there are many different system." The usual coordinates ways, each giving rise to a difFerent of a point are called its cartesian coordinates, after the French mathematician who first introduced the idea of and philosopher René Descartes (1596-1650), coordinate systems. In many situations it is more convenient to introduce polar which are illustrated in Figure 1. To the point P we assign the polar coordinates, coordinates (r, 0), where r is the distance from the origin O to P, and 0 is the angle between the horizontal axis and the line from O to P. This angle can be measured either in degrees or in radians (Chapter 15), but in either case 0 is not determined unambiguously. For example, with degree measurement, points on the right side of the horizontal axis could have either 8 0 or 9 360; moreover, 8 is completely ambiguous at the origin O. So it is necessary to exclude some ray through the origin if we want to assign a unique pair (r, Ð) to each point under "coordinate e FIGURE i = = consideration. On the other hand, there is no problem associating a unique point to any pair (r, 0). In fact, we can even associate a point to (r, 0) when r < 0, according to the scheme indicated in Figure 2. Thus, it always makes sense to talk about point with polar coordinates (r, 9)," even though there is some ambiguity when polar coordinates" of a given point. we talk about "the "the P is the point with polar coordinates (r, 01) and also the point with polar coordinates FIGURE 2 It is clear from Figure 1 (andFigure 2) that the point with polar coordinates (r, 0) has cartesian coordinates (x, y) given by x = r cos0, y = r sin0. 4, Appendix3. Polar Coordinates85 coordinates Conversely, if a point has cartesian (r, 0) satisfy (x, y), ordinates r tan0 ±Áx2 = 2 = x a then (anyof) its polar co- 2 if x ¢ 0. f is a function. Then by the graph of f in polar cothe collection of all points P with polar coordinates (r, 0) we mean = r f (6). In other words, the graph of f in polar coordinates is the collection of all points with polar coordinates (f (0), 8). No special significance should be attached to the fact that we are considering pairs ( (0),0), with f (0) f ÍÌTSt, as opposed to pairs (x,f(x)) in the usual graph of f; it is purely a matter of convention that r is considered the first polar coordinate and 0 is considered the Now suppose that ordinates satisfying FIGURE 3 second. The graph of f in polar coordinates is often described as "the graph of the equation r = f (0)." For example, suppose that f is a constant function, f (0) = a for all 9. The graph of the equation r = a is simply a circle with center O and radius a (Figure 3). This example illustrates, in a rather blatant way, that polar coordinates are likely to make things simpler in situations that involve symmetry with respect to the origin O. The graph of the equation r = Ð is shown in Figure 4. The solid line corresponds to all values of 0 > 0, while the dashed line corresponds to values of 0 5 0. F I GURE Spiral of Archimedes 4 As another example involving both positive and negative r, consider the graph of the equation r cos0. Figure 5(a)shows the part that corresponds to 0 5 0 5 90 [with0 in degrees]. Figure 5(b) shows the part corresponding to 90 5 0 5 180; here r < 0. You can check that no new points are added for 9 > 180 or 0 < 0. It is easy to describe this same graph in terms of the cartesian coordinates of its points. Since the polar coordinates of any point on the graph satisfy = FIGURE s r = cos0, 86 Foundations and hence r2 = r cosÛ, its cartesian coordinates satisfy the equation 2 x2 which describes a circle (Problem 3-16). [Conversely, it is clear that if the cartesian coordinates of a point satisfy x2 + y2 x, then it lies on the graph of the equation cos0.] r Although we've now gotten a circle in two different ways, we might well be hesitant about trying to find the equation of an ellipse in polar coordinates. But it turns out that we can get a very nice equation if we choose one of the focias the origin. Figure 6 shows an ellipse with one focus at 0, with the sum of the distances of all points from O and the other focus f being 2a. We've chosen f to the left of O, with coordinates written as = = (-2ea, 0). (We have 0 _< e < 1, since we must have 2a > distance from f to 0). (x, y) 2a f FIGURE = r - r O (-2ea, 0) 6 The distance r from (x, y) to O is given by r2 (1) By assumption, the distance from (2a - r)2 = x2 + y2 to f is 2a (x, y) - r, (x [-2ea])2+ = - hence y2, or (2) Subtracting 4a2 - 4ar + r2 (1)from (2),and = x2 + 4eax + 4e2a2 + y2 dividing by 4a, we get a-r=ex+e2a, 4, Appendir3. Polar Coordinates87 or r=a-ex-eia (1 = e2)a - ex, - which we can write as (3) r = A - for A ex, (1 = e2¾ - Substituting r cos 0 for x, we have A = r - r(1+ecos9) er cos0, A, = and thus A (4) r = . 1+ ecos0 In Chapter 4 we found that 2 2 (5) + = 1 is the equation in cartesian coordinates for an ellipse with 2a as the sum of the distances to the foci, but with the foci at (-c, 0) and (c,0), where Ja2-c2, b= Since the distance between the foci is 2c, this corresponds to the ellipse (4)when (With equation we take c ea or 8 = CD (3)determining A). Conversely, given the ellipse described by (4),for the corresponding equation (5)the value of a is determined by (3), = A a= and again using c b= = 1-e2' sa, we get a2-C2= a2_g2a2=a Î-€2- Thus, we can obtain a and b, the lengths of the major and minor axes, immediately from e and A. The number e c Ja2 a a - - b2 1- - b 2 , a "shape" of the ellipse the eccentncip of the ellipse, determines the (theratio of the major and minor axes), while the number A determines its as shown by (4). "size," 88 Foundations PROBLEMS 1. If two points have polar coordinates (ri, Bi) and (72,4), show that the distance d between them is given by d2 rt2 = + r2 2rir2 Cos(Û] Û2)- - What does this say geometrically? 2. Describe the general features of the graph of f in polar coordinates if is even. f is odd. (i) f (ii) (iii) f (9) f (0 + = 3. 180) [when0 is measured in degrees]. Sketch the graphs of the following equations. r = a sin0. r = a sec0. (iii) r = (i) (ii) (iv) (v) (vi) r = r = r = Hint: It is a very simple graph! cos 20. Good luck on this onel cos 30. | cos 20|. | cos 30 . 4. Find equations for the cartesian and (iii) in Problem 3. coordinates of points on the 5. Consider a hyperbola, where the difference of the distance between the two foci is the constant 2a, and choose one focus at O and the other at (-2ea, 0). (In this case, we must have e > 1). Show that we obtain the exact same equation in polar coordinates graphs (i),(ii) A r as we obtained 6. = l+scos0 for an ellipse. Consider the set of points (x, y) such that the distance (x, y) to O is equal to the distance from (x, y) to the line y a (Figure 7). Show that the distance conclude and that the equation can be written to the line is a r cos 9, = - (x,y) a=r(l+cos0). y=a Notice that this equation for a parabola is again of the same from as 0 7. Now, for any A and e, consider the graph in polar coordinates of the equation (4),which implies (3).Show that the points satisfying this equation satisfy (1 e2)x2 + - FIGURE7 (4). y2 = A2 - 2Aex. Using Problem 4-16, show that this is an ellipse for e 1, and a hyperbola for e > 1. e = < 1, a parabola for 4, Appendix3. Polar Coordinates89 8. graph of the cardioid r 1 sin 0. sin0. also of the graph that Show it is r (b) that the equation it can be described by (c) Show (a) Sketch the = - -1 = 12 and conclude that 2 2 2_L it can be described by the equation 2 (x2 + y2 + y)2 9. 2 Sketch the graphs of the following equations. (i) P - r 1 = - ( sin 0. (ii) r 1-2sin0. (iii) r 2 + cos 0. (a) Sketch the graph = = 10. of the lemniscate r2 (-a, (a, 0) 0) (b) Find an equation it is (c) Show that a2. did2 FIGURE 8 = 2a2 cos 20. for its cartesian coordinates. the collection of all points P in Figure 8 satisfying = the shape of the curves formed by the set of all P b, when b > a2 and when b < a2. Make a guess about (d) SatiSfying did2 = CHAPTER LIMITS The concept of a limit is surely the most important, and probably the most difficult one in all of calculus. The goal of this chapter is the definition of limits, but we are, once more, going to begin with a provisional definition; what we shall define is not the word but the notion of a function approaching a limit. "limit" PROVISIONAL DEFINITION The function f approaches the limit l near a, if we can make like to l by requiring that x be sufficiently close to, but unequal f(x) as close as we to, a. Of the six functions graphed in Figure 1, only the first three approach I at a. Notice that although g(a) is not defined, and h(a) is defined wrong way," it is still true that g and h approach I near a. This is because we explicitly ruled the value of the function out, in our definition, the necessity of ever considering at a-it is only necessary that f(x) should be close to l for x close to a, but unequal to a. We are simply not interested in the value of f(a), or even in the question of whether f (a) is defined. "the l- t- i- / FIGURE i 1 One convenient way of picturing the assertion that f approaches l near a is provided by a method of drawing functions that was not mentioned in Chapter 4. In this method, we draw two straight lines, each representing R, and arrows from a point x in one, to f (x) in the other. Figure 2 illustrates such a picture for two different functions. 90 5. Limits 91 Now consider a function f whose drawing looks like Figure 3. Suppose we ask that f(x) be close to I, say within the open interval B which has been drawn in Figure 3. This can be guaranteed if we consider only the numbers x in the interval A of Figure 3. (In this diagram we have chosen the largest interval which (a) f(x) -2 = will work; any smaller c -1 1 -2 -1 2 0 (b) f(x) FIGURE 2 2 - x. interval containing a could have been chosen instead.) If we choose a smaller interval B' (Figure 4) we will, usually, have to choose a smaller A', but no matter how small we choose the open interval B, there is always supposed to be some open interval A which works. A similar pictorial interpretation is possible in terms of the graph of f, but in this case the interval B must be drawn on the vertical axis, and the set A on the horizontal axis. The fact that f (x) is in B when x is in A means that the part of the graph lying over A is contained in the region which is bounded by the horizontal lines through the end points of B; compare Figure 5(a), where a valid interval A has been chosen, with Figure 5(b),where A is too large. In order to apply our definition to a particular function, let us consider f(x) x sin 1/x (Figure 6). Despite the erratic behavior of this function near 0 it is clear, at least intuitively, that f approaches 0 near 0, and it is certainly to be hoped that our definition will allow us to reach the same conclusion. In the case we are considering, both a and I of the defmition are 0, so we must ask if we can get sin f (x) x 1/x as close to O as desired if we require that x be sufficiently close of0. This to 0, but ¢ 0. To be specific, suppose we wish to get x sin 1/x within = = FIGURE 3 means we want - or, more succinctly, |x sin 1/x| I sin FIGURE -- 1 10 < 1 - x . < 1 . < x sm - x < --, 1 10 Now this is easy. Since 1, for all x ¢ 0, 4 (a) FIGURE 5 (b) 92 Fdundations we , have x sin / r . f(x)=xsm- ,' ', 1 2 1 x for all x ¢ 0. |x|, 5 - ( and x ¢ 0, then |x sin 1/x| < This means that if |x| < in other words, of 0 provided that x is within of 0, but ¢ 0. There is x sin 1/x is within nothing special about the number ; it isjust as easy to guarantee that |f (x)-0| < require that |x| < , but x ¢ 0. In fact, if we take any positive number e we can make |f (x) 0| < e simply by requiring that |x! < e, and x ¢ 0. x2 sin 1/x (Figure 7) it For the function f(x) seems even clearer that f approaches 0 near 0. If, for example, we want i þ; -simply ' s ' ' - FIGURE 6 = x2 1 . sm then we certainly need only require and consequently that |x2| < < - -, x that |x| 1 10 and x < ¢ 0, since this implies g x2sin- 1 1 100 <\x2|<-<-. ¯¯ x 1 10 / f(x) / FIGURE x" sin = - 7 ' , f( ) = v' = (We could do even better, and allow |x| < 1/K and x ¢ 0, but there is no particular virtue in being as economical as possible.) In general, if e > 0, to ensure that 1 x2 sin FIGURE 8 We need only require - < e, that |x| < e and x ¢ 0, 5. Limits 93 f(x) = sin - A FIGURE + 9 provided that e 5 1. If we are given an s which is greater than 1 (itmight be, even thought it is "small" s's which are of interest), then it does not sufRce to require that |x| < e, but it certainly sufEces to require that lx| < 1 and x ¢ 0. sin 1/x (Figure 8). In As a third example, consider the function f (x) 1 = f(x) = sin order to make |Ssin 1/x| < s we can require e2 |x| < (thealgebra and that x M ¢0 is left to you). Finally, let us consider the function f (x) sin 1/x (Figure 9). For this function it is falsethat f approaches O near 0. This amounts to saying that it is not true for every number e > 0 that we can get |f (x) 0 < e by choosing x sufficiently small, and ¢ 0. To show this we simply have to find onee > 0 for which the condition < 0| cannot be guaranteed, no matter how small we require e |f (x) will do: it is impossible to fx| to be. In fact, e ensure that |f(x)| < no matter how small we require [x| to be; for if A is any interval containing 0, there is some number x = 1/(90+360n) which is in this interval, and for this x we have f (x) = 1. This same argument can be used (Figure 10) to show that f does not approach any number near 0. To show this we must again find, for any particular number l, some number e > 0 so that |f (x) li < e is not true, no matter how small x is required to be. The choice e = works for any number l; that is, no niatter how small we require x| to be, we cannot ensure that | (x) l| < The reason f is, that for any interval A containing 0 there is some xi 1/(90 + 360n) in this interval, so that f (xi) 1, = - - FIGURE 10 = { ¼ -- ( - = = and also some x2 = 1/(270 + 360m) in this interval, so that -1. f (x2) = J. 94 Foundations -1 ( ) But the interval from l cannot contain both to I + length is only 1; so we cannot have - |1 • * . I . x, x rational _ ¯ o, x irrational . . - l| < ( and also l| - < 1, since its total ¼, what l is. no matter The phenomenon exhibited by If we consider the function f(x) • . f (x) , , sin = 1/x near 0 can occur in many ways. 0, x irrational 1, x rational, = then, no matter what a is, f does not approach any number i near a. In fact, we cannot make f (x) l| < no matter how close we bring x to a, because in any interval around a there are numbers x with f(x) 0, and also numbers x with need would and also that |0 l| < |1 l| < we f (x) 1, so An amusing variation on this behavior is presented by the function shown in i - FIGURE |-1 and Il = = ¼ - ¼. - Figure 11: f (x) = x, x rational 0, x irrational. sin 1/x; it apThe behavior of this function is "opposite" to that of g(x) proaches 0 at 0, but does not approach any number at a, if a ¢ 0. By now you should have no difficulty convincing yourself that this is true. As a contrast to the functions considered so far, which have been quite pathological, we will now examine some of the simplest functions. If f (x) c, then f approaches c near a, for every number a. In fact, to ensure that |f (x) c| < e one does not need to restrict x to be near a at all; the condition is automatically satisfied (Figure 12). As a slight variation, let f be the function shown in Figure 13: = ----------'-i-! ----f(x) - c-----¯^¯¯¯¯ -¯¯¯¯¯¯ -¯ ~^¯ -T-; = FIGURE 12 - -1, f (x) = 1, x x < > 0 0. If a > 0, then f approaches 1 near a: indeed, to ensure that certainly suffices to require that |x al < a, since this implies |f (x) - 1| < e it - 1 i f(x) 1, x = > o -a<x-a 0<x or b a -1, /W - «< oi -i -1 1. Similarly, if b < 0, then f approaches near b: to ensure Finally, as you may < that |f(x) (-1)| < e it suffices to require that x easily check, f does not approach any number near 0. The function f (x) = x is easily dealt with. Clearly f approaches a near a: to so that f (x) = -b| - -a| FIGURE 13 -b. -a| < e. s we just have to require that |x requires x2 work. show that The function f (x) little To more a near a, we must decide how to ensure that ensure that |f(x) < = a2 X2 __ 2 f approaches 5. Limits 95 Factoring looks like the most promising pmcedure: [x al - - |x+al < we want e. Obviously the factor |x + al is the one that will cause trouble. On the other hand, there is no need to make |x + a l particularly small; as long as we know some bound on the values of |x + a| we will be in good shape. For example, if x + a| < 1,000,000, then we will just need to require that |x a| < e/1,000,000. Therefore, to begin with, let us require that [x < 1 (anypositive number other than 1 would do just as well); presumably this will ensure that x is not too large, and consequently that lx + a l is not too large. As a matter of fact, Problem 1-12 - -al shows that jx -Jals|x-al<1, SO lx[ < l + a|, and consequently +1. |x+al5lx[+|a|<2|a -al Now we need only the additional requirement that |x < s/(2|a|+ 1). In other words, if (x a| -- < min 1, 2|a + 1 then , lx2 - a2| < s. + 1)) will just be e/(2|al + 1) for small e. Precisely the same sort of trick will show that if f (x) x3, then a3 near a. In fact, Naturally, min(1, e/(2|a = if |x - a| < min 1, ' (1 + |a )2 + ja (1 + |a|) + |a|2 , then f |x3 - approaches a3| < e. The proof of this assertion will show where the weird denominator comes from: < 1, then |x| < |a|+ 1, and consequently If x --al X2+ax‡a2 2‡ < aj·|x|+|a2 (1 + |a|)2 + ja (1 + |a|) + a|2. Therefore |x3-a3|= |x--al |x2+ax+a2 - < - (1 + \aj)2+ \a\(1+ laj)+ \a\2 [(1+ |a|)2 + |a|(1 + la|) + [a|2] The time has now come to point out that of the many demonstrations about limits which we have given, not one has been a real proof. The fault lies not with our reasoning, but with our definition. If our provisional definition of a function was open to criticism, our provisional definition of approaching a limit This definition is simply not sufficiently precise to be is even more vulnerable. used in proofs. It is hardly clear how one f (x) close to I (whatever means) by close "suffix to be sufficiently close to a (however "makes" "close" "requiring" 96 Foundations ciently" is supposed to be). Despite the criticisms of our definition you may feel hope you do) that our arguments were nevertheless quite convincorder ing. In to present any sort of argument at all, we have been practically forced real definition. It is possible to arrive at this definition in several steps, to invent the each one clarifying some obscure phrase which still remains. Let us begin, once again, with the provisional definition: close (I certainly The function f approaches the limit l near a, if we can make f(x) as close as we like to I by requiring that x be sufficiently close to, but unequal to, a. The very first change which we made in this defmition was to note that making f (x) close to l meant making |f (x) l| small, and similarly for x and a: - The function f approaches the limit I near a, if we can make |f (x) l| as small as we like by requiring that |x a| be sufficiently small, and x ¢ a. - - "as small as The second, more crucial, change was to note that making [f (x) l| we like" means making |f (x) l| < e for any e > 0 that happens to be given us: - - The function can make ¢ x approaches f If (x) - l| the limit l near a, if for every number e > 0 we a| be sufficiently small, and e by requiring that x < - a. There is a common pattern to all the demonstrations about limits which we have given. For each number e > 0 we found some other positive number, 8 say, with the property that if x ¢ a and |x a| < ô, then f(x) l| < e. For the function 0), the number 8 was just the number e; 0, I f (x) x sin 1/x (witha x2 it l/x, it was e2; for f(x) for f(x) was the minimum of 1 and e/(2|a l + 1). In general, it may not be at all clear how to find the number ô, < 8 which expresses how small given E, but it is the condition |x small must be: - = - = = - Ñsin = -a| "sufficiently" The function f approaches the limit l near a, if for every e & > 0 such that, for all x, if |x a| < 8 and x ¢ a, then |f - 0 there is some (x) l| < e. > - This is practically the defmition we will adopt. We will make only one trivial change, noting that a| < ô and x ¢ a" can just as well be expressed "O < "|x - |x-a|<ô." DEFINITION The function f approaches the lirnit I near a means: for every e > 0 there such that, all > 8 0 for is some x, if 0 < |x a| < 8, then |f(x) l| < e. - - 5. Limits 97 This definition is so important (everything we do from now on depends on it) that proceeding any further without knowing it is hopeless. If necessary memorize it, like a poem! That, at least, is better than stating it incorrectly; if you do this you are doomed to give incorrect proofs. A good exercise in giving correct proofs is to review every fact already demonstrated about functions approaching limits, giving real proofs of each. This requires writing down the correct definition of what you the algebraic work has been done already. are proving, but not much morwall When proving that f does not approach l at a, be sure to negate the defmition correctly: If it is not true that for every e then |f (x) 0 there is some 8 > l - < 0 such that, for all x, if 0 > < [x al - < 8, e, then 0 such that for every & > 0 there is some x which satisfies 8 but not |f (x) l| < e. there is some e 0 |x < - a| < > -- Thus, to show that the function f (x) sin 1/x does not approach 0 near 0, we and note that for every ô > 0 there is some x with 0 < |x 0 < 8 consider e = but not Isin 1/x 0 | < j-namely, an x of the form 1/(90 + 360n), where n is that < 8. 1/(90 360n) + so large As an illustration of the use of the definition of a function approaching a limit, we have reserved the function shown in Figure 14, a standard example, but one = ( -- - of the most complicated: f(x) 0, = x 1/q, x irrational, O < x < 1 p/q in lowest terms, 0 = < x 1. < (Recall that p/q is in lowest terms if p and q are integers with no common and q > 0.) 1 e - x irrational f(x) I- e 1- - e e 10, -, 1 x = p -- . m lowest terms e I-eeee Ip-eees e ee II FIGURE 14 I | I i I il e i factor 98 Foundations 0 at a. To prove a, with 0 < a < 1, the function f approaches natural number number this, consider any so large that 1/n e > 0. Let n be a e. Notice that the only numbers x for which |f (x) 0| < s could be false are: For any number _< - -; 1 12 13 -; -, 1234 -; -, -, -, -; 1 ...; -, n-1 -,..., n (If a is rational, then a might be one of these numbers.) However many of these numbers there may be, there are, at any rate, only finitely many. Therefore, of all these numbers, one is closest to a; that is, |p/q a| is smallest for one p/q among these numbers. (If a happens to be one of these numbers, then consider only the values |p/q al for p/q ¢ a.) This closest distance may be chosen as the 8. For if 0 < |x a| < 8, then x is not one of - - - n-1 1 -,.... n This completes the proof. Note that our description of the 8 which works for a given e is completely adequate-there is no why of give 8 in formula for terms we must reason a s. Armed with our definition, we are now prepared to prove our first theorem; you have probably assumed the result all along, which is a very reasonable thing to do. This theorem is really a test case for our definition: if the theorem could not be proved, our definition would be useless. and therefore THEOREM l e is true. < - A function cannot approach two different limits near a. In other words, if approaches PROOF |f (x) 0| i near a, and approaches f m near a, then I = f m. Since this is our first theorem about limits it will certainly be necessary to translate the hypotheses according to the definition. Since f approaches l near a, we know that for any e > 0 there is some number 1 > 0 such that, for all x, if 0 < |x - a We also know, since f approaches for all x, l< 81, then |f (x) - l| m near a, that there < e. is some â2 > 0 such that, if0<|x-al<ô2,then|f(x)--m|<e. We have had to use two numbers, 81 and 82, since there is no guarantee that the 8 which works in one definition will work in the other. But, in fact, it is now easy to conclude that for any e > 0 there is some ô > 0 such that, for all x, if0<|x-a|<8,then|f(x)-l|<eand we simply choose 8 = min(8i, 82). |f(x)-m|<e; 5. Limits 99 To complete the proof we just have to pick a particular e > 0 for which the two conditions and |f(x)-lj<s if(x)-ml<s cannot both hold, if i ¢ m. The proper choice is suggested by Figure 15. If l ¢ m, so that |l m) > 0, we can choose |l ml/2 as our e. It follows that there is a 8 > 0 such that, for all x, -- -- if 0 length FIGURE 15 t-m lx < al - 8, then < and - length This implies that for 0 |l - ml < |x a| - < If (x) li < m| < - |f (x) - . 2 8 we have _< |l f (x) + f (x) = - - m| ll f (x)| + |f (x) - II < mi - 2 + il - m! mi -- 2 =|l--m, a contradiction. | The number I which approaches near a is denoted by lim f (x) (read: the limit of f(x) as x approaches a). This definition is possible only because of Theorem 1, which ensures that lim f(x) never has to stand for two different numbers. The f equation lim f (x) has exactly the same meaning as the f = l phrase approaches l near a. The possibility still remains that f does not approach l near a, for any l, so that lim f (x) l is false for every number l. This is usually expressed by saying that = "lim X->G f (x) does not exist." Notice that our new notation introduces an extra, utterly irrelevant letter x, which could be replaced by t, y, or any other letter which does not alæady appear-the symbols lim x-va f(x), lim t a f(t), lim y-wa f(y), all denote precisely the same number, which depends on f and a, and has nothing to do with x, t, or y (theseletters, in fact, do not denote anything at all). A more logical symbol would be something like lim f, but this notation, despite its brevity, is so infuriatingly rigid that almost no one has seriously tried to use it. The notation Iim f (x) is much more useful because a function f often has no simple name, even 100 Foundations though it might be possible to express the short symbol simple f (x) by a formula involving x. Thus, lim (x2 -I- sin x) could be paraphrased only by the awkward expression lim f, where f(x) Another advantage of the standard =x2+sinx. symbolism is illustrated by the expressions limx+t3, lim x + t3 The first means the number which f (x) the second means the number = f t3, x+ which f (t) = approaches f near a when for all x; approaches near a when x + t*, for all t. You should have little difficulty (especially if you consult Theorem 2) proving that limx+t3=a+t3, x->a limx+t3=x+a3 These examples illustrate the main advantage of our notation, which is its flexibility. In fact, the notation lim f (x) is so flexible that there is some danger of forgetting what it really means. Here is a simple exercise in the use of this notation, which will be important later: first interpret precisely, and then prove the equality of the expressions lim x-va f (x) and lim f (a + h). h->O An important part of this chapter is the proof of a theorem which will make it easy to find many limits. The proof depends upon certain properties ofinequalities and absolute values, hardly surprising when one considers the definition of limit. Although these facts have already been stated in Problems 1-20, 1-21, and 1-22, because of their importance they will be presented once again, in the form of a lemma (alemma is an auxiliary theorem, a result that justifiesits existence only by virtue of its prominent role in the proof of another theorem). The lemma says, roughly, that if x is close to xo, and y is close to yo, then x + y will be close to xo + yo, and xy will be close to xoyo, and 1/y will be close to 1/yo. This intuitive than the precise estimates of the lemma, statement is much easier to remember of Theorem 2 first, in order to see just read and it is not unreasonable the proof to estimates used. how these are 5. Limits 101 LEMMA (Î)If jx xo| - and < |y yo| -- < , then (x + y) yo) | (xo+ - < e. (2)If |x - xo| < S ( min 1, then 2(lyol + 1) |xy (3)If yo ¢ 0 xoyol - < yo| - . < mm ¢ 0 and -, (1) |(x + y) 10)| (10 - - € < 2(ixol + 1)' e. 1 1 y yo e|yo|2 2 2 xo| < , <e. I(X -X0) = f (2)Since |x yo| - |yo --- PROOF Iy and ly then y and + (7 70) - e e lx--xcl+Iy-yol<-+-=e. 2 2 l we have x| xo| 5 - |x xo| - 1, < so that lxl < 1+ |xo|. Thus xoyal lxy - |x(y yo) + yo(x xo) | 5|xl ly-yol+lyol·lx-xol = - - 8 < 8 + (1+ |xoD 2(|xo + 1) lyol 2( yo| + 1) - e e 2 2 <-+-=e. - (3)We have yo| so |y| > |yo||2. In particular, y |y| 5 |y - ¢ yo| - < 10 --, 1 2 lyI lyol Thus 1 1 y yo |yo y |y|· lyol - 2 0, and 2 1 lyol lyo| s|yo[2 2 102 Foundations THEOREM 2 If lim f (x) = l and lim g (x) m, then = (1) lim( f + (2) lim( f · g)(x) l + m; = g)(x) l m. = - Moreover, if m ¢ 0, then lim (3) x-va PROOF 1 (1 (x) - g The hypothesis means that for every e = -. m 0 there are Si, 82 > 0 such that, for > all x, if0<|x-a|<ô1,then|f(x)-l|<e, -al and if 0 -m| lx < ô2, then < |g(x) < after all, e/2 is also a positive number) This means (since, such that, for all x, and Now let ô 0 < |x a| = < - if 0 < |x if 0 < |x - - are true. < 81, then |f (x) l| a| < 82, then |g(x) 62). If 0 < |x 82 are both true, so both & |f(x)-l|<But by part (1)of the - and 2 a| < and Again let & = < |x if 0 < \x min(81, - - < ô, then 0 < a| - 81 and < 2 - a\ < ô2, then \g(x) m\ - l| - |x < -- part min < < (2)of 1, 2(|m[ + 1) and 2(|m|+ 1) , 2(\l\ + 1) € |g(x)-m|< · . if0<|x-a|<ô,then|g(x)-m|<min --, . 2(|l + 1) (2). |m| s |m| . 2 2 < s. the lemma. If a l < 8, then S - |x 8 l m| < e, and this proves So, by the lemma, |(f g)(x) Finally, if e > 0 there is a 8 > 0 such that, for all x, - . lemma this implies that |(f + g) (x) (l + m)| ô1, then |f (x) ( - |g(x)-m|<- < 1, |f(x)-l|<min , m| al ô2). If 0 < - This proves (1). To prove (2)we proceed similarly, after consulting e > 0 there are Bi, ô2 > 0 such that, for all x, if 0 that there are Bi, 82 > 0 a| min(ål, e. 5. Limits 103 But according to part (3)of the lemma this means, first, that g(x) ¢ 0, so (1/g)(x) makes sense, and second that (1 (x)-- - g This proves 1 <e. m (3).| Using Theorem 2 we can prove, trivially, such facts as a3+7aß x3+7x6 . hm x2 x-a + 1 without begin = a2 , + 1 going through the laborious process of finding a ô, given an e. We must with lim 7 = 7, lim 1 = 1, lim x = x->a X-wa a, but these are easy to prove directly. If we want to find the 8, however, the proof of Theorem 2 amounts to a prescription for doing this. Suppose, to take a simpler example, that we want to find a o such that, for all x, if 0 < |x - a| 8, then < lx2+ x (a2 + - a)| < e. Consulting the proof of Theorem 2(1), we see that we must first find such that, for all x, and if 0 < |x if 0 < |x - - |x2 a < ôi, then a < 02, then x Since we have already given proofs that lim x2 to do this: - -- a2| a| < < 8 1 min = 2 ( 1, 2 2|a l + 1 Thus we can take & = min(Bi, ô2) = min min l' 2|a + 1 and 82 > 0 2 . a2 and lim x ·- i ' = a, we know how 104 Foundations If a ¢ 0, the same method if 0 < can be used to find a 8 |x a| - 1 8, then < - 0 such that, for all x, > 1 < - x e. a The proof of Theorem 2(3) shows that the second condition a 8 > 0 such that, for all x, will follow if we find ,ela|4 if0<|x-al<8,then|x2-a |<min Thus we can take . mm ô min = 1, --, |a|2 e|a|4 -- 2 2 2|a| + 1 Naturally, these complicated expressions for 8 can be simplified considerably, after they have been derived. One technical detail in the proof of Theorem 2 deserves some discussion. In order for lim f (x) to be defined it is, as we know, not necessary for f to be defined at a, nor is it necessary for f to be defined at all points x ¢ a. However, there must be some ô > 0 such that f (x) is defmed for x satisfying 0 < |x a l < 8; otherwise the clause - (a) "if would 0 < |x - a| 8, then < |f (x) l| - e" < make no sense at all, since the symbol f (x) would make no sense for some x's. If f and g are two functions for which the definition makes sense, it is easy to see that the same is true for f + g and f g. But this is not so clear for 1/g, since 1/g is undefined for x with g(x) = 0. However, this fact gets established in the proof of Theorem 2(3). i - (b) FIGURE l6 There are times when we would like to speak of the limit which f approaches at a, even though there is no 8 > 0 such that f(x) is defined for x satisfying 0 < |x a| < 8. For example, we want to distinguish the behavior of the two functions shown in Figure 16, even though they are not defined for numbers less than a. For the function of Figure 16(a) we write - lim f (x) l = x->a+ or lim f (x) l. = x‡a (The symbols on the left are read: the limit of f (x) as x approaches a from above.) These from above" are obviously closely related to ordinary limits, and the definition is very similar: lim f (x) l means that for every e > 0 there is a 8 > 0 "limits = I- such that, for all x, if 0 < x - a < ô, then |f (x) - l| < e. "0 < x (The condition a < 8" is equivalent to "O < |x a| < ô and x "Limits from below" (Figure 17) are defined similarly: lim f(x) X->G - FIGURE 17 -- > = a.") l (or 5. Limits 105 lim f (x) x†a = l) means that for every e > 0 there is a ô > 0 such that, for all x, if 0 < - a x < 8, then |f(x) l -- < e. It is quite possible to consider limits from above and below even if f is defined for numbers both greater and less than a. Thus, for the function f of Figure 13, we have -1. lim f (x) 1 and = lim f (x) = It is an easy exercise (Problem 29) to show that lim f (x) exists if and only if lim f (x) and lim f (x) both exist and are equal. x->a+ x->a- Like the definitions of limits from above and below, which have been smuggled into the text informally, there are other modifications of the limit concept which will be found useful. In Chapter 4 it was claimed that if x is large, then sin 1/x is close to 0. This assertion is usually written lim sin 1/x 0. = x-voo FIGURE 18 "the limit of f (x) as x approaches oo," or "as The symbol lim f (x) is read x becomes infinite," and a limit of the form lim f (x) is often called a limit at X-¥OO infinity. Figure 18 illustrates a general situation where lim f (x) l. Formally, = lim X->00 f (x) = l means that for every e if x > > 0 there is a number N, then |f(x) - l| < N such that, for all x, e. The analogy with the definition of ordinary limits should be clear: whereas the condition "O < |x al < 8" expresses the fact that x is close to a, the condition > N" expresses the fact that x is large. We have spent so little time on limits from above and below, and at infinity, because the general philosophy behind the definitionsshould be clear if you understand the definition of ordinary limits (which are by far the most important). Many exercises on these definitionsare provided in the Problems, which also contain several other types of limits which are occasionally useful. - "x 106 Foundations PROBLEMS 1. Find the following limits. (These limits all follow, after some algebraic mafrom the various parts of Theorem 2; be sure you know which used in each case, but don't bother listing them.) ones are nipulations, x2_) lim (i) x->l (ii) lim . x+1 xa-8 x->2 lim (iii) xs3 . (iv) lun (v) lim . x x" 2 . - y" - . x" y" - . y - x Ja+h- . h Find the following limits. Inn (i) x->1 (ii) lim 240 lim (iii) x->0 3. 2 - y->x hm (vi) h->o 2. . x x3_g . 1 x - l-x2 1- . x 1-J1-x2 . X In each of the following cases, find a satisfying 0 < |x a| < ô. such that |f (x) - I < e for all x - (i) f (x) = (ii) f (x) = (iii) f (x) = (iv) f (x) = x4. ; -; g4 _ 1 a x x 1, l = 1 4 _ . 1. = _ y y _ g x x ;a 1 + sin2 = 0, l = 0. (v) f(x)=S;a=0,l=0. (vi) f (x) S; a 1, l 1. = 4. *5. = = For each of the fimetions in Problem 4-17, decide for which numbers limit lim f (x) exists. a the the same for each of the functions in Problem 4-19. Same problem, if we use infinite decimals ending in a string of O's (b) instead of those ending in a string of 9's. (a) Do 5. Limits 107 6. Suppose the functions f and g have the following property: for all e 0 > and all x, if 0 < if 0 < For each e < sin [x 2| < e2, then -- (ii) (iii) if 0 < |x if0< 2| - < 8, then < 8, then x-2|<ô,then < - < e, s. and limf(x) f(x)g(x) 1 --- 1 4 1 2 - - g(x) 8| - < < s. e. f (x) <s. g (x) function f for which Give an example of a If f (x) l| < e when 0 0< |x-a|<ô/2. (a) If -- ----- - 8. |g(x) 4| |f (x) 2| <b,then|f(x)+g(x)-6|<e. - (iv) + e, then 0 fmd a 8 > 0 such that, for all x, > if0< x-2 if 0 < x 2| (i) 7. 2| - x < limg(x) x a| -- the following assertion is false: 8, then |f (x) ll < e/2 when < - do not exist, can lim[f(x)+g(x)] or lim f(x)g(x)exist? lim f (x) exists and X-¥G lim [f (x) + g(x)] exists, must X-Þg lim g(x) exist? (c) If lim f(x) exists and lim g(x) does not exist, can lim[ f(x)+g(x)]exist? (b) If (d) If X-¥Ø lim f (x) exists and lim f (x)g(x) exists, does it follow that lim g(x) x-a x->a x->a exists? 9. Prove that lim f (x) x->a lim = h->0 f (a + h). (This is mainly in under- an exercise standing what the terms mean.) 10. (a) Prove that l if and only if lim [f (x) l] 0. (First see why the assertion is obvious; then provide a rigorous proof. In this chapter most problems which ask for proofs should be treated in the same way.) (b) Prove that lim f (x) lim f (x a). lim f (x) = = x->0 (c) Prove that (d) Give an 11. = - - x-+a lim f(x) = x->0 example where lim f(x31 x->0 lim f(x2) exists, Suppose there is a & > 0 such that f(x) but lim f(x) does not. g(x) when 0 -a| 8. Prove lim depends f(x) f(x) x->a x-a x-wa values of f (x) for x near a-this fact is often expressed by saying that limits property." (It will clearly help to use ô', or some other letter, are a of 8, in the definition of limits.) instead that lim = limg(x). In other = < words, |x < only on the "local 12. 5 g(x) for all x. Prove that lim provided that these limits exist. (a) Suppose that f(x) f(x) 5 limg(x), 108 Foundations (b) How can the hypotheses by weakened? (c) If f(x) < g(x) for all x, does it necessarily follow that lim f(x) < lim g(x)? X-¥ß 13. lim f(x) lim h(x). Prove that Suppose that f(x) 5 g(x) 5 h(x) and that X-kB X-¥ß and that lim g(x) lim g(x) exists, lim f(x) lim h(x). (Draw a picture!) = = *14. = -/ (a) Prove that if lim f(x)/x = l and b 0, then lim f(bx)/x bl. Hint: x->0 x->0 Write f (bx)/x b [f (bx)/bx]. (b) What happens if b 0? (c) Part (a)enables us to find lim(sin2x)/x in terms of lim(sinx)/x. Find = = = x->0 x->0 this limit in another way. 15. Evaluate the following limits in terms of the number lim (i) sin lim (iii) x->0 2x X sinax . . sin2 2x . X sin2 h lim (iv) x-vo lim . x Î COS X - . x->0 . hm (vi) x-so lim (vii)x-vo x tan2 x + 2x x + x2 . 1 - sin(x2 lim (ix) x->i . lun . x sin x cosx sin(x + h) lim (viii)h->0 h 16. lim(sinx)/x. x->0 x-vo smbx lim (v) = . x->0 (ii) a sin x - . O - . x - 1 x2(3+sinx) (x) xso (xi) lim (x2 2-1 (x + sin x)2 - (a) Prove that (b) Prove that . 3 1 1)3sin x 1 - l, then lim |f|(x) = |l|. x-va l and lim g(x) if lim f(x) m, then lim max( f, g)(x) max(l, m) and similarly for min. if lim x->a f(x) = = = 5. Limits 109 17. (a) Prove that lim 1/x does not exist, i.e., show that lim 1/x every number I. (b) Prove that lim 1/(x l is false for 1) does not exist. - x->1 18. = & > 0 and a number M such that |f(x)| < M if 0 < |x al < 8. (What does this mean pictorially?) Hint: Why does it sufBceto prove that l--1 < f (x) < l+1 for 0 < |x-a| < ô? Prove that if lim f(x) x-a = l, then there is a number - 19. *20. 21. 0 for irrational x and Prove that if f(x) then lim f(x) does not exist for any a. = Prove that if f(x) x for rational exist if a ¢ 0. lim f (x) does not = lim g(x) (a) Prove that if xw0 (b) Generalize this fact as = x, and f(x) f(x) x->0 1 for rational x, for irrational x, then -x = 0, then lim g(x) sin 1/x = = 0. 0 and |h(x)| < M for all x, then lim g(x)h(x) 0. (Naturally it is unnecessary to do part (a)if you x->0 succeed in doing part (b);actually the statement of part (b)may make it follows: If lim g(x) = = easier than 22. (a)-that'sone of the values of generalization.) Consider a function f with the following property: if g is any function for which limg(x) does not exist, then lim[f(x) + g(x)] also does not exist. x->0 x->0 Prove that this happens if and only if lim f (x) does exist. Hint: This is lim f (x) does not exist leads to an x->0 actually very easy: the assumption immediate contradiction **23. that if you consider the right g. This problem is the analogue of Problem 22 when f + g is replaced by f g. In this case the situation is considerably more complex, and the analysis requires several steps (thosein search of an especially challenging problem - can attempt an independent solution). (a) Suppose that lim x->O f(x) exists and is ¢ f(x)g(x) also does not same result if lim |f (x)| x->0 exist, then lim lim g(x) does not 0. Prove that if x->0 exist. x->0 (b) Prove the = oo. (The precise definition of this sort of limit is given in Problem 37.) that Prove if neither of these two conditions holds, then there is a function (c) such that lim g(x) does not exist, but lim f(x)g(x) does exist. g x->O x->O Hint: Consider separately the following two cases: (1)for some e > 0 we have |f (x)| > e for all sufficiently small x. (2)For every e > 0, there with are arbitrarily small x |f (x) < e. In the second case, begin by with choosing points x, |x,\ < 1/n and |f(x.)| < 1/n. *24. Suppose that An is, for each natural number n, some fmue set of numbers in members and and that A, Am have no in common if m ¢ n. Define [0, 1], 110 Foundations f as follows: f(x) Prove that lim f (x) 25. = x in A. in A. for any n. x not 0 for all a in [0,1]. Explain why the following definitions of lim f (x) l are all correct: x-+a For every 8 > 0 there is an e > 0 such that, for all x, = (i) if 0 < |x (ii) if 0 |x (iii) if 0 < |x < (iv) *26. 1/n, 0, = if 0 < |x - - - - a| < a| < a| < a| < e, then |f (x) l| e, then |f (x) l| e, then |f (x) l| s/10, then |f(x) < - < - 8. 8. 5ô. < - - l| < 8. Give examples to show that the following definitions of lim f(x) x->a = I are not correct. (a) For all & > |f (x) l | < (b) For all e > 0 there is an e |x 27. -- a 0 such that if 0 |x < 0 there is a 8 0 such that if |f (x) > - li a| - < 8, then < e, then 0 < 8. For each of the functions in Problem 4-17 indicate for which one-sided *28. |< > s. - limits lim f (x) and lim numbers a the f (x) exist. (a) Do the same for each of the functions in Problem 4-19. (b) Also consider what happens if decimals ending in O's are used instead of decimals ending in 9's. 29. Prove that lim 30. Prove that f (x) exists if lim f (x) (i) x->0+ lim f (|x|) (ii) x->0 lim f (x2) (iii) x->O lim f (x) = lim f (x). lim f (-x). lim f (x). = xa0= x->0+ lim = x->0+ f (x). (These equations, and others like them, are open to several interpretations. They might mean only that the two limits are equal if they both exist; or that if a certain one of the limits exists, the other also exists and is equal to it; or that if either limit exists, then the other exists and is equal to it. Decide for yourself which interpretations are suitable.) *31. Suppose that lim (Draw a picture to illustrate this assertion.) Prove that there is some 8 > 0 such that f(x) < f(y) whenever x < a < y and |x al < ô and |y a| < 8. Is the converse true? f(x) X-¥G¯ - < lim X-+G f(x). - 5. Limits 111 *32. 33. -+ao)/(bmx"'+-· -+bo) exists if and only if m > n. Prove that lim (anx"+ x-voo = What is the limit when m n? When m > n? Hint: the one easy limit is lim 1/x* = 0; do some algebra so that this is the only information you need. X-+0C - - Find the following limits. (i) (ii) x + sin3 lim . 5x + 6 x-oo lim x->oo x sin x . x2 + 5 lim Vx2 + x (iii) xooo x2 -- x. 1+k2x) lim (iv) x-+oo (x+sinx)2 . 34. Prove that lim f (1/x) 35. Find the following limits in terms of the number 36. lim œ = lim(sin x)/x. sinx (i) x-oo (ii) lim x sin x-Foo Define lim f (x). = -. x " 1 --. X lim f (x) = l." x->-oo (a) Find lim (anx"+ + a°)/(bmx" + (b) Prove that lim f (x) lim f (--x). lim f (1/x) lim f (x). (c) Prove that x-vox->-oo · - - - - - bd = X-+0C X-¥-DO = 37. We define lim f (x) = oo to mean that for all N there is a 8 > 0 such that, for all x, if 0 < x a < 8, then f (x) > N. (Draw an appropriate picture!) - (a) Show that (b) Prove that lim 1/(x 3)2 -- x-+3 if f(x) > e > lim X->G 38. -· o. 0 for all x, and lim g(x) f(x)/|g(x) = = 0, then oo. lim f (x) oo. (Or at least oo, and x->a convince yourself that you could write down the definitions if you had the energy. How many other such symbols can you define?) lim 1/x oo. (b) Prove that x->0+ lim f (1/x) oo. (c) Prove that lim f (x) oo if and only if X->oo lim f (x) (a) Define x-a+ = lim oo, x->a- f (x) = = = = x->0+ 39. Find the following limits, when they exist. x3+4x-7 (i) lim x-voo 7x2 - x + 1 = 112 Foundations (ii) (iii) lim x(1+ sin2x). X-¥OO lim x sin x. (iv) x(v) (vi) oc ; _ , 41. x2 ‡ lim 1400 . x 2X lim x( x + 2 lim (vii)x-oo 40. x sin -- X . ). - --. x (a) Find the perimeter of a regular n-gon inscribed in a circle of radius r; use radian measure for any trigonometric functions involved. [Answer: 2rn sin(x|n).] (b) What value does this perimeter approach as n becomes very large? After sending the manuscript for the first edition of this book off to the printer, a2 and lim x3 I thought of a much simpler way to prove that lim x2 -- a - a / . a3, without example, Va= FIGURE 19 - e a = going through all the factoring tricks on page 95. Suppose, for a2, where a > 0. Given that we want to prove that lim x2 = «2 of Ja2 + e g 0, we simply let 8 be the minimum a and a a| < 8 implies that Ja2 e < x < Ja2+ e, so (seeFigureX2 19); then |x |X2 a2] < < a2 + E, Or 8 < s. It is fortunate that these pages had a already been set, so that I couldn't make these changes, because this is completely fallacious. Wherein lies the fallacy? e > - - - - _ - -- "proof" FUNCTIONS CONTINUOUS CHAPTER If f is an arbitrary function, it is not necessarily true that lim this4m f (x) f (a). = fail to be true. For example, f might not In fact, there are many ways even be defined at a, in which case the equation makes no sense (Figure 1). Again, lim f (x) might not exist (Figure 2). Finally, as illustrated in Figure 3, FIGURE even if 1 f is defined at a and lim f(x) exists, the limit might not equal FIGURE (c) (b) (a) f(a). 2 We would like to regard all behavior of this type as abnormal and honor, with some complimentary designation, functions which do not exhibit such peculiarities. Intuitively, a function f is The term which has been adopted is wild oscillations. Although continuous contains the if graph no breaks, jumps,or whether usually enable you to decide this description will a function is continuous simply by looking at its graph (a skill well worth cultivating) it is easy to be fooled, and the precise definition is very important. "continuous." DEFINITION The function f is continnons at a if lim f (x) x-a / FIGURE a 3 = f (a). We will have no difficulty finding many examples of functions which are, or are example involving limits provides an not, continuous at some number a-every example about continuity, and Chapter 5 certainly provides enough of these. The function f (x) sin 1/x is not continuous at 0, because it is not even defined at 0, and the same is true of the function g(x) x sin 1/x. On the other hand, if of willing these extend second that is, if we wish to define functions, the to we are = = 113 114 Foundations a new function G by G(x) x sin a, = 1/x, ¢0 x 0, = x then the choice of a G(0) can be made in such a way that G will be continuous this Sto do 0 (Figure 4). This sort of at we can (iffact, we must) define G(0) extension is not possible for f; if we define = = F(x) sin I = 1/x, then F will not be continuous at 0, no matter (a) ¢0 x x a, 0, = what lim f a is, because x->0 (x) does not exist. The function f (x) is not continuous = x, x rational 0, x irrational if a ¢ 0, since lim f(x) does not exist. However, lim f(x) x->0 x-va 0 f (0), so f is continuous at precisely one point, 0. x2 are continuous at all numThe functions f(x) c, g(x) x, and h(x) bers a, since at a, = = = = lim f (x) = lim c = lim g(x) = lim h(x) = lim x X-ra FIGURE = c = f (a), = a = g(a), x-va 4 lim x2 - a2 = h(a). Finally, consider the function f (x) = we showed 10, 1/q, that lim x x irrational p/q in lowest terms. = In Chapter 5 f (x) continuous a is irrational, this fimction is = 0 for all a. Since O f (a) only when at a if a is irrational, but not if a is = rational. It is even easier to give examples THEOREM 1 If f of continuity if we prove two simple theorems. and g are continuous at a, then (1) f (2) f + g is continuous at a, g is continuous at a. Moreover, if g(a) ¢ 0, then (3) 1/g is continuous at a. 6. ContinuousFunctions 115 PROOF Since f and g are continuous at a, lim f (x) f (a) = Xad and lim g (x) = X->G g (a). By Theorem 2(l) of Chapter 5 this implies that lim(f +g)(x)= which is just the assertion that and (3)are left to you. I f +g)(a), f(a)+g(a)= (f + g is continuous at a. The proofs of parts Starting with the functions f(x) c and f(x) x, which are continuous for every a, we can use Theorem 1 to conclude that a function = f (x) = b,x" + bn-1x"¯1 + = - - - (2) at a, + bo c.xm+c._ixm-1+·-·+co is continuous at every point in its domain. But it is harder to get much further than that. When we discuss the sine function in detail it will be easy to prove that sin is continuous at a for all a; let us assume this fact meanwhile. A function like sin2x+x2+x'sinx f (x) = sin27 Sin x + 4x2 X proved continuous at every point in its domain. But we are still sin(x2); we obviously prove the continuity of a function like f(x) need composition about the of continuous theorem functions. Before stating this a of continuity is worth noting. If theorem, the following point about the definition we translate the equation lim f (x) f (a) according to the definition of limits, can now be unable to = = we obtain for every e > 0 there is & > 0 such that, for all x, if0< |x--a|<ô,then|f(x)f(a) <e. But in this case, where the limit is f (a), the phrase 0<|x-a|<8 may be changed to the simpler condition x-a since THEOREM 2 PROOF if x = a <ô, it is certainly true that |f (x) f (a)| - < e. If g is continuous at a, and f is continuous at g(a), then fog is continuous at a. (Notice that f is required to be continuous at g(a), not at a.) Let e > 0. We wish to find a 8 > 0 such that for all x, og)(a)|<e, if |x-a|<8,then |(f og)(x)-(f i.e., |f (g(x)) f (g(a))| < e. - 116 Foundations We first use continuity of f to estimate how close g(x) must be to g(a) in order for this inequality to hold. Since f is continuous at g(a), there is a 8' > 0 such that for all y, if |y (1) - g(a) |< ô', then |f (y) f (g(a))| < e. g(a) |< ô', then |f (g(x)) f (g(a))| < - In particular, this means that if (2) |g(x) - - e. We now use continuity of g to estimate how close x must be to a in order for the inequality |g(x) g(a)| < 8' to hold. The number ô' is a positive number just like any other positive number; we can therefore take ô' as the e (!)in the definition of continuity of g at a. We conclude that there is a ô > 0 such that, for all x, - (3) Combining if |x (2)and (3)we if |x - We can now reconsider a| - see that al < < 8, then |g(x) - g(a) < 8'. for all x, 8, then |f (g(x)) f (g(a))| < - e. | the function f (x) x sin 1/x, = 0, x x ¢0 = 0. We have already noted that f is continuous at 0. A few applications of Theorems 1 and 2, together with the continuity of sin, show that f is also continuous at a, for sin2(x3))) sin(x2 should be equally easy sin(x Functions like f(x) + + a ¢ 0. for you to analyze. The few theorems of this chapter have all been related to continuity offunctions at a single point, but the concept of continuity doesn't begin to be really interesting until we focus our attention on functions which are continuous at all points ofsome interval. If f is continuous at x for all x in (a, b), then f is called continuous on (a, b). Continuity on a closed interval must be defined a little differently; a function f is called continuous on [a,b] if = (1) f is continuous lim f (x) (2) x-a+ = f at x for all x in (a, b), (a) and lim f (x) = f (b). zeb- Functions which are continuous on an interval are usually regarded as especially behaved; indeed continuity might be specified as the first condition which a function ought to satisfy. A continuous function is sometimes described, intuitively, as one whose graph can be drawn without lifting your pencil of the the function from paper. Consideration well "reasonable" f (x) shows that this = x sin 0, 1/x, x x ¢0 = 0 description is a little too optimistic, but it is nevertheless true that there are many important results involving functions which are continuous on an interval. There theorems are generally much harder than the ones in this chapter, 6. ContinuousFunctions 117 but there is a simple theorem which forms a bridge between the two kinds of results. The hypothesis of this theorem requires continuity at only a single point, but the conclusion describes the behavior of the function on some interval containing the point. Although this theorem is really a lemma for later arguments, it is included here as a preview of things to come. THEOREM S is continuous at a, and f (a) > 0. Then there is a number 8 > 0 such that f (x) > 0 for all x satisfying |x a| < 8. Similarly, if f (a) < 0, then there is < ô. a number ô > 0 such that f(x) < 0 for all x satisfying |x Suppose f - -al PROOF Consider the case f (a) > 0. Since for all x, f such that, if |x Since f(a) > a| - is continuous at a, if e ô, then < |f (x) f (a) | < - > 0 there is a 8 > 0 e. 0 we can take f(a) as the e. Thus there is ô > 0 so that for all x, if |x a - |< 8, then |f (x) f (a)| < f (a), - and this last inequality implies f (x) > 0. A similar proof can be given in the case apply the first case to the function f. f (a) < 0; take e = -- f (a). Or one can - PROBLEMS 1. For which of the following functions f is there a continuous function F with domain R such that F(x) f(x) for all x in the domain of f? = x2 (i) f (x) - = x - 4 2 . (ii) f (x) x (iii) f (x) 0, x irrational. (iv) f (x) 1/q, x p/q rational = -. = = = which in lowest terms. points are the functions of Problems 4-17 and 4-19 continuous? 2. At 3. is a function satisfying |f (x) 5 |x for all x. Show that at 0. (Notice that f (0) must equal 0.) example of such a function f which is not continuous Give at any an (b) a ¢ 0. (c) Suppose that g is continuous at 0 and g(0) 0, and | f(x)| 5 |g(x)|. Prove that f is continuous at 0. (a) Suppose that f f is continuous = 4. Give an example of a function is continuous everywhere. 5. For each number other points. a, f such that f is continuous nowhere, fmd a function which is continuous at a, but |f| but not at any 118 Foundations 6. ¼,¼,¼, but continuous (b) Find a function f which is discontinuous at 1, ¼,),¼, and at 0, but continuous at all other points. a function f which all other points. at (a) Find is discontinuous at 1, . . . ... , 7. Suppose that f satisfies f(x + y) f(x) + f(y), and that f is continuous at 0. Prove that f is continuous at a for all a. 8. Suppose that f is continuous at a and f(a) 0. Prove that if a f + a is nonzero in some open interval containing a. 9. is not continuous at a. Prove that for some number e > 0 there numbers are x arbitrarily close to a with |f(x) f(a)| > e. Illustrate graphically. (b) Conclude that for some number e > 0 either there are numbers x arbitrarily close to a with f (x) < f (a) e or there are numbers x arbitrarily close to a with f (x) > f (a) + e. = = ¢ 0, then (a) Suppose f - - 10. at a, then so is if|. continuous every f can be written f = E + 0, where E is even and continuous and 0 is odd and continuous. then so are max( f, g) and Prove that if f and g are continuous, min( g). (a) Prove that (b) Prove that (c) if f is continuous f, (d) Prove that every continuous f can be written f are nonnegative 11. Prove Theorem f (x) *12. = = g h, where g and h - and continuous. 1(3) by using Theorem 2 and continuity of the if f is continuous at l and limg(x) l, then lim f(g(x)) f (l). (You can go right back to the definitions, but it is easier to consider the function G with G(x) g(x) for x ¢ a, and G(a) l.) continuity assumed, that of then l if it is not generally f at is not (b) Show true that lim f (g(x)) f (limg(x)). Hint: Try f (x) 0 for x ¢ l, and (a) Prove that = = = = = f (l) 13. function 1/x. = 1. = if f is continuous on [a,b], then there is a function g which is continuous on R, and which satisfies g(x) f(x) for all x in [a,b]. Hint: Since you obviously have a great deal of choice, try making g (a) Prove that = constant on (-oo, a] and [b,oo). (b) Give an example to show that this assertion is false if by (a, b). 14. g and h are continuous g(x) if x > a and h(x) (a) Suppose that f(x) to be if [a,b] is replaced at a, and that g(a) xs a. Prove that h(a). Define is f continuous = at a. g is continuous on [a,b] and h is continuous on [b,c] and h(b). Let f(x) be g(x) for x in [a,b] and h(x) for x in [b,c]. (b) Suppose g(b) = 6. ContinuousFunctions 119 Show that f is continuous "pasted 15. on together".) (Thus, continuous [a,c]. functions can be continuity": following version of Theorem 3 for "right-hand and that lim there > is a number Suppose f (x) f (a), f (a) 0. Then (a) Prove the = x->a+ 0 such that f (x) > 0 for all x satisfying 0 Ex a < 8. Similarly, such number that > then there < 8 0 if f (a) 0, is a f (x) < 0 for all x satisfying 0 5 x a < 8. version of Theorem 3 when lim f (x) f (b). Prove a (b) x->b8 > - - = 16. If lim f (x) exists, but is ¢ f (a), then xua continuity is said to have a rernovable dis- at a. sin 1/x for x ¢ 0 and discontinuity at 0? What if f (x) (a) If f (x) f f (0) = x sin = 1, does f have a removable 1/x for x ¢ 0, and f (0) l? f(x) for (b) Suppose f has a removable discontinuity at a. Let g(x) continuous that and g(a) at lim Prove is let a. (Don't g x ¢ a, f(x). = = = = x->a hard; this is quite easy.) = Let (c) f (x) 0 if x is irrational, and let f (p/q) terms. What is the function g defined by g(x) work very *(d) = = 1/q if p/q is in lowest lim f(y)? y->x f be a function with the property that every point of discontinuity removable discontinuity. This means that lim f (y) exists for all x, is a y->x infinitely many) numbers x. but f may be discontinuous at some (even lim f(y). Prove that g is continuous. (This is not quite Define g(x) Let = y->x **(e) (b).) so easy as part Is there a function f which is discontinuous at every point, and which has only removable discontinuities? (It is worth thinking about this problem now, but mainly as a test of intuition; even if you suspect the correct will almost certainly be unable to prove it at the present answer, you time. See Problem 22-33.) CHAPTER THREE HARD THEOREMS This chapter is devoted to three theorems about continuous functions, and some of their consequences. The proofs of the three theorems themselves will not be given until the next chapter, for reasons which are explained at the end of this chapter. THEOREM I If f is continuous such that f (x) = on 0. [a,b] and f (a) < 0 < f (b), then there is some x in [a,b] (Geometrically, this means that the graph of a continuous function which starts below the horizontal axis and ends above it must cross this axis at some point, as in Figure 1.) THEOREM 2 is continuous on [a,b], then f is bounded above on some number N such that f (x) 5 N for all x in [a,b]. If f (Geometrically, this theorem means that the graph of allel to the horizontal axis, as in Figure 2.) THEOREM s f that is, there is lies below some line par- is continuous on [a,b], then there is some number f (y) f (x) for all x in [a,b] (Figure 3). If [a,b], f y in [a,b] such that > from the theorems of Chapter 6. The These three theorems differ markedly of always those theorems hypotheses involved continuity at a single point, while I I I I I a b a b I I I FIGURE l FIGURE 120 2 FIGURE 3 I I a y I b 7. ThreeHard Theorems 121 the hypotheses of the present theorems require continuity on a whole interval [a,b]--if continuity fails to hold at a single point, the conclusions may fail. For example, let f be the function shown in Figure 4, e----e -1, i 05x<Ã f(x)= i 2 Ásxs2. 1, Then f is continuous at every point of [0,2] except Ã, and f (0) < 0 < f (2), but there is no point x in [0,2] such that f (x) 0; the discontinuity at the single point Ã is sufficient to destroy the conclusion of Theorem 1. Similarly, suppose that f is the function shown in Figure 5, = FIGURE 4 -y f (x) N x 0, x = f is continuous at every point of [0, 1] except 0, but f [0, 1]. In fact, for any number N > 0 we have f (1/2N) Then ---------- 0 0. i1/x, = on is not bounded above = 2N > N. example also shows that the closed interval [a,b] in Theorem 2 cannot be by the open interval (a, b), for the function f is continuous on (0, 1), but is not bounded there. Finally, consider the fimction shown in Figure 6, This replaced X2 f (x) FIGURE = f 0, x > l. On the interval [0,1] the function f is bounded above, so f does satisfy the conclusion of Theorem 2, even though f is not continuous on [0,1]. But f of satisfy the conclusion Theorem 3-there is no y in [0,1] such that does not f(y) > f (x) for all x in [0, 1]; in fact, it is certainly not true that f (1) > f (x) for all x in [0,1] so we cannot choose y 1, nor can we choose 0 y < l because number with if < < x < 1. y f (y) f (x) x is any This example shows that Theorem 3 is considerably stronger than Theorem 2. Theorem 3 is often paraphrased by saying that a continuous function on a closed interval on its maximum value" on that interval. As a compensation for the stringency of the hypotheses of our three theorems, the conclusions are of a totally different order than those of previous theorems. They describe the behavior of a function, not just near a point, but on a whole interval; such properties of a function are always significantly more difficult than properties, and are correspondingly of much greater power. to prove usefulness of Theorems 1, 2, and 3, we will soon deduce some imillustrate To the portant consequences, but it will help to first mention some simple generalizations Of fl1€SC Sleorems. 5 _< = "takes "global" FIGURE | ¡ . y x 1 "local" 6 THEOREM « If f is continuous on such that f (x) = c. [a,b] and f (a) < c < f (b), then there is some x in [a,b] 122 Foundations PROOF Let g is some THEOREM s f = c. Then g is continuous, b] such that g(x) = - in x [a, If f is continuous such that f (x) The function PROOF on = and = 0 < = f (a) is continuous orem 4 there is some x in f (x) g(b). By Theorem 1, there 0. But this means that f(x) c. < > c > f (b), then there is some x in [a,b] c. f - [a,b] and g(a) on [a,b] and [a,b] such that - - f (a) f (x) --c < < -c, = f (b). By Thewhich means that - c. Theorems 4 and 5 together show that f takes on any value between f (a) and f (b). We can do even better than this: if c and d are in [a,b], then f takes on any value between f (c) and f (d). The proof is simple: if for example, c < d, then just apply Theorems 4 and 5 to the interval [c,d]. Summarizing, if a continuous function on an interval takes on two values, it takes on every value in between; this slight generalization of Theorem 1 is often called the Intermediate Value Theorem. THEOREM 6 If f is continuous on some number PROOF The function such that in - [a,b], so [a,b], then f is bounded below on [a,b], f(x) > N for all x in [a,b]. that is, there is N such that f is continuous - for all x in -M. let N f(x) SM we can [a,b], so by Theorem 2 there is a number M [a,b]. But this means that f(x) > -M for all x on - 2 and 6 together show that a continuous function f on [a,b] is bounded on [a,b], that is, there is a number N such that |f (x) 5 N for all x in [a,b]. In fact, since Theorem 2 ensures the existence of a number Ni such that f (x) 5 N1 for all x in [a,b], and Theorem 6 ensures the existence of a number max(|N1|, |N2|). N2 such that f (x) > N2 for all x in [a,b], we can take N Theorems = THEOREM 7 If f is continuous on [a, b], then there is some y in [a,b] such that for all x in [a,b]. (A continuous function on a closed interval takes on its minimum interval.) PROOF The function - f is continuous f (x) for such that f (y) > all x in [a,b]. - - on all x [a,b]; by Theorem in f (y) 5 f(x) value on that 3 there is some y in [a,b], which means that f(y) 5 [a,b] f (x) for Now that we have derived the trivial consequences of Theorems 1, 2, and 3, we can begin proving a few interesting things. THEOREM a Every positive number has a square root. In other words, if a some number x2 x such that - a > 0, then there is 7. ThreeHard Theorems 123 PROOF x2, which is certainly continuous. = Notice that the "the of number a has a the theorem can be expressed in terms of f: statement root" means that of value takes proof the The this fact about f square a. on f will be an easy consequence of Theorem 4. illustrated in Figure 7); There is obviously a number b > 0 such that f (b) > a Consider the function f (x) (as 1 we can take b a, while if a < 1 we can take b = 1. Since applied < to [0,b] implies that for some x (in[0,b]), f (0) a < f (b), Theorem 4 x2 = œ, have i.e., a. we f (x) in fact, if a > = = f(x) . - can be used to prove that a positive number has nth natural number for root, n. If n happens to be odd, one can do an any better: every number has an nth root. To prove this we just note that if the positive number a has the nth root x, i.e., if x" = a, then (-x)" = n is odd), so assertion, odd number nth The that root for has the n any a has an nth root, is equivalent to the statement that the equation Precisely the same argument x= _ -a -a (since -x. x"-a=0 has a root if n is odd. FIGURE Expressed in this way the result is susceptible of great gentrâÏÎZatÎOD. 7 THEOREM 9 If n is odd, then any equation x"+an-ix"¯1+---+ao=0 has a root. PROOF We obviously want to consider the function f(x)=x"+an-ix"¯'+· ·+¾ like to prove that f is sometimes positive and sometimes negative. The x" and, intuitive idea is that for large |x|, the function is very much like g(x) negative and since n is odd, this function is positive for large positive x for large negative x. A little algebra is all we need to make this intuitive idea work. The proper analysis of the function f depends on writing we would = f (x) = x" + an-1x"¯ I + - · - + ao = 1 + a, x" x" x Note that as_i - x an-2 + x2 + --- · · · + GO -- x" 00| Un-1 5 |x| + + Consequently, if we choose x satisfying (*) then |x|> 1, 2nla,_i|,...,2n|ao , |xk| > |x| and Xk| |x| 2n|a._g| E' ----. |x"| 124 Foundations so an_2 lan-1 -+-+ ao ··+- x2 x 1 2n 1 2n <-+···+--=-. ¯ yn 1 2 n terms In other words, ---<-+---+--<-, 1 2¯ a..I 1 ao x"¯¯2 x which implies that 1 2 a. -51+-+-·-+-. Therefore, if we choose an xi (xi)" 2 so that f(xi) > _ ao x= i x > 0 which satisfies a._i (xi)" 1+ (*),then ao -+ 5 + xi 0. On the other hand, if x2 (x2)n> (x2)" 1 + an-1 ---- 2 x2 + - < - - = (xi)" 0 satisfies + GO (*),then = (x2)" f(x1), (x2)" < 0 and f (x2), so that f (x2)< 0. Now applying Theorem 1 to the interval in [x2,xi] such that f(x) 0. [x2,x1] we conclude that there is an x = Theorem 9 disposes of the problem of odd degree equations so happily that it would be frustrating to leave the problem of even degree equations completely undiscussed. At first sight, however, the problem seems insuperable. Some equations, like x2 1 = 0, have a solution, and some, like x2 + 1 0, do not-what more is there to say? If we are willing to consider a more general question, however, something interesting can be said. Instead of trying to solve the equation = - x" + I I a..tx"¯I + - - - + ao 0, = let us ask about the possibility of solving the equations x"+a._ix"¯1+---+ao=c FIGURE 8 for all possible numbers c. This amount to allowing the constant term ao to vary. The information which can be given concerning the solution of these equations depends on a fact which is illustrated in Figure 8. with n even, contains, The graph of the function f (x) x"+a._ix"¯I+words, other the point. there is a least have drawn it, lowest In at way we a number y such that f (y) < f (x) for all numbers x-the function f takes on a minimum value, not just on each closed interval, but on the whole line. (Notice that this is false if n is odd.) The proof depends on Theorem 7, but a tricky application will be required. We can apply Theorem 7 to any interval [a,b], and obtain a point yo such that f (yo)is the minimum value of f on [a,b]; but if [a,b] happens to be the interval shown in Figure 8, for example, then the point ye will value for the whole line. In the next not be the place where f has its minimum -+ao, = - 7. ThreeHard Theorerns 125 theorem the entire point of the proof is to choose an interval that this cannot happen. THEOREM 10 If n is even and that PROOF f (x) x" + a,_ixn-1 = f (y) 5 f (x) for all + . - - [a,b] in such + ao, then there is a number a way y such x. As in the proof of Theorem 9, if M then for all x with max(1, = 2n an-il, 2n aal), ..., à M, we have x I 2¯ a._i -<1+--+ a0 ··+--. x" x Since n is even, x" 2 0 for all x, so -- ---------- ---- - x" -Ex" an-1 ao x x" 1+-+·-·+- 2 =f(x), provided that |x à M. Now consider the number f (0). Let b > 0 be a number b" à 2 f (0) and also b > M. Then, if x à b, we have (Figure 9) such that x" f (x) > > -- ¯2¯2- b" -- f (0). > --b, then Similarly, if x 5 FIGURE 9 x" f(X) > (--b)" > -- ¯2¯ b" = > - 2 2¯ f(Û). Summarizing: -b, if x then b or x 5 > Now apply Theorem 7 to the function that there is a number y such that f f(x) 2 f(0). on the interval [-b, b]. -b (1) if 5 xs b, then f(y) 5 f(x). In particular, f (y) 5 f (0). Thus -b if xs Combining (2) (1)and (2)we or x see that à b, then f (x) 2 f (0) 2 f (y). f (y) 5 f (x) for all x. Theorem 10 now allows us to prove the following result. THEOREM 11 Consider the equation (*) x" + a,_ix"¯1 + - - - + ao = c, We conclude 126 Foundations and suppose c à m and has no solution for c PROOF m such that n is even. Then there is a number (*)has a solution for < m. Let f (x) x" + an-ix"¯' + + ao (Figure 10). there According to Theorem 10 is a number y such that f(y) 5 f(x) for all x. Let m f(y). If c < m, then the equation (*)obviously has no solution, since the left side always has a value à m. If c m, then (*)has y as a solution. number such that b > y and f (b) > c. Then > Finally, suppose c m. Let b be a f(y) m < c < f(b). Consequently, by Theorem 4, there is some number x in [y,b] such that f (x) c, so x is a solution of (*). = · · · = = = = b y FIGURE These consequences of Theorems 1, 2, and 3 are the only ones we will derive theorems will play a fundamental role in everything we do later, hownow (these ever). Only one task remains-to prove Theorems 1, 2, and 3. Unfortunately, we cannot hope to do this-on the basis of our present knowledge about the real numbers P1-P12) a proof is impossible.There are several ways of con(namely, vincing ourselves that this gloomy conclusion is actually the case. For example, the proof of Theorem 8 relies only on the proof of Theorem 1; if we could prove Theorem 1, then the proof of Theorem 8 would be complete, and we would have a proof that every positive number has a square root. As pointed out in Part I, it is impossible to prove this on the basis of Pl-P12. Again, suppose we consider the funCtiOn 10 f(x)= 1 X2 - 2 If there were no number x with x2 2, then f would be continuous, since the denominator would never 0. But f is not bounded on [0,2]. So Theorem 2 essentially the existence of numbers other than rational numbers, and depends on therefore on some property of the real numbers other than P1-P12. Despite our inability to prove Theorems 1, 2, and 3, they are certainly results which we want to be true. If the pictures we have been drawing have any connection with the mathematics we are doing, if our notion of continuous function corresponds to any degree with our intuitive notion, Theorems 1, 2, and 3 have got to be true. Since a proof of any of these theorems must require some new property of R which has so far been overlooked, our present difficulties suggest a way to discover that property: let us try to construct a proof of Theorem 1, for = = A x a a x' b example, and see what goes wrong. One idea which seems promising is to locate the first point where f(x) 0, that is, the smallest x in [a,b]such that f(x) = 0. To find this point, first consider the set A which contains all numbers x in [a,b] such that f is negative on [a,x]. In Figure 11, x is such a point, while x' is not. The set A itself is indicated by a heavy line. Since f is negative at a, and positive at b, the set A contains some points greater than a, while all points sufficiently close to b are not in A. (We are here using the continuity of f on [a,b], as well as Problem 6-15.) = FIGURE 11 7. ThreeHard Theorems 127 f(x) < 0 for all x in this interval is the smallest number which is greater than all members of A; clearly a < a < b. We claim that f (œ) 0, and to prove this we only have to eliminate the possibilities f (a) < 0 and f (a) > 0. Suppose first that f (a) < 0. Then, by Theorem 6-3, f (x) would be less than 0 for all x in a small interval containing a, in particular for some numbers bigger than a (Figure 12);but this contradicts the fact that a is bigger than every member of A, since the larger numbers would also be in A. Consequently, f (a) < 0 is false. On the other hand, suppose f(a) > 0. Again applying Theorem 6-3, we see that f(x) would be positive for all x in a small interval containing a, in particular for some numbers smaller than « (Figure 13). This means that these smaller numbers are all not in A. Consequently, one could have chosen an even smaller a which would be greater than all members of A. Once again we have a contradiction; f(a) > 0 is also false. Hence f(a) 0 and, we are tempted to say, QE.D. We know, however, that something must be wrong, since no new properties of R Were ever used, and it does not require much scrutiny to find the dubious point. It is clear that we can choose a number a which is greater than all members of A (forexample, we can choose a = b), but it is not so clear that we can choose a smallest one. In fact, suppose A consists of all numbers x > 0 such that x2 < 2. If the number Ã did not exist, there would not be a least number greater than all the members of A; for any y > Á we chose, we could always choose a still Now suppose œ = o ontain FIGU RE 12 = A could really only this bi be f(x) > o for au in this interval FIGURE 13 2 smaller one. Now that we have discovered the fallacy, it is almost obvious what additional property of the real numbers we need. All we must do is say it properly and use it. That is the business of the next chapter. PROBLEMS 1. For each of the following functions, decide which are bounded above or below on the indicated interval, and which take on their maximum or minimum value. (Notice that f might have these properties even if f is not continuous, and even if the interval is not a closed interval.) (i) f (x) (ii) f(x) (iii) f (x) (iv) f (x) (v) f (x) = = x2 on (-1, 1). x3 on (-1, 1). = x2 = x2 on R. on x2, [0,oo). = [ a+2, x 5 a x >a on (-a - 1, a + 1). (It will be necessary consider several possibilities for a.) x2, (vi) f (x) = (vii)f (x) = f[ a+2, 0, 1/q x < x>a a on [-a - 1, a + 1]. x irrational p|q in lowest terms x = on [0,1]. to 128 Foundations (viii)f (x) = (ix) f (x) = (x) f (x) = x irrational p/q in lowest terms x (1, (1, I 1/q -1/q x x irrational p/q in lowest terms x, x rational 0 x irrational on sin2(cosx a2) [0,1]. [0,1]. on = + Ja + (xi) f(x) (xii)f (x) [x]on [0,a]. = on = [0,a]. on [0,a3p = 2. For each of the following polynomial functions f (x) 0 for some x between n and n + 1. f, fmd an integer n such that = (i) f (x) (ii) f(x) (iii) f (x) (iv) f (x) 3. x3 x + 3. x5 + 5x'+ h + 1 x + x + 1. = - = = 4x2 = 4x + 1. -- Prove that there is some number x such that 163 Il79 gig (i) l+x2+sin2 (ii) sin x x 1. __ = 4. - This problem is a continuation (a) IËn of Problem 3-7. k is even, and > 0, find a polynomial function of degree n with exactly k roots. A m (b) root a of the polynomial fimction f is said to have multiplicity if f(x) (x a)mg(x), where g is a polynomial function that does not - = -- have a as a root. Let f be a polynomial function of degree n. Suppose that f has k roots, counting multiplicities, i.e., suppose that k is the sum of the multiplicities of all the roots. Show that n k is even. -- [a,b] and f (x) is always that rational. 5. Suppose that f is continuous on can be said about f? 6. Suppose that f is a continuous function on [--1,1] such that x2+(f(x))2 y for all x. (This means that (x,f (x)) always lies on the unit circle.) Show that x2 for all X. either f(x) /1 x2 for all x, or else f (x) What _ -)1 = 7. = - -- How many continuous functions f are there which satisfy (f (x))2 2 for all x? 8. Suppose that f and g are continuous, that f2 g2, and that all x. Prove that either f (x) g(x) for all x, or else f (x) f(x) ¢ 0 for - -g(x) = 9. for all x. = (a) Suppose that f is continuous, that f (x) 0 only for x f (x) > 0 for some x > a as well as for some x < a. What = about f(x) for all x ¢ a? = a, and that can be said 7. ThreeHard Theorems l 29 (b) Again assume but suppose, some x *(c) 10. < a. f is continuous and that f (x) instead, that f (x) > 0 for some x > Now what can be said about f (x) for that 0 only for x = a and ¢ x f (x) < = a, 0 for a? Discuss the sign of x3 + x2y + xy2 + y3 when x and y are not both 0. Suppose f and g are continuous on [a,b] and that f(a) < g(a), but f(b) > g(x) for some x in [a,b]. (If your proof isn't very g(b). Prove that f(x) short, it's not the right one.) = 11. (1,i) 's [0,1] = ', FIGURE Suppose that f is a continuous function on [0,1] and that f (x) is in for each x (drawa picture). Prove that f (x) x for some number x. 12. 14 11 shows that f intersects the diagonal of the square in Figline). Show that f must also intersect the other (dashed) ure 14 (solid diagonal. (b) Prove the following more general fact: If g is continuous on [0,1] and g(0) 0, g(l) 1 or g(0) 1, g(1) 0, then f(x) g(x) for some x. (a) Problem = 13. = (a) Let f (x) *(b) *(c) sin = = = = 1/x for x ¢ 0 and let f (0) = 0. Is f continuous on maximum |f | [-1, 1]? Show that f satisfies the conclusion of the Intermediate Value Theorem on [-1, 1]; in other words, if f takes on two values somewhere on [-1, 1], it also takes on every value in between. Suppose that f satisfies the conclusion of the Intermediate Value Theorem, and that f takes on each value only once. Prove that f is continuous. Generalize to the case where f takes on each value only finitely many times. 14. f is a continuous on [0,1]. If function on [0,1], let ||f || be the for any number c we have ||cf|| |c| ||f||. Prove that | f + g|| 5 ||f|| + ||g||. Give an example where (a) Prove that *(b) Ilfll + = llg||. - *15. - |f + g|| ¢ -g||+||g (c) Prove that |\h f|| y value of 5 ||h Suppose that ¢ is continuous - and f||. lim ¢(x)/x" X->OD = 0 = lim ¢(x)/x". X->-DO if n is odd, then there is a number x such that x" + ¢(x) 0. (b) Prove that if n is even, then there is a number y such that y" + ¢(y) 5 x" + ¢(x) for all x. (a) Prove that 's, " = Hint: Of which proofs does this problem test your understanding? FIGURE *16. Let f be any polynomial function. Prove that there is some number that |f(y)| 5 |f(x)| for all x. *17. Suppose that 15 is a continuous function with y such 0 for all x, and lim f (x) 0 lim f (x). (Draw a picture.) Prove that there is some X->OD X4-00 number y such that f (y) > f (x) for all x. = f = f (x) > l 30 Foundations *18. is continuous on [a,b], and let x by any number. Prove that there is a point on the graph of f which is closest to (x, 0); in other words there is some y in [a,b] such that the distance from (x, 0) to (y, f (y)) is 5 distance from (x, 0) to (z,f (z)) for all z in [a,b]. (See Figure 15.) that this same assertion is not necessarily true if [a,b] is replaced Show (b) by (a, b) throughout. (c) Show that the assertion as true if [a,b] is replaced by R throughout. (d) In cases (a)and (c),let g(x) be the minimum distance from (x, 0) to a yl, and conclude point on the graph of f. Prove that g(y) 5 g(x) +|x that g is continuous. that there are numbers xo and xi in [a,b] such that the distance Prove (e) from (xo,0) to (x1,f(xi)) is 5 the distance from (xo',0) to (xt', f(xi')) for any xo', Il' in [a,b]. (a) Suppose that f - **19. (a) Suppose that f is continuous on [0, 1] and f(0) number. Prove that there is some number shown in Figure 16 for n = 4. Hint: (x+1/n), as natural f g(x) = f(x) f(x (b) Suppose - 0 < a < FIGURE16 + n. Find a function f which is continuous satisfies f(0) = f(1), but which does not satisfy 1 f(l). Let n be any x such that f (x) = Consider the function + 1/n); what would be true if g(x) ¢ 0 for all x? 1, but that a is not equal to 1/n for any natural number x = [0, 1] and on f(x) = f(x which + a) for any x. **20. there does not exist a continuous function f defined on R which takes on every value exactly twice. Hint: If f (a) f (b) for either all then > < < b, for in b) a x (a, or f (x) f (a) for f (x) f (a) values close to f(a), but slightly the all x in (a, b). Why? In first case all larger than f (a), are taken on somewhere in (a, b); this implies that f (x) < f (a) for x < a and x > b. (b) Refine part (a)by proving that there is no continuous function f which takes on each value either 0 times or 2 times, i.e., which takes on exactly twice each value that it does take on. Hint: The previous hint implies that f has either a maximum or a minimum value (which must be taken said about values close maximum value? twice). What the be to on can which continuous takes value exactly function Find 3 times. on every a f (c) which value exactly takes on every More generãlly, find one n times, if odd. is n (d) Prove that if n is even, then there is no continuous f which takes on 4, for example, every value exactly n times. Hint: To treat the case n Í(X4). > let f(x1) Then either f(x2) f(13) f(x) 0 for all x in of the three x3), x4), intervals (xi, x2), (x2, two (x3, or else f (x) < 0 for all x in two of these three intervals. (a) Prove that = = = = = LEAST UPPER BOUNDS CHAPTER NeverThis chapter reveals the most important property of the real numbers. merely sequel which theless, it is must be followed has to Chapter 7; the path a already been indicated, and further discussion would be useless delay. DEFDHTION A set A of real numbers is bounded x Such a number x > a above if there is a number x such that for every a in A. is called an upper bound for A. Obviously A is bounded above if and only if there is a number x which is an upper bound for A (andin this case there will be lots of upper bounds for A); we often say, as a concession to idiomatic English, that "A has an upper bound" when which is an upper bound for A. we mean that there is a number above" has now been used in two ways--first, in Notice that the term Chapter 7, in reference to functions, and now in reference to sets. This dual usage should cause no confusion, since it will always be clear whether we are talking about a set of numbers or a function. Moreover, the two definitions are closely connected: if A is the set {f (x) : a Ex 5 b}, then the function f is bounded above on [a,b] if and only if the set A is bounded above. The entire collection R of real numbers, and the natural numbers N, are both examples of sets which are not bounded above. An example of a set which is bounded above is A=(x:05x<1}. "bounded To show that A is bounded above we need only name some upper bound for A, is easy enough; for example, 138 is an upper bound for A, and so are 2, l¼, 1¼, and 1. Clearly, 1 is the least upper bound of A; although the phrase just introduced is self explanatory, in order to avoid any possible confusion (in means), we particular, to ensure that we all know what the superlative of explicitly. define this which "less" DEFINITION A number x is a least upper and 131 bound of A if (1) x is an upper bound of A, (2) if y is an upper bound of A, then x 5 y. 132 Foundations "a" in this definition was merely a concession The use of the indefinite article that ignorance. Now have made a precise definition, it is easily we to temporary and that if x seen y. Indeed, in this y are both least upper bounds of A, then x = case xs y, and ys x, since y since x is an upper bound, and x is a least upper bound, is an upper bound, and y is a least upper bound; it follows that x y. For this reason we speak of the least upper bound of A. of A is synonymous and has one advantage. It abbreviates The term supremum nicely quite to = sup A and saves us "soup (pronounced A") from the abbreviation lub A (whichis nevertheless used by some authors). There is a series of important definitions, analogous to those just given, which can now be treated more briefly. A set A of real numbers is bounded below if there is a number x such that x 5 a for every a in A. Such a number x is called a lower bound for A. A number x is the greatest lower bound of A if is a lower bound of A, if (2) y is a lower bound of A, then x (1) and x > y. The greatest lower bound of A is also called the infimum of A, abbreviated inf A; some authors use the abbreviation glb A. One detail has been omitted from our discussion so far-the question of which sets have at least one, and hence exactly one, least upper bound or greatest lower bound. We will consider only least upper bounds, since the question for greatest lower bounds can then be answered easily (Problem 2). If A is not bounded above, then A has no upper bound at all, so A certainly cannot be expected to have a least upper bound. It is tempting to say that A does have a least upper bound if it has some upper bound, but, like the principle of mathematical induction, this assertion can fail to be true in a rather special way. then If A 0, A is bounded above. Indeed, any number x is an upper bound for 0: x > y for every y in 0 = simply because there is no y in Ø. Since every number is an upper bound for 0, there is surely no least upper bound for Ø. With this trivial exception however, 8. Irast UpperBounds 133 is true-and very important, definitely important enough to warrant of details. We are finally ready to state the last property of the real our assertion consideration which we need. numbers (The least upper bound property) If A is a set of real numbers, A ¢ Ø, and A is bounded above, then A has a least upper bound. (Pl3) b a a but that is actually one of its Property P13 may strike you as anticlimactic, virtues. To complete our list of basic properties for the real numbers we require no particularly abstruse proposition, but only a property so simple that we might feel foolish for having overlooked it. Of course, the least upper bound property is not really so innocent as all that; after all, it does not hold for the rational numbers Q. For example, if A is the set of all rational numbers x satisfying x2 < 2, then there is no rahonal number y which is an upper bound for A and which is less than or equal to every other rational number which is an upper bond for A. It will become clear only gradually how significant Pl3 is, but we are already in a position to demonstrate its power, by supplying the proofs which were omitted in Chapter 7. 1 FIGURE THEOluuW 7.1 If in PROOF f is continuous [a,b] on such that [a,b] f (x) = 0. and f(a) < 0 < f(b), then there is some number x Our proof is merely a rigorous version of the outline developed at the end of Chapter 7-we will locate the smallest number x in [a,b] with f(x) 0. Defme the set A, shown in Figure 1, as follows: = f(x) o for all x in this interval < A - {x a 5 x : 5 b, and on the f is negative interval ·ý a x, I U Clearly A 0, since a is in A; in fact, there is some 8 > 0 such that A contains all points x satisfying a 5 x < a + 8; this follows from Problem 6-15, since f is continuous on [a,b] and f (a) < 0. Similarly, b is an upper bound for A and, in fact, there is a 8 > 0 such that all points x satisfying b ô < x 5 b are upper bounds for A; this also follows from Problem 6-15, since f (b) > 0. From these remarks it follows that A has a least upper bound a and that 0, by eliminating the possibila < œ < b. We now wish to show that f(a) <0and ities f(a) f(a)>0. Suppose first that f (a) < 0. By Theorem 6-3, there is a 8 > 0 such that f (x) < 0 for « ô < x < a + & (Figure 2). Now there is some number xo in A which satisfies a otherwise a would not be the least upper 8 < xo < a (because negative that This of A). is on the whole interval [a,xo]. But if bound means f and number between a a + ô, then f is also negative on the whole interval xi is a f is negative on the interval [a,xi], so x1 is in A. But this [xo,xt]. Therefoæ that contradicts that the fact a is an upper bound for A; our original assumption f(a) < 0 must be false. Suppose, on the other hand, that f (a) > 0. Then there is a number ô > 0 such that f (x) > 0 for a 8 < x < a + 8 (Figure 3). Once again we know that there is SRtiSfying ô < xo < a; but this means that f is negative on [a,xo], a an XO in A - FIGURE2 = - - $ta i a i o Ex) > c for an x in this interval - FIGURE 3 [a,x]}. - 134 Foundations which is impossible, since leaving f a contradiction, f (xo) > 0. Thus the assumption f (a) (a) 0 as the only possible alternative. > 0 also leads to = The proofs of Theorems 2 and 3 of Chapter 7 require a simple preliminary play much the same role as Theorem 6-3 played in the previous proof result, which will THEOREM 1 is continuous at a, then there is a number ô above on the interval (a 8, a + 8) (seeFigure 4). If f > 0 such that f is bounded - PROOF Since lim f (x) = f (a), there if |x - is, for every e a 8, then |< > 0, a 8 > 0 such that, for all x, |f (x) f (a)| < -- e. It is only necessary to apply this statement to some particular e (anyone will do), for example, e 1. We conclude that there is a 8 > 0 such that, for all x, = if|x-a|<8,then|f(x)-f(a)1<l. It follows, in particular, that if |x a| < 8, then f (x) f (a) < l. This completes the proof: on the interval (a 8, a + 8) the function f is bounded above by 1. + f (a) -- - - It should hardly be necessary to add that we can now also prove that f is bounded below on some interval (a ô, a + 8), and, finally, that f is bounded on some open interval containing a. A more significant point is the observation that if lim f (x) f (a), then there - = x-a+ I I I I I I I I I I I I I i aFIGURE 4 | ô a a‡ 8. Least UpperBounds 135 0 such that f is bounded on the set {x : a 5 x < a + ô}, and a similar observation holds if lim f (x) f (b). Having made these observations (and is a ô > = assuming THEOREM 7-2 PROOF that you will supply the proofs), we tackle our second major theorem. If f is continuous [a,b], then f on is bounded above on [a,b]. Let A and {x a Exsb : = above on f is bounded [a,x]}. Clearly A ¢ Ø (since a is in A), and A is bounded above (byb), so A has a least above" both Notice that we are here applying the term bound upper a. visualized to the set A, which can be as lying on the horizontal axis, and to f, i.e., which the : x}, to sets {f(y) a sy 5 can be visualized as lying on the vertical axis (Figure 5). b. Suppose, instead, that Our first step is to prove that we actually have a such that there > < 8 0 is By Theorem 1 is bounded b. on («-8, a+ô). Since a f of satisfying a-ô < xo < a. This A there is some xo in A a is the least upper bound means that f is bounded on [a,xo]. But if xi is any number with a < xi < a + 8, then f is also bounded on [xo,xi]. Therefore f is bounded on [a,x1], so xi is the fact that a is an upper bound for A. This contradiction in A, contradicting shows that a b. One detail should be mentioned: this demonstration implicitly assumed that a < a [sothat f would be defined on some interval (a 8, a + 8)]; the possibility a a can be ruled out similarly, using the existence of a ô > 0 such that f is bounded on {x: a Ex < a + 8}. The proof is not quite complete-we only know that f is bounded on [a,x] for every x < b, not necessarily that f is bounded on [a,b]. However, only one small argument needs to be added. There is a ô > 0 such that f is bounded on {x : b 8 < x 5 b}. There is xo in A such that b 8 < xo < b. Thus f is bounded on [a,xo] and also on [xo,b], so f is bounded on [a,b]. "bounded ' = -------- U(y):a sys x) = - = a FIGURE A 5 - -- To prove the third important theorem we resort to a trick. THEOREM 7-3 PROOF is continuous f (x) for all x in If f on [a,b], then there is a number y in [a,b]. [a,b], which {f (x) : x in [a,b] } We already know that f is bounded on [a,b] means such that f(y) > that the set is bounded. This set is obviously not 0, so it has a least upper bound «. Since f (y) for some y in [a,b]. a > f (x) for x in [a,b) it suffices to show that a Suppose instead that a ¢ f (y) for all y in [a,b]. Then the function g defined by 1 = g(x) = , a - f (x) x in [a,b] 136 Foundations is continuous on [a,b], since the denominator of the right side is never 0. On the other hand, a is the least upper bound of f(x) : x in [a,b]}; this means that { for every e 0 there is x in > [a,b] with a - f (x) < e. This, in turn, means that for every e 0 there is x in > But this means that g is not bounded on [a,b] with g(x) > 1/e. [a,b], contradicting the previous theorem. At the beginning of this chapter the set of natural numbers N was given as an example of an unbounded set. We are now going to provethat N is unbounded. After the difficulttheorems proved in this chapter you may be startled to find such an theorem winding up our proceedings. If so, you are, perhaps, allowing the geometrical picture of R to influence you too strongly. "Look," you real numbers look like may say, "obvious" "the Illl O ·--III 1 2 3 nx n+1 x is itself an integer)." so every number x is between two integers n, n + 1 (unless Basing the argument on a geometric picture is not a proof, however, and even the that if you place unit segments end-togeometric picture contains an assumption: end you will eventually get a segment larger than any given segment. This axiom, often omitted from a first introduction to geometry, is usually attributed (notquite corresponding and numbers, the that N is not property for justly) to Archimedes, real numbers. of the bounded, is called the Archimedian This property is not proper_; reference of of the Suggested Reading), although P1-P12 (see a consequence [17] of it does hold for Q, course. Once we have P13 however, there are no longer problems. any THEOREM 2 N is not bounded above. -/ PROOF Suppose N were bounded above. bound a for N. Then Since N a à n Ø, there would be a least upper for all n in N. Consequently, aën+1 since n + 1 is in N if n is in N. But this means that « and forallninN, - 1à n for all n in N, this means that a 1 is also an upper bound for N, contradicting the fact that the least upper bound. a is - 8. Least UpperBounds 137 of Theorem 2 often assumed implicitly. we have very There is a consequence THEOREM s PROOF For any e 0 there is a natural number > an (actually n with 1/n equivalent formulation) which < e. Suppose not; then 1/n > e for all n in N. Thus n < 1/s for all n in N. But this Theorem 2. means that 1/s is an upper bound for N, contradicting A brief glance through Chapter 6 will show you that the result of Theorem 3 was used in the discussion of many examples. Of course, Theorem 3 was not available at the time, but the examples were so important that in order to give them some cheating was tolerated. As partial justificationfor this dishonesty we can claim that this result was never used in the proof of a theoran,but if your faith has been shaken, a review of all the proofs given so far is in order. Fortunately, such deception will not be necessary again. We have now stated every property of the real numbers that we will ever need. Henceforth, no more lies. PROBLEMS 1. Find the least upper bound and the greatest lower bound (ifthey exist) of the following sets. Also decide which sets have greatest and least elements (i.e.,decide when the least upper bound and greatest lower bound happens to belong to the set). (i) 1 - : n (ii) n in N . 1 -:ninZandn¢0. . n (iii) (x : x = (iv) (x : 0 5 0 or x xs Ã = 1/n for some n in N}. and x is rational}. (v) {x:x2+x+1>0}. (vi) {x:x2+x--1<0}. (vii){x:x<0andx2+x-1<0}. 1 (viii) n --- 2. + (-1)" : n in N . (a) Suppose A ¢ Ø is bounded below. Let -A denote the set of all A. for x in Prove that -A ¢ Ø, that -A is bounded above, and that sup(-A) is the greatest lower bound of A. (b) If A ¢ Ø is bounded below, let B be the set of all lower bounds of A. Show that B ¢ 0, that B is bounded above, and that sup B is the greatest lower bound of A. -x - 3. Let f be a continuous function on (a) The with [a,b] with f (a) < 0 < f (b). proof of Theorem 1 showed that there is a smallest x in [a,b] f (x) 0. Is there necessarily a second smallest x in [a,b] with = l 38 Foundations 0? Show that there is a largest x in [a,b] with f (x) 0. (Try to give an easy proof by considering a new function closely related to f.) : a 5 upon consideration of A (b) The proof of Theorem 1 depended x]}. Give another proof of Theorem 1, x 5 b and f is negative on [a, which depends upon consideration of B : a 5 x 5 b and f (x) < 0}. Which point x in [a,b] with f (x) 0 will this proof locate? Give f (x) = = = = {x (x = an example where the sets A and B are not the same. *4. (a) Suppose Suppose numbers that is continuous on [a,b] and that f (a) f (b) 0. 0 for some xo in [a,b]. Prove that there are sc < xo < d 5 b such that f (c) f (d) 0, 0 for all x in (c,d). Hint: The previous problem can be used f = also that f (xo) > c and d with a but f (x) > to good advantage. (b) Suppose that f is continuous on there are numbers and 5. f (d) f = = = = that f (a) < f (b). Prove that d 5 b such that f(c) f(a) (d) for all x in (c,d). [a,b] and c and d with a 5 c and (b) f (a) < f (x) < f < = that y x > l. Prove that there is an integer k such that x < k < y. Hint: Let I by the largest integer satisfying I 5 x, and consider l + 1. (b) Suppose x < y. Prove that there is a rational number r such that x < then ny Why have parts > 1. r < y. Hint: If 1/n < y (Query: and until set?) this postponed problem (a) (b)been (c) Suppose that r < s are rational numbers. Prove that there is an irrational number between r and s. Hint: As a start, you know that there is an irrational number between 0 and 1. (d) Suppose that x < y. Prove that there is an irrational number between x and y. Hint: It is unnecessary to do any more work; this follows from (a) Suppose - -x, -nx (b)and (c). *6. A set A of real numbers is said to be dense if every open interval contains a point of A. For example, Problem 5 shows that the set of rational numbers and the set of irrational numbers are each dense. that if f is continuous and f (x) 0 for all numbers x in a dense 0 for all x. set A, then f (x) and g are continuous and f (x) g(x) for all x in a dense (b) Prove that if f g(x) for all x. set A, then f(x) that f (x) à g(x) for all x in A, show that f (x) à instead If (c) we assume all g(x) for x. Can à be replaced by > throughout? (a) Prove = = = = 7. Prove that if f is continuous and f (x + y) = f (x) + f (y) for all x and y, then there is c such that f (x) = cx for all x. (This conclusion can be demonstrated simply by combining the results of two previous problems.) Point of information: There do exist noncondnuousfunctions f satisfying this now; in (x + y) = (x) + (y) for all x and y, but we cannot a number f f f prove fact, this simple question involves ideas that are usually never mentioned any undergraduate course. The Suggested Reading contains references. in 8. Least UpperBounds 139 *8. Suppose that ure 6). (a) Prove that f is a function such that f (a) lim f (x) and lim x->a+ x->a- f (x) both f (b) whenever _< exist. a < b (Fig- Hint: Why is this problem in this chapter? (b) Prove that f never has a removable discontinuity (thisterminology comes from Problem 6-16). (c) Prove that if f satisfies the conclusions of the Intermediate Value Theorem, then f is continuous. *9. If f is a bounded function on [0,1], let |||fl|| sup{|f(x)| Prove analogues of the properties of || || in Problem 7-14. 10. Suppose a > 0. Prove that every number x can be written uniquely in the form x ku + x', where k is an integer, and 0 5 x' < a. = : x in [0,1]}. = FIGURE 11. 6 one side of P ¯¯¯¯~2;T¯Á¯ FIGURE 7 that ai, a2, a3, is a sequence of positive numbers with assi y an/2. Prove that for any e > 0 there is some n with a, < s. (b) Suppose P is a regular polygon inscribed inside a circle. If P' is the inscribed regular polygon with twice as many sides, show that the difference between the area of the circle and the area of P' is less than half the difference between the area of the circle and the area of P (useFigure 7). (c) Prove that there is a regular polygon P inscribed in a circle with area as close as desired to the area of the circle. In order to do part (c)you will need part (a).This was clear to the Greeks, who used part (a)as the basis for their entire treatment of proportion and area. By calculating the areas of polygons, this method ("themethod of exhaustion") allows computations of x to any desired accuracy; Archimedes used it to show < x < But it has far greater theoretical importance: that *(d) Using the fact that the areas of two regular polygons with the same number of sides have the same ratio as the square of their sides, prove that the areas oftwo circles have the same ratios as the square oftheir radii. Hint: Deduce a contradiction from the assumption that the ratio of the areas is greater, or less, than the ratio of the square of the radii by inscribing appropriate polygons. (a) Suppose ... ). / 12. Suppose that A and B are two nonempty for all x in A and all y in B. sets of numbers such that x 5 y for all y in B. (b) Prove that sup A 5 inf B. (a) Prove that 13. sup A sy Let A and B be two nonempty sets of numbers which are bounded above, and let A+B denote the set of all numbers x+y with x in A and y in B. Prove that sup(A+B) sup A+sup B. Hint: The inequality sup(A+B) 5 sup A+sup B is easy. Why? To prove that sup A + sup B 5 sup(A + B) it sufEces to prove that sup A + sup Bs sup(A + B) + e for all e > 0; begin by choosing x in A = 140 Foundations and y in B with sup A FIGURE I as 8 - I as x < e/2 and sup B I as of closed - y < s/2. I I I b, b: 61 intervals Ii [ai,bi], 12 Ë2, b2], Suppose that a, y a.+1 and bn+1 y b,, for all n (Figure 8). Prove that there is a point x which is in every I.. (b) Show that this conclusion is false if we consider open intervals instead of (a) Consider 14. a sequence = = . . . . closed intervals. The simple result of Problem 14(a)is called the "Nested Interval Theorem." It may be used to give alternative proofs of Theorems 1 and 2. The appropriate reasoning, outlined in the next two problems, illustrates a general method, called "bisection a *15. argument." Suppose f is continuous on [a,b] and f(a) < 0 < f(b). Then either f ((a + b)/2) 0, or f has different signs at the end points of the interval [a, (a + b)/2], or f has different signs at the end points of [(a + b)/2, b]. Why? If f((a + b)/2) ¢ 0, let 11 be one of the two intervals on which f changes sign. Now bisect II. Either f is 0 at the midpoint, or f changes sign on one of the two intervals. Let I2 be such an interval. Continue in this way, to define I, for each n (unless f is 0 at some midpoint). Use the Nested Interval Theorem to find a point x where f (x) 0. = = *16. Suppose f were continuous on [a,b], but not bounded on [a,b]. Then f would be unbounded on either [a, (a + b)/2] or [(a+b)/2, b]. Why? Let 11 of these Proceed as in Problem 15 intervals be one on which f is unbounded. to obtain a contradiction. 17. (a) Let A = {x: x < «}. Prove the following (theyare all easy): IfxisinAandy<x,thenyisinA. (ii) A ¢ 0. (iii) A ¢ R. (iv) If x is in A, then there is some number (i) (b) Suppose, conversely, that A satisfies sup A}. *18. (i) iv). x' in A such that x Prove that A = {x : < x x'. < A number x is called an almost upper bound for A if there are only with numbers A in fmitely many y y > x. An almost lower bound is defined similarly. upper bounds and almost lower bounds of the sets in Problem 1. (b) Suppose that A is a bounded infinite set. Prove that the set B of all almost upper bounds of A is nonempty, and bounded below. (a) Find all almost 8. Least UpperBounds 141 inf B exists; this number is called the limit superior of A, and denoted by lim A or lim sup A. Find lim A for each set A in Problem 1. (c) It follows from part (b)that (d) Define lim A, and *19. find it for all A in Problem 1. If A is a bounded infinite set prove (a) lim A 5 E (b) E A. A 5 sup A. (c) If E A < sup A, then A contains a largest element. of parts (b)and (c)for lim. The analogues (d) I I shadow points FIGURE *20. 9 Let f be a continuous function on R. A point x is called a shadow point of f if there is a number y > x with f(y) > f(x). The rationale for this terminology is indicated in Figure 9; the parallel lines are the rays of the sun rising in the east (youare facing north). Suppose that all points of (a, b) are shadow points, but that a and b are not shadow points. (a) For x in (a, b), prove that f (x) sf (b). Hint: Let A {y and f(x) 5 f(y)}. If supA were less than b, then supA = : x 5 y 5 b would be a shadow point. Use this fact to obtain a contradiction to the fact that b is not a shadow point. Now (b) prove that f (a) 5 f (b). (This is a simple consequence of continuity.) (c) Finally, using the fact that a is not a shadow point, prove that f(a) = f (b). This is known as the Rising Sun Lemma. Aside from serving as a good illustration of the use of least upper bounds, it is instrumental in proving several beautiful theorems that do not appear in this book; see page 443. result l 42 Foundations APPENDIX. UNIFORM CONTINUITY "foundations," Now that we've come to the end of the it might be appropriate to slip in one further fundamental concept. This notion is not used crucially in the rest of the book, but it can help clarify many points later on. x2 is continuous at We know that the function f (x) a for all a. In other = words, y a= if a is any number, then such that, for all x, if |x _ for every e - al < 0 there is some 8 > 8, then ]x2 > 0 2 Of course, 8 depends on e. But 8 also dependson a-the 8 that works at a might not work at b (Figure 1). Indeed, it's clear that given e > 0 there is no one 8 > 0 that works for all a, or even for all positive a. In fact, the number a + 8/2 will certainly satisfy |x a| < ô, but if a > 0, then + e - a for FIGURE a and this won't be < e once a tational way of saying that f > e/8. (This is just an admittedly confusing compu- is growing faster and faster!) On the other hand, for any e > 0 there will be one 8 > 0 that works for all a in any interval [-N, N]. In fact, the 8 which works at N or -N will also work everywhere else in the interval. As a final example, consider the function f(x) sin 1/x, or the function whose graph appears in Figure 18 on page 62. It is easy to see that, so long as e < 1, there will not be one 8 > 0 that works for these functions at all points a in the open interval (0, 1). These examples illustrate important distinctions between the behavior ofvarious continuous functions on certain intervals, and there is a special term to signal this distinction. 1 = DEFINITION continuous The function f is uniforrnly on an interval A if for every e all and such there is some ô > 0 that, for x y in A, > 0 if |x-y|<8,then|f(x)-f(y)|<e. We've seen that a function can be continuous on the whole line, or on an open interval, without being uniformly continuous there. On the other hand, the function f (x) x2 did turn out to be uniformly continuous on any closed interval. the same sort of thing that occurs when we This shouldn't be too surprising-it's ask whether a function is bounded on an interval-and we would be led to suspect that any continuous function on a closed interval is also uniformly continuous on that interval. In order to prove this, we'll need to deal first with one subtle point. Suppose that we have two intervals [a,b] and [b,c] with the common endpoint b, and a function f that is continuous on [a,c]. Let e > 0 and suppose that = 8, Appendix.UnformContinuig 143 the following two statements hold: if x and y are in if x and y are in (i) (ii) [a,b] and |x [b,c] and |x y I < 81, then - yl -- < 82, then |f (x) f (y)| < |f (x) f (y)| < -- - e, e. We'd like to know if there is some & > 0 such that |f (x) f (y)| < e whenever x and y are points in [a,c] with x y < 8. Our first inclination might be to choose ô as the minimum of ô1 and ô2. But it is easy to see what goes wrong (Figure 2): we might have x in [a,b] and y in [b,c], and then neither (i)nor (ii) tells us anything about |f (x) f (y)|. So we have to be a little more cagey, and - < a x & a - > c - FIGURE also use continuity 2 LEMMA of f at b. Let a 0, and (ii)hold. Then there is a 8 > 0 such that, (i)and [a,c] and at b, there |x y| - is a ôs < 8, then jf (x) f (y)| < - e. 0 such that, > -b| if |x 83, then < f(x) -- f(b) < . It follows that (iii) if |x - b| 83 and < |y b| < - 83, then |f (x) - f (y)| < e. Choose 8 to be the minimum of ôi, 82, and 83. We claim that this ô works. In fact, suppose that x and y are any two points in [a,c] with |x y| < 8. If x and y are both in [a,b], then |f (x) f (y)| < e by (i);and if x and y are both in [b,c], then |f (x) f (y)| < e by (ii).The only other possibility is that - - - x<b<y In either case, since [x yl f (x) f (y)| < e by (iii). - y<b<x. or ô, we also have < |x - b| < 8 and |y - b < 8. So - THEOREM 1 PROOF If f is continuous on [a,b], then f is uniformly continuous on [a,b]. It's the usual trick, but we've got to be a little bit careful about the mechanism of the proof. For e > 0 let's say that f is e-good on [a,b] if there is some ô > 0 such that, for all y and z in [a,b], if |y - z| < ô, then |f (y) f (z)| < e. f is e-good on [a,b] for all e -- Then we're trying to prove that Consider any particular e > 0. Let A = {x: a 5 x 5 b and f > 0. is e-good on [a,x]}. above (byb), so A has a least Then A ¢ 0 (sincea is in A), and A is bounded really should write since and might A œ œ. depend on e. But We upper bound as, b, no matter what e is. we won't since we intend to prove that œ = 144 Foundations Suppose that we had a < b. Since f is continuous at a, there is some ho > 0 if |y a l < So, then |f (y) f (a)| < e/2. Consequently, if |y a| < Bo and |z a| < So, then |f (y) f(z)| < s. So f is surely e-good on the interval [a ôo, a + ôo]. On the other hand, since a is the least upper bound of A, it is also clear that f is e-good on [a,a Bo]. Then the Lemma implies that f is e-good on [a,a + bo], so a + Bo is in A, contradicting the fact that a is an upper bound. b is actually in A. The To complete the proof we just have to show that œ this practically the continuous for is Since is at b, there is some same: argument f E/2. then < < > 0 such that, if |b 80, y| So f is e-good on |f(y) f(b)| ao [b ôo, b]. But f is also e-good on [a,b bo], so the Lemma implies that f is e-good on [a,b]. such that, - - - - - -- -- = - - - - PROBLEMS 1. following values of a is the function f (x) x" uniformly continuous on [0,oo): a = 1/3, 1/2, 2, 3? (b) Find a function f that is continuous and bounded on (0, 1], but not uniformly continuous on (0, 1]. Find a function f that is continuous and bounded on [0,oo) but which (c) is not uniformly continuous on [0,oo). 2. (a) Prove that (a) For which of the if (b) Prove that if f = and g are uniformly continuous on A, then so is f + g. f and g are uniformly continuous and bounded on A, then uniformly continuous on A. is fg that this conclusion does not hold if one of them isn't bounded. Show (c) (d) Suppose that f is uniformly continuous on A, that g is uniformly continuous on B, and that f (x) is in B for all x in A. Prove that gof is uniformly "bisection continuous argument" on A. (page140) to give another 3. Use a 4. Derive Theorem 7-2 as a consequence of Theorem 1. proof of Theorem 1. PART DERIVATIVES AND INTEGRALS In 1604, al the heightof his scienttßc career, Galileoargued that fora rectilinear motion in rchich sped increasesproportionally Ie distana caeered, the law of motion should be just thal (x = d2) which he had discowred in theinvestigationof faRing bodies. Between 1695 and 1700 not a single ene of the monthly issues of Leiþng's Acta Eruditaram was without artides of Leibniz, þublished the ßernoulli brothers or the Marquis de (W&pitaltreating, with natalfon ody slightly dyerent from that which uw use leday, the most varied poblemsof calculus, dŒtrential ivitegralcalculus and the calculus of variations. Thus in the space of almost precisely one century infmitesimalcakulus er, as we new call it in English, The Cakulus, the calculating toolpar exceRenœ, had beenforged; and nearly three cenlaries of constant use ham not completdy dulled this incomparable instrument. N[CROLAS BOEmHAKI DERIVATIVES CHAPTER The derivative of a function is the first of the two major concepts of this section. Together with the integral, it constitutes the source from which calculus derives its particular flavor. While it is true that the concept of a function is fundamental, that you cannot do anything without limits or continuity, and that least upper bounds are essential, everything we have done until now has been preparation-if adequate, this section will be easier than the preceding ones-for the really exciting ideas to come, the powerflu concepts that are truly characteristic of calculus. would say the interest of the ideas to be introduced Perhaps (some in this section stems from the intimate connection between the mathematical concepts and certain physical ideas. Many definitions, and even some theorems, may be described in terms of physical problems, often in a revealing way. In fact, the demands of physics were the original inspiration for these fundamental ideas of calculus, and we shall frequently mention the physical interpretations. But we shall always first define the ideas in precise mathematical form, and discusstheir significance in terms of mathematical problems. The collection of all functions exhibits such diversity that there is almost no hope of discovering any interesting general properties pertaining to all. Because continuous functions form such a restricted class, we might expect to fmd some nontrivial theorems pertaining to them, and the sudden abundance of theorems after Chapter 6 shows that this expectation is justified.But the most interesting will results and most powerful about functions be obtained only when we restrict which have even greater claim to be called our attention even further, to functions which are even better behaved than most continuous functions. "certainly") "reasonable," j(x) = Ex|,x > o f(x)=x',x<0 f(x) lx\ = F I GUREt (a) (c) (b) which continuous Figure 1 illustrates certain types of misbehavior functions can unlike the graph of 0), display. The graphs of these fimetions are at (0, each where point. The quotation line" at Figure 2, it is possible to draw a marks have been used to avoid the suggestion that we have defined or "bent" "tangent "bent" FIGURE 2 147 148 Derivativesand Integrals "bent" "tangent line," although we are suggesting that the graph might be at a where already noticed probably line" cannot be drawn. You have point a that a tangent line cannot be defined as a line which intersects the graph only uch a definition would be both too restrictive and too permissive. With onc such a definition, the straight line shown in Figure 3 would not be a tangent line to the graph in that picture, while the parabola would have two tangent lines at each point (Figure 4), and the three functions in Figure 5 would have more than one tangent line at the points where they are "tangent "bent." FIGURE f(x) 3 = x (a) FIGURE FIGURE 4 (b) (c) 5 A more promising approach to the definition of a tangent line might start with lines," and use the notion of limits. If h 0, then the two distinct points (a, f (a)) and (a + h, f (a + h)) determine, as in Figure 6, a straight line whose slope is f (a + h) f (a) ·ý "secant - h :a k h, f(4 + h)) I I I f(a + h) - f(a) I FIGURE 6 As Figure 7 illustrates, the some sense, of these talked about a "limit" "tangent line" at (a,f (a)) seems to be the limit, in lines," as h approaches 0. We have never before of lines, but we can talk about the limit of their slopes: the "secant 9. Derivatives 149 slope of the tangent line though (a, f (a)) should be (a + h) f (a) lim f - h hoo We are ready for a definition, and some comments. DEFINITION The function f is differentiable f (a + h) . hm if at a f (a) - h hoo . exists. In this case the limit is denoted byf'(a) and is called the derivative of f at a. (We also say that f is differentiable if f is differentiable at a for every a in the domain of f.) on our definition is really an addendum; we define the of f at (a, f(a)) to be the line through (a,f(a)) line the graph tangent to with slope f'(a). This means that the tangent line at (a, f(a)) is defined only if f is differentiable at a. The second comment refers to notation. The symbol f'(a) is certainly reminiscent of functional notation. In fact, for any function f, we denote by f'the function whose domain is the set of all numbers a such that f is differentiable The first comment I FIGURE 7 (a,f(a)) at a, and whose value at such a number a h) f (a + is the collection of all pairs f' + h) lim f (a ( a, for which lim [f (a + h) h->0 of f (a) - h A-+o (To be very precise: is -- f (a) h hoo f (a)]|h -- exists.) The function f' is called the derivative f. Our third comment, somewhat longer than the previous two, refers to the physical interpretation of the derivative. Consider a particle which is moving along a straight line (Figure 8(a)) on which we have chosen an point O, and a direction,in which distances from O shall be written as positive numbers, the distance from 0 of points in the other direction being written as negative numbers. Let s (t) denote the distance of the particle from O, at time t. The suggestive notation s(t) has been chosen purposely; since a distance s(t) is determined for each "origin" motion t=5 i of the particle t=4 8(a) t=1 i-2 t=3 a e0 FIGURE t=0 a I I I line along which particle is moving 150 Derivativesand Integrals t, the physical situation automatically supplies us with a certain function s. The graph of s indicates the distance of the particle from O, on the vertical axis, in terms of the time, indicated on the horizontal axis (Figure 8(b)). number "distance" graph of s The quotient s(a+ I I "time" FIGURE 8(b) h) s(a) - h I "average velocity" of the particle has a natural physical interpretation. It is the time during the interval from a to a + h. For any particular a, this average speed depends on h, of course. On the other hand, the limit s(a+h) - s(a) h hoo depends only on a (aswell as the particular function s) and there are important physical reasons for considering this limit. We would like to speak of the of the particle at time a," but the usual defmition of velocity is really a definition of average velocity; the only reasonable at time a" (so-called definition of velocity") is the limit "velocity "velocity "instantaneous . lun hoo s(a + h) h - s(a) of the particle at a to be s'(a). Thus we defmethe (instantaneous)velocity s'(a) could easily be negative; the absolute value |s'(a) is sometimes Notice that speed. called the (instantaneous) realize that instantaneous velocity is a theoretical concept, It is important to which does not correspond precisely to any observable quantity. an abstraction While it would not be fair to say that instantaneous velocity has nothing to do with average velocity, remember that s'(t) is not s(t + h) -- s(t) h for any particular h, but merely the limit of these average velocities as h approaches 0. Thus, when velocities are measured in physics, what a physicist really measures is an average velocity over some (verysmall) time interval; such a procedure cannot be expected to give an exact answer, but this is really no defect, because physical measurements can never be exact anyway. velocity of a particle is often called the of change of its position." This The notion of the derivative, as a rate of change, applies to any other physical situation of change of in which some quantity varies with time. For example, the where of mass" of a growing object means the derivative m(t) is the function m, "rate "rate the mass at time t. In order to become familiar with the basic definitions of this chapter, we will spend quite some time examining the derivatives of particular functions. Before proving the important theoretical results of Chapter 11, we want to have a good idea of what the derivative of a function looks like. The next chapter is devoted exclusively to one aspect of this problemaalculating the derivative of complicated functions. In this chapter we will emphasize the concepts, rather than the 9. Denvatives 151 calculations, by considering a few simple examples. function, f (x) c. In this case Simplest of all is a constant = f (a + h) f (a) - = h hw0 lim c c - = h hoo 0. 0. This means that Thus f is difTerentiable at a for every number a, and f'(a) the tangent line to the graph of f always has slope 0, so the tangent line always = coincides with the graph. Constant functions are not the only ones whose graphs coincide with their tangent lines-this happens for any linear function f (x) cx + d. Indeed = f'(a) lim f(a+h)h = f(a) h->0 c(a + h) +d h hoo ch lim = [ca+d] - = - h h-vo c; the slope of the tangent line is c, the same as the slope of the graph of A refreshing difference occurs for f (x) x2. Here f. = f'(a) = + h) lim f (a h (a + h)2 h-0 slope 2. f(x) - = x2 (a, al ) lim a2 + - a2 h h->0 = a2 - h 2ah + h2 ho = f (a) - lim 2a + h h->0 = FIGURE 9 2a. Some of the tangent lines to the graph of f are shown in Figure 9. In this picture each tangent line appears to intersect the graph only once, and this fact can be checked fairly easily: Since the tangent line through (a, a2) has slope 2a, it is the graph of the function g(x)=2a(x-a)+a2 2ax = Now, if the graphs of f and g intersect at a point (x, f(x)) x2 or so or a2. - = x2 - 2ax a2 2ax + a2 - = 0; (x-a)2=0 x = a. In other words, (a, a2) is the only point of intersection. = (x, g(x)), then 152 Denvativesand Integrals The function f (x) x2 happens to be quite special in this regard; usually a tangent line will intersect the graph more than once. Consider, for example, the function f (x) x3. In this case = = f'(a) f(a+h)- f(a) lim = h h)3 (a + h->o lim = G3 - h h->0 a3+3a2h+3ahi+h3-a3 =lim h 3a2h + 3ah2 + h3 h->0 h lim 3a2 + 3ah + h2 h-vo =hm = . hw0 3a2. = Thus the tangent line to the graph of f at (a, a3) has slope 3a2. This means that the tangent line is the graph of g(x) 3a2(x a) + a3 3a2x 2a3. = - = The graphs of f - and g intersect at the point (x, f(x)) = (x,g(x)) when -2a3 x3 x3 or y2x 3a2x + 2a3 _ - 0. = This equation is easily solved if we remember that one solution of the equation has got to be x a, so that (x a) is a factor of the left side; the other factor can then be found by dividing. We obtain = - (x f(x) - slope 3a' -2a (a,a') x= It so happens that x2 + ax - (x - a) (x2+ ax 2a2 also has x - a)(x -- = 0. a as a factor; we obtain finally = 0. Thus, as illustrated in Figure 10, the tangent line through (a, a3) also intersects the graph at the point (-2a, These two points are always distinct,except when a 0. We have already found the derivative of sufficiently many functions to illustrate the classical, and still very popular, notation for derivatives. For a given function f, the derivative f' is often denoted by -8a3). df(x) dx 10 -- a)(x +2a) = FIGURE 2a2) - For example, the symbol dx2 9. Derivatives 153 denotes the derivative of the function f (x) parts of the expression df (x) dx = x2. Needless to say, the separate are not supposed to have any sort of independent existence--the d's are not numbers, they cannot be canceled, and the entire expression is not the quotient of two and other numbers This notation is due to Leibniz (generally considered an independent co-discoverer of calculus, along with Newton), and is affectionately referred to as Leibnizian notation.* Although the notation df (x)/dx seems very complicated, in concrete cases it may be shorter; after all, the symbol dx2/dx is actually more concise than the phrase derivative of the function "df(x)" "dx." "the f(x) = x2." The following formulas state in standard that we have found so far: Leibnizian de -- all the information 0, = dx d(ax + b) notation = a, dx dx2 -=2x, dx dx3 = - 3x2 dx Although the meaning of these formulas is clear enough, attempts at literal interpretation are hindered by the reasonable stricture that an equation should not contain a function on one side and a number on the other. For example, if the third equation is to be true, then either df(x)/dx must denote f'(x), rather than f', or else 2x must denote, not a number, but the function whose value at x is 2x. It is really impossible to assert that one or the other of these alternatives is intended; in practice df(x)/dx sometimes means f' and sometimes means f'(x), while 2x may denote either a number or a function. Because of this ambiguity, most authors are reluctant to denote f'(a) by df (x) dx (a); instead f'(a) is usually denoted by the barbaric, but unambiguous, df symbol (x) dx *Leibniz was led to this symbol by his intuitive notion of the derivative, which he considered to be, of this quotient when h is an quotients [f (x + h) f(x)]/ h, but the small" quantity was denoted by dx and the corresponding "infinitely This "infinitely small" difference (x+dx)- (x) by df (x). Although this point of view is impossible to reconcile with f f properties (Pl) Pl3) of the real numbers, some people find this notion of the derivative congenial. not the limit of small" number. "value" -- "infinitely 154 Derivadvesand Integrals is associated with one more to these difficulties, Leibnizian notation ambiguity. Although the notation dx2/dx is absolutely standard, the notation df(x)/dx is often replaced by df/dx. This, of course, is in conformity with the of confusing practice a function with its value at x. So strong is this tendency that the fimction functions are often indicated by a phrase like the following: x2." We will sometimes follow classical practice to the extent of using y y carefully distinguish between as the name of a function, but we will nevertheless the function and its values-thus the we will always say something like x2." = y(x) function by) In addition "consider = "consider (defined Despite the many ambiguities of Leibnizian notation, it is used almost exclusively in older mathematical writing, and is still used very frequently today. The staunchest opponents of Leibnizian notation admit that it will be around for quite some time, while its most ardent admirers would say that it will be around forever, and a good thing tool In any case, Leibnizian notation cannot be ignored completely. The policy adopted in this book is to disallow Leibnizian notation within the text, but to include it in the Problems; several chapters contain a few (immediately recognizable) problems which are expressly designed to illustrate the vagaries of Leibnizian notation. Trusting that these problems will provide ample practice in this notation, we return to our basic task of examining some simple examples of derivatives. The few functions examined so far have all been differentiable. To fully appreciate the significance of the derivative it is equally important to know some examples of functions which are not differentiable. The obvious candidates are the three functions first discussed in this chapter, and illustrated in Figure 1; if they turn out to be differentiable at 0 something has clearly gone wmng. Consider first f (x) = |x|. In this case f(0+h) f(0) -- [h| h h -1 Now |h|/h = 1 for h > 0, and |h||h for h = 0. This shows that < f (h) f (0) does not - lim hoo exist. h In fact, (h) lim f hoo+ - f (0) h f (h) f (0) - and . lun hoo- -1. = h (These two limits are sometimes called the right-hand hand derivative, respectively, of f at 0.) derivative and the left- 9. Denvatives 155 If a ¢ 0, then f'(a) does exist. I In fact, f'(x) f'(x) 1 = if x if x -1 = 0, 0. > < The proof of this fact is left to you (itis easy if you remember the derivative of a linear function). The graphs of f and of f' are shown in Figure 11. For the function x2, f(x)= f <0 x x>0, x, a similar difficulty arises in connection with f'(0). We have h2 FIGURE 11 h, = - f (h) f (0) - h h h h h < 0 -=1, h>0. Therefore, hm f(h) f(0) - . (h) lun f f (0) - . but Thus f'(0) does not exists for x † Rit h->o+ =0, h h-+0- = h 1. exist; f is not differentiableat 0. Once again, however, f'(x) is easy to see that f'(x) 2x, 1, x x f = < > 0 0. The graphs of f and f' are shown in Figure 12. For this function Even worse things happen for f(x) = f. & 1 h d' --=- FIGURE 12 f(h) f(0) - _ ¯ h B h In this case the right-hand - -1/¾ what's FIGURE 13 more, (Figure 13). 1 h - N' < 0. limit f (h) f (0) does not exist; instead h>0 1/J 1 becomes arbitrarily large as h approaches 0. And, becomes arbitrarily large in absolute value, but negative 156 Derivativesand Integrals The function f (x) W, although better behaved than this. The quotient f (h) f (0) -----....... - diEerentiable at 0, is at least a little not = hij3 W 1 1 simply becomes arbitrarily large as h goes to 0. Sometimes one says that f has derivative at 0. Geometrically this means that the graph of f has a an line" which is parallel to the vertical axis (Figure 14). Of course, f (x) "infinite" --..... "tangent has the same geometric property, but one would say that f has a derivative infinity" at 0. Remember that differentiability is supposed to be an improvement over mere continuity. This idea is supported by the many examples of functions which are continuous, but not differentiable; however, one important point remains to be - of FIGURE = 14 N "negative noted: THEOREM 1 If f is differentiable at a, then f is continuous at a. PROOF lim f (a + h) hw0 - f (a) = lim f (a + h (a + h) h hoo = h) lim f hoo = f'(a) 0 = 0. - - f (a) - h f (a) lim h - ha0 - As we pointed out in Chapter 5, the equation lim f (a+h)- f (a) = 0 is equivalent h->0 to lim f (x) = f (a); thus f is continuous at a. It is very important to remember Theorem 1, and just as important to remember that the converse is not true. A differentiable function is continuous, but a continuous function need not be differentiable (keep in mind the function f (x) |x|, which and you will never forget statement is true and which false). The continuous functions examined so far have been differentiable at all points with at most one exception, but it is easy to give examples of continuous functions which are not differentiable at several points, even an infinite number (Figure 15). Actually, one can do much worse than this. There is a function which is contmuous = FIGURE 15 9. Derivatives 157 FIGURE (a) (b) (c) (d) 16 dgiventiablenowhee! Unfortunately, the definition of this function will be inaccessible to us until Chapter 24, and I have been unable to persuade the artist to draw it (consider carefully what the graph should look like and you will with sympathize her point of view). It is possible to draw some rough approximations to the graph, however; several successively better approximations are shown in Figure 16. must be postponed, Although such spectacular examples of nondifferentiability with which continuous ingenuity, find function is not differentiable a we can, a little all such ofwhich are in [0, 1]. One function is illustrated in at infmitely many points, Figure 17. The reader is given the problem of defining it precisely; it is a straight line version of the function everywhere and ,r ' ,7' , ,/ -1 -I / s # 1 xsin-, 0, FIGURE 17 1 x¢0 x = 0. This particular function f is itself quite sensitive to the question of differentiability. Indeed, for h ¢ 0 we have 158 Derivativesand Integrals h sin 1 - --- 0 I h h h Long ago we proved that lim sin 1/h does not exist, so f is not differentiable at 0. h->0 Geometrically, one can see that a tangent line cannot exist, by noting that the secant line through (0, 0) and (h, f (h)) in Figure 18 can have any slope between and 1, no matter how small we require h to be. f (h) f (0) - . sin -. -1 x=0 0, FIGURE 18 This finding represents something of a triumph; although continuous, the funcand we can now enunciate one mathtion f seems somehow quite unreasonable, ematically undesirable feature of this function-it is not differentiable at 0. Nevabout the criterion of differenertheless, one should not become too enthusiastic example, tiability. For the function x2sin-, 8(x) = 1 0, is differentiable at 0; in fact g'(0) lim h-wo g(h) X fÛ x = x = 0: = h lim h->o = = - h h lim h sin ha0 The tangent line to the graph of g at 1 h2 sin g(0) - 0 1 - h 0. (0,0) is therefore the horizontal axis (Fig- 19). This example suggests that we should seek even more restrictive conditions on a function than mere differentiability. We can actually use the derivative to formulate such conditions if we introduce another set of defmitions, the last of this chapter. ure 9. Derivatives 159 / / FIGURE f(x) = x x=0 0 19 For any function f, we obtain, by taking the derivative, a new function f' (whose domain may be considerably smaller than that of f). The notion of differentiability can be applied to the function f', of course, yielding another function (f')', whose domain consists of all points a such that f' is differentiable at a. The function (f')' is usually written simply f" and is called the second derivative of f. If f"(a) exists, then f is said to be 2-times differentiable at a, and the number f"(a) is called the second derivative of f at a. In physics the second derivative is particularly important. If s(t) is the position at time t of a particle moving along a straight line, then s"(t) is called the acceleration at time t. Acceleration plays a special role in physics, because, as stated in Newton's laws of motion, the force on a particle is the product of its mass and its acceleration. Consequently you can feel the second derivativewhen you sit in an accelerating car. There is no reason to stop at the second derivative-we can define f"' = (f"y, f"" = (f'")', etc. This notation rapidly becomes unwieldy, so the following abbreviation is usually adopted is really a recursive definition): (it f 0) f f(***) U(k) r ' = , -- Thus JU) fí2) f* f (4) - - = _ J' f" f')' f"'= (f") ur peu - _ , i etc. The various functions f(k), for k ;> 2, are sometimes called higher-order derivatives of f. Usually, we resort to the notation f (k) only for k > 4, but it is convenient to have f f*) defined for smaller k also. In fact, a reasonable definition can be made for f(°),namely, f(0) f - 160 Denvativesand Integrals Leibnizian notation for higher-order derivatives should Leibnizian symbol for f"(x), namely, also be mentioned. The natural (df(x) d is abbreviated ' to d2f(x) -x, f(x) dx dx ¯ (dx)2 d2f(x) dx2 frequently to or more , Similar notation is used for f(*) xL The following example illustrates the notation f(k),and also shows, in one very simple case, how various higher-order derivatives are related to the original funcx2. Then, tion. Let f(x) as we have already checked, = (a) f'(x) f"(x) f'(x) = f 2x X 2x, 2, Û, = = - f*(x) if k 0, = 3. > Figure 20 shows the function f, together with its various derivatives. A rather more illuminating example is presented by the following function, whose graph is shown in Figure 21(a): / (b) x2, f (x) f"(x) - 2 x > x < if a if a > -x , 0 0. It is easy to see that f'(a) 2a = --2a f'(a) = 0, 0. < Moreover, (h) hm f f (0) - . f'(0) (e) = h-+o h lim f(h) h = -. h->o Now f (x) - 0, k > 3 (h) lim f hoo+ h h2 . hm = --- 0 = - h how -h2 and (d) FIGURE 20 f (h) . hm . hm = h h-+0- = - h h--o- SO lim f (h) 0. h all summarized information be This as follows: can J'(0) = = - h->o f'(x) = 2 |x|. 0, 9. Derivatives 161 f(x) It follows that f"(0) does not exist! Existence of the second derivative is thus a looking" function strong criterion for a function to satisfy. Even a when reveals examined with second the like f derivative. This some irregularity suggests that the irregular behavior of the function "smooth rather = x2 sin 1 -, g(x) = x 0, x x ¢0 = 0 be revealed by the second derivative. At the moment we know that g'(0) = 0, but we do not know g'(a) for any a ¢ 0, so it is hopeless to begin computing g"(0). We will return to this question at the end of the next chapter, after we have perfected the technique of finding derivatives. might (a) also PROBLEMS 1. J'(x) = 2|x| directly from the definition, that if f (x) = 1/x, then Î |a2, for a ¢ 0. j'(a) that the tangent line to the graph of f at (a, 1/a) does not intersect (b) Prove the graph of f, except at (a, 1/a). (a) Prove, working - - -2/a3 2. 1/x2, then f'(a) for a ¢ 0. 1/a2) that the Prove at line intersects tangent to f at one other f (a, (b) vertical axis. which the opposite side of the lies point, on 3. Pove that if f (x) J, then f'(a) 1/(24), for a > 0. (The expression f (a)]/h will require some algebraic face lifting, you obtain for [f (a + h) but the answer should suggest the right trick.) (b) f"(x) 2, x > o = (a) Prove that if f(x) = = = = - 4. 1, For each natural number n, let S.(x) = x". Remembering that Si'(x) 3X2, conjecture formula S.'(x). S2'(x) for Prove 2x, and S3'(x) = a your conjecture. (The expression (x + h)" may be expanded by the binomial = = --2, I"(x) theorem.) x < 0 - (c) FIGURE 21 5. Find 6. Prove, starting from the definition f' if f (x) [x]. = (anddrawing a picture to illustrate): f (x) + c, then g'(x) f'(x); cf (x), then g'(x) cf'(x). 7. Suppose that f (x) x3 (a) What is f'(9), f'(25), f'(36)? f'(629 (b) What is f'(32), f'(52), r 2 f'(a2 What is (c) (a) if g(x) (b) if g(x) = = = = = If you do not find this problem silly, you are missing a very important point: f'(x2) means the derivative of f at the number which we happen to be calling x2; it is not the derivative at x of the function g(x) f(x2). Justto drive the point home: = (d) For f(x) = x3, compare f'(x2) and g'(x) where g(x) = f(x2L 162 Derivativesand Integrals 8. definition) that g'(x) c). picture this. this problem you must illustrate To do Draw + to a f'(x write out the defmitions of g'(x) and c) correctly. The purpose f'(x + of Problem 7 was to convince you that although this problem is easy, it is not an utter triviality, and there is something to prove: you cannot simply put prime marks into the equation g(x) f(x + c). To emphasize this (a) Suppose g(x) from the f(x+c). Prove (starting == = = point: (b) Prove that if g(x) f(cx), then g'(x) c. J'(cx). Try to see pictorially why this should be true, also. (c) Suppose that f is differentiable and periodic, with period a (i.e., f (x + a) f (x) for all x). Prove that f' is also periodic. = = = 9. Find or and also f'(x) you will surely f'(x + 3) in the following cases. Be very methodical, Consult the answers (afteryou do the slip up somewhere. problem, naturally). (i) f (x) = (ii) f (x + (iii) f (x + (x + 3)*. 3) 3) = x5. = (x + 5)'. 10. Find f'(x) if f(x) be the same. 11. Galileo was wrong: if a body falls a distance s(t) in t seconds, and s' is proportional to s, then s cannot be a function of the form s(t) = ct2. (b) Prove that the following facts are true about s if s(t) (a/2)t2 (thefirst fact will show why we switched from c to a/2): g(t + x), and if = f(t) = g(t + x). The answers will not (a) Provethat = (i) s"(t) = a (ii) [s'(t)]2 = (theacceleration is constant). 2as(t). (c) If s is measured in feet, the value of a is 32. How many seconds do you have to get out of the way of a chandelier which falls from a 400-foot ceiling? If you don't make it, how fast will the chandelier be going when it hits you? Where was the chandelier when it was moving with half that speed? 12. Imagine a road on which the speed limit is specified at every single point. In other words, there is a certain function L such that the speed limit x miles from the beginning of the road is L(x). Two cars, A and B, are driving along this road; car A's position at time t is a(t), and car B's is b(t). equation expresses the fact that car A always travels at the speed limit? (The answer is not a'(t) L(t).) always that speed limit, and that B's position at A the Suppose at (b) goes time t is A's position at time t 1. Show that B is also going at the speed limit at all times. (c) Suppose B always stays a constant distance behind A. Under what conditions will B still always travel at the speed limit? (a) What = - 9. Derivatives 16 3 13. g(a) and that the left-hand derivative of f at a equals Suppose that f(a) the right-hand derivative of g at a. Define h(x) f(x) for x 5 a, and h(x) g(x) for x > a. Prove that h is differentiable at a. = = = 14. *15. Let f (x) x2 if x is rational, and f (x) 0 if x is irrational. Prove that f is differentiable at 0. (Don't be scared by this function. Justwrite out the definition of /'(0).) = = be a function such at |f (x)| 5 x2 for all x. Prove that f is differentiable at 0. (If you have done Problem 14 you should be able to do this.) result can be generalized if x2 is replaced by Ig(x), where g has This (b) what property? (a) Let f 1. If f satisfies f(x)| 5 x|a, prove that f is differentiable at 0. 16. Let 17. Let 0 < ß < 1. Prove that if is not differentiable at 0. *18. œ > f satisfies f (x) > |x and f (0) 0, then = f Let f (x) 0 for irrational x, and 1/q for x p/q in lowest terms. Prove that f is not differentiable at a for any a. Hint: It obviously suffices to prove this for irrational a. Why? If a is the decimal expansion m.aia2a3... of a, consider [f (a + h) f (a)]/h for h rational, and also for = = = - -0.00...0a h 19. ay2.... h(a), that f(x) 5 g(x) 5 f(a) g(a) that g is differentiable h'(a). Prove f'(a) that h(x) for all x, at a, and that with of the definition g'(a).) f'(a) g'(a) h'(a). (Begin the conclusion omit the hypothesis that not does follow if Show we (b) f (a) g(a) h(a). (a) Suppose and that = = = = = = 20. = = Let f be any polynomial function; we will see in the next chapter that f is differentiable. The tangent line to f at (a, f(a)) is the graph of g(x) f'(a)(x a) + f(a). Thus f(x) g(x) is the polynomial function d(x) if f(x) x2, then f(x) f'(a)(x a) f(a). We have already seen that d(x) (x a)2, and if f(x) x3, then d(x) (x a)2(x + 2a). = - - - = = - = - - = = - d(x) when f (x) x4, and show that it is divisible by (x a)2. (b) There certainly seems to be some evidence that d(x) is always divisibleby Figure 22 provides an intuitive argument: usually, lines parallel (x to the tangent line will intersect the graph at two points; the tangent line interSects the graph only once near the point, so the intersection should be a intersection." To give a rigorous proof, first note that (a) Find = - -a)2. FIGURE 22 "double d(x) f (x) f (a) - = x-a - f'(a). x-a Now answer the following questions. Why is f (x) f (a) divisible by (x a)? Why is there a polynomial function h such that h(x) - - 164 Derivativesand Integrals a) for x ¢ a? Why is lim h(x) d(x)/(x does this solve the problem? - 0? Why is h(a) = 0? Why = --a). 21. (a) Show that f'(a) lim[f(x) (b) Show that derivatives are a = "local some open interval containing in computing f'(a), you can course you can't ignore *22. that (a) Suppose f(x) (Nothing deep here.) f(a)]/(x - property": if f(x) g'(a). a, then f'(a) g(x) for all x in (This means that ignore f (x) for any particular x ¢ a. Of for all such x at once!) = = f is differentiable at x. Prove that (x + h) f (x h) J'(x) hm f -- -- . = Hint: Remember an old algebraic trick-a number is not changed if the same quantity is added to and then subtracted from it. Prove, more generally, that **(b) f'(x) *23. . 2h hoo lim f(x+h)- f(x-k) h+ k = . h,k->o+ Prove that if f is even, then f'(x) = f'(--x). (In order to minimize cong'(x) and then remember what other thing g fusion, let g(x) f(-x); find is.) Draw a picture! - = *24. Prove that if is odd, then f f'(x) = 25. Problems 23 and 24 say that f' is even if What can therefore be said about f(k)p 26. Find = x3. = x5. (iii) f'(x) (iv) f (x + If S.(x) 3) = f is even. x3. x", and = 0 sk 5 n, prove that Xn-k = (n = *29. is odd, and odd if x4 = S,(k)(X) *28. f f"(x) if (i) f (x) (ii) f (x) 27. Once again, draw a picture. f'(-x). (a) Find f'(x) if f(x) (b) Analyze f similarly = k)! - xn-k kl |x|3. Find f"(x). Does f"'(x) exist for all x? if f (x) x* for x > 0 and f (x) for xs 0. -x* = = for x > 0 and let f(x) 0 for xs 0. Prove that f("¯U exists (andfind a formula for it), but that f(")(0)does not exist. Let f(x) = x" = 9. Denvatives 165 30. Interpret the following specimens of Leibnizian notation; ment of some fact occurring in a previous problem. (i) - nxn-1 = dx dz (ii) = -- - - dy (iv) 1 if z y d[f(x)+c] dx d[cf(x)] (iii) (v) dx" = = c dx dz dy dx dx -=- dx (vii) . y df(x) dx df(x) . . dx x-a2 df(x+a) df(x) = dx df(cx) dx x=b+a df(x) = df(cx) c . dX x=cb df(y) = dx dx x=b (viii) (ix) 1 3a4 = - -. if z=y+c. dx3 (vi) = c . dy y=cz each is a restate- DIFFERENTIATION CHAPTER The process of finding the derivative of a function is called dyerentiation.From the previous chapter you may have the impression that this process is usually laborious, requires recourse to the definition of the derivative, and depends upon successfully recognizing some limit. It is true that such a procedure is often the only possible approach-if you forget the definition of the derivative you are likely to be lost. Nevertheless, in this chapter we will learn to differentiate a large number of functions, without the necessity of even recalling the defmition. A few theorems will provide a mechanical process for differentiating a large class of functions, which are formed from a few simple functions by the process of addition, multiplication, division, and composition. This description should suggest what theorems will be proved. We will first find the derivative of a few simple functions, and then prove theorems about the sum, products, quotients, and compositions of differentiable of a computation functions. The first theorem is merely a formal recognition carried out in the previous chapter. THEOREM 1 If f is a constant function, f (x) f'(a) PROOF f'(a) = c, then = = for all numbers 0 lim f(a+h)h f(a) a. lim c-c h = hoo hoo = 0. The second theorem is also a special case of a computation in the last chapter. THEOREM 2 If f is the identity function, f (x) f'(a) PROOF j'(a) = - x, then = 1 for all numbers f (a + li f (a) a+h-a . h hoo = - h hoo =hm h) a. h lim Aso h - = 1. The derivative of the sum of two functions is just what one would hopathe sum of the 166 derivatives. 10. Dg'erentiation 167 THEOREM 3 If f and g are differentiable at a, then f + g is also differentiable at a, and g)'(a) (f + PROOF (f +g)(a+h) (f . # g)'(a) 11m = - h)- f(a+h)+g(ak lim f(a+h) lim f(a) - f (a + lim hoo h) h f (a) + - g(a+h) + h hoo g(a) - h g(a + h) lim g(a) --- h hoo + g'(a). J'(a) = [f(a)+g(a)] h hw0 = +g)(a) (f h hoO = g'(a). f'(a) + = The formula for the derivative of a product is not as simple as one might wish, but it is nevertheless pleasantly symmetric, and the proof requires only a simple algebraic trick, which we have found useful before-a number is not changed if the same quantity is added to and subtracted from it. THEOREM 4 If f and g are difTerentiable at a, then (f - g)'(a) f'(a) = (f · g)'(a) = (f Em hoo = lim f f (a) - g'(a). -g)(a) -g)(a+h) PROOF g(a) + - (f - h (a + h)g(a + h) f (a)g(a) - h hoo -g(a)] = f(a+h)[g(a+h) lim hw0 = = lim f hw0 h g(a + h) lim (a + h) hoo h + g(a) --- - [f(a+h)- f(a)]g(a) h + lim f (a + h) -- f (a) h hoo · lim g(a) hw0 f(a).g'(a)+ f'(a)-g(a). (Notice that we have used Theorem 9-1 to conclude that lim hw0 f(a + h) = f (a).) In one special case Theorem 4 simplifies considerably: THEOREM s If g(x) = cf(x) and f is differentiable at g'(a) PROOF If h(x) = c, so that g = h - c - f'(a). then by Theorem 4, f, g'(a) = a, then g is differentiable at a, and = = = = (h f)'(a) - f'(a) + h'(a) f (a) f'(a) + 0 f (a) h(a) c c - - - f'(a). 168 Denvatives and Integrals Notice, in particular, that (- f)'(a) (f + [-g])'(a) f'(a) = = and consequently f'(a), -- (f - g)'(a) = g'(a). - To demonstrate what we have already achieved, we will compute the derivative of some more special functions. THEOREM 6 If f(x) In = for some natural number f'(a) PROOF = n, then na"¯' for all a. The proof will be by induction on n. For n = 1 this is simply Theorem 2. Now assume that the theorem is true for n, so that if f (x) x", then = f'(a) Let g(x) = x"**. If I(x) = = f · I. It follows g'(a) = na"¯1 for all a. x"+1 x, the equation g(x)= thus g = = x" - x can be written forallx; f(x)-I(x) from Theorem 4 that (f I)'(a) - f'(a) = - I(a) + =na"¯'-a+a"-1 f (a) - I'(a) na" + a" = (n + 1)a", for all a. This is precisely the case n + 1 which we wished to prove. Putting together the theorems proved so far we can now fmd f' for f of the form f(x)=anx"+an-1x"¯'+---+a2x2&GII 0- We obtain f'(x)=nanx"¯1+(n--1)an-ix"¯2+···+2a2X#at. We can also find f"(x) = f": n(n - 1)anx"¯2 + (n 1)(n - -- 2)an-1x"¯3 + - - - +2a2. This process can be continued easily. Each differentiation reduces the highest power of x by 1, and eliminates one more at. It is a good idea to work out the derivatives f"', f(4), and perhaps f , until the pattern becomes quite clear. The last interesting derivative is f ">(x) n!an; = fork>nwehave ff*)(x) 0. = Clearly, the next step in our program is to find the derivative of a quotient f/g. It is quite a bit simpler, and, because of Theorem 4, obviously sufficient to find the derivative of 1/g. 10. Djerentiation 169 THEOREM 7 ¢ 0, then 1/g is differentiable at If g is differentiable at a, and g(a) ' -g'(a) (1 (a) - g PROOF a, and = . [g(a)]2 Before we even write h) (a + (a) - h is necessary to check that be sure that this expression makes sense-it sufficiently requires only two observations. small h. This defined for (1/g)(a+h) is Since g is, by hypothesis, differentiable at a, it follows from Theorem 9-1 that g is continuous at a. Since g(a) ¢ 0, it follows from Theorem 6-3 that there is some 8 > 0 such that g(a + h) ¢ 0 for |h| < 8. Therefore (1/g)(a + h) doesmake sense for small enough h, and we can write we must (1 - lim hw0 g (a + h) 1 1 (a) - - g 1 - = h lim g(a + h) . hm hw0 g(a) h a->o = -- g(a) h[g(a) g(a + h) g(a + h)] -- - -[g(a+h)-g(a)] l . =lun g(a)g(a h h-so + h) -[g(a+h)-g(a)] 1 . = lim lun - h h->O g(a) g(a + h) 1 -g'(a) = h->o - · . [g(a)]2 (Notice that we have used continuity of g at a once again.) The general formula for the derivative of a quotient is now easy to derive. Though not particularly appealing, it is important, and must simply be memorized (I always use the incantation: times derivative of top, minus top times derivative of bottom, over bottom squared.") "bottom THEOREM a If f and g are differentiable at a and g(a) ¢ 0, then is differentiable at a, f/g and f ( - g ' (a) g(a) = - J'(a) f (a) [g(a)]2 - - g'(a) . 170 Derivativesand Integrals PROOF Since f/g = f (1/g) we have - (a) f = = (a) - f'(a) (1 g f'(a) = (a) + - - f'(a) - (a) g f (a)(-g'(a)) + g(a) f (a) ' 1 - · [g(a)]2 g(a) f(a)·g'(a) -- [g(a)]2 We can now differentiate a few more functions. For example, x2 if f(x) if f (x) = = if f (x) = 1 then f'(x) x2 + 1 (x2+ 1)(2x) (x2 1)(2x) (x2+ 1)2 - , x then x2 4 1 f'(x) , 1 then -, f'(x) x - = (x2 + 1) = -- - x(2x) 1 1 (x2+ 1)2 (-1)x¯2. - xy x2 ; (x2 })2 = (x2 ()2 - -- 4x - Notice that the last example can be generalized: if f (x) = x¯" = 1 -, for some natural number n, x" then -nx"¯1 f'(x) - - (-n)x¯"¯1. x thus Theorem 6 actually holds both for positive and negative integers. If we inter0, then pret f (x) x to mean f (x) 1, and f'(x) 0 x¯I to mean f'(x) is necessary because it is Theorem 6 is true for n 0 also. (The word 0¯I is meaningless.) not clear how 0° should be defined and, in any case, O Further progress in differentiation requires the knowledge of the derivatives of certain special functions to be studied later. One of these is the sine function. For the moment we shall divulge, and use, the following information, without proof: = = = = - "interpret" = - sin'(a) = cos a cos'(a)=-sina for all a, foralla, This information allows us to differentiatemany other functions. For example, if f (x) = x sinx, then f'(x) f"(x) = xcosx +sinx, sin x + cos x + cos x sin x + 2 cos x; -x = -x = 10. DgFerentiation171 if g(x) sin*x = sinx = sinx, - then g'(x) sin x cos x + cos x sin x 2 sin x cos x, 2[(sinx)(-sinx) + cosx cosx] 2[cos2x sin2x]; = = g"(x) = = -- if h(x) cos2 x = = cos x x, - then h'(x) sin x) + (cosx)(- = sin x) cos x (- -2sinxcosx, = -2[cos2 h"(x) = x - sin2 Notice that g'(x) + h'(x) 0, = 1. As we would expect, we hardly surprising, since (g + h)(x) = sin2 x + COS2 x also have g"(x) + h"(x) 0. The examples above involved only products of two functions. A function involving triple products can be handled by Theorem 4 also; in fact it can be handled in two ways. Remember that f g h is an abbreviation for = = - (f - - g) h - or f (g - - h). Choosing the first of these, for example, we have (f g h)'(x) - - (f g)'(x) h(x) + (f g)(x)h'(x) [f'(x)g(x) + f (x)g'(x)]h(x) + f (x)g(x)h'(x) f'(x)g(x)h(x) + f(x)g'(x)h(x) + f (x)g(x)h'(x). = - = = - - The choice of f (g h) would, of course, have given the same result, with a different intermediate step. The final answer is completely symmetric and easily - - remembered: sum of the three terms obtained and multiplying by the other two. (f g h)' is the · - g, and h by differentiating each of f, For example, if f(x) x3 sinx = cosx, then f'(x) 3x2 sinx cosx + x3 COsX = COsI -|-13(SinX)(- Products of more than 3 functions can be handled similarly. should have little difficulty deriving the formula sinX). For example, you .g·h-k)'(x) (f = f'(x)g(x)h(x)k(x)+f(x)g'(x)h(x)k(x) + f (x)g(x)h'(x)k(x) + f (x)g(x)h(x)k'(x). 172 Derivativesand Integrals You might even try to prove (fi ..... f.)'(x) (byinduction) the general formula: f;_1(x)fi'(x)fi+1(x)··--·fn(x)fi(x) = -...- i=l Differentiating the most interesting functions obviously requires a formula for (fo g)'(x) in terms of f' and g'. To ensure that fog be differentiable at a, one reasonable hypothesis would seem to be that g be differentiable at a. Since the behavior of fog near a depends on the behavior of f near g(a) (notnear a), it also seems reasonable to assume that f is differentiable at g(a). Indeed we shall prove that if g is differentiable at a and f is differentiable at g(a), then fog is differentiable at a, and (fo g)'(a) f'(g(a)) = · g'(a). This extremely important formula is called the Chain Ruk, presumable because of functions. Notice that a composition of functions might be called a and g)' the of g', is not practically product but quite: (fo f' must be evaluated f' attempting and g' at a. Before at g(a) to prove this theorem we will try a few applications. Suppose "chain" f (x) = Let us, temporarily, use S to denote the f sin x2. function ("squaring") sin = o S(x) = x2. Then S. Therefore we have f'(x) sin'(S(x)) = = Quitea different result is obtained cos x2 - S'(x) 2x. - if f(x) = sin2x. In this case f = So sin, so f'(x) = = S'(sin x) sin'(x) 2 sin x cos x. · - sin sin Notice that this agrees (asit should) with the result obtained by writing f and using the product formula. function, Although we have invented a special symbol, S, to name the it does not take much practice to do problems like this without bothering to write down special symbols for functions, and without even bothering to write down the particular composition which f is-one soon becomes accustomed to taking f apart in one's head. The following differentiations may be used as practice for such mental gymnastics-if you find it necessary to work a few out on paper, by all means do so, but try to develop the knack of writing f' immediately after seeing = "squaring" · 10. Dyerentiation 173 the definition of f; problems of this sort are so simple that, if you the Chain Rule, there is no thought necessary. = sinx' f(x) = sin3x f (x) = sin f (x) = f(x) f(x) if f(x) then f'(x) = cosx3-3x2 f'(x) = 3sin2x.cosx just remember -1 1 1 J'(x) = x sin(sin x) f'(x) = = sin(x3 + 3x2) J'(x) = (x3+ 3x2)53 r(x) - cos - · - x2 x cos(sin x) cos x . = cos(x2 + 3x2) = 53(x3 + 3x2)52 - (3x2+ 6x) 2 A function like f(x) sin2X2 = [SinX2 = 2 which is the composition of three functions, f=SosinoS, can also be differentiated by the Chain Rule. It is only necessary to remember that a triple composition fogoh means (fo g) oh or fo (go h). Thus if f (x) sin x = we can write f = (So sin) f = So o (sin o S, S). The derivative of either expression can be found by applying the Chain Rule twice; the only doubtful point is whether the two expressions lead to equally simple calculations. As a matter of fact, as any experienced differentiator knows, it is much better to use the second: f = So (sin o S). We can now write down f'(x) in one fell swoop. To begin with, note that the first function to be differentiated is S, so the formula for f'(x) begins Inside the parentheses we must put sin x2, the value at x of the second function, sin o S. Thus we begin by writing - f'(x) = 2 sin x2 necessary, after all). We must now multiply this much of the answer by the derivative of sin oS at x; this part is easy-it involves a composition of two functions, which we already know how to handle. We obtain, (theparentheses weren't really for the final answer, f'(x)=2sinx2-cosxt2x. 174 Daivativesand Integrals The following example is handled similarly. Suppose f(x) sin(sinx2L = Without even bothering to write down f as a composition gohok of three functions, we can see that the left-most one will be sin, so our expression for f'(x) begins f'(x) cos( = ) · . Inside the parentheses we must put the value of ho k(x); this is simply sin x2 (what x2) by deleting the first sin). So our expression for f'(x) begins you get from sin(sin f'(x) cos(sinx2 = We can now forget about the first sin in sin(sinx2); we have to multiply what we have so far by the derivative of the function whose value at x is sin x2-which is again a problem we already know how to solve: f'(x) cos(sin x2) = - cos x2 - 2x. Finally, here are the derivatives of some other functions which are the composition You can probably just S, as well as some other triple compositions. of sin and "see" that the answers are correct-if if f(x) = not, try writing f'(x) then sin((sinx) ) x)]2 f'(x) f'(x) f'(x) f (x) [sin(sin f (x) sin(sin(sin x)) f (x) sin2(x sin x) = = = = as a composition: f out cos((sinx)2).2sinx - 2 sin(sin x) cos(sin x) cos x = cos(sin(sin x)) = 2 sin(x sin x) cos(x sin x) · - - cos(sin x) = sin(sin(x sin x)) f'(x) = · cos x - - f (x) cosx = cos(sin(x sin x)) - [sinx + cos(x2 sin x) [2xsin x + - x cos x] x2 cosx]. The rule for treating compositions of four (oreven more) functions is easyput in parentheses starting from the right, (mentally) always fo(go(hok)), and start reducing the calculation number of functions: For example, to the derivative of a composition if f(x) sin2(sin2(x)) = [f = So sin oSo sin = So (sino (So sin))] then f'(x) = 2 sin(sin2 x) cos(sin2 x) 2 sin x cos x; - - - of a smaller 10. Dz§erentiation175 if f (x) 2) sin((sin x2 = (f sin sin = = oSo sin oS (So o (sin o S))] then f'(x) cos((sin x2)2) = - 2 sin x2 cos x2 2x; - - if f (x) sin2(sin(sin = [fillin yourself, x)) if necessary] then -cos(sin(sinx)) f'(x) 2sin(sin(sinx)) = - cos(sinx) - cosx. With these examples as reference, you require only one thing to become a master differentiator-practice. You can be safely turned loose on the exercises at the end of the chapter, and it is now high time that we proved the Chain Rule. The following argument, while not a proof, indicates some of the tricks one might try; as well as some of the difficulties encountered. We begin, of course, with the definition(fo g)(a + h) (fo g)(a) (fo g)'(a) = lim - h h->0 f(g(a+h)) lo = f(g(a)) - h hoo Somewhere in here we would like the expression put it in by fiat: (g(a + h)) hm f f (g(a)) - . (g(a + h)) lun f - . = h h->o for g'(a). g(a + h) hoo -- f (g(a)) g(a) One approach is to g(a + h) h g(a) - . This does not look bad, and it looks even better if we write . hm og)(a+h) (f - (f og)(a) h hoo = f (g(a) + [g(a+ lim h) g(a + h) h->0 g(a)]) - - f (g(a)) hm h->0 . g(a + h) - g(a) g(a) . h The second limit is the factor g'(a) which we want. If we let g(a + h) (tobe precise we should write k(h)), then the first limit is f(g(a)+k)h->o - - -- g(a) = k f(g(a)) k It looks as if this limit be f'(g(a)), since continuity of g at a implies that k goes to 0 as h does. In fact, one can, and we soon will, make this sort of reasoning precise. There is already a problem, however, which you will have noticed if you are the kind of person who does not divide blindly. Even for h ¢ 0 we might have g(a + h) g(a) by g(a + h) g(a) 0, making the division and multiplication small meaningless. only about True, we h, but g(a + h) g(a) could be 0 care for arbitrarily small h. The easiest way this can happen is for g to be a constant should - = -- - 176 Derivativesand Integrals g(a) function, g(x) 0 for all h. In this case, fog is also c. Then g(a + h) the Chain Rule does indeed hold: a constant function, (fo g)(x) f (c), so = = - = (fo g)'(a) 0 = f'(g(a)) = - g'(a). However, there are also nonconstant functions g for which g(a + h) g(a) for arbitrarily small h. For example, if a = 0, the function g might be - f (x) 1 x2 sin x 0, 0 ¢0 x -, = = 0. = x In this case, g'(0) 0, as we showed in Chapter 9. If the Chain Rule is correct, we 0 for any differentiable f, and this is not exactly obvious. must have (fo g)'(0) A proof of the Chain Rule can be found by considering such recalcitrant functions separately, but it is easier simply to abandon this approach, and use a trick. = = THEOREM 9 (THE CHAM RULE) If g is differentiable at a, and f is differentiable at g(a), then at a, and g)'(a) (fo PROOF f'(g(a)) = f og is differentiable g(a) ¢0 g(a) = g'(a). - Define a function ¢ as follows: h)) f (g(a + ¢(h) g(a + h) = f (g(a)) - . if g(a + h) , g(a) - if g(a+ J'(g(a)), - h) - 0. It should be intuitively clear that ¢ is continuous at 0: When h is small, g(a) is also small, so if g(a + h) g(a) is not zero, then ¢(h) will g(a + h) be close to f'(g(a)); and if it is zero, then ¢(h) actually equals f'(g(a)), which is even better. Since the continuity of ¢ is the crux of the whole proof we will provide a careful translation of this intuitive argument. We know that f is differentiable at g(a). This means that - - f(g(a)+k)- f(g(a)) . hm Thus, if s 0 there is some number > . (1) if0 < |k| < f'(g(a)). = k koo 8' > f(g(a) ô', then 0 such that, for all k, +k) f(g(a)) - k - f'(g(a)) Now g is differentiable at a, hence continuous at a, so there is a 8 for all h, (2) if |hl < ô, then |g(a + h) g(a)| < ô'. ¢(h) it followsfrom = f(g(a+h))- (2)that g(a + h) k| < = f(g(a)) - ¢ 0, then f(g(a)+k)- f(g(a)) g(a + h) - g(a) g(a) 8', and hence from |¢(h) f'(g(a))| - k (1)that <e. e. 0 such that, > - Consider now any h with Ih| < ô. If k < ' 10. Djerentiation 177 On the other hand, if g(a + h) g(a) --- 0, then ¢(h) = f'(g(a)), so = it is surely true that |¢(h) f'(g(a))| < - s. We have therefore proved that lim¢(h) f'(g(a)), = h-0 so ¢ is continuous at 0. The rest of the proof is easy. If h ¢ 0, then we have f (g(a + h) - f (g(a)) ¢(h) = h even if g(a + h) (fo g)'(a) = - in that (because f(g(a+h))- f(g(a)) g(a) lim 0 = = f'(g(a)) g(a) - both sides are 0). Therefore case -g(a) . lim ¢ (h) lun = g(a+h) - h hw0 g(a + h) h - h-o hwO h g'(a). - Now that we can differentiate so many functions so easily we can take another look at the function x2 sin 1 -, f (x) = ¢0 x x 0, 0. = x In Chapter 9 we showed that f'(0) 0, working straight from the definition (the only possible way). For x ¢ 0 we can use the methods of this chapter. We have = f'(x) = 2x sin l - x + x2 æs 1 1 - · - · - x2 x Thus f'(x) 1 2x sin = 1 - - x cos -, ¢0 x x 0, 0. = x As this formula reveals, the first derivative f' is indeed badly behaved at 0-it not even continuous there. If we consider instead f (x) = x3sin-, 1 x x¢0 0, x - is 0, then f'(x) = 3x2 sin 1 - x 0, 1 - x cos -, x x x ¢0 = 0. the expresIn this case f' is continuous at 0, but f"(0) does not exist (because sion 3x2 sin l/x defines a function which is difTerentiable at 0 but the expression cos 1/x does not). -x 178 Derivativesand Integrals As you may suspect, improvement. If increasing the power of x yet again produces another 1 f (x) = 0, 0, = x then f'(x) 4x3 sin = - 1 x2 - -, cos x 1 x x 0, x It is easy to compute, right from the definition, that easy to find for x ¢ 0: f"(x) = 12x2 sin 1 - - x 4x cos 1 - x = 0. (f')'(0) 1 2x cos - ¢0 - - x sin = 0, and f"(x) is 1 -, x 0, x x ¢0 = 0. In this case, the second derivative f" is not continuous at 0. By now you may have guessed the pattern, which two of the problems ask you to establish: if f (x) sin x = 1 -, x x 0, then f'(0), ... , x ff">(0)exist, but ff") is not 0, = continuous at 0; if x2.+1 sin 1 x x 0, x -, f (x) = ¢0 ¢0 = 0, then /'(0),..., ff">(0)exist, and ff") is continuous at 0, but ff") is not differentiable at 0. These examples may suggest that functions can be characterized by the possession of higher-order derivatives-no matter how hard we try to mask the infinite oscillation of f (x) sin 1/x, a derivative of sufliciently high order seems able to reveal the underlying irregularity. Unfortunately, we will see later that much worse things can happen. After all these involved calculations, we will bring this chapter to a close with minor remark. It is often tempting, and seems more elegant, to write some of a the theorems in this chapter as equations about functions, rather than about their "reasonable" = - values. Thus Theorem 3 might be written (f+g)'= f'+g', Theorem 4 might be written as (f·g)'= and f-g'+f'·g, Theorem 9 often appears in the form (fog)'=(f'og)-g'. 10. Deferentiation 179 Strictly speaking, these equations may be false, because the functions on the lefthand side might have a larger domain than those on the right. Nevertheless, this is hardly worth worrying about. If f and g are differentiable everywhere in their domains, then these equations, and others like them, are true, and this is the only case any one cares about. PROBLEMS 1. As a warm up exercise, find f'(x) for each of the following f. (Don't worry about the domain of f or f'; just get a formula for f'(x) that gives the right answer when it makes sense.) (i) f(x)=sin(x+x2L (ii) f(x)=sinx+sinx2. (iii) f(x) sin(cosx). (iv) f (x) sin(sin x). (v) f (x) sin x = = (cosx = . sin(cos x) (vi) f (x) (vii)f (x) (viii)f (x) 2. = . X = sin(x + sin x). = sin(cos(sin x)). of the following functions f. (It took the author 20 minthe derivatives for the answer section, and it should not take utes to compute much longer. Although rapid calculation is not the goal of mathematics, you if you hope to treat theoretical applications of the Chain Rule with aplomb, these concrete applications should be child's play-mathematicians like to pretend that they can't even add, but most of them can when they have to.) Find f'(x) for each (i) f (x) (ii) f (x) (iii) f (x) = = = (iv) f (x) = (v) f (x) = sin((x + 1)2(x + 2)). sin3 2 + sinX). sin2 ((x + sin x)2L sin x ( 3 . cos x3 sin(x sin x) + sin(sin x2). (vi) f(x) (cosx)312. sin2 x sin x2 sin2 x2. (vii)f (x) sin3 (sin2(sin x)). (viii)f (x) = = = (ix) f (x) (x + = (x) f(x) (xi) f(x) (xii) f (x) (xiii)f (x) (xiv)f (x) = = = sin5 6 sin(sin(sin(sin(sinx)))). sin((sin'x2 + BIL (((x2+ x)3 4 = sin(x2 + sin(x2 = sin(6 cos(6 sin(6 cos 5 2 6x))). 180 Derivativesand Integrals (xv) f (x) = (xvi)f (x) = sin x2 sin2 x . . 1+smx 1 2 x+smx x3 f (x) (xvii) sin = x3 . sin . -- smx x f (x) (xviii) 3. sin = ( x . x-sm . . x - smx Find the derivatives of the functions tan, cotan, sec, cosec. (You don't have to memorize these formulas, although they will be needed once in a while; if you express your answers in the right way, they will be simple and somewhat symmetrical.) 4. For each of the following functions (i) (ii) (iii) (iv) 5. 6. f (x) f, find f'( f (x)) (not(fo f)'(x)). 1 = . 1+ x f (x) = f(x) f (x) = sin x. x2. 17. = For each of the following functions f, fmd f (f'(x)). 1 (i) f (x) = (ii) f(x) = (iii) f (x) (iv) f(x) = Find f' -. x x2. 17. 17x. = in terms of g' if (i) f (x) g(x + g(a)). (ii) f (x) g(x g(a)). (iii) f(x)=g(x+g(x)). = = (iv) f (x) (v) f (x) (vi) f (x + = = 7. - a). g(x)(x g(a)(x a). 3) g(x2). - - = (a) A circular object is increasing in size in some unspecified manner, but it is known that when the radius is 6, the rate of change of the radius is 4. Find the rate of change of the area when the radius is 6. (If r(t) and A(t) represent the radius and the area at time t, then the functions r and A str2; satisfy A use of the Chain Rule is called for.) a straightforward = 10. Deferentiation 181 (b) Suppose that informed that the circular object we have been watching is really the cross section of a spherical object. Find the rate of change of the volume when the radius is 6. (You will clearly need to know a formula for the volume of a sphere; in case you have forgotten, the volume is times the cube of the radius.) (c) Now suppose that the rate of change of the area of the circular cross section is 5 when the radius is 3. Find the rate of change of the volume when the radius is 3. You should be able to do this problem in two ways: first, by using the formulas for the area and volume in terms of the radius; and then by expressing the volume in terms of the area (to use this method you will need Problem 9-3). we are now gr 8. The area between two varying concentric circles is at all times 9x in2. The rate of change of the area of the larger circle is 10x in2/sec. How fast is the circumference of the smaller circle changing when it has area 16x in29 9. Particle A moves along the positive horizontal axis, and particle B along the graph of f(x) -J, x < 0. At a certain time, A is at the point (5,0) and moving with speed 3 units/sec; and B is at a distance of 3 units from the origin and moving with speed 4 units/sec. At what rate is the distance changing? and B between A = 10. Let f(x) x2 sin 1/x for x ¢ 0, and let f (0) are two functions such that = h'(x) = h(0) = sin2(sin(x + 1)) 3 0. Suppose also that h and k = k'(x) = k(0) = f(x + 1) 0. Find (i) (fo h)'(0). (ko f)'(0). (ii) a'(x2), where (iii) 11. Find a(x) = h(x2). Exercise great care. f'(0) if f (x) g(x) sin - 1 -, x 0, x x ¢0 = 0, and g(0) = g'(0) = 0. 12. Using the derivative of f (x) by the Chain Rule. 13. (a) Using Problem 9-3, find f'(x) for < x < 1, if f (x) Ál x2. (b) Prove that the tangent line to the graph of f at (a, V1 a2 ) intersects the graph only at that point (andthus show that the elementary geometry = 1/x, as found in Problem 9-1, find (1/g)'(x) -1 = - definition of the tangent line coincides with ours). - 182 Derivativesand Integrals 14. Prove similarly that the tangent lines to an ellipse or hyperbola intersect these sets only once. 15. If f + g is differentiable at a, are f and g necessarily differentiable at a? If f g and f are differentiable at a, what conditions on f imply that g is differentiable at a? - 16. (a) Prove that if provided that f is differentiable f (a) ¢ 0. (b) Give a counterexample (c) Prove that if f and if at a, then is also differentiable at a, |f| 0. g are differentiable at a, then the functions and min(f,g) are differentiable at a, provided that f(a) ¢ max(f,g) f (a) = g(a). (d) Give a 17. counterexample if f(a) = times differentiable and at x is defined to be If f is three f'"(x) Sf(x)= (a) Show that --- f'(x) g(a). f'(x) ¢ 0, the Schwarzianderivativeof f 3 2 2 f"(x) . f'(x) (fog)=[Sfog]-g'2+By ax + b with ad if f (x) , cx + d quently, B(f a g) Og. (b) Show that = -- ¢ 0, bc then 9 f = 0. Conse- = 18. Suppose that ff")(a)and gt">(a) exist. Prove Leibnit'sformula: (f - g)(">(a) n k = k=0 *19. n-k (k) Prove that if f("3(g(a))and gt">(a) both exist, then (fo g)("3(a) exists. A little experimentation should convince you that it is unwise to seek a formula for (fo g)(") (a). In order to prove that (fo g) (") (a) exists you will therefore have to devise a reasonable assertion about (f og)(">(a) which can be proved by induction. Try something like: "( fo g)(") (a) exists and is a sum of terms each of which is a product of terms of the form ." ... 20. (a) If f(x) = anx"+a,_ix.-1+- Find another. (b) If find a fimction g such that g' --+ao, b2 b3 b, f(x)= p+p+---+--, find a function g with g' f. = (c) Is there a function f (x) such that f'(x) = = anx" + 1/x? - · - + ao + bi - x + - - - + b. -- xm = f. 10. Deferentiation 183 21. Show that there is a polynomial function f of degree n such that (a) f'(x) (b) f'(x) = = for precisely n 1 numbers x. for no x, if n is odd. for exactly one x, if n is even. for exactly k numbers x, if n k is odd. - (c) f'(x) (d) f'(x) (a) The number = = 22. 0 0 0 0 f(x) = (x - - a is called a double root double root of f (b) When does f(x) the condition 23. *24. of the polynomial function f if for some polynomial function g. Prove that a is a if and only if a is a root of both f and f'. ax2 + bx + c (a ¢ 0) have a double root? What does a)2g(x) say = geometrically? If f is differentiable at a, let d(x) f (x) f'(a)(x a) f (a). Find d'(a). connection with another solution for Problem 9-20. this gives Problem 22, In = This problem is a companion be given numbers. - - - to Problem 3-6. Let ai, a, and bi, ..., bn ..., x, are distinct numbers, prove that there is a polynomial function f of degree 2n 1, such that f (x¡) f'(x¡) 0 for j ¢ i, and and b¿. Remember Problem 22. Hint: f'(x¿) f (x¿) at (b) Prove that there is a polynomial function f of degree 2n 1 with f (xi) a¿ and f'(x;) b¿ for all i. (a) If xt, ..., = - = = = = - = *25. Suppose that a and b are two consecutive roots of a polynomial fimction but that a and b are not double roots, so that we can write f(x) (x a)(x b)g(x) where g(a) ¢ 0 and g(b) ¢ 0. - f, = - (a) Prove that g(a) and g(b) have the same sign. (Remember that a and b roots.) consecutive are that there is some number x with a < x < b and f'(x) 0. (Also (b) Prove draw a picture to illustrate this fact.) Hint: Compare the sign of f'(a) and f'(b). (c) Now prove the same fact, even if a and b are multiple roots. Hint: If b)"g(x) where g(a) ¢ 0 and g(b) ¢ 0, consider the f(x) = (x b)" polynomial function h(x) f'(x)/(x a)m-1(x = -a)"(x - = - - . This theorem was proved by the French mathematician Rolle, in connection of approximating of with the problem roots polynomials, but the result was not originally stated in terms of derivatives. In fact, Rolle was one of the mathematicians who never accepted the new notions of calculus. This was such not a pigheaded attitude, in view of the fact that for one hundred years no one could define limits in terms that did not verge on the mystic, but on the whole history has been particularly kind to Rolle; his name has become attached to a much more general result, to appear in the next chapter, which forms the basis for the most important theoretical results of calculus. 26. xg(x) for some function g which is continuous Suppose that f(x) that Prove f is differentiable at 0, and find f'(0) in terms of g. = at 0. 184 Derivancesand Integrals *27. Suppose f is differentiable at 0, and that f (0) 0. Prove that f (x) xg(x) for some function g which is continuous at 0. Hint: What happens if you try = to write g(x) 28. If f(x) = = f (x)/x? x¯" for n in N, prove that = ¡(*)(x) (-1)k (n+k-1)! X-n-k (n 1)! + k l (-1)kk! n = - - -n-k, for x ¢ 0. = *29. Prove that it is impossible to wiite x f(x)g(x) where f and g are differentiable and f (0) g(0) 0. Hint: Differentiate. = = 30. What is f(k)(X) (a) f (x) *(b) *31. f(x) = f 1/(x a)"? p 1/(x2 - = , x sin l /x if x ¢ 0, and let f (0) = 0. Prove that /'(0), f">(0) exist, and that ff") is not continuous at 0. (You will the encounter . . . basic difficulty as that in Problem 19.) Let f(x) = x2n+1sin1/x f("3(0)exist, that f("3is if x ¢ 0, and continuous at let f(0) 0. Prove that f'(0),..., 0, and that f(") is not differentiable = 0. at 33. if Let f (x) same *32. = = In Leibnizian notation the Chain Rule ought to read: df(g(x)) df(y) dg(x) dy dx ¯ dx finds the following statement: Instead, z = one usually f(y). Then dz dz dy dx dy dx "Let y = g(x) and " Notice that the z in dz/dx denotes the composite function fo g, while the z in dz/dy denotes the fimction f; it is also understood that dz/dy will be expression involving y," and that in the final answer g(x) must be substituted for y. In each of the following cases, find dz/dx by using this formula; then "an with compare (i) z=siny, (ii) z sin y, (iii) z cos u, (iv) z sino, = Problem 1. y=x+x2. y = u = v = = = cos x. sin x. cosu, u = sinx. SIGNIFICANCE OF THE DERIVATIVE CHAPTER One aim in this chapter is to justifythe time we have spent learning to find the derivative of a function. As we shall see, knowing just a little about f' tells us a lot about f. Extracting information about f from information about f' requires some difficult work, however, and we shall begin with the one theorem which is really easy. This theorem is concerned with the maximum value of a function on an interval. Although we have used this term informally in Chapter 7, it is worthwhile to be precise, and also more general. Let f be a function and A a set of numbers contained in the domain of f. point for f on A if A point x in A is a maximum DEFINITION f(x) > f(y) The number say that i x2 FIGURE i x. f "has for every y in A. value the maximum its maximum value on A at x"). of f (x) itself is called f on A (andwe also Notice that the maximum value of f on A could be f(x) for several different x (Figure 1);in other words, a function f can have several differentmaximum points value. Usually we shall be on A, although it can have at most one maximum interested in the case where A is a closed interval [a,b]; if f is continuous, then Theorem 7-3 guarantees that f does indeed have a maximum value on [a,b]. The definition of a minimum of f on A will be left to you. (One possible definition is the following: f has a minimum on A at x, if f has a maximum I x, i - on A at x.) We are now ready for a theorem which does not even depend upon the existence of THEOREM 1 least upper bounds. Let f be any function defined on (a, b). If x is a maximum (ora minimum) point for f on (a, b), and f is differentiable at x, then f'(x) 0. (Notice that we do not assume differentiability, or even continuity, of f at other = points.) PROOF Consider the case where f has a maximum at x. Figure 2 illustrates the simple idea drawn through points to the left of (x, f (x)) behind the whole argument--secants > slopes through points to the right of (x, f (x)) have and secants drawn 0, have slopes 5 0. Analytically, this argument proceeds as follows. 185 l 86 Derivativesand Integrals If h is any number such that x + h is in f(x) since f (a, b), then f(x+h), > has a maximum on (a, b) at x. This means that f (x + Thus, if h > h) f (x) 5 0. - 0 we have f(x+h)- f(x) FIGURE I i a x I x+h 50. h I b and consequently 2 + h) lim f (x < f (x) < 0. ¯ h h-0+ On the other hand, if h - 0, we have f(x+h)- f(x) >0, ¯ h so f (x + h) - f (x) > 0. ¯ h hoo- By hypothesis, i a equal to i & f'(x). f is differentiableat x, so these two limits must be equal, in fact This means that f'(x) FIGURE 5 0 and f'(x) > 0, from which it follows that f'(x) 0. The case where f has a minimum at x is left to you 3 = (givea one-line proof). Notice (Figure 3) that we cannot replace (a, b) by [a,b] in the statement of the we add to the hypothesis the condition that x is in (a, b).) (unless Since f'(x) depends only on the values of f near x, it is almost obvious how to get a stronger version of Theorem 1. We begin with a definition which is illustrated in Figure 4. theorem Let f be a function, and A a set of numbers contained in the domain of f. A point x in A is a local maximum [minimum]point for f on A if point for f on there is some & > 0 such that x is a maximum [minimum] DEFINITION An THEOREM 2 (x - 8, x +ô). If f is defined on (a, b) and has a local maximum differentiable at x, then f'(x) 0. (orminimum) = PROOF You s}iould see why this is an easy application of Theorem 1. at x, and f is 11. Segn$icance of the Derivative 187 The converse of Theorem 2 is definitely not true-it is possible for f'(x) to be 0 even if x is not a local maximum or minimum point for f. The simplest example is provided by the function f (x) x3; in this case f'(0) 0, but f has no local = or minimum anywhere. Probably the most widespread with the behavior of a function = maximum misconceptions about calculus are concerned point made in f near x when f'(x) = 0. The the previous paragraph is so quickly forgotten by those who want the world to be simpler than it is, that we will repeat it: the converse of Theorem 2 is not true-the condition f'(x) 0 does not imply that x is a local maximum or minimum point of f. Precisely for this reason, special terminology has been adopted to describe numbers x which satisfy the condition f'(x) = 0. = DEFINITION A critical point of a function f is a number f'(x) The number f(x) itselfis = x such that 0. called a criticalvalue of f. The critical values of f, together with a few other numbers, turn out to be the ones which must be considered in order to find the maximum and minimum of a given function f. To the uninitiated, finding the maximum and minimum value of a function represents one of the most intriguing aspects of calculus, and there is no denying that problems of this sort are fun (untilyou have done your first hundred or so). Et us consider first the problem of finding the maximum or minimum of f on a closed interval [a,b]. (Then, if f is continuous, we can at least be sure and minimum value exist.) In order to locate the maximum and that a maximum minimum of f three kinds of points must be considered: critical points of f in end The points a and b. (2) (3)Points x in [a,b] such that (1)The I [a,b]. f is not differentiable at x. 1 point or a minimum point for f on [a,b], then x must be in one of the three classes listed above: for if x is not in the second or third group, then x is in (a, b) and f is differentiable at x; consequently f'(x) 0, by Theorem 1, and this means that x is in the first group. If there are many points in these three categories, finding the maximum and minimum of f may still be a hopeless proposition, but when there are only a few critical points, and only a few points where f is not differentiable, the procedure is 0, and fairly straightforward: one simply finds f(x) for each x satisfying f'(x) f (x) for each x such that f is not differentiable at x and, finally, f (a) and f (b). The biggest of these will be the maximum value of f, and the smallest will be the minimum. A simple example follows. If x is a maximum = . y local rummum FIGURE 4 pomt a local maximum point = 188 Derivativesand Integrals Suppose we wish to find the maximum and minimum value of the function f(x)=x3-x on the interval [-1, 2]. To begin with, f'(x) so f'(x) = 0 when 3x2 1 - have 3x2 = 1 - 0, that is, when = = x Ñ we Ñ or - Ñ. -Ñ and The numbers both lie in [-1, 2], so the first group of candidates for the location of the maximum and the minimum is the end points of the interval, The second group contains (2) 1, 2. - The third group is empty, since f is differentiable everywhere. The final step is to compute f (-1) 0, f (2) 6. = = -jÑ, occurring at maximum Clearly the minimum value is , and the occurring is 6, at 2. This sort of procedure, if feasible, will always locate the maximum and minimum value of a continuous function on a closed interval. If the function we are dealing with is not continuous, however, or if we are seeking the maximum or minimum on an open interval or the whole line, then we cannot even be sure beforehand that the maximum and minimum values exist, so all the information obtained by this procedure may say nothing. Nevertheless, a little ingenuity will often reveal the nature of things. In Chapter 7 we solved just such a problem when we showed that if n is even, then the function value 11 b a FIGURE 1 5 f(x)=x"+an_ixn-14...4 has a minimum value on the whole line. This proves that the minimum occur at some number 0 = value must x satisfying f'(x) = nx"¯1 + (n - 1)a._ix"¯2 +··- + ai. If we can solve this equation, and compare the values of f (x) for such x, we can actually find the minimum of f. One more example may be helpful. Suppose we wish to find the maximum and minimum, if they exist, of the function f(x)= 1 1-x2 11. Sénþicanceof theDerivative 189 on the open interval (-1, 1). We have f'(x) 2x = (1 - x2) 2 -1 the 0 only for x 0. We can see immediately that for x close to 1 or values of f (x) become arbitrarily large, so f certainly does not have a maximum. This observation also makes it easy to show that f has a minimum at 0. We just note (Figure 5) that there will be numbers a and b, with so f'(x) = = -1 < such that a < 0 and 0 a and byx<1. b < 1, < f (x) > f (0) for -1<x This means that the minimum of f on [a,b] is the minimum of f on all of (-1, 1). Now on [a,b] the minimum occurs either at 0 (theonly place where f' 0), or at a or b, and a and b have already been ruled out, so the minimum value is f (0) 1. solving these problems we purposely did not draw the graphs of f (x) x3-x In and f (x) 1/(1 x2), but it is not cheating to draw the graph (Figure 6) as long as you do not rely solely on your picture to prove anything. As a matter of fact, we are now going to discuss a method of sketching the graph of a function that really fact gives enough information to be used in discussing maxima and minima-in will maxima minima. method able and be This involves to locate even local we of the sign of f'(x), and relies on some deep theorems. consideration The theorems about derivatives which have been proved so far, always yield information about f' in terms of information about f. This is true even of Theorem 1, although this theorem can sometimes be used to determine certain information about f, namely, the location of maxima and minima. When the derivative was first introduced, we emphasized that f'(x) is not [f (x + h) f (x)]/h for any particular h, but only a limit of these numbers as h approaches 0; this fact becomes painfully relevant when one tries to extract information about f from information about f'. The simplest and most frustrating illustration of the difficulties encountered is afTorded by the following question: If f'(x) 0 for all x, must f be a constant function? It is impossible to imagine how f could be anything else, and this conviction is strengthened by considering the physical interpretation-if the of velocity a particle is always 0, surely the particle must be standing still! Nevertheless it is difficult even to begin a proof that only the constant functions satisfy f'(x) 0 for all x. The hypothesis f'(x) = 0 only means that = = = = f(x) fi -1 N¯3 = x' -- x -- -- = I i i i -1 1 = + h) hm f (x . hoo FIGURE 6 h - f (x) = 0, and it is not at all obvious how one can use the information about the limit to derive information about the fimction. 190 Derwativesand Integrals The fact that f is a constant function if f'(x) = 0 for all x, and many other facts of the same sort, can all be derived from a fundamental theorem, called the Mean Value Theorem, which states much stronger results. Figure 7 makes it plausible that if f is differentiable on b], then there is some x in (a, b) such that / ,7 ,7 [a, , ./*,7' y' ta FIGURE = b-a Geometrically this means that some tangent line is parallel to the line between (a, f (a)) and (b, f (b)). The Mean Value Theorem asserts that this is true-there is some x in (a, b) such that f'(x), the instantaneous rate of change of f at x, is change of f on [a,b], this average change exactly equal to the average or being [f (b) f (a)]/[b a]. (For example, if you travel 60 miles in one hour, then at some time you must have been traveling exactly 60 miles per hour.) This the theorem is one of the most important theoretical tools of calculus-probably might result about conclude that the derivatives. Fmm this statement you deepest hard theorems in this book proof is difficult, but there you would be wrong-the have occurred long ago, in Chapter 7. It is true that if you try to prove the Mean Value Theorem yourself you will probably fail, but this is neither evidence that the theorem is hard, nor something to be ashamed of. The first proof of the theorem was an achievement, but today we can supply a proof which is quite simple. It helps to begin with a very special case. x "mean" 7 - FIGURE f (b) f (a) - f'(x) 8 THOMM 3 (ROLLFS TŒOMM) - is continuous on [a,b] and differentiable on (a, b), and there is a number x in (a, b) such that f'(x) 0. If f f (a) = f (b), then = PROOF If followsfrom the continuity of f on [a,b] that f has a maximum and a minimum [a,b]. Suppose first that the maximum value occurs at a point x in (a, b). Then f'(x) 0 by Theorem 1, and we are done (Figure 8). value of f occurs at some point x in (a, b). Suppose next that the minimum again, Then, f'(x) 0 by Theorem 1 (Figure 9). Finally, suppose the maximum and minimum values both occur at the end points. Since f (a) f (b), the maximum and minimum values of f are equal, |IS SO 3 COnstant function (Figure 10), and for a constant function we can choose any x in (a, b). value on = = = FIGURE 9 e FIGU RE lo i Notice that we really needed the hypothesis that f is differentiable everywhere the theorem is on (a, b) in order to apply Theorem 1. Without this assumption false (Figure 11). You may wonder why a special name should be attached to a theorem as easily proved as Rolle's Theorem. The reason is, that although Rolle's Theorem is a special case of the Mean Value Theorem, it also yields a simple proof of the Mean Value Theorem. In order to prove the Mean Value Theorem we will apply Rolle's Theorem to the function which gives the length of the vertical segment shown in 11. Syngicanceof ike Derivative 191 Figure 12; this is the difference between f (x), and the height at x of the line L between (a,f (a)) and (b,f (b)). Since L is the graph of f (b) f (a)¯ - g(x) i a FIGURE [ = i b we want to look at b a - (x -- a) + f (a), a) f (a). 11 f (x) b As it turns out, the constant THEOREM 4 (THE MEAN VALUE THEOREM) If f is continuous on in (a, b) such that (x -- -- a - - f (a) is irrelevant. [a,b] and differentiable on (a, b), then there is a number f (b) f (a) x - f'(x) PROOF = . b--a IÆt f (b) f (a) - h(x) = h(a)= (6, f(b)) = = L - b [a,b] and Clearly, h is continuous on h(b) f (x) f(a), f (b) - a (x a). - differentiable on (a, b), and f (b) f (a) - [ - f (a). b -- a (x a) - Consequently, we may apply Rolle's Theorem to h and conclude some x in (a, b) such that that there is f (b) f (a) - O = h'(x) FIGURE So 12 f'(x) - - b , - a that f (b) f (a) - J'(x) = b - . a Notice that the Mean Value Theorem still fits into the pattern exhibited by previous theorems-information about f yields information about f'. This information is so strong, however, that we can now go in the other direction. COROLLARY 1 PROOF If f is defined on an interval and f'(x) constant on the interval. = 0 for all x in the interval, then f is Let a and b be any two points in the interval with a ¢ b. Then there is some x in 192 Derivativesand Integrals (a, b) such that f (b) f (a) - f'(x) But f'(x) = = b . a - 0 for all x in the interval, so f (b) f (a) - FIGURE 0 13 and consequently f(a) interval is the same, i.e., - ' b-a Thus the value of f at any two points in the f is constant on the interval. = f(b). Naturally, Corollary 1 does not hold for functions defined on two or more intervals (Figure 13). COROLLARY 2 and g are g'(x) for all x in the defined on the same interval, and f'(x) number such that f interval, then there is some g + c. c If f = = PROOF For all x in the intervalwe have (f g)'(x) there is a number c such that f c. g = - f'(x) - g'(x) = 0 so, by Corollary1, = - The statement of the next corollary requires some terminology, which is illustrated in Figure 14. A function is increasing on an interval if f (a) < f (b) whenever a and b are two numbers in the interval with a < b. The function f is decreasing on an interval if f (a) > f (b) for all a and b in the interval with a < b. (We often say simply that f is increasing or decreasing, in which case the interval is understood to be the domain of f.) DEFINITION COROLLARY 3 PROOF Íf f'(x) > Ûfor all x in an interval, then f is increasing on the interval; if f'(x) for all x in the interval, then f is decreasing on the interval. < 0 Consider the case where f'(x) > 0. Let a and b be two points in the interval with a < b. Then there is some x in (a, b) with f (b) f (a) - f'(x) But f'(x) > = b - a 0 for all x in (a, b), so f (b) f (a) - b - > 0. a Since b a > 0 it follows that f (b) > f (a). The proof when f'(x) < 0 for all x is left to you. - of theDenvative 193 11. Srgngicance the converses of Corollary 1 and Corollary 2 are true (and obvious), the converse of Corollary 3 is not true. If f is increasing, it is easy to see that f'(x) > 0 for all x, but the equality sign might hold for some x (consider (x) x3 Notice that although f = Corollary 3 provides enough information to get a good idea of the graph of a function with a minimal amount of point plotting. Consider, once more, the function f (x) x3 x. We have = - f'(x) = 3x2 1. - Ñ -Ñ, and x and it is 0 for x We have already noted that f'(x) also possible to determine the sign of f'(x) for all other x. Note that 3x2 1 > 0 = = = - when precisely 3x2 x I x>Ñ I thus 3x2 (a) - 2 or > 1 , x<- ; 1 < 0 precisely when an increasing function Thus -Ñ, decreasing between -N and Ñ, Ñ. Combining this information with the is increasing for x < and once again increasing for x following facts f > -1, (2)f (x) 0 for x (3)f (x) gets large = (b) a decreasing FIGURE 14 function = 0, 1, as x gets large, and large negative as x gets large negative, approximation it is possible to sketch a pretty respectable to the graph (Figure 15). By the way, notice that the intervals on which f increases and decreases could have been found without even bothering to examine the sign of f'. For example, and since f' is continuous, and vanishes only at we know that f' 5, -Ñ (-Ñ, Ñ). always f(--Ñ) > Since has the same sign on the interval it follows that f decreases on this interval. Similarly, f' always has the oo) and f (x) is large for large x, so f must be increasing on same sign on oo). Another point worth noting: If f' is continuous, then the sign of f' on the interval between two adjacent critical points can be determined simply by finding the sign of f'(x) for any one x in this interval. x3 Our sketch of the graph of f(x) x contains sufficient information is a local maximum point, and to allow us to say with confidence that a local minimum point. In fact, we can give a general scheme for decid- f(Ñ), (Ñ, (Ñ, = - -Ñ 194 Derivativesand Integrals f(x) -1 I I 0 I / I sil 1 I f decreasing sit I I J increasing I i I I I I FIGURE xa I I I f increasing = 15 ing whether a critical point is a local maximum neither (Figure 16): point, a local minimum point, or > 0 in some interval to the left of x and f' < 0 in some interval to the right of x, then x is a local maximum point. (2)if f' < 0 in some interval to the left of x and f' > 0 in some interval to the right of x, then x is a local minimum point. if (3) f' has the same sign in some interval to the left of x as it has in some interval to the right, then x is neither a local maximum nor a local minimum point. (1)if f' (There is no point in memorizing these rules-you can always draw the pictures yourself.) The polynomial functions can all be analyzed in this way, and it is even possible to describe the general form of the graph of such functions. To begin, we need a I f>of<o (a) FIGURE 16 I f<of>o (b) I f>o I f>o (c) f<o f<o (d) 11. Srp¢ance of theDerivance 195 result already in Problem 3-7: If mentioned f (x) = anx" +a,_ixn-1 ...4 4 "roots," has at most n i.e., there are at most n numbers x such that f (x) 0. Although this is really an algebraic theorem, calculus can be used to give an easy proof. Notice that if x1 and x2 are roots of f (Figure 17), so that 0, then by Rolle's Theorem there is a number x between xi f (xi) f (X2) and x2 such that f'(x) 0. This means that if f has k different roots xi < x2 < < xk, then f' has at least k 1 different roots: one between xi and x2, one between x2 and xa, etc. It is now easy to prove by induction that a polynomial then f = = = = - ... functiOD 17 FIGURE f(x)=anx"+a._ix"¯1+··-+ao has at most n roots: The statement is surely true for n it is true for n, then the polynomial g(x) could not roots. = bn+1x"**+ b.x" + · = 1, and if we assume that + be - have more than n + 1 roots, since if it did, g' would have more than n With this information it is not hard to describe the graph of f (x) = anx" + a._ix"¯1 + - - - + ao. 1, has at most The derivative, being a polynomial function of degree n critical 1 points. Of 1 Therefore has at most n roots. course, a critin f cal point is not necessarily a local maximum minimum point, but at any rate, or if a and b are adjacent critical points of f, then f' will remain either positive or negative on (a, b), since f' is continuous; consequently, f will be either increasing or decreasing on (a, b). Thus f has at most n regions of decrease or increase. As a specific example, consider the function - - -- 2 f(x)=x4_ Since f'(x) = 4x3 - 4x 4x (x = - 1)(x + 1), -1, the critical points of f are 0, and 1, and -1, * f (-1) °) = f (0) 0, f (1) = -1. / \ = -/1) -1) (-1 FIGURE ( 18 The behavior of f on the intervals between the critical points can be determined by one of the methods mentioned before. In particular, we could determine the Sign Of j' OR Íl1€S€ ÍnÍ€rVRÎS simply be examining the formula for f'(x). On the other hand, from the three critical values alone we can see (Figure 18) that f increases on (-1, 0) and decreases on (0, 1). To determine the sign of f' on 196 Derivativesand Integrals -1) and (-oo, (1, oo) we can compute f'(-2) f'(2) = = 4 (-2)3 - 4 23 - - -24, 4 (-2) 4 2 24, - - = = - -1) and conclude that f is decreasing on (-oo, conclusions also follow from the fact that f negative and increasing on (x) is large for large (1, oo). These x and for large x. We can already produce a good sketch of the graph; two other pieces of information provide the finishing touches (Figure 19). First, it is easy to determine that f (x) 0 for x 0, ±Ã; second, it is clear that f is even, f (x) f (--x), so the graph is symmetric with respect to the vertical axis. The fimction f(x) x3 already sketched in Figure 15, is odd, f (x) f (-x), and is consequently symmetric with respect to the origin. Half the work of graph sketching may be saved by noticing these things in the beginning. Several problems in this and succeeding chapters ask you to sketch the graphs of functions. In each case you should determine = = = -x, = = (1)the critical - points of f, (2)the value of f at the critical points, the sign of f' in the regions between critical points this clear), the numbers x such that f (x) = 0 possible), the behavior of f (x) as x becomes large or large negative (3) (if (if (4) (5) is not already (ifpossible). Finally, bear in mind that a quick check, to see whether the function is odd or even, may save a lot of work. This sort of analysis, if performed with care, will usually reveal the basic shape of the graph, but sometimes there are special features which require a little more thought. It is impossible to anticipate all of these, but one piece of information is often very important. If f is not defmed at certain points (forexample, if f is a rational function whose denominator vanishes at some points), then the behavior of f near these points should be determined. For example, consider the function x2-2x+2 f (x) = x |(x) (-Vž, 0) (0,0) -1) -1) (-1, FIGURE 19 (1, - x' 1 - - 2x* , 11. Sén¢cance of theDerivative 197 which is not defined at 1. We have (x 1)(2x = 2) - - f'(x) (x x(x-2) (x2 2x + 2) - - 1) - ¯ (x 1)2 - Thus (1) the critical points of f are 0, 2. Moreover, -2, (2) f (0) f (2) 2. = = Because f is not defined on the whole interval (0,2), the sign of f' must be determined separately on the intervals (0, 1) and (1,2), as well as on the intervals (-oo, 0) and (2, oo). We can do this by picking particular points in each of these intervals, or simply by staring hard at the formula for f'. Either way we find that (3) f'(x) > 0 if x < 0, f'(x)<0 if 0<x<1, f'(x) < f'(x) 0 0 > if if 1 < x < 2, 2 < x. Finally, we must determine the behavior of f (x) as x becomes large or large negative, as well as when x approaches 1 (this information will also give us another regions which the increases and decreases). To examine on way to determine f write the behavior as x becomes large we x2-2x+2 1 =x-l+ x-1 x---l ; 1 (andslightly larger) when x is large, and f (x) is close 1 (butslightly smaller) when x is large negative. The behavior of f near 1 to x also is easy to determine; since clearly f (x) is close to x - - lim (x2 2x + 2) - xel -/- = 1 0, the fraction x2-2x+2 x - 1 becomes large as x approaches 1 from above and large negative as x approaches 1 from below. All this information may seem a bit overwhelming, but there is only one way that it can be pieced together (Figure 20); be sure that you can account for each feature of the graph. When this sketch has been completed, we might note that it looks like the graph 198 Derivativesand Integrals I I I 2-- / / I 2 71 / i I \ / I I \ / / FIGURE / / 0 / / 20 of an odd function shoved over 1 unit, and the expression x2-2x+2 ¯ x-1 (x-1)2+1 x-1 shows that this is indeed the case. However, this is one of those special features which should be investigated only after you have used the other information to get of the graph. a good idea of the appearance Although the location of local maxima and minima of a function is always revealed by a detailed sketch of its graph, it is usually unnecessary to do so much There is a popular test for local maxima and minima which depends on the behavior of the function only at its critical points. work. THEOREM 5 Suppose f'(a) 0. If f"(a) > 0, then f has a local minimum at a; if f"(a) f has a local maximum at a. = then PROOF By clefmition, . f"(a) Since f'(a) = = Inn J'(a h-wo + h) h - f'(a) 0, this can be written f"(a) = lim h->0 f'(a+ h h) . < 0, of theDenvative 199 11. Srgngicance Suppose now that f"(a) small h. Therefore: and f'(a f'(a 0. Then f'(a + h)/h must be positive for sufficiently > + h) must be positive for sufficiently small h > 0 + h) must be negative for sufficiently small h < 0. This means (Corollary 3) that f is increasing in some interval to the right of a and f is decreasing in some interval to the left of a. Consequently, f has a local minimum f(x) - at a. The proof for the case f"(a) -x < 0 is similar. Theorem 5 may be applied to the function been considered. We have (a) f'(x) f(x) = f"(x) xa At the critical points, = = 3x2 6x. - f (x) = x3 x, which -- has already 1 --N and Ñ, we have f"(-Ñ) f"(Ñ) 6Ñ -6 = > = (b) -Ñ ) = 0, 0. < Ñ is a local maximum point and is a local minimum Consequendy, point. Although Theorem 5 will be found quite useful for polynomial functions, for many functions the second derivativeis so complicated that it is easier to consider the sign of the first derivative. Moreover, if a is a critical point of f it may happen that f"(a) 0. In this case, Theorem 5 provides no information: it is possible that a is a local maximum point, a local minimum point, or neither, as shown (Figure 21) by the functions x. = (c) -x4, FIGURE f(x) 21 f(x) = = X4 5. in each case f'(0) f"(0) 0, but 0 is a local maximum point for the first, a local minimum point for the second, and neither a local maximum nor minimum point for the third. This point will be pursued further in Part IV It is interesting to note that Theorem 5 automatically proves a partial converse of itself. - THEOREM 6 PROOF = Suppose f"(a) exists. If f has a local minimum local maximum at a, then f"(a) 5 0. at a, then f"(a) > 0; if f has a has local minimum at a. If f"(a) < 0, then f would also have a local maximum at a, by Theorem 5. Thus f would be constant in some interval containing a, so that f"(a) 0, a contradiction. Thus we must have f"(a) > 0. The case of a local maximum is handled similarly. Suppose f = 200 Denvativesand Integrals (This partial converse to Theorem 5 is the best we can hope for: the signs cannot be replaced by > and <, as shown by the functions f (x) > = and 5 x4 and -x4.) f (x) = The remainder of this chapter deals, not with graph sketching, or maxima and but with three consequences of the Mean Value Theorem. The first is a simple, but very beautiful, theorem which plays an important role in Chapter 15, and which also sheds light on many examples which have occurred in previous minima, chapters. THEOREM 7 Suppose that f is continuous at a, and that f'(x) exists for all x in some interval containing a, except perhaps for x a. Suppose, moreover, that lim f'(x) exists. = Then f'(a) also exists, and f'(a) PROOF = lim f'(x). By definition, f'(a) hm f(akh)- f(a) . = . h hoo For sufficiently small h > 0 the function f will be continuous on [a,a + h] and differentiable on (a, a + h) (a similar assertion holds for sufliciently small h < 0). By the Mean Value Theorem there is a number Œh in (a, a + h) such that f(a) f(a+h)- = h NOW lim approaches a as h approaches exists, it follows that f'(a) e = + h) lim f (a hoo h - f (a) h)· 0, because Œh f'(x) f'( lim = f'(Gh) hoO (It is a good idea to supply a rigorous have treated somewhat informally.) Œh = is in lim (a, a + h); since f'(X). xsa e-ô argument for this final step, which we Even if f is an everywhere differentiable function, it is still possible for f' to be discontinuous. This happens, for example, if FIGURE 22 f(X) x2 sin = 1 -, X 0, -/ 0 x x = 0. According to Theorem 7, however, the graph of f' can never exhibit a discontinuity of the type shown in Figure 22. Problem 55 outlines the proof of another beautiful theorem which gives further information about the function f', and Problem 56 uses this result to strengthen Theorem 7. The next theorem, a generalization of the Mean Value Theorem, is of interest mainly because of its applications. of theDerivative 201 11. Signgicance THEOREM 8 (THE CAUCHY MEAN VALUE THEOREM) If f and g are continuous on number x in (a, b) such that [a,b] and [f (b) f (a)]g'(x) differentiable on (a, b), then there is a [g(b) = - - g(a)] f'(x). (If g(b) ¢ g(a), and g'(x) ¢ 0, this equation can be written f (b) f (a) f'(x) - g(b) _ ¯ g(a) - g'(x) 1, and we obtain the Mean Value Notice that if g(x) x for all x, then g'(x) applying the Mean Value Theorem to f and g Theorem. On the other hand, separately, we find that there are x and y in (a, b) with - = f (b) f (a) f'(x) g(b) g'(y)' - g(a) -- but there is no guarantee that the x and y found in this way will be equal. These remarks may suggest that the Cauchy Mean Value Theorem will be quite difEcult to prove, but actually the simplest of tricks suffices.) PROOF Let -g(a)] h(x) -g(x)[f(b) f(x)[g(b) - Then h is continuous on f(a)]. - [a,b], differentiable on (a, b), and h(a)= f(a)g(b)-g(a)f(b)= It followsfrom Rolle's Theorem that h'(x) = h(b). 0 for some x in (a, b), which means that 0 = f'(x) [g(b) - g(a)] g'(x) - [f (b) f (a)]. - The Cauchy Mean Value Theorem is the basic tool needed to prove a theorem limits of the form which facilitates evaluation of lim --, f (x) when lim f(x) 0 = and lim g(x) = 0. In this case, Theorem 5-2 is of no use. Every derivative is a limit of this form, and computing derivatives frequently requires a great deal of work. If some derivatives are known, however, many limits of this form can now be evaluated easily. THEOREM 9 (I?HÔPITAI?S RULE) Suppose that lim f(x) and suppose also that 0 = lim g(x) and = 0, lim f'(x)/g'(x) exists. Then lim f(x)/g(x) exists, and x-va x-va lim x->. -- f (x) . = g(x) (Notice that Theorem 7 is a special case.) hm x-ma f'(x) gr(x) . 202 Derivativesand Integrals PROOF The hypothesis that lim f'(x)/g'(x) exists contains two implicit assumptions: 8, a + ô) such that f'(x) and g'(x) exist for all x in perhaps, 8) ô, for x a, except, (a a+ with, once again, the possible exception (2)in this interval g'(x) ¢ 0 (1)there is an interval (a - = - ofx=a. On the other hand, f and g are not even assumed to be defined at a. If we define the previous values of f(a) and g(a), if necessary), 0 (changing f(a) g(a) and then f g are continuous at a. If a < x < a + 3, then the Mean Value Theorem and the Cauchy Mean Value Theorem apply to f and g on the interval [a,x] (anda similar statement holds for a ô < x < a). First applying the Mean Value Theorem to g, we see that g(x) ¢ 0, for if g(x) 0 there would be some xi 0, contradicting (2).Now applying the Cauchy Mean Value in (a, x) with g'(x1) Theorem to f and g, we see that there is a number ex in (a, x) such that = = - = = [f (x) - 0]g'(ax) [g(x) = - 0] f'(ax) or f (x) f'(ax) _ ¯ g(x) g'(ax) Now œ, approaches a as x approaches lim f'(y)/g'(y) exists, it follows that because as is in (a, x); since a, yea (x) hm f , - g(x) x->a (Once again, the reader lim = xsa J'(ax) lim = g'(ax) y a J'(y) . g'(y) is invited to supply the details of this part of the argument.) PROBLEMS 1. For each of the following functions, find the maximum and minimum values on the indicated intervals, by finding the points in the interval where the derivative is 0, and comparing the values at these points with the values at the end points. (i) f(x)=x3_-x2-8x+1 x6 + x + 1 on on [-2,2]. [---1,1]. (ii) f (x) (iii) f(x)=3x4--8x3+6x2 on[-Lg (iv) f (x) = x5+x+1 on [--¼,1]. (v) f(x) = x + 1 x2 4 1 on [-1, on [0,5]. = (vi) f (x) 1 x = x2 - 1 . ofthe Daivative 203 11. Srgmyicance 2. Now sketch the graph of each of the functions in Problem 1, and find all local maximum and minimum points. 3. Sketch the graphs of the following functions. 1 (i) f(x)=x+-. x (ii) f (x) (iii) f (x) = (iv) f (x) = 3 x+ = --. x x 2 . X2 1 - 1 1 + x2 -ag)2. 4. (a) If at *(b) < ··· < a., find the minimum value of f(x) (x = i=1 Now find the minimum value of f (x) |x = - a¿|. This is a problem i=l where *(c) calculus help at all: on the intervals between the a¿'s the won't function f is linear, so that the minimum clearly occurs at one of the a¿, and these are precisely the points where f is not differentiable. However, the answer is easy to find if you consider how f (x) changes as you pass from one such interval to another. Let a > 0. Show that the maximum value of f (x) 1 1+|x| = + 1 1+|x-a| is (2 + a)/(1 + a). (The derivative can be found on each of the intervals (-oo, 0), (0,a), and (a, oo) separately.) 5. For each of the following functions, fmd all local maximum points. and minimum y=3, 5, 7, 9 x, x 5, x=3 9, 7, x=7 -3, (i) f(x) (ii) f (x) = = x = 5 9. x irrational p/q in lowest terms. x x 0, 1/q, = = (iii) f (x) x, x rational 0, x (iv) f (x) = 1, 0, (v) f (x) = 1, 0, I x irrational. = 1/n for some n in N otherwise. if the decimal expansion of x contains otherwise. a 5 204 Denvativesand Integrals 6. of the plane, and let L be the graph of the fimction f (x) = mx + b. Find the point ž such that the distance from (xo,yo) to (ž, f (ž)) is smallest. [Notice that minimizing this distance is the same as minimizing its square. This may simplify the computations somewhat.] (a) Let (xo,yo) be a point e, y a p ,"-' (b) Also find ž (x, o) ,," dicular to L. the distance from (xo,yo) to L, i.e., the distance from (xo,yo) to Find (c) (ž, f (ž)). [ftwill make the computations easier if you first assume that b 0; then apply the result to the graph of f(x) mx and the point (xo,yo b).] Compare with Problem 4-22. Consider a straight line described by the equation Ax + By + C (d) 0 (Problem 4-7). Show that the distance from (xo,yo) to this line is -a)w' (o, FIGURE by noting that the line from (xo,yo) to (ž, f (ž)) is perpen- 23 = = - = r Surface area is the sum of these areas (ÂX0 7. ETO / 2 g2 The previous Problem suggests the following question: What is the relationbetween the critical points of f and those of f22 ship FIGURE 8. A straight line is drawn from the point (0,a) to the horizontal axis, and then back to (1,b), as in Figure 23. Pove that the total length is shortest When the angles a and ß are equal. (Naturally you must bring a function into the picture: express the length in terms of x, where (x, 0) is the point on the horizontal axis. The dashed line in Figure 23 suggests an alternative geometric proof; in either case the problem can be solved without actually finding the point (x, 0).) 9. Prove that of all rectangles with given perimeter, the square has the greatest 24 b ,- area. / Find, among all right circular cylinders of fixed volume V, the one with smallest surface area (counting the areas of the faces at top and bottom, as in Figure 24). 11. A right triangle with hypotenuse of length a is rotated about one of its legs 10 generate a right circular cone. Find the greatest possible volume of such ' C FIGURE 10. 25 a cone. 12. Two hallways, of widths a and b, meet at right angles (Figure 25). What is the greatest possible length of a ladder which can be carried horizontally around the corner? FIGURE 26 13. A garden is to be designed in the shape of a circular sector (Figure 26), with radius R and central angle 0. The garden is to have a fixed area A. For what value of R and 0 (inradians) will the length of the fencing around the minimized? perimeter be 14. Show that the sum of a positive number and its reciprocal is at least 2. 15. Find the trapezoid of largest area that can be inscribed in a semicircle base lying along the diameter. radius a, with one of 11. Sénjicanceof theDerivative 205 a 16. A right angle is moved along the diameter of a circle of radius a, as shown in Figure 27. What is the greatest possible length (A + B) intercepted on it by the circle? 17. Ecological Ed must cross a circular lake of radius 1 mile. He can row across at 2 mph or walk around at 4 mph, or he can row part way and walk the rest (Figure 28). What route should he take so as to (i) (ii) 27 FIGURE FIGURE see as much scenery as possible? CEOss as guickly as possible? 18. The lower right-hand corner of a page is folded over so that it just touches the left edge of the paper, as in Figure 29. If the width of the paper is a and the page is very long, show that the minimum length of the crease is 3da/4. 19. Figure 30 shows the graph of the dmvativeof f. Find all local maximum minimum points of f. 28 3 1 FIGURE *20. *21. 7" is a polynomial function, f (x) x" + an-1xn-1 with critical points critical values 6, 1, 2, 1, 2, 3, 4, and corresponding the of 4, 3. Sketch the graph f, distinguishing cases n even and n odd. Suppose that f / 29 the critical points of the polynomial function (a) Suppose that - -1, -+ - and f (x) x" + = = - / FIGURE = -1, 1, 2, 3, f"(-1) 0, f"(1) > 0, f"(2) < ao are 0, f"(3) 0. Sketch the graph of f as accurately as possible on the basis of this information. (b) Does there exist a polynomial function with the above properties, except that 3 is not a critical point? / 'v/ 4 30 an-1xn-1+ \ and 22. Describe the graph of a rational function (invery general terms, similar to the text's description of the graph of a polynomial function). 23. that two polynomial functions of degree m and n, respectively, intersect in at most max(m, n) points. For (b) each m and n exhibit two polynomial functions of degree m and n which intersect max(m, n) times. *24. (a) Prove the polynomial function has exactly k critical points and f"(x) that n k is odd. (a) Suppose that - + ao f (x) x" + a..ixn-1 + all critical points Show 0 for x. ¢ = - - - 206 Derivativesand Integrals n, show that there is a polynomial function f of degree n with k critical points if n k is odd. a,_ix"¯I + + ao (c) Suppose that the polynomial function f(x) = x" + has ki local maximum points and k2 local minimum points. Show that k2 ki + 1 if n is even, and k2 ki if n is odd. (d) Let n, ki, k2 be three integers with k2 ki + 1 if n is even, and k2 ki if n is odd, and ki + k2 < n. Show that there is a polynomial function f of degree n, with k1 local maximum points and k2 local minimum points. (b) For each - -·· = = = = kl+k2 -(1+x2 I -a.) Hint: Pick ai < < a2 < · Øki+k2 and try f'(x) (x = i=1 for an appropriate 25. number l. if f'(x) > M for all x in [a,b], then f (b) > f (a) + M(b a). (b) Prove that if f'(x) EM for all x in [a,b], then f (b) f (a) + M(b a). (c) Formulate a similar theorem when |f'(x)| EM for all x in [a,b]. Suppose that f'(x) > M > 0 for all x in [0,1]. Show that there is an interval of length on which |f| > M/4. (a) Prove that -- _< --- *26. ¾ 27. (a) Suppose that f'(x) > g'(x) for all x, and that f(x)>g(x)forx>aand f(x)<g(x)forx (b) Show by an example that these conclusions hypothesis f(a) g(a). f(a) = g(a). Show that <a. do not follow without the = 28. Find all functions f such that (a) f'(x) (b) f"(x) (c) f"'(x) 29. sin x. = = = x3. x +x2 16t2 Although it is true that a weight dropped from rest will fall s(t) mention the behavior of feet after t seconds, this experimental fact does not weights which are thrown upwards or downwards. On the other hand, the law s"(t) 32 is always true and has just enough ambiguity to account for the behavior of a weight released from any height, with any initial velocity. For simplicity let us agree to measure heights upwards from ground level; in this case velocities are positive for rising bodies and negative for falling bodies, and all bodies fall according to the law s"(t) = = --32. = -16t2 + at + ß. (a) Show that s is of the form s(t) and then in the formula for s', setting in the formula for By 0 t s, (b) show that s(t) where + vet + so, so is the height from which the body is released at time 0, and vo is the velocity with which it is released. (c) A weight is thrown upwards with velocity v feet per second, at ground is the maximum level. How high will it go? ("How high" means height for all times".) What is its velocity at the moment it achieves its greatest height? What is its acceleration at that moment? When will it hit the ground again? What will its velocity be when it hits the ground = = --16t2 = "what again? 11. Sénýicanceof theDerwative 207 30. A cannon ball is shot from the ground with velocity v at an angle a (Figof velocity v sing and a horiure 31) so that it has a vertical component zontal component v cos a. Its distance s (t) above the ground obeys the law sina)t, while velocity remains constantly s(t) its horizontal + (v --16t2 = U COS Œ. (a) Show that FIGURE the path of the cannon ball is a parabola (find the position at each time t, and show that these points lie on a parabola). (b) Find the angle a which will maximize the horizontal distance traveled by the cannon ball before striking the ground. 31 31. (a) Give lim x- oo an example f'(x) does not (b) Prove that (c) Prove that of a function f for which lim f (x) exists, but exist. if 1400 lim f (x) and X->OO lim f'(x) both exist, then X->OO lim f'(x) if lim f (x) exists and lim f"(x) exists, then lim f"(x) (See also Problem 20-15.) 32. 0. = = 0. and g are two difTerentiable functions which satisfy that if a and b are adjacent zeros of f, and g(a) fg' f'g 0. Prove and g(b) are not both 0, then g(x) 0 for some x between a and b. (Naturally the same result holds with f and g interchanged; thus, the zeros of f and g separate each other.) Hint: Derive a contradiction from the assumption that g(x) ¢ 0 for all x between a and b: if a number is not 0, there is a natural thing to do with it. Suppose that - f = = 33. Suppose that |f (x) f (y) [ < |x y|n for n > 1. Prove that f is constant by considering f'. Compare with Problem 3-20. 34. A function - - f is Lepschiteoforder (*) a at x if there is a constant f(x)- f(y)| C such that 5 C|x-y|" for all y in an interval around x. The function f is Lapschitz of onier a on an intervalif (*)holds for all x and y in the interval. is Lipschitz of order a > 0 at x, then f is continuous at x. (b) If f is Lipschitz of order a > 0 on an interval, then f is uniformly continuous on this interval (seeChapter 8, Appendix). (c) If f is differentiable at x, then f is Lipschitz of order 1 at x. Is the converse true? (d) If f is differentiable on [a,b], is f Lipschitz of order 1 on [a,b]? (e) If f is Lipschitz of order a > 1 on [a,b], then f is constant on [a,b]. (a) If f 208 Duivancesand Integrals 35. Prove that if ao ai an 1 2 n+1 -+-+--·+ =0, then ao+aix+··-+anx"=0 for some x in [0,1]. 36. x3 3x +m never has two roots Pove that the polynomial function f.(x) in [0, 1], no matter what m may be. (This is an easy consequence of Rolle's Theorem. It is instructive, after giving an analytic proof, to graph fo and f2, and consider where the graph of f. lies in relation to them.) 37. Suppose that f is continuous and differentiable on [0,1], that f(x) is in [0,1] for each x, and that f'(x) ¢ 1 for all x in [0, 1]. Show that there is exactly one number x in [0, 1] such that f (x) x. (Half of this problem has been done already, in Problem 7-11.) = - = 38. the function f (x) x2 cos x satisfies f (x) 0 for precisely two numbers x. (b) Prove the same for the function f (x) 2x2 x sin x cos2 x. (Some preliminary estimates will be useful to restrict the possible location of the (a) Prove that = = - = zeros of *39. - - f.) 0 and a twice differentiable function with f(0) f'(0) J'(1) 0, then |f"(x)| > 4 for some x in [0,1]. In more picturesque terms: A particle which travels a unit distance in a unit time, and starts and ends with velocity 0, has at some time an acceleration > 4. Hint: Prove that either f"(x) > 4 for some x in for some x in [j, 1]. [0, or else f"(x) < that in fact have must Show we |f"(x)| > 4 for some x in [0,1]. (b) (a) Pove f(1) = that if 1 and f is = = = -4 j], 40. Suppose that f is a function such that f'(x) 1/x for all x > 0 and f (1) = 0. Prove that f(xy) f(x) + f(y) for all x, y > 0. Hint: Find g'(x) when = g(x) *41. = = f (xy). Suppose that f satisfies f"(x) + f'(x)g(x) f (x) 0 g. Prove that if f is 0 at two points, then f = - for some function interval between them. Hint: Use Theorem 6. 42. Suppose that f is n-times differentiable and that f (x) ent x. Prove that ff">(x)= 0 for some x. 43. Let ai, ..., an+1 be arbitrary points in [a,b], n+1 Q(x) (x = i=1 - x¿). and let = is 0 on the 0 for n + 1 differ- 11. Signigcance of theDerivative 209 Suppose that f is (n + 1)-times differentiable and that P is a polynomial function of degree sn such that P(x¿) f(x¿)for i 1,...,n + 1 (see 49). that each for Show x in [a,b] there is a number c in (a, b) such page = = that f(x) P(x) - Q(x) = f (n+13(c) - . (n + 1)! Hint: Consider the function F(t)=Q(x)[f(t)-P(t)]-Q(t)[f(x)-P(x)]. Show that F is zero at n + 2 different points in 44. use Problem 42. Prove that computing 5 (without 45. [a,b], and 2 decimal places!). to Prove the following slight generalization of the Mean Value Theorem: If f is continuous and differentiable on (a, b) and lim f (y) and lim f (y) exist, y->a+ then there is some x in yab- (a, b) such that lim f (y) f'(x) lim f (y) - y-+b- y->a+ = b - a (Yourproof should begin: "This is a trivial consequence Theorem because of the Mean Value ".) ... 46. Prove that the conclusion in the form of the Cauchy Mean Value Theorem can be written f (b) f (a) f'(x) - g(b) - _ ¯ g'(x)' g(a) that g(b) are never simultaneously O on (a, b). under *47. the additional assumptions ¢ g(a) and that and g'(x) Prove that if f and g are continuous on [a,b] and differentiableon (a, b), ¢ 0 for x in (a, b), then there is some x in (a, b) with and g'(x) f'(x) f (x) f (a) g'(x) g(b) - g(x) - Hint: Multiply out first, to see what this really says. 48. f'(x) What is wrong with the following use of THôpital's Rule: . hm x->lX2 x3+x-2 = -3x +2 -4.) (The limit is actually lim x-1 3x2+1 -3 2x = lim x-i - 6x 2 = 3. 210 Derivativesand Integrals 49. Find the following limits: x tan lim (i) x->0 . I cos2 50. (ii) lim Find f'(0) if x 1 - . g(x) ¯¯¯ f (x) 0, and g(0) 51. = g'(0) 0 and g"(0) = ¢0 x , = 0, = x 17. = (nonerequiring Prove the following forms of l'Hôpital's Rule new reasoning). (a) If lim x->a+ f (x) lim g(x) = lim f(x)/g(x) lim f(x) lim f(x)/g(x) -> l, then = for limits from below). 0, and = = -> -oo, = a* or x by x (c) If lim f(x) a¯). -+ 0, and lim f'(x)/g'(x) Hint: l (and similarly for lim g(x) = = x-voo x->oo lim lim f'(x)/g'(x) x-va+ lim f'(x)/g'(x) oo, then oo (andsimilarly for or if x a is replaced lim g(x) = f(x)/g(x) X->OO (andsimilarly l = x-+a+ (b) If 0, and = x-va+ any essentially = l, then = x->oo -oo). Consider lim f(1/x)/g(1/x). lim g(x) lim f(x) X400 lim f(x)/g(x) = oo. x-voo (d) If 52. 0, and = = X->OO There is another manipulations: then lim x-voo (a) For = oo, then form ofTHôpital's Rule which requires more than algebraic lim f'(x)/g'(x) lim f (x) lim g(x) l, If X->QC oo, and X-Þ00 1400 = f (x)/g(x) every e lim f'(x)/g'(x) X-+00 > = = = l. Prove this as follows. 0 there is a number f'(x) a such that -l forx>a. <s g'(x) Apply the Cauchy Mean Value Theorem to f and g on > a. that f (x) f (a) - g(x) - g(a) (Why can we assume g(x) - - l g(a) < e for x ¢ 0?) [a,x] to show 11. Syn$icanceof theDerivative 211 (b) Now write f (x) f (a) f (x) - g(x) (whycan - g(a) that we assume g(x) g(a) g(x) f (x) f (x) f (a) - - f (x) f (a) ¢ 0 for large - x?) and conclude that f (x) g(x) 53. l < 2e for sufficiently large x. To complete the orgy of variations on l'Hôpital's Rule, use Problem 52 to are so many prove a few more cases of the following general statement (there possibilities that you should select just a few, if any, that interest you): f(x)/g(x) = can be a or a+ or a¯ or oo or and ( ) can be l or oo or ( ). Here [] can be 0 or oo or *54. - -oo, and -oo, { } -oo. is differentiable on [a,b]. Prove that if the minimum of f on [a,b] is at a, then f'(a) > 0, and if it is at b, then f'(b) < 0. (One half of the proof of Theorem 1 will go through.) (b) Suppose that f'(a) < 0 and f'(b) > 0. Show that f'(x) 0 for some x in (a, b). Hint: Consider the minimum of f on [a,b]; why must it be somewhere in (a, b)? (c) Prove that if f'(a) < c < f'(b), then f'(x) c for some x in (a, b). (This (a) Suppose that f = = is known as Darboux's Theorem.) Hint: Cook up an appropriate function to which part (b)may be applied. result 55. Suppose that f is differentiable in discontinuous at a. one-sided interval containing a, but that f' is and lim f'(x) cannot both exist. (This is just a minor variation on Theorem 7.) (b) Neither of these one-sided limits can exist even in the sense of being +oo Hint: Use Darboux's Theorem (Problem 54). or (a) The limits lim some f'(x) -oo. *56. It is easy to find a function f such that |f | is differentiable but f is not. for 1 for x rational and f (x) For example, we can choose f (x) continuous, nor is this a mere x irrational. In this example f is not even coincidence: Prove that if |f| is differentiable at a, and f is continuous at a, then f is also differentiable at a. Hint: It suffices to consider only a with f(a) 0. Why? In this case, what must |f|'(a) be? -1 = = = *57. y ¢ 0 and let n be even. Prove that x" + y" = (x + y)" only when x = 0. Hint: If xo" + y" = (xo + y)", apply Rolle's Theorem to f (x) = x" + y" (x + y)" on [0,xo]- (a) Let - 212 Derivativesand Integrals (b) Prove that if y ¢ 0 and n is odd, then x" + y" = (x + y)" only if x = 0 orx=--y. **58. Use the method of Problem 57 to prove that if every tangent line to f intersects f only once. **59. Prove even more generally that if intersects f only once. *60. f' n is even and f (x) = x", then is increasing, then every tangent line Suppose that f (0) 0 and f' is increasing. Prove that the function g(x) f(x)/x is increasing on (0, oo). Hint: Obviously you should look at g'(x). Prove that it is positive by applying the Mean Value Theorem to f on the right interval (itwill help to remember that the hypothesisf(0) 0 is essenshown tial, as by the function f (x) 1 + x2L = = = = *61. Use derivatives to prove that if n 1, then > -1 (1 + 62. x)" > for 1 + nx that (notice equality holds for x Let f (x) x4 - 2 = < x < = < x 0). 1/x for x ¢ 0, and let f (0) (a) Pove that 0 is a local minimum (b) Prove that f'(0) f"(0) 0. 0 and 0 = 0 (Figure 32). point for f. = This function thus provides another example to show that Theorem 6 cannot be improved. It also illustrates a subtlety about maxima and minima that often goes unnoticed: a function may not be increasing in any interval to the right of a local minimum point, nor decreasing in any interval to the left. FIGURE *63. 32 if f'(a) > 0 and f' is continuous some interval containing a. (a) Prove that at a, then f is increasing in The next two parts of this problem show that continuity of f' is essential. x2 sin 1/x, show that there g(x) are numbers with g'(x) with g'(x) 1 and also (b) If = -1. = = x arbitrarily close to 0 of theDerivative 213 11. Srgngicance 0 < a < 1. Let f (x) ax + x2 sin 1/x for x ¢ 0, and let f (0) 0 (seeFigure 33). Show that f is not increasing in any open interval containing 0, by showing that in any interval there are points x with f'(x) > 0 and also points x with f'(x) < 0. (c) Suppose = = / FIGURE 0, x-0 33 The behavior of f for a à 1, which is much more difficult to analyze, is discussed in the next problem. **64. ax + x2 sin 1/x for x ¢ 0, and let f (0) 0. In order to find the sign of f'(x) when œ à 1 it is necessary to decide if 2x sin 1/x cos 1/x is < for any numbers x close to 0. It is a little more convenient to consider the function g(y) 2(sin y)/y cos y for y ¢ 0; we want to know if g(y) < for large y. This question is quite delicate; the most significant but this happens only part of g(y) is cos y, which does reach the value when sin y = 0, and it is not at all clear whether g itself can have values < The obvious approach to this problem is to find the local minimum values of g. Unfortunately, it is impossible to solve the equation g'(y) 0 explicitly, so more ingenuity is required. Let f (x) = = - -1 = - -1 -1, - -1. = (a) Show that if g'(y) = 0, then cos y = (siny) and conclude that g(y) = (siny) (2-y2 (2 , 2y +y 2y 2 . 214 Derivativesand Integrals (b) Now show that if g'(y) = 0, then 4y2 sin2 4 + y4 and conclude that 2+y2 |g(y) = . 4+ y* fact that (2 + y2)/J4+ y4 > 1, show that if a not increasing in any interval around 0. (c) Using the (d) Using the f **65. 4 + y4 fact that lim (2 + y2)| y->oo = 1, then 1, show that if a = > is f 1, then is increasing in some interval around 0. A function f is increasing at a if there is some number 8 f (x) > f (a) if a < x < > 0 such that a + ô and f (x) < f (a) if a Notice that this does not mean that &< x - < a. is increasing in the interval (a 8, shown example, the function for 8); in Figure 33 is increasing at 0, but + a is not an increasing function in any open interval containing 0. f - (a) Suppose that f is continuous on [0,1] and that f is increasing at a for every a in [0,1]. Prove that f is increasing on [0,1]. (First convince yourself that there is something to be proved.) Hint: For 0 f (b) for all y in [b,x]}. of the is not necessary for the other parts.) Hint: problem (This part Prove that Sb {x: b 5 x 5 1) by considering sup Sb· (c) If f is increasing at a and f is differentiable at a, prove that f'(a) > 0 (thisis easy). (d) If f'(a) > 0, prove that f is increasing at a (goright back to the definition {x = of f'(a)). (e) Use parts (a)and (d)to show, without using the Mean Value Theorem, that if f is continuous on [0, 1] and f'(a) > 0 for all a in [0,1], then f is increasing on [0,1]. (f) Suppose that f is continuous on [0, 1] and f'(a) 0 for all a in (0, 1). Apply part (e)to the function g(x) f(x) + ex to show that f(l) show that f(1) f(0) < e by considering h(x) Similarly, f(0) > ex f (x). Conclude that f (0) f (1). = = - -e. - - = = of theDerivative 215 11. Sagnÿicance This particular proof that a function with zero derivative must be constant has many points in common with a proof of H. A. Schwarz, which may be the first rigorous proof ever given. Its discoverer, at least, seemed to think it was. See his exuberant letter in reference [40]of the Suggested Reading. **66. is a constant function, then every point is a local maximum point for f. It is quite possible for this to happen even if f is not a constant 1 for x > 0 for x < 0 and f(x) function: for example, if f(x) 0. But prove, using Problem 8-4, that if f is continuous on [a,b] and every point of [a,b] is a local maximum point, then f is a constant function. The same result holds, of course, if every point of [a,b] is a local minimum point. (b) Suppose now that every point is either a local maximum or a local minimum point for f (butwe don't preclude the possibility that some points Prove that f is conare local maxima while others are local minima). that < Suppose We stant, as follows. can assume that f (ao) f (bo). f(ao) < f(x) < f(bo) for ao < x < bo. (Why?) Using Theorem1 of the Appendix to Chapter 8, partition [ao,bo] into intervals on which also choose the lengths of these inter< sup f f (f (bo) f (ao))/2; ao)/2. vals to be less than (bo Then there is one such interval [ai,bij with ao < ai < b1 < bo and f(ai) < f(bi). (Why?) Continue inductively and use the Nested Interval Theorem (Problem 8-14) to find a point x that cannot be a local maximum or minimum. (a) If f = = -inf - - **67. point for f on A if f (x) > f (y) a strict maximum with the definition of an ordinary for all y in A with y ¢ x (compare maximum point is defined in the point). A local strict maximum maximum obvious way. Find all local strict points of the function (a) A point x is called 0, f (x) 1 - --, x irrational p x m lowest terms. q . = - q that a function can have a local strict maximum It seems quite the above example might give one pause for at every point (although thought). Prove this as follows. (b) Suppose that every point is a local strict maximum point for f. Let ai < l such xi be any number and choose a1 < xi < b1 with bi Xl > all in Let be any point in that f(xt) x2 Ý f(x) for x [ai, bi]. (ai, bi) and choose al a2 < x2 < b2 5 bi with b2 a2 < such that f(x2) > f(x) for all x in [a2,b2]. Continue in this way, and use the Nested Intenval Theorem (Problem 8-14) to obtain a contradiction. unlikely - _< - 216 Derivativesand Integrals CONVEXITY AND CONCAVITY APPENDIX. Although the graph of a function can be sketched quite accurately on the basis of the information provided by the derivative, some subtle aspects of the graph are revealed only by examining the second derivative. These details weœ purposely enough without woromitted previously because graph sketching is complicated rying about them, and the additional information obtained is often not worth the effort. Also, correct proofs of the relevant facts are sufficiently difficult to be placed in an appendix. Despite these discouraging remarks, the information presented here is well worth assimilating, because the notions of convexity and concavity are far more important than as mere aids to graph sketching. Moreover, the proofs have a pleasantly geometric flavor not often found in calculus theorems. Indeed, the basic definition is geometric in nature (seeFigure 1). DEFINITION 1 A function f is convex segment joining(a, f on an interval, if for all a and b in the interval, the line (a)) and (b, f (b)) lies above the graph of f. The geometric condition appearing in this definition can be expressed in an way that is sometimes more useful in proofs. The straight line between (a, f (a)) and (b. f (b)) is the graph of the function g defmedby analytic f (b) f (a) - g(x) = fbatax This line lies above the graph of (b,f(b)) (x a) + --- if g(x) f (b) f (a) (x > f (a). f(x), that is, if - b (a,f (a)) - a) + f (a) > f (x) a) f (x) f (a) a - or f (b) f (a) (x - I i - > - or FIGURE 1 f (b) f (a) - b-a 2 A function a<x<bwehave f is convex - x-a definition of convexity. We therefore have an equivalent DEFINITION f (x) f (a) on an interval if for a, x, and b in the interval with f (x) f (a) -- x--a f (b) f (a) - b--a 11, Appendix.Convexipand Concaviþ 217 "over" in Definition 1 is replaced by inequality in Definition 2 is replaced by If the word (a,f(a)) f (x) (b,f(b)) f (a) - "under" or, equivalently, if the f (b) f (a) - the definition of a concave function (Figure 2). It is not hard to see that the concave functions are precisely the ones of the form f, where f is convex. For this reason, the next three theorems about convex functions have immediate corollaries about concave functions, so simple that we will not even bother to State them. Figure 3 shows some tangent lines of a convex function. Two things seem to be we obtain - FIGURE 2 true: (1)The graph of f lies above the tangent line (a, f (a)) itself (thispoint is called the point at (a, f (a)) except at the point of contact of the tangent line). of then the slope the < tangent line at (a, f(a)) is less than the slope (2)If a b, of the tangent line at (b, f (b));that is, f' is increasing. As a matter of fact these observations FIGURE THEOREM 1 PROOF are true, and the proofs are not difficult. 3 be convex. If f is differentiable at a, then the graph of the tangent line through (a, f (a)), except at (a, f (a)) itself If a differentiableat a and b, then f'(a) < f'(b). Let If 0 f < hi A nonpictorial a a + hi < lies above b and f is hz, then as Figure 4 indicates, < (1) < f < f(a+hi)- f(a) hi < f(a+h2)- f(a) h2 proof can be derived immediately from Definition 2 applied a + h2. Inequality (1)shows that the values of f(a+h)- f(a) h to 218 Derivativesand Integrals I FIGURE I I a+haa‡h, a 4 decrease as h -> 0+. Consequently, f'(a) < f(a+h)- f(a) h for h > 0 (infact f'(a) is the greatest lower bound of all these numbers). But this means that for h > 0 the secant line through (a, f (a)) and (a + h, f (a + h)) has larger slope than the tangent line, which implies that (a + h, f (a + h)) lies above the tangent line (ananalytic translation of this argument is easily supplied). For negative h there is a similar situation (Figure 5): if h2 < hi < 0, then f(a+hi)- f(a) f(a+h2)- hi f(a) h2 This shows that the slope of the tangent line is greater than f(a+h)h (infact f'(a) is the I FIGURE 5 forh<0 least upper bound of all these numbers), so that f (a + h) lies < 0. This proves the first part of the theorem. above the tangent line if h a‡h: f(a) 1 1 a+h2 a and Concavig 219 11, Appendix.Convexity I FIGURE I 6 Now suppose that a < f'(a) < b. Then, as we have already seen (Figure 6), f (a + (b a)) - b f (a) - a - . smce b -- a > 0 b < 0 f (b) f (a) - b - a and b)) a b (b + (a f'(b) > f ¯ f (a) - - f (b) - - f (b) I FIGURE I - f (b) f (a) - ¯ a-b I . smce a b-a Combining these inequalities, we obtain f'(a) < f'(b). Theorem 1 has two converses. Here the proofs will be a little more difficult. We begin with a lemma that plays the same role in the next theorem that Rolle's Theorem plays in the pmof of the Mean Value Theorem. It states that if f' is increasing, then the graph of f lies below any secant line which happensto be horizontal. 7 ' LEl IA f (x) PROOF is differentiable and f'is increasing. If a f (a) f (b) for a < x < b. Suppose < f f(a) = f(b) for some x in (a, b). Then the maximum f on [a,b] occurs at some point xo in (a, b) with f (xo)> f (a) and, of course, f'(xo) 0 (Figure 7). On the other hand, applying the Mean Value Theorem to the interval [a,xo], we find that there is xi with a < xi < xo and of = f (xo) f (a) > - f'(xi ) = xo - a 0, 220 Derivativesand Integrals contradicting the fact that f' is increasing. This proves that b, and it only remains to prove that f (x) f _< f (x) f (a) = f (b) for a < x < (a) is also impossible for x in (a, b). Suppose f(x) f(a) for some x in (a, b). We know that f is not constant on x] it [a, (if were, f' would not be increasing on [a,x]) so there is (Figure 8) some xi with a < xi < x and f (xi) < f (a). Applying the Mean Value Theorem to [x1,x] we conclude that there is x2 with xi < x2 < x and = = a x, x. & t f(X) FIGURE j'(X2) 8 f(Xi) - > = x -- 0. x1 On the other hand, f'(x) 0, since a local maximum contradicts the hypothesis that f' is increasing. = at x. occurs Again this We now attack the general case by the same sort of algebraic machinations that we used in the proof of the Mean Value Theorem. THEOREM 2 If f is differentiable and f' is increasing, then f is convex. Let a PRooF < b. Define g by f (b) f (a) - g(x) , f (x) = - b-a (x - It is easy to see that g' is also increasing; moreover, g(a) the lemma to g we conclude that g(x) (a, f(a)) < f (a) if a < x < a). = g(b) = f(a). Applying b. ' In other words, if a < x < b, then f (b) f (a) (x-a)<f(a) - f(x)- . b -- a or f (x) f (a) - FIGURE 9 f (b) f (a) - b---a x-a Hence f is convex. THEOREM s PROOF is differentiable and the graph of point of contact, then f is convex. If f f lies above each tangent line except at the b. It is clear from Figure 9 that if (b, f (b)) lies above the tangent line at (a, f (a)), and (a, f (a)) lies above the tangent line at (b,f (b)),then the slope of the tangent line at (b,f (b)) must be larger than the slope of the tangent line at (a, f (a)). The following argument just says this with equations. Since the tangent line at (a, f (a)) is the graph of the function Let a < g(x)= f'(a)(x-a)+ f(a), 11, Appendix.Convexigand Concavig 221 and since (b, f (b)) lies above the tangent line, we have (1) f(b) > f'(a)(b-a)+ f(a). Similarly, since the tangent line at (b,f (b)) is the graph of h(x) and (a, f (a)) lies above = f'(b)(x b) + f (b), - the tangent line at (b, f (b)),we have (2) f (a) > f'(b)(a - b) + f (b). It follows from (1)and (2)that f'(a) < f'(b). It now follows from Theorem 2 that f is convex. If a function f has a reasonable second derivative, the information given in these in which f is convex or concave. theorems can be used to discover the regions Consider, for example, the function f(x) 1 = 1+ x2 . For this function, -2x f'(x) Thus f'(x) = 0 only for x = 0, and - (1 + f (0) f'(x) > f'(x) < 0 0 = x2)2 . 1, while if x if x < > 0, 0. Moreover, 0 for all x, f (x) 0 as xe oo or f is even. f (x) > -+ f (x) FIGURE = -oo, 1 1 ‡ x* 10 The graph of f therefore looks something like Figure 10. We now compute 222 Derivativesand Integrals f"(x) (1 + = x2)2(-2) + 2x - [2(1+ x2) - 24 (1 + x2)4 2(3x2 - 1) (1 + x2)3 It is not hard to determine the sign of f"(x). Note first that f"(x) 0 only when Since f" is clearly continuous, it must keep the same sign or x of each the sets on = = N -N. Since we easily compute, for example, that f"(-1) f"(0) f"(1) = -2 ( > } > = = < 0, 0, 0, we conclude that Since f" > > f" < 0 means (·--oo,-Ñ) and f is FIGURE f" f' --Ñ) and (Ñ, (-N, Ñ). 0 on (-oo, 0 on is increasing, it follows from Theorem 2 that f is convex on oo), while on (-5, f is concave (Figure 11). (Ñ, Ñ) f is concave convex oo), f is convex 11 (G, Notice that at j) the tangent line lies below the part of the graph to the oo), and above the part of the graph to the left, f is convex on since f is concave on thus the tangent line crosses the graph. In general, a number a is called an inflection point of f if the tangent line to the and graph of f at (a, f (a)) crosses the graph; thus are inflection x2). of points Note that the condition f"(a) 0 does not ensure f (x) 1/(1 + that a is an inflection point of f; for example, if f (x) x4, then f"(0) 0, but f is convex, so the tangent line at (0, 0) certainly doesn't cross the graph of f. In order for a to be an inflectionpoint of a function f, it is necessary that f" should have diEerent signs to the left and right of a. right, since (Ñ, (-Ñ, 5); N -Ñ = = = = l l, Appendix.Convexipand Concavig 223 This example illustrates the procedure which may be used to analyze any function f. After the graph has been sketched, using the information provided by f', the zeros of f" are computed and the sign of f" is determined on the intervals between consecutive zeros. On intervals where f" > 0 the function is convex; on intervals where f" < 0 the function is concave. Knowledge of the regions of of other convexity and concavity of f can often prevent absurd misinterpretation which data about f. Several functions, can be analyzed in this way, are given in the problems, which also contain some theoretical questions. To round out our discussion of convexity and concavity, we will prove one further result that you may already have begun to suspect. We have seen that convex and the that every tangent line intersects the graph property concave functions have will convince probably you that no other functions have just once; a few drawings this property. The proof of this assertion is rather tricky; it is closely related to the proof of Theorem 2 of the next chapter, and is probably best deferred until after that proof has been read. THEOREM 4 PROOF is differentiable on an interval and intersects each of its tangent lines just once, then f is either convex or concave on that interval. If f There are two parts to the proof. line can intersect the graph of f in threedifferent points. Suppose, on the contrary, that some straight line did intersect the graph of f at (a, f (a)), (b, f (b)) and (c, f (c)), with a < b < c (Figure 12). Then we (1)First would M we claim that no straight have f (b) f (a) · ¡ ¡ a b - b-a c-a Consider the function c ) ý(g) _ x FIGURE f (c) f (a) - (1) - for x in a [b,c]. g(c). So by Rolle's Theorem, there is some number that g(b) in (b,c) where 0 g'(x), and thus Equation 12 (1)says = x = O = (x a) f'(x) - or j f'(x) = f (x) f (a) x But this says (Figure 13) that the tangent contradicting the hypotheses. I I I a b x 13 - . a line at (x,f (x)) passes through (a, f (a)), I e (2)Suppose that ao < bo x, y, FIGURE - - . , [f (x) f (a)] - Zt < = = = co and ai < bl (1 t)ao+tal (1 t)bo + tbi (Î -1)CO #fCI < c1 are points in the interval. Let - - 0 yt 5 1. 224 Derivativesand Integrals (Problem 4-2) the points x, all lie between ao for y, and z,. Moreover, Then xo = ao and xi = at and and ai, with analogous statements < x, y, < for z, Osts 1. Now consider the function f (yr) f (x,) - f (z,) f (x,) - for 0 < ts 1. ze x, By step (1),g(t) ¢ 0 for all t in [0,1]. So either g(t) > 0 for all t in [0,1] or g(t) < 0 for all t in [0, 1]. Thus, either f is convex or f is concave (compare pages 231-232). g(t) = y, - - - x, PROBLEMS 1. Sketch, indicating regions of convexity and concavity and points of inflection, the functions in Problem 11-1 (consider (iv)as double starred). 2. Figure 30 in Chapter 11 shows the graph of 3. Find two convex functions f and g such that f(x) an integer. 4. is convex on an interval if and only if for all x and y in the interval we have Show that f'. Sketch the graph = of f. g(x) if and only if x is f f(tx+(1-t)y)<tf(x)+(1-t)/(y), for0<t<1. (This is just a restatement of the definition, but a useful one.) 5. and g are convex and f is increasing, then f og is convex. (It will be easiest to use Problem 4.) (b) Give an example where gof is not convex. (c) Suppose that f and g are twice differentiable. Give another proof of the œsult of part (a)by considering second derivatives. 6. and convex on an interval. Show that either f is increasing, or else f is decreasing, or else there is a number c such that f is decreasing to the left of c and increasing to the right of c. (b) Use this fact to give another proof of the result in Problem 5(a) when f (a) Prove that if f (a) Suppose that f is differentiable (You will have to be a little careful f'(g(y)) for x < y.) f'(g(x)) assuming f differentiable. You will without in part (a) (c) several of keep track have to different cases, but no particularly clever ideas are needed. Begin by showing that if a f (b), then f is decreasing to the left of a. and g are differentiable. (one-time) when comparing Prove the result *7. Let f(x) f > and be a twice-differentiable function with the following properties: 0. Prove that 0 for x :> 0, and f is decreasing, and f'(0) = 11, Appendir. Convexigand Concavig 225 0 for some x > 0 (sothat in reasonable cases f will have an inflection point at x-an example is given by f (x) 1/(1+x2)). Every hypothesis in this theorem is essential, as shown by f (x) 1 x2, which is not positive for all x; by f (x) x2, which is not decreasing; and by f (x) = 1/(x + 1), which does not satisfy f'(0) 0. Hint: Choose xo > 0 with f'(xo) < 0. We for have f'(y) 5 f'(xo) all y > xo. Why not? So f'(xi) > f'(xo)for cannot > xo. Consider f' on [0,xt]. some xi f"(x) = = = - = = *8. that if f is convex, then f ([x + y]/2) < [f (x) + f (y)]/2, that f satisfies this condition. Suppose Show that f(kx + (1 k)y) < (b) rational whenever number, between 0 and 1, kf (x) + (1 k) f (y) k is a of the form m/2". Hint: Part (a)is the special case n 1. Use induction, employing part (a)at each step. (c) Suppose that f satisfies the condition in part (a)and f is continuous. Show that f is convex. (a) Prove - - = *9. Let pl. - - - , numbers with Pn by positive p¿ = 1. i=1 (a) For any numbers xi, x, show that ..., p¿x¿ lies between the smallest i=1 and the largest x¿. n-1 n-i p;x;, where t (b) Show the same for (lft) = i=1 mequalay: If f (c) Prove Jensen's f p¿. i=1 is convex, then Hint: Use Problem 4, noting that p, = f p¿x; i=l 1 t. (Part (b)is needed - n-1 that (1/t) i p¿x; p¿f 5 i=1 is in the domain of f if x1, ..., (x;). to show x, are.) i=1 *10. function f, the right-hand derivative, lim [f(a + h) f (a)]/h, h->0+ is denoted by f '(a), and the left-hand derivative is denoted by f.'(a). The proof of Theorem 1 actually shows that f ' and f_' always exist if f is convex. Check this assertion, and also show that fe' and f_' are increasing, and that f_'(a) 5 f,'(a). Show that if f is convex, then fo'(a) f_'(a) if and only if f+' is continuous at a. (Thus f is differentiable precisely when f,' is continuous.) Hint: [f(b) f(a)]/(b a) is close to f_'(a) for b < a close to a, and f,'(b) is less than this quotient. (a) For any **(b) - = - *11. (a) Prove that - a convex function on R, or on any open interval, must be continuous. an example of a convex function on a closed interval that is not continuous, and explain exactly what kinds of discontinuities are possible. (b) Give 226 Derivativesand Integrals 12. Call a function f weakly convexon an interval if for a we have f (x) f (a) - x-a < b < c in this interval f (b) f (a) - 5 b-a a wealdy convex function is convex if and only if its graph no straight line segments. (Sometimes a wealdy convex function while convex functions in our sense are called is simply called (a) Show that contains "convex," a) a subsci t onves of die pluir "strictly convex".) (b) Reformulate 13. b) a nonconvex FIGURE [4 sul>set of the plane the theorems of this section for weakly convex functions. A set A of points in the plane is called convexif A contains the line segment joiningany two points in it (Figure 14). For a function f, let Ay be the set of points (x, y) with y > f (x), so that A is the set of points on or above y the graph of f. Show that A is convex if and only if f is weakly convex, in the terminology of the previous problem. Further information on convex sets will be found in reference [19]of the Suggested Reading. INVERSE FUNCTIONS CHAPTER We now have at our disposal quite powerful methods for investigating functions; what we lack is an adequate supply of functions to which these methods may be applied. We have studied various ways of forming new functions from oldusing these alone, we can addition, multiplication, division, and composition-but rational sine the the produce only functions function, although frequently (even used for examples, has never been defined). In the next few chapters we will begin to construct new functions in quite sophisticated ways, but there is one important method which will practically double the usefulness of any other method we discover. If we recall that a function is a collection of pairs of numbers, we might hit upon the bright idea of simply reversing all the pairs. Thus from the function f = {(1,2), (3,4), (5,9), (13,8) }, g = {(2. 1), (4, 3), (9, 5), (8, 13) }. we obtain 1 and g(4) While f(1) 2 and f(3) 4, we have g(2) Unfortunately, this bright idea does not always work. If = = f = = = 3. {(1, 2), (3, 4), (5, 9), (13, 4) }, then the collection {(2, 1), (4, 3), (9, 5), (4, 13) } is not a function at all, since it contains both (4, 3) and (4, 13). It is clear where the trouble lies: f (3) f (13), even though 3 ¢ 13. This is the only sort of thing that can go wrong, and it is worthwhile giving a name to the functions for which this does not happen. = DEFINITION A function f is one-one "one-to-one") if f (a) ¢ (read The identity function I is obviously and so one-one, f (b) whenever a ¢ b. is the following modification: g(x)= The function f (x) = x2 x¢3,5 3, 5, x=5 x is not one-one, since g(x)=x2, 227 x, = 3. f (-1) x>0 = f (1), but if we define 228 Derivativesand Integrals (andleave g'(x) = number 2x g undefined > 0, for x 0), then g is one-one, because g is increasing (since 0). This observation is easily generalized: If n is a natural for x > < and x>0, f(x)=x", then f is one-one. If n is odd, one can do better: the function f (x) = x" for all x is one-one (sincef'(x) = nx"¯' > 0, for all x ¢ 0). It is particularly easy to decide from the graph of f whether f is one-one: the condition f (a) ¢ f (b) for a ¢ b means that no horizontalline intersects the graph of f twice (Figure 1). a one-one function a function which is not one-one (b) FIGURE 1 If we reverse all the pairs in necessarily one-one function) f we obtain, in It is popular to abstain from this procedure unless f is one-one, but there is no particular reason to do so-instead of a definition with restrictive conditions we obtain a definition and a theorem. any case, some collection For any function f, the inverse of f, denoted by f¯I, (a, b) for which the pair (b,a) is in f. DEFINITION THEOREM (anot of pairs. 1 PROOF f-1 is a function if and only if is the set of all pairs is one-one. f Suppose first that f is one-one. Let (a, b) and (a, c) be two pairs in f-1. Then (b,a) and (c,a) are in f, so a f(c); since f is one-one this f(b) and a implies that b = c. Thus f¯l is a function. Conversely, suppose that f is a function. If f (b) f (c), then f contains the pairs (b, f (b))and (c, f (c)) (c, f (b)), so (f (b),b) and (f (b),c) are in f¯I. Since f¯1 is a function this implies that b c. Thus f is one-one. = = ¯1 = = = 12. InverseFunctions 229 The graphs of f and f¯1 are so closely related that it is possible to use the graph of f to visualize the graph of f-1. Since the graph of f¯1 consists of all pairs (a, b) with (b,a) in the graph of f, one obtains the graph of f¯I from the graph of f by interchanging the horizontal and vertical axes. If f has the graph shown in Figure 2(a), FIGURE I 1 I 2 I I 2(a) I This procedure is awkward with books and impossible with blackboards, so it is fortunate that there is another way of constructing the graph of f-1. The points 230 Derivativesand Integrals and (b,a) are reflections of each other through the graph of I(x) = x, which is called the diagonal (Figure 4). To obtain the graph of f¯I we merely reflect the graph of f through this line (Figure 5). (a, b) FIGURE FIGURE4 5 Reflecting through the diagonal twice will clearly leave us right back where we started; this means that (f-1)-1 f, which is also clear from the definition. In this equation conjunction with Theoæm 1, has a significant consequence: if f is a is a one-one function, then the function f¯* is also one-one (since(f¯1 function). There are a few other simple manipulations with inverse functions of which you should be aware. Since (a, b) is in f precisely when (b, a) is in f¯', it follows that = -1 b Thus f x3, then ¯I = (b) is the f¯*(b)is the definition, 6. The fact that the same as means f(a) number (unique) f¯I(x) much more compact f(f¯I(x)) a such that y such that is the number form: x, f (a) a' a such that unique number = a = = f(y) L f¯ = b; for example, if f (x) b, and this number is, by = = x can be restated in a for all x in the domain of f¯'. Moreover, f¯I(f(x)) = x, for all x in the domain of this follows from the previous equation important equations can be written upon replacing f; f by f¯I. These two fof¯*=1, f¯1 (exceptthat y the right side will have a bigger domain if the domain of not all of R). f or f-I is 12. InverseFunctions 231 Since many standard functions will be defined as the inverses of other functions, it is quite important that we be able to tell which functions are one-one. We have and already hinted which class of functions are most easily dealt with-increasing obviously decreasing functions are one-one. Moreover, if f is increasing, then f is also increasing, and if f is decreasing, then f¯I is decreasing (theproof is left to you). In addition, f is increasing if and only if f is decreasing, a very useful fact to remember. It is certainly not true that every one-one function is either increasing or decreasing. One example has already been mentioned, and is now graphed in Figure 6: e - -/ FIGURE 6 g(x)= 3, 5 x, x 3, 5, x=5 x 3. = Figure 7 shows that there are even continuous one-one functions which are neither increasing nor decreasing. But ifyou try drawing a few pictures you will soon agree that every one-one continuous function defined on an interval is either increasing or decreasing. It's possible to give a straightforward, but cumbersome, proof of this much like Problem 6(c) in fact that involves keeping track of a lot of cases (very the previous Appendix). The following proof dispenses with all these unpleasant details, although it is rather tricky. THEOREM 2 If f is continuous and one-one on that interval. of an interval, then f is either increasing or decreasing PROOF I et ao bo be two numbers < in the interval. Since f is one-one, either (i) f (bo) f (ao) > 0 or (ii) f(bo) f(ao) < we know that -- - 0. We will assume that (i)is true, and show that the same inequality holds for any ai < bi in the interval, so that f in increasing. (A similar argument shows that if (ii)is true, then f is decreasing.) Let x,=(1-t)ao+tal (1 t)bo + tb1 y, = 7 for0stsl. ao and xi = ai and the points xr all lie between ao and at An analogous statement holds for y,. So x, and y, are all in the Moreover, since ao < be and ai < bi, we also have Then xo FIGURE - = x,<y, (Problem 4-2). domain of f. for0stsl. Now consider the function g(t)= f(yr) -f(Ir) forO5 t 5 1. Using Theorem 6-2, it is easy to see that g is continuous on [0, 1]. Moreover, g(t) is never 0, since x, < y, and f is one-one. Consequently, g(t) is either positive for all t in [0, 1] or negative for all t in by the Intermediate Value [0,1] (otherwise, 232 Derivativesand Integrals Theorem it would also by 0 somewhere in [0,1]). But g(0) > 0, which means that (i)also holds for ai, bl. I g(1) I I I c a FIGURE > 0 by (i).So also Henceforth we shall be concerned almost exclusively with continuous increasing or decreasing functions which are defined on an interval. If f is such a function, it is possible to say quite precisely what the domain of f¯1 will be like. Suppose first that f is a continuous increasing function on the closed interval f takes on every value between [a,b]. Then, by the Intermediate Value Theorem, f-I of and the closed interval [f(a), f(b)]. the is domain f(a) f(b). Therefore, and decreasing on [a,b], then the domain of f¯I is Similarly,if f is continuous b a [f(b),f(a)]. a If f is a continuous increasing function on an open interval (a, b) the analysis becomes a bit more difficult. To begin with, let us choose some point c in (a, b). We will first decide which values > f (c) are taken on by f. One possibility is that f takes on arbitrarily large values (Figure 8). In this case f takes on all values > f (c), by the Intermediate Value Theorem. If, on the other hand, f does not take on arbitrarily large values, then A {f (x) : c < x y (because actually of takes on A). By the Intermediate Value Theorem, f upper bound the value y. Notice that f cannot take on the value a itself; for if a = f (x) for a < x a, which is impossible. Precisely the same arguments work for values less than f (c): either f takes on all values less than f(c) or there is a number ß < f(c) such that f takes on all values between ß and f (c), but not ß itself. This entire argument can be repeated if f is decreasing, and if the domain of f is R or (a, oo) or (-oo, a). Summarizing: if f is a continuous increasing, or decreasing, function whose domain is an interval having one of the forms - = FIGURE 9 (a, b), (-oo, b), (a, oo), or R, then the domain of f ¯I is also an interval which has one of these four forms. Now that we have completed this preliminary analysis of continuous one-one functions, it is possible to begin asking which important properties of a one-one function are inherited by its inverse. For continuity there is no problem. THEOREM s PROOF If f is continuous and one-one on an interval, then f¯I is also continuous. We know by Theorem 2 that f is either increasing or decreasing. We might as well assume that f is increasing, since we can then take care of the other case by applying the usual trick of considering We must show that --- f. -1(b) lim f-1 12. InverseFunctions 233 for each b in the domain of f-1. Such a number b is of the form f (a) for some in the domain of f. For any e if f(a)-ô<x< > 0, we to find a ô want Figure 10 suggests the way of finding 8 see the graph of f-1): since a 0 such that, for all x, f¯I(x)<a+e. thena-e< f(a)+ô, > that (remember by looking sideways you a-s<a<a+s, it follows that f(a - f(a) 6 - f(a) e - i I I ------- I I ---- i I I a-a FIGURE i i I f(a-s)5 f(a+ e); - - f(a)-ôand f(a)+85 f(a+e). Consequently, if a+s a < let 8 be the smaller of f (a + s) f (a) and f (a) f (a e). Figure 10 contains the entire proof that this & works, and what follows is simply a verbal account of the information contained in this picture. Our choice of ô ensures that I i I f(a) < - we f(a) + a e) f(a)-B<x< f(a)+8, f(a-s)<x< f(a+e). then lo Since f is increasing, f¯I is also increasing, and we obtain f¯I(f(a-e)) < *(x)< f f¯I(f(a+s)), i.e., f¯I(x)<a+e, a-e< which (f(a), a) - , / L, is precisely what we want. Having successfully investigated continuity of f¯*, it is only reasonable to tackle differentiability. Again, a picture indicates just what result ought to be true. Figure 11 shows the graph of a one-one function f with a tangent line L through (a, f (a)). If this entire picture is reflected through the diagonal, it shows the graph and the tangent line L' through (f(a), a). The slope of L' is the reciprocal of the slope of L. In other words, it appears that -1 (a,f(a)) / L / FIGURE / / 11 (f¯')'(f(a))= . f'(a) This formula can equally well be written in a way which expresses (f¯1y(b) directlÿ, (Or each b in the domain of f-I (f I)'(b) continuity, Unlike the argument for involved when formulated analytically. = 1 f'( f-1(b)) this pictorial "proof" There is another becomes somewhat approach which might 234 Derivativesand Integrals be tried. Since we know that f (f¯1 it is tempting to prove the desired formula by applying the Chain Rule: -1)'(x) f'(f¯1 1, = so 1 f'( f¯l(x)) f¯I f(x) - is differentiable, since the Chain Rule Unfortunately, this is not a proof that f¯I already cannot be applied unless is known to be differentiable. But this arguis differentiable, and it can ment does show what (f-1)'(x) will have to be if f also be used to obtain some important preliminary information. x THEOREM 4 PROOF If f is a continuous one-one function defined on an interval and then f I is not differentiable at a. f'(f-1(a)) = 0, We have f(f-1(x))=x. 1 If f¯i were differentiable at a, the Chain Rule would imply that (a) f'( f¯ (a)) (f¯')'(a) - 1, = hence 0 (f-1)'(a) = - which is absurd. 1, I A simple example to which Theorem 4 applies is the function f (x) x3. Since f'(0) 0 and 0 f¯I(0), the function f¯I is not differentiable at 0 (Figure 12). Having decided where an inverse function cannot be differentiable, we are now ready for the rigorous proof that in all other cases the derivative is given by the formula which we have already "derived" in two different ways. Notice that the following argument uses continuig of f-1, which we have already proved. = = (b) FIGURE 12 THEOREM 5 = Let f be a continuous one-one function defined on an interval, and suppose that I(b), with derivative f'(f¯1(b)) ¢ 0. Then f I is differf is differentiable at f entiable at b, and 1 f¯I)'(b) ( PROOF Let b = f(a). = . f'( f-1(b)) Then -1(b) lim f¯* (b + h) - h h->o = lim h->0 f -1(b + h) h - a 12. InverseFunctions 235 Now every number b+h= for a unique k Then really (weshould . hm I ¯ b + h in the domain of f can be written in the form f(a+k) write k(h), but we will stick with k for simplicity). f~1(b+h)-a h hw0 f¯I(f(a+k)) -a f(a+k)-b A-o k =lim . f Aso (a + k) f (a) - We are clearly on the right track! It is not hard to get an explicit expression for k; since b+h= we f(a+k) have f"(b+h)=a+k or -1(b k -1(b). + h) f = f -- Now by Theorem 3, the function f¯1 is continuous 0 as h approaches 0. Since at b. This means that k approaches lim f(a+k)- koo f(a) = k f'(a) f'( f¯1(b)) ¢ 0, = this implies that I (f¯ )'(b) 1 = f'( . f¯I (b)) The work we have done on inverse functions will be amply repaid later, but here is an immediate dividend. For n odd, let f.(x) for all x; xn = for n even, let f,(x) Then f, is a continuous one-one x", = x > 0. function, whose inverse function is gn(x) = = x1/n 236 Derivativesand Integrals By Theorem 5 we have, for x ¢ 0, g,'(x) 1 = f,'( f.-1 1 n( f.- l (x))"¯I 1 R(Xl|ngn-1 1 1 1-(I/n) 1 (1/n)-1 n xa, and a is an integer or the reciprocal of a natural number, then It is now easy to check that this formula is true if a is any rational = m/n, where m is an integer, and n is a natural number; if a Thus, if f (x) f'(x) = number: axa-1. Let = f (x) x"'7" = = (x1/np= then, by the Chain Rule, j'(x) = I m l/n m-1 Î 1 (1/n)-1 n [(m/n)-(1/n)]+[(1¡n)-l] n (m/n)-1 n xa and a is rational, x. for irrational a will have to be saved the treatment of the function f(x) the for later-at moment we do not even know the meaning of a symbol like x Actually, inverse functions will be involved crucially in the definition of xa for irrational a. Indeed, in the next few chapters several important functions will be Although we now have a formula for f'(x) when f (x) = = . defined in terms of their inverse functions. PROBLEMS 1. Find f¯I for each of the following f. (i) f (x) = 3 x + (ii) f (x) (x = (iii) f (x) = - 1. 1)3. x rational irrational. x x, . -x, . -x2 (iv) f (x) = 1--x (v) f(x)= (vi) f(x) = x 3 , > 0 x<0. x, x¢a1,...,a, ai+1 ai, x=ag, x=a.. x + [x]. i=1,...,n-1 12. InverseFunctions 237 (vii)f (0.aia2a3 ) x (viii)f (x) 1 x . . . = Û.a2Gl = < , Describe the graph of 3. f-I is increasing and is increasing and is decreasing and is decreasing and (i) f (ii) f (iii) f (iv) f x . . . (Decimal representation is being used.) < 1. when always positive. always negative. always positive. always negative. f is increasing, Prove that if . -1 - 2. ü3 then so is f¯I, and similarly for decreasing functions. 4. If f and g are increasing, is f + g? Or 5. (a) Proveg)¯Ithat (fo (b) Find 6. 7. g? Or g? fo Find i = f (x) = ¯1 f - if f and g are one-one, then fog is also one-one. in terms of f¯1 and g¯1. Hint: The answer is not f¯1 in terms of f¯1 if g(x) 1 + f (x). g¯I Show that f in this case. On which intervals ax + b is one-one if and only if ad cx + d [a,b] will the - bc ¢ 0, and find following functions be one-one? (i) f(x)-x3-3x2. x5 4 (ii) f(x) = 8. (iii) f(x) = (1+x2)-1. (iv) f (x) = + 1 x + 1 . Suppose that f is differentiable with derivative f'(x) that g 9. x = f¯I satisfies g"(x) = = (1+ x3)¯I/2. Show (g(x)2 Suppose that f is a one-one function and that f¯1 has a derivative which is 0. Prove that f is difFerentiable. Hint: There is a one-step proof. nowhere 10. The Schwarzian derivative 2 f was defined in Problem 10-17. if 9 f(x) exists for all x, then 2 f¯1(x)also exists for all x in the domain of f f-I(x). Find a formula for o (b) (a) Prove that . *11. there is a differentiablefunction f such that [f (x)]5 4 all that expressed 0 for Hint: Show be inverse x x. as an f can function. The easiest way to do this is to find f-1. And the easiest way f¯I(y). to do this is to set x = (b) Find f' in terms of f, using an appropriate theorem of this chapter. (c) Find f' in another way, by simply differentiating the equation defining f. (a) Prove that = 238 Derivativesand Integrals The fimction in Problem 11 is often said to be denned irnplicitly by the yS+y+x 0. The situation for this equation is quite special, however. As the next problem shows, an equation does not usually define a function implicitly on the whole line, and in some regions more than one function may be defined implicitly. equation 12. = are the two diEerentiable functions f which are defined implicitly 1, i.e., which satisfy x2+[ f (x)]2 1 on (-1, 1) by the equation x2+y2 for all x in (-1, 1)? Notice that there are no solutions defined outside (a) What = = [-1, 1]. satisfy x2 + [f(x)]2 -P Which differentiable functions f satisfy [f (x)]3 3 f (x) x3 will help to first draw the graph of the function g(x) (b) Which functions f *(c) - - = x? = - Hint: It 3x. In general, determining on what intervals a differentiable function is defined implicitly by a particular equation may be a delicate afTair, and is best discussed in the context of advanced calculus. If we assume that f is such a differentiable solution, however, then a formula for f'(x) can be derived, exactly as in Problem 11(c),by differentiating both sides of the equation defining f (aprocess known as "implicit differentiation"): 13. 22 1. Notice that your to the equation [f (x)]2 4 expected, since there is more will this only involve to be answer f (x); is 2 than one function defined implicitly by the equation y2 g check that your answer works for both of the functions f found in But (b) Problem 12(a). (c) Apply this same method to [f(x)]3 3 f (x) x. (a) Apply this method = _ = - 14. to find f'(x) and f"(x) for the functions f defined implicitly by the equation x3 + y3 7. (b) One of these functions f satisfies f (-1) 2. Find f'(-1) and f"(-1) for this f. (a) Use implicit differentiation = = 15. 4 The collection of all points (x, y) such that 3x3 + 4x2y xy2 + 2y3 forms a certain curve in the plane. Find the equation of the tangent line to this curve at the point (-1, 1). 16. Leibnizian notation is particularly convenient for implicit differentiation. Because y is so consistently used as an abbreviation for f (x), the equation in x and y which defmes f implicitly will automatically stand for the equation which f is supposed to satisfy. How would the following computation be - written in our notation? +y3+g-1 y 4y'$+3y2Ë+y+xË=0, dx dx dx dy --y ¯ Ï 4y3+3y2+x = 12. InverseFunctions 239 17. As long as Leibnizian notation has entered the picture, the Leibnizian notation for derivativesof inverse functions should be mentioned. If dy/dx denotes the derivative of f, then the derivative of f¯I is denoted by dx/dy. Write out Theorem 5 in this notation. The resulting equation will show you another reason why Leibnizian notation has such a large following. It will also explain at which point (f-1), is to be calculated when using the dx/dy notation. What is the significance of the following computation? x=y", y X1/n = dxil" dy 1 1 ¯ Ì dx nyn-1 dy 18. Suppose that f is a differentiable one-one function with a nowhere zero F(f¯I(x)). xf¯1(x) F'. Let G(x) Prove that derivative and that f G'(x) this problem tells details, (Disregarding interesting f¯1(x). us a very fact: if we know a function whose derivative is f, then we also know one whose derivative is f-i. But how could anyone ever guess the function G? Two different ways are outlined in Problems 14-17 and 19-15.) = = - = 19. Suppose h is a function such that h'(x) Find (i) (ii) 20. sin2(sin(x = + 1)) and h(0) 3. = (h¯I)'(3). (ß¯I)'(3), where Find a formula for ß(x) = h(x + 1). (f¯I)"(x). -l(x)) *21. 22. Prove that if f(k) exists, and then is nonzero, (f¯l)(k)(X) eXists. that an increasing and a decreasing function intersect at most once. (b) Find two continuous increasing functions f and g such that f (x) g(x) precisely when x is an integer. (c) Find a continuous increasing function f and a continuous decreasing function g, defined on R, which do not intersect at all. (a) Prove = *23. f¯I, is a continuous function on R and f prove that there is at least one x such that f (x) x. (What does the condition f f mean geometrically?) (b) Give several examples of continuous f such that f f¯1 and f (x) x for exactly one x. Hint: Try decreasing f, and remember the geometric interpretation. One possibility is f(x) f¯I, then (c) Prove that if f is an increasing function such that f = all will Although the geometric interpretation f (x) x for x. Hint: 2 lines) is to rule be immediately convincing, the simplest proof (about out the possibilities f (x) < x and f (x) > x. (a) If f = ¯1 = = = = -x. = = *24. Which functions have the property that the graph is still the graph of a function when reflected through the graph of --I "antidiagonal")? (the 240 Derivativesand Integrals 25. A function f is nondecreasing if f (x) sf (y) whenever more precise we should stipulate that the domain of f nonincreasing function is defined similarly. Caution: "increasing" "nondecreasing," instead of and "strictly x < y. (To be be an interval.) A Some writers use increasing" for our "increasing." (a) Prove that if f is nondecreasing, but not increasing, then f is constant increasing" is not on some interval. (Beware of unintentional puns: "not the same as "nonincreasing.") (b) Prove that if all x. (c) Prove that *26. f is differentiable and nondecreasing, if f'(x) > then f'(x) 0 for > 0 for all x, then f is nondecreasing. > 0 for all x, and that f is decreasing. Prove that there is a continuous decreasing function g such that 0 < g(x) 5 f(x) for (a) Suppose that f (x) ' all x. (b) Show that we can even arrange that g will satisfy lim g(x)/f(x) X400 = 0. of Curves 241 12, Appendix.ParametricRepresentation APPENDIX. PARAMETRIC REPRESENTATION OF CURVES The material in this chapter serves to emphasize something that we noticed a long time ago-a perfectly nice looking curve need not be the graph of a function (Figure 1). In other words, we may not be able to describe it as the set of all points (x, f (x)). Of course, we might be able to describe the curve as the set of all points (f (x),x); for example, the curve in Figure 1 is the set of all points (x2,x). But even this trick doesn't work in most cases. It won't allow us to describe the circle, consisting of all points (x, y) with x2 + y2 1, or an ellipse, and it can't be used to describe a curve like the one in Figure 2. The simplest way of describing curves in the plane in general harks back to the physical conception of a curve as the path of a particle moving in the plane. At each time t, the particle is at a certain point, which has two coordinates; to indicate the dependence of these coordinates on the time t, we can call them u(t) and v(t). Thus, we end up with two functions. Conversely, given two functions u and v, we can consider the curve consisting of all points (u(t), v(t)). This curve is said to by u and v, and the pair of functions u, v is called a be represented parametrically parametric representation of the curve. The curve represented parametrically by = FIGURE 1 v thus consists of all pairs with x and y It is often v(t)." Notice that the graph of a u(t), y described briefly as curve x function f can always be described parametrically, as the curve x t, y f (t). Instead of considering a curve in the plane as defined by two functions, we can obtain a conceptually simpler picture if we broaden our original defmition of function somewhat. Instead of considering a rule which associates a number with another number, we can consider a c from real numbers to the plane," i.e., a rule c that associates, to each number t, a pointin theplane,which we can denote by c(t). With this notion, a curve is just a function from some interval of real numbers to the plane. Of course, these two different descriptions of a curve are essentially the same: functions u and v determines a single function c from the real A pair of (ordinary) numbers to the plane by the rule u and (x, y) "the = u(t) = v(t). = = = = "function FIGURE 2 c(t) = (u(t), v(t)), given a function c from the real numbers to the plane, each c(t) the plane, is a point in so it is a pair of numbers, which we can call u(t) and v(t), unique functions u and v satisfying this equation. so that we have In Appendix 1 to Chapter 4, we used the term to describe a point in the plane. In conformity with this usage, a curve in the plane may also be called function." The conventions of that Appendix would lead us to a write c(t) (c1(t),c2(t)), but in this Appendix we'll continue to use notation like c(t) (u(t), v(t)) to minimize the use of subscripts. simple example of a vector-valued function that is quite useful is A and, conversely, "vector" e(r) "vector-valued = = e(t) FIGURE 3 Which = (cost, sin t), goes round and round the unit circle (Figure 3). 242 Derivativesand Integrals functions f and g, we defined new functions f + g and f g For two (ordinary) by the rules - (1) (2) (f + g)(x) = g)(x) = (f - f (x) + g(x), f (x) g(x). - Since we have defined a way of adding vectors, we can imitate the first of these definitions for vector-valued functions c and d: we define the vector-valued function c+d by (c+d)(t) where the + on the right-hand to saying that if = c(t) +d(t), is now the sum ofvectors. This simply amounts side c(t) = (u(t), v(t)), d(t) = (w(t), z(t)), then (c +d)(t) = v(t)) + z(t)) (u(t) + w(t), v(t) + z(t)). (w(t), (u(t), = Recall that we have also defined a v for a number a and a vector v. To extend this to vector-valued functions, we want to consider an onlinary function a and a vector-valued function c, so that for each t we have a number a(t) and a function a c by vector c(t). Then we can define a new vector-valued - - (a - c)(t) where the on the right-hand side simply amounts to saying that - (« - c)(t) = a(t) - = a(t) - c(t), is the product of a number (u(t), v(t)) = (a(t) u(t), - a(t) and a vector. This v(t)). - Notice that the curve a e, - (a (a ," . e)(t) e)(t)= 4 a(t)sint), -e)(t) "graph - of a in polar coordinates." functions generally, given any vector-valued r and 8 function c, we can define new by c(t) FIGURE (a(t)cost, is already quite general (Figure 4). In the notation of Appendix 3 to Chapter 4, the point (a e)(t) has polar coordinates a(t) and t, so that (a is the Even more ' - = r(t) - e(8(t)), where r(t) is just the distance from the origin to c(t), and 0(t) is some choice of the angle of c(t) (asusual, the function 9 isn't defined unambiguously, so one has careful when using of writing arbitrary c). this be to way curve an We aren't in a position to extend (2)to vector-valued functions in general, since However, Problems 2 and 4 of we haven't defined the product of two vectors. real-valued products Appendix 1 to Chapter 4 define two o w and det(v, w). It • of Curves 243 12, Appendix.ParametricRepresentation functions c and d, how we would define two be clear, given vector-valued ordinary (real-valued) functions should and c•d det(c, d). Beyond imitating simple arithmetic operations on functions, we can consider more interesting problems, like limits. For c(t) (u(t), v(t)), we can define = limc(t) (*) = lim(u(t), v(t)) lim v(t) (limu(t), to be . Rules like lim c + d lim a c - = lim c + lim d, = lim a (t) lim c - follow immediately. Problem 10 shows how to give an equivalent definition that imitates the basic definition of limits directly. Limits lead us of course to derivatives. For c(t) (u(t), v(t)) = define c' by the straightforward we can c'(a) definition v'(a)). (u'(a), = We could also try to imitate the basic definition: . c'(a) lun = c(a + h) - c(a) , h amo understood where the fraction on the right-hand side is 1 - · [c(a+ h) - to mean c(a)]. As a matter of fact, these two defmitions are equivalent, because c(a+h)-c(a) u(a+h)-u(a) v(a+h)-v(a) = lim hm ho0 h-o h h h c(a+h) . , c = lim u(a+h)-u(a) h hw0 . , hw0 by our definition = c(a) Inn v(a+h)-v(a) h (*)of limits v'(a)). (u'(a), Figure 5 shows c(a + h) and c(a), as well as the arrow from c(a) to c(a + h); c(a), except as we showed in Appendix 1 to Chapter 4, this arrow is c(a + h) would moved over so that it starts at c(a). As h 0, this arrow appear to move closer and closer to the tangent of our curve, so it seems reasonable to defmethe when c'(a), straight along c'(a) of c(a) the line is moved to be tangent line c at other words, of c(a). In we define the tangent line c at c(a) over so that it starts at - -> FIGURE 5 as the set of all points c(a)+s·c'(a); 244 Derivativesand Integrals 1 we get c(a) + c'(a), etc. (Note, 0 we get the point c(a) itself, for s for s make however, that this definition does not 0.) Problem 1 sense when c'(a) shows that this definition agrees with the old one when our curve c is defined by = = = c(t) so that we simply have the graph of Once again, various old formulas (c + d)'(a) (a c)'(a) - or, as equations (t, f (t)), = f. have analogues. For example, = c'(a) + d'(a), = a'(a) .c'(a), .c(a) +a(a) involving functions, (c+d)'=c'+d', (a-c)'=a'-c+a-c'. These formulas can be derived immediately from the definition in terms of the component functions. They can also be derived from the definition as a limit, by imitating previous proofs; for the second, we would of course use the standard trick of writing «(a + h)c(a + h) a(a)c(a) - = a(a + h) We can also consider - [c(a+ h) c(a)] + - [a(a + h) - a(a)] · c(a). the function d(t) c(p(t)) = = (co p)(t), where p is now an ordinary function, from numbers to numbers. The new curve d through the same points as c, except at different times; thus p corresponds passes of c. For to a "reparameterization" c (u, v), = d=(uop,vop), we obtain d'(a) = = ((uop)'(a), (vo p)'(a)) p'(a)v'(p(a))) (p'(a)u'(p(a)), = p'(a)•(u'(p(a)),v'(p(a))) = p'(a) - c'(p(a)), or simply d'= p'-(c'o p). Notice that if p(a) a, so that d and c actually pass through the same point at time a, then d'(a) = p'(a) c'(a), so that the tangent vector d'(a) is just a multiple of c'(a). This means that the tangent line to c at c(a) is the same as the tangent line to the reparameterized c(a). The one exception occurs curve d at d(a) when p'(a) 0, since the tangent line for d is then undefined, even though the = · = = of Curves 245 12, Appendix.ParamekicRepresentatwn c(t3) won't have a tangent tangent line for c may be defined. For example, d(t) line defined at t 0, even though it's merely a reparameterization of c. Finally, since we can define real-valued functions = = (c.d)(t) det(c, d)(t) = c(t) .d(t), det(c(t) d (t)), = , we ought to have formulas for the derivatives of these new functions. As you might guess, the proper formulas are (c•d)'(a) [det(c,d)]'(a) = c(a)•d'(a) = det(c', d)(a) + det(c, d')(a). +c'(a)•d(a), calculations from the definitions in terms These can be derived by straightforward functions. But it is more elegant to imitate the proof of the ordinary product rule, using the simple formulas in Problems 2 and 4 of Appendix 1 trick" referred to above. to Chapter 4, and, of course, the of the component "standard PROBLEMS 1. "point-slope form" (Problem 4-6) of the tangent function f, the written line at (a, f(a)) can be as y f'(a), so that the f(a) (x tangent line consists of all points of the form (a) For a --a) - = (x,f(a)+(x-a)f'(a)). Conclude that the tangent line consists of all points of the form (a+ s, f (a) + s f'(a)). (b) If c is the curve c(t) (a, f (a)) at 2. (t, f (t)), conclude that the tangent line of c at the [usingour new defmition] is same as the tangent line of f = (a, f (a)). Let c(t) (f (t), t2), where f is the function shown in Figure 21 of Chapfunction ter 9. Show that c lies along the graph of the non-differentiable reparameterization c'(0) other words, that h(x) but 0. In a can |x|, this usually only For interested in reason, we are a corner. curves c with c' never 0. = = = "hide" 3. v(t) is a parametric u(t), y Suppose that x and that u' 0 on some interval. = = representation of a curve, -/ that on this interval the curve lies along the graph of (b) Show that at the point x u(t) we have (a) Show = J'(x) v'(t) = -. u'(t) f = vo u--1 246 Derivativesand Integrals In Leibnizian notation this is often written suggestively as dy Ã dy dx (c) We also have f"(x) 4. 5. u'(t)v"(t) . (u'(t))3 By implicit differentiation. By considering the parametric representation 6 = u(t), y We've seen that the = cos3 t, y 2/3 = sin3 _ g t = = "graph of -e)(t)= (f f in polar coordinates" metric is the curve (f(t)cost, f(t)sint); in other words, the graph of f in polar coordinates is the curve with the para- representation x= 6. x x2/3 v(t) be the parametric representation of a curve, with differentiable, and let P (xo,70) be a point in the plane. Prove that if the point Q = (u(i), v(i)) on the curve is closest to (xo,yo), and u'(i) and v'(i) are not both 0, then the line from P to Q is perpendicular to the tangent line of the curve at Q (Figure 6). The same result holds if Q is furthest from (xo,yo). Let x u and v FIGURE v'(t)u"(t) - = Consider a function f defined implicitly by the equation Compute f'(x) in two ways: (i) (ii) P dx f(0)cos9, y= f(0)sin9. for the graph of f in polar coordinates the slope ofthe tangent line at the point with polar coordinates (f (0),0) is (a) Show that --f(0)sin0 f(8)cos0 + f'(6)sin0 + f'(0)cos0 if f (0) 0 and f is differentiable at 0, then the line through the origin making an angle of 0 with the positive horizontal axis is a tangent line of the graph of f in polar coordinates. Use this result to add some details to the graph of the Archimedian spiral in Appendix 3 of Chapter 4, and to the graphs in Problems 3 and 10 of that Appendix (b) Show that = as well. that the point with polar coordinates (f (Ð),0) is further from the origin O than any other point on the graph of f. What can you say about the tangent line to the graph at this point? Compare with (c) Suppose Problem 5. of Curves 247 12, Appendix.ParametncRepresentation that the tangent line to the graph of f at the point with polar (f(0),0) makes an angle of a with the horizontal axis 0 is the angle between the tangent line and the (Figure 7), so that a the point. Show that from O to ray (d) Suppose coordinates / - _ tan(a 7. = f (0) . J'(0) Problem 5 of Appendix 1 to Chapter 4 we found that the cardioid 2 2 1 sin 0 is also described by the equation (x2+ y2 + y)2 r slope of the tangent line at a point on the cardioid in two ways: Find the - (i) (ii) By implicit differentiation. By using the previous problem. (b) Check that 7 0) (a) In = FIGURE - . at the origin the tangent lines are vertical, as they appear to be in Figure 8. FIGURE 8 The next problem uses the material from Chapter 15, in particular, radian measure, and the inverse trigonometric functions and their properties. 8. A gdoid is defined as the path traced out by a point on the rim of a rolling wheel of radius a. You can see a beautiful cycloid by pasting a reflector on the edge of a bicycle wheel and having a friend ride slowly in front of the headlights of your car at night. Lacking a car, bicycle, or trusting friend, you can settle instead for Figure 9. FIGURE 9 248 Derivativesand Integrals of the point on the rim after the u(t) and v(t) be the coordinates wheel has rotated through an angle of t (radians). This means that the wheel rim of the from P to Q in Figure 10 has length at. Since the arc wheel is rolling, at is also the distance from O to Q. Show that we have of the cycloid the parametric representation (a) Let = v(t) = a(t a(1 sint) - - cost). i 0 x Q length FIGURE u(t) lo at Figure 11 shows the curves we obtain if the distance from the point to the center of the wheel is (a)less than the radius or (b)greater than the radius. In the latter case, the curve is not the graph of a function; at certain times the point is moving backwards, even though the wheel is moving forwards! (b) FIGURE 11 In Figure 9 we drew the cycloid as the graph of a function, but we really need to check that this is the case: u'(t) and conclude that u is increasing. Problem 3 then shows that the cycloid is the graph of f vo u¯1, and allows us to compute (b) Compute = f'(t). It isn't possible to get an explicit formula for f, but we can come close. of Curves 249 12, Appendix.ParametricRepresentation (c) Show that a - v(t) |[2a v(t)]v(t). ± a Hint: first solve for t in terms of v(t). (d) The first half of the first arch of the cycloid is the graph of g u(t) , = a arccos g(y) = aarccos a y - f(2a --- , where y)y. - y 9. - Let u and v be continuous on [a,b] and differentiable on (a, b); then u representation of a curve from P give parametric v a (u(a),u(b)) to Q (v(a),v(b)). Geometrically, it seems clear (Figure 12) that at some point on the curve the tangent line is parallel to the line segment from P to Q. Prove this analytically. Hint: This problem will give a geometric interpretation for one of the theorems in Chapter 11. and = = FIGURE 12 10. The following definition of a limit for a vector-valued analogue of the definition for ordinary functions: lim c(t) l means that for every e = all t, if 0 |t < a| - 8, then < ||c(t) - > l function is the direct 0 there is some 8 < 0 such that, for > e. Here || | is the norm, defined in Problem 2 of Appendix 1 to Chapter 4. If I = then (li, l2), | c(t) (a) Conclude - -12\ = lu(t) lil l || and - + Iv(t) . that |u(t) ll ] 5 | c(t) - and show that also I ||2 - if lim c(t) = |v(t) l2| 5 ||c(t) l ||, - l according - to the above definition, then we have limu(t) = li and limv(t) t-+a = l2, ted 1 according to our definition(*)in terms of component functions, on page 243. (b) Conversely, show that if lim c(t) 1 according to the definition in terms of component functions, then also lim c(t) l according to the above so that lim c(t) = = = definition. INTEGRALS CHAPTER "integral," the The derivative does not display its full strength until allied with the complete second main concept of Part III. At first this topic may seem to be a digression-in this chapter derivatives do not appear even once! The study of integrals does require a long preparation, but once this preliminary work has been completed, integrals will be an invaluable tool for creating new functions, and the derivative will reappear in Chapter 14, more powerful than ever. Although ultimately to be defined in a quite complicated way, the integral formalizes a simple, intuitive concept-that of area. By now it should come as no surprise to learn that the definition of an intuitive concept can present great difficulties-"area" is certainly no exception. In elementary geometry, formulas are derived for the areas of many plane figures, but a little reflection shows that an acceptable definition of area is seldom given. The area of a region is sometimes defined as the number of squares, with sides of length 1, which fit in the region. But this defmition is hopelessly inadequate for any but the simplest regions. For example, a circle of radius I supposedly has "x squares" means. as area the irrational number x, but it is not at all clear what which supposedly has area 1, it is hard Even if we consider a circle of radius 1/ , this circle, since it does not seem possible unit what fits in way a to say in square which unit to divide the can be arranged to form a circle. square into pieces In this chapter we will only try to define the area of some very special regions (Figure 1)-those which are bounded by the horizontal axis, the vertical lines through (a, 0) and (b, 0), and the graph of a function f such that f(x) > 0 for all x in [a,b]. It is convenient to indicate this region by R( f, a, b). Notice that these regions include rectangles and triangles, as well as many other important geometric figures. The number which we will eventually assign as the area of R( f, a, b) will be called the integralof f on [a, b]. Actually, the integral will be defined even for functions f which do not satisfy the condition f (x) :> 0 for all x in [a,b]. If f is the function graphed in Figure 2, the integral will represent the difference of the area of the lightly shaded region and the area of the heavily shaded region (the / R(f, b) ., (a,0) (b, 0) FIGURE 1 (4 0) FIGURE (b,0) 2 "algebraic area" of R( f, a, b)). The idea behind the prospective definition is indicated in Figure 3. The interval [a,b] has been divided into four subintervals [to,tt] [tt,12] [12,t3][13,14 by means of numbers m, to, ti, 12, $3, $4 with a=to<t1<t2<t3<t4-b a FI GURE to = a t a ta 4 = b (thenumbering of the subscripts begins with 0 so that the largest subscript equal the number of subintervals). 250 will 13. Integrals 251 the function f has the minimum value mi and the t¿] let the minimum value maximum value Mi; similarly, on the ith interval [t¿_i, of f be mi and let the maximum value be M¿. The sum [to,ti] On the first interval s represents = mi(ti - to) + m2(t2 ti) + m3(t3 - the total area of rectangles t2) + m4(t4 - f3) - lying inside the region R( f, a, b), while the sum S = M1(t1 - to) + M2(t2 t1) + M3(ts - t2) + M4(t4 - -- f3) the total area of rectangles containing the region R(f, a, b). The guiding principle of our attempt to define the area A of R(f, a, b) is the observation that A should satisfy and s<A A<S, represents and that this should be true, no matter how the interval [a,b] is subdivided. It is to be hoped that these requirements will determine A. The following definitions begin to formalize, and eliminate some of the implicit assumptions in, this discussion. DEFDiITION of the interval [a,b] is a finite collection b. A partition [a,b], one of which is a, and one of which is b. Let a < The points in a partition can be numbered Suppose f is bounded on The lower sum of f [a,b] and mg = Mi = . . , to so that has been assigned. that such a numbering we shall always assume . <in-1<t.=b; a=to<11< DEFINITION to, of points in P = {to, ..., t.) is a partition of [a, b]. Let inf {f(x) : t¿_i 5 x 5 t¿}, sup{ f(x) : ti-1 5 x 5 ti}. for P, denoted by L(f, P), is defined as L(f, P) mi(t; = - ti_1). i=1 The upper sum of f for P, denoted by U(f, P), is defined as U(f,P)= M¿(ti-t¿_i). i=1 The lower and upper sums correspond to the sums s and S in the previous example; they are supposed to represent the total areas of rectangles lying below and above the graph of f. Notice, however, that despite the geometric motivation, these sums have been defined precisely without any appeal to a concept of "area." 252 Derivativesand Integrals The requirement that f be Two details of the definition deserve comment. order all that the essential in bounded on [a,b] is me and M¿ be defined. Note, numbers also, that it was necessary to defme the mi and Mi as inf's and sup's, than as minima and maxima, since f was not assumed continuous. One thing is clear about lower and upper sums: If P is any partition, then rather L(f, P) 5 U(f, P), because L( f, P) m¿(t; = U(f,P)= - ti-1), M¿(t;-t¿,), i-l and for each i we have FIGURE On the other hand, something less obvious oughtto be true: If Pi and P2 are 4 any two partitions of [a,b], then it should be the case that L( f, Pi) 5 U(f, P2), because L(f, Pi) should be 5 area R(f, a, b), and U(f, P2) should be > area of R(f, a, b)" has not even the R( f, a, b). This remark proves nothing (since been definedyet), but it does indicate that if there is to be any hope of defining the area of R(f, a, b), a proof that L(f, PI) 5 U(f, P2) should corne first. The proof which we are about to give depends upon a lemma which concerns the behaviorof lower and upper sums when more points are included in a partition. In Figure 4 the partition P contains the points in black, and Q contains both the points in black and the points in grey. The picture indicates that the rectangles drawn for the partition Q are a better approximation to the region R(f, a, b) than those for the original partition P. To be precise: "arca j -- l,, b-1 FIGURE & 5 LEMMA If Q contains P (i.e.,if all points of P are also in L( f, P) 5 L( f, U(f, P) > U(f, PROOF Consider first the special Q),then Q), Q). case (Figure 5) in which Q contains just one more point P = Û = (to, (10, than P: ts), ..., -- -, ik--1,N,Ík,•••9 Ín), where a=to<tt<---<tk-1<¾<Ik<'''<Ín-b. 13. Integrals 253 Let m' = m" = inf {f (x) : tk-1 X M} inf {f (x) : u 5 x 5 tkÌ• , Then m;(t; L(f, P)= ti-1), - i-1 k-I L( f, Q) n m¿(t¿ = ti-1) + - m'(u tk-1) i - (Ik E M) - RifÍi i=1 ¯ Íi-1)· i-k+1 To prove that L(f, P) 5 L(f, Q) it therefore suffices to show that mkffk N'fM Ik-1) ¯ Ík-1) Ÿ - Ik N N)• - ik} contains all the numbers in {f (x) : tk-1 X smaller and u}, possibly some ones, so the greatest lower bound of the first set x 5 is lessthan or equal to the greatest lower bound of the second; thus Now the set {f (x) : tk-1 mgim'. Similarly, m mk . Therefore, mkfÍk ¯ Ík-1) = mk(N - Ík-l) EkfÍk m'(N N) - - ik-1) + m"(tk - M)- This proves, in this special case, that L( f, P) 5 L( f, Q). The proof that U( f, P) à U( f, Q) is similar, and is left to you as an easy, but valuable, exercise. The general case can now be deduced quite easily. The partition Q can be obtained from P by adding one point at a time; in other words, there is a sequence of partitions P such that Pj+1 contains = just one PI, P2, more P, ..., = Q point than Pj. Then L(f,P)=L(f,Pi)$L(f,P2)$--·5L(f,PŒ)=L(f,Q), and U( f, P) = U( f, P1) à U( f, P2) à ··· à U( f, Pa) = U( f, Q). The theorem we wish to prove is a simple consequence of this lemma. THEOREM l Let P1 and P2 be partitions of on [a,b]. Then [a,b], L(f, PI) and let f be 5 U(f, P2)- a function which is bounded 254 Derivativesand Integrals PROOF There is a partition P which contains both Pi and P2 in both P1 and P2). According to the lemma, (letP consist of all points L(f,Pi)5L(f,P)5U(f,P)5U(f,P2). It follows from Theorem 1 that any upper sum U( f, P') is an upper bound for the set of all lower sums L( f, P). Consequently, any upper sum U( f, P') is greater than or equal to the least upper bound of all lower sums: sup{L( f, P) : P a partition of for every P'. This, in turn, means that of all upper sums of f. Consequently, [a, b]} sup(L( f, P)} sup{L(f, P)) 5 inf{U(f, It is clear that both of these numbers f for all partitions: are 5 U( f, P'), is a lower bound for the set P)}. between the lower sum and upper sum of L(f, P') 5 sup(L(f, P)} 5 U(f, P'), L(f, P') 5 inf {U(f, P)} 5 U(f, P'), for all partitions P'. It may well happen that sup{L(f, P)) inf{U(f, = P}; in this case, this is the only number between the lower sum and upper sum of f for all partitions, and this number is consequently an ideal candidate for the area of R(f, a, b). On the other hand, if sup{L(f, P)} f(x) then every number -, x inf{U(f, < P)}, between sup{L(f, P)) and inf{U(f, P)} will satisfy L( f, P') 5 x 5 U (f, P') a-a FIGURE 6 a a t. 1 t. 6 for all partitions P'. It is not at all clear justwhen such an embarrassment of riches will occur. The following two examples, although not as interesting as many which will soon appear, show that both phenomena are possible. Suppose first that f(x) c for all x in [a,b] (Figure 6). If P {to, t,} is partition of then any [a,b], = = m¿=M¿=c, so c(t¿-t¿_i)=c(b-a), L(f,P)= i=1 U( f, P) c(t; = i=1 -- ti-1) = c(b - a). ..., 13. Integrals 255 In this case, all lower sums and upper sums are equal, and sup{L(f,P))=inf{U(f,P))=c(b-a). Now consider (Figure 7) the function 0, 1, a=to FIGURE • ••• si If P la in-1 f.=b x rational. t,} is any partition, then {to, = defined by x irrational I f(x)= • f ..., mg = 0, since there is an irrational number in M¿ = 1, since there is a rational number and in [ti-1,til- ' Therefore, 7 [ti-1.til 0-(t;-t,_i)=0, L(f,P)= i=1 U(f,P)= 1-(t¿-ti-1)=b-a. i=1 inf{U(f, P)}. The Thus, in this case it is certainly not true that sup{L(f, P)) which the of definition principle upon area was to be based pmvides insufficient information to determine a specific area for R(f, a, b)-any number between 0 and b a seems equally good. On the other hand, the region R(f, a, b) is so weird that we might with justicerefuse to assign it any area at all. In fact, we can maintain, more generally, that whenever = - sup{L(f, P)) the region R(f,a,b) peal to the word terminology. A function f sup{L(f, P) which : P)}, is too unreasonable "unreasonable" DEFINITION ý=inf{U(f, to deserve having an area. As our apsuggests, we are about to cloak our ignorance in is bounded on P a partition of [a,b] is integrable [a,b]} In this case, this common number denoted by = inf{U(f, P) : on [a,b] if P a partition of is called the integral of f on [a,b]}. [a,b] and is b (The symbol Ï f. is called an integralsrgn and was originally an elongated s, for the numbers a and b are called the lower and uger limitsof integration.) The integral f is also called the area of R( f, a, b) when f (x) > 0 for all x b]. in [a, "sum;" f f 256 Derivativesand Integrals If f is integrable, then according to this definition, b L(f, P) 5 I for all partitions P of 5 U(f, P) f [a,b]. Moreover, f is the unique number with this property. This defmition merely pinpoints, and does not solve, the problem discussed before: we do not know which fimctions are integrable (nordo we know how to find the integral of f on [a,b] when f is integrable). At present we know only two examples: b (1) if f (x) = c, then f is integrable (Notice that this integral assigns the expected (2) if f (x) = 0 x irrational 1, x rational, ' and [a,b] on f = c · (b - a). area to a rectangle.) then f is not integrable on [a,b]. Several more examples will be given before discussing these problems further. Even for these examples, however, it helps to have the following simple criterion for integrability stated explicitly. THEOREM 2 If e f > is bounded on [a,b], then f is integrable on 0 there is a partition P of [a,b] such that U( f, P) PROOF Suppose first that for every e > L( f, P) - < [a,b] if and only if for every e. 0 there is a partition P with U( f, P) L( f, P) - < e. Since inf {U(f, P')} 5 U(f, P), sup{L(f, P')} ;> L( f, P), it follows that inf (U (f, P') } sup {L( f, P') } < e. - Since this is true for all e > 0, it follows that sup{L(f, P')} = inf{U( f, P')}; by definition, then, f is integrable. The proof of the converse assertion is similar: If f is integrable, then sup{L(f, P)) This means that for each e > = inf{U(f, P)}. 0 there are partitions P', P" with U(f, P") - L(f, P') < E. 13. Integrals 257 Let P be a partition which contains both P' and P". Then, according lemma, to the U(f, P) 5 U(f, P"), L( f, P) > L( f, P'); consequently, U( f, P) L( f, P) 5 U( f, P") - L( f, P') - < e. Although the mechanics of the proof take up a little space, it should be clear of the definition that Theorem 2 amounts to nothing more than a restatement of integrability. Nevertheless, it is a very convenient restatement because there is and which work with. The next mention sup's of often inf's, difficult to no are example illustrates this point, and also serves as a good introduction to the type of reasoning which the complicated definition of the integral necessitates, even in very simple situations. Let f be defined on [0,2] by x ¢ 1 1, x 1. t,} is a partition of [0,2] with f (x) Suppose P = (to, ..., 10, = = tj-1<1<tj i, 21 t, i (seeFigure 8). Then mg FIGUREs Mr = = 0 if i ¢ j, but and 0 = mj Mj 1. = Since j-l n m¿(t¿-ti-1)+mj(tj-tj_t)+ L(f,P)= m¡(t¿--ti-1), i=l i=j+1 j-1 U( f, P) n M¿(t¿ = ti-1) + Mj(tj -- tj_i) + - i=1 we Mi(t; - ti-1) i=j+l have U( f, P) This certainly shows that f L( f, P) - to choose tj-1 Moreover, it - tj-1. is integrable: to obtain a partition P with U(f, P) it is only necessary tj = - L(f, P) < e, a partition with < l < ty and tj - ty_i < e. is clear that L(f, P) 5 0 5 U(f, P) for all partitions P. 258 Derivativesand Integrals Since f namely, between all lower and upper sums, is integrable, there is only one number the integral of f, so 2 f 0. = Although the discontinuity of f was responsible for the difficultiesin this example, even worse problems arise for very simple continuous functions. For example, let f (x) x, and for simplicity consider an interval [0,b], where b > 0. If t,} is a partition of [0,b], then (Figure 9) P = {to, = ..., mi = and ti-1 M, = ti and therefore M L( f, P) U( f, P) = ti-1(ti = t¿(t¿ - ti-1) - ti-1) i=l '¯' FIGURE 9 ' ' =ti(ti-to)+t2($2¯$1)Ÿ" In(In"¯Ín-1)- Neither of these formulas is particularly appealing, but both simplify considerably for partitions P. = {to, t.} into n equal subintervals. In this case, the length ty ti-1 of each subinterval is b/n, so . .. , - to 0, b = t1= -, n 2b t2 = -, n etc; in general ib t,=-. n Then L( f, P.) t¿ = (t¿ t¿ i ) - i i=1 " (i - 1)b :=1 " -- 1) i=l b2 = b2 [f(i p gj (n-1 - . j=0 b 13. Integrals 259 Remembering the formula k(k + 1) 2 1+···+k= this can , be written (n 1)(n) 62 - L( f, Pa) = 7 2 b2 n-1 2 n Similarly, U( f, P,,) t¿(ti = ti-1) - i=1 n ib b i=1 n n n(n + 1) b2 b2 n+1 = ¯ n If n is very large, both L(f, P,) and U(f, Pa) are close to b2/2, and this remark makes it easy to show that f is integrable. Notice first that 2 b2 2 n U(f,Pn)-L(f,Pn)=- -- This shows that there are partitions P, with U( f, P.) -L( By Theorem 2 the fimction f is integrable. Moreover, with only a little work. It is clear, first of all, that f, P.) as small as desired. feb| may HOW be found b2 L( f, Ps) 5 5 U( f, P.) - 2 for all n. This inequality shows only that b2/2 lies between certain special upper and lower L(f, P.) can be made as small as sums, but we have just seen that U(f, Pn) desired, so there is only one number with this property. Since the integral certainly has this pmperty we can conclude that - f(x) = b 10 2 o b FIGURE b2 I gi Notice that this equation assigns area b2/2 to a right triangle with base and altitude b (Figure 10). Using more involved calculations, or appealing to Theorem 4, it can be shown that b2 a2 Ibf=---. a 2 2 260 lhivadves and Integrals The function f (x) = x2 presents even greater difficulties. In this case (Figure 11), if P {to, t,} is a partition of [0,b], then = ..., m¿= f(ti-1) (ti-1)2 = Choosing, once again, a partition P. and M¿= {to, = · - - , f(t,) In} = t . into n equal parts, so that i b - t, = ---- n the lower and upper sums become L( f, P,,) |(x) - x' (t¿_i)2 (t; = - ti-1) i=1 I f(i = 11 1)2 - n-1 bs FIGURE -- n -(t;-ti_i) U(f,P.)= t j=1 Recalling the formula 12 + - - + k2 = · )k(k+ 1)(2k+ 1) from Poblem 2-1, these sums can be written as b3 L( f, P,,) U(f,P.)= = -(n 1 1)(n)(2n - - -- 1), 1 b --(n+1)(2n+1). It is not too hard to show that L( f, P.) 5 b3 --- 3 5 U(f, P,,), and that U( f, P.) L( f, P.) can be made sufficiently large. The same sort of reasoning as small as desired, by choosing n as before then shows that -- la o f b3 = ---. 3 This calculation already represents a nontrivial result-the area of the region usually elementary derived in bounded by a parabola is not geometry. Nevertheless, the result was known to Archimedes, who derived it in essentially the same 13. Integrals 261 way. The only superiority we can claim is that in the next chapter a much simpler way to arrive at this result. Some of our investigations can be summarized as follows: we will discover b Lf = b f = a (b c a) - f (x) = if f (x) = c for all x, a2 b2 - if - - x for all x, 3 b3 if f(x)=x2forallx. f=--- fb This list already reveals that the notation f suffers from the lack of a convenient for naming functions defmed by formulas. For this reason an alternative analogous to the notation lim f (x), is also useful: notation notation,* b b f (x)dx means precisely the same as f. Thus b Ï Ï Lb cdx=c-(b-a), b b2 a2 2 b3 x2dx=---. 3 a3 xdx=--- a 2' 3 Notice that, as in the notation lim f(x), the symbol x can be replaced x-a other letter (except f, a, or b, of course): b b f (x)dx = a b f (t) dt = b f (a) da b f (y)dy = by any = a f (c)dc. The symbol dx has no meaning in isolation, any more than the symbol x has any meaning, except in the context lim f(x). In the equation ---> *The notation fb f(x)dx is actually the older, and was for many years the only, symbol for the integral. Leibniz used this symbol because he considered the integral to be the sum (denoted by f) of infinitely many rectangles with height f(x) and small" width dx. Later writers used xo, x. to denote the points of a partition, and abbreviated x; xi-1 by Axi. The integral was "infinitely - ..., defined as the limit as Ax; approaches 0 of the sums i=l sums). The fact that the limit is obtained by changing many people. i to lower and f(xt) Ax¿ (analogous to f, f(x¿) to f(x), and upper Ax¿ to dx, delights 262 Derivativesand Integrals the entire symbol x2 dx may be regarded the function f as an abbreviation such that f(x) = for: x2 for all x. This notation for the integral is as flexible as the notation lim f(x). Several examples may aid in the interpretation of various types of formulas which frequently appear; we have made use of Theorems 5 and 6.* a(x+y)dx= xdx+ (1) (2) lx ydx= 2 x x ydy+ (y+t)dy= +y(b-a). - 2 tdy=---+t(x-a). a b x(1+t)dz dx (3) (1+t)(x-a)dx = lb =(1+t) (x-a)dx a b2 a2 2 2 ----a(b--a) =(1+t) . b d(x+y)dy x(d-c)+ dx= (4) = dx - xdx (b-a)+(d-c) - 2 = (d2 - - 2 The computations of fx dx and f is generally difficult or impossible. As -- 2 (b --- a) + (d -- c) a2 b2 -- --- -- . 2 2 x2 dx may suggest that evaluating integrals a matter of fact, the integrals of most functions are impossible to determine exactly (although theymay be computedto any degree of accuracy desiredby calculating lower and upper sums). Nevertheless, as we shall see in the next chapter, the integral of many functions can be computed very easily. Even though most integrals cannot be computed exactly, it is important at least to know when a fimetion f is integrable on [a,b]. Although it is possible to say precisely which functions are integrable, the criterion for integrability is a little too difficult to be stated here, and we will have to settle for partial results. The next Theorem gives the most useful result, but the proof given here uses material from the Appendix to Chapter 8. If you prefer, you can wait until the end of the next chapter, when a totally different proof will be given. *Lest other books, equation (1)requires an important This equation interprets j y dx to mean the integral of the function f such that each (x) is the number y. But classical notation often uses y for y(x), so f y dx might rnean the chaos overtake the reader when consulting qualification. value f integral of some arbitrary y. function 13. Integrals 263 THEOlŒM 3 PROOF If f is continuous on [a,b], then f is integrable [a,b]. on Notice, first, that f is bounded on [a,b], because it is continuous on [a,b]. To prove that f is integrable on [a,b], we want to use Theorem 2, and show that for every 2 > 0 there is a partition P of [a, b] such that U( f, P) - L( f, P) < e. Now we know, by Theorem 1 of the Appendix to Chapter 8, that f is uniformly continuous on [a,b]. So there is some 8 > 0 such that for all x and y in [a,b], if x y|<8, - and - |f(x)- f(y)|< 2(b a partition P {to, t,} such - The trick is simply to choose ô. Then for each i we have If(x) then f(y)l = < 2(b - . . . , for all x, y in a) . a) that each |t¿ - ti-i l< [ti-1,ts], it follows easily that Mi - m¿ € < ¯ 2(b-a) € < . b-a Since this is true for all i, we then have U( f, P) - L( f, P) (M¿ = -- m¿)(t¿ - ti-1) i=1 t¿ < b-a -- ti-1 i=1 -b-a = b-a which is what we wanted. Although this theorem will provide all the information necessary for the use of integrals in this book, it is more satisfying to have a somewhat larger supply of integrable functions. Several problems treat this question in detail. It will help to know the following three theorems, which show that f is integrable on [a,b], if it is integrable on [a,c] and [c,b]; that f + g is integrable if f and g are; and that c f is integrable if f is integrable and c is any number. As a simple application of these theorems, recall that if f is 0 except at one point, where its value is 1, then f is integrable. Multiplying this function by c, it follows that the same is true if the value of f at the exceptional point is c. Adding such a function to an integrable function, we see that the value of an integrable function may be changed arbitrarily at one point without destroying integrability. By breaking up the interval into many subintervals, we see that the value can be changed at finitely many points. The proofs of these theorems usually use the alternative criterion for integrability in Theorem 2; as some of our previous demonstrations illustrate, the details of the - 264 Derivativesand Integrals often conspire to obscure the point of the proof. It is a good idea to proofs of your own, consulting those given here as a last resort, or as a check. This will probably clarify the proofs, and will certainly give good practice in the techniques used in some of the problems. argument attempt THEOREM 4 Let a < c < b. If f is integrable on [a,b], then f is integrable on [a,c] and on [c,b]. Conversely, if f is integrable on [a,c] and on [c,b], then f is integrable on [a,b]. Finally, if f is integrable on [a,b], then lbf= PROOF Suppose f is integrable on [a,b] such that [a,b]. f+ If e U( f, P) We might as well assume which contains to,...,t. U( f, P) - Now P' of L( f, P) = {to, ..., < that c and b c > f. 0, there is a partition P L( f, P) - < tj for some j. (Otherwise, let c; then Q contains P, so U(f, [a,c] and [c,b] (Figure 12). Since - - - , In} Of s. = e.) t¡} is a partition of {to. = P" = (tj, Q be the partition Q) - L(f, Q) 5 t,} is a partition ..., L( f, P) L( f, P') + L( f, P"), U(f,P)-U(f,P')+U(f,P"), = we ¯¯ FIGURE ¯¯ 12 have [U( f, P') - L( f, P')] + [U( f, P") L( f, P")] - = U( f, P) - L( f, P) < e. Since each of the terms in brackets is nonnegative, each is less than e. This shows that f is integrable on [a,c] and [c,b]. Note also that C L(f, P') 5 5 U(f, P'), f Ïb L(f,P")$ fšU(f,P"), C so that b c L( f, P) 5 I f G + C f 5 U( f, P). Since this is true for any P, this proves that b b c f+ C f= U f. Now suppose that f is integrable on [a,c] and on [c,b]. If e partition P' of [a,c] and a partition P" of [c,b] such that U (f, P') U( f, P") - -- L( f, P') L(f, P") < s/2, < e/2. > 0, there is a 13. Integrals 265 If P is the partition of [a,b] containing L( f, P) U( f, P) all the points of P' and P", then L( f, P') + L( f, P"), U( f, P') + U( f, P"); = = consequently, U( f, P) J L( f, P) - [U( f, P') = L( f, P')] + [U( f, P") - - L( f, P")] Theorem 4 is the basis for some minor notational conventions. f was defined only for a < b. We now add the definitions b la and f=0 < e. The integral a f=- f ifa>b. With these definitions, the equation f f + fbf = f f holds for all a, c, b even if a < c < b is not true (theproof of this assertion is a rather tedious case-by-case check). THEOREM s If f and g are integrable on [a,b], then f Ib (f+g)= Let P = {to, ..., t,} be any partition of [a,b] and b f+ G PROOF + g is integrable on b [a, b]. g. d Let m;=inf{(f+g)(x):ti-15x5ti), m¿' m;" and = = inf {f (x) : ti-1 5 x 5 ti } inf {g(x): ti-1 5 x 5 tiÌ. , define M¿, M¿', M¿" similarly. It is not necessarily true that m¿ = m¿' + m¿", mg > m¿' + m¿". but it is true (Problem 10) that Similarly, M¿ 5 M¿' + M¿". Therefore, L( f, P) + L(g, P) 5 L( f + g, P) and U(f +g, P) 5 U(f, P) +U(g, P). Thus, L( f, P) + L(g, P) 5 L(f + g, P) 5 U( f + g, P) 5 U( f, P) + U(g, P). Since f and g are integrable, there are partitions P', P" with U( f, P') U(g, P") - - L( f, P') L(g, P") < < e/2, e|2. 266 Derivativesand Integrals If P contains both P' and P", then U( f, P) + U(g, P) [L(f, P) + L(g, P)] - < e, and consequently U(f+g,P)-L(f+g,P)<s. This proves that f + g is integrable on [a,b]. Moreover, L( f, P) + L(g, P) 5 L( f + g, P) (1) 5 lb (f+g) SU(f+g,P)5U(f,P)+U(g,P); and also (2) b Ïbf+ L(f,P)+L(g,P)$ L(f, P) and U(g, p) Since U(f, P) desired, it follows that - U( f, P) + U(g, P) - -- gšU(f,P)+U(g,P). L(g, P) can both be made as small as [L( f, P) + L(g, P)] can also be made as small as desired; it therefore follows from lb (f THEOREM 6 If f is integrable on on [a,b] and [a,b], b + g) = Lb PROOF b f + g. then for any number cf=c- (1)and (2)that c, the function cf is integrable b f. The proof (which is much easier than that of Theorem 5) is left to you. It is a good separately the cases c > 0 and c < 0. Why? idea to treat (Theorem 6 is just a special case of the more general theorem that f g is integrable on [a,b], if f and g are, but this result is quite hard to prove (see Problem 38).) - In this chapter we have acquired only one complicated definition, a few simply theorems with intricate proofs, and one theorem which required material from the Appendix to Chapter 8. This is not because integrals constitute a more difficult topic than derivatives, but because powerful tools developed in previous chapters have been allowed to remain dormant. The most significant discovery of calculus is the fact that the integral and the derivative are intimately related-once we learn the connection, the integral will become as useful as the derivative, and as easy to use. The connection between derivatives and integrals deserves a separate chapter, but the preparations which we will make in this chapter may serve as a hint. We first state a simple inequality concerning integrals, which plays a role in many important theorems. 13. Integrals 267 THEOREM 7 Suppose f [a,b] and is integrable on m 5 that for all x in M f(x) Then b m(b PROOF a) 5 - Ï m(b-a)<L(f,P) I and for every partition P. Since ff inequality followsimmediately. U(f,P)<M(b-a) - [a,b]. X F(x) cc FIGURE a). - sup{L(f, P)} = Suppose now that f is integrable on [a, b] by area F(c+h)-F(c) 5 M(b f It is clear that M // [a,b]. inf{U(f, = P)), the desired We can define a new function f on X f = f(t)dt. = a + h (This depends on Theorem 4.) We have seen that f may be integrable even if it not continuous, and the Problems give examples ofintegrable functions which are quite pathological. The behavior of F is therefore a very pleasant surprise. iS 13 THEOREM 8 If f is integrable on F is defined on [a, b] and [a,b] by X F(x) I = f, a then F is continuous PROOF on [a,b]. Suppose c is in [a,b]. Since f is integrable on on [a,b]; let M be a number such that |f (x)| 5 If h > [a,b] it is, by definition, bounded for all x in M [a,b]. 0, then (Figure 13) c+h F(c + h) - F(c) f f (x) 5 M f - a Since -M i c+h c I = f. = a e for all x, it followsfrom Theorem 7 that c+h -M - h5 I f 5 Mh; c in other words, (1) Fh < - M hy F(c + h) - F(c) 5 M h. - - 0, a similar inequality can be derived: Note that c F(c + h) - F(c) = Lc+h f. f = - c+h 268 Denvativesand Integrals -h, Applying Theorem 7 to the interval h, c], of length [c+ we obtain C Mh i Ï f c+h -1, multiplying which reverses all the inequalities, we have by Mh 5 F(c + h) (2) Inequalities (1)and (2)can > F(c) - |5 M - < s/M. F(c) - | < e, This proves that lim F(c+h) = h->0 in other words F is continuous F(c); at c. Figure 14 compares f and F(x) that F is always better behaved than this is. = f. | f for various functions f; it appears In the next chapter we will see how true I I FIGURE 14 h|. 0, we have |F(c + h) provided that |h| F(c) 5 -Mh. - be combined: |F(c + h) Therefore, if e 5 -Mh; I - I - F 13. Integrals 269 PROBLEMS 1. $ x3 dx Prove that b4/4, by considering = partitions into n equal subin- i3 which was found in Problem 2-6. This tervals, using the formula for i=1 problem requires only a straightforward imitation of calculations in the text, but you should write it up as a formal proof to make certain that all the fine points of the argument are clear. 2. *3. Prove, similarly, that febx*dxb5/5. = k?/np+1 Problem 2-7, show that the sum (a) Using can be made as close k=1 to 1/(p + 1) as desired, by choosing n large enough. bP**/(p + 1) (b) Prove that xP dx feb *4. = . Ib xPdx This problem outlines a clever way to fmd for 0 < a < b. (The 0 will then follow by continuity.) The trick is to use partitions tn) for which all ratios r ty/ti-1 are equal, instead of using {to, which all differences t. t;_i are equal. partitions for result P for a = . = .. = , - (a) Show that for such a partition P t¿ (b) If f(x) = xP, U( f, P) = - a c'7" show, using the = ap+1 y _ have we for c = . a formula in Problem 2-5, that (p+1yn c-1/n i i=1 ¯1¡n 1 = (a*** bf** )c(p+1yn - 1 = (bp+1 - ap+1)c?!" -- - c c(p+1)/n - 1+cl/"+---+cF/= fmd a similar formula for L(f, P). (c) Conclude that and b Ï x? dx a 5. bp+1 - . p + 1 Evaluate without doing any computations: IlxJl (ii) ll (i) -1 -1 -- x2 dx. (x5+ 3))1 - x2 dx. ap+1 = 270 Derivativesand Integrals 6. Prove that 2 sin t dt t+ 1 for all x 7. > 0 0. > Decide which of the followingfunctions are integrable on the integral when you can. (i) f(x)=fx, [x-2, fx lx ' (ii) f (x) (iii) f (x) = x-2, = (iv) f (x) = (vii)f x rational 0, x irrational. x of the form a + x not of this form. = (vi) f (x) 05x51 1<xs2. + 1, (v) f (x) l 0, calculate 05x<1 15xs2. [x]. [x], x + [0,2], and bÃ for rational a and b 1 - = x=0orx>1. 0, is the function shown in Figure 15. 1 - --- ---- ---- I # FIGURE 8. 2 1 15 Find the areas of the regions bounded by 2 the graphs of f(x) (i) (ii) the graphs of f (x) (-1, 0) and (1,0). (iii) the graphs of f (x) = (v) the graphs of the graphs of f(x) f(x) +2. = -x2 = (iv) x2 and g(x) = = = x2 and g(x) x2 and g(x) x2 and g(x) x2 and g(x) and the vertical lines through = = = = 1 1 x2. - - x2 -2x x2 and h(x) = 2. +4 and the vertical axis. 13. Integrals 271 (vi) the graph of through f(x) = , the horizontal axis, and the vertical /2Sdx; (2, 0). (Don't try to find should you 0 line see a way of guessing the answer, using only integrals that you already know how to evaluate. The questions that this example should suggest are considered in Problem 21.) 9. Find b d dx f(x)g(y)dy in terms of vengeance; 10. f f and f g. (This problem is an exercise in notation, with a it is crucial that you recognize a constant when it appears.) Prove, using the notation m¿'+m¿"= 11. (a) Which upper of Theorem 5, that inf {f(xt)+g(x2) 5 t;} 5 m¿. ti_i 5 xt,x2 : functions have the property that every lower sum equals every sum? (b) Which functions have the property that some upper sum equals some sum? lower (other) continuous Which (c) functions have the property that all lower sums are equal? *(d) Which integrable functions have the property that all lower sums are equal? (Bear in mind that one such function is f (x) 0 for x irrational, p/q in lowest terms.) Hint: You will need the f(x) 1/q for x notion of a dense set, introduced in Problem 8-6, as well as the results of Problem 30. = = = b < c < d and f is integrable on [b,c]. (Don't work hard.) 12. If a 13. (a) Prove that b < then (b) Prove in Ï f if f is integrable on and then and g are integrable on f f (x) à lb f a f is integrable 0 for all x in a on [a,b], f(x) à g(x) for all x (b)you are to warn wasting time.) b+c b Ï and g. (By now it should be unnecessary à Prove that f a [a, b] b that if you work hard on part 14. [a,b] that à 0. that if [a,b], [a,d], prove (x) dx = +c f (x - c) dx. (The geometric interpretation should make this very plausible.) Hint: Every partition P {to, In} of [a,b] gives rise to a partition P' {to+ c, t, + c) of [a + c, b + c], and conversely. = ..., - - -, = 272 Derivativesand Integrals *15. Prove that ab dt dt + dt. = ab a Ï written Hint: This can be 1/t dt 1/t dt. Every partition P = I b of [l.a] gives rise to a partition P' and conversely. {to,...,t,} *16. Prove that Lcb f (t) dt = {bto,...,bt.) of = [b,ab], b = f (ct)dt. c (Notice that Problem 15 is a special case.) 17. Given that the area enclosed by the unit circle, described by the equation 2 1, is ar, use Problem 16 to show that the area enclosed by the ellipse described by the equation x2/a2 + y2/b 1 is rab, x2 = = b 18. This problem outlines yet another way to compute working Cavalieri, one of the mathematicians x" dx; just before it was used by the invention of calculus. a (a) Izt c, (b) Problem x" dx. Use Problem 16 to show that = = c.a"**. 14 shows that 2a Ï a x" dx = o Use this formula to prove that 2n+1caa"*I = -a (x + a)" dx. 2a"** Ck· k even (c) Now use x" dx Problem 2-3 to prove that cn = 7 1/(n + 1). 19. Suppose that f is bounded on [a,b] and that f is continuous at each point in [a,b] with the exception of xo in (a, b). Prove that f is integrable on [a,b]. Hint: Imitate one of the examples in the text. 20. Suppose that f is nondecreasing on [a,b]. Notice that f is automatically bounded on [a,b], because f (a) sf (x) 5 f (b) for x in [a,b]. (a) If P {to, t.) is a partition of [a,b], what is L(f, P) and U( f, P)? (b) Suppose that ti ti-1 8 for each i. Prove that U(f, P) L(f, P) 8[ f(b) f(a)]. that f is integrable. Prove (c) (d) Give an example of a nondecreasing function on [0,1] which is discon= ..., - = - tinuous at infinitely many points. - = 13. Integrals 273 It might be of interest to compare this problem with the followingextract from Newton's hincapia.* LEMMA II a K --L ' ' , A total area b · f )) av c n *21. z If in anyfigure AacE, terminatedby the ryht linesAa, AE, and the curve acE, therebe inscribedany number of parallelograms Ab, Ec, Cd, &c., comprehended under equal basesAB, BC, CD, &c., and theside.s Bb, Cc, Dd, &c.,parallel and theparallelograms to one side Aa of thefigure; aKbl, bLcm, cMdo, &c., are completed: then¶tkebreadthof thoseparallelograms be supposed to be diminished, and theirnumber to be augmented in infinitum, I sa3 that the ultimate ratios which the mscribedfigure AKhLcMdD, the circumscribed}gureAalbmendoE, and curvilinear AabcdE, will haveto one another; are ratios ofequaßg. figure For the difference of the inscribed and circumscribed figures is the sum of the parallelograms Kl, Lm, Mn, Do, that is (fromthe equality of all their bases), the rectangle under of their bases Kb and the one altitudes of their that Aa, is, the rectangle ABla. But this rectangle, sum because its breadth AB is supposed diminished in i¢nitum, becomesless than any given space. And therefore (byLem. 1) the figures inscribed and circumscribed become ultimately equal one to the other; and much more will the intermediate curvilinear figure be ultimately equal to either. QE.D. Suppose that f is increasing. Figure 16 suggests that Lb f¯I (a) If P = f"(t.)}. [*(a) FIGURE b (c) Find 22. f-46) -af¯I(a) - f. ..., Ï {f¯I(to), -- -, formula stated above. dx for 0 5 a < b. Suppose that f is a continuous increasing function with that for a, b > 0 we have Toung'sinequaliy, Ï O f (0) = 0. Prove b a ab 5 and that equality Figure 16! = 1,P)+U(f,P')=bf¯1(b)-af¯!(a). the (b) Now prove b 16 bf-i(b) (to, t,} is a partition of [a,b), let P' Prove that, as suggested in Figure 17, L(f a = f (x) dx f¯ (x) dx, + 0 holds if and only if b = f(a). Hint: Draw a picture like *Newton's Principia,A Revisionol'Mott's Translation, by Florian Cajori. University of California Press, Berkeley, California, 1946. 274 Duivatives and Integrals 23. (a) Prove that [a,b] if f is integrable on and m 5 f (x) 5 M for all x in [a,b], then b f (x) dx = (b a)µ - for some number µ with m 5 µ 5 M. a=to FIGURE ti /2 (b) Prove that is i,=b if f is continuous on [a,b], then b 17 f (x)dx for some ( in [a, b]; and show = (b - a) f (g) by an example that continuity generally, suppose that f is continuous on integrable and nonnegative on [a,b]. Prove that (c) More b (d) Deduce 18 24. [a,b]. = f (g) that g is g(x)dx This result is called the Mean Value Theorem for the same result if g is integrable and nonpositive on (e) Show that FIGURE and b f (x)g(x)dx for some ( in Integrals. [a,b] is essential. [a,b]. one of these two hypotheses for g is essential. the graph of a function in polar coordinates (Chapter 4, Appendix 3). Figure 18 shows a sector of a circle, with central angle 0. When 0 is measured in radians (Chapter 15), the area of this sector In this problem we consider is r2 9 Now consider the region A shown in Figure 19, where the curve is the graph in polar coordinates of the continuous fimction f. Show that - -. area A *25. FIGURE 19 Let f be a continuous [a,b], define p) = function on - i=1 f(0)2d0. [a,b]. If P = {to, ..., t.] is a partition of 13. Integrals 275 The number £( f, P) represents the length of a polygonal curve inscribed in of the graph f (seeFigure 20). We define the length of f on [a,b] to be the least upper bound of all £( f, P) for all partitions P (provided that the set of all such £(f, P) is bounded above). is a linear function on [a,b], prove that the length of f is the distance from (a, f(a)) to (b, f(b)). (b) If f is not linear, prove that there is a partition P = {a,t, b) of [a,b] such that £(f, P) is greater than the distance from (a, f(a)) to (b, f(b)). (You will need Problem 4-9.) (c) Conclude that of all functions f on [a,b] with f(a) c and f(b) d, the length of the linear function is less than the length of any other. (Or, in conventional but hopelessly muddled terminology: "A straight line is the shortest distance between two points".) (d) Suppose that f' is bounded on [a,b]. If P is any partition of [a,b] (a) If f I a FIGURE = t, I t, I e, t t, t. I - 6 20 = = show that y L(J1+(f')2,P)5£(f,P)5U(Ál+(f')2 Hint: Use the Mean Value Theorem. (e) Why is sup{L( 1 + (f')2, P)} 5 sup{£( P)} 5 inf (U( show that sup(£(f, (f) Now that the length of f [a,b] on f, P)}? (This is easy.) 1 + (f')2, P)}, thereby proving r 2 1 + (f')2 is is inte- Hint: It suffices to show that if P' and P" are any two partitions, then £( f, P') 5 U( 1 + (f')2, P"). If P contains the points of both P' and P", how does £(f, P') compare to f(f, P)? (g) Let 2(x) be the length of the graph of f on [a,x], and let d(x) be the length of the straight line segment from (a, f (a)) to (x, f (x)). Show that grable on [a,b]. lun ma 1. = d(x) Hint: It will help to use a couple of Mean Value Theorems. 26. A function s defmed on [a,b] is called a step function if there is a partition of b] t,,} P {to, [a, such that s is a constant on each (ti-1, t¿) (thevalues of s at t, may be arbitrary). = ..., (a) Prove that if function si 5 f is integrable on lb with f a b with Lb Lb s2 (b) Suppose that such that f - < - for any e > 0 there is a step - si < e, and also a step function s2 f e. for all e s2 f [a,b], then b b > 0 there are step functions si 5 f and s2 2 f 31 < e. Prove that f is integrable. 276 Derivativesand Integrals b I (c) Find a function f which is not a step function, but which satisfies L(f, P) for some partition P of [a,b]. *27. Prove that if f is integrable on b functions g 5 f 5 h with Ï h g - a (a) Show that if si and s2 are step (b) Prove, without using Theorem (c) Use part (b)(andProblem 29. fx that f to choose x to be in *30. b Ï + s2 is also. [a,b], then b si (si + a s2) = b + si 52a proof of Theorem 5. Prove that there is a number f. Show by example = 0 there are continuous A picture will help immensely. functions on [a,b]. = e. Hint: First get step functions ones. 5, that > 26) to give an alternative Suppose that f is integrable on b [a,b] such < a with this property, and then continuous 28. for any e [a,b], then b f x in that it is not always possible (a, b). The purpose of this problem is to show that if f is integrable on f must be continuous at many points in [a,b]. [a,b], then L(f, P) < t.} be a partition of [a,b] with U(f, P) b a. Prove that for some i we have Mr mg < l. (b) Prove that there are numbers ai and b1 with a < ai < bi < b and sup{ f (x) : ai 5 x 5 bl} inf {f (x) : al 5 x 5 b1} < 1. (You can choose = [ai,b1] [ti-1,t¿] from part (a)unless i 1 or n; and in these two cases a very simple device solves the problem.) (c) Prove that there are numbers a2 and b2 with ai < a2 < b2 < bi and sup{ f(x) : d2 5 x 5 b2} inf{ f(x) : a2 5 x 5 b2) < Continue in this way to find a sequence of intervals I. (d) [a.,b,] such < that sup{ f (x) : x in I.} inf (f (x) : x in I.} 1/n. Apply the Nested Intervals Theorem (Problem 8-14) to find a point x at which f is continuous. (e) Prove that f is continuous at infinitely many points in [a,b]. (a) Let P {to, = - ..., - - - = (. --- = - b *31. Recall, from Problem 13, that (a) Give an example where b [a,b], and Ï f (b) Suppose f (x) > = Ï f à 0 if f (x) à f (x) > 0 for all x, and 0 for all x in f (x) > 0 for some Lb x in 0. 0 for all x in b [a,b] and f is continuous f (xo) > 0. Prove that f > 0. Hint: It suffices sum L( f, P) which is positive. (c) Suppose f is integrable on [a,b] and f (x) > 0 for all and [a,b]. at xo in [a,b] to find one lower x in [a,b]. Prove 0. Hint: You will need Problem 30; indeed that was one reason for including Problem 30. that f > 13. Integrals 277 *32. lb is continuous on [a,b] and fg 0 for all continuous functions g on [a,b]. Prove that f 0. (This is easy; there is an obvious (a) Suppose that f = = g to choose.) (b) Suppose f is continuous and that [a,b] on Lb fg -= 0 for those con- functions g on [a,b] which satisfy the extra conditions g(a) 0. Prove that f 0. (This innocent looking fact is an important in the calculus of variations; see the Suggested Reading for referHint: Derive a contradiction from the assumption f (xo)> 0 or < f(xo) 0; the g you pick will depend on the behavior of f near xo. tinuous g(b) = lemma ences.) 33. Let f (x) = = = for x rational and f (x) x 0 for x irrational. = (a) Compute L(f, P) for all partitions P of [0,1]. (b) Find inf {U( f, P) : P a partition of [0,1]}. *34. Let f (x) that = 0 for irrational x, and 1/q if x f is integrable on [0,1] and that f p/q = = in lowest terms. Show 0. (Every lower sum is clearly 0; you must figure out how to make upper sums small.) *35. Find two functions f and g which are integrable, but whose composition is not. Hint: Problem 34 is relevant. gof *36. Let f be a bounded function on [a, b] and let P be a partition of [a,b]. Let Mi and mi have their usual meanings, and let Ma' and m¿' have the corresponding meanings for the function |f I. M¿' m¿' s M¿ m¿. integrable that if is on [a,b], then so is |f |. f (b) Prove (c) Prove that if f and g are integrable on [a,b], then so are max( min(f, g). (d) Prove that f is integrable on [a,b] if and only if its part" min( f, 0) are integrable on max( f, 0) and its (a) Prove that - - "positive "negative 37. Prove that if f is integrable on f, g) and part" [a,b]. [a,b], then f(t)dt Lb b 5 |f(t)|dt. Hint: This follows easily from a certain string of inequalities; Problem 1-14 is relevant. *38. Suppose f and g are integrable on [a,b) and f(x), g(x) à 0 for all x in [a,b]. Let P be a partition of [a,b]. Let Mg' and mi' denote the appropriate sup's and inf's for f, define M¿" and m;" similarly for g, and define M¿ and mi similarly for fg. (a) Prove that M¿ 5 M¿'M¿" and m¿ à mi'm;". 278 Derivancesand Integrals (b) Show that U(fg, P) - [M¿'M¿"-mi'm;"](t¿-t¿ L(fg, P) 5 i). i=1 fact that f and g are bounded, so that show that b], x in [a, (c) Using the U(fg, P) M for L(fg, P) - [M¿' sM - m¿'](t¿ [M¿" t¿_t) + - i=1 (d) Prove that fg (e) Now eliminate 39. |f(x)|, |g(x)| 5 Suppose that f - m,"](t; - t¿_i) . i=l is integrable. the restriction and g are that à 0 for x in f(x), g(x) integrable on states that [a,b]. b [a,b]. The Cauchy-Schwartinequalig b fg (a) Show that the Schwarz inequality is a special case of the Cauchy-Schwarz inequality. (b) Give three proofs of the Cauchy-Schwarz inequality by imitating the proofs of the Schwarz inequality in Problem 2-21. (The last one will take some imagination.) (c) If equality holds, is it necessarily true that f Ag for some I? What if = and g are coti uou 2 (d) . Is this result true if 0 and 1 are replaced by a and b? *40. Suppose that f is continuous lim x->oo Hint: The condition and x e lim f (x) x->oo f (x) = a. Prove that * 1 - t à some N. This means that comparison lim f (t) dt = = a. a implies that f (t) is close to a for N+M f (t) dt is close to N, then Ma/(N + M) is close to a. to Ma. If M is large in 13, Appenda. RiemannSums 279 APPENDIX. RIEMANN SUMS Suppose that P choose {to, = . some point x, in . . , [t, i, t,,} is a partition of [a,b], and that for each i we ty]. Then we clearly have L(f, P) 5 Any sum a = FIGURE t to i = f(x;)(t; f(xi)(t¿- ti-1) 5 U(f, P). ti_i) is called a Riemannsum of - for P. Figure 1 shows f the geometric interpretation of a Riemann sum; it is the total area of n rectangles that lie pardy below the graph of f and partly above it. Because of the arbitrary have been picked, we can't say for way in which the heights of the rectangles whether particular Riemann sum is less than or greater than the integral sure a b b But it does seem that the overlaps shouldn't matter too much; if the bases of all the rectangles are narrow enough, then the Riemann sum ought to be close to the integral. The following theorem states this precisely. f(x)dx. l THEORE1W 1 Suppose that f is integrable on [a,b]. Then for every e > 0 there is some 8 > 0 such that, if P < 8, {to, In) is any partition of [a,b] with all lengths ti then -t;-i = ..., b a f(xt)(t¿-t¿_i)- f(x)dx <e, i=1 for any Riemann sum formed by choosing x, in PROOF [t¿_i,t¿]. As in the proof that a First we will prove the theorem when f is continuous. continuous function is integrable (Theorem 13-3), we will use Theorem 1 fmm the Appendix to Chapter 8, so you might want to skip it. But if you've already read the proof of Theorem 13-3, this part of the proof will be a snap-in fact, it's practically the same. Given e > 0, choose ô > 0 so that for all x and y in [a,b] if |x - y| < 8, then |f (x) f (y)| < 2(b - . - a) any partition P {to, t,} with each ta ti_i < ô, and any x; in [t¿_i,t¿]. Then, as we saw in the proof of Theorem 13-3, we have Now consider (1) = - ..., U( f, P) - L( f, P) < s. But we also have (2) L(f, P) 5 f(xi)(t¿-t¿_i) 5 U(f, P) i=1 and (3) L(f, P) 5 Ibf(x)dx U(f, P). 280 Derivativesand Integrals The desired inequality, for our continuous and (1),(2) function f, follows immediately from (3). The argument in the general case is simple (though perhaps a bit messy), using Problem 13-27, which says that there are continuous functions g 5 f 5 h satisfying b Ib (4) b g 5 f b b h, 5 with h- g<e. We have -ti-1) -ti-1) g(xi)(ti f(xi)(ti 5 i=l h(xi)(t¿-ti_i), 5 i=l i=1 and since the theorem holds for continuous functions, we know that for te < ô, and right-hand and right-hand sides of close the leftthis inequality are to the leftsides of middle that the This implies two terms, (4). -ti-1 b I must be close to skeptical reader. fh - n and f fbf, Which i ti-1), f (x¿)(t; - is small. Detailed inequalities are left to the The moral of this tale is that anything which looks like a good approximation to an integral really is, provided that all the lengths t¿ t¿ ¡ of the intervals in the partition are small enough. Some of the following problems should bring home this message with even greater force. - _ PROBLEMS 1. Suppose that f and g are continuous functions on In} of P {to, [a,b] choose a set of points x¿ in of points ut in [tt-1,ti]. Consider the sum = - - - , [a,b]. For a partition [t¿ i, t¿] and another set _ f(xi)g(ug)(ti ti-1)- I i-I Notice that this is not a Riemann sum of fg for P. Nevertheless, show that b all such sums will be within e of Ï fg provided that the partition P has all ti-1 small enough. Hint: Estimate the difference between such a sum and a Riemann sum; you will need to use uniform continuity. lengths t¿ - 13, Appendix.RiemannSums 281 2. This problem is similar to, but somewhat harder than, the previous one. Suppose that f and g are continuous nonnegative functions on [a,b]. For a partition P, consider sums f (x¿)+ g(u¿) (t¿ - ti-1). i=1 Lb Show that these sums will be within e of / f + g if all te ti_i are small enough. Hint: Use the fact that the square-root function is uniformly continuous on a closed interval [0,M]. (u(et),,(tt)) 3. Mrò, or - Finally, we're ready to taclde something big! (Compare Problem 13-25.) Consider a curve c given parametrically by two functions u and v on [a,b]. t.} of [a, b] we define For a partition P = (to, ..., £(c, P) [u(t¿) - u(ti-1)]2 -l- [v(t¿) - U(ti-1) 2 i=l FIGURE this represents the length of an inscribed polygonal curve (Figure 2). We define the length of c to be the least upper bound of all £( f, P), if it exists. Prove that if u' and v' are continuous on [a,b], then the length of c is 2 b 4. Ï interval [00,Gi]. Show that the graph of f in polar coordinates on this interval has the length Let f' be continuous on the 101 5. Using Theorem 1, show that the Cauchy-Schwarz inequality (Problem 13-39) is a consequence of the Schwarz inequality. THE FUNDAMENTAL THEOREM OF CALCULUS CHAPTER From the hints given in the previous chapter you may have already guessed the first theorem of this chapter. We know that if f is integrable, then F(x) f f is only ask what when original continuous; it is the fitting that we happens function f is continuous. It turns out that F is differentiable (andits derivative is especially = simple). THEOREM 1 (THE FIRST Let f be integrable on [a,b], and [a,b] define F on FUNDAMENTAL THEOREM by x F(x) OF CALCULUS) If f is continuous at c in then F is differentiable at c, and [a,b], F'(c) (If c = f. = f (c). = to mean the right- or left-hand derivative a or b, then F'(c) is understood of F.) PROOF We will assume that c is in (a, b); the easy modifications supplied by the reader. By definition, F'(c) = F(c + h) h lim h->o Suppose first that h > - for c = a or b may be F(c) 0. Then Ic+h F(c+h)-F(c)= f. c Define mh and Mh as follows (Figure 1): mg = inf {f (x) : c 5 x 5 c + h} , Mh=SUp(f(X):Csx5c+h}. It follows from Theorem 13-7 that mh - hs f lc+h š Mh h. - c Therefore F(c + h) F(c) 5 Ma. h 0, only a few details of the argument have to be changed. - mh 5 If h < mh=inf{f(x):c+hšx5c}, Mh=sup{f(x):c+hšxíc}. 282 Let 14. TheFundamentalTheoremof Cakulus 283 FIGURE 1 Then C mh - Ï (-h) 5 Since 5 Mh f c+h (-h). · ckh F(c + h) F(c) -- c I = f = - +h c this yields ma h 2 F(c + h) · Since h < 0, dividing by h as before: F(c) 2 Mh h. - - the inequality again, yielding the same result reverses ma 5 f F(c + h) h F(c) - 5 Mh· This inequality is true for any integrable function, continuous continuous at c, however, lim mh h->O = lim Mh = or not. Since f is f(C), h->0 and this proves that F'(c) = lim F(c + h) F(c) - = h h-vo f (c). Although Theorem 1 deals only with the function obtained by varying the upper limit of integration, a simple trick shows what happens when the lower limit is varied. If G is defined by b G(x) = I f, 284 Derivativesand Integrals then G(x) Consequently, if f x lbf = f. - is continuous at c, then G'(c) = f (c). - here is very fortunate, and allows us to extend Theorem 1 to the situation where the function The minus sign appearing X F(x) is defined even for x < a. In this case we can write F(x) so if c < Ïf laf, = = - a we have -(-f(c))= F'(c)= f(c), exactly as before. Notice that in either case, differentiability of F at c is ensured by continuity of f at c alone. Nevertheless, Theorem 1 is most interesting when f is continuous at all points in [a,b]. In this case F is differentiable at all points in [a,b] and F' f. = In general, it is extremely difRcult to decide whether a given function f is the derivative of some other function; for this reason Theorem 11-7 and Problems 11-54 and 11-55 are particularly interesting, since they reveal certain properties which f must have. If f is continuous, however, there is no problem at all-according to Theorem 1, f is the derivative of sorne function, namely the function F(x) which Theorem 1 has a simple corollary integrals to a triviality. COROMARY If f is continuous on [a,b] and f f. = g' = frequently reduces computations for some function g, then b I PROOF f = g(b) Let g(a). ÏX F(x) Then F' - = f = g' on [a,b]. = f. Consequently, there is a number F=g+c. c such that of 14. The FundamentalTheoremof Calculus 285 The number c can be evaluated easily: note that 0 = F(a) = g(a) + c, = g(x) -g(a); so c thus = F(x) This is true, in particular, for x b. Thus = Lb f g(a). - = F(b) g(b) = - g(a). The proof of this corollary tends, at first sight, to make the corollary seem useless: after all, what good is it to know that b Ï f g(b) = b(a) - if g is, for example, g(x) f ? The point, of course, is that one might happen to know a quite different function g with this property. For example, if f = g(x) then g'(x) = f (x) so and = 3 f (x) = x2, we obtain, without ever computing lb X2dx=---. , similarly; b3 a3 3 3 lower and upper sums: if n is a natural number and g(x) One can treat other powers xn+1/(n + 1), then g'(x) x", so = = an+1 bn+1 /b x" dx = - . n+1 , n+1 For any natural number n, the function f (x) x¯" is not bounded on any interval containing 0, but if a and b are both positive or both negative, then = b I X-ndx a¯"*' b¯"** = -n+1 -n+1 . a -1. Wedo not knowa simple expressionfor Naturally this formula is only true for n ¢ g Lb - dx. x The problem of computing this integral is discussed later, but it provides a good to warn against a serious error. The conclusion of Corollary 1 is often confused with the definition of integrals-many students think that f is defmed f whose where g(a), derivative is f." This is as: g is a function is useless. One reason is that a function f may be integrable not only wrong-it without being the derivative of another function. For example, if f (x) 0 for not?). = x ¢ l and f (1) 1, then f is integrable, but f cannot be a derivative(why There is also another reason that is much more important: If f is continuous, opportunity "defmition" "g(b) - = 286 Denvativesand Integrals g' for some function g; but we know this only becauseof then we know that f Theorem1. The function f (x) 1/x provides an excellent illustration: if x > 0, then f (x) g'(x), where = = = lxl -dt, g(x)= t I know of no simpler function g with this property. The to Theorem 1 is so useful that it is frequently called the Second Fundamental Theorem of Calculus. In this book, that name is reserved for a somewhat stronger result in practice, however, is not much more useful). (which As we have just mentioned, a function f might be of the form g' even if f is not continuous. If f is integrable, then it is still true that and we corollary Lb f = g(b) g(a). - The proof, however, must be entirely different-we must return to the defmition of integrals. THEOREM 2 (THE SECOND FUNDA- If f is integrable on [a,b] and f = cannot use Theorem 1, so we g' for some function g, then MENTAL THEOREM OF CALCULUS) PROOF Let P {to, is a point x; in = ..., t,} be any partition of [t¿.i,ti] such that g(t¿) [a,b]. g(t¿_i) -- By the Mean Value Theorem there g'(x¿)(t; = f(x;)(t; = - - ti-1) ti-1)- If m¿ inf {f (x) : tt-1 5 x 5 ti } = , M¿=sup{f(x):ti-15x5tt}, then clearly -ti_i) m¿(t¿--t,_i) 5 5 Mg(t¿-t¿_i), f(x;)(t; that is, m¿(t; t¿_i) 5 g(ti) - Adding these equations for i m¿(t¿ - = 1, g(ti-1) 5 M¿(t; - - n we obtain ..., ti-1) 5 g(b) - g(a) 5 Mg(t¿ i-1 i=1 so that L( f, P) 5 g(b) foreverypartitionP. - g(a) 5 U( f, P) But this means that g(b) - ti_i). g(a) = Ibf. - ti_i) 14. TheFundamental7heoran of Calculus 287 We have already used the corollary to Theorem 1 (or,equivalently, to find the integrals of a few elementary functions: b Ï b"*I x" dx a f(x) = x. a"** = n+ 1 n + (aand b both positive or both negative if n > 0). -1. - l ' Theorem 2) nj As we pointed out in Chapter 13, this integral does not always represent the area bounded by the graph of the function, the horizontal axis, and the vertical lines through (a, 0) and (b, 0). For example, if a < 0 < b, then b Ï does not represent mstead by xadx the area of the region shown in Figure 2, which is given -x3dx+x3dx=--+- a4 b4 4 4 =-+-. FIGURE 2 Similar care must be exercised in finding the areas of regions which are bounded by the graphs ofmore than one function-a problem which may frequently involve considerable ingenuity in any case. Suppose, to take a simple example first, that we wish to find the area of the region, shown in Figure 3, between the graphs of the functions f(x)=x and g(x)=x3 x3 i x2, on the interval [0, 1]. If 0 sxs 1, then 0 5 so that the graph below that of f. The area of the region of interest to us is therefore area 1 R(f,0,1)- g lies 1), which is This area could have been expressed FIGURE3 area R(g,0, of as If g(x) 5 f(x) for all x in [a,b], then this integral always gives the area bounded by f and g, even if f and g are sometsmes negative. The easiest way to see this is shown in Figure 4. If c is a number such that f + c and g + c are nonnegative on [a,b], then the region Ri, bounded by f and g, has the same area as the region R2, 288 Derivativesand Integrals bounded by f + c and g + c. Consequently, b area Ri area R2 = Lb [(f+c)-(g+c)] Lb + c) (f = = (g + - c) b (f = - g). This observation is useful in the following problem: Find the area of the region bounded by the graphs of f(x)=x2-x b a The first necessity and g intersect is to determine this region more precisely. The graphs of x3-x2 x(x2 or or i f when or a g(x)=x2. and x = - O' x 1) - = 0, 1-B 1+B ' ' 2 2 22 and on the interval , , On the interval ([1 ß]/2, 0) we have x3 x2 > x3 assertions from the x. These are apparent (0, [1 + Ë]/2) we have graphs (Figure 5),but they can also be checked easily, as follows. Since f(x) g(x) only if x 0, [1+ Ã]/2, or [1- ß]/2, the function f does not change sign on the intervals ([1- ß]/2, 0) and (0, (1+Ã]/2); it is therefore only necessary tO Observe, for example, that - _ - = b) FIGURE 4 -g = 13-1-12--1<0 to conclude that f f 0 on ([1 ß]/2, 0), g 5 0 on (0, [1+ Ã]/2). g - - > - The area of the region in question is thus (x3 - x - x2)dx + [x2 - (x3 - x)]dx. As this example reveals, one of the major problems involved in finding the areas of a region may be the exact determination of the region. There are, however, have thus far defined the areas more substantial problems of a logical nature-we of some very special regions only, which do not even include some of the regions whose areas have just been computed! We have simply assumed that area made 14. TheFundamentalTheoremof Calculus 289 f(x) FIGURE - x x 5 sense for these regions, and that certain reasonable properties of do hold. These remarks are not meant to suggest that you should regard exercising ingenuity to compute areas as beneath you, but are meant to indicate that better approach a to the definition of area is available, although its proper place is somewhere in advanced calculus. The desire to define area was the motivation, both in this book and historically, for the definition of the integral, but the integral does not really provide the best method of defining areas, although it is frequently the proper tool for computing them. It may be discouraging to learn that integrals are not suitable for the very purpose for which they were invented, but we will soon see how essential they are for other purposes. The most important use of integrals has already been emphasized: if f is continuous, the integral provides a function y such that "area" y'(x) = f (x). This equation is the simplest example of a equation" (anequation for a function y which involves derivatives of y). The Fundamental Theorem of Calculus says that this differential equation has solution, a if f is continuous. In succeeding chapters, and in various problems, we will solve more complicated equations, but the solution almost always depends somehow on the integral; in order to solve a differential equation it is necessary to construct a new function, and the integral is one of the best ways of doing this. Since the differentiable functions provided by the Fundamental Theorem of Calculus will play such a prominent role in later work, it is very important to realize that these functions may be combined, like less esoteric functions, to yield still more functions, whose derivatives can be found by the Chain Rule. Suppose, for example, that "differential f(x) = a 1 + sin2 t dt. 290 Derivativesand Integrals Although the notation of the functions tends to disguise the fact somewhat, C(x) In fact, f (x) Rule, = x x3 = and F(x) dt. 1 + sin2 t F'(x') = Fo C. Therefore, by the Chain = F'(C(x)) = C'(x) · 3x2 - 1 = 3x2. sin273 1+ If f is defined, instead, as f(x) dt, sin2 1+ 3 then f'(x) 1 La = 1 = - - 1+ sin2x3 3x2. If f is defined as the reverse composition, f (x) = then f'(x) = 1 (lx dt 1 + sin2 t a C'(F(x)) · , F'(x) 2 1 (x Similarly, if aux f (x) = g(x) = h(x) = £6 (l' dt, sin2 1 dt 1+sin2t sin a then dt, t 1 1+ . 1+sin2X 1 sin2 1+ x f'(x) 3 \ y dt1+sin2t =3 , 1 = - 1 + sin2(sin x) cosx, -1 g'(x) = h'(x) = · cosx, 1 + sin2 (sinx) 2 1 cos ( 1+ sin2 composition 1 = F(C(x)); in other words, f f'(x) f is the dt t 1 - 1+ sin2 . X 14. TheFundamentalTheoremof Calculus 291 The formidable appearing function f (x) = is also a composition; in fact, f f'(x) = 1 + sin2 t Fo F. Therefore Ja = F'(F(x)) - dt F'(x) 1 1 + sin2 \Ja 1 1+ sin2 1+ dt sin2 / t As these examples reveal, the expression occurring above (orbelow) the integral sign indicates the function which will appear on the rykt when f is written as a composition. As a final example, consider the triple compositions f (x) = sin2 1+ a dt, g(x) " = 1+=*"'' - dt, 1 + sin2 - t which can be written and f=FoFoC Omitting the intermediate (whichyou steps g=FoFoF. may supply, if you still feel insecure), we obtain 1 f'(x) = 1 - 1 1 + sin a dt 1 + sin2 t - 1+ sin2,3 1 g'(x) - 1 + sm 1 1 2 a 3x2, 1 + sin2 t dt 1+ n2 \Ja 1+ sin2 dt t / 1 1 + sin2 x should beLike the simpler differentiations of Chapter 10, these manipulations and, like much easier after the of the practice problems, provided by come some the problems of Chapter 10, these differentiations are simply a test of your understanding of the Chain Rule, in the somewhat unfamiliar context provided by the Fundamental Theorem of Calculus. The powerful uses to which the integral will be put in the following chapters all depend on the Fundamental Theorem of Calculus, yet the proof of that theorem was quite easy-it seems that all the real work went into the definitionof the integral. Actually, this is not quite true. In order to apply Theorem 1 to a continuous function we need to know that if f is continuous on [a,b], then f is integrable on [a,b]. Although we've already offered one pmof of this result, there 292 Derivativesand Integrals "elementary" is a more elementary argument that you might prefer. Like most it's quite tricky, but it has the virtue that it will force a review of the arguments, proof of Theorem 1. If f is any bounded function on [a,b], then and sup{L(f, P)} inf{U(f, P)} will both exist, even if f is not integrable. These numbers integral of f on [a,b] and the upper integral of f on will be denoted by L are called the lower [a,b], respectively, and b lb and f U f. The lower and upper integrals both have several properties which the integral possesses. In particular, if a < c < b, then bcb Lbcb Lf=Lf+LfandUf=Uf+Uf, G and if m 5 f (x) 5 C M for all x in a) 5 L - C [a,b], then b m(b d G I b 5 U f f 5 M(b -- a). The proofs of these facts are left as an exercise, since they are quite similar to the corresponding proofs for integrals. The results for integrals are actually a corollary of the results for upper and lower integrals, because f is integrable precisely when b b I Lf=Uf. We will prove that a continuous function f is integrable by showing that this equality always holds for continuous functions. It is actually easier to show that L Ïx x f=U f for all x in [a,b]; the trick is to note that most of the proof of Theorem even depend on the fact that f was integrable! THEOREM 13-3 PROOF If f is continuous on then [a,b], Define functions L and U on f is integrable on [a,b]. [a,b] by IX L(x) Let x be in (a, b). If h > = L X f and 0 and ma=inf{f(t):xstyx+h}, Mh=sup{f(t):xstšx+h}, U(x) = U f. 1 didn't 14. The FundamentalTheoremof Calculus 293 then x+h x+h Ï mh-hšL fšU fšMa-h, x SO - ma h 5 L(x + h) or L(x + h) h mh< ¯ If h < x L(x) 5 U(x + h) -- L(x) - < U(x) 5 Mh h - U(x + h) · U(x) - ¯ <Mh- ¯ h 0 and ma=inf{f(t):x+hst áx}, Mh=sup{f(t):x+hstsx}, one obtains the same inequality, precisely as in the proof of Theorem 1. Since f is continuous at x, we have lim mh lim Mh = h->0 f(X), = h->0 and this proves that L'(x) U'(x) = This means that there is a number U(x) f (x) for x in (a, b). = c such that for all x in L(x) + c = [a,b]. Since U(a) the number c must equal L(a) = = 0, 0, so U(x) for all x in L(x) = [a,b]. In particular, U lbf b = U(b) = L(b) = L a and this means that a f is integrable on f, [a,b]. PROBLEMS 1. Find the derivatives of each of the following functions. x3 (i) F(x) (ii) F(x) sin3 = t dt. Ja = Ï (f an3rdt 1+sin6t#£2 3 2 (iii) F(x) (iv) F(x) > 1 = 5 = 8 Lb 1+t2+sin2t dt. 1 + t2 + sin2 dt dt dy. 294 Derivativesand Integrals (v) F(x) = (vi) F(x) = x L sin3 sin sin Ï = ' - t 1 where F(x) FIGURE . (Find (F' - - - -- of dt. = - )'(x) in terms F¯I(x).) 1 1 1 dy dt. x (viii)F-1, t dt xy (vii) F¯1 where F(x) 2. dt. 1+t2+sin2t t2 ---- ------ - 6 For each of the following f, if F(x) = jg f, at which points x is F'(x) = f (x)? (Caution: it might happen that F'(x) f (x), even if f is not contin= uous at x.) (i) f(x)=0ifxs1, (ii) f (x) 0 if x < = f(x)=1ifx>l. 1, f (x) = 1 if x > 1. (iii) f(x)=0ifxý=1, f(x)=lifx=l. (iv) f(x) 0 if x is irrational, f(x) 1/q if x (v) f (x) 0 if x 5 0, f (x) x if x > 0. (vi) f (x) 0 if x 5 0 or x > 1, f (x) 1/[1/x] = = = = 3. p/q in lowest terms. if 0 = (vii)f is the function shown in Figure 6. (viii)f(x) 1 if x 1/n for some n in N, f(x) = = = = Let f be integrable on [a,b], F(x)= = < x 5 1. 0 otherwise. let c be in (a, b), and let lx f, a5x5b. For each of the following statements, give either a proof or a counterexample. (a) If f is differentiable at c, then F is differentiable at c. 14. TheFundamentalTheoremof Calculus 295 (b) If f is differentiable at c, then F' is continuous (c) If f' is continuous at c, then F' is continuous 4. at c. at c. Show that the values of the following expressions do not depend on x: 1/x1 x1 (i) dt + 1+t2 1 (ii) Isinx Find (f 1 cosx dt, t2 -- dt. 14g2 0 [0,x/2]. XE ¯1)'(0) 5. if X (i) f (x) = (ii) f(x) = Ï 1 + sin(sin t) dt. O cos(cost)dt. 1 (Don't try to evaluate 6. f explicitly.) Find a function g such that (i) lx tg(t)dt=x+x2. 0 tg(t) dt (ii) = x +x2 (Notice that g is not assumed continuous at 0.) 7. Find all continuous functions f satisfying X f=(f(x))2+C. Ï O for some constant *8. C. Suppose that f is a differentiable function with f (0) Prove that for all x > 0 we have < f' s 1. 3 Let f (x) Is the function F(x) 10. 0 and 0 2 x *9. = = cos = fe'f (i) (ii) 1 76 1 < ¯ 7Ã 3 8 - Ál+x2 1/2 < ¯ o dx 5 1 x dx 5 1+x - 1 x ·ý x 0 differentiable at 0? Hint: Stare at page 177. Use Problem 13-23 to prove that - -, n4 ---. 1 I -. 296 Derivativesand Integrals 11. Find F'(x) if F(x) f( xf (t)dt. (The answer is not xf (x); you should perform an obvious manipulation on the integral before trying to find F'.) 12. Prove that if f is continuous, = then xf (u)(x u) du - f (t) dt = du. Hint: Differentiate both sides, making use of Problem 11. *13. Use Problem 12 to prove that x f(u)(x-u)2du=2 14. *15. Find a function f such that f"'(x) supposed to be easy; don't misinterpret (a) If f is periodic I O periodic g for only 16. if f (This problem is x. "find.") a, if f (x + a) f (x) for all x. = [0,a], show that f - b for all b. f is not periodic, but f' is. Hint: Choose it can be guaranteed that f (x) fag is not such that which periodic. (c) Suppose that f' sin2 du2- b+a a a | Ál+ the word 1 period a and integrable on with (b) Find a function f = with period A function f is periodic, dui f(t)dt = is periodic with period a. Prove that f is periodic if and f (a) f (0). = feb Find dx, by simply guessing a function f with f'(x) 6, and using the Second Fundamental Theorem of Calculus. Then check with Prob= lem 13-21. Cs *17. c Use the Fundamental Theorem of Calculus and Problem 13-21 to derive the in Problem 12-18. result stated ca 18. Let C1, C and C2 be curves passing through the origin, as shown in Figure 7. Each point on C can be joinedto a point of Ci with a vertical line segment and to a point of C2 with a horizontal line segment. We will say that C bisects Ci and C2 if the regions A and B have equal areas for every point on C. x2, x > 0 and C is the graph of (x) 2x2, f > that and C Ci C2. C2 bisects find 0, so x (b) More generally, find C2 if Ci is the graph of f(x) x", and C is the graph of f (x) cx" for some c > 1. (a) If Ci is the graph of f (x) = = = FIGURE 7 = 19. *20. derivatives of F(x) fi' 1/t dt and G(x) (b) Now give a new proof for Problem 13-15. (a) Find the = = 1/t dt. fbbx Use the Fundamental Theorem of Calculus and Darboux's Theorem (Problem 11-54) to give another proof of the Intermediate Value Theorem. 14. TheFundamentalTheoremof Calculus 297 -21. Prove that if h is continuous, and g are differentiable, and f F(x) h(t) dt, = h(f(x)) then F'(x) h(g(x)) g'(x) f'(x). Hint: Try to reduce this to the two cases you can already handle, with a constant either as the lower or the upper limit of integration. = 22. · · - Show also that the hypothesis *23. [0, 1] and f (0) Suppose that f' is integrable on [0, 1] we have (a) Suppose G' differential (*) = g and F' f (0) g(y(x)) - y'(x) Hint: Problem 13-39. for all x in some interval, f(x) = 0. Prove that for all x in Prove that if the function y satisfies the f. = equation 0 is needed. = = then there is a number c such that G(y(x)) (**) for all x in this interval. F(x) + c that if y satisfies (b) Show, conversely, (c) Find what = condition y must satisfy y'(x) (**),then if 1 + x2 1+y(x) = y is a solution of (*). . the resulting 1 + t and f (t) 1 + t2.) Then (In this case g(t) equations to find all possible solutions y (nosolution will have R as its domain). (d) Find what condition y must satisfy if = "solve" = -1 y'(x) = . 1 + 5[y(x)]4 (An appeal to Problem 12-11 will show that there are functions satisfying the resulting equation.) (e) Find all functions y satisfying y(x)y'(x) -x. = -1. Find the solution y satisfying 24. y(0) = In Problem 10-17 we found that the Schwarzian derivative f'"(x) f'(x) was 0 whose ¯ 3 2 f"(x) f'(x) for f (x) (ax + b)/(cx + d). Now suppose that f is any function Schwarzian derivative is 0. = 298 Derivativesand Integrals (a) f"2g'3 is a constant fimction. the (b) f is form f(x) (ax + b)/(cx + d). Hint: Consider u apply the previous problem. = *25. JN The limit lim f, N-woo "improper called an if it exists, is denoted by and f' = f°°f (orf°°f (x)dx), and integral." -1. f°°xrdx, (a) Determine if r < 13-15 to show that (b) Use Problem fi°°1/x dx does not exist. Hint: What 1/x dx? can you say about > f exists. Prove that if (c) Suppose that f (x) 0 for x > 0 and that 0 5 g(x) 5 f(x) for all x > 0, and g is integrable on each interval g also exists. [0,N], then 1/(1+ x2) dx exists. Hint: Split this integral up at 1. (d) Explain why f fe" fe" fe 26. Decide whether or not the following improper integrals exist. 1 Ï°°fl+x3 (ii) I" (iii) l=° (i) dx. 0 x 1 + x3/2 i o o xjl *27. dx. dx. +x The improper integral is defined in the obvious way, as fi f But another kind of improper integral it is fg°° f + | (a) Explain why (b) Explain why f, provided f°° f lim JN · N->-oo is defined in a nonobvious way: these improper integrals both exist. f" 1/(1 + x2) dx exists. f°° x dx does not exist. (But notice that lim /_NN N->oo X dx does exist.) (c) Prove that if Show moreover, f°° f exists, then lim f_NN Nandœ that lim |_NN lim N-woo Naco f exists and equals |_NN2 f both f°° f. exist and equal Can you state a reasonable generalization of these facts? (If you will have a miserable time trying to do these special cases!) you f°° f. can't, *28. "improper There is another kind of bounded, but the function is unbounded: (a) If a > 0, find lim e->O+ f 1/Jdx. even though the function f (x) matter how we defme f (0). integral" in which the interval is This limit is denoted by = 1/ ||1//dx, is not bounded on [0,a], no -1 (b) Find fe"x' < r < 0. dx if Problem 13-15 to show that (c) Use a limit. / x¯I dx does not make sense, even as of Calculus 299 14. TheFundamentalTheorem (d) Invent a reasonable definition of -l<r<0. fa°|x|'dx for a 0 and compute it for < (1 x2)¯1¡2 dx, as a sum of two a reasonable definition of limits, and show that the limits exist. Hint: Why does 1(1 + x)¯I/2 dx (1-x2)¯I/2 < exist? How does (1+x)¯I/2 for compare with x < 0? f (e) Invent - | 29. *30. (a) If f is continuous (b) If f is integrable on on [0, 1], compute [0, 1] and It is possible, finally, to combine the integral. (a) If ocf(x) 1/T = for 0 Ex lim f x->0 /1 lim x --- x->0+ x t --1 dt. lim x (x) exists, compute x->0+ x t dt. the two possible extensions of the notion of 5 1 and f (x) dx (afterdeciding what f(x) this should = 1/x2 for x > 1, find mean). oo -1 xr (b) Show that dx never makes sense. (Distinguish the cases < --1. 0 and r at oo; for r r < = < --1 In one case things go wrong at 0, in the other case things go wrong at both places.) CHAPTER THE TRIGONOMETRIC FUNCTIONS The definitions of the functions sin and cos are considerably more subtle than one might suspect. For this reason, this chapter begins with some informal and intuitive definitions, which should not be scrutinized too carefiilly, as they shall soon be replaced by the formal definitions which we really intend to use. In elementary geometry an angle is simply the union of two half-lines with a common initial point (Figure 1). FIGURE 1 "directed angles," which may be regarded as More useful for trigonometry are pairs (li, l2) of half-lines with the same initial point, visualized as in Figure 2. FIGURE i. FIGURE 3 2 If for 11we always choose the positive half of the horizontal axis, a directed angle is described completely by the second half-line (Figure 3). Since each half line intersects the unit circle precisely once, a directed angle is described, even more simply, by a point on the unit circle (Figure 4), that is, by a pOint (X, y) With x2 1. + y2 = 300 15. The Trgonometric Functions 301 The sine and cosine of a directed angle can now be defined as follows (Figure 5): a directed angle is determined by a point (x, y) with x2 + y2 = 1; the sine of the angle is defined as y, and the cosine as x. Despite the aura of precision surrounding the previous paragraph, we are not yet finished with the definitions of sin and cos. Indeed, we have barely begun. What we have defined is the sine and cosine of a directed angle; what we want to define is sin x and cos x for each number x. The usual procedure for doing this depends on associating an angle to every number. The oldest method is to the way around" is associated to 360, angles in degrees." An angle associated angle around" is 180, to an quarter way around" an angle to 90, etc. (Figure 6). The angle associated, in this manner, to the number x, is called angle of x degrees." The angle of 0 degrees is the same as the angle of 360 degrees, and this ambiguity is purposely extended further, so that an angle of 90 degrees is also an angle of 360 + 90 degrees, etc. One can now define a function, which we will denote by sin°, as follows: "measure "all "half-way "a "the FIGURE 4 sin°(x) > 7¯¯ si A = sine of the angle of x degrees. There are two difficulties with this approach. Although it may be clear what we mean by an angle of 90 or 45 degrees, it is not quite clear what an angle of Ã degrees is, for example. Even if this difficulty could be circumvented, it is unlikely that this system, depending as it does on the arbitrary choice of 360, will lead would be sheer luck if the function sin° had mathematically to elegant results-it pleasing properties. "Radian measure" appears to offer a remedy for both these defects. Given any number x, choose a point P on the unit circle such that x is the length of the are of the circle beginning at (1,0) and running counterclockwise to P (Figure 7). angle of x radians." Since the The directed angle determined by P is called of whole the length circle is 2x, the angle of x radians and the angle of 2x + x cos A "the FIGURE radians 5 are identical. A fimction sin, can now be defined as follows: sin'(x) = sine of the angle of x radians. This same method can easily be adopted sin' 2x, we can define sin° 360 to define sin°; since we want to have = 180 sin° x so so = sin" 2xx 360 ---- = sin' xx -. 180 We shall soon drop the superscript r in sinr, since sin' (andnot sin°) is the only function which will interest us; before we do, a few words of warning are advisable. The expressions sin° x and sinr x are sometimes written sin x° sin x radians, FIGURE 6 but this notation is quite misleading; a number x is simply a number-it does not radians." If the meaning degrees" or carry a banner indicating that it is "in "in 302 Derivativesand Integrals of the notation "sin x" is in doubt one usually asks: P "Is x in degrees or radians?" kngth x but what one means is: (t, 0) FIGURE 'sinr, 'sin°' "Do you mean » or Even for mathematicians, addicted to precision, these remarks might be dispensable, were it not for the fact that failure to take them into account will lead to incorrect answers to certain problems (anexample is given in Problem 19). Although the function sin' is the function which we wish to denote simply by sin involved even in the definition use exclusively henceforth),there is a difficulty (and Of SÎûr. Our proposed definition depends on the concept of the length of a curve. several problems, it is also easy Although the length of a curve has been defined in of length is to reformulate the definition in terms of areas. (A treatment in terms outlined in Problem 28.) 7 Suppose that x is the length of the arc of the unit circle from (1,0) to P; this are of the unit cirle. thus contains x/2x of the total length 2x of the circumference shown in Figure 8; S is bounded by the unit circle, the Let S denote the should be horizontal axis, and the half-line through (0, 0) and P. The area of S x/2x times the area inside the unit circle, which we expect to be x; thus S should have area X X P length x "sector" area - 2 which We can therefore define cos x and sin x as the coordinates of the point P determines a sector of area x/2. sin With these remarks as background, the rigorous definition of the functions and cos now begins. The first definition identifies x as the area of the unit circlesemicircle (Figure 9). more precisely, as twice the area of a 2x F I GURE 8 11 DEFINITION f(x) g = 2 Ál . - x2 dx. (This definition is not offered simply as an embellishment; to define the trigonometric functions it will be necessary to first define sinx and cosx only for = -1 area = | Ex 5 1, the area A(x) of The second definition is meant to describe, for the sector bounded by the unit circle, the horizontal axis, and the half-line through expressed (Figure 10) as the sum of (x, Ál x2). If 0 Exy 1, this area can be - the area of a triangle and the area of a region under the unit circle: x FIGURE9 1-x2 1 + 1-t2dt. 15. The Trigonometric Functions 303 -1 This same formula happens to work for 9 0 also. In this case (Figure 11), 5 x 5 the term 2 is negative, the term FIGURE and the area of the triangle which must be subtracted from represents 1 1-t2dt. 10 -1 DEFINITION If < x < 1, then x/1 A(x) 1 x2 - |1 + = 2 , 12 - dt. -1 < x < 1, then A is differentiable at x and Notice that if mental Theorem of Calculus), (usingthe Funda- -2x A'(x) = - 1 x 2 2) l + - x2 - Ál - x2 /1 - - x2 -x2 - FIGURE ll 1 2 Î (1 + X - - 2Ál l-2x2-2(Î-X2 --- x2) -Vl-x2 1-x2 - ---- - x2 Ál x2 - 2/l-x2 -1 2)1- x2 Notice also (Figure 12) that on the interval from i i FIGURE the function A decreases $ Ál-t2dt=I 2 1-1 A(-1)=0+ -1 [---1,1] to A(1) = 0. This followsdirectly from the definition of A, and also from the fact that its derivative is negative on (-1, 1). For 0 5 x 5 x we wish to define cos x and sin x as the coordinates of a point P (cosx, sin x) on the unit circle which determines a sector whose area is x/2 (Figure 13). In other words: 12 = DEFINMON If 0 5 x 5 x, then cos x is the unique number A(cosx) in x -; = 2 and sin x = 1 - (cosx)2. [-1, 1] such that and Integrals 304 Derivatives This definition actually requires a few words of justification.In order to know x/2, we use the fact that A is continuous, that there is a number y satisfying A(y) and w/2. This tacit appeal to the Intermediate and that A takes on the values 0 make Value Theorem is crucial, if we want to our preliminary definition precise. and made, definition, Having we can now proceed quite rapidly. justified,our = THEOREM 1 If 0 < x < x, then -sinx, PROOF If Ñ = cos'(x) = sin'(x) = 2A, then the definition A(cosx) cos x. be written x/2 can = B(cos x) = x; in other words, cos is just the inverse of B. We have already computed that 1 A'(x) 2dl = - , -x2 P = sin x) (cosx, from which we conclude that 1 area - gl-x2 2 Consequently, cos'(x) FIGURE (B¯1), = 13 Î B'(B-1 ) 1 1 1 = - = - 1 [B¯*(x)]2 - (cosx)2 - sin x. Since sin x V1 = - (cosx)2, we also obtain -2 sin'(x) = cos x cos'(x) (cosx) 1 2 - Ál sinx - cosx sinx = cosx. The information contained in Theorem 1 can be used to sketch the graphs of 15. The TryonomenicFunctions 305 sin and cos on the 14 [0,x]. interval COS cos'(x) = - Since sin x < 0, O< x < x, -1 the function cos decreases from cos O 1 to cos x (Figure 14). Consequently, unique x]. that the definition of cos, To fmd 0 for in cos y a y y, we note [0, = -1- = = FIGURE À(COSX) 14 means = , that A(0) SO 2, = 2 1 y = -t2dt. Ál 2 It is easy to see that 0 2 -1 FIGURE 15 l Ål-t2dt= Ál-t2dt JO so we can also write 1-t2dt= y= . 1 Now we have sin'(x) = cos x >0, 0<x<r/2 <0, x/2<x<x, increases on [0,x/2] from sin 0 = 0 to sin x/2 = 1, and then decreases on x] [n/2, to sin x 0 (Figure 15). The values of sin x and cos x for x not in [0,x] are most easily defined by a two-step piecing together process: so sin = 2 (1) If x 5 x 5 2x, then -sin(2x sinx (a) cosx --x), = = cos(2x Figure 16 shows the graphs of sin and cos on (2) 2= .- 2 If x = FIGURE 16 [0,2x]. 2xk + x' for some integer k, and some x' in sin x cos x (b) x). - = sin x', = cos x'. [0,2x], then Figure 17 shows the graphs of sin and cos, now defined on all of R. Having extended the functions sin and cos to R, we must now check that the basic properties of these functions continue to hold. In most cases this is easy. For example, it is clear that the equation sin2 x + COS2 306 Derivativesand Integrals (a) (b) FIGURE 17 holds for all x. It is also not hard to prove that if x is not a multiple of x. sin'(x) = cos'(x) = cos x, -sinx, For example, if x < -sin(2x sinx < x 2x, then -x), = so sin'(x) = = = - sin'(2x cos(2x -- - x) - (--1) x) cosx. If x is a multiple of x we resort to a trick; it is only necessary to apply Theorem 11-7 to conclude that the same formulas are true in this case also. FIGURE 18 15. The TyonometricFunctions 307 i- The other standard define trigonometric 1 -1 1 secx = tan x = csex = cotx = --- cosx sm x x ¢ kr +x/2, x ¢ kr. cos x 1 i tan - smx -1- cosx sinx The graphs are sketched in Figure 18. It is a good idea to convince yourself that the general features of these graphs can be predicted from the derivatives of these the functions, which are listed in the next theorem (thereis no need to memorize results of rederived whenever needed.) the theorem, since the statement can be THEOREM FIGURE functions present no difficulty at all. We 2 If x ¢ kr + x/2, then BCC'(X) 19 tan'(x) = SCCX tanX, = sec2 x. If x ¢ kx, then csc'(x) cot'(x) PROOF 2· Left to you (astraightforward = - csc2 computation). - / The inverses of the trigonometric functions are also easily differentiated. The trigonometric functions are not one-one, so it is first necessary to restrict them to suitable intervals; the largest possible length obtainable is x, and the intervals usually chosen are (Figure 19) [-x/2, x/2] [0,x] (a) (-x/2, for sin, for cos, for tan. x/2) (The inverses of the other trigonometric functions are so rarely used that they will not even be discussed here.) The inverse of the function -1 1 -x/2 f (x) (b) 20 = sin x, 5 x 5 x/2 is denoted by arcsin (Figure 20); the domain of arcsin is [-1, 1]. The notation has been avoided because arcsin is not the inverse of sin (which is not oneone), but of the restricted function f; sometimes this function f is denoted by Sin, and arcsin by Sin¯1 sin¯I FIGURE -cscxcotx, = 308 Derivativesand Integrals The inverse of the function 05x5x g(x)=cosx, is denoted by arccos (Figure 21); the domain of arccos is is denoted by Cos, and arccos by Cos-1 The inverse of the function [-1, 1]. Sometimes g --x/2 h(x) rccos tanx, < < x x/2 is denoted by aretan (Figure 22); arctan is one of the simplest examples of a differentiable function which is bounded even though it is one-one on all of R. Sometimes the function h is denoted by Tan, and arctan by Tan¯'. The derivatives of the inverse trigonometric functions are surprisingly simple, and do not involvetrigonometric functions at all. Finding the derivatives is a simple matter, but to express them in a suitable form we will have to simplify expressions like -1 1 -1 FIGURE = 21 cos(arcsin sec(arctanx). x), 2 22 FIGURE A little picture is the best way to remember the correct simplifications. For example, Figure 23 shows a directed angle whose sine is x-the angle shown is thus an angle of (arcsinx) radians; consequently cos(arcsin x) is the length of the other side, namely, Ál x2. However, in the proof of the next theorem we will not - resort to such pictures. -1 FIGURE 23 THEOREM 3 If < x < 1, then arcsin'(x) 1 = , 1-x2 -1 arecos'(x) = . -x2 1 Moreover, for all x we have arctan'(x) = 1 1 + x2 . 15. The Trgonometric Functions 309 arcsin'(x) PROOF (f¯ )'(x) = 1 f(x)) f'(f 1 sin'(arcsin x) 1 cos(arcsin x) Now x)]2 [sin(arcsin + x)]2 [cos(arcsin _., y that is, x2 + x)]2 [cos(arcsin O -- therefore, cos(arcsin x) Ál = x2. - (The positive square root is to be taken because arcsinx is in (-n/2, x/2), so cos(arcsinx) > 0.) This proves the first formula. The second formula has already been established (inthe proof of Theorem 1). It is also possible to imitate the proof for the first formula, a valuable exercise if that proof presented any difficulties. The third formula is proved as follows. arctan'(x) = (h¯I)'(x) 1 h'(h-1( 1 tan'(arctan x) 1 ¯ sec2(arctaux) Dividing both sides of the identity sin2 a # COS2 = a Î by cos2 a yields tan2 a + 1 sec2 = a. It follows that + 1 [tan(arctanx)]2 = sec2(arctanx), or x2 which + 1 = sec2(arctanx), proves the third formula. The traditional proof of the formula sin'(x) given here) is outlined different from the one cos x (quite in Problem 27. This proof depends upon first establishing = 310 Derivativesand Integrals the limit lim hw0 and the "addition sin h 1, = -- h formula" sin(x + y) sin x cos y + cos x sin y. = Both of these formulas can be derived easily now that the derivative of sin and cos are known. The first is just the special case sin'(0) = cos0. The second depends of the functions sin and cos. In order to derive this on a beautiful characterization result we need a lemma whose proof involves a clever trick; a more straightforward proof will be supplied in Part IV LEMMA Suppose f has a second derivative everywhere and that =0, f"+ f f (0) 0, f'(0) 0. = = Then f PROOF = 0. Multiplying both sides of the first equation by yields f' f'f"+ ff'= 0. Thus [(f')2+ f2]'=2(f'f"+ so (f')2 the constant ff')=0, 2 is a constant function. From f (0) is 0; thus [f'(x)] + [f (x)] = 0 = 0 and f'(0) = 0 it follows that for all x. This implies that f (x) 0 for all x. I = THEOREM 4 If f has a second derivative everywhere and f"+ f f (0) f'(0) =0, = = a, b, then f (In particular, if cos.) then f = f(0) = = 0 and f'(0) b sin + a cos. - = - 1, then f = sin; if f (0) = 1 and f'(0) = 0, 15. The TgonometricFunctions 311 PROOF Let -bsinx g(x)= -acosx. f(x) Then g'(x)= g"(x) f'(x)-bcosx+asinx, sinx + a cosx. f"(x) +b = Consequently, g"+g=0, g(0) g'(0) 0, 0, = = which shows that 0=g(x)= THEOREM s If x and y are any two numbers, sin(x then + y) cos(x + y) PROOF forallx. f(x)-bsinx-acosx, sin x cos y + cos x sin y, cos x cos y sin x sin y. = = - For any pardcular number y we can define a function f (x) f by sin(x + y). = Then f'(x) f"(x) cos(x + y) sin(x + y). = = - Consequently, f" + f f(0) f'(0) = = = 0, sin y, cos y. It follows from Theorem 4 that f = (cosy) - sin +(sin y) · cos; that is, sin(x + y) = cos y sin x + sin y cos x, for all x. Since any number y could have been chosen to begin with, this proves the first formula for all x and y. The second formula is proved similarly. As a conclusion an alternative to this chapter, and as a prelude to Chapter 18, we will mention to the definition of the function sin. Since approach arcsin'(x) l = 1 - x2 for - 1< x < 1, 312 Derivativesand Integrals it followsfrom the Second Fundamental Theorem of Calculus that x 1 arcsinx arcsinx arcsin0 dt. 1 t2 = = - - This equation could have been taken as the definitionof arcsin. It would follow immediately that I arcsin'(x) = 1 - the function sin could then be defined as (arcsin)¯I tive of an inverse function would show that sin'(x) Ál = - ; x2 and the formula for the deriva- sin x, which could be defined as cos x. Eventually, one could show that A(cos x) x/2, with which recovering of started. the end the the development definition at we very While much of this presentation would proceed more rapidly, the definition would of the definitions would be known to the reasonableness be utterly unmotivated; the author, but not to the student, for whom it was intended! Nevertheless, as we shall see in Chapter 18, an approach of this sort is sometimes very reasonable = indeed. PROBLEMS 1. Diffërentiate each of the following functions. (i) f (x) = arctan(arctan(arctan (ii) f (x) (iii) f (x) = arcsin(arctan(arccos = arctan = arcsin (iv) f (x) x)). x)). x). (tanx arctan 1 . 1 + x2 2. Find the following limits by l'Hôpital's Rule. . sinx-x+x3/6 hm (i) x-O (ii) hm . X, sinx-x+x3/6 . x->0 . cosx---1+x2/2 lun (iii) x->o lim (iv) x40 . (v) 4 x . x cos x 1 + x2/2 - ' 4 x arctanx-x+x3/3 hm x-vo lim (vi) x->o x (1 - - x 1 -- smx . 15. The Trgonometric Functions 313 sin x ¯¯' 3. Let f (x) x = 1, x ¢0 = 0. (a) Find f'(0). (b) Find f"(0). At this point, you will almost certainly have to use l'Hôpital's Rule, but in Chapter 24 we will be able to find f(k)(0)for all k, with almost no work at all. 4. Graph the following functions. (a) f (x) (b) f (x) = sin 2x. sin(x2). (A pretty respectable sketch of this graph can be obtained using only a picture of the graph of sin. Indeed, pure thought is your only hope in this problem, because determining the sign of the cos(x2)-2x is no easier than determining the behavior = derivative = f'(x) of f directly. The formula for f'(x) does indicate 0, which must be true since one important fact, is even, and which however-f'(0) f be clear in your graph.) (c) f(x) sinx + sin2x. (It will probably be instructive to first draw the sin2x carefully on the same set of sinx and h(x) graphs of g(x) and what the sum will look like. You can from 0 2x, to axes, guess easily find out how many critical points f has on [0,2x] by considering the derivative of f. You can then determine the nature of these critical points by finding out the sign of f at each point; your sketch will probably = should = = = suggest the answer.) (First determine the behavior of f in (-x/2, x/2); in the intervals (kr n/2, kr + x/2) the graph of f will look exactly the same, except moved up a certain amount. Why?) (e) f (x) sin x x. (The material in the Appendix to Chapter 11 will be particularly helpful for this function.) (d) f (x) = tan x -- x. - = - sin x (f) f (x) ¢0 ¯¯' x 1, x=0. = where the zeros enable you to determine approximately of f' are located. Notice that f is even and continuous at 0; also consider the size of f for large x.) (g) f (x) = x sin x. (Part *5. 6. should (d) a/0 in polar coordispiral is the graph of the function f(0) The hyperbolic this paying particular attention nates (Chapter 4, Appendix 3). Sketch curve, to its behavior for 0 close to 0. = Prove the addition formula for cos. 314 lkrivatives and Integrals 7. (a) From the addition formula for sin and cos derive formulas for sin 2x, 2x, sin 3x, and cos 3x. (b) Use these formulas to find the following values of the trigonometric fimctions (usually deduced by geometric arguments in elementary trigonomcos etry): sm tan cos 8. cos 4 4 - -, 4 2 1 x = -, 2 6 = - = - 1, = - . sm = - -. 6 2 A sin(x + B) can be written as a sin x + b cos x for suitable a and b. (One of the theorems in this chapter provides a one-line proof. You should also be able to figure out what a and b are.) (b) Conversely, given a and b, find numbers A and B such that a sin x + A sin(x + B) for all x. b cos x (c) Use part (b)to graph f (x) E sin x + cos x. (a) Show that = = 9. (a) Prove that tan(x + y) tan x + tan y 1 tanx tan y = - provided that x, y, and x + y are not of the form kr + x/2. addition (Use the sin and cos.) formulas for (b) Prove that arctan x + arctan y x + y ( arctan = 1 , - xy indicating any necessary restrictions on x and y. Hint: Replace x by arctan x and y by arctan y in part (a). 10. Prove that arcsina + arcsin ß indicating any restrictions = on arcsin(ad1 œ and p2 - _ ß. 11. Prove that if m and n are any numbers, sinmx sin nx sin mx cos nx cos mx cos nx = = = then cos(m + n)x], ([cos(mn)x sin(m n)x], + n)x + ([sin(m - - --- + n)x + cos(m ([cos(m - n)x]. 2 15. The T onometricFunctions 315 12. Prove that if numbers, m and n are natural x Ï I Ï then 0, sinmx sinnxdx m ¢ x, m = n, 0, m ¢ n m = n, = l cos mx cos nx dx = l x, -x n x sinmx cos nx dx 0. = These relations are particularly important in the theory of Fourier series. Although this topic will receive serious attention only in the Suggested Reading, the next problem provides a hint as to their importance. 13. [-x, x], is integrable on (a) If f show that the minimum lx dx (f(x)-acosnx) occurs when 1 a and the minimum = value of " f(x)cosnxdx, - value of /K (f (x) sinnx)2 dx f (x)sinnx dx. a - when a = 1 " - (In each case, bring a outside the integral sign, obtaining a quadratic expression in a.) (b) Define = a, b,=- 1 " 1 " - f(x)cosnxdx, n= 0,1,2,..., f(x)sinnxdx, n=1,2,3,.... Show that if c, and d, are any numbers, ( f (x) = 2 N [f (x)]2dx c, cos nx + d, sinnx + - then dx n=1 - 2x + + x a.c. - -- + b.d. + + x (c. c,2 + d + - a.)2 + (d. b.)2 - 2 316 Derivanvesand Integrals thus showing that the first integral is smallest when at c¿ and b¿ d¿. combinations" of the functions s.(x) In other words, among all sinnx and c.(x) N, the particular function for 1 yns cosnx = = "linear = = g(x)= a,cosnx+bssinnx + n=1 has the 14. "closest fit" to f on [-n, x]. (a) Find a formula for sin x + sin y. (Notice that this also gives a formula for Hint: First find a formula for sin(a + b) + sin(a b). What good does that do? Also find a formula for cos x + cos y and cos x cos y. (b) sin x sin y.) - - - 15. (a) Starting from the formula for cos 2x, derive formulas for terms of cos in (b) Prove that cos x 1+cosx 2 2 for05xix/2. (c) Use part (a)to find sin2 (d) Graph f (x) = 16. c'os2 x and 2x. = - sin2 f and sin2 x . sm dx and f x - 1-cosx 2 = 2 cos2 x dx. Find sin(arctan x) and cos(arctan x) as expressions not involving trigonosin y/ cos y functions. Hint: y arctan x means that x tan y sin2 sin y/ 1 metric = = = = - 17. 18. answers should express sinu and cosu in terms of x. (Use Problem 16; the be very simple expressions.) (a) Prove that sin(x + x/2) If x = tanu/2, cosx. (All along we have been drawing the graphs of sin and cos as if this were the case.) (b) What 19. is arcsin(cosx) (a) Find t2 1 °° (b) Find 1 1 ‡ t2 dt. 1 Find lim x sin 21. (a) Define functions sin° -. x and cos° Find (sin°)'and cos(xx|180). sin° x (b) Find 22. and arccos(sinx)? dt. Hint: The answer is not 45. 20. x-oo = lim x-o x and lim x sin° x oo by sin°(x) sin(xx/180) and cos°(x) (cos°)'in terms of these same functions. = 1 -. x Prove that every point on the unit circle is of the form (cos0,sin0) for at least one (andhence for infinitely many) numbers 0. 15. The Trigonomehic Functions 317 23. (a) Prove that x is the maximum possible length of an interval on is one-one, and that such an interval must be of the form x/2, 2kx + x/2] or [2kn+ x/2, 2(k + 1)x x/2]. [2kn (b) Suppose we let g(x) sin x for x in (2kr x/2, 2kx + x/2). What is which sin - - = - (g-1p 24. Let f(x) = for 0 Exs secx Find the domain of x. f¯I and sketch its graph. 25. Prove that | sin x -¢ y| for all numbers x y. Hint: The same with replaced < is straightforward by statement, of a 5 consequence a very well-known theorem; simple supplementary considerations then allow $ to be improved to < sin y| - |x < - , . *26. It is an excellent test of intuition to predict the value of lim Ib f(x)sin1xdx. Continuous functions should be most accessible to intuition, but once you get the right idea for a proof the limit can easily be established for any integrable f. (a) Show that (b) Show that lim f sin1xdx 0, by computing = if s is a step function on [a,b] lem 13-26), then lim j s(x) sin1x dx = 0. the integral explicitly. from (terminology (c) Finally, use Prob- Problem 13-26 to show that lim f f(x)sin1xdx 0 for A-koo which result, is integrable on [a,b]. This like Problem 12, any function f plays an important role in the theory of Fourier series; it is known as the Riemann-Lebesgue Lemma. = C 27. A This problem outlines the classical approach to the trigonometric functions. The shaded sector in Figure 24 has area x/2. (a) By considering then O B - the triangles OAB and OCB prove that if 0 sin x (1, 0) - 2 FIGURE 24 (b) Conclude x 4 _ 2 sin x « 2cosx that smx cosx<--<1, x and prove that lim imx = - x-vo 1. x (c) Use this limit to fmd . hm x->0 1 - cosx . x < x < x/4, 318 Denvativesand Integrals (d) Using parts (b)and (c),and starting *28. from the definition the addition formula for sin, find sin'(x), of the derivative. This problem gives a treatment of the trigonometric length, and uses Problem 13-25. Let f(x) fl Define Œ(x) to be the length of f on [x, 1]. functions in terms of X Î. for -x2 -1 = (a) Show that i 2(x) 1 = 1 t2 - dt. (This is actually an improper integral, as defined in Problem 14-28.) (b) Show that E'(x) -1 for - < l. < x For 0 Ex 5 x, define cosx by 2(cosx) = x, and and sin'(x) V1-cos2x.Prove that cos'(x) as 2(-1). (c) Define x -sinx define sinx = = cosxfor0<x<x. *29. 1 = = Yet another development of the trigonometric functions was briefly menwith inverse functions defined by integrals. It tioned in the textstarting is convenient to begin with arctan, since this function is defmed for all x. To do this problem, pretend that you have never heard of the trigonometric functions. (a) Let a(x) = lim a(x) we t2)¯I dt. /‡(1+ and a(x) Prove that a is odd and increasing, and that both exist, and are negatives of each other. If lim define x = 2 X->00 lim a(x), then a¯1 is defmed on (-x/2, x/2). (b) Show that (a¯1)'(x) = 1+ [a¯1(x)2 «¯I(x'). kr + x' with x' ¢ x/2 or Then define tanx 1//1 tan2 define cos x + x, for x not of the form kx+x/2 or kx and cos(kx ±x/2) 0. Prove first that cos'(x) tan x cos x, and then all that cos"(x) cos x for x. (c) For x -x/2, = = -x/2, = = = *30. = - - If we are willing to assume that certain differential equations have solutions, another approach to the trigonometric functions is possible. Suppose, in particular, that there is some function yo which is not always 0 and which satisfies yo" + yo 0. = (a) Prove that /)2 yo2 is Constant, and conclude or yo'(0) ¢ 0. that there is a function s satisfying s" + s Prove (b) s'(0) 1. Hint: Try s of the form ayo + byo'. that either yo(0) = 0 and s(0) = ¢0 0 and = s', then almost all facts about trigonoIf we define sin s and cos metric functions become trivial. There is one point which requires work, = = 15. The TryonometncFunctions 319 however-producing the number x. This is most easily done using an exercise from the Appendix to Chapter 11: (c) Use Problem 7 of the Appendix to Chapter 11 to prove that cos x cannot be positive for all x > 0. It follows that there is a smallest xo > 0 with cos xo 0, and we can define x = 2xo. ±1; 1, we have sin x/2 1. (Since sini+ cos2 (d) Prove that sin x/2 the problem is to decide why sin x/2 is positive.) (e) Find cos x, sin x, cos 2x, and sin 2x. (Naturally you may use any addition formulas, since these can be derived once we know that sin' cos = = = = = and cos' 31. = sin.) - (f) Prove that cos and (a) After all the work sin are periodic with period 22. involved in the definition of sin, it would be disconcerting to find that sin is actually a rational function. Prove that it isn't. (There is a simple property of sin which a rational function cannot possibly have.) (b) Prove that sin isn't even defined implicitly by an algebraic equation; that is, there do not exist rational fimctions fo, fn-1 such that . (sinx)" + f._ t (x)(sinx)"¯1 + - - - + . . , fo(x) 0 for all = x. Hint: Prove that fo 0, so that sin x can be factored out. The remaining factor is 0 except perhaps at multiples of 2x. But this implies that it is 0 for all x. (Why?) You are now set up for a proof by induction. = *32. Suppose that ¢i and ¢2 satisfy ¢i" + ¢2 and that g2 > 0, 82Ò2 0, g1¢1 = = g1. (a) Show that ¢i"¢2 ¢2"¢i (x2 - - (b) Show that - gi)¢i¢2 0. = if ¢1(x)> 0 and ¢2(x) > 0 for all x in (a, b), then b [¢1"‡2 ¢2 ÈI] - > 0, and conclude that [‡1'(b)¢2(b)¢i'(a)¢2(a)]+ [¢i(b)¢2'(b)‡1(a)¢2'(a)]> - - (c) Show thatsignin this case we cannot have of ¢i'(a) and ‡1'(b). sider the (d) Show that ¢2 < 0 able to the equations ¢¡(a) ¢i(a) ¢i ¢i(b) = 0. Hint: Con- ‡1(b) 0 are also impossible if ¢i > 0, ¢i < 0, ¢2 < 0 on (a, b). (You should be = = < 0, ¢2 > 0, or with almost no extra work.) this do or = 0. 320 Ihrivativesand Integrals The net result of this problem may be stated as follows: if a and b are zeros of ¢1, then ¢2 must have a zero somewhere between This result, in a slightly more general form, is known as the Sturm Comparison Theorem. As a particular example, any solution of the differential equation consecutive a and b. y"+(x+1)y=0 must have zeros on the each other. 33. (a) Using the formula for sin x sin(k + (b) Conclude -+ 1 2 positive horizontal axis which are within x of sin y - derived in Problem 14, show that ()x sin(k ()x - = - 2 sin cos kx. that cosx + cos2x + + cosnx --- ¼)x sin(n + - . I . 2sm - 2 Like two other results in this problem set, this equation is very important in the study of Fourier series, and we also make use of it in Problems 19-42 and 23-19. (c) Similarly, derive the formula . sin sinx +sin2x --.+sinnx + - \ (n+1 2 . In . x sm x X sm- 2 (A more natural derivation of these formulas will be given in Problem 27-14.) and (d) Use parts (b) (c)to fmd the defmition of the integral. lb 0 sin x b dx and 0 cos x dx directly from JT IS IRRATIONAL *CHAPTER This short chapter, diverging from the main stream of the book, is included to demonstrate that we are already in a position to do some sophisticated mathematics. This entire chapter is devoted to an elementary proof that x is irrational. Like proofs of deep theorems, the motivation for many steps in our many proof cannot be supplied; nevertheless, it is still quite possible to follow the proof "elementary" step-by-step. Two observations must be made before the proof. The first concerns the function f,(x) x"(1 - x)" = n! , which clearly satisfies 1 0< for0<x<1. f,(x)<--n! An important property of the function f, is revealed by considering the expression obtained by actually multiplying out x"(1-x)". The lowest power of x appearing will be n and the highest will be 2n. Thus f, can be written in the form power f,(x) where = - 1 2" E c;x', the numbers c¿ are integers. It is clear from this expression that f,t*)(0)=0 ifk<nork>2n. Moreover, f.f">(x) -[n!c, 1 = f,(n+1) n! + 1)!cn+1 + terms involving x] n! [(n f,<2">(x) = 321 + terms involving x] n! [(2n)!c2.]. 322 Derivativesand Integrals This means that fn(n)(0) c., f.f"*)(0) (n +1)cn+1 = = f.f¾(0) (2n)(2n = where the numbers 1) - - ... on the right are all integers. (n + 1)c2n, Thus f.(k)(0)is an integerfor all k. The relation f,(x) = f,(1 x) - implies that f.f*)(x) (-1)k (k)(Î X), = - therefore, f.f*)(1) is also an integerfor all k. The proof that x is irrational requires one further observation: number, and e > 0, then for sufficiently large n we will have if a is any n! To prove this, notice that if n > 2a, then a"+1 (n+1)! Now let no be any natural may have, the succeeding number values a(no+2) I If k is so large that k)! ano < n+1 n! > 1 a" 2 n! 2a. Then, whatever satisfy 2 (no+ a" with no (no+ 1)! (no + 2)! a (no)! a " > (no + 1)! 2k (40)! 2k, then 1 1 ano 2 2 (no)! value 16. x is Irrational 323 a(no+k) (no+ ' k)! which is the desired result. Having made these observations, one theorem in this chapter. mEOREM 1 PROOF we are ready for the The number n is irrational; in fact, x2 is irrational. (Notice that the irrationality of x2 implies the irrationality of x, for if x were rational, then x2 certainly would be.) Suppose x2 were rational, so that a b 2 for some positive integers a and b. Let G(x) (1) b" [x = f,(x) x2=-2¿"(x) - y2n-4 y (4) + (-1)" f.fk(x) Notice that each of the factors b"z2n-2k is an integer. Since n-k b"W2)"¯* b" - - f.(*)(0)and f,(k)(1)are n-kb* _ integers, this shows that G(0) and G(1) are integers. Differentiating G twice yields G"(x) (2) The last term, 2n(4) x2" f is zero. Thus (-1)" f,(2n+2)(x), = b"[x2" "(x) - n adding G"(x)+x2G(x)=b"r2n+2 (3) (2n+2) (1)and (2)gives 2 n _ Now let H(x) G'(x) sinxx = xG(x) cosxx. - Then H'(x) = = xG'(x) cosxx + G"(x) sinxx [G"(x) +x2G(x)] sinxx - xG'(x) cosxx +x2G(x) sinxx =x2a"f,(x)sinxx, by(3). By the Second Fundamental Theorem of Calculus, x2 Il a"f,(x)sinxxdx=H(1)-H(0) O G'(1)sinx xG(1)cosx x [G(1) + G(0)]. = - = Thus - G'(0)sinO+ xG(0)cos0 l x a" f,(x) sinxx dx is an integer. . 324 Derivativesand Integrals On the other hand, O < 1/n! for 0 f,(x) < O < na" < x ra" f,(x) sinux < < 1, so for 0 -- n! < x < 1. Consequently, 0<x li a"f,(x)sinxxdx<--. ra" n o This reasoning was completely independent of the value of n. Now if n is large enough, then 0<x a"f.(x)sinxxdx<-<1. na" n! But this is absurd, because the integral is an integer, and there is no integer between 0 and 1. Thus our original assumption must have been incorrect: x2 is irrational. I This proof is admittedly mysterious; perhaps most mysterious of all is the why that x enters the proof-it almost looks as if we have poved n irrational without of the pmof will show ever mentioning a definition of x. A close reexamination that precisely one property of x is essentialsin(x) 0. - The pmof really depends on the properties of the function sin, and proves the irrationality of the smallest positive number x with sin x = 0. In fact, very few properties of sin are required, namely, · sm cos' sin(0) cos(0) r = = = = cos, sin, - 0, 1. Even this list could be shortened; as far as the proof is concerned, cos might just as well be defined as sin'. The properties of sin required in the proof may then be written sin"+sin sin(0) sin'(0) = = = 0, 0, 1. Of course, this is not really very surprising at all, since, as we have seen in the previous chapter, these properties characterize the function sin completely. PROBLEMS 1. that the areas of triangles OAB and OAC in Figure 1 are related by the equation (a) Prove 16. x is Irrational 325 area OAC = - 1-Vl-16(area0AB)2 1 2 . 2 Hint: Solve the equations xy = 2(area OAB), x2+ y2 = 1, for y. polygon of m sides inscribed in the unit circle. If (b) Ist Pm be the regular of P. show that Am is the area (x,y) 0 FIGURE i (x,0) - B = (1,0) A = A2. 2 = 2 2J1 -- (2A./m)2. -- and more complicated) expressions This result allows one to obtain (more with = A4 2, and thus to compute x as accurately for A2., starting Problem 8-11). Although better methods will desired to as (according slight variant of this approach yields a very appear in Chapter 20, a interesting expression for x: c 2. (a) Using the fact that area(OAB) OB, = area(OAC) show that if œm is the distance from O to one side of Pm, then --- A. «m• = A2m (b) Show that 2 -- «4 = A2e (c) Using the 8 · · · · . Œ2k-1 . fact that a. and the formula cosx/2 «4 etc. = = = cos -, m 1 + cos x (Problem 15-15),prove that 2 326 Denvadves and Integrals Together with part product" -= 2 x 1 1 -+-- -· \2 11 2V2 "infinite that 2/x can be written (b),this shows 1 -+- 11 -+ as an 11 -· \2 - \2 2 2 IVI ' to be precise, this equation means that the product of the first n factors can be made as close to 2/x as desired, by choosing n sufficiently large. This product was discovered by François Viète in 1579, and is only one of later. many fascinating expressions for x, some of which are mentioned *CHAPTER PLANETARY MOTION Nature and Nature's Laws lay hid in night God said "Let Newton be," and all was light. AlexanderPope Unlike Chapter 16, a short chapter diverging from the main stream of the book, diverges from the main stream of the book to demonstrate that we are already in a position to do some real physics. In 1609 Kepler published his first two laws of planetary motion. The first law describes the shape of planetary orbits: this long chapter Theplanets movein ellipses,with thesun at onefocus. The second law involves the area swept out by the segment from the sun to the planet (the vector from the sun to the planet') in various time intervals (Figure 1): 'radius Equal areas are swept out bythe radius vector in equal tünes.(Equivalently, thearea swept out in time t is proportional to t.) FIGURE 1 Kepler's third law, published in 1619, relates the motions of different planets. If a is the major axis of a planet's elliptical orbit and T is its period, the time it takes the planet to return to a given position, then: Theratio a3/T2 is thesameforall planets. Newton's great accomplishment was to show (usinghis general law that the force on a body is its mass times its acceleration) that Kepler's laws followfrom the assumption that the planets are attracted to the sun by a force (thegravitational force of the sun) always directed toward the sun, proportional to the mass of the planet, and satisfying an inverse square law; that is, by a force directed toward the sun whose magnitude varies inversely with the square of the distance from the sun to the planet and directly with the mass of the planet. Since force is mass this is equivalent simply to saying that the magnitude of the times acceleration, acceleration is a constant divided by the square of the distance from the sun. Newton's analysis actually established three results that correlate with Kepler's individual laws. The first of Newton's results concerns Kepler's second law (which was actually discovered first, nicely preserving the symmetry of the situation): 327 328 Duivativesand Integrals between i.e.,if and only theforce Kepler'ssecond lawis trueprecisely for'central forces', the sun and theplanetalways lies along the linebetweenthe sun and theplanet. p, P2 S (a) Ps R P2 p, S FIGURE 2 Although Newton is revered as the discoverer of calculus, and indeed invented calculus precisely in order to treat such problems, his derivation hardly seems to use calculus at all. Instead of considering a force that varies continuously as the planet moves, Newton first considers short equal time intervals and assumes that a momentary force is exerted at the ends of each of these intervals. To be specific, let us imagine that during the first time interval the planet moves along the line P1P2, with uniform velocity (Figure 2a). If, during the next equal time interval, the planet continued to move along this line, it would end up at P3, where the length of PI P2 equals the length of P2P3. This would imply that they have equal the triangle SPiP2 has the same are as the triangle SP2P3 (since bases, and the same height)-this just says that Kepler's law holds in the special case where the force is 0. Now suppose (Figure 2b) that at the moment the planet arrives at P2 it experiS to P2, which by itself would cause the planet ences a force exerted along thelinefrom with the motion that the planet already has, to move to the point Q. Combined this causes the planet to move to R, the vertex opposite P2 in the parallelogram whose sides are P2P3 and P2Q. Thus, the area swept out in the second time interval is actually the triangle SP2R. But the area of triangle SP2R is equal to the area of triangle SPs P2, since they have the same base SP2, and the same heights (sinceRP3 is parallel to SP2). Hence, finally, the area of triangle SP2R is the same as the area of the original triangle SPi P2! Conversely, if the triangle SRP2 has the same area as SP1P2, and hence the same area as SPsP2, then RPs must be parallel to SP2, and this implies that Q must lie along SP2· Of course, this isn't quite the sort of argument one would expect to find in a modern book, but in its own charming way it shows physically just why the result should be true. To analyze planetary motion we will be using the material in the Appendix to Chapter 12, and the det defined in Problem 4 of Appendix 1 to Chapter 4. We describe the motion of the planet by the parameterized curve "determinant" c(t) = r(t)(cos 0(t), sinO(t)), gives the length of the line from the sun to the planet, while 0 gives the angle. It will be convenient to write this also as so that r always (1) c(t) r(t) = - e(0(t)), where e(t) = (cost, sin t) is just the parameterized curve that runs along the unit circle. Note that e'(t) = (- sin t, cos t) 17. Planetay Motion 329 is also a vector of unit length, but perpendicular to e(t), and that we also have det(e(t), e'(t)) (2) Differentiating (3) (1),using the c'(t) r'(t) and combining with to Chapter 4, we get det(c(t), c'(t)) = = c(t) = = 1. formulas on page 244, we obtain ·e(8(t)) + r(t)0'(t) with the (1),together e'(0(t)), - formulas in Problem 6 of Appendix 1 det(e(0(t)), e(G(t))) + r(t)20'(t) det(e(0(t)), e'(0(t))), r(t)2Ð'(t) r(t)r'(t) since det(v, v) is always 0. Using (2)we then get det(c, c') (4) det(e(0(t)), e'(0(t))) r 9'. = As we will see, r20' turns out to have another important interpretation. c(0) Suppose that A(t) is the area swept out from time 0 to t (Figure 3). We want to get a formula for A'(t), and, in the spirit of Newton, we'll begin by making A(t), together with a straight line an educated guess. Figure 4 shows A(t + h) segment between c(t) and c(t + h). It is easy to write down a formula for the area of the triangle A(h) with vertices O, c(t), and c(t + h): according to Problems 4 and 5 of Appendix 1 to Chapter 4, the area is - FI CURE 3 area(A(h)) ¼det(c(t), = c(t + h) -- c(t)). Since the triangle A(h) has practically the same area as the region A(t+h) this shows (orpractically shows) that c(r + h) A'(t) A(t + h) hoo h A(h) hm area han h - . = hm - A(t), A(t) . = c(t) = = O FIGURE 4 ( | }det(c(t), c'(t)). det c(t), lim hoo c(t + h) h - c(t) A rigorous derivation, establishing more in the process, can be made using Problem 13-24, which gives a formula for the area of a region determined by the graph of a function in polar coordinates. According to this Problem, we can write (*) A(t) = - p(¢)2 d¢ if our parameterized curve c(t) = r(t) e(Ð(t))is the graph of the function p in polar coordinates (herewe've used ¢ for the angular polar coordinate, to avoid confusion with the function 0 used to describe the curve c). 330 Derivativesand Integrals Now the function p is just p=roß-1 [forany particular angle ¢, Ð¯I(¢) r(0¯I(t)) is the time at which the curve c has angular polar coordinate ¢, so is the radius coordinate corresponding to ¢]. Although the presence of the inverse function might look a bit forbidding, it's actually quite innocent: Applying the First Fundamental Theorem of Calculus and the Chain Rule to we immediately get (4:) A'(t) O'O (p(Ð(t))2 (r(t)20'(t),since = - = p = ro 0¯1 Briefly, A' Combining with (4),we Jr20'. = thus have (5) A' = }det(c, c') (r20'. = Now we're ready to consider Kepler's second law. Notice that Kepler'ssecond law is equivalent to saying that A' is constant, and thus it is equivalent to A" 0. But = -1 A" = c')]' ([det(c, i det(c, c") det(c', c') + = -) det(c, c"). = (seepage 245) So Kepler's second law is equivalent to det(c, c") = 0. Putting this all together we have: THEOREM 1 Kepler's second law is true if and only if the force is central, and in this case each planetary path c(t) (K2) PROOF = r(t) · e(0(t)) r20' = satisfies the equation det(c, c') = constant. Saying that the force is central just means that it always points along c(t). Since c"(t) is in the direction of the force, that is equivalent to saying that c"(t) always points along c(t). And this is equivalent to saying that we always have det(c, c") = 0. We've just seen that this is equivalent to Kepler's second law. c')]' 0, which by Moreover, this equation implies that [det(c, = (5)gives (K2). 17. PlanetmyMotion 331 Newton next showed that if the gravitational force of the sun is a central force and also satisfies an inverse square law, then the path of any object in it will be a conic section having the sun at one focus. Planets, of course, correspond to the case where the conic section is an ellipse, and this is also true for comets that visit the sun periodically; parabolas and hyperbolas represent objects that come from outside the solar system, and eventually continue on their merry way back outside the system. THEOREM 2 PROOF c(t) If the gravitational force of the sun is a central force that satisfies an inverse square law, then the path of any body in it will be a conic section having the sun at one focus. Notice that out conclusion specifies the shape of the path, not a particular parameterization. But this parameterization is essentially determined by Theorem 1: the hypothesis of a central force implies that the area A(t) (Figure 5) is proportional to t, so determining c(t) is essentially equivalent to determining A for arbitrary points on the ellipse. Unfortunately, the areas of such segments cannot be deterexplicitly.I mined This means that we have to determine the shape of the path without = r(t) finding its parameterization! Since it is the function c e(Ð(t)) 0¯1 of which shape the actually the describes path in polar coordinates, we ro Shouldn't 0¯l surprised entering the into proof. be to find By Theorem 1, the hypothesis of a central force implies that - FlGURE 5 r20' (K2) det(c, c') = = M for some constant M. The hypothesis of an inverse square law can be written c"(t) (*) H = - - r(t) for some constant e(8(t)) H. Using (K2), this can be written c"(t) H = - - - O'(t) M e(0(t)). Notice that the left-hand side of this equation is [c'o Ð¯']'(8(t)). So if we let D (thisis the be written main More precisely, we can't arcsin, etc. c' o Ð¯1 c' as a function of Ð"), then the equation can as D'(0(t)) 1 consider trick-"we = H = write - - M e(0(t)) H = - sinO(t)), -(cosÐ(t), M down a solution in terms of familiar "standard functions," like sin, 332 Derivativesand Integrals and we can write this simply as D'(u) H M - H H -(cos = u, sin u) = - - cos u, M - - M sin u [forall u of the form 9(t) for some t, which happens to be all u], completely eliminating 9. equation that we have just obtained is simply a pair of equations, for the The components of D, each of which we can easily solve individually; we thus find that D(u) (H - = sin u - -M for two constants A and B. Letting u for c': (H c'= 0(t) again we thus have an explicit formula = sin 0 - -M H cos u + B M + A, H +A, cos & +B M - . [Here sin0 really stands for sinoß, etc., abbreviations that we will use throughout.] Although we can't get an explicit formula for c itself, if we substitute this equation, together with c r(cos0, sin0), into the equation = det(c, c') (K2 ), (equation M = we get ¯ - r H cos2 H sin 0 0 + B cos 0 + M M A sin 0 - - = M, which simplifies to r H B + cos 0 M2 M -- - A -- . sm -- M 0 1. = Problem 15-8 shows that this can be written in the form r(t) H - M + C cos(0(t) + D) = 1, for some constants C and D. We can let D 0, since this simply amounts to rotating our polar coordinate system (choosing which ray corresponds to 0 0), = = so we can write, finally, M2 r[1+ ecos0] = = -- H A. But this is the formula for a conic section derived in Appendix 3 of Chapter 4. In terms of the constant M in the equation r20' and the constant A in the equation r[1+ - M of the orbit ecos0] = A 17. PlanetaryMotzon 333 the last equation in our proof shows that we can rewrite c"(t) (**) Recall (page87) that = M2 -A - 1 (*)as e(0(t)). the major axis a of the ellipse is given by A (a) a= while the minor axis 1-e2' b is given by b (b) A = . 1 e2 - Consequently, b2 (c) = - a. A Remember that equation (5)gives A'(t) = A(t) = Šr20'(M, - and thus ¼Mt. We can therefore interpret M in terms of the period T of the orbit. This period T is, by definition,the value of t for which we have 0(t) 2x, so that we obtain the complete ellipse. Hence = area of the ellipse or M 2(area of the ellipse) T Hence the constant M2/A in = A(T) }MT, = 2xab by Problem 13-17. T (**)is 4x2a2b2 M2 ¯ T2A A 4x2a3 = . usmg , (c). This completes the final step of Newton's analysis: THEOREM s Kepler's third law is true if and only if the acceleration on an ellipse, satisfies 1 c"(t) = for a constant -G · r2 e(0(t)) G that does not depend on the planet. c"(t) of any planet, moving 334 Derivativesand Integrals It should be mentioned that the converse of Theorem 2 is also true. To prove this, we first want to establish one further consequence of Kepler's second law. Recall that for (cost, sin e(t) = t) e'(t) = (- sin t, cos t). e"(t) = (- cos t, we have Consequently, Now differentiating c"(t) = r"(t) · -e(t). sin t) - = (3)gives e(0(t)) + r'(t)0'(t) + r'(t)0'(t) - e'(0(t)) - e'(0(t)) + r(t)0"(t) - e'(9(t)) + r(t)Ð'(t)0'(t) e"(0(t)). - -e(t) Using e"(t) c"(t) we get = = [r"(t) r(t)O'(t)2] - - e(0(t)) + + r(t)0"(t)] [2r'(t)0'(t) - e'(0(t)). Since Kepler's second law implies central forces, hence that c"(t) is always a multiple of c(t), and thus always a multiple of e(0(t)), the coefficient of e'(0(t)) must be 0 [asa matter of fact, we can see this directly by taking the derivative of formula (K2)]. Thus Kepler's second law implies that (6) THEOREM 4 PROOF c"(t) = [r"(t) r(t)0'(t)2 - If the path of a planet moving under a central gravitational force is an ellipse with the sun as focus, then the force must satisfies an inverse square law. As in Theorem 2, notice that the hypothesis on the shape of the path, together with the hypothesis of a central force, which is equivalent to Kepler's second law, essentially determines the parameterization. But we can't write down an explicit solution, so we have to obtain information about the acceleration without actually knowing what it is. Once again, the hypothesis of a central force implies that Y2pr (K2) for some constant M, and the hypothesis that the path is an ellipse with the sun as focus implies that it satisfies the equation r[1+ecos0] (A) = A, for some e and A. For our (notespecially illuminating) proof, we will keep differentiating and substituting from these two equations. First we differentiate (A) to obtain r'[1 + e cos Ð] - er0' sin Ð Multiplying by r this becomes rr'[1+scos0]-er20'sin0=0. = 0. 17. PlanetaryMotion 335 Using both (A) and (K2), this becomes Ar' eM sin 8 - 0. = Differentiating again, we get Ar" eMG'cos0 --- Using (K2) we get 0. = sM2 Ar"- cos0 -- 0, = r and then using (A) we get Ar" M2 - - A - r2 1 - r = 0. Substituting from (K2) yet again, we get MS A[r"-r(0')2]+ =0, or r" Comparing with (6),we - M2 = - --. Ar2 obtain c"(t) which r(0')2 wanted M2 = - e(0(t)), to show: the force is inversely proportional to the square of the distance from the sun to the planet. is precisely what we THE LOGARITHM AND EXPONENTIAL FUNCTIONS CHAPTER In Chapter 15 the integral provided a rigorous formulation for a preliminary definition of the functions sin and cos. In this chapter the integral plays a more essential mle. For certain functions even a preliminary defmition presents difficulties. For example, consider the function f (x) 102. = This function is assumed to be defined for all x and to have an inverse function, defined for positive x, which is the to the base 10," "logarithm f¯1(x) 108102= In algebra, 102 is usually defined only for rational x, while the defmition for irx is quiedy ignored. A brief review of the definition for rational x will not only explain this omission, but also recall an important principle behind the definition of 102. The symbol 10" is first defined for natural numbers n. This notation turns out to be extremely convenient, especially for multiplying very large numbers, because rational 10" 10"' - 10"*'". = The extension of the definition of 10= to rational x is motivated by the desire actually forces upon us the customary to preserve this equation; this requirement definition. Since we want the equation 10 to be true, we must define 10° 10" - = *" 10" = 1; since we want the equation = 10¯" 10" - to be true, we must define 10¯" 10 = 10° = 1 1/10"; since we want the equation = -....10'/" 101¡" 101/4+-.+1/" = 101 = 10 = n times n times to be true, we must define 101/" 101/" - . ... = 6; 101/" = and since we want the equation 10 17"*" **/" = 10'"/" m times m times (6) to be true, we must define 10'"/" Unfortunately, at this point the program comes to a dead halt. We have been guided by the principle that 10" should be defmed so as to ensure that 10'+> 102107; but this principle does not suggest any simple algebraic way of defining - . = 336 18. The Logarithmand Exponential Functions 337 10= for irrational x. For this reason we will try some more sophisticated finding a function f such that (*) f(x + y) for all x and y. f(x) f(y) = ways of - Of course, we are interested in a function which is not always zero, so we might add the condition f (1) 0. If we add the more specific condition f (1) 10, then (*)will imply that f (x) 10=for rational x, and 102 could be d¢ned as f (x) for other x; in general f (x) will equal [f (1)]*for rational x. One way to find such a function is suggested if we try to solve an apparently function f such that more difficult problem: find a dà¶erentiable -¢ = = f(x). f(y) f(x+y)= f (1) = foralIxand 10. y, Assuming that such a function exists, we can try to find f'-knowing the derivative of f might pmvide a clue to the definition of f itself. Now f(x+h)- . f'(x) = hm f(x) h h->0 f (x) f (h) - . =lun f (x) - h h->0 hm f(h)-1 f (x) h-vo h . = - . The answer thus depends on f'(0) lim = f(h) - 1 ; h h-wo for the moment assume this limit exists, and denote it by a. Then f'(x) = - a for all x. f (x) Even if a could be computed, this approach seems self-defeating. The derivative of f has been expressed in terms of f again. If we examine the inverse function f¯I log1o, the whole situation appears in a new light: = logio 1 'fx) = frg-1 1 1 ¯ ¯ œ- f(f-1(x)) ax° The derivative of f-1 is about as simple as one could ask! And, what is even more interesting, of all the integrals b Ï x-I Lb x" dx examined previously, the integral dx is the only one which we cannot evaluate. Since logio 1 have 1 -- * 1 -dt=logiox-logto1=logtox. = 0 we should 338 Derivativesand Integrals t-i dt. This suggests that we define logiox as (1/a) The difficulty is that a is J1 One way of evading this difficulty is to define unknown. log x dt, = hope that this integral will be the logarithm to some base, which might be determined later. In any case, the function defined in this way is surely more reasonable, from a mathematical point of view, than logie. The usefulness of role of the number 10 in arabic notation (andthus the important logie depends on ultimately on the fact that we have ten fingers), while the function log provides a notation for an extremely simple integral which cannot be evaluated in terms of any functions already known to us. and If x DEFINITION 0, then > log x = 1 lx - dt. The graph of log is shown in Figure 1. Notice that if x if 0 < x < 1, then logx < 0, since, by our conventions, > 1, then log x > 0, and dt<0. dt=- For x 5 0, a number logx cannot not bounded on [x, 1]. be defined in this way, because f(t) = 1/t is 1 1 1 x (b) (a) FIGURE l The justificationfor the notation THEOREM 1 If x, y > "log" comes from the following theorem. 0, then log(x y) = log x + log y. 18. The Logarithmand ExponentialFunctions 339 PROOF Notice first that log'(x) 1/x, by the Fundamental Theorem choose a number y > 0 and let = f (x) log(xy). = Then f'(x) Thus f' log'(xy) y = - 1 = 1 - f (x) xy -. x c such that for all x log x + c = = y - log'. This means that there is a number = of Calculus. Now 0, > that is, log(xy) The number c can for all x logx + c = > be evaluated by noting that when x 0. = 1 we obtain log(1-y)=logl+c Thus log(xy) Since this is true for all y COROLLARY 1 If n is a natural number > log x + log y = 0, the theorem is proved. and x 0, then > log(x") PROOF COROLLARY 2 Let to you If x, y > for all x. = n logx. (useinduction). 0, then log (x -logy. logx = - y PROOF This followsfrom the equations log x = log - = y log + log y. Theorem 1 provides some important information about the graph of log. The fimction log is clearly increasing, but since log'(x) 1/x, the derivative becomes and consequently small large, log very as x becomes grows more and more slowly. clear whether log is bounded or unbounded It is not immediately on R. Observe, however, that for a natural number n, = log(2") n log 2 = (and log 2 > 0); it follows that log is, in fact, not bounded above. Similarly, log = log 1 -- log 2" -n = log 2; 340 Derivativesand Integrals therefore log is not bounded below on (0, 1). Since log is continuous, it actually takes on all values. Therefore R is the domain of the function log¯1. This important function has a special name, whose appropriateness will soon become clear. The DEFINITION "exponential function," exp, is defined as log¯1 The graph of exp is shown in Figure 2. Since log x is defined only for x > 0, we always have exp(x) > 0. The derivative of the function exp is easy to determine. THEOREM 2 For all numbers x, exp'(x) PROOF eXp'(X) )'(X) (ÏOg¯ = exp(x). = 1 = Iog'(log¯I (x)) 1 1 log-1 = log¯'(x) = exp(x). A second important property of exp is an easy consequence THEOREM 3 If x and y are any two numbers, exp Let x' = exp(x) and y' = Theorem 1. then exp(x + y) PROOF of = exp(x) - exp(y). so that exp(y), x = y = log x', log y'. Then x + y = log x' + log y' = log(x'y'). This means that exp(x + y) FIGURE = x'y' = exp(x) - exp(y). This theorem, and the discussion at the beginning of this chapter, suggest that is particularly important. There is, in fact, a special symbol for this number. exp(1) 2 DEFINITION e = exp(1). 18. TheLogarithmand ExponentialFunctions 341 This definition is equivalent to the equation 1 log e = y le = - I 1 As illustrated in Figure 3, 1 /2 - ¯ ' I r dt < 1, since 1 - dt. (2 1) is an upper sum for f (t) 1/t on [1,2], - = and 4 y - 1 FIGURE 2 3 dt > 1, since 4 (sum(2 for1) + ) - - f (t) . = (4 2) - 1/t on = 1 is a lower [1,4]. Thus 3 21 41 'l - t dt < - 11 dt < - 11 dt, which shows that 2<e<4. In Chapter 20 we will find much better approximations for e, and also prove that e is irrational (thepmof is much easier than the proof that x is irrational!). As we remarked at the beginning of the chapter, the equation exp(x + y) exp(x) = · exp(y) implies that exp(x) = = [exp(1)]* ex, for all rational x. Since exp is defined for all x and exp(x) our earlier use of the exponential notation DEFINITION For any number = for rational x, it is consistent with to defineex as exp(x) for all x. e2 x, e" = exp(x). "exponential function" should now be clear. We have sucThe terminology ceeded in defining e2 for an arbitrary (even irrational) exponent x. We have not yet defmed ax, if a ¢ e, but there is a reasonable principle to guide us in the attempt. If x is rational, then a2 = (el°5")2 = exioga But the last expression is defined for all x, so we can use it to define a*. 342 Derivativesand Integrals If a DEFINITION 0, then, for any real number > a' (If a f(x) " - e21°E". = e this definition clearly agrees with the previous one.) = The requirement unduly x, a > since, restrictive 0 is necessary, in order that log a be defined. This is not for example, we would not even expect to be defmed. (Of course, for certain rational to the old definition; for example, x, the symbol a2 will make sense, according Our definition of a2 was designed to ensure that (e2)T FIGURE ei = for all x and y. As we would hope, this equation turns out to be true when e is replaced by any number a > 0. The proof is a moderately involved unraveling of terminology. At the same time we will prove the other important properties of ax. 4 THEOREM 4 If a > 0, then bc (1) (abpc _ (Notice that ab - (2) al (Notice that rational x.) = Will be positive, so (ab c will be defined); automatically a and a2+> = (2)implies that (Î) (ab c PROOF for all b, c. a' - a> for all x, y. this definition of ax agrees with the old one for all clog(e**) cloga (Each of the steps in this string of equalities log¯14 the fact that exp c(bloga) chloga depends upon = a . last definition, or our = (2) al = a*** etioga = - e(2+T)'°ga eloga - exioga+,ioga ex loga - .,yioga _ a' - a?. Figure 4 shows the graphs of f (x) a2 for several difTerent a. The behavior of the function depends on whether a < 1, a 1, then 1, or a > 1. If a = = = 18. TheLogmithmand Exponentù:lFunctions 343 f (x) = 12 = 1. Suppose a 1. In this case log a > if < x then xloga < y, yloga, yloga exloga SO a' i.e., 0. Thus, > < a?. Thus the function f (x) a2 is increasing. On the other hand, if 0 < a < 1, a2 so that log a < 0, the same sort of reasoning shows that the function f (x) = is decreasing. In either case, if a > 0 and a ¢ 1, then f (x) a* is one-one. Since exp takes on every positive value it is also easy to see that a* takes on every positive value. Thus the inverse function is defined for all positive numbers, and a2, then f¯1 is the function usually denoted by loga takes on all values. If f(x) (Figure 5). Justas a* can be expressed in terms of exp, so loga can be expressed in terms of log. Indeed, = = = f(x) f(x) = /(x) = losa x -= lag x logio x (a < 1) if y = then x = so logx or y loßa X, ey a' yloga, = = = logx log a . logx log a · oga In other words, loga I FIGURE 5 The derivatives of f(x) = ax and g(x) f(x)=exioga, g(x) = A more complicated = logx loga --, = loga x are both easy to find: f'(x)=loga-exioga so =loga-ax, 1 so g'(x) = . xloga function like f (x) = g(x)*(2) is also easy to differentiate, if you remember that, b_;definition, it follows from the Chain Rule that f'(x) = = eh(x)log g( g(x)h(x) - h'(x)1088(x) + h(x) h'(x) logg(x) + h(x) g(x) . g(x) There is no point in remembering this formula--simply apply the principle behind it in any specific case that arises; it does help, however, to remember that the first factor in the derivative will be g(x)h(x) 344 Derivativesand Integrals There is one special case of the above formula which is worth remembering. The function f (x) xa was previously defined only for rational a. We can now define and find the derivative of the function f (x) x" for any number a; the result is just what we would expect: = = f (x) = xa = $ = ealogx SO f'(x) e"1°52 - S - - xa axa-1 = x x Algebraic manipulations with the exponential functions will become second naremember that all the rules which ought to work ture after a little practicejust actually do. The basic properties of exp are still those stated in Theorems 2 and 3: exp'(x) exp(x + y) exp(x), exp(x) exp(y). = = - In fact, each of these properties comes close to characterizing the function exp. Naturally, exp is not the only function f satisfying f' f, for if f ce", then with this property; however. f'(x) = ce2 f (x); these functions are the only ones = = = THEOREM 5 If f is differentiable and f'(x) then there is a number = c such that f (x) PROOF for all x, f (x) = ce' for all x. Let f(x) g(x)=- e2 (This is permissible, since e2 ¢ 0 for all x.) Then exf'(x) f (x)e2 - g'(x) Therefore there is a number - - 0. c such that g(x) f (x) c ex for all x. The second basic property of exp requires a more involved discussion. The function exp is clearly not the only function f which satisfies f(x+y) = f(x)· f(y). a2 also satisfies this equation. In fact, f (x) 0 or any function of the form f(x) But the true story is much more complex than this-there are infinitely many other functions which satisfy this property, but it is impossible, without appealing to more advanced mathematics, to prove that there is even one function other than those = = 18. TheLogarithmand ExponentialFunctions 345 It is for this reason that the definition of 10= is so difficult: there are infinitely many functions f which satisfy already mentioned! f(x+y)= f(x) f(y), - f (1) but which are not the function f (x) continuousfunction f satisfying f (x + 10, = y) 102! One thing is true however-any = f (x) f (y) = - be of the form f (x) a2 or f (x) 0. (Problem 38 indicates the way to prove this, and also has a few words to say about discontinuousfunctions with this property.) In addition to the two basic properties stated in Theorems 2 and 3, the function faster than any exp has one further property which is very importantaxp polynomial." In other words, must = = "grows THEOREM 6 For any natural number n, €* lim x- PROOF = - Xn oo. The proof consists of several steps. Step1. ex x for all x, and consequently > 0). to be the case n this statement To prove lim e2 = X->OC oo (thismay be considered = (whichis clear x log x > for x 0) it suffices to show that < for all x 1 this is clearly true, since log x < 0. If x 1/t on [1,x], so logx < x upper sum for f(t) If x 0. > = Skp 2. lim 1, then (Figure 6) x 1 < x. > < - - 1 is an ex - = oo. To prove this, note that 12 ex/2 - 1 FIGURE 6 x x x - 2 By Step 1, the expression shows that lim e2/x = x->oo Step3. lim x->oo - x = - x/2 2 2 x - 2 in parentheses is greater than 1, and lim ex/2 oo. oo. Note that ex (e2/")" 1 e*/" -nn - n n " = oo; this 346 Derivativesand Integrals The expression in parentheses becomes arbitrarily large, by Step 2, so the nth power certainly becomes arbitrarily large. It is now possible to examine carefully the following very interesting function: f (x) = e¯I/2', x ¢ 0. We have f'(x) e¯1/x2 = x Therefore, is decreasing for negative large, then x2 is large, so so f -1/x2 f'(x) < f'(x) > 0 0 for x for x 0, 0, < > x and increasing for positive x. Moreover, if |x| is is close to 0, so e¯1/x2 is close to 1 (Figure 7). 1 FIGURE 7 The behavior of f near 0 is more interesting. If x is small, then 1/x2 is large, ' is large, so e¯1/x2 1 glix2) is small. This argument, suitably stated with so ell - s's and o's,shows that lim e-1¡x2 = x->0 0. Therefore, if we defme f (x) then the function FIGURE 8 f is continuous e¯1/x2, = x ¢ 0 0, 0, l x (Figure 8). In fact, f is actually differentiable = 18. TheLogarithmand ExponentialFunctions 347 at 0: Indeed €-l/h' f'(0) lim = 1/h lim = h hoO , g(1/h)2 hw0 and lim h->0+ 1/h lim = ¢(1/h)2 --, x while g(x2) x-voo 1/h lim = g(1/h)2 hw0 lim - x-oo -. x g(x2) We already know that e2 lim = - x-voo oo; x it is all the more true that lim and this means = --- oo, that lim x - 0. = g(x2) x->oo Thus e¯1¡x2 f'(x) . 2 0, We can ¢0 x x3' = 0. = x now compute that f'(h) f'(0) - . f"(0) hm = h a-vo -l/h2 e =lim -- h3 h h-o 2 e¯l/h2 lim h4 hw0 1 ' 2x4 · = lim = h->0 an argument similar to the one above shows that e¯1¡x2 f"(x) , x4 = + e¯1/x2 lim = g1/h2 f"(0) , 0. Thus = i X6' 0, x ¢0 x = 0. be continued. In fact, using induction it can be shown (Probf (0) 0 for every k. The function f is extremelyflat at 0, and 0 so quickly that it can mask many irregularities of other functions. This argument lem 40) that approaches ' xsoo can (k) - For example (Figure 9), suppose that e¯1/x2 sin 1, x 0, x - f (x) = ¢0 = 0. 348 Derivativesand Integrals It can be shown (Problem 41) that for this function it is also true that f (k)(0) 0 for all k. This example shows, perhaps more strikingly than any other, just how bad a function can be, and still be infinitely differentiable. In Part IV we will investigate even more restrictive conditions on a function, which will finally rule out behavior of this sort. = FIGURE 9 PROBLEMS 1. Differentiate each of the following functions ). notes a (i) f (x) e (ii) f (x) log(1 + log(1 + (iii) f (x) (sinx)"i"(si"2). dt) = that (remember a* always de- . = log(1 + e14e = (iv) f (x) (v) f (x) (vi) f(x) = = = e¯ e sin x"i"2 log,psinx. . (vii)f(x) (viii)f(x) 2. = = (ix) f (x) = (x) f (x) = (a) The arcsm = x log(sin «Q smx (log(3+ e4 4x # . (arcsinx)log3 (logx)1°52. xx. derivative of logo f is f'/f. This expression to compute than is called the logmithmic doivativeof f. It is often easier since and products f', powers in the expression for f and expression products the in become sums for log o f. The derivative f' can then be recovered simply by multiplying by f; this process is called logarithmicd entiation. (b) Use logarithmic differentiation to find f'(x) for each of the following. (i) f (x) = (1 +x)(1+e ). 18. TheLogarithmand ExponenúalFuncions 349 3. (ii) /(x) = (iii) f(x) = (iv) f (x) = (3 x)U3x2 - . (1 x)(3 + x)24 + (cosx)"i"2. (sinx)cos2 - e2 €2x e¯' - . 73) (1 4 Find i lb /(t) a (forf 4. > [a,b]). 0 on dt Graph each of the following functions. (a) f (x) e2*. (b) f (x) e (c) f(x) e'+ (d) f(x)=ex = "2. = (Compare the graph with the graphs of exp and 1 1/exp.) e¯x4 = -e-x 2 1 1 eh e2 + e-2 + 1 e + 1 Find the following limits by l'Hòpital's Rule. e2 Us, y): e + y = (e) f(x) i 5. - e - --- --- . ex-1-x-x24 cos x, sm x) sea e¯2 - - (i) lim . 2 (ii) X2 x-»O lim e2 - 1 - x2/2 - x x3/6 - . x-o x ex-1-x-x2/2 lim (iii) x->o y - i (cmhx, (v) lim , log(1 + x) -- . log(1 + x) - x (vi) x->0 6. x2/2 x + . xa0 , x) x + x2/2 X log(1 + x) 11m i sinh (iv) lim 140 x - x2/2 x + X3 The functions sinh x = cosh x = tanh x the hyperbolic respectively (b) , 2 ex - e-2 1 ex+e-2 2 - e22+1 , and hyperbolic There and tangent, their ordinary trigonometric analogies and between these functions are many One analogy is illustrated in Figure 10; a proof that the region counterparts. are called F1 GUR Elo x3/3 - sine, (butusually hyperbolic read 'sinch,' cosine, 'cosh,' 'tanch'). 350 Derivativesand Integrals shown in Figure 10(b)really has area x/2 is best deferred until the next chapdevelop methods of computing integrals. Other analogies the following three problems, but the deepest analogies must are discussed in wait until Chapter 27. If you have not already done Problem 4, graph the functions sinh, cosh, and tanh. ter, when we will 7. Prove that sinh2 (a) cosh2 (b) tanh2 +1/ cosh2 (c) sinh(x + y) sinh x cosh x + cosh x sinh y. (d) cosh(x + y) cosh x cosh y + sinh x sinh y. sinh' cosh. (e) cosh' - = = = (f) = (g) tanh' 8. sinh. 1 = . cosh2 The functions sinh and tanh are one-one; their inverses sinh¯' and tanh¯I, are defined on R and (-1, 1), respectively. These inverse functions are sometimes denoted by arg sinh and arg tanh (the of the hyperbolic sine and tangent). If cosh is restricted to oo) [0, it has an inverse, denoted by arg cosh, or simply cosh-1, which is defined on [1,oo). Prove, using the information in Problem 7, that "argument" sinh(cosh-1 (a) cosh(sinh¯I (b) 2 x) 1 + x2. 1 = (c) (sinh¯1), 1+x2 1 (d) (cosh¯1)'(x) = (e) 9. (tanh¯I)'(x) 1. - x2 - for x 1 = 1 - > 1. 1 for x| < 1. x sinh¯1, cosh¯', and tanh¯I an explicit formula for sinh¯' equation y = x for x in terms of y, etc.). (a) Find (bysolving the (b) Find dx, 1+x2 b Ï x2 dx - 1 for a, b > 1 or a, b < 1, b g dx for |a|, |b| < l. 1 x2 Compare your answer for the third integral with that obtained by writing - 1 1-x2 =- 1 1 2 1-x + 1¯ 1+x . 18. The Loganthmand ExponentialFunctions 351 10. Show that x F(x) is not bounded on 11. Let f = 1 dt log t -- [2,oo). be a nondecreasing function on and define [1,oo), f(t) lx -dt, F(x)= Prove that f is bounded on 12. x>l. i i [1,oo) if and only if F/ log is bounded on [1,oo). Find (a) lim ax for 0 < X-¥oo a < 1. (Remember the definition!) x lim (b) x-voo . (logx) (logx)" lim (c) x-oo x (-1)" lim (d) x->0+ x (logx)". lim (e) x->0+ x2. 13. Graph 14. (a) Find the f (x) Hint: x (logx)" log = . 1 x xx for x = minimum > 0. (Use Problem 12(e).) value of f(x) = e2/x" for x > 0, and conclude that f (x) > e"/n" for x n. n)/xn+1, e2(x Using prove that f'(x) > (b) e"**/(n the expression f'(x) + 1)n+1 for x > n + 1, and thus obtain another proof that lim f(x) oo. X->OC > = - = 15. Graph f (x) 16. lim log(1 + (a) Find y->0 = ex /x". y)/y. (You can use PHôpital's Rule, but that would be silly.) (b) Find lim x log(1 + 1/x). lim (1 + 1/x)". (c) Prove that e X->00 = lim (1+ a/x)". (It is possible to derive this from part fiddling.) just a little that Prove log b X-¥OO lim x (b1/x 1). (d) Prove that with *(e) en = (c) algebraic = (1 + 1/x)2 for x - 0. (Use Problem 16(c).) 17. Graph f (x) 18. If a bank gives a percent interest per annum, then an initial investment I after 1 year. If the bank compounds the interest (counts yields I(1+a/100) the accrued interest as part of the capital for computing interest the next = > 352 Denvativesand Integrals year), then the initial investment grows to I(1 + a/100)" after n years. Now suppose that interest is given twice a year. The final amount after n years is, alas, not I(1+a/100)2", but merely I(l +a/200) interest is awarded twice as often, the interest must be halved in each calculation, since the interest is a/2 per half year. This amount is larger than I(1+ a/100)", but not that much larger. Suppose that the bank now compounds the interest continuously, i.e., the bank considers what the investment would yield when compounding k times a year, and then takes the least upper bound of all these numbers. How much will an initial investment of 1 dollar yield after --although 1 year? 19. log |x| for x ¢ 0. Prove that f'(x) 1/x for x ¢ 0. (b) If f(x) ¢ 0 for all x, prove that (log|f )' f'/f. (a) Let f (x) = = = 20. Suppose that on some interval the function f satisfies number f' = cf for some c. is never 0, use Problem 19(b)to prove that |f(x)| lecx number for some l (> 0). It follows that f (x) ke'2 for some k. (b) Show that this result holds without the added assumption that f is never 0. Hint: Show that f can't be 0 at the endpoint of an open interval on which it is nowhere 0. (c) Give a simpler proof that f (x) = kecx for some k by considering the function g(x) f(x)/ecx that Suppose f' fg' for some g. Show that f(x) kes(x) for some k. (d) (a) Assuming that f = = = = *21. = A radioactive substance diminishes at a rate proportional to the amount present (sinceall atoms have equal probability of disintegrating, the total disintegration is proportional to the number of atoms remaining). If A(t) cA(t) for some c (which is the amount at time t, this means that A'(t) represents the probability that an atom will disintegrate). = in terms of the amount Ao A(0) present at time 0. element) there is a number r (the life" of the radioactive A(t)/2. with the property that A(t + r) (a) Find A(t) (b) Show that = "half = *22. Newton'slaw of conhng states that an object cools at a rate proportional to the difTerence of its temperature and the temperature of the surrounding medium. Find the temperature T(t) of the object at time t, in terms of its temperature To at time 0, assuming that the temperature of the surrounding medium is kept at a constant, M. Hint: To solve the differential equation expressing Newton's law, remember that T' M)'. (T = X *23. 24. Prove that if f(x) = f(t) dt, Find all continuous functions /I (i) 0 f = ex . f then f = satisfying 0. -- 18. TheLogarithmand ExponentialFunctions 353 (ii) 25. f 1 = eh - . all continuous Given a differentiable function f, find functions g satisfying f(x) fg *26. 27. Find all functions satisfying f Find all continuous f'(t) > - f (t) + = 1. Ilf 0 (t) dt. f(01+t2 dt. = and g be continuous 0. Suppose that (a) Let f C (x)) functions f which satisfy the equation (f(x))2 28. g(f = nonnegative functions on [a,b], and let IX f(x)5C+ a5x5b. fg Prove Gronwall'sinequalig: f(x) 5 CeÍ.*. Hint: Consider the derivative of the function h (x) C + /* fg. (b) Use a limiting argument to show that this result holds even for C 0. (c) Suppose that f'(x) g(x) f(x) for some continuous function g, and that f (0) 0. Then f 0. (Compare Problem 20.) = = = = = 29. (a) Prove that 3 x2 n 1+x+-+-+---+-<e2 forx>0. n! 21 3! and Hint: Use induction on n, compare derivatives. that ex/x" proof lim Give new co. a (b) = 30. Give yet another proof of this fact, using the appropriate form of l'Hôpital's Rule. (See Problem 11-53.) 31. (a) Evaluate lim e¯** xooo e2 JO dt. (You should be able to make an educated guess before doing any calculations.) (b) Evaluate the following limits. (i) lim e¯2° xwoo lx+(1/x) Ï lx+(logx/2x) e" dt. x+(logx/x) (ii) lim e¯" xooo lim (iii) xwoo e¯" e" dt. e" dt. 354 Derivativesand Integrals 32. This problem outlines the classical approach to logarithms and exponentials. To begin with, we will simply assume that the function f (x) a=, defined in an elementary way for rational x, can somehow be extended to a continuous one-one function, obeying the same algebraic rules, on the whole line. (See Problem 22-29 for a direct proof of this.) The inverse of f will then be denoted by loga• = the defmition, that (a) Show, directly from loga '(x) ( lim loga 1 + = h->0 1 1/h h - x lim(1+k)1/k =--loga k->0 X Thus, the whole problem has been reduced to the determination of lim(1 + h)l/h. If we can show that this has a limit e, then log,'(x) = h->0 1 - -- x log, e 1 = -, exp(x). (b) Let a. and consequently log¯* has derivative exp'(x) = = for natural numbers n. Using the binomial theorem, 1+ = exp x show that a,=2+ 1- Conclude that < a, 1-k-1 1- -...- a.wi. fact that 1/k! 5 1/2k-1 for k > 2, show that all an < 3. Thus, the set of numbers {ai,a2, a3, is bounded, and therefore has a least that bound Show for a, < e for large upper e. any e > 0 we have e (c) Using the ...) - enough n. If (d) nsxsn ( 1+ + 1, then " 1 Conclude that lim x->oo and conclude 1+-- 5 n+1 2 1 1 ( 1+ - 1+ 5 x 1 n+1 . n+1 2 = x that lim(1 + h)l/h e. Also show that _ lim -oo x 1+ 1 - x * = e, g h->0 *33. .4 >> a A point P is moving along a line segment A B of length 107 while another point Q moves along an infinite ray (Figure l1). The velocity of P is always equal to the distancefrom P to B (inother words, if P(t) is the position of P 107 P(t)), while Q moves with constant velocity at time t, then P'(t) 107. The distance traveled by Q after time t is defined to be the Q'(t) Napierianlogmithmof the distance from P to B at time t. Thus = - = FIGURE 11 10't = Nap log[102 -- P(t)]. 18. The Logarithmand ExponentialFunctions 355 in his This was the definition of logarithms given by Napier (1550-1617) canonis descnþtion(A Description of publication of 1614, Mirgicilogarithmonum the Wonderful Law of Logarithms); work which was done beforethe use of exponents was invented! The number 107 was chosen because Napier's tafor astronomical and navigational calculations), listed the logables (intended rithms of sines of angles, for which the best possible available tables extended to seven decimal places, and Napier wanted to avoid fractions. Prove that Nap log x 107 10'log = -. x Hint: Use the same trick as in Problem 22 to solve the equation *34. for P. particular attention to the graph of f(x) (logx)/x (paying behavior near 0 and oo). (b) Which is larger, ex or x"? (c) Prove that if 0 < xs 1, or x e, then the only number y satisfying xi y* is y x; but if x > 1, x ¢ e, then there is precisely one number y ¢ x satisfying x> = y2; moreover, if x < e, then y > e, and if x > e, then y < e. (Interpret these statements in terms of the graph in part (a)!) that if x and y are natural numbers and x> yx, then x Prove y or (d) (a) Sketch the = = = = = = x=2,y=4,orx=4,y=2. y2 consists of a curve and the set of all pairs (x, y) with x> which find the intersect; intersection and draw a rough straight line a (e) Show that = sketch. **(f) g(x)". For 1 < x < e let g(x) be the unique number > e with x8 Prove that g is differentiable. (It is a good idea to consider separate functions, = logx f1(x)=-, f2(x)= and write g in terms of fi g'(x) 0<x<e x logx, x and e<x f2. You should be [g(x)]2 = 1 - able to show that log x x2 1-logg(x) if you do this part properly.) *35. This problem uses the material from the Appendix to Chapter 11. (a) Prove that exp is convex and log is concave. (b) Prove that if p¿ = 1 and all p¿ > 0, then i=1 ...·zn'" zi' < Pizi +-··+ p.z.. (Use Problem 9 from the Appendix to Chapter 11.) (c) Deduce another proof that G, < A. (Problem 2-22). 356 Derivativesand Integrals 36. [a,b], and let P, be the partition of equal into n intervals. Use Problem 2-22 to show that be a positive function on (a) Let f 1 L(log f, P.) 5 log b-a (b) Use the Appendix to Chapter . 13 to conclude that for allintegrable f 0 > have we b 1 b-a b log f 5 log a f b-a . is illustrated in the next part: A more direct approach (c) In 1 L(f, Ps) b-a [a,b] Problem 35, Problem 2-22 was deduced as a special case of the inequality g p¿x; for p¿ 0, > p¿ pgg (xi) 5 i=1 i=l 1 and g convex. For g concave we have the reverse = i=l inequality Pig(xi) 5 & Fixi i=1 - i-l Apply this with g log to prove the result of part (b)directly for any integrable f. (d) State a general theorem of which part (b)is just a special case. = 37. Suppose f satisfies f' that *38. f = exp or f = = f and f (x + y) f (x)f (y) for all = 0. x and y. Prove Prove that if f is continuous and f(x + y) f(x)f(y) for all x and y, then all either f 0 or f (x) [f (1)]2 for x. Hint: Show that f (x) [f (1)]2 for rational x, and then use Problem 8-6. This problem is closely related to Problem 8-7, and the information mentioned at the end of Problem 8-7 can be used to show that there are discontinuousfunctions f satisfying f (x+y) = = = = = f (x)f (y). *39. Prove that if f is a continuous function defined on the positive real numbers, and f(xy) f(x)+ f(y) for all positive x and y, then f 0 or f(x) all for x > 0. Hint: Consider g(x) f(e)]ogx f(e2). = = = = *40. Prove that if f (x) = e¯1/x2 for x ¢ 0, and f (0) = 0, then f (k)(0) = 0 for all k. *41. Prove that if for all k. f (x) = e¯1/x2 sin 1/x for x ¢ 0, and f (0) = 0, then f (k) (0) = 0 18. The Logarithmand ExponentialFunctions 357 42. (a) Prove that if a is a root of the equation an-1x"¯' + + a1x (*) anx" + - then the function y(x) (**) - - + ao 0, = e"2 satisfies the differential equation = anytn)+a.-iyf"~)+--.+aiy'+aoy=0. xe"2 also satisfies (**). Prove that if a is a double root of (*),then y(x) Hint: Remember that if a is a double root of a polynomial equation f(x) 0, then f'(a) 0. xk ŒX iS & SOÏUÍiOR Prove that if a is a root of (*)of order r, then y(x) for05kyr-1. *(b) = = *(c) = = If (*)has n n solutions real numbers as roots of (**)· yn yl, - (d) Prove that - multiplicities), (counting part (c)gives · in this case the function ci yi + - - - + c.y. also satisfies (**). It is a theorem that in this case these are the only solutions of (**).Problem 20 and the next two problems prove special cases of this theorem, and the general case is considered in Problem 20-19. In Chapter 27 we will see what to do when (*)does not have n real numbers as roots. *43. Suppose that as follows. f satisfies f"- f 0 and f(0) = J'(0) = = 0. Prove that f = 0 (a) Show that f2 J')2 0. (b) Suppose that f (x) ¢ 0 for all x in some = - interval (a, b). Show that either for in (a, b), for some constant c. f (x) ce" or else f (x) If f(xo) ¢ 0 for xo > 0, say, then there would be a number a such that 0 < a < xo and f (a) 0, while f (x) ¢ 0 for a < x < xo. Why? Use this fact and part (b)to deduce a contradiction. = = **(c) all x ce¯2 = *44. ae2 + be¯' for that if f satisfies f" 0, then f (x) f should what and and out b be in (First figure b. terms of f (0) a some a and f'(0), and then use Problem 43.) a and b. (b) Show also that f a sinh +b cosh for some (other) (a) Show - = = = 45. Find all functions f satisfying (a) f (") f ("¯1) (b) ff") f(n-2) = = *46. This problem, a companion to Problem 15-30, outlines a treatment of the exponential function starting from the assumption that the differential equation f' f has a nonzero solution. = 0 with f' each x by considering the function g(x) f (xo) ¢ 0. (a) Suppose there is a function f ¢ = = f. Prove that f (x) ¢ 0 for f(xo + x) f(x - xo), where 358 Derivativesand Integrals (b) Show that there is a functidn f satisfying f' f and f (0) 1. (c) For this f show that f (x + y) f (x) f (y) by considering the function = = = g(x)= f(x+y)/f(x). and that (f¯I)'(x) (d) Prove that f is one-one 47. - 1/x. = Let f and g be continuous functions such that lim f(x) than g (f » g) if We say that f growsfaster . hm and we say that f f (x) = -- f (x) exists = oo. oo, and g grow at the same rate lim lim g(x) = and (f g) if ~ is ¢ 0, oo. For example, exp » P for any polynomial function P, and P » log" for any positive integer n. and g, with lim f(x) that one of the three conditions (a) Given f (b) If f » g, then f + g ~ (c) If oo, is it necessarily true f » g or g » f or f g holds? lim g(x) = ~ f. log f logg > c > 1 for sufficiently large x, then f » g. LX (d) If f » g and F(x) = f, = G(x) X = g, does it necessarily follow that F » G? (e) Arrange each of the following sets of functions in increasing order of growth (forconvenience, we indicate each function simply by giving its value at x): (i) (ii) (iii) x3 x, X3 3), log4x, (logx)2, x2, x + e¯52, x3 e**, x2, xiogx x log2 x, e52, log(xx) e2', ex/2, e2, x", x2, 22, (logx)2x x 48. Suppose that gi, g2, gs, are continuous functions. Show that there is a continuous function f which grows faster than each g,. 49. Prove that logio 2 is irrational. ... INTEGRATION CHAPTER IN ELEMENTARY TERMS Every computation of a derivative yields, according to the Second Fundamental Theorem of Calculus, a formula about integrals. For example, if F(x) = x(logx) then F'(x) x - = logx; consequently, b log x dx = F(b) -- F(a) = b(log b) b - - [a(loga) - a], O < a, b. Formulas of this sort are simplified considerably if we adopt the notation F(x) We may then write Ib F(b) = F(a). - b Ib logxdx=x(logx)-x . £b This evaluation of log x dx depended on the lucky guess that log is the derivative of the function F(x) In general, a function F satisfying F' x (logx) f function f always has a is called a primitive of f. Of course, a continuous -x. = primitive, = namely, F(x) = lx f, but in this chapter we will try to find a primitive which can be written in terms of familiar functions like sin, log, etc. A function which can be written in this way is called an elementary function. To be precise,* an elementary function is obtained which multiplication, and composition addition, division, be by can one from the rational functions, the trigonometric functions and their inverses, and the functions log and exp. It should be stated at the very outset that elementary primitives usually cannot function F such that be found. For example, there is no ekky F'(x) = e¯2° for all x merely ignorance; it is a report on the present state of mathematical such what theorem that exists). function And, is even worse, you no a (difficult) (thisis not *The definition which we will give is precise, but not really accurate, or at least not quite standard. functions, that is, functions g Usually the elementary functions are defined to include "algebraic" satisfying an equation (a(x))"+Ín-1(x)(f(x))"¯1+.-+ fo(x) where the 359 f¿ are =0, rational functions. But for our purposes these fimetions can be ignored. 360 Denvativesand Integrals will have no way of knowing whether or not an elementary primitive can be found (youwill just have to hope that the problems for this chapter contain no misprints). Because the search for elementary primitives is so uncertain, finding one is often peculiarly satisfying. If we observe that the function F(x) - x arctan x log(1 + x2) 2 - satisfies F'(x) (justhow we = arctanx would ever be led to such an observation is quite another matter), so that Lb arctan x dx = x arctan x "really" - ÏOg(Î+ 2 X2) b , evaluated then we may feel that we have arctan x dx. This chapter consists of little more than methods for finding elementary primitives of given elementary functions (a process known simply as and conventions designed to facilitate together with some notation, abbreviations, this procedure. This preoccupation with elementary functions can be justifiedby three considerations: f "integration"), (1)Integration about is a standard topic in calculus, and everyone should know it. in a while you might actually need to evaluate an integral, under conditions which do not allow you to consult any of the standard integral tables (forexample, you might take a (physics) course in which you are able expected to be to integrate). of integration are actually very important theuseful most The (3) apply all not just elementary ones). functions, orems (that to (2)Every once "methods" Naturally, the last reason is the crucial one. Even if you intend to forget how to integrate (andyou probably will forget some details the first time through), you must never forget the basic methods. These basic methods are theorems which allow us to express primitives of one function in terms of primitives of other functions. To begin integrating we will therefore need a list of primitives for some functions; such a list can be obtained simply by differentiating various well-known functions. The list given below makes use of a standard symbol which requires some explanation. The symbol f "a means The primitive of symbol ff or f (x)dx "the or, more precisely, will often by used in stating f" useful in formulas like the following: collection of all primitives of f." theorems, while (x) dx is most ff 19. Integrationin ElementaryTerms 361 x4/4 satisfies F'(x) x3. It that the function F(x) cannot be interpreted literally because the right side is a number, not a function, but in this one context we will allow such discrepancies; our aim is to make the integration process as mechanical as possible, and we will resort to any possible device. Another feature of the equation deserves mention. Most people write This "equation" = means x3dx = + C = to emphasize that the primitives of f (x) x3 are precisely the functions of the x4/4 + C for some number C. Although it is possible (Problem 13) form F(x) to obtain contradictions if this point is disregarded, in practice such difficulties do not arise, and concern for this constant is merely an annoyance. There is one important convention accompanying this notation: the letter appearing on the right side of the equation should match with the letter appearing after the on the left side-thus = = "d" u3du I / = , tx2 txdx = txdt = -, 2 xt2 -. 2 "indefinite A function in ff (x)dx, i.e., a primitive of f, is often called an integral" of f, while f f(x) dx is called, by way of contrast, a "definite integral." This suggestive notation works out quite well in practice, but it is important not to be led astray. At the risk of boring you, the following fact is emphasized once again: the integral f f (x) dx is not defined as "F(b) F(a), where F is an indefinite of repetitious, integral it is time to reread f" (ifyou do not find this statement Chapter 13). We can verify the formulas in the following short table of indefinite integrals simply by differentiating the functions indicated on the right side. - adx / = ax X" x"dx= dx n+1 = n-/-1 , dx is often written log x abbreviations table.) e'dx sin x = dx e' = - cos x are used for convenience; similar in the last two examples of this 362 Derivativesand Integrals cosxdx sinx = sec2xdx = tanx secx tanx dx Ï / dx 1+x2 dx = secx arctanx = . = -x2 arcsmx 1 Two general formulas of the same nature are consequences of theorems about difFerentiation: [f(x)+g(x)]dx= c- g(x)dx, f(x)dx+ f(x)dx=c- f(x)dx. These equations should be interpreted as meaning that a primitive of f + g can be obtained by adding a primitive of f to a primitive of g, while a primitive of a primitive of f by c. c f can be obtained by multiplying Notice the consequences of these formulas for definite integrals: If f and g are - continuous, then b[f(x)+g(x)]dx= g(x)dx, f(x)dx+ Ib c· b f(x)dx = f(x)dx. c· a a These follow fmm the previous formulas, since each definite integral may be written as the difference of the values at a and b of a corresponding primitive. Continuity is required in order to know that these primitives exist. (Of course, the formulas are also true when f and g are merely integrable, but recall how much more difficult the proofs are in this case.) The product formula for the derivative yields a more interesting theorem, which will be written in several different ways. THEOREM l (INTEGRATION BY PARTS) If f' and g' are continuous, then fg'= fg f(x)g'(x)dx f(x)g(x) = lb f (x)g'(x)dx f'g, - f'(x)g(x)dx, - b = (Notice that in the second equation f (x)g(x) b - f'(x)g(x)dx. f g.) f (x)g(x)denotes the functwn - 19. Integrationin Elementar.yTerms 363 PROOF The formula (fg)'= f'g fg' + can be written fg'= (fg)'-- f'g. Thus fg'= (fg)'- f'g, and fg can be chosen as one of the functions denoted by fg)'. This proves the f( first formula. The second formula is merely a restatement of the first, and the third formula follows immediately from either of the first two. As the following examples illustrate, integration by parts is useful when the function to be integrated can be considered as a product of a function f, whose derivative is simpler than f, and another function which is obviously of the form g'. xe'dx = xex 1·e2dx - fg'fgf'g xex - x sin x f dx = x g' - (- f e2 - cos x) 1 (- cos x) dx - · f g ' g cos x + sin x -x - There are two special tricks which often work with integration by parts. The first is to consider the fimction g' to be the factor 1, which can always be written in. log x dx 1 log x dx = . g' x log x = - gf x(log x) f = x - f' g x. - (1/x) dx The second trick is to use integration by parts to find in terms of and then solve for h. A simple example is the calculation fh f (1/x) log x dx - g' which = f log x log x · g f - (1/x) log x dx, - f' implies that ll -logxdx 2 x = (logx)2 g fh again, 364 Derivativesand Integrals or A more complicated Ï = . 2 x calculation is often required: ex sinx dx f (logx)2 l -logxdx e' = g' - (- f cos x) ex --- (- - f' g =-excosx+ cosx) dx g e2cosxdx v' u -e2 [e2 (sinx) cosx + = - u e>(sinx) dx]; - u' v v therefore, e2 sin x dx 2 or Ï e' sinx dx e2 (sinx = = e*(sin x - cos x) cos x) - . 2 Since integration by parts depends upon recognizing that a function is of the form g', the more functions you can already integrate, the greater your chances for success. It is frequently reasonable to do a preliminary integration before tackling the main problem. For example, we can use parts to integrate (logx)2dx (logx)(logx)dx = g' f if we recall that f log x dx gration by parts); we have (logx)(logx) dx f = g' = x(log x) - x (thisformula (logx)[x(logx) - f x] -- itself derived by inte- was (1/x)[x(logx) - f' g x]dx g -x] = (logx)[x(logx) = (logx) [x(logx) - - x] - -x] = (logx)[x(logx) =x(logx)2-2x(logx)+2x. [logx - 1]dx log x dx + 1dx -x]+x - [x(logx) The most important method of integration is a consequence of the Chain Rule. The use of this method requires considerably more ingenuity than integrating by parts, and even the explanation of the method is more difficult. We will therefore 19. Integrationin Elementar_;Terms 365 in stages, stating the theorem for definite integrals first, and of indefinite integrals for later. develop this method saving the treatment THEOREM (THE SUBSTITUTION 2 If f and g' are continuous, then FORMULA) g(b) b (fog)-g' f= g(b) b Ï PROOF If F is a primitive of f(g(x))-g'(x)dx. f(u)du= f, then the left side is F(g(b)) F(g(a)). - On the other hand, (Fog)'=(F'og)-g'=(fog)-g', so Fog is a primitive of (fo g) g' and the right side is · (Fo g)(b) - (Fo g)(a) F(g(b)) = F(g(a)). - The simplest uses of the substitution formula depend upon recognizing given function is of the form (fo g) g'. For example, the integration of that a - sin*xcosxdx (sinx)"cosxdx = is facilitated by the appearance of the factor cos x, which will be the factor g'(x) sinx; the remaining expression, for g(x) (sinx) , can be written as (g(x))5 us. Thus f(g(x)), for f(u) = = g(X) Lb sin" x cos x dx f (u) = u 5 g(b) Ïf Ïsino (g(x))g'(x)dx u5du f (u)du = sin6b = à"a - . 6 sina f = SinX b = The integration of = 6 tan x dx can be treated similarly if we write b Lb tanx dx -sinx = Ï - a -sinx dx. cosx is g'(x), where g(x) cosx; the remaining f (cosx) for f (u) 1/u. Hence In this case the factor 1/ cos x can then be written = = g(x) Ïa tan x dx = f(u) cosx 1 - f(g(x))g'(x)dx=- =- f(u)du cmbg = - - Osa N du = log(cos a) - log(cos b). factor 366 Derivadvesand Integrals Finally, to find b Ï y dx, x logx a notice that 1/x f (u) = g'(x) where = g(x) logx, and that 1/logx = f(g(x)) for = 1|u. Thus g (X ) b Ï ÏOg X = 1 dx b g(b) I Ïlogb f(g(x))g'(x)dx= = y = - E loga du = f(u)du g(a) a log(log b) log(log a). - Fortunately, these uses of the substitution formula can be shortened The intermediate steps, which involve writing considerably. g(b) lb f(g(x))g'(x)dx= f(u)du, g(a) a can easily be eliminated by noticing the following: To go from the left side to the right side, u for g (x) du for g'(x)dx (andchange the limits of integration); substitute the substitutions can be performed directly on the original for the name of this theorem). For example, function sinb - Ïb sin" N fOr . x cos x dx substitute (accounting SinX u6 du, = du for cosxdx. sina and similarly - lb a SiBX this method = -sinxdx du. - du for cosx Usually we abbreviate cosb M fOT COSX . dx substitute cosa u even more, and say simply: "Let u = g(x) du = g'(x) dx." Thus ÌCt Lb xlogx dx u = IOg X Iog by -du. 1 du=-dx x = ga In this chapter we are usually interested in primitives rather than definite integrals, but if we can find f f (x)dx for all a and b, then we can certainly find 19. Integrationin Elemen y Terms 367 ff (x) dx. For example, since 6b Lb Ï sin" it f'ollows that x cos x dx = 6 6 , sin6x sin x cosxdx Similarly, sin6a - = . 6 -logcosx, tanxdx I 1 = dx xlogx Iog(log x). = It is quite uneconomical to obtain primitives from the substitution formula by first finding definite integrals. Instead, the two steps can be combined, to yield the following procedure: (1)Let u du (afterthis manipulation = g(x), = g'(x)dx; only the letter u should appear, not the letter x). (2)Find a primitive (asan expression (3)Substitute g(x) back for u. involving u). Thus, to find sin x cos x dx, (1)let u du = sinx, = cosxdx so that we obtain u3 du; (2)evaluate u6 du (3)remember to substitute I = ; sin x back for u, so that sin" sin6 x cos x dx = . 6 368 Derivativesand Integrals Similarly, if logx, = u du 1 = - x dx, then 1 dx x logx I so that I To evaluate 1 becomes - du = u 1 dx x log x log u, log(log x). = x dx, 1 ‡ x2 let u=1+x2, du 2x dx; = the factor 2 which has just popped up causes no problem-the l 2 -du 1 integral becomes 1 2 -logu, = - u so dx 1+ x, log(1+ x2L = (This result may be combined with integration by parts to yield 1 arctan x dx - = = x arctan x x arctan x dx - - ) 1 + x2 log(1 + x2 a formula that has already been mentioned.) These applications of the substitution formula* illustrate the most straightthe suitable factor g'(x) is recognized, forward and least interesting types-once the whole problem may even become simple enough to do mentally. The following three problems require only the information provided by the short table of indefinite integrals at the beginning of the chapter and, of course, the right substitution *The substitution formula is often written in the form f(x)du= f(g(x))g'(x)dx, This formula cannot be taken literally f f(u)du of (afterall, mean a primitive u=g(x). should mean a primitive of f and the g'; these are certainly not equal). However, it may be regarded as a symbolic summary of the procedure which we have developed. If and a little fudging, the formula reads particularly well: we use Leibniz's notation, symbol f f(g(x))g'(x)dx should f(u)du = (fo f(u) g) dx. - 19. Integrañonin Elemener.yTerms 369 (thethird problem has been disguised a little by some algebraic sec2 chicanery). x tan" x dx, (cosx)e'"2 dx, I e2 1 dx. en - If you have not succeeded in finding the right substitutions, you should be able to guess them from the answers, which are (tan6x)/6, e"i"=, and arcsin ex. At first you may find these problems too hard to do in your head, but at least when g is of the very simple form g(x) ax + b you should not have to waste time writing out the substitution. The following integrations should all be clear. (The only worrisome the answer to the second be detail is the proper positioning of the constant-should e32/3 or 3e32? I always take care of these problems as follows. Clearly e32 dx e32· e3x, I get F'(x) 3e32, so the Now if I differentiate F(x) (something). must be to cancel the 3.) = J = "something" = = ¾, Ï dx e3'dx Ï I log(x + 3), = 3 x + = , sin4x cos = , 4 -cos(2x sin(2x + / 4x dx 1)dx dx 1 + 4x2 + 1) = 2 , arctan2x ¯ 2 More interesting uses of the substitution formula occur when the factor g'(x) does not appear. There are two main types of substitutions where this happens. Consider first ll 1 +ex - dx. e2 of the expression ex suggests The prominent appearance the simplifying tion u=e2, du e'dx. = Although the expression e2 dx does not appear, it can always be put in: 1+e2 1 dx1 e2 ex 1 ex We therefore obtain 1 du, 1--u /1+e2 Ï1+u -- - ¯ ---e'dx. substitu- 370 Derivativesand Integrals which can be evaluated by the algebraic trick 1 Il+u Il+e2 ¯ 1-u so that 1 - ex dx du 2 1-u = 1 + - u -21og(1 du = -- -21og(1 u) + log u, -21og(1 = - e") + 1oge2 = e2) + x. - There is an alternative and preferable way of handling this problem, which does not require multiplying and dividing by ex. If we write u=e*, x=logu, dx 1 = - da u then 1+u 1-u l1+ex 1-e= dx immediately becomes 1 ¯ du. Most substitution problems are much easier if one resorts to this trick of expressing x in terms of u, and dx in terms of du, instead of vice versa. It is not hard to see why this trick always works (aslong as the function expressing u in terms of x is one-one for all x under consideration): If we apply the substitution x=g¯I(u) u=g(x), dx = (g¯1)'(u)du to the integral f (g(x))dx, we obtain f(u)(g¯I)'(u)du. (1) On the other hand, if we apply the straightforward u = g(x) du = g'(x)dx substitution to the same integral, -g'(x)dx, f(g(x))dx= f(g(x))- we obtain (2) / f (u) 1 · g'(g-1) du. The integrals (1)and (2)are identical, since (g¯ )'(u) As another concrete example, consider Ï e ex +1 dx. = 1/g'(g¯1 19. Integrationin Elemenky Terms 371 In this case we will go the whole hog and replace the entire expression by one letter. Thus we choose the substitution 1 Vex+1, = u Je*+ u2_741 u2 1 = ex, - x = dx = log(u2 - 2u u2 du. 1 - The integral then becomes Ï Thus (u2 12 - 2u u Ï du u2-1 e dx = 2 = u 2 2u3 1du - = -- 2u. - 3 (e2+ 1)3/2 2(e2 + 1)1/2 - 3 e2+1 Another example, which illustrates the second occur, is the integral main type ofsubstitution that can 1-x2dx. In this case, instead of replacing a complicated expression by a simpler one, we sin2 will replace x by sin u, because u cos u. This really means that we arcsin x, but it is the expression for x in terms of u are using the substitution u which helps us find the expression to be substituted for dx. Thus, Ál = - = let x sin u, dx=cosudu; [u = arcsin x] = then the integral becomes 1 The evaluation of this - sin2 u cos u du cos2 = u du. integral depends on the equation 1 + cos 2u 2 (seethe discussion of trigonometric functions below) so that COS2 Ï and Ï cos2 u du = Ï 1+cos2u du 2 arcsin x 1-x2¿y = - u 2 sin2u + - - -x)1 = , sin(2 arcsin x) 2 4 arcsin x 1 sui(arcsin x) cos(arcsin x) + 2 2 arcsinx 1 x2 + 2 2 4 . = 4 - 372 Derivativesand Integrals Substitution and integration by parts are the only fundamental methods which you have to learn; with their aid primitives can be found for a large number of functions. Nevertheless, as some of our examples reveal, success often depends upon some additional tricks. The most important are listed below. Using these you should be able to integrate all the functions in Problems 1 to 9 (a few other interesting tricks are explained in some of the remaining problems). 1. TRIGONOMETRIC FUNCTIONS Since sin2 cos2 x ‡ and cos 2x cos2 = x sin2 - we obtain cos2x = cos 2x = cos2x (1 1- co - sin x) - 2x) sin2 - 2cos2X = 1 = x - - 2 sin2 Î, x, or sin' 1 -- x - cos 2x , 2 1 + cos 2x 2 These formulas may be used to integrate cos2 = x . sin" x dx, cos" x dx, if n is even. Substituting (1 cos 2x) - or 2 (1 + cos 2x) 2 for sin2 x or cos2x yields a sum of terms involving lower powers of cos. For example, sin4 x dx 1 = - cos 2x dx dx = cos 2x dx + - and If n is odd, n / = cos2 2x dx 2k + 1, then sin" x dx = = /1+cos4x 2 sin x (1 - cos dx. x)k dx; cos2 2x dx 19. Integrationin Elementap Terms 373 the latter expression, multiplied out, involves terms of the form sin x cos' x, all of which can be integrated easily. The integral for cos" x is treated similarly. An integral sin" x cos" x dx is handled the same way if n or m is odd. If n and m are both even, use the formulas for sin2 x and cos2 A final important trigonometric integral is ! 1 cos x dx sec x dx = = log(sec x + tan x). "deriving" this result, by means of the methAlthough there are several ways of ods already at our disposal (Problem 12), it is simplest to check this formula by differentiating the right side, and to memorize it. 2. REDUCTION FORMULAS Integration by parts yields (Problem 20) / Ï sin" x dx / cos" x dx = - - 1 sinn-1 x cos x + n = -- 1 cos"¯1 n x sin x + 1 - n sinn-2 n 1 - n n 1 1 2n-3 x dx= + 2n 2 (x2 + 1)=¯I 2n 2 (x24 1). -- - cos"¯2 x dx, x dx, 1 dx (x2y 1)n-1 and many similar formulas. The first two, used repeatedly, give a different method for evaluating primitives of sin" or cos". The third is very important for integrating a large general class of functions, which will complete our discussion. 3. RATIONAL FUNCTIONS Consider a rational function p/q where p(x) = anx" +an-1x"¯I + q(x)=bmx"+bm-ix"¯l+--·+bo. -··+ao, 1. Moreover, we can assume that n < m, b, We might as well assume that a, for otherwise we may express p/q as a polynomial function plus a rational function which is of this form by dividing (thecalculation = u2 u-1 = =u+1+ 1 u-1 is a simple example). The integration of an arbitrary rational fimction depends on two facts; the first follows from the "Fundamental Theorem of Algebra" (see Chapter 26, Theorem 2 and Problem 26-3), but the second will not be proved in this book. 374 Derivativesand Integrals THEOREM Every polynomial function q(x)=x"'+bm-ix"¯'+-··+bo can be written as a product X+¾)s1·...·(X2+þX+ q(x)=(x-a1)"-...·(x-ŒkŠr*(x24 " 51Ÿ···ŸSI)=M). (whereri+-··+rk (In this expression, identical factors have been collected together, so that all x2 + x «¿ and ßgx + y, may be assumed distinct. Moreover, we assume that each quadratic factor cannot be factored further. This means that - ß;2 since otherwise we can -4y¿ < 0, factor -ßt + X2 Áß¿24y, Áß? -ß, - - - 2 4y; 2 into linear factors.) THEOREM If n < m and p(x) x" + an-1x"¯' + q(x)=xm+bm-ixm-1+·--+bo = - + ao, ·· (x2+ßix+yi)" =(x-a1)"·---·(X-akf(x2+Ax+n)"'--can be written then p(x)/q(x) p(x) al,1 = q(x) (x - at) in the form a1,ri +···+ (x +--- al) - Øk,l + + Œk,ra + (x ag) bl.1x+ci,i · - [ + (x - +-··+ (x2 + any b1,six+ct 2 bi,1x + ci,1 (x2 + ßix + yi) +--- si b +··-+ ,,,x + ci,, . (x2 + ßix + yi)"I "partial This expression, known as the fraction decomposition" of p(x)/q(x), is so complicated that it is simpler to examine the following example, which illustrates such an expression and shows how to find it. According to the theorem, it is possible to write 2x2 + 8x6 + 13x5 + 20x4 + 15x3 + 16x2 + 7x + 10 (x2 + x + 1)2(x2 + 2x + 2)(x 1)2 - = a x-1 + b (x--1)2 cx+d + x2+2x+2 ex+ + x2+x+1 f + gx+h . (x2+x+1)2 19. Integrationin Elementary72rms 375 To find the numbers a, b, c, d, e, f, g, and h, write the right side as a polynomial over the common denominator (x2+ x + 1)2(x2 + 2x + 3)(x 1)2; the numerator becomes - 2+b(x2+2X+2)(X2 a(x-1)(x2+2x+2)(x2 --1)2(x2+x+1)2+(ex+ 2 -1)2(x2+2x+2)(x2+x+0 +(cx+d)(x f)(x +(gx+h)(x-1)2(x2+2x+2). Actually multiplying this out (!)we obtain a polynomial of degree 8, whose coefficients are combinations of a, h. Equating these coefficients with the coefficients of 2x2+8x6+13x5+20x4+15x3+16x2+7x+ 10 (thecoefficient of x8 is 0) obtain unknowns equations the eight 8 in h. After heroic calculations we a, solved these can be to give ..., . 1, a e=0, b = 2, = . 1, c g=0, , d = 3, h=1. = f=0, . Thus = 3 2x2 + 5x6 + 13x3 + 20x4 + 16x2 + 7x + 7 dx (x24, y 1)2(x2 + 2x + 2)(x 1)2 1 2 1 dx + dx + dx + Ï / - (x-1)2 (x-1) (x2+x+1)2 x + 3 x2+2x+2 dx. (In simpler cases the requisite calculations may actually be feasible. I obtained this particular example by stardng with the partial fraction decomposition and converting it into one fraction.) We are already in a position to fmd each of the integrals appearing in the above expression; the calculations will illustrate all the difficulties which arise in integrating rational functions. The first two integrals are simple: Ï Ï 1 x 1 - dx = dx = 1), - -2 2 (x 1)2 -- The third integration depends on x "completing x2 - l' the square": 2 = | 3 4 - x + + 1 . instead of we could not take the square root, but in this quadratic factor could have been factored into linear factors.) We (If we had obtained case our original log(x - 376 Daivancesand Integrals can now write / 1 dx (x2+x+1)2 = 16 9 - 1 dx. -2 - ; x+, +1 4 - - The substitution x + y 1 changes this integral to Ë4 16 9 du (u2 + 1)2 which can be computed using the third reduction Finally, to evaluate x+ 3 Ï we write / x+3 dx x2+2x+2 = - 1 2 (x2+2x+2) 2x+2 ' formula given above. dx 2 dx + x2+2x+2 (x+1)2+1 dx. The first integral on the right side has been purposely constructed so that we can evaluate it by using the substitution u=x2+2x+2, du (2x +2)dx = The second integral on the right, which is just the simply 2 arctan(x + 1). If the original integral were Ï x+3 (x2+2x+2)" dx 1 = 2x+2 ¯ (x2+2x+2)= difference of the other two, is 2 dx + [(x+1)2+1] dx ' the first integral on the right would still be evaluated by the same substitution. The second integral would be evaluated by means of a reduction formula. This example has probably convinced you that integration of rational functions is a theoretical curiosity only, especially since it is necessary to find the factorization of q (x) before you can even begin. This is only partly true. We have already seen that simple rational functions sometimes arise, as in the integration another Ï l +ex dx; 1 e2 - important example is the integral Ï 1 x 2 -1 1 dx = - x--1 12 x+1 dx = 1 log(x 2 - - 1) - 1 log(x + 1). 2 - 19. Integrationin ElemeneryTerms 377 Moreover, if a problem has been reduced to the integration of a rational function, it is then certain that an elementary primitive exists, even when the difficulty or impossibility of finding the factors of the denominator may preclude writing this primitive explicitly. PROBLEMS 1. This problem contains some integrals which require little more than algeand consequently braic manipulation, test your ability to discover algebraic of the integration processes. Nevertricks, rather than your understanding theless, any one of these tricks might be an important preliminary step in an honest integration problem. Moreover, you want to have some feel for which integrals are easy, so that you can see when the end of an integration process is in sight. The answer section, if you resort to it, will only reveal what algebra you should have used. Ï (ii) I (iii) Ï (i) Ñ+Vi dx . x-1+)x+1 +e32 e=+e dx e4x . dx. (iv) (v) dx. tanixdx. (Trigonometric integrals are always very touchy, because identities that an easy there are so many trigonometric problem can easily look hard.) dx Ï / (viii) / (ix) l8x2+6x+4 (x) / 2 y x2 dx dx 1+sinx x + 1 -x2 dx. 1 dx. 2x 2. The following integrations involve simple substitutions, should be able to do in your head. (i) e2 sine2 dx. most of which you 378 Denvativesand Integrals xe¯" (ii) (iii) dx. Ilogx Ï (In the text this was done by parts.) dx. x ex dx + 2ex + 1 e e e'dx. (v) Ï (vii) Ï xdx (vi) . 1--x4 - e dx. 1 x2 dx. (viii) x (ix) log(cos x) tan x dx. - log(logx) J 3. dx. xlogx Integration by parts. (i) x2ex dx. (ii) x3e" (iii) eax sinbxdx. (iv) x2 sin x (v) (logx)3 dx. (vi) dx. dx. Ilog(logx) (vii) dx. sec dx. (This is a tricky and important integral that often comes up. If you do not succeed in evaluating it, be sure to consult the answers.) (viii) (ix) (x) cos(logx)dx. log x dx. x(log x)2 dx. 19. Integrationin Elemenky Terms 379 4. The following integrations can all be done with substitutions of the form sin u, x x cos u, etc. To do some of these you will need to remember = = that secxdx=log(secx+tanx) followingformula, which can also be checked by differentiation: as well as the -log(cscx esex dx + cot x). = In addition, at this point the derivatives of all the trigonometric should be kept handy. (i) dx I . 1 . 1 + x2 I (iv) / I (vi) Ï tution X = tan = sec2 u, you want to use the substi- M.) . x2 1 dx - (The answer will be a certain inverse function that was 1 given short shrift in the text.) . x/x2 - dx x)1 x2 - dx . x/1 + x2 x3 I (Since tan2 u + 1 dx (iii) (vii) (You already know this integral, but use the substitution sin u anyway, just to see how it works out.) x = dx (ii) 5. x2 - functions 1 72 _ dx. x2 dx. (viii) 1 (ix) 1+x2dx. (x) x2 - You will need to remember the methods for integrating powers of sin and cos. 1dx. - The following integrations involve substitutions of various types. There is no substitute for cleverness, but there is a general rule to follow: substitute for an expression which appears frequently or prominently; if two different troublesome expressions appear, try to express them both in terms of some new expression. And don't forget that it usually helps to express x directly in terms of u, to find out the proper expression to substitute for dx. I (ii) / (i) dx 1+/x+1 dx 1 + ex . . 380 Derivativesand Integrals dx I (iv) I (v) Ï (vi) / (vii) /42+1 (iii) . dx (The substitution u e2 leads to an integral requiring yet another substitution; this is all right, but both . 1 + e' = substitutions can be done at once.) dx 2+tanx dx (Another place where one substitution + 1 do the work of two.) . dx. 2 +1 (viii) (ix) *(x) 6. Ï can be made to dx. e Vl-x 1 dx. (In this case two successive substitutions work out best; there are two obvious candidates for the first substitu- E - tion, and either will work.) x-1 1 dx. 1 x + x / · - The previous problem provided gratis a haphazard selection of rational functions to be integrated. Here is a more systematic selection. 2x2 + 7x - / x34x2--x-Î (ii) Ï (iii) Ï (iv) Ï (v) / (vi) Ï (vii) / (viii) X4‡Î / (ix) l (x) Ï (i) x3 1 2x + 1 3x2 + 3x - dx. - 1 dx . x3+7x2-5x+5 1)2(x + 1)3 2x2 ‡ X + Î dx. (x + 3)(x 1)2 (x dx. - - x+4 dx. x24) x3+x+2 x4+2x2+1 dx. 3x2+3x+1 dx. x3+2x2+2x+1 dx . 2x (x2+ x + 1)2 3x 2 (x 1)3 dx. dx. 19. Integrationin ElementaryTerms 381 *7. (No holds barred.) The following integrations involve all the Potpourri. of the previous problems methods I (ii) Ï (i) arctanx 1 + x2 dx. x arctan x (1 + x2)3 dx. (iii) log fl + x2 dx. (iv) x (v) Ï log fl + x2 dx. x2 ¡ _ x2+1 f dx. x Ï (viii) / e'i"> (ix) Jtan x dx. (x) g6 (vii) dx. 41+x4 arcsin (vi) g · dx. . 1 + smx xcos3x-sinx dx. . COS X . i (To factor X6 + Î , Erst factor y3 + 1, using Problem 1-1.) The following two problems provide still more practice at integration, ifyou need it (andcan bear it). Problem 8 involves algebraic and trigonometric manipulations and integration by parts, while Problem 9 involves substitutions. (Of course, in resulting will require still manipulations.) the integrals further many cases 8. Find the following integrals. 2) (i) log(a2 I (iii) / l + cos x (ii) dx. sin2 x x + 1 4 x2 - dx. (iv) x arctan x (v) sin3 dx. x (vi) Ï sin3x COS2 X dx. dx. dx. 382 Denvatives and Integrals x2 arctan x (vii) xdx (viii) 9. dx. . x2-2x+2 (ix) sec3xtanxdx. (x) x tan2 x dx. Find the following integrals. I dx (a2 + x2)2 (ii) 1 sin x (iii) arctan (iv) sin - dx. dx. fx + 1 dx. -2 (v) / Vx3 dx. (vi) log(x + Vx2 1) dx. (vii) log(x + 5) (viii) x -- dx. dx - x3/ . (ix) (arcsinx)2 dx. (x) 10. x5 arctan(x2) dx. If you have done Problem 18-9, the integrals (ii)and (iii)in Problem 4 will look cosh u often works for integrals very familiar. In general, the substitution x = involving fx2 1, while x sinh u is the thing to try for integrals involving x2 + 1. Try these substitutions on the other integrals in Problem 4. (The method is not really recommended; it is easier to stick with trigonometric = - substitutions.) *11. The world's sneakiest substitution t = tan x -, 2 is undoubtedly x = dx = 2 arctan t, 2 dt. 1 ‡ t2 19. Integrationin ElementaryTerms 383 As we found in Problem 15-17, this substitution 2t 1+ t2, . = smx leads to the expressions 1 t2 1+t2 - cosx = thus transforms any integral which involves only sin and combined and division, into the integral of by addition, multiplication, cos, a rational function. Find This substitution I (ii) / (iii) Ï (i) *12. . dx 1 sin2 Ï (In this case it is better to let t . tan x. Why?) = - dx , a sin x + sini (iv) (v) dx (Compare your answer with Problem 1(viii).) 1 + sin x b cos x (There is also another way to do this, using Problem 15-8.) dx. (An exercise to convince you that this substitution should be used only as a last resort.) dx (A last resort.) 3+5sinx x . Derive the formula for fsecxdxin the following two ways: (a) By writing 1 cosx cos2 x cos x ¯ cos x 1 sin2 - X ¯ = - 1 2 cos x _1+sinx + cos x 1-sinx ' an expression obviously inspired by partial fraction decompositions. Be log(1 sin x); the minus sign sure to note that cos x/ (1 sin x) dx remember that log a = log is very important. And From there on, keep doing algebra, and trust to luck. By using the substitution t = tan x/2. One again, quite a bit of manip- f - - - - j . (b) ulation is required to put the answer in the desired form; the expression tan x/2 can be attacked by using Problem 15-9, or both answers can be expressed in terms of t. There is another expression for jsecxdx, which is less cumbersome than log(sec x + tan x); using Problem 15-9, we obtain 1+tan x =log secxdx=log 1 - tan tan + . - 2 was actually the one first discovered, and was due, mathematician's cleverness, but to a curious historical accinot to any This last expression 384 Derivancesand Integrals dent: In 1599 Wright computed nautical tables that amounted to defmite integrals of sec. When the first tables for the logarithms of tangents were produced, the correspondence between the two tables was immediately noticed remained unexplained until the invention of calculus). (but 13. The derivation of f e2 sin x dx given in the text seems to prove that the only eX primitive of f (x) e' sin x is F(x) (sinx cos x)/2, whereas F(x) ex (sinx cos x)/2 + C is also a primitive for any number C. Where does C come from? (What is the meaning of the equation = = = - - e2 sin x dx 14. Suppose that f" e2 sin x = ex cos x - ex sin x - dx?) is continuous and that /K O Given that 15. f (x) = [f (x) + f"(x)] sinx 1, compute dx = 2. f (0). x dx, using the same trick that worked Generalizethis trick: Find (x)dx in terms of with Problems 12-18 and 14-17. (a) Find f arcsin *(b) ¯l ff for log and arctan. ff (x)dx. Compare sin4 16. x dx in two different ways: first using the reduction formula, and then using the formula for sin2 x. (b) Combine your answers to obtain an impressive trigonometric identity. 17. Express (a) Find f f log(log x) dx terms of elementary x2e¯" functions.) dx in terms of 18. Express 19. Prove that the function (Do not try to find it!) 20. Prove the reduction / and work x 21. = f (x) = f(logx)¯I dx. (Neither is expressible (e52+e2 ex + 1) has an elementary I tan u.) formula for x"e" dx Prove that primitive. ¯ (b) (logx)"dx. *22. coshx t2 - 1dt in f e¯2° dx. formulas in the text. For the third one write x2 dx dx dx (1 + x2)" J (1 + x2)"¯1 J (1 + x2)" on the last integral. (Another possibility is to use the Find a reduction (a) in terms of = (See Problem 18-6 for the significance cosh x sinh x 2 - x -. 2 of this computation.) substitution 19. Integrationin ElementaryTerms 385 23. Prove that b lbf(x)dx= f(a+b-x)dx. (A geometric interpretation makes this clear, but it is also a good exercise in the handling of limits of integration during a substitution.) 24. Prove that the area of a circle of radius r is wr2. (Naturally you must remember that x is defined as the area of the unit circle.) 25. Let ¢ be a nonnegative li and such that ¢ -1 integrable function such that ¢(x) 1. For h = = 0 for |x| 1 > 0, let > 1 -¢(x|h). ¢h(x) h = h I ¢h 1• (a) Show that ¢h(x) 0 for |x| > h and that (b) Let f be integrable on [-1, 1] and continuous at 0. Show that = = -h 1 lim h-W J-1 h ¢af = lim haC+ A-h (c) Show that lim h-+0+ /1 h h2 + x2 -1 ¢ëf dx = = f (0). x. The final part of this problem might appear, at first sight, to be an exact analogue of part (b),but it actually requires more careful argument. (d) Let f be integrable on lim [-1, 1] and continuous at 0. Show that h L1 dx xf (0). 2 f (x) i h2 + x Hint: If h is small, then h/(h2 + x2) will be small on most of hoe = [-1, 1]. The next two problems use the formula f (0)2d0, derived in Problem 13-24, for the area of a region bounded by the graph of polar coordinates. 26. f in For each of the following functions, find the area bounded by the graphs in polar coordinates. (Be careful about the proper range for 0, or you will get results!) nonsensical (i) f (0) a sin 0. = (ii) f(0)=2+cos9. 2COS2Ô. (iii) f(0)2 (iv) f (0) a cos 20. _ = 386 lkrivances and Integrals 27. r ·· /(0) Figure 1 shows the graph of f in polar coordinates; the region OAB thus 1 1 f(0)2d0. Now suppose that this graph also happens to be has area - A the ordinary graph of some function g. Then the region OAB also has area area AOxiB o FIGURE + - g area AOxoA. l I I x2 x. Prove analytically that these two numbers are indeed the same. Hint: The function g is determined by the equations X 1 g(X) f(Û)COSÛ, = f(Û)SißÛ. = The next four problems use the formulas, derived in Problems 3 and 4 of the Appendix to Chapter 13, for the length of a curve represented parametrically (and, in particular, as the graph of a function in polar coordinates). 28. Et c be a curve represented parametrically by u and v on [a,b], and let h be an increasing function with h(ä) b. Then on [ä,b] the a and h(b) representation of another voh give parametric h, 6 functions ü uo a the traversed at a different rate. same curve c curve ë; intuitively, ä is just = = = = (a) Show, directlyfrom the definition of length, that the length of c on [a,b] b]. length [ä, (b) Assuming differentiability of any functions required, show that the lengths are equal by using the integral formula for length, and the apequals the of i on propriate substitution. 29. Find the length of the following curves, all described as the graphs of funcwhich is represented tions, except for (iii), parametrically. (i) f (x) = (ii) f (x) = (x2 + 2)3/2, x 3 + O 5 xs 1 5 x 5 2. , (iii) x a3 cos3 t, (iv) f(x) log(cosx), (v) f (x) log x, = y a3 = (vi) f(x) 30. = 3 O 5 t 5 2x. t, x/6. O 5 xs = = 1. 1 Ex 5 e. arcsine2, -log2 5 x 5 0. For the following functions, find the length of the graph in polar coordinates. (i) f(0) (ii) f(0) (iii) f(0) (iv) f (0) (v) f (0) = = = acos0. a(1 cos0). asin20/2. - = 0 = 3 sec 0 0 5 Ð s 2x. 0 s 0 y x/3. 19. Integrationin ElementaryTerms 387 31. In Problem 8 of the Appendix to Chapter 12 we described the cycloid, which has the parametric representation y=v(t)=a(1-cost). x=u(t)=a(t-sint), length of one arch of the cycloid. [Answer: 8a.] (b) Recall that the cycloid is the graph of vo u-1. Find the area under substitution in ff and one arch of the cycloid by using the appropriate 3xa2.] resultant evaluating the integral. [Answer: (a) Find the 32. Use induction and integration by parts to generalize Problem 14-13: n du 33. = If f' is continuous on Lebesgue Lemma for [a,b], f: du f (t)dt . du.. ... integration by parts to prove the Riemann- use b lim Amoo Ï f (t) sin(At)dt = 0. a This result is just a special case of Problem 15-26, but it can be used to prove the same way that the Riemann-Lebesgue Lemma was derived in Problem 15-26 from the special case in which f is a step the general case (inmuch function). 34. The Mean Value Theorem for Integrals was introduced in Problem 13-23. The "Second Mean Value Theorem for Integrals" states the following. Supor pose that f is integrable on [a,b] and that ¢ is either nondecreasing nonincreasing number such that there in b] is b]. Then on [a, a & [a, Ïbf(x)¢(x)dx=¢(a) a b & f(x)dx+¢(b) a ( f(x)dx. In this problem, we will assume that f is continuous and that ¢ is differentiable, with a continuous derivative ¢'. (a) Prove that if the result is true for nonincreasing ¢, then it is also true for nondecreasing ¢. that if the result is true for nonincreasing Prove (b) then it is true for all nonincreasing ¢. Thus, we can assume that ¢ is nonincreasing we have to prove that (c) Prove this (d) Show that is needed. satisfying ¢(b) = ¢(b) = f (x)¢(x) ¢(a) = f (x)dx. by using integration by parts. the hypothesis that ¢ is either nondecreasing 0, 0. In this case, ( b Ï and ¢ or nonincreasing 388 Derivativesand Integrals From this special case of the Second Mean Value Theorem for Integrals, the general case could be derived by some approximation arguments, just as in the case of the Riemann-Lebesgue Lemma. But there is a more instructive way, outlined in the next problem. 35. (a) Given at, (*) albi as and ..., bi, b., let ..., Sk al ‡ = · · ‡ ak. Show that -b3) s1(bi +---+a,b.= b2)+s2(b2 - -+ +·- s.-1(bn-1 - b.) + s.b. This disarmingly simple formula is sometimes called "Abel's formula for summation by parts." It may be regarded as an analogue for sums of the integration by parts formula f'(x)g(x)dx = f(b)g(b) f(a)g(a) - f(x)g'(x)dx, - a especially if we P = (to, Riemann sums (Chapter 13, Appendix). In fact, for a partition [a,b], the left side is approximately use ta) of ..., J'(ty)g(ta_i)(tk (1) - Ik-1). k=1 while the right side is approximately f (b)g(b) f (a)g(a) - f(tk)&'fik)(Ik - Ik-1) ¯ k=1 which is approximately n f (b)g(b) f (a)g(a) - 8(ik) f (ik) - Ik k=1 = g(ik-1) - ik-1 - (ik Ik-1) - f(tk)[a(ik-1)-å(tk) f(b)g(b) f(a)g(a)+ - kl -f(ik) = f(b)g(b) f(a)g(a)+ [f(tk) f(a)]· [g(tk-1) - - k=1 8(Ik)· g(tk-1) k=1 g(b), this works out to be + Since the right-most sum is just g(a) (2) [f(b) f(a)]g(b)+ - - f(a) - [f(tk) f(a)] [g(tk-1)-8(tk)]· - k=1 . If we choose ak = l'fik)(fk ¯ Ik-l), bk = 8(ik-1) then (1) is akbk, k=1 which is the left side of (*),while k sk k l'(fi)(fi = - fi-1) is appmximately i=1 f (t¿) f (t; ) - _i i=1 so n (2) is approximately which is the right side of (*). snbn + L sk(bk k=1 - bk-1). = f (tk) f (a), - 19. Integrationin ElementaryTerms 389 This discussion is not meant to suggest that Abel's formula can actually be derived from the formula for integration by parts, or vice versa. But, as we shall see, Abel's formula can often be used as a substitute for integration by parts in situations where the functions in question aren't differentiable. (b) Suppose {b,}is nonincreasing, that with b, ;> 0 for each n, and that msai+---+a,sM for all n. Prove Abel's Lemma: bim s aibi +- +a,bn s blM. (And, moreover, bkW Økbk nb, ' 5 bgM, formula which only looks more general, but really isn't.) (c) Let f be integrable on [a,b] and let ¢ be nonincreasing on [a, b] with ¢(b) 0. Let P {to, t,} be a partition of [a,b]. Show that the a = = ..., sum f(t¿_i)¢(ti-1)(ti - ti-1) i=1 lies between the smallest and the largest of the sums k f(t¿_i)(t¿--ti-1)- ¢(a) i=1 Conclude that Lb Ï f (x)¢(x) dx lies between the minimum and the maximum of x ¢(a) and that it therefore equals 36. (a) Show that f (t) dt, ¢(a) f (t) dt for some ( in [a, b]. the following improper integrals both converge. i sin x + (i) dx. 1 (ii) sin2 x + (b) Decide which of the (i) sin (ii) sin dx. following improper integrals converge. dx. dx. 390 Derivativesand Integrals 37. the (a) Compute (b) Show that (c) Use the integral (improper) log x dx. the improper integral substitution 2u to show that = x log(sin x) dx converges. x/2 la log(sin x) dx = x/2 2 log(cos x) dx + x log 2. log(sia x) dx + 2 0 0 x/2 (d) Compute log(cos x) dx. relation (e) Using the 38. cos x sin(x/2 = x), compute - log(sin x) dx. Prove the following version of integration by parts for improper integrals: ao ao loo u'(x)v(x)dx=u(x)v(x) u(x)v'(x)dx. - The first symbol on the right side means, of course, lim u(x)v(x) u(a)v(a). -- X->OC *39. One of the most important functions in analysis is the gamma function, f (x) e¯'t*¯1dt. = JO (a) Prove that the improper integral P(x) is defined if x > 0. (b) Use integration by parts (moreprecisely, the improper integral version in the previous problem) to prove that F (x + 1) (c) Show that numbers F(1) = xf = 1, and conclude (x). that F(n) (n = - 1)! for all natural n. The gamma function thus provides a simple example of a continuous function the values of n! for natural numbers n. Of course there are infinitely many continuous functions f with f(n) (n 1)!; there are with continuous infinitely functions even many f(x + 1) xf(x) for all f the x > 0. However, gamma function has the important additional property of that log is convex, a condition which expresses the extreme smoothness of this function. A beautiful theorem due to Harold Bohr and Johannes Mollerup states that F is the only function f with log of convex, f (1) 1 and f(x + 1) xf(x). See the Suggested Reading for a reference. which "interpolates" = - = = = *40. (a) Use the reduction lx/2 O f sin" x dx to show formula for sin" x dx = n "/2 1 - n that sin"*x JO dx. 19. Integrationin ElemenaryTerms 391 (b) Now show that 2 4 6 3 5 7 x 1 3 5 xdx=--------...2246 2n 2n+1 2n 1 2n l"/ /2/2 sin 2.+1 xdx=-·--·--... o sin o - , and conclude that r/2 x 2 224466 i § sin 2n 2n 2n 1 2n + 1 ' 7 dx x =/2 - sin2"** x dx JO the quotient of the two integrals in this expression is between and 1 1 + 1/2n, starting with the inequalities (c) Show that 0 < sin2n+1 x sin < x < sin x for 0 < < x n/2. This result, which shows that the products 224466 1 3 3 5 5 7 2n 2n-1 2n 2n+1 can be made as close to x/2 as desired, is usually written product, known as Wallis' product: as an infinite 224466 2¯133557 x (d) Show also that the products 1 2-4-6-...-2n 1-3-5.....(2n-1) as desired. (This fact is used in the next can be made as close to and 27-19.) problem in Problem **41. It is an astonishing fact that improper integrals f (x) dx can often be b computed in cases where ordinary elementary formula for manay e¯22 integrals f (x)dx cannot. There is no dx, but we can find the value of e¯" dx precisely! There are ways of evaluating this integral, but most require some advanced techniques; the following method involves a fair amount of work, but no facts that you do not already know. 392 Derivativesand Integrals (a) Show that ÏI 2 4 3 5 2n 2n+1 (1-x2)ndx=-·---...o 1 / (1 + x2)" o x 1 3 dx=------... 2 , 2n 2n 2 4 3 2 - - . (This can be done using reduction formulas, or by appropriate tions, combined with the previous problem.) (b) Prove, using the derivative, that 1 -- x2 2 e¯' (c) Integrate < < ¯ e¯" for 0 1 1+x2 < 1. < x substitu- for 0 5 x. the nth powers of these inequalities from 0 to 1 and from 0 to Then use the substitution y x to show that oo, respectively. = 2 4 3 5 2n 2n+1 e¯" < dy e¯T' < dy 0 1 3 2 4 x · 2 (d) Now use **42. 2n 2n - - 3 2 Problem 40(d) to show that Ï I e 2 dy = --. 2 o (a) Use integration by parts to show that b sinx dx -- x b cosb cosa X2 b a COSX dx, a exists. (Use the left side to investigate and conclude that side and right oo.) the limit as a the 0+ for the limit as b (b) Use Problem 15-33 to show that -> fg°°(sinx)/xdx ()tdt sin(n + L" . S1R - t = -> x 2 for any natural number n. (c) Prove that lim xxo Ï" sin(1 + 2 ()t 1 - - t , sln- t dt = 0. 2Hint: The term in brackets is bounded by Problem Riemann-Lebesgue Lemma then applies. - 15-2(vi); the 19. Integrationin ElementaryTerms 393 substitution (d) Use the ¼)tand (A + = u °° sin x x x 2 -dx=-. 43. Given the value of (sinx)/x show that (b)to part dx from Problem 42, compute sin x Ï°° dx x o by using integration by parts. (As in Problem 37, the formula for sin 2x will play an important role.) --- *44. (a) Use the substitution u t2 to show = e¯" f(x)= (b) Find *45. that du. x Jo F(j). (a) Suppose and that that - f (x) is integrable x lim f (x) A and = x->0 on every interval lim f (x) X- [a,b] for 0 < B. Prove that for all a, = a ß < > b, 0 we have " f(ax) f(ßx) dx - (A = - B) log x a N f(ŒX) Hint: To lim x->0 f (x) f( - estimate X) dx use two different substitutions. x (b) Now suppose = instead that X Ja dx converges f (ax) f (ßx) dx - = (ii) > 0 and that Alog ß x the following integrals: (c) Compute e¯"" l°° o°° for all a A. Prove that " (i) ß -. e¯** - X cos(ax) Jo a dx. cos(ßx) - -. dx. X In Chapter 13 we said, rather blithely, that integrals may be computed to any degme of accuracy desired by lower and upper sums. But an applied mathematician, who really has to do the calculation, rather than just talking about the overjoyed of computing doing it, may not be lower sums to evaluate at prospect of an integral to three decimal places, say (adegree accuracy that might easily be needed in certain circumstances). The next three problems show how more refined calculating methods can make the calculations much more efficient. 394 Derivativesand Integrals We ought to mention at the outset that computing upper and lower sums might not even be practical, since it might not be possible to compute the quantities m, and Mi for each interval [ti-1,tt]. It is far more reasonable simply to pick points x; in [t¿_i,t¿] and consider f(x¿) (t¡ - - t¿_i). This represents the sum of the areas i=1 which partially overlap the graph of f-see Figure 1 in the Appendix to Chapter 13. But we will get a much better result if we instead choose the trapezoids shown in Figure 2. of certain rectangles FIGURE 2 Suppose, in particular, that divide we the points [a, b] into n equal intervals, by means of (b-a =a+ih. t¿=a+i n Then the trapezoid with base [t; i, t;] has area f(ti_i)+ f(ti) 2 - (ti t¿_t) - and the sum of all these areas is simply f (11)+ f (a) E,=h h = This method [ 2 + f (t2)+ f (ti) +---+ f(a)+2Lf(a+ih)+ f(b) , h= b - a n an integral is called the trapezoid ru/s. Notice that to the old f(ti); their contribution necessary to recompute of approximating E2, from E, it isn't E2, is just (E.. So in practice it is best to to 46. 2 "¯' obtain approximations f (b) + f (ts-1) 2 to lb f. In compute E2, E4, 28, the next problem we will estimate - - · b f - ÉO $6É E.. (a) Suppose that f" is continuous. Let Pi be the linear function which with f at tt-i and ti. Using Problem 11-43, show that if ni and agrees Ni are 19. Integrationin ElementaryTerms 395 and maximum the minimum I (x = of - f" on ti-1)(x [ti_i,ti] and ti) dx - -1 then A n¿ I -2 2 (b) Evaluate n¿h3 . 12 -l that there is ome c inbl term" (b a)S f"(c)/12n2 varies as 1/n2 obtained using ordinary sums varies as 1/n). Notice that the the eror 3 (f-P¿)< 12 (c) Conclude FIGURE Ni h3 4 ¯ / = 2 I to get -< = N¿I (f-Pi)2--. _, "error - (while We can obtain still more accurate results if we approximate f by quadratic functions rather than by linear functions. We first consider what happens when the interval [a,b] is divided into two equal intervals (Figure 3). 47. 0 and b 2. Let P be the second degree polynomial function which agrees with f at 0, 1, and 2 (Problem 3-6). Show (a) Suppose first that that = a = 2 P (b) Conclude [f (0) + 4 f (1) + f (2)]. = that in the general case Lb lb P= b-a a+b f(a)+4f 6 2 + f(b) . b is a quadratic polynomial. But, remarkably enough, this same relation holds when f is a cubic polynomial! Prove this, using Problem 11-43; note that f'" is a constant. (c) Naturally P = G d f when f The previous problem shows that we do not have to do any new calculations b to compute a + 2 b : we still Q when Q is a cubic polynomial which agrees with f at a, b, and have lb Q= . b-a 6 f(a)+4f But there is much more lee-way in choosing a+b 2 Q, which + f(b) we can use . to our advantage: 396 Derivativesand Integrals 48. there is a cubic polynomial function (a) Show that Q(b)= f(b), Q(a)= f(a), Q' Q (a+b 2 Hint: Clearly Q(x) P(x) + A(x (b) Prove that for every x we have = f (x) Q(x) (x = - - ( a) x - a+b (a+b 2 2 a+b f' = Q satisfying . 2 a)(x - a + b - ( b) x 2 (x 2 - b) a+b for some A. - 2 p(4) 4! for some g in [a,b]. Hint: Imitate the proof of Problem 11-43. (c) Conclude that if f (4) is continuous, then /* a f a+b b-a 6 = f(a)+4f + 2 f(b) - (b-a)5 f(4)(C) 2880 for some c in [a,b]. (d) Now divide [a,b] into 2n intervals by means of the points h= ti=a+ih, b-a . 2n Prove Simþson'smle: f = b - a f (a) + - for some ö in [a,b]. 4 (b f (t2¿_i) + - a)5 2880n* f (4) 2 f (t2;) + f (b) Integral 397 19, Appendix.The Cosmopolitan THE COSMOPOLITAN INTEGRAL APPENDIX. We originally introduced integrals in order to find the area under the graph of considerably more versatile than that. For example, a function, but the integral is Problem 13-24 used the integral to express the area of a region of quite another used to exsort. Moreover, Problem 13-25 showed that the integral can also be as we've seen in Appendix to Chapter 13, a press the lengths of curves-though, consider the general case! This result was problot of work may be necessary to ably a little more surprising, since the integral seems, at first blush, to be a very in quite a two-dimensional creature. Actually, the integral makes its appearance will present in this Appendix. To derive these few geometric formulas,which we formulas we will assume some results fom elementary geometry (andallow a little fudging). objects, we'll begin by tackling some Instead of going down to one-dimensional three-dimensional ones. There are some very special solids whose volumes can of revolution," be expressed by integrals. The simplest such solid V is a > of region under 0 on [a, b] around the graph obtained by revolving the f the horizontal axis, when we regard the plane as situated in space (Figure 1). (to,--.,In} is any partition of [a,b], and m¿ and My have their usual If P j "volume FIGURE i = meanings, then xm,2 x I ,i a / i t .. \i It i is the volume of a disc that lies inside the solid V (Figure 2). Similarly, xMi2(t; ti-1) is the volume of a disc that contains the part of V between t;_; and ti. Consequently, - & s n x FIGURE n m¿2(ti - ti-1) M¿2(t¿ 5 volume V 5 x ti i). i=1 i=·l 2 - But the sums on the ends of this inequality are just the lower and upper sums for f2 on [a,b]: x Vt . L( f2, P) 5 volume V 5 x - U( f , P). Consequently, the volume of V must be given by volumn This FIGURE 3 method V = x f (x)2dx. of finding volumes is affectionately referred to as the "disc method." ægion Figure 3 shows a more complicated solid V obtained by revolving the under the graph of f around the verticalaxis (V is the solid left over when we start with the big cylinder of radius b and take away both the small cylinder of radius a and the solid Vi sitting right on top ofit). In this case we assume a > 0 as well 398 Derivativesand Integrals as f 0. Figures 4 and 5 indicate some other possible shapes for V. > FIGURE 4 FIGURE For a partition P rectangle with base = 5 "shells" obtained by rotating the t.} we consider the [t;_i,t¿] and height mi or My (Figure 6). Adding the volumes {to, .. . , of these shells we obtain m;(t¿2 x ..- ti-12) - volume < V < ' ' which we can write ti-1) - < volume V < M¿(t¿+t¿_i)(t¿ x i=l FIGURE ti-12 as m;(t¿+ti-i)(t¿ x - i-l i=1 . M¿(tg2 x - ti-1). i=I Now these sums are not lower or upper sums of anything. But Problem 1 of the Appendix to Chapter 13 shows that each sum 6 -ti_i) -t¿_i) and mit¿(t; mit¿_i(t; i=1 i=l b can Ï be made as close as desired to small enough. xf(x)dx by choosing the lengths t¿ - t;_i The same is true of the sums on the right, so we find that b volume V 2x = xf (x)dx; i "shell finding volumes. The surface area ofcertain curved regions can also be expressed in terms ofintegrals. Before we taclde complicated regions, a little review of elementary geometric formulas may be appreciated here. Figure 7 shows a right pyramid made up of triangles with bases of length l and altitude s. The total surface area of the sides of the pyramid is thus this is the so-called FIGURE method" of 1 2 7 -p5, where p is the perimeter of the base. By choosing the base to be a regular polygon 19, Appendix.The Cosmopolitan Integral 399 with a large number must be of sides we see that the area of a right circular 1 2 cone (Figure 8) -(2xr)s = xrs, "slant height." Finally, consider the frustum of a cone with slant s is the height s and radii ri and r2 shown in Figure 9(a). Completing this to a cone, as in Figure 9(b), we have where - ,-- N r FIGURE si 8 si + s -- - ri r2 SO ris si= s r25 si+s= , r2 . r2 ri - ri -- Consequently, the surface area is * r. Kr2(Si + S) 2 Es, Krisi - 2 __ = = r2 Es(rl ‡ r2)- ri -- (a) Now consider the surface formed by revolving the graph of f around the horizontal axis. For a partition P (to, t,,) we can inscribe a series of frusta of cones, as in Figure 10. The total surface area of these frusta is = ..., £1 x [f (ti-1)+ f (ti)] (ti -- ti-1)2 + [f (ti) f (ti-1)]2 - r: = x 1+ [f(t¿_i)+ f(t¿)] f(ti) ¿-1) - 2 r, By the Mean Value Theorem, this is (b) FIGURE 9 r(Xi) - É Q i=l for some x; in (ti_i,ti). Appealing to Problem 1 of the Appendix to Chapter 13, we conclude that the surface area is f(x)Jl + 2x f'(x)2 dx. PROBLEMS i y 1. the volume of the solid obtained by revolving the region bounded the graphs of f (x) x and f (x) x2 around the horizontal axis. by (b) Find the volume of the solid obtained by revolving this same region around the vertical axis. (a) Find = / FIGURE lo 2. Find the volume of a sphere of = radius r. 400 Derivativesand Integrals 3. When the ellipse consisting of all points (x, y) with x2þ2 rotated around the horizontal axis we obtain an (Figure 11). Find the volume of the enclosed solid. "ellipsoid FIGURE 4. (x - 1 is of revolution" 11 Find the volume of the - + y2 b2 a)2 + y2 - b2 (a > "torus" (Figure 12), obtained by b) around the vertical axis. rotating the circle b / a FIGURE 5. 12 A cylindrical hole of radius a is bored through the center of a sphere of radius 2a (Figure 13). Find the volume of the remaining solid. FIGURE 13 19, Appendix.The Cosmopolitan Integral 401 6. (a) For the solid shown in Figure 14, find the volume by the shell method. COS FIGURE 14 Write down can also be evaluated by the disc method. the integral which must be evaluated in this case; notice that it is more complicated. The next problem takes up a question which this might suggest. (b) This 7. volume Figure 15 shows a cylinder of height b and radius f (b), divided into three solids, one of which, V1, is a cylinder of height a and radius f (a). If f is one-one, then a comparison of the disk method and the shell method of computing volumes leads us to believe that b abf(b)2-und2-x Ï f(x)2dx= volume V2 yf¯1(y)dy. =2x (a) Prove this analytically, using the formula for f¯1 from Problem 19-15. f(b) FIGURE B A 8. FIGURE 15 16 16 shows a solid with a circular base of radius a. Each plane perpendicular to the diameter AB intersects the solid in a square. Using similar to those already used in this Appendix, express the arguments VOlurie of the solid as an integral, and evaluate it. (b) Same problem if each plane intersects the solid in an equilateral triangle. (a) Figure 402 Derivativesand Integrals 9. Find the volume of a pyramid (Figure 17) in terms of its height h and the area A of its base. F1GURE 10. FIGURE 18 17 volume of the solid which is the intersection of the two cylinders the in Figure 18. Hint: Find intersection of this solid with each horizontal Find the plane. FIGURE I9 11. that the surfàce area of a sphere of radius r is 47tr2. (b) Prove, more generally, that the area of the portion of the sphere shown in Figure 19 is 2xrrh. (Notice that this depends only on h, not on the position of the planes!) 12. (a) Find (b) Find 13. The graph of (Figure 20). (a) Prove the surface area of the ellipsoid of revolution in Problem 19-3. the surface area of the torus in Problem 19-4. (a) Find the f (x) volume (b) Show that = 1/x, x :> l is revolved amund the horizontal axis of the enclosed "infinite trumpet." the surface area is infinite. that we fill up the trumpet with the finite amount of paint found (c) Suppose in part (a).It would seem that we have thereby coated the infinite inside surface area with only a finite amount of paint! How is this possible? FIGURE 20 PART INFINITE SEQUENCES AND INFINITE SERIES One of the most remarkable serits of algebraic analysis is the following: m(m 1+-x+m 2 1 2 - m (m ‡ m(m + - - 1) (m 1·2-3 1) · - - · ) x2 2) - x= ‡ [m (n - - 1) - x 1-2--...·-····» ‡ - • When m is a positivewhole number th¢ sum of the series, which is thenfinite, can he expressed, as is known, by (1 + x) When m is not an integer, the series goes on lo infmity,and it will conwrge or divergeamordin.g er the quantities m and x havethis or that value. In this care, one writer the same equality . m 1 (1‡x)=-1+-x m(m ‡ . . . - 1) -2 1 x2‡·-·«¾ It is assumed that numerical equality will always ocur the whenever the series is cetrgent, but this has neveryet beenproved. NŒLS HENRIK ABEL · APPROXIMATION BY POLYNOMIAL FUNCTIONS CHAPTER There is one sense in which the If p is a polynomial function, "elementary functions" are not elementary at all. p(x)=ao+alx+··-+anxn then p(x) can be computed easily for any number x. This is not at all true for 1/t dt approximately, functions like sin, log, or exp. At present, to find log x we must compute some upper or lower sums, and make certain that the error involved in accepting such a sum for log x is not too great. Computing ex = f,' - log¯'(x) would be even more difficult: we would have to compute log a for many x-then values of a until we found a number a such that log a is approximately a would be approximately e2. In this chapter we will obtain important theoretical results which reduce the computation of f (x), for many functions f, to the evaluation of polynomial functions. The method depends on fmding polynomial functions which are close approximations to f. In order to guess a polynomial which is appropriate, it is useful to first examine polynomial functions themselves more thoroughly. Suppose that p(x)=ao+aix+-·-+anx". It is interesting, and for our purposes very important, to note that the coefficients a¿ can be expressed in terms of the value of p and its various derivativesat 0. To begin with, note that p(0) = ao- Differentiating the original expression for p(x) yields p'(x)=ai+2a2x‡·--‡nanXn-1 Therefore, p'(0) = pfil(0) = at. Differentiating again we obtain Yn(n-1)-anxn-2 p"(x)=2a2+3-2-43x+-- Therefore, p"(0) = p*(0) 2a2. = In general, we will have p(k) (0) 405 (k = k! ak ¯ Or Gk 406 and IrginiteSenes InfiniteSequences If we agree to define 0! 1, and recall the notation p(0) p, then this formula holds for k 0 also. in (x If we had begun with a function p that was written as a "polynomial = = = -a)," p(x) = + at (x ao a) + - + an(x -·- a)n - then a similar argument would show that p(k) Øk = ° kl Suppose now that f is a function (notnecessarily a ff)(a),..., ff">(a) all exist. Let f(k) ak= and such that Oskšn, , k! polynomial) define Pn,a(x) ao + at = (x a) + - - - - + an(x - a)". The polynomial Pn.. is called the Taylor polynomial of degree n for f at a. (Strictly speaking, we should use an even more complicated expression, like Pn,a,f, to indicate the dependence on f; at times this more precise notation will be useful.) The Taylor polynomial has been defined so that Pn,a(k)(a) f(k)(a) for 0 sks = n; in fact, it is clearly the only polynomial ofdegree sn with this property. Although the coefficients of P.,a,y seem to depend upon f in a fairly complicated way, the most important elementary functions have extremely simple Taylor polynomials. Consider first the function sin. We have sin(0) sin'(0) 0, = = cos O 1, 0, = -sin0 sin"(0) sin'"(0) sin(4) = = -1, cos 0 = sin 0 = 0. = (0) - = From this point on, the derivatives repeat in a cycle of 4. The numbers at are sin >(0) = k! 1 1 0, 91 7! Therefore the Taylor polynomial P2n+1,o of degree 2n + 1 for sin at 0 is 0, 1, 0, ---, 1 1 0, 0, 3! 5! -, 3 P2n+i,o(x)=x--+ (Of course, P2n+I,0 = xxx 3! 22n+2,0)· 5 5! - -, .... 7 7! -, +···+(-1)" x 2n+1 - (2n + 1)! 20. Approximation byPolynomialFunctions 407 The Taylor polynomial P2.,o of degree 2n for cos at 0 is (thecomputations aœ left to you) 4 x2 P2n.o(x) = 1 - + - 2! gh 6 - - + - 4! 6! - - (-1)" + - . (2n)! The Taylor polynomial for exp is especially easy to compute. Since expf*)(0) exp(0) 1 for all k, the Taylor polynomial of degree n at 0 is = = x2 x Pn,o(x)=1+-+-+-+---+---+-. l! 2! x3 4 n 3! 4! n! The Taylor polynomial for log must be computed at some point a log is not even defined at 0. The standard choice is a = 1. Then log'(x) 1 = -, log'(l) = log"(1) = ¢ 0, since 1; x log"(x) 1 = - -1; -,, x log'"(x) 2 = log"'(l) --, 2; = x in general (-Î)k-lk(k 1)!, - log(k) __ k-1(k log(k)() - 1)!. Therefore the Taylor polynomial of degree n for log at 1 is (x 1)2 + (x 1)3 +---+ (-1)"¯1 , yn Pn,1(x)=(x-1)2 3 n It is often more convenient to consider the function f (x) log(1 + x). In this 0. We have case we can choose a - _ - = = ff >(x) = log(k) f (k)(0) = log(k) SO k-1 (k 1)! - . Therefore the Taylor polynomial of degree n for f at 0 is x2 P.,o(x) = x - - X4 x3 + - - - n-1Xn _¡ +···+ 3 4 2 n There is one other elementary function whose Taylor polynomial is importantarctan. The computations of the derivatives begin arctan'(x) = 1 1 + x2 arctan'(0) = 1; arctan"(0) = 0; -2x arctan"(x) arctan'"(x) = = (1 + (1 + x2)2 x2)2 , + 2x 2(1 + x2) 2 (1 + x2)4 (-2) . . , arctan"'(Q) and InfmiteSeries 408 InfiniteSequences It is clear that this brute force computation will never do. However, the Taylor polynomials of arctan will be easy to find after we have examined the properties of Taylor polynomials more closely-although the Taylor polynomial Pn,a,y was simply defined so as to have the same first n derivatives at a as f, the connection between f and P.,4, y w°a actua%y turn out tohe much Aceper. One line of evidence for a closer connection between f and the Taylor polynomials for f may be uncovered by examining the Taylor polynomial of degree 1, which is P1..(x) = f (a) + f'(a)(x a). - Notice that f(x) - PI,a(x) f(x) = x-a - f(a) f'(a). - x-a Now, by the definition of f'(a) we have . hm x a f (x) =0. x-a P2.o(x) FIGURE P1,a(X) - = 1+ x + x" I In other words, as x approaches a the difference f(x) small, but actually becomes small even compared to x graph of f(x) - - Pi,a(x) not only becomes a. Figure 1 illustrates the ex and of = PI,o(x) = f (0) + J'(0)x 1 +x, = which is the Taylor polynomial of degree 1 for f at 0. The diagram also shows the graph of P2,o(x) which = f (0) + f'(0) + f"(0) x2 21 x2 = 1+ x + , is the Taylor polynomial of degree 2 for f at 0. As x approaches 0, the difference f(x) P2,0(x) seems to be getting small even faster than the difference - 20. Approximation byPolynomialFunctions 409 Pi,o(x). As it stands, this assertion is not very precise, but we are now prepared to give it a definite meaning. We have just noted that in general f (x) - f (x) lim x-a For f (x) = e2 and a PI,a - - x (x) 0. = a 0 this means that = . hm eX-1-x f(x)-Pl.o(x) . hm = x->o 0. = x-o x x On the other hand, an easy double application ofTHôpital's Rule shows that e2-1-x lim 1 = ¢ 0. - x2 x-o 2 Thus, although f (x) Pi,o(x) becomes small compared to x, as x approaches 0, it does not become small compared to x2. For P2.o(x) the situation is quite difTerent; the extra term x2/2 provides just the right compensation: - x e2-1-x-- lim 2 2 x2 x->0 ex lim = x-so and f (x) (x x-va in fact, the analogous THEOREM l f"(a) exist, P2,a(X) - lim = a)2 -- 0; assertion for Pn,a is also true. Suppose that f is a function for which f'(a),..., ff">(a) all exist. Let ag= and f(k) , k! 05ksn, define Pn,a(x)=ao+al(x-a)+---+a.(x-a)". Then . Inn x-a f (x) Pn,a(x) -- (x = - a)" x - = 2 x-vo f'(a) l 2x ex-1 lim = This result holds in general-if' --- 0. 0. then and InfiniteSeries 410 InfiniteSequences PROOF Writing out Pn,a(x) explicitly, we obtain f(x)-" f(x) f (a)(x-a)' Pn,a(x) - il i=0 f(")(a) ¯ ¯ (x a)" - (x n! a)" - It will help to introduce the new functions n-1 Q(x) f (a) (x 2 = a)' - and g(x) (x = ' i=0 - a)"; now we must prove that f(x) f(">(a) Q(x) - n! g(x) x->a Notice that Q"W)--ff')(a), g(k)(X) = NÎ(X k_<n-1, a)n-k/(n - k)! - . Thus Q(x)]= f(a)- Q(a)=0, Q'(x)]= f'(a) Q'(a)=0, lim[f(x)- X->d lim[f'(x) lim[ f - - "¯©(x) Qi"¯©(x)] f(n-2) = - (n-2)(a) _ = 0. and lim g(x) lim g'(x) = x- a = - - (x) hm f Q(x) - . (x x-a Since Q is a polynomial of degree n (.-1) fact, Q("¯1; (a). Thus x- a f (x) Q(x) and this (x - a)n - 0. 1 times to obtain f("¯1)(x) g(n-1) n! (x a) 1, its (n - . = . - - lim = _ lim = a)" -- lim g(n-2)(x) x-va We may therefore apply l'Hôpital's Rule n x->a = - x->a lun x- a - ff"¯1) n! 1)st derivative is a constant; (n-1)(a) _ (x in - a) last limit is ff")(a)/n! by definition of ff">(a). One simple consequence of Theorem 1 allows us to perfect the test for local and minima which was developed in Chapter 11. If a is a critical point of f, then, according to Theorem 11-5, the function f has a local minimum 0 no at a if f"(a) > 0, and a local maximum at a if f"(a) < 0. If f"(a) sign of might conclusion was possible, but it is conceivable give that the f"'(a) maxima = by PolynomialFunctions 411 20. Approximation 0, then the sign of f(4)(a) further information; and if f'"(a) significant. Even more generally, we can ask what happens when = f'(a) (*) f f"(a) = f">(a) = - - f = - ("¯1)(a) = 0 might be 0, = ¢ 0. The situation in this case can be guessed by examining the functions f(x) g(x) (x = a)", - --(x = a)", - which satisfy (*). Notice (Figure 2) that if n is odd, then a is neither a local minimum maximum the other point On local for hand, if n is nor a f or g. with nth minimum at a, while derivative, has a local a positive even, then f, g, with a negative nth derivative, has a local maximum at a. Of all functions satisfying these are about the simplest available; nevertheless they indicate the situation exactly. In fact, the whole point of the next proof is that any general looks very much like one of these functions, in a sense that fimction satisfying (*), (*) is made precise by Theorem 1. THEOREM 2 Suppose that f'(a)=···= ff"¯1)(a)=0, f(")(a) ¢ 0. is even and f("3(a)> 0, then f has a local minimum at a. (2)If n is even and f("3(a)< 0, then f has a local maximum at a. (3)If n is odd, then f has neither a local maximum nor a local minimum (1)If PROOF n at a. There is clearly no loss of generality in assuming that f (a) 0, since neither the hypotheses nor the conclusion are affected if f is replaced by f f (a). Then, since the first n 1 derivatives of f at a are 0, the Taylor polynomial Pn,a of f is = - - P..a(x) (a) f (a) + = n odd (n) f'(a) 1! (x (x = n! - - a) + - - - + f (") (a) n! (x - a)" a)". Thus, Theorem 1 states that 0 I = lim f(x) (x x-a 6 Pn,a(x) - = - a)= f(x) lim (x x-a - a)" - f("'(a) . n! Consequently, if x is sufficiently close to a, then f(x) ¯ (b) FIGURE a even 2 U , has the same sign as f(">(a) . Suppose now that n is even. In this case (x a)" > 0 for all x ¢ a. Since has the same sign as ff"3(a)/n!for x sufficiently close to a, it follows f(x)/(x - -a)" and hrfiniteSeries 412 InßniteSequences f"(a)/n! for itself has the same sign as this means that > 0, ff">(a) that f(x) f(x) fOr X close to a. Consequently, the case f(n)(a) < 0. -Le JM= 0, f 0 > = If has a local minimum at a. A similar proof works that n is odd. The same argument sufficiently close to a, then (a) close to a. f(a) for Now suppose x=0 x sufficiently f (x) (x a)" as before shows that if x is has the same sign. always --- But (x a)" > 0 for x > a and (x a)" < 0 for x < a. Therefore f (x) has diferent signs for x > a and x < a. This proves that f has neither a local maximum nor a local minimum at a. - - /W - 0, x=0 Although Theorem 2 will settle the question of local maxima and minima for just about any function which arises in practice, it does have some theoretical limitations, because ff*)(a)may be 0 for all k. This happens (Figure 3(a))for the function e-1/x2, 0 x f (x) 0, 0, x (b) -/ = = which has a minimum at 0, and also for the negative of this which has a maximum at 0. Moreover (Figure 3(c)), if function (Figure 3(b)), 1¡x1¡P f (x) = O (x) = e f(k)(0) = 0 for all k, but f has neither a local minimum The conclusion FIGURE cept of a at a "order nor a local maximum in terms of an important conand g are equal up to order n of Theorem 1 is often expressed of equality." Two fimetions if f (x) x-a (x f g(x) -- = a)" - 0. In the language of this definition, Theorem 1 says that the Taylor polynomial Pn,a.y equals f up to order n at a. The Taylor polynomial might very well have been designed to make this fact true, because there is at most one polynomial of degree 5 n with this property. This assertion is a consequence of the following elementary THEOREM s theorem. Let P and Q be two polynomials in (x a), of degree 5 n, and suppose that P and Q are equal up to order n at a. Then P Q. - = PROOF Let R = P - Q. Since R is a polynomial of degree 5 n, it is only necessary to 20. Approximation by PolynomialFunctions 413 prove that if R (x) bo + = · - + b,,(x - a)" - satisfies R(x) lim = (x x->a then R x->a = R (x) . (x - 0 = a) for 0 0 this condition reads simply lim R(x) lim R(x) lim [bo+ bi (x = = < n. 0; on the other hand, = a) + - i < - - - + b.(x - a)"] bo. = Thus bo 0, 0. Now the hypotheses on R surely imply that = lim For i a)" - 0 and R(x)=bi(x-a)+-·-+b.(x-a)". Therefore, R(x) x - =bi+b2(x-a)+--·+b.(x--a)"'¯I a and lim 24. Thus bi = R(x) = x - bi. a 0 and R(x) = b2(x - a) + -·- + b,(x -- a)". Continuing in this way, we find that bo=---=b.=0. conollAny be n-times differentiable at a, and suppose that P is a polynomial in (x a) of degree 5 n, which equals f up to order n at a. Then P P.,,,7. Let f - = PROOF Since P and Pn,a,I both equal f up to order n at a, it is easy to see that P equals Pra,a,ÿ by the Theorem. P.,a, ÿ up to order n at a. Consequently, P = At first sight this corollary appears to have unnecessarily complicated hypotheses; it might seem that the existence of the polynomial P would automatically imply that f is sufficiently differentiable for Pn,a,y to exist. But in fact this is not so. For example (Figure 4), suppose that xn+1, f (x) = 0, x irrational x rational. 0, then P is certainly a polynomial of degree 5 n which equals f up to order n at 0. On the other hand, f'(a) does not exist for any a ¢ 0, so f"(0) is If P(x) = undefined. and InfiniteSeries 414 InfiniteSequences O ,* •, ..• •., FIGURE x", x irrational o, x rational 4 When f does have n derivatives at a, however, the corollary may provide a useful method for finding the Taylor polynomial of f. In particular, remember that our first attempt to find the Taylor polynomial for arctan ended in failure. 2 The equation 1 arctan x dt 1 + t2 = a promising method of fmding a polynomial t2, by 1 + to obtain a polynomial plus a remainder: suggests 1 1+t2 = 1 - 4 t2 6 _ n 2n _ close to arctan-divide (-1)"*E t 1 *2 1+t2 both sides by 1 + t2, This formula, which can be checked easily by multiplying shows that arctanx = t2 + t4 Lx 1 - x - - 3 + - x dt + (-1)"** - - - - 5 + (-1)" 2n+1 + th+2 1+t2 72n+1 5 x3 = n 2n _ dt x 12n+2 (-1).+1 l+t2 dt. According to our corollary, the polynomial which appears here will be the Taylor polynomial of degree 2n + 1 for arctan at 0, provided that x t2n+2 1 + t2 . lun dt 0. = X2n+l x->O Since x t2n+2 1+t2 x dt 5 2n+3 t2"*2 dt = 2n+3 , this is clearly true. Thus we have found that the Taylor polynomial of degree 2n + 1 for arctan at 0 is = x - -- 3 x2"** x5 x3 P2n+1,o(x) + - - 5 - -. + (-1)" 2n+1 . 20. Approximation by PolynomialFunctions 415 By the way, now that we have discovered the Taylor polynomials of arctan, possible to work backwards and find arctant*)(0) for all k: Since P2n+1,o(x) and since this arctant = x x - 3 + - 3 x 5 - - - - · 5 x 2n+1 (-1)" 2n+1 + it is , polynomial is, by definition, arctan(2)(0) 3(0) + arctant)(0) we can find arctan(*) + + 21 simply (0) by arctan***)(0) x2 - + -. x (2n + 1)! of xk equating the coefficients in **, tÏ1eSe LWO polynomials: arctan(k) (0) = kl arctan(2t+l)(0) (-1)' _ or 21 + 1 (21+ 1)! 0 if k is even, arctan(2'*)(0) = (-1)' (2l)!. - A much more interesting fact emerges if we go back to the original equation x3 arctan x = and remember x - - 3 x5 + - x2n+1 - - - - 5 + (-1)" 2n+1 + (-1)"** x 72.42 1+t2 dt, the estimate t2n+2 x 1+t2 |X dts |2n+3 . 2n+3 When |x| 5 1, this expression is at most 1/(2n + 3), and we can make this as small as we like simply by choosing n large enough. In other words, for |x| 5 1 the Taylorpolynomials we can use for arctan to compute arctanx as accurately as we like. The most important theorems about Taylor polynomials extend this isolated result to other functions, and the Taylor polynomials will soon play quite a new role. The theorems proved so far have always examined the behavior of the Taylor polynomial Pn,aforfixed n, as x approaches a. Henceforth we will compare Taylor polynomials P..a for fixedx, and different n. In anticipation of the coming theorem we introduce some new notation. If f is a function for which P.,, R..a(x) by f (x) = (x) exists, define the remainder term P.,a(x) + R.,a(x) f (">(a) -a)+··-+ = we f(a) + f'(a)(x n! -a)"+ Rn,a(x)- (x We would like to have an expression for Rn,a (x) whose size is easy to estimate. There is such an expression, involving an integral, just as in the case for arctan. One way to guess this expression is to begin with the case n 0: = f(x) = f (a) + Ro,a(x)- 416 and infmite Senes InfmiteSequences The Fundamental Theorem of Calculus enables us to write X f (x) f (a) + = Ï f'(t) dt, a so that Ro,a(x) lx f'(t) dt. = A similar expression for RI,a(x) can be derived from this formula tion by parts in a rather tricky way: Let u(t) that (notice x represents f'(t) = some and v(t) = t using integra- x - fixed number in the expression for v(t), so v'(t) 1); = then f'(t) dt f'(t) = 1 dt - v'(t) u(t) x x =u(t)v(t) Since v(x) f"(t)(t-x)dt. - 0, we obtain = X f (x) f = (a) + Ï f'(t) dt X Ï f"(t)(x-t)dt Ï f"(t)(x f(a)-u(a)v(a)+ = x f (a) + f'(a)(x = a) + - Thus R1,a(x) f"(t)(x = Ja t) dt. - t) dt. - It is hard to give any motivation for choosing v(t) t x, rather than v(t) t. It just happens to be the choice which works out, the sort of thing one might discover after sufRciently many similar but futile manipulations. However, it is now easy to guess the formula for R2,a(x). If = t)2 -(x - u(t) then v'(t) = (x - = f"(t) and = , 2 t), so x x Ï v(t) = - f"(t)(x - t)dt = = u(t)v(t) f"(a)(x - This shows that x R2,a(X) = _ f"'(t) a)2 x (3 (x You should now have little difficulty giving a - dt - 2 r"(t) + 2 g)2 _(, x - 2 (x - t)'dt. t)2dt. rigorous proof, by induction, that byPolynomialFunctions 417 20. Approximation if f (n+13 is continuous on then [a,x], Rn,a(x) lx = (n+1)(t) (x - t)" dt. From this formula, which is called the integral form of the remainder, it is possible (Problem 15) to derive two other important expressions for Rn,a(x): the Cauchy form of the remainder, Rn.a(x) f(n+1)(t) = n! (x t)"(x - for some t in (a, x), a) - Lagrange form of the remainder, and the Rn,a(x) = f (n+1) (x (n + 1)! - a)"** for some t in (a, x). In the proofof the next theorem (Taylor's Theorem) we will derive all three forms of the remainder from in an entirely different way. One virtue of this proof (aside of and remainder cleverness) the that the the forms is fact Cauchy Lagrange its will be proved without assuming the extra hypothesis that f("**)is continuous. In this way Taylor's Theorem appears as a direct generalization of the Mean Value Theorem, to which it reduces for n 0, and which is the crucial tool used in the proof. These remarks may suggest a strategy for proving Taylor's Theorem. Since Rn.a(a) 0, we might try to apply the Mean Value Theorem to the expression = = Rn,a(x) Rn,a(x) x-a - Rn,a(a) x-a On second thought, however, this idea does not look very promising, since it is not at all clear how f ("**) (t) is ever going to be involved in the answer. Indeed, if we take the most straightforward route, and differentiate both sides of the equation which defines R.,a, we obtain f'(x) = f'(a) + f"(a)(x a) +·- - - + f (">(a) (x (n 1) - a)n-1 + R.,a'(x), - is useless. The proper application of the Mean Value Theorem has a lot in common with the integration by parts proof outlined above. This proof involved the derivative of a function in which x denoteda number which was fixed.This is just how x will be treated in the following proof. which THEOREM 4 (TAYIDR'S THEOREM) Suppose that f', . . . , f(n+1)are defined -a)+-·-+ f(x)= f(a)+ f'(a)(x on [a,x], and that R.,,(x) f (">(a) n! (x -a)"+R.,a(x). is defmed by 418 and InfmiteSenes Infinite Sequences Then (=+1) Rn,a(x) = (2) Rn,a(x) = (1) (x n! (x Moreover, if f(n+1)is integrable on Rn,a(x) Ï = -- a)"+1 - for some t in (a,x). a) for t in some (a, x). then [a,x], (n+1)(t) x (3) t)"(x - (x n! a t)" dt. - a, then the hypothesis should state that f is (n + 1)-times differentiable on a]; number t in (1)and (2)will then be in (x, a), while (3)will remain true the [x, stated, provided that f(n+1)is integrable on [x,a].) as (If x PROOF < For each number t in [a,x] we have f (x) f (t) + f'(t)(x = Let us denote the number we have t) + - - - - + f (">(t) (x n! - t)" + R.,,(x). R.,,(x) by S(t); the function S is defined on [a,x], and f(">(t) -t)+---+ (*) f(x)= f(t)+ f'(t)(x n! (x -t)"+S(t) for all t in [a,x]. We will now differentiate both sides of this equation, which asserts the equality of two functions: the one whose value at t is f (x), and the one whose value at t is f (">(t) f(t)+.--+ n! (x-t)"+S(t). "as parlance we are considering both sides of (*) a function of t".) make notice that if sure that the letter x causes no confusion, Justto (In common g(t) f (x) = for all t, then g'(t) and 0 = for all t; if g(t) f(k)(t) = k! k ' then gr(7) Í(k) = k(x k! = - f (k) (k - - t)k-1 (k+1) (X - kl (k+l) 1)! (x -- t)k (X k! - f)k 1)& 20. Approximation by PolynomialFunctions 419 Applying these formulas to each term of 0 = f'(t) + - f'(t) f"(t) + obtain (*),we f(3)(t) "¯' f"(t) - 1! 1! 2 2! -f">(t) + - - - + (n 1). (x - In this beautiful formula practically everything S'(t) = (x n! f"*0(t) + (x n! sight in f (n+1) - i t)"" - - t)" cancels out, and we obtain t)". - Now we can apply the Mean Value Theorem to the function S on is some t in (a, x) such that S(x) S(a) - S'(t) = x = f (n+1) - (x n! a - + S'(t). t) - [a,x]: there . Remember that S(t) = R..,(x); this means in particular that S(x) S(a) R.,x(x) Rn,a(x). = = 0, = Thus 0 f'" Rn.a(x) -- =- "(t) (x-t) n! x-a . or R.,a(x) ff"**)(t) = (x n! t)"(x - a); - this is the Cauchy form of the remainder. To derive the Lagrange form we apply the Cauchy Mean Value Theorem the functions S and g(t) = (x t).+1: there is some t in (a, x) such that - f (n+1) - S(x) - S(a) n! S'(t) (x Thus _ f(n+1)(t) ¯ (n + 1)! - or Rn,a(x) which is the Lagrange form. = ff"**)(t) (x (n + 1)! t)" ° -(n+1)(x-t)= g(x)-g(a) Rn,a(x) (x a).41 - - a)"**, to and InßniteSeries 420 InßniteSequences Finally, if f (n+1) is integrable on [a,x], then x S(x) - S(a) Ï = (n+1)(t) x S'(t) = ni a Of R..a(x) f("+0(t) L' = (x - n! (x t)"dt - t)" dt. - Although the Lagrange and Cauchy forms of the remainder are more than theoretical curiosities (see,e.g., Problem 23-18), the integral form of the remainder will usually be quite adequate. If this form is applied to the functions sin, cos, and 0, Taylor's Theorem yields the following formulas: exp, with a = x 3 sinx=x--+--...+(-1)" x 3! 5 x 5! 2n+1 (2n + 1)! x X2n x2 x4 cosx=1--+-----+(-1)" 4! 2! x2 t)2.+1dt, + dt, (x-t) (2n)] o xt n -(x-t)"dt. n! 2! - x COS(2n+1) (2n)! e2=1+x+-+---+-+ sin(2n+2) (x (2n + 1)! + n! To evaluate any of these integrals explicitly would be supreme foolishness-the answer of course will be exactly the difTerenceof the left side and all the other terms on the right side! To estimate these integrals, however, is both easy and worthwhile. The first two integrals are especially easy. Since " | sin we (t)| 5 1 for all t, have x (2n+2)(t) (2n + 1)! (x dt t) - 2(x 1 < - ¯ (2n + 1)! t) Since _ fx (x - t) 2n+1 dt _ 2n+2 t=x = 2n + 2 t=0 x2n+2 ¯2n+2 we conclude that x gg (2n+2)(t) (x (2n + 1)! - t) gy dt 5 |x|2n+2 . (2n + 2)! ** dt . 20. ApproximationbyPolynomialFunctions 421 Similarly, we can show that x 2.+1 cos(2n+1) (x (2n)! - dt t) < . ¯¯ (2n+ 1)! These estimates are particularly interesting, because (asproved in Chapter 16) for any e > 0 we can make Xn n! by choosing n large enough (howlarge n must be will depend on x). This enables sin x to any degree of accuracy desired simply by evaluating the pmper Taylor polynomial Pa.o(x). For example, suppose we wish to compute sin 2 with an error of less than 10-4. Since us to compute sin 2 where P2n+1,0(2) + R, = we can use P2n+l,o(2) as our answer, |R| 5 22n+2 (2n + 2)! , provided that 22n+2 (2n + 2)! 10¯*. < search-it obA number n with this property can be found by a straightforward viously helps to have a table of values for n! and 2" 428). this In case it (seepage works, that that happens 5 n so = sin 2 = Pi l.o (2) + R 23 25 27 29 211 31 5! 7! 9! 10-4 11! =2---+---+----+R where |R < ' It is even easier to calculate sin 1 approximately, since sin 1 = P2n+1,o(1) + R, where |R| < 1 . (2n + 2)! To obtain an error less than s we need only find an n such that 1 (2n + 2)! ' and this requires only a brief glance at a table of factorials. (Moreover, the inditerms of P2n+1,o(1) will be easier to handle.) For very small x the estimates will be even easier. For example, vidual sin To obtain with n = - 1 10 = P2,+1,o IRI < 10¯1o we - 1 + R, 10 where |R| < 1 102n+2(2n+ 2)! . 4 (andwe could even get away can clearly take n used actually 3). These are to compute tables of sin and cos. A high-speed computer can compute P2, i,o(x) for many different x in almost no time at all. methods = 422 and InfiniteSenes InßniteSequences Estimating the remainder for e2 is only slightly harder. For simplicity assume > that x 0 (theestimates for x < 0 are obtained in Problem 10). On the interval [0,x] the maximum value of e' is e2, since exp is increasing, so xt n! (x t)" dt - Since we already know that e < xXn+1 x x -- (x --- ¯ n! t)" dt - = . (n + 1)! 4, we have < xx"*I 42x"*I (n + 1)! ' (n + 1)! ' which can be made as small as desired by choosing n sufficiently large. How large the factor 42 will make things more difficult). n must be will depend on x 1, then Once again, the estimates are easier for small x. If 0 Exs (and x2 e2=1+x+-+--.+-+R, x" n! 2! 4 where0<R< . (n + 1)! (The inequality 0 < R follows immediately from the integral form for R.) In particular, if n = 4, then 4 1 0<R< <-, 5! 10 so e=e1=1+1+-+-+-+R, 65 24 1 3! 1 2! 1 1 10 where0<R<- 4! =-+R 17 24 2+--+R, which shows that 2<e<3. (This then shows that 0<R< 32x"** (n + 1)! , allowing us to improve our estimate of R slightly.) compute that the first 3 decimals for e are e = 2.718 By taking n = 7 you can ... (youshould check that n = 7 does give this degme of accuracy, but it would be cruel to insist that you actually do the computations). The fimction arctan is also important but, as you may recall, an expression for arctanf*)(x) is hopelessly complicated, so that the integral form of the remainder useless. the other hand, our derivation of the Taylor polynomial for arctan is On automatically provided a formula for the remainder: x3 arctanx=x¯¯ (¯·1)"22n+1 2 ' 3 2n+1 Jo (-1). 1,2n+2 1+t2 dt. 20. Approximation byPolynomialFunctions 423 As we have already estimated, n+lg2n+2 _¡ x x dt 1+t2 5 t 2n+3 2n+2 dt = we will consider only numbers x with term can clearly be made as small as desired For the moment remainder . 2n+3 |x| 5 1. In this case, the by choosing n sufficiently large. In particular, 1 5 1 3 arctanl=1--+--...+ (-1)" 2n+1 1 2n+3 where|R|< +R, . With this estimate it is easy to find an n which will make the remainder less than any preassigned number; on the other hand, n will usually have to be so large as to make computations hopelessly long. To obtain a remainder < 10 , for example, take > (104 3)/2. This is really a shame, because arctan 1 = x/4, must n we polynomial for arctan should allow us to compute x. Fortunately, so the Taylor there are some clever tricks which enable us to surmount these difficulties.Since - |R2n+1,c(x)\ < , 2n + 3 n's will work for only somewhat smaller x's. The trick for computing is to express arctan 1 in terms of arctan x for smaller x; Problem 6 shows how x this can be done in a convenient way. much smaller 1 is best The Taylor polynomial for the function f (x) log(x + 1) at a handled in the same manner as the Taylor polynomial for arctan. Although the integral form of the remainder for f is not hard to write down, it is difficultto estimate. On the other hand, we obtain a simple formula if we begin with the = = equation 1 l+t 1 = t+ t - - - - - (-1)"¯1,n-i q (-1)"t" + . l+t this implies that x log(1 + x) 1 - 1+ t x2 dt = x - -- 2 x3 + - - - - 3 - + (-1)"¯1 Xn n + (-1)" 1+ t dt, -1. for all x > If x > 0, then x t+1 Xn+1 x n dt 5 t" dt = , n+1 -1 < x < 0 (Problem 11). is a slightly more complicated estimate when For this function the remainder term can be made as small as desired by choosing < x 5 1. n sufficiently large, provided that and there -1 424 and InfiniteSeries InßniteSequences The behavior of the remainder terms for arctan and f(x) another when matter |x| > = log(x + 1) is quite 1. In this case, the estimates |R2n+1,o(x)| |R..o(x)| |x|2n+3 < < for arctan, 2n + 3 (x > 0) for f, n+ 1 1 the bounds x"'/m become large as m beare of no use, because when |x| > unavoidable, and is not just a deficiency of our predicament is comes large. This estimates. It is easy to get estimates in the other direction which show that the remainders actually do remain large. To obtain such an estimate for arctan, that if t is in [0,x] (orin [x,0] if x < 0), then 1+t251+x252x2, so t2n+2 x 1+t2 Similarly, if x > 0, then for dt à g 2x2 if|x note 1, h+1 x t2"*2dt - = . 4n+6 o t in [0,x] we have 1+tsl+xs2x, 1, ifx SO t+1 dt à t" dt - 2x = . 2n+2 These estimates show that if |x| > 1, then the remainder terms become large as n becomes large. In other words, for lx| > 1, the Taylor polynomials for arctan and f are ofno use whatsoeverin computing arctanx and log(x + 1). This is no tragedy, because the values of these functions can be found for any x once they are known for all x with |x < 1. This same situation occurs in a spectacular way for the function e¯1/x2, f (x) = 0, x x We have already seen that ff*)(0) 0 for every that the Taylor polynomial Pn,0 for f is = = = f (0) + f'(0)x + 0. natural number f (0) f (") (0) 2! n! " Pn,o(x) ¢0 = k. This means 0. In other words, the remainder term R..o(x) always equals f(x), and the Taylor polynomial is useless for computing f (x), except for x = 0. Eventually we will be able to ofFer some explanation for the behavior of this fimction, which is such a disconcerting illustration of the limitations of Taylor's Theorem. has been used so often in connection with our estimates The word for the remainder term, that the significance of Taylor's Theorem might be misconstrued. It is true that Taylor's Theorem is an almost ideal computational aid "compute" 20. ApproximationbyPolynomialFunctions 425 failure in the previous example), but it has equally important theoretical consequences. Most of these will be developed in succeeding chapters, but two pmofs will illustrate some ways in which Taylor's Theorem may be used. The first illustration will be particularly impressive to those who have waded through the proof, in Chapter 16, that x is irrational. (despiteits ignominious THEOREM 5 PROOF e is irrational. We know that, for any n, 1 1 e=el=1+--+-+··-+--+R., 1 n! 2! 11 Suppose that e were rational, say e Choose n > b and also n > 3. Then where0<R.< . a/b, where a and b are positive integers. = 1 a 3 (n + 1)! 1 -=l+1+-+·--+-+R., b 2! n! so n!a n! n! --=nl+n!+-+---+-+n!R.. b 2! n! Every term in this equation other than n!R, is an integer (theleft side is an integer because n > b). Consequently, nlR, must be an integer also. But 3 (n + 1)! 0<R,< , so 0<n!R.< which is 3 <-<1, n+1 3 4 impossible for an integer. The second illustration is merely a straightforward proved in Chapter 15: If f" demonstration of a fact f 0, f (0) 0, f'(0) 0, + = = = then f = 0. To prove this, observe first that f(k) exists for every k; in fact f* = f(4)_ f (5 etc. - (f")' = 3r_ U(4) f', - tr__ _ r _ pr n_L 426 Infinite Sequences and InßniteSeries This shows, not only that all ff*) exist, but also that there are at most 4 different Since f(0) f'(0) 0, all ff*)(0)are 0. Now Taylor's ones: f, f', Theorem states, for any n, that -f'. -f, = f (x) = n! |f("*)(t)|<M add the phrase "and t)" dt. - (sinceff"*2)exists), Each function f(n+13 is continuous there is a number M such that (wecan (x = so for any particular x for0<t<x,andalln all n" because there are only four different f(k) Thus t)" 5 M |f(x)| ni dt = M |x|"** . (n + 1)! Since this is true for every n, and since |x|"/n! can be made as small as desired by choosing n sufficiently large, this shows that f (x)| Se for any e > 0; consequently, 0. f (x) = The other uses to which Taylor's Theorem will be put in succeeding chapters considerations which have concerned us are closely related to the computational much chapter. remainder of this the for If term Rn.a(x) can be made as small as sufficiently desired by choosing n large, then f (x) can be computed to any degree of accuracy desired by using the polynomials Pn,a(x). As we require greater and greater accuracy we must add on more and more terms. If we are willing to add up infinitely many terms (intheory at least!), then we ought to be able to ignore the remainder completely. There should be sums" like "infmite X3X5X' . smx=x---+----+- -, 3! 5! x2468 cosx=l---+-----+----·-, 2! 7! 4! 6! x2 e2=1+x+-+-+-+--., 2! arctan x = x - x 3 - 3 + x2 log(1+x)=x---+----+... 2 x 3! 5 -- -- 8! 74 x3 x 4! 7 -- 5 7 73 74 + · · - if |x| < 1, ¯ -l<xs1. 3 4 if We are almost completely prepared for this step. Only one obstacle remainswe have never even defined an infinite sum. Chapters 22 and 23 contain the necessary definitions. 20. ApproximationbyPolynomialFunctions 427 PROBLEMS 1. Find the Taylor polynomials (ofthe indicated degree, and at the indicated point) for the following functions. (i) f (x) = e*; degree 3, at 0. (ii) f (x) = esi"' degree 3, at 0. (iii) sin; degree 2n, at . (iv) cos; degree 2n, at z. (v) exp; degree n, at 1. (vi) log; degree n, (vii)f(x) (viii)f (x) 2. = x5 4,3 = x5 + x3 + x; degree (ix) f (x) = (x) f(x) = 1 1+ x + x; degree 4, at 0. 2; 4, at 1. degree 2n + 1, at 0. 1 degree n, at 0. 1+ x ; Write each of the following polynomials in x as a polynomial in (x 3). (It is only necessary to compute the Taylor polynomial at 3, of the same degree as the original polynomial. Why?) - (i) 2 x x4 (ii) 5 (iii) x - 4x - 9. 12x3 + 44x2 + 2x + 1. - . (iv) ax2 + 3. at 2. bx + c. Write down a sum numbers to within notation) (using the specified tion, consult the tables for 2" (i) sin 1; error < 10¯17 (ii) sin 2; error < 10¯12 (iii) sin (;error < 10¯ (iv) e; error < 10-4 (v) e2; error < 10¯6. . which equals each of the following minimize To accuracy. and n! on the next page. needless computa- and InfiniteSeries 428 InßniteSequences 2" n n! l 2 3 2 4 8 16 4 5 32 6 7 8 9 10 11 12 64 128 256 512 1,024 2,048 4,096 8,192 16,384 32,768 65,536 131,072 262,144 524,888 1,048,576 13 14 15 16 17 18 19 20 *4. 1 2 6 24 120 720 5,040 40,430 362,880 3,628,800 39,916,800 479,001,600 6,227,020,800 87,178,291,200 1,307,674,368,000 20,922,789,888,000 355,687,428,096,000 6,402,373,705,728,000 121,645,100,408,832,000 2,432,902,008,176,640,000 This problem is similar to the previous one, except that the errors demanded are so small that the tables cannot be used. You will have to do a little thinking, and in some cases it may be necessary to consult the pmof, in Chapter 16, that x"/n! can be made small by choosing n large-the pmof actually provides a method for finding the appropriate n. In the previous rather short find it problem sums; in fact, it was possible was possible to to find the smallest n which makes the estimate of the remainder given by Taylor's Theorem less than the desired error. But in this problem finding any specific sum is a moral victory you can demonstrate that the sum (provided works). (i) (ii) sin 1; error e; error < < 10-flolo) 10¯ . < 10¯2°. 10¯3 (iv) e ; error < arctan ; error < 10¯(101° (iii) sin 10; error . (v) 5. Problem 11-38 you showed that the equation x2 cos x has precisely two solutions. Use the Taylor polynomial of cos to show that the solutions are approximately and find bounds on the error. (b) Similarly, estimate the solutions of the equation 2x2 x sin x -E cos2 x. (a) In = ±Œ, = 20. Approximation ly PolynomialFunctions 429 6. using Problem (a) Prove, 15-9, that 1 1 + aretan 2 3 1 1 x 4 arctan arctan 4 5 239 (b) Show that x 3.14159.... (Every budding mathematician should verify a few decimalsof x, but the purpose of this exercise is not to set you off on an immense calculation. If the second expression in part (a)is used, the first 5 decimals for x can be computed with remarkably little - x = 4 arctan - = - -, - - --. = work.) 7. For every number a, and every natural number n, we define the "binomial coefficient" ) and we define (1 + = 1, as usual. If a is not an integer, then = in sign for n and alternates for f(x) a(a-1)-...-(œ-n+1) x)" at > is never 0, a. Show that the Taylor polynomial of degree n 0 is P.,o(x) k, = and that the Cauchy and k-0 Lagrange forms of the remainder are the following: Cauchy form: a(a-1)-...·(a-n) R..o(x) x(x = n! a(a-1)-...-(«-n) x(1+t) = n! (n + 1) = « n+1 t)n(1 + t)a-n- - ,_1 I-I " 1+t x-t x(1 + t)"¯1 1 1+t , t in [0,x] or [x,0]. Lagrange form: (n + 1)! " I, xn+1(1 + t)"¯" t in [0,x] or [x,0]. n+ 1 Estimates for these remainder terms are rather difficult to handle, and are postponed to Problem 23-18. = 8. Suppose that a¿ and b¿ are the coefficients in the Taylor polynomials at a of f and g, respectively. In other words, a¿ f (a)/i! and bi g (a)/i!. Find the coefficients c¿ of the Taylor polynomials at a of the following functions, in terms of the ag's and bi's. = (i) | + g (ii) fg . . = and InfiniteSeries 430 InfiniteSequences (iii) J'. (iv) h(x) (v) 9. k(x) = lxf (t) dt. f (t) dt. = (a) Prove that the Taylor polynomial of f (x) sin(x2) of = degree 4n + 2 at 0 is 6 10 4n+2 --+---··+(-1)" x . 3! 5! (2n + 1)! Hint: If P is the Taylor polynomial of degree 2n + 1 for sin at 0, then P(x) + R(x), where lim R(x)/x2.+1 0. What does this imply sinx = = xso 4n+2; about lim R(x2 x->0 (b) Find f (k)(0) for all (c) In general, if f(x) at 10. k. g(x"), find = ff*)(0)in terms of the derivatives of g 0. Pmve that if xs 0, then x e' - n! (x t)" dt - |x|n+1 (n+1)! < ¯ . -1 11. Prove that if < 0, then xs x 1+t *12. In+1 s (1+x)(n+1) n X dt (a) Show that a|" for |x if |g'(x)| 5 M|x M |x a |n+1/(n+ 1) for |x a | < ô. (b) Use part (a)to show that if lim g'(x)/(x - - a| . 8, then < |g(x) - g(a)| 5 -- -- lim x->a g(x) = (x - a)"+1 - a)n = 0, then 0. if g(x) f (x) Pn,a.f (x), then g'(x) f'(x) P.-1,a.r(x). (d) Give an inductive proof of Theorem 1, without using PHôpital's Rule. (c) Show that = - = - 13. Deduce Theorem 1 as a corollary of Taylor's Theorem, with any form of the remainder. (The catch is that it will be necessary to assume one more derivative than in the hypotheses for Theorem 1.) 14. Deduce the Cauchy and Lagrange forms of the remainder from the integral form, using Problem 13-23. There will be the same catch as in Problem 13. 20. Approximation byPolynomialFunctions 431 15. (a) Suppose for all x we have that > f is twice differentiable on (0, oo) and that |f (x)| 5 Mo 0, while |f"(x)| 5 M2 for all x > 0. Prove that for all x > 0 2 |f'(x)| 5 -Mo h (b) Show that for all x h + -M2 for all h 2 > 0. 0 we have > |f'(x)| 5 2/MoM2. is twice differentiable on (0,oo), f" is bounded, and f(x) approaches 0 as x 0 as x oo, then also f'(x) approaches oo. and lim f"(x) exists, then lim exists lim If lim f"(x) f'(x) f(x) (d) (c) If f -+ -> = x-o x-o x-oo = x-oo 0. (Compare Problem 11-31.) 16. (a) Prove that if f"(a) exists, then f"(a) = + h) + f (a lim f (a hw0 h2 - h) - 2f (a) . The limit on the right is called the Schwarzseconddenvativeof f at a. Hint: Use the Taylor polynomial P2,a (x) with x = a + h and with x h. a x2 and that for x :> 0, for xs 0. Show (b) Let f(x) = - -x2 = . hm f(0+h)+f(0-h)-2f(0) h2 hoo exists, even though f"(0) does not. that if f has a local maximum at a, and the Schwarz second derivative of f at a exists, then it is 5 0. (d) Prove that if f"'(a) exists, then (c) Prove f'"(a) 3 17. f(a+h)- f(a-h)-2hf'(x) =lim a->o h3 . Use the Taylor polynomial Pi,aj, together with the remainder, to prove a weak form of Theorem 2 of the Appendix to Chapter 11: If f" > 0, then the graph of f always lies above the tangent line of f, except at the point of contact. *18. Problem 1843 presented a rather complicated proof that f 0 if f"- f = 0 and f (0) f'(0) 0. Give another proof, using Taylor's Theorem. (This problem is really a preliminary skirmish before doing battle with the general case in Problem 19, and is meant to convince you that Taylor's Theorem is a good tool for tackling such problems, even though tricks work out more = = = neatly for special cases.) and InfiniteSeries 432 InfiniteSequences **19. Consider a function f which satisfies the differential equation f(">= Lajf II, j=0 for certain numbers ao, an-1. Several special cases have already received either detailed treatment, in the text or in other problems; in particular, we have found all functions satisfying f' f, or f"+ f 0, or f"- f 0. The trick in Problem 18-42 enables us to find many solutions for such equations, but doesn't say whether these are the only solutions. This requires a uniqueness result, which will be supplied by this problem. At the end you will find some sketchy) remarks about the general solution. (necessarily . . . , (a) Derive the following formula for f("*0 (letus f ("*) (b) Deduce f j=0 = + an-1aj) (aj.I = = = "a-i" agree that will be 0): f fa. formula for f("*23. a The formula in part (b)is not going to be used; it was inserted only to convince you that a general formula for f(n+k)is out of the question. On the other hand, as part (c)shows, it is not very hard to obtain estimates on the size of f("**)(x). (c) Let N max(1, = |a.-1|). Then |aj_i + |aol, ..., an-laj| 5 2N2; this means that n-1 bjl f(n+1) = ffi3, where |b; | 5 2N2. where |bj \ 5 j=0 Show that n-1 f("*2) 42ffil, -- 4N3 j=0 and, more generally, n-i f "**) gk -- (j), |bj*|5 where 2Nk+1 j-0 (d) Conclude number (c)that, from part M such that |f(n+k(x)|5 that (e) Now suppose |f(x)|< M ¯ and conclude that for any particular number M -2kNk+t x, there for all k. f (0) f'(0) f ("¯U(0) 0. Show that M |2Nx |n+k+1 ·2A+1N**2Mn+k+1 < = = - · = = - · ¯ (n+k+1)! f = 0. (n+k+1)! ' is a 20. Approximation byPolynomialFunctions 433 (f) Show that if fi and f2 are both solutions of the differential equation n-1 j=0 and for 0 5 jsn fi (0) f2(13(0) - 1, then - fi = f2. In other words, the solutions of this differential equation are determined conditions" (thevalues fU)(0) for 0 5 jyn by the 1). This means that we can find all solutions once we can find enough solutions to obtain any given set of initial conditions. If the equation "initial - x" an-1xn-1 - has n distinct roots al, f (x) is a solution, and - - -- - ao = 0 an, then any function of the form ..., = -- ci e"1" + ·· - + c.e.,x · f(0) f'(0) =c1+···+c., =aic1+···+a.c., f(n-1)(0)=ain-Ic1+···+a."¯1c, As a matter of fact, every solution is of this form, because we can obtain any set of numbers on the left side by choosing the c's properly, but we will not try to prove this last assertion. (It is a purely algebraic fact, which 2 or 3.) These remarks are also true if some you can easily check for n of the roots are multiple roots, and even in the more general situation considered in Chapter 27. = **20. function on [a,b] with f (a) f(b) and that for all x in (a, b) the Schwarz second derivative of f at x is 0 (Problem 16). Show that f is constant on [a,b]. Hint: Suppose that f (x) > f (a) for some x in (a, b). Consider the function (a) Suppose that f is a continuous g(x)= with g(a) = f(x)-s(x-a)(b-x) sufficiently small e > 0 we will have point y in (a, b). Now use g(x) > have Problem 16(c)(theSchwarz second derivative of (x a)(b x) is simply its ordinary second derivative). (b) If f is a continuous function on [a,b] whose Schwarz second derivative is O at all points of (a, b), then f is linear. = g(a), g(b) = f(a). For so g will a maximum - - and InßniteSenes 434 InßniteSequences *21. x4 sa Ux2(-orx ¢ 0, and f (0) 0. Show that order 2 at 0, even though f"(0) does not exist. (a) Let f (x) = = f 0 up to = This example is slightly more complex, but also slightly more impressive, than the example in the text, because both f'(a) and f"(a) exist for a ¢ 0. Thus, for each number a there is another number m(a) such that (*) f (x) f (a) + f'(a)(x = where . hm x-a namely, m(a) a) + - R.(x) (x - m(a) 2 - a)2 0, and m(0) (x - a)2 + Ra(x), 0; 0. Notice that the fimetion continuous. this defined in is not way m (b) Suppose that f is a differentiable fimction such that (*)holds for all a, with m(a) m(a) 0 for 0. Use' Problem 20 to show that f"(a) = = f"(a) for a ¢ = = = all a. suppose that (*)holds for all a, and that m is continuous. that for all a the second derivative f"(a) exists and equals m(a). (c) Now Prove *CHAPTER e IS TRANSCENDENTAL The irrationality of e was so easy to pmve that in this optional chapter we will attempt a more difficult feat, and prove that the number e is not merely irrational, but actually much worse. Justhow a number might be even worse than irrational is suggested by a slight rewording of definitions. A number x is irrational if it is a/b for any integers a and b, with b ¢ 0. This is the not possible to write x saying that x does not satisfy any equation same as = bx-a=0 for integers a and b, except for a 0, b 0. Viewed in this light, the irrationality of Ã does not seem to be such a terrible deficiency; rather, it appears that Ã just barely manages to be irrational-although Ã is not the solution of an equation = = a1x + ao = 0, it is the solution of the equation x2-2=0, of one higher degree. Problem 2-18 shows how to produce many irrational numbers x which satisfy higher-degree equations anx"+a,_ixn-1+---+ao=0, the at are integers and ao ¢ 0 (thiscondition rules out the possibility that equation of this sort is called all at 0). A number which satisfies an and practically every number we have ever encountered an algebraic number, of solutions of algebraic equations (x and e are the great exis defined in terms mathematical experience). ceptions in our limited All roots, such as where "algebraic" = are clearly algebraic numbers, and even complicated 3+ñ+ are algebraic we (although will not try to 1+ñ+ combinations, like 6 prove this). Numbers which cannot be obtained by the process of solving algebraic equations are called transcendental; the main result of this chapter states that e is a number of this anomalous sort. The proof that e is transcendental is well within our grasp, and was theoretically possible even before Chapter 20. Nevertheless, with the inclusion of this proof, we can justifiablyclassify ourselves as something more than novices in the study of while many irrationality proofs depend only on elementary higher mathematics; properties of numbers, the proof that a number is transcendental usually involves 435 and InßniteSeries 436 InßniteSequences Even the dates connected with the transome really high-powered mathematics. scendence of e are impressively recent-the first proof that e is transcendental, 1873. due to Hermite, dates from The proof that we will give is a simplification, due to Hilbert. Before tackling the proof itself, it is a good idea to map out the strategy, which depends on an idea used even in the proof that e is irrational. Two features of the expression 1 1! 1 1 nl 2! that the proof were important for e is irrational: On the one hand, the number e=1+-+-+...+-+Rn 1 1+-+---+11 written p/q with q 1 n! as a fraction can be (sothat n! (p/q) is an integer); on the other hand, O < R. < 3/(n + 1)! (son!R, is not an integer). These two facts show that e can be approximated particularly well by rational numbers. Of course, every number x can be approximated arbitrarily closely by rational numbers-if e> 0 there is a rational number r with |x r| < s; the catch, however, is that it may be n! < - necessary to allow a very large denominator for r, as large as 1/s perhaps. For e we are assured that this is not the case: there is a fraction p/q within 3/(n + 1)! of e, whose denominator q is at most n! If you look carefully at the proof that e will see that only this fact about e is ever used. The number e is is irrational, . you by no means unique in this respect: generally speaking, the bettera number can be approximated evidence for this assertion by rational numbers, the worse it is (some presented that transcendental 3). is in Problem The proof depends on a natural e is extension of this idea: not only e, but any finite number of powers e, e2, , e", rational numbers. simultaneously approximated well In especially be by can our proof we will begin by assuming that e is algebraic, so that ... (*) for some integers ao, certain integers M, Mi, ... a,,e"+---+aie+ao=0, , ... an. , ao-j0 In order to reach a contradiction Mn and certain 2 numbers et, we will than find ... , en such that Mi+ci 1 e= e "small" M , M2+E2 = M , Ma+En e"= M Justhow small the c's must be will appear when these expressions are substituted into the assumed equation (*).After multiplying through by M we obtain [aoM+atM1+···+a.M.]+[61ai+-..+e.a.]=0. 21. e is Tanscendental 437 The first term in brackets is an integer, and we will choose the M's so that it will necessarily be a nonero integer. We will also manage to find e's so small that |riai þ +-·-+ena.\< this will lead to the desired contradiction--the sum of a nonzero integer and a zero! value of absolute less than cannot be I number As a basic strategy this is all very reasonable and quite straightforward. The remarkable part of the proof will be the way that the M's and c's are defined. In order to read the proof you will need to know about the gamma function! (This function was introduced in Problem 19-39.) THEOREM 1 e is transcendental. -/ PROOF Suppose there were integers ao, (*) a.e" Define numbers M, M1, ... = I , a., with ao + an-1e"¯* + Mn and ci, , xP-1 M ... 1) [(x - = Uk ¯ Øk + ao - (x ... = 0. n)]Pe¯2 - dx, 1)! - [(x-1)·....(x--n)]Pe-2 "xP (p €k - e, as follows: , - (p o MK ... - 0, such that k XE¯ (X Î) - · dx, 1)! - (X ...• - H)] €¯ dx. (p-1)! The unspecified number p represents a prime number* which we will choose later. Despite the forbidding aspect of these three expressions, with a little work they will appear much more reasonable. We concentrate on M first. If the expression in brackets, 1) [(x - is actually multiplied out, we obtain - ...· (x - n)], a polynomial x"+---±n! number" The term "prime was defined in Problem 2-17. An important fact about prime numbers be used in the proof, although it is not proved in this book: If p is a prime number which does not divide the integer a, and which does not divide the integer b, then p also does not divide ab. is crucial in proving that the The Suggested Reading mentions references for this theorem (which factorization of an integer into primes is unique). We will also use the result of Problem 2-17(d), that there are infinitely many primes-the reader is asked to determine at pocisely which points this information is required. * will and InfiniteSeries 438 InfiniteSequences with integer coefficients. When raised to the pth power this becomes an even more complicated polynomial x"P + ·- (n!)P. ± - Thus M can be written in the form "" M where °° 1 = (p 1)! - Ca X P¯1"e¯2 dx, O the C, are certain integers, and Co ±(n!)P. = But oo xk Thus -xdx=kl. "" (p-1+a)! (p-1)! Cœ M= a=o Now, for a = 0 we obtain - the term ±(n!)* (p (p 1)! 1)! - - ±(n!)* = . We will now consider only primes p > n; then this term is an integer which is not divisible by p. On the other hand, if a > 0, then 1 + a)! =C,(p+a--1)(p+a-2)-...-p, (p 1)! which is divisible by p. Therefore M itself is an integer which is not divisible by p. Now consider Mk. We have Ca (p - - "xp-1[(x-1)·...-(x-n)]Pe Mk = X Uk (p °° = x?¯I [(x - 1) · (p This can be transformed - ... - dx 1)! - (x - n)]Pe-(x-k) 1)! dx. into an expression looking very much like M by the substitution u=x-k du dx. = The limits of integration are changed to 0 and oo, and l°°(u+k)?¯1[(u+k-1)·...·u-...·(u+k-n)]Pe¯" Mk = o (p - 1)! du. There is one very significant difference between this expression and that for M. The term in brackets contains the factor u in the kth place. Thus the pth power contains the factor uP. This means that the entire expression (u+k)p-1[(u+k--1)-...·(u+k-n)]F 21. with is a polynomial Thus Mk e is Wanscendental439 integer coefficients, every term ofwhich has degree at least p. o Î = p-1"e¯" du D, = ==1 a=i the D, are certain integers. Notice that the summation begins with a 1; in this case every term in the sum is divisible by p. Thus each Mk is an integer which is divisible by p. Now it is clear that where = e Substituting into (*)and & Mk Ÿ M = Ek k=1,...,n. , multiplying by M we obtain [aoM+aiM+···+a.M.]+[aici+.--+a,«.]=0. In addition to requiring that p > n let us also stipulate that p > |ao[.This means that both M and ao are not divisible by p, so aoM is also not divisible by p. Since each Mk is divisible by p, it follows that aoM+a1M1+---+a.M. is not divisible by p. In particular it is a nongero integer. In order to obtain a contradiction to the assumed equation prove that e is transcendental, it is only necessary to show that la1Ei + (*),and thereby + a.«.| · can be made as small as desired, by choosing p large enough; it is clearly sufficient to show that each |eg|can be made as small as desired. This requires nothing more than some simple estimates; for the remainder of the argument remember that n is a certain fixed number (thedegree of the assumed polynomial equation (*)).To begin with, if 1 5 k i n, then k ek Ïk Ï" o Se" (p np-1|[(x the maximum of |(x 1) - |< - (x ...- (p-1)! 1) (x - - €"Rp-1 |Ek dx -1)! o Now let A be €¯* ...•(X-N) Xp-1((X-Î) ... p - (p oc e¯* dx < ¯ (p-1)! e"n?¯I AP (p 1)! ¯ - e"n?AP ¯ (p - e"(nA)F 1)! (p - |e¯* dx. n)| for x in e¯* dx 1)! p-1¿p gn n)]P n ¯ - - - 1)! [0,n]. Then and InfiniteSeries 440 Ir¢nite Sequences But n and A are fixed; thus making (nA)F/(p - 1)! can be made as small as desired by p sufficiently large. This proof, like the proof that x is irrational, deserves some philosophic afall, we terthoughts. At first sight, the argument seems quite mathematiuse integrals, and integrals from 0 to oo at that. Actually, as many cians have observed, integrals can be eliminated from the argument completely; the only integrals essential to the proof are of the form "advanced"-after ao Xk -x dx for integral k, and these integrals can be replaced by k! whenever Thus M, for example, could have been defined initially as where they occur. C, are the coefficients of the polynomial [(x - 1)··..·(x - n)]P. "completely If this idea is developed consistently, one obtains a that e is transcendental, depending only on the fact that 1 1 2! 3! this proof Unfortunately, is harder to understand than the original onethe whole structure of the proof must be hidden just to eliminate a few integral signs! This situation is by no means peculiar to this specific theoremarguments are frequently more difficult than ones. Our proof that x is irrational is a case in point. You probably remember nothing about this proof except that it involves quite a few complicated functions. There is actually a more advanced, but much more conceptual proof, which shows that x is transcendental, a fact which is of great historical, as well as intrinsic, interest. One of the classical problems of Greek mathematics was to construct, with compass and straightedge alone, a square whose area is that of a circle of radius 1. This which can be requires the construction of a line segment whose length is , accomplished if a line segment of length x is constructible. The Greeks were totally unable to decide whether such a line segment could be constructed, and even the full resources of modern mathematics were unable to settle this question until 1882. In that year Lindemann proved that x is transcendental; since the length of any segment that can be constructed with straightedge and compass can and is therefore algebraic, this ÷, and f, be written in terms of +, proves that a line segment of length x cannot be constructed. The proof that x is transcendental requires a sizable amount of mathematics which is too advanced to be reached in this book. Nevertheless, the pmof is not much more difficult than the proof that e is transcendental. In fact, the proof for x is practically the same as the pmof for e. This last statement should certainly 1+ ¢= 1 1! elementary" proof -+-+ -+---. "elementary" "elementary" "advanced" ·, -, 21. e is Transcendental441 surprise you. The proof that e is transcendental seems to depend so thoroughly particular properties of e that it is almost inconceivable how any modifications on could ever be used for x; after all, what does e have to do with x? Justwait and see! PROBLEMS L is algebraic. if a > 0 is algebraic, then algebraic and rational, then a + r and er are that Prove if is is a r (b) (a) Prove that algebraic. considerably: the sum, product, is algebraic. This fact is too difficult for us to prove here, but some special cases can be examined: Part (b)can actually be strengthened and quotient of algebraic 2. *3. numbers Prove that + Ä and Ã(1 + Ã) are algebraic, by actually finding algebraic equations which they satisfy. (You will need equations of degree 4.) n (a) Let a be an algebraic number which is not rational. satisfies the Suppose that a polynomial equation f(x)=anx"+a..ix"¯'+---+ao=0, polynomial function of lower degree has this property. Show 0 for any rational number p/q. Hint: Use Probf (p/q) lem 3-7(b). (b) Now show that |f(p/q)| ;> 1/q" for all rational numbers p/q with q > 0. Hint: Write f (p/q) as a fraction over the common denominator q". < 1}. Use the Mean Value Theorem (c) Let M sup{|f'(x)|: |x p/q| < 1, then to prove that if p/q is a rational number with |a max(1, > 1/M) we have 1/Mq". (It follows that p/q| for c |a all rational p/q.) |a p/q l > c/q" for and that no -/ that -a = - = - - *4. Let a = 0.110001000000000000000001000..., l's occur in the n! place, for each n. Use Problem 3 to prove that œ is transcendental. (For each n, show that a is not the root of an equation of degree n.) where the Although Problem 4 mentions only one specific transcendental number, it should be clear that one can easily construct infinitely many other numbers a which do not satisfy |a p/q| > c/q" for any c and n. Such numbers were first considered and the inequality in Problem 3 is often called Liouville's by Liouville (1809-1882), inequality. None ofthe transcendental numbers constructed in this way happens to be particularly interesting, but for a long time Liouville's transcendental numbers were the only ones known. This situation was changed quite radically by the work who showed, without exhibiting a single transcendental of Cantor (1845-1918), number, that most numbers are transcendental. The next two problems provide an - 442 and InfiniteSeries Infmite Sequences introduction to the ideas that allow us to make sense of such statements. The basic definition with which we must work is the following: A set A is called countable if its elements can be arranged in a sequence at, a2, a3, a4. The obvious example is N, the set of natural (infact, more ••• - or less the Platonic ideal of) a countable set the set of even natural numbers is also clearly numbers; countable: 2,4,6,8,.... It is a little more surprising to learn that Z, the set of all integers 0) is also countable, but seeing is believing: negative (positive, and -1, 0, 1, -2, -3, 2, 3, .... The next two problems, which outline the basic features of countable sets, are really a series of examples to show that (1)a lot more sets are countable than one might *5. think and (2)nevertheless, some sets are not countable. if A and B are countable, then so is A UB {x : x is in A or worked the trick that for Z. same x is in B }. Hint: Use rational numbers of the countable. that (This is really set positive is (b) Show startling, quite but the figure below indicates the path to enlightenment.) (a) Show that = I 12 l 345 22 12 222 345 33 333 345 T2 1 that the set of all pairs (m,n) of integers is countable. practically the same as part (b).) (d) If Ai, A2, As, are each countable, prove that (c) Show (This is ... A1UA2UAsU... is also countable. (Again use the same trick as in part (b).) (e) Prove that the set of all triples (l, m, n) of integers is countable. (A triple (l, m, n) can be described by a pair (l, m) and a number n.) an) is countable. (If you have (f) Prove that the set of all n-tuples (ai, a2, using this, induction.) done part (e),you can do of that the of Prove all set roots polynomial functions of degree n with (g) integer coefficients is countable. (Part (f)shows that the set of all these . . . , 21. polynomial functions can be arranged most n roots.) Now use parts (h) is countable. *6. (d)and (g)to e is Wanscendental443 in a sequence, and each prove that the set of all algebraic has at numbers Since so many sets turn out to be countable, it is important to note that the In other words, set of all real numbers between 0 and 1 is not countable. there is no way of listing all these real numbers in a sequence «I = «2 = Œ3 = 0.aiia12a13al4 0.a2]a22a23a24 0.a31a32G33G34 - - - · - - - · · is being used on the right). To prove that this is so, suppose such a list were possible and consider the decimal notation (decimal 0.ã1iã22533Õ44 - where ä., cannot = 5 if a.. ¢ 5 and ä., - - , 6 if a.. = possibly be in the list, thus obtaining = 5. Show that this number a contradiction. Problems 5 and 6 can be summed up as follows. The set of algebraic numbers is countable. If the set of transcendental numbers were also countable, then the set of all real numbers would be countable, by Problem $(a), and consequently the set of real numbers between 0 and 1 would be countable. But this is false. Thus, the set of algebraic numbers is countable and the set of transcendental numbers is not ("thereare more transcendental numbers than algebraic numbers"). The remaining two problems illustrate further how important it can be to distinguish between sets which are countable and sets which are not. *7. Let f be a nondecreasing function on lim f (x) and lim f (x) both exist. x->a+ x->a- [0,1]. Recall (Problem 8-8) that 0 prove that there are only finitely many numbers a in [0,1] with lim f (x) lim f (x) > s. Hint: There are, in fact, at most (a) For any e > - [f (1) f (0)]/e of them. - (b) Prove f is discontinuous is countable. 0, then it is > 1/n for some natural that the set of points at which Hint: If lim x->a+ f (x) - lim x->a- f (x) > number n. This problem shows that a nondecreasing function is automatically continuous at most points. For differentiability the situation is more difficult function can fail to analyze and also more interesting. A nondecreasing which countable, of points is but it is differentiable be set not to at a nondecreasing still true that functions are differentiable at most points "most"). word of the Reference [32]of the Sugdifferent sense (ina using proof, gives the Rising Sun Lemma of gested Reading a beautiful and InfiniteSeries 444 InfiniteSequences Problem 8-20. For those who have done Pmblem 10 of the Appendix to Chapter 11, it is possible to provide at least one application to dif ferentiability of the ideas already developed in this problem set: If f is convex, then f is differentiable except at those points whem its righthand derivative fe' is discontinuous; but the function f,' is increasing, differentiable except at a countable so a convex function is automatically of points. set *8. 11-66 showed that if every point is a local maximum point for a continuousfunction f, then f is a constant function. Suppose now that the hypothesis of continuity is dropped. Prove that f takes on only a countable set of values. Hint: For each x choose rattonal numbers ax and by such that a, < x < b, and x is a maximum point for f on value value of f on some maximum the bx). is Then every (ax, f(x) such there? How interval (ax,bx). intervals are many corollary. Deduce Problem 11-66(a) as a (b) (c) Prove the result of Problem 11-66(b) similarly. (a) Problem INFINITE SEQUENCES CHAPTER The idea of an infinite sequence is so natural a concept that it is tempting to dispense with a definition altogether. One frequently writes simply infinite sequence "an ai, a2, a3, a4, G5, ••• , "forever." the three dots indicating that the numbers at continue to the right A rigorous definition of an infinite sequence is not hard to formulate, however; the important point about an infinite sequence is that for each natural number, n, there is a real number a.. This sort of correspondence is precisely what functions are meant to formalize. DEFINITION An infinite sequence of real numbers is a function whose domain is N. From the point of view of this definition, a sequence should be designated by a letter like a, and particular values by single a(1), a(2), a(3), ..., but the subscript notation ai, a2, as, ... is almost always used instead, and the sequence itselfis usually denoted by a symbol like {an).Thus {n},{(-1)"},and (1/n} denote the sequences œ, ß, and y defined by a,=n, ß, = (-1)", 1 n like any function, can be graphed (Figure 1)but the graph is usually rather unrevealing, since most of the function cannot be fit on the page. A sequence, (a) FIGURE 445 1 (b) (c) 446 InfiniteSequences and InfiniteSeries I FIGURE 2 A more convenient representation of a sequence is obtained by simply labeling the points at, a2, as, on a line (Figure 2). This sort of picture shows when the sequence out to infinity," the sequence going." The sequence (an) and and and the sequence {y.) back forth between 1," (ß,} to 0." Of the three phrases in quotation marks, the last is the crucial concept associated with sequences, and will be defined precisely (thedefinition is illustrated in Figure 3). ... "goes "is --l "jumps F1GURE DEFINITION "converges 3 A sequence (a.) converges to I (insymbols HNOO lim a. l) if for every e is a natural number N such that, for all natural numbers n, = if n > N, then |a, - l| < > 0 there s. In addition to the terminology introduced in this definition, we sometimes say that the sequence (a,} approaches I or has the limit 1. A sequence {a.}is said to converge if it converges to I for some l, and to diverge if it does not converge. To show that the sequence {y,,} converges to 0, it suffices to observe the following. If e > 0, there is a natural number N such that 1/N < e. Then, if n > N we have 1 1 n N y,=-<-<E, so|y,-0|<e. The limit lim n+1- =0 HNOO probably seem reasonable after a little reflection (itjust says that jn + 1 is practically the same as for large n), but a mathematical proof might not be so will 22. InfiniteSequences447 obvious. S we can use an algebraic trick: Jn+1-S= (Jn+1-O)(Jn+1+Á) n+1+S n+1--n To estimate Jn + 1 - n+1+S I n+1+S 1 S by applying It is also possible to estimate n + to the function f (x) = on the interval the Mean Value Theorem [n,n + 1]. We obtain - n + 1- f'(x) = 1 = 1 --, for some x in (n,n + 1) 2 1 2f Either of these estimates may be used to prove the above limit; the detailed proof is left to you, as a simple but valuable exercise. The limit 3n3+7n2+1 3 =lim 4n3 naoo Sn + 63 4 - should also seem reasonable, because the terms involving n3 are the most remember the proof of Theorem 7-9 will tant when n is large. If you to guess the trick that translates this idea into a proof-dividing by n3 yields 3n3 4n3 - Sn + 63 8 4-- n top and bottom 1 7 3+-+-- 2 you impor- be able 3 63 +-n Using this expression, the proof of the above limit is not difficult, especially if one uses the following facts: If H->OO lim a, and N->OO lim b, both exist, then lim (a, + b.) = lim (a, b.) = 8400 - R->OO moreover, lim a. + B-¥QO lim b., N->OD Iim a. B->OO - lim b.; B-kcQ if lim b, ¢ 0, then b, ¢ 0 for all n greater than some N, and H-¥DO lim an/b, R->OD = lim a./ N->OO lim b.. N->OO and InßniteSeries 448 InfiniteSequences to be utterly precise, the third statement would have to be even complicated. As it stands, we are considering the limit of the sequence more where the numbers c, might not even be defined for certain n < N. (c,} {as/b.), could define c, any way we liked for such nThis doesn't really matter-we of the limit because a sequence is not changed if we change the sequence at a (If we wanted = finite number of points.) Although these facts are very useful, we will not bother stating them as a theorem-you should have no difficulty proving these results for yourself because the definition of lim a, l is so similar to previous definitions of limits, especially N->CO lim f (x) l. = = l and lim f (x) I is The similarity between the definitions of lim a. x-oo naoo actually closer than mere analogy; it is possible to define the first in terms of the second. If f is the function whose graph (Figure 4) consists of line segments joining = FIGURE = 4 the points in the graph of the sequence f (x) (a.si = a.)(x - that {a.), so n 5 x 5 n + 1, n) + a, - then lim a, Conversely, if f if and only if l = satisfies lim lim f (x) = l. lim a. l, and we set a, l. f (n), then XMOD is frequently very useful. For example, suppose that = = = X->OO This second observation 0 < a < 1. Then lim a" = 0. n->oo To prove this we note that lim a' lim e"° = x->oo 0, so that x log a is a Notice that we actually have since log a < negative lim a" R-¥CO for if a < " 0 = x-moo = 0 and if |a| large in absolute value for large x. < 1; 0 we can write lim R->OO aN = lim (-1)" a|" B->DO = 0. 22. InfiniteSequences449 The behavior of the logarithm function also shows that if a > 1, then a" becomes arbitrarily large as n becomes large. This assertion is often written Iim a" = N->OO and it is sometimes even said that oo, a > 1, {a"}approaches oo. We also write equations like -a" lim -m - R-FOO -1, then lim a" and say that (-a"} approaches Notice, however, that if a < does not exist, even in this extended sense. Despite this connection with a familiar concept, it is more important to visualize convergence in terms of the picture of a sequence as points on a line (Figure 3). There is another connection between limits of functions and limits of sequences which is related to thispicture. This connection is somewhat less obvious, but considerably more interesting, than the one previously mentioned-instead of defming limits of sequences in terms of limits of functions, it is possible to reverse the pm-oo. cedure. THEOREM I Let f be a function defined in an open interval containing c, except perhaps at c itself with lim f (x) l. X->C = Suppose that (a.) is a sequence such that (1) each is in the domain of f, as (2) each a, (3) lim a, ¢ - c, c. naoo Then the sequence {f (as)) satisfies lim R->OO f (a.) l. = Conversely, if this is true for every sequence {a,}satisfying the above conditions, then lim x->c PROOF f (x) = l. Suppose first that lim f (x) l. Then for every e > 0 there is a 8 for all x, then|f(x)-ll<e. if0<\x-c|<8, = > 0 such that, If the sequence {a.}satisfies N-FOR lim a. c, then (Figure 3) there is a natural ber N such that, if n > N, then |a, c| < 8. = - By our choice of 8, this means that |f(a.)-l|<e, num- and InfiniteSerks 450 InfiniteSequences showing that lim f (a.) = l. lim f (a.) l for every sequence {a.}with lim a. Suppose, conversely, that B->OD B-¥00 lim f (x) l were not true, there would be some e > 0 such that for every c. If X->C ô > 0 there is an x with = = = 0<|x-c|<ô but |f(x)-l|>e. In particular, for each n there would be a number 0 |x, < cl - 1 < but -- n x, such that |f (x.) l| - > e. Now the sequence (x.) clearly converges to c but, since If (x.) l| > e for all n, the sequence {f(x.)) does not converge to l. This contradicts the hypothesis, so lim f (x) l must be true. - = Theorem 1 provides many examples of convergent sequences. sequences {a,}and (b,} defined by = a, bn a, FIGURE a. a. a* = sin 13 + cos sin 1 + (-1)" For example, the - , clearly converge to sin(13) and cos(sin(1)), respectively. It is important, however, to have some criteria guaranteeing convergence of sequences which are not obviously of this sort. There is one important criterion which is very easy to prove, but which is the basis for all other results. This criterion is stated in terms of concepts defined for functions, which therefore apply also to sequences: a sequence {an}is ÎncreBSÂng if am I > an for all n, nondecreasing if an+1 > as for all n, and M for all n; there are simbounded above if there is a number M such that a. = 5 _< which ilar definitions for sequences below. THEOREM 2 PROOF are decreasing, nonincreasing, and bounded If {a,}is nondecreasing and bounded above, then {an)converges (asimilar statement is true if {a.}is nonincreasing and bounded below). The set A consisting of all numbers an is, by assumption, bounded above, so A has «. We claim that lim a, = a (Figure 5). In fact, if e > 0, least upper bound there is some DN SatiSfying Then if n > N we have a HOOD Œ a,>aN, This proves that n->OO lim a. = - GN SO œ. < e, since a is the least upper bound of A. Œ-an<Œ-GN<E· 22. InfiniteSequences451 The hypothesis that {a.}is bounded above is clearly essential in Theorem 2: if or not {a.}is nondecreasing) {a,}is not bounded above, then (whether (a,} clearly might that consideration, there should it diverges. Upon first be little appear trouble deciding whether or not a given nondecreasing sequence {a,}is bounded above, and consequently whether or not {an)converges. In the next chapter such sequences will arise very naturally and, as we shall see, deciding whether or not they converge is hardly a trivial matter. For the present, you might try to decide whether or not the following increasing) sequence is bounded above: (obviously 1+(+(-, 1, 1+¾, 1+¼+¼+¾, .... Although Theorem 2 treats only a very special class of sequences, it is more useful than might appear at first, because it is always possible to extract from an arbitrary or else sequence {a,} another sequence which is either nonincreasing nondecreasing. of the sequence {as] To be precise, let us define a subsequence of the be form to a sequence raph of a a.,,an,,as the nj are natural numbers where 2 and 6 are peak points 2 FIGURE 3 4 5 6 7 8 9 ni to 11 6 LFNMA ,..., with < n2 < 83 · • Then every sequence contains a subsequence which is either nondecreasing or It is possible to become quite befuddled trying to prove this asthe proof is very short if you think of the right idea; it is worth ftCOrding as a lemma. nonincreasing. sertion, although Any sequence {a,}contains a subsequence which is either nondecreasing or nonincreasing. PROOF Call a natural number n a m > n (Figure 6). "peak point" of the sequence {a,}if a, < as for all Case 1. The sequence has mfmitelymany peakpoints. In this case, if n1 < n2 < are the peak points, then as, > a,2 > an, > so {a.,) is the desired na < ··· · --, subsequence. (nonincreasing) Case2. The sequence has only finitely many peakpoints.In this case, let ni be greater than all peak points. Since ni is not a peak point, there is some n2 > ni such that an, > any. Since n2 is not a peak point (itis greater than ni, and hence greater than all peak points) there is some n3 > n2 such that an, > a.,. Continuing in this subsequence. way we obtain the desired (nondecreasing) If we assume that our original sequence {a,} is bounded, we can pick up an extra corollary along the way. COROLLARY (THE BOLZANO-WEIERSTRASSTHEOREM) Every bounded sequence has a convergent subsequence. and InfiniteSeries 452 InfiniteSequences Without some additional assumptions this is as far as we can go: it is easy to construct sequences having many, evenly infinitely many, subsequences converging to different numbers (seeProblem 3). There is a reasonable assumption to add, which yields a necessary and sufficient condition for convergence of any secondition this will work, it does simplify Although not be crucial for our quence. many proofs. Moreover, this condition plays a fundamental role in more advanced investigations, and for this reason alone it is worth stating now. If a sequence converges, so that the individual terms are eventually all close to the same number, then the difference of any two such individual terms should be l for some l, then for any e > 0 there is very small. To be precise, if lim a, l| < e/2 for n > N; now if both n > N and m > N, then an N such that |a, = - e e 2 |a,-am|s|a.-l|+|l-aml<-+-=e- 2 This final inequality, |a, a.| be used to formulate a condition for convergence of a sequence. - < e, which eliminates mention of the limit l, can (theCauchy condition) which is clearly necessary A sequence {an}is a Cauchy sequence number N such that, for all m and n, DEFINITION if m, n (This condition > is usually written if for every e N, then |a, lim |am - -- a,| a,| < > 0 there is a natural e. 0.) = The beauty of the Cauchy condition is that it is also sufficient to ensure convergence of a sequence. After all our preliminary work, there is very little left to do in order to prove this. THEOREM 3 PROOF A sequence {an}converges if and only if it is a Cauchy sequence. We have already shown that {a.}is a Cauchy sequence if it converges. The proof of the converse assertion contains only one tricky feature: showing that every Cauchy 1 in the definition of a Cauchy sequence sequence {a,}is bounded. If we take e we find that there is some N such that = |a. - a, |< 1 for m, n > N. In particular, this means that lam - Thus {am: m whole sequence > aN+1| < 1 for all m > N. N} is bounded; since there are only finitely many other ag's the is bounded. 22. InfiniteSequences453 The corollary to the Lemma thus implies that some subsequence of {a.}converges. Only one point remains, whose proof will be left to you: if a subsequence Cauchy sequence converges, then the Cauchy sequence itself converges. of a PROBLEMS Verifyeach of the following limits, 1. (i) (ii) n lim (v) (vi) n! = g" < n/2. lim å k lim = 1, = 1. (vii) lim Šn2+ lim Ja" + (viii)n->oo n Ñ - = 0. n(n = - 1) max(a, = a, b b), - hm 2. $ k=1 = . nr+1 p + 1 Find the following limits. . (i) hmoon+1 = n+1 n - . n (ii) limn-Jn+adn+b. - R-¥DO lim (iii) n->oo lim (iv) naco lim (v) naoo (vi) R->00 2" + (-1)" 2=+1+ (-1)"+1 . Y sin(n") (-1)" . n+ 1 b" an + b" a" lim nc", > < n, in particular, for 0. 0, where a(n) is the number = n kP --oo k! for k 1. = Hint: The fact that each prime is how small a(n) must be. *(x) ... 0. > a b" a(n) -- 1 1 = 0. Hint: You should at least be able to prove 0. Hint: n! R->OO (ix) nlimoo fn + 1- Sn2+ lim NNOO - 0. = lim fn2 + (iii) th4a lim (iv) n->oo 1. = 1 n+ 3 lim n->oo n3 + 4 n + n->oo - . |c| < l. > of primes which divide n. 2 gives a very simple estimate of 454 and InfiniteSeries InfiniteSequences lim (vii)naoo 3. 2" -. n! can be said about the sequence {a,}if it converges and each as is integer? an 1, 1, (b) Find all convergent subsequences of the sequence 1, (There are infinitely many, although there are only two limits which such subsequences can have.) (c) Find all convergent subsequences of the sequence 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, (There are infinitely many limits which such subsequences can have.) the sequence Consider (d) (a) What -1, . . . -1, -1, . ... 1 1 2 . 1 2 3 2 1 3 4 1 For which numbers a is them a subsequence converging to a? 4. if a subsequence of a Cauchy sequence converges, then so does the original Cauchy sequence. (b) Prove that any subsequence of a convergent sequence converges. 5. (a) Prove that (a) Prove that a < 2, then a (b) Prove that the sequence if 0 < < 2. < converges. lim a, (c) Find the limit. Hint: Notice that if 8400 = l, then lim R->OD N = J, by Theorem 1. 6. Let 0 < ai < b1 and define as i = g, be = i a, + b. the sequences {a,}and {b.}each converge. (b) Prove that they have the same limit. (a) Prove that 7. In Problem 2-16 we saw that any rational approximation m/n to Ã can be by a better approximation (m + 2n)/(m + n). In particular, starting with m 1, we obtain n œplaced = = 1, (a) Prove that 3 -, 7 -, ... . this sequence is given recursively a1=1, a. i=l+ 1 1+a, by . 22. InfiniteSequences455 (b) Prove that lim a. n. = B->OD expansion Ã This gives the so-called continued.finction 1 1+ = 1 2+ 2+ --- Hint: Consider separately the subsequences {a2,}and {a2, i}. (c) Prove that for any natural numbers a and b, b a2+b=a+ . b 2a + 8. Identify the function many 9. f(x) 2a+..- lim (lim (cosnbrx)2k).(It has been mentioned = k->oo n->oo times in this book.) Many impressive looking limits can be evaluated easily (especially by the who makes them up), because they are really lower or upper sums in person disguise. With this remark as hint, evaluate each of the following. (Warning: the list contains one red herring which can be evaluated by elementary considerations.) Ñ+-·-+ Ñ naoo . (ii) lun lim (iv) n-oo . (v) lun W+ Ñ+·-·+ Ñ¯ . n ( ( 1 n+ 1 1 - n2 + + - · + · 1 1)2 (, n + + (n 1)2 n->oo lim (vi) nuo . n n->oo lim (iii) n->oo 10. + lim (i) ( n n2+1 + - 1 2n + . - - - n (n + 2)2 n + . (2n)2 - · · á + n + (n n)2 . n +---+ n2+22 Although limits like lim 1 + . n2+n2 and lim a" can be evaluated using facts about the behavior of the logarithm and exponential functions, this approach is vaguely dissatisfying,because integral roots and powers can be defined without using the exponential function. Some of the standard arguments for such limits are outlined here; the basic tools are inequalities derived from the binomial theorem, notably H->OD N->OC "elementary" (1 + and, for part h)" for h 1 + nh, > > 0; (e), (1+h)" à 1+nh+ n(n -- 2 1) h2 n(n 2 - 2 1) h2 , forh>0. 456 and InfiniteSenes InfiniteSequences 11. (a) Prove that n-woo lim a" (b) Prove that HNOO (c) Prove that n->oo (d) Prove that H->OD (e) Prove that H->OD lim a" = oo if a = 0 if 0 < 1 if a > 6 lim 6 lim & lim = 1, by setting a a < l. > = 1 if 0 = 1. 1 + h, where h = 1, by setting < a = > 0. 1+h and estimating h. 1. < (a) Prove that a convergent sequence is always bounded. (b) Suppose that lim a, 0, and that each a, > 0. Prove that the set of all = numbers 12. a, actually has a maximum member. (a) Prove that 1 (b) If an log(n + 1) < n+1 1+ = 1 2 1 3 -+ lim = n-koo ( 1+ < 1 --. n 1 --logn, -+-··+ n show that the sequence {a.) is follows that there is a number y log n - - · decreasing, and that each a, - + l - log n -- y > 0. It . This number, known as Euler's number, has proved to be quite refractory; it is not even known whether y is rational. 13. (a) Suppose that f is increasing on Show that [1,oo). n f(1)+---+ f(n-1) (b) Now choose f = < Ï I f(x)dx < f(2)+---+ f(n). log and show that -<n!< (n + 1)" n" ; it follows that lim =-woo -- & n = 1 -. e n/e, in the sense that the This result shows that Ñ is approximately ratio of these two quantities is close to 1 for large n. But we cannot conclude that n! is close to (n/e)" in this sense; in fact, this is false. An estimate for n! is very desirable, even for concrete computations, because n! cannot be calculated easily even with logarithm tables. The standard (anddifficult) theorem which provides the right information will be found in Problem 27-19. 22. InfiniteSequences457 14. I (a) Show that the tangent line to the graph of f at (xo,f (xo))intersects the horizontal axis at (xi, 0), where f (xo) xt=xo- This intersection point may be regarded as a rough approximation to the point where the graph of f intersects the horizontal axis. If we now start at xi and repeat the process to get x2, then use x2 to get xs, etc., we have a sequence {x.}defined inductively by I | « FIGURE x, Xn+1 , x. xi = Xn - 8k for some (k in i 9 x. xx a t xa x' "' (C, Xk). ! f (xk) = i #' f (xk) f'($k) = Show that Ok+1 /(Xk) ¯ • J'(Xk) Í'($k) Conclude that Ok+1 = FIGURE . = = e - Figure 7 suggests that {x.}will converge to a number c with f (c) 0; this is called Newton'smethod for finding a zero of f. In the remainder of this problem we will establish some conditions under which Newton's method works (Figures 8 and 9 show two cases where it doesn't). A few facts about convexity may be found useful; see Chapter 11, Appendix. (b) Suppose that f,' f" > 0, and that we choose xo with f(xo) > 0. Show thatxoëx12x22---àc. 8k Xk c. Then Let (c) 7 , f (x.) I I . 8 for some Ok in f (xt) J'($k)j'(Xk) (C, Xk), · f"(g)(Xk ¯ $k) and then that (*) ôk+1 f"(Uk) f'(x ) 2 Ok · sup|f"| on inf f' on [c,xi]and let M m Newton's method works if xo c < m/M. (e) What is the formula for x.si when f(x) x2 A? 1.4 we get If we take A = 2 and xo 1.4 xo 1.4142857 xi 1.4142136 x2 1.4142136, x3 (d) Let = = [c,x1].Show that - = - = = = = = which is already correct to 7 decimals! Notice that the number of correct decimalsat least doubled each time. This is essentially guaranteed by the FIGURE 9 inequality (*)when M/m < l. 458 and InßniteSeries InfiniteSequences 15. Use Newton's method (i) f(x) (ii) f (x) (iii) f (x) (iv) f (x) *16. = tanx = cos x - - x3 + x = x3 = to estimate the zeros of the following functions. cos2x near 0. x2 Û. on [0,1]. on [0,1]. near 1 - 3x2 + 1 -- Prove that if R-¥OC lim a. = l, then . hm (a1+---+a.) naoo n Hint: This problem is very lem 13-40. 17. Suppose that lim f(x)/x similar to (infact it is a is continuous and lim f (x + 1) 0. Hint: See the previous problem. f = X->OD *18. =l. - special f (x) case of) Prob- 0. Prove that = l. Prove that lim a. 1/a. Suppose that a, > 0 for each n and that H->OO lim g l. Hint: This requires the same sort of argument that works in naoo 1, for a > 0. Problem 16, together with the fact that n->OO lim = = = 19. (a) Suppose that {an}is a convergent that lim an is also in n-woo (b) Find a convergent sequence of points all in [0,1]. sequence {a.) of points all in (0, 1) such [0,1]. Prove that lim an is not in (0, 1). 20. Suppose that f is continuous x, and that the sequence f(x), f(f(x)), f(f(f(x))), "fixed converges to l. Prove that I is a special cases have occurred already. 21. ... point" for f, i.e., f (l) = l. Hint: Two (a) Suppose that f is continuous on [0, 1] and that 0 5 f(x) 5 1 for all x in 7-11 shows that f has a fixed point (inthe terminology 1]. Problem [0, of Problem 20). If f is increasing, a much stronger statement can be made: For any x in [0,1], the sequence x, a fixed point, by Problem 20). Prove this the behavior of the sequence for f (x) > x and by f (x) < x, or by looking at Figure 10. A diagram of this sort is used in Littlewood's Mathematician'sMiscel&my to preach the value of drawing only pictures: "For the professional the proof needed is [thisFigure]." Suppose that f and g are two continuous functions on [0,1], with 0 5 f(x) < 1 and 0 < g(x) 5 1 for all x in [0, 1], which satisfy fog go f. Suppose, moreover, that f is increasing. Show that f and g have has a limit assertion, FI GU RE lo *(b) f(x), f(f(x)),... is necessarily (which examining = 22. InfiniteSequences459 a common f (l) = l = fixed point; in other words, there is a number l such that g(l). Hint: Begin by choosing a fixed point for g. amused themselves by asking whether the For a long time mathematicians conclusion of part (b)holds without the assumption that f is increasing, but in the Noticesof the American Mathematitwo independent announcements cal Society, Volume 14, Number 2 give counterexamples, so it was probably a pretty silly problem all along. The trick in Problem 20 is really much more valuable than Problem 20 might point theorems" depend upon suggest, and some of the most important looking at sequences of the form x, f (x), f (f (x)), A special, but representaof such treated in Problem23 (forwhich the next problem theorem is tive, case one is preparation). "fixed . .. . -¢ 22. (a) Use Problem 2-5 to show c" Mm+1 + (b) Suppose that |c| < that if c 1, then Cm . . . + c" Cn+1 - = . 1 - c 1. Prove that lim c'"+---+c"=0. that {x.) is a sequence with Prove that {x.}is a Cauchy sequence. (c) Suppose *23. |x. --- xn+1| 5 c", where c < l. Suppose that f is a function on R such that (*) where c < |f (x) f (y)| 5 - c|x - for all x and y, y|, 1. (Such a function is called a contraction.) (a) Prove that f is continuous. (b) Prove that f has at most one (c) By considering the sequence x, fixed point. f(x), f(f(x)), ..., for any x, prove that f does have a fixed point. (This result, in a more lemma.") general setting, is known as the "contraction 24. (a) Prove that if f is differentiable and |f'[ < 1, then f has at most one fixed point. (b) Prove that if |f'(x)| 5 c < 1 for all x, then f has a fixed point. (c) Give an example to show that the hypothesis |f'(x)| 5 1 is not suflicient to insure that f has a fixed point. 25. This problem is a sort of converse to the previous problem. Let be be a lim b, exists sequence defined by bi a, b.4i f (b.). Prove that if b HNOO and f' is continuous at b, then [J'(b)| 5 1. Hint: If |f'(b)| > 1, then = = = and InßniteSaies 460 InfiniteSequences 1 for all x in an interval around b, and b, will be in this interval for large enough n. Now consider f on the interval [b,bn]. |f'(x)| 26. > This problem investigates for which a > 0 the symbol aa makes b = In other words, if we define bi sense. a, basi = = ab , when does lim b, exist? n->OO b. (The situation is similar to that in if b exists, then ab Problem 5.) According to part (a),if b exists, then a can be written in the form (b) y1/2 ylj> and conclude that for some y. Describe the graph of g(y) 0<ase1/e (c) Suppose that 1 5 a 5 e 16. Show that {b.}is increasing and also b. E e. This proves that b exists (andalso that b 5 e). (a) Prove that = = The analysis for a < 1 is more difBcult. Problem 25, show that if b exists, then e¯1 elle that e¯' sa5 (d) Using From now on we will suppose that e¯" (e) Show that < a < sbs e. Then show 1 the function a2 f(x)= logx is decreasing on the interval (0, 1). Let b be the unique number such that ab b. Show that a OD (g) Using part (e)again, show that I b. lim b. b. lim a2.42 b, so that ¤-700 (h) Finally, show that R->OG = = = = = 27. Let {x.}be a sequence y. (a) Prove that x, or R->OD = is bounded, and let sup{x., xn+1, Xn+2, the sequence - - - Ì- {y,) converges. The limit lim lim sup x., and called the limit superior, of the sequence (b) Find 5 which {x.}. x, for each of the following: (i) x, = (ii) x, = 1 -. n 1 (--1)"-. n HNDO y. is denoted by or upper limit, 22. InßniteSequences461 (iii) x, = (iv) x, = (-1)" 1 + - 1 . n J. (orliminf (c) Define lim x, x.) and prove that that lim x, exists if and only if B-koQ (d) Prove lim x, case N-FOO = 4 E 00 x, i R-koQ x. = g x, and that in this lim x.. = the definition, in Problem 8-18, of E A for a bounded set A. that if the numbers x, are distinct, then lim x, lim A, where Prove (e) Recall = n->oo A={x.:ninN). 28. In the Appendix to Chapter 8 we defined uniform continuity of a function on an interval. If f (x) is defined only for rational x, this concept still makes sense: we say that f is uniformly continuous on an interval if for every e > 0 there is some ô > 0 such that, if x and y are rational numbers in the interval and |x - y| (a) Let x < 8, then be any |f (x) f (y) | < e. - or irrational) point in the (rational of rational interval, and let (x.) be lim xn x. Show such that points in the interval a sequence naoo that the sequence {f (x.)) converges. (b) Prove that the limit of the sequence {f (x.)] doesn't depend on the choice of the sequence (x.}. We whole will denote this limit by interval. (c) Prove that the extended f(x),so jis function j that = is an extension uniformly of continuous to the f on the interval. 29. 0, and for rational x let f (x) a2, as defined in the usual elementary algebraic way. This problem shows directly that f can be extended to a continuous function / on the whole line. Problem 28 provides the necessary Let a > = machinery. for rational x < y. 10, show that for any e numbers x close enough to 0. (a) Show that a' < (b) Using Problem rational (c) Using the ai equation a' - af = aT(a**I interval f is uniformly continuous, (d) Show that the extended function isfies *30. f(x+y)= / > 0 we have |ax - 1| < e for 1), prove that on any closed in the sense of Problem 28. of Problem 28 is increasing and sat- f(x)f(y). The Bolzano-Weierstrass Theorem is usually stated, and also proved, quite differently than in the text-the classical statement uses the notion of limit and InßniteSeries 462 InfiniteSequences points. A point x is a limit point of the set A if for every e point a in A with |x a| < e but x ¢ a. > 0 theœ is a - (a) Find all limit (i) points of the following sets. I1 -:ninN . n 1 1 -+--:nandminN (ii) . m n 1+ (iii) (-1)" n in N : . (iv) Z. (v) Q. x is a limit point of A if and only if for every e > 0 there are infinitely many points a of A satisfying x a| < e. Prove that ÏÏmA is the largest limit point of A, and lim A the smallest. (b) Prove that - (c) The usual form of the Bolzano-Weierstrass Theorem states that if A is an infinite set of numbers contained in a closed interval [a,b], then some point of [a,b] is a limit point of A. Prove this in two ways: form already proved in the text. Hint: Since A is infinite, there in A. are distinct numbers xi, x2, x3, (e) Using the Nested Intervals Theorem. Hint: If [a,b] is divided into two intervals, at least one must contain infinitely many points of A. (d) Using the ... 31. (a) Use the Bolzano-Weierstrass Theorem to prove that if f is continuous on [a,b], then f is bounded above on [a,b]. Hint: If f is not bounded above, then there are points x, in [a,b] with f(x.) > n. the Rolzano-Weierstrass Theorem to prove that if f is continuous on [a,b], then f is uniformly continuous on [a,b] (seeChapter 8, (b) Also use Appendix). **32. (a) Let {a,}be the 1 2' sequence 1 2 3= 3' 1 4' 2 4· 3 4' 1 5' 2 5' 3 5' 4 5' 1 6' 2 6' "" Suppose that 0 5 a oo = -b-a. n {a,}of numbers [0,1] if , in [0,1] is called N(n; a, b) n =b--a uniformly distributed 22. InfiniteSequences463 for all a and b with 0 5 a oo n (c) Prove that if {a.}is uniformly on [0,1], then i . f=hm n--oo **33. (a) Let f distributed in f(ai)+---+ be a function defined on [0, 1] and f f(a.) is integrable . n [0, 1] such that lim y->a f (y) exists for all a e > 0 prove that there are only finitely many points a with | lim f (y) f (a)| > e. Hint: Show that the set of such 1] [0, y->a points cannot have a limit point x, by showing that Iim f (y) could not in in [0,1]. For any - y-ex exist. in the terminology of Problem 21-5, the set of points where f is discontinuous is countable. This finally answers the question of Problem 6-16: If f has only removable discontinuities, then f is continuous except at a countable set of points, and in particular, f cannot be discontinuous everywhere. (b) Prove that, CHAPTER INFINITE SERIES Infinite sequences were introduced in the previous chapter with the specific intention of considering their "sums" ai + a2 ‡ a3 ¯I¯°· · in this chapter. This is not an entirely straightforward matter, for the sum of infinitely many numbers is as yet completely undefined. What can be defined are the "partial sums" ss=a1+·--+as, and the infinite sum must presumably be defined in terms of these partial sums. Fortunately, the mechanism for formulating this definition has already been developed in the previous chapter. If there is to be any hope of computing the infinite closer and closer apsum ai + a2 + a3 + sums s, should represent , the partial proximations as n is chosen larger and larger. This last assertion amounts to little sum" ai + a2 Ÿ G3 + ought more than a sloppy definition of limits: the "sum" necessarily approach will of leave the to be lim s,. This many sequences · · · "infinite - -· B-FOO undefined, since the sequence {s,}may easily fail to have a limit. For example, the sequence -1, 1, with a, = (-1)n+1yields -1, 1, ... the new sequence s1=at=1, s2 = ai + a2 ss=ai+a2 = 0, 3=Î, s4=al+a2+a3+G4=Û, for which lim s, does not exist. Although there happen to be some clever extensions of the definition suggested here (seeProblems 9 and 24-20) it seems unavoidable that some sequences will have no sum. For this reason, an acceptable definition of the sum of a sequence should contain, as an essential component, terminology which distinguishes sequences for which sums can be defined from less fortunate sequences. nooo 464 23. InfiniteSeries 465 The sequence DEFINITION {a,}is summable if the sequence s, ai + = · - - (s.} converges, where + an. lim s. is denoted by In this case, HROO (or,less formally, al a. + a2 + a3 + - - n=1 and is called the sum of the sequence {a,}. The terminology introduced in this definition is usually replaced by less precise expressions; indeed the title of this chapter is derived from such everyday language. An infinite sum an is usually called an infmitesenes, the word "series" emphasiz- n=I ing the connection with the infinite sequence is not, summable is conventionally replaced {a,}.The statement by the statement does, or does not, converge. This terminology is somewhat best the symbol as denotes a number n=1 (soit can't "converge"), that that the {a,}is, or series a. peculiar, because at and it doesn't denote anything at all unless {a.}is summable. Nevertheless, this informal language is convenient, standard, and unlikely to yield to attacks on logical grounds. Certain elementary arithmetical operations on infinite series are direct consequences of the definition. It is a simple exercise to show that if {a.}and {b.}are summable, then (an+ b.) n=1 n=1 n=1 c·a.=c n-1 b., an + = a.. n=1 As yet these equations are not very interesting, since we have no examples of summable sequences (except for the trivial examples in which the terms are eventually all 0). Before we actually exhibit a summable sequence, some general conditions for summability will be recorded. There is one necessary and sufficient condition for summability which can be stated immediately. The sequence (a,}is summable if and only if the sequence {s,} lim sm converges, which happens, according to Theorem 22-3, if and only if FN,H-¥DO - s. = 0; this condition can be rephrased in terms of the original sequence as follows. and InfiniteSeries 466 InfiniteSequences {a.) is summable if and The sequence THE CAUCHY CRITERION only lim a,4i +···+am if 0. = NE.R->OC Although the Cauchy criterion is of theoretical importance, it is not very useful for deciding the summability of any particular sequence. However, one simple consequence of the Cauchy criterion provides a necessary condition for summability which is too important not to be mentioned explicitly. If {an} is summable, THE VANISHING CONDITION then lim a, 0. = n->oo This condition follows from the Cauchy criterion by taking m l, then lim s. be proved directly as follows. If B->OD = n + 1; it can also - lim a, lim (s. = - s, HNCO 8400 i) _ = lim s. lim s. - R-FOO N->00 _ i =l-l=0. lim 1/n Unfortunately, this condition is far from sufficient. For example, B-¥OC 0, but the sequence {l/n} is not summable; in fact, the following grouping of the numbers 1/n shows that the sequence {s,}is not bounded: = 1+(+ (+¾+¾+g+ -2 I (+(+(+-··+¾+-·· . -2 (2terms, each à¼) 1 (4terms, each 1 -2 () (8terms, each g) of proof used in this example, a clever trick which one might never reveals the need for some more standard methods for attacking these problems. see, These methods shall be developed soon (oneof them will give an alternate pmof The method 1/n does not converge) that but it will be necessary to first procure a few n=1 examples of convergent series. The most important of all infinite series are the "geometric r"=1+r+r2+r3+ series" -. n=o Only the cases (r| < 1 are interesting, since the individual terms do not approach 0 if |r| ;> 1. These series can be managed because the partial sums s,=1+r+..-+r" can be evaluated in simple terms. The two equations s.=1+r+r2+-..+r" rsn= r+r2+---+r"+r"*l 23. InfiniteSenes 467 lead to s.(1 r) - 1 = or 1 s.= by (division lim r" = n->oc 1 - r is rn+1 - r.+1 - 1 - r valid since we are not considering the case r 1). Now = 0, since Ir| < 1. It follows that °° r" = 1 lim 1 = 1 n->= n-0 r"+1 - --- 1 r |r , - r < l. In particular, "/1" " - n=l 2 =L 1" 2 1 I_1-1=1, -1= - 2 that is, (+)+¼+¼+...=1, an infinite FIGURE sum which can always be remembered from the picture in Figure 1. 1 Special as they are, geometric series are standard tests for summability will be derived. For a while we shall consider only sequences examples {a,}with from which important each 0; such nonnegative. called nonnegative then the seIf is sequence, {a.} a sequences are remark, with combined clearly nondecreasing. This Theorem 22-2, quence {s.}is provides a simple-minded test for summability: a, > A nonnegative sequence {a,} is summable if and only if the set of partial sums s, is bounded. THE BOUNDEDNESS CRITERION By itself this criterion is not very helpful-deciding whether or not the set of is bounded is just what we are unable to do. On the other hand, if some convergent series are already available for comparison, this criterion can be used to obtain a result whose simplicity belies its importance (itis the basis for almost all s, all other tests). THEOREM l (THE COMPARISON TEST) Suppose that O < a. 5 b. for all n. and InßniteSeries 468 InßniteSequences Then if bn converges, so does n=1 PROOF a.. n=1 If +---+as, +---+b., s,=ai tn=bi then Oss.st, Now {t.) is bounded, since foralln. b, converges. quently, by the boundedness criterion Therefore {s.}is bounded; conse- a, converges. n=1 the comparison test can be used to analyze very complicated series which looking in most of the complication is irrelevant. For example, Quitefrequently 2 + sin3(n + 1) 2= + n2 n-1 converges because 05 2 + sin3(n + 1) 2"+n2 < 3 and "3 "1 n=1 n=1 series. is a convergent (geometric) Similarly, we would expect the series °° 1 2" - 1 + sin2 n3 to converge, since the nth term of the series is practically 1/2" for large n, and we would expect the series °° n+1 n=1 to diverge, since n2+1 (n + 1)/(n2 + 1) is practically 1/n for large n. These facts can the be derived immediately from following theorem, another kind of test." "comparison -/ THEOREM 2 If an, b, converges. > 0 and HNCO lim a./b. = c a, converges if and only if 0, then n=1 b. n=1 23. InfiniteSeries 469 00 PROOF Suppose Since R-FDD lim a,|b, b, converges. n=1 - a, 5 2cb, But the sequence 2c c, there is some N such that for n > b, certainly converges. N. Then Theorem 1 shows that n=N as converges, and this implies convergence of the whole series an, which n= N has only finitely many additional terms. The converse follows immediately, since we also have lim bs/a. n->oo n= I = 1/c ¢ 0. The comparison test yields other important tests when we use previously analyzed series as catalysts. Choosing the geometric series r", the convergent n-0 series THEOREM 3 (THE RATIO TEST) Let a. par excellence,we obtain the > important of all tests for summability. most 0 for all n, and suppose that Un+1 . hm =soo Then a, converges if r < = -- a. r. l. On the other hand, if r 1, then the terms a, do > n=1 not approach 0, so a, diverges. (Notice that it is therefore essential to compute n=1 lim an+1/a. and not lim a./a,4i!) n->oo PROOF Suppose first that r n-woo < 1. Choose any number s with r =r<1 lim nooo as implies that there is some N such that n+1 forn>N. <s This can be written for n an+1 5 sa, Thus aN+1 SGNs aN+2 SUN+1 aN+k S GN, SkaN - _> N. < s < 1. The hypothesis 470 and InfiniteSeries InfiniteSequences aNSk Since Sk UN = k=0 the comparison test shows that converges, k=0 = a. aN+k n=N converges. k=0 This implies the convergence The case r > 1 is even easier. If 1 < s a.si which means of the whole series > r, then there < for n s > N, = 0, 1, an. is a number N such that that NSk GN+k k > GN .... This shows that the individual terms of {a.} do not approach 0, so {a.) is not summable. of the ratio As a simple application test, consider the series 1/n! . Letting n=1 a, 1/n! we obtain = 1 n+1 (n+1)! 1 an 1 n! (n + 1)! Thus . lun an+1 naoo which 0, = -- as 1/n! converges. If we consider instead the series shows that the series n=1 r"/n!, where r then is some fixed positive number, n=1 rn+1 (n + 1)! n r"/n! so oc r r" oo = n+ 1 converges. It follows that n=1 lim naoo r" = - n! 0, = 0, 23. InfiniteSeries 471 a result already proved in Chapter 16 (theproof given there was based on the same ideas as those used in the ratio test). Finally, if we consider the series have lim (n+ 1)rn÷1 lim r = nr" naoo - 1 n + naoo = nrn we r, n oc since lim (n + 1)/n = 1. This proves that if 0 5 r < nr" converges, 1, then and consequently lim nr" 0. = N->OO -1 < r (This result clearly holds for 0, also.) It is a useful exercise to provide a without using the ratio test as an intermediary. direct proof of this limit, Although the ratio test will be of the utmost theoretical importance, as a practical tool it will frequently be found disappointing. One drawback of the ratio test is the fact that lim anoi/a. may be quite difficult to determine, and may not even _< exist. A more serious deficiency, which appears with maddening regularity, is the might a.4i/a. equal precisely that the the fact limit 1. The case B-900 lim 1 is one = which is inconclusive: {an}might not be summable (forexample, if a, = 1/n), (1/n)2 but then again it might be. In fact, our very next test will show that n=1 converges, even though 1 n + lim 1. = This test provides a quite different method for determining convergence or divergence of infinite series-like the ratio test, it is an immediate consequence of the comparison test, but the series chosen for comparison is quite novel. THEOREM 4 (THE INTEGRAL TEST) Suppose that Then f is positive and decreasing on a, converges [1,oo), and that f (n) = if and only if the limit n=1 A oo Ï 1 exists f = lim A->oo 1 f lA PROOF The existence of lim A-voo i f is equivalent to convergence 2 3 f+ 2 4 f+ f+.-.. of the series as for all n. and InfiniteSeries 472 InfmiteSequences Now, since f is decreasing we have (Figure 2) n+1 f (n + 1) < f f (n). < The first half of this double inequality shows that the series a, i may be com- n=1 n+1 oc pared to the series i " n=1 1 2 3 4 n oo i (andhence an+1 n=1 n + 1 ian, pared to the series L f A 00 2 n+1 00 The second half of the inequality shows that the series FIGURE an) converges n=1 exists. if hm | oo f, proving that proving that lim Amoo n=1 be com- oc f i may ia, must exist if converges. n=1 Only one example using the integral test will be given here, but it setdes the question of convergence for infinitely many series at once. If p > 0, the convergence of by the integral test, to the existence of 1/nf is equivalent, n=1 °° - l dx. xP Now 1 ÏA ¯ - dx (p = - 1 1) Ap-1 1 p - 1 log A, p = 1. A This shows that lim Amoo Ï converges precisely for p oo 1/xP dx exists if p 1, but not if p 5 1. Thus > 1/n? n=1 > l. In particular, 1/n diverges. n=1 The tests considered so far apply only to nonnegative sequences, but nonpositive sequences may be handled in precisely the same way. In fact, since -a, as = - , .=i .=i all considerations about nonpositive sequences can be reduced to questions involving nonnegative sequences. Sequences which contain both positive and negative terms are quite another story. If an is a sequence with both positive and negative terms, one can con- n=i |a,|, all sider instead the sequence n=1 of whose terms are nonnegative. Cheerfully 23. InfiniteSeries 473 ignoring the possibility that we may have thrown away all the interesting information about the original sequence, we proceed to culogize those sequences which are converted by this procedure into convergent sequences. The series DEFINITION as is absolutely if the series convergent la,| is convergent. n=1 n=1 (In more formal language, the sequence {a.) is absolutely summable if the {|a.|) is summable.) sequence Although we have no right to expect this definition to be of any interest, it turns out to be exceedingly important. The following theorem shows that the definition is at least not entirely useless. THEOREM Every absolutely convergent series is convergent. Moreover, a series is absolutely convergent if and only if the series formed from its positive terms and the series formed from its negative terms both converge. 5 If PROOF |a.| converges, then, by the Cauchy criterion, n=I lim |a.wil+---+|a.|=0. . m,n-oo - Since a,4t|+---+|a.|, |a,41+---+a.|5 it follows that lim an+1+···+a.=0, m,n->oo which shows that as converges. n=1 To prove the second part of the theorem, let a.* so that a = a, = Ï a., Í 0, a., 0, if a. 2 0 if a. 5 0, if as 5 0 if a. à 0, is the series formed from the positive terms of n=1 an, and n=i is the series formed from the negative terms. a,* If n=1 and both converge, a,¯ then n=1 a, n=1 | [an* (an¯)] = - n=1 = a. n=1 an¯ -- n=1 a, n=1 474 and Infinite Series InfiniteSequences also converges, so a, converges On the other hand, if |a,| absolutely. converges, then, as we have just shown, also converges. as n=i n=i Therefore n=1 n=1 n=1 °° °° and = 1 both converge. It followsfrom Theorem 5 that every convergent series with positive terms can be used to obtain infinitely many other convergent series, simply by putting in minus signs at random. Not every convergent series can be obtained in this way, however-there are series which are convergent but not absolutely convergent convergent). In order to prove this state(suchseries are called conditionally ment we need a test for convergence which applies specifically to series with positive and negative terms. THEOREM 6 (LEIBNIZ'S THEOREM) Suppose that ai>a22a32---20, and that lim a, = 0. n->oo Then the series (-1).+1a, = ai - a2 3 - a4 * US ·¯ •• n=1 converges. PROOF Figure 3 illustrates relationships (1)s2 5 s4 36 (2)siksstss2···, ' (3)sk FIGURE se 3 · > if k is even and I is odd. al flllll ss · between the partial sums which se Ja 5:s IIIIII sa In fo 17 Is is 5: we will establish: 23. InfiniteSeries 475 To prove the first two inequalities, observe that (1) s2, + a2.wi = s2, 2 s2., (2) = s2,43 - a2,42 since a2.gi s2n+1 s2n+1, à a2,42 Ÿ 02n+3 U2n+2 - since a2n+2 2 a2n+3· To prove the third inequality, notice first that s2, = s2., 5 52n-l - a2, since a2n 0. This proves only a special case of (3),but in conjunction with case is easy: if k is even and I is odd, choose n such that 2n 2 k and 2n - (1)and (2)the general 1 à l; then Sk 52n 52,_) Si, which proves (3). Converges, Now, the sequence {s2n} above (bys¿ for any odd l). Let a because it is nondecreasing sup(s2,} = and is bounded lim s24. = n->oo Similarly, let ß It follows from (3)that a 5 s2.wi it is actually the case that The standard example lim s2n+1· = B->OO ß; since s2, - « inf(s2.wi} = = = ß. a2n+1 and lim a, 0 = R->OO This proves that a = ß = lim s.. HNOO derived from Theorem 6 is the series 1-(+¾-¼+(----, which is convergent, but not absolutely convergent (sincen=1 1/n does not If the sum of this series is denoted by x, the following manipulations to quite a paradoxical œsult: verge). x=1-j+¾-¼+¾-(+--- (thepattern here is one positive term followed by two negative ones) =(1-))-¼+(j-4)-¼+(J-Å)-i+(j-¾)·-i+- =l-¼+g-¼+A-h+Å-·rš+ con- lead and InßniteSeries 476 InfiniteSequences -/- x/2, 0. On the other hand, it is easy to see that x 0: the partial sum s2 equals and the proof of Leibniz's Theorem shows that x ;> s2. This contradiction depends on a step which takes for granted that operations valid for finite sums necessarily have analogues for infinite sums. It is true that the so x = implying that x = ), sequence contains all the numbers in the sequence 11 1(} 11 11 11 1 1 l of {an}in the following precise sense: each In fact, {b,}is a rearrangernent b, the natural numbers, agn) where f is a certain function which that is, every natural number m is f (n) for precisely one n. In our example "permutes" = f (2m+ 1) = 3m + 1 (theterms 1, y, y places), = 3m (theterms , . . . go into the 1st, 4th, 7th, . .. -¾, f(4m) ... f(4m + 2) = 3m + 2 -g, places), (theterms ... go into the 3ri, 6th, 9th, -y,... -(, -y, go into the 2nd, 5th, 8th, -y, ... places). b, should equal Nevertheless, there is no reason to assume that n=1 sums are, by definition, lim bi + · · naoo order · + b, and lim ai + usoo of the terms can quite conceivably matter. a.: these n=1 . . . + a., so the particular (-1)"**/n is not The series n=1 special in this regard; solutely convergent-the than a theorem) shows THEOREM 7 indeed, its behavior is typical of series which are not abfollowing result (really more of a grand counterexample how bad conditionally convergent series are. a, converges, but does not converge absolutely, then for any number a there If n=1 is a PROOF rearrangement Let p, n=1 {b,}of {a,}such that b, = a. denote the series formed from the positive terms of {an}and let q. n=1 denote the series of negative terms. It followsfrom Theorem 5 that at least one of these series does not converge. As a matter of fact, both must fail to converge, for if one had bounded partial sums, and the other had unbounded partial sums, then 23. InfiniteSeries 477 the original series as would also have unbounded partial sums, contradicting the assumption that it converges. Now let a be any number. Assume, for simplicity, that < a 0 will be a simple modification). there is a number Since the series « > 0 (theproof for pn is not convergent, N such that N n=1 We will choose Ni to be the smallest N with this property. This means that NI-1 (1) p. 5 a, n=1 Ni (2) but p, > a. n=1 Then if N1 Si = L Pn, n=1 we have Si IIII I p,+ o FIGURE · +ps_, I a a 5 - pN1* I p,+ - +p,._,+ps, - 4 This relation, which is clear from Figure 4, followsimmediately from equation (1): N1-1 S1-a5S1- Pn=PNin=1 To the sum S1 we now add on just enough negative terms to obtain a new sum T1 which is less than a. In other words, we choose the smallest integer Mi for which M1 Ti=Si+ q.<a. n=1 As before, we have « - Ti 5 -qui . We now continue this procedure indefinitely, obtaining and smaller than a, each time choosing the smallest sums alternately Nk or Mk possible. larger The 478 and Infinite Series Inßnite Sequences sequence pt, pNI,91,•••, ..., 9Mi, PN1+1= ··· s PN2'''' of {a,}.The partial sums of this rearrangement increase to Si, is a rearrangement then decrease to Ti, then increase to S2, then decrease to T2, etc. To complete the and |Tk-«| are less than or equal to pN, or -qMa, proof we simply note that |Sk respectively, and that these terms, being members of the original sequence {a.}, -a| decrease to 0, since must a, converges. n=i Together with Theorem 7, the next theorem establishes conclusively the distinction between conditionally convergent and absolutely convergent series. THEOREM 8 a. converges absolutely, If and {b.) is any of rearrangement {a.},then b. n=1 n=I also converges and (absolutely), an bn- = n=1 PRooF Let us n=1 denote the partial sums of {an]by s., and the partial sums of {b,}by t.. Suppose that e > a, converges, there is some N such that 0. Since n=1 E- a.-sN n=1 Moreover, since |a,| converges, we can also choose N so that n=1 |a.| - ·+ (lail + ) aN < E, n=i i.e., so that GN+1 # |aN+2 Now choose M so large that each of a1, whenever m > M, the difference t. sN exduded. Consequently, are definitely - |t. - SN N+1 N+3 aN ..., < ''' E- appear among b1, is the sum of certain N+2 N+3 ..., bM· a¿, rehere ai, · - Then ..., aN 23. InfiniteSeries 479 Thus, if m > M, then a.-t. a.-sN"(Ím-SN) = n=1 n=1 5 an - IN Ÿ Im SN ¯ n=1 <s+e. Since this is true for every e > 0, the series bn converges to an. ==1 n=1 b, converges absolutely, note that {|b,| } is a rearrangement To show that n=1 of {|a,\ };since |a,| converges absolutely, |b,| converges by the first part of n=1 n=1 the theorem. is also important when we want to multiply two infinite series. Unlike the situation for addition, where we have the simple formula Absolute convergence an n=1 b, + n=i (a, + b.) = , n=1 there isn't quite so obvious a candidate for the product a. n=1 bN - ')· =(a1+a2+---)-(bi+b2 n=1 It would seem that we ought to sum all the products a,bj. The trouble is that these form a two-dimensional array, rather than a sequence: aib2 alb3 a2b1 a2b2 U2b3 a3bi asb2 asb3 albi --- · • - -• - Nevertheless, all the elements of this array can be arranged in a sequence. The picture below shows one way of doing this, and of course, there are (infinitely) many other ways. arbi arba albs a2bi a2b; a2b3 aabi aaba a3bs ... ... . and InfiniteSenes 480 InfmiteSequences Suppose that (c,} is some sequence of this sort, containing each product a¿bj just once. Then we might naively expect to have = c, be. · a, n=1 n=I n=1 But this im't true (seeProblem 8), nor is this really so surprising, since we've said of the terms. The next theorem shows nothing about the specific arrangement that the result does hold when the arrangement of terms is irrelevant. THEOREM 9 b, converge absolutely, as and If n=1 and {c.) is any sequence containing the n=1 products arbj for each pair (i, j), then = c, a. n=1 PROOF b.. - nl n=l Notice first that the sequence L L i|. pL= |b;| i=1 j=1 converges, since {a,}and {b.}are absolutely convergent, and since the limit of a product is the product of the limits. So (PL} is a Cauchy sequence, which means that for any e > 0, if L and L' are large enough, then i|· i=1 |bj|- |bj| |a¿|i=1 j=1 < . j=1 It follows that L (1) |a;| |bj| 5 < . e. i or j>L Now suppose that N is any number so large that the terms c, for n 5 N include every term arbj for i, j 5 L. Then the difference N L L (c.- La¿-Ebj n=1 consists of terms a¿b; with i > i=1 L or N (2) n-l j > j=1 L, so L L c.-Lag·ibj i=l i 5 |ar|-|bj| i or j>L j=l < e by (1). But since the limit of a product is the product of the limits, we also have (3) by a¿· i=1 j=1 - bj a¿· i=1 j=l < e 23. InfiniteSeries 481 for large enough L. Consequently, if we choose L, and then N, large enough, we will have og N oo oo La¿-Ebj Ec, i=1 L - i=1 j=1 L oo Lag-Ebj La¿-Eb; 5 - i=1 i=1 j=1 L La¿-ibj ic, j=1 + - n=1 i=l 2e < j=1 N L by (2)and (3), which proves the theorem. Unlike our previous theorems, which were merely concerned with summability, this result says something about the actual sums. Generally speaking, there is in any simpler no reason to presume that a given infinite sum can be expressions simple equated infinite be However, to many sums by using terms. can Taylor's Theorem. Chapter 20 provides many examples of functions for which "evaluated" "f(a) f (x) (x = a)' + Rn,a(x), - i=0 where lim R..a(x) 0. This is precisely equivalent to = BROC " f (x) f (a) i lim = . (x a)', - i=0 which means, in turn, that °° f (x) f (a) = (x a)'. - i=0 As particular examples we have x . 3 3! x 2 x 2! X = x - , 7! 4 x 6 6! 4! X2 X3 X4 2! 3! 4! ·, 1! arctan x 7 x 5! cosx=1---+---+-·-, «'=1+-+--+-+-+ 5 x smx=x--+-----+ x 3 -- 3 2 + x 5 - 7 -- 5 3 x x log(1+x)=x--+-----+-+---, 2 3 - x + 7 4 x 4 · · - , x 5 5 |x| 5 1, -1<xs1. and InfiniteSeries 482 InßniteSequences (Notice that the series for arctan x and log(1+x) do not even converge for |x| > 1; the series for log(1 + x) becomes in addition, when x -1, = does not converge.) Some pretty impressive which results with are obtained 3 5 particular values of x: 7 0=×--+---+-·-, 3! 5! 7! 1 l 1 e= l+-+---+--+---, 1! 2! 3! 1 1 1 x 4 357 1 1 1 log2=1--+---+.... 2 3 4 -=1--+---+--- ' More significant developments may be anticipated if we compare the series for sin x and cos x a little more carefully. The series for cos x is just the one we would have obtained if we had enthusiastically differentiated both sides of the equation . smx=x---+--... x3 75 3! 5! term-by-term, ignoring the fact that we have never proved anything about the derivatives of infmite sums. Likewise, if we differentiate both sides of the formula for we obtain the formula cos'(x) cos x formally (i.e.,without justification) sin x, and if we differentiate the formula for e= we obtain exp'(x) exp(x). shall such of the chapter that term-by-term next In differentiation infinite we see sums is indeed valid in certain important cases. = = - PROBLEMS 1. Decide whether each of the following infmite series is convergent or divergent. The tools which you will need are Leibniz's Theorem and the comparison, ratio, and integral tests. A few examples have been picked with malice aforethought; two series which look quite similar may require different tests (andthen again, they may not). The hint below indicates which tests may be used. °° sin nÐ n2 (i) L (ii) 1-¼+¾-(+·-·. (iii) 1-(+(-)+(-j+(-(+--·. (iv) (-1), n=1 • log n 23. InfiniteSerms 483 © 1 (v) (The summation begins with n 2 simply to avoid I the meaningless term obtained for n 1). = . n2 n=2 - = 1 (vi) . n2‡l (vii) . n=1 logn n=1 "1 (ix) --. n=2 °° (x) 1 L (IOg 4) k n=2 ' (log1n)" ° (xii)n=2 (--1)"(ÎOg 1 8)n n3n+ (xiii) 1 n=1 (xiv) sin . " 1 (xv) . nlo n=2 n * (xvi)L n=2 1 . n(logn) = (xvii) 1 n2(logn) (xviii) . n=1 °° (xix) 2"n! - . . 484 and InßniteSeries InßniteSequences °° (xx) 3"n! --. n=1 Hint: Use the comparison test for (i),(ii),(v),(vi),(ix),(x),(xi),(xiii), (xiv), (xvii);the ratio test for (vii),(xviii), (xix),(xx);the integral test for (viii), (xv),(xvi). The next two problems examine, with hints, some infmite series that require more delicate analysis than those in Problem 1. *2. solved examples (a) If you have successfully a"nl/n" it should be clear that (xix)and (xx)from for a converges < Problem 1, e and diverges for n=1 a > e. For a = e the ratio test fails; show that e"n!/n" actually n=1 diverges, by using Problem 22-13. (b) Decide when *3. nn/a"n! when converges, again resorting to Problem 22-13 n=1 the ratio test fails. (logn)-k Problem 1 presented the two series which lies between these two, is analyzed e7/y? (a) Show that fg°° of which n=2 the first diverges while the second converges. (logn)1°5" (logn)~", and n=2 The series ' in parts dy exists, by considering (a)and (b). the series (e/n)". n=1 (b) Show that (logn)l°8" converges, by using the integral test. Hint: Use an appopriate tion and part (a). substitu- (c) Show that (logn)°8½") diverges, by using the integral test. Hint: Use the same substitution as in part (b),and show directly that the resulting integral diverges. 23. InfiniteSeries 485 nl+1,,, 4. Decide whether 5. (a) Let {an}be a sequence of integers with 0 5 a. 5 9. Prove that n=1 exists (andlies between 0 and 1). (This, of course, is the number we usually or not converges. a,10¯" which ·) denote by 0.ara2a3a4·· (b) Suppose that 0 5 x 5 1. Pove that there is a sequence of integers with a.10¯" 0 5 as 5 9 and = Hint: For example, x. ai = n=1 (where[y]denotes the greatest integer which (c) Show a1, a2, that if (a.) ak, 1, a2, ..., is repeating, - · a.10¯" then ., i.e., {a,} [10x] is y y). is of the form ai,a2,...,aks is a rational number (andfind n=I it). The same result naturally holds if {a,}is eventually the sequence {aN+k) is repeating for some N. (d) Prove that if x ing. (Justlook at a.10-" = is rational, then repeating, {a.}is eventually i.e., if repeat- n=i the process of finding the decimal expansion of p/qdividing q into p by long division.) 6. Suppose that (an}satisfies the hypothesis of Leibniz's Theorem. Use the proof of Leibniz's Theorem to obtain the following estimate: (-1).+1an 7. - Prove that if a, :> 0 and lim n-Foo [a1-a2+-··±aN & = N- r, then as converges if r < 1, and n=i diverges if r > 1. (The proof is very similar to that of the ratio test.) This result is known as the "mot test." It is easy to construct series for which the ratio test fails, while the root test works. For example, the root test shows that the series i+i+(i)'+(02+(03+(03+ · converges, even though the ratios of successive terms do not approach a limit. Most examples are of this rather artificial nature, but the root test is nevertheless quite an important theoretical tool, and if the ratio test works the root test will also (byProblem 22-18). It is possible to eliminate limits from the root test; a simple modification of the proof shows that a, converges n=1 if there is some s < 1 such that all but fmitely many a, diverges if infinitely many n-1 W are > 6 are < s, and that 1. This result is known as the and InfiniteSeries 486 InfiniteSequences "delicate test" root the notation (thereis a similar delicate ratio test). It follows, using 00 Problem 22-27, that of i a, converges if Eg is possible if lim g < 1 and n=l diverges if lim & n->oo 8. 1; no conclusion > {a,}and {b.),let c, For two sequences n-woo akbn+1-k· = = 1. (Then c. is the sum k=1 diagonal in the picture on page 479.) The series of the terms on the nth c. is called the Cauchyproductof as and b.. n=1 n=I (-1)"|{, show that |c,| If a, b, = = n=1 2 1, so that the Cauchy product does not converge. 9. {a.}is called A sequence . hm with Cesaro sum l, summable, Cesaro si+---+s, n->oo if =l n Uk). Problem 22-16 shows that a summable sequence is automatically Cesaro summable, with sum equal to its Cesaro sum. Find a sequence which is not summable, but which is Cesaro summable. Sk (where 10. - 61 ' Suppose that a, sequence s. ' 0 and {a,}is Cesaro summable. > {na,}is bounded. Prove that a, and a, = ' = s,, i=1 11. prove that s. (a) Suppose that a, à n+ = - (b) Show that - (c) Show that n=1 (d) Now replace - 1as is bounded. - - . b, exists. b, if n=I a. Hint: If proof of Theorem 8 which does not rely - an 5 n=1 a, converges. of {a,}, 0 for each n. Let {b,}be a rearrangement + b.. Show that for each n + a. and t. = bi + let s, at + there is some m with s. E t, absolutely, - i=1 This problem outlines an alternative on the Cauchy criterion. and the series Suppose also that the n=I b.. = n=1 the condition a. à 0 by the hypothesis that using the second a, converges n=1 part of Theorem 5. 23. InfiniteSenes 487 00 12. as converges absolutely, if (a) Prove that and {b,}is any subsequence of n=1 {an},then b, converges (absolutely). n=1 an does not converge absolutely. this is false if (b) Show that 00 I . *(c) Prove that if absolutely, a, converges then -)+(a2+a4+as+---). an=(ai+a3+a5+- 13. Prove that if a. is absolutely convergent, then a. *14. ff FIGURE ---- ----- _ _ (sinx)/x dx converges. f 0 for all x such that Prove that f°°f (x) dx *15. Find a continuous function f with exists, but lim f(x) does not exist. *16. 0. Recall the definition 1, and let f(0) of £(f, P) from Problem 13-25. Show that the set of all £(f, P) for P.a length"). Hint: Try partition of [0, 1] is not bounded (thusf has partitions of the form -1 - 19-42 shows that |(sinx)/x| dx diverges. Problem |an|. 5 n=i n=1 n=1 _ s Let f(x) x sin 1/x for 0 = < f (x) > xs = "infinite P- 17. *18. 0 2 ' ' (2n+1)x' 2 2 2 2 1 7x' 5x' 3x' x' be the function shown in Figure 5. Find shaded region in Figure 5. Let f In this problem we will establish the a (l+x)"= "binomial f f, and also the area of the series" x*, |x|<1, A-0 lim Ro,o(x) = 0. The proof is in several steps, for any a, by showing that 8400 and uses the Cauchy and Lagrange forms as found in Problem 20-7. (a) Use the for Ir| < a ratio test to show that the series r* does indeed converge k=0 1 (thisis not to say that it necessarily lim follows in particular that HNDO r" = converges to 0 for |r| < 1 (1 + r)"). It and InfmiteSeries 488 InßniteSequences (b) Suppose first that 0 Ex lim R..o(x) 1. Show that B->OO < = 0, by using Lagrange's form of the remainder, noticing that (1 + t).-n-1 5 1 for n + 1 > œ. < x < 0; the number t in Cauchy's form of the (c) Now suppose that remainder satisfies < x < t 5 0. Show that -1 -1 |x(1+ t)"¯'1| where 5 |x|M, M max(1, = x)Œ-Î (1 + and x-t = 1+t 1-t|x 1+t ix| < ¯ Using Cauchy's form of the remainder, a +1 (n+1) show that lim R.,o(x) (a) Suppose that the partial and the fact that a-1 =a , n 0. = RNOO 19. ixi. sums of the sequence {b,}is a sequence with b. > {a,}are bounded bn+1 and lim b, and that DO a,b, 0. Prove that = naoo n=1 This is known as Diricklet's test. Hint: Use Abel's Lemma (Problem 19-35) to check the Cauchy criterion. (b) Derive Leibniz's Theorem from this result. converges. (c) Prove, using Problem 15-33, that the series converges if x (cosnx)/n n=1 is not of the form 2kr for any integer k (d) Prove Abel'stest: If a, converges and (inwhich {b,}is a case it clearly diverges). sequence which is either n=1 nondecreasing or nonincreasing and which is bounded, then a,b, n=1 converges. Hint: Consider b, - b, where b = lim b.. naoo 00 *20. Suppose {a,}is decreasing and ANOO lim a, 2"a2. also converges then = 0. Prove that if (the"Cauchy as converges, n=1 Condensation Theorem"). No- n=1 1/n is a special case, for if tice that the divergence of n=1 1/n converged, n=1 2"(1/2") would also converge; this remark may serve as a hint. then n=1 23. InfiniteSeries 489 *21. (a) Pove b, converge, then n=1 (b) Prove that *22. a,2 and that if n=1 a.* if a.b. converges. n=i then converges, n=1 aN N COHV€Tg€S fOr any a > . n=1 Suppose {a.}is decreasing and each a, à 0. Prove that if then lim na, = 0. Hint: Write down the R-¥OC use the fact that (as) is decreasing. Cauchy criterion a, converges, and be sure to oo *23. partial sums s. are bounded, and B->OO lim a, a, converges, then the If n=i = 0. It is tempting to conjecture that boundedness of the partial sums, together with the condition 0, implies convergence of 00 lim a, = n-+oo a.. This is not true, n=1 but finding a counterexample requires a little ingenuity. As a hint, notice that some subsequenceof the partial sums will have to converge; you must somehow allow this to happen, without letting the sequence itself converge. 24. Pove that if a, à 0 and an diverges, then 1 an Compare the partial sums. Does the converse hold? n=i 25. Let b, ¢ 0. We say that the infinite product p, = i=1 also diverges. Hint: n=1 b, converge if the sequence b; converges, and also Iim p, ¢ 0. n->oo (a) Prove that (b) Prove that (c) For a, > if (1 + an) converges, then a, approaches 0. n=1 (1+ a.) converges if and only if n=1 log(1 + a.) converges. n=1 0, pmve that (1 + an) converges n=1 if and only if a, conn=1 Hint: Use Problem 24 for one implication, and a simple estimate for log(1 + a) for the reverse implication. verges. 26. (a) Compute (b) Compute 1 - . (1 + x2") for x| < Î. and InßniteSeries 490 InfiniteSequences 27. The divergence of 1/n is related to the following remarkable fact: Any n=l positive rational number x can be written as a finitesum of distinctnumbers of the form 1/n. The idea of the proof is shown by the following calculation : Since for _23 2Î _ 31 1 62 _ ¯ 2 62 3 186 1 4' 7 156 ''' l 26 ¯ 186 we 27 1674 have. Notice that the numerators 23, 7, 1 of the differences are decreasing. if 1/(n + 1) < x 1/n for some n, then the numerator in this decrease; conclude that x can be written as a finite sum of distinct numbers 1/k. (a) Prove that < sort of calculation must always (b) Now prove the result for all x by using the divergence of 1/n. n=1 UNIFORM CONVERGENCE POWER SERIES CHAPTER AND The considerations at the end of the previous chapter suggest an entirely new way of looking at infinite series. Our attention will shift from particular infinite sums to equations like ex 1+ = -- x 1! x + 2 + ---- 2! - - - which concern sums of quantities that depend on x. defined by equations of the form interested in functions In other f(x)= f1(x)+f2(x)+f3(x)+· x"¯1/(n words, we are · 1)!). In such a situation (f,} will be each obtain of functions; for x we a sequence of numbers {f.(x)}, some sequence of order and f (x) is the sum this sequence. In to analyze such functions it will certainly be necessary to remember that each sum (inthe previous example fn(x) = - f1(x) + f2(x)+ f3(x)+ · is, by definition, the limit of the sequence ft(x), f1(x)+ f2(x),fi(x) + If we defme a new sequence of f2(X) Ÿ f3(X). -·•• functions {sn}by = s. f1 + + fn. 1 then we can express this fact more succinctly by writing A la 1 f (x) rather I lim s.(x). For some time we shall therefore concentrate on functions defined as limits, f (x) FIGURE = such than on functions defined as = lim H-+OD f.(x), infinite sums. The total body summed easily: nothing of results about would functions can be hope to be up very one true actually is-instead we have a splendid collection of counterexamples. The first of these shows that even if each f, is continuous, the function f may not be! Contrary to what you may expect, the fimetions f. will be very simple. Figuæ 1 shows the graphs of the functions f,(x) 491 x", = 1, 05xs1 x > 1. 492 and InfiniteSenes ÍnßniteSequences h -I I 1 II A II I These functions are all continuous, continuous; in fact, A but the function f (x) Hm f,(x)= I naoo lim f.(x) is not 05x<1 1. 10, Î, x Another example of this same phenomenon is illustrated in Figure 2; the functions f. are defined by A I A I -1, x 5 -1.. FIGURE = 2 f,(x) 1- -- n 1 = nx, - n 1 1, - - ) 5 x 5 - - n 5 x. for large enough n) equal to In this case, if x < 0, then f.(x) is eventually (i.e., and if x > 0, then f, (x) is eventually 1, while f.(0) 0 for all n. Thus -1, = lim fn(x) = naoo A I I so, once again, the function f(x) 0, 1, x x = > 0 0; lim f.(x) is not continuous. 00 = N -1- FIGURE 3 By rounding off the corners in the previous examples it is even possible to produce a sequence of dgerentiablefunctions (f,} for which the function f(x) lim f, (x) is not continuous. One such sequence is easy to defme explicitly: = n fn(x)= 3 sin --Ex5(mrx11 2 1 - , n 1, - n 2 n 5 x. These functions are differentiable (Figure 3), but we still have h -1, 1 lim 8400 i I ' i i 4 0, 1, x < x = x > 0 0 0. Continuity and diKerentiability are, moreover, not the only properties for which problems arise. Another difficulty is illustrated by the sequence {f,} shown in Figure 4; on the interval [0,1/n] the graph of f. forms an isosceles triangle of altitude n, while f, (x) 0 for x à 1/n. These functions may be defined explicitly = FIGURE f.(x) = RS fOlÏOWS: and PowerSeries 493 24. UniforrnConvergence 2n2 h 2n fn(x) = - 05x5- 2n2x, - 1 Exs 2n 1 2n 1 -- n 1 -sxs1. 0, n varies so erratically Because this sequence near 0, our primitive mathematical instincts might suggest that lim fn(x) does not always exist. Nevertheless, this lim f.(x) is even continuous. limit does exist for all x, and the function f(x) lim f,(x) 0; moreover, f.(0) 0 In fact, if x > 0, then f,(x) is eventually 0, so HNOO 0. In other words, f (x) for all n, so that we certainly have lim f.(0) A = = = = = 0 for all x. On the other hand, the integral quickly reveals the lim f.(x) aœnge behavior of this sequence; we have = f,(x)dx FIGURE 5 (, = but 1 f (x)dx 0. = Thus, 1 lim HNDO 1 fn(x)dx =/= lim f,(x)dx. NACO This particular sequence of functions behaves in a way that we really never imagined when we first considered functions defined by limits. Although it is true that for each x in [0,1], f(x) lim f,(x) = 8400 "approach" 1- - I A / A the graph of f in the sense of the graphs of the functions f, do not lying close to it-if, as in Figure 5, we draw a strip around f of total width 2e (allowing a width of e above and below), then the graphs of f. do not lie completely within this strip, no matter how large an n we choose. Of course, for each x there is some N such that the point (x, f,(x)) lies in this strip for n > N; this assertion just amounts to the fact that lim f.(x) f (x). But it is necessary to choose larger = 1 n->ao and larger N's as x is chosen closer and closer to 0, and no one N will work for all x at once. FIGURE 6 The same situation actually occurs, though less blatantly, for each of the other examples given previously. Figure 6 illustrates this point for the sequence f,(x) x" = 1, ' Osxs1 x>l. A strip of total width 2e has been drawn around the graph of If e < i, f(x) = lim naoo f.(x). this strip consists of two pieces, which contain no points with second 494 InfmiteSequences and InßniteSenes since each function f, takes on the value coordinate equal to the graph of each f, fails to lie within this strip. Once again, for each point x there is some N such that (x, f,(x)) lies in the strip for n > N; but it is not possible to pick one N which works ibr all x at once. It is easy to check that precisely the same situation occurs for each of the other examples. In each case we have a function f, and a sequence of functions (fa), all defined on some set A, such that ); fa j, . (, f(x) . = lim n->oo for all x in A. f,(x) This means that FIGURE 7 for all e > 0, and for all x in A, there is some N such that if n If (x) f.(x)l - < > N, then s. But in each case different N's must be chosen for difTerent x's, and in each case it is not true that for all e If(x) - > 0 there is some N such that for all x in A, if n f,,(x)l < > N, then e. Although this condition differs from the first only by a minor displacement of the phrase "for all x in A," it has a totally different significance. If a sequence (f,} satisfies this second condition, then the graphs of f, eventually lie close to the graph of f, as illustrated in Figure 7. This condition turns out to be justthe one which makes the study of limit functions feasible. DEFINITION Let {f,} be a sequence of functions defined on A, and let f be a function which limit of {f.) on A if for is also defined on A. Then f is called the uniform every e > 0 there is some N such that for all x in A, if n > N, then |f (x) f.(x) | < s. We also say that (f,} convergesuniformly f uniformly on A. - to f onA, or that f, appmaches As a contrast to this definition, if we know only that f(x) = lim f,(x) NNOO for each x in A, then we say that {f.} converges pointwiseto f onA. Clearly, uniform convergence implies pointwise convergence (butnot conversely!). Evidence for the usefulness of uniform convergence is not at all difficult to amass. Integrals represent a particularly easy topic; Figure 7 makes it almost obvious that if {f.) converges uniformly to f, then the integral of f, can be made as close to the integral of f as desired. Expressed more precisely, we have the following theorem. and PowerSeries 495 24. Un¶ormConvergence THEOREM 1 of Suppose that {f.} is a sequence that (f.) converges uniformly on [a,b]. Then Let e > b b Ï PROOF functions which are integrable on [a,b], and [a,b] to a function f which is integrable on f lim = f.. 0. There is some N such that for all n |f(x) Thus, if n > f,(x)|< - N we have > for allx in e [a,b]. N we have b Ilb f (x) dx b f.(x) dx - [f (x) f.(x)] dx = - a Lb |f(x) 5 - f,(x)|dx b edx 5 e(b = Since this is true for any e > a). - 0, it follows that b b Ï f lim = f.. of continuity is only a little more difficult, involving an a three-step estimate of |f (x) f (x + h)|. If {f,} is a sequence of continuous functions which converges uniformly to f, then there is some n such The treatment "e/3-argument," - that (1) |f (x) f.(x) | < (2) |f (x + - h) -- fn(x + , h) < . Moreover, since f, is continuous, for sufficiently small h we have (3) |f,(x) f.(x - + h) | < . It will follow from (1),(2),and (3)that |f (x) f (x +h) < e. In order to obtain (3), however, we must restrict the size of |h| in a way that cannot be predicted until n has already been chosen; it is therefore quite essential that there be some fixed n which makes (2)true, no matter how small [h| may be-it is pæcisely at this point uniform that convergence enters the proof. - THEOREM 2 Suppose that {f.) is a sequence that {f.} converges uniformly PROOF of functions which are continuous on on [a,b] to f. Then f is also continuous [a,b], and on [a,b]. For each x in [a,b] we must prove that f is continuous at x. We will deal only with x in (a, b); the cases x b require the usual simple modifications. a and x = = and InfmiteSeries 496 InfmiteSequences Let e > 0. Since (f,} converges uniformly to f on [a,b], there is some n such that |f(y) f,(y)| - for all y in < In particular, for all h such that x + h is in Now f, is continuous, (1) |f (x) f,(x) | < (2) |f (x + h) If(x+h)- < so there is some ô f.(x - > , + h) | < . 0 such that for |h| If,(x)- f,(x+h)|< < 8 we have . ô, then f(x)| = |f(x+h) < If(x+h) - - S <-+-+- 8 8 3 3 3 This proves that f f, FIGURE ' have - (3) Thus, if |h| [a,b], we [a,b]. f,(x+h)+ f,(x+h) fn(x+h)|+If,(x+h)- - f,(x)+ f,(x) f(x)| - fn(x)\+lln(x)- f(x)\ is continuous at x. f(x) = |x| 8 After the two noteworthy successes provided by Theorem 1 and Theorem 2, the situation for differentiability turns out to be very disappointing. If each f, is differentiable, and if {f.} converges uniformly to f, it is still not necessarily true that f is differentiable. For example, Figure 8 shows that there is a sequence of differentiable functions {f.) which converges uniformly to the function f(x) |x|. = and PowerSeries 497 24. UmformConvergence Even if f is differentiable, it may not be true that f'(x) lim f,'(x); = HNOO this is not at all surprising if we reflect that a smooth function can be approximated by very rapidly oscillating functions. For example (Figure 9), if f.(x) then (f.) converges uniformly - 1 sin(n2x), n the function to f.'(x) and = = n lim n cos(n2x) does not always exist H->OD FIGURE f (x) 0, but = cos(n2x), (forexample, it does not exist if x = 0). 9 practicalÏy the Fundamental Theorem of Calculus guarof Theantees that some sort of theorem about derivatives will be a consequence orem 1; the crucial hypothesis is that {f.') converge uniformly (tosome continuous Despite such examples, function). THEOREM s Suppose that {f.) is a sequence of functions which are differentiable on [a,b], integrable derivatives f.', and that {f.} converges (pointwise) to f. Suppose, with moreover, that Then f {f,'} converges uniformly f'(x) PROOF on [a, b] to some continuous function g. is differentiableand Applying Theorem 1 to the interval lx g = = = f,'(x). we see that [a,x], x lim lim H-¥OO = lim n->OC f(x) f.' [f.(x) f.(a)] - - f(a). for each x we have and InßniteSeries 498 InfiniteSequences Since g is continuous, it follows that f'(x) interval [a, b]. g(x) = lim f,'(x) for all x in the = B->OO -it Now that the basic facts about uniform limits have been established, how to treat functions defined as infinite sums, is clear f1(x)+f2(X)#f3(X)#·•·. f(x)= This equation means that f (x) lim f1(x)+ = + R-¥DO fn(x) our previous theorems apply when the new sequence fi, fi +h. fi +h+fs. ... converges uniformly to f. Since this is the only case we shall ever be interested in, we single it out with a definition. DEFINITION The series f. uniformly converges (moreformally: n=1 uniformly summable) converges uniformly to to f the sequence on A, if the sequence It+h+h, fi, fi+h, f on A. --- We can now apply each of Theorems 1, 2, and 3 to uniformly the results may be stated in one common corollary. COROLLAnr Let f. uniformly converge {f,} is to f on convergent series; [a,b]. n=1 is continuous on [a,b], then f is continuous on f, is integrable on [a,b], then (1)If each f. (2)If f and each b I Moreover, if f. converges f.' derivative f.' and oc f b f.. = to f (pointwise) converges uniformly on [a,b], each f, on [a,b] n=1 tion, then (3)f'(x) f,'(x) = n=1 for all x in [a,b]. [a,b]. has an integrable to some continuous func- and PowerSenes 499 24. UnformConvergence PROOF (1)If each f, is continuous, then so is each -+ f,,, and f is the uniform limit continuous is by Theorem 2. so f fi, fi f2 Í3, (2)Since fi, fi + f2, fi + f2 + f3, converges uniformly to f, it followsfrom Theorem 1 that of the sequence f2, ft + + fi + · · . - · , ... b Ï b f a lim = (fi + · - - + f.) a b lim = nuco oc b (Ï a fi+...+ a f, b n=1 (3) Each function fi + + f, and fi', fi'+ f2', fi'+ f2'+ fs', · - - ... is differentiable, with derivative fi' + converges uniformly to a continuous · · + fn', function, by hypothesis. It follows from Theorem 3 that f'(x) = lim [fi'(x) + 5400 + f.'(x)] f.'(x). = n=i At the moment this corollary is not very useful, since it seems quite difficult to will converge uniformly. The predict when the sequence fi, f1+f2, ft+f2+ f3, most important condition which ensures such uniform convergence is provided by the following theorem; the proof is almost a triviality because of the cleverness with which the very simple hypotheses have been chosen. ... THEOREM (THE WEIERSTIMSS 4 M-TEST) Let {f.} be a sequence of functions defined on A, and suppose sequence of numbers such that forallxinA. |f,(x)|5M. Suppose moreover that M. converges. Then for each x in A the series n=1 converges (infact, it converges that {M,} is a f,(x) n=1 absolutely), and f, n=1 to the function f (x) = f,,(x). converges uniformly on A 500 and InfiniteSeries InfiniteSequences PROOF For each x in A the series quently f,(x) converges n=1 f(x) - |f, (x)| converges, by the comparison Moreover, for (absolutely). [fi(x)+···+fn(x)] all x in A we test; conse- have , fn(x) = n=N+1 Ifn(x)l 5 n=N+1 M.. 5 n=N+1 Since M. converges, the number M. can be made as small as desired, n=1 n=N+1 by choosing -3 N sufficiently large. -2 -·-1 2 1 FIGURE 3 10 The following sequence {f.} illustrates a simple application of the Weierstrass M-test. Let {x}denote the distance from x to the nearest integer (thegraph of f (x) (x) is illustrated in Figure 10). Now define (a) = 1 10" -{10"x}. f,(x)= 1 The functions fi and f2 are shown in Figure 11 (butto make the drawingssimpler, 10" has been replaced by 2"). This sequence of functions has been defined so that the Weierstrass M-test automatically applies: clearly (b) n(X) FIGURE 11 - 1 10" fOr aÏÌ X, and PowerSenes 501 24. UnformConvergence 1/10" converges. Thus and f. n=1 uniformly; converges since each f, is con- n=1 tinuous, the corollary implies that the function f (x) f,(x) = {10"x) = n=1 n=1 is also continuous. Figure 12 shows the graph of the first few partial sums fi + + f.. As n increases, the graphs become harder and harder to draw, and - - · the infinitesum f, is quite undrawable, as shown by the following theorem n=1 mainly (included rough). THEOREM 5 as an interesting sidelight, to be skipped if you find the going too The function f (x) {10"x) = n=1 is continuous everywhere and differentiable nowhere! PROOF is continuous; this is the only part of the proof which convergence. We will prove that f is not differentiable at a, for any a, by the straightforward method of exhibiting a particular sequence {hm} approaching 0 for which We have just shown that uses uniform f lun f(a+hm)hm f(a) . mooo does not exist. It obviously suffices to consider only those numbers 0<as1. Suppose that the decimal expansion of a is a = 0.aia2a3a4 · a satisfying -•- -10¯= Let hm 10-. if am ¢ 4 or 9, but let h. these two exceptions will appear soon). Then = if a. = °° f(a+ hm) f(a) - h. (10"(a+ h.)} 1 - (thereason for {l0"a} ±10-m n=1 ±10'"¯"[{10"(a = 4 or 9 = + hm)} {10"a}]. -- n=1 This infmite series is really a fmite sum, because if n m, then > 10"hmis an integer, SO {10"(a+ On the other hand, for n 10"a = 10"(a + hm) = < hm)) - {10"a} 0. = m we can write integer + 0.an+1a, 2an+3 integer + 0.an+1anç2an+3 . . . . . . am · - · (ami I) · · • 502 and InfiniteSeries Infinite Sequences (inorder for the second equation to be true it is essential that we choose h. when a, 9). Now suppose that -10-= = = 0.an+1anç2a.43...a.-·· ...(a.±1) 0.a.sta,42a.43 (inthe -10¯" hm A , Then we also have A+h f·¯ 5 y. = A special case m when am = = - and exactly the same equation for n < m we have can 1 , is true because we chose ±10,-. = - (a) 5 - n + 1 the second equation 4). This means that {10"(a+ hm)} {l0"a} i - be derived when 0.an+1a,42a,49 10"¯"[{10"(a + h.)} {10"a}] = - > ... (. Thus, ±1. In other words, hm) f (a + +A+b f (a) - h. -1 1 numbers, each of which is ±1. Now adding +1 or to a number changes it from odd to even, and vice versa. The sum of m 1 numbers each ±1 is therefore an even integerif m is odd, and an odd integerif m is even. Consequently the sequence of ratios is the sum of m - - A+ h h(x) = Max! hm) f (a + f (a) - hm (b) cannot possibly converge, since it is a sequence of integers which are alternately odd and even. A+ b + h + A 1- - In addition to its role in the previous theorem, the Weierstrass M-test is an ideal tool for analyzing functions which are very well behaved. We will give special attention to functions of the form f (x) a.(x = - a)", A+A+b A(x) M6x! = which can also be described by the equation f (x) ==0 (c) FIGURE 12 f,(x), = fOT n(X) = Un(X - G)n. Such an infinite sum, of functions which depend only centered on powers of (x a), is called a power series at a. For the sake of simplicity, we will usually concentrate on power series centered at 0, - f (x) anx". = n=0 and PowerSenes 503 24. UmformConvergence One especially important group of power series are those of the form °° f(">(a) where f called the (x n! n=o a)", - is some function which has derivatives of all orders at a; this series is Taylor series for f at a. Of course, it is not necessarily true that n(a)(x-a)"; f(x)= · n=o this equation holds only when the remainder terms satisfy lim R..a(x) = naoo a,x" We already know that a power series does not necessarily 0. converge for n=o all x. the power series For example, 77 5 x3 x--+-----+--- 5 3 converges only 7 x| 5 1, while the power series for X2345 x--+---+-+··- 2 3 4 5 -1 converges only for converges only for x < = xs 1. It is even possible to produce a power series which 0. For example, the power series n!x" n=o does not converge for x ¢ 0; indeed, the ratios (n + 1)! (x"+1) (n+1)x n!x" = are unbounded anx" for any x ¢ 0. If a power series does converge for n=o some xo ¢ 0 however, then a great deal can be said about the series anx" for n=o lxl < lxol. THEOREM 6 Suppose that the series f (xo) anxo" = n=o converges, and let a be any number with 0 f (x) < a < anx" = n=o |xo|. Then on [-a, a] the series and InfiniteSenes 504 InfiniteSequences (andabsolutely). converges uniformly Moreover, the same is true for the series g(x) nauxn-1 - n=1 Finally, f is differentiable and f'(x) naux"¯' = n=1 for all x with |x| PROOF < |xo|. anxo" converges, the terms anxo" approach Since 0. Hence they are surely n=o bounded: there is some number M such that |anxo" |a,| |xo"|5 is in [--a,a], then |x| 5 |a|, so anx" | |a, | |x"| 5 Iasi la"i = Now if x = for all n. M - · - = |a,| |xo|" - - - n a (thisis the xo clever step) xo But la/xo| < 1, so the series (geometric) "-M -a M n-0 Choosing M converges. a XO n=0 a/xo|" as the number · anx" converges uniformly on follows that X0 M. in the Weierstrass M-test, it [--a,a]. n=0 To prove the same assertion for g(x) na,x"-I = notice that n=I |na,x"¯1| n |a. | 5 n|an\ = = < -- I |x"¯ | |an-1 - - |a, | |xo|" n |a| - " M -- ¯ |a| n - " - a xo a xo Since |a/xo| < 1, the series "M n=1 converges (thisfact was |a| a" M" a" xo |a| n=1 xo proved in Chapter 23 as an application of the ratio test). and PowerSeries 505 24. UniformConvergence naux"¯' converges Another appeal to the Weierstrass M-test proves that uni- n=1 formly on [-a, a]. Finally, our corollary proves, first that g is continuous, and then that f'(x) = g(x) na.xn-1 = for x in [-a, a]. 0 |xo|,this n=1 Since we could have chosen any number all x with a with < a < result holds for ixi ixol.I < We are now in a position to manipulate power series with ease. Most algebraic manipulations are fairly straightforward consequences of general theorems about infinite series. For example, that suppose f(x) a,x" = and g(x) b,x", = n=0 where the two power series both converge n=0 for some xo. Then for |x| |xo| we < have anx" + n=0 So the series h(x) b,x" (anx"+ b,x") = n=0 (a, + b.)x" also = (an+ b.)x". = n=0 n=0 for |x| converges < and h |xo , - f + g n-O for these x. The treatment of products is just a little more involved. If |x| know that the series anx" and n=0 b,x" converge < |xo|, then we absolutely. So it follows from n=O Theorem 23-9 that the product a,x" b.x" is given by - n=0 n=0 a¿xibjxi, i=0 j=0 the elements a¿xibjxi choose the arrangement where aobo in any order. are arranged + (aobi+aibo)x + (aob2+albi In particular, we can +a2bo)x2 + - - - which can be written as c.x" n=0 for c, aab,_a. = k=0 This is the "Cauchy product" that was introduced in Problem 23-8. Thus, the Cauchy product h(x) c,x" also converges for = n=0 these x. |x| < |xo|and h = fg for 506 and Infmite Series InfmiteSequences Finally, suppose that f (x) anx", where ao = ¢ 0, so that f (0) = ao ¢ 0. n=o 1/f. This means' b,x" which represents Then we can try to find a power series n=0 that we want to have anx" b,x" · n=o 1 = l + 0 x + 0 x2 + = - · - - . n=o Since the left side of this equation will be given by the Cauchy product, we want to have aobo 1 = aobi + a1bo = 0 aob2 Ÿ Ul bi + a2bo 0 = Since ao ¢ 0, we can solve the first of these equations for bo. Then we can solve the second for bi, etc. Of course, we still have to prove that the new series b.x" n=0 does converge for some x ¢ 0. This is left as an exercise (Problem 17). For derivatives, Theorem 6 gives us all the informationwe need. In particular, when we apply Theorem 6 to the infinite series 7 x5 x3 . smx=x--+---+--···, X 3! 5! 7! 9! x2 x4 x6 X cosx=1--+----+-----, 2! 4! 6! 8! x2 x ex=1+-+-+-+--+--·, x3 x4 3! 4! l! 2! get precisely the results which are expected. Each of these converges for any xo, hence the conclusions of Theorem 6 apply for any x: we sin'(x) 1 = cos'(x) = exp'(x) = 3x2 - - 31 - -- - + - f (x) + -- - = - = x - cos x, - - - . - -· - 75 3 + - - smx, situation 7 - 5 = exp(x). = log(1+x) the = x3 arctan x - 5! 2x 4x3 6x" + + 2! 4! 6! 2x 3x2 1+ For the functions arctan and complicated. Since the series 5x4 + -- - 7 + · - - is only slightly more and PowerSeries 507 24. Unform Convergenœ converges for xo = arctan'(x) 1, it also converges for |x| = 1 x2 + x4 - 1 6 _ 1, and < . . for |x| . 1 + x2 < 1. --l In this case, the series happens to converge for x for the derivative is not correct for x 1 or x = also. However, the formula indeed the series = -1; = 6 1-x2+x4_ -1. Notice that this does not contradict Theorem 6, 1 and x diverges for x which that the derivative is given by the expected formula only for |x| < |xo|. proves series Since the = = log(1 + x) converges for xo 1 1+x = = x x -- 2 - 2 1, it also converges + x 3 -- x 3 for |x| 4 --- - 4 x 5 --- - - - - 5 1, and < ,(1+x)=1-x+x2_ + 3 -1; In this case, the original series does not converge for x moreover, the differentiated series does not converge for x = 1. All the considerations which apply to a power series will automatically apply to its derivative, at the points where the derivative is represented by a power series. If = f (x) anx" = n-o converges for all x in some interval (-R, R), then Theorem 6 implies that f'(x) naux"¯I = n=1 for all x in (-R, R). Applying Theorem 6 once again we find that f"(x) n(n = - 1)anx"¯2, n=2 and proceeding by induction we fmd that f(k)(x)= n(n-1).....(n-k+1)anx"¯ . n=k Thus, a function defined by a power series which converges in some interval (-R, R) is automatically infinitely differentiable in that interval. Moreover, the previous equation implies that f(k)(0) = so that ak = kl ag, f (k)(0) k! 508 and InfiniteSeries InfiniteSequences In other words, a convergent powersenes centered at 0 is always theTaylorseries at 0 of the whkh it defmes. function On this happy note we could easily end our study of power series and Taylor ' series. A careful assessment of our situation will reveal some unexplained facts, however. The Taylor series of sin, cos, and exp are as satisfactory as we could desire; they converge for all x, and can be differentiated term-by-term for all x. The Taylor series of the function f (x) = log(1 + x) is slightly less pleasing, because it < x 5 1, but this deficiency is a necessary converges only for consequence of the basic nature of power series. If the Taylor series for f converged for any xo with |xo| > 1, then it would converge on the interval (-|xo|, |xo|), and on this interval the function which it defines would be differentiable, and thus continuous. But this is impossible, since it is unbounded on the interval (-1, 1), where it equals log(1 + x). The Taylor series for arctan is more difficult to comprehend-there seems to be no possible excuse for the refusal of this series to converge when |x| > 1. This mysterious behavior is exemplified even more strikingly by the function f (x) 1/(1 + x2), an infinitely differentiable function which is the next best thing to a polynomial function. The Taylor series of f is given by -1 = f(x) 1 = = 1+ x2 1-x2+x4 _ 6 8 If |x| > 1 the Taylor series does not converge at all. Why? What unseen obstacle Asking this sort of prevents the Taylor series from extending past 1 and question is always dangerous, since we may have to setde for an unsympathetic the way things are! In this case answer: it happens because it happens-that's there does happen to be an explanation, but this explanation is impossible to give at the present time; although the question is about real numbers, it can be answered intelligently only when placed in a broader context. It will therefore be necessary to devote two chapters to quite new material before completing our discussion of Taylor series in Chapter 27. -1? PROBLEMS 1. For each of the following sequences {f.), determine the pointwise limit of {f.} (ifit exists) on the indicated interval, and decide whether {f,} converges uniformly to this function. (i) f. (x) = (ii) f,(x) =- (iii) f.(x) = (iv) f.(x) = (v) f.(x) = J, [0,1]. on O n, x 5 n x > n, on (1, oo). ' - x ex -, e¯"", , on on [-1, 1]. R. on [a,b], and on R. and PowerSenes 509 24. UniformConvergence 2. This problem asks for the same information as in Problem 1, but the functions are not so easy to analyze. Some hints are given at the end. (i) fn(x) = (ii) f.(x) = x" --- [0,1]. on x nx 1+n+x on [0,oo). (iii) f,,(x) = x2 + on [a,oo), a (iv) f, (x) = x2 + on R. (v) f,(x) = x + (vi) f.(x) = (vii)f.(x) = (viii)f,,(x) = x + - 1 - on n 1 --- [a, oo), a > 0. on R. - xn+ n on [a,oo), a on [0,oo). - x + n 0. > - > 0. find the maximum of |f f,| on [0, 1]. (ii)For each n, consider |f(x) f.(x)| for x large. (iii)Express f(x) f,(x) as a fraction and estimate f(x) f.(x)| for x > a. (iv)Give a separate estimate of If(x) f,(x)| for small lx|. (vii)Use (v). Hints: (i)For each n, - - - - - 3. 4. Find the Taylor series at 0 for each of the following functions. (i) f (x) = (ii) f (x) = (iii) f(x) = (iv) f (x) = (v) f (x) = 1 a log(x 1 1 ¢ 0. a , x - a), - a (1 = --- ¢ 0. x)-l/2. - (Use Problem 20-7.) x 1 . 1 x2 - arcsin x. Find each of the following infinite sums. x2 (i) (ii) X2 (iii) x3 x4 1-x+-------+-----·. 2! 31 1-x3+x6-x9+--.. -- 3 - 2 ---- 3-2 4! Hint: Whatis l-x+x2-x3+---? 4 + --- 4-3 x5 - ---- 5-4 + - - - for lx| < l. Hint: Differentiate. 510 and InfiniteSenes InfiniteSequences 5. Evaluate the following infinite sums. (In most cases they are f(a) where a is some obvious number and f (x) is given by some power series. To evaluate the various them until some well-known power series, manipulate power series emerge.) (-1)"22n & (2n)! (21n)!° (ii) n=o 1" i 2n+1 (iii) (iv) - 2 . n=o 1 3"(n + 1) n=o 2n+1 2"n! (vi) n=o 6. If f(x) power 7. · = (sinx)/x for for f. x ¢ 0 and f(0) 1, find = f(k)(0).Hint: a In this problem we deduce the binomial series (1+x). n=o without all the work of in part (a)of that lx < Problem 23-18, although series problem-the = c(1+ x)" for some constant series. Hint: Consider g(x) a = x" |x| < 1 fact established does converge for 1. part (a)is of the form f (x) c, and use this fact to establish the binomial = for |x| < satisfying f (x)/(1 + = x)". Prove that the series n=1 converges 9. f(x) we will use a x", n l. (a) Prove that (1+ x)J'(x) af(x) (b) Now show that any function f 8. Find the series uniformly (a) Prove that on n(1 +nx2) R. the series 2" sin n=o converges uniformly on [a,oo) for a > 0. Hint: lim(sinh)/h hwO = l. and PowerSenes 511 24. UmformConvergence (b) By considering the sum from N to oo for x series does not converge uniformly on (0, oo). 10. (a) Prove that = 2/(x3N), show that tlie the series " f (x) nx = 1 + n4x2 n=o converges uniformly on [a, oo) for a of nx/(1 + n4X2) OR [Û,00). (b) Show that > 0. Hint: First find the and by using an integral to estimate the sum, show that Conclude that the series does not converge uniformly (c) What about the series oo f maximum > 1/4. on R. nx 1 +nSx2 11. Problem 15-33 and the method of proof used for Dirichlet's test (Problem 23-19) to obtain a uniform Cauchy condition for the series (a) Use sin nx n n=1 uniformly uniformly (b) For x = [e,2x on s], e -- > 0, and conclude that the series converges there. x/N, with N large, show that 2N N sin kx sin kx = k=N > N -. k=0 Conclude that 2N 1 2x sin kx ¯¯ k=N and that the series 12. (a) Suppose that k ' does not converge uniformly f (x) anx" = on [0,2x]. converges for all x in some interval n=o 0. (-R, R) and that f (x) 0 for all x in (-R, R). Prove that each a, this remember the for is easy.) formula (If you as only that 0 know Suppose for some sequence {x.) with (x.) we f (b) each = again that Prove 0. lim x, an = 0. Hint: First show that = = = 8400 f (0) = ao = 0; then that f'(0) = ai = 0, etc. and InfiniteSeries 512 InfiniteSequences This result shows that if f(x) e¯1A' sin 1/x for x ¢ 0, then f cannot possibly be written as a power series. It also shows that a function defined by a power series cannot be 0 for x 5 0 but nonzero for x > Sthus a power series cannot describe the motion of a particle which has remained at rest until time 0, and then begins to move! = (c) Suppose that f(x) anx" and g(x) = b.x" converge for all x = n=0 n=0 g(t.) for some sequence in some interval containing 0 and that f(t.) converging that each n. In particular, b, 0. Show for to a, {t,,} npmsentation only senes centered at 0. as a power a function can have one = = 13. Prove that if f (x) a.x" is an even function, then a, = = 0 for n odd, n-0 and if f is an odd function, then a, 0 for n even. = -1 14. Show that the power series for f (x) log(1 x) converges only for s x)] converges log[(1 + x)/(1 x < 1, and that the power series for g(x) only for x in (-1, 1). = - = *15. - Recall that the Fibonacci sequence {a.) is defined by a1 a, + a._t = a2 1, an+1 = = . (a) Show that ja, 5 2. a (b) Let =1+x+2x2+3x"+··· anx"-I f(x)= . n=1 Use the ratio test to prove that f(x) converges if |x| < 1/2. (c) Prove that if |x| < 1/2, then -1 f(x)= ;2+X- Î Î Hint: This equation can be written f (x) xf (x) x2 f (X) 1/(x2 +x and the decomposition Use the partial fraction for 1), power (d) series for 1/(x a), to obtain another power series for f. (e) It followsfrom Problem 12 that the two power series obtained for f must be the same. Use this fact to show that - = - . - - " 1-Ã (1+Ê ¯ 2 a.= 16. Let f (x) anx" and g(x) = n=o = n=o 2 . Suppose we merely knew that n=o c,x" f(x)g(x) b,x". = " for some c., but we didn't know how to multiply series and PowerSeries 513 24. UmformConvergence in general. Use Leibniz's formula (Pmblem 10-18) to show directly that this series for fg must indeed be the Cauchy product of the series for f and g. 17. Suppose that f(x) anx" = converges for some xo, and that ao ¢ 0; n=O for simplicity, we'll assume that ao recursively by bo = b, = 1. Let {b.}be the sequence = defined 1 n-1 bkün-k· - k=0 The aim of this problem is to show that b,x" also converges for some n=0 x 1/f for small enough |x ¢ 0, so that it represents (a) If all |a,xo"| 5 . M, show that |b,x"| 5 M |bkXk i k=0 (b) Choose M > Ä with all |b,xo"| 5 (c) Conclude 18. M b,x" converges that M. Show that lanxo"|5 for |x| sufficiently small. n=o Show that the series oo X2n+1 2n+1 converges uniformly to it converges to log 2! *19. . Suppose that j log(x a. converges. + 1) on X" 2n+2 [-a, a] for 0 < < a We know that the series n=o 1, but that at 1 f(x) anx" = n=o must converge uniformly on [-a, a] for 0 < a < 1, but it may not converge uniformly on [-1, 1]; in fact, it may not even converge at the point example, if (x) log(1 + x)). However, a beautiful theorem of Abel f (for shows that the series doesconverge uniformly on [0,1]. Consequently, f is -1 = continuous on [0,1] and, in particular, a, n=0 Theorem by noticing that if |a,.+·· +a.| by Abel's Lemma (Problem 19-35). = anx". lim Prove Abel's ad < e, then |amx"'+- -+anx"| < e, 514 and InfiniteSeries InfiniteSequences 20. A sequence {a.} is called Abel summable anx" exists; if lim x->l Prob- n-O lem 19 shows that a summable sequence is necessarily Abel summable. Find a sequence which is Abel summable, but which is not summable. Hint: Look over the list of Taylor series until you find one which does not converge at 1, is continuous at 1. even though the function it represents 21. (a) Using Problem (i) (ii) (b) Let - 19, find the following infinite sums. 1 1 1 + 2.1 4-3 3.2 1-¼+¼+---. c, 1 + 5·4 - - · . be the Cauchy product of two convergent power series b., and suppose merely that c, converges. n=o Prove that, in fact, n=o it converges to the product a. n=0 (a) Suppose that {f,} is a sequence b.. - n=0 of bounded functions on [a, b] which converge uniformly f is bounded on (notnecessarily continuous) to f on [a,b]. Prove that [a,b]. (b) Find a sequence of continuous functions on wise to an unbounded function on [a,b]. *23. a. n=0 n 0 and 22. - [a,b] which converge point- Suppose that f is differentiable. Prove that the function f' is the pointwise limit of a sequence of continuous functions. (Since we already know examples of discontinuous derivatives, this provides another example where the pointwise limit of continuous functions is not continuous.) 24. Find a sequence of integrable functions {f,} which converges to the (nonintegrable) function f that is 1 on the rationals and 0 on the irrationals. Hint: Each f. will be 0 except at a few points. 25. (a) Prove that if f is the uniform limit of {f,} on [a,b] and each f, is integrable on [a,b], then so is f. (So one of the hypotheses in Theorem 1 was unnecessary.) (b) In Theorem 3 we assumed only that {f,} converges pointwise to f. Show that the remaining hypotheses ensure that {f,} actually converges uniformly to f. (c) Suppose that in Theorem 3 we do not assume {f.} converges to a function f, but instead assume only that f.(xo) converges for some xo in g). to some f (withf' [a,b]. Show that f. does converge (uniformly) (d) Prove that the series = °° n=i (-1)" x+n and PowerSeries 515 24. UmformConvergence converges uniformly 26. on [0,oo). Suppose that f. are continuous functions on to f. Prove that 1-1/n lim R-¥QQ Is this true if the convergence 27. converge uniformly (0, 1] that 1 f. f. = isn't uniform? that {f,} is a sequence of continuous functions on [a,b] which approach 0 pointwise. Suppose moreover that we have f,(x) 2 f,wi(x) > 0 for all n and all x in [a,b]. Prove that {f.) actually approaches 0 uniformly on [a,b]. Hint: Suppose not, choose an appropriate sequence of points x. in [a,b], and apply the Bolzano-Weierstrass theorem. (a) Suppose (b) Prove Dini's Theorem: If (f,} is a nonincreasing sequence of continuous functions on [a,b] which approach the continuous function f pointwise, then {f.) also approaches f uniformly on [a,b]. (The same result holds sequence.) if {f,} is a nondecreasing (c) Does Dini's Theorem hold if f isn't continuous? How about if replaced by the open interval (a, b)? 28. [a,b] is that {f.} is a sequence of continuous functions on [a,b] that converges uniformly to f. Prove that if x, approaches x, then f,(x.) (a) Suppose approaches f (x). (b) Is this statement true without assuming that the f, are continuous? (c) Prove the converse of part (a):If f is continuous on [a,b] and {f,} is that f,(x.) approaches converges uniformly to f on with the property a sequence approaches x, then f. there is an e > 0 and a sequence x. with the Bolzano-Weierstrass theorem. 29. |f.(x.) f(x) whenever x, [a,b]. Hint: If not, f (x.)| > e. Then use - This problem outlines a completely different approach to the integral; consequently, it is unfair to use any facts about integrals learned previously. (a) Let some s be a step function on partition {to, ..., t.} of [a,b], [a,b]. is constant b Define I on n -(ti-ti_i) s as gs¿ (ti-1,ti) for where value of s on (ti_i, ti). Show that this definition does si is the (constant) not depend on the partition (to, , t.). function on [a,b] if it is the uniform A function f is called a regulated . (b) so that s .. limit of a sequence of step functions {s,}on [a,b]. Show that in this case there is, for every e > 0, some N such that for m, n > N we have |s.(x) sm(x) | < e for all x in [a,b]. - (c) Show that the sequence of numbers (d) Suppose that (t,}is another sequence s, of step will be a Cauchy sequence. functions on [a,b] which to f. Show that for every e > 0 there is an N such N we have |s.(x) t.(x)| < s for x in [a,b]. converges uniformly that for n > - and InfiniteSeries 516 InßniteSequences b b lim (e) Conclude that n-Foo Ïb f Ï s, lim = n->oo t.. This means that we can dfme lim s, for any sequence of step functions to be RNOC {s.) converging to f. The only remaining question is: Which functions are Here is a partial answer. Prove that a continuous function is regulated. Hint: To find a step funcs(x)| < e for all x in tion s on [a,b] with |f(x) [a,b], consider all y for which there is such a step function on [a,y]. uniformly regulated? *(f) - *30. Find a sequence {f,} approaching f uniformly on [0,1] for which lim (lengthof f, on [0,1]) ¢ length of f on [0,1]. (Length is defined naoo in Problem 13-25, but the simplest example will involve functions the length of whose graphs will be obvious.) COMPLEX NUMBERS CHAPTER With the exception of the last few paragraphs of the previous chapter, this book has presented unremitting propaganda for the real numbers. Nevertheless, the real numbers do have a great deficiency-not every polynomial function has a root. The simplest and most notable example is the fact that no number x can satisfy x2 + 1 = 0. This deficiency is so severe that long ago mathematicians felt the need to 0. For a a number i with the property that i2 + 1 since there is no long time the status of the i was quite mysterious: number x satisfying x2 + 1 0, it is nonsensical to say i be the number i2 admission number i to satisfying of the 0." Nevertheless, + 1 the family of numbers seemed to simplify greatly many algebraic computations, numbers" especially when a + bi (fora and b in R) were allowed, and all the laws of arithmetical computation enumerated in Chapter 1 were assumed valid. example, equation be quadratic For to every "invent" = "number" "let = "imaginary" = "complex ax + bx + c 0 = (a ¢ 0) can be solved formally to give -b x If b2 + = Åb24ac 2a Áb2 4ac -b - - or x - = 2a -4ac 0, these formulas give correct solutions; when complex numbers are allowed the formulas seem to make sense in all cases. For example, the equation > x2+x+1=0 has no real root, since x2+x+1=(x+()2+(>0, forallx. But the formula for the roots of a quadratic equation x if we understand would be B and = 2 V3 (-1) to mean - - 1 Ã + i 2 2 - -- and = x suggest 2 - 1 2 - = -- Ë 2 di, then these numbers i. It is not hard to check that these, as yet purely formal, numbers the equation x2+x+1=0. 517 "solutions" = d.|¯1 - the do indeed satisfy and InßniteSeries 518 InfiniteSequences "solve" whose coefficients are themselves quadratic equations It is even possible to complex numbers. For example, the equation x2+x+1+i=0 ought to have the solutions --1 -1 1 ± x where -3 - 4(1 + i) - 3 ± 2 J-3 4i -- ' 2 a complex the symbol 4i means 4i. In order to have - (a+ßi)2 number 2_ 2+2aßi=--3-4i 2 2 a + ßi whose square is we need _ -4. 2aß can easily These two equations possible solutions: = be solved for real a and ß; in fact, there are two «=-1 a=1 and ß=-2 ß=2. -3 "square -1 2i and + 2i. There is no reasonable way to decide which one of these should be called V-3 4i, and which makes 4i; the conventional usage of sense only for real x :> 0, in which case á denotes the (real) nonnegative root. For this reason, the solution Thus the two roots" of 4i are 1 - - - --- /-3 - -1± -3---4i x= must be understood 2 for: as an abbreviation -1+ x r = 2 -3 where , With this understanding r is one of the square roots of - 4i. we arrive at the solutions -l+1-2i . x ¯" 2 -l-l+2i -1+i; x 2 as you can easily check, these numbers do provide formal solutions for the equation x2+x+l+i=0. For cubic equations complex numbers ax3 roots (a¢0) b, c, and d, has, as we know, a real root a, and if we divide we obtain a second-degree polynomial whose roots are of ax3+bx2+cx+d 0; the roots of this second-degree polynomial with real coefficients a, ax3+bx2+cx+d by x the other 2+cx+d=0 are equally useful. Every cubic equation -a = 25. ComplexNumbers 519 may be complex numbers. Thus a cubic equation will have either three real roots or one real root and 2 complex roots. The existence of the real root is guaranteed by our theorem that every odd degree equation has a real root, but it is not really is of no use at all if the coefficients necessary to appeal to this theorem (which of complex); in the cubic equation are we can, with sufficient cleverness, case a actually find a formula for all the roots. The following derivation is presented not only as an interesting illustration of the ingenuity of early mathematicians, but as of complex evidence numbers they the importance further for (whatever may be). To solve the most general cubic equation, it obviously suffices to consider only equations of the form x3 + bx2 + cx + d 0. = the term involving x2, by a fairly straight-forward It is even possible to eliminate manipulation. If we let b x=y---, 3 then x3=y3-by2+ x 2 2 =y _ - 3 b2 g +-, 3 , 27 9 so 0=x3+bx2+cx+d = = ( y3 y* + - b2y by2 + - b3 - 3 - 27 + 2b2 (b2 -- - --- 3 3 + c y + b 2 b3 2b2 + - 3 b3 b3 bc 9 27 3 + -- 9 + d cy -- b 3 - + d . The right-hand side now contains no term with y2. If we can solve the equadon for y we can find x; this shows that it suffices to consider in the first place only equations of the form x3+px+q=0. We shall see later on 0 we obtain the equation x3 complex number three, so that this has cube that every root, in fact it does have a equation has three solutions. The case p ¢ 0, on the other hand, requires quite an ingenious step. Let In the special case p -q. = = (*) x=w-- p 3w 520 and InfiniteSeries InfiniteSequences Then 0=x3+px+q= 3 =w 3w2p 3mp2 + -- 3w = This equation P 3_ w +p w-- +q wp2 p3 +pw--+q - 9w2 27w3 3w 3 +q. 27w3 can be written 27(w3 2 + 27q(w3 in w3 which is a quadratic equation Thus (!!). -27q (27)2 2 + 4 27p3 ± w3 3 _ - 2.27 = q -- - 2 q2 ± -- 4 p3 + --. 27 Remember that this really means: w3 9 _ _ 2 2 where r + r, is a square root of 4 + ' 3 . 27 We can therefore write -q -± w= q2 p3 4 27 -+-; s y 2 -q/2+r, this equation means that w is some cube root of where r is some square p3/27. six q2/4 of allows possibilities This for w, but when these are root + substituted into (*),yielding -q -± q2 p3 4 27 -+- x= a y 2 p -- , -q -± 3- q2 p3 4 27 --+-- s \ 2 it turns out that only 3 different values for x will be obtained! An even moæ feature of this solution arises when we consider a cubic equation all of whose roots are real; the formula derived above may still involve complex numbers in an essential way. For example, the roots of surprising x3-15x-4=0 25. ComplexNumbers 521 -2 are 4, + -15, (withp Ã, = q -2 and Ã. - -4) On the other hand, the formula derived above gives as one solution = -15 2+44-125 x= - 3 2+lli = 44 2+ . - 125 15 + 3. V2+lli Now, _g2 (2+i)3=23+3-22 ¿3 =8+12i-6-i =2+lli, so one of the cube roots of 2+ l li is 2 + i. Thus, for one solution of the equation we obtain 15 6 + 3i x=2+i+ 15 =2+i+ 6 + 3i 90 45i 36 + 9 6 6 - - 3i 3i - =2+i+ = 4 (!) . The other roots can also be found if the other cube roots of 2+ lli are known. The fact that even one of these real roots is obtained from an expression which depends on complex numbers is impressive enough to suggest that the use of complex numbers cannot be entirely nonsense. As a matter of fact, the formulas for the solutions of the quadratic and cubic equations can be interpreted entirely in terms of real numbers. Suppose we agree, for the moment, to write all complex numbers as a + bi, writing the real number a as a + Oi and the number i as O + li. The laws of show that ordinary arithmetic and the relation i2 = -1 (a+bi)+(c+di)=(a+c)+(b+d)i (a + bi) (c + di) (ac bd) + (ad + = - - bc)i. Thus, an equation like (1 + 2i) (3 + li) - may be regarded simply as an abbreviation 1-3-2-1=1, 1.1+2-3=7. = 1 + 7i for the two equations 522 Infinite Sequences and InfiniteSeries ax2 + The solution of the quadratic equation could be paraphrased as follows: u2 77 uv (i.e.,if y2 _ = -- b2 - bx + c = 0 with real coefficients 4ac, 0, (u + vi)2 -- b2 44 - -b+u (-b+uÍ-3 a 2a then a v - +b \2a / 2 +c=0, 2a +bœ=0, 2 -b+u+W ( (-b+u+vi i.e., then a 2a + b + c 2a = 0 . It is not very hard to check this assertion about real numbers without writing down a single "i," but the complications of the statement itself should convince you that equations about complex numbers are worthwhile as abbreviations for pairs of equations about real numbers. (If you are still not convinced, try paraphrasing the solution of the cubic equation.) If we really intend to use complex numbers consistently, however, it is going to be necessary to present some reasonable definition. One possibility has been implicit in this whole discussion. All mathematical properties of a complex number a + bi are determined completely by the real numbers a and b; any mathematical object with this same property may reasonably used number. be The obvious candidate is the ordered pair to define a complex (a, b) of real numbers; we shall accordingly d¢ne a complex number to be a pair of real numbers, and likewise d¢ne what addition and multiplication of complex numbers is to mean. DEFINITION is an ordered pair of real numbers; if z (a, b) is a complex number, then a is called the real part of z, and b is called the imaginary part of z. The set of all complex numbers is denoted by C. If (a, b) and (c,d) are two complex numbers we define A complex number = (a, b) + (c, d) = (a + c, b + d) (a,b)-(c,d)=(a·c--b-d,a·d+b·c). (The + and appearing on the left side are new symbols being defined, while the + and appearing on the right side are the familiar addition and multiplication for real numbers.) - · When complex numbers were first introduced, it was understood that real numbers were, in particular, complex numbers; if our definition is taken seriously this after all. This difficulty is not true-a real number is not a pair of real numbers, 25. ComplexNumbers 523 is only a minor annoyance, however. Notice that (a, 0) + (b,0) (a + b, O + 0) (a + b, 0), (a,0)-(b,0)=(a-b-0-0,a-0+0-b)=(a-b,0); = = this shows that the complex numbers of the form (a, 0) behave precisely the same with respect to addition and multiplication of complex numbers as real numbers addition and multiplication. with their own For this reason we will adopt the do convention that (a, 0) will be denoted simply by a. The for complex numbers can now be recovered if one more i DEFINITION Notice that i2 = familiar a + bi notation definition is made. (Û,1). -1 = on our convention). (0, 1) (0, 1) - (-1, 0) = (thelast = Moreover (a, b) = = equality sign depends (a, 0) + (0,b) (a, 0) + (b, 0) (0, 1) - =a+bi. You may feel that our definition was merely an elaborate device for defining complex numbers as of the form a + bi." That is essentially correct; it is a firmly established prejudice of modern mathematics that new objects must be defined as something specific, not as Nevertheless, it is inter"expressions "expressions." esting to note that mathematicians were sincerely worried about using complex until the modern definition was proposed. Moreover, the precise definition emphasizes one important point. Our aim in introducing complex numbers was to avoid the necessity of paraphrasing statements about complex numbers in terms of their real and imaginary parts. This means that we wish to work with complex numbers in the same way that we worked with rational or real numbers. p/3w, For example, the solution of the cubic equation required writing x w w2 solving makes that 1/w sense. Moreover, so we want to know was found by a numbers = - In numerous other algebraic manipulations. short, we are likely to use, at some time or other, any manipulations performed on real numbers. We certainly do not want to stop each time and justifyevery step. Fortunately this is not necessary. Since all algebraic manipulations performed on real numbers can be justifiedby the properties listed in Chapter 1, it is only necessary to check that these properties are also true for complex numbers. In most quadratic equation, which requires cases this is quite easy, and these facts will not be listed as formal theorems. example, the proof of Pl, [(a,b) + (c, d)] + (e, f) requires only the application of the = (a, b) + [(c,d) definition of addition The left side becomes ([a+c]+e, [b+d] + f), + For (e, f)] for complex numbers. 524 and InfiniteSeries InfiniteSequences and the right side becomes (a+[c+e],b+[d+ f]); these two are equal because P1 is true for real numbers. It is a good idea to check P2-P6 and PS and P9. Notice that the complex numbers playing the role of 0 and 1 in P2 and P6 are (0, 0) and (1, 0), respectively. It is not hard to figure b) is, but the multiplicative inverse for (a, b) required in P7 is a little out what trickier: if (a, b) ¢ (0, 0), then a2 + b2 ¢ 0 and -(a, -b (a, b) ( - a (1,0). = , a2+b2 a2+b2 This fact could have been guessed in two ways. To find (x, y) with (a, b) (x, y) = - it is only necessary to solve (1, 0) the equations ax-by=1, bx + ay = 0. (a2 + b2). It is also possible to reason The solutions are x a/(a2 + b2Ly that if 1/(a + bi) means anything, then it should be true that -b - = 1 1 a-bi a-bi ¯ a + bi ¯ a + bi a - a2 + b2° bi guessing the inverse Once the existence of inverses has actually been proved (after by some method), it follows that this manipulation is really valid; it is the easiest one to remember when the inverse of a complex number is actually being sought-it was precisely this trick which we used to evaluate 15 6+3i 15 6 3i 6+3i 6-3i 90 45i 36+9 - - ° Unlike Pl-P9, the rules P10-P12 do not have analogues: it is easy to prove that there is no set P of complexnumbers such that P10-P12 are satisfied for all complex numbers. In fact, if there were, then P would have to contain 1 (since1 = 12)and i2), and this would contradict P10. The absence of P10-P12 also (since will not have disastrous consequences, but it does mean that we cannot define -1 -1 = z < w for complex z and w. Also, you may remember that for the real numbers, P10-P12 were used to prove that 1 + 1 ¢ 0. Fortunately, the corresponding fact for complex numbers can be reduced to this one: clearly (1,0) + (1, 0) ¢ (0,0). Although we will usually write complex numbers in the form a + bi, it is worth remembering that the set of all complex numbers C is just the collection of all real numbers. of pairs Long ago this collection was identified with the plane, and plane." The horizontal axis, for this reason the plane is often called the which consists of all points (a, 0) for a in R, is often called the real axis, and the "complex 25. ComplexNumbers 525 vertical axis is called the imaginaryaxis. Two important definitions are also related this geometric picture. to If z DEFINITION of z (withx and x + iy is a complex number = is defined as = and the absolute value or modulus length |z| (Notice that x2+ y2 the nonnegative e' FIGURE " " ¯ " > x 0, so that y real), then the conjugate ž iy, - |z| of z is defined as x2 = Vx2+ 2 y2 is defined unambiguously; it denotes root of x2 + y2) real square Geometrically, ž is simply the reflection of z in the real axis, while |zi is the distance from z to (0,0) (Figure 1). Notice that the absolute value notation for complex numbers is consistent with that for real numbers. The distance between two complex numbers z and w can be defined quite easily as |z- w|. The following theorem lists all the important properties of conjugates and absolute values. 1 THEOREM 1 Let z and w be complex numbers. (1)Ë = (2)ž Then z. z if and only if z is real = number (i.e.,is of the form a + Oi, for some real a). (3)z+w=¯+t¯. (4) (5)z w=z-w. ¯¯¯ž --ž. = (ž)¯I, if z ¢ 0. (6) (7) |z z ž. (8) |z wl |z| |wl. (9) |z+wl s Izl+lwl. = = = PROOF Assertions (1)and (2)are obvious. Equations (3)and (5)may be checked by straightforward calculations and (4)and (6)may then be proved by a trick: 0 = -ž, Õ z + (-z) ž+ = = -z, --z = so -)-1 1= 1 = z (z-1) - - _ . z-1, so g-1 calculation. Equations (7)and (8)may also be proved by a straightforward The only difficult part of the theorem is (9).This inequality has, in fact, already occurred (Problem 4-9), but the proof will be repeated here, using slightly different terminology. It is clear that equality holds in (9)if z 0 or w 0. It is also easy to see that (9) separately the cases A > 0 and is true if z = Aw for any real number 1 (consider A < 0). Suppose, on the other hand, that z ¢ Aw for any real number A, and that = = and Ir¢niteSeries 526 InfiniteSequences w ¢ 0. Then, for all real numbers 1, Notice that wž + ze is real, since wž+zã=eÏ+žë=ez+2w=wž+zã. Thus the right side of (*)is a quadratic equation in I with real coefficients and no real solutions; its discriminant must therefore be negative. Thus 2<0; (wž+zã)2-4|w|2. it follows, since wž + žw and |w| |z| are - real numbers, (wž+zã)<2|w - and |w| |z| > 0, that - |z|. From this inequality it follows that |z+w|2=(z+w) (ž+ 2 Izl2 < |zl2+ Iwl2 ) = (lzl + |wl)2, = which implies that Iz+wl<lzl+lwl-i The operations of addition and multiplication of complex numbers both have important geometric interpretations. The picture for addition is very simple (Figure 2). Two complex numbers z (a, b) and w (c,d) determine a parallelogram having for two of its sides the line segment from (0, 0) to z, and the line segment from (0, 0) to w; the vertex opposite (0, 0) is z + w (aproof of this Appendix 1 to Chapter 4]). geometric fact is left to you [compare The interpretation of multiplication is more involved. If z 0, 0 or w then z w 0 (a one-line computational proof can be given, but even this is assertion has already been shown to follow from P1-P9), so we unnecessary-the attention restrict to nonzero complex numbers. We begin by putting every may our DORZCTO complex number into a special form (compare Appendix 3 to Chapter 4). For any complex number z ¢ 0 we can write = = = c FIGURE 2 a a + c - = z in this expression, |z| is a positive ItI = --; z Izi real number, - z Izl = Izi Izl while =1 ' = 25. ComplexNumbers 527 so that z/\z| is a complex number of absolute value 1. Now any complex number a x + iy with 1 |a| x2 + y2 can be written in the form = = = a length for some number (cos0, sin 0) = 0. Thus every nonzero angle of 0 radians x FIGURE = r(cos0 = cos 0 + i sin 0 complex number z can be written + i sin0) for some r > 0 and some number 0. The number r is unique (itequals Iz|),but 8 is not unique; if Gois one possibility, then the others are 90 + 2kx for k in Z-any of z. Figure 3 shows z in terms of r one of these numbers is called an argument and 8. (To find an argument 8 for z iy + we may note that the equation x 3 = x+iy= means z = |z|(cos0+i = |z| cos0, sin0) that x y = Iz| sin0. aretan y/x; if x So, for example, if x > 0 we can take 0 x/2 when y > 0 and & 3x/2 when y < 0.) 8 Now the product of two nonzero complex numbers = = 0, we can take = = z = w = + i sin9), s(cos¢+ i sin¢), r(cos0 is - z w = = rs(cos0 rs[(cos0 =rs[cos(0+¢)+isin(0+¢)]. + i sin8)(cos¢ + i sin¢) sin0 sin cos ¢ ¢) + i(sin0 cos ¢ + cos0 sin -- ¢)] Thus, the absolute value of a product is the product of the absolute values of the factors, while the sum of any argument for each of the factors will be an argument for the product. For a nonzero complex number z = r(cos0 + i sin0) it is now an easy matter to prove by induction the following very important formula known as De Moivre's Theorem): (sometimes z" = |z|"(cosn0+ i sinn0), for any argument 0 of z. This formula describes z" so explicitly that it is easy to decide just when z" THEOREM 2 Every nonzero complex number has exactly n complex nth roots. More precisely, for any complex number w ¢ 0, and any natural there are precisely n different complex numbers z satisfying z" w. = PROOF Let w = s(cos¢ + i sin¢) = number w: n, and InfiniteSems 528 InfiniteSequences for s Iw| and = some number 7 satisfies z" = w a complex number ¢. Then V (COS = Û § Í Sin Û) if and only if r"(cosnO + i sinnO) which + isin¢), s(cos¢ - happens if and only if r" = cos n0 + i sin n9 = s, cos ¢ + i sin ¢. From the first equation it follows that denotes the positive real nth root of s. From the second equation it that follows for some integer k we have where 6 2kn ¢ 0=0k= ¯¯"· n and 6 6 n Ok for some k, then the number z sin satisfy will the number of nth roots of w, 0 i To determine 0) z" + r (cos w. which only such z are distinct. Now any it is therefore necessary to determine integer k can be written Conversely, if we choose r = = = = k=nq+k' for some integer q, and some integer k' between 0 and n cos04 +isin0 i This shows that every z satisfying z" cos0ks #isinÛy. = w can be written = Moreover, it is easy to see that these numbers k 0, 1 differ by less than 2x. n ¯' 1. Then 6(cosOk#iS1nÛk)k=0,...,n-1. z= = - are all different, since any two OkfOr - ..., In the course of proving Theorem 2, we have actually developed a method for fmding the nth mots of a complex number. For example, to find the cube roots of i (Figure 4) note that li| 1 and that x/2 is an argument for i. The cube roots = FIGURE 4 of i are therefore 1- cos-+ism6 6 , ¯ 1 cos - x - 6 + 2x 3 -- + i sin x - 6 + - 2x 3 cos - . -, ° ¯ 1- cos 5x 5x + r sm 6 6 . = x -+- 6 4x 3 . . +rsm x ---+- 6 4x 3 =cos-+xsm--. 3x 2 . . 3x 2 25. ComplexNumbers 529 Since x( cos 5x 6 3x cos 2 -- cos n2 = - = -- --, x . --, '6 6 5x sm 6 3x Ä 1 2 1 2 = -- sm -, . 2 -- = -- = -, -1, . 0, = sm 2 the cube roots of i are -n+i Ã+i -i ' 2 ' 2 In general, we cannot expect to obtain such simple results. For example, to find and that the cube roots of 2+ 1li, note that |2+ lli| /22+112 is an argument for 2 + l li. One of the cube roots of 2 + l li is therefore arctan = = ( ' VÏžŠ cos arctan ll 2 arctan + i sin 3 - 11 2 3 Á = cos g arctan (arctan 3 + i sin . 3 Previously we noted that 2 + i is also a cube root of 2 + lli. Since |2 + i| = 22 12 W, and since arctan is an argument of 2 + i, we can write this ) = cube root as 2+ i These two cube roots = Ã(cosarctan j + are actually i sin arctan the same number, arctan = 3 arctan (). because 1 2 - can check this by using the formula in Problem 15-9), but this is hardly the sort of thing one might notice! The fact that every complex number has an nth root for all n is just a special i was originally introduced in case of a very important theorem. The number (you order to provide a solution for the equation x2 + 1 0. The FundamentalTheorem ofAlgebrastates the remarkable fact that this one addition automatically provides solutions for all other polynomial equations: every equation = z"+a,-iz""+--·+ao=0 ao,...,a,_iinC has a complex root! In the next chapter we shall give an almost complete proof of the Fundamental Theorem of Algebra; the slight gap left in the text can be filled in as an exercise (Problem 26-5). The proof of the theorem will rely on several new concepts which come up quite naturally in a more thorough investigation of complex numbers. and Ir¢nite Series 530 InfiniteSequences PROBLEMS 1. value Find the absolute and argument of each of the following. (i) 3 + 4i. (ii) (3 + 4i)¯*. i)5 (iii) (1 + (iv) V3+ 4i. (v) 13+4il. 2. Solve the following equations. x2+ix+1=0. (i) (ii) (iii) 2+1=0. x4 x2 + 2ix - 1 0. = (1 + i)y 3, (2+i)x+iy=4° ix (v) 3. = - x3_,2-x-2=0. Describe the set of all complex numbers (i) ž -z. = (iii) Iz al Iz bl. (iv) |z-a|+[z-b|=c. (v) |z| < 1 real part = - - of - 4. z such that z. Prove that \zi |ž|, and that the real part of z is (z+ž)/2, while the imaginary part is (z ž)/2i. = - w|2 2 5. Prove that |z + w|2 geometrically. 6. What is the pictorial relation between z and Á· zR by NP goes into the real axis under multiplication 7. real and a + bi (fora and b real) satisfies if ao, , an-1 are a.-iz"¯I the equation z" + 0, then a bi also satisfies this + ao + equation. (Thus the nonreal roots of such an equation always occur in pairs, and the number of such roots is even.) z"+a,_iz"¯I is divisible by z2 +···+ao (b) Conclude that (a) Prove that -i- |z - -- 20z + |w|2),and interpret this statement P Hint: Which line ... · - - = - ---2az+(a2+b2 coefficients (whose *8. are real). integer which is not the square of another integer. If a and b of a + bá, denoted by a + b are integers we define the conjugate , well showing bá. that the that conjugate Show defined is by as a a number can be written a + bá, for integers a and b, in only one way. (b) Show that for all a and ß of the form a + bá, we have â a; ä a if (a) Let c be an - = = 25. ComplexNumbers531 -a; and only if a is an integer; a + ß that if ao, equation z" anz"¯1 this equation. (c) Prove 9. *10. ... , a,_i + · - - a + ß; 3 = are integersand z 0, then ž + ao = = = = a - ß ä = - ß; and satisfies the a + b b a also satisfies - Find all the 4th roots of i; express the one having smallest argument form that does not involve any trigonometric functions. in a if o is an nth root of 1, then so is ok n-1 nth root of 1 if {1,o, m2 o is called a primitive is the set of all nth roots of 1. How many primitive nth roots of 1 are there for n 3, 4, 5, 9? (a) Prove that (b) A number = n-1 (c) Let o *11. be an nth root of 1, with w ¢ l. Prove that i w* = 0. k-0 Zk lie on one side of some straight line through 0, 0. Hint: This is obvious from the geometric interthen z1 + + Zk ¢ of pretation addition, but an analytic proof is also easy: the assertion is clear if the line is the mal axis, and a trick will reduce the general case if zi, (a) Prove that · · ... , · to this one. that z1-1, through 0, so that zi-1 (b) Show further *12. ... , · Zk¯ k all lie on one side of a straight line Û· Pove that if \til Iz2\ Iz3|and zi + 22 + Z3 0, then zi, z2, and z3 are the vertices of an equilateral triangle. Hint: It will help to assume that zi is real, and this can be done with no loss of generality. Why? = = = COMPLEX CHAPTER FUNCTIONS You will probably not be surprised to learn that a deeper investigation of complex numbers depends on the notion of functions. Until now a function was (intuitively) a rule which assigned real numbers to certain other real numbers. But there is no reason why this concept should not be extended; we might just as well consider a rule which assigns complex numbers to certain other complex numbers. Arigorous definition presents no problems (wewill not even accord it the full honors of a formal definition): a function is a collection of pairs of complex numbers which does not contain two distinct pairs with the same first element. Sincewe consider real numbers to be certain complex numbers, the old definition is really a special case of the new one. Nevertheless, we will sometimes resort to special terminology in order to clarify the context in which a function is being considered. A function f is called real-valued if f (z) is a real number for all z in the domain of f, and complex-valued Similarly, to emphasize that it is not necessarily real-valued. subset of] R in will usually explicitly that function is defined state we a on [a f those cases where the domain of f is [asubset of] R; in other cases we sometimes mention that f is defined on [asubset of] C to emphasize that f (z) is defined for complex z as well as real z. Among the multitude of functions defined on C, certain ones are particularly important. Foremost among these are the functions of the form f (z) = anz" + an-1z.-1 + - - - + ao, where ao, numbers. These functions are called, as in the , as are complex "identity real case, polynomial functions; they include the function f (z) = ... z (the function") and functions of the form f (z) a for some complex number a ("constant functions"). Another important generalization of a familiar function is the value function" f (z) Iz| for all z in C. of Two functions particular importance for complex numbers are Re (the part function") and Im (the part function"), defined by = "absolute = "real "imaginary Re(x+fy)=x, Im(x + iy) = y, The "conjugate for x and y real. function" is defined by f (z) ž = = Re(z) - i Im(z). Familiar real-valued functions defined on R may be combined in many ways to produce new complex-valued functions defined on C-an example is the function f (x + 532 iy) = ei sin(x ·- y) + ix3 cos y. 26. Complex Functions 533 The formula for this particular function illustrates a decomposition which is always possible. Any complex-valued function f can be written in the form f=u+iv for some real-valued functions u and v-simply define u(z) as the real part of f(z), and v(z) as the imaginary part. This decomposition is often very useful, but not always; for example, it would be inconvenient to describe a polynomial function in this way. One other function will play an important role in this chapter. Recall that an number 0 such that argument of a nonzero complex number z is a (real) z |zl(cosÐ + = i sin Ð). There are infinitely many arguments for z, but just one which satisfies O < Ð < function (the 2x. If we call this unique argument 0(z), then Ð is a (real-valued) "argument function") on (z in C : z ¢ 0}. "Graphs" of complex-valued functions defined on C, since they lie in 4-dimensional space, are presumably not very useful for visualization. The alternative used instead: we draw two picture of a function mentioned in Chapter 4 can be copies of C, and arrows from z in one copy, to f (z) in the other (Figure 1). F1GURE I of a complex-valued function is proThe most common pictorial repæsentation duced by labeling a point in the plane with the value f (z),instead of with z (which can be estimated from the position of the point in the picture). Figure 2 shows this sort of picture for several different functions. Certain features of the function are For example, the absolute value function illustrated very clearly by such a around concentric circles is constant on 0, the functions Re and Im are constant vertical and respectively, and the function f (z) z2 wraps horizontal lines, on the the circle of radius r twice around the circle of radius r2. fimetions in genDespite the problems involved in visualizing complex-valued eral, it is still possible to defineanalogues of important properties previously defmed for real-valued functions on R, and in some cases these properties may be easier to visualize in the complex case. For example, the notion of limit can be defmed as follows: "graph." = lim f (z) = I means that for every ß > 0 such that, for all z, if 0 < (real)number |z - a| < e 8, then > 0 there is a | f (z) Il - < number (real) s. 534 and Infinite Series InßniteSequences 2 2 2 2 2 a 2 i 1 2 2 2 -4 -# i- 2 3 -1 -¾ -1 -¾ -¾ 2 2 a 2. 2 1 1 1 i 1 2 I -1 -J 0 -1 2 -# 1 - -1 0 --¾ -f 2 i : 1 i i T 1 $. 2 1 0- Ia 1 1 -# 1 1 2 a2 2 Y 2 2 i - -i 2 1 - ¾ 0 0 0 -¾ -1 -# -1 O (b)f(( 2 2 1. j- 1 ¼ i ¼ i 1 -1--- -¾ -1 -J -J 2 Y ¾ ¼i ¼ -i- , 2 2 -j- 0 2 i 1 a 2 -i -1 0 2 -1 2 1 a i 2 I I 1 -¾ -f- --- Y 1 1 0 ¾ } } } ( i¾ ¾ } } 1 1 -i- - --1 2 ¾ # ¾ ¼ -} g 1 4 1 1 O 2 2 1 i -# -1 2 T 1 2 -¾ 3 0 0 0 2 a 1 3 ' 2 1 1 -# -1 2 x' ·= # ¼ ¾ 1 1 1 Re(t) 2 -4 (a)f(t)-|c| -·-4¿ 4i -i i 111111 111111 ¼ i¼¾ì 0 000000000000 fi¼¼ì¾ 1 4 1 4 -1 -1 -1-1-1--1-1-1-1-1---1-1-1-1. -4: 4: -f -¾-¾-1-1-#- -4 -#-¾-#-#-J-J (c)/(x)-Im(x) i (d)f(x) i i T (e)/( )- FIGURE 2 0( ) =z 26. ComplexFunctions 535 Although the defmitionreads precisely as before, the interpretation is slightly dif ferent. Since |z w| is the distance between the complex numbers z and w, the equation lim f (z) l means that the values of f (z) can be made to lie inside - = any given circle around I, provided that z is restricted to lie inside a sufficiently small circle around a. This assertion is particularly easy to visualize using the "two copy" picture of a function (Figure 3). FIGURE 3 Certain facts about limits can be proved exactly as in the real case. In particular, lim e lim z = c, = a, z->a lim[ f (z) + g(x)] lim f(x)-g(z) lim z->a 1 -- lim f(z)-limg(x), = 1 = g(z) lim f (z) + lim g(z), = lim g(z) if lim g (z)¢ 0. z-va , The essential property of absolute values upon which these results aœ based is the inequality Iz+ wl 5 Izl+ |w|, and this inequality holds for complex numbers as well as for real numbers. These facts already provide quite a few limits, but many more can be obtained from the following theorem. THEOREM I u(z) +io(z) for real-valued functions u and v, and let l real numbers a and ß. Then lim f (z) I if and only if Let f(z) = = a + iß for = limu(z) = a, lim v(z) = ß. z->a PROOF Suppose first that lim f (z) = l. If s if0<|z-a|<8, > 0, there is & > 0 such that, for all z, then|f(z)-ll<e. The second inequality can be written [u(z)-a]+i[v(z)-ß] <e, and InfiniteSeries 536 InfiniteSequences or a] [u(z) - [v(z) ß] + < - e2. -ß and v(z) Since u(z) are both real numbers, inequality therefore implies that their squares are positive; this -a «]2 [u(z) - which 2 and [v(z) ß]2 < e and |v(z) ß| < e2 - implies that lu(z) œ - Since this is true for all e |< - e. 0, it follows that > lim u (z) = and a lim v (z) Now suppose that these two equations hold. If a for all z, if 0 < |z a| < 8, then > ß. = 0, there is a ô > 0 such that, - e |u(z)-a|<- 2 which e and |v(z)-œ|<-, 2 implies that -ll -a]+i[v(z) |f(z) = |[u(z) 5 |u(z) a | + - € <--+-=e. 2 This proves that lim f(z) = ß] |i| |v(z) ß| - - - 8 2 l. In order to apply Theorem 1 fruitfully, notice that since we already know the limit lim z a, we can conclude that = lim Re(z) = Re(a), lim Im(z) = Im(a). A limit like lim sin(Re(z)) sin(Re(a)) = follows easily, using continuity of sin. Many applications such limits as the following: lim ž = lim Izl = lim (x+iy)ea+bi of these principles prove ä, lai, ei sin x + ix' cos y = eb SÍnG ‡ $8 COS b. Now that the notion of limit has been extended to complex functions, the notion of continuity can also be extended: f is continuous at a if lim f (z) = f (a), and 26. ComplexFunctions 537 f is continuous is continuous at a for all a in the domain of f. The previous work on limits shows that all the following functions are continuous: if f f(z)=anz"+a,-iz"¯!+---+ao, f(z) ž, f(z)= |z|, = f(x+iy)=e'sinx+ix3cosy. Examples of discontinuous functions are easy to produce, and certain ones come One particularly frustrating example is the funcup very naturally. real numbers (seethe tion" Ð, which is discontinuous at all nonnegative 0 it is possible to change the discontinuin Figure 2). By suitably redefining example ities; for (Figure 4), if O'(x) denotes the unique argument of z with x/2 5 O'(z) < 5x/2, then 0' is discontinuousat ai for every nonnegative real number a. But, no matter how 0 is redefined, some discontinuities will always "argument "graph" occur. 3r FIGURE 4 f(z) - 0'(:) The discontinuity of 0 has an important bearing on the problem of defining a function," that is, a function f such that (f (z))2 z for all z. For real real numbers. numbers the function f had as domain only the nonnegative If complex numbers are allowed, then every number has two square roots (except 0, which has only one). Although this situation may seem better, it is in some ways worse; since the square roots of z are complex numbers, there is no clear criterion for selecting one root to be f (z), in preference to the other. "square-root = and InßniteSeries hýiniteSequences 538 One way to define f is the following. We set f(0) f(z) +isin cos = 0, and for e ¢0 we set = . Clearly (f (z))2 z, but the function f is discontinuous, since 0 is discontinuous. As a matter of fact, it is impossible to find a continuous f such that (f (z))2 z for all z. In fact, it is even impossible for f(z) to be defined for all z with |zl 1. To we could always prove this by contradiction, we can assume that f(1) 1 (since replace f by f). Then we claim that for all 9 with 0 0 < 2x we have = = = = _< - f (cos0 + i sin O) The argument for this is left to you argument). But (*)impliesthat (it is a (*) lim 0->2x f(cos0 + isin0) cos = = 0 + i sin standard cosx 0 -. type of least upper bound + isinn -1 = ¢ f (1), 2x. Thus, we have our contradic1 as 8 even though cos0 + isin0 tion. A similar argument shows that it is impossible to define continuous -+ d- -> "nth-root ¯ [a,b] × [c,d] c-- functions" for any n > 2. For continuous complex functions there are important analogues of certain theorems which describe the behavior of real-valued functionson closed intervals. A natural analogue of the interval [a,b] is the set of all complex numbers z x + ly with a < x < b and c < y d (Figure 5). This set is called a closed rectangle, and is denoted by [a,b] x [c,d]. If f is a continuous complex-valued function whose domain is [a,b] × [c,d], then it seems reasonable, and is indeed true, that f is bounded on [a,b] × [c,d]. That is, there is some real number M such that = _< a FI GURE 5 b |f (z)| < It does not make M sense to say that for all z in f has [a,b] × [c,d]. a maximum and a minimum value on [c,d], since there is no notion of order for complex numbers. If f is a real-valued function, however, then this assertion does make sense, and is true. In continuous function on [a,b] x [c,d], then particular, if f is any complex-valued is also continuous, there is some zo in [a,b] × [c,dj such that so |f| [a,b] x ||(zo)| < |f(z)| for all z in [a,b] × [c,dj; a similar statement is true with the inequality reversed. It is sometimes d]." f attains its maximum and minimum modulus on [a,b] × " said that [c, The various facts listed in the previous paragraph will not be proved here, although proofs are outlined in Problem 5. Assuming these facts, however, we can 26. ComplexFunctions 539 give a proof of the Fundamental Theorem of Algebra, which is really quite surprising, since we have not yet said much to distinguishpolynomial functions now from other continuous THEOREM 2(THEFUNDAMENTAL THEOREM OF ALGEBRA) functions. Then there is a complex such that z" + PROOF numbers. be any complex Let ao,...,an-1 z"¯ I + a,_2z"¯2 + a,_i - - + ao - number z 0. = Let f (x) z" + = a,_i z"¯ I + - - - + g Then f is continuous, and so is the function |fl defined by |fl(z) If(z)|= Iz"+an-iz"¯'+-··+ao|. = Our proof is based on the observation that a point zo with f (zo) 0 would clearly be a minimum point for |f |. To prove the theorem we will first show that |f | does indeed have a smallest value on the whole complexplane. The proof will be almost identical to the proof, in Chapter 7, that a polynomial function of even degree (withreal coefficients) has a smallest value on all of R; both proofs depend on the fact that if |zi is large, then |f(z)| is large. We begin by writing, for z ¢ 0, = ( f (z) z" 1 + = so that an_i + ---- · - . + - a0 , an-1 If(x)|= |z|"- 1+--+---+- a0 . Let M max(1, = 2nla._i|, 2n|aol). ..., Then for all z with |z à M, we have |zk|à |z| and |a._g| Iz| |a._al zk| ¯ 1 2n' |an_a| ¯ 2nla._al SO an_I + z which Un-1 a0 +--<-+ z" GO +-<-, z= ¯¯ z ¯ Î 2 implies that an-1 ao 1+---+---+z 21- z" an-1 ao z z" -+---+- This means that for |z| 2 M. |f (z)| 2 2 -- In particular, if [z|2 M and also |z| 2 |f(z)| V2|f (0)|, then à |f(0)|. 1 2 E-. and InßniteSeries 540 InßniteSequences [a,b] x [c,d] be a Now let closed rectangle (z Iz!5 (Figure 6) which contains max(M, 2|f(0)|)), and suppose that the minimum attained at zo, so that of |f| on : [a,b] x [c,d] is forzin[a,b]x[c,d]. (1) |f(zo)|5|f(z)| It follows,in particular, that |f (zo)| 5 |f (0)|. Thus (2) Combining (1)and (2)we J2|f(0)|),then|f(x)|E see that lf(0)|2|f(zo)|. |f(zo)| 5 If(z)| for all z, so that |f| value on the whole complex minimum FIGURE if|z|2max(M, attains its plane at zo. 6 To complete the proof of the theorem we now show that f (zo) to introduce the function g defined by = 0. It is convenient g(z) = f(z + zo). Then g is a polynomial function of degree n, whose minimum absolute value 0. occurs at 0. We want to show that g(0) Suppose instead that g(0) a ¢ 0. If m is the smallest positive power of z which occurs in the expression for g, we can write = = g(z)=a+ßz"+cm+1z"** where ß ¢ 0. Now, according - Ra", 25-2 there is a complex to Theorem such that ß number y 26. ComplexFunctions 541 Then, setting dk = CkŸk,We have |g(yz)|= |a+ßy"zm+dm+1z" |a +duz"I + -arm+dmwizm+1 = = a 1-z"+d.wizm tortuously arrived at, will enable us to reach a quick contradiction. Notice first that if |z| is chosen small enough, we will have This expression, so dm+1 <1. -z+--- a If we choose, from among all z for and positive, then Consequently, if 0 < z < which this inequality holds, some z which is real 1 we have l = - . z dm+1 z+ a · j <l-zm 1. = This is the desired contradiction: for such a number z we have |g(yz)| < |a|, contradicting the fact that |a| is the minimum of |gl on the whole plane. Hence, the original assumption must be incorrect, and g(0) 0. This implies, finally, that - Even taking into account our omission of the proofs for the basic facts about continuous complex functions, this proof verified a deep fact with surprisingly little work. It is only natural to hope that other interesting developments will arise if we pursue further the analogues of properties of real fimctions. The next obvious step is to define derivatives: a function f is differentiable at a if lim f(a+z) - f(a) . exists, 542 and InfiniteSeries InfiniteSequences in which case the limit is denoted by f'(a). It is easy to prove that f'(a) = f'(a) (f+g)'(a)= (f g)'(a) = = - 0 1 f (z) c, if f (z) z, f'(a)+g'(a), f'(a)g(a) + f (a)g'(a), ' = = -g'(a) (1 -/- (a) - g (fo if g)'(a) if g(a) = [g(a)] = f'(g(a)) 0, g'(a); - the proofs of all these formulas are exactly the same as before. It follows, in nz"¯1. These formulas only particular, that if f (z) z", then f'(z) prove the of candidates rational obvious differentiability functions however. Many other are not differentiable. Suppose, for example, that = = f (x + iy) = - x iy (i.e.,f (z) = ž). If f is to be differentiable at 0, then the limit . lun f(x + iy) - f (0) . lun = - iy (x+iy)-0 x + iy x + iy (x+iy)-o x must exist. Notice however, that if y = if x = 0, then x - x 0, then - iy = x + iy 1, and iy -1; = x + iy therefore this limit cannot possibly exist, since the quotient has both the values 1 and for x + iy arbitrarily close to 0. of this example, it is not at all clear where other differentiable functions view In are to come from. If you recall the definitions of sin and exp, you will see that there is no hope at all of generalizing these definitions to complex numbers. At the moment the outlook is bleak, but all our problems will soon be solved. -1 PROBLEMS 1. x + iy (sothat a is a complexvalued function defined on R). Show that œ is continuous. (This follows immediately from a theorem in this chapter.) Show similarly that ß(y) x + iy is continuous. (b) Let f be a continuous function defined on C. For fixed y, let g(x) = on R). Show f (x + iy). Show that g is a continuous function (defined similarly that h(y) f(x + iy) is continuous. Hint: Use part (a). (a) For any real number y, define a(x) = = = 2. is a continuous real-valued function defined on a closed [a,b]×[c, d]. Prove that if f takes on the values f (z) and f(w) (a) Suppose that f rectangle 26'. ComplexFunctions 543 for z and w in [a,b] × [c,d], then f also takes all values between f(z) and f(w). Hint: Consider g(t) f(tz + (1 t)w) for t in [0,1]. complex-valued function defined on [a,b) × [c,d], If f is a continuous the assertion in part (a)no longer makes any sense, since we cannot talk of complex numbers between f (z) and f (w). We might conjectum that f takes on all values on the line segment between f (z) and f (w), but even this is false. Find an example which shows this. = *(b) 3. that if ao, complex numbers zi, (a) Prove ... a,_; , ... , - then there are are any complex numbers, necessarily distinct) such that z. z" + an-i za-1 + (not · - · + ao (z z¿). = - i=1 an-1z"¯I that if ao, + + ao can be a._ i are real, then z" + written as a product of linear factors z+a and quadratic factors z2+c+b all of whose coefficients are real. (Use Problem 25-7.) (b) Prove 4. - . . . - · , In this poblem we will consider only polynomials with real coefficients. ifit can be written as hi2+. a polynomial is called a sum of squares for polynomials h, with real coefficients. -+h,2 Such - if f is a sum of squares, then f (x) :> 0 for all x. if f and g are sums of squares, then so is f g. that f (x) > 0 for all x. Show that f is a sum of squares. Hint: (c) Suppose write First f(x) xkg(x), where g(x) ¢ 0 for all x. Then k must be and g(x) > 0 for all x. Now use Problem 3(b). even (why?), (a) Prove that (b) Prove that - = (a) 5. A be a set of complex numbers. real case, a limit point of the set A A number z is called, as in the if for every (real)e > 0, there is < with a| point in but A a a |z e z ¢ a. Prove the two-dimensional version of the Bolzano-Weierstrass Theorem: If A is an infinite subset of [a,b] x [c,d], then A has a limit point in [a,b] x [c,J]. Hint: First divide [a,b] × [c,d] in half by a vertical line as in Figure 7(a). Since A is infinite, at least one half contains infinitely many points of A. Divide this in half by a horizontal line, as in Figure 7(b). Continue in this way, alternately dividing by vertical and horizontal lines. (a) Let - (b) FIGURE 7 (The two-dimensional bisection argument outlined in this hint is so standard that the title "Bolzano-Weierstrass" often serves to describe the method of proof, in addition to the theorem itself See, for example, H. Petard, "A Contribution to the Mathematical Theory of Big Game 446--447.) Hunting," Amer.Math. Monthly,45 (1938), continuous that function on [a,b] × [c,d] is a (complex-valued) (b) Prove bounded on [a, b] x [c,d]. (Imitate Problem 22-31.) (c) Prove that if f is a real-valued continuous function on [a,b] x [c,d], then f takes on a maximum and minimum value on [a,b] x [c,d]. (You can use the same trick that works for Theorem 7-3.) and InfiniteSeries 544 InfiniteSequences *6. The proof of Theorem 2 cannot be considered to be completely elementary depends on Theobecause the possibility of choosing y with y= rem 25-2, and thus on the trigonometric functions. It is therefore of some interest to provide an elementary proof that there is a solution for the equation z" c 0. -a/ß - = - to show that solutions of z2 c = 0 can an explicit computation be found for any complex number c. (b) Explain why the solution of z" c = 0 can be reduced to the case where (a) Make (a) a convex subset of the planc - - is odd. z" c has its minimum Let (c) zo be the point where the function f(z) absolute value. If zo 0, show that the integer m in the proof of Theorem 2 is equal to 1; since we can certainly find y with yl -gÂ the remainder of the proof works for f. It therefore suffices to show that the minimum absolute value of f does not occur at 0. (d) Suppose instead that f has its minimum absolute value at 0. Since n is odd, the points ±8, ±ai go under f into Show that for value small 8 at least one of these points has smaller absolute than thereby obtaining a contradiction. n = - -¢ - -c±a", -c±ô"i. -c, (b) a nonconvex FIGURE8 subset of the plane ¯ZI)ml.---'( • -Zk)m k (a) Show that f'(z) (b) Let g(z) = = (x zi)"t ma(z - - zŒ)¯I. - - ... z - kym* mg(Z - Za)¯ . «=1 Show that if g(z) cannot all lie on the same side of a straight - = 0, then zi, ..., k line through z. Hint: Use Problem 25-11. A (c) subset K of the plane is convex if K contains the line segment joining any two points in it (Figure 8). For any set A, there is a smallest convex set containing it, which is called the convex hull of A (Figure 9); if a FIGURE 9 26. Complex Functions 545 point P is not in the convex hull of A, then all of A is contained on one side of some straight line through P. Using this information, prove that the roots of f'(z) 0 lie within the convex hull of the set {Zl9 ZkÌ· Further information on convex sets will be found in reference [19]of the Suggested Reading. = 8. *9. --· 9 Prove that if f is differentiable at z, then f is continuous at z. Suppose that f u + iv where = u and v are real-valued functions. u(x + iyo) and h(x) v(x + iyo). Show that if yo let g(x) f'(xo + iyo) a + iß for real a and ß, then g'(xo) a and h'(xo) ß. (b) On the other hand, suppose that k(y) u(xo+iy) and l(y) v(xo+iy). Show that l'(yo) a and k'(yo) (c) Suppose that f'(z) 0 for all z. Show that f is a constant function. (a) For fixed = = = = = -ß. = = - 10. (a) Using the expression f (x) = = - 1 1+x2¯2i 1 1 x-i find f (k)(x) for all k. (b) Use this result to find arctan(k) (0) for all k. 1 . x+r , CHAPTER If you have not already guessed where differentiable complex functions are going to come from, the title of this chapter should give the secret away: we intend to define fimctions by means of infmite series. This will necessitate a discussion of infinite sequences of complex numbers, and sums of such sequences, but (aswas the case with limits and continuity) the basic definitions are almost exactly the same as for real sequences and series. An infinite sequence of complex numbers is, formally, a complex-valued function whose domain is N; the convenient subscript notation for sequences of real numbers will also be used for sequences of complex numbers. A sequence (a.} of complex numbers is most conveniently pictured by labeling the points as in the plane (Figure 1). of complex The sequence shown in Figure 1 converges to 0, real sequences being defined precisely as for sequences: the sequence (a,} converges to l, in symbols lim a, l, as' a,.ag a,• a, . a• . a. • a' •"" ,a a•,*a= FI GURE COMPLEX POWER SERIES 1 "convergence" .a' = • .a. arf a.. if fOr every e GN > 0 there is a natural • . . . if n . number N, then > N such that, for all n, |a. - l| < e. condition FIGURE This means that any circle drawn around I will contain a. for all sufficiently large n (Figure 2); expressed more colloquially, the sequence is eventually inside any circle drawn around l. Convergence of complex sequences is not only defined precisely as for real S€qu€nC€S, but can even be reduced to this familiar case. 2 THEOREM 1 Let a, = b, + ic,, for real b,, and c,,, and let l lim a, Then U-¥OO = = ß+ for real ß and y. l if and only if lim b, R-FDD PROOF iy = ß and lim c, RADO = y. The proof is left as an easy exercise. If there is any doubt as to how to proceed, consult the similar Theorem 1 of Chapter 26. The sum of a sequence {a.}is defined, once again, as lim s., where n->oo s.=ai 546 +··-+a.. 27. ComplexPowerSeries 547 Sequences for which this limit exists are summable; the infinite series alternatively, we may say that if this limit exists, and diverges otherwise. as converges It n=I is unnecessary to develop any new tests for convergence of infinite series, because of the following theorem. THEOREM 2 Let a, for real b, and c.. b, + ic, = a, converges if and only if Then n=I b, and c, both converge, and in this n=1 n=I case a. b. + i = n=1 PROOF c. n=1 . n=i This is an immediate consequence of Theorem 1 applied to the sequence ofpartial sums of {a.). There is also a notion of absolute a, converges convergence if the series absolutely for complex series: the series |a, \ converges (thisis a series of real n=I n=t numbers, and consequently one to which our earlier tests may be applied). following theorem is not quite so easy as the preceding two. THEOREM 3 The Let a, for real b, and c.. b, + ic, = as converges absolutely if and only if Then n=1 b, and n=1 c, both converge n=1 absolutely. PROOF b, and Suppose first that n=1 |c,| both c, both converge absolutely, i.e., that n=t converge. It follows that |b, |c.| converges. Now, + .=1 n=i |a, | = |b, + ic. | 5 |b.| + |c,|. It follows from the comparison test that la.| converges n-1 |b,| + |c,\ are |b,\ and n=1 real and nonnegative). Thus a, converges n=1 (thenumbers |a,| absolutely. and 548 and InfiniteSeries InfiniteSequences Now suppose that |a,| converges. Since n=1 |a. | /b.2 + = c.2, it is clear that and |b.| 5 |a,| Once again, the comparison |c,| 5 |a.|. test shows that |bs| and n=1 of Two consequences |c,| converge. n=1 Theorem 3 are particularly noteworthy. If an conn=i verges absolutely, b, and then n=1 b, and n=1 c, converge, c, also converge absolutely; consequently n=1 by Theorem 23-5, so n=i as converges by Theorem 2. n=1 In other words, absolute convergence implies convergence. Similar reasoning series has the same shows that any rearrangement of an absolutely convergent sum. These facts can also be proved directly, without using the corresponding theorems for real numbers, by first establishing an analogue of the Cauchy criterion (seeProblem 13). With these preliminaries safely disposed of, power series, that is, functions of the form we can now consider complex a,(z-a)"=ao+a1(z-a)+a2(z-a)2+--·. f(z)= n=o Here the numbers a and an are allowed to be complex, and we are naturally interested in the behavior of f for complex z. As in the real case, we shall usually consider power series centered at 0, f(z) anz"; = n=0 in this case, if f (zo) converges, then f (z) will also converge for |z| < |zo|. The proof of this fact will be similar to the proof of Theorem 24-6, but, for reasons that will soon become clear, we will not use all the paraphernalia of uniform convergence and the Weierstrass M-test, even though they have complex analogues. Our next theorem consequently generalizes only a small part of Theorem 24-6. THEOREM 4 Suppose that anzo" n=o = ao + al zo + a2zo2 27. ComplexPowerSenes 549 for some zo ¢ 0. Then if |zl < |zo|, the two series converges anz" ao + ai = a2x2 + z+ - - - n-0 nanzn-1 2a2z + 3a3z" + ai + = - · n=l both converge absolutely. PROOF As in the proof of Theorem 24-6, we will need only the fact that the set of numbers a,zo" is bounded: there is a number M such that |anzo"|5 for all n. M We then have z la.z"I lanzo"|= - zo In O and, for z ¢ 0, -1 Inanz"¯I | nianzo" - Izi <-n - Izl Iz/zo|" and n nanz"¯* converge zo z zo . this shows that both Iz/zo|"converge, n=1 n=0 and - " M ¯ Since the series | absolutely n=1 anz" n=0 (theargument z ¢ 0, but this series certainly converges for z = nanz"" for assumed that n=1 0 also). Theorem 4 evidently restricts greatly the possibilities for the set z : anz" converges . For example, the shaded set A in Figure 3 cannot be the set of all z where anz" n=o FIGURE 3 since it contains z, but not the number w satisfying w| < |z|. It seems quite unlikely that the set of points where a power series converges of could be anything except the set of points inside a circle. If we allow and radius the power series converges only at 0) of radius 0" (when oo" series converges at all points), then this assertion is true (withone the (when power complication which we will soon mention); the proof requires only Theorem 4 and a knack for good organization. converges, "circles "circles 550 and InfiniteSeries InfiniteSequences THEOREM 5 For any power series anz" a2:2 + ao + aiz + = a3Z3 n=0 one of the following three possibilities must be true: (1)n=o anz" converges only (2) anz" converges for z 0. = absolutely for all z in C. n-0 is a number (3)There > 0 such that anz" converges absolutely if diverges if Iz| > R. (Notice that we do not mention and R |z| < n=o when PROOF R |z| what happens R.) = IÆt S = x in R anw" converges for some w : with |w = x . n=o Suppose first that S is unbounded. x in S such that a number |z| < Then for any complex number z, there is By definition of S, this means that x. an w" n=o converges for some w with |wl = x |z|. It followsfrom Theorem 4 that > anz" n=o absolutely. Thus, in this case possibility (2)is true. Now suppose that S is bounded, and let R be the least upper bound of S. If converges R anz" converges 0, then = only for z = 0, so possibility (1)is true. Suppose, n=o on the other hand, that R is a number x > 0. Then if z is a complex number with |z| < R, there in S with |z| < x. Once again, this a.w" that means converges n=o for some w with Iz| < |w|, so anz" converges that Moreover, if absolutely. n=0 [z| > anz" does not converge, since R, then Iz| is not in S. n=0 The number anz". n=o R which occurs in case In cases (3)is called (1)and (2)it is customary is 0 and oo, respectively. When 0 the circle of convergence < R < the radius of convergence of to say that the radius of convergence oo, the circle (z : |z| = R} is called anz". If z is outside the circle, then, of course, of n=o 27. ComplexPowerSeries 551 does not converge, but actually a much stronger statement can be made: anz" n=o bounded. To prove this, let w be any number with R; if the terms anz" were bounded, then the proofof Theorem4 would the terms anz" are not even 1z|> |w| > anw" converges, which is show that false. Thus (Figure 4), inside the circle of n=0 the terms anz- are not bounded oo . anz" converges m the the series convergence best possible way and (absolutely) n=0 outside the circle the series diverges in the worst possible way .,,= What happens on the circle of convergence is a much more difficult question. We will not consider that question at all, except to mention that there are power series which converge everywhere on the circle of convergence, power series which converge nowhere on the circle of convergence, and power series that do just about anything in between. (See Problem 5.) abmluœly circle of convergence FIGURE 4 anz" are bounded). not Converges (theterms Algebraic manipulations Thus, if TCâÌCRSe. anz" and g(z) = baz" both have radius of = n=0 R, then h(z) > convergence f(z) on complex power series can be justifiedjust as in the n-0 (a, = + b.)z" also has radius of convergence n=0 R and h h(z) = f caz", = + g inside the circle of radius R. Similarly, the Cauchy product for c, akb,_g, has radius of convergence = inside the circle of radius R. And if f (z) fg = a,z" has radius of convergence = > 0 n-o -/ and ao à R and h k=0 n=0 b,z" 0, then we can fmd a power series with radius of convergence n=o > 0 which æpresents 1/f inside its circle of convergence. But our real goal in this chapter is to poduce differentiable functions. We therefore want to generalize the result proved for real power series in Chapter 24, that a function defined by a power series can be differentiated term-by-term inside the circle of convergence. At this point we can no longer imitate the proof of Chapter 24, even if we were willing to intoduce uniform convergence, because no analogue of Theorem 24-3 seems available. Instead we will use a direct argument (whichcould also have been used in Chapter 24). Before beginning the proof, we notice that at least there is no problem about produced by term-by-term the convergence difTerentiation. If the series anz" of the series has radius of con- n=0 vergence R, then Theorem 4 immediately implies that the series converges for Iz| < R. Moreover, if |z| > R, nanz"¯1 also so that the terms anz" are unbounded, 552 and InfiniteSeries InfiniteSequences then the terms nanz"¯1 are surely unbounded, This shows that the radius of convergence nanz"¯1 does not converge. so nanz"¯I of is also exactly R. n=1 THOREM e If the power series f (z) anz" = n=0 has radius of convergence and R > 0, then f'(z) f is differentiable at z for all z with |z! < R, nanza-1 = n=l PROOF "e/3 argument." The fact that the theorem is clearly true for We will use another polynomial functions suggests writing (*) f (x + h) f (z) -- h °° - °° E nanz n-1 = a" n=1 °° y a" ((z + h)" h n=0 ((z + h)" z") - h n=0 N - a" ((z + h)" °° E nanz - ,_i n=1 ((z + h)" z") - h n=0 N z") - N z") - 1 N og nanz"¯ + n=1 - nanza-1 n=l We will show that for any e > 0, each absolute value on the right side can be made < e/3 by choosing N sufficiently large and h sufficiently small. This will clearly prove the theorem. Only the first term in the right side of (*)will present any difficulties. To begin with, choose some zo with |z! < Izo| < R; henceforth we will consider only h with |z + h| 5 |20|. The expression ((z + h)" z")/h can be written in a more convenient way if we remember that - xn_yn n-14 n-2 n-3 n-1 24... -y x Applying this to (z + h)" h -- z" ¯ (z + h)" - z" ' (z+h)-z we obtain (z+h)"-z" h = (z+h)n-1+z(z+h)" *+ ··+z"¯'. 27. ComplexPowerSeries 553 Since |(z + h)n-1 + z(z + h)"¯ we have a. But the series n|a.| ((z + h)" + z") - h ·· z"¯I| + · 5 nlanl n|zo|"¯1 5 , Izol · y - |zo|"¯1converges, so if N is sufficiently large, then · n=l nia.I Izol"¯'< · . n=N+1 This means that °° i a" ((z + h)" z") - N i - h n=0 ((z + h)" a" h n=0 °° = ((z + h)" a" z") - h n=N+1 z") -- °° g 5 a" ((z + h)" - z") h n=N+1 n=N+1 In short, if N is sufficiently large, then °° (1) L a" ((z + h)" N z") - a" - h n=0 for all h with ((z + h)" h| 5 3' |zo|. It is easy to deal with the third term on the right side of converges, e ' h n=0 |z + z") -- (*):Since nanz"¯' n=1 it follows that if N is sufficiently large, then nanz" (2) I nanz"¯1 - n=1 n=1 Finally, choosing an N such that N (1)and (2)are ((z + h)" La" lin - z") h n=0 true, we note that N = 2nanz" , n=1 N since the polynomial function g(z) anz" is certainly differentiable. Therefore = n=o N (3) n=0 Un((z + h)" - z") h N - i nanz"¯ < . n=1 for sufficiently small h. As we have already indicated, (1),(2),and (3)prove the theorem. and InßniteSeries 554 InfiniteSequences Theorem 6 has an obvious corollary: a function represented by a power series is infinitely differentiable inside the circle of convergence, and the power series is its Taylor series at 0. It follows,in particular, that f is continuous inside the circle of since a function differentiable at z is continuous at z (Problem 26-8). convergence, The continuity of a power series inside its circle of convergence helps explain the behavior of certain Taylor series obtained for real functions, and gives the promised answers to the questions raised at the end of Chapter 24. We have already seen that the Taylor series for the function f (z) 1/(1 + z2), namely, = 1-Z2 4 6 for real z only when izi < 1, and consequently has radius of convergence 1. It is no accident that the circle of convergence contains the two points i and If this power series converged in a circle of at which f is undefined. radius greater than 1, then (Figure 5) it would represent a fimction which was continuous in that circle, in particular at i and But this is impossible, since it z2) z2) unit equals 1/(1 + circle, and 1/(1 + does not approach a limit inside the from inside the unit circle. i or as z approaches The use of complex numbers also sheds some light on the strange behavior of the Taylor series for the function converges ---i -i. -i FIGURE 5 e¯1¡x2, x ¢ 0 0. 0, l x Although we have not yet defined et for complex z, it will presumably be true that if y is real and unequal to 0, then f (x) = = f (iy) e¯1/cy? = = l' e . The interesting fact about this expression is that it becomes large as y becomes Thus f will not even be continuous at 0 when defined for complex numbers, 0. so it is hardly surprising that it is equal to its Taylor series only for z The method by which we will actually define et (aswell as sin z and cos z) for complex z should by now be clear. For real x we know that small. = x sinx=x--+ 3 x 5 ¯ ' 31 x 2 x cosx=1--+------, 2! 4 41 2 x x 1! 2! e2=1+-+-+--- For complex z we therefore define 5 z3 . smz=z---+------, 31 5! 2 4 cosz=1---+--+---, 2! . 4! 2 exp(z) = er = 1+ + + - - - 27. ComplexPowerSeries 555 sin z, and exp'(z) exp(z) by Theorem 6. Then sin'(z) cos z, cos'(z) Moreover, if we replace z by iz in the series for e2, and make a rearrangement of the terms (justified by absolute convergence), something particularly interesting happens: = = e 2! 3! 2 =1+sz------+--+-+ 3 it 4 3! 4! it z . 2! = 4! it 5 5! 4 2 It is clear from the definitions 5 z z z----+--+··· 3! 5! +i , cos x + i sin z. = e +··- 5! 3 z z 1---+----·· 2! 4! ( 5 (iz)2 (iz)* (iz)4 =1+iz+-+-+-+ so = -- (i.e.,the power series) that sin(-z) = cos(-z) = sin - z, cos z, so we also have e¯" = cos z i sin z. - From the equations for en and e¯u we can derive the formulas eu . e¯¤ - = smz , . 2: cos z eu + = e¯" 2 The development of complex power series thus places the exponential function at the very core of the development of the elementary functions-it reveals a connection between the trigonometric and exponential functions which was never imagined when these functions were first defined, and which could never have As a by-product of this been discovered without the use of complex numbers. relationship, obtain we a hitherto unsuspected connection between the numbers e and x: if in the formula eu we take z = = cosx + isinz x, we obtain the remarkable e'" result --1. = (More generally, e2"U" is an nth root of 1.) With these remarks we will bring to a close our investigation of complex functions. And yet there are still several basic facts about power series which have not been mentioned. Thus far, we have seldom considered power series centered at a, f (z) a.(z = n=o - a)", 556 and InfiniteSeries InfiniteSequences 0. This omission was adopted partly to simplify the exposition. except for a For power series centered at a there are obvious versions of all the theorems in this chapter (theproofs require only trivial modifications): there is a number R = 0 or (possibly "oo") with |z all with R, and has unbounded a| < R the function z - al < Iz - such that the series f(z) terms for z with = a - a| > R; moreover, for G)n an has derivative R 6 |z n=0 b-a| FIGURE a)" converges absolutely for z - n=0 R a a.(z n-1 r It is less straightforward to investigate the possibility of representing a function series centered at b, if it is already written as a power series centered as a power at G. If a.(z-a)" f(z)= n=o has radius of convergence R, and b is a point with |b a| < R (Figure 6), then it is true that f (z) can also be written as a power series centered at b, - b,(x-b)" f(z)= .=o b, are necessarily ff">(b)/n!);moreover, this series has radius of least R |b a| (itmay be larger). convergence We will not prove the facts mentioned in the previous paragraph, and there are several other important facts we shall not prove. For example, if (thenumbers at -- f(z) - an(z = - a)" n=o and g(b) and g(z) b.(z = - b)", n=o a, then we would expect that fog can be written as a power series centered at b. All such facts could be proved now without introducing any basic new ideas, but the proofs would not be as easy as the proofs about sums, products and reciprocals of power series. The possibility of changing a power series centered at a into one centered at b is quite a bit more involved, and the treatment of fog requires still more skill. Rather than end this section with a tour deforce analysis," one of of computations, we will instead give a preview of where all these facts are derived as the most beautiful branches of mathematics, = "complex straightforward of some fundamental results. Power series were introduced in this chapter in order to provide complex functions which are differentiable. Since these functions are actually infinitely differentiable, it is natural to suppose that we have therefore selected only a very special consequences 27. ComþlexPowerSeries 557 collection functions. The basic theorems of differentiable complex of complex show that this is not at all true: analysis If a cornplexfunction is d¢ned in some region A of the planeand is dj¶erentiablein A, foreachpointa in A the thenit i,sautomatically infmitelydfferentiablein A. Moreover, Taylorseriesforf at a will convergeto f in any cirde containedin A (Ftgure7). These factsare among the first to be proved in complex analysis. It is impossible methods used are quite different to give any idea of the proofs themselves-the elementary If these calculus. facts are granted, however, then from anything in the facts mentioned before can be proved very easily. Suppose, for example, that f and g are functions which can be written as power series. Then, as we have shown, f and g are differentiable-it then follows from also differentiable. easy general theorems that f + g, f g, 1/g and fog are Appealing to the results from complex analysis, it follows that they can be written as power sefies. and 1/g from We already know how to compute the power series for f + g, f would also compute how and those for f an expression one easy to guess g. It is expansions series when given series the we are in (z for f og as a power power · FIGURE 7 .g -b) an(z-a)" f(z)= n=0 g(x) bk(z = b)k - k=0 with a = g(b) be, so that = g(z) - bt(z = a b) - . k=1 the power series First of all, we know how to compute I (g(z) - a)' = (oo bk(z - b)* , k=1 series will and this power begin with (z - b)'. Consequently, the coefficient of z" in a;(g(z)-a)' f(g(z))= t=0 can be calculated powers of g(z) as a - fmite sum, involving only coefficients arising from the first n a. Similarly, if f (z) a.(z = - a)" n=0 has radius of convergence R, then f is differentiable in the region A {z: Iz-a| < R}. Thus, if b is in A, it is possible to write f as a power series centered at b, = 558 and Inßnite Senes InfiniteSequences which will converge will be ff">(b)/nl. a.(z a)" may - al. The coefficient of z" b in the circle of radius R This series may actually converge in a larger circle, because - - be the series for a function differentiable in a larger region n=o than A. For example, suppose that f (z) 1/(1 + zi). Then f is differentiable, where it is not defined. Thus f (z) can be written as a power except at i and = -i, i + 4), anzn with radius series of convergence 1 (asa n=o a2, (-1)" = and as = 0 if k is odd). It is also possible to write f (z) bn(z = - n=0 where FIGURE 8 b, = matter of fact, we know that ff"3(i)/n!. We can ()", easily predict the radius of convergence of this Î ( )2, the distance from to i or (Figure 8). added complex analysis incentive to investigate further, one more result As an will be mentioned, which lies quite near the surface, and which will be found in serÎCS: it is -l- ¼ -i any treatment of the subject. and 1, but for complex For real z the values of sin z always lie between real, all then this is not at true. In fact, if z iy, for y edd e¯Y e¯¤© ei -1 z = - siniy - . 2i 2i If y is large, then sin iy is also large in absolute value. This behavior of sin is typical of functions which are defined and differentiable on the whole complex plane (such functions are called entire). A result which comes quite early in complex analysis is the following: Liouville'sTheoren: The only boundedentirefunctions are the constantfunctions. As a simple application of Liouville's Theorem, consider a polynomial function f(z)=z"+a._iz"¯I+---+ao, 1, so that f is not a constant. We already know that f (z) is large for large z, so Liouville's Theorem tells us nothing interesting about f. But consider the function 1 where n > g(x) = -. f (z) If f (z) were never 0, then g would be entire; since f (z) becomes large for large z, the function g would also be bounded, contradicting Liouville's Theorem. Thus f (z) 0 for some z, and we have proved the Fundamental Theorem of Algebra. - PROBLEMS 1. Decide whether each of the following series converges, and whether verges absolutely. it con- 27. ComplexPowerSenes 559 °° (i) (1+i)n ' nl =1 n °° (ii) L n=1 (iii) n=1 (iv) n=i (v) 2. l + 2i 2" . (( + (i)". logn n=2 + i" logn --. n n Use the ratio test to show that the radius of convergence of each of the following power series is 1. (In each case the ratios of successive terms will approach a limit < 1 if Iz! < 1, but for |z| > 1 the ratios will tend to oo or to a limit > 1.) (i) . n=1 (ii) . n=l (iii) z". n=1 (n + 2¯")z". (iv) n=1 (v) 3. 2"z"'. n=1 Use the root test (Problem 23-7) to find the radius of convergence the following power series. 23456 zzzzzz 3 2 -+--+--+-+-+-+---. (i) (ii) 22 . n=1 z". (iii) n=1 (iv) (v) z". n=1 2"¿"'. n=1 32 23 33 of each of 560 and Inßnite Series InfiniteSequences 4. The root test can always be used, in theory at least, to find the radius of of a convergence power series; in fact, a close analysis of the situation leads to a formula for the radius of convergence, known as the "Cauchy-Hadamard formula." Suppose first that the set of numbers is bounded. (a) Use Problem 23-7 to show that if Ïi¯ WROO Ñ Iz| < 1, then anz" conn=o verges. (b) Also show that if Ïi |z n->oo fin "oo"). anz" is of To complete the formula, deis unbounded. diverges for z ¢ 0, so that the anz" this case, means oo if the set of all = terms. n=o (where 00 N that the radius of convergence "1/0" Ïld anz" has unbounded 1, then n=o (c) Parts (a)and (b)show 1/ > radius Prove that in of convergence n=o is 0 5. (whichmay "1/00"). be considered as Consider the following three series from Problem 2: n=1 n=1 n=1 Prove that the first series converges everywhere on the unit circle; that the third series converges nowhere on the unit circle; and that the second series converges for at least one point on the unit circle and diverges for at least one point on the unit circle. 6. e=+= for all complex numbers e2 e= z and w by showing e=** that the infinite series for is the Cauchy product of the series for e2 and e". and cos(z + w) sinzcosw + coszsinw (b) Show that sin(z + w) sin and sin all complex for w. w cos z cos w z z (a) Prove that - = = = - 7. every complex number of absolute value 1 can be written eu for some real number y. (b) Prove that |e**U| ex for real x and y. (a) Prove that = 8. (a) Prove that exp takes on every complex value except 0. (b) Prove that sin takes on every complex value. 9. For each of the following functions, compute the first three nonzero the Taylor series centered at 0 by manipulating power series. (i) f (z) = (ii) f(z) = tan z. -:)¯1/2 z(1 terms of 27. ComplexPowerSenes 561 e" (iii) f (z) = (iv) f(z) = . log(1 sin2 (v) f (z) = (vi) f (z) = (vii)f (z) = (viii)f(z) = 1 - z2L - z z2 sin(z2 10. . z cos2 z 1 4 2x2 + 3 ° - [e(E¯U - 1]. that we write a differentiable complex function f as f u + Let ü and ü denote the restrictions iv, where u and v are real-valued. of u and v to the real numbers. In other words, ü(x) u(x) for real numbers x x). Using other Problem 26-9, show (butü is not defined for that for real x we have (a) Suppose = = f'(x) ü'(x) +iif = (x), denotes the complex derivative, while ü' and D' denote the derivatives of these real-valued functions on R. (b) Show, more generally, that where f' ordinary satisfies the equation (c) Suppose that f f f") + (*) a,-i f in-1) + - - - + ao f = 0, the a; are real numbers, and where the f (k) denote higher-order complex derivatives. Show that ü and 6 satisfy the same equation, where üf*)and ü(k)HOW denote higher-order derivatives of real-valued functions where on R. (d) Show that if a b+ ci is a complex root of the equation z" + an-1xn-1 4 ebx sin cx and f (x) = ebx cos cx are both 0, then f (x) = + ao solutions of - 11. - - = = (*). on C. log |w| w if and only if z x +iy with x (b) Given w ¢ 0, real logarithm function), and of to. the argument log denotes an y (here *(c) exist continuous there that function log defined for Show does not a such that exp(log(z)) nonzero complex numbers, z for all z ¢ 0. (Show that log cannot even be defined continuously for Iz| 1.) (a) Show that exp is not one-one show that e2 = = = = = Since there is no way to define a continuous logarithm fimction we canlogarithm not speak of thelogarithm of a complex number, but only of with meaning of e' the numbers for w," tv. And infmitely many one z "a = 562 and InfiniteSeries InfiniteSequences for complex numbers a and b we define ab to be a set of complex num€blop bers, namely the set of all numberS or, more precisely, the set of all numbers ebt where z is a logarithm for a. (d) If m is an integer, then am consists of only one number, the one given by the usual elementary definition of am (e) If m and n are integers, then the set a"/" coincides with the set of values given by the usual elementary definition, namely the set of all bmwhere b is an nth root of a. If (f) a and b are real and b is irrational, then ab contains infinitely many members, even for a > 0. (g) Find all logarithms of i, and find all values of i'. (h) By (ab)cweab.mean the set of all numbers of the form ze for some number z in the set Show that (16)' has infinitely many values, while 1" has only one. 12. all values of ab-c are (i) Show that (a) For real x show that we can choose log(x + i) log(x + i) and log(x log(1 + x2) + i = 1 1+x2 arctan x - log(x-i)=log(1+x2)--i (It will help to note that x/2 (b) The expression b-c VaÏuesOf (ßbge âlSO (ab c - - c by i) to be , -arctanx . arctan x - 1 2i arctan = 1 1 x--i x+i 1/x for x ¢ 0.) yields, formally, Use part 13. I dx -[log(x = 1 -log(x -i) +i)]. 1+x2 (a)to check that this answer agrees with the usual one. (a) A sequence (a,} of complex is called a Cauchy sequence numbers b, + ic,, where if b, and c, are 0. Suppose that a, lim |a. a,| real. Prove that {a,}is a Cauchy sequence if and only if {b,}and {c.} are Cauchy sequences. (b) Prove that every Cauchy sequence of complex numbers converges. using theorems about real series, that an (c) Give direct proofs, without absolutely convergent series is convergent and that any rearrangement has the same sum. (It is permitted, and in fact advisable, to use the proofs of the corresponding theorems for real series.) = = - m,n-oo 14. (a) Prove that e k=l 2 = e'2 1 _ x sin ei(n+1)x/2 sm - 2 27. ComplexPowerSeries 563 the formulas for (b) Deduce and coskx sinkx k=1 that are given in k=1 Problem 15-33. 15. Let (a,} be the Fibonacci sequence, (a) If (b) Show that r (1 + = a2 1, a. = 2 show that rn+1 = 1+ 1/r.. 1 + 1/r. lim r, exists, and r H->OO anyt/a., = r, al = = = a, + a,41. Conclude that r = ß)/2. anz" that (c) Show of convergence radius has 2/(1 + ß). (Using n=1 the unproved in this chapter theorems and the fact that anz" = n=1 --1/(Z2 + z 1) from Problem 24-15 we could have predicted that the radius of convergence is the smallest absolute value of the roots of z2 + radius of convergence z 1 0; since the roots are (-1 ± Ã)/2, the should be (-1 + Notice that this number is indeed equal to - = - d)/2. Ã).) 2/(1+ 16. Since (et 1)/z can be written as the power series 1 + z/21 + z2/3! + which is nonzero at 0, it follows that there is a power series - °° er 1 - n=o - - - b, n! Using the unproved theorems in this chapter, we can even predict the radius of convergence; it is 2x, since this is the smallest absolute value of the numbers z 2kzi for which er 1 = 0. appearing called numbers the b, here are Bernoulli numbers.* The radius with nonzero of convergence. = (a) Clearly bo - 1. Now show that = z =--+ ex-1 «¯2+1 z 2 z e= + 1 e=-l' e=+1 er--l' e¯z-1 and deduce that bt=- (b) By finding the , b,=0 ifnisoddandn>1. coefRcient of z" in the right side of the equation z= X=+i+¾+ *Sometimes the numbers B, = (-1)n-1b2 are called the Bernoulli numbers, because b, = 0 if n is odd and > 1 (seepart (a))and because the numbers b2n alternate in sign, although we will not of this nomenclature are also in use. prove this. Other modifications and InfiniteSerks 564 Infinite Sequences show that n-i b,=0 . forn>1. i=0 This formula allows us to compute any bk in terms of previous ones, and shows that each is rational. Calculate two or three of the following: b2 *(c) Part = b4 g, (a)shows = b6 -y, = be g, = --g. that -z/2 °° n=o b2n z/2 2;I = (2n)!z · ez et/2 1 - e¯=/2° - Replace z by 2iz and show that z cotz (-1)"2 z = . n=o *(d) Show that tan z *(e) = cot z 2 cot 2z. - Show that tan z (-1)"¯12 (2 = - 1)z . n=1 (This series converges for |z| < x/2.) 17. The Bernoulli numbers play an important role in a theorem which is best introduced by some notational nonsense. Let us use D to denote the entiation operator," so that Df denotes f'. Then Dk f wiÏÏ mean f(k) and "differ- e" f will mean f(n)/n!(ofcourse this series makes no sense in general, but it will make sense if f is a polynomial function, for example). Finally, operator" let A denote the for which Af (x) f(x + 1) f (x). Now Taylor's Theorem implies, disregarding questions of convergence, that "difference = f (x + 1) - = no or (*) f(x+1)- f(x)= n=1 we may write this symbolically as Af = (e* 1) f, where I stands for the "identity operator." Even more symbolically this can be written A = e 1, which suggests that - - D-€D_) D ' 27. ComplexPower Series 565 Thus we obviously ought to have Dk D= k=0 i.e., (**) J'(x) [f(k)(X = # Î) f(k)(X) - . k=0 The beautiful thing about all this nonsense is that it works! if f is a polynomial function (inwhich case the infinite sum is really a finite sum). Hint: By applying (*)to f(k),find (k)(X # Î) f(k) a formula for f (x);then use the formula in Problem 16(b) to find the coefEcient of fU)(x) in the right side of (**). (a) Prove that (**)is literally true - (b) Deduce from (**)that f'(0)+-·-+ f'(n) [f(k) = (k)(0)]. _ k=0 (c) Show that for any polynomial function g we have n+1 oc bk +D-g(k-1)(0)]. -[g g(0)+---+g(n)= g(t)dt+ k=1 (d) Apply this to g(x) kP = xF to show that np+11 ¯***. = k=1 Using the fact that bi kP = - , 1 k k=1 n show that np+11 np-k+1 = k k=2 k=1 1 The first ten instances of this formula were written out in Problem 2-7, which offered as a challenge the discovery of the general pattern. This may now seem to be a preposterous suggestion, but the Bernoulli numbers were actually discovered in precisely this way! After writing out these 10 formulas, Bernoulli claims (inhis posthumously printed work Ars Comectandi, 1713): "Whoeverwill examine the series as to their regularity may be able to continue the table." He then writes down the above formula, offering no proof at all, merely noting that the coefficients by (whichhe denoted simply by A, B, C, ) satisfy the equation in Problem 16(b). The relation between these numbers and the coefEcients in the power series for z/(e2 1) was discovered by Euler. ... - and InfiniteSenes 566 InfiniteSequences *18. The formula in Problem 17(c)can be generalized to the case where g is not a polynomial function; the infinite sum must be replaced by a finite sum plus a remainder term. In order to find an expression for the remainder, it is useful to introduce some new functions. (a) The Bernoullipolynomials y, are defined by p,(x) n = bn-kXk k=0 The first three are 1 91(x)=x-p2(x) x2 = 3x 2 x 2 Show that p.(0) y, = (1) y,'(x) p,(x) b., b. if n ny,_i(x), = = (--1)"p,(1 = > 1, x) - for n > 1. Hint: Prove the last equation by induction on n, starting with n 2. RNk (x) be the remainder term in Taylor's Theorem for f (k), on the (b) Let interval [x,x + 1], so that = N f(k)(X (*) # Î) - (k+n)(X f(k)(x) = )+ RNk(X). n=1 Prove that f'(x) = k=0 [f f*) + 1 - f (*)(x)] RN--kk(X). - k=0 Hint: Imitate Problem 17(a). Notice the subscript N (c) Use the integral form of the remainder to show that RN-kk(X) (d) Deduce fn(x ‡ 1 = - t) (N+1)(t) -- k on R. dt. the "Euler-Maclaurin Summation Formula": g(x)+g(x+1)+---+g(x+n) N be g lx+n+1 -[gf*¯>(x = g(t)dt x + k=l ° +n+1) - g(k-1)(x)] + S,(x, n), 27. Complex PowerSaies 567 where x+j+1 SN(X, N) = ŸN(X Î - f) (N)(t) - dt. *Ì j=0 be the periodic function, with period 1, which satisfies ‡.(t) p,(t) for 0 st < 1. (Part (a)implies that if n > 1, then ‡. is continuous, since y,(1) p,(0), and also that ‡. is even if n is even and odd if n is odd.) Show that (e) Let ‡, = = ÈN (X SN(X, 8) = f ) g(N)(t) Lx+n+1 - - (-1)N+1 (N) dt if x is an integer (t) dt . Unlike the remainder in Taylor's Theorem, the remainder S.(x, n) usually does 0, because the Bernoulli numbers and functions not satisfy lim SN(X, n) = N->œ the first few examples do not suggest this). become large very rapidly (although Nevertheless, important information can often be obtained from the summation formula. The general situation is best discussed within the context of a specialized series"), but the next problem shows one particularly important study ("asymptotic example. **19. (a) Use the Euler-Maclaurin Formula, with N = 2, to show that logl+-··+log(n-1) = (b) Show that log Ï" log t dt 1 - - n! ( log (d) Problem " ¯ an"+1/2e¯.41¡12. Use part (c)to show Ý2(Í)dt. 2t2 (n!)22 . (2n)!S that -2.2* 2n2n+1 lim = no and conclude that a = R. 2t, 2t2 19-40(d) shows that lim [" ‡2(t) dt. ‡2(t) dt JI ¯ = 1 + ‡2(t)/2t2dt exists, and = n! ( -- 1 i exp(ß + 11/12), then = 1 - + Ï2 nn+1/2g-n+1/l2n - " 11 ¯ (c) Explain why the improper integral ß that if a 1 log n + a(2n)2n+1/2g-2n ' show 568 and InfinitzSeries InfiniteSequences (e) Show that 1 Ï1/2 92(t) dt p2(t) dt = = 0. O (You can do the computations explicitly, but the result also follows imfrom Problem 18(a).)Now what can be said about the graphs of #(x) ‡2(t)dt and #(x) fe'#(t)dt? Use this information and integration by parts to show that mediately = £ = © Ï ‡2(t) dt (f) Show that the maximum that value of © 2t2 , (g) Finally, conclude and |p2(x)|for x in [0, 1] is ¼, ‡2(t)dt Ï 0. > 2t2 n < --. conclude 1 12n that nn+1/2e¯" n! < n.41¡2e¯"**/2". < The final result of Problem 19, a strong form of Stirling's Formula, shows that nn+1/2,-n, in the n! is approximately sense that this expression differs from n! by an amount which is small compared to n when n is large. For example, for 10 we obtain 3598696 instead of 3628800, with an error < 1%. n nature of A more general form of Stirling's Formula illustrates the the summation formula. The same argument which was used in Problem 19 can now be used to show that for N > 2 we have = "asymptotic" -b1)nk-1 N log ¯ n Since ‡N 1/2e¯¤ k(k dt. is bounded, we can obtain estimates of the form ‡N(t) l°° NtN MN dt _< gN-1 n If N is large, the constant MN Will also be large; but for very large n the factor will make the product very small. Thus, the expression nl-N -bl E nn+1/2e¯n - exp k (k )nk-1 large may be a very bad approximation for n! when n is small, but for large n (how good depends on N). depends on N) it will be an extremely good one (how PART EPILOGUE There was a most ingenious Architect who had contrived a new Method forbuildin.gHouses, by beginningat the Roof, and working downw«rdsto the Foundation. JONATHAN SWWF FIELDS CHAPTER Throughout this book a conscientious attempt has been made to define all imfor which an intuitive definition is portant concepts, even terms like and R, the two main protagonists of this story, often considered sufficient. But Q have only been named, never defined. What has never been defined can never be analyzed thoroughly, and P1-Pl3 must be considered assumptions, numbers. theorems, about Nevertheless, the term has been purposely not avoided, and in this chapter the logical status of Pl-P13 will be scrutinized more "function," "properties" "axiom" carefully. R, the sets N and Z have also remained undefined. True, some talk about all four was inserted in Chapter 2, but those rough descriptions are far from a definition. To say, for example, that N consists of 1, 2, 3, etc., merely is useless). names some elements of N without identifying them (andthe The natural numbers can be defined, but the procedure is involved and not quite pertinent to the rest of the book. The Suggested Reading list contains references to this problem, as well as to the other steps that are required if one wishes to develop calculus from its basic logical starting point. The further development of this program would proceed with the definition of Z, in terms of N, and the definition of Q in terms of Z. This program results in a certain well-defined and properties P1-P12 as set Q, certain explicitly defined operations + and of R, in terms of theorems. The final step in this program is the construction Q. which construction It is this last concerns us. Assuming that Q has been defined, all and that P1-P12 have been proved for Q, we shall ultimately d¢ne R and prove of P1-P13 for R. Our intention of proving Pl-Pl3 means that we must define not only real numbers, but also addition and multiplication of real numbers. Indeed, the real numbers are of interest only as a set together with these operations: how the real numbers behave with respect to addition and multiplication is crucial; what the real numbers may actually be is quite irrelevant. This assertion can be expressed in which includes a meaningful mathematical way, by using the concept of a as special cases the three important number systems of this book. This extraordinarily important abstraction of modern mathematics incorporates the properties P1-P9 common to Q, R, and C. A field is a set F (ofobjects of any sort whatoperations" soever), together with two + and defined on F (thatis, two rules which associate to elements a and b in F, other elements a+b and a.b in F) for which the following conditions are satisfied: Like Q and "etc." -, "field," "binary · b) + c a + (b + c) for all a, b, and c in F. (2)There is some element 0 in F such that (1)(a + = for all a, in F, + 0 a (ii)for every a in F, there is some element b in F such that a + b (i)a 571 = = 0. 572 Epilqgue forallaandbinF. (3)a+b=b+a for alla, b, and c in F. element I in F such that 1 ¢ 0 and (i)a 1 a for all a in F, (ii)For every a in F with a ¢ 0, there is some element b in F such that a.(b•c) (4)(a•b).c= (5)There is some = - a.b=1. for all a and b in F. a b + a c for all a, b, and c in F. (6)a b b a (7)a (b + c) = - - • = • • The familiar examples of fields are, as already indicated, Q, R, and C, with It is probably unnecessary to + and being the familiar operations of + and explain why these are fields, but the explanation is, at any rate, quite brief. When - - . the rules (1),(3),(4),(6),(7) + and are understood to mean the ordinary + and are simply restatements of Pl, P4, P5, P8, P9; the elements which play the role of 0 and 1 are the numbers O and 1 (which accounts for the choice of the symbols 0, 1); number and the b in (2)or (5)is or a¯', respectively. (For this reason, in an arbitrary field F we denote by the element such that a + (-a) 0, and by a-I the element such that a-1 for 0.) 1, a a · - , -a -a = ¢ = · In addition to Q, R, and C, there are several other fields which can be described easily. One example is the collection F1 of all numbers a + bá for a, b in Q. The operations + and will, once again, be the usual + and for real numbers. It is necessary to point out that these operations really do produce new elements • - of Fl: (a+bÃ) + (c +dÃ) (a+c) +(b +d)Ã, which is in F1; (a+bÃ) (c+dd)=(ac+2bd)+(bc+ad)Ã, whichisinF1= field are obvious for F1: since these hold for all real numbers, they certainly hold for all real numbers of the form a + bÃ. a+bn in Fi and, for a Condition (2)holds because the number 0 0+0ñis in F1 the number ß = (-a) + (-b)Ä in F1 satisfies a + ß 0. Similarly, On 1 1+ is in F1, so (5i)is satisfied. The verification of (Sii)is the only slightly difficult point. If a + bd ¢ 0, then Conditions (1),(3),(4),(6),(7)for a = = = = a+bÃ· it is therefore necessary 1 to show that 1/(a + a a+bn 1 - =1; a+bá bÃ (a-bÃ)(a+bÃ) bÃ) is in Fi. This is true because (-b) a a2-2b2 + a2-2b2 Ë ° (The division by a bÃ is valid because the relation a bÃ 0 could be true only if a b 0 (sinceÃ is irrational) which is ruled out by the hypothesis a + bÃ ¢ 0.) The next example of a field, F2, is considerably simpler in one respect: it contains only two elements, which we might as well denote by 0 and 1. The operations - = = - = 28. Relds 573 + and - are described by the following tables. 000 110 101 1 0 • 001 The verification of conditions straightforward, case-by-case checks. For proved by checking the 8 equations obtained by 0 or 1. Notice that in this field I + 1 0; this equation may also example, condition setting a, be written 1 0 + (lp(7)are (1)may be b, c 1 Our final example of a field is rather silly: F3 consists of all pairs (a, a) for a in R, and + and are defined by = = -1. = - (a,a)+(b,b)=(a+b,a+b), (a,a)-(b,b)=(a-b,a-b). (The + and appearing on the right side are ordinary addition and multiplication for R.) The verification that F3 is a field is left to you as a simple exercise. A detailed investigation of the properties of fields is a study in itself but for our purposes, fields provide an ideal framework in which to discuss the properties of numbers in the most economical way. For example, the consequences of Pl-P9 which were derived for in Chapter 1 actually hold for any field; in particular, they are true for the fields Q, R, and C. Notice that certain common properties of Q, R, and C do not hold for all fields. For example, it is possible for the equation 1 + 1 0 to hold in some fields, and consequently a b b a does not necessarily imply that a b. For the field C the assertion 1+1 ¢ 0 was derived from the explicit description of C; for the fields Q and R, however, this assertion was derived from further properties which do not have analogues in the conditions for a field. There is a related concept which does use these properties. An ordered field is a field F (withoperations + and together with a certain subset P of F (the elements) with the following properties: - "numbers" = = - = - .) "positive" (8)For all a (i)a (ii)a (iii) -a (9)If a (10)If a in F, one and only one of the following is true: 0, is in P, is in P. = and b are in P, then a + b is in P. and b are in P, then a b is in P. · We have already seen that the field C cannot be made into an ordered field. The field F2, with only two elements, likewise cannot be made into an ordered shows that 1 must be in P; then field: in fact, condition (8),applied to 1 (9) implies that 1 + 1 0 is in P, contradicting (8).On the other hand, the field Fi, -1, = = 574 Epilogue consisting of all numbers an ordered bá with a, the set of all a + a + certainly can be made into which are positive real numbers F3 can also be made into an ordered field; the field: let P be the ordinary sense). The field (in description of P is left to you. It is natural to introduce notation b in bÃ Q, for an arbitrary sponds to that used for Q and R: we define a>b ab if if if if ordered field which corre- a-bisinP, b>a, a<bora=b, a>bora=b. Using these definitions we can reproduce, for an arbitrary ordered field F, the definitions of Chapter 7: A set A of elements of F is bounded above if there is some x in F such that xt a for all a in A. Any such x is called an upper bound for A. An element x of F is a least upper bound for A if x is an upper bound for A and x 5 y for every y in F which is an upper bound for A. Finally, it is possible to state an analogue last abstraction of this chapter: of property P13 for R; this leads to the ordered field is an ordered A complete which is bounded above has a least upper field in which every nonempty set bound. of fields may seem to have taken us far from the goal of constructing the real numbers. However, we are now provided with an intelligible means of formulating this goal. There are two questions which will be answered in the remaining two chapters: The consideration 1. Is there a complete ordered field? 2. Is there only one complete ordered field? Our starting point for these considerations will be Q, assumed to be an ordered field, containing N and Z as certain subsets. At one crucial point it will be necessary to assume another fact about Q: Let x be an element of in N such that nx > y. This assumption, Q with x > 0. Then for any y in which asserts that the rational of the real numbers, does not follow numbers Q there is some n have the Archimedian other the properties of an from property conclusively ordered field see reference [17] (forthe example that demonstrates this of the Suggested Reading). The important point for us is that when Q is explicitly constructed, properties P1-P12 appear as theorems, and so does this additional 28. Relds 575 assumption; if we really began from the beginning, no assumptions be necessary. about Q would PROBLEMS Let F be the set {0,1, 2} and define operations + and on F by the following tables. (The rule for constructing these tables is as follows: add or multiply in the usual way, and then subtract the highest possible multiple of 3; thus 2-2=4=3+1,so2-2=l.) 1. • + 0 1 2 • 0 1 2 0 0 1 2 0 0 0 0 1 1 2 0 1 0 1 2 2 2 0 1 2 0 2 1 be made into an ordered Show that F is a field, and prove that it cannot field. Suppose now that we try to construct a field F having elements 0, 1, 2, 3 with operations + and defined as in the previous example, by adding or multiplying in the usual way, and then subtracting the highest possible multiple of 4. Show that F will not be a field. 2. · Let F tables. 3. = {0,1, a, ß} and define operations 001aß by the following on F 00000 1 0 ß 1 a auß01 p - ·01aß +01aß 1 + and 0 1 p a a0aßl ß0ßla ßa10 Show that F is a field. 4. (a) Let F be a field in which 1 + 1 can also (b) Suppose be written that a + a b+ b consequently -a). a = = = 0. Show that a + a = 0 for all a (this = 0 for some a ¢ 0. Show that I + 1 0 for all b). = 0 (and 576 Epilogue 5. (a) Show that in any field we have (1+-- +1)·(1+---+1) m times 1+--.+1 = mn times a times for all natural numbers m and n. (b) Suppose that in the field F we have 1+---+1=0 m times for some natural number must be a prime number of F). 6. Show that the smallest m with this property (thisprime number is called the characteristic m. Let F be any field with only finitely many elements. (a) Show that there must be distinct natural m and n with -+1. 1+---+1=1+m times (b) Conclude numbers n times that there is some natural number k with 1+---+1=0. k times 7. Let a, b, c, and d be elements of a field F with a d the equations - a.x+b•y=œ, c•x +d•y can 8. = b c • ¢ 0. Show that ß, be solved for x and y. Let a be an element of a field F. A with b2 b•b a. = "square root" of a is an element b of F = (a) How many square roots does 0 have? (b) Suppose a ¢ 0. Show that if a has a square mots, unless 1 + 9. - 1 0, in = which case a x2 + it has two square b x + c = 0, where b and c are elements of b2 that F. field 4•c has a square root r in F. Show that Suppose a r)/2 solution of this equation. is a (-b + (b) In the field F2 of the text, both elements clearly have a squam mot. On the other hand, it is easy to check that neither element satisfies the equation x2 + x + 1 0. Thus some detail in part (a)must be incorrect. What is it? (a) Consider an equation has root, then only one. . - = 10. Let F be a field and a an element of F which does not have a square root. This problem shows how to construct a bigger field F', containing F, in which a does have a square root. (This construction has already been carried 28. Fields 577 -1; this special case should through in a special case, namely, F R and a guide you through this example.) Let F' consist of all pairs (x, y) with x and y in F. If the operations on F and are + and , define operations on F' as follows: = = e · e (x,y)O(z,w)=(x+z,y*w), (x, y) O (z, w) (a) Prove that (x•z +a·y•w,y-z+x·w). = F', with the operations (b) Prove that (x, 0) e (y, 0) (x, 0) e (y, 0) and e is a field. y, 0), y, 0), = (x + = (x . e, so that we may agree to abbreviate (x, 0) by x. (c) Find a squam root of a (a, 0) in F'. = 11. Let F be the set of all four-tuples (w, x, y, z) of real numbers. by Define + and - (s,t,u,v)+(w,x,y,z)=(s+w,t+x,u+y,v+z), (s,t,u,v)•(w,x,y,z)=(sw-tx-uy-vz,sx+tw+uz-vy, sy+uw+vx-tz,sz+vw+ty-ux). F satisfies all conditions for a field, except (6).At times the become quite ornate, but the existence of multiplicative inverses is the only point requiring any thought. (b) It is customary to denote (a) Show that algebra will (0, 1, 0, 0) by i, (0, 0, 1, 0) by j, (0,0, 0, 1) by k. Find all 9 products of pairs i, j, and k. The results will show in particular that condition (6)is definitely false. This field" F is known as the quaternions. "skew CONSTRUCTION REAL NUMBERS CHAPTER OF THE The mass of drudgery which this chapter necessarily contains is relieved by one truly first-rate idea. In order to prove that a complete ordered field exists we will for an ordered have to explicitly describe one in detail; verifying conditions (1)--(10) field will be a straightforward ordeal, but the description of the field itself of the elements in it, is ingenious indeed. At our disposal is the set of rational numbers, and from this raw material it is necessary to produce the field which will ultimately be called the real numbers. To the uninitiated this must seem utterly hopeless-if only the rational numbers are known, where are the others to come from? By now we have had enough experience to realize that the situation may not be quite so hopeless as that casual consideration suggests. The strategy to be adopted in our construction has already been used effectively for defining functions and complex numbers. Instead of nature" of these concepts, we settled for a definition trying to determine the that described enough about them to determine their mathematical properties "real completely. A similar proposal for defining real numbers requires a description of real numbers in terms of rational numbers. The observation, that a real number ought to be determined completely by the set of rational numbers less than it, suggests a strikingly simple and quite attractive possibility: a real number might (andin fact eventually will) be described as a collection of rational numbers. In order to make this proposal effective, however, some means must be found for describing set of rational numbers less than a real number" without mentioning real numbers, which are still nothing more than heuristic figments of our mathematical imagination. If A is to be regarded as the set of rational numbers which are less than the real number a, then A ought to have the following property: If x is in A and y is a rational number satisfying y < x, then y is in A. In addition to this property the set A should have a few others. Since there should be some rational number x < a, the set A should not be empty. Likewise, since there should be some rational number x > a, the set A should not be all of Q. Finally, if x < a, then there should be another rational number y with x < y < a, so A should not "the contain a greatest member. If we temporarily regard the real numbers as known, then it is not hard to check (Problem 8-17) that a set A with these properties is indeed the set of rational numbers less than some real number a. Since the real numbers are presently in limbo, your proof, if you supply one, must be regarded only as an unofficial comment on these proceedings. It will serve to convince you, however, that we have not failed to notice any crucial property of the set A. There appears to be no reason for hesitating any longer. 578 29. Constructionof theReal Numbers 579 DEFINITION A real nurnber is a set a, of rational numbers, with the following four properties: is in a and y is a rational number with y (2)a ¢ 0. (1)If x (3)a ¢ Q. (4)There is no < x, then y is also in a. greatest element in a; in other words, if x is in a, then there > x. is some y in a with y The set of all real numbers is denoted by R. Justto remind you of the philosophy behind our definition, here is an explicit example of a real number: a=(xinQ:x<0orx2<2}. It should be clear that a is the real number which will eventually be known as Å, but it is not an entirely trivial exercise to show that a actually is a real number. The whole point of such an exercise is to prove this using only facts about Q; the hard part will be checking condition (4),but this has already appeared as a problem in a previous chapter (finding out which one is up to you). Notice that condition (4),although quite bothersome here, is really essential in order to avoid ambiguity; without it both (xinQ:x<l} and {xinQ:x<1} "real number 1." be candidates for the The shift from A to a in our definition indicates both a conceptual and a notational concern. Henceforth, a real number is, by definition, a set of rational numbers. This means, in particular, that a rational number (a member of Q) is not a real number; instead every rational number x has a natural counterpart which is a real number, namely, {yin Q : y < x}. After completing the construcwould tion of the real numbers, we can mentally throw away the elements of Q and agree that Q will henceforth denote these special sets. For the moment, however, it will be necessary to work at the same time with rational numbers, real numbers (setsof rational numbers) and even sets of real numbers (setsof sets of rational numbers). Some confusion is perhaps inevitable, but proper notation should keep this to a minimum. Rational numbers will be denoted by lower case Roman letters (x, y, z, a, b, c) and real numbers by lower case Greek letters (a, ß, y); capital Roman letters (A, B, C) will be used to denote sets of real numbers. The remainder of this chapter is devoted to the definition of +, and P for R, and a proof that with these structures R is indeed a complete ordered field. We shall actually begin with the definition of P, and even here we shall work and 0 are available, we shall backwards. We first define a < ß; later, when +, define P as the set of all a with 0 < a, and prove the necessary properties for P. •, -, 580 Epikgue The for beginning with the definition of reason < is the simplicity of this concept in our present setup: Definition.If a and ß are real numbers, then « < ß means that a is contained in ß (thatis, every element of a is also an element of ß), but a ¢ ß. defmitions of 5, >, > would be stultifying, but it is interesting to note that 5 can now be expressed more simply than <; if a and ß are real numbers, then œ $ ß if and only if a is contained in ß. If A is a bounded collection of real numbers, it is almost obvious that A should have a least upper bound. Each a in A is a collection of rational numbers; if these rational numbers are all put in one collection ß, then ß is presumably sup A. In the proof of the following theorem we check all the little details which have not been mentioned, not least of which is the assertion that ß is a real number. (We will not bother numbering theorems in this chapter, since they all add up to one complete ordered field.) big Theorem: There is a A repetition TEOREM PROOF of the If A is a set of real numbers least upper bound. Let ß = numbers; and A ¢ 0 and A is bounded above, then A has a x is in some a in A}. Then ß is certainly a collection of rational the proof that ß is a real number requires checking four facts. (x : is in ß and y < x. The first condition means that x is in a for some a in A. Since a is a real number, the assumption y < x implies that y is in œ. Therefore it is certainly true that y is in ß. (2)Since A ¢ 0, there is some a in A. Since a is a real number, there is some x in a. This means that x is in ß, so ß ¢ 0. (3)Since A is bounded above, there is some real number y such that « < y for every a in A. Since y is a real number, there is some rational number x which is not in y. Now a < y means that a is contained in y, so it is also true that x is not in a for any a in A. This means that x is not in ß; so (1)Suppose that x that x is in ß. Then x is in a for some a in A. Since a does not have a greatest member, there is some rational number y with x < y and y in a. But this means that y is in ß; thus ß does not have a greatest member. (4)Suppose These four observations prove that ß is a real number. The proof that ß is the least upper bound of A is easier. If a is in A, then clearly œ is contained in ß; this that œ $ ß, so ß is an upper bound for A. On the other hand, if y is an means upper bound for A, then a 5 y for every a in A; this means that a is contained in y, for every a in A, and this surely implies that ß is contained in y. This, in turn, means that ß $ y; thus ß is the least upper bound of A. The definition of + is both obvious and easy, but is must be complemented with defmition makes any sense at all. a proof that this "obvious" of theReal Numbers 581 29. Construction real Dejînition.If a and ß are a + THEOREM PROOF ß {x: x = = If a and ß are real numbers, then numbers, y + z for some y in a and some z in then « + ß is a ß}. real number. Once again four facts must be verified. (1)Suppose for some x in a + ß. Then x y + z for some y in a and some z in ß, which means that w < y + z, and consequently, w y < z. real number). and is in This shows that w y is in ß (since is Since ß, ß a z w y + (w y), it follows that w is in « + ß. clear that a + ß ¢ 0, since « ¢ Ø and ß ¢ 0. It is (2) Since a ¢ Q and ß ¢ Q, there are rational numbers a and b with a (3) not in a and b not in ß. Any x in a satisfies x < a (forif a < x, then condition (1)for a real number would imply that a is in a); similarly any y in ß satisfies y < b. Thus x + y < a + b for any x in œ and y in ß. This shows that a + b is not in a + ß, so a + ß ¢ Q. (4)If x is in « + ß, then x y + z for y in a and z in p. There are y' in a and z' in ß with y < y' and z < z'; then x < y' + z' and y' + z' is in a + ß. member. Thus a + p has no greatest w < x = - - = - = By now you can see how tiresome this whole procedure is going to be. Every time we mention a new real number, we must prove that it is a real number; this requires checking four conditions, and even when trivial they require concentration. There that it will be less boring if you check the four is really no help for this (except conditions for yourself). Fortunately, however, a few points of interest will arise now and then, and some of our theorems will be easy. In particular, two properties of + present no problems. THEOREM PROOF then If a, ß, and y are real numbers, (a + ß) + y = a + (ß + y). Since (x + y) + z x + (y + z) for all rational numbers x, y, and z, every member (œ+ ß) + y is also a member of a + (ß + y), and vice versa. = of THEOREM PROOF If a and ß are real numbers, then a + Left to you ß = ß+ a. (eveneasier). To prove the other properties of + we first define 0. Defmilion.0 = {xin Q : x < 0} . It is, thank goodness, obvious that 0 is a real number, and the following theorem is also simple. 582 Epilogue THEOREM PROOF If a is a real number, then a + 0 = a. If I is in Œ and y is in 0, then y < 0, so x + y < x. This implies that x + y is in a. Thus every member of a + 0 is also a member of a. On the other hand, if x is in a, then there is a rational number y in a such that Sincex=y+(x-y),whereyisina,andx-y<0(sothatx-yis y>x. in 0), this shows that x is in « + 0. Thus every member of a is also a member of a + 0. The reasonable candidate for would seem to -a {xin Q : be the set is not in a} -x -a). > a, so that x < But in certain in œ means, intuitively, that cases this set will not even be a real number. Although a real number a does not have a greatest member, the set (since --x -x not Q a -- = {xin Q : x is not in a) xo; when a is a real number of this kind, the set is not in have a greatest element It is therefore necessary to {x : which comes equipped introduce a slight modification into the definition of with a theorem. may have a least element a} will -x -xo. -a, D¢înition.If a is a real number, then -a THEOREM PROOF = {xin Q : -x If a is a real number, (1)Suppose that is not in a, but then is not the least element of and y y is in element a}. - is not in a, clear that is not in a. Moreover, it is is not the of since smaller shows element. This that is a Q a, -a < x. Then -y Since -x. > -x -y smallest Q is a real number. -a x is in also it is true that x - -y -x - -a. is some rational (2)Since a ¢ Q, there number y which number in rational that y is not the smallest always be replaced by any y' > y). Then (3)Since « ¢ 0, there is some x in a. Then assume -y -x is in -a. cannot is not in a. We can Q - Thus (sincey can a ¢ 0. -a possibly be in -a, so then is not in a, and there is a rational number y < is in which is also not in a. Let be a rational number with y < Then z z< clearly smallest œ. also and is So the element of is in not not a, z Q z > x, this shows that is in Since does not have a greatest (4)If x -a, -x -x -x. - -z -a. -a --z element. The proof that a + (-a) are not = caused, as you might The difficulties 0 is not entirely straightforward. the finicky details in the definition presume, by 29. Constructionof theReal Numbers 583 Rather, at this point we require the Archimedian property of Q stated on page 574, which does not follow from Pl-P12. This property is needed to prove the following lemma, which plays a crucial role in the next theorem. of LEMMA -œ. Let a be a real number, and z a positive rational number. Then there are (Figure 1) rational numbers x in a, and y not in a, such that y x z. Moreover, we may smallest of the that element is y not assume Q a. - = - PROOF Suppose first that z is in a. If the numbers z, 2z, 3z, ... were all in a, then everyrational number would be in a, since every rational number w satisfies w < nz for some n, by the additional assumption on page 574. This contradicts the fact that a is a real number, so there is some k such that x = kz is in a and y = (k+ 1)z is not in a. Clearly y x = z. Moreover, if y happens to be the smallest element of a, let x' > x be an element of a, and replace x by x', and y by y + (x' x). -- Q - - If z is not in a, there is a similar proof, based on the fact that the numbers cannot all fail to be in a. FIGURE THEOREM If Œ 1 number, is a real then a + PROOF (-n): (-a) 0. = > x. Then is not in a, so Hence Suppose x is in a and y is in member of a + (-a) is in 0. x + y < 0, so x + y is in 0. Thus every other > 0. the direction. If z is in 0, then in is little difficult It a to go more with According to the lemma, there is some x in a, and some y not in a, y not the such smallest element of that y x This equation can be written Q a, and this is in x + (-y) proves that z is in «+ (-a). z. Since x is in a, -y -y -a. -z -z. = - - -y -a, = Before proceeding with multiplication, prove a basic property: Definition.P = {a in R : a > we define the 0} . Notice that a + ß is clearly in P if a and ß are. "positive elements" and 584 Epilogue THEOREM If a is a (i)a (ii)a (iii) -a PROOF real then one and only one of the following conditions holds: number, 0, is in P, is in P. = any positive rational number, then a certainly contains all negative rational numbers, so a contains 0 and a ¢ 0, i.e., a is in P. If a contains no positive rational numbers, then one of two possibilities must hold: If a contains contains all negative rational numbers; then « 0. which negative rational there number not in «; it can be asis is some x (2) sumed that x is not the least element of Q a (sincex could be replaced x); then contains the positive rational number by x/2 > so, as we (1)a = - --x, -a have just proved, is in P. -a 0, it is clearly impossible This shows that at least one of (i)-(iii) must hold. If a > 0 for condition (ii)or (iii)to hold. Moreover, it is impossible that a > 0 and both hold, since this would imply that 0 a + (-a) > 0. = -a = Recall that « > ß was defined to mean that a contains ß, but is unequal to ß. This definition was fine for proving completeness, but now we have to show that it is equivalent to the definition which would be made in terms of P. Thus, we must show that « ß > 0 is equivalent to a > ß. This is clearly a consequence of - the next theorem. THEOREM PROOF If a, ß, and y are real numbers and œ > ß, then + y « > ß+ y. The hypothesis a > ß implies that ß is contained in a; it followsimmediately from the definition of + that µ+y is contained in a+y. This shows that a+y >ß+y. We can easily rule out the possibility of equality, for if ß+y, a+y= then a which is false. Thus = « ( +Y) + (-Y) + y > ß‡ = (ß +y) +(-Y) Definition.If a and ß are real numbers THEOREM •ß = {z: z 5 0 or z If a and ß are real numbers = 0, y. Multiplication presents difficulties of its own. defmed as follows. a = and œ, If a, ß > • ß can ß > 0, then x y for some x in a and y in · with a, 0, then a ß > 0, then a · ß is a ß with x, y real number. > 0}. be of theReal.Numbers585 29. Construction PROOF As usual, we must check four conditions. z, where z is in « ß. If wy 0, then w is automatically in a•ß. Suppose that w > 0. Then z > 0, so z x y for some positive x in « and positive y in ß. Now (1)Suppose w < • = wz w------ wxy z w --x z · -y. z 1, so (w/z) x is in a. Thus w is in a ß. Since O < w < z, we have w/z Clearly Ø. a ß¢ (2) (3)If x is not in a, and y is not in ß, then x > x' for all x' in a, and y > y' for all y' in ß. Hence xy > x'y' for all such positive x' and y'. So xy is not in a ß; thus a ß ¢ Q. (4)Suppose w is in a•ß, and w 5 0. There is some x in a with x > 0 and and xy is in «•ß some y in ß with y > 0. Then z z > w. Now suppose and positive > 0. Then for in some positive y in ß. x some a w xy w x' x'y, > then contains Moreover, a some w, and z is x; if z z > xy Thus a ß does not have a greatest element. in a < . - • • · = = = = •ß. . Notice that a•ß of all properties of is clearly in P if a and ß are. This completes the verification P. To complete the definition of we first define ja|. · then Defmition.If a is a real number, II -a, Dginition. If a and ß are real numbers, then ifa=0orß=0 ifa>0,ß>0ora<0,ß<0 ifa>0,ß<0ora<0,ß>0. 0, a.ß= if a 0 if a 5 0. œ, = ja|•|ßl, -(|al•|ßl), As one might suspect, the proofs of the properties of multiplication usually involve THEOREM PROOF THEOREM reduction to the case of positive numbers. If a, ß, and y are real numbers, This is clear if a, ß, y > then a • (ß • y) = (a ß) . . y. 0. The proof for the general case requires separate cases (andis simplified If a and ß are real numbers, considering slightly if one uses the following theorem). then a.ß = ß•a. 586 Epilogue PROOF This is clear if a, ß 0, and the other cases are easily checked. > D¢inition. 1 = {xin Q : x < 1} (It is clear that 1 is a real number.) . THEOREM PROOF If a is a real number, then a · 1 = a. It is easy to see that every member of «•1 is also a member of a. On the other hand, suppose x is in a. If xs 0, then x is automatically in «-1. If x > 0, then there is some rational number y in a such that x < y. Then x y (x/y), and x/y is in 1, so x is in a 1. This proves that « 1 a if a > 0. If a < 0, then, applying the result just proved, we have Let «>0. = -(|al•|1|) «•1 = - - - -(la!) = = = Finally, the theorem is obvious when « = a. 0. Dginition.If a is a real number and a > 0, then {xin Q : xs 0, or x > 0 and 1/x is not in a, but 1/x is not the smallest a-1 = member if a THEOREM PROOF < of Q - a}; -1 0, then a-1 _ If a is a real number _ unequal Clearly it suffices to consider to 0, then aonly œ > is a real number. 0. Four conditions must be checked. is in «-1. If y 5 0, then y is in «-1. If y > 0, then x > 0, so 1/x is not in a. Since 1/y > 1/x, it follows that 1/y is not in a, and 1/y is clearly not the smallest element of Q a, so y is in a-1 (2)Clearly a-1 (3)Since a >0, there is some positive rational number x in a. Then 1/x is not in a-1, so a-1 (4)Suppose x is in a-1. If xs 0, there is clearly some y in a-1 with y > x because a-1 contains some positive rationals. If x > 0, then 1/x is not in a. Since 1/x is not the smallest member of Q a, there is a rational number y not in a, with y < 1/x. Choose a rational number z with y < z < 1/x. Then 1/z is in a, and 1/z > x. Thus œ-1 does not contain a largest (1)Suppose y < x, and x - - member. In order to prove that a-1 is really the multiplicative inverse of a, it helps to have another lemma, which is the multiplicative analogue of our first lemma. LEMMA with a > 0, and z a rational number with z > 1. Then there are rational numbers x in a, and y not in a, such that y/x z. Moreover, that the element of is least we can assume y not Q a. Let a be a real number = - of theReal Numbers 587 29. Construction PROOF Suppose first that z is in a. Since z z" (1 + (z = 1 > 0 and - 1))" - > 1 + n(z 1), - it follows that the numbers Z, Z2 3 z* is in a, and y zk+1iS HOt cannot all be in a. So there is some k such that x in a. Clearly y/x z. Moreover, if y happens to be the least element of Q a, let x' > x be an element of a, and replace x by x' and y by yx'/x. If z is not in a, there is a similar proof, based on the fact that the numbers 1/zk cannot all fail to be in a. = = = THEOREM PROOF If a is a real number - and a ¢ 0, then · a a-1 y suffices to consider only a > 0, in which case a-1 > 0. Suppose that x is a positive rational number in a, and y is a positive rational number in a-1 Then 1/y is not in a, so 1/y > x; consequently xy < 1, which means that xy is in 1. Since all rational numbers xs 0 are also in 1, this shows that every member of a is in 1. It obviously •«-1 To prove the converse assertion, let z be in 1. If zy 0, then clearly z is in a-1. Suppose O < « z < 1. According to the lemma, there are positive rational numbers x in a, and y not in a, such that y/x = 1/z; and we can assume that y · is not the smallest element of Q a. But this means that z is in a, and 1/y is in a-1. Consequently, z is in «•«-1 - = x - (1/y), where x We are almost done! Only the proof of the distributive law remains. Once again we must consider many cases, but do not despair. The case when all numbers are positive contains an interesting point, and the other cases can all be taken care of very THEOREM PROOF neatly. If a, ß, and y are real numbers, then a • (ß + y) = a • ß+ a • y. ASSume first that a, ß, y > 0. Then both numbers in the equation contain all 5 0. A positive rational number in a (ß + y) is of the form x (y + z) for positive x in a, y in ß, and z in y. Since x (y + z) = x y + x z, where x y is a positive element of a and x z is a positive element of a y, this number is also in a ß + a y. Thus, every element of a (ß + y) is also in rational numbers · - · · · •ß, - - • - • . a•ß+g•y. On the other hand, a positive rational number in a ß + a y is of the form x1 y + X2 z for positive x1. x2 in a, y in ß, and z in y. If xi 5 x2, then (x1/x2) y 5 y, so (xi/x2) y is in ß. Thus . • - · · x1 y+x2 z=x2[(xt/x2)y+z] is in «•(ß + y). Of course, the same trick works if x2 5 xiTo complete the proof it is necessary to consider the cases when a, ß, and y are not all > 0. If any one of the three equals 0, the proof is easy and the cases 588 Epilogue involving a < 0 can be derived immediately once all the possibilities for ß and y have been accounted for. Thus we assume «>0 and consider three cases: ß, y <0, and ß < 0, y > 0, and ß > 0, y < 0. The first follows immediately from the case already proved, and the third follows from the second by interchanging ß and y. Therefore we concentrate on the case ß<0, y>0. There are then two possibilities: (1)ß + y 0. Then a•y = a•([ß+y] + |ßl) = a•(ß +y) +a· |ßl, so «•(ß+y)=-(a•|ß|)+a.y = (2)ß + y < a•ß+a•y. 0. Then «·|ßl=a-(|ß+y|+y)=a-|ß+yl+a•y, so «•(ß+y)=-(«•|ß+yl)=-(«•|ßl)+a.y=a.ß+a.y. This proof completes the work of the chapter. Although long and frequently tedious, this chapter contains results sufficiently important to be read in detail at least once (andpreferably not more than once!). For the first time we know that we is indeed a complete ordered field, the have not been operating in a vacuum-there theorems of this book are not based on assumptions which can never be realized. One interesting and horrid possibility remains: there may be several complete rich ordered fields. If this is true, then the theorems of calculus are unexpectedly in content, but the properties P1-P13 are disappointingly incomplete. The last chapter disposes of this possibility; properties P1-Pl3 completely characterize the real numbers-anything that can be proved about real numbers can be proved on the basis of these properties alone. PROBLEMS There are only two problems in this set, but each asks for an entirely different construction of the real numbers! The detailed examination of another construction is recommended only for masochists, but the main idea behind these other constructions is worth knowing. The real numbers constructed in this chapter might be called algebraist's real numbers," since they were purposely defined the least upper bound property, which involves the ordering <, so as to guarantee an algebraic notion. The real number system constructed in the next problem might be called analyst's real numbers," since they are devised so that Cauchy will always sequences converge. "the "the 1. Since every real number of rational sequence numbers, of rational ought to be the limit of some Cauchy sequence we might try to definea real number to be a Cauchy numbers. Since two Cauchy sequences might converge to the same real number, however, this proposal requires some modifications. of'the RealNumbers 589 29. Construction (a) Define two Cauchy sequences of rational numbers {a.}and {b,}to be equivalent (denoted by {a,} {b,})if lim (a, b.) 0. Prove that ~ = -- E-¥ØQ {a.} (a,}, that {b.) {a,}if {a.} {a.} {b.}and (b.)~ {c.}. ~ ~ {b,},and ~ that {a.} {c.}if ~ ~ a is the set of all sequences equivalent to {a.), and ß is the equivalent 0 or to {b.). Prove that either a n ß then there is some (c,}in both a and ß. Show that a ß. (If anß ¢ 0, in this case a and ß both consist precisely of those sequences equivalent (b) Suppose that set of all sequences = = to {c.).) Part (b)shows that the collection of all Cauchy sequences can be split up into disjoint sets, each set consisting of all sequences equivalent to some fixed sequence. We define a real number to be such a collection, and denote the set of all real numbers by R. (c) If a and ß are real numbers, let {a.) be a sequence in a, and {b,}a sequence in ß. Define «+ ß to be the collection of all sequences equivalent to the sequence {a.+b.}. Show that {a.+b.) is a Cauchy sequence and also show that this definition does not depend on the particular sequences {a,}and (b,}chosen for a and ß. Check also that the analogous definition of multiplication is well defined. (d) Show that R is a field with these operations; existence of a multiplicative inverse is the only interesting point to check. (e) Define the positive real numbers P so that R will be an ordered field. (f) Prove that every Cauchy sequence of real numbers converges. Remember that if {an)is a sequence of real numbers, then each as is itself a collection of Cauchy sequences of rational numbers. 2. "the high-school student's real numThis problem outlines a construction of bers." We define a real number to be a pair (a, {b.)),where a is an integer and {b,}is a sequence of natural numbers from 0 to 9, with the proviso that the sequence is not eventually 9; intuitively, this pair represents b,,10¯". With this definition, a real number a + is a very concrete ob- the difficulties involved in defining addition and multiplication are .=1 ject, but formidable (howdo you add infinite decimals without worrying about carrying digits infinitely far out?). A reasonable approach is outlined below; the trick is to use least upper bounds right from the start. (a) Define (a,{b.))< (c,{d,})if a d, but b¡ d¡ for 1 upper bound property. b, < = < j < < if a c and for some n we have n. Using this definition, prove the least = c, or k (b) Given rational œ = (a, (b.)), define Gk = b.10 a + "; intuitively, Œk is the n=1 number obtained by changing all decimal places after the kth 590 Epilogue to 0. Conversely, given a rational number r of the form a + L b.10¯", n=1 let r' denote the real number (a, {bs')),where b,' and b,' 0 for n > k. Now for a (a, {b,})and ß = = = a + ß = sup{(ak Ÿ Åk)':k a natural = b. for 1 5 n 5 k (c,(ds})define number} If multiplication is defined verification of all conditions for a field is a straightthen the Once more, however, existence forward task, not highly recommended. of multiplicative inverses will be the hardest. (theleast similarly, upper bound exists by part (a)). UNIQUENESS OF THE REAL NUMBERS CHAPTER We shall now revert to the usual notation for real numbers, reserving boldface symbols for other fields which may turn up. Moreover, we will regard integers and rational numbers as special kinds of real numbers, and forget about the specific chapter which real numbers this were defined. In we are interested in only way in one question: are there any complete ordered fields other than R? The answer For example, the field F3 introduced in to this question, if taken literally, is ordered and complete field, it is certainly not R. This field is a Chapter 28 is a example because the pair (a, a) can be regarded as just another name for "yes." "silly" the real a; the operations number (a, a) + (b,b) (a, a) (b, b) - = = (a + b, a (a b, · a - + b), b), This sort of example shows that any intelligent are consistent with this renaming. consideration of the question requires some mathematical means of discussing such renaming procedures. If the elements of a field F are going to be used to rename elements of R, then for each a in R there should correspond a f(a) in F. The notation f(a) suggests that renaming can be formulated in terms of functions. In order to do this we will need a concept of function much more general than any which has occurred until now; in fact, we will require the most general notion of used in mathematics. A function, in this general sense, is simply a rule which assigns to some things, other things. To be formal, a function is a collection of ordered pairs (ofobjects of any sort) which does not contain two distinct pairs with the same first element. The domain of a function f is the set A of all objects a such that (a, b) is in f for some b; this b is denoted by f(a). If f(a) is (unique) in the set B for all a in A, then f is called a function from A to B. For example, "name" "function" if f (x) sia x for all x in R (andf is defined only for x in R), then f is a function from R to R; it is also a function from R to [-1, 1]; = if f (z) sin = z for all z in C, then f is a function from C to C; et for all z in C, then f is a function from C to function from C to {zin C : z ¢ 0}; if f (z) = 9 is a function fmm {zin C : z ¢ 0} to {xin R : 0 yx < C; it is also a 2x}; if f is the collection of all pairs (a, (a, a)) for a in R, then f is a function from R to F3· 591 592 Epilo.gue Suppose that Fi and F2 are two fields; we will denote the operations in Fi by and the operations in F2 by +, etc. If F2 is going to be considered , O, etc., collection of elements of F1, then there should be a function for new names as a with the properties: following from F1 to F2 •, (1)The function f should be one-one, that is, if x ¢ y, then we should have f (x) ¢ f (y); this means that no two elements of F1 have the same name. that is, for every element z in F2 there (2)The function f should be should be some x in Fi such that z f (x); this means that every element "onto," = is used to name some element of Ft. (3)For all x and y in Fi we should have of F2 f(x O y)= f(x)+ f(y), f(xey)= f(x)· f(y); this means that the renaming the field. procedure is consistent with the operations If we are also considering F1 and F2 as ordered quirement: (4)If x © y, then of fields, we add one more re- f (x) < f (y). A function with these properties is called an isomorphismfrom Fi to F2. This definition is so important that we restate it formally. DEFINITION If Fi and F2 are two fields, an isomorphism from F1 to F2 with the following properties: from F1 to F2 is a function f y, then f (x) ¢ f (y). (2)If z is in F2, then z f (x) for some x in F1(3)If x and y are in F1, then (1)If x ¢ = f(x e y)= f(x)+ f(y), f(xey)= f(x)•f(y). If F1 and F2 are ordered fields we also require: (4)If x@y, then f(x)< f(y). The fields F1 and F2 are called isomorphic ifthere is an isomorphism between them. Isomorphic fields may be regarded as essentially the same-any important of will other. automatically hold for the Therefore, we can, and property one should, reformulate the question asked at the beginning of the chapter; if F is a complete ordered field it is silly to expect F to equal R-rather, we would like to know if F is isomorphic to R. In the following theorem, F will be a field, with operations and elements" P; we write a < b to mean that b a + and and is in P, so forth. "positive •, - 30. Uniquenessof theReal Numbers593 THEOREla PROOF If F is a complete ordered field, then F is isomorphic to R. Since two fields are defined to be isomorphic if there is an isomorphism between them, we must actually construct a function f from R to F which is an isomorphism. We begin by defining f on the integers as follows: f (0) 0, = forn>0, f(n)=1+...+1 n times f (n) = (1 + - . . . for n + 1) < 0. |n| times It is easy to check that f(m+n)= f(m)+ f(n), f(m-n)= f(m)•f(n), for all integers m and n, and it is convenient define f on the rational numbers by f (m/n) = m|n = to denote f (n) by We then n. m•n-1 + 1 ¢ 0 if n > 0, since F is an ordered field). nk, so m•l k•n, so k/l, then ml This definition makes sense because if m/n m•n-1 k•l-1. It is easy to check that that (notice the n-fold sum 1 + - - - = = = = f (ri + f(ri r2) = -r2)= f (r1) + f (r2) , f(ri). f(r2), for all rational numbers ri and r2, and that f (ri) < f (r2) if ri < r2. The definition of f (x) for arbitrary x is based on the now familiar idea that any real number is determined by the rational numbers less than it. For any x in R, let Ax be the subset of F consisting of all f(r), for all rational numbers r < x. The set A, is certainly not empty, and it is also bounded above, for if ro is a rational number with ro > x, then f (ro) > f (r) for all f (r) in A,. Since F is a complete ordered field, the set Ax has a least upper bound; we define f(x) as sup Ax. We now have f (x) defined in two different ways, first for rational x, and then for any x. Before proceeding further, it is necessary to show that these two definitions words, if x is a rational number, we want to show agree for rational x. In other that sup A, = f (x), for x m/n. This is not automatic, required. on the completeness of F; a slight digression is thus Since F is complete, the elements where f (x) here denotes m|n, 1+...+1 n times = fornaturalnumbersn but depends 594 Epilqgue form a set which is not bounded above; the proof is exactly the same as the proof for R (Theorem 8-2). The consequences of this fact for R have exact analogues in F: in particular, if a and b are elements of F with a <b, then there is a rational number r such that f(r)<b. a< Having made this observation, we return to the proof that the two definitions f (x) agree for rational x. If y is a rational number with y < x, then we have already seen that f (y)< f (x). Thus every element of Ax is < f (x). Consequently, of supAx 5 On the other f(x). hand, suppose that we had sup A, < Then there would be a rational number sup A, < f (x). r such that f (r) < f (x). But the condition f(r) < f(x) means that r < x, which means that f(r) is in the set A,; this clearly contradicts the condition sup A, < f (r). This shows that the original assumption is false, so sup A, = f(x). We thus have a certain well-defined function f from R to F. In order to show of the definition. We f is an isomorphism we must verify conditions (1)--(4) will begin with (4). If x and y are real numbers with x < y, then clearly A, is contained in A,. Thus $supA, f(x)=supA, f(y). that = To rule out the possibility of equality, notice that there are rational numbers r and s with x<r<s<y. We know that f (r) < f (s). It follows that f (x) 5 f (r) < f (s) $ f (y). This poves (4). Condition (1)follows immediately from (4):If x ¢ y, then either x < y or < y x; in the first case f (x) < f (y), and in the second case f (y) < f (x); in either case f(x) ¢ f(y). be an element of F, and let B be the set of all rational f (r) < a. The set B is not empty, and it is also bounded above, because there is a rational number s with f (s) > a, so that f (s)> f (r) for r in B, which implies that s > r. Let x be the least upper bound of B; we claim that f (x) = a. In order to prove this it suffices to eliminate the alternatives To prove numbers (2),let a r with f(x) <a, a < f (x). 30. Uniquenessof theReal Nurnbers595 In the first case there would be a rational number r with f(x)< f(r)<a. But this means that x < r and that r is in B, which contradicts the fact that number r with x sup B. In the second case there would be a rational = f(r)< f(x). a< This implies that Hence x. Since x r < sup B, this means = that r < s for some s in B. f(r)< f(s)<a, again a contradiction. Thus a, proving and real numbers let be x y (3), either Then f (x) + f (y). f(x) = To check f (x + y) f (x) + f (y) < y) f (x + < and suppose that f (x) + f (y) < f (x + or In the first case there would be a rational (2). number f(x + y) ¢ y). r such that f (r) < f (x) + f (y). But this would mean that x+y<r. Therefore r could be written as the sum of two rational = r where x r1 + r2, Then, using the facts checked about f (r) = f (r1+ r2) = < numbers ri and y < r2. f for rational numbers, it would f (r1)+ f (r2)> f (X) + f (7), follow that The other case is handled similarly. Finally, if x and y are positive real numbers, the same sort of reasoning a contradiction. shows that f(x-y)= f(x). f(y); the general case is then a simple consequence. This theorem brings to an end our investigation of the real numbers, and resolves any doubts about them: There is a complete ordered field and, up to isomorphism, educaonly one complete ordered field. It is an important part of a mathematical real numbers construction of tion to follow a the in detail, but it is not necessary refer construction. again this particular is utterly irrelevant that a real It to to ever numbers, number happens to be a collection of rational and such a fact should never enter the proof of any important theorem about the real numbers. Reasonable proofs should use only the fact that the real numbers are a complete ordered field, because this property of the real numbers characterizes them up to isomorphism, and any significant mathematical property of the real numbers will be true for all isomorphic fields. To be candid I should admit that this last assertion is just a prejudice of the author, but it is one shared by almost all other mathematicians. 596 Epilogue PROBLEMS 1. Let f be an isomorphism from Fi to F2- I. (Here 0 and I on the left denote 0 and f(1) elements in Fi, while 0 and 1 on the right denote elements of F2.) f(a) and f(a-1) f(a)-1, for a ¢ 0. (b) Show that f(-a) (a) Show that f(0) = = 2. - - = Here is an opportunity to convince yourself that any significant property of shared field is by a any field isomorphic to it. The point of this problem is write to out very formal proofs until you are certain that all statements of this sort are obvious. F1 and F2 will be two fields which are isomorphic; for simplicity we will denote the operations in both by + and Show that: .. (a) If the equation x2 + 1 = 0 has a solution in Fi, then it has a solution in F2. 0 with + ao (b) If every polynomial equation x" + a._i x"¯I + then equation F1, in F1, has a root in every polynomial ao,...,a,_i - = --- x"+b._i•x"¯I+--·+bo=0withbo,...,b,_iinF2hasarootinF2. (c) If + 1 (summed m times) 0 in Fi, then the same is true in F2. fields (andthe isomorphism f satisfies f (x) < (d) If f (y) for x < y) and F1 is complete, then F2 is complete. 3. 1+ - - = - F1 and F2 are ordered f be an isomorphism from Fi to F2 and g an isomorphism from F2 g(f(x)). from F1 to F3 by (go f)(x) F3. Define the function gof to that Show gof is an isomorphism. Let = 4. Suppose that F is a complete ordered field, so that there is an isomorphism f from R to F. Show that there is actually only one isomorphism from R R, this is Problem 3-17. Now if f and g are two to F. Hint: In case F isomorphisms from R to F consider g¯1 = 5. Find an isomorphism from C to C other than the identity function. SUGGESTED READING A aught a read þrt as inttination leads Em; for what he reads as a task will do him little good. inan SAMITEL JOHNSON One purpose of this bibliography is to guide the reader to other sources, but the most important function it can serve is to indicate the variety of mathematical reading available. Consequently, there is an attempt to achieve diversity, but no pretense of being complete. The present plethora of mathematics books would make such an undertaking almost hopeless in any case, and since I have tried to independent reading, the more standard a text, the less likely it is to encourage here. In some cases, this philosophy may seem to have been carried to appear extremes, as some entries in the list cannot be read by a student just finishing a first course of calculus until several years have elapsed. Nevertheless, there are which can be read now, and I can't believe that it hurts to have many selections of what lies ahead. some idea Many of these books have gone through numerous editions and printings, which will be reflected in more recent publication dates. Many of the books with older publications dates are out of print, though that generally doesn't apply to books from the redoubtable Dover Publications, or from the Mathematical Association of America. Those that are no longer in print can still often be found in well-stocked academic libraries. One of the most elementary unproved theorems mentioned in this book is the fact that every natural number can be written as a product of primes in only one way. A proof of this basic theorem will be found near the beginning of almost number theory. Few books have won so enthusiastic an any book on elementary audience as [1] An Introductionto the Theoryof Numbers(fifth edition), by G. H. Hardy and E. M. Wright; Oxford University Press, 1980. The Pergamon Press published a series, Popular Lectures in Mathematics, with several titles worth investigating, among them [2] A Selectronof Problemsin the Theoryof Numbers,by W. Sierpinski; Macmillan (Pergamon), 1964. Finally, I will mention [3] an intriguing little book, now out of print I fear, ThreePearlsofNumberTheory,by A. Khinchin; Graylock Press, 1952. The subject of irrational numbers straddles the fields of number analysis. An excellent introduction will be found in [4] theory and IrratzonalNumbers, by I. M. Niven; Mathematical Association of America, 1956. Together with many historical notes, there are references to some fairly elementary articles in journals. There is also a proof that yr is transcendental (seealso [51]) and, finally, a proofof the "Gelfond-Schneider theorem": If a and b are algebraic, with a ¢ 0 or 1, and b is irrational, then ab is transcendental. All the books listed so far begin with natural numbers, but whenever necessary take for granted the irrational numbers, not to mention the integers and rational 599 Reading 600 Suggested numbers. Several books present a construction of the rational numbers from the natural numbers, but one of the most lucid treatments is still to be found in [5] edition), Foundations ofAnalysis(second by E. Landau; Chelsea, 1960. Incidentally, the original German edition, [6] edition), by E. Landau; Chelsea, 1965. GrundlagenderAnalysis(fourth has been printed in paper back, together with a complete German-English dictionary (ofabout 300 words) for the whole book-an excellent way to begin reading mathematical German. The basic idea for constructing the real numbers is derived from Dedekind, whose contributions can be found in [7] Essayson the Theoy ofNumbers,by R. Dedekind; Dover, 1963. While many mathematicians are content to accept the natural numbers as a natural starting point, numbers can be defined in terms of sets, the most basic starting point of all. A charming exposition of set theory can be found in a sophisticated little book called [8] NaiveSet Theoy, by P. R. Halmos; Springer-Verlag, 1991. Another very good introduction is [9] Theoy of Sets,by E. Kamke; Dover, 1950. "new math" that set theory Perhaps it is necessary to assure some victims of the mathematical content (infact, some very deep theorems). Using does have some these deep results, Kamke proves that there is a discontinuous function f such that f (x + y) f (x) + f (y) for all x and y. For those who enjoy reading the classics, the most important notions of set theory were first introduced by Cantor, = whose work is reproduced in [10]Contributionsto theFoundingof the Theoy of TransßniteNumbers,by G. Cantor; Dover, 1952. Inequalities, which were treated as an elementary topic in Chapters 1 and 2, actually form a specialized field. A good elementary introduction is provided by [11]AnalyticInequalities,by N. Kazarinoff; Mathematical Association of Amer- ica, 1961. Twelve different proofs that the geometric mean is less than or equal to the aritheach based on a different principle, can be found in the beginning of the more advanced book metic mean, [12]AnIntroductiontoInequalitùs,by E. Beckenbach and ical Association of R. Bellman; Mathemat- America, 1961. The classic work on inequalities is edition), [13]Inequalities(second by G. H. Hardy, Cambridge University Press, 1988. J. E. Littlewood, and G. Polya; Suggested Reading 601 Each of the authors of this triple collaboration has provided his own contribution of mathematical about the thinking, written from a nature to the sparse literature mathematician's point of view. My favorite is [14]A Mathematician'sApology,by G. H. Hardy; Cambridge University Press, 1992. Littlewood's anecdotal selections are entitled [15]A Mathematician'sMiscellany,by J. E. Littlewood; Polya's contribution [16] Methuen, 1953. is pedagogy at the highest level: Mathematicsand PlausibleReasoning(Vol.I: Inductionand AnalogyinMathematics; Vol. II: PatternsofPlausible Inference),by G. Polya; Princeton University Press, 1990. Geometry is the other main field which can be considered as background for work, but should Euclid's Elements is still a masterful mathematical perhaps be postponed until some preparation has been made, with a modern work geometry," like on calculus. "classical [17]ElementaryGeometry froman AdvancedStandpoint(secondedition), by E. Moise; Addison-Wesley, 1974. This beautiful book provides excellent historical perspectives and contains a thorough discussionof the role of the "Archimedean axiom" in geometry; in addition, Chapter 28 describes an ordered field in which the Archimedean axiom does not hold. Speaking of beautiful geometry books, all sorts of fascinating things can be found in edition), [18]Introductionto Geometry(second by H. S. Coxeter; Wiley, 1989. Almost all treatments of geometry at least mention convexity, which forms another specialized topic. I cannot imagine a better introduction to convexity, or a experience in general, than reading and working through better mathematical [19]ConvexFaures, by I. M. Yaglom and W. G. Boltyanskii; Holt, Rinehart and Winston, 1961. This book contains a carefully arranged sequence of definitions and statements of theorems, whose proofs are to be supplied by the reader (worked-out proofs are supplied in the back of the book). Another geometry book has been modeled on the same principle: [20] CombinatorialGeometryin the Plane, by H. Hadwiger and H. Debrunner; Holt, Rinehart and Winston, 1964. Along with these two out-of-the-ordinary books, I might mention valuable little book, also of a specialized sort, in Analysis,by B. [21] Counterexamples 1964. Gelbaum and J. Olmsted; an extremely Holden-Day, 602 Suggested Reading Many of the example in this book come from more advanced topics in analysis, but quite a few can be appreciated by someone who knows calculus. Of calculus books I will mention only two, each something of a classic: [22]A Courseof PureMathematics(tenthedition), by G. H. Hardy; Cambridge University Press, 1952. [23]Digerentialand IntegralCalculus(twovolumes), by R. Courant; Wiley (Interscience), 1988. Courant is especially strong on applications to physics. Speaking of such applications, an elegant exposition of the material in Chapter 17, together with much further discussion, can be found in the article [24] On the Geometryof the Kepler19oblem,by JohnMilnor; matical Monthly,Volume 90 in TheAmericanMathe- pp. 353-365. (1983), (In this paper the curve c' of Chapter 17 is denoted by v, and the derivative of the important composition vo 0¯I (page331) is introduced quite off handedly as dv/dB.) A derivation of Kepler's laws, together with numerous references, can be found in another article in this same journal, "straight-forward" [25] The MathematicalRelationshipBetweenKepler'sLaws and Newton'sLaws, by Andrew T. Hyman; in The AmericanMathematical Monthly, Volume 100 (1993),pp. 932-936. The latter parts of Volume I of Courant contain material usually found in adcalculus, including differential equations and Fourier series. An introduction to Fourier series (requiring a little advanced calculus) will also be found in vanced [26]An Introductionto FourierSenesand Integrals,by R. Seeley; W. A. Renjamin, 1966. calculus in earnest) contains additional The second volume of Courant (advanced equations, well as as an introduction to the calculus ofvarion difFerential ations. A widely admired, though somewhat more advanced, book on differential equations is material [27]Lectureson OrdinaryI@iirendalEquadons, by W. Hurewicz; Dover, 1990. I will bypass the more or less standard advanced calculus books (which can easily be found by the reader) since nowadays there is a movement to revise the whole presentation of advanced calculus, basing it upon linear algebra. One of the first, and still one of the nicest, treatments of advanced calculus using linear algebra is [28] Calculusof VectorFunctions,by R. H. Crowell and R. E. Williamson; Prentice- Hall, 1962. Several recent books on advanced calculus attempt to acquaint undergraduates with very large areas of modern mathematics. My favorite, of course, is [29] Calculuson Man¶olds,by M. Spivak; W. A. Benjamin, 1965. Suggested Reading 603 There are three other topics which are somewhat out of place in this bibliography because they are rapidly becoming established as part of a standard underThe purposeful study of fields and related systems is the graduate curriculum. domain of "algebra." One of the favorite texts is edition), [30]TopicsinAlgebra(second by I. N. Herstein; Wiley, 1975. A more advanced book is the great classic: [31]Algebra,by B. L. van der Waerden; Springer-Verlag, 1990. By the way, this book contains a proof of the partial fraction decomposition of a function. There are now several introductions to complex analysis, as well as many elementary books on topology. Although the latter subject has not been mentioned before, it has really been in the background of many discussions, since it is the natural generalization of the ideas about limits and continuity which play such a prominent role in Part II of this book. The next few topics, ranging from elementary to very difficult, are included in this bibliography because they have been alluded to in the text. The proof that a nondecreasing function is differentiable at almost all points (andan explanation of just what this means) receives a beautiful exposition in rational [32]FunctionalAnalysis,by E Riesz and B. Sz.-Nagy; Ungar, 1955. (After this elementary beginning, the book moves on to quite advanced material.) The gamma function has an elegant little book devoted entirely to its properties, most of them proved by using the theorem of Bohr and Mollerup which was mentioned in Problem 19-39: [33]The GammaFunction, by E. Artin; Holt, Rinehart and Winston, 1964. The gamma function is only one of several important improper integrals in mathe¯22 ematics. In particular, the calculation of dx (seeProblem 19-41) is imporfe theory, where the distribution in probability function" tant "normal (x) 1 = 2 e¯ dy plays a fundamental role. A classic book on pobability [34]An Introductionto 14obabilig Theory and W. Feller; Wiley, 1968. theory is lšs Applications(thirdedition), by The impossibility of integrating certain functions in elementary terms (among them f (x) = e¯=') is one of the most esoteric subjects in mathematics. An interesting discussion of the possibilities of integrating in elementary terms, with an 604 Suggested Reading outline of the impossibility proofs, and references ville, will be found in [35] The to the original papers of Liou- Integration of Functions of a Single Variable (secondedition), by G. H. Hardy; Cambridge University Press, 1958. A complete presentation of the impossibility proofs will be found in [36]Integrationin Finite Terms,by J. Ritt; Columbia University Press, 1948. Oddly enough, a related but seemingly more difficult problem has a much neater There are simple differentialequations (y"+ xy 0 is a specific examwhose solutions of ple) indefinite integrals of cannot be expressed even in terms elementary functions. This fact is proved on page 43 of the (60-page) book: solution. = [37]An Introductionto Ihþrential Algebra,by L Kaplansky; Hermann, 1957. To read this book you will need to know quite a bit of algebra, however. A few words should also be said in defense of the process of integrating in elementary terms, which many mathematicians look upon as an art (unlike differentiation, which is merely a skill). You are probably already aware that the process of integration can be expedited by tables of indefinite integrals. For those who enjoy pursuing tables there is a really beautiful collection, that includes indefinite integrals, definiteimproper integrals, and a great deal more besides(ifyou should ever happen to need the value of the thirty-fourth Bernoulli number, this is the place to look): [38]Tablesof Integrals,Senes,and Pwducts,by I. S. Gradschteyn et al.; Academic Press, 1980. For the thrifty, there is a paperback table of integrals: [39] Tablesof IndefiniteIntegrals,by G. Petit Bois; Dover, 1961. The remaining references are of a somewhat different sort. They fall into three of which the first is historical. The letter of H. A. Schwarz referred to in Problem 11-65 will be found in categories, [40] Waysof Thoughtof GreatMathematicians, by H. Meschkowski; Holden-Day, 1964. Some historical remarks, and an attempt will be found in to incorporate them into the teaching of calculus, [41] The Calculus:A GeneticApproach,by O. Toeplitz; University of Chicago Press, 1981. An admirable textbook on the history of mathematics is [42]An Introductionto the HistoryofMathematics(sixthedition), ders College Publishing, 1990. by H. Eves; Saun- Suggested Reading 605 Three good scholarly works are [43]History ofAnalyticGeometry,by C. Boyer; Scholar's Bookshelf, 1988. [44]A Historyof the Calculus,and Rs ConceptualDevelopment,by C. Boyer; Dover, 1959. [45] TheMathematicsof GreatAmateurs,by J. Coolidge; Oxford University Press, 1990 and extracts from original sources will be found in [46]A SourceBookin Mathematics(2vols.), by D. Smith; Dover, 1959. Despite the impression that might be given by the large number of books listed here, it is often hard to find specific concrete information about the origins of calculus. For example, it is almost impossible to find out who first proved the Mean Wissenschaften, VolValue Theorem (according to the EncyklopädùderMathematischen whose students of differential geomname is familiar to ume II, it was O. Bonnet, etry from the "Gauss-Bonnet Theorem"). Similarly, though many history books "complicated method of interpolatell us that Wallis proved Wallis' formula by a mention what though tion," most never bother to it inspired Euler's it was, even investigations of the gamma function (a description is given in the answer book, along with the solution to Problem 19-40). The second category in this final group of books might be described as ularizations." There are a surprisingly large number of first-rate ones by real "pop- mathematicians: [47] What is Mathematics? (fourthedition), by R. Courant and H. Robbins; Oxford University Press, 1979. [48] Geometryand the Imagination,by D. Hilbert and S. Cohn-Vossen; Chelsea, 1952. [49] The EnjoymentofMathematics,by H. Rademacher and O. Toeplitz; Dover, 1990. edition), by H. Tietze; Graylock [50]FamousProblemsof Mathematics(second Press, 1965. One of the most renowned "popularizations" is especially concerned with the teaching of mathematics: [51]ElementaryMathematicsfroman AdvancedStan point,by E Klein (vol.1: Arithmetic, Algebra,Analysis;vol. 2: Geometry);Dover, 1948. Volume 1 contains a proof of the transcendence of x which, although not so elementary as the one in [4],is a direct analogue of the proof that e is transcendental, replacing integrals with complex line integrals. It can be read as soon as the basic facts about complex analysis are known. riginal The third category is the very opposite extrem papers. The difficulties encountered here are formidable, and I have only had the courage to list such the one source of the quotation for Part IV. It is not even in English, paper, 606 Suggested Reading although you do have a choice of foreign languages. The article in the original French is in [52] OeuvresComplètesd'Abel;Christiania. JohnsonReprint Corporation, New York, 1965. It first appeared in a German translation in the journalfiir die reine und angewandte Mathematik,Volume 1, 1826. To compound the difficulties, these references will usually be available only in university libraries. Yet the study of this paper will probably be as valuable as any other reading mentioned here. The reason is suggested by a remark of Abel himself, who attributed his profound knowledge of mathematics to the fact that he read the masters, rather than the pupils. ANSWERS TO SELECTED PROBLEMS CHAFTER 1 1. la)x a¯I(ax) 1 x x. 1 a-la (a x2 y2, then 0 x2 y2 If (x y)(x + y), so either x (iii) 0, that is, either x or x x+ y y. in Replace by y y (vi) (iv). One step requires dividing by x y 0. (i) = = = = = = = · = _ - y - 0 or = -y = = = --- 2. 3. (i) a/b (ii) (ad + ab¯I (iii) bc)/(bd) (a = (a/b)/(c/d) (v) ad(b¯I - ad(bc)¯I = a¯I - (a-b¯I)(c-d = b¯I I) (ab)¯'. = 1 = (a-b¯I)(c-1-d)= = (ad)/(bc). = -1. (i) (iii) x>Áorx<-Ã. (v) All x, since x2 2X + 2 (X Î) + Î. x2 since 3 and are the roots of (vii)x > 3 or x < < x < 3. (ix) x > sr or < 3. (xi) x (xiii)x>1or0<x<1. (i) b a and d c are in P, so (b a) + (d c) (b + < x = - - -2, -2 x - --5 5. (by(iii)) = 1, so = - (a/b)(c/d)¯I = c¯1) - = bc)(bd)¯I (ad + = (ac)(bc)-i (by(iii)) ac/bc. bc)(b¯Id¯') (ad + = a/b + c/d. a-1)(b b¯1) = ab(a¯Ib-1) c¯1) (ac)(b¯I = ab-1 + cd-1 4. = - = - - - in P. Thus, b + d Using (ii), < - a + c. then (i)implies that a + a) are in P, so ac 0 and a < 1, so a2 < a. d) = 6 - - 0. = c) (a + is > --d; (iii) -c -a) and (v) (b -c -bc --c(b (vii)Using (iv),a > (ix) Substitute a for c and b for d in 9. = - (i) Ã+E-E+n. (iii) |a+b|+|c|-|a+b+c|. (v) Ä+Ä+E-W. and b à (i) a if a (-c) b+ < (--d). is in P, that is, ac > bc. (viii). -b 10. -b 0; 0; and b 5 if a 5 -a a+2bifat-bandb_<0; -b -a - (iii) x - --x--x2ifxs0. 2b if a 5 x2 if x à 0; and b > 0. -5. 11. (i) x (iii) (v) No = 11, -6 --2. < x < x (thedistance --1 least 2). (vii)x 1, from x to 1 plus the distance from x to is at -1. = 12. (i) (|xyl)2 (xy)2 x2y2 |x|2|y|2 (\x| |y|)2;since |xy| and |x|· |y| are both > 0, this proves that |xy| |x|· |y|. y--1| by (ii)) |xy il (by(i)) |x/y|. (iii) 1x|||y| |x (v) It followsfrom (iv)that |x| |y (y x)| 5 |y| + ly x|, so lxI lyI 5 lx yl(vii) |x + y + z| 5 |x + y| + |z 5 |x| + ly| + |z|. If equality holds, then |x + y| |xl + |y|, so x and y have the same sign. Moreover, z must = = = = - = -|y|¯-1 = = . = - = 609 - - = - - 610 Answers to SelectedProblems have the same sign as x + y, so x, y, and z must all have the same sign one is 0). (unless CHAPTER 2 1. (i) Since 12 1 (2) (2 1 + 1)/6, the formula is true for n 1. Suppose that the formula is true for k. Then k(k + 1)(2k + 1) 2 12+---+k'+(k+1) +(k+1) 2 6 (k + 1) + 1) + 6(k + 1)] 6 [k(2k (k + 1) + 2)(2k + 3)] 6 [(k (k + 1)(k + 2) (2[k+ 1] + 1) 6 = - - = - = = ' so the 2. formula is true for k + 1. (i) (2i-1)=l+3+5+---+(2n-1) i=1 = = = 5. 1+2+3+---+2n-2(1+---+n) (2n)(2n + 1) -n(n+1) 2 n2. (a) Since 1-r2 1+r= the formula is true for n = , 1-r 1. Suppose that 1 1+r+··-+r"= - rn+1 . 1-r Then 1+r+---+r"+rn+1 _ 1 1 1 rn+1 - - - +rn+1 r rn+1 + rn+1(1 1-r 1-r"*2 ' 1-r (b) S=1+r+ rS= +r" r+---+r"+rnel Thus S(1-r)=S---rS=1-r"**, - r) Answersto SelectedProblems 611 SO l rn+1 - ¯ 1-r 6. (i) From (k+1)'-k'=4k3+6k2+4k+1, k=1,...,n we obtain k3+6 (n+1)4-1=4 k2+4 k=l k+n, k=1 k=1 so n k3_ 1)(2n + Î) 64 n(n + (n+1) 4 R4 3 2 4 2 4 -1-6 n(n ‡ 1) -4 -n 2 k=1 =-+-+-. (iii) From 1 1 k 1 = -- k+1 k=1,...,n , k(k+1) we obtain 18. 1 1 n+1 k(k+1) 1 is either even or odd, in fact it is odd. Suppose n is either even or odd; then n can be written either as 2k or 2k + 1. In the first case n + 1 2k + 1 is odd; in the second case n + 1 2k + 1 + 1 2(k + 1) is even. In either case, n + 1 is either even or odd. (Admittedly, this looks fishy,but it is really = = = 9. 12. correct.) Let B be the set of all natural numbers I such that no 1 + l is in A. Then 1 is in B, and l + 1 is in B if l is in B, so B contains all natural numbers, which means that A contains all natural numbers > no. (a) Yes, for if a + b were rational, then b (a + b) a would be rational. If a and b are irrational, then a + b could be rational, for b could be r a for some rational number a. rational. But if a ¢ 0, then ab could not be rational, If (b) a 0, then ab isa¯1 = would be rational. for then b (ab) example, M. (c) Yes; for example, and for Yes; E (d) -- = - -- = · -d. 13. (a) Since (3n + 1)2 (3n + 2)2 = = 9n2 + 6n + 1 3(3n2+ 2n) + 1, 9n* + 12n + 4 3(3n2 + 4n + 1) + 1, = = it follows that if k2 is divisible by 3, then k must also be divisible by 3. p/q where p and Now suppose that Ë were rational, and let Ë = 612 Answersto SelectedProblems p2 3q2, so p2 is divisible by 3, so q have no common factor. Then 3p' for some natural number p', and consep must be. Thus, p 42. Thus, quently (3p')2 3q2, or 3(p')2 q is also divisible by 3, a contradiction. The same proofs work for Ã and Ã, because the equations = = = _ (Sn + 1)2 = 25n2 + 10n + 1 (Sn + 2)2 25n2 + 20n + 4 (Sn + 3)2 25n2 + 30n + 9 = = = = = 5(5n2 $(5n2 + 4n) + 4, 5(5n2 + 6n + 1) + 4, (5n+4)2=25n2+40n+16=5(5n2+¾+B+L equations for numbers of the form 6n + m, show and the corresponding that if k2 is divisible by 5 or 6, then k must be. The proof fails for Ã, because (4n + 2)2 is divisible by 4. (For precisely this reason this proof cannot be used to show that in general á is irrational if a is not a have no guarantee that (an + m)2 might not be a perfect square-we multiple of a for some m < a. Actually, this assertion is true, but the proof requires the information in Problem 17.) (b) Since (2n+1)3 = Sn3+12n2+6n+1 2(4n3+6n2+3n)+1, = = p/q where p and it follows that if k3 is even, then k is even. If p3 p3 2q3, is divisible by 2, so p so q have no common factors, then 2p' for some natural number p', and consequently must be. Thus, p (2p')S 2q3, or 4(p')3 q3. Thus, q is also even, a contradiction. = = = = The proof for å is similar, using the equations (3n+ 1)3 27n3 + 27n2 + 9n + 1 (3n + 2)3 19. If n = = = 27n* + 54n2 + 36n + 8 1, then (1 + h)" = (1 + h)"* (1 + = 3(9n3+ 9n2 + 3n) + 1, = 3(9n3 + 18n2 + 12n + 2) + 2. 1 + nh. Suppose that (1 + h)" = h)(1 + h)" =1+(n+1)h+nh2>1+(n+1)h. > (1 + h) (1 + since nh), > 1 + nh. Then 1+ h > 0 For h > 0, the inequality follows directly fmm the binomial theorem, since all the other terms appearing in the expansion of (1 + h)" are positive. -1 CHAPTER 3 1. (i) (x + 1)/(x + 2); the expression f (f (x)) makes -2. and x ¢ -1/c if c ¢ 0). 1/(1 + cx) (forx ¢ (v) (x + y + 2)/(x + 1)(y + 1) (forx, y ¢ (vii)Only c 1, since f (x) f (cx) implies that x true for at least one x ¢ 0. (i) y > 0 and rational, or y > 1. (iii) 0. 0, 1. (v) (iii) -1). = 2. sense only when xy ---1, = = cx, and this must be Answersto SelectedProblems 613 3. (i) {x:-1Exs1}. (iii) {x: x ¢ 1 and x ¢ (v) 0 (i) 22y (iii) 22=r + sin(2'). (i) Po s. (iii) so S. (v) Po P. (vii)sososoPoPoPos. (a) y. (b) H(y). 2). . 4. 5. 11. (c) H(y). 12. (a) even odd even neither neither odd even odd (b) even odd even even odd odd odd even (c) f f even odd g even even even g odd even odd (d) Let g(x) f (x) for x > 0 and define g arbitrarily for x < 0. (i) Let g(x) h(x) 1 and let f be a function for which f(2) ¢ f(1) + = 21. = = f (1). Then fo (g + h) ¢ fog + fo h. (ii) [(g+h)o f](x)= (g+h)(f(x))= g(f(x))+h(f(x)) (ho f)(x) [(gof) + (ho f)](x). = (go f)(x)+ = 1 1 1 = = 1/(f ¢f 1 -(g(x)) (iii) fog (x) f (g(x)) f 2 and let f be a (iv) Let g(x) og) o(1/g). = = - f og (x). function for which f(j) ¢ 1/f(2). Then 614 Answersto SelectedProblems CHAFTER a 1. (i) (2, 4). (iii) [2,4]. (v) (-2, 2). (vii) (-oo, 1] U [1,oo). 3. (i) All points below the (iii) All points below the graph of f (x) graph of f (x) = = x. x2. All points between the graphs of f(x) x + 1 and f(x) x 1. inter(vii)A collection of straight lines parallel to the graph of f (x) = axis secting the horizontal at the points (n, 0) for integers n. radius 1 and around (1, 2). of the inside circle All points (ix) with vertices (1,0), (0, 1), (-1, 0), and (0, (i) A square (iii) The union of the graph of f (x) x and of f (x) 2 x. (v) The point (0, 0). (vii)The circle of radius á around (1, 0), since x2 2x + y2 (x 1)2 + (v) = = - -x, -1). 4. = = - = - - y2-1. 6. a) + b that the graph of f (x) m(x mx + (b ma) is a straight line with slope m, which goes through the point (a, b). (The important point about this exercise is simply to remember the point slope form.) (b) The straight line through (a, b) and (c,d) has slope (d b)/(c a), so the equation follows from part (a). (c) When m m' and b ¢ b'. In that case, there is clearly no number x with f (x) g(x), while such a number x always exists if m ¢ m', namely, m'). x = (b' b)/(m (a) If B = 0 and A ¢ 0, then the set is the vertical straight line formed If B ¢ 0, the set is the graph of -C/A. by all points (x, y) with x f (x) (-A/B)x + (-C/A). points (x, y) on the vertical line with x The a are precisely the ones (b) which satisfy 1 x + 0 y + (-a) 0. The points (x, y) on the graph of f (x) mx+b are precisely the ones which satisfy (-m)x+ 1-y+(-b) (a) Simply observe = = - - - - = = - 7. - = = = = - = = 11. 0. (i) The graph of f is symmetric with respect to the vertical axis. (ii) The graph of f is symmetric with respect to the origin. Equivalently, the part of the graph to the left of the vertical axis is obtained by reflecting first through the vertical axis, and then through the horizontal RXiS. 21. (iii) The graph of f lies above or on the horizontal axis. 0 and a over (iv) The graph of f repeats the part between x2) of the distance from is The to (0, ¼) (x, square (a) -x2+x'- x2+ x2- \ 4) 2 x2 =x4 2 = (x2+ 16 1)2 + 1 16 and over. Answersto SelectedProblems 615 is the square of the distance from (x, x2) to the graph of g. (b) The point (x, y) satisfies this condition if and only if which (x-a)2+Q- 2-Q-02 or x2-2ax+a2+y2-2ßy+ß2-y2-2yy+y2, or 1 2ß-2y ( y= «24p2 a y-ß x2+ x+ 2 . 2ß-2y for ß ¢ y, which is just the condition that P is not on L. If P is on L, then the solution is the vertical line through P.) (This CHAFTER 5 1. solution works only (ii) x3 8 2 - lim x->2 - x lim(x2 + 2x + 4) = = xw2 12. (iv) yn lim yn _ limXn-1 = xey X - n-2 n-2 n-1 X->T y yn-i = n-1 pn-i = ny"¯1 (vi) Ja+h- lim lim = h h->o (Ja+h- lim = ha0 )(Ja+h+Á) h( Ja + h + h->o da + h+ ) & l 24 3. It is possible to fmd ô by beginning with the equation (i) x4 If |x - a| < - (X - G 1, then |x| < 3 |x3+ax2+a2 a)(X - 1+ 3+|a| < # GX2 + a2x + a3L |a|, so X2 2 |X|+|a|3 (1 + |a|)" + |a|(1 + |a|)2 + M2(1+ |a|) + |a|3· therefore we can choose 8 = min ( l' e [a|)3+ |a|(1 + |a|)2 + |a|2(1+ |a|) + |a|3 (1 + It is instructive, and probably easier, to use part shows that |x4 - |x2 a _ | < e when 2 2(|a|2 + 1) (2)of the lemma. This 616 Answersto SelectedProblems which when is true min x (ii) < min 1, = min 1, a| - (3)of the By part lemma, |l/x 1| - e when < e (1 - - =ô. 2'2 the lemma, (x*+ 1/x) 1| < e/2. According to parts - 2| - 6. (v) (i) 1| - e (1 min < 2'Ä' e2, since Û < Let 8 We need |f(x) = 0 < x 2| - 2| - 82 - |1/x 1 e 2'32 = - Ñ - sin2 + (iii) We need . |g(x) 4| - < mm -, |4| 1| - < e/2 (ii)of this problem, this < e. implies that e/2 and |g(x) 4| < s/2, so we min < min = '8-2-2 |x| < < e 1 e when < (i)and happens when |x 8. = (iii) By part (1)of |x4 2(|a 2 + 1) 2(|a| + 1) 4(|a|2 + 1)(|a[ + 1) |x-l]<min and 1, - ô. need 1 e|4|2 - 2 2 so we need 0 9. |x 2| < - < 8e)]2 [min(2, __ lim f (x) and define g(h) f (a + h). Then for every e > 0 there is a 8 > 0 such that, for all x, if 0 < |x a| < 8, then |f (x) l < e|. Now, if < e. This inequality 0 < lh|<8, then0 < (h+a)-al < ô, so|f(a+h)-l| written which < g(h) l, e. Thus, lim can be can also be written |g(h) Let l = = - - -l| = ho0 lim f (a + h) l. The same sort of argument shows that if lim f (a+ h) m, h->0 then lim f (x) m. So either limit exists if the other does, and in this case = = h->0 = they are equal. 10. can get f(x) as close to I as we like if and only if we can get f (x) as close to 0 as we like. The formal proof is so trivial that it takes a bit of work to make it look like a pmof at all. To be very precise, l and let g(x) suppose lim f (x) f (x) l. Then for all e > 0 there < 8, then |f(x) < e. is a ô > 0 such that, for all x, if 0 < |x This last inequality can be written |g(x) 0| < e, so lim g(x) 0. The (a) Intuitively, we -l = = - -al - argument in the other direction is similarly uninteresting. -l| = Answersto SelectedProblems 617 (b) Intuitively, making x close to a is the same as making x a close to 0. Formally: Suppose that lim f(x) l, and let g(x) f(x a). Then for all e > 0 there is a 8 > 0 such that, for all x, if 0 < |x a < ô, then |f (x) l| < e. Now, if 0 < |yl < 8, then 0 < |(y + a) a < 8, so < E. But this last inequality < can be written |g(y) f(y +a) e. l. The argument in the reverse direction is similar. So lim g(y) yw0 - = = - - -- - -l| -l| = is close to 0 if and only if x3 is. Formally: Let lim f (x) x->O l. For every e > 0 there is a 8 > 0 such that if 0 < lx| < 8, then |f (x) l| < e. Then if 0 < |x| < min(1, 8), we have O < |x3| < ô, so lim f (x) = l. On the other hand, if we assume |f(x3) l| < e. Thus, x->O (c) Intuitively, x = - - f(x3) exists, that lim x->0 such that if 0 |x| < say < lim f(x3) 8, then we have 0 < | W] < ô, so lim f (x) m. m, then = x->0 if (x3) [f ([ W]3) - - for all e > 0 there is a & m| < e. Then if 0 < |x| ml < e, or |f (x) m| < e. -- < 83, Thus = -1 (d) Let f (x) but lim x->0 17. = 1 for x 0, and f (x) = > f (x) does not function f (x) for x < 0. Then lim f (x2) x->0 = 1, exist. 1/x cannot approach a limit at 0, since it bewhat 8 > 0 may near 0. In fact, no matter comes satisfying 0 < |x| < 8, but 1/x > |l| + s, namely, be, there is some x min(8, 1/(|l| + s)). This x does not satisfy |l/(x l)| < e. x (b) No matter what 8 > 0 may be, there is some x satisfying 0 < |x 1| < 8, but 1/(x-1) > |l|+e, namely, x min(1+8, 1+1/(|l|+e)). This x does < e. (It is also possible to apply Problem 10(b): 1) not satisfy |l/(x lim 1/x lim 1/(x 1) if the latter exists, so this limit does not exist, 290 x->l because of part (a).) (i) This is the usual definition, simply calling the numbers ô and e, instead (a) The arbitrarily = large = - - = -l| - = 25. (ii) - of e and 8. This is a minor modification of (i):if the condition is true for all 8 then it applies to 8/2, so there is an e > 0 such that if 0 < |x a| then |f (x) l| < ô/2 < 8. - > 0, < e, - is a similar modification: apply it to 8/5 to obtain (i). (iv) This is also a modification: it says the same thing as (i),since e/10 > 0, and it is only the existence of some e > 0 that is in question. If lim f(x) lim f(x) l, then for every e > 0 there are ôi, 82 > 0 such (iii) This 29. = x->.+ = x->a- that, for all x, if a < x < a + ô1, then |f (x) li then|f(x)-I|<e. ifa-82<x<a, -- LetB=min(8i,82).If0<|x-a|<&,theneithera--82<a-B<x<a or else a < x < a + 8 < a + Bi, so jf (x) l| < e. - < e, 618 Answersto SelectedProblems 30. -l| < lim f (x), then for all e > 0 there is a ô > 0 such that |f (x) xwD+ < < x < 0, then 0 < < ô, so |f(-x) e for 0 < x < 8. If e. l. Similarly, if lim f (x) exists, then lim f (x) Thus lim f (-x) x->0xsOx->D* exists and has the same value. (Intuitively, x is close to 0 and positive if and only if is close to 0 and negative.) lim f (x), then for all e > 0 there is a 8 > 0 such that |f (x)--l| < If I If l (i) = -8 -l| -x

Calculus Textbook: Spivak, Third Edition

Related documents

Study collections

Products

Support

Calculus Textbook: Spivak, Third Edition

Related documents

Study collections

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib