Introduction to the Cauchy integral Bro. David E. Brown, Dept. of Mathematics, BYU–Idaho Version 1.2, of March 1, 2013. All rights reserved. Augustin Louis Cauchy’s Résumé des leçons donées à l’Ecole Polytechnique Royale sur le calcul infinitésimal of 1823 was the first document to give a rigorous notion of integration. He took the familiar approach of approximating the area under a curve by sums of areas of appropriate rectangles and letting the widths of the rectangles go to 0.1 He was more careful about it than any of his predecessors. This document aims to replicate Cauchy’s achievement by defining the Cauchy integral, showing that continuous functions are Cauchy integrable, and proving a fundamental theorem of calculus for Cauchy integrals. You, the student, will help. We will modernize Cauchy’s approach and use greater rigor. (Sorry, but you’ll have to draw your own pictures and create your own examples.) The nature of integration theory is that you have to create kind of a lot of machinery (in the form of lemmas) to be able to prove anything interesting about integrals. I regret to inform you that the present effort is no exception. However, the machinery is easier to construct for the Cauchy integral than for the Riemann integral, especially if we use only continuous functions. We’ll have a little more generality than this, but we’ll keep things as simple as we can. 1 Definition of the Cauchy integral We begin with some necessary vocabulary. Throughout this document, assume a < b. Definition 1.1. Suppose given an interval [a, b] of real numbers. • A partition P of [a, b] is a set of points x0 , x1 , x2 , . . . , xN having the property that a = x0 < x1 < x2 < · · · < xN = b. We will write P = {a = x0 < x1 < x2 < · · · < xN = b}. • The intervals [xn , xn−1 ] are the subintervals of the partition, or subintervals, for short. Exercise 1.2. Show that N P (xn − xn−1 ) = b − a. (In other words, the sum of the lengths of the subintervals n=1 of [a, b] is the length of [a, b].) Also show that N P f (xn ) − f (xn−1 ) = f (xN ) − f (x0 ). n=1 Let f : [a, b] → R be a bounded function, and suppose P = {a = x0 < x1 < x2 < · · · < xN = b} is a partition of [a, b]. Construct on each subinterval [xn−1 , xn ] the rectangle that has signed height f (xn−1 ). (The height is signed because f (xn−1 ) may be negative.) The signed area of this rectangle is f (xn−1 )(xn − xn−1 ). These are the areas that make up a Cauchy sum for f . Definition 1.3. The Cauchy sum for f corresponding to the partition P of [a, b] is C(f, P ) = N X f (xn−1 )(xn − xn−1 ). (1) n=1 1 Note that this method starts with an unwarranted assumption, namely that the area under the curve is already defined and exists before you start drawing rectangles. This is not so. The integral actually defines the area under the curve. 1 If we want to define an integral by using narrower and narrower rectangles in some limiting process, we’ll need a way to make the lengths of the subintervals of partitions go to 0.2 Definition 1.4. The norm of the partition P of [a, b] is kP k = max{xn − xn−1 | n = 1, 2, . . . , N }. Exercise 1.5. Show that there is a sequence of {Pj } of partitions of [a, b] whose norms go to 0 as j → ∞. We will also need vocabulary for a partition made by partitioning the subintervals of P . Definition 1.6. Q is a refinement of the partition P iff Q is a partition of [a, b] and P ⊂ Q. We also say that Q refines P . When simultaneously discussing two partitions P and Q, we will write P = {a = x0 < x1 < x2 < · · · < xN = b} and Q = {a = y0 < y1 < y2 < . . . < yM = b}. Note that if Q refines P , then each xn is some ym . Exercise 1.7. Show that if Q is a refinement of P then M ≥ N and kQk ≤ kP k. Exercise 1.8. Let P and Q be any two partitions of [a, b]. 1. Show that P ∪ Q is a refinement of P and a refinement of Q. (Indeed, P ∪ Q is an example of a common refinement of P and Q.) 2. Show that kP ∪ Qk ≤ min kP k, kQk . We are ready to try to define the Cauchy integral as the limit of Cauchy sums as the norms of our partitions go to 0. Naturally, making sense out of this requires a δ and an ε. Definition 1.9. Let a < b. The bounded function f : [a, b] → R is Cauchy integrable on [a, b] iff there is a real number A such that for every ε > 0 there is a δ > 0 having the property that if P is any partition of [a, b] for which kP k < δ then |C(f, P ) − A| < ε. When such an A exists, we call it the Cauchy integral of f on [a, b] and write Z A = lim C(f, P ) = (C) kP k→0 b f (x) dx. a The (C) in front of the integral symbol indicates that the integral is a Cauchy integral.3,4 At this point, we do not know whether we have defined anything, because we do not know whether there is such a thing as a function that is Cauchy integrable on some interval. Nevertheless, we can prove a theorem about the uniqueness of our hypothetical Cauchy integral. (This is important!) Exercise 1.10. Prove that the Cauchy integral of a Cauchy integrable function on a given compact interval is unique, if it exists. So the Cauchy integral is unique, IF is exists. Our next question should be whether there actually is such a thing as a Cauchy integrable function. As the following example shows, the answer is affirmative. 2 In fact, we will need the lengths of subintervals go to 0 uniformly. we ever need a Riemann integral in this document, we’ll just use an unadorned integral symbol for it. 4 Warning: Some people like to write lim n→∞ C(f, P ) = A, but this makes no sense, because n only runs up to N , not ∞. Some people like to write limN →∞ C(f, P ) = A, but this is not the same thing as A = limkP k→0 C(f, P ), even though N → ∞ iff kP k → 0. This is because (1) N is not a function of kP k and (2) kP k is not a function of N . So if you’re in the habit of thinking that the integral is what you get when you let the number of rectangles go to ∞, I exhort you to change your ways. The integral (if it exists) is what you get by using rectangles all of whose widths are arbitrarily small. 3 If Page 2 Example 1.11. Suppose f (x) = c ∈ R on [a, b] and let ε > 0 be given. Let δ be any positive number and let P be a partition of [a, b] with kP k < δ. If A = c(b − a) then X N f xn−1 (xn − xn−1 ) − A |C(f, P ) − A| = n=1 X N c(xn − xn−1 ) − c(b − a) = n=1 = |c(b − a) − c(b − a)| =0 < ε. So: We actually did define something, because there is such a thing as a Cauchy integrable function. Unfortunately, using the definition directly (as in the example above) requires knowing the value of the integral in advance. You might think we could find out by calculating the Riemann integral, by way of the Fundamental Theorem of Calculus that you were taught in Calc I. But we don’t know at this point whether the Riemann integral of any given function will be the same as the Cauchy integral. In fact, we don’t even know which non-constant functions have Cauchy integrals. It would be nice to have ways to determine whether a function is Cauchy integrable without having to be able to integrate it first. We will prove two such criteria. One is Cauchy’s result on the Cauchy integrability (on [a, b]) of any continuous function. The other uses certain sequences of Cauchy sums. 2 Continuous functions are Cauchy integrable Even if you know the value of the Cauchy integral in advance, using the definition directly is highly nontrivial, even for simple functions like f (x) = x. (Try it.) It’s easier to prove that continuous functions are Cauchy integrable than to prove that f (x) = x is Cauchy integrable! Why? Well, recall that the problem is that we can’t prove that a given (non-constant) function is Cauchy integrable without knowing the value of the integral. How did we handle this sort of problem for sequences? Why, by using the Cauchy criterion for convergence, which relies only on the distance between terms in the sequence, not on some putative value of the limit. So suppose we prove that if {Pj } is a sequence of partitions of [a, b] for which kPj k → 0 as j → ∞, then the corresponding sequence C(f, Pj ) of Cauchy sums is a Cauchy sequence of real numbers. The Cauchy criterion for convergent sequences of real numbers will yield a real number A for which limj→∞ C(f, Pj ) = A. But now we have a new problem: The definition of the Cauchy integral requires that C(f, P ) be within some given ε of A no matter which partition P (having norm less than some δ) we use. Well, there are uncountably many such partitions (for a given δ > 0), so no sequence can include them all. That’s OK. We’ll just be very careful to make sure that the fact that limj→∞ C(f, Pj ) = A implies that |C(f, P ) − A| < ε for any partition P whose norm is sufficiently small. When that’s done, the proof will also be done. Sounds like a plan. There will be a couple of extra steps I haven’t mentioned yet, including a lemma that is sufficiently interesting and useful to be named. Lemma 2.1 (The Refinement Lemma for Cauchy sums for continuous functions). Let f : [a, b] → R be continuous. For every ε > 0 there is a δ > 0 such that for every partition P of [a, b] having norm less than δ and for every refinement Q of P , C(f, P ) − C(f, Q) < ε. Proof (after [1]). Let ε > 0 be given and recall that the fact that f is continuous on [a, b] implies that it is uniformly continuous on [a, b]. Specifically, there is a δ > 0 so that if x and y are in [a, b] and |x − y| < δ, ε then |f (x) − f (y)| < b−a . Page 3 Let P = {a = x0 < x1 < x2 < · · · < xN = b} Q = {a = y0 < y1 < y2 < · · · < yM = b}, and and suppose kP k < δ. Because Q refines P , we can draw two conclusions. First, kQk < kP k. Second, the points of Q provide partitions of each of the subintervals of P . For example, re-indexing the ym ’s allows us to write a = x0 = y0 = y1,0 < y1,1 < y1,2 < · · · < y1,m1 = x1 , for some m1 .5 Now let y2,0 = y1,m1 and write x1 = y2,0 < y2,1 < y2,2 < · · · < y2,m2 = x2 , for some m2 . Continuing in the same manner yields x0 = y1,0 < y1,1 < y1,2 < · · · < y1,m1 = x1 x1 = y2,0 < y2,1 < y2,2 < · · · < y2,m2 = x2 x2 = y3,0 < y3,1 < y3,2 < · · · < y3,m3 = x3 .. . xN −1 = yN,0 < yN,1 < yN,2 < · · · < yN,mN = xN , and xN = b.6 Pmn Using Exercise 1.2,P we can now write xn − xn−1 = yn,mn − yn,0 = m=1 (yn,m − yn,m−1 ), so that mn f (xn−1 )(xn − xn−1 ) = m=1 f (xn−1 ) yn,m − yn,m−1 . Then N mn N X X X C(f, P ) − C(f, Q) = f (x )(x − x ) − f (y ) y − y ) n−1 n n−1 n,m−1 n,m n,m−1 n=1 n=1 m=1 X mn mn N X N X X f (yn,m−1 ) yn,m − yn,m−1 ) f (xn−1 ) yn,m − yn,m−1 ) − = n=1 m=1 n=1 m=1 N mn X X f (xn−1 ) − f (yn,m−1 ) yn,m − yn,m−1 ) = n=1 m=1 mn N X X f (xn−1 ) − f (yn,m−1 ) yn,m − yn,m−1 ≤ n=1 m=1 < N mn ε XX (yn,m − yn,m−1 ) b − a n=1 m=1 = ε, as was to be shown. Lemma 2.2. Let f : [a, b] → R be continuous, and let {Pj } be a sequence of partitions of [a, b] for which kPj k → 0 as j → ∞. The sequence {C(f, Pj )} is a Cauchy sequence of real numbers. Corollary 2.3. Let f : [a, b] → R be continuous, and let {Pj } be a sequence of partitions of [a, b] for which kPj k → 0 as j → ∞. There exists a real number A for which lim C(f, Pj ) = A. j→∞ Recall that we said that to prove that a continuous function is Cauchy integrable, we’d need a number A to use in the definition of “Cauchy integrable.” In other words, we’d need a number that could serve as a Rb candidate for the value of (C) a f (x) dx. Well, Corollary 2.3 will provide such an A. 5 Note that y1,0 = y1 , y2,0 = y1,m1 = ym1 , y3,0 = y2,m2 = ym1 +m2 , and so on. may be wondering how this works if, say, x1 = y1 and x2 = y2 . Quite simply, we have x1 = y1 < y2 = x2 . There are no ym ’s between y1 and y2 , so we don’t have to write anything like y2,1 , y2,2 . . . y2,m2 . 6 You Page 4 Exercise 2.4. Prove Lemma 2.2 and its corollary. The next lemma looks a lot like the Refinement Lemma, but it’s just a little different. It will allow us to conclude that if P is any partition of [a, b] with norm sufficiently small, then C(f, P ) will be within ε of the limit mentioned in Corollary 2.3. Lemma 2.5. Let f : [a, b] → R be continuous. For every ε > 0 there is a δ > 0 such that if P and Q are partitions of [a, b] with kP k < δ and kQk < δ, then C(f, P ) − C(f, Q) < ε. Exercise 2.6. Prove this lemma. We now have the tools we need for proving that continuous functions are Cauchy integrable. Theorem 2.7. If f : [a, b] → R is continuous then f is Cauchy integrable on [a, b]. Exercise 2.8. Prove this theorem. (Hint: Follow the plan given on page 3.) The proof of Theorem 2.7 actually proves a little more: Corollary 2.9. Let f : [a, b] → R be continuous. Then f is Cauchy integrable on [a, b], and if {Pj } is any Rb sequence of partitions of [a, b] whose norms go to 0 as j → ∞ then (C) a f (x) dx = limj→∞ C(f, Pj ). This corollary gives us a way to calculate Cauchy integrals (albeit an inefficient way). It also justifies mathematically the calculation of Cauchy integrals by using partitions whose subintervals all have the same length.7 As a bonus, it allows us to prove a lemma that we will need in the next section. Lemma 2.10. Let f : [a, b] → R be continuous, and suppose c ∈ (a, b) Then f is Cauchy integrable on both [a, c] and [c, b] and Z b Z c Z b (C) f (x) dx = (C) f (x) dx + (C) f (x) dx. a a c Proof. First note that f is Cauchy integrable on any interval contained in [a, b], because f is continuous on Rc Rc Rb any such interval. Therefore, each of (C) a f (x) dx, (C) a f (x) dx, and (C) c f (x) dx exists. Next, let ε > 0 and let {Pj } be any R c sequence of partitions of [a, c] whose norms go to 0 as j → ∞. By Corollary 2.9, limj→∞ C(f, Pj ) = (C) a f (x) dx. Likewise, let {Qk } be any sequence of partitions of [c, b] whose norms go to 0 as k → ∞. Then Rb limk→∞ C(f, Qk ) = (C) c f (x) dx. Next, observe that for each ` ∈ N, P` ∪ Q` is a partition of [a, b] whose norms go to 0 as ` → ∞, and Rb lim`→∞ C(f, P` ∪ Q` ) = (C) a f (x) dx. Finally, note that C(f, P` ∪ Q` ) = C(f, P` ) + C(f, Q` ) and let ` → ∞. 3 The Fundamental Theorem of Calculus for Cauchy integrals So, now we know that continuous functions are Cauchy integrable, but we still don’t know how to integrate them without using the tedious and inefficient method of taking limits of sequences of Cauchy sums. As noted before, we might be tempted to antidifferentiate and subtract, just like we did in Calc I. Unfortunately, all this would be good for at this point would be finding possible values of Cauchy integrals. We do not yet know that this process gives the true values of Cauchy integrals, so we’d have to test the possible values of Cauchy integrals by seeing whether they satisfied the definition of Cauchy integration. Ugh! And what about the related practice of differentiating an integral? Does it really give you the function you started with? If so, under what conditions? You may have seen theorems about these things, for Riemann integrals. Well, we don’t know at this point whether such proofs apply to Cauchy integrals or not. Fortunately, there are theorems that address these issues, the so-called Fundamental Theorems of Calculus. We’re going to prove one such theorem for Cauchy integrals. 7 However, using such partitions exclusively does not help the student develop correct intuition concerning the nature of Cauchy integration. Page 5 3.1 Barrow’s theorem for Cauchy integrals One half of our fundamental theorem of calculus for Cauchy integrals can be stated thus: Theorem 3.1 (Barrow’s theorem for Cauchy integrals). Let f : [a, b] → R be continuous, and for all x ∈ (a, b] define F by Z x F (x) = (C) f (t) dt. a Then F is a differentiable function on (a, b] and F 0 (x) = f (x) for each x ∈ (a, b]. Let’s use the Mean Value Theorem for Integrals (Theorem 5.2.9 in the text, which I proved in class for Cauchy integrals of continuous functions) to prove this half of the Fundamental Theorem. There are proofs that don’t use the MVT for Integrals, but they’re more work. We need two more details before starting the proof. First, to prove that F is differentiable will likely F (x + h) − F (x) . The problem is that h needs to be allowed to require using the difference quotient h 8 be negative. We’re going to use the Mean Value Theorem for Integrals, which will require the symbol R x+h “(C) x f (t) dt” to be defined, even when h < 0. If h < 0 then x + h < x, but at the beginning of Section 1, we said that a < b. Rb Ra The standard solution to this problem is to define (C) a f (x) dx to be −(C) b f (x) dx when a > b. With Rb Ra this definition in hand, a little thought reveals that the equation (C) a f (x) dx = −(C) b f (x) dx makes sense whether a > b or a < b.9 This definition is not arbitrary. It’s consistent with Lemma 2.10.10 R x+h The second thing is that if h < 0, then x + h ≤ a for a ≤ x ≤ a + h, and (C) a f (t) dt is not defined. That is, F (x+h) is not defined. Likewise, if h > 0, then F (x+h) is not defined for b−h ≤ x ≤ b. Throughout the following, then, let’s assume that—no matter what else may be going on—once we’ve chosen an x ∈ [a, b], h is small enough that x + h ∈ (a, b). OK: We’re ready for the proof. Rx Proof (of Barrow’s theorem). Note that (C) a f (t) dt exists for all x ∈ (a, b] because f is continuous on [a, x] for all such x. The fact that the Cauchy integral of any Cauchy integrable function on [a, x] is unique now implies that F is actually a function. The rest of the proof is the following exercise: Exercise 3.2. Finish the proof of Barrow’s Theorem for Cauchy integrals of continuous functions. (Lengthy (x) = f (x) for every x ∈ (a, b]. So let x ∈ (a, b] and use the hint: You need to show that limh→0 F (x+h)−F h Mean Value Theorem for Integrals. Then (and only then!) let h → 0. With luck and a tail wind, you won’t need to refer to any δ’s or ε’s, though you are welcome to do so if you feel the need.) 3.2 Newton’s theorem for Cauchy integrals The other half of our fundamental theorem of calculus is: 8 h does not, however need to be allowed to be 0. If you don’t see why, go to the beginning of Chapter 4 in the text and read the definition of “limit of a function” very carefully. R 9 I’m going to remain stubborn about not defining (C) a f (x) dx to be 0. You can make this definition if you want, but we a don’t need it in this document, and making this definition means we’d have to go back and extend the proofs of all our lemmas, theorems, and corollaries to include this special case—or else reword them all so that this special case is excluded. I’ve spent ENOUGH time on this document already, thank you! 10 The definitions of “partition” and “Cauchy sum” can be extended to make this all make sense, as well: If a > b, then your interval is really [b, a]. Then a partition P of [b, a] is a sequence of points x0 , x1 , . . . , xN having the property that PN b = x0 < x1 < · · · < xN = a. But define C(f, P ) to be n=1 f (xn−1 )(xn−1 − xn ). Alternatively, define a partition P of [b, a] to be a sequence of points x0 , x1 , . . . , xN having the property that b = xN < xN −1 < · · · < x1 = x0 = a, and let PN −1 C(f, P ) = n=0 f (xn )(xn − xn+1 ). Either way, you get an extra negative sign in your Cauchy sum (and therefore in your Cauchy integral), and the proofs of the lemmas, theorems, and corollaries we have should suffice (give or take paying attention to some new details, such as extra negative signs). Page 6 Theorem 3.3 (Newton’s theorem for Cauchy integrals). Let F : [a, b] → R be differentiable and suppose F 0 is continuous. Then F 0 is Cauchy integrable on [a, b] and Z x (C) F 0 (x) dx = F (x) − F (a) a for every x ∈ (a, b] Proof (after [1]). Step 1 (Setup): Let F : [a, b] → R be differentiable, and suppose F 0 is continuous. F 0 is Rx Cauchy integrable on [a, b], by Theorem 2.7. Happily, (C) a F 0 (x) dx is a function,11 by Exercise 1.10. Rx The plan: All we have left to show is that (C) a F 0 (x) dx = F (x) − F (a), for each x ∈ (a, b]. So let x ∈ (a, b]. It will actually be enough to show that for every for every ε > 0, Z x 0 F (x) − F (a) − (C) < ε. F (2) (x) dx a (Any non-negative quantity that is less than every positive quantity has to be 0.) The use of this approach is suggested by the fact that things like (uniform) continuity and Cauchy integrability are defined in terms of ε. So: Let ε > 0 be given. In Step 2, we will find a δ > 0 suitable for proving Inequality (2). Step 3 will be the actual proof of the inequality. This step will involve introducing Cauchy sums for F 0 into the expression Z x F 0 (x) dx F (x) − F (a) − (C) a by the careful addition of 0. Doing so will allow us to use the Triangle Inequality in such a way as to take advantage of the continuity and Cauchy integrability of F 0 . So, in Step 2, we should use these two properties to select our δ. Step 2 (Finding an appropriate δ): Because F 0 is continuous on [a, x], it is also uniformly continuous there. Therefore there is a δ1 > 0 such that if a ≤ x1 < x2 ≤ x and |x2 − x1 | < δ1 then ε |F 0 (x2 ) − F 0 (x1 )| < . 2(x − a) Because F 0 is Cauchy integrable on [a, x], there is a δ2 > 0 such that if P is a partition of [a, b] with kP k < δ2 , then Z x ε f (x) dx < . C(f, P ) − (C) 2 a Let δ = min{δ1 , δ2 }. Step 3 (Proof of Inequality (2)): Let P be a partition of [a, x] with norm less than the δ chosen in Step PN 2. Note that by Exercise 1.2, F 0 (x) − F 0 (a) = n=1 F 0 (xn ) − F 0 (xn−1 ) . Therefore, Z F (x) − F (a) − (C) a x X Z N F (x) dx = F (xn ) − F (xn−1 ) − (C) 0 a n=1 x F (x) dx 0 X Z N = F (xn ) − F (xn−1 ) − C(F 0 , P ) + C(F 0 , P ) − (C) a n=1 x F 0 (x) dx N N X X = F (xn ) − F (xn−1 ) − F 0 (xn−1 ) xn − xn−1 n=1 n=1 + N X 0 F (xn−1 ) xn − xn−1 n=1 Z − (C) a x F (x) dx 0 (3) R is important, because if (C) ax F 0 (x) dx is not a function, there’s no point in trying to decide whether it’s equal to F (x) − F (a). 11 This Page 7 To be able to make any progress at this point requires somehow replacing the differences F (xn )−F (xn−1 ) with information about F 0 . The Mean Value Theorem will accomplish this:12 Because F is continuous on each subinterval [xn−1 , xn ] and differentiable on each (xn−1 , xn ), there is (for n = 1, 2, . . . , N ) a cn ∈ (xn−1 , xn ) for which F (xn ) − F (xn−1 ) = F 0 (cn ). xn − xn−1 Equation (3) now becomes Z F (x) − F (a) − (C) a x X N X N 0 F (cn )(xn − xn−1 ) − F 0 (xn−1 ) xn − xn−1 F (x) dx = 0 n=1 n=1 + N X Z F 0 (xn−1 ) xn − xn−1 − (C) n=1 ≤ N X x a F 0 (x) dx 0 F (cn ) − F 0 (xn−1 )(xn − xn−1 ) n=1 X Z N 0 F (xn−1 ) xn − xn−1 − (C) + n=1 < x a F (x) dx 0 N X ε ε (xn − xn−1 ) + 2(x − a) n=1 2 = ε, and the proof is complete. The point of Newton’s theorem (at least, for us) is that if a function is continuous on a compact interval, we can calculate its Cauchy integral by the usual method of antidifferentiating and subtracting. Newton had a different reason for proving his theorem: Doing so allowed him to solve differential equations, which would give him insight into how the Creator runs the universe. The modern (and non-religious) point of view is to note that the very most basic non-trivial ODE of all is F 0 = f , for some given f , on some given (compact) interval; Newton’s theorem guarantees that if f is continuous, this ODE has a solution. What’s more, Newton’s theorem implies that if you have an initial condition for F —say, if you know the value of F (a)—then the resulting initial value problem has a unique solution. Not that bad for a college sophomore, eh? References [1] Frank E. Burk, A garden of integrals, The Mathematical Association of America, 2007. [2] N. L. Carothers, Real analysis, Cambridge University Press, 2000. [3] Patrick M. Fitzpatrick, Real analysis, PWS Publishing Co., 1996. [4] Watson Fulks, Advanced calculus: An introduction to analysis, 3rd ed., John Wiley and Sons, 1978. [5] Ivor Grattan-Guiness, Landmark writings in western mathematics 1640-1940, Elsevier, 2005. [6] Joseph L. Taylor, Foundations of real analysis, version 2.3, 2005. 12 Every proof of Newton’s theorem (except Newton’s own “proof”!) makes use of the Mean Value Theorem for derivatives or something equivalent thereto. Page 8