Vector Calculus for Engineers Lecture Notes

MAM2083F 2021 Vector Calculus for Engineers 0.1. Functions Mathematics is really not about numbers; it’s about functions. In your first-year courses, you learnt that a function is a rule that assigns to each element 𝑥 of a set 𝐴 an element 𝑦 of a set 𝐵. Often we write something like 𝑦 = 𝑓 (𝑥), which then gives the function a name, 𝑓 . The basic idea is that 𝑓 somehow transforms 𝑥 into 𝑦 : we call 𝑦 the value of the function 𝑓 at the point 𝑥. This transformation might work in different ways; the rule that assigns 𝑦 to 𝑥 might be given by a formula, or might be found by doing an experiment or consulting a table or looking at a graph. In fact, this rule might depend on 𝑥 : for example, the absolute value function is defined by ( |𝑥| = −𝑥 𝑥<0 𝑥 𝑥≥0 for all 𝑥 ∈ R. In more extreme examples, you might encounter functions that are given by a formula for certain values of 𝑥, but by a completely different method for others. Whenever possible, I try to introduce a new function by writing something like 𝑓 : 𝐴 → 𝐵. This has the advantage of specifying the name 𝑓 , the domain 𝐴 and the codomain 𝐵 of the function. However, sometimes I choose to be less formal, and write something cos 𝑥 . Here I haven’t given the function a name and it isn’t immediately like 𝑦 = 1 + sin 𝑥 clear what the domain and the codomain are: you somehow have to work this out. In your first-year courses, you worked mainly with functions whose domains and codomains were both subsets of R. For such functions, it is often convenient to study the behaviour of the function by looking at its graph, which is a subset of the cartesian plane R2 . In MAM2083F, we are more interested in functions whose domains and codomains are subsets of R𝑚 or R𝑛 , where either 𝑚 > 1 or 𝑛 > 1. The graph of such a function is a subset of R𝑚+𝑛 , so it’s not as easy to visualise. 0.2. Limits and continuity When I started talking about "graphs", you probably thought of a curve in R2 . However, not all functions are continuous, so even if 𝐴 and 𝐵 are subsets of R, the graph of a function 𝑓 : 𝐴 → 𝐵 need not be a nice smooth curve in R2 . 1 The concept of continuity is very closely related to the idea of a limit. In a sense, the limit of a function 𝑓 : 𝐴 → 𝐵 at a point 𝑎 is the value which the function would have at the point 𝑎 if it were continuous at 𝑎. It is useful to think of the function 𝑓 as being some kind of a "black box", that takes a number 𝑥 as an input and somehow transforms this into an output 𝑓 (𝑥). Rather than do treat this totally abstractly, let’s imagine that 𝑓 is a machine, or a piece of electronic circuitry. For example, 𝑓 might be a power supply, that converts alternating current into direct current. In this analogy, 𝑥 is the input voltage, while 𝑓 (𝑥) is the output voltage. Suppose that this power supply is designed to convert 220V of alternating current into 5V of direct current, which you can use to charge your smartphone. The input is unlikely to be exactly 220V, but fortunately your smartphone is probably able to tolerate a voltage that is slightly under or over 5V. You would not be able to use this power supply if its output were outside this range. Even if an input voltage of exactly 220V produces an output voltage of precisely 5V, you wouldn’t be happy to use this power supply if you knew that a tiny deviation in the input voltage could result in an output voltage that is outside the range that your phone can tolerate. This is the whole idea behind continuity: if the function that relates output voltages to input voltages "jumps" in some way near 220V, then the power supply can’t be trusted. Let’s put 𝑎 = 220V and ℓ = 5V. Suppose your phone can tolerate an output voltage between ℓ − 𝜀 and ℓ + 𝜀, where 𝜀 is a relatively small positive amount, while you know that the input voltage 𝑥 is certain to lie between 𝑎 − 𝛿 and 𝑎 + 𝛿, where 𝛿 is another small amount. Thus the question is whether an input voltage that is between 𝑎 − 𝛿 and 𝑎 + 𝛿 always results in an output voltage between ℓ − 𝜀 and ℓ + 𝜀. output voltage ℓ +𝜀 ℓ ℓ −𝜀 𝑎−𝛿 𝑎 𝑎+𝛿 input voltage If this isn’t the case, then you can’t trust the power supply. In a sense, 𝜀 measures how much variation your phone can tolerate in the output voltage, while 𝛿 measures how much variation is acceptable in the input voltage. 2 Of course, the power supply isn’t actually a "black box": if you could look inside, you would see that it’s made up of electronic components, such as resistors, capacitors and so on. A clever engineer should be able to work out how all these components interact with one another. By consulting the datasheets for these components and applying the appropriate formulas, this engineer should be able to calculate what the output voltage the power supply will produce for any given input. With this information, you will be able to see whether the power supply can be trusted. In a similar way, a function 𝑓 can often be broken down into simpler parts. A clever mathematician is able to figure out whether 𝑓 is continuous at 𝑥 = 𝑎 by looking at these parts and how they fit together. Most of the important results about limits say things like "If 𝑓 and 𝑔 both have limits at 𝑥 = 𝑎, then so does 𝑓 + 𝑔, with lim ( 𝑓 (𝑥) + 𝑔(𝑥)) = lim 𝑓 (𝑥) + lim 𝑔(𝑥) ". These 𝑥→𝑎 𝑥→𝑎 𝑥→𝑎 results are all proved using 𝜀’s and 𝛿’s, as we have just been discussing. Unfortunately, you probably weren’t taught these proofs in first year, as some people mistakenly believe that engineers don’t have to know them. However, I hope that the example above makes you realise that this isn’t true: on the contrary, engineers are constantly working with errors and uncertainties and need to know whether a slight variation in one quantity could result in an unacceptably large variation in another. If you know anything at all about alternating current, then you will know that the story above is simplistic: for example, the input might vary not only in voltage, but also in frequency. The output might change as certain components heat up, or it might suffer from "ripple". This means that the input might not depend on a single number 𝑥 and its output 𝑓 (𝑥) might have more than one component. This fits in with what we are going to talk about in MAM2083F: functions which have more than one argument or whose values are not just a single number. 0.3. Differentiability You should know from first year that one of the most important applications of limits is to define the idea of a derivative. I’m not going to repeat most of the story here; in particular, you should be familiar with the notion of a tangent line and you should know that the derivative can be interpreted as an "instantaneous rate of change". You should also know all the rules of differentiation, including the derivatives of the trig functions sin, cos, tan and sec and the inverse trig functions arcsin, arccos and arctan. As before, let 𝐴 and 𝐵 be subsets of R. Definition 0.1. We say that a function 𝑓 : 𝐴 → 𝐵 is differentiable at 𝑎 ∈ 𝐴 if 𝑓 (𝑥) − 𝑓 (𝑎) 𝑥→𝑎 𝑥−𝑎 𝑓 ′(𝑎) = lim exists. If this is the case, then 𝑓 ′(𝑎) is known as the derivative of 𝑓 at the point 𝑎. 3 (1) If 𝑓 is differentiable at 𝑎, then the graph of 𝑦 = 𝑓 (𝑥) has a tangent line at the point (𝑎, 𝑓 (𝑎)). The equation of this tangent line is 𝑦 = 𝑓 (𝑎) + 𝑓 ′(𝑎) (𝑥 − 𝑎) . Note that this line has slope 𝑚 = 𝑓 ′(𝑎). 𝑦 𝑦 = 𝑓 (𝑥) 𝑦 = 𝑓 (𝑎) + 𝑓 ′(𝑎)(𝑥 − 𝑎) (𝑎, 𝑓 (𝑎)) 𝑥 An important observation is that near the point (𝑎, 𝑓 (𝑎)), the graph of 𝑦 = 𝑓 (𝑥) looks very much like its tangent line. This is the idea behind the linear approximation 𝑓 (𝑥) ≈ 𝑓 (𝑎) + 𝑓 ′(𝑎) (𝑥 − 𝑎) , (2) which in turn motivates important results such as Taylor’s Theorem. We can make this "approximation" a little more precise. Suppose for each 𝑥 ∈ 𝐴, we put ℎ = 𝑥 − 𝑎, so that 𝑥 = 𝑎 + ℎ. As you well know, we can write equation (1) in the form 𝑓 (𝑎 + ℎ) − 𝑓 (𝑎) . ℎ ℎ→0 𝑓 ′(𝑎) = lim For each ℎ ≠ 0, put 𝑠(ℎ) = 𝑓 (𝑎 + ℎ) − 𝑓 (𝑎) − 𝑓 ′(𝑎). ℎ (3) Then 𝑓 (𝑎 + ℎ) − 𝑓 (𝑎) − 𝑓 ′(𝑎) = 𝑓 ′(𝑎) − 𝑓 ′(𝑎) = 0. ℎ ℎ→0 ℎ→0 In other words, 𝑠(ℎ) → 0 as ℎ → 0. Remarkably enough, this can be used to prove that the function 𝑓 is differentiable at the point 𝑎 ∈ 𝐴. Although we’ve used the fact that 𝑓 is differentiable at the point 𝑎 to prove that 𝑠(ℎ) → 0 as ℎ → 0, the converse is also true: we’ll show that if 𝑠(ℎ) → 0 as ℎ → 0, then the function 𝑓 is differentiable at 𝑎. lim 𝑠(ℎ) = lim Before we prove this converse, let’s complete our explanation of how 𝑠(ℎ) can be used to make the linear approximation (2) more precise. It’s not difficult to rewrite the equation 𝑓 (𝑎 + ℎ) − 𝑓 (𝑎) 𝑠(ℎ) = − 𝑓 ′(𝑎) ℎ in the form 𝑓 (𝑎 + ℎ) = 𝑓 (𝑎) + 𝑓 ′(𝑎)ℎ + 𝑠(ℎ)ℎ, 4 which in turn can be written as 𝑓 (𝑥) = 𝑓 (𝑎) + 𝑓 ′(𝑎) (𝑥 − 𝑎) + 𝑠(ℎ)ℎ. (4) Thus 𝑠(ℎ)ℎ is the difference between the actual value of 𝑓 (𝑥) and the approximate value 𝑓 (𝑎) + 𝑓 ′(𝑎) (𝑥 − 𝑎) predicted by (2). 𝑦 𝑦 = 𝑓 (𝑥) 𝑦 = 𝑓 (𝑎) + 𝑓 ′(𝑎)(𝑥 − 𝑎) 𝑠(ℎ)ℎ 𝑓 (𝑎) 𝑎 𝑥 𝑎+ℎ Since 𝑓 (𝑥) is the 𝑦-coordinate of a point on the graph of 𝑦 = 𝑓 (𝑥) and 𝑓 (𝑎)+ 𝑓 ′(𝑎) (𝑥 − 𝑎) is the 𝑦-coordinate of the corresponding point on the tangent line, the fact that 𝑠(ℎ)ℎ → 0 as 𝑥 → 𝑎 means that the graph and the tangent line get closer and closer together as you approach the point (𝑎, 𝑓 (𝑎)). Notice that 𝑓 (𝑥) − 𝑓 (𝑎) − 𝑓 ′(𝑎) 𝑥−𝑎 is the difference between the slope of the secant line joining the points (𝑎, 𝑓 (𝑎)) and (𝑥, 𝑓 (𝑥)) and the slope of the tangent line through (𝑎, 𝑓 (𝑎)). The fact that this difference tends to 0 explains why the secant line tends towards the tangent line as 𝑥 → 𝑎. 𝑠(ℎ) = 𝑦 𝑦 = 𝑓 (𝑥) secant line tangent line (𝑎, 𝑓 (𝑎)) 𝑥 (𝑥, 𝑓 (𝑥)) 𝑠(ℎ)ℎ 𝑓 (𝑥) − 𝑓 (𝑎) (𝑎, 𝑓 (𝑎)) ℎ=𝑥−𝑎 5 (𝑥, 𝑓 (𝑎)) slope of secant line 𝑓 (𝑥) − 𝑓 (𝑎) 𝑥−𝑎 slope of tangent line 𝑓 ′(𝑎) You’ve probably noticed that 𝑠(ℎ)ℎ is actually the product of two factors, both of which tend to 0 as 𝑥 → 𝑎. This tells us that difference between 𝑓 (𝑥) and 𝑓 (𝑎)+ 𝑓 ′(𝑎) (𝑥 − 𝑎) gets small very quickly as 𝑥 → 0. As we shall see, this is actually what makes the function 𝑓 differentiable. It’s time for us to prove the converse result I mentioned earlier. Since 𝑓 ′(𝑎) appears in the formula 𝑓 (𝑎 + ℎ) − 𝑓 (𝑎) − 𝑓 ′(𝑎), ℎ when we use this formula to define 𝑠(ℎ), we are obviously assuming that 𝑓 is differentiable. Suppose however that we have somehow found a quantity 𝑠(ℎ) that tends to 0 as ℎ → 0, and which is such that 𝑠(ℎ) = 𝑓 (𝑥) = 𝑓 (𝑎) + 𝑚 (𝑥 − 𝑎) + 𝑠(ℎ)ℎ (5) for all 𝑥 ∈ 𝐴 for some constant 𝑚. (Remember that ℎ = 𝑥 − 𝑎.) You will recognise that 𝑦 = 𝑓 (𝑎) + 𝑚 (𝑥 − 𝑎) is the equation of a line that passes through (𝑎, 𝑓 (𝑎)) and has slope 𝑚. Equation (5) tells us that the graph of 𝑦 = 𝑓 (𝑥) gets closer and closer to this line as 𝑥 → 𝑎. To show that 𝑓 is differentiable at 𝑎, we look at 𝑓 (𝑥) − 𝑓 (𝑎) ( 𝑓 (𝑎) + 𝑚 (𝑥 − 𝑎) + 𝑠(ℎ)ℎ) − 𝑓 (𝑎) = lim 𝑥→𝑎 𝑥→𝑎 𝑥−𝑎 𝑥−𝑎 𝑚 (𝑥 − 𝑎) + 𝑠(ℎ) (𝑥 − 𝑎) = lim 𝑥→𝑎 𝑥−𝑎 = 𝑚 + lim 𝑠(ℎ) lim 𝑥→𝑎 = 𝑚. This calculation shows not only that 𝑓 is differentiable at 𝑎, but also that 𝑓 ′(𝑎) = 𝑚. Thus the line we’ve just been talking about is actually the tangent line. The point is that we were able to introduce this line, and show that its slope is 𝑓 ′(𝑎), without assuming that the function 𝑓 is differentiable. The fact that 𝑓 is differentiable follows from equation (5) and that 𝑠(ℎ) → 0 as ℎ → 0; notice that we don’t need an explicit formula for 𝑠(ℎ). So we’ve proved the converse. This means that 𝑓 being differentiable at 𝑎 is equivalent to equation (5) holding for all 𝑥 ∈ 𝐴, where 𝑚 is a constant and 𝑠(ℎ) → 0 as ℎ → 0. You might think that this is all very abstract. However, the linear approximation (2) is one of the most important ideas in Calculus. As I have already mentioned, it is the motivation for such ideas as Taylor’s Theorem. In the form of equation (4), it can be used to prove important results such as the Chain Rule. In MAM2083F, we are going to use these ideas to define the notion of differentiability for functions of several variables. We will see that for such functions, the idea that "a function is differentiable 6 if its derivative exists" doesn’t work very well. Instead, we will define differentiability using something like equation (5). 0.4. The Riemann integral As you know very well, we use integrals to calculate areas, volumes, arclengths, averages and so on. Most of the important applications of integrals, such as finding areas and volumes, involve breaking breaking something complicated into the sum of a large number of tiny parts and then estimating this sum in some way. In your first-year courses, one of the most important results you learn is the Fundamental Theorem of Calculus, which establishes the link between integrals and antiderivatives. In those courses, you spend a lot of time learning various "techniques of integration", which are mainly clever tricks for finding antiderivatives. All this is important, but I want to spend a little time revising how we define the notion of an integral, using Riemann sums. In MAM2083F we will extend this to functions which are defined on various subsets of R𝑛 , or which take on values which are vectors in R𝑚 , but it helps to remind ourselves how it is done . In first year, you usually start with the story of calculating the area under the graph of a function 𝑦 = 𝑓 (𝑥) on an interval [𝑎, 𝑏]. 𝑦 𝑦 = 𝑓 (𝑥) 𝑎 𝑏 𝑥 So that we can actually get somewhere, let’s assume that 𝑓 : [𝑎, 𝑏] → R is a reasonably nice function, with 𝑓 (𝑥) ≥ 0 for all 𝑥 ∈ [𝑎, 𝑏]. At this stage, I’m not going to explain what I mean by "reasonably nice": we’ll decide what we actually mean by this later. The reason why we want 𝑓 (𝑥) ≥ 0 is so that we can talk about the region bounded by 𝑦 = 𝑓 (𝑥) and the lines 𝑦 = 0, 𝑥 = 𝑎 and 𝑥 = 𝑏. We’ll drop this requirement quietly as soon as we no longer need it. The first step is to divide the interval [𝑎, 𝑏] into subintervals [𝑥0 , 𝑥 1 ], [𝑥 1 , 𝑥 2 ], [𝑥2 , 𝑥 3 ], . . . , [𝑥 𝑛−1 , 𝑥 𝑛 ]. We can do this by choosing 𝑥 0 , 𝑥 1 , 𝑥 2 , 𝑥 3 , . . . , 𝑥 𝑛 so that 𝑎 = 𝑥0 < 𝑥 1 < 𝑥2 < 𝑥3 < · · · < 𝑥 𝑛 = 𝑏. 7 𝑎 𝑥0 𝑥1 𝑥 𝑘−1 𝑥 𝑘 𝑏 𝑥 𝑛−1 𝑥 𝑛 We call 𝑃 = {𝑥0 , 𝑥 1 , 𝑥 2 , 𝑥 3 , . . . , 𝑥 𝑛 } a partition of the interval [𝑎, 𝑏]. Select a "sample point" 𝑥 ∗𝑘 from each subinterval [𝑥 𝑘−1 , 𝑥 𝑘 ] and form the Riemann sum 𝑛 Õ 𝑘=1 𝑓 (𝑥 ∗𝑘 ) (𝑥 𝑘 − 𝑥 𝑘−1 ) (6) As you know, we can interpret this Riemann sum in terms of areas of rectangles: 𝑦 𝑦 = 𝑓 (𝑥) 𝑥 0 𝑥 ∗ 𝑥 1𝑥 ∗ 1 2 𝑥2 𝑥 3∗ 𝑥3 𝑥4∗ 𝑥4 𝑥 This suggests that the Riemann sum (6) is an approximation to the "area under the curve". There are many ways in which to choose a sample point 𝑥 ∗𝑘 from each subinterval [𝑥 𝑘−1 , 𝑥 𝑘 ]. In theory, we could choose these sample points at random, but it is natural to make these choices systematically. For example, we could choose each 𝑥 ∗𝑘 to be the right-hand endpoint 𝑥 𝑘 of the subinterval [𝑥 𝑘−1 , 𝑥 𝑘 ]. This means that 𝑥 ∗𝑘 = 𝑥 𝑘 for all 𝑘 ; the corresponding Riemann sum is the "right-hand sum" 𝑛 Õ 𝑓 (𝑥 𝑘 ) (𝑥 𝑘 − 𝑥 𝑘−1 ) . 𝑘=1 Another possibility is to choose each 𝑥 ∗𝑘 to be the left-hand endpoint 𝑥 𝑘−1 of the subinterval [𝑥 𝑘−1 , 𝑥 𝑘 ]. This gives us the "left-hand sum" 𝑛 Õ 𝑓 (𝑥 𝑘−1 ) (𝑥 𝑘 − 𝑥 𝑘−1 ) . 𝑘=1 You should be familiar with this from first-year. Stewart discusses left-hand and righthand sums in detail and goes on to talk about midpoint sums, etc. 8 When we started this story, I asked you to assume that the function 𝑓 : [𝑎, 𝑏] → R is "reasonably nice", without explaining exactly what I meant by this. I will need to add some more conditions later, but for now, let’s now agree that part of being "reasonably nice" means that 𝑓 (𝑥) is bounded on the interval [𝑎, 𝑏]. This means that we can find numbers 𝑚, 𝑀 such that 𝑚 ≤ 𝑓 (𝑥) ≤ 𝑀 for all 𝑥 ∈ [𝑎, 𝑏]. We call these numbers 𝑚, 𝑀 bounds on 𝑓 (𝑥). If 𝑓 is bounded on [𝑎, 𝑏], then it is also bounded on each of the subintervals [𝑥 𝑘−1 , 𝑥 𝑘 ], so that for each 𝑘, we can find numbers 𝑚 𝑘 , 𝑀 𝑘 such that 𝑚 𝑘 ≤ 𝑓 (𝑥) ≤ 𝑀 𝑘 (7) for all 𝑥 ∈ [𝑥 𝑘−1 , 𝑥 𝑘 ]. I’m going to assume that for each 𝑘, 𝑚 𝑘 and 𝑀 𝑘 are respectively the biggest and the smallest numbers with these properties, so that if 𝑚 ′𝑘 , 𝑀 ′𝑘 are any numbers satisfying 𝑚 ′𝑘 ≤ 𝑓 (𝑥) ≤ 𝑀 ′𝑘 for all 𝑥 ∈ [𝑥 𝑘−1 , 𝑥 𝑘 ], then 𝑚 ′𝑘 ≤ 𝑚 𝑘 and 𝑀 𝑘 ≤ 𝑀 ′𝑘 . For a given function 𝑓 and a given partition 𝑃, these numbers 𝑚 𝑘 and 𝑀 𝑘 are unique. Definition 0.2. The lower sum of 𝑓 associated with the partition 𝑃 = {𝑥0 , 𝑥 1 , 𝑥 2 , 𝑥 3 , . . . , 𝑥 𝑛 } is 𝐿( 𝑓 , 𝑃) = 𝑛 Õ 𝑚 𝑘 (𝑥 𝑘 − 𝑥 𝑘−1 ) , 𝑘=1 while the upper sum is 𝑈( 𝑓 , 𝑃) = 𝑛 Õ 𝑀 𝑘 (𝑥 𝑘 − 𝑥 𝑘−1 ) , 𝑘=1 where 𝑚 𝑘 and 𝑀 𝑘 are as defined above. Suppose now we have chosen a sample point 𝑥 ∗𝑘 from each subinterval. From (7), it follows that 𝑚 𝑘 ≤ 𝑓 (𝑥 ∗𝑘 ) ≤ 𝑀 𝑘 for each 𝑘. This means that if 𝑛 Õ 𝑘=1 𝑓 (𝑥 ∗𝑘 ) (𝑥 𝑘 − 𝑥 𝑘−1 ) is any Riemann sum of 𝑓 associated with the partition 𝑃 = {𝑥 0 , 𝑥 1 , 𝑥 2 , 𝑥 3 , . . . , 𝑥 𝑛 }, then 𝐿( 𝑓 , 𝑃) ≤ 𝑛 Õ 𝑘=1 𝑓 (𝑥 ∗𝑘 ) (𝑥 𝑘 − 𝑥 𝑘−1 ) ≤ 𝑈( 𝑓 , 𝑃). (8) If for each 𝑘, we can find points 𝑢 𝑘 , 𝑣 𝑘 ∈ [𝑥 𝑘−1 , 𝑥 𝑘 ] with 𝑓 (𝑢 𝑘 ) = 𝑚 𝑘 and 𝑓 (𝑣 𝑘 ) = 𝑀 𝑘 , then 𝐿( 𝑓 , 𝑃) is the Riemann sum obtained by choosing 𝑥 ∗𝑘 = 𝑢 𝑘 for each 𝑘, while 𝑈( 𝑓 , 𝑃) is the Riemann sum obtained by choosing 𝑥 ∗𝑘 = 𝑣 𝑘 for each 𝑘. In this case, 𝐿( 𝑓 , 𝑃) is the smallest Riemann sum of the function 𝑓 associated with the partition 𝑃, while 𝑈( 𝑓 , 𝑃) 9 is the largest. However, in what we are going to do, it’s not necessary to assume that the points 𝑢 𝑘 and 𝑣 𝑘 exist, since we just need the numbers 𝑚 𝑘 and 𝑀 𝑘 . You might have noticed that I haven’t assumed that the partition 𝑃 is regular, in the sense that all the subintervals generated by 𝑃 are of the same length. This means that there need not be any particular relationship between the lengths of the different subintervals or between the lengths of the subintervals and the number of points in 𝑃. Thus there is no point in simply taking limits as 𝑛 → ∞ ; we have to be a bit more clever. Let’s think what happens when we use different partitions of the interval [𝑎, 𝑏]. Obviously, the number of subintervals might change and also their endpoints. Since 𝑚 𝑘 and 𝑀 𝑘 depend on these subintervals, the lower and upper sums will also vary. Definition 0.3. Let 𝑃 and 𝑄 be two partitions of the same interval [𝑎, 𝑏]. We say that 𝑄 is a refinement of 𝑃 if 𝑃 ⊆ 𝑄. (The symbol "⊆" means "is a subset of".) Sometimes, we prefer to say that the partition 𝑄 is finer than the partition 𝑃 or that 𝑃 is coarser than 𝑄. Thus 𝑄 is a refinement of 𝑃 if every endpoint of a subinterval in 𝑃 is also an endpoint of an interval in 𝑄. Each of the subintervals in 𝑃 has either been left unchanged, or has been cut up into shorter intervals in 𝑄. 𝑃 𝑄 Notice that any interval [𝑎, 𝑏] has a coarsest partition, the partition {𝑎, 𝑏}. In a sense, this isn’t actually a "partition", since it involves only one subinterval, the interval [𝑎, 𝑏] itself. For any partition 𝑃 of [𝑎, 𝑏], we have that {𝑎, 𝑏} ⊆ 𝑃. Notice that 𝑚 (𝑏 − 𝑎) ≤ 𝐿( 𝑓 , {𝑎, 𝑏}) ≤ 𝑈( 𝑓 , {𝑎, 𝑏}) ≤ 𝑀 (𝑏 − 𝑎) , (9) where 𝑚, 𝑀 are bounds on 𝑓 (𝑥). If you draw some pictures, you will see that if 𝑄 is a refinement of the partition 𝑃, then 𝐿( 𝑓 , 𝑃) ≤ 𝐿( 𝑓 , 𝑄) ≤ 𝑈( 𝑓 , 𝑄) ≤ 𝑈( 𝑓 , 𝑃). (10) In other words: if we replace a partition of the interval [𝑎, 𝑏] by one of its refinements, the lower sum will get bigger and the upper sum will get smaller. (8) tells us that any Riemann sum associated with 𝑄 is "squeezed" between 𝐿( 𝑓 , 𝑄) and 𝑈( 𝑓 , 𝑄). 10 Now consider two partitions 𝑃, 𝑄 of [𝑎, 𝑏] that are not necessarily refinements of one another. Notice that 𝑃 ∪ 𝑄 is also a partition of [𝑎, 𝑏] and it is finer than both 𝑃 and 𝑄. Together, (9) and (10) tell us that 𝑚 (𝑏 − 𝑎) ≤ 𝐿( 𝑓 , 𝑃) ≤ 𝐿( 𝑓 , 𝑃 ∪ 𝑄) ≤ 𝑈( 𝑓 , 𝑃 ∪ 𝑄) ≤ 𝑈( 𝑓 , 𝑄) ≤ 𝑀 (𝑏 − 𝑎) . In particular, this means for any pair of partitions 𝑃, 𝑄, 𝑚 (𝑏 − 𝑎) ≤ 𝐿( 𝑓 , 𝑃) ≤ 𝑈( 𝑓 , 𝑄) ≤ 𝑀 (𝑏 − 𝑎) . In summary: if we look at finer and finer partitions of the interval [𝑎, 𝑏], the associated lower sums will get bigger and bigger, but they will always be smaller than any one of the upper sums. In the same way, the upper sums will get smaller and smaller, but they will always be bigger than any one of the lower sums. Let 𝑆( 𝑓 ) be the maximum of all the lower sums of 𝑓 on the interval [𝑎, 𝑏] and let 𝑆( 𝑓 ) be the minimum of all the upper sums. You can think of 𝑆( 𝑓 ) as being the "limit" of the lower sums and 𝑆( 𝑓 ) as being the "limit" of the upper sums, in the sense that 𝐿( 𝑓 , 𝑃) → 𝑆( 𝑓 ) and 𝑈( 𝑓 , 𝑃) → 𝑆( 𝑓 ) as we consider finer and finer partitions of [𝑎, 𝑏]. For any pair of partitions 𝑃, 𝑄, 𝑚 (𝑏 − 𝑎) ≤ 𝐿( 𝑓 , 𝑃) ≤ 𝑆( 𝑓 ) ≤ 𝑆( 𝑓 ) ≤ 𝑈( 𝑓 , 𝑄) ≤ 𝑀 (𝑏 − 𝑎) . After all this, we can say what we mean by a function 𝑓 being "reasonably nice" and define the integral of this function on the interval [𝑎, 𝑏] : Definition 0.4. We say that a function 𝑓 is Riemann integrable on [𝑎, 𝑏] if it is bounded on [𝑎, 𝑏] and 𝑆( 𝑓 ) = 𝑆( 𝑓 ). If this is the case, then we define the Riemann integral of 𝑓 on [𝑎, 𝑏] to be ∫ 𝑏 𝑎 𝑓 (𝑥) d𝑥 = 𝑆( 𝑓 ) = 𝑆( 𝑓 ). It is useful to have an example of a bounded function which is not Riemann integrable: Example 0.5. Dirichlet’s function on [0, 1] is defined by ( 𝑟(𝑥) = 1 if 𝑥 is rational 0 if 𝑥 is irrational . This is a "highly discontinuous" function: any interval of the real line contains both rational and irrational numbers, and thus points where 𝑟(𝑥) = 1 and points where 𝑟(𝑥) = 0. It follows that 𝐿(𝑟, 𝑃) = 0 and 𝑈(𝑟, 𝑃) = 1 for any partition 𝑃 of [0, 1]. (Can you see why?) Thus 𝑆(𝑟) < 𝑆(𝑟), and so 𝑟 is not Riemann integrable on [0, 1]. One of the basic rules of integration says something like If 𝑓 and 𝑔 are both Riemann integrable on [𝑎, 𝑏], then so is 𝑓 + 𝑔, with ∫ 𝑏 𝑎 ( 𝑓 (𝑥) + 𝑔(𝑥)) d𝑥 = ∫ 𝑏 𝑎 11 𝑓 (𝑥) d𝑥 + ∫ 𝑏 𝑎 𝑔(𝑥) d𝑥. You can easily prove this result by showing that 𝐿( 𝑓 + 𝑔, 𝑃) = 𝐿( 𝑓 , 𝑃) + 𝐿(𝑔, 𝑃) and 𝑈( 𝑓 + 𝑔, 𝑃) = 𝑈( 𝑓 , 𝑃) + 𝑈(𝑔, 𝑃) for any partition 𝑃. From this it follows that 𝑆( 𝑓 + 𝑔) = 𝑆( 𝑓 ) + 𝑆(𝑔) and 𝑆( 𝑓 + 𝑔) = 𝑆( 𝑓 ) + 𝑆(𝑔), which is sufficient to prove the result. Most of the basic rules of integration can be proved in the same way. If we know that a function 𝑓 is Riemann integrable on [𝑎, 𝑏], then we can find the ∫ 𝑏 value of 𝑎 𝑓 (𝑥) d𝑥 by calculating either 𝑆( 𝑓 ) or 𝑆( 𝑓 ). If necessary, we can find these by considering a suitable sequence of partitions, for example, the regular partitions of [𝑎, 𝑏]. It can be shown that if 𝑓 is continuous on [𝑎, 𝑏], then 𝑓 is Riemann integrable on [𝑎, 𝑏] ; I’m not going to prove this. Neither am I going to give a proof of the Fundamental Theorem of Calculus, even though this follows very easily from the ideas which we have just been discussing. You will see that the Fundamental Theorem of Calculus plays a very important rôle in the later part of this course. 12

Vector Calculus for Engineers Lecture Notes

Related documents

Products

Support

Vector Calculus for Engineers Lecture Notes

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib