0.1 First-Order Approximations When we are faced with a function that is too difficult to work with directly, sometimes we can instead work with a simpler function that approximates the function we are interested in. Even though the resulting solutions will only be approximations, approximate solutions can often provide a lot of insight into a problem. In fact, in many situations exact solutions may be impossible to find. The problem of approximating functions is closely linked to the problem of sampling. For a function defined on the real numbers, we can find the value of the function at any point. However, when we measure a physical quantity, say temperature, it doesn’t come packaged as an an explicitlywritten mathematical function. To construct a function of time we must measure or sample the given quantity at a number of instants in time. This leads immediately to the following question how often should one measure the quantity? Every hour, minute, or second? Even if we measure the temperature every second, there will be an infinite number of times between each second in which we do not know the temperature; our function of temperature with respect to time looks like a number of points with large gaps in between. Such a function is called a discrete-time function (or just discrete function), usually defined at regular intervals, in contrast to a continuous-time function defined for all possible values of time. Since no matter how often we sample a physical quantity there will still be gaps between our measurements, anytime we record data from a physical situation for mathematical analysis we are forced to work with discrete functions. Anytime we want to know the value of the function at a point other than a sampling point, we are forced guess what the value would have been, had we actually measured it at that time. Trying to determine the value of the function at unknown points using known, surrounding points, is called interpolation. Through interpolation we can construct a continuous-time function that approximates our discrete-time function. One of the simplest ways of interpolating between data points is to simply draw a line segment connecting each two successive points. Such a segment is called a chord, and its extension to a line is called a secant line. Definition 0.1.1 (Secant Line). A line that intersects a function f at the points (x0 , y0 ) and (x1 , y1 ) is called the secant line intersecting f at (x0 , y0 ) and (x1 , y1 ). Once we’ve found a secant line we can find the chord between two points simply by restricting the domain of the line to be between the two points. Recall that for any two points in the Cartesian plane, there is a unique line that passes through both of them. Thus, just knowing the values of two points we can find the secant line passing through them. We generally define such a line using the point-slope form. Definition 0.1.2 (Point-Slope Form). The equation for the unique line passing through the point (x0 , y0 ) with slope m written in point-slope form is given by y − y0 = m(x − x0 ). Now let’s suppose that we know the two points (x0 , y0 ) and (x1 , y1 ) for our discrete function. In order to find the slope of the line passing between them we simply substitute the point (x1 , y1 ) into the point slope equation, finding that y1 − y0 = m(x1 − x0 ), or m= y1 − y0 , x1 − x0 1 which is simply the ratio of change in output over change in input. Having found the slope we find that the equation for the secant line will be given by y − y0 = y1 − y0 · (x − x0 ). x1 − x0 Having found the secant line intersecting (x0 , y0 ) and (x1 , y1 ) we can now define our approximating function so that it coincides with the secant line on the interval [x0 , x1 ]. Continuing in the same fashion for each other pair of points we can replace our discrete-time function with a continuous-time approximation, defined piecewise by a number of first-order polynomials. Let us return to the problem of approximating a complicated function, such as an exponential function (of any base). As we’ve defined it, such a function is evaluated at a given point by finding the solution to a certain equation (for instance, a1/n is a root to the equation xn − a = 0). We’ve previously seen that we can find such solutions using the bisection method, but doing so takes some work. Thus, exponential functions are a reasonable candidate for approximation, as they difficult to actually evaluate except at a very limited number of points. For the sake of illustration, let’s consider a base 2 exponential. We know that 20 = 1 and 1 2 = 2, so we can use the same method as previously to approximate the values of the function between 0 and 1. The slope of the secant between these two points is given by m= 2−1 = 1, 1−0 so we find that 2x ≈ x + 1, for 0 ≤ x ≤ 1. 4 3 2 1 x+1 2x 0 0 0.5 1 x 1.5 2 Figure 1: Graph of the function 2x and the secant line through (0,1) and (1,2) approximating it. This provides us, for instance, with the estimate that 20.5 ≈ 1.5. For such a quick and simple method this is a reasonable approximation to the actual value of √ 20.5 = 2 = 1.414 . . . . Nevertheless, in most situations we require a higher degree of accuracy. Before looking at higherorder approximations (which we won’t for some time), let’s further investigate first-order approximations. Above we decided to use the secant line to make an approximation because it is very simple to calculate, and seems to be an obvious first choice. However, that by no means implies 2 it is the best choice. To proceed in answering this question we’ll have to further investigate the properties of functions. Let us take a look at the smooth, continuous function f (x) = x2 , at the point x = 1 (these choices are quite arbitrary - it just happens that this function has the property of interest). If we restrict our view of this function to very small intervals surrounding the point x = 2, we begin to notice something in the behavior of x2 (the easiest way to do this is by looking on a small window of a graphing calculator or computer - see figure 2). The more we zoom in the more it appears that the function we are looking at is a line. If we look at other points we’ll notice that the function has the same behavior. To classify this behavior we say that, locally, x2 behaves like a first-order polynomial. This means that over a small enough interval, the behavior of the function closely resembles that of a line. A special name is given to the line that best resembles the function in a small neighborhood of the point of interest - the tangent line. It is best to disclaim that a tangent line only touches a function at a single point or cannot cross a function. These misunderstandings are often used to intuitively motivate the tangent line, but both are untrue. 4 2.5 3 2 x2 x2 1.5 2 1 1 0.5 0 0 0 0.5 1 x 1.5 2 0.6 0.8 1 x 1.2 1.4 1.3 1.2 x2 1.1 1 0.9 0.8 0.9 0.95 1 x 1.05 1.1 Figure 2: In the near vicinity of x = 1 the behavior of x2 resembles that of a line. The dotted line represents the line that x2 begins to behave like. We are immediately brought to the question of how to find a tangent line. In order to find such a line we return to the secant lines that we dealt with earlier. Over a large interval, our secant lines give a pretty poor approximation, because there is a lot of distance between the two points of the function connected by the secant line - in such a large space the function can change a lot. However, as we decrease the distance between the points of intersection of a secant line, the secant line becomes a better approximation, at least in the limited region between the two points of intersection, and the points nearby them. The more we decrease the distance between the two points a secant line intersects, the better the approximation becomes in the nearby region (see 3 figure 3). If we look in the limit as the distance between the two points of intersection approaches 0, then we approach the line that provides the best first-order approximation in that region - the tangent line. 10 10 7.5 5 x2 x2 5 2.5 0 x2 tangent line secant, h = 2 x2 tangent line secant, h = 1 0 −5 −2.5 0 0.5 1 1.5 x 2 2.5 3 0 0.5 1 (a) 1.5 x 2 2.5 3 (b) 7.5 7.5 5 5 x2 10 x2 10 2.5 2.5 x2 tangent line secant, h = 0.5 0 x2 tangent line secant, h = 0.1 0 −2.5 −2.5 0 0.5 1 1.5 x 2 2.5 3 0 (c) 0.5 1 1.5 x 2 2.5 3 (d) Figure 3: As the distance h between the two points the secant line intersects becomes smaller, the secant line more closely represents the tangent line. Let’s perform the above argument mathematically in order to find a mathematical representation for the tangent line. Let’s consider a function f and a point x0 . Since we want the behavior of f to be resembled by that of the tangent line near x0 , we begin by requiring that the tangent line intersect the point (x0 , f (x0 )). This is called the point of tangency, where the tangent line has the same value as the function it is tangent to. Now in order to find the point-slope form for the equation of the tangent line, we just need to find the tangent line’s slope. As stated above, we find the slope of the tangent line through a limiting process governed by secant lines. Let’s consider the secant line passing through the points (x0 , f (x0 )) and (x0 + h, f (x0 + h)), where h ∈ R, h 6= 0. This slope is given by f (x0 + h) − f (x0 ) . h In order to find the slope of the tangent line, we simply look at the limit as h → 0. In other words, the slope of the tangent line is given by f (x0 + h) − f (x0 ) . h→0 h lim This quantity plays such an important role in calculus that it is given its own name. 4 Definition 0.1.3 (Derivative at a Point). The derivative of a function f at the point x0 is denoted by f 0 (x0 ), where f (x0 + h) − f (x0 ) . f 0 (x0 ) = lim h→0 h If the above limit exists, f is said to be differentiable at x0 . Through the above discussion we see that in order to find the line tangent to a function at a given point, we need to evaluate a certain limit. Even though we used the notion of a tangent line to motivate the above definition, we will actually formally define the tangent line in terms of it. Definition 0.1.4 (Tangent Line). Let f be differentiable at x0 . The line tangent to f at the point x0 is the unique line passing through the point (x0 , f (x0 )) with slope f 0 (x0 ). A function only has a tangent line at a point if it is differentiable at that point. It follows that a function does not behave like a line in the vicinity of points at which it is not differentiable - it instead exhibits some more complicated behavior. There are a number of ways in which a function can fail to be differentiable at a given point. A function is not differentiable at any place it has a: 1. corner. Consider f (x) = |x| which has a corner at x = 0. If we look at the secant lines in the limit as h → 0, we see that as h % 0 the slope of the secant lines approaches −1, and that as h & 0 the slope of the secant lines approaches 1. Since these limits do not agree, the derivative and thus tangent line do not exist at x = 0 (a similar analysis will hold for a corner of any function). More intuitively, in the vicinity of x = 0, the function |x| does not behave like a line. p 2. a cusp. Consider the function f (x) = |x| at the point x = 0 (where it has a cusp). If we look at the secant lines as h & 0, their slopes approach ∞, and as h % 0 their slopes approach −∞. Thus, the limit that defines the slope of the tangent line doesn’t exist, to the function is not differentiable at x = 0 (and so it has no tangent line there). 3. vertical tangent. A vertical tangent line exists when a function locally behaves like a line, but the line it behaves like is vertical. As defined above, a vertical tangent line is not really considered a tangent line, because it occurs at a point where a function is not differentiable. Similar to the above example, a vertical tangent line occurs if the limit of the slopes of the secant lines as h → 0 approaches either ∞ or −∞ from both sides. An example of function with a vertical tangent is (2 − x)1/5 . 4. a discontinuity. If we have a point discontinuity the secant lines will behave like when we have a cusp. A function with a jump discontinuity will have different behavior on both sides of the point of interest. For instance, f (x) = |x|/x, f (0) = 1 has a jump discontinuity at x = 0. The secant lines approach a horizontal tangent from the right, and vertical from the left. Thus, the derivative does not exist at this point (nor does a tangent line). There is one subtlety in the above statements that we will repeat again for emphasis. Above it is said that a function is not differentiable at any place it has a discontinuity. The contrapositive of this statement is that at any point a function is differentiable, it must be continuous. Thus, if we know that a function is differentiable then we immediately know it is continuous, or differentiability implies continuity. This subtle fact can come in very handy at times. Although we did not explicitly say so at the time, the argument we made above with secant and tangent lines is essentially the same argument we made in motivating the concept of the limit, 5 using average and instantaneous velocity. In fact, the slope of a secant line is the same as the average rate of change of a function over the interval defined by those two points. Similarly, the slope of the line tangent to a point is the same as the instantaneous rate of change of the function at that point. Thus, the derivative of a function at a point has many meanings. The derivative of a function f at a point x0 can be interpreted as: 1. The instantaneous rate of change of f at the point x0 . 2. The slope of the line given by looking in the limit as h → 0 of secant lines passing through (x0 , f (x0 )) and (x0 + h, f (x0 + h)). 3. The slope of the line that best approximates the function f in the vicinity of x0 , the tangent line at x0 . 4. The slope of the function f at the point x0 . From here on we will begin the study of differential calculus, where our primary object of study is the derivative. 6