Visualizing and Understanding the Components of Lagrange and Newton Interpolation One of the standard topics in numerical analysis courses is the subject of interpolation with particular emphasis on the Lagrange and Newton Interpolating formulas. In both cases, the usual approach is highly computational as one works to construct the polynomial of degree n that passes through n 1 interpolating points. However, as Richard Hamming, one of the giants of modern numerical analysis, put it, “The purpose of computing is insight, not numbers” [3]. In this article, we will look graphically at the functional components of each of these two interpolating formulas to see the kinds of deeper insights that can be achieved. The Lagrange Interpolating Formula Suppose that we have the n 1 points ( x0 , y0 ) , ( x1 , y1 ) , …, ( xn , yn ) , where all of the xi are different, though not necessarily uniformly spaced. These interpolating points then determine a unique polynomial of degree n (or possibly lower, if the points happen to lie on such a curve). One way to express the equation of this polynomial is with the Lagrange Interpolating Formula: n n ( x xk ) ( x x0 ) ( x xi 1 )( x xi 1 ) ( x xn ) yi . ( xi x0 ) ( xi xi 1 )( xi xi 1 ) ( xi xn ) i 0 k 0, k i ( xi xk ) n Ln ( x) yi i 0 Equivalently, if we write this formula without the summation notation, it becomes Ln ( x) y0 ( x x1 )( x x2 ) ( x xn ) ( x x0 )( x x2 ) ( x xn ) y1 ( x0 x1 )( x0 x2 ) ( x0 xn ) ( x1 x0 )( x1 x2 ) ( x1 xn ) yn ( x x0 )( x x1 ) ( x xn 1 ) . ( xn x0 )( xn x1 ) ( xn xn 1 ) Notice that this polynomial is composed of n 1 distinct polynomial terms, each of degree n (provided yi 0 ). Either way, both of these are rather daunting expressions for students and consequently it is not surprising that many tend to miss some of the key underlying concepts. As an example, consider the three interpolating points (1, 2) , (3,8) , and (5, 4) that determine the quadratic function whose graph is shown in Figure 1. Clearly, this function 1 10 (3, 8) 5 (5, 4) (1, 2) 0 0 1 2 3 4 5 6 7 8 -5 -10 Figure 1: The quadratic interpolating polynomial L2 ( x) passes through each of the three interpolating points. The corresponding expression for this interpolating polynomial is ( x 3)( x 5) ( x 1)( x 5) ( x 1)( x 3) 8 4 (1 3)(1 5) (3 1)(3 5) (5 1)(5 3) (1) 1 1 ( x 3)( x 5) 2( x 1)( x 5) ( x 1)( x 3) . 4 2 (2) L2 ( x) 2 More to the point here, this function is a linear combination of three distinct quadratic functions, shown in the first expression. Let’s see how each of these functions behaves, as shown in Figure 2. (The heavier fourth curve shown is the Lagrange interpolating polynomial.) Although the shapes of two of the curves totally miss the shape of the interpolating polynomial, each of the three quadratics passes through just one of the interpolating points. Moreover, not only does each one completely miss the other two interpolating points, but also each quadratic has its real zeros precisely in line vertically with the other two interpolating points. In addition, observe that each pair of quadratic components share a common zero. To see why, notice that the component quadratic functions are constructed in such a way that each of them has a pair of factors that correspond to two of the three interpolating points. It is not coincidence that there is such correspondence between the zeros and the interpolating points. 2 10 y2 ( x ) (3, 8) 5 y0 ( x) (5, 4) (1, 2) 0 0 1 2 3 4 5 6 7 8 -5 y1( x) -10 L2 ( x) Figure 2: The three component quadratic functions y0 ( x) 2 ( x 3)( x 5) (13)(15) y1 ( x) 8 ( x 1)( x 5) (31)(35) y2 ( x ) 4 ( x 1)( x 3) (51)(53) Furthermore, look at the first term in the first expression (1) for L2 ( x) . Corresponding to the first interpolating point (1, 2) , the coefficient of ( x 3)( x 5) is (1 3)(1 5) precisely equal to the value, 2, of y at this point. Also, at this point where x 1 , the factors in the numerator precisely cancel the factors in the denominator, so the total contribution of this first term at x 1 is y 2 . Similarly, the second term in the expression (1) for L2 ( x) is constructed in such a way that when x 3 , we have y 8 , and so on for the third term. In general, for any n , the terms in the Lagrange formula are constructed in such a way that each one has zero contribution at all but one of the interpolating points and contributes precisely the given value of y at the remaining interpolating point. The Newton Interpolating Formula Again, suppose that we have the n 1 points ( x0 , y0 ) , ( x1 , y1 ) , …, ( xn , yn ) , where all of the xi are different. For simplicity, we consider the case where these xi ’s are uniformly spaced with x xk 1 xk for each k . 3 These interpolating points determine a unique polynomial of degree n (or possibly lower, if the points happen to lie on such a curve). Another way to express the equation of this polynomial is with the Newton Forward Interpolating Formula: y0 2 y0 3 y0 Pn ( x) y0 ( x x0 ) ( x x0 )( x x1) ( x x0 )( x x1)( x x2 ) x 2!( x)2 3!( x)3 n y0 ( x x0 )( x x1 ) n!( x)n (3) ( x xn1 ), where y0 y1 y0 , 2 y0 y1 y0 y2 2 y1 y0 , 3 y0 2 y1 2 y0 y3 3 y2 3 y1 y0 , M n y0 n 1 y1 n 1 y0 yn nyn 1 n(n 1) n(n 1)(n 2) yn 2 yn 3 2! 3! (1) n y0 . Notice that this polynomial Pn ( x) is also composed of n 1 distinct polynomial terms, but each of degree i , i 0,1, 2, , n . Term by term, each polynomial is of one degree higher than the previous one. At a quick glance, the above formula is obviously very similar to the formula for the n th degree Taylor polynomial approximation for a function y f ( x) at x x0 : Tn ( x) f ( x0 ) f ( x0 )( x x0 ) f ( n ) ( x0 ) ( x x0 )n . n! f ( x0 ) f ( x0 ) ( x x0 ) 2 ( x x0 )3 2! 3! (4) Let’s see just how close the two are. Consider what happens to Newton’s interpolating i y0 formula (3) as the stepsize x 0 . Clearly, the quantities in the polynomial ( x )i expression (3) converge toward the successive derivatives f (i ) ( x0 ) of the function at x x0 . Moreover, as x 0 , all of the interpolating points approach x x0 , though they do retain the uniform spacing. As all the interpolating points coalesce at x x0 , we 4 see that the products of the various factors all converge toward ( x x0 ) and so approach the successive powers of ( x x0 ) . Thus, the Taylor polynomial of degree n for a function is the limit of the Newton interpolating polynomials as x 0 . We next consider how Newton’s interpolating formula comes about (which might reflect Newton’s own thought process in originally developing the formula). Suppose we start with the first two points ( x0 , y0 ) and ( x1 , y1 ) , which determine a line whose algebraic representation can be written y1 y0 x1 x0 y y0 ( x x0 ) y0 y0 x ( x x0 ). Now suppose we also have the third point ( x2 , y2 ) . Unless the three points happen to be collinear, which is highly unlikely, the line determined by the first two points will miss the third point. In particular, if we extend that line until it reaches x x2 x0 2x , then the height along the line is y y0 y0 x ( x2 x0 ) y0 y1 y0 x ( x0 2x x0 ) y0 y1 y0 x (2x) 2 y1 y0 . The question we pose is: How can we adapt the above linear function to create a quadratic function that also passes through the third point? To do so, we want to introduce a quadratic term that forces the entire quadratic function to pass through the third point while maintaining the same two values y0 and y1 at x x0 and x x1 , respectively. To do this, we write the quadratic polynomial in the form y y0 y1 y0 x ( x x0 ) A( x x0 )( x x1 ). where A is some constant to be determined. Notice that the presence of the factors ( x x0 ) and ( x x1 ) guarantee that the quadratic term has zero contribution at the first two interpolating points. To determine A, we impose the condition that the value of the quadratic function when x x2 must be y y2 . We therefore obtain y y0 = y0 y1 y0 x ( x0 2x x0 ) A( x0 2x x0 )( x0 y1 y0 x (2x) A(2x)(x) y2 . 2x x1 ) Consequently, 5 2 A(x) 2 y2 2( y1 y0 ) y0 y2 2 y1 y0 , and so A y2 2 y1 y0 . 2(x)2 Notice that the numerator y2 2 y1 y0 is the second difference 2 y0 y1 y0 . Therefore, we write 2 y0 A 2(x)2 and y y0 y0 2 y0 ( x x0 ) ( x x0 )( x x1 ). x 2(x)2 When doing this in class, we suggest asking the students to extend the argument used to derive the cubic Newton interpolating formula. Let’s find the interpolating quadratic using the Newton interpolating formula for the above example where the three interpolating points are (1, 2) , (3,8) , and (5, 4) . We have 6 10 P2 ( x) 2 ( x 1) ( x 1)( x 3) 2 2!22 5 2 3( x 1) ( x 1)( x 3). 4 Notice that the sum of the first two terms, P1 ( x) 2 3( x 1) , determines the linear function that pass through the first two points (1, 2) and (3,8) . Figure 3 shows the graphs of P1 ( x) and P2 ( x) . Let’s include the graph of the quadratic term, denoted by 5 C2 ( x) ( x 1)( x 3) , to see how the quadratic term alone changes the interpolation 4 polynomial P1 ( x) based on the first two points into the interpolation polynomial P2 ( x) based on all three points. Figure 4 shows that the quadratic component C2 ( x) has two real zeros precisely in line vertically with the first two interpolating points. Therefore, C2 ( x) automatically contributes zero at x 1 and x3 so that 6 P2 ( x) P1( x) C2 ( x) P1( x) at x 1 and x 3 , which implies the presence of C2 ( x) in P2 ( x) does not alter the perfect fit of the first two points by P1 ( x) . In addition, observe that at x 5 , the value of C2 (5) is the amount that “bends” the linear function P1 ( x) at x 5 in order for the P2 ( x) to pass through the third interpolating point (5, 4) . 20 P1 ( x) 15 (3, 8) 10 P2 ( x) (5, 4) (1, 2) 5 0 0 1 2 3 4 5 6 -5 -10 -15 Figure 3: Two successive Newton interpolating polynomials P1 ( x) and P2 ( x) 20 P1 ( x) 15 (3, 8) 10 Two vertical segments with equal length P2 ( x) (1, 2) 5 (5, 4) 0 -5 0 1 2 3 4 5 6 -10 C2 ( x) -15 Figure 4: How the quadratic term C2 ( x) affects the linear interpolation P1 ( x) 20 15 10 5 0 -5 0 1 2 3 4 5 6 -10 -15 5 C2 ( x ) ( x 1)( x 3) 4 P2 ( x) P1 ( x) C2 ( x) 7 To better understand the process of finding the interpolating polynomial by using Newton formula, we expand the data set of (1, 2) , (3,8) , and (5, 4) to include two additional points (7, 2) , (9, 22) . These five points determine a quartic polynomial. We show these successive Newton interpolating polynomials in Figure 5. Once again, notice 40 P4 ( x) 30 20 P3 ( x) P1 ( x) 10 0 -10 0 -20 1 2 3 4 5 6 7 8 9 10 11 P2 ( x) Figure 5: Successive Newton interpolating polynomials up to degree 4 that the linear function passes through the first two points, though it completely misses all the remaining points. The quadratic function passes through the first three points, but then misses all the subsequent points. The cubic function passes through the first four points, but comes nowhere near the final point. It is only the quartic function that passes through all five points. The approach used to introduce Newton’s formula in many numerical analysis texts tends to be rather abstract and, as such, conveys little in the way of understanding to many students. Moreover, some texts give a somewhat misleading image of what is actually happening; they tend to say something to the effect that, if you have n 1 interpolating points, you usually need a polynomial of degree n to fit them, and any lower degree polynomial misses the points. The “derivation” and examples used above actually suggest that this is not exactly the truth. However, it is misleading to suggest that the lower degree polynomials miss the points; they only miss the points further to the right. 8 Interpolation and Approximation of Functions Interpolation provides an important tool for approximating a function. When we work with a function that involves more than the basic operations, we may want to replace it with a polynomial for inexpensive and quick computations. In this case, often what comes to our mind is the Taylor approximation of a function, provided that the function is sufficiently differentiable. Because of the similarity of Taylor approximation and Newton formula, we now consider the interpolating polynomials in the Newton form for the approximation of functions. As an example, we approximate the sine function y sin x on the interval [0, 2 ] . Let’s interpolate the sine function at the five uniformly spaced points (0,0) , ( 2,1) , ( ,0) , (3 2, 1) , and (2 ,1) . Just as we did earlier, we construct the successive Newton interpolating polynomials, shown in Figure 6. Notice that the quadratic interpolating polynomial P2 ( x) fits the sine function reasonably well on the interval [0, ] . As expected, the quadratic polynomial P2 ( x) misses all the points on the right. The cubic polynomial P3 ( x) and quartic polynomial P4 ( x) are identical because the fourth difference 4 y0 is zero. This is the advantage of using the Newton formula that determines the degree of the interpolating polynomial as we construct it. Had we P1 ( x) 1.5 1 P3 ( x) 0.5 f ( x) 0 -0.5 0 1 2 3 4 5 6 7 -1 -1.5 P2 ( x) Figure 6: Successive Newton interpolating polynomials for the sine function 9 used the Lagrange formula, we would only learn the degree of the interpolating polynomial after simplifying the expression for L4 ( x) . Overall, P3 ( x) approximate the sine function reasonably well on the entire interval [0, 2 ] . On the other hand, the cubic Taylor polynomial for y sin x at x 0 , T3 ( x) x x3 , is shown in Figure 7 along with the sine function and cubic interpolating 3! polynomial. We see that the Taylor approximation achieves high accuracy between x 0 and roughly x 2 , then the difference between the cubic Taylor polynomial and the sine function grows significantly as x moves to the right beyond x 2 . Clearly, the interpolating polynomial gives us a better approximation of a function on a larger interval compared with the Taylor approximation. 1.5 P3 ( x) 1 0.5 f ( x) 0 -0.5 0 1 2 3 4 5 6 7 -1 -1.5 T3 ( x) Figure 7: Cubic Taylor polynomial and cubic interpolating polynomial of the sine function Comparisons Between Lagrange and Newton Interpolation The Lagrange and Newton interpolating formulas provide two different forms for an interpolating polynomial, even though the interpolating polynomial is unique. When we want a quick symbolic expression of the interpolating polynomial, the Lagrange formula seems to be the way to go. For this reason, the Lagrange form is most often used for deriving formulas for approximating derivatives and integrals. For example, many numerical 10 analysis textbooks (for example, [1] and [2]) establish the trapezoidal rule and Simpson’s rule by using the Lagrange formula for linear and quadratic interpolating polynomials to approximate the integrand, respectively. However, the Newton formula is much better for computation than the Lagrange formula. When using the interpolating polynomials for working with functions that are stored in tabular form, we often choose the Newton formula. As we will show below, the forward differences n y0 that determine the coefficients of the Newton formula can be easily constructed using a tabular form. More importantly, the Newton formula provides a generally accurate idea of when the degree n is sufficiently large by observing the size of the terms with higher-order forward differences. This is a useful technique in deciding what degree polynomial to use. Suppose we are given five points (2.0, 1.414214) , (2.1, 1.449138) , (2.2, 1.483240) , (2.3, 1.516575) , and (2.4, 1.549193) , which is based on an example in [1]. We construct the forward difference table for these five points, shown in Table 1. Table 1 Forward difference table for the five points i xi yi yi 0 1 2 3 4 2.0 2.1 2.2 2.3 2.4 1.414214 1.449138 1.483240 1.516575 1.549193 0.034924 0.034102 0.033335 0.032618 2 yi – 0.000822 – 0.000767 – 0.000717 3 yi 0.000055 0.000050 4 yi – 0.000005 The last five entries in the first row are used to determine the coefficients of the Newton interpolating polynomial P4 ( x) 1.414214 0.34924( x 2.0) 0.041( x 2.0)( x 2.1) 0.009167( x 2.0)( x 2.1)( x 2.2) 0.002083( x 2.0)( x 2.1)( x 2.2)( x 2.3). By applying the usual optimization approach from Calculus I to the fourth degree polynomial term 0.002083( x 2.0)( x 2.1)( x 2.2)( x 2.3) on the interval [2.0, 2.4] , we find that 11 Max ( x 2.0)( x 2.1)( x 2.2)( x 2.3) 0.0024 . 2.0 x2.4 Then the largest possible value the last term of P4 ( x) that will contribute to the interpolating polynomial at any point in the interval [2.0, 2.4] is roughly 0.002083 0.0024 5 106 . This result may be improved on by following a common practice used in approximating functions with the Newton formula. When we want to approximate the function at a point x that is inside the first half of the interval, we use the above Newton forward formula. Otherwise we use the Newton backward formula, or equivalently, we apply the Newton forward formula to the same table where the entries are listed in reverse order. If we stay with the original notation for the interpolating points ( xi , yi ) for i 0,1, 2, , n . We define the backward differences as follows. Let yn yn yn1 , 2 yn (yn ) yn yn1 yn 2 yn1 yn2 , and in general, i yn (i 1 yn ) i 1 yn i 1 yn1 , for i 1 . Then the Newton Backward Interpolating Formula can be expressed as yn 2 yn 3 yn Pn ( x) yn ( x xn ) ( x xn )( x xn1 ) ( x xn )( x xn1)( x xn2 ) x 2!( x)2 3!( x)3 n yn ( x xn )( x xn1 ) n!( x)n ( x x1 ). Assume that we want to approximate the function between 2.0 and 2.2, the first half of the interval [2.0, 2.4] . Now the maximum value of the last term of P4 ( x) on [2.0, 2.2] is only about 2 107 . Since our data points are given to decimal places, the Newton forward formula of order greater than three won’t increase the accuracy of the approximation for x in the interval [2.0, 2.2] . Therefore, we have achieved the desired level of accuracy by using polynomials of degree three (or even lower). Table 2 gives the approximations of f (2.03) , f (2.09) , and f (2.15) by using P1 ( x) , P2 ( x) , P3 ( x) and P4 ( x) . The identical values of P3 ( x) and P4 ( x) at x 2.03 , x 2.09 , and x 2.15 confirm the above observation. A similar analysis can be given to the approximation of the function between 2.2 and 2.4 using the Newton backward formula. 12 We present the results of approximations of f (2.24) , f (2.31) , and f (2.37) using Newton backward formula in Table 3. Table 2 Example of use of Newton forward formula xi 2.03 2.09 2.15 P1 ( xi ) 1.424691 1.445646 1.466600 P2 ( xi ) 1.424777 1.445683 1.466292 P3 ( xi ) 1.424780 1.445684 1.466289 P4 ( xi ) 1.424780 1.445684 1.466289 f ( xi ) 1.424780 1.445684 1.466289 Table 3 Example of use of Newton backward formula xi 2.24 2.31 2.37 P1 ( xi ) 1.497004 1.519837 1.539408 P2 ( xi ) 1.496660 1.519869 1.539483 P3 ( xi ) 1.496663 1.519868 1.539480 P4 ( xi ) 1.496663 1.519868 1.539480 f ( xi ) 1.496663 1.519868 1.539480 In contrast, the Lagrange interpolation approach requires far more computation – each time you increase the number of interpolating points by one, you have to recalculate everything. This makes Lagrange interpolation less convenient for seeking the lowest degree interpolating polynomial that fits the data with a given error tolerance. Interpolation and Regression From time to time, we may have a large set of data. If we have n 1 points (where n is large), the interpolating polynomial is of the degree n , presuming that the points do not fall onto a polynomial of lower degree. This high degree polynomial is an exact match to the data points, but can be a very poor match between those points. This can happen because the polynomial may change direction up to n 1 times. In the process, the interpolating polynomial may shoot way up or down after passing through each interpolating point in order to reach the next turning point to come back down/up to hit the next interpolating point. We illustrate such a case in Figure 8 where the interpolating points are (0, 5) , (1, 9) , (2, 10) , (3, 40) , (4, 5) , (5, 50) and (6, 29) . The problematic portions of the interpolating polynomial are between the first two interpolating points and between the last two interpolating points. Such an oscillatory behavior may dramatically affect the accuracy of approximation between interpolating points and make the approximation very sensitive to any changes 13 of the interpolating points. Moreover, there is likely a high computational cost of using the interpolating function. If we opt for a lower degree polynomial by using the method we discussed above, we will have to constantly get back to the data to select a subset of interpolating points according to the value of x of interest in order to get a good approximation. 120 100 80 60 40 20 0 -20 0 1 2 3 4 5 6 -40 Figure 8: Oscillatory behavior of interpolating polynomial If the exact fit is not the only concern, we may overcome these difficulties by finding a lower degree polynomial that will give reasonable accuracy. One way to find such a lower degree polynomial is to use regression. A regression polynomial only attempts to capture the overall trend in a set of data and, as such, can potentially give much better approximations between the interpolating points, even though it doesn’t necessarily pass through any of them. Concluding Remarks We began this article by quoting Richard Hamming’s famous statement about the purpose of computing being insight. While interpolation methods are too often considered simply as computational procedures, we hope that our emphasis on the components of those formulas provides much in the way of insight into where those formulas came from, why they work, and how they should be used. 14 References: [1] Atkinson, K. 1988. An Introduction to Numerical Analysis, 2nd Ed. New York: John Wiley & Sons. [2] Burden, R. and Faires, J. 2010. Numerical Analysis, 9th Ed. Boston: Brooks/Cole. [3] Hamming, R. 1987. Numerical Methods for Scientists and Engineers, 2nd Ed. New York: Dover Publications. -----------------------------------------------------------------------------------------------------------Abstract This article takes a close look at Lagrange and Newton interpolation by examining graphically the components of each of these formulas. While interpolation methods are too often considered simply as computational procedures, we demonstrate how hope that our emphasis on the components of the polynomial terms in these formulas provides much in the way of insight into where these formulas came from, why they work, and how they should be used. Keywords interpolating polynomial, Lagrange formula, Newton formula, Taylor polynomial Suggested Running Head Components of Lagrange and Newton Interpolation 15