Finding Polynomial Patterns and Newton Interpolation Introduction Just as a linear function has a distinct numerical pattern based on the points it passes through (the successive difference quotients are all constant, or the successive differences are all constant if all the x-values are equally spaced), so also does a polynomial function have its own numerical pattern determined by a set of data. But unlike the linear case, it is usually not as easy to find the polynomial that fits a given pattern. The process of finding such a polynomial is called interpolation and one of the most important approaches used is the Newton interpolating formula. In this article, we first show how the polynomial pattern can be identified and the degree determined by a set of points. Furthermore, we show how the Newton interpolating polynomial can be introduced and developed at the precalculus level. This brings one of the most powerful and useful tools of numerical analysis to the attention of lower division students while simultaneously building on and reinforcing some of the fundamental ideas in precalculus mathematics. Linear Patterns and Interpolation Let’s consider the set of data presented in Table 1, where the x-values xi are equally spaced. We examine the data by computing the differences between successive y-values yi . We extend Table 1 to include the differences yi yi 1 yi in the new column y in Table 2 and observe that the differences are all equal. Since there is a constant difference between successive x-values, the lines joining any two consecutive points share the slope. We therefore conclude that the data fall into a linear pattern. Using the first two points from either table, we obtain the corresponding linear function as y 1 7( x 3) or y 7 x 20 . Looking closely at the equation of the linear function y 1 7( x 3) , we can see that the numbers 1, 7, and 3 exactly match the entries for y , y and x , respectively, in the first row of Table 2. This is not coincidental, as we discuss later. This observation indicates that the parameters that define the linear function are closely associated with the information given in the first row of the table of differences. In this example, the slope is equal to y because x 1; otherwise, it would be y x . A natural extension of this observation will allow us to construct the polynomial of higher degree based on a set of data values once its pattern is determined. 1 Table 1 Determine the pattern x x0 = 3 x1 = 4 x2 = 5 x3 = 6 y y0 = 1 y1 = 8 y2 = 15 y3 = 22 Table 2 Difference table showing the linear pattern for Table 1 x x0 = 3 x1 = 4 x2 = 5 x3 = 6 y y0 = 1 y1 = 8 y2 = 15 y3 = 22 Δy y1 y0 = 7 y2 y1 = 7 y3 y2 = 7 Polynomial Patterns and Interpolation How do we determine whether a set of data follows a quadratic pattern or, in general, a polynomial pattern? We begin to answer this question by considering the set of points shown in Table 3 that lie on the parabola y x 2 , where equally spaced x-values are used. Just as with Table 2, we include the differences between successive yvalues. Obviously, the y values are not constant. But they clearly follow a linear pattern because the differences between successive y values, called the second differences of the yvalues and written 2 yi (yi ) yi 1 yi yi 2 2 yi 1 yi , are all equal. It is not a coincidence that a quadratic function has constant second differences of the y-values. In fact, we can show algebraically that f ( x 1) f ( x) is a linear function if f ( x) is a quadratic function. (For that matter, if a set of data follows the pattern of a polynomial of degree n, then all the n th order differences will be constant.) Table 3 Quadratic pattern x 3 4 5 6 7 y 9 16 25 36 49 Δy 7 9 11 13 Δ2 y 2 2 2 2 In general, suppose that we have a set of n 1 points ( x0 , y0 ) , ( x1 , y1 ) , ( x2 , y2 ) , …, ( xn , yn ) . These points follow a quadratic pattern if the second differences of the y-values are all equal when the x-values are uniformly spaced [2]. Let’s apply this criterion to determine the pattern in the data in Table 4. Table 5 extends Table 4 to include both the successive differences y and the second differences 2 y of the y-values. We observe a fixed value for all the second differences and hence can conclude that the points in Table 4 follow a quadratic pattern. Table 4 Determine the pattern x 1 2 3 4 5 y 12 10 14 24 40 Table 5 Difference table of the quadratic pattern for Table 4 x 1 2 3 4 5 y 12 10 14 24 40 Δy –2 4 10 16 Δ2 y 6 6 6 Often knowing the pattern of the data is just the beginning. We would like to find the equation of the quadratic function that fits the quadratic pattern. One way to proceed is to select three points, say (1,12) , (2,10) and (3,14) , from Table 4 and associate each of these points with an equation, that is, f (1) 12 , f (2) 10 and f (3) 14 . These three equations are sufficient for solving for the three unknown coefficients a , b and c of a quadratic function f ( x) ax 2 bx c . The evaluation of the function at x 1 , x 2 , and x 3 yields a system of three linear equations in three unknowns: a b c 12 4a 2b c 10 9a 3b c 14 3 Using either the substitution or elimination method or technology to find the solution, we obtain that a 3 , b 11 and c 20 ; and so the quadratic function is f ( x) 3x 2 11x 20 . We could also use the polynomial regression feature of a graphing calculator or Excel to find the polynomial directly, but this is limited to fourth degree polynomials with graphing calculators or sixth degree polynomials with Excel. The graph of the function is shown in Figure 1 along with the data points. 60 40 20 0 -1 0 1 2 3 4 5 6 Figure 1: A parabola passes through the set of points in Table 4 The above approach sounds like a very good strategy; it is simple and straightforward. But this method has its own limitation if we want to extend it to a set of data that follows a higher degree polynomial pattern. When we try to fit a polynomial to a set of data that falls into a degree m polynomial pattern, we would have to solve a system of m 1 linear equations in m 1 unknowns. We seek an alternative approach to find such an interpolating polynomial that doesn’t involve the heavy, but essentially mindless, computations. The alternative we show below is a more clever way that enhances students’ mathematical thinking. For the linear pattern case, we found that each entry in the first row of Table 2 was used in the equation of the linear function. Is something similar also the case for the quadratic function? What does the value of 2 y inform us about the desired function? To answer these questions, we need to examine the first row of Table 5 carefully. Just like the linear case, it turns out that the first row is all we need to create the entire table, 4 assuming that x 1. Furthermore, it is still true that the first three numbers in this row x 1 , y 12 , and y 2 are associated with the linear function y f1 ( x) 12 2( x 1) . Notice that f1 ( x) passes through (1,12) and (2,10) but not (3,14) . How, then, can we modify the linear function f1 ( x) to obtain the desired quadratic interpolating polynomial f ( x) that passes through (1,12) , (2,10) , and (3,14) , as well as the remaining points in the table? One way to do it is by adding an appropriately chosen quadratic function f 2 ( x) to the linear function f1 ( x) to find the desired quadratic function f ( x) f1 ( x) f 2 ( x) . This likely seems an unusual approach because our problem is to find a quadratic function f ( x) and we suggest finding a different quadratic function f 2 ( x) and then adding it to the linear function f1 ( x) to solve the problem. However, it would be worth taking this zigzag approach if it is very easy to construct f 2 ( x) . By the above analysis, we want a quadratic function f 2 ( x) that contributes zero automatically at x 1 and x 2 , so that f 2 (1) f 2 (2) 0 . This guarantees that f (1) f1(1) 0 12 and f (2) f1(2) 0 10 . Consequently, f 2 ( x) must have two zeros, one at x 1 and the other at x 2 . Because each real zero r of a polynomial corresponds to a linear factor of x r , then f 2 ( x) must be of the form f 2 ( x) a2 ( x 1)( x 2) , where a2 is a constant that can be determined by the third point (3,14) from Table 4. We select the point (3,14) to maintain evenly spaced x-values of the points used. Would the value of a2 be determined by the last entry 2 y in row one of Table 5? If this is indeed the case, this approach greatly reduces the computational work to the point where it could be described as trivial. We know that f (3) f1(3) f 2 (3) 14 . Since f1(3) 12 2(3 1) 8 and f 2 (3) a2 (3 1)(3 2) 2a2 , we obtain f (3) f1 (3) f 2 (3) 8 2a2 14 and so a2 3 . Comparing this result with the constant second difference 2 y 6 , we see that 2 y0 a2 is half of the constant second difference y . In fact, a2 holds in general, provided 2! 2 that x 1. We now prove this result. 5 If we have the three points ( x0 , y0 ) , ( x1 , y1 ) , and ( x2 , y2 ) and want to construct the quadratic interpolating polynomial, we express the desired function as f ( x) y0 y0 ( x x0 ) a2 ( x x0 )( x x1 ) , assuming that the x-values are uniformly spaced with x 1. It is apparent that f ( x0 ) y0 , and f ( x1) y0 (y0 )( x1 x0 ) 0 y0 ( y1 y0 ) y1 . To have f ( x2 ) y2 , we solve y0 y0 ( x2 x0 ) a2 ( x2 x0 )( x2 x1 ) y2 for a2 . Because x 1, then x2 x0 2 . Knowing y1 y0 y0 ( x1 x0 ) , we have y1 y0 a2 2 1 y2 , which yields the result y2 y1 y0 y1 y0 (y0 ) 2 y0 a2 . 2 1 2! 2! 2! This proof provides the specific insight needed to extend the process to model higher degree polynomial patterns. The interpolating polynomial for the data in Table 4 is therefore f ( x) 12 2( x 1) 3( x 1)( x 2) or f ( x) 3x 2 11x 20 , when we multiply it out. Figure 2 shows f ( x) along with its two component functions f1 ( x) and f 2 ( x) . Let’s focus on the three dots that represent the first three points in the example. We see that only the graphs of f ( x) and f1 ( x) pass through the points (1,12) and (2,10) , while the quadratic component f 2 ( x) goes through the x-intercepts at x 1 and x 2 . At the point x 3 , the value of f 2 (3) 6 is the amount being added to f1 (3) so that the point (3,14) is on the graph of f ( x) . As expected, the first row of Table 5 contains the sufficient information to give us all the coefficients 12, –2, and 3 of the quadratic function f ( x) 12 2( x 1) 3( x 1)( x 2) . Moreover, each entry contributes to the expression for the function in a particular way, that is, the constant term 12 is y0 , the coefficient –2 of the linear term ( x 1) is y0 , and the coefficient 3 of ( x 1)( x 2) is determined by 2 y0 . In general, the formula for the quadratic 2! 6 30 f ( x) 20 (3, 14) (1, 12) f 2 ( x) Two vertical segments with equal length 6. 10 f1( x) (2, 10) 0 -1 0 1 2 3 4 5 -10 Figure 2: How the quadratic term f 2 ( x) affects the linear interpolation f1( x) f1 ( x) 12 2( x 1) f2 ( x) 3( x 1)( x 2) f ( x) f1 ( x ) f 2 ( x ) interpolating polynomial through the three points ( x0 , y0 ) , ( x1 , y1 ) , and ( x2 , y2 ) with ∆x = 1 is P2 ( x) y0 y0 ( x x0 ) 2 y0 ( x x0 )( x x1 ) . 2! This expression is called the Newton forward interpolating formula of degree 2 for the interpolating polynomial [1]. The ideas discussed here can be extended beyond interpolating a quadratic pattern. For a set of n 1 points where the x-values are uniformly spaced, when a higher degree polynomial pattern is identified by calculating m th-order differences of y and finding them all constant at some level m n , a polynomial of degree m is required to fit the pattern. The Newton formula provides a simple way to construct the polynomial function by making use of every entry in the first row, x0 , y0 , y0 , 2 y0 , 3 y0 , …, of the difference table. For instance, after determining the cubic pattern of a set of points, we select the four points ( x0 , y0 ) , ( x1 , y1 ) , ( x2 , y2 ) , and ( x3 , y3 ) (assuming that x 1 ). The third degree Newton interpolating polynomial, P3 ( x) , will then be composed of a linear component f1 ( x) y0 y0 ( x x0 ) , a quadratic component 7 f 2 ( x) 3 y0 2 y0 ( x x0 )( x x1 )( x x2 ) , each ( x x0 )( x x1 ) , and a cubic component f3 ( x) 3! 2! constructed in the comparable way. The derivation of the coefficient the derivation of a2 3 y0 in f3 ( x) is similar to 3! 2 y0 . We leave it to the interested readers or their students. The result is 2! 2 y0 3 y0 P3 ( x) y0 y0 ( x x0 ) ( x x0 )( x x1 ) ( x x0 )( x x1 )( x x2 ) . 2! 3! (1) Notice that the linear function f1 ( x) y0 y0 ( x x0 ) passes through only the first two points ( x0 , y0 ) and ( x1 , y1 ) ; the quadratic function f1 ( x) f 2 ( x) satisfies the first three points ( x0 , y0 ) , ( x1 , y1 ) , and ( x2 , y2 ) ; and finally the cubic function P3 ( x) f1 ( x) f 2 ( x) f3 ( x) passes through not only all four points, but also any remaining given points in the data set. From this point of view, the process of constructing the interpolating polynomial using the Newton formula is a sequential method. We initially consider the first two points and obtain the linear function that fits them. Then we include the third point in the data set, find the appropriate quadratic component, and add it to the linear function so that the sum of these two functions fits the first three points. In a way, the quadratic component acts as a correction term. To move forward to find the interpolating polynomial of higher degree, we add the next point from the data to the first three points. We then find the cubic polynomial component that is the new correction term and so obtain P3 ( x) y0 y0 ( x x0 ) 2 y0 3 y0 ( x x0 )( x x1 ) ( x x0 )( x x1 )( x x2 ) . 2! 3! Even though our discussion assumes that x 1 , the way of constructing the Newton formula can be easily extended to the case where the x-values are uniformly spaced with x h . However, we take a different approach to show the Newton formula where x h by using the technique of horizontal shifting and horizontal stretching/shrinking. Again, using the cubic pattern of a set of points as an example, we select the four points ( x0 , y0 ) , ( x1 , y1 ) , ( x2 , y2 ) , and ( x3 , y3 ) where x h . If x0 0 , we first shift these points horizontally to the right by | x0 | units if x0 0 or to the left by x0 units if x0 0 , so that the new location of the first point 8 ( x0 , y0 ) is on the y-axis. If x0 0 , then the horizontal shift is not necessary. Next, we stretch the resulting points horizontally (if h 1) or shrink them (if h 1 ) by a factor of 1 h so that the points now are equally spaced horizontally with an increment of u 1 . u ( x) The formula x x0 summarizes the above horizontal transformations and the ui values are shown in h Table 6. Table 6 Change of xi into ui x0 x x0 h u0 0 x1 u1 1 x2 u2 2 x3 u3 3 u x By applying (1) to the set of the corresponding points (u0 , y0 ) , (u1, y1 ) , (u2 , y2 ) , and (u3 , y3 ) , we have P3 (u ) y0 y0 (u u0 ) After substituting P3 ( 2 y0 3 y0 (u u0 )(u u1 ) (u u0 )(u u1 )(u u2 ) . 2! 3! x x0 for u in P3 (u ) , we arrive at h x x0 x x0 2 y0 x x0 x x0 3 y0 x x0 x x0 x x0 ) y0 y0 ( ) ( )( 1) ( )( 1)( 2) h h 2! h h 3! h h h y0 2 y0 3 y0 y0 ( x x0 ) ( x x0 )( x x1 ) ( x x0 )( x x1 )( x x2 ). h 2!h2 3!h3 The general cubic Newton interpolating polynomial, still denoted by P3 ( x) , is P3 ( x) y0 y0 2 y0 3 y0 ( x x0 ) ( x x )( x x ) ( x x0 )( x x1 )( x x2 ) . 0 1 h 2!h2 3!h3 (2) Suppose that we are only given every other points in Table 4 with an additional point (7,90) , shown in Table 7 along with the differences. We will apply a general Newton interpolating formula similar to (2) to find the quadratic interpolating polynomial 9 2 24 P2 ( x) 12 ( x 1) ( x 1)( x 3) . 2 2! 4 When we multiply it out, we obtain P2 ( x) 3x2 11x 20 , as we expected. Table 7 Difference table of the quadratic pattern where x 2 x 1 3 5 7 y 12 14 40 90 Δy 2 26 50 Δ2 y 24 24 Interpolation with Divided Differences In the case where the points xi are not uniformly spaced, for example, the point (4,8) is missing from Table 1, or the point (2,10) is missing from Table 4, then the above technique fails to identify the polynomial pattern. We present a new set of data in Table 8, which is based on Table 1 with the point (4,8) being removed. To remove the effect of the unevenness of the xi , we will calculate the difference quotients instead of the differences. Table 8 Non-uniformly Spaced xi x 3 5 6 y 1 15 22 Table 9 Divided difference table showing the linear pattern for Table 8 x y 3 1 5 6 15 22 y x 14 7 2 7 7 1 The difference quotients are called the divided differences in numerical analysis. Following the notation yi yi 1 yi , we denote xi xi 1 xi , then the divided differences 10 are yi yi 1 yi , denoted by f [ xi , xi 1] (if y f ( x) ). The constant divided differences in xi xi 1 xi Table 9 show that the pattern of the data in Table 8 is linear. Again, using the information from the first row of Table 9, just as we did earlier, we obtain the linear interpolating function P1 ( x) 1 7( x 3) or P1 ( x) 7 x 20 . Suppose we have another set of data, shown in Table 10, that is based on Table 4 minus the point (2,10) . We follow the idea of using divided differences to determine the polynomial pattern. When it comes to the second divided differences based on f [ xi , xi 1] and f [ xi 1, xi 2 ] , we compute f [ xi 1, xi 2 ] f [ xi , xi 1 ] , denoted by f [ xi , xi 1, xi 2 ] . Notice that when x h , xi 2 xi f [ xi , xi 1 ] yi 1 yi yi xi 1 xi h and f [ xi 1, xi 2 ] f [ xi , xi 1 ] f [ xi , xi 1, xi 2 ] xi 2 xi When i 0 , f [ x0 , x1 ] yi 1 yi 2 h h yi . 2h 2!h 2 2 y0 y0 and f [ x0 , x1, x2 ] appear as the coefficients of the linear and h 2!h 2 quadratic components of the polynomial expression P3 ( x) of (2), respectively. This observation leads to the generalization of the Newton interpolating formula. We have that the degree m of the polynomial pattern will be determined by the equal m th-order divided differences. And the Newton formula for the third degree polynomial, for example, can be generalized to P3 ( x) y0 f [ x0 , x1]( x x0 ) f [ x0 , x1, x2 ]( x x0 )( x x1) f [ x0 , x1, x2 , x3 ]( x x0 )( x x1)( x x2 ) . Therefore, the constant second divided differences in Table 11 show that the pattern of the data in Table 10 is quadratic. Using the information from the first row of Table 11, we obtain the quadratic interpolating function P2 ( x) 12 1 ( x 1) 3( x 1)( x 3) or P2 ( x) 3x2 11x 20 . Comparison with the Lagrange Interpolation Formula The computationally convenient form of the Newton interpolating formula makes it one of two very widely used interpolation formulas, the other being the Lagrange interpolating formula [1]. The Newton interpolation process is very different from the Lagrange interpolating formula. For example, the Lagrange 11 Table 10 Determine the pattern x 1 3 4 5 y 12 14 24 40 Table 11 Divided difference table of the quadratic pattern for Table 10 x y 1 12 3 14 4 5 24 40 f [ xi , xi 1] 2 1 2 10 10 1 16 16 1 f [ xi , xi 1, xi 2 ] 9 3 6 2 3 3 formula for interpolating the three points ( x0 , y0 ) , ( x1 , y1 ) , and ( x2 , y2 ) is f ( x) y0 ( x x0 )( x x2 ) ( x x0 )( x x1 ) ( x x1 )( x x2 ) y1 y2 . ( x0 x1 )( x0 x2 ) ( x1 x0 )( x1 x2 ) ( x2 x0 )( x2 x1 ) Every term in the Lagrange formula uses the information of all the points. Therefore, it constructs the interpolating polynomial by considering all points simultaneously. Although the two approaches actually produce the identical polynomial despite the totally different appearances of the two expressions, there are marked differences between how calculations with each are performed; sometimes one approach is far preferable and sometimes the other is. For example, Newton’s interpolation is better for the polynomial curve fitting for a given set of data. In general, a set of n 1 points determines a polynomial of degree at most n. Each polynomial term in the Lagrange interpolating formula is of degree n . Due to a possible cancellation in the Lagrange formula, the interpolating polynomial may be lower in degree. We will not know the exact degree of the interpolating polynomial until we simplify the Lagrange expression. On the other hand, the Newton interpolation approach first constructs the divided difference table (or difference table if the x-values are uniformly spaced) that can be used to determine the exact degree of the polynomial. Moreover, the first row of the divided difference table provides the coefficient of each term in the Newton interpolating formula. When the data 12 fall into a polynomial pattern with degree m n , then the efforts in constructing the interpolating polynomial can be reduced with the Newton interpolation. While Newton interpolation provides a foundation for the development of methods for the numerical solution of ordinary and partial differential equations, as well as for detecting errors in data, Lagrange interpolation has its own advantage of incorporating the original data into its formula. Because of it, many numerical integration formulas such as the trapezoidal and Simpson’s rules use the Lagrange interpolating polynomials for the derivation of these formulas. Relationship to Taylor Polynomials What seems less trivial is the connection between the Newton interpolating polynomial and Taylor polynomials if we only consider where we typically cover these two polynomials in mathematics courses. But at a quick glance, the formula (2) for P3 ( x) y0 y0 2 y0 3 y0 ( x x0 ) ( x x )( x x ) ( x x0 )( x x1 )( x x2 ) 0 1 h 2!h2 3!h3 is obviously very similar to the formula for the third degree Taylor polynomial for a smooth function y f ( x) near x x0 : T3 ( x) f ( x0 ) f ( x0 )( x x0 ) f ( x0 ) f ( x0 ) ( x x0 ) 2 ( x x0 )3 . 2! 3! Let’s see just how close the two are. Consider what happens to the Newton interpolating i y0 formula P3 ( x) as h 0 . Clearly, the expressions in the polynomial expression converge hi toward f (i ) ( x0 ) , the successive derivatives of the function at x x0 . Moreover, as h 0 , all of the xi approach x x0 , though they do retain the uniform spacing. As all the points xi coalesce at x x0 , we see that the various factors ( x xi ) all converge toward ( x x0 ) and so their products approach the successive powers of ( x x0 ) . Thus, the Taylor polynomial expansion of a function is the limit of the Newton interpolating polynomials as h 0 . The resemblance of the Newton formula P3 ( x) to the formula for the third degree Taylor polynomial T3 ( x) may also be explained by the fact that the Taylor polynomial T3 ( x) is determined by a sequential technique as well, where we consider the derivative of f of order i up to 3, one at a time. That is why the remarkable similarity of these two formulas is not a coincidence. 13 Conclusion The idea of the Newton interpolating formula is often hidden in the more standard derivation of the formula found in most numerical analysis textbooks. Many students learn the Newton formula in an upper level math course where the formula is usually presented to them and then it is shown that it satisfies all the given points. But the question that the students often ask (and more often don’t ask, even though it is something they don’t see) is where did the formula come from in the first place. At least, precalculus is a place where the Newton formula can be investigated in the context of looking at the simple problem of finding polynomial patterns [2]. It is also a place to foster deep learning of mathematics by using a number of topics together during the investigation and so reinforce those ideas in students’ minds. Through our discussion, we touch upon many important concepts at the introductory level of mathematics, for instance, the connection between the real zeros of a polynomial and its linear factors, combining functions, graph transformations, and systems of linear equations. It is a wonderful opportunity for students to see how seemingly unrelated mathematics topics are used collectively to solve a simple question such as fitting a quadratic function to three points. References [1] K.E. Atkinson, An Introduction to Numerical Analysis, 2nd Ed., John Wiley & Sons, New York, NY, 1988. [2] S.P. Gordon, F.S. Gordon, A.C. Tucker, and M.J. Siegel, Functioning in the Real World – A Precalculus Experience, 2nd Ed., Pearson, Boston, MA, 2004. 14