1 Chapter 3. Polynomial Interpolation/Function Approximation 3.1. Lagrange Interpolation Formula For a set of data points: x i , y i , i 1, 2, , n and n 1, the elementary Lagrange interpolation formula is l in ( x ) x xj , i 1, 2, 3, ..., n x xj j 1, j i i n (3.1.1) lin (x) is a polynomial with degree no greater than n 1 . Its value at any data point x k within the data set is either 1 or 0 l in ( x k ) l in ( x k ) n xk x j x xj j 1, j i i n xk x j x xj j 1, j i i 1, for i k (3.1.2) 0, for i k (3.1.3) which is equivalent to 1, i k l in ( x k ) ik 0, i k (3.1.4) The Lagrange interpolation polynomial of degree n 1 is n p n1 ( x ) y i l in ( x ) i 1 (3.1.5) and its values at the data points are n n i 1 i 1 p n1 ( x k ) y i l in ( x k ) y i ik y k , k 1, 2, , n (3.1.6) 2 which means p n1 ( x) passes through all the data points exactly. For the simplest case where n 2 , there are only two data points and p1 ( x) is a linear function which passes through the two data points. Thus, p1 ( x) is just a straight line with its two end points being the two data points. From (3.1.6), we have p1 ( x ) y1 x x2 x x1 y y1 y2 y1 2 ( x x1 ) x1 x 2 x 2 x1 x 2 x1 (3.1.7) For n 3 , p2 ( x) is the quadratic polynomial that passes through three data points. p2 ( x ) y1 ( x x 2 )( x x 3 ) ( x x1 )( x x 3 ) ( x x1 )( x x 2 ) y2 y3 ( x1 x 2 )( x1 x 3 ) ( x 2 x1 )( x 2 x 3 ) ( x 3 x1 )( x 3 x 2 ) (3.1.8) An advantageous property of the Lagrange interpolation polynomial is that the data points need not be arranged in any particular order, as long as they are mutually distinct. Thus the order of the data points is not important. For an application of the Lagrange interpolation polynomial, say we know y1 and y 2 or y1 , y 2 and y 3 , then we can estimate the function y (x) anywhere in [ x1 , x2 ] linearly and [ x1 , x3 ] quadratically. This is what we do in finite element analysis. 3.2 Newton Interpolating Polynomial Suppose there is a known polynomial p n1 ( x) that interpolates the data set: ( xi , yi , i 1, 2, , n ) . When one more data point ( xn1 , y n1 ) , which is distinct from all the old data points, is added to the data set, we can construct a new polynomial that interpolates the new data set. Keep also in mind that the new data point need not be at either end of the old data set. Consider the following polynomial of degree n n p n ( x ) p n1 ( x ) c n ( x x i ) i 1 (3.2.1) where cn is an unknown constant. In the case of n 1, we specify p0 ( x) as p0 ( x) y1 , where data point 1 need not be at the beginning of the data set. 3 At the points of the old data set, the values of p n (x) are the same as those of p n1 ( x) . This is because the second term in Equation 3.2.1 is zero there. Since we assume p n1 ( x) interpolates the old data set, p n (x) does so too. At the new data point, we want pn ( xn1 ) y n1 . This can be accomplished by setting the coefficient cn to be cn p n ( x n1 ) p n1 ( x n1 ) y n1 p n1 ( x n1 ) n ( x n1 x i ) i 1 (3.2.2) n ( x n1 x i ) i 1 which definitely exists since x n 1 is distinct from x i ( i 1, 2, , n) . Now p n (x) is a polynomial that interpolates the new data set. For any given data set: ( xi , yi , i 1, 2, , n ) , we can obtain the interpolating polynomial by a recursive process that starts from p0 ( x) and uses the above construction to get p1 ( x) , p2 ( x) , , p n1 ( x) . We will demonstrate this process through the following example. Example 3.2.1. Construct an interpolating polynomial for the following data set using the formula in Equations 3.2.1 and 3.2.2 i x y 1 0 0 2 5 2 3 7 -1 4 8 -2 5 10 20 Step1: for i 1 p0 ( x ) c 0 y1 0 Step 2: adding point #2 p1 ( x ) p0 ( x ) c1 ( x x1 ) c1 x Applying p1 ( x2 ) y 2 , we get c1 0.4 . So p1 ( x ) p0 ( x ) c1 ( x x1 ) 0.4 x Step 3: adding point #3 p2 ( x) p1 ( x) c 2 ( x x1 )( x x 2 ) 0.4 x c 2 x( x 5) 4 Applying p2 ( x3 ) y3 , we get c2 0.2714 . So p2 ( x) 0.4x 0.2714x( x 5) Step 4: adding point #4 p3 ( x ) p2 ( x ) c 3 ( x x1 )( x x 2 )( x x 3 ) p2 ( x ) c 3 x( x 5)( x 7) Applying p3 ( x4 ) y 4 , we get c3 0.0548 . So p3 ( x) p2 ( x) 0.0548 x( x 5)( x 7) Step 5: adding point #5 p4 ( x ) p3 ( x ) c 4 ( x x1 )( x x 2 )( x x 3 )( x x 4 ) p3 ( x ) c 4 x( x 5)( x 7)( x 8) Applying p4 ( x5 ) y5 , we get c4 0.0712 . So p4 ( x) p3 ( x) 0.0712 x( x 5)( x 7)( x 8) which is the final answer. If we expand the recursive form, the r.h.s of Equation (3.2.1), we obtain the more familiar form of a polynomial pn1 ( x) c0 c1 ( x x1 ) c2 ( x x1 )( x x2 ) cn ( x x1 )( x x2 ) ( x xn ) (3.2.3) which is called the Newton’s interpolation polynomial. Its constants can be determined from the data set: ( xi , yi , i 1, 2, , n ) p 0 ( x 1 ) y1 c 0 p1 ( x 2 ) y 2 c 0 c1 ( x 2 x1 ) (3.2.4) p2 ( x 3 ) y 3 c 0 c1 ( x 3 x1 ) c 2 ( x 3 x1 )( x 3 x 2 ) which gives 5 c 0 y1 y c 0 y 2 y1 c1 2 x 2 x1 x 2 x1 y c 0 c1 ( x 3 x1 ) c2 3 ( x 3 x1 )( x 3 x 2 ) (3.2.5) We should note that forcing the polynomial through data with no regard for rates of change in the data (i.e., derivatives) results in a C 0 continuous interpolating polynomial. Alternatively, each data condition p( xi ) yi is called a C 0 constraint. Let’s use the following notation for these constants c0 [ x1 ] y y1 c1 [ x1 , x2 ] y [ x 2 ] y [ x1 ] y x 2 x1 c i 1 [ x1 , x 2 x i ] y, i 1, 2, (3.2.6) which has the following property [ x1 x 2 x i ] y [ x 2 , x 3 x i ] y [ x1 , x 2 x i 1 ] y x i x1 (3.2.7) so that it is called the divided difference. For the proof of this formula, please refer to Gautschi (1997), but for example: x1 , x 2 y y 2 y1 , x1 , x 2 , x 3 y x 2 , x 3 y x1 , x 2 y , etc. Using this x 2 x1 x 3 x1 recursion property, a table of divided differences can be generated as follows x1 y1 x2 y2 x3 y3 x4 y4 [ x1 x 2 ] y [ x 2 x 3 ] y [ x1 x 2 x 3 ] y (3.2.8) [ x 3 x 4 ] y [ x 2 x 3 x 4 ] y [ x1 x 2 x 3 x 4 ] y This table can be viewed as part of an n (n 1) matrix for a data set that has n points. The first column is the x values of the data set and the second column is the y or function values of the data set. For the rest of the (n 1) (n 1) lower triangle, the rule for the construction of its elements, say, element(i, j), is as follows 6 (1) It takes the form of a division ( j 3, i j 1 ) element ( i , j ) element ( i , j 1) element ( i 1, j 1) element ( i , 1) element ( i j 2, 1) (3.2.9) where Element(i, j-1) is the element in the matrix immediately left of element(i, j); Element(i-1, j-1) is the element above and immediately left; Element(i, 1) is the x on the same row; Element(i-j+2, 1). To find it, going from element(i, j) diagonally upward and leftward, when reaching the second column, it is the element to the left. (2) The denominator is easier to see in this form: [ x k x l xq ] y element on left element above left xq xk Example 3.2.2. Using the table of divided differences, construct the Newton interpolating polynomial for the data set in Example 3.2.1. 0 0 5 2 7 1 8 2 10 20 20 0.4 50 1 2 1.5 75 2 1 1 87 20 2 11 10 8 1.5 0.4 0.271 70 1 1.5 0.167 85 11 1 4 10 7 0.167 0.271 0.055 80 4 0.167 0.767 10 5 0.767 0.055 0.071 10 0 where the number of decimal digits is reduced so that the table fits the page here. The bracketed terms <…> are the coefficients of the Newton polynomial. Then p 4 ( x) 0 0.4 x 0.2714 x( x 5) 0.0548 x( x 5)( x 7) 0.0712 x( x 5)( x 7)( x 8) 0 x(0.4 ( x 5)( 0.2714 ( x 7)(0.0548 ( x 8)0.0712))) The second line above is called a nested form that can save computation operations and produces a more accurate solution numerically. Note in passing how similar this table is 7 to Romberg quadrature (or Richardson’s extrapolation), only in here the formula for calculating the rightmost terms is a simple ratio of differences. A MATLAB code “newton_example.m” is written to incorporate the divided difference scheme and calculate the Newton interpolation polynomial using the nested form. The algorithm is as follows (1) Input and plot data set points; (2) Declare c(n, n + 1) as the matrix for the table; Note: c = element in Equation (3.2.9); (3) c(i, 1) x; c(i, 2) y; (4) Calculate the rest of the table due to Equation (3.2.9); (5) Calculate the polynomial using c(i, i +1) as the coefficients of the polynomial. Figure 3.2.1 shows the resulting polynomial for the above example. Newton Interpolating Polynomial 20 data points Newton Polynomial 15 y 10 5 0 -5 -10 -2 0 2 4 6 8 10 12 x Figure 3.2.1 Newton interpolation polynomial. Through the above example, we can see the advantages of the divided difference table over the algebraic approach we have in Example 3.2.1. First, it has less computational operations. We do not need to write the polynomial and then use the C0 8 condition to calculate the constants. Second, it is much easier to incorporate it in a computer code. It is important to realize that both the Lagrange and Newton polynomials are C0 continuous and each would generate the same result. One effect of C0 continuity can be seen in the large dip that occurs between the first two data points in the figure. Should it really be there? 9 3.3 Hermite Interpolation Polynomial The Hermite interpolation accounts for the derivatives of a given function. For example, in the case of a beam finite element, suppose we need to obtain cubic polynomials that satisfy the following cases: (1) Consider: y = ax3 + bx2 + cx + d in [0, 1]. (2) Apply conditions @x=0 @x=1 Case 1: y = 1, y = 0 y = y= 0 Case 2: y = 0, y= 1 y = y = 0 Case 3: y = 0, y= 0 y = 1, y= 0 Case 4: y = 0, y = 0 y = 0, y = 1 (3) Solve each case for a, b, c, d. We recall the standard Newton form for a cubic polynomial y( x) c0 c1 ( x x1 ) c2 ( x x1 )( x x2 ) c3 ( x x1 )( x x2 )( x x3 ) (3.3.1) This clearly is not what the Newton interpolation is meant for, but we can employ it as follows: approximate y by using the divided difference between two points, then letting one approach the other in the limit. For example, if y(xi) is used, consider adding another point xm. Letting these two points converge to one point, we obtain [ xi x m ] y y m yi y ( xi ), as xm xi x m xi From this we discover that the divided difference is an approximation of a derivative. Then in the divided difference table, we will put two entries for xi in the data set and do not calculate xi , xi y in its original form (which will overflow numerically), but rather simply put the y (xi) value there. For the case mentioned above, the table would be x1 x1 y1 y1 x2 y2 x2 y2 y1 [ x1 x 2 ] y [ x1 x1 x 2 ] y y 2 [ x1 x 2 x 2 ] y [ x1 x1 x 2 x 2 ] y The tables for the four cases are best determined by hand. Then substitution of the diagonal values for the c i ’s and for the xi ’s into Equation 3.3.1 yield the polynomials. The results are: 10 Case 1 : 0 1 0 1 0 1 0 1 1 1 0 0 1 2 y ( x) 1 0 x 1x 2 2 x 2 ( x 1) 2 x 3 3 x 2 1 Case 2 : 0 0 0 0 1 1 0 0 1 1 0 0 0 1 y ( x) 0 1x 1x 2 1x 2 ( x 1) x 3 2 x 2 x Case 3 : 0 0 0 0 0 1 1 1 1 1 1 0 1 2 y ( x) 0 0 x 1x 2 2 x 2 ( x 1) 2 x 3 3 x 2 Case 4 : 0 0 0 0 0 1 0 0 0 1 0 1 1 1 y ( x) 0 0 x 0 x 2 1x 2 ( x 1) x 3 x 2 It is not hard to modify the code “newton_example.m” and incorporate these changes. The resulting code is called “hermite_example.m”. It can compute the above tables. The polynomials are plotted in Figure 3.3.1. For cases involved with higher order derivatives, the principle is the same (see Gautschi, 1997). One thing worth noting here is that when y(n)(xi) is used, all lower derivatives and y(xi) itself must be included in the constraints. For example, you can not have y(xi) as a constraint but not y(xi), nor y(2)(xi) but not y(xi) and y(xi). 11 0.8 1 0.8 0.4 y y 0.6 0.4 0.2 0 0.2 0 case 1 0 -0.2 0.5 x -0.4 1 case 2 0 0.5 x 1 0.5 x 1 0.8 1 0.6 0.8 0.4 y y 0.6 0.4 0.2 data points Hermite Polynomial 0.6 0 case 3 -0.2 0 0 0.2 0.5 x 1 -0.4 case 4 0 Figure 3.3.1 Hermite interpolation. Example 3.3.1. Constructing displacements in a beam element from Hermite polynomials Consider the beam of length L shown in Figure 3.3.2. The Hermite polynomials are: x x N1 ( x) 2( ) 3 3( ) 2 1 (3.3.2) L L x3 x2 N 2 ( x) 2 2 x (3.3.3) L L x x N 3 ( x) 2( ) 3 3( ) 2 (3.3.4) L L x3 x2 N 3 ( x) 2 (3.3.5) L L These polynomials interpolation functions may be thought of as the fundamental modes of deflection, two of which are shown in Figure 3.3.2. The deflection w(x ) of any statically loaded beam can be written in terms of these modes as w( x) N1W1 N 2 1 N 3W2 N 4 2 (3.3.6) where the subscripts associate quantities with positions (or nodes) 1 and 2 on the beam and Wi , i , i 1, 2, are the deflection and slope, respectively, at each node. 12 1 N1 N3 N x/L 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Figure 3.3.2 Static bending modes N1 and N3 in a beam. Can you physically explain why N1 + N3 = 1 for all x in Figure 3.3.2? …And why N2 + N4 1? References Gautschi, Walter, “Numerical Analysis,” Birkhauser, Boston, 1997.