Visualizing the Components of Lagrange and Newton

advertisement
Visualizing and Understanding the Components of Lagrange and
Newton Interpolation
One of the standard topics in numerical analysis courses is the subject of
interpolation with particular emphasis on the Lagrange and Newton Interpolating
formulas. In both cases, the usual approach is highly computational as one works to
construct the polynomial of degree n that passes through n  1 interpolating points.
However, as Richard Hamming, one of the giants of modern numerical analysis,
put it, “The purpose of computing is insight, not numbers” [3]. In this article, we will
look graphically at the functional components of each of these two interpolating formulas
to see the kinds of deeper insights that can be achieved.
The Lagrange Interpolating Formula Suppose that we have the n  1 points ( x0 , y0 ) ,
( x1 , y1 ) , …, ( xn , yn ) , where all of the xi are different, though not necessarily uniformly
spaced. These interpolating points then determine a unique polynomial of degree n (or
possibly lower, if the points happen to lie on such a curve). One way to express the
equation of this polynomial is with the Lagrange Interpolating Formula:
n
n
( x  xk )
( x  x0 ) ( x  xi 1 )( x  xi 1 ) ( x  xn )
  yi
.
( xi  x0 ) ( xi  xi 1 )( xi  xi 1 ) ( xi  xn )
i 0
k 0, k i ( xi  xk )
n
Ln ( x)   yi 
i 0
Equivalently, if we write this formula without the summation notation, it becomes
Ln ( x)  y0
( x  x1 )( x  x2 ) ( x  xn )
( x  x0 )( x  x2 ) ( x  xn )
 y1

( x0  x1 )( x0  x2 ) ( x0  xn )
( x1  x0 )( x1  x2 ) ( x1  xn )
 yn
( x  x0 )( x  x1 ) ( x  xn 1 )
.
( xn  x0 )( xn  x1 ) ( xn  xn 1 )
Notice that this polynomial is composed of n  1 distinct polynomial terms, each of
degree n (provided yi  0 ). Either way, both of these are rather daunting expressions for
students and consequently it is not surprising that many tend to miss some of the key
underlying concepts.
As an example, consider the three interpolating points (1, 2) , (3,8) , and (5, 4) that
determine the quadratic function whose graph is shown in Figure 1. Clearly, this function
1
10
(3, 8)
5
(5, 4)
(1, 2)
0
0
1
2
3
4
5
6
7
8
-5
-10
Figure 1: The quadratic interpolating polynomial L2 ( x)
passes through each of the three interpolating points. The corresponding expression for
this interpolating polynomial is
( x  3)( x  5)
( x  1)( x  5)
( x  1)( x  3)
8
4
(1  3)(1  5)
(3  1)(3  5)
(5  1)(5  3)
(1)
1
1
 ( x  3)( x  5)  2( x  1)( x  5)  ( x  1)( x  3) .
4
2
(2)
L2 ( x)  2
More to the point here, this function is a linear combination of three distinct quadratic
functions, shown in the first expression. Let’s see how each of these functions behaves,
as shown in Figure 2. (The heavier fourth curve shown is the Lagrange interpolating
polynomial.) Although the shapes of two of the curves totally miss the shape of the
interpolating polynomial, each of the three quadratics passes through just one of the
interpolating points. Moreover, not only does each one completely miss the other two
interpolating points, but also each quadratic has its real zeros precisely in line vertically
with the other two interpolating points. In addition, observe that each pair of quadratic
components share a common zero. To see why, notice that the component quadratic
functions are constructed in such a way that each of them has a pair of factors that
correspond to two of the three interpolating points. It is not coincidence that there is such
correspondence between the zeros and the interpolating points.
2
10
y2 ( x )
(3, 8)
5
y0 ( x)
(5, 4)
(1, 2)
0
0
1
2
3
4
5
6
7
8
-5
y1( x)
-10
L2 ( x)
Figure 2: The three component quadratic functions
y0 ( x)  2
( x 3)( x 5)
(13)(15)
y1 ( x)  8
( x 1)( x 5)
(31)(35)
y2 ( x )  4
( x 1)( x 3)
(51)(53)
Furthermore, look at the first term in the first expression (1) for L2 ( x) .
Corresponding to the first interpolating point (1, 2) , the coefficient of
( x  3)( x  5)
is
(1  3)(1  5)
precisely equal to the value, 2, of y at this point. Also, at this point where x  1 , the
factors in the numerator precisely cancel the factors in the denominator, so the total
contribution of this first term at x  1 is y  2 . Similarly, the second term in the
expression (1) for L2 ( x) is constructed in such a way that when x  3 , we have y  8 ,
and so on for the third term. In general, for any n , the terms in the Lagrange formula are
constructed in such a way that each one has zero contribution at all but one of the
interpolating points and contributes precisely the given value of y at the remaining
interpolating point.
The Newton Interpolating Formula Again, suppose that we have the n  1 points
( x0 , y0 ) , ( x1 , y1 ) , …, ( xn , yn ) , where all of the xi are different. For simplicity, we
consider the case where these xi ’s are uniformly spaced with x  xk 1  xk for each k .
3
These interpolating points determine a unique polynomial of degree n (or possibly
lower, if the points happen to lie on such a curve). Another way to express the equation
of this polynomial is with the Newton Forward Interpolating Formula:
 y0
 2 y0
3 y0
Pn ( x)  y0 
( x  x0 ) 
( x  x0 )( x  x1) 
( x  x0 )( x  x1)( x  x2 ) 
x
2!( x)2
3!( x)3
 n y0

( x  x0 )( x  x1 )
n!( x)n
(3)
( x  xn1 ),
where
 y0  y1  y0 ,
2 y0   y1   y0  y2  2 y1  y0 ,
3 y0  2 y1  2 y0  y3  3 y2  3 y1  y0 ,
M
 n y0   n 1 y1   n 1 y0
 yn  nyn 1 
n(n  1)
n(n  1)(n  2)
yn  2 
yn  3 
2!
3!
 (1) n y0 .
Notice that this polynomial Pn ( x) is also composed of n  1 distinct polynomial terms,
but each of degree i , i  0,1, 2,
, n . Term by term, each polynomial is of one degree
higher than the previous one.
At a quick glance, the above formula is obviously very similar to the formula for
the n th degree Taylor polynomial approximation for a function y  f ( x) at x  x0 :
Tn ( x)  f ( x0 )  f ( x0 )( x  x0 ) 
f ( n ) ( x0 )

( x  x0 )n .
n!
f ( x0 )
f ( x0 )
( x  x0 ) 2 
( x  x0 )3 
2!
3!
(4)
Let’s see just how close the two are. Consider what happens to Newton’s interpolating
 i y0
formula (3) as the stepsize x  0 . Clearly, the quantities
in the polynomial
(  x )i
expression (3) converge toward the successive derivatives f (i ) ( x0 ) of the function at
x  x0 . Moreover, as x  0 , all of the interpolating points approach x  x0 , though
they do retain the uniform spacing. As all the interpolating points coalesce at x  x0 , we
4
see that the products of the various factors all converge toward ( x  x0 ) and so approach
the successive powers of ( x  x0 ) . Thus, the Taylor polynomial of degree n for a
function is the limit of the Newton interpolating polynomials as x  0 .
We next consider how Newton’s interpolating formula comes about (which might
reflect Newton’s own thought process in originally developing the formula). Suppose we
start with the first two points ( x0 , y0 ) and ( x1 , y1 ) , which determine a line whose
algebraic representation can be written
y1 y0
x1 x0
y  y0 
( x  x0 )  y0 
y0
x ( x  x0 ).
Now suppose we also have the third point ( x2 , y2 ) . Unless the three points happen to be
collinear, which is highly unlikely, the line determined by the first two points will miss
the third point. In particular, if we extend that line until it reaches x  x2  x0  2x , then
the height along the line is
y  y0 
y0
x
( x2  x0 )  y0 
y1 y0
x
( x0  2x  x0 )  y0 
y1 y0
x
(2x)  2 y1  y0 .
The question we pose is: How can we adapt the above linear function to create a
quadratic function that also passes through the third point? To do so, we want to
introduce a quadratic term that forces the entire quadratic function to pass through the
third point while maintaining the same two values y0 and y1 at x  x0 and x  x1 ,
respectively. To do this, we write the quadratic polynomial in the form
y  y0 
y1  y0
x
( x  x0 )  A( x  x0 )( x  x1 ).
where A is some constant to be determined. Notice that the presence of the factors
( x  x0 ) and ( x  x1 ) guarantee that the quadratic term has zero contribution at the first
two interpolating points.
To determine A, we impose the condition that the value of the quadratic function
when x  x2 must be y  y2 . We therefore obtain
y  y0 
= y0 
y1 y0
x ( x0  2x  x0 )  A( x0  2x  x0 )( x0
y1 y0
x (2x)  A(2x)(x)  y2 .
 2x  x1 )
Consequently,
5
2 A(x) 2  y2  2( y1  y0 )  y0  y2  2 y1  y0 ,
and so
A
y2  2 y1  y0
.
2(x)2
Notice that the numerator y2  2 y1  y0 is the second difference 2 y0   y1   y0 .
Therefore, we write
 2 y0
A
2(x)2
and
y  y0 
y0
 2 y0
( x  x0 ) 
( x  x0 )( x  x1 ).
x
2(x)2
When doing this in class, we suggest asking the students to extend the argument
used to derive the cubic Newton interpolating formula.
Let’s find the interpolating quadratic using the Newton interpolating formula for
the above example where the three interpolating points are (1, 2) , (3,8) , and (5, 4) . We
have
6
10
P2 ( x)  2  ( x  1) 
( x  1)( x  3)
2
2!22
5
 2  3( x  1)  ( x  1)( x  3).
4
Notice that the sum of the first two terms, P1 ( x)  2  3( x  1) , determines the linear
function that pass through the first two points (1, 2) and (3,8) . Figure 3 shows the
graphs of P1 ( x) and P2 ( x) . Let’s include the graph of the quadratic term, denoted by
5
C2 ( x)   ( x  1)( x  3) , to see how the quadratic term alone changes the interpolation
4
polynomial P1 ( x) based on the first two points into the interpolation polynomial P2 ( x)
based on all three points. Figure 4 shows that the quadratic component C2 ( x) has two
real zeros precisely in line vertically with the first two interpolating points. Therefore,
C2 ( x)
automatically
contributes
zero
at
x 1
and
x3
so
that
6
P2 ( x)  P1( x)  C2 ( x)  P1( x) at x  1 and x  3 , which implies the presence of C2 ( x) in
P2 ( x) does not alter the perfect fit of the first two points by P1 ( x) . In addition, observe
that at x  5 , the value of C2 (5) is the amount that “bends” the linear function P1 ( x) at
x  5 in order for the P2 ( x) to pass through the third interpolating point (5, 4) .
20
P1 ( x)
15
(3, 8)
10
P2 ( x)
(5, 4)
(1, 2)
5
0
0
1
2
3
4
5
6
-5
-10
-15
Figure 3: Two successive Newton interpolating polynomials
P1 ( x) and P2 ( x)
20
P1 ( x)
15
(3, 8)
10
Two vertical
segments with
equal length
P2 ( x)
(1, 2)
5
(5, 4)
0
-5
0
1
2
3
4
5
6
-10
C2 ( x)
-15
Figure 4: How the quadratic term C2 ( x) affects the linear
interpolation P1 ( x)
20
15
10
5
0
-5
0
1
2
3
4
5
6
-10
-15
5
C2 ( x )   ( x  1)( x  3)
4
P2 ( x)  P1 ( x)  C2 ( x)
7
To better understand the process of finding the interpolating polynomial by using
Newton formula, we expand the data set of (1, 2) , (3,8) , and (5, 4) to include two
additional points (7, 2) , (9, 22) . These five points determine a quartic polynomial. We
show these successive Newton interpolating polynomials in Figure 5. Once again, notice
40
P4 ( x)
30
20
P3 ( x)
P1 ( x)
10
0
-10
0
-20
1
2
3
4
5
6
7
8
9
10
11
P2 ( x)
Figure 5: Successive Newton interpolating
polynomials up to degree 4
that the linear function passes through the first two points, though it completely misses all
the remaining points. The quadratic function passes through the first three points, but
then misses all the subsequent points. The cubic function passes through the first four
points, but comes nowhere near the final point. It is only the quartic function that passes
through all five points.
The approach used to introduce Newton’s formula in many numerical analysis
texts tends to be rather abstract and, as such, conveys little in the way of understanding to
many students. Moreover, some texts give a somewhat misleading image of what is
actually happening; they tend to say something to the effect that, if you have n  1
interpolating points, you usually need a polynomial of degree n to fit them, and any lower
degree polynomial misses the points. The “derivation” and examples used above actually
suggest that this is not exactly the truth. However, it is misleading to suggest that the
lower degree polynomials miss the points; they only miss the points further to the right.
8
Interpolation and Approximation of Functions Interpolation provides an important
tool for approximating a function. When we work with a function that involves more
than the basic operations, we may want to replace it with a polynomial for inexpensive
and quick computations. In this case, often what comes to our mind is the Taylor
approximation of a function, provided that the function is sufficiently differentiable.
Because of the similarity of Taylor approximation and Newton formula, we now consider
the interpolating polynomials in the Newton form for the approximation of functions.
As an example, we approximate the sine function y  sin x on the interval
[0, 2 ] . Let’s interpolate the sine function at the five uniformly spaced points (0,0) ,
( 2,1) , ( ,0) , (3 2, 1) , and (2 ,1) .
Just as we did earlier, we construct the
successive Newton interpolating polynomials, shown in Figure 6.
Notice that the
quadratic interpolating polynomial P2 ( x) fits the sine function reasonably well on the
interval [0,  ] . As expected, the quadratic polynomial P2 ( x) misses all the points on the
right. The cubic polynomial P3 ( x) and quartic polynomial P4 ( x) are identical because
the fourth difference 4 y0 is zero. This is the advantage of using the Newton formula
that determines the degree of the interpolating polynomial as we construct it. Had we
P1 ( x)
1.5
1
P3 ( x)
0.5
f ( x)
0
-0.5
0
1
2
3
4
5
6
7
-1
-1.5
P2 ( x)
Figure 6: Successive Newton interpolating
polynomials for the sine function
9
used the Lagrange formula, we would only learn the degree of the interpolating
polynomial after simplifying the expression for L4 ( x) . Overall, P3 ( x) approximate the
sine function reasonably well on the entire interval [0, 2 ] .
On the other hand, the cubic Taylor polynomial for y  sin x at x  0 ,
T3 ( x)  x 
x3
, is shown in Figure 7 along with the sine function and cubic interpolating
3!
polynomial. We see that the Taylor approximation achieves high accuracy between x  0
and roughly x   2 , then the difference between the cubic Taylor polynomial and the
sine function grows significantly as x moves to the right beyond x   2 . Clearly, the
interpolating polynomial gives us a better approximation of a function on a larger interval
compared with the Taylor approximation.
1.5
P3 ( x)
1
0.5
f ( x)
0
-0.5
0
1
2
3
4
5
6
7
-1
-1.5
T3 ( x)
Figure 7: Cubic Taylor polynomial and cubic interpolating polynomial
of the sine function
Comparisons Between Lagrange and Newton Interpolation
The Lagrange and
Newton interpolating formulas provide two different forms for an interpolating
polynomial, even though the interpolating polynomial is unique. When we want a quick
symbolic expression of the interpolating polynomial, the Lagrange formula seems to be
the way to go. For this reason, the Lagrange form is most often used for deriving
formulas for approximating derivatives and integrals. For example, many numerical
10
analysis textbooks (for example, [1] and [2]) establish the trapezoidal rule and Simpson’s
rule by using the Lagrange formula for linear and quadratic interpolating polynomials to
approximate the integrand, respectively. However, the Newton formula is much better
for computation than the Lagrange formula.
When using the interpolating polynomials for working with functions that are
stored in tabular form, we often choose the Newton formula. As we will show below, the
forward differences n y0 that determine the coefficients of the Newton formula can be
easily constructed using a tabular form. More importantly, the Newton formula provides
a generally accurate idea of when the degree n is sufficiently large by observing the size
of the terms with higher-order forward differences. This is a useful technique in deciding
what degree polynomial to use.
Suppose
we
are
given
five
points
(2.0, 1.414214) ,
(2.1, 1.449138) ,
(2.2, 1.483240) , (2.3, 1.516575) , and (2.4, 1.549193) , which is based on an example in
[1]. We construct the forward difference table for these five points, shown in Table 1.
Table 1 Forward difference table for the five points
i
xi
yi
yi
0
1
2
3
4
2.0
2.1
2.2
2.3
2.4
1.414214
1.449138
1.483240
1.516575
1.549193
0.034924
0.034102
0.033335
0.032618
2 yi
– 0.000822
– 0.000767
– 0.000717
3 yi
0.000055
0.000050
4 yi
– 0.000005
The last five entries in the first row are used to determine the coefficients of the Newton
interpolating polynomial
P4 ( x)  1.414214  0.34924( x  2.0)  0.041( x  2.0)( x  2.1)
 0.009167( x  2.0)( x  2.1)( x  2.2)  0.002083( x  2.0)( x  2.1)( x  2.2)( x  2.3).
By applying the usual optimization approach from Calculus I to the fourth degree
polynomial term
0.002083( x  2.0)( x  2.1)( x  2.2)( x  2.3)
on the interval [2.0, 2.4] , we find that
11
Max ( x  2.0)( x  2.1)( x  2.2)( x  2.3)  0.0024 .
2.0 x2.4
Then the largest possible value the last term of P4 ( x) that will contribute to the
interpolating polynomial at any point in the interval
[2.0, 2.4]
is roughly
0.002083  0.0024  5 106 .
This result may be improved on by following a common practice used in
approximating functions with the Newton formula. When we want to approximate the
function at a point x that is inside the first half of the interval, we use the above Newton
forward formula. Otherwise we use the Newton backward formula, or equivalently, we
apply the Newton forward formula to the same table where the entries are listed in
reverse order. If we stay with the original notation for the interpolating points ( xi , yi ) for
i  0,1, 2,
, n . We define the backward differences as follows. Let  yn  yn  yn1 ,
2 yn  (yn )  yn  yn1  yn  2 yn1  yn2 ,
and
in
general,
i yn  (i 1 yn )  i 1 yn  i 1 yn1 , for i  1 . Then the Newton Backward Interpolating
Formula can be expressed as
 yn
 2 yn
3 yn
Pn ( x)  yn 
( x  xn ) 
( x  xn )( x  xn1 ) 
( x  xn )( x  xn1)( x  xn2 ) 
x
2!( x)2
3!( x)3

 n yn
( x  xn )( x  xn1 )
n!( x)n
( x  x1 ).
Assume that we want to approximate the function between 2.0 and 2.2, the first
half of the interval [2.0, 2.4] . Now the maximum value of the last term of P4 ( x) on
[2.0, 2.2] is only about 2 107 . Since our data points are given to decimal places, the
Newton forward formula of order greater than three won’t increase the accuracy of the
approximation for x in the interval [2.0, 2.2] . Therefore, we have achieved the desired
level of accuracy by using polynomials of degree three (or even lower). Table 2 gives the
approximations of f (2.03) , f (2.09) , and f (2.15) by using P1 ( x) , P2 ( x) , P3 ( x) and
P4 ( x) .
The identical values of P3 ( x) and P4 ( x) at x  2.03 , x  2.09 , and
x  2.15 confirm the above observation.
A similar analysis can be given to the
approximation of the function between 2.2 and 2.4 using the Newton backward formula.
12
We present the results of approximations of f (2.24) , f (2.31) , and f (2.37) using
Newton backward formula in Table 3.
Table 2 Example of use of Newton forward formula
xi
2.03
2.09
2.15
P1 ( xi )
1.424691
1.445646
1.466600
P2 ( xi )
1.424777
1.445683
1.466292
P3 ( xi )
1.424780
1.445684
1.466289
P4 ( xi )
1.424780
1.445684
1.466289
f ( xi ) 
1.424780
1.445684
1.466289
Table 3 Example of use of Newton backward formula
xi
2.24
2.31
2.37
P1 ( xi )
1.497004
1.519837
1.539408
P2 ( xi )
1.496660
1.519869
1.539483
P3 ( xi )
1.496663
1.519868
1.539480
P4 ( xi )
1.496663
1.519868
1.539480
f ( xi ) 
1.496663
1.519868
1.539480
In contrast, the Lagrange interpolation approach requires far more computation –
each time you increase the number of interpolating points by one, you have to recalculate
everything. This makes Lagrange interpolation less convenient for seeking the lowest
degree interpolating polynomial that fits the data with a given error tolerance.
Interpolation and Regression From time to time, we may have a large set of data. If
we have n  1 points (where n is large), the interpolating polynomial is of the degree n ,
presuming that the points do not fall onto a polynomial of lower degree. This high
degree polynomial is an exact match to the data points, but can be a very poor match
between those points. This can happen because the polynomial may change direction up
to n 1 times. In the process, the interpolating polynomial may shoot way up or down
after passing through each interpolating point in order to reach the next turning point to
come back down/up to hit the next interpolating point. We illustrate such a case in
Figure 8 where the interpolating points are (0, 5) , (1, 9) , (2,  10) , (3, 40) , (4, 5) ,
(5, 50) and (6, 29) .
The problematic portions of the interpolating polynomial are
between the first two interpolating points and between the last two interpolating points.
Such an oscillatory behavior may dramatically affect the accuracy of approximation
between interpolating points and make the approximation very sensitive to any changes
13
of the interpolating points. Moreover, there is likely a high computational cost of using
the interpolating function. If we opt for a lower degree polynomial by using the method
we discussed above, we will have to constantly get back to the data to select a subset of
interpolating points according to the value of x of interest in order to get a good
approximation.
120
100
80
60
40
20
0
-20 0
1
2
3
4
5
6
-40
Figure 8: Oscillatory behavior of interpolating polynomial
If the exact fit is not the only concern, we may overcome these difficulties by
finding a lower degree polynomial that will give reasonable accuracy. One way to find
such a lower degree polynomial is to use regression. A regression polynomial only
attempts to capture the overall trend in a set of data and, as such, can potentially give
much better approximations between the interpolating points, even though it doesn’t
necessarily pass through any of them.
Concluding Remarks We began this article by quoting Richard Hamming’s famous
statement about the purpose of computing being insight. While interpolation methods are
too often considered simply as computational procedures, we hope that our emphasis on
the components of those formulas provides much in the way of insight into where those
formulas came from, why they work, and how they should be used.
14
References:
[1] Atkinson, K. 1988. An Introduction to Numerical Analysis, 2nd Ed. New York: John
Wiley & Sons.
[2] Burden, R. and Faires, J. 2010. Numerical Analysis, 9th Ed. Boston: Brooks/Cole.
[3] Hamming, R. 1987. Numerical Methods for Scientists and Engineers, 2nd Ed. New
York: Dover Publications.
-----------------------------------------------------------------------------------------------------------Abstract This article takes a close look at Lagrange and Newton interpolation by
examining graphically the components of each of these formulas. While interpolation
methods are too often considered simply as computational procedures, we demonstrate
how hope that our emphasis on the components of the polynomial terms in these formulas
provides much in the way of insight into where these formulas came from, why they
work, and how they should be used.
Keywords
interpolating polynomial, Lagrange formula, Newton formula, Taylor
polynomial
Suggested Running Head
Components of Lagrange and Newton Interpolation
15
Download