18.330 Lecture Notes: Numerical Differentiation Homer Reid February 25, 2014

advertisement
18.330 Lecture Notes:
Numerical Differentiation
Homer Reid
February 25, 2014
Suppose we have a black-box function f (x). We can query this function
for its value at any x, and we will get back a number, but we don’t have an
analytical formula for f (x). How do we estimate values of f 0 (x)?
Contents
1 Finite-difference approximations of the first derivative
1.1 Forward differencing . . . . . . . . . . . . . . . . . . . . .
1.2 Backward differencing . . . . . . . . . . . . . . . . . . . .
1.3 Centered differencing . . . . . . . . . . . . . . . . . . . . .
1.4 Higher-order finite-difference formulas . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
2
2
3
3
2 Finite-difference approximations of higher derivatives
4
3 Finite-differencing of multivariable functions
5
4 Finite-differencing as matrix-vector multiplication
6
1
18.330 Lecture Notes
1
1.1
2
Finite-difference approximations of the first
derivative
Forward differencing
The definition of the derivative of a function f (x) at a point x is
f (x + h) − f (x)
df f 0 (x) ≡
= lim
.
dx x h→0
h
(1)
The simplest approach to numerical differentiation is simply to arrest the limiting process here and evaluate the RHS of (1) at a finite value of h. This defines
what is known as the forward-finite-difference (FFD) (or just forward-difference)
approximation to the derivative:
0
fFFD
(h; x) ≡
f (x + h) − f (x)
.
h
(2)
It’s easy to assess the error incurred by the forward-difference procedure. Recall
that the Taylor-series expression for the quantity f (x + h) is
f (x + h) = f (x) + hf 0 (x) +
h2 00
f (x) + O(h3 )
2
Inserting this into (2), we find
0
fFFD
(h; x) = f 0 (x) +
h 00
f (x) + O(h2 )
2
(3)
The first term on the RHS here is the quantity we are trying to compute, and
everything else is an error term. Thus we have
0
fFFD
(h; x) − f 0 (x) =
h 00
f (x) + O(h2 )
2
(4)
As usual in error analysis, this equation is not useful for giving us an actual
number for the error, because we don’t know how to evaluate f 00 (x). The only
important thing is the h dependence: the error is linear in h, i.e. we have a
first-order method. To obtain one more digit of accuracy (i.e. 10× smaller
error) we must use a 10× smaller value of h.
1.2
Backward differencing
It may happen that values of f (x + h) are not available for positive h. This
may happen, for example, if the point x lies at the right endpoint of the interval
over which our function is computable or measurable. (I mean measurable in
the experimental sense, not the sense of Lebesgue integration. Think of f (x) as
a quantity reported by an experimental apparatus on which we can’t turn the
dial any further than some xmax .) Of course values of f (x + h) must exist for at
18.330 Lecture Notes
3
least some nonzero range of positive h, since otherwise the derivative at x would
not be defined, but those values may not be accessible to us for one reason or
another.
In this case, we can use backward differencing:
0
fBD
(h; x) ≡
f (x) − f (x − h)
.
h
(5)
It’s easy to show that backward-differencing, like forward-differencing, is a firstorder method.
1.3
Centered differencing
Consider the Taylor-series expansions of f (x − h) and f (x + h):
f (x + h) = f (x) + hf 0 (x) +
h3
h2 00
f (x) + f 000 (x) + O(h4 )
2
6
(6a)
f (x − h) = f (x) − hf 0 (x) +
h2 00
h3
f (x) − f 000 (x) + O(h4 )
2
6
(6b)
Careful scrutiny of these equations reveals that by subtracting them and dividing by 2 we can pick off the second-derivative term (and in fact all even
derivative terms) in (3):
f (x + h) − f (x − h)
h3
= hf 0 (x) + f 000 (x) + O(h5 )
2
6
Now just divide by h to obtain the centered-difference approximation to the
derivative:
f (x + h) − f (x − h)
0
fCD
(x) ≡
(7)
2h
The above analysis shows that
0
fCD
(x) − f 0 (x) =
h2 000
f (x) + O(h4 )
6
(8)
Thus centered-differencing is a method of order 2.
1.4
Higher-order finite-difference formulas
Formulas like (2), (5), and (7) are known as finite-difference stencils: they
are linear combinations of n function samples that approximate the derivative
with pth-order convergence. The forward-difference, backward-difference, and
centered-difference stencils have (n, p) = (2, 1), (2, 1), (2, 2) respectively.
By increasing the number of function samples n that we are willing to compute, it is easy to construct finite-difference stencils that achieve any desired
convergence order p. All you have to do is write down the Taylor expansions of
the quantities
· · · , f (x − 2h), f (x − h), f (x), f (x + h), f (x + 2h), · · ·
18.330 Lecture Notes
4
and construct clever weighted combinations of these that pick off successively
higher-order terms in the error estimates of equations (3) and (8).
However, we generally don’t carry out finite-differencing beyond the centereddifference case. The reason is that in constructing formulas of this type we are
essentially constructing and differentiating a polynomial interpolant through
data samples at uniformly-spaced intervals. As we have noted many times, this
procedure is badly-behaved due to the Runge phenomenon: the more you try
to bend and squeeze a high-order polynomial to fit through evenly-spaced data
points, the more it will bulge out in between the points. If you need a numerical
differentiation stencil that achieves a rapid convergence rate, a better idea is to
use non-uniformly spaced points to construct and differentiate a polynomial interpolant. We will revisit this topic when we consider Chebyshev interpolation
later in the course.
2
Finite-difference approximations of higher derivatives
We can play similar games to write down approximate formulas for higher
derivatives. For example, go back to equations (6) and suppose that we add
the two equations together instead of subtracting them:
f (x + h) + f (x − h) = 2f (x) + h2 f 00 (x) +
h4 0000
f (x) + · · ·
12
Clearly all we have to do is subtract off 2f (x) and divide by h2 to obtain an
approximation to the second derivative:
00
fCD
(h; x) ≡
f (x + h) − 2f (x) + f (x − h)
h2 0000
00
=
f
(x)
+
f (x) + · · ·
h2
12
(9)
We call this the “centered-difference” approximation to the second derivative;
evidently it achieves second-order convergence.
18.330 Lecture Notes
3
5
Finite-differencing of multivariable functions
Next suppose we want to differentiate a function of more than one variable, say
f (x, y).
If we are only interested in partial derivatives with respect to a single variable, we can just apply the formulas for the one-dimensional case with the other
variables held fixed. For example:
f (x + h, y) − f (x, y)
∂f ≈
first-order convergence
∂x (x,y)
h
∂f f (x, y + h) − f (x, y − h)
≈
second-order convergence
∂y (x,y)
2h
f (x − h, y) − 2f (x, y) + f (x + h, y)
∂ 2 f second-order convergence
≈
2
∂y (x,y)
h2
However, things get a little more interesting when we go to compute mixed
partial derivatives. Consider, for example, the simultaneous double Taylor expansion of f (x, y) :
f (x + ∆x , y + ∆y ) = f (x, y) + ∆x fx (x, y) + ∆y fy (x, y)
+
∆2y
∆2x
fxx (x, y) + ∆x ∆y fxy (x, y) +
fyy (x, y) + O(∆3 )
2
2
By writing out this equation for various possible choices of ∆x and ∆y and
taking linear combinations of the results, it is possible to kill off various terms
on the RHS to obtain stencils for various partial derivatives. You will explore
this possibility in your problem set this week.
18.330 Lecture Notes
6
Figure 1: A set of N = 5 equally-spaced points in the interior of an interval
[a, b]. The spacing between the points is h = (b − a)/(N + 1).
4
Finite-differencing as matrix-vector multiplication
Consider an interval [xa , xb ] and a function f (x) that vanishes at the endpoints,
i.e. f (xa ) = f (xb ) = 0. Suppose we have samples of f at a set of N evenlyspaced points between a and b. More specifically, break up the interval into
N + 1 segments of width
b−a
h=
N +1
and denote the endpoints of these intervals and the values of f at those points
by
xn = xa + nh,
fn = f (xn ),
n = 1, 2, · · · , N.
(For convenience we will also use the notation x0 = a and xN +1 = b.)
Now suppose we try to compute the second derivative of f at the points xn
18.330 Lecture Notes
7
using the second-order finite-difference stencil (9). We find
i
1h
f
(x
)
−
2f
(x
)
+
f
(x
)
0
1
2
h2
i
1h
= 2 − 2f (x1 ) + f (x2 )
h
f 00 (x1 ) ≈
(10a)
(where we used the boundary condition f (x0 ) = 0)
f 00 (x2 ) ≈
f 00 (x3 ) ≈
..
.
=
f 00 (xN −1 ) ≈
f 00 (xN ) ≈
i
1h
f (x1 ) − 2f (x2 ) + f (x3 )
2
h
i
1h
f (x2 ) − 2f (x3 ) + f (x4 )
2
h
..
.
i
1h
f
(x
)
−
2f
(x
)
+
f
(x
)
N
−2
N
−1
N
h2
i
1h
f
(x
)
−
2f
(x
)
N
−1
N
h2
(10b)
(10c)
(10d)
(10e)
where in the last line we used the boundary condition f (xN +1 ) = 0.
It’s convenient to write equations (10) in the form of a matrix-vector product:





f1
f100
−2 1
0 ··· 0
0


 f200 
 1 −2 1 · · · 0
0 
  f2 





 f300 

1 −2 · · · 0
0   f3 
1  0



(11)
 ..  ≈ 2  ..
..
..
.
..   .. 
..
..
 . 
 .  h  .
.
.
.
.





 f 00

 0
0
0 · · · −2 1   fN −1 
N −1
00
fN .
0
0
0 · · · 1 −2
fN
which we may write using matrix-vector notation in the form
f 00 = Af
(12)
where f 00 and f are the vectors of samples of f and samples of the second derivative of f .
The point of equations (11) and (12) is that the operation that takes f into f 00
may be thought of as matrix multiplication. Among the important consequences
of this observation is that it makes it easy to invert the operation that obtains
f from f 00 :
f = A−1 f 00
(13)
The primary use of formulas like (13) is in the application of finite-difference
differentiation to the solution of boundary-value problems and higher-order
PDEs; the technique is known in the PDE world as the finite-difference method.
18.330 Lecture Notes
8
Extension to nontrivial boundary conditions
In the example above, we used the boundary conditions f (xa ) = f (xb ) = 0.
This simplified equations (10a) and (10e). What if instead we have nontrivial
boundary conditions
f (xa ) = fa ,
f (xb ) = fb
where fa , fb are some given numbers? In this case, equations (10a) and (10e)
are respectively modified to read
i
1h
f
−
2f
(x
)
+
f
(x
)
a
1
2
h2
i
1h
f 00 (xN ) ≈ 2 f (xN −1 ) − 2f (xN ) + fb
h
f 00 (x1 ) ≈
and equation (11) is modified to look like this:





f100
fa
−2 1
0
 f200 
 0 
 1 −2 1





 f300  1  0 
1 −2
1 




 0
 ..  − 2  ..  ≈ 2  ..
..
..
 .  h  .  h  .
.
.





 f 00

 0 
 0
0
0
N −1
00
fN
0
0
0
fb
|
| {z } |
{z
}
{z
f 00
∆
A
···
···
···
..
.
0
0
0
..
.
···
···
−2
1
(14a)
(14b)
0
0
0
..
.

f1
f2
f3
..
.







1   fN −1
−2
fN .
}|
{z
f









}
(15)
What we have done here is to swing the terms involving fa and fb in (14) over to
the other side of the equation in (15) – that is, away from the side containing the
unknowns and onto the side on which the known quantities reside. Note that
the matrix A and the vectors f , f 00 in this equation are unchanged from equation
(11). All that happens is that the RHS of equation (13) is now augmented by
an additional term:
h
1 i
f = h2 A−1 f 00 − 2 ∆
(16)
h
where ∆ is the sparse vector in (15) containing the boundary values of f .
Download