Ch2-4

advertisement
Chapter 2
Iterative Methods for Solving Sets of Equations
2.4 Gradient Methods
2.4.1 Gradients and Hessian
Gradient methods use derivative information of a function to locate optima. At the location
where the first derivative is equal to zero, the function will have a maximum if the second
derivative is negative and will have a minimum if the second derivative is positive. The
concepts are illustrated in Figure 2.4-1 for a function with a single variable.
f(x)
f"(x1) < 0
f"(x2) > 0
x2
x1
x
Figure 2.4-1 The optimums of a one-dimensional function.
To understand how the first and second derivatives are expressed in a multidimensional
system we begin by reviewing the concept of directional derivative of a function.

Let u(x,y) is a function of two variables and v = (v1, v2) is a unit vector with arbitrary
direction  shown in Figure 2.4-2 where i and j are unit vectors in the x and y direction,
respectively.
y
du
v2j
v


v1i
x

Figure 2.4-2 Unit vector v = (v1, v2) with arbitrary direction 
2-13

v






= v1 i + v2 j = v cos  i + v sin  j
(2.4-1)
The directional derivative g’ of u measures the rate of change of u at the point (xo, yo) as we

move in the direction of v .


 u  u 

i +
g’ = u v = (
j )( v1 i + v2 j )
x
y
g’ = v1
u
u
(xo, yo) + v2
(xo, yo)
x
y
g’ = cos 
u
u
(xo, yo) + sin  (xo, yo)
x
y
(2.4-2a)
(2.4-2b)
(2.4-2c)

Consequently, if g’ = 0, then u is not changing in the direction of v . g’ is a maximum if we

move in the direction of u since g’ = |u|| v |cos(). Therefore the gradient of u, u, gives
the direction of steepest ascent.
Example 2.4-13
Evaluate the steepest ascent direction for the function u(x,y) = xy2 at the point (2, 2)
Solution
If u(x,y) is temperature then the curves xy2 = constant are called isotherms. Six of these
isotherms are plotted in Figure 2.4-3 with the point A(2, 2) for which the direction of steepest
ascent is line AB. The function at A(2, 2) can be determined as
u(2, 2) = 2(2)2 = 8
Next, the gradient of u, u, can be evaluated
u =




 u  u 
i +
j = y2 i + 2xy j = (2)2 i + 2(2)(2) j
x
y


u = 4 i + 8 j
The angle  with respect to the x axis is then
8
 = tan-1  4  = 1.107 radians (= 63.4o)
3
Numerical Methods for Engineers by Chapra and Canale
2-14
4
32
3.5
40
24
16
8
3
B
4
y
2.5
2
A
1.5
1
0.5
0
0
0.5
1
1.5
2
x
2.5
3
3.5
4
Figure 2.4-3 Isotherms for u(x,y) = xy2.
The magnitude of u is evaluated as
|u| = (42 + 82)1/2 = 8.944
Therefore line AB will initially gain 8.944 units for a unit distance advanced along this
steepest path. The value of u at B is not 8 + 8.944 = 16.944 since as we move in this
direction the value of the gradient changes. The value of u at B is
u(2.4472, 2.8944) = 2.4472(2.8944)2 = 20.502
The directional derivative g’ of u along this path is just |u|
g’ = cos 
u
u
(xo, yo) + sin  (xo, yo) = 4 cos(1.107) + 4 sin(1.107) = 8.944
x
y
The direction of steepest ascent is normal to the isotherm at the coordinate (2, 2). The Matlab
program listed in Table 2.4-1 plots the isotherms shown in Figure 2.4-3.
2-15
Table 2.4-1 Matlab program to plot isotherms for u(x,y) = xy2 ------------%
x1=2;y1=2;
dfdx=4;dfdy=8;
r=dfdy/dfdx;
dx=sqrt(1/(1+r*r));dy=r*dx;
x2=x1+dx;y2=y1+dy;
xx=[x1 x2];yy=[y1 y2];
fxy=[4 8 16 24 32 40];
x=.5:.02:4;x=x';
n=length(fxy);nx=length(x);
ym=zeros(nx,n);
for i=1:n
ym(:,i)=sqrt(fxy(i)./x);
end
plot(x,ym,xx,yy);axis equal
axis([0 4 0 4]);grid
xlabel('x');ylabel('y')
y2 =
2.8944
>> x2
x2 =
2.4472
For a function with two independent variables f(x,y), a maximum or a minimum depends not
only on the partials with respect to x and y but also on the second partial with respect to x and
y. The Hessian H of f is a matrix consists of the second derivatives defined as
 2 f
 x 2
H=  2
 f
 yx
2 f 
xy 

2 f 
y 2 
A maximum or a minimum of a multidimensional function depends on the determinant of the
Hessian matrix.
2
2 f 2 f  2 f 


|H| =
x 2 y 2  xy 
2 f
If |H| > 0 and
> 0 then f(x,y) has a local minimum.
x 2
2 f
If |H| > 0 and
< 0 then f(x,y) has a local maximum.
x 2
If |H| < 0 then f(x,y) has a saddle point.
2-16
Download