Chapter 2 Iterative Methods for Solving Sets of Equations 2.4 Gradient Methods 2.4.1 Gradients and Hessian Gradient methods use derivative information of a function to locate optima. At the location where the first derivative is equal to zero, the function will have a maximum if the second derivative is negative and will have a minimum if the second derivative is positive. The concepts are illustrated in Figure 2.4-1 for a function with a single variable. f(x) f"(x1) < 0 f"(x2) > 0 x2 x1 x Figure 2.4-1 The optimums of a one-dimensional function. To understand how the first and second derivatives are expressed in a multidimensional system we begin by reviewing the concept of directional derivative of a function. Let u(x,y) is a function of two variables and v = (v1, v2) is a unit vector with arbitrary direction shown in Figure 2.4-2 where i and j are unit vectors in the x and y direction, respectively. y du v2j v v1i x Figure 2.4-2 Unit vector v = (v1, v2) with arbitrary direction 2-13 v = v1 i + v2 j = v cos i + v sin j (2.4-1) The directional derivative g’ of u measures the rate of change of u at the point (xo, yo) as we move in the direction of v . u u i + g’ = u v = ( j )( v1 i + v2 j ) x y g’ = v1 u u (xo, yo) + v2 (xo, yo) x y g’ = cos u u (xo, yo) + sin (xo, yo) x y (2.4-2a) (2.4-2b) (2.4-2c) Consequently, if g’ = 0, then u is not changing in the direction of v . g’ is a maximum if we move in the direction of u since g’ = |u|| v |cos(). Therefore the gradient of u, u, gives the direction of steepest ascent. Example 2.4-13 Evaluate the steepest ascent direction for the function u(x,y) = xy2 at the point (2, 2) Solution If u(x,y) is temperature then the curves xy2 = constant are called isotherms. Six of these isotherms are plotted in Figure 2.4-3 with the point A(2, 2) for which the direction of steepest ascent is line AB. The function at A(2, 2) can be determined as u(2, 2) = 2(2)2 = 8 Next, the gradient of u, u, can be evaluated u = u u i + j = y2 i + 2xy j = (2)2 i + 2(2)(2) j x y u = 4 i + 8 j The angle with respect to the x axis is then 8 = tan-1 4 = 1.107 radians (= 63.4o) 3 Numerical Methods for Engineers by Chapra and Canale 2-14 4 32 3.5 40 24 16 8 3 B 4 y 2.5 2 A 1.5 1 0.5 0 0 0.5 1 1.5 2 x 2.5 3 3.5 4 Figure 2.4-3 Isotherms for u(x,y) = xy2. The magnitude of u is evaluated as |u| = (42 + 82)1/2 = 8.944 Therefore line AB will initially gain 8.944 units for a unit distance advanced along this steepest path. The value of u at B is not 8 + 8.944 = 16.944 since as we move in this direction the value of the gradient changes. The value of u at B is u(2.4472, 2.8944) = 2.4472(2.8944)2 = 20.502 The directional derivative g’ of u along this path is just |u| g’ = cos u u (xo, yo) + sin (xo, yo) = 4 cos(1.107) + 4 sin(1.107) = 8.944 x y The direction of steepest ascent is normal to the isotherm at the coordinate (2, 2). The Matlab program listed in Table 2.4-1 plots the isotherms shown in Figure 2.4-3. 2-15 Table 2.4-1 Matlab program to plot isotherms for u(x,y) = xy2 ------------% x1=2;y1=2; dfdx=4;dfdy=8; r=dfdy/dfdx; dx=sqrt(1/(1+r*r));dy=r*dx; x2=x1+dx;y2=y1+dy; xx=[x1 x2];yy=[y1 y2]; fxy=[4 8 16 24 32 40]; x=.5:.02:4;x=x'; n=length(fxy);nx=length(x); ym=zeros(nx,n); for i=1:n ym(:,i)=sqrt(fxy(i)./x); end plot(x,ym,xx,yy);axis equal axis([0 4 0 4]);grid xlabel('x');ylabel('y') y2 = 2.8944 >> x2 x2 = 2.4472 For a function with two independent variables f(x,y), a maximum or a minimum depends not only on the partials with respect to x and y but also on the second partial with respect to x and y. The Hessian H of f is a matrix consists of the second derivatives defined as 2 f x 2 H= 2 f yx 2 f xy 2 f y 2 A maximum or a minimum of a multidimensional function depends on the determinant of the Hessian matrix. 2 2 f 2 f 2 f |H| = x 2 y 2 xy 2 f If |H| > 0 and > 0 then f(x,y) has a local minimum. x 2 2 f If |H| > 0 and < 0 then f(x,y) has a local maximum. x 2 If |H| < 0 then f(x,y) has a saddle point. 2-16