Multidimensional Gradient Methods in Optimization Major: All Engineering Majors Authors: Autar Kaw, Ali Yalcin http://numericalmethods.eng.usf.edu Transforming Numerical Methods Education for STEM Undergraduates 4/13/2015 http://numericalmethods.eng.usf.edu 1 Steepest Ascent/Descent Method http://numericalmethods.eng.usf.edu Multidimensional Gradient Methods -Overview Use information from the derivatives of the optimization function to guide the search Finds solutions quicker compared with direct search methods A good initial estimate of the solution is required The objective function needs to be differentiable http://numericalmethods.eng.usf.edu 3 Gradients The gradient is a vector operator denoted by (referred to as “del”) When applied to a function , it represents the functions directional derivatives The gradient is the special case where the direction of the gradient is the direction of most or the steepest ascent/descent The gradient is calculated by f f f i j x y http://numericalmethods.eng.usf.edu 4 Gradients-Example Calculate the gradient to determine the direction of the steepest slope at point (2, 1) for the function f x, y x 2 y 2 Solution: To calculate the gradient we would need to calculate f f 2 2 2 x 2 y 2(2) 2 (1) 8 2 xy 2(2)(1) 4 y x which are used to determine the gradient at point (2,1) as f 4i 8 j http://numericalmethods.eng.usf.edu 5 Hessians The Hessian matrix or just the Hessian is the Jacobian matrix of second-order partial derivatives of a function. The determinant of the Hessian matrix is also referred to as the Hessian. For a two dimensional function the Hessian matrix is simply 2 f 2 x H 2 f yx 2 f xy 2 f y 2 http://numericalmethods.eng.usf.edu 6 Hessians cont. The determinant of the Hessian matrix denoted by H can have three cases: 2 2 2 1. If H 0 and f / x 0 then f x, y has a local minimum. 2 2 2 2. If H 0 and f / x 0 then f x, y has a local maximum. 3. If H 0 then f x, y has a saddle point. http://numericalmethods.eng.usf.edu 7 Hessians-Example Calculate the hessian matrix at point (2, 1) for the function f x, y x 2 y 2 Solution: To calculate the Hessian matrix; the partial derivatives must be evaluated as f 2 y 2 2(1)2 4 2 2 x 2 2 f 2 2 2 x 2 ( 2 ) 8 y 2 2 f 2 f 4 xy 4(2)(1) 8 xy yx resulting in the Hessian matrix 2 f 2 x H 2 f yx 2 f xy 4 8 2 f 8 8 y 2 http://numericalmethods.eng.usf.edu 8 Steepest Ascent/Descent Method 9 Starts from an initial point and looks for a local optimal solution along a gradient. The gradient at the initial solution is calculated. A new solution is found at the local optimum along the gradient The subsequent iterations involve using the local optima along the new gradient as the initial point. http://numericalmethods.eng.usf.edu Example Determine the minimum of the function f x, y x y 2x 4 2 2 Use the point (2,1) as the initial estimate of the optimal solution. 10 http://numericalmethods.eng.usf.edu Solution Iteration 1: To calculate the gradient; the partial derivatives must be evaluated as f 2 x 2 2(2) 2 4 x f 2 y 2(1) 2 y f 4i 2 j Now the function f x, y can be expressed along the direction of gradient as f f f x0 h, y0 h f (2 4h,1 2h) (2 4h)2 (1 2h)2 2(2 4h) 4 x y g (h) 20h2 28h 13 11 http://numericalmethods.eng.usf.edu Solution Cont. Iteration 1 continued: This is a simple function and it is easy to determine h* 0.7 by taking the first derivative and solving for its roots. This means that traveling a step size of h 0.7 along the gradient reaches a minimum value for the function in this direction. These values are substituted back to calculate a new value for x and y as follows: x 2 4(0.7) 0.8 y 1 2(0.7) 0.4 Note that 12 f 2,1 13 f 0.8,0.4 3.2 http://numericalmethods.eng.usf.edu Solution Cont. Iteration 2: The new initial point is 0.8,0.4 .We calculate the gradient at this point as f 2 x 2 2(0.8) 2 0.4 x f 2 y 2(0.4) 0.8 y f 0.4i 0.8 j f f f x0 h, y0 h f (0.8 0.4h,0.4 0.8h) (0.8 0.4h) 2 (0.4 0.8h) 2 2(0.8 0.4h) 4 x y g (h) 0.8h2 0.8h 3.2 h* 0.5 x 0.8 0.4(0.5) 1 y 0.4 0.8(0.5) 0 f 0.8,0.4 3.2 13 f 1,0 3 http://numericalmethods.eng.usf.edu Solution Cont. Iteration 3: The new initial point is 1,0 .We calculate the gradient at this point as f 2 x 2 2(1) 2 0 x f 2 y 2(0) 0 y f 0i 0 j This indicates that the current location is a local optimum along this gradient and no improvement can be gained by moving in any direction. The minimum of the function is at point (-1,0). 14 http://numericalmethods.eng.usf.edu Additional Resources For all resources on this topic such as digital audiovisual lectures, primers, textbook chapters, multiple-choice tests, worksheets in MATLAB, MATHEMATICA, MathCad and MAPLE, blogs, related physical problems, please visit http://nm.mathforcollege.com/topics/opt_multidimensional_gradient.html THE END http://numericalmethods.eng.usf.edu