You can also view this case study in the following formats: Mathematica Maple Extrema for Functions of Several Variables Text Reference: Section 7.2, p. 461 The purpose of this set of exercises is to show how quadratic forms may be used to investigate maximum and minimum values of functions of several variables. Finding the extreme values, or extrema, of a function is one of the major uses of calculus. Often there is some physical or economic interpretation of the function, so maximizing or minimizing the function is of great practical value. Recall first how to find the extreme values of a function of one variable: y = f (x). The notions of relative maximum, relative minimum, and relative extremum are of primary importance. Definition: The function f (x) has a relative maximum at x = a if f (x) < f (a) for all x close to a; that is, if f (x) f (a) < 0 for all x close to a. The function f (x) has a relative minimum at x = a if f (x) > f (a) for all x close to a; that is, if f (x) f (a) > 0 for all x close to a. If f (x) has either a relative maximium or a relative minimum at a, then f (x) has a relative extremum at a. To find these points (if they exist), find the critical points of f (x): points where f 0 (x) = 0 or points where f 0 (x) does not exist. For future reference, note that the points where f 0 (x) = 0 are exactly those points where the tangent line to the curve f (x) is horizontal. If f 0 (x) = 0, a second derivative test can be applied to determine whether the critical point yields a relative maximum or a relative minimum, although this test sometimes fails. Consult Reference 1 (Section 3.3) for more details. x1 , The situation is quite similar when functions of more than one variable are studied. Let x = x2 and consider f (x). Relative extrema for f (x) are defined in a manner analogous to that for a function of one variable. Definition: The function f (x) has a relative maximum at a if f (x) < f (a) for all x close to a; that is, if f (x) f (a) < 0 for all x close to a. The function f (x) has a relative minimum at (a) if f (x) > f (a) for all x close to a; that is, if f (x) f (a) > 0 for all x close to a. If f (x) has either a relative maximium or a relative minimum at a, then f (x) has a relative extremum at a. To find these points (if they exist), examine the tangent plane to the surface z The gradient of f at a is the vector r = x) at x = a. f( a) = (f1 (a); f2 (a)) f( where f1 and f2 are the partial derivatives of f taken with respect to x 1 and x2. The equation for the tangent plane at x = a may be written in a convenient form: z = f( a) + rf (a) (x 1 a) Analogously with the y = f (x) case, critical points are found where the tangent plane is horizontal. As can be seen from the equation of the plane, this will happen exactly when rf (a) = 0. In this case a is called a critical point of f (x). The behavior of a function at a critical point a is now important; linear algebra will assume a prominent role in developing a strategy to determine this behavior. Assume that f has first and second partial derivatives and that these functions are continuous. A multivariable version of Taylor’s Theorem says that x) = p2 (x) + R2 (x; a) f( where p2 ( x) = f (a) + rf (a) (x and R2 (x; a) is a term with a) + 1 2 (x T a) a) f21 (a) f11 ( a) f22 (a) f12 ( (x a) j 2(x a)j ! 0 as x ! a jjx ajj2 R ; The second partial derivatives of f are denoted f11, f12, f21 , and f22 . Since the matrix in the expression on p2 will figure prominently in the analysis, it is given a name. Definition: The Hessian of a function f : R 2 ! is fij (a). That is, H = R evaluated at a is the matrix whose ( i; j ) a) f21 (a) a) f22 (a) f11 ( f12 ( entry The Hessian determines the behavior of f at a critical point a. Since a is a critical point, rf (a) = 0, and x) = p2 (x) + R2 (x; a) = f (a) + f( so x) f( But the quantity f (x) f( a) = 1 2 (x 1 2 (x a)T H (x a)T H (x a) + R2 (x; a) a) + R2 (x; a) a) is what must be examined to discover the behavior of f : f( If f (x) at a. If f (x) at a. If f (x) f (a) is negative for some choices of x near a and positive for some choices of x near a then f does not have a relative extremum at a. In this situation it is said that f has a saddle point at a. a) is negative for all choices of x near a, then f will have a relative maximum f( a) is positive for all choices of x near a, then f will have a relative minimum f( 2 As x approaches a, notice that R2(x; a) is approaching 0, so if 12 (x a)T H (x a) is not equal to 0 as x gets closer and closer to a, then R2 (x; a) will not affect whether f (x) f (a) is positive or negative when x is near a. However, if 12 (x a)T H (x a) = 0, then the answer would depend on R2 , which is unknown. The test would fail in that case. The sign of f (x) f (a) is important and not its magnitude, so the constant 12 may be removed from the analysis. The result may be summarized as follows: If (x at a. a)T H (x a) is negative for any choice of x near a, then there is a relative maximum If (x at a. a)T H (x a) is positive for any choice of x near a, then there is a relative minimum If (x a)T H (x a) is positive for some choices of x and negative for other choices of x arbitrarily close to a, then there is a saddle point at a, but not a relative extremum. If (x a)T H (x a) = 0 for a choice of x arbitrarily close to a, then the analysis fails. Example: Consider the function x) = x21 f( x1 x2 2 + x2 + 2x1 + 2x2 4 Then rf (x) = (2x1 x2 + 2; 2x2 x1 + 2), and rf (x) = 0 is solved to find that a = ( is the only critical point. Differentiate again to find f 11(a) = 2, f12(a) = f21(a) = f22 (a) = 2. Thus the Hessian of f at a is H and x) f( f( a) = 1 2 = (x 2 1 1 2 a)T 2 1 1 2 (x 2; 2) 1 and a) Analyzing this situation is difficult because of all the possible choices for x near a; however, the notion of a quadratic form cleans up matters considerably. If z = x a, then the term (x a)T H (x a) = zT H z, so there is a quadratic form Q(z) = zT H z and the Hessian H is the matrix of that quadratic form. Notice then that the above observation about relative maxima, relative minima, and saddle points may be summarized quite nicely. If Q(z) < 0 for all z, then there is a relative maximum at a. If Q(z) > 0 for all z, then there is a relative minimum at a. If Q(z) is positive for some choices of z and negative for other choices of z, then there is a saddle point at a, but not a relative extremum. If Q(z) = 0 for a choice of z, then the analysis fails. 3 Finally, note that the first three conditions given above are the definitions for a negative definite quadratic form, a positive definite quadratic form, and an indefinite quadratic form. Example (cont.): The standard matrix for this quadratic form Q is H = 2 1 1 2 The Principal Axes Theorem says that there is an orthogonal change of variable z = P y that transforms the quadratic form z T H z into a quadratic form y T Dy, where D is a diagonal matrix with the eigenvalues of H (with multiplicities) as its diagonal entries. Example (cont): The eigenvalues of H are 1 and 3, and the standard matrix H may be diqagonalized to find that H = P DP 1 , where 1 1 1 0 P = and D = 1 1 0 If y = (y1; y2), then the quadratic form has been converted into yT Dy = [ y1 y2 ] 1 0 y1 0 3 y2 3 2 2 = y1 + 3y2 which is positive for all choices of y1 and y2, thus for all choices of y. And so Q(z) > 0 for all choices of z (i.e., Q is a positive definite quadratic form), and f (x) = x 21 x1x2 + x22 + 2x1 + 2x2 4 has a relative minimum at the point a = ( 2; 2). Finally note that, by Theorem 5 in Section 7.2, the behavior of summarized by determining the eigenvalues of the Hessian H a) f21 (a) f11 ( = x) at a critical point a may be f( a) f22 (a) f12 ( If all eigenvalues of H are positive, f (x) will have a relative minimum at a. If all eigenvalues of H are negative, f (x) will have a relative maximum at a. If the eigenvalues of H are of mixed signs, then f (x) has a saddle point at a. If any of the eigenvalues of H is zero, then the analysis fails. The same analysis applies to functions defined to be 2 f of three variables. If the Hessian of 3 = 4 a) a) f32 (a) at a point a is a) a) 5 f33 (a) determine the behavior of f (x) at a critical point (where rf (a) H a) a) f31 (a) f f11 ( f12 ( f13 ( f21 ( f22 ( f23 ( then the eigenvalues of H exactly as they do in the two variable case. Questions: Locate all relative extrema and saddle points for the following functions. 4 = 0) x) = x21 + x1x2 + x22 + 3x1 1. f( 2. f( 3. f( 4. f( 5. f( 6. f( 7. f( 8. f( 3x2 + 4 x) = x21 + x1x2 + 3x1 + 2x2 + 5 x) = x21 4x1 x2 + x22 + 6x2 + 2 x) = 2x1 + 2x2 x) = x31 3 x2 x) = 6x21 x) = x21 2x21 2 2x1 x2 x2 +3 2x1 x2 + 6 2x31 + 3x22 + 6x1 x2 x1 x2 + x22 x) = x31 + x32 + x33 x1 x3 + x23 3x21 x2 3x22 x3 3x23 x1 + 6x1 + 6x2 + 6x3 Hint: the critical points for this function are the following 8 points: (1; 1; 1) (1:53868; (:0838115; :0838115 ; 2:15069) 2:15069; (2:15069; 1:53868; 1:53868) :0838115) ( 1; ( 1:53868; :0838115; 1; 1) ( :0838115; 2:15069; 1 :53868) ( 2:15069; 2:15069) 1:53868; :0838115) Reference: 1. Finney, Ross L., Weir, Maurice D., and Giordano, Frank R. Thomas’ Calculus. Tenth Edition. Boston: Addison-Wesley, 2001. 5