Extrema for Functions of Several Variables

advertisement
You can also view this case study in the following formats:
Mathematica
Maple
Extrema for Functions of Several Variables
Text Reference: Section 7.2, p. 461
The purpose of this set of exercises is to show how quadratic forms may be used to investigate
maximum and minimum values of functions of several variables.
Finding the extreme values, or extrema, of a function is one of the major uses of calculus. Often
there is some physical or economic interpretation of the function, so maximizing or minimizing the
function is of great practical value. Recall first how to find the extreme values of a function of one
variable: y = f (x). The notions of relative maximum, relative minimum, and relative extremum
are of primary importance.
Definition: The function f (x) has a relative maximum at x = a if f (x) < f (a) for all x close to
a; that is, if f (x)
f (a) < 0 for all x close to a. The function f (x) has a relative minimum at
x = a if f (x) > f (a) for all x close to a; that is, if f (x)
f (a) > 0 for all x close to a. If f (x) has
either a relative maximium or a relative minimum at a, then f (x) has a relative extremum at a.
To find these points (if they exist), find the critical points of f (x): points where f 0 (x) = 0 or
points where f 0 (x) does not exist. For future reference, note that the points where f 0 (x) = 0 are
exactly those points where the tangent line to the curve f (x) is horizontal. If f 0 (x) = 0, a second
derivative test can be applied to determine whether the critical point yields a relative maximum or
a relative minimum, although this test sometimes fails. Consult Reference 1 (Section 3.3) for more
details.
x1
,
The situation is quite similar when functions of more than one variable are studied. Let x =
x2
and consider f (x). Relative extrema for f (x) are defined in a manner analogous to that for a function
of one variable.
Definition: The function f (x) has a relative maximum at a if f (x) < f (a) for all x close to a;
that is, if f (x) f (a) < 0 for all x close to a. The function f (x) has a relative minimum at (a) if
f (x) > f (a) for all x close to a; that is, if f (x)
f (a) > 0 for all x close to a. If f (x) has either
a relative maximium or a relative minimum at a, then f (x) has a relative extremum at a.
To find these points (if they exist), examine the tangent plane to the surface z
The gradient of f at a is the vector
r
=
x) at x = a.
f(
a) = (f1 (a); f2 (a))
f(
where f1 and f2 are the partial derivatives of f taken with respect to x 1 and x2. The equation for
the tangent plane at x = a may be written in a convenient form:
z
=
f(
a) + rf (a) (x
1
a)
Analogously with the y = f (x) case, critical points are found where the tangent plane is horizontal.
As can be seen from the equation of the plane, this will happen exactly when rf (a) = 0. In this
case a is called a critical point of f (x). The behavior of a function at a critical point a is now
important; linear algebra will assume a prominent role in developing a strategy to determine this
behavior.
Assume that f has first and second partial derivatives and that these functions are continuous. A
multivariable version of Taylor’s Theorem says that
x) = p2 (x) + R2 (x; a)
f(
where
p2 (
x) = f (a) + rf (a) (x
and R2 (x; a) is a term with
a) +
1
2
(x
T
a)
a)
f21 (a)
f11 (
a)
f22 (a)
f12 (
(x
a)
j 2(x a)j ! 0 as x ! a
jjx ajj2
R
;
The second partial derivatives of f are denoted f11, f12, f21 , and f22 . Since the matrix in the
expression on p2 will figure prominently in the analysis, it is given a name.
Definition: The Hessian of a function f : R 2 !
is fij (a). That is,
H
=
R evaluated at a is the matrix whose (
i; j )
a)
f21 (a)
a)
f22 (a)
f11 (
f12 (
entry
The Hessian determines the behavior of f at a critical point a. Since a is a critical point, rf (a) = 0,
and
x) = p2 (x) + R2 (x; a) = f (a) +
f(
so
x)
f(
But the quantity f (x)
f(
a) =
1
2
(x
1
2
(x
a)T H (x
a)T H (x
a) + R2 (x; a)
a) + R2 (x; a)
a) is what must be examined to discover the behavior of f :
f(
If f (x)
at a.
If f (x)
at a.
If f (x) f (a) is negative for some choices of x near a and positive for some choices of x
near a then f does not have a relative extremum at a. In this situation it is said that f has a
saddle point at a.
a) is negative for all choices of x near a, then f will have a relative maximum
f(
a) is positive for all choices of x near a, then f will have a relative minimum
f(
2
As x approaches a, notice that R2(x; a) is approaching 0, so if 12 (x a)T H (x a) is not equal to
0 as x gets closer and closer to a, then R2 (x; a) will not affect whether f (x) f (a) is positive or
negative when x is near a. However, if 12 (x a)T H (x a) = 0, then the answer would depend
on R2 , which is unknown. The test would fail in that case. The sign of f (x) f (a) is important
and not its magnitude, so the constant 12 may be removed from the analysis. The result may be
summarized as follows:
If (x
at a.
a)T H (x a) is negative for any choice of x near a, then there is a relative maximum
If (x
at a.
a)T H (x a) is positive for any choice of x near a, then there is a relative minimum
If (x a)T H (x a) is positive for some choices of x and negative for other choices of x
arbitrarily close to a, then there is a saddle point at a, but not a relative extremum.
If (x
a)T H (x
a) = 0 for a choice of x arbitrarily close to a, then the analysis fails.
Example: Consider the function
x) = x21
f(
x1 x2
2
+ x2 + 2x1 + 2x2
4
Then rf (x) = (2x1 x2 + 2; 2x2 x1 + 2), and rf (x) = 0 is solved to find that a = (
is the only critical point. Differentiate again to find f 11(a) = 2, f12(a) = f21(a) =
f22 (a) = 2. Thus the Hessian of f at a is
H
and
x)
f(
f(
a) =
1
2
=
(x
2
1
1
2
a)T
2
1
1
2
(x
2;
2)
1
and
a)
Analyzing this situation is difficult because of all the possible choices for x near a; however, the
notion of a quadratic form cleans up matters considerably.
If z = x a, then the term (x a)T H (x a) = zT H z, so there is a quadratic form Q(z) = zT H z
and the Hessian H is the matrix of that quadratic form. Notice then that the above observation
about relative maxima, relative minima, and saddle points may be summarized quite nicely.
If Q(z) < 0 for all z, then there is a relative maximum at a.
If Q(z) > 0 for all z, then there is a relative minimum at a.
If Q(z) is positive for some choices of z and negative for other choices of z, then there is a
saddle point at a, but not a relative extremum.
If Q(z) = 0 for a choice of z, then the analysis fails.
3
Finally, note that the first three conditions given above are the definitions for a negative definite
quadratic form, a positive definite quadratic form, and an indefinite quadratic form.
Example (cont.): The standard matrix for this quadratic form Q is
H
=
2
1
1
2
The Principal Axes Theorem says that there is an orthogonal change of variable z = P y that
transforms the quadratic form z T H z into a quadratic form y T Dy, where D is a diagonal matrix
with the eigenvalues of H (with multiplicities) as its diagonal entries.
Example (cont): The eigenvalues of H are 1 and 3, and the standard matrix H may be diqagonalized
to find that H = P DP 1 , where
1
1
1 0
P =
and D =
1
1
0
If y = (y1; y2), then the quadratic form has been converted into
yT Dy = [
y1
y2
]
1
0
y1
0
3
y2
3
2
2
= y1 + 3y2
which is positive for all choices of y1 and y2, thus for all choices of y. And so Q(z) > 0 for all
choices of z (i.e., Q is a positive definite quadratic form), and f (x) = x 21 x1x2 + x22 + 2x1 + 2x2 4
has a relative minimum at the point a = ( 2; 2).
Finally note that, by Theorem 5 in Section 7.2, the behavior of
summarized by determining the eigenvalues of the Hessian
H
a)
f21 (a)
f11 (
=
x) at a critical point a may be
f(
a)
f22 (a)
f12 (
If all eigenvalues of H are positive, f (x) will have a relative minimum at a.
If all eigenvalues of H are negative, f (x) will have a relative maximum at a.
If the eigenvalues of H are of mixed signs, then f (x) has a saddle point at a.
If any of the eigenvalues of H is zero, then the analysis fails.
The same analysis applies to functions
defined to be
2
f
of three variables. If the Hessian of
3
=
4
a)
a)
f32 (a)
at a point a is
a)
a) 5
f33 (a)
determine the behavior of f (x) at a critical point (where rf (a)
H
a)
a)
f31 (a)
f
f11 (
f12 (
f13 (
f21 (
f22 (
f23 (
then the eigenvalues of H
exactly as they do in the two variable case.
Questions: Locate all relative extrema and saddle points for the following functions.
4
=
0)
x) = x21 + x1x2 + x22 + 3x1
1.
f(
2.
f(
3.
f(
4.
f(
5.
f(
6.
f(
7.
f(
8.
f(
3x2 + 4
x) = x21 + x1x2 + 3x1 + 2x2 + 5
x) = x21
4x1 x2 + x22 + 6x2 + 2
x) = 2x1 + 2x2
x) = x31
3
x2
x) = 6x21
x) = x21
2x21
2
2x1 x2
x2
+3
2x1 x2 + 6
2x31 + 3x22 + 6x1 x2
x1 x2
+ x22
x) = x31 + x32 + x33
x1 x3
+ x23
3x21 x2
3x22 x3
3x23 x1 + 6x1 + 6x2 + 6x3
Hint: the critical points for this function are the following 8 points:
(1; 1; 1)
(1:53868;
(:0838115;
:0838115 ; 2:15069)
2:15069;
(2:15069; 1:53868;
1:53868)
:0838115)
(
1;
(
1:53868; :0838115;
1;
1)
(
:0838115; 2:15069; 1 :53868)
(
2:15069;
2:15069)
1:53868; :0838115)
Reference:
1. Finney, Ross L., Weir, Maurice D., and Giordano, Frank R. Thomas’ Calculus. Tenth Edition.
Boston: Addison-Wesley, 2001.
5
Download