1.1 Differentiability

advertisement
A TEXT FOR NONLINEAR PROGRAMMING
Thomas W. Reiland
Department of Statistics
North Carolina State University
Raleigh, NC 27695-8203
Email: reiland@ncsu.edu
Table of Contents
Chapter I: Optimality Conditions................................................................................................... 1
§ 1.1 Differentiability.................................................................................................................. 1
§ 1.2. Unconstrained Optimization ............................................Error! Bookmark not defined.
§ 1.3 Equality Constrained Optimization...................................Error! Bookmark not defined.
§ 1.3.1 Interpretation of Lagrange Multipliers..........................Error! Bookmark not defined.
§ 1.3.2 Second order Conditions – Equality Constraints ..........Error! Bookmark not defined.
§ 1.3.3 The General Case ..........................................................Error! Bookmark not defined.
§ 1.4 Inequality Constrained Optimization ...............................Error! Bookmark not defined.
§ 1.5 Constraint Qualifications (CQ) ........................................Error! Bookmark not defined.
§ 1.6 Second-order Optimality Conditions ...............................Error! Bookmark not defined.
§ 1.7 Constraint Qualifications and Relationships Among Constraint Qualifications ..... Error!
Bookmark not defined.
Chapter II: Convexity ...................................................................Error! Bookmark not defined.
§ 2.1 Convex Sets ......................................................................Error! Bookmark not defined.
§ 2.2 Convex Functions ............................................................Error! Bookmark not defined.
§ 2.3 Subgradients and differentiable convex functions ............Error! Bookmark not defined.
1.1 Differentiability
page 1.1-1
Chapter I: Optimality Conditions
§1.1 Differentiability
Definition 1.1.1 Let S be a nonempty set in E n , x  int( S ) , and let f : S  E1 . Then f is said to
be differentiable at x if there is a vector f ( x ) in E n , called the gradient of f at x , and a
function β satisfying  ( x ; x)  0 as x  x such that
f ( x)  f ( x )  f ( x )T ( x  x )  x  x  ( x ; x) for each x  S
Remark: The gradient vector consists of the partial derivatives of f:
 f ( x ) 
x1 



f ( x )  

f ( x ) 

xn 
Implications:
If f is differentiable at x then f is continuous at x and f ( x ) exists.
If partial derivatives exist and are continuous at x then f is differentiable at x .
Examples 1-1:
1) A discontinuous (hence not differentiable) function of two variables with 1st partial
derivatives everywhere.
 x1 x2
2
2
 2
2 if x1  x2  0
x

x
f ( x1 , x2 )   1
Discontinuous at (0,0) since arbitrarily near (0,0)
2
if x1  x2  0
 0

there exists points of the form (a,a) at which f = ½ .
2) A differentiable function of two variables whose gradient is not continuous.
1.1 Differentiability
page 1.1-2


 x 2 sin  1   x 2 sin  1  if x x  0
  2
 
1 2
 1
x1 

 x2 


1
f ( x1 , x2 )  
x12 sin   if x1  0, x2  0
 x1 


1

x22 sin   if x1  0, x 2  0

 x2 

0
if x1  x2  0

Taking the partial derivatives it is apparent both are discontinuous at the origin.

1
1
2 x1 sin    cos   if x1  0
f

 x1 
 x1 
x1 
 0
if x1  0


1
1
2 x2 sin    cos   if x2  0
f

 x2 
 x2 
x2 
 0
if x2  0

Definition 1.1.2 The set of points for which the function f(x) has a constant value is called a
contour of f.
Example 1-2: Let S  E 2 , f : S  E1 , f ( x1 , x2 )  x12  x22
Contours are circles with center at the origin.
{(x1 , x2 )  x12  x22  1} circle with radius of 1
Remark: At any point x the gradient f ( x) points in the direction of the fastest instantaneous
rate of increase of the function (in the direction of the steepest ascent) and is orthogonal to the
contours of f that pass through x. If the partial derivatives are continuous, f ( x) is the
direction of steepest descent.
Example 1-2 Continued:
 2 x1 
f ( x)  x12  x22 so f ( x)  
.
 2 x2 
0 
0 
Let x    . Then f ( x)    .
1 
2
1.1 Differentiability
page 1.1-3
The gradient lies in the domain space.
Remark: Given f : E n  E1 (not necessarily differentiable), the directional derivative of f at
f ( x0  ty )  f ( x0 )
x in the direction y is Df ( x ; y )  lim
if the limit exists.
t 0
t
f ( x 0 )
If y = [ 1, 0, …, 0 ]T, then Df ( x 0 ; y ) 
. Similarly can obtain partials with respect
x1
to x2,…, xn.
If f is differentiable at x0 then Df ( x0 ; y ) is finite for all directions y and
0
0
f ( x0  ty )  f ( x 0 )
 f ( x 0 )T y
t 0
t
0 T
When y  1 , then f ( x ) y describes the instantaneous rate of change of f from x0 in
the direction y.
Df ( x0 , y )  lim
Proof: Since f is differentiable at x0, for t > 0
f ( x0  ty)  f ( x0 )  tf ( x0 )T y  t y  ( x0 ; x0  ty)
f ( x0  ty )  f ( x 0 )
 f ( x0 )T y  y  ( x 0 ; x0  ty )
t
f ( x0  ty )  f ( x 0 )
lim
 f ( x0 )T y  lim y  ( x0 ; x0  ty )  f ( x 0 )T y
t 0
t 0
t
QED.
Definition 1.1.3 Let S  E n be nonempty, x  int(S) , f : S  E1 ; f is called twice
differentiable at x if, in addition to f ( x ) , there exists an n  n symmetric matrix H ( x ) (or
 2 f ( x ) ) called the Hessian matrix of f at x , and a function β satisfying  ( x ; x)  0 as x  x
1
2
such that f ( x)  f ( x )  f ( x )T ( x  x )  ( x  x )T H ( x )( x  x )  x  x  ( x ; x ) for all x  S .
2
Remark: H ( x ) consists of the 2nd order partial derivatives of f evaluated at x .
  2 f ( x ) x12
 2 f ( x ) x1x2

2 f ( x )  H ( x )  
 2 f ( x ) xn x1  2 f ( x ) xn x2

 2 f ( x ) x1xn 


2
2 
 f ( x ) xn 
1.1 Differentiability
page 1.1-4
Definition 1.1.4 The n  n symmetric matrix D is positive semidefinite (definite) if xT Dx  0 (>
0) x  E n ( x  0  E n ). If D is positive definite, then it is also positive semidefinite.
Similarly, reversing the inequality signs provides the definitions of negative semidefinite and
negative definite.
Definition 1.1.5 A point x  S  E n is said to be a local maximum point of f over S if there
exists an   0 such that f ( x)  f ( x ) x  S within a distance  of x .
If f ( x)  f ( x ) x  S , x  x , within a distance  of x , then x is said to be a strict local
maximum point of f over S.
Definition 1.1.6 A point x  S  E n is a global maximum of f over S if f ( x)  f ( x ) x  S .
If f ( x)  f ( x ) x  S , x  x , then x is a strict global maximum point of f over S.
Reversing the inequalities in Definitions 1.1.4 and 1.1.5 provides the corresponding
definitions for minimum points.
Download