A TEXT FOR NONLINEAR PROGRAMMING Thomas W. Reiland Department of Statistics North Carolina State University Raleigh, NC 27695-8203 Email: reiland@ncsu.edu Table of Contents Chapter I: Optimality Conditions................................................................................................... 1 § 1.1 Differentiability.................................................................................................................. 1 § 1.2. Unconstrained Optimization ............................................Error! Bookmark not defined. § 1.3 Equality Constrained Optimization...................................Error! Bookmark not defined. § 1.3.1 Interpretation of Lagrange Multipliers..........................Error! Bookmark not defined. § 1.3.2 Second order Conditions – Equality Constraints ..........Error! Bookmark not defined. § 1.3.3 The General Case ..........................................................Error! Bookmark not defined. § 1.4 Inequality Constrained Optimization ...............................Error! Bookmark not defined. § 1.5 Constraint Qualifications (CQ) ........................................Error! Bookmark not defined. § 1.6 Second-order Optimality Conditions ...............................Error! Bookmark not defined. § 1.7 Constraint Qualifications and Relationships Among Constraint Qualifications ..... Error! Bookmark not defined. Chapter II: Convexity ...................................................................Error! Bookmark not defined. § 2.1 Convex Sets ......................................................................Error! Bookmark not defined. § 2.2 Convex Functions ............................................................Error! Bookmark not defined. § 2.3 Subgradients and differentiable convex functions ............Error! Bookmark not defined. 1.1 Differentiability page 1.1-1 Chapter I: Optimality Conditions §1.1 Differentiability Definition 1.1.1 Let S be a nonempty set in E n , x int( S ) , and let f : S E1 . Then f is said to be differentiable at x if there is a vector f ( x ) in E n , called the gradient of f at x , and a function β satisfying ( x ; x) 0 as x x such that f ( x) f ( x ) f ( x )T ( x x ) x x ( x ; x) for each x S Remark: The gradient vector consists of the partial derivatives of f: f ( x ) x1 f ( x ) f ( x ) xn Implications: If f is differentiable at x then f is continuous at x and f ( x ) exists. If partial derivatives exist and are continuous at x then f is differentiable at x . Examples 1-1: 1) A discontinuous (hence not differentiable) function of two variables with 1st partial derivatives everywhere. x1 x2 2 2 2 2 if x1 x2 0 x x f ( x1 , x2 ) 1 Discontinuous at (0,0) since arbitrarily near (0,0) 2 if x1 x2 0 0 there exists points of the form (a,a) at which f = ½ . 2) A differentiable function of two variables whose gradient is not continuous. 1.1 Differentiability page 1.1-2 x 2 sin 1 x 2 sin 1 if x x 0 2 1 2 1 x1 x2 1 f ( x1 , x2 ) x12 sin if x1 0, x2 0 x1 1 x22 sin if x1 0, x 2 0 x2 0 if x1 x2 0 Taking the partial derivatives it is apparent both are discontinuous at the origin. 1 1 2 x1 sin cos if x1 0 f x1 x1 x1 0 if x1 0 1 1 2 x2 sin cos if x2 0 f x2 x2 x2 0 if x2 0 Definition 1.1.2 The set of points for which the function f(x) has a constant value is called a contour of f. Example 1-2: Let S E 2 , f : S E1 , f ( x1 , x2 ) x12 x22 Contours are circles with center at the origin. {(x1 , x2 ) x12 x22 1} circle with radius of 1 Remark: At any point x the gradient f ( x) points in the direction of the fastest instantaneous rate of increase of the function (in the direction of the steepest ascent) and is orthogonal to the contours of f that pass through x. If the partial derivatives are continuous, f ( x) is the direction of steepest descent. Example 1-2 Continued: 2 x1 f ( x) x12 x22 so f ( x) . 2 x2 0 0 Let x . Then f ( x) . 1 2 1.1 Differentiability page 1.1-3 The gradient lies in the domain space. Remark: Given f : E n E1 (not necessarily differentiable), the directional derivative of f at f ( x0 ty ) f ( x0 ) x in the direction y is Df ( x ; y ) lim if the limit exists. t 0 t f ( x 0 ) If y = [ 1, 0, …, 0 ]T, then Df ( x 0 ; y ) . Similarly can obtain partials with respect x1 to x2,…, xn. If f is differentiable at x0 then Df ( x0 ; y ) is finite for all directions y and 0 0 f ( x0 ty ) f ( x 0 ) f ( x 0 )T y t 0 t 0 T When y 1 , then f ( x ) y describes the instantaneous rate of change of f from x0 in the direction y. Df ( x0 , y ) lim Proof: Since f is differentiable at x0, for t > 0 f ( x0 ty) f ( x0 ) tf ( x0 )T y t y ( x0 ; x0 ty) f ( x0 ty ) f ( x 0 ) f ( x0 )T y y ( x 0 ; x0 ty ) t f ( x0 ty ) f ( x 0 ) lim f ( x0 )T y lim y ( x0 ; x0 ty ) f ( x 0 )T y t 0 t 0 t QED. Definition 1.1.3 Let S E n be nonempty, x int(S) , f : S E1 ; f is called twice differentiable at x if, in addition to f ( x ) , there exists an n n symmetric matrix H ( x ) (or 2 f ( x ) ) called the Hessian matrix of f at x , and a function β satisfying ( x ; x) 0 as x x 1 2 such that f ( x) f ( x ) f ( x )T ( x x ) ( x x )T H ( x )( x x ) x x ( x ; x ) for all x S . 2 Remark: H ( x ) consists of the 2nd order partial derivatives of f evaluated at x . 2 f ( x ) x12 2 f ( x ) x1x2 2 f ( x ) H ( x ) 2 f ( x ) xn x1 2 f ( x ) xn x2 2 f ( x ) x1xn 2 2 f ( x ) xn 1.1 Differentiability page 1.1-4 Definition 1.1.4 The n n symmetric matrix D is positive semidefinite (definite) if xT Dx 0 (> 0) x E n ( x 0 E n ). If D is positive definite, then it is also positive semidefinite. Similarly, reversing the inequality signs provides the definitions of negative semidefinite and negative definite. Definition 1.1.5 A point x S E n is said to be a local maximum point of f over S if there exists an 0 such that f ( x) f ( x ) x S within a distance of x . If f ( x) f ( x ) x S , x x , within a distance of x , then x is said to be a strict local maximum point of f over S. Definition 1.1.6 A point x S E n is a global maximum of f over S if f ( x) f ( x ) x S . If f ( x) f ( x ) x S , x x , then x is a strict global maximum point of f over S. Reversing the inequalities in Definitions 1.1.4 and 1.1.5 provides the corresponding definitions for minimum points.