OR-ST-MA 706 Page |1 Spring 2016 Version A TEXT FOR NONLINEAR PROGRAMMING Thomas W. Reiland Statistics Department North Carolina State University Raleigh, NC 27695-8203 Office: (919) 515-1939 Email: reiland@ncsu.edu OR-ST-MA 706 Page |2 Table of Contents Chapter I: Optimality Conditions..................................................Error! Bookmark not defined. § 1.1 Differentiability.................................................................Error! Bookmark not defined. § 1.2. Unconstrained Optimization ............................................Error! Bookmark not defined. § 1.3 Equality Constrained Optimization...................................Error! Bookmark not defined. § 1.3.1 Interpretation of Lagrange Multipliers..........................Error! Bookmark not defined. § 1.3.2 Second order Conditions – Equality Constraints ..........Error! Bookmark not defined. § 1.3.3 The General Case ..........................................................Error! Bookmark not defined. § 1.4 Inequality Constrained Optimization ...............................Error! Bookmark not defined. § 1.5 Constraint Qualifications (CQ) ........................................Error! Bookmark not defined. § 1.6 Second-order Optimality Conditions ...............................Error! Bookmark not defined. § 1.7 Constraint Qualifications and Relationships Among Constraint Qualifications ..... Error! Bookmark not defined. Chapter II: Convexity .................................................................................................................... 1 § 2.1 Convex Sets ....................................................................................................................... 2 § 2.2 Convex Functions ............................................................................................................. 5 § 2.3 Subgradients and differentiable convex functions ........................................................... 13 Chapter II: Convexity § 2.1 Convex Sets Definition 2.1.1 A set C E n is a convex set if x1 (1 ) x2 C x1 , x2 C and 0,1 . Remark: Geometrically, this means that a straight line segment joining any two points in a convex set lies entirely in the set. Definition 2.1.2 A point x E n is a convex combination of the points x1 , x2 ,, xk if x can be k put in the form x i xi , where i 0, i 1 k i 1 i 1. Theorem 2.1 For C E n to be a convex set it is necessary and sufficient that every convex combination of points of C belongs to C. OR-ST-MA 706 Page |3 Proof: Necessity: follows by induction Sufficiency: follows directly from the definitions Theorem 2.2 If Ai iI is an arbitrary family of convex sets, then Proof: Let x1 , x2 iI iI Ai is convex. Ai and let 0 1. Then for each i I , x1 , x2 Ai and since Ai is convex, x1 (1 ) x2 Ai . QED Definition 2.1.3 Let S be an arbitrary set in E n . The convex hull of S, denoted H ( S ) , is the k collection of all convex combinations of points in S; i.e. x H ( S ) iff x j x j , where j 1 k j 1 j 1, j 0, x1 , x2 ,, xk S , and k is a positive integer. Example 2-1: Note the geometric interpretation of the convex hull, H(S), for the three sets below. Theorem 2.3 Let S be an arbitrary set in E n . Then H(S) is the smallest convex set containing S. Indeed, H(S) is the intersection of all convex sets containing S. Definition 2.1.4 A point x in a convex set C is called an extreme point if x cannot be expressed as a convex combination of any other two distinct points in C. Example 2-2: Below are three convex sets with their respective extreme points identified. OR-ST-MA 706 Page |4 Definition 2.1.5 A hyperplane H E n is a collection of points of the form x : pT x where p is a (fixed) nonzero vector in E n and α is a (fixed) scalar. A hyperplane H defines two closed half-spaces, H x : pT x and H x : pT x , and the two open half-spaces, x : p T x and x : pT x . Note: In E1 , H is a point. In E 2 , H is a line. In E 3 , H is a plane. Example 2-3: Theorem 2.4 A hyperplane is a convex set. Proof: Let H x : pT x and let x1 , x2 H . For any point xˆ x1 (1 ) x2 , where 0 1, we have: pT xˆ pT x1 pT (1 ) x2 pT x1 (1 ) pT x2 (1 ) So x̂ H . QED Theorem 2.5 The closed half-spaces H and H are convex sets. OR-ST-MA 706 Page |5 Remark: A hyperplane H and the corresponding half-spaces can be written in reference to a fixed point, say x H , then pT x and hence any x H must satisfy the following: i.e. pT ( x x ) 0 . pT x pT x 0 Accordingly, H x : pT ( x x ) 0 and H x : pT ( x x ) 0 . Definition 2.1.6 Boundary of a Set: Let S E n be arbitrary; x is said to be in the boundary of S, denoted S , if 0 , N ( x) contains at least one point in S and one point not in S. N ( x) y E n : x y Example 2-4: Boundary Points Definition 2.1.7 Let S E n be convex and nonempty and let x S , then there exists a supporting hyperplane of S at x ; i.e. there exists p 0 such that pT ( x x ) 0 for each x cl S . § 2.2 Convex Functions Definition 2.2.1 Epigraph/Hypograph: Let f be a function defined on a subset D E n with values in the extended reals; the epigraph of f, denoted epi f, is epi f ( x, ) : x D E n , E1 , f ( x) The hypograph of f, denoted hyp f, is hyp f ( x, ) : x D E n , E1 , f ( x) Remark: Epi f and hyp f are subsets of E n 1 . f : D E n , Example 2-5: Graphical Depiction of Epigraph/Hypograph OR-ST-MA 706 Page |6 Definition 2.2.2 Convex Function: Let f be defined on D E n whose values are in the extended reals; f is a convex function if epi f is a convex set. Example 2-6: (1) empty set is convex f in E n , epi f n n 1 (2) also convex f in E , epi f E n (3) f : D E , finite on D, D is convex f ( x) x D f1 ( x) epi f is convex x D Suppose a convex function f defined on a proper convex subset D E n . epi f on D = epi f1 defined on En If epi f1(x) is convex then f1(x) is convex. We can always expand f to be defined on all of En and still be convex. Remark: As a result of the above example, we shall assume that, unless mentioned explicitly, a convex function is defined on all of E n . Definition 2.2.3 Effective Domain: Let f : E n extended reals ; the effective domain of f, denoted ED( f ) , is the set ED( f ) x E n : f ( x) . OR-ST-MA 706 Page |7 Remark: The effective domain of f is the projection of epi f onto E n . Theorem 2.7 If f : E n extended reals is convex, then ED( f ) is a convex set in E n . Proof: Let x1 , x2 ED( f ) and suppose 0,1 ; we must show x1 (1 ) x2 ED( f ) . By definition, f is convex if and only if epi f is convex; i.e. if ( x1 , 1 ) and ( x2 , 2 ) epi f then ( x1 , 1 ) (1 )( x2 , 2 ) x1 (1 ) x2 , 1 (1 )2 belongs to epi f Now x1 , x2 ED( f ) f ( x1 ), f ( x2 ) ; in particular, there exists a finite 1 and 2 such that f ( x1 ) 1 and f ( x2 ) 2 ; therefore, ( x1 , 1 ) and ( x2 , 2 ) belong to epi f. Since epi f is a convex set in E n 1 , for 0,1 we have ( x1 ,1 ) (1 )( x2 , 2 ) x1 (1 ) x2 , 1 (1 ) 2 epi f i.e. f ( x1 (1 ) x2 ) 1 (1 ) 2 since 1 and 2 are finite, 1 (1 ) 2 and hence f ( x1 (1 ) x2 ) i.e. x1 (1 ) x2 ED( f ) . QED Remark: The converse statement does not hold in general; i.e. if ED( f ) is convex, f is not necessarily a convex function. Example 2-7: ED(f ) is convex but f is not convex; epi f is not a convex set. Definition 2.2.4 Proper Convex Function: The function f is a proper convex function if f is convex, f ( x) x E n , and ED ( f ) (less than +∞ somewhere). Convex functions that are not proper are called improper. Remark: Hence proper convex functions are convex functions that nowhere have the value and are not identically . Equivalently, a convex function is proper if its epigraph is nonempty and contains no vertical lines (infinitely long from to + ). Example 2-8: Improper Convex Function OR-ST-MA 706 Page |8 Theorem 2.8 Convex Function: Let f be a function from C to , , where C is a convex set (e.g. C E n ). Then f is convex if and only if f ( x (1 ) y ) f ( x) (1 ) f ( y ), 0 1 for every x and y in C. (1) Remarks: 1. The inequality in (1) is the “classical” definition of a convex function where f nowhere has the value . The classical definition expresses the usual “chord above the curve” property. Two technical approaches are possible when discussing convex functions. One can limit attention to functions f:D which are nowhere , so that D would always coincide with ED( f ) (but would vary with f ). Alternatively, one can limit attention to functions given on all of E n , since a convex function f on D can always be extended to a convex function on all of E n by setting f ( x) for x D (see Example 2-6). We will take the second approach. Thus, by a “convex function” we always mean a “convex function with possible infinite values defined throughout E n ,” unless otherwise specified. This approach has the advantage that technical nuisances about effective domains can be suppressed almost entirely. This approach leads to calculations involving and . We adopt the following obvious rules: , () () , () () for for for for 0 0 0 0() ()0, () sup inf and are undefined and avoided 0 0 OR-ST-MA 706 Page |9 2. Proper convex functions (see Definition 2.2.4) are the real object of study (though improper functions do arise from proper ones in many natural situations and it is more convenient to include them than to exclude them with added restrictions). Note that a proper convex function on E n is a function obtained by taking a finite convex function f on a nonempty convex set C and then extending it to all of E n by setting f ( x) for x C . Definition 2.2.5 An extended real-valued function f defined on D E n is a concave function if –f is convex. Remark: A function f is concave if hyp f is a convex set in E n 1 . Definition 2.2.6 A real-valued function f defined on a convex subset C E n is strictly convex if for any x1 C , x 2 C , x1 x 2 , and 0,1 we have f ( x1 (1 ) x 2 ) f ( x1 ) (1 ) f ( x 2 ) . Theorem 2.9 Let f be a function from E n to , . Then f is convex if and only if f (1 x1 m x m ) 1 f ( x1 ) m f ( x m ) , whenever i 0, i 1, , m, m i 1 i 1. Operations on convex functions to form other convex functions… Theorem 2.10 Let f be a convex function and let λ be a nonnegative number. Then λf is also a convex function. Let f and g be convex functions. Then f g is also convex, provided that the undefined operation ( ) is avoided. Corollary 2.11 Under the hypotheses of Theorem 2.10, every linear combination 1 f1 k f k of convex functions f1 ,, f k with i 0, i 1,, k , is also a convex function. Definition 2.2.7 A function : E1 , is nondecreasing if for every x1 x 2 , we have ( x1 ) ( x 2 ) . Theorem 2.12 Let f be a convex function from E n , and let be a nondecreasing convex function from E1 , with () . Then g ( x) f ( x) is convex on En . Proof: f is convex if and only iff f ( x1 (1 ) x 2 ) f ( x1 ) (1 ) f ( x 2 ) ; since is nondecreasing, f ( x1 (1 ) x 2 ) f ( x1 ) (1 ) f ( x 2 ) . And since is convex f ( x1 ) (1 ) f ( x 2 ) f ( x1 ) (1 ) f ( x 2 ) . QED OR-ST-MA 706 P a g e | 10 f ( x) x 2 ( x) x Example 2-9: f ( x) x 2 Need assumption that ( x) is nondecreasing. (1) g ( x) e f ( x ) is a proper convex function on E n ; if f ( x ) is proper convex on E n since e x is the nondecreasing . (2) g ( x) f ( x) is convex for p 1 when f is convex and nonnegative. p p 0 ( ) 0 0 p In particular, g ( x ) x is convex on E n for p 1 . When p 1 , you get Euclidean distance. Theorem 2.13 Let fi , i I be a finite or infinite collection of convex functions on E n . For every x E n define the pointwise supremum of this collection as f ( x) sup fi ( x) . Then f is a iI convex function. Example 2-10: (1) Let x (1 , 2 , , n ) E n , then f ( x) max j is convex. 1 j n Let fi ( x) x ei where ei 0,,0,1, 0,,0 T 1 in the ith position f ( x) sup fi ( x) is a convex function i j n is convex (2) Let K ( x) max j 1 j n (Chebychev Norm) Let fi ( x) i which is convex. K ( x) sup fi ( x) 1 j n Theorem 2.14 The function f is convex on E n if and only if for every x1 , x 2 E n , the function defined by ( ) f ( x1 (1 ) x 2 ) is convex for 0,1 . Remark: This theorem states that a function is convex on a convex set C if and only if the restriction of f to each line segment in the set C is a convex function. Note that f is defined on E n while is defined on E1 . Example 2-11: Continuity of Convex Functions x2 1 x 1 f ( x) 2 x 1 This is a convex function since epi f is convex. x 1 OR-ST-MA 706 P a g e | 11 The effective domain is: ED( f ) ,1 Remark: Roughly speaking, discontinuities in convex functions can occur only at some boundary points of their effective domain. Some of these discontinuities can be removed as will be shown below. Theorem 2.15 A convex function f on E n is continuous on the interior of its effective domain. Remark: This generalizes real-valued convex functions f : C E1 on open convex set C as continuous (convex function f : E n E1 is continuous). Definition 2.2.8 Let f be a convex function defined on E n . Define the support set, denoted L( f ) , as follows: L( f ) a, b : a E n , b E1 , aT x b f ( x) x E n . Example 2-12: Definition 2.2.9 Let f be a convex function defined on E n . Then the closure of f, denoted cl f, cl f ( x) sup aT x b . A convex function f is said to be closed if f cl f . is: a ,b L ( f ) Example 2-13: OR-ST-MA 706 P a g e | 12 Recall the definition of a proper convex function (Definition 2.2.4). Theorem 2.16 A proper convex function f on E n is closed if and only if the convex level set x : f ( x) is closed for every real number α. Remarks: 1. The closure operation for proper convex functions is directly related to the closure operation for sets. Specifically, the epigraph of cl f is the closure of the epigraph of f, i.e. epi(cl f ) epi f ( C denotes the closure of C ). This last relation can also be used as the definition of the closure operation for proper convex functions. 2. For improper convex functions, the closure operation has the following meaning: (i) if f ( x) for every x E n , then cl f f (ii) if f ( x) for some x E n , then by definition L( f ) ; since by definition, the supremum over the empty set is (see remarks for Theorem 2.8), we get cl f ( x) for every x E n . Therefore, the only closed improper convex functions are those that are identically or . Example 2-14: x2 1 x 1 f ( x) 2 x 1 x 1 x2 1 x 1 cl f x 1 The closure of epi f = epi (cl f ). Example 2-15: 0 x 0 f ( x) x 1 0 x 0 cl f x 1 Remark: The closure operation is a normalization which makes convex functions “more regular” by redefining their values at certain points where there are unnatural discontinuities. OR-ST-MA 706 P a g e | 13 § 2.3 Subgradients and differentiable convex functions Definition 2.3.1 Let f be defined on E n with values in , . Let x 0 be a point where f is finite and let y E n . The right-sided derivative of f at x 0 in the direction y is D f ( x 0 ; y ) lim t 0 f ( x 0 ty ) f ( x 0 ) , if the limit (which can be ) exists. t The left-sided derivative of f at x 0 in the direction y is D f ( x 0 ; y ) lim t 0 f ( x 0 ty ) f ( x 0 ) . t For y 0, D f ( x0 ; y) D f ( x0 ; y) 0 . The limits in Definition 2.3.1 exist for convex and concave functions. Example 2-16: Let f ( x) x ; x0 0, y 1 D f ( x 0 ; y ) lim t 0 t 0 t 1 D f ( x 0 ; y ) lim t 0 t 0 t 1 Theorem 2.17 Let f be a convex function on E n and let x be such that f ( x ) is finite. Then the right- and left-sided derivatives of f at x exist in every direction y. Remarks: If for some x 0 and y, D f ( x0 ; y) D f ( x0 ; y) , this common value is called the directional derivative of f at x 0 in the direction y and denoted Df ( x 0 ; y ) : f ( x 0 ty ) f ( x 0 ) Df ( x ; y ) lim t 0 t f ( x 0 ) T If y 1, 0, , 0 , then Df ( x 0 ; y ) . These partial derivatives may exist x1 even if f is not differentiable. If f is differentiable, then Df ( x0 ; y) yT f ( x0 ) . 0 Since epi f (hyp f) of a convex (concave) function f is a convex set, epi f has a supporting hyperplane at its boundary points (see Theorem 2.6). This fact motivates the following definition. Definition 2.3.2 A subgradient of a convex function f at a point x E n is a vector E n satisfying: OR-ST-MA 706 P a g e | 14 f ( x) f ( x ) T ( x x ), x E n . Similarly, a subgradient of a concave function f at a point x E n is a vector E n satisfying: f ( x) f ( x ) T ( x x ), x E n . Example 2-17: Geometric Interpretation of a Subgradient <FIX: Insert graphic p. 19 of handout> Remarks: 1. The function f ( x ) T ( x x ) [a function of x since x and are fixed] is a non-vertical supporting hyperplane of epi (f) or hyp (f) at the point ( x , f ( x )) when f is finite at x ; corresponds to the slope of the hyperplane. 2. In general, at a point x E n , there are 3 possibilities: (i) no vector satisfying exists (ii) there exists a unique satisfying (iii) there exists more than one such (infinite number) Definition 2.3.3 The set of subgradients of the convex (convave) function f at x is called the subdifferential of f at x and denoted f ( x ) . If f ( x ) is nonempty, then f is subdifferentiable at x. Example 2-18: (i) Consider the point depicted. The subdifferential at this point is the empty set. (ii) f ( x) x 2 f ( x) f ( x ) T ( x x ) x2 x 2 T ( x x ) x2 x 2 (x x ) x x x x x E1 xx OR-ST-MA 706 P a g e | 15 x2 x 2 x x x x (x x ) Therefore, 2x xx (iii) f ( x) x , the Euclidean norm in E n . x 0 f ( x) f ( x ) T ( x x ) x T x x E n f ( x ) f (0) unit ball in E n E n : 1 <FIX: insert graphic p. 20 of handout> Example 2-19: f ( x) min f1 ( x), f 2 ( x) f1 ( x) 4 x x E1 f 2 ( x) 4 ( x 2)2 1 x 4 4 x f ( x) This is concave – hypograph is a convex set. 2 4 ( x 2) otherwise For x (1, 4), 1 is a subgradient. For x 1 or x 4, 2( x 2) is a subgradient. Now examine the corner points, x 1, x 4 , is not unique. At x 1 , the subgradients are characterized by f1 (1) (1 )f 2 (1) for 0,1 . (1) (1 )(2) 2 3 , [0,1], [ 1, 2] At x 4 , the subgradients are characterized by f1 (4) (1 )f 2 (4) for 0,1 . (1) (1 )(4) 3 4 , [0,1], [ 4, 1] <FIX: insert graphic from p. 20 of handout> Theorem 2.18 Let f be a convex function on E n ; then for any x E n , f ( x ) is a closed convex set. Subgradients can be characterized in terms of directional derivatives. Theorem 2.19 A vector E n is a subgradient of a convex function f at a point x E n where f is finite if and only if D f ( x ; y) T y for every direction y E n . Example 2-20: (1) f ( x) x 2 Previously we verified that f ( x ) 2 x . The above theorem says (trivially) that D f ( x ; y) f ( x )T y 2 xy y E1 i.e. 2xy 2 xy y E1 OR-ST-MA 706 (2) f ( x) x P a g e | 16 x En E n is a subgradient of f at x 0 if and only if D f (0; y) T y i.e. iff lim t 0 i.e. iff ty t T y y T y y E n y E n y E n . Which is true unit ball in E n . i.e. f (0) is the unit ball in E n which we already established. Theorem 2.20 Let f be a convex function on E n and suppose f ( x ) is finite. Then f ( x) f ( x ) D f ( x ; x x ) x E n . In particular, if f is differentiable at x , then f ( x) f ( x ) f ( x )T ( x x ) x E n . In other words, the linear approximation always underestimates the function. Proof: For 0 t 1 and x E n f ((1 t ) x tx) (1 t ) f ( x ) tf ( x) f ( x t ( x x ) f ( x ) tf ( x ) tf ( x) tf ( x) tf (x ) f ( x t ( x x )) f ( x ) t f ( x) f ( x ) f ( x t ( x x )) f ( x ) t t f ( x t ( x x )) f ( x ) f ( x) f ( x ) lim D f ( x ; x x ) t 0 t f ( x) f ( x ) D f ( x ; x x ) QED.