1.5 Constraint Qualifications

Spring 2016 Version A TEXT FOR NONLINEAR PROGRAMMING Thomas W. Reiland Statistics Department North Carolina State University Raleigh, NC 27695-8203 Office: (919) 515-1939 Email: reiland@ncsu.edu Table of Contents Chapter I: Optimality Conditions..................................................Error! Bookmark not defined. § 1.1 Differentiability.................................................................Error! Bookmark not defined. § 1.2. Unconstrained Optimization ............................................Error! Bookmark not defined. § 1.3 Equality Constrained Optimization...................................Error! Bookmark not defined. § 1.3.1 Interpretation of Lagrange Multipliers..........................Error! Bookmark not defined. § 1.3.2 Second order Conditions – Equality Constraints ..........Error! Bookmark not defined. § 1.3.3 The General Case ..........................................................Error! Bookmark not defined. § 1.4 Inequality Constrained Optimization ...............................Error! Bookmark not defined. § 1.5 Constraint Qualifications (CQ) ......................................................................................... 2 § 1.6 Second-order Optimality Conditions ...............................Error! Bookmark not defined. § 1.7 Constraint Qualifications and Relationships Among Constraint Qualifications ..... Error! Bookmark not defined. Chapter II: Convexity ...................................................................Error! Bookmark not defined. § 2.1 Convex Sets ......................................................................Error! Bookmark not defined. § 2.2 Convex Functions ............................................................Error! Bookmark not defined. § 2.3 Subgradients and differentiable convex functions ............Error! Bookmark not defined. § 1.5 Constraint Qualifications (CQ) Consider x0  A  E n . Intuitively, h will be considered a tangent vector of A at x0 if h leads us from x0 into A. Definition 1.5.1 Let x0  A  E n and z  E n ; if there exists   0 such that x 0   z  A for some 0     , then z is a linear tangent vector of A at x0. This definition is too restrictive for our purposes. For example, a circle has no nonzero linear tangent vectors. However, a circle does have “curvilinear tangent vectors.” Definition 1.5.2 A curvilinear tangent vector of A at x0 is a vector h such that h is the derivative dx(0)  h at t  0 of a continuous arc x(t ), 0  t   , in A having x(0)  x0 . dt Remark: (1) Every linear tangent vector is a curvilinear tangent vector. (2) dx(0) dt  h implies more than that h is tangent to the arc at x(0)  x0 . Generalized Curvilinear Tangent Vector Example 1-31: Let A be defined by g1 ( x)  x22  x13  0, g 2 ( x)   x2  x3  0 . (i) A  ( x1 , x2 )  E 2 : x2  x13  Curvilinear tangent vectors at x 0  (0, 0)T are ( x1 , x2 ) : x2  0 t  x(t )   3  t  dx(t ) (ii) A  ( x1 , x2 ) : x1  0, x2  x12  dt t 0 1    0  ( x1 , x2 ) : x1  0, x2   x1 Curvilinear tangent vectors at x 0  (0, 0)T are ( x1 , x2 ) : x2  max(0,  x1 ) t  1    (1) e.g. x(t )   3  is on the curve x2  x13 ; dx(t ) dt t 0 0  t   t   1   (2) e.g. x(t )    is on the curve x2   x1 ; dx(t ) dt t 0  t   1 (iii) A  ( x1 , x2 ) : x1  0, x2  x12  (x , x ) : x  0,  2 x x  1 x 2 3 1 3 1 Curvilinear tangent vectors at x 0  (0, 0)T are ( x1, x2 ) : x1  0, x2  0) ( x1, x2 ) : x1  0, 2 3 x1  x2  13 x1 1 2  1  Let x 0  (0, 0)T and let the arc x(t ) be the curve x13  x2 extending from (0, 0)T into the 1  dx(0) x( y)  x(0) third quadrant. The curve is tangent to h    at x 0 but  lim  h . y 0 dt y 0   t   1   1 1   y  i.e. x(t)   3  , 0  t   , so lim  3   lim  2      h y 0 y  t    y  y 0   y   0  -h is not a curvilinear tangent vector since it does not point back into A (see figure below) This concept is also too restrictive; we can generalize the idea by replacing a curve with a directionally convergent sequence of points. Definition 1.5.3 Let w  E n be a unit vector and x0  A  E n ; w is a sequential tangent vector xk  x0 k 0 k 0 k 0 of A at x if there exist a sequence  x   A such that x  x , x  x , and lim k w . k  x  x 0 In other words, the sequence  x k  converges to x0 in the direction w. Example 1-32: A  E 2 contains the unit circle; choose x 0   0 0 . Let w  1 0 ; w is a T sequential tangent vector of A at x0 since x k   1 k  T T 0  satisfies the definition:  x k   0 0  x 0 , x k  x 0 T 1 1  1  xk lim k  lim k k  lim       w k  x k  0 k  0 0 T Now let w   1  2 since 3  . Then w is a sequential tangent vector of A at x 0   0 0T 2   x   A where x  k k lim k  xk xk   1  2k 1  2k  lim  k  T 3  works. 2k  T   1  2k  2   lim  w  k   3 1 k  2  3 Definition 1.5.4 Let x  A  E n ; the closed cone of tangents of A at x, denoted S ( A, x ) , is the set of z  E n such that there exists a sequence  x k   A converging to x and a sequence of nonnegative numbers (scalars)  k  such that the sequence  k ( x k  x) converges to z. Remark: S ( A, x ) contains all the nonnegative multiples of the sequential tangent vectors w in Definition 1.5.3. Can always let a k  1 k . x x Remark: Every curvilinear tangent h of A at x0 is in S ( A, x ) . Proof: If x(t ), 0  t   , is a continuous arc in A having x(0)  x0 and dx(0) dt 0 k    h , then choose a k  k  and x k  x  . We have  x k   A since k  k   , lim xk  lim x  k  k   x(0)  x0 , and lim a k ( x k  x 0 )  lim k  k   k   x(0)  h . QED. x   k Geometric Interpretation of the close cone of tangents S ( A, x ) Translate A by subtracting x from each element of A. A* = A – x. Let  x k  be a sequence in the translated set, x k  0 , which converges to the origin. Construct a sequence of half-lines from the origin passing through x k . These half-lines converge to a half-line that will be a member of S ( A, x ) . The union of all half-lines formed by taking all such sequences is S ( A, x ) . Example 1-33: Let A  ( x1 , x2 ) : ( x1  4)  ( x2  2)  1 and x   4  3 2  lies on the boundary of A. 2 2 T 3  which 2  Translate A by subtracting x from each element in A. 2 2   1     3 A  ( x1 , x2 ) :  x1   x   1  2   2 2     By taking  x k  ’s on the boundary of A* converging to  0 0 , we generate T sequences of half-lines converging to a line that is the ordinary tangent line to the curve defined by the boundary of A* at the origin. The tangent line satisfies  3  x1  1 x2  0 . 2  2 * Repeating this process for all sequences in the interior of A converging to   T S ( A, x)  ( x1 , x2 ) :  3  x1  1 x2  0  . 0 0 , we get the following: 2  2       Remark: Let g ( x1 , x2 )  ( x1  4) 2  ( x2  2)2  1  0 Then   1 Z 1 ( x)  z  E 2 : z T g ( x)  0  z  E 2 : z1 3  z2  0   z  E 2 :  3  z1    z2  0   2 2       Definition 1.5.5 Let A  E n ; the dual cone of A, denoted A , is A   x  E n : xT y  0 y  A . From the definition, this is obviously a cone and can never be empty since 0  A . Example 1-34: See examples (1) – (5) below with associated figures. (1) Let A  E n is a subspace, then A  A ; the dual equals the orthocomplement of A. The dual cone is a generalization of orthogonality between subspaces. If C is a nonempty convex cone, then C  (C)  E n .  (2) A  x  E 2 : 0  x2  x1 (3) A  x  E 2 : x2  0 (4) A  x  E 2 : x2   x1 (5) A  En       A  x  E 2 : x1  0, x1  x2  0   A  x  E 2 : x1  0, x2  0  A  0 A  0 Lemma 1.21 Suppose that x 0  X . The set Z 1 ( x0 ) Z 2 ( x0 ) is empty if and only if   f ( x0 )  Z 1  x0   . Proof: Z 1 ( x0 ) Z 2 ( x0 )   if and only if z  Z 1 ( x0 ), zT f ( x0 )  0    Thus by definition, f ( x0 )  Z 1 x0 )  . Lemma 1.22 Suppose that x0 is a local minimum point for Problem (P). Then f ( x0 )  S X , x0  .    Remark: Suppose X  E n (i.e. unconstrained problem) or x0  int( X ) (i.e. problem    essentially unconstrained), then S  X , x 0   E n and S X , x0   0 so we are left with the usual unconstrained result f ( x0 )  0 . Proof: Let z  S ( X , x0 ) . Then there exists  x k   X , converging to x 0 and a sequence of nonnegative numbers  k  such that  k ( x k  x0 )  z . f is differentiable at x 0 . f ( x k )  f ( x 0 )  ( x k  x 0 )T f ( x 0 )  x k  x 0  ( x k ; x 0 ) (where β is a function that converges to zero as k approaches infinity) T  k  f ( x k )  f ( x 0 )    k ( x k  x 0 )  f ( x 0 )   k ( x k  x 0 )  ( x k ; x 0 ) As k   ,  k ( x k  x 0 )  ( x k ; x 0 )   , and since x k  X and x0 is a local minimum,  k  f ( x k )  f ( x 0 )  converges to a nonnegative limit. T 0  lim  k  f ( x k )  f ( x0 )   lim  k ( x k  x0 )  f ( x 0 )  zT f ( x 0 ) k  k     Therefore, f ( x0 )  S X , x0  . QED Theorem 1.23 Generalized Karush-Kuhn-Tucker (K-K-T) Conditions Let x be a local minimum to Problem (P) and suppose that Z 1 ( x )   S X , x  .     Then there exist    E m and    E p such that the following are true: m p i 1 j 1 f ( x )   igi ( x )    j h j ( x )  0 i gi ( x )  0, i  1,, m   0 (i) (ii) (iii)  Proof: Suppose x is a local minimum to Problem (P). By Lemma 1.22, f (x )  S X , x  .         If Z 1 ( x )   S X , x  , then f ( x )   Z 1 ( x )   . By Lemma 1.21, Z 1 ( x ) Z 2 ( x )   . By Theorem 1.18, (i) – (iii) hold. QED x  En Consider Problem (P’): Min f ( x) s.t. gi ( x)  0, i  1,, m h j ( x)  0, j  1,, p x0       Corollary 1.24 Let x be a solution of (P’) and suppose Z 1 X , x   S X , x  . Then there exist   E and   E such that the following are true:  m  p m p i 1 j 1 f ( x )   igi ( x )    j h j ( x )  0  gi ( x )  0, i  1,, m   0  i And  p m   ( x )T f ( x )   igi ( x )    j h j ( x )   0 i 1 j 1   Remark: It is possible to find points satisfying Karush-Kuhn-Tucker conditions (Theorem 1.23) that are not feasible. Example 1-35: Min f ( x)  x1 x  E2 g1 ( x)  16  ( x1  4) 2  x22  0 h1 ( x)  ( x1  3) 2  ( x2  2) 2  13  0 Using the K-K-T conditions we locate three candidates: T x1   0 0  1  18 , 1  0 T x 2   6.4 3.2  1  3 40 , 1  15 T x3  3  3 2 1  0, 1  13 26   Since the contours of f are just vertical lines, we can see that x1 , x 2 are local minimum points and x3 is a local maximum point for Max f ( x)  x1 . I ( x1 )  I ( x2 )  1   Z 1 ( x1 )  z  E 2 : g1 ( x1 )T z  0, h1 ( x1 )T z  0   z  E 2 : 8 0 z  0,   6   4 z  0  z  E 2 : z1  0, z2   3 z1 2      Z ( x )     S  X , x   S X , x1  z  E 2 : z1  0, z2   3 z1 2  Z 1 ( x1 )  S  X , x1   1  1 1      Z 2 ( x1 )  z  E 2 : f ( x1 )T z  0  z  E 2 : 1 0 z  0  z  E 2 : z1  0 So Z 1 ( x1 ) Z 2 ( x1 )     6  z   z  E At x 2 , Z 1 ( x 2 )  z  E 2 : z1  0, z2   17   S ( B , x 0 )  x : g ( x 0 ) x  0  S  B, x    g ( x ) ,   E 0 0 T m ,   0 1 2  : z1  0 Comments concerning Lemma 1.22: Suppose that x 0 is a solution to Problem (P). Then f ( x0 )  S X , x0  .    Many results concerning optimality conditions for nonlinear programming problems (e.g. Karush-Kuhn-Tucker conditions, Lagrange multipliers) are special cases of Lemma 1.22. For special problems, the form of S X , x0  must the determined. This will be developed    below. Lemma 1.25 Suppose g : E n  E m is differentiable at x 0 , suppose there exists z  E n such that g ( x 0 ) z  0 , and let B   x : g ( x)  g ( x 0 ) . Then S ( B; x 0 )   x  E n : g ( x 0 ) x  0 , and      thus S B; x0   g ( x0 )T  ,   E m ,   0 . Proof: Part 1 Let z  E n be such that g ( x 0 ) z  0 ; then g ( x0 )( z )  0 and g ( x0   ( z ))  g ( x0 ),  such that 0     , for some   0 . Define x k  x 0   ( z ) ;  x k   B and x k  x 0 as k   . k k Let   k ; then  k ( x k  x0 )   z , thus  k ( x k  x0 )   z .  Then  z  S ( B, x0 ) and  y : g ( x 0 ) y  0  S(B,x 0 ) . Now let x be such that g ( x0 ) x  0 . Then g ( x0 )  ( z)  (1   ) x  0,   (0,1)  ( z )  (1   ) x  S  B; x 0  ,   (0,1) Letting   0 , we have x  S  B; x 0  . Thus far we have shown x : g ( x ) x  0  S  B; x  . 0 0 Part 2 Let x  S  B, x 0  ; then there exist  x k   B such that x k  x 0 and  k  0 such that  k ( x k  x0 )  x . Since g is differentiable at x 0 , g ( x k )  g ( x 0 )  g ( x 0 )( x k  x 0 )  x k  x 0  ( x k ; x 0 ) , where  ( x k  x0 )  0 as k   . Since x k  B, g ( x k )  g ( x0 ) , and 0  lim  k  g ( x k )  g ( x0 )   lim g ( x0 )  k ( x k  x0 )   g ( x0 ) x . k  k  Hence S  B; x 0    x : g ( x 0 ) x  0 . QED Remark: g ( x 0 ) z  0 says that the gradients g1 ( x 0 ), , g m ( x 0 ) are in an open halfspace. In particular, this condition holds if the gradient vectors are linearly independent ( i.e. if g ( x0 ) has full rank). This follows from Lemma 1.19. Consider Problem (P’’): Min f ( x) x  En s.t. gi ( x)  0, i  1,, m  g1  Notation: g     g m   X  x : g ( x)  0  I  i : gi ( x 0 )  0 g I ( x 0 ) k n : rows are the gradients of the active constraints    Lemma 1.22 If x0 is a solution to Problem (P’’), then f ( x0 )  S X , x0  .   Now S X , x0  S  x : g ( x)  g ( x )  0 , x  . 0 I 0 I By Lemma 1.25, if there exist y such that g I ( x 0 ) y  0 , then    S X , x 0  x : g I ( x 0 ) x  0   and  S  X , x    g ( x ) , 0 0 T I   0 .  So f ( x0 )  S X , x0   f ( x0 )  g I ( x0 )T ˆ , ˆ  0, ˆ  E k Define i  0, i  I . Then f ( x0 )  g ( x 0 )T    0 , i gi ( x 0 )  0 , and    0 (which are just the K-K-T conditions). Remark: The multiplier corresponding to f ( x 0 ) is positive since we are assuming there exist y such that g I ( x 0 ) y  0 . If no such y exists, then we have only the following:     This implies  x : g ( x ) x  0    S  X , x    S X , x 0  x : g I ( x 0 ) x  0 0 I 0 (see Part 2 of the proof for Lemma 1.25)    g I ( x0 )T  ,   0  S  X , x 0   . If no such y exists, the by Lemma 1.19, there exist nonzero    0 such that g I ( x 0 )T    0 and one obtains the Fritz-John Conditions (Theorem 1.20) : there exist 0  0,    0, (0 ,   )  0, 0f ( x 0 )  g I ( x 0 )T    0 . Consider the equality constrained problem: Min f ( x) x  E n s.t. h j ( x)  0, j  1, , p f, hj’s continuously differentiable  h1    h   hp      X  x  E n : h( x )  0 Lemma 1.26 Let h : E n  E p be continuously differentiable at x 0 , and let C   x  E n : h( x)  h( x 0 ) . If h( x 0 ) has full rank, then S (C ; x 0 )   x : h( x 0 ) x  0 and hence  S C; x    h( x ) , 0   E p . 0 T    Thus, Lemma 1.22 says f ( x0 )  S X ; x0  . This means that f ( x0 )  h( x0 )T   . p In other words, f ( x0 )    j h j ( x 0 )  0 (which is just Lagrange Multipliers). j 1 Consider the general nonlinear programming problem: Min f ( x) x  En s.t. gi ( x)  0, i  1,, m h j ( x)  0, j  1, , p f, gi’s, hj’s are continuously differentiable g   g1 gm  T  I  i : g ( x )  0 h   h1 hp  T  X  x  E n : gi ( x)  0, i  1, , m; h j ( x)  0, j  1, , p 0 i Lemma 1.27 Let v1 : E n  E m and v2 : E n  E p be differentiable and continuously differentiable, respectively, at x 0 . If v2 ( x 0 ) is of full rank, if there exist y such that v1 ( x 0 ) y  0 and v2 ( x 0 ) y  0 , and if C   x : v1 ( x)  v1 ( x 0 ), v2 ( x)  v2 ( x 0 ) , then   S (C; x 0 )  x  E n : v1 ( x 0 ) x  0, v2 ( x 0 ) x  0 And thus  S (C; x )    v ( x ) 0 0 T 1   v2 ( x 0 )T  ,   0    Now Lemma 1.22 says that if x 0 solves Problem (P), then f ( x0 )  S X ; x0  . Note that S ( X , x0 )  S x  E n  : g I ( x)  g ( x0 )  0, h( x)  h( x0 )  0 , x0 . By Lemma 1.27, if there exist y  E n such that g I ( x 0 ) y  0 and h( x0 ) y  0 , and if h( x 0 ) is full rank; then S  X , x 0    x : g I ( x 0 ) x  0, h( x 0 ) x  0 and  S  X , x    g ( x )   h( x ) , 0 0 T 0 T I     0 .  So f ( x0 )  S X , x0  implies that there exist ˆ  E k , ˆ  0,    E p such that f ( x0 )  g I ( x0 )T ˆ  h( x0 )T   . Define i  0, i  I ; then you get the following conditions (K-K-T). f ( x0 )  g ( x0 )T    h( x 0 )T    0 i gi ( x 0 )  0, i  1, , m   0 Remark: The assumption that there exist y  E n such that g I ( x 0 ) y  0 and h( x0 ) y  0 , and h( x 0 ) is at full rank comprise a constraint qualification. It holds if gi ( x 0 ), i  I , and h j ( x0 ) are linearly independent.

1.5 Constraint Qualifications

Related documents

Products

Support

1.5 Constraint Qualifications

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib