Convexity

advertisement
OR-ST-MA 706
Page |1
Spring 2016 Version
A TEXT FOR NONLINEAR PROGRAMMING
Thomas W. Reiland
Statistics Department
North Carolina State University
Raleigh, NC 27695-8203
Office: (919) 515-1939
Email: reiland@ncsu.edu
OR-ST-MA 706
Page |2
Table of Contents
Chapter I: Optimality Conditions..................................................Error! Bookmark not defined.
§ 1.1 Differentiability.................................................................Error! Bookmark not defined.
§ 1.2. Unconstrained Optimization ............................................Error! Bookmark not defined.
§ 1.3 Equality Constrained Optimization...................................Error! Bookmark not defined.
§ 1.3.1 Interpretation of Lagrange Multipliers..........................Error! Bookmark not defined.
§ 1.3.2 Second order Conditions – Equality Constraints ..........Error! Bookmark not defined.
§ 1.3.3 The General Case ..........................................................Error! Bookmark not defined.
§ 1.4 Inequality Constrained Optimization ...............................Error! Bookmark not defined.
§ 1.5 Constraint Qualifications (CQ) ........................................Error! Bookmark not defined.
§ 1.6 Second-order Optimality Conditions ...............................Error! Bookmark not defined.
§ 1.7 Constraint Qualifications and Relationships Among Constraint Qualifications ..... Error!
Bookmark not defined.
Chapter II: Convexity .................................................................................................................... 1
§ 2.1 Convex Sets ....................................................................................................................... 2
§ 2.2 Convex Functions ............................................................................................................. 5
§ 2.3 Subgradients and differentiable convex functions ........................................................... 13
Chapter II: Convexity
§ 2.1 Convex Sets
Definition 2.1.1 A set C  E n is a convex set if  x1  (1   ) x2  C  x1 , x2  C and   0,1 .
Remark: Geometrically, this means that a straight line segment joining any two points in
a convex set lies entirely in the set.
Definition 2.1.2 A point x  E n is a convex combination of the points x1 , x2 ,, xk if x can be
k
put in the form x   i xi , where i  0,
i 1
k

i 1
i
1.
Theorem 2.1 For C  E n to be a convex set it is necessary and sufficient that every convex
combination of points of C belongs to C.
OR-ST-MA 706
Page |3
Proof: Necessity: follows by induction
Sufficiency: follows directly from the definitions
Theorem 2.2 If  Ai iI is an arbitrary family of convex sets, then
Proof: Let x1 , x2 
iI
iI
Ai is convex.
Ai and let 0    1. Then for each i  I , x1 , x2  Ai and since Ai is
convex,  x1  (1   ) x2  Ai . QED
Definition 2.1.3 Let S be an arbitrary set in E n . The convex hull of S, denoted H ( S ) , is the
k
collection of all convex combinations of points in S; i.e. x  H ( S ) iff x    j x j , where
j 1
k

j 1
j
 1,  j  0, x1 , x2 ,, xk  S , and k is a positive integer.
Example 2-1: Note the geometric interpretation of the convex hull, H(S), for the three sets
below.
Theorem 2.3 Let S be an arbitrary set in E n . Then H(S) is the smallest convex set containing S.
Indeed, H(S) is the intersection of all convex sets containing S.
Definition 2.1.4 A point x in a convex set C is called an extreme point if x cannot be expressed
as a convex combination of any other two distinct points in C.
Example 2-2: Below are three convex sets with their respective extreme points identified.
OR-ST-MA 706
Page |4
Definition 2.1.5 A hyperplane H  E n is a collection of points of the form  x : pT x    where
p is a (fixed) nonzero vector in E n and α is a (fixed) scalar. A hyperplane H defines two closed
half-spaces, H    x : pT x    and H    x : pT x    , and the two open half-spaces,
x : p
T



x   and x : pT x   .
Note: In E1 , H is a point. In E 2 , H is a line. In E 3 , H is a plane.
Example 2-3:
Theorem 2.4 A hyperplane is a convex set.
Proof: Let H   x : pT x    and let x1 , x2  H . For any point xˆ   x1  (1   ) x2 ,
where 0    1, we have:
pT xˆ  pT  x1  pT (1   ) x2
  pT x1  (1   ) pT x2
   (1   )  
So x̂  H . QED
Theorem 2.5 The closed half-spaces H  and H  are convex sets.
OR-ST-MA 706
Page |5
Remark: A hyperplane H and the corresponding half-spaces can be written in reference
to a fixed point, say x  H , then pT x   and hence any x  H must satisfy the following:
i.e. pT ( x  x )  0 .
pT x  pT x      0
Accordingly, H    x : pT ( x  x )  0 and H    x : pT ( x  x )  0 .
Definition 2.1.6 Boundary of a Set: Let S  E n be arbitrary; x is said to be in the boundary
of S, denoted S , if   0 , N ( x) contains at least one point in S and one point not in S.

N ( x)  y  E n : x  y  

Example 2-4: Boundary Points
Definition 2.1.7 Let S  E n be convex and nonempty and let x S , then there exists a
supporting hyperplane of S at x ; i.e. there exists p  0 such that pT ( x  x )  0 for each
x  cl S .
§ 2.2 Convex Functions
Definition 2.2.1 Epigraph/Hypograph: Let f be a function defined on a subset D  E n with
values in the extended reals; the epigraph of f, denoted epi f, is

epi f  ( x,  ) : x  D  E n ,   E1 , f ( x)  

The hypograph of f, denoted hyp f, is

hyp f  ( x,  ) : x  D  E n ,   E1 , f ( x)  
Remark: Epi f and hyp f are subsets of E n 1 .

f : D  E n   , 
Example 2-5: Graphical Depiction of Epigraph/Hypograph
OR-ST-MA 706
Page |6
Definition 2.2.2 Convex Function: Let f be defined on D  E n whose values are in the
extended reals; f is a convex function if epi f is a convex set.
Example 2-6:
(1)
empty set is convex
f   in E n , epi f  
n
n 1
(2)
also convex
f   in E , epi f  E
n
(3)
f : D  E , finite on D, D is convex
 f ( x) x  D
f1 ( x)  
epi f is convex
  x  D
Suppose a convex function f defined on a proper convex subset D  E n .
epi f on D = epi f1 defined on En
If epi f1(x) is convex then f1(x) is convex.
We can always expand f to be defined on all of En and still be convex.
Remark: As a result of the above example, we shall assume that, unless mentioned
explicitly, a convex function is defined on all of E n .
Definition 2.2.3 Effective Domain: Let f : E n  extended reals ; the effective domain of f,
denoted ED( f ) , is the set ED( f )   x  E n : f ( x)   .
OR-ST-MA 706
Page |7
Remark: The effective domain of f is the projection of epi f onto E n .
Theorem 2.7 If f : E n  extended reals is convex, then ED( f ) is a convex set in E n .
Proof: Let x1 , x2  ED( f ) and suppose   0,1 ; we must show  x1  (1   ) x2  ED( f ) .
By definition, f is convex if and only if epi f is convex;
i.e. if ( x1 , 1 ) and ( x2 ,  2 )  epi f
then  ( x1 , 1 )  (1   )( x2 , 2 )    x1  (1   ) x2 , 1  (1   )2  belongs to epi f
Now x1 , x2  ED( f )  f ( x1 ), f ( x2 )   ; in particular, there exists a finite
1 and  2 such that f ( x1 )  1 and f ( x2 )   2 ; therefore, ( x1 , 1 ) and ( x2 ,  2 )
belong to epi f. Since epi f is a convex set in E n 1 , for   0,1 we have
 ( x1 ,1 )  (1   )( x2 , 2 )    x1  (1   ) x2 , 1  (1   ) 2   epi f
i.e. f ( x1  (1   ) x2 )  1  (1   ) 2 since 1 and  2 are finite,
1  (1   ) 2   and hence f ( x1  (1   ) x2 )  
i.e.  x1  (1   ) x2  ED( f ) . QED
Remark: The converse statement does not hold in general; i.e. if ED( f ) is convex, f is
not necessarily a convex function.
Example 2-7:
ED(f ) is convex but f is not convex; epi f is not a convex set.
Definition 2.2.4 Proper Convex Function: The function f is a proper convex function if f is
convex, f ( x)   x  E n , and ED ( f )   (less than +∞ somewhere). Convex functions that
are not proper are called improper.
Remark: Hence proper convex functions are convex functions that nowhere have the
value  and are not identically   . Equivalently, a convex function is proper if its epigraph
is nonempty and contains no vertical lines (infinitely long from  to + ).
Example 2-8: Improper Convex Function
OR-ST-MA 706
Page |8
Theorem 2.8 Convex Function: Let f be a function from C to  ,  , where C is a convex
set (e.g. C  E n ). Then f is convex if and only if
f ( x  (1   ) y )   f ( x)  (1   ) f ( y ), 0    1
for every x and y in C.
(1)
Remarks:
1. The inequality in (1) is the “classical” definition of a convex function where f nowhere
has the value  . The classical definition expresses the usual “chord above the curve” property.
Two technical approaches are possible when discussing convex functions. One can limit
attention to functions f:D which are nowhere  , so that D would always coincide with ED( f )
(but would vary with f ). Alternatively, one can limit attention to functions given on all of E n ,
since a convex function f on D can always be extended to a convex function on all of E n by
setting f ( x)   for x  D (see Example 2-6).
We will take the second approach. Thus, by a “convex function” we always mean a
“convex function with possible infinite values defined throughout E n ,” unless otherwise
specified. This approach has the advantage that technical nuisances about effective domains can
be suppressed almost entirely.
This approach leads to calculations involving  and   . We adopt the following
obvious rules:
       
        
    ,  ()  ()  
    ,  ()  ()  
for
for
for
for
0  0  0  0()  ()0,  ()  
sup   
inf   
   and     are undefined and avoided
    
    
0   
    0
OR-ST-MA 706
Page |9
2. Proper convex functions (see Definition 2.2.4) are the real object of study (though
improper functions do arise from proper ones in many natural situations and it is more
convenient to include them than to exclude them with added restrictions).
Note that a proper convex function on E n is a function obtained by taking a finite convex
function f on a nonempty convex set C and then extending it to all of E n by setting f ( x)  
for x  C .
Definition 2.2.5 An extended real-valued function f defined on D  E n is a concave function if
–f is convex.
Remark: A function f is concave if hyp f is a convex set in E n 1 .
Definition 2.2.6 A real-valued function f defined on a convex subset C  E n is strictly convex
if for any x1  C , x 2  C , x1  x 2 , and    0,1 we have
f ( x1  (1   ) x 2 )   f ( x1 )  (1   ) f ( x 2 ) .
Theorem 2.9 Let f be a function from E n to  ,  . Then f is convex if and only if
f (1 x1  m x m )  1 f ( x1 )  m f ( x m ) , whenever i  0, i  1, , m,
m

i 1
i
 1.
Operations on convex functions to form other convex functions…
Theorem 2.10 Let f be a convex function and let λ be a nonnegative number. Then λf is also a
convex function. Let f and g be convex functions. Then f  g is also convex, provided that the
undefined operation   ( ) is avoided.
Corollary 2.11 Under the hypotheses of Theorem 2.10, every linear combination
1 f1  k f k of convex functions f1 ,, f k with i  0, i  1,, k , is also a convex function.
Definition 2.2.7 A function  : E1   ,  is nondecreasing if for every x1  x 2 , we have
 ( x1 )   ( x 2 ) .
Theorem 2.12 Let f be a convex function from E n   ,  and let  be a nondecreasing
convex function from E1   ,  with  ()   . Then g ( x)    f ( x)  is convex on
En .
Proof: f is convex if and only iff f ( x1  (1   ) x 2 )   f ( x1 )  (1   ) f ( x 2 ) ; since  is
nondecreasing,   f ( x1  (1   ) x 2 )     f ( x1 )  (1   ) f ( x 2 )  . And since  is convex




  f ( x1 )  (1   ) f ( x 2 )    f ( x1 )  (1   ) f ( x 2 ) . QED
OR-ST-MA 706
P a g e | 10
  f ( x)    x 2
 ( x)   x
Example 2-9: f ( x)  x 2
Need assumption that  ( x) is nondecreasing.
(1) g ( x)  e f ( x ) is a proper convex function on E n ; if f ( x ) is proper convex on E n
since e x is the nondecreasing  .
(2) g ( x)   f ( x)  is convex for p  1 when f is convex and nonnegative.
p
 p   0
( )  
 0  0
p
In particular, g ( x )  x is convex on E n for p  1 . When p  1 , you get
Euclidean distance.
Theorem 2.13 Let
 fi , i  I be a finite or infinite collection of convex functions on
E n . For
every x  E n define the pointwise supremum of this collection as f ( x)  sup fi ( x) . Then f is a
iI
convex function.
Example 2-10:
(1) Let x  (1 ,  2 , ,  n )  E n , then f ( x)  max  j  is convex.
1 j  n
Let fi ( x)  x ei where ei  0,,0,1, 0,,0
T
 1 in the ith position
f ( x)  sup fi ( x) is a convex function
i  j n
  is convex
(2) Let K ( x)  max  j
1 j  n
(Chebychev Norm)
Let fi ( x)  i which is convex. K ( x)  sup  fi ( x)
1 j  n
Theorem 2.14 The function f is convex on E n if and only if for every x1 , x 2  E n , the function
 defined by  ( )  f ( x1  (1   ) x 2 ) is convex for   0,1 .
Remark: This theorem states that a function is convex on a convex set C if and only if
the restriction of f to each line segment in the set C is a convex function. Note that f is defined
on E n while  is defined on E1 .
Example 2-11: Continuity of Convex Functions
 x2 1 x  1

f ( x)   2
x 1
This is a convex function since epi f is convex.
 
x 1

OR-ST-MA 706
P a g e | 11
The effective domain is: ED( f )   ,1
Remark: Roughly speaking, discontinuities in convex functions can occur only at some
boundary points of their effective domain. Some of these discontinuities can be removed as will
be shown below.
Theorem 2.15 A convex function f on E n is continuous on the interior of its effective domain.
Remark: This generalizes real-valued convex functions  f : C  E1  on open convex
set C as continuous (convex function f : E n  E1 is continuous).
Definition 2.2.8 Let f be a convex function defined on E n . Define the support set, denoted
L( f ) , as follows:
L( f )   a, b  : a  E n , b  E1 , aT x  b  f ( x) x  E n  .
Example 2-12:
Definition 2.2.9 Let f be a convex function defined on E n . Then the closure of f, denoted cl f,
cl f ( x)  sup aT x  b . A convex function f is said to be closed if f  cl f .
is:
 a ,b L ( f )
Example 2-13:
OR-ST-MA 706
P a g e | 12
Recall the definition of a proper convex function (Definition 2.2.4).
Theorem 2.16 A proper convex function f on E n is closed if and only if the convex level set
x : f ( x)    is closed for every real number α.
Remarks:
1. The closure operation for proper convex functions is directly related to the
closure operation for sets. Specifically, the epigraph of cl f is the closure of the epigraph of f,
i.e. epi(cl f )  epi f ( C denotes the closure of C ). This last relation can also be used as the
definition of the closure operation for proper convex functions.
2. For improper convex functions, the closure operation has the following
meaning:
(i) if f ( x)   for every x  E n , then cl f  f
(ii) if f ( x)   for some x  E n , then by definition L( f )   ; since by
definition, the supremum over the empty set is  (see remarks for Theorem 2.8), we get
cl f ( x)   for every x  E n . Therefore, the only closed improper convex functions are those
that are identically  or   .
Example 2-14:
 x2 1 x  1

f ( x)   2
x 1
 
x 1

 x2 1 x  1
cl f  
  x  1
The closure of epi f = epi (cl f ).
Example 2-15:
0 x  0
f ( x)  
 x  1
0 x  0
cl f  
 x  1
Remark: The closure operation is a normalization which makes convex functions “more regular”
by redefining their values at certain points where there are unnatural discontinuities.
OR-ST-MA 706
P a g e | 13
§ 2.3 Subgradients and differentiable convex functions
Definition 2.3.1 Let f be defined on E n with values in  ,  . Let x 0 be a point where f is
finite and let y  E n . The right-sided derivative of f at x 0 in the direction y is
D  f ( x 0 ; y )  lim
t 0
f ( x 0  ty )  f ( x 0 )
, if the limit (which can be  ) exists.
t
The left-sided derivative of f at x 0 in the direction y is
D  f ( x 0 ; y )  lim
t 0
f ( x 0  ty )  f ( x 0 )
.
t
For y  0, D f ( x0 ; y)  D  f ( x0 ; y)  0 . The limits in Definition 2.3.1 exist for convex
and concave functions.
Example 2-16: Let f ( x)  x ; x0  0, y  1
D  f ( x 0 ; y )  lim
t 0
t 0
t
1
D  f ( x 0 ; y )  lim
t 0
t 0
t
 1
Theorem 2.17 Let f be a convex function on E n and let x be such that f ( x ) is finite. Then the
right- and left-sided derivatives of f at x exist in every direction y.
Remarks: If for some x 0 and y, D f ( x0 ; y)  D  f ( x0 ; y) , this common value is called
the directional derivative of f at x 0 in the direction y and denoted Df ( x 0 ; y ) :
f ( x 0  ty )  f ( x 0 )
Df ( x ; y )  lim
t 0
t
f ( x 0 )
T
If y  1, 0, , 0 , then Df ( x 0 ; y ) 
. These partial derivatives may exist
x1
even if f is not differentiable.
If f is differentiable, then Df ( x0 ; y)  yT f ( x0 ) .
0
Since epi f (hyp f) of a convex (concave) function f is a convex set, epi f has a supporting
hyperplane at its boundary points (see Theorem 2.6). This fact motivates the following
definition.
Definition 2.3.2 A subgradient of a convex function f at a point x  E n is a vector   E n
satisfying:
OR-ST-MA 706
P a g e | 14
f ( x)  f ( x )   T ( x  x ), x  E n .
 
Similarly, a subgradient of a concave function f at a point x  E n is a vector   E n
satisfying:
f ( x)  f ( x )   T ( x  x ), x  E n .

Example 2-17: Geometric Interpretation of a Subgradient
<FIX: Insert graphic p. 19 of handout>
Remarks:
1. The function f ( x )   T ( x  x ) [a function of x since x and  are fixed] is a
non-vertical supporting hyperplane of epi (f) or hyp (f) at the point ( x , f ( x )) when f is finite at
x ;  corresponds to the slope of the hyperplane.
2. In general, at a point x  E n , there are 3 possibilities:
(i) no vector  satisfying   exists
(ii) there exists a unique  satisfying  
(iii) there exists more than one such  (infinite number)
Definition 2.3.3 The set of subgradients of the convex (convave) function f at x is called the
subdifferential of f at x and denoted f ( x ) . If f ( x ) is nonempty, then f is subdifferentiable at
x.
Example 2-18:
(i) Consider the point depicted. The subdifferential at this point is the empty set.
(ii) f ( x)  x 2
f ( x)  f ( x )   T ( x  x )
 x2  x 2   T ( x  x )
x2  x 2

(x  x )
x  x  x x 
x  E1
xx
OR-ST-MA 706
P a g e | 15
x2  x 2

x  x  x x 
(x  x )
Therefore,   2x
xx
(iii) f ( x)  x , the Euclidean norm in E n . x  0
f ( x)  f ( x )   T ( x  x )
x   T x x  E n


f ( x )  f (0)  unit ball in E n    E n :   1
<FIX: insert graphic p. 20 of handout>
Example 2-19: f ( x)  min  f1 ( x), f 2 ( x)
f1 ( x)  4  x
x  E1
f 2 ( x)  4  ( x  2)2
1 x  4
 4 x
f ( x)  
This is concave – hypograph is a convex set.
2
4  ( x  2) otherwise
For x  (1, 4),   1 is a subgradient. For x  1 or x  4,   2( x  2) is a subgradient.
Now examine the corner points, x  1, x  4 ,  is not unique.
At x  1 , the subgradients are characterized by f1 (1)  (1   )f 2 (1) for  0,1 .
  (1)  (1   )(2)  2  3 ,   [0,1],   [ 1, 2]
At x  4 , the subgradients are characterized by f1 (4)  (1   )f 2 (4) for   0,1 .
  (1)  (1   )(4)  3  4 ,   [0,1],   [ 4, 1]
<FIX: insert graphic from p. 20 of handout>
Theorem 2.18 Let f be a convex function on E n ; then for any x  E n , f ( x ) is a closed
convex set.
Subgradients can be characterized in terms of directional derivatives.
Theorem 2.19 A vector   E n is a subgradient of a convex function f at a point x  E n where
f is finite if and only if D f ( x ; y)   T y for every direction y  E n .
Example 2-20:
(1) f ( x)  x 2 Previously we verified that f ( x )  2 x . The above theorem says
(trivially) that
D f ( x ; y)  f ( x )T y  2 xy y  E1
i.e. 2xy  2 xy
y  E1
OR-ST-MA 706
(2) f ( x)  x
P a g e | 16
x  En
  E n is a subgradient of f at x  0 if and only if D f (0; y)   T y
i.e. iff lim
t 0
i.e. iff
ty
t
 T y
y  T y
y  E n
y  E n
y  E n . Which is true   unit ball in E n .
i.e. f (0) is the unit ball in E n which we already established.
Theorem 2.20 Let f be a convex function on E n and suppose f ( x ) is finite.
Then f ( x)  f ( x )  D f ( x ; x  x ) x  E n .
In particular, if f is differentiable at x , then f ( x)  f ( x )  f ( x )T ( x  x ) x  E n . In other
words, the linear approximation always underestimates the function.
Proof: For 0  t  1 and x  E n
f ((1  t ) x  tx)  (1  t ) f ( x )  tf ( x)
f ( x  t ( x  x )  f ( x )  tf ( x )  tf ( x)
tf ( x)  tf (x )  f ( x  t ( x  x ))  f ( x )
t  f ( x)  f ( x ) f ( x  t ( x  x ))  f ( x )

t
t
f ( x  t ( x  x ))  f ( x )
 f ( x)  f ( x )  lim
 D f ( x ; x  x )
t 0
t

 f ( x)  f ( x )  D f ( x ; x  x ) QED.
Download