Lecture 4 Two-stage stochastic linear programs January 24, 2018 Uday V. Shanbhag Lecture 4 Introduction Consider the two-stage stochastic linear program defined as SLP cT x + Q(x) minimize x Ax = b, subject to x ≥ 0, where Q(x) is defined as follows: Q(x) , E[Q(x, ξ)]. If ξ(ω) , (T (ω), W (ω), q(ω), b(ω)), then the integrand of this expectation, denoted by Q(x, ξ), is the optimal value of the following second-stage Stochastic Optimization 1 Uday V. Shanbhag Lecture 4 problem: SecLP(ξ) minimize q(ω)T y(ω) subject to T (ω)x + W (ω)y(ω) = b(ω) y(ω) ≥ 0. y(ω) • T (ω) is referred to as the tender • W (ω) is called the recourse matrix Stochastic Optimization 2 Uday V. Shanbhag Lecture 4 Polyhedral functions Definition 1 (Polyhedra) A polyhedron (or a polyhedral convex set) is the intersection of a finite number of linear inequalities. Definition 2 (Polyhedral function) A function f : Rn → (−∞, +∞] is said to be polyhedral if its epigraph is a polyehdral set in Rn+1 or epi(f ) is a polyhedral set where epi(f ) , {(x, w) : x ∈ Rn, w ∈ R, f (x) ≤ w} . A polyhedral function f is closed, convex, and proper since epi(f ) is a closed, convex, and nonempty set. Stochastic Optimization 3 Uday V. Shanbhag Lecture 4 Discrete random variables • Suppose ξ is a discrete random variable taking on values in Ξ. • Suppose K1 and K2(ξ) (elementary feasibility set) are defined as K1 , {x : Ax = b, x ≥ 0} and K2(ξ) , {x : y ≥ 0 subject to W (ω)y = h(ω) − T (ω)x} . • Moreover, K2 is defined as K2 , \ ξ∈Ξ Stochastic Optimization 4 K2(ξ). Uday V. Shanbhag Lecture 4 Next, we prove some useful properties of K2(ξ) but need to define the positive hull: Definition 3 (Positive hull) The positive hull of W is defined as pos W , {t : W y = t, y ≥ 0} . In fact, pos W is a finitely generated cone which is the set of nonnegative linear combinations of finitely many vectors. Note that it is convex and polyhedral. Proposition 1 (Polyhedrality of K2(ξ) and K2) The following hold: 1. For a given ξ , K2(ξ) is a convex polyhedron; 2. When ξ is a finite discrete random variable, K2 is a convex polyhedron. Proof: Stochastic Optimization 5 Uday V. Shanbhag Lecture 4 (i) Consider an x 6∈ K2(ξ) and ξ = (T (ω), W (ω), q(ω), b(ω)). Consequently there exists no nonnegative y such that W (ω)y = h(ω) − T (ω)x; equivalently h(ω) − T (ω)x 6∈ pos W (ω), where pos W (ω) is a finitely generated convex cone. Therefore, by the separating hyperplane theorem, there exists a hyperplane, defined as H , such that H = {v : uT v = 0} such that uT (h(ω) − T (ω)x)) < 0 and uT v > 0 where v ∈ pos W (ω). Since pos W (ω) is a finitely generated cone, there can only be finitely many such hyperplanes. It follows that K2(ξ) can be viewed as an intersection of a finite number of halfspaces and is therefore a convex polyhedron. (ii) Since K2 is an intersection of a finite number of polyhedral sets, K2 is a convex polyhedron. • Next, we examine the properties of Q(x, ξ) Stochastic Optimization 6 Uday V. Shanbhag Lecture 4 Proposition 2 (Properties of Q(x, ξ)) For a given ξ , Q(x, ξ) satisfies the following: (i) Q(x, ξ) is a piecewise linear convex function in (h, T ) (ii) Q(x, ξ) is a piecewise linear concave function in q (iii) Q(x, ξ) is a piecewise linear convex function in x for x ∈ K2 Proof: (i) Suppose f (b) is defined as f (b) = min q T y : W y = b, y ≥ 0 . We proceed to show that f (b) is convex in b. Consider b1, b2 such that bλ = λb1 + (1 − λ)b2 where λ ∈ (0, 1). Let y1 and T y2 denote solutions of q y : W y = b, y ≥ 0 for b = b1 and b2 respectively. Furthermore, suppose yλ denotes the solution of min q T y : W y = bλ, y ≥ 0 and it follows that f (bλ) = q T yλ ≤ q T (λy1 + (1 − λ)y2), Stochastic Optimization 7 Uday V. Shanbhag Lecture 4 where λy1 + (1 − λ)y2 is feasible i.e. W (λy1 + (1 − λ)y2) = λW y1 + (1 − λ)W y2 = b (λy1 + (1 − λ)y2) ≥ 0. Consequently, f (bλ) = q T yλ ≤ q T (λy1 + (1 − λ)y2) = q T ȳ (By optimality of yλ ) = λ q| T{zy1} +(1 − λ) q| T{zy2} =f (b1 ) =f (b2 ) = λf (b1) + (1 − λ)f (b2), where λy1 + (1 − λ)y2 is feasible with respect to the constraint {y : W y = bλ}. The convexity of f in (h, T ) follows. Stochastic Optimization 8 Uday V. Shanbhag Lecture 4 (ii) To show the concavity of Q(x, ξ) in q , we may write f (q) as n T T o f (q) = max π b : W π ≤ q . Proceeding as earlier, it is relatively easy to show that f (q) is concave in q . (iii) The convexity of Q(x, ξ) in x follows from (i). Next, we show the piecewise linearity of Q(x, ξ) in (h, T ). Suppose y is a solution of min q T y : W y = b, y ≥ 0 . Then if yB and yN (≡ 0) denote the basic and non-basic variables and B(ω) denotes the basis and N (ω) Stochastic Optimization 9 Uday V. Shanbhag Lecture 4 represents the remaining columns, we have B(ω)yB + N (ω)yN = h(ω) − T (ω)x B(ω)yB = h(ω) − T (ω)x yB = (B(ω))−1(h(ω) − T (ω)x). Recall that when a feasible basis is optimal, we have that q(ω)T ỹ ≥ qB (ω)T yB = qB (ω)T (B(ω))−1(h(ω) − T (ω)x) = Q(x, ξ), implying that Q(x, ξ) is linear in (h, T, q, x) on a domain prescribed by feasibility and optimality conditions. Piecewise linearity follows Stochastic Optimization 10 Uday V. Shanbhag Lecture 4 from noting the existence of a finite number of different optimal bases for the second-stage program. Stochastic Optimization 11 Uday V. Shanbhag Lecture 4 Support functions The support function sC (•) of a nonempty closed and convex set C is defined as sC (h) , sup z T h. z∈C Given a convex set C , its support function, denoted by sC (•), is convex, positively homogeneous,∗ and lower semicontinuous.† If s1(•) and s2(•) are support functions of C1 and C2 respectively, then s1(•) ≤ s2(•) ⇔ C1 ⊂ C2. Furthermore, s1(•) = s2(•) ⇔ C1 = C2. ∗ A function f (x) is positively homogeneous if f (λx) = λf (x) where λ ≥ 0 and x ∈ Rn . † A function f (x) is lower semicontinuous at x if lim inf f (x) ≥ f (x ). 0 0 x→x0 Stochastic Optimization 12 Uday V. Shanbhag Lecture 4 Lemma 1 (Support function of unit ball) Consider the unit ball B defined with the Euclidean norm as follows B , {π : kπk2 ≤ 1}. Then sB (.) = k.k2. Proof: By definition, we have that sB (χ) = sup π T χ. π∈B Since B is a compact set, by the continuity of the objective, the supremum is achieved. Suppose the maximizer is denoted by π ∗. Then, kπ ∗k ≤ 1 and Stochastic Optimization 13 Uday V. Shanbhag Lecture 4 it follows that π ∗ = χ . kχk Consequently, χT kχk2 (π ) χ = χ= = kχk. kχk kχk ∗ T Lemma 2 (Support function of Minkowski addition of sets) Consider a set X = X1 + X2 + . . . + Xn where X1, X2, . . . , Xn are nonempty closed and convex sets. Then for any y ∈ Rn, the support function sX (y) is given by the following: sX (y) = sX1 (y) + sX2 (y) + . . . + sXn (y). Proof: Recall that by the definition of a support function, we have the Stochastic Optimization 14 Uday V. Shanbhag Lecture 4 following sequence of equalities: sX (y) = sup z T y z∈X = zT y sup z∈X1 +...+Xn = zT y sup z1 ∈X1 ,...,zn ∈Xn ! = sup z1T y + . . . + z1 ∈X1 ! sup znT y zn ∈Xn = sX1 (y) + . . . + sXn (y). Lemma 3 (Support function is positively homogeneous) Consider the support function of a closed and convex set X . Then given any y ∈ Rn, we have that sX (λy) = λsX (y) when λ ≥ 0. Stochastic Optimization 15 Uday V. Shanbhag Lecture 4 Proof: This result follows by noting that for λ ≥ 0, we have that sX (λy) = sup λy T z z∈X = λ sup y T z z∈X = λsX (y). Other examples of support functions: • ( C= x: n X ) xi = 1, xi ≥ 0 , i=1 Stochastic Optimization 16 sC (χ) = max χj . j∈{1,...,n} Uday V. Shanbhag Lecture 4 Subgradients of Q(x, ξ) Consider the second-stage linear program given by SecLP(ξ) minimize q(ω)T y(ω) subject to T (ω)x + W (ω)y(ω) = b(ω) y(ω) ≥ 0. y(ω) Then the associated dual problem is given by D-SecLP(ξ) maximize b(ω)T π(ω) π(ω) subject to W T (ω)π(ω) ≤ q(ω). Stochastic Optimization 17 Uday V. Shanbhag Lecture 4 The recourse function Q(x, ξ) is defined as follows: n T n T Q(x, ξ) , inf q(ω) y : W (ω)y = h(ω) − T (ω)x, y ≥ 0 o o ≡ inf q(ω) y : W (ω)y = χ, y ≥ 0 , sq (χ). (Suppressing ω ) Suppose Π(q) is defined as n o T Π(q) , π : W π ≤ q . It follows that from LP duality, we have that n T o n T T o inf q(ω) y : W (ω)y = χ, y ≥ 0 ≡ sup χ π : W (ω) π ≤ q(ω) . Stochastic Optimization 18 Uday V. Shanbhag Lecture 4 As a consequence, sq (χ) = sup π T χ. π∈Π(q) This implies that sq (χ) is the support function of Π(q), a closed and convex polyhedral set. Next we examine the subdifferential of Q(x, ξ) and need some definitions. Definition 4 (Differentiability) Given a mapping g : Rn → Rm. It is said that g is directionally differentiable at a point x0 ∈ Rn in a direction h ∈ Rn if the following limit exists: ! g(x0 + th) − g(x0) g (x0; h) := lim . t→0 t 0 If g is directionally differentiable at x0 for every h ∈ Rn, then it is said to be directionally differentiable at x0. Note that whenever this limit exists, g 0(x0; h) is said to be positively homogeneous in h or g 0(x0; th) = Stochastic Optimization 19 Uday V. Shanbhag Lecture 4 tg 0(x0; h) for any t ≥ 0. If g(x) is directionally differentiable at x0 and g 0(x0; h) is linear in h, then g(x) is said to be Gâteaux differentiable at x0. This limit can also be written as follows: g(x0 + h) = g(x0) + g 0(x0; h) + r(h), where r(h) is such that r(th) t → 0 as t → 0 for any fixed h ∈ Rn. If g 0(x0; h) is linear in h and r(h) → 0 as h → 0 (or r(h) = o(h)), then khk g(x) is said to be Fréchet differentiable at x0 or merely differentiable at x0 Note that Fréchet implies Gâteaux differentiability and when the functions are locally Lipschitz, both notions coincide. Recall that a mapping g is said to be locally Lipschitz on a set X ⊂ Rn, if it is Lipschitz continuous on a neighborhood of every point of X (with possibly different Stochastic Optimization 20 Uday V. Shanbhag Lecture 4 constants). Stochastic Optimization 21 Uday V. Shanbhag Lecture 4 Summary of differentiability concepts • Directionally differentiable at x0 =⇒ Directionally differentiable for all h at x0 • g(x) is Gauteaux differentiable at x0 =⇒ Directionally differentiable at x0 and g 0(x0; h) is linear in h • g(x) is Fréchet differentiable at x0 =⇒ g 0(x0; h) is linear in h and r(h) → 0 as h → 0 khk • In fact Fréchet differentiable =⇒ Gauteaux differentiable; if g is locally Lipschitz continuous, then both notions coincide. Stochastic Optimization 22 Uday V. Shanbhag Lecture 4 Subgradients and subdifferentials A vector z ∈ Rn is said to be a subgradient of f (x) at x0 if f (x) − f (x0) ≥ z T (x − x0), ∀x ∈ Rn. The set of all subgradients of f (x) at x0 is referred to as a subdifferential and is denoted by ∂f (x0). The subdifferential ∂f (x0) is a closed and convex subset of Rn and f is subdifferentiable at x0 if ∂f (x0) 6= ∅. • Consider the function f (x) = |x|. Then ∂f (x) is defined as follows: ∂f (x) = Stochastic Optimization −1, x<0 1, x > 0. [−1, 1], 23 x=0 Uday V. Shanbhag Lecture 4 • Consider the function f (x) = max{x, 0} ∂f (x) = Stochastic Optimization 0, x<0 1, x > 0. [0, 1], 24 x=0 Uday V. Shanbhag Lecture 4 Convexity and polyhedrality of Q(x, ξ) Proposition 3 (Convexity and polyhedrality of Q(x, ξ)) For any given ξ , the function Q(., ξ) is convex. Moreover, if the set Π(q) is nonempty and D-SecLP(ξ) is feasible for at least one x, then the function Q(., ξ) is polyhedral. Proof: Recall that the set Π(q) is closed, convex, and polyhedral. If Π(q) is nonempty, then sq (χ), defined as sq (χ) = sup χT π, π∈Π(q) is a polyhedral function,‡ where χ = h−T x. Finally, since Q(x, ξ) = sq (χ), the result follows. ‡ Recall that an extended real-valued function is called polyhedral, if it is proper, convex, and lower semicontinuous, its domain is a closed and convex polyhedron, and it is piecewise linear on its domain. A function f is said to be proper if f (x) > −∞ for all x ∈ Rn and its domain dom(f ) is nonempty where dom(f ) , {x : f (x) < +∞}. Stochastic Optimization 25 Uday V. Shanbhag Lecture 4 Next we characterize the subdifferential of Q(x, ξ) but require the FenchelMoreau theorem: Theorem 4 (Fenchel-Moreau) Suppose that f is a proper, convex, and lower semicontinuous function. Then f ∗∗ = f , where f ∗, the conjugate function of f , is defined as f ∗(z) , sup {z T x − f (x)}. x∈Rn Note that the conjugate function f ∗ : Rn → R̄ is always convex and lsc. Furthermore, we have that n ∗ T o ∂f (x) = arg max z x − f (z) . n z∈R Proposition 5 (Subdifferential of Q(x, ξ)) Suppose that for a given x = Stochastic Optimization 26 Uday V. Shanbhag Lecture 4 x0 and for ξ ∈ Ξ, the function Q(x0, ξ) is finite. subdifferentiable at x0 and Then Q(., ξ) is ∂Q(x0, ξ) = −T T D(x0, ξ), where D(x, ξ) the set of dual solutions is defined as D(x, ξ) := arg max π T (h − T x). π∈Π(q) Proof: Since Q(x0, ξ) is finite, then Π(q) is nonempty and sq (χ) is its support function, defined as sq (χ) , sup χT π. π∈Π(q) Stochastic Optimization 27 Uday V. Shanbhag Lecture 4 By the definition of a conjugate function, it can be seen that sq (χ) is the conjugate of the indicator function 1lq (π), defined as 1lq (π) := 0, if π ∈ Π(q) +∞, otherwise. Since Π(q) is closed and convex, the function 1lq (π) is convex and lower semicontinuous. By the Fenchel-Moreau theorem, the conjugate of sq (.) is 1lq (.). This can be easily seen by noting the following: sq (z) = sup {z T π} = sup {z T π − 1lq (π)}, π π∈Π(q) implying that sq (·) is the conjugate of 1lq (·). Since 1lq (·) is convex and lower semicontinuous, by Fenchel-Moreau, we have that 1lq (·) = 1l∗∗ q (·) Stochastic Optimization 28 Uday V. Shanbhag Lecture 4 (the biconjugate) where sq (·) = 1l∗q (.) implying that 1lq (·) = (sq (·))∗. Then for χ0 = h − T x0, n o T ∂sq (x0) = arg max π χ0 − 1lq (π) = arg max π T χ0. π π∈Π(q) Since Π(q) is polyhedral and sq (χ0) is finite, it follows that ∂sq (χ0) is nonempty. Moreover, sq (.) is piecewise linear (earlier in this lecture) and by the chain rule ∂xQ(x, ξ) = (∂xχ0)T ∂χ0 sq (χ0) = −T T D(x0, ξ). Stochastic Optimization 29 Uday V. Shanbhag Lecture 4 Expected recourse for discrete distributions In this section, we consider the expected cost of recourse, denoted by Q(x), where Q(x) , E[Q(x, ξ)]. Suppose, the distribution of ξ has finite support, implying that ξ can take on a finite number of realizations given by ξ1, . . . , ξK with a mass function denoted by p1, . . . , pK . For a given x, yj is given by the solution to SecLP(ξj ) minimize y qjT yj subject to Tj x + Wj yj = bj yj ≥ 0, j where (T (ξj ), W (ξj ), q(ξj ), b(ξj )) = (Tj , Wj , qj , bj ). Stochastic Optimization 30 Given that the Uday V. Shanbhag Lecture 4 problems in y are separable, we may specify the expected cost of recourse as the following problem in (y1, . . . , yK ) as follows: SecLP(ξ1, . . . , ξK ) minimize y ,...,y 1 K subject to Stochastic Optimization K X pj qjT yj j=1 Tj x + Wj yj = bj , yj ≥ 0, 31 j = 1, . . . , K j = 1, . . . , K Uday V. Shanbhag Lecture 4 Note that the entire two-stage problem can then be specified as follows: SLP minimize x,y ,...,y 1 K T c x+ K X pj qjT yj j=1 Ax = b subject to Tj x + Wj yj = bj , x, yj ≥ 0, j = 1, . . . , K j = 1, . . . , K Theorem 6 (Moreau-Rockafellar) Let fi : Rn → R̄ be proper convex P functions for i = 1, . . . , N . Let f (.) = N i=1 fi (.) and x0 be a point such that fi(x0) are finite or x0 ∈ ∩N i=1 dom fi . Then ∂f1 + . . . + ∂fN ⊂ ∂f. Stochastic Optimization 32 Uday V. Shanbhag Lecture 4 Furthermore, we have equality in this relationship or ∂f1 + . . . + ∂fN = ∂f, if one of the following hold: 1. The set ∩m i=1 ri(domfi ) is nonempty; 2. the functions f1, . . . , fk for k ≤ m are polyhedral and the intersection of the sets ∩ki=1domfi and ∩m i=k+1 ri(domfi ) is nonempty; 3. there exists a point x̄ ∈ dom(fm) such that x̄ ∈ int(domfi) for i = 1, . . . , m − 1. The subdifferential of Q(x) is specified by the following proposition. Stochastic Optimization 33 Uday V. Shanbhag Lecture 4 Proposition 7 (Subdifferential of ∂Q(x)) Suppose that the probability distribution function for ξ has finite support or Ξ , {ξ1, . . . , ξK } and the expected recourse cost has a finite value for at least some x̃ ∈ Rn. Then Q(x) is polyhedral and ∂Q(x0) = K X ∂Q(x, ξj ). j=1 Proof: This follows as a consequence of the Moreau-Rockafellar theorem. Stochastic Optimization 34 Uday V. Shanbhag Lecture 4 Expected recourse for general distributions Consider Q(x, ξ) where ξ : Ω → Rd. The cost of taking recourse given ξ is the minimum value of q(ω)T y(ω). Lemma 4 Q(x, ξ) is a random lower semicontinuous function. Proof: This follows from noting that Q(x, .) is measurable§ with respect to (.). Furthermore, Q(., ξ) is lower semicontinuous. It follows that Q(x, ξ) is a random lower semicontinuous function. Proposition 8 Suppose either E[Q(x, ξ)]+ or E[Q(x, ξ)]− is finite. Then, Q(x) is well defined. Proof: This requires verifying whether Q(x, .) is measurable with respect to the Borel sigma algebra on Rd. This follows by directly employing Theorem 7.37 [SDR09]. § Given a probability space (Ω, F , P), a function ξ : Ω → [−∞, +∞] is said to be measurable if the set ξ −1 ((a, ∞]) = {ω ∈ Ω : ξ(ω) > a} is a measurable set. Any element of F is said to be a measurable set. Stochastic Optimization 35 Uday V. Shanbhag Lecture 4 Theorem 9 Let F : Rn × Ω → R̄ is a random lower semicontinuous function. Then the optimal value function ϑ(ω) and the optimal solution multifunction X ∗(ω) are both measurable where ϑ(ω) , infn F (y, ω) and X ∗(ω) , arg minn F (x, ω). y∈R x∈R Next, we consider several settings where either E[Q(x, ξ)]+ or E[Q(x, ξ)]− is finite(ii). However, this requires taking a detour in discussing various notions of recourse. Stochastic Optimization 36 Uday V. Shanbhag Lecture 4 Notions of recourse • The two-stage problem is said to have fixed recourse if W (ω) = W for every ω ∈ Ω. • The problem is said to have complete recourse if there exists a solution y satisfying W y = χ and y ≥ 0 for every χ. In effect, for any first-stage decision, the primal second-stage problem is feasible. More formally, this can be stated as follows: posW = Rm, where h − T x ∈ Rm. • By employing duality, the fixed recourse is said to be complete if and only if the feasible set Π(q) of the second-stage dual problem is bounded Stochastic Optimization 37 Uday V. Shanbhag Lecture 4 (could also be empty). Complete recourse ⇔ Primal second-stage is feasible ⇔ Dual second-stage is bounded (could be empty). • Given a set X , then X∞ refers to the recession cone of X and is defined as follows: X∞ , {y : ∀x ∈ X, ∀λ ≥ 0, x + λy ∈ X, } . • Boundedness of Π(q) implies that its recession cone denoted by Π0 = Π(0) contains only the zero element; Π0 = {0}, provided that Π(q) is nonempty. Stochastic Optimization 38 Uday V. Shanbhag Lecture 4 • Claim: Π(q) = {π : W T π ≤ q} is nonempty and bounded implies that Π0 contains only the zero element. Proof sketch: Suppose this claim is false. Then Π0 6= ∅ and suppose u ∈ Π0. This implies that W T u ≤ 0 where u 6≡ 0. Since Π(q) is nonempty, then it contains at least one element, say π̂ . It follows that W T (π̂ + λu) ≤ W T π̂ ≤ q, for all λ ≥ 0. This implies that Π(q) is unbounded since it contains a ray given by π̂ + λu where λ ≥ 0 and π̂ ∈ Π(q). • A subclass of problems with recourse problems where the shortage decisions determined T and q are deterministic and Stochastic Optimization fixed and complete recourse are simple recourse decisions are either surplus or by the first-stage decisions. Furthermore, q > 0. 39 Uday V. Shanbhag Lecture 4 Specifically, y(ω) = y +(ω) + y −(ω). If [h − T x]i ≥ 0, yi+(ω) = [h−T x]i and yi−(ω) = 0. Similarly, if [h−T x]i ≤ 0, yi−(ω) = [h−T x] i and yi+(ω) = 0. This is compactly captured by defining W = I −I , leading to the following system: y +(ω) − y −(ω) = h(ω) − T x y +(ω), y −(ω) ≥ 0. • The recourse is said to be relatively complete if for every x in the set X , {x : Ax = b, x ≥ 0}, the feasible set of the primal second-stage problem is nonempty for a.e. ω ∈ Ω. Note that there may be events that occur with zero probability (measure) that may have Q(x, ξ) = +∞. However, these do not affect the overall expectation. • A sufficient condition for relatively complete recourse is the following: for every x ∈ X, Q(x, ξ) < +∞ for all ξ ∈ Ξ. Stochastic Optimization 40 Uday V. Shanbhag Lecture 4 • Note that this condition becomes necessary and sufficient in two cases: • The vector ξ has finite support • Fixed recourse • Next we consider an instance of a problem with random recourse and discuss some concerns. Example 1 (Random recourse) Suppose Q(x, ξ) := inf{y : ξy = x, y ≥ 0}, with x ∈ [0, 1] and ξ having a density function given by p(z) = 2z, 0 ≤ z ≤ 1. Then for ξ > 0, x ∈ [0, 1], Q(x, ξ) = x/ξ and E[Q(x, ξ)] = 2x. For ξ = 0 and x > 0, Q(x, ξ) = +∞. It can be seen that for almost every ξ ∈ [0, 1], Q(x, ξ) < +∞. – relatively complete recourse. Furthermore, P(ξ : ξ = 0) = 0 and E[Q(x, ξ)] < +∞ for all x ∈ [0, 1]. However, a small perturbation in the distribution may change that. For instance, a discretization of the distribution with the first point at Stochastic Optimization 41 Uday V. Shanbhag Lecture 4 ξ = 0, immediately leads to E[Q(x, ξ)] = +∞ for x > 0. In effect, this problem is unstable and this issue cannot occur if the recourse is fixed. • We now consider the support function sq associated with Π(q). Recall that this is defined as sq (χ) = sup π T χ. π∈Π(q) Our goal lies in finding sufficiency conditions for the finiteness of the expectation E[sq (χ)]. We will employ Hoffman’s Lemma [SDR09, Theorem 7.11] which is defined as follows: Lemma 5 (Hoffman’s lemma) Consider the multifunction M(b) := {x : Ax ≤ b}, where A ∈ Rm×n. Then there exists a positive constant Stochastic Optimization 42 Uday V. Shanbhag Lecture 4 κ, depending on A, such that for any given x ∈ Rn and any b ∈ dom M, dist(x, M(b)) ≤ κk(Ax − b)+k. Lemma 6 Suppose there exists a constant κ depending on W such that if for some q0, the set Π(q0) is nonempty, then for every q , the following inclusion holds: Π(q) ⊂ Π(q0) + κkq − q0kB, where B := {π : kπk ≤ 1} and k.k denotes the Euclidean norm. Then the following holds: sq (•) ≤ sq0 (•) + κkq − q0kk • k. Stochastic Optimization 43 Uday V. Shanbhag Lecture 4 If q0 = 0, then | sq (χ1) − sq (χ2) |≤ κkqkkχ1 − χ2k, ∀χ1, χ2 ∈ dom(sq ). Proof. Since the support function of the unit ball is the Euclidean norm, when Π(q0) is nonempty, and that s1(•) ≤ s2(•) ⇔ C1 ⊂ C2. Stochastic Optimization 44 Uday V. Shanbhag Lecture 4 If C1 = Π(q) and C2 = Π(q0) + κkq − q0kB , then we have that sq (•) ≤ sΠ(q0)+κkq−q0kB (•) = sΠ(q0)(•) + sκkq−q0kB (•) (Support function of Minkowski sum) = sΠ(q0)(•) + κkq − q0ksB (•) = sq0 (•) + κkq − q0kk • k. (Positive homogeneity of support function) (Support function of unit ball is Euclidean norm) (1) • Consider q0 = 0 and the set Π(0) is defined as n T o Π(0) , π : W π ≤ 0 . This set is a cone and suppose its support function is denoted by s0 = sq0 . Stochastic Optimization 45 Uday V. Shanbhag Lecture 4 • The support function is defined as follows (proof omitted): s0(χ) = 0, if χ ∈ pos W +∞, otherwise • Therefore, if q0 = 0 in (1) allows one to claim that if Π(q) is nonempty, then sq (χ) ≤ κkqkkχk for all χ ∈ pos W , and sq (χ) = +∞ if χ 6∈ pos W . • By the polyhedrality of Π(q), if Π(q) is nonempty, we have that sq (·) is piecewise linear on its dom(sq (·)) which is given by posW . In fact, sq (.) is Lipschitz continuous on its domain or | sq (χ1) − sq (χ2) |≤ κkqkkχ1 − χ2k, Stochastic Optimization 46 ∀χ1, χ2 ∈ dom(sq ). Uday V. Shanbhag Lecture 4 • We now provide a necessary and sufficient condition for a fixed recourse problem having a finite E[Q(x, ξ)+]. Proposition 10 (Nec. and suff. condition for finiteness of E(Q(x, ξ)+)) Suppose that recourse is fixed and E[kqkkhk] < +∞ and E[kqkkT k] < +∞. Consider a point x ∈ Rn. Then E[Q(x, ξ)+] is finite if and only if h(ξ) − T (ξ)x ∈ pos W for almost every ξ. Proof: (⇒) Suppose h(ξ) − T (ξ)x 6∈ pos W with probability one. This implies that for some ξ ∈ U ⊆ Ξ, h(ξ) − T (ξ)x 6∈ pos W where µ(U ) = P[ξ : ξ ∈ U ] > 0. But for any such ξ , Q(x, ξ) = +∞. But (Q(x; ξ))+ , max(Q(x; ξ), 0)) ≥ Q(x; ξ) = +∞ with positive probability. Consequently, it follows that E[Q(x, ξ)+] = +∞. (⇐) Suppose h(ξ) − T (ξ)x ∈ pos W with probability one. Stochastic Optimization 47 Then Uday V. Shanbhag Lecture 4 Q(x, ξ) = sq (h − T x). By (1), we have that sq (χ) ≤ s0(χ) + κkqkkχk. Furthermore, we recall that s0(χ) = 0 when χ ∈ pos W implying that sq (χ) ≤ κkqkkχk. By using the triangle inequality, we have that Q(x; ξ) = sq (h(ξ) − T (ξ)x) ≤ κkq(ξ)(kh(ξ)k + kT (ξ)kkxk) (with probability one) Since 0 ≤ κkq(ξ)(kh(ξ)k + kT (ξ)kkxk) wp1, Stochastic Optimization 48 (wp1) Uday V. Shanbhag Lecture 4 it follows that max(Q(x; ξ), 0) ≤ κkq(ξ)(kh(ξ)k + kT (ξ)kkxk) wp1. Taking expectations, we have the following: E[Q(x; ξ)+] ≤ κE[kq(ξ)kh(ξ)k] + κE[kq(ξ)kkT (ξ)k]kxk. By assumption, we have that E[kqkkhk] < +∞ and E[kqkkT k] < +∞. As a result, E[Q(x, ξ)+] < +∞. • Next, instead of requiring that (h(ξ) − T (ξ)x) ∈ pos(W ) wp1, we assume that Π(q(ξ)) is nonempty wp1. Proposition 11 Suppose that (i) the recourse is fixed, (ii) for a.e. q the Stochastic Optimization 49 Uday V. Shanbhag Lecture 4 set Π(q) is nonempty, and (iii) the following holds: E[kqkkhk] < +∞ and E[kqkkT k] < +∞. Then the following hold: (a) the expectation function Q(x) is welldefined and Q(x) > −∞ for all x ∈ Rn; (b) Moreover, Q(x) is convex, lower semicontinuous, and Lipschitz continuous on dom φ, and its domain is a convex closed subset of Rn is given by domQ = {x : h − T x ∈ pos W w.p.1.} Proof: (a): By (ii), Π(q) is nonempty with probability one. Consequently, for almost every ξ ∈ Ξ, Q(x, ξ) = sq (h(ξ) − T (ξ)x) for every x and for almost every ξ . Stochastic Optimization 50 Uday V. Shanbhag Lecture 4 Suppose π(q) denotes the element of Π(q) closest to zero; by the closedness of Π(q), it exists and by Hoffman’s lemma, there exists a constant κ such that kπ(q)k ≤ κkqk. By the definition of the support function, sq (h − T x) = sup π(q)T (h − T x) ≥ π(q)T (h − T x) π(q)∈Π(q) and for every x, the following holds with probability one: −π(q)T (h − T x) ≤ kπ(q)k (khk + kT kkxk) (Cauchy-Schwartz and triangle inequality) ≤ κkqk (khk + kT kkxk) =⇒ π(q)T (h − T x) ≥ −κkqk (khk + kT kkxk) Stochastic Optimization 51 Uday V. Shanbhag Lecture 4 By (iii), we have that E[kqkkhk] < +∞ and E[kqkkT k] < +∞ and it can be concluded that Q(•) is well defined and Q(x) > −∞ for all x ∈ Rn. (b): Since sq (•) is lower semicontinuous in (•) (since it is a support function), it follows that Q(•) is also lower semicontinuous by Fatou’s Lemma¶ Lemma 7 (Fatou’s Lemma) Suppose that there exists a P-integrablek function g(ω) such that fn(•) ≥ g(•). Then h i lim inf E[fn] ≥ E lim inf fn . n→∞ n→∞ ¶ Just to jog your memory, Fatou’s Lemma allows for interchanging integrals and limits under assumptions on the integrand. k A random variable Z : Ω → Rd is P-integrable if E[Z] is well-defined and finite. Recall that E[Z(ω)] is well-defined if it does not happen that E[Z(ω)+ ] and E[(−Z(ω))+ ] are +∞ in which case E[Z] = E[Z+ ] − E[(−Z)+ ] Stochastic Optimization 52 Uday V. Shanbhag Lecture 4 • Convexity of Q(x) can be claimed as follows. From convexity of Q(x, ξ) in x, we have that for any x, y Q(x, ξ) ≥ Q(y, ξ) + u(ξ)T (x − y), where u(ξ) ∈ ∂Q(y, ξ). (2) Taking expectations on both sides and by the well-defined nature of Q(x) for all x, we obtain the following: E[Q(x, ξ)] ≥ E[Q(y, ξ)] + E[u(ξ)]T (x − y), where u(ξ) ∈ ∂Q(y, ξ). (3) But under suitable assumptions, we may interchange subdifferential and expectation, implying that u = E[u(ξ)] ∈ E[∂Q(x, ξ)] = ∂ E[Q(x, ξ)] = ∂Q(x). Stochastic Optimization 53 Uday V. Shanbhag Lecture 4 It follows that for any x, y , we have that Q(x) ≥ Q(y) + uT (x − y), where u ∈ ∂Q(y). (4) In other words, Q(x) is a convex function in x. • Convexity and closedness of domQ is a consequence of the convexity and lower semicontinuity of Q. By Prop. 10, Q(x) < +∞ if and only if h(ξ) − T (ξ)x ∈ pos W with probability one. But this implies that domQ = {x ∈ Rn : Q(x) < +∞} = {x ∈ Rn : h(ξ) − T (ξ)x ∈ pos W, with probability one. } . (c): In the remainder of the proof, we show the Lipschitz continuity of Stochastic Optimization 54 Uday V. Shanbhag Lecture 4 Q(x). Consider x1, x2 ∈ domQ. Then, we have h(ξ) − T (ξ)x1 ∈ pos W, w.p.1 and h(ξ) − T (ξ)x2 ∈ pos W, w.p.1 . By leveraging this claim, if Π(q) is nonempty, we have that |sq (h − T x1) − sq (h − T x2)| ≤ κkqkkT kkx1 − x2k. Taking Expectations on both sides, we have that |sq (h − T x1) − sq (h − T x2)| ≤ κkqkkT kkx1 − x2k. Stochastic Optimization 55 Uday V. Shanbhag Lecture 4 |Q(x1) − Q(x2)| ≤ E[|sq (h − T x1) − sq (h − T x2)|] ≤ κE[kqkkT k]kx1 − x2k, where the first inequality follows from the application of Jensen’s inequality∗∗ and the convexity of the norm. The Lipschitz continuity of Q(x) on its domain follows. ∗∗ Recall that Jensen’s inequality implies that f (E[X]) ≤ E[f (X)], when f is a convex function and E[X] denotes the expectation of a random variable X. Stochastic Optimization 56 Uday V. Shanbhag Lecture 4 References [SDR09] A. Shapiro, D. Dentcheva, and A. Ruszczyński. Lectures on stochastic programming, volume 9 of MPS/SIAM Series on Optimization. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2009. Modeling and theory. Stochastic Optimization 57