Shabbir Ahmed Convexity and decomposition of mean-risk stochastic programs April 20, 2004. Revised February 8, 2005. Abstract. Traditional stochastic programming is risk neutral in the sense that it is concerned with the optimization of an expectation criterion. A common approach to addressing risk in decision making problems is to consider a weighted mean-risk objective, where some dispersion statistic is used as a measure of risk. We investigate the computational suitability of various mean-risk objective functions in addressing risk in stochastic programming models. We prove that the classical mean-variance criterion leads to computational intractability even in the simplest stochastic programs. On the other hand, a number of alternative mean-risk functions are shown to be computationally tractable using slight variants of existing stochastic programming decomposition algorithms. We propose decomposition-based parametric cutting plane algorithms to generate mean-risk efficient frontiers for two particular classes of mean-risk objectives. Key words. Stochastic programming, mean-risk objectives, computational complexity, decomposition, cutting plane algorithms. 1. Introduction This paper is concerned with stochastic programming problems of the form min{ E[f (x, ω)] : x ∈ X}, (1) where x ∈ Rn is a vector of decision variables; X ⊂ Rn is a non-empty set of feasible decisions; (Ω, F, P ) is a probability space with elements ω; and f : Rn × Ω 7→ R is a cost function such that f (·, ω) is convex for almost every (a.e.) ω ∈ Ω, and f (x, ·) is F-measurable and P -integrable for all x ∈ Rn . Of particular interest are instances of (1) corresponding to two-stage stochastic linear programs (cf. [2,13]), where X is a polyhedron and f (x, ω) = cT x + Q(x, ξ(ω)) (2) Q(x, ξ) = min{q T y : W y = h + T x, y ≥ 0}. (3) with Here ξ = (q, W, h, T ) represents a particular realization of the random data ξ(ω) for the linear program in (3). It is assumed that problem (3) has a finite optimal value for all x ∈ X and for a.e. ω ∈ Ω, i.e., relatively complete and sufficiently expensive recourse. School of Industrial & Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive, Atlanta, GA 30332. E-mail: sahmed@isye.gatech.edu 2 Shabbir Ahmed Formulation (1) is risk-neutral in the sense that it is concerned with the optimization of an expectation objective. A common approach to addressing risk is to consider a weighted mean-risk criterion, where some dispersion statistic is used as a proxy for risk, i.e., min{ E[f (x, ω)] + λD[f (x, ω)] : x ∈ X}, (4) where D is a dispersion statistic, and λ is a non-negative weight to trade-off expected cost with risk. The classical Markowitz [8] mean-variance portfolio optimization model is an example of (4) where variance is used as the dispersion statistic. In this paper we investigate various mean-risk objective functions and their computational suitability in addressing risk in stochastic programming models. We prove that the mean-variance criterion leads to computational intractability even in the simplest stochastic programs. On the other hand, a number of alternative mean-risk functions are shown to be computationally tractable using slight variants of existing stochastic programming decomposition algorithms. We propose parametric algorithms to generate mean-risk efficient frontiers for two specific mean-risk objectives in the context of two-stage stochastic linear programs. 2. Complexity of mean-variance stochastic programming It is well-known that using the mean-variance criterion in two-stage stochastic linear programming, in general, leads to non-convex formulations [13,18]. In this section, we prove the computational intractability of such non-convex formulations by showing that, even for very simple two-stage stochastic linear programs, the mean-variance criterion leads to NP-hard optimization problems. Consider the class of two-stage stochastic linear programs with simple recourse (cf. [2]), i.e., problem (1) with f (x, ω) = n X j=1 cj xj + n X Qj (xj , ξj (ω)) (5) j=1 and Qj (xj , ξj ) = qj+ (xj − ξj )+ + qj− (ξj − xj )+ , (6) where (·)+ = max{·, 0}, and qj+ +qj− ≥ 0 for all j = 1, . . . , n. The following lemma provides a closed-form formula for the variance of the simple recourse function (6) when ξj (ω) has a discrete distribution. The proof follows from algebra and is omitted. Lemma 1. Consider the function Q(x, ξ) = q + (x − ξ)+ + q − (ξ − x)+ , where x, ξ ∈ R. Let ξ(ω) be a random variable with the discrete distribution ξ(ω) = ξk Convexity and decomposition of mean-risk stochastic programs 3 w.p. pk for k = 1, . . . , K, and ξ1 ≤ · · · ≤ ξK . The variance of Q(x, ξ(ω)) is then given by if x ≤ ξ1 , a V[Q(x, ξ(ω))] = bk x2 + ck x + dk if ξk−1 ≤ x ≤ ξk k = 2, . . . , K, e if x ≥ ξK , where K X a = (q − )2 pi ξi2 − K X i=1 i=1 + bk = q + q " k−1 X − 2 ck = −2 (q + )2 pi i=1 k−1 X !2 pi ξi , K X pi i=k pi ξi + (q − )2 i=1 q + dk = (q + )2 k−1 X k−1 X pi − q K X pi ξi + − K X pi i=k pi ξi2 + (q − )2 i=1 K X e = (q + )2 pi ξi2 − i=1 , i=k i=1 ! K X ! pi ξi2 − i=k !2 K X pi ξi . q − K X pi ξi − q + i=1 i=k q+ k−1 X k−1 X pi ξi − q − i=1 K X i=k pi ξi !# , !2 pi ξi , i=1 From the above lemma, it can be seen that the function V[Q(x, ξ(ω))] is piecewise convex quadratic (note that bk ≥ 0 for all k), but non-convex in general. Consequently, we encounter computational complications in using the meanvariance criterion in stochastic programs with simple recourse. Theorem 1. The mean-variance stochastic programming problem min{ E[f (x, ξ)] + λV[f (x, ξ)] : x ∈ X} (7) corresponding to the simple recourse function (5)-(6) is NP-hard for any λ > 0. Proof. Consider the Binary Integer Feasibility problem: Given an integer matrix A ∈ Zm×n , and integer vector b ∈ Zm , is there a vector x ∈ {−1, 1}n such that Ax ≤ b? (8) The binary integer feasibility problem (8) is known to be NP-complete [3]. We shall show that given any instance of (8) with n variables, we can construct a polynomial (in n) sized instance of the mean-variance stochastic program (7) for any λ > 0 such that (8) has an answer “yes” if and only if (7) has an optimal 3 objective value of (3 + 4λ + 3λ 2 )n. 4 Shabbir Ahmed An instance of (8) is given by the data pair (A, b). Given any such instance, we can construct an instance of (7) for any λ > 0 as follows. Let X = {x ∈ Rn : Ax ≤ b, − e ≤ x ≤ e}, e be a n-vector of ones, and qj+ = 1, qj− = 1, cj = 0 for all j = 1, . . . , n. Let ξj (ω) for j = 1, . . . , n be i.i.d random variables, with the distribution −3 − λ1 w.p. 41 , 0 w.p. 12 , ξj (ω) = 1 3 + λ w.p. 41 . Owing to the independence of ξj (ω), the mean-variance stochastic program (7) corresponding to the above data reduces to n X min{ (E[Qj (xj , ξj (ω))] + λV[Qj (xj , ξj (ω))]) : x ∈ X}. (9) j=1 It follows from Lemma 1 that for all j = 1, . . . , n, E[Qj (xj , ξj (ω))] + λV[Qj (xj , ξj (ω))] = 3 4λ (4λ 3 4λ (4λ + 1 + λ2 x2j + 2λ2 xj + 3λ2 ) if + 1 + λ2 x2j − 2λ2 xj + 3λ2 ) if −3 − λ1 ≤ xj ≤ 0 0 ≤ xj ≤ 3 + λ1 . 3 + 3λ Note that for all j = 1, . . . , n, E[Qj (xj , ξj (ω))]+λV[Qj (xj , ξj (ω))] ≥ (3+ 4λ 2 ) for any xj ∈ [−1, 1], with equality holding if and only if xj ∈ {−1, 1}. Thus (9) 3 has an optimal objective value of (3 + 4λ + 3λ 2 )n if and only if there exists x ∈ X n such that x ∈ {−1, 1} , i.e., problem (8) has an affirmative answer. u t In the classical setting of portfolio optimization, the function f (x, ω) = −r(ω)T x where r(ω) ∈ Rn is a random vector of returns, and X ⊂ Rn is a polyhedral set of feasible weights for the n assets in the portfolio. In this case, V[f (x, ω)] = xT Cx where C is the covariance matrix of the random vector r(ω). Consequently, (7) reduces to a deterministic (convex) quadratic program suitable for very efficient computation. For typical stochastic programs f (x, ω) is nonlinear (although convex) in x. Furthermore, the variance operator, although convex, is non-monotone. Consequently E[f (x, ω)] + λV[f (x, ω)] is not guaranteed to be convex in x, leading to the computational complication proven in Theorem 1. In the following section, we investigate mean-risk objectives that preserve convexity, hence computational tractability. 3. Tractable mean-risk objectives Given a random variable Y : Ω 7→ R, representing cost, belonging to the linear space Xp = Lp (Ω, F, P ) for p ≥ 1, a scalar λ ≥ 0, and an appropriate function D : Xp 7→ R to measure the risk associated with Y , we define a mean-risk function gλ,D : Xp 7→ R as gλ,D [Y ] = E[Y ] + λD[Y ]. (10) Convexity and decomposition of mean-risk stochastic programs 5 Using (10) to address risk in the context of the stochastic program (1), we arrive at the formulation min{φ(x) = gλ,D [f (x, ω)] : x ∈ X}. (11) From a computational viewpoint, it is desirable that the objective function φ(·) in (11) be convex. As discussed in Section 2, even though f (·, ω) is convex for a.e. ω ∈ Ω, the convexity of the composite function φ(x) = gλ,D [f (x, ω)] may not be preserved, for example, if variance is used as the measure of risk in gλ,D . 3.1. Sufficient conditions for preserving convexity We shall say that a function g : Xp 7→ R is convexity-preserving, if the composite function φ(x) = g[f (x, ω)] is convex for any function f (x, ω) such that f (·, ω) is convex for a.e. ω ∈ Ω and f (x, ·) ∈ Xp for all x ∈ Rn . Recall that a function g : Xp 7→ R is convex if g[λY1 + (1 − λ)Y2 ] ≤ λg[Y1 ] + (1 − λ)g[Y2 ], for all Y1 , Y2 ∈ Xp and λ ∈ [0, 1]; is non-decreasing if g[Y1 ] ≥ g[Y2 ] for all Y1 , Y2 ∈ Xp such that Y1 ≥ Y2 ; and is positively homogenous if g[λY ] = λg[Y ] for all Y ∈ Xp and λ ≥ 0. The following result is well-known (cf. [14]). Proposition 1. If g : Xp 7→ R is convex and non-decreasing, then g is convexitypreserving. Lemma 2. A convex and positively homogenous function g : Xp 7→ R is nondecreasing if and only if it satisfies g[Y ] ≤ 0 for all Y ≤ 0. Proof. Suppose g satisfies g[Y ] ≤ 0 for all Y ≤ 0. Let Y1 ≥ Y2 , i.e., Y2 = Y1 + ∆ for some ∆ ≤ 0. Then 1 1 2 g[Y2 ] = 2 g[Y1 + ∆] = g[ 12 Y1 + 21 ∆] ≤ 12 g[Y1 ] + 21 g[∆] ≤ 21 g[Y1 ], where the second line follows from positive-homogeneity, the third line follows from convexity, and the fourth line follows from the fact that g[∆] ≤ 0 (since ∆ ≤ 0). Thus g[·] is non-decreasing. Conversely, if g[·] is non-decreasing, then g[Y ] ≤ g[0] = 0 for all Y ≤ 0. u t The following result immediately follows from Lemma 2 and Proposition 1. Proposition 2. If g : Xp 7→ R is convex, positively homogenous, and satisfies g[Y ] ≤ 0 for all Y ≤ 0, then g is convexity-preserving. 3.2. Examples Here we show that a number of common mean-risk objectives are convexity preserving, and hence suitable for optimization. 6 Shabbir Ahmed Semideviation from a target. For Y ∈ Xp and a fixed target T ∈ R, the p-th Semideviation from T [11] is defined as ST,p [Y ] = (E[(Y − T )p+ ])1/p . Proposition 3. The mean-risk objective gλ,uT ,p [Y ] = E[Y ] + λST,p [Y ] is convexity preserving for all p ≥ 1 and λ ≥ 0. Proof. Since ST,p [Y ] is convex and non-decreasing in Y for all p ≥ 1, the result follows from Proposition 1. u t Central semideviation. For Y ∈ Xp , the p-th central semideviation [9] is defined as 1 δp [Y ] = E[(Y − EY )p+ ] p . Proposition 4. The mean-semideviation objective gλ,δp [Y ] = E[Y ] + λδp [Y ] is convexity-preserving for all p ≥ 1 and λ ∈ [0, 1]. Proof. Ogryczak and Ruszczyński [9] have shown that δp is convex for all p ≥ 1. Furthermore, it can be verified that δp is positively-homogenous. Thus gλ,δp [Y ] is convex and positively homogenous for all λ ≥ 0. Moreover, if Y ≤ 0, then (Y − EY )+ ≤ −EY . Thus gλ,δp [Y ] ≤ (1 − λ)EY ≤ 0 for all λ ≤ 1. The result then follows from Proposition 2. u t Quantile-deviation and CVaR. Given α ∈ (0, 1), the Quantile-deviation [10] for Y ∈ X1 is defined as hα [Y ] = E [(1 − α)(κα [Y ] − Y )+ + α(Y − κα [Y ])+ ] , where κα is α-quantile of the distribution of Y , i.e., Pr(Y ≤ κα [Y ]) ≥ α and Pr(Y ≥ κα [Y ]) ≥ 1 − α. In [10], it is shown that hα is equivalent to the following risk measure (defined for ε1 > 0 and ε2 > 0), hε1 ,ε2 [Y ] = min{E[ε1 (z − Y )+ + ε2 (Y − z)+ ]}, z∈R when α = ε2 /(ε1 +ε2 ). The α-Conditional value at risk [12] for Y ∈ X1 is defined as 1 E[(Y − z)+ ]}. CVaRα [Y ] = min{z + z∈R 1−α By simple algebra, CVaRα can be expressed in terms of hε1 ,ε2 as follows [14]: CVaRα [Y ] = E[Y ] + where α = ε2 /(ε1 + ε2 ). 1 hε ,ε [Y ], ε1 1 2 Convexity and decomposition of mean-risk stochastic programs 7 Proposition 5. Consider Y ∈ X1 . (i) Given ε1 > 0, ε2 > 0, the mean-risk objective gλ,hε 1 ,ε2 [Y ] = E[Y ] + λhε1 ,ε2 [Y ] is convexity preserving for all λ ∈ [0, 1/ε1 ]. (ii) Given α ∈ (0, 1), the mean-quantile deviation objective gλ,hα [Y ] = E[Y ] + λhα [Y ] is convexity preserving for all λ ∈ [0, 1/(1 − α)]. (iii) Given α ∈ (0, 1), the mean-CVaR objective gλ,CVaRα [Y ] = E[Y ] + λCVaRα [Y ] is convexity preserving for all λ ≥ 0. Proof. Ogryczak and Ruszczyński [10] have shown that hε1 ,ε2 [Y ] is convex and positively homogenous. Thus gλ,hε ,ε [Y ], gλ,hα [Y ] and gλ,CVaRα [Y ] are convex 1 2 and positively homogenous for all λ ≥ 0. Note that hε1 ,ε2 [Y ] ≤ ε1 E(−Y )+ + ε2 E(Y )+ , and if Y ≤ 0, then hε1 ,ε2 [Y ] ≤ −ε1 E[Y ]. Thus gλ,hε ,ε [Y ] ≤ (1 − 1 2 λε1 )E[Y ] ≤ 0 since λ ≤ 1/ε1 . Statement (i) then follows from Proposition 2. Statement (ii) follows from the equivalence of hα [Y ] and hε1 ,ε2 [Y ] when α = ε2 /(ε1 + ε2 ). Note that gλ,CVaRα [Y ] = (1 + λ)E[Y ] + λ/ε1 hε1 ,ε2 [Y ]. If Y ≤ 0, gλ,CVaRα [Y ] ≤ (1 + λ)E[Y ] − λE[Y ] = E[Y ] ≤ 0 for all λ. Statement (iii) then follows. u t Gini mean difference. For Y ∈ X1 with distribution function FY , the Gini mean difference [19] is defined as Z Γ [Y ] = E[(ξ − Y )+ ]dFY (ξ). Proposition 6. The mean-Gini mean difference objective gλ,Γ [Y ] = E[Y ] + λΓ [Y ] is convexity preserving for all λ ∈ [0, 1]. Proof. Ogryczak and Ruszczyński [10] have shown that Γ [·] is convex and positivelyhomogenous. Thus gλ,Γ [·] is convex and positively-homogenous for all λ ≥ 0. Moreover, if Y ≤ 0 then Z Z E[(ξ − Y )+ ]dFY (ξ) ≤ −E[Y ]dFY (ξ) = −E[Y ]. Thus gλ,Γ [Y ] ≤ (1 − λ)E[Y ] ≤ 0 for all λ ≤ 1. The result then follows from Proposition 2. u t 8 Shabbir Ahmed The non-decreasing, hence convexity preserving, property of gλ,δp , gλ,hα (or, equivalently, gλ,hε ,ε ) , and gλ,Γ may not hold for λ > 1, λ > 1/(1 − α) (or, 1 2 λ > 1/ε1 ), and λ > 1, respectively. However, as shown in [9,10], these meanrisk objectives are guaranteed to be consistent with standard stochastic ordering rules only if λ ∈ (0, 1), λ ∈ (0, 1/(1 − α)) (or, λ ∈ (0, 1/ε1 )) and λ ∈ (0, 1), respectively1 . 4. Solving mean-risk stochastic programs In this section, we discuss methods for solving stochastic programs with convexity preserving mean-risk objectives. We provide specific details for two particular classes of mean-risk stochastic programs, those that involve the semideviation risk measure δp with p = 1, i.e., δ1 [Y ] = E[(Y − EY )+ ] which is referred to as the absolute semideviation (ASD), and those that involve the quantile deviation (QDEV) risk measure hε1 ,ε2 [Y ] = min{E[ε1 (z − Y )+ + ε2 (Y − z)+ ]}, z∈R for ε1 > 0 and ε2 > 0. 4.1. Deterministic equivalent formulations Consider the stochastic program (1) where f (x, ω) is given as the value function of a second-stage optimization problem f (x, ω) = min{F0 (x, y, ω) : Fi (x, y, ω) ≤ 0 i = 1, . . . , m, y ∈ Y }. Here we make the necessary assumptions on Fi for i = 0, . . . , m and Y for f (x, ω) to be finite-valued, F-measurable and P -integrable (cf. [13]). The twostage stochastic linear programming objective function (2)-(3) is a special case. If the mean-risk objective is non-decreasing, then the corresponding mean-risk stochastic program (11) can be written as: min gλ,D [F0 (x, y(ω), ω)] x,y(ω) s.t. x ∈ X Fi (x, y(ω), ω) ≤ 0 i = 1, . . . , m for a.e. ω ∈ Ω. y(ω) ∈ Y (12) When Ω is finite, problem (12) is a large-scale deterministic optimization problem. Furthermore, if Fi for i = 0, 1, . . . , m are convex functions of the decision variables x and y(ω) for a.e. ω ∈ Ω, and the sets X and Y are convex, then (12) is a convex program. 1 Note that in our minimization setting smaller values of the random variables are preferable to larger, whereas in the maximization setting of [10], larger values are preferable. Consequently, for the quantile deviation risk function, [10] shows consistency with stochastic ordering for λ ∈ (0, 1/α) Convexity and decomposition of mean-risk stochastic programs 4.1.1. The Mean-ASD objective. 9 It can be verified that gλ,δ1 [Y ] = (1 − λ)µ[Y ] + λν[Y ], (13) where µ[Y ] = E[Y ] and ν[Y ] = E[max{Y, EY }]. The deterministic equivalent (12) of the mean-ASD stochastic program is then (cf. [16]): R R min (1 − λ) F0 (x, y(ω), ω)dP (ω) + λ ν(ω)dP (ω) x,y(ω),ν(ω) s.t. x∈X Fi (x, y(ω), ω) ≤ 0 i = 1, . . . , m ν(ω) ≥ F R 0 (x, y(ω), ω) for a.e. ω ∈ Ω. ν(ω) ≥ F0 (x, y(ξ), ξ)dP (ξ) y(ω) ∈ Y (14) Note that, if λ ∈ [0, 1], Fi for i = 0, 1, . . . , m are convex functions of the decision variables x and y(ω) for a.e. ω ∈ Ω, and the sets X and Y are convex, then the objective function and constraints of (14) are all convex. In case of two-stage stochastic linear programs, the functions Fi for i = 0, . . . , m are linear and the sets X and Y are polyhedral. Then, in case of a finite set Ω of scenarios, (14) is a large-scale linear program. Unfortunately, this linear program does not posses the dual-block angular structure that is natural in the deterministic equivalents of standard two-stage stochastic linear programs [16]. Consequently, decomposition methods (such as the Benders or L-shaped algorithm) (cf. Chapter 3 of [13]) used for solving standard two-stage stochastic linear programs cannot be directly applied. In the next section, we show that problem decomposition can be achieved by using slight variations of these methods. 4.1.2. The Mean-QDEV objective. written as gλ,hε 1 ,ε2 The mean-QDEV objective function can be [Y ] = E[Y ] + λ min{E[ε1 (z − Y )+ + ε2 (Y − z)+ ]} z∈R = min{E[Y + λε1 (z − Y )+ + λε2 (Y − z)+ ]} z∈R = min{E[Y + λε1 (Y − z)+ − λε1 (Y − z) + λε2 (Y − z)+ ]} z∈R = min{λε1 z + (1 − λε1 )E[Y ] + λ(ε1 + ε2 )E[(Y − z)+ ]}. z∈R (15) From the fact that E[Y ] and E[(Y − z)+ ] are non-decreasing in Y and λ ∈ [0, 1/ε1 ], it follows that deterministic equivalent (12) of the mean-QDEV stochastic program is R R min λε1 z + (1 − λε1 ) F0 (x, y(ω), ω)dP (ω) + λ(ε1 + ε2 ) ν(ω)dP (ω) z,x,y(ω),ν(ω) s.t. z ∈ R, x ∈ X Fi (x, y(ω), ω) ≤ 0 i = 1, . . . , m ν(ω) ≥ F0 (x, y(ω), ω) − z for a.e. ω ∈ Ω. ν(ω) ≥ 0, y(ω) ∈ Y (16) 10 Shabbir Ahmed If Fi for i = 0, 1, . . . , m are convex functions of the decision variables x and y(ω) for a.e. ω ∈ Ω, and the sets X and Y are convex, then the objective function and constraints of (16) are all convex. Moreover, in case of two-stage stochastic linear programs, with a finite set Ω of scenarios, (16) is a large-scale linear program with dual-block angular structure and is directly amenable to standard decomposition methods [17]. In the following two sections, we propose such a decomposition algorithm and its parametric extension to efficiently generate the entire mean-QDEV efficient frontier. 4.2. Cutting plane methods When f (·, ω) is convex for a.e. ω ∈ Ω and the mean-risk function gλ,D [·] is convexity-preserving, the mean-risk stochastic program (11) involves minimizing a convex (often non-smooth) objective function φ(x) = gλ,D [f (x, ω)]. Effective solution schemes for such problems are subgradient-based methods, such as various cutting plane algorithms, bundle methods, or level methods [4,5]. Given a candidate solution x ∈ Rn , these methods require the calculation of a subgradient s of the composite function φ(·) = gλ,D [f (·, ω)] at x, i.e., s ∈ ∂φ(x). Note that a subgradient of gλ,D always exists if gλ,D is real-valued, convex, and continuous. Consequently, given subgradients of f and gλ,D , a subgradient of φ(·) can be calculated using the chain rule of subdifferentiation. Subgradient formulas for various mean-risk objectives, including mean-ASD and mean-QDEV, are derived in detail in [14]. These subgradient formulas can be used for the optimization of the mean-risk stochastic program (11) using, for example, a generic cutting plane algorithm, such as Algorithm 1, or some variant of it. Algorithm 1 A cutting plane algorithm for the mean-risk stochastic program (11). 1: let ≥ 0 be a pre-specified tolerance; set i = 1, LB = −∞ and U B = +∞. 2: while U B − LB ≥ do 3: solve the following master problem LB = min{θ : x ∈ X, θ ≥ φ(xj ) + (sj )T (x − xj ), j = 1, . . . , i − 1}; x,θ 4: 5: 6: 7: 8: 9: and let xi be its optimal solution (or any feasible solution if an optimal does not exist). for ω ∈ Ω do compute f (xi , ω) and π i (ω) ∈ ∂f (xi , ω). end for compute φ(xi ) and si ∈ ∂φ(xi ), from f (xi , ω) and π i (ω). set U B = min{U B, φ(xi )}. end while In case of two-stage stochastic linear programs of the form (2)-(3), a subgradient π(ω) ∈ ∂f (x, ω) is given by c + T T ϑ(ω) where ϑ(ω) is a dual optimal solution to the second-stage linear program (3) for given x and realization ω. When Ω is finite, the second-stage subproblems corresponding to a given x can be solved independently for each realization ω ∈ Ω allowing for a computationally convenient decomposition. The optimal objective values and the dual solutions Convexity and decomposition of mean-risk stochastic programs 11 for the subproblems can then be used to compute the function value and its subgradient. This scheme is a slight variation of the well-known L-shaped (or Benders) decomposition method for solving standard two-stage stochastic linear programs involving an expected value objective. If the mean-risk function is polyhedral (as in case of the mean-ASD and mean-QDEV objectives), then Algorithm 1 is guaranteed to terminate in a finite number of iterations with an -optimal solution for any ≥ 0. 4.3. Parametric cutting plane algorithms for mean-ASD and mean-QDEV stochastic linear programs Often, it is necessary to solve the mean-risk stochastic program (11) for many different values of the mean-risk tradeoff parameter λ so as to trace out the mean-risk efficient frontier. This can, of course, be accomplished by repeatedly solving the problem for different values of λ. However, more efficient parametric optimization schemes may be possible. For the case when f (x, ω) is linear in x, as in the portfolio optimization setting, a parametric simplex algorithm for generating mean-ASD and mean-QDEV frontiers has been proposed in [15]. In this section, we propose parametric cutting plane algorithms to generate mean-ASD and mean-QDEV frontiers when f (x, ω) corresponds to two-stage stochastic linear programs. Note that in Algorithm 1, the tradeoff parameter λ is embedded within the subgradient si calculated in step 7, and hence in the cut coefficients. As such, this algorithm is not suitable for an efficient parametric analysis of the mean-risk model with respect to λ. Recall from (13), that the mean-ASD stochastic program is o n (17) min gλ,δ1 [f (x, ω)] = (1 − λ)µ[f (x, ω)] + λν[f (x, ω)] : x ∈ X , where µ[Y ] = EY and ν[Y ] = E[max{Y, EY }]. Note that both µ and ν are convex non-decreasing functions, and so the composite functions µ[f (x, ω)] and ν[f (x, ω)] are convex in x that can be approximated by subgradient inequalities (cuts) within a cutting plane scheme. If we use a cutting plane scheme for (17) where separate cuts are used to approximate the functions µ and ν, then the cut coefficients are independent of the tradeoff weight λ which only appears in the objective function of the master problem. Thus the cuts are valid for any λ ∈ [0, 1]. If X is polyhedral, then the master problem is a linear program for which a parametric analysis with respect to the objective coefficient λ can be easily carried out to detect the range of λ for which the current master problem basis remains optimal. We can then choose a λ outside this range and reoptimize. In this way we can construct the efficient frontier for the entire range of λ ∈ [0, 1]. This parametric modification of the cutting plane Algorithm 1 is summarized in Algorithm 2. From (15), we have that the mean-QDEV stochastic program is n min gλ,hε ,ε [f (x, ω)] = min{λε1 z + (1 − λε1 )µ[f (x, ω)] 1 2 z∈R o (18) +λ(ε1 + ε2 )ν[f (x, ω), z]} : x ∈ X , 12 Shabbir Ahmed Algorithm 2 A parametric cutting plane algorithm for the mean-ASD stochastic program (17) (the set X is polyhedral). 1: let ≥ 0 be a pre-specified tolerance; set λ = 0, i = 1, LB = −∞ and U B = +∞. 2: while U B − LB ≥ do 3: solve the following master linear program LB = min {(1 − λ)θ + λη : x ∈ X, θ ≥ µ(xj ) + (uj )T (x − xj ) j = 1, . . . , i − 1, x,θ,η η ≥ ν(xj ) + (v j )T (x − xj ) j = 1, . . . , i − 1}; 4: 5: 6: 7: 8: 9: 10: 11: 12: and let xi be its optimal solution (or any feasible solution if an optimal does not exist). for ω ∈ Ω do compute f (xi , ω) and π i (ω) ∈ ∂f (xi , ω). end for compute µ(xi ) = E[f (xi , ω)] and a subgradient ui = E[π i (ω)]. compute ν(xi ) = E[max{f (xi , ω), µ(xi )}] and a subgradient v i = E[ι(ω)π i (ω)] + ui E[1 − ι(ω)] where ι(ω) = 1 if f (xi , ω) > µ(xi ) and 0 otherwise. set U B = min{U B, (1 − λ)µ(xi ) + λν(xi )}. end while use parametric analysis on the master problem to find the range [λ, λ∗ ] for which the current master problem basis remains optimal. if λ∗ < 1, set λ = min{1, λ∗ + δ} (where δ > 0 is very small), LB = −∞ and U B = +∞ and return to step 2. Algorithm 3 A parametric cutting plane algorithm for the mean-QDEV stochastic program (18) (the set X is polyhedral). 1: let ≥ 0 be a pre-specified tolerance; set λ = 0, i = 1, LB = −∞ and U B = +∞. 2: while U B − LB ≥ do 3: solve the following master linear program LB = min {λε1 z + (1 − λε1 )θ + λ(ε1 + ε2 )η : z ∈ R, x ∈ X, z,x,θ,η θ ≥ µ(xj ) + (uj )T (x − xj ) j = 1, . . . , i − 1, η ≥ ν(z j , xj ) + (vzj )T (z − z j ) + (vxj )T (x − xj ) j = 1, . . . , i − 1}; 4: 5: 6: 7: 8: 9: 10: 11: 12: and let (z i , xi ) be its optimal solution (or any feasible solution if an optimal does not exist). for ω ∈ Ω do compute f (xi , ω) and π i (ω) ∈ ∂f (xi , ω). end for compute µ(xi ) = E[f (xi , ω)] and a subgradient ui = E[π i (ω)]. compute ν(xi , z i ) = E[(f (xi , ω) − z i )+ ], a subgradient with respect to x: vxi = E[ι(ω)π i (ω)], and a subgradient with respect to z: vzi = −E[ι(w)], where ι(ω) = 1 if f (xi , ω) > z i and 0 otherwise. set U B = min{U B, λε1 z i + (1 − λε1 )µ(xi ) + (λε1 + ε2 )ν(xi , z i )}. end while use parametric analysis on the master problem to find the range [λ, λ∗ ] for which the current master problem basis remains optimal. if λ∗ < 1/ε1 , set λ = min{1/ε1 , λ∗ + δ} (where δ > 0 is very small), LB = −∞ and U B = +∞ and return to step 2. where µ[Y ] = EY and ν[Y, z] = E[(Y − z)+ ]. Note that both µ and ν are convex non-decreasing in Y , and so the composite functions µ[f (x, ω)] and ν[f (x, ω), z] are convex in x. Moreover ν is also convex in z. Thus, µ and ν can be approx- Convexity and decomposition of mean-risk stochastic programs 13 imated separately by cutting planes whose coefficients are independent of λ. Consequently, we can develop a parametric cutting plane algorithm similar to Algorithm 2 for the mean-QDEV objective. A key difference is that the optimization is with respect to x and z. The details are provided in Algorithm 3. 4.4. Computational illustration We implemented enhanced versions of Algorithms 2 and 3 to generate the meanASD frontier and the mean-QDEV frontier, respectively. Our implementations extend the basic schemes of Algorithm 2 and Algorithm 3 with `∞ –trust-region based regularizations as described in [6]. We used the GNU Linear Programming Kit (GLPK) [7] library routines to solve linear programming subproblems. All computations were carried out on a Linux workstation with dual 2.4 GHz Intel Xeon processors and 2 GB RAM. We considered a 100 scenario instance of the standard stochastic programming test problem gbd. Data for this instance is available from the website: http://www.cs.wisc.edu/∼swright/stochastic/sampling Figures 1 and 2 show the mean-ASD efficient frontier obtained by Algorithm 2 and the mean-QDEV efficient frontier (corresponding to ε1 = 1 and ε2 = 9, i.e., α = 0.9) obtained by Algorithm 3, respectively. In both cases, the parametric strategy was significantly more efficient than resolving the problem from scratch for different values of λ. For the mean-ASD problem, 5 unique solutions were identified for λ ∈ [0, 1]. For the mean-QDEV, 18 unique solutions were identified for λ ∈ [0, 10]. It is evident that the mean-QDEV model offers more flexibility in the mean-risk trade-off than the mean-ASD model. Computational results using Algorithm 2 on additional stochastic programming test problems are provided in [1]. Acknowledgements This research has been supported by the National Science Foundation under CAREER Award number DMII-0133943. The author is grateful to two anonymous reviewers and Alexander Shapiro for constructive comments. 14 Shabbir Ahmed 1755.6 1755.4 1755.2 1755 E[f(x,w)] 1754.8 1754.6 1754.4 1754.2 1754 1753.8 1753.6 300 300.5 301 301.5 302 302.5 303 303.5 304 304.5 180 200 ASD Fig. 1. Mean-ASD frontier for gbd 3200 3000 2800 E[f(x,w)] 2600 2400 2200 2000 1800 1600 0 20 40 60 80 100 QDEV 120 140 160 Fig. 2. Mean-QDEV frontier for gbd (α = 0.9) Convexity and decomposition of mean-risk stochastic programs 15 References 1. S. Ahmed. Mean-risk objectives in stochastic programming. Technical report, Georgia Institute of Technology, 2004. E-print available at www.optimization-online.org. 2. J. R. Birge and F. Louveaux. Introduction to Stochastic Programming. Springer, New York, NY, 1997. 3. M.R. Garey and D.S. Johnson. Computers and intractability: A guide to the theory of NP-completeness. W. H. Freeman and Co., 1979. 4. J.-B. Hiriart-Urruty and C. Lemaréchal. Convex Analysis and Minimization Algorithms II. Springer-Verlag, Berlin Heidelberg, 1996. 5. C. Lemaréchal, A. Nemirovski, and Yu. Nesterov. New variants of bundle methods. Mathematical Programming, 69:111–148, 1995. 6. J. Linderoth and S. Wright. Decomposition algorithms for stochastic programming on a computational grid. Computational Optimization and Applications, 24:207–250, 2003. 7. A. Makhorin. GNU Linear Progamming Kit, Reference Manual, Version 3.2.3. http://www.gnu.org/software/glpk/glpk.html, 2002. 8. H. M. Markowitz. Portfolio Selection: Efficient Diversification of Investments. John Wiley & Sons, 1959. 9. W. Ogryczak and A. Ruszczyński. On consistency of stochastic dominance and meansemideviation models. Mathematical Programming, 89:217–232, 2001. 10. W. Ogryczak and A. Ruszczyński. Dual stochastic dominance and related mean-risk models. SIAM Journal on Optimization, 13:60–78, 2002. 11. R. B. Porter. Semivariance and stochastic dominance. American Economic Review, 64:200–204, 1974. 12. R.T. Rockafellar and S. Uryasev. Optimization of conditional value-at-risk. The Journal of Risk, 2:21–41, 2000. 13. A. Ruszczyński and A. Shapiro, editors. Stochastic Programming, volume 10 of Handbooks in Operations Research and Management Science. North-Holland, 2003. 14. A. Ruszczyński and A. Shapiro. Optimization of convex risk functions. Submitted for publication. E-print available at www.optimization-online.org, 2004. 15. A. Ruszczyński and R. Vanderbei. Frontiers of stochastically nondominated portfolios. Econometrica, 71:1287–1297, 2004. 16. R. Schultz. Stochastic programming with integer variables. Mathematical Programming, 97:285–309, 2003. 17. A. Shapiro and S. Ahmed. On a class of minimax stochastic programs. SIAM Journal on Optimization, 14:1237–1249, 2004. 18. S. Takriti and S. Ahmed. On robust optimization of two-stage systems. Mathematical Programming, 99:109–126, 2004. 19. S. Yitzhaki. Stochastic dominance, mean variance, and Gini’s mean difference. American Economic Review, 72(1):178–185, 1982.