Convexity and decomposition of mean-risk stochastic programs

advertisement
Shabbir Ahmed
Convexity and decomposition of
mean-risk stochastic programs
April 20, 2004. Revised February 8, 2005.
Abstract. Traditional stochastic programming is risk neutral in the sense that it is concerned with the optimization of an expectation criterion. A common approach to addressing
risk in decision making problems is to consider a weighted mean-risk objective, where some
dispersion statistic is used as a measure of risk. We investigate the computational suitability
of various mean-risk objective functions in addressing risk in stochastic programming models.
We prove that the classical mean-variance criterion leads to computational intractability even
in the simplest stochastic programs. On the other hand, a number of alternative mean-risk
functions are shown to be computationally tractable using slight variants of existing stochastic
programming decomposition algorithms. We propose decomposition-based parametric cutting
plane algorithms to generate mean-risk efficient frontiers for two particular classes of mean-risk
objectives.
Key words. Stochastic programming, mean-risk objectives, computational complexity, decomposition, cutting plane algorithms.
1. Introduction
This paper is concerned with stochastic programming problems of the form
min{ E[f (x, ω)] : x ∈ X},
(1)
where x ∈ Rn is a vector of decision variables; X ⊂ Rn is a non-empty set
of feasible decisions; (Ω, F, P ) is a probability space with elements ω; and f :
Rn × Ω 7→ R is a cost function such that f (·, ω) is convex for almost every
(a.e.) ω ∈ Ω, and f (x, ·) is F-measurable and P -integrable for all x ∈ Rn .
Of particular interest are instances of (1) corresponding to two-stage stochastic
linear programs (cf. [2,13]), where X is a polyhedron and
f (x, ω) = cT x + Q(x, ξ(ω))
(2)
Q(x, ξ) = min{q T y : W y = h + T x, y ≥ 0}.
(3)
with
Here ξ = (q, W, h, T ) represents a particular realization of the random data ξ(ω)
for the linear program in (3). It is assumed that problem (3) has a finite optimal
value for all x ∈ X and for a.e. ω ∈ Ω, i.e., relatively complete and sufficiently
expensive recourse.
School of Industrial & Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive,
Atlanta, GA 30332. E-mail: sahmed@isye.gatech.edu
2
Shabbir Ahmed
Formulation (1) is risk-neutral in the sense that it is concerned with the
optimization of an expectation objective. A common approach to addressing risk
is to consider a weighted mean-risk criterion, where some dispersion statistic is
used as a proxy for risk, i.e.,
min{ E[f (x, ω)] + λD[f (x, ω)] : x ∈ X},
(4)
where D is a dispersion statistic, and λ is a non-negative weight to trade-off
expected cost with risk. The classical Markowitz [8] mean-variance portfolio
optimization model is an example of (4) where variance is used as the dispersion
statistic.
In this paper we investigate various mean-risk objective functions and their
computational suitability in addressing risk in stochastic programming models.
We prove that the mean-variance criterion leads to computational intractability even in the simplest stochastic programs. On the other hand, a number of
alternative mean-risk functions are shown to be computationally tractable using slight variants of existing stochastic programming decomposition algorithms.
We propose parametric algorithms to generate mean-risk efficient frontiers for
two specific mean-risk objectives in the context of two-stage stochastic linear
programs.
2. Complexity of mean-variance stochastic programming
It is well-known that using the mean-variance criterion in two-stage stochastic
linear programming, in general, leads to non-convex formulations [13,18]. In this
section, we prove the computational intractability of such non-convex formulations by showing that, even for very simple two-stage stochastic linear programs,
the mean-variance criterion leads to NP-hard optimization problems.
Consider the class of two-stage stochastic linear programs with simple recourse (cf. [2]), i.e., problem (1) with
f (x, ω) =
n
X
j=1
cj xj +
n
X
Qj (xj , ξj (ω))
(5)
j=1
and
Qj (xj , ξj ) = qj+ (xj − ξj )+ + qj− (ξj − xj )+ ,
(6)
where (·)+ = max{·, 0}, and qj+ +qj− ≥ 0 for all j = 1, . . . , n. The following lemma
provides a closed-form formula for the variance of the simple recourse function (6)
when ξj (ω) has a discrete distribution. The proof follows from algebra and is
omitted.
Lemma 1. Consider the function Q(x, ξ) = q + (x − ξ)+ + q − (ξ − x)+ , where
x, ξ ∈ R. Let ξ(ω) be a random variable with the discrete distribution ξ(ω) = ξk
Convexity and decomposition of mean-risk stochastic programs
3
w.p. pk for k = 1, . . . , K, and ξ1 ≤ · · · ≤ ξK . The variance of Q(x, ξ(ω)) is then
given by

if x ≤ ξ1 ,
a
V[Q(x, ξ(ω))] = bk x2 + ck x + dk if ξk−1 ≤ x ≤ ξk k = 2, . . . , K,

e
if x ≥ ξK ,
where

K
X
a = (q − )2 
pi ξi2 −
K
X
i=1
i=1
+
bk = q + q
"
k−1
X
− 2
ck = −2 (q + )2
pi
i=1
k−1
X
!2 
pi ξi  ,
K
X
pi
i=k
pi ξi + (q − )2
i=1
q
+
dk = (q + )2
k−1
X
k−1
X
pi − q
K
X
pi ξi +
−
K
X
pi
i=k
pi ξi2 + (q − )2
i=1

K
X
e = (q + )2 
pi ξi2 −
i=1
,
i=k
i=1

!
K
X
!
pi ξi2 −
i=k
!2 
K
X
pi ξi  .
q
−
K
X
pi ξi − q
+
i=1
i=k
q+
k−1
X
k−1
X
pi ξi − q −
i=1
K
X
i=k
pi ξi
!#
,
!2 
pi ξi  ,
i=1
From the above lemma, it can be seen that the function V[Q(x, ξ(ω))] is piecewise convex quadratic (note that bk ≥ 0 for all k), but non-convex in general.
Consequently, we encounter computational complications in using the meanvariance criterion in stochastic programs with simple recourse.
Theorem 1. The mean-variance stochastic programming problem
min{ E[f (x, ξ)] + λV[f (x, ξ)] : x ∈ X}
(7)
corresponding to the simple recourse function (5)-(6) is NP-hard for any λ > 0.
Proof. Consider the Binary Integer Feasibility problem:
Given an integer matrix A ∈ Zm×n , and integer vector b ∈ Zm ,
is there a vector x ∈ {−1, 1}n such that Ax ≤ b?
(8)
The binary integer feasibility problem (8) is known to be NP-complete [3]. We
shall show that given any instance of (8) with n variables, we can construct a
polynomial (in n) sized instance of the mean-variance stochastic program (7) for
any λ > 0 such that (8) has an answer “yes” if and only if (7) has an optimal
3
objective value of (3 + 4λ
+ 3λ
2 )n.
4
Shabbir Ahmed
An instance of (8) is given by the data pair (A, b). Given any such instance,
we can construct an instance of (7) for any λ > 0 as follows. Let X = {x ∈ Rn :
Ax ≤ b, − e ≤ x ≤ e}, e be a n-vector of ones, and qj+ = 1, qj− = 1, cj = 0 for
all j = 1, . . . , n. Let ξj (ω) for j = 1, . . . , n be i.i.d random variables, with the
distribution

 −3 − λ1 w.p. 41 ,
0 w.p. 12 ,
ξj (ω) =

1
3 + λ w.p. 41 .
Owing to the independence of ξj (ω), the mean-variance stochastic program (7)
corresponding to the above data reduces to
n
X
min{ (E[Qj (xj , ξj (ω))] + λV[Qj (xj , ξj (ω))]) : x ∈ X}.
(9)
j=1
It follows from Lemma 1 that for all j = 1, . . . , n,
E[Qj (xj , ξj (ω))] + λV[Qj (xj , ξj (ω))] =
3
4λ (4λ
3
4λ (4λ
+ 1 + λ2 x2j + 2λ2 xj + 3λ2 ) if
+ 1 + λ2 x2j − 2λ2 xj + 3λ2 ) if
−3 − λ1 ≤ xj ≤ 0
0 ≤ xj ≤ 3 + λ1 .
3
+ 3λ
Note that for all j = 1, . . . , n, E[Qj (xj , ξj (ω))]+λV[Qj (xj , ξj (ω))] ≥ (3+ 4λ
2 )
for any xj ∈ [−1, 1], with equality holding if and only if xj ∈ {−1, 1}. Thus (9)
3
has an optimal objective value of (3 + 4λ
+ 3λ
2 )n if and only if there exists x ∈ X
n
such that x ∈ {−1, 1} , i.e., problem (8) has an affirmative answer.
u
t
In the classical setting of portfolio optimization, the function f (x, ω) =
−r(ω)T x where r(ω) ∈ Rn is a random vector of returns, and X ⊂ Rn is a
polyhedral set of feasible weights for the n assets in the portfolio. In this case,
V[f (x, ω)] = xT Cx where C is the covariance matrix of the random vector r(ω).
Consequently, (7) reduces to a deterministic (convex) quadratic program suitable for very efficient computation. For typical stochastic programs f (x, ω) is
nonlinear (although convex) in x. Furthermore, the variance operator, although
convex, is non-monotone. Consequently E[f (x, ω)] + λV[f (x, ω)] is not guaranteed to be convex in x, leading to the computational complication proven in
Theorem 1. In the following section, we investigate mean-risk objectives that
preserve convexity, hence computational tractability.
3. Tractable mean-risk objectives
Given a random variable Y : Ω 7→ R, representing cost, belonging to the linear
space Xp = Lp (Ω, F, P ) for p ≥ 1, a scalar λ ≥ 0, and an appropriate function
D : Xp 7→ R to measure the risk associated with Y , we define a mean-risk
function gλ,D : Xp 7→ R as
gλ,D [Y ] = E[Y ] + λD[Y ].
(10)
Convexity and decomposition of mean-risk stochastic programs
5
Using (10) to address risk in the context of the stochastic program (1), we arrive
at the formulation
min{φ(x) = gλ,D [f (x, ω)] : x ∈ X}.
(11)
From a computational viewpoint, it is desirable that the objective function φ(·)
in (11) be convex. As discussed in Section 2, even though f (·, ω) is convex for
a.e. ω ∈ Ω, the convexity of the composite function φ(x) = gλ,D [f (x, ω)] may not
be preserved, for example, if variance is used as the measure of risk in gλ,D .
3.1. Sufficient conditions for preserving convexity
We shall say that a function g : Xp 7→ R is convexity-preserving, if the composite
function φ(x) = g[f (x, ω)] is convex for any function f (x, ω) such that f (·, ω)
is convex for a.e. ω ∈ Ω and f (x, ·) ∈ Xp for all x ∈ Rn . Recall that a function
g : Xp 7→ R is convex if g[λY1 + (1 − λ)Y2 ] ≤ λg[Y1 ] + (1 − λ)g[Y2 ], for all
Y1 , Y2 ∈ Xp and λ ∈ [0, 1]; is non-decreasing if g[Y1 ] ≥ g[Y2 ] for all Y1 , Y2 ∈ Xp
such that Y1 ≥ Y2 ; and is positively homogenous if g[λY ] = λg[Y ] for all Y ∈ Xp
and λ ≥ 0.
The following result is well-known (cf. [14]).
Proposition 1. If g : Xp 7→ R is convex and non-decreasing, then g is convexitypreserving.
Lemma 2. A convex and positively homogenous function g : Xp 7→ R is nondecreasing if and only if it satisfies g[Y ] ≤ 0 for all Y ≤ 0.
Proof. Suppose g satisfies g[Y ] ≤ 0 for all Y ≤ 0. Let Y1 ≥ Y2 , i.e., Y2 = Y1 + ∆
for some ∆ ≤ 0. Then
1
1
2 g[Y2 ] = 2 g[Y1 + ∆]
= g[ 12 Y1 + 21 ∆]
≤ 12 g[Y1 ] + 21 g[∆]
≤ 21 g[Y1 ],
where the second line follows from positive-homogeneity, the third line follows
from convexity, and the fourth line follows from the fact that g[∆] ≤ 0 (since
∆ ≤ 0). Thus g[·] is non-decreasing. Conversely, if g[·] is non-decreasing, then
g[Y ] ≤ g[0] = 0 for all Y ≤ 0.
u
t
The following result immediately follows from Lemma 2 and Proposition 1.
Proposition 2. If g : Xp 7→ R is convex, positively homogenous, and satisfies
g[Y ] ≤ 0 for all Y ≤ 0, then g is convexity-preserving.
3.2. Examples
Here we show that a number of common mean-risk objectives are convexity
preserving, and hence suitable for optimization.
6
Shabbir Ahmed
Semideviation from a target. For Y ∈ Xp and a fixed target T ∈ R, the p-th
Semideviation from T [11] is defined as
ST,p [Y ] = (E[(Y − T )p+ ])1/p .
Proposition 3. The mean-risk objective
gλ,uT ,p [Y ] = E[Y ] + λST,p [Y ]
is convexity preserving for all p ≥ 1 and λ ≥ 0.
Proof. Since ST,p [Y ] is convex and non-decreasing in Y for all p ≥ 1, the result
follows from Proposition 1.
u
t
Central semideviation. For Y ∈ Xp , the p-th central semideviation [9] is defined
as
1
δp [Y ] = E[(Y − EY )p+ ] p .
Proposition 4. The mean-semideviation objective
gλ,δp [Y ] = E[Y ] + λδp [Y ]
is convexity-preserving for all p ≥ 1 and λ ∈ [0, 1].
Proof. Ogryczak and Ruszczyński [9] have shown that δp is convex for all p ≥ 1.
Furthermore, it can be verified that δp is positively-homogenous. Thus gλ,δp [Y ]
is convex and positively homogenous for all λ ≥ 0. Moreover, if Y ≤ 0, then
(Y − EY )+ ≤ −EY . Thus gλ,δp [Y ] ≤ (1 − λ)EY ≤ 0 for all λ ≤ 1. The result
then follows from Proposition 2.
u
t
Quantile-deviation and CVaR. Given α ∈ (0, 1), the Quantile-deviation [10] for
Y ∈ X1 is defined as
hα [Y ] = E [(1 − α)(κα [Y ] − Y )+ + α(Y − κα [Y ])+ ] ,
where κα is α-quantile of the distribution of Y , i.e., Pr(Y ≤ κα [Y ]) ≥ α and
Pr(Y ≥ κα [Y ]) ≥ 1 − α. In [10], it is shown that hα is equivalent to the following
risk measure (defined for ε1 > 0 and ε2 > 0),
hε1 ,ε2 [Y ] = min{E[ε1 (z − Y )+ + ε2 (Y − z)+ ]},
z∈R
when α = ε2 /(ε1 +ε2 ). The α-Conditional value at risk [12] for Y ∈ X1 is defined
as
1
E[(Y − z)+ ]}.
CVaRα [Y ] = min{z +
z∈R
1−α
By simple algebra, CVaRα can be expressed in terms of hε1 ,ε2 as follows [14]:
CVaRα [Y ] = E[Y ] +
where α = ε2 /(ε1 + ε2 ).
1
hε ,ε [Y ],
ε1 1 2
Convexity and decomposition of mean-risk stochastic programs
7
Proposition 5. Consider Y ∈ X1 .
(i) Given ε1 > 0, ε2 > 0, the mean-risk objective
gλ,hε
1 ,ε2
[Y ] = E[Y ] + λhε1 ,ε2 [Y ]
is convexity preserving for all λ ∈ [0, 1/ε1 ].
(ii) Given α ∈ (0, 1), the mean-quantile deviation objective
gλ,hα [Y ] = E[Y ] + λhα [Y ]
is convexity preserving for all λ ∈ [0, 1/(1 − α)].
(iii) Given α ∈ (0, 1), the mean-CVaR objective
gλ,CVaRα [Y ] = E[Y ] + λCVaRα [Y ]
is convexity preserving for all λ ≥ 0.
Proof. Ogryczak and Ruszczyński [10] have shown that hε1 ,ε2 [Y ] is convex and
positively homogenous. Thus gλ,hε ,ε [Y ], gλ,hα [Y ] and gλ,CVaRα [Y ] are convex
1 2
and positively homogenous for all λ ≥ 0. Note that hε1 ,ε2 [Y ] ≤ ε1 E(−Y )+ +
ε2 E(Y )+ , and if Y ≤ 0, then hε1 ,ε2 [Y ] ≤ −ε1 E[Y ]. Thus gλ,hε ,ε [Y ] ≤ (1 −
1 2
λε1 )E[Y ] ≤ 0 since λ ≤ 1/ε1 . Statement (i) then follows from Proposition 2.
Statement (ii) follows from the equivalence of hα [Y ] and hε1 ,ε2 [Y ] when α =
ε2 /(ε1 + ε2 ). Note that gλ,CVaRα [Y ] = (1 + λ)E[Y ] + λ/ε1 hε1 ,ε2 [Y ]. If Y ≤ 0,
gλ,CVaRα [Y ] ≤ (1 + λ)E[Y ] − λE[Y ] = E[Y ] ≤ 0 for all λ. Statement (iii) then
follows.
u
t
Gini mean difference. For Y ∈ X1 with distribution function FY , the Gini mean
difference [19] is defined as
Z
Γ [Y ] = E[(ξ − Y )+ ]dFY (ξ).
Proposition 6. The mean-Gini mean difference objective
gλ,Γ [Y ] = E[Y ] + λΓ [Y ]
is convexity preserving for all λ ∈ [0, 1].
Proof. Ogryczak and Ruszczyński [10] have shown that Γ [·] is convex and positivelyhomogenous. Thus gλ,Γ [·] is convex and positively-homogenous for all λ ≥ 0.
Moreover, if Y ≤ 0 then
Z
Z
E[(ξ − Y )+ ]dFY (ξ) ≤ −E[Y ]dFY (ξ) = −E[Y ].
Thus gλ,Γ [Y ] ≤ (1 − λ)E[Y ] ≤ 0 for all λ ≤ 1. The result then follows from
Proposition 2.
u
t
8
Shabbir Ahmed
The non-decreasing, hence convexity preserving, property of gλ,δp , gλ,hα (or,
equivalently, gλ,hε ,ε ) , and gλ,Γ may not hold for λ > 1, λ > 1/(1 − α) (or,
1 2
λ > 1/ε1 ), and λ > 1, respectively. However, as shown in [9,10], these meanrisk objectives are guaranteed to be consistent with standard stochastic ordering
rules only if λ ∈ (0, 1), λ ∈ (0, 1/(1 − α)) (or, λ ∈ (0, 1/ε1 )) and λ ∈ (0, 1),
respectively1 .
4. Solving mean-risk stochastic programs
In this section, we discuss methods for solving stochastic programs with convexity preserving mean-risk objectives. We provide specific details for two particular
classes of mean-risk stochastic programs, those that involve the semideviation
risk measure δp with p = 1, i.e.,
δ1 [Y ] = E[(Y − EY )+ ]
which is referred to as the absolute semideviation (ASD), and those that involve
the quantile deviation (QDEV) risk measure
hε1 ,ε2 [Y ] = min{E[ε1 (z − Y )+ + ε2 (Y − z)+ ]},
z∈R
for ε1 > 0 and ε2 > 0.
4.1. Deterministic equivalent formulations
Consider the stochastic program (1) where f (x, ω) is given as the value function
of a second-stage optimization problem
f (x, ω) = min{F0 (x, y, ω) : Fi (x, y, ω) ≤ 0 i = 1, . . . , m, y ∈ Y }.
Here we make the necessary assumptions on Fi for i = 0, . . . , m and Y for
f (x, ω) to be finite-valued, F-measurable and P -integrable (cf. [13]). The twostage stochastic linear programming objective function (2)-(3) is a special case.
If the mean-risk objective is non-decreasing, then the corresponding mean-risk
stochastic program (11) can be written as:
min gλ,D [F0 (x, y(ω), ω)]
x,y(ω)
s.t. x ∈ X
Fi (x, y(ω), ω) ≤ 0 i = 1, . . . , m
for a.e. ω ∈ Ω.
y(ω) ∈ Y
(12)
When Ω is finite, problem (12) is a large-scale deterministic optimization problem. Furthermore, if Fi for i = 0, 1, . . . , m are convex functions of the decision
variables x and y(ω) for a.e. ω ∈ Ω, and the sets X and Y are convex, then (12)
is a convex program.
1 Note that in our minimization setting smaller values of the random variables are preferable to larger, whereas in the maximization setting of [10], larger values are preferable. Consequently, for the quantile deviation risk function, [10] shows consistency with stochastic
ordering for λ ∈ (0, 1/α)
Convexity and decomposition of mean-risk stochastic programs
4.1.1. The Mean-ASD objective.
9
It can be verified that
gλ,δ1 [Y ] = (1 − λ)µ[Y ] + λν[Y ],
(13)
where µ[Y ] = E[Y ] and ν[Y ] = E[max{Y, EY }]. The deterministic equivalent
(12) of the mean-ASD stochastic program is then (cf. [16]):
R
R
min
(1 − λ) F0 (x, y(ω), ω)dP (ω) + λ ν(ω)dP (ω)
x,y(ω),ν(ω)
s.t.
x∈X

Fi (x, y(ω), ω) ≤ 0 i = 1, . . . , m 


ν(ω) ≥ F
R 0 (x, y(ω), ω)
for a.e. ω ∈ Ω.
ν(ω) ≥ F0 (x, y(ξ), ξ)dP (ξ) 


y(ω) ∈ Y
(14)
Note that, if λ ∈ [0, 1], Fi for i = 0, 1, . . . , m are convex functions of the decision
variables x and y(ω) for a.e. ω ∈ Ω, and the sets X and Y are convex, then the
objective function and constraints of (14) are all convex. In case of two-stage
stochastic linear programs, the functions Fi for i = 0, . . . , m are linear and the
sets X and Y are polyhedral. Then, in case of a finite set Ω of scenarios, (14) is a
large-scale linear program. Unfortunately, this linear program does not posses the
dual-block angular structure that is natural in the deterministic equivalents of
standard two-stage stochastic linear programs [16]. Consequently, decomposition
methods (such as the Benders or L-shaped algorithm) (cf. Chapter 3 of [13]) used
for solving standard two-stage stochastic linear programs cannot be directly applied. In the next section, we show that problem decomposition can be achieved
by using slight variations of these methods.
4.1.2. The Mean-QDEV objective.
written as
gλ,hε
1 ,ε2
The mean-QDEV objective function can be
[Y ] = E[Y ] + λ min{E[ε1 (z − Y )+ + ε2 (Y − z)+ ]}
z∈R
= min{E[Y + λε1 (z − Y )+ + λε2 (Y − z)+ ]}
z∈R
= min{E[Y + λε1 (Y − z)+ − λε1 (Y − z) + λε2 (Y − z)+ ]}
z∈R
= min{λε1 z + (1 − λε1 )E[Y ] + λ(ε1 + ε2 )E[(Y − z)+ ]}.
z∈R
(15)
From the fact that E[Y ] and E[(Y − z)+ ] are non-decreasing in Y and λ ∈
[0, 1/ε1 ], it follows that deterministic equivalent (12) of the mean-QDEV stochastic program is
R
R
min
λε1 z + (1 − λε1 ) F0 (x, y(ω), ω)dP (ω) + λ(ε1 + ε2 ) ν(ω)dP (ω)
z,x,y(ω),ν(ω)
s.t.
z ∈ R, x ∈ X

Fi (x, y(ω), ω) ≤ 0 i = 1, . . . , m 
ν(ω) ≥ F0 (x, y(ω), ω) − z
for a.e. ω ∈ Ω.

ν(ω) ≥ 0, y(ω) ∈ Y
(16)
10
Shabbir Ahmed
If Fi for i = 0, 1, . . . , m are convex functions of the decision variables x and y(ω)
for a.e. ω ∈ Ω, and the sets X and Y are convex, then the objective function
and constraints of (16) are all convex. Moreover, in case of two-stage stochastic
linear programs, with a finite set Ω of scenarios, (16) is a large-scale linear
program with dual-block angular structure and is directly amenable to standard
decomposition methods [17]. In the following two sections, we propose such a
decomposition algorithm and its parametric extension to efficiently generate the
entire mean-QDEV efficient frontier.
4.2. Cutting plane methods
When f (·, ω) is convex for a.e. ω ∈ Ω and the mean-risk function gλ,D [·] is
convexity-preserving, the mean-risk stochastic program (11) involves minimizing
a convex (often non-smooth) objective function φ(x) = gλ,D [f (x, ω)]. Effective
solution schemes for such problems are subgradient-based methods, such as various cutting plane algorithms, bundle methods, or level methods [4,5]. Given a
candidate solution x ∈ Rn , these methods require the calculation of a subgradient s of the composite function φ(·) = gλ,D [f (·, ω)] at x, i.e., s ∈ ∂φ(x). Note
that a subgradient of gλ,D always exists if gλ,D is real-valued, convex, and continuous. Consequently, given subgradients of f and gλ,D , a subgradient of φ(·) can
be calculated using the chain rule of subdifferentiation. Subgradient formulas for
various mean-risk objectives, including mean-ASD and mean-QDEV, are derived
in detail in [14]. These subgradient formulas can be used for the optimization
of the mean-risk stochastic program (11) using, for example, a generic cutting
plane algorithm, such as Algorithm 1, or some variant of it.
Algorithm 1 A cutting plane algorithm for the mean-risk stochastic program
(11).
1: let ≥ 0 be a pre-specified tolerance; set i = 1, LB = −∞ and U B = +∞.
2: while U B − LB ≥ do
3:
solve the following master problem
LB = min{θ : x ∈ X, θ ≥ φ(xj ) + (sj )T (x − xj ), j = 1, . . . , i − 1};
x,θ
4:
5:
6:
7:
8:
9:
and let xi be its optimal solution (or any feasible solution if an optimal does not exist).
for ω ∈ Ω do
compute f (xi , ω) and π i (ω) ∈ ∂f (xi , ω).
end for
compute φ(xi ) and si ∈ ∂φ(xi ), from f (xi , ω) and π i (ω).
set U B = min{U B, φ(xi )}.
end while
In case of two-stage stochastic linear programs of the form (2)-(3), a subgradient π(ω) ∈ ∂f (x, ω) is given by c + T T ϑ(ω) where ϑ(ω) is a dual optimal
solution to the second-stage linear program (3) for given x and realization ω.
When Ω is finite, the second-stage subproblems corresponding to a given x can be
solved independently for each realization ω ∈ Ω allowing for a computationally
convenient decomposition. The optimal objective values and the dual solutions
Convexity and decomposition of mean-risk stochastic programs
11
for the subproblems can then be used to compute the function value and its
subgradient. This scheme is a slight variation of the well-known L-shaped (or
Benders) decomposition method for solving standard two-stage stochastic linear programs involving an expected value objective. If the mean-risk function
is polyhedral (as in case of the mean-ASD and mean-QDEV objectives), then
Algorithm 1 is guaranteed to terminate in a finite number of iterations with an
-optimal solution for any ≥ 0.
4.3. Parametric cutting plane algorithms for mean-ASD and mean-QDEV
stochastic linear programs
Often, it is necessary to solve the mean-risk stochastic program (11) for many
different values of the mean-risk tradeoff parameter λ so as to trace out the
mean-risk efficient frontier. This can, of course, be accomplished by repeatedly
solving the problem for different values of λ. However, more efficient parametric
optimization schemes may be possible. For the case when f (x, ω) is linear in x,
as in the portfolio optimization setting, a parametric simplex algorithm for generating mean-ASD and mean-QDEV frontiers has been proposed in [15]. In this
section, we propose parametric cutting plane algorithms to generate mean-ASD
and mean-QDEV frontiers when f (x, ω) corresponds to two-stage stochastic linear programs. Note that in Algorithm 1, the tradeoff parameter λ is embedded
within the subgradient si calculated in step 7, and hence in the cut coefficients.
As such, this algorithm is not suitable for an efficient parametric analysis of the
mean-risk model with respect to λ.
Recall from (13), that the mean-ASD stochastic program is
o
n
(17)
min gλ,δ1 [f (x, ω)] = (1 − λ)µ[f (x, ω)] + λν[f (x, ω)] : x ∈ X ,
where µ[Y ] = EY and ν[Y ] = E[max{Y, EY }]. Note that both µ and ν are
convex non-decreasing functions, and so the composite functions µ[f (x, ω)] and
ν[f (x, ω)] are convex in x that can be approximated by subgradient inequalities
(cuts) within a cutting plane scheme. If we use a cutting plane scheme for (17)
where separate cuts are used to approximate the functions µ and ν, then the
cut coefficients are independent of the tradeoff weight λ which only appears in
the objective function of the master problem. Thus the cuts are valid for any
λ ∈ [0, 1]. If X is polyhedral, then the master problem is a linear program for
which a parametric analysis with respect to the objective coefficient λ can be
easily carried out to detect the range of λ for which the current master problem
basis remains optimal. We can then choose a λ outside this range and reoptimize.
In this way we can construct the efficient frontier for the entire range of λ ∈ [0, 1].
This parametric modification of the cutting plane Algorithm 1 is summarized in
Algorithm 2.
From (15), we have that the mean-QDEV stochastic program is
n
min gλ,hε ,ε [f (x, ω)] = min{λε1 z + (1 − λε1 )µ[f (x, ω)]
1 2
z∈R
o
(18)
+λ(ε1 + ε2 )ν[f (x, ω), z]} : x ∈ X ,
12
Shabbir Ahmed
Algorithm 2 A parametric cutting plane algorithm for the mean-ASD stochastic program (17) (the set X is polyhedral).
1: let ≥ 0 be a pre-specified tolerance; set λ = 0, i = 1, LB = −∞ and U B = +∞.
2: while U B − LB ≥ do
3:
solve the following master linear program
LB = min {(1 − λ)θ + λη : x ∈ X, θ ≥ µ(xj ) + (uj )T (x − xj ) j = 1, . . . , i − 1,
x,θ,η
η ≥ ν(xj ) + (v j )T (x − xj ) j = 1, . . . , i − 1};
4:
5:
6:
7:
8:
9:
10:
11:
12:
and let xi be its optimal solution (or any feasible solution if an optimal does not exist).
for ω ∈ Ω do
compute f (xi , ω) and π i (ω) ∈ ∂f (xi , ω).
end for
compute µ(xi ) = E[f (xi , ω)] and a subgradient ui = E[π i (ω)].
compute ν(xi ) = E[max{f (xi , ω), µ(xi )}]
and a subgradient v i = E[ι(ω)π i (ω)] + ui E[1 − ι(ω)] where ι(ω) = 1 if f (xi , ω) > µ(xi )
and 0 otherwise.
set U B = min{U B, (1 − λ)µ(xi ) + λν(xi )}.
end while
use parametric analysis on the master problem to find the range [λ, λ∗ ] for which the
current master problem basis remains optimal.
if λ∗ < 1, set λ = min{1, λ∗ + δ} (where δ > 0 is very small), LB = −∞ and U B = +∞
and return to step 2.
Algorithm 3 A parametric cutting plane algorithm for the mean-QDEV
stochastic program (18) (the set X is polyhedral).
1: let ≥ 0 be a pre-specified tolerance; set λ = 0, i = 1, LB = −∞ and U B = +∞.
2: while U B − LB ≥ do
3:
solve the following master linear program
LB = min {λε1 z + (1 − λε1 )θ + λ(ε1 + ε2 )η : z ∈ R, x ∈ X,
z,x,θ,η
θ ≥ µ(xj ) + (uj )T (x − xj ) j = 1, . . . , i − 1,
η ≥ ν(z j , xj ) + (vzj )T (z − z j ) + (vxj )T (x − xj ) j = 1, . . . , i − 1};
4:
5:
6:
7:
8:
9:
10:
11:
12:
and let (z i , xi ) be its optimal solution (or any feasible solution if an optimal does not
exist).
for ω ∈ Ω do
compute f (xi , ω) and π i (ω) ∈ ∂f (xi , ω).
end for
compute µ(xi ) = E[f (xi , ω)] and a subgradient ui = E[π i (ω)].
compute ν(xi , z i ) = E[(f (xi , ω) − z i )+ ], a subgradient with respect to x:
vxi = E[ι(ω)π i (ω)], and a subgradient with respect to z: vzi = −E[ι(w)],
where ι(ω) = 1 if f (xi , ω) > z i and 0 otherwise.
set U B = min{U B, λε1 z i + (1 − λε1 )µ(xi ) + (λε1 + ε2 )ν(xi , z i )}.
end while
use parametric analysis on the master problem to find the range [λ, λ∗ ] for which the
current master problem basis remains optimal.
if λ∗ < 1/ε1 , set λ = min{1/ε1 , λ∗ + δ} (where δ > 0 is very small), LB = −∞ and
U B = +∞ and return to step 2.
where µ[Y ] = EY and ν[Y, z] = E[(Y − z)+ ]. Note that both µ and ν are convex
non-decreasing in Y , and so the composite functions µ[f (x, ω)] and ν[f (x, ω), z]
are convex in x. Moreover ν is also convex in z. Thus, µ and ν can be approx-
Convexity and decomposition of mean-risk stochastic programs
13
imated separately by cutting planes whose coefficients are independent of λ.
Consequently, we can develop a parametric cutting plane algorithm similar to
Algorithm 2 for the mean-QDEV objective. A key difference is that the optimization is with respect to x and z. The details are provided in Algorithm 3.
4.4. Computational illustration
We implemented enhanced versions of Algorithms 2 and 3 to generate the meanASD frontier and the mean-QDEV frontier, respectively. Our implementations
extend the basic schemes of Algorithm 2 and Algorithm 3 with `∞ –trust-region
based regularizations as described in [6]. We used the GNU Linear Programming
Kit (GLPK) [7] library routines to solve linear programming subproblems. All
computations were carried out on a Linux workstation with dual 2.4 GHz Intel
Xeon processors and 2 GB RAM. We considered a 100 scenario instance of the
standard stochastic programming test problem gbd. Data for this instance is
available from the website:
http://www.cs.wisc.edu/∼swright/stochastic/sampling
Figures 1 and 2 show the mean-ASD efficient frontier obtained by Algorithm 2
and the mean-QDEV efficient frontier (corresponding to ε1 = 1 and ε2 = 9, i.e.,
α = 0.9) obtained by Algorithm 3, respectively. In both cases, the parametric
strategy was significantly more efficient than resolving the problem from scratch
for different values of λ. For the mean-ASD problem, 5 unique solutions were
identified for λ ∈ [0, 1]. For the mean-QDEV, 18 unique solutions were identified
for λ ∈ [0, 10]. It is evident that the mean-QDEV model offers more flexibility in
the mean-risk trade-off than the mean-ASD model. Computational results using
Algorithm 2 on additional stochastic programming test problems are provided
in [1].
Acknowledgements
This research has been supported by the National Science Foundation under CAREER Award number DMII-0133943. The author is grateful to two anonymous
reviewers and Alexander Shapiro for constructive comments.
14
Shabbir Ahmed
1755.6
1755.4
1755.2
1755
E[f(x,w)]
1754.8
1754.6
1754.4
1754.2
1754
1753.8
1753.6
300
300.5
301
301.5
302
302.5
303
303.5
304
304.5
180
200
ASD
Fig. 1. Mean-ASD frontier for gbd
3200
3000
2800
E[f(x,w)]
2600
2400
2200
2000
1800
1600
0
20
40
60
80
100
QDEV
120
140
160
Fig. 2. Mean-QDEV frontier for gbd (α = 0.9)
Convexity and decomposition of mean-risk stochastic programs
15
References
1. S. Ahmed. Mean-risk objectives in stochastic programming. Technical report, Georgia
Institute of Technology, 2004. E-print available at www.optimization-online.org.
2. J. R. Birge and F. Louveaux. Introduction to Stochastic Programming. Springer, New
York, NY, 1997.
3. M.R. Garey and D.S. Johnson. Computers and intractability: A guide to the theory of
NP-completeness. W. H. Freeman and Co., 1979.
4. J.-B. Hiriart-Urruty and C. Lemaréchal. Convex Analysis and Minimization Algorithms
II. Springer-Verlag, Berlin Heidelberg, 1996.
5. C. Lemaréchal, A. Nemirovski, and Yu. Nesterov. New variants of bundle methods. Mathematical Programming, 69:111–148, 1995.
6. J. Linderoth and S. Wright. Decomposition algorithms for stochastic programming on a
computational grid. Computational Optimization and Applications, 24:207–250, 2003.
7. A. Makhorin.
GNU Linear Progamming Kit, Reference Manual, Version 3.2.3.
http://www.gnu.org/software/glpk/glpk.html, 2002.
8. H. M. Markowitz. Portfolio Selection: Efficient Diversification of Investments. John Wiley
& Sons, 1959.
9. W. Ogryczak and A. Ruszczyński. On consistency of stochastic dominance and meansemideviation models. Mathematical Programming, 89:217–232, 2001.
10. W. Ogryczak and A. Ruszczyński. Dual stochastic dominance and related mean-risk
models. SIAM Journal on Optimization, 13:60–78, 2002.
11. R. B. Porter. Semivariance and stochastic dominance. American Economic Review,
64:200–204, 1974.
12. R.T. Rockafellar and S. Uryasev. Optimization of conditional value-at-risk. The Journal
of Risk, 2:21–41, 2000.
13. A. Ruszczyński and A. Shapiro, editors. Stochastic Programming, volume 10 of Handbooks
in Operations Research and Management Science. North-Holland, 2003.
14. A. Ruszczyński and A. Shapiro. Optimization of convex risk functions. Submitted for
publication. E-print available at www.optimization-online.org, 2004.
15. A. Ruszczyński and R. Vanderbei. Frontiers of stochastically nondominated portfolios.
Econometrica, 71:1287–1297, 2004.
16. R. Schultz. Stochastic programming with integer variables. Mathematical Programming,
97:285–309, 2003.
17. A. Shapiro and S. Ahmed. On a class of minimax stochastic programs. SIAM Journal on
Optimization, 14:1237–1249, 2004.
18. S. Takriti and S. Ahmed. On robust optimization of two-stage systems. Mathematical
Programming, 99:109–126, 2004.
19. S. Yitzhaki. Stochastic dominance, mean variance, and Gini’s mean difference. American
Economic Review, 72(1):178–185, 1982.
Download