A SUMMARY AND ILLUSTRATION OF DISJUNCTIVE DECOMPOSITION WITH SET CONVEXIFICATION Suvrajeet Sen sen@sie.arizona.edu Julia L. Higle julie@sie.arizona.edu Lewis Ntaimo nlewis@sie.arizona.edu Dept. of Systems and Industrial Engineering The University of Arizona Tucson, Arizona 85721 Abstract In this paper we review the Disjunctive Decomposition (D2 ) algorithm for two-stage Stochastic Mixed Integer Programs (SMIP). This novel method uses the principles of disjunctive programming to develop cuttingplane-based approximations of the feasible set of the second stage problem. At the core of this approach is the Common Cut Coefficient Theorem, which provides a mechanism for transforming cuts derived for one outcome of the second stage problem into cuts that are valid for other outcomes. An illustrative application of the D2 method to the solution of a small SMIP illustrative example is provided. Keywords: Set convexification, Disjunctive Decomposition 1 Disjunctive Decomposition with Set Convexification 1 Introduction Stochastic Mixed Integer Programs (SMIP) comprise one of the more difficult classes of mathematical programming problems. Indeed, this class of problems combines the extremely large scale nature of stochastic programs and the inherent computational difficulties of combinatorial optimization. The main difficulty in solving two-stage stochastic mixed-integer programs is that the recourse costs are represented as the expected value of a mixed-integer program whose value function is far more complicated than the value function of a linear program. In general, the expected recourse function is non-convex and possibly discontinuous. In this paper we illustrate the Disjunctive Decomposition (D2 ) algorithm with set convexification for two stage SMIP, proposed by Sen and Higle [2000]. The method uses the principles of disjunctive programming to develop a cutting-plane-based approximation of the feasible set of the second stage problem. This task is streamlined via the Common Cut Coefficients (C 3 ) Theorem (Sen and Higle [2000]), which provides a simple mechanism for transforming cuts derived for one instance of the second stage problem into cuts that are valid for another instance. This significantly reduces the effort required to approximate the convexification of the feasible set, a task that must be undertaken for each possible outcome of the random variables involved. In this paper, we illustrate the D2 algorithm and the manner in which the C 3 Theorem is used to reduce the computational effort. Because the methodology is related to, but distinctly different from, the work of Carøe[1998], we also use this forum to highlight the relationship between the two approaches. This paper is organized as follows. In §1 we summarize the results of Sen and Higle [2000], and identify connections between their work and that of Carøe[1998]. In §2 we illustrate the application of the D2 Algorithm with a simple numerical example with both first and secondstage binary variables. Finally, a discussion and our conclusions are found in §3. 1. Background In this section we summarize the main results from Sen and Higle [2000] that are critical to our illustration of the D2 algorithm. In particular, we review the C 3 theorem and discuss the details of its application. For a more thorough explanation of disjunctive decomposition concepts, proofs, and the derivation of the D2 algorithm, we refer the reader to Sen and Higle [2000]. Throughout this paper we consider the following 2 stochastic mixed integer program (SMIP): > Min T c x + E[f (x, ω̃)], x∈X B (1) where X ⊆ Rn1 is a set of feasible first stage decisions, B ⊂ Rn1 is the set of binary vectors, ω̃ is a random variable defined on a probability space (Ω, A, P), and for any ω ∈ Ω f (x, ω) = Min gu> u + gz> z, s.t. Wu u + Wz z = r(ω) − T (ω)x, u ∈ Rn+u , z ∈ B nz . (2a) (2b) (2c) It is assumed that X is a convex polyhedron, Ω is a finite set, and that f (x, ω) < ∞ for all (x, ω) ∈ X × Ω. Moreover, we assume that by using appropriately penalized continuous variables, the subproblem (2) remains feasible for any restriction of the integer variables z. Note that the inclusion of integer variables in the second stage problem, (2), is the primary source of the computational and algorithmic challenges associated with (1). In particular, in order to evaluate the SMIP objective cx + E[f (x, ω̃)], it is necessary to solve (implicitly or approximately) the MIP (2) for each ω ∈ Ω. Moreover, the structural difficulties associated with MIP objective functions are well documented (see, e.g., Blair and Jeroslow [1982] and Blair [1995]). These difficulties are compounded by the fact that the expected value operations associated with the SMIP objective function amounts to a convex combination of the complicated individual MIP objective functions. The C 3 Theorem exploits the specific structure of (2), thereby permitting a computationally streamlined development of SMIP objective function approximations. 1.1. Common Cut Coefficients In an effort to develop approximations of the SMIP objective, we begin with an approximation of the convex hull of feasible integer points for (2). This set can be expressed as a disjunction, [ S= Sh , h∈H where H is a finite index set, and the sets {Sh }h∈H are polyhedral sets represented as Sh = {y | Wh y ≥ rh , y ≥ 0} Within our setting, we have y = (u, z) as in (2) and rh includes r(w) − T (w)x. More formally, we note that the constraints in (2), Wu u + Wz z ≥ r(ω) − T (ω)x Disjunctive Decomposition with Set Convexification 3 vary with the first stage decision, x, and the scenario ω. Consequently, the disjunctive representation of the set depends on (x, ω) ∈ X × Ω, [ S(x, ω) = Sh (x, ω), (3) h∈H where Sh (x, ω) = {y | Whu u + Whz z ≥ rh (x, ω), u, z ≥ 0}. A convex relaxation of the nonconvex set (3) can be represented by a collection of valid inequalities of the form πu> u + πz> z ≥ π0 (x, ω). While the disjunctive representation depends on (x, ω), the C 3 Theorem, which we state below, ensures that as the argument changes, cut validity can be maintained by a shift in the right-hand-side element without altering the gradient of the cut. In the following we use n2 = nu + nz . Theorem 1 (The C 3 Theorem). Consider the stochastic program with fixed recourse as stated in (1), (2). For (x, ω) ∈ X × Ω, let Y (x, ω) = {y = (u, z) | W y ≥ r(ω) − T (ω)x, u ∈ Rn+u , z ∈ B nz }, the set of mixedinteger feasible solutions for the second stage mixed-integer linear program. Suppose that {Ch , dh }h∈H , is a finite collection of appropriately dimensioned matrices and vectors such that for all (x, ω) ∈ X × Ω [ {y ∈ Rn+2 | Ch y ≥ dh }. Y (x, ω) ⊆ h∈H Let Sh (x, ω) = {y ∈ Rn+2 | W y ≥ r(ω) − T (ω)x, Ch y ≥ dh }, and let S= [ Sh (x, ω). h∈H Let (x̄, ω̄) be given, and suppose that Sh (x̄, ω̄) is nonempty for all h ∈ H and π T y ≥ π0 (x̄, ω̄) is a valid inequality for S(x̄, ω̄). There exists a function, π0 : X×Ω → R such that for all (x, ω) ∈ X×Ω, π T y ≥ π0 (x, ω) is a valid inequality for S(x, ω). Proof. See Sen and Higle [2000]. The C 3 Theorem ensures that a valid inequality for the set S(x̄, ω̄) of the form π > y ≥ π0 (x̄, ω̄) can be translated to an inequality, π > y ≥ π0 (x, ω) that is valid for the set S(x, ω). The cut coefficients, π, are 4 common to both sets. Thus, one may derive the left hand side coefficients, π, which may be applied to all scenario problems. The right hand side π0 (x, ω) is derived as necessary for each pair (x, ω) using a strategy from reverse convex programming in which disjunctive programming is used to provide facets of the convex hull of reverse convex sets (Sen and Sherali [1987]). Given the valid inequalities, π > y ≥ π0 (x, ω), a lower bound approximation for the scenario subproblem objective function is given by: f (x, ω) ≥ f0 (x, ω) ≡ Min g > y s.t. W y ≥ r(ω) − T (ω)x (4a) (4b) π T y ≥ π0 (x, ω) y≥0 (4c) (4d) (4e) We note that a version of Theorem 1 appears in Carøe[1998], although there are some critical distinctions between Sen and Higle [2000] and Carøe[1998]. Specifically, while Sen and Higle [2000] work within the context of the temporal decomposition indicated in (1,2), Carøe[1998] works within the context of the “deterministic equivalent problem”, X Min c> x + pω (gu> uω + gz> zω ) ω∈Ω s.t. T (ω)x + Wu uω + Wz zω = r(ω) x ∈ Rn+1 , uω ∈ Rn+u , zω ∈ B nz ∀ω ∈ Ω ∀ω ∈ Ω. Accordingly, Carøe’s cuts may be translated from one scenario to another, while being restricted to the higher dimension of (x, uω , zω ) as compared to the (uω , zω ) dimension restriction of the Sen and Higle cuts. It follows that the Sen and Higle cuts permit both a temporal and scenario decomposition (i.e., wrt to x and ω), while Carøe’s are restricted to only scenario decomposition (i.e., wrt ω). Another recent paper, Sherali and Fraticelli [2000] also uses cuts in this higher dimensional space. Their approach uses the formulation-linearization techniques (Sherali and Adams [1990]) to construct their approximation. 1.2. Convexification of the Right-Hand-Side Function As discussed in Sen and Higle [2000], the function π0 (x, ω) is piecewise linear and concave in the first argument. That is, π0 (x, ω) = Min{ν̄h (ω) − γ̄h (ω)> x} h∈H Disjunctive Decomposition with Set Convexification 5 for a specified collection {ν̄h (ω), γ̄h (ω)}h∈H . Consequently, it is necessary to develop a convexification of the function in order to facilitate the solution of the lower bounding approximation (4). This is accomplished using reverse convex programming techniques, in which disjunctive programming concepts are used to obtain the convex hull of reverse convex sets (Sen and Sherali [1987]). To begin, let the epigraph of π0 (·, ω), restricted to x ∈ X be defined as ΠX (ω) = {(θ, x) | x ∈ X, θ ≥ π0 (x, ω)}, where X is a polyhedral set such that X = {x ∈ Rn+1 | Ax ≥ b}, where A ∈ Rm1 ×n1 and b ∈ Rm1 . Also let Eh (ω) = {(θ, x) | θ ≥ ν̄h (ω) − γ̄h (w)> x, Ax ≥ b, x ≥ 0}. (5) Then ΠX (ω) can be defined in disjunctive normal form as ΠX (ω) = [ Eh (ω). h∈H Thus the epigraph of function π0 can be represented as the union of halfspaces, which is a disjunctive set. In order to convexify this set, we apply the notion of reverse polars from the theory of disjunctive programming (Balas [1979]). These sets (reverse polars) characterize the set of all valid inequalities of a disjunctive set, with the extreme points providing facets of the (closure of the) convex hull of the disjunctive set. The specific construction that we adopt is provided below, and will be referred to as the epi-reverse polar because it represents the reverse polar of the epigraph of π0 . In the following, we assume that for all x ∈ X, θ ≥ 0 in (5). As long as X is bounded, there is no loss of generality with this assumption, because the epigraph can be translated to ensure that θ ≥ 0. The epi-reverse 6 polar of this set, Π†X (ω), is defined as © Π†X (ω) = σ0 (ω) ∈ R, σ(ω) ∈ Rn1 , δ(ω) ∈ R such that ∀h ∈ H, ∃ τh ∈ Rm1 , τ0h ∈ R σ0 (ω) ≥ τ0h ∀h ∈ H X τ0h = 1 h (6) σj (ω) ≥ τh> Aj + τ0h γ̄hj (ω) ∀h ∈ H, j = 1, ..., n1 δ(ω) ≤ τh> b + τ0h ν̄h (ω) ∀h ∈ H ª τh ≥ 0, τ0h ≥ 0, ∀h ∈ H . Note that the epi-reverse polar only allows those facets of the convex hull of ΠX (ω) that have positive coefficient for the variable θ. If {ν̄, γ̄} are given then ¡ ¢ © conv ΠX (ω) = (θ, x) | ∀(σ0 (ω), σ(ω), δ(ω)) ∈ Π†X (ω) δ(ω) σ > (ω) ª x ∈ X, θ ≥ ( )−( )x . σ0 (ω) σ0 (ω) Let (σ0i (ω), σ i (ω), δ i (ω))i∈I denote the set of extreme points of the i i and γ i (ω) = σσi (ω) . For each epi-reverse polar, and let ν i (ω) = σδ i(ω) (ω) (ω) 0 0 (x, ω) ∈ X × Ω, let πc (x, ω) = Maxi∈I {νi (ω) − γi (ω)> x}. Then for each ω ∈ Ω, πc (·, ω) is a convex function. Moreover, the epigraph of πc (x, ω) restricted to X is the closure of the convex hull of ΠX (ω). We refer to πc (·, ω) as the convex hull approximation of π0 (·, ω), and note that π0 (x, ω) = πc (x, ω) whenever x is an extreme point of X. 1.3. An Algorithmic Context for the C 3 Theorem As a preview of the D2 algorithm, let us consider the scenario subproblems in a temporal decomposition of the SMIP, (1). If we let xk denote the first stage solution associated with the k th algorithmic iteration, the subproblems are of the form: f k (x, ω) = Min s.t. g>y W k y ≥ rk (ω) − T k (ω)x y ∈ Rn+2 (7) Disjunctive Decomposition with Set Convexification 7 Of course, in the first iteration we have f 1 (x, ω) = Min g > y s.t. W y ≥ r(ω) − T (ω)x y ∈ Rn+2 , the LP relaxation of (2). Thus, the problem is initialized with W 1 = W , r1 (ω) = r(ω), and T 1 (ω) = T (ω) as in (2). As iterations progress, cutting planes of the form π k y ≥ πc (xk , ω) = Maxi∈I {νi (ω) − γi (ω)> xk } are added to the subproblem, thereby refining the approximation of the convex hull of integer solutions. As such, the vector π k is appended to the¡matrix W k−1 , and¢ the element identified (νi (ω), γi (ω)) is appended to rk−1 (ω), T k−1 (ω) . Let y k (ω) ∈ argmin{g > y | W k y ≥ rk (ω) − T k (ω)xk , y ∈ Rn+2 }. If z k (ω), the value assigned to integer variables in y k (ω) is integer for all ω, then no update is necessary, and W k+1 = W k , rk+1 (ω) = rk (ω), and T k+1 (ω) = T k (ω). On the other hand, suppose that the subproblems do not yield integer optimal solutions. Let j(k) denote an index j, for which zjk (ω) is noninteger for some ω ∈ Ω, and let z̄j(k) denote one of the non-integer values {zjk (ω)}ω∈Ω . To eliminate this non-integer solution, a disjunction of the following form may be used: [ Sk (xk , ω) = S0,j(k) (xk , ω) S1,j(k) (xk , ω) where © S0,j(k) (xk , ω) = y ∈ Rn+2 | W k y ≥ rk (ω) − T k (ω)xk ª −zj(k) ≥ −bz̄j(k) c (8b) S1,j(k) (xk , ω) = {y ∈ Rn+2 | W k y ≥ rk (ω) − T k (ω)xk (9a) (8a) and zj(k) ≥ dz̄j(k) e (9b) The index j(k) is referred to as the “disjunction variable” for iteration k. Our assumptions ensure that the subproblems remain feasible for any restriction of the integer variables, and thus both (8) and (9) are nonempty. Also, since the disjunction is based on an either-or-condition, H = {0, 1} is used. It should be noted that when the integer restrictions are binary, the right hand side of (8b) is zero, and the right hand side 8 of (9b) is one. This is precisely the disjunction used in lift-and-project cuts of Balas, Ceria and Cornuéjols [1993]. Let λ0,1 denote the vector of multipliers associated with (8a), and λ0,2 denote the scalar multiplier associated with (8b). Let λ1,1 and λ1,2 be similarly defined for (9a) and (9b), respectively. Assuming that the sets defined in (8) and (9) are non-empty for all ω ∈ Ω, the following problem may be used to generate the common cut coefficients, π k , in iteration k: Max E[π0 (ω̃)] − E[y k (ω̃)]π s.t. k k πj ≥ λ> 0,1 Wj − Ij λ0,2 ∀j k k πj ≥ λ> 1,1 Wj + Ij λ1,2 ∀j ¡ k ¢ k k π0 (ω) ≤ λ> − λ> 0,1 r (ω) − T (ω)x 0,2 bz̄j(k) c ∀ω ∈ Ω ¡ ¢ > k k k π0 (ω) ≤ λ1,1 r (ω) − T (ω)x + λ> 1,2 dz̄j(k) e ∀ω ∈ Ω (10) − 1 ≤ πj ≤ 1, ∀j, − 1 ≤ π0 (ω) ≤ 1, ∀ω ∈ Ω λ0,1 , λ0,2 , λ1,1 , λ1,2 ≥ 0 ½ 0, if j 6= j(k) where Ijk = 1, otherwise. The validity of the cut coefficients generated above follows from the disjunctive cut principle (Balas [1979]) which requires the multipliers (λ) to be chosen in such a way that the cut coefficients are greater than the aggregated columns as specified above. Since the coefficients π are independent of ω, the above LP generates common cut coefficients. This LP/SLP is formulated following the standard approach of generating valid inequalities in disjunctive programming (Sherali and Shetty [1980]), and it optimizes some measure of distance of the current solution y k (ω) from the cut. It is interesting to note that this problem is a simple recourse problem, and may be interpreted as a stochastic version of the linear program used to generate the lift-and-project cuts. Since the disjunction used for cut formation has H = {0, 1}, the epigraph π0 (x, ω) is a union of two polyhedral sets. Therefore, for each ω ∈ Ω, the following parameters are derived from an optimal solution of (10), k ν̄0k (ω) = λ> 0,1 r (ω) − λ0,2 bz̄j(k) c, k ν̄1k (ω) = λ> 1,1 r (ω) + λ1,2 dz̄j(k) e, and k [γ̄h (ω)]> = λ> h,1 T (ω), ∀h ∈ H are used to update the approximation of the polyhedron defined via (6), which we denote as Π†X (ω)k . This polyhedron represents the epi-reverse Disjunctive Decomposition with Set Convexification 9 polar, which provides access to the convexification of π0 . Correspondingly, for each ω ∈ Ω, the following LP is used to approximate π0 (x, ω): Max δ(ω) − σ0 (ω) − (xk )> σ(ω) (σ0 (ω), σ(ω), δ(ω)) ∈ (Π†X (ω))k s.t. (11) With an optimal solution to (11), (σ0k (ω), σ k (ω), δ k (ω)), we obtain ν k (ω) = k δ k (ω) and γ k (ω) = σσk (ω) . For each ω ∈ Ω, these coefficients are used σ k (ω) (ω) 0 0 to update the right-hand-side functions rk+1 (ω) = [rk (ω), ν k (ω)], and T k+1 (ω) = [T k (ω)> ; γ k (ω)]> . Similarly, the solution to (10) is used to update the constraint matrix, W k+1 = [(W k )> ; π k ]> . The master program is defined as: Min s.t. c> x + F k (x) Ax ≥ b \ x∈X B (12) where F k (·) is a piecewise linear approximation of the subproblem objective function, E[f (x, ω̃)] in the k th iteration. 1.4. Disjunctive Decomposition with Set Convexification The Basic D2 Algorithm (Sen and Higle [2000]) can be stated as follows: Basic D2 Algorithm 0. Initialize. V1 ← ∞. ² > 0 and x1 ∈ X are given. k ← 1, W 1 ← W , T 1 (ω) ← T (ω), and r1 (ω) = r(ω). 1. Solve one LP Subproblem for each ω ∈ Ω For each ω ∈ Ω, use the matrix W k as well as the right hand side vector rk (ω) − T k (ω)xk to solve (7). If {y k (ω)}ω∈Ω satisfy the integer restrictions, ˜ Vk }, and go to step 4. Vk+1 ← Min{c> xk + E[f (xk , ω)], 2. Solve Multiplier/Cut Generation LP/SLP and Perform Updates Choose a disjunction variable j(k). (i) Formulate and solve (10) to obtain π k and define W k+1 = [(W k )> ; π k ]> . (ii) Using the multipliers λk0 , λk1 and the value z̄j(k) obtained in (i) solve (11) for each outcome ω. The solution defines ν k (ω) and γ k (ω) which are used to update the right hand side functions: rk+1 (ω) = [rk (ω), ν k (ω)] and T k+1 (ω) = [T k (ω)> ; γ k (ω)]> . 3. Update and Solve one LP Subproblem for each ω ∈ Ω For each ω ∈ 10 Ω solve (7) using W k+1 and rk+1 (ω) − T k+1 (ω)xk . If y k (ω) satisfies the integer restrictions for all ω ∈ Ω, Vk+1 ← Min{c> xk + E[f (xk , ω)], Vk }. Otherwise, Vk+1 ← Vk . 4. Update and Solve the Master Problem Using the dual multipliers from the most recently solved subproblem for each ω ∈ Ω (either step 1 or step 3), update the approximation F k by adopting a standard decomposition method (e.g Benders [1962]). Let xk+1 ∈ argmin{c> x + F k (x) | x ∈ X}, and let vk denote the optimal value of the master problem. If Vk − vk ≤ ², stop. Otherwise, k ← k + 1 and repeat from step 1. An Illustration of the D2 Algorithm 2. Consider the following two-stage SIP example problem with two scenarios: Min −1.5x1 − 4x2 + E[f (x1 , x2 , ω)] s.t. x1 , x2 ∈ {0, 1} where, f (x1 , x2 , ω) = Min −16y1 − 19y2 − 23y3 − 28y4 s.t. −2y1 − 3y2 − 4y3 − 5y4 ≥ −ω 1 + x1 −6y1 − 1y2 − 3y3 − 2y4 ≥ −ω 2 + x2 y1 , y2 , y3 , y4 ∈ {0, 1} The first stage variables are x = [x1 , x2 ]> , while the second stage variables are y = [y1 , y2 , y3 , y4 ]> . There are two scenarios are ω1 = [5, 2]> and ω2 = [10, 3]> , each occurring with probability 0.5. This instance is motivated by the example in Schultz, Stougie, and van der Vlerk [1998], where the second stage involves general integer variables. To ensure that the subproblems remain feasible for any restriction on the integer variables, we include an artificial variable, denoted R, which is penalized in the objective at a rate of 100. Thus, we recast the problems as Min −1.5x1 − 4x2 + 0.5f1 (x1 , x2 , ω1 ) + 0.5f2 (x1 , x2 , ω2 ) s.t. x1 , x2 ∈ {0, 1} where f1 (x1 , x2 , ω1 ) = Min −16y1 − 19y2 − 23y3 − 28y4 + 100R s.t. −2y1 − 3y2 − 4y3 − 5y4 + R ≥ −5 + x1 −6y1 − 1y2 − 3y3 − 2y4 + R ≥ −2 + x2 y1 , y2 , y3 , y4 ∈ {0, 1}, R ≥ 0 Disjunctive Decomposition with Set Convexification 11 and f2 (x1 , x2 , ω2 ) = Min −16y1 − 19y2 − 23y3 − 28y4 + 100R s.t. −2y1 − 3y2 − 4y3 − 5y4 + R ≥ −10 + x1 −6y1 − 1y2 − 3y3 − 2y4 + R ≥ −3 + x2 y1 , y2 , y3 , y4 ∈ {0, 1}, R ≥ 0 In this · problem ¸we have the following input data: −1 0 A= , 0 −1 · ¸ −1 b= , −1 −2 −3 −4 −5 1 −6 −1 −3 −2 1 −1 0 0 0 0 , W = 0 −1 0 0 0 0 0 −1 0 0 0 0 0 −1 0 −1 0 0 −1 0 0 for both scenarios, T (ω) = 0 0 0 0 0 0 −5 −2 −1 r(ω1 ) = −1 , −1 −1 −10 −3 −1 and r(ω2 ) = −1 . −1 −1 Note that with A and b as defined above, binary solutions are extreme points of X = {x : Ax ≥ b, x ≥ 0}, as required. It is easily seen that for binary values of the second stage variables a lower bound on the objective −16y1 − 19y2 − 23y3 − 28y4 is -86. In order to be consistent with the requirement that the lower bound on the second stage objective 12 value must be zero, we translate the second stage objective function by adding 86, thereby ensuring nonnegativity after the translation. We can now start the D2 algorithm. Iteration 1 (k = 1) Step 0 The D2 algorithm is initialized with the following master program: Min −1.5x1 − 4x2 + θ s.t. −x1 ≥ −1 −x2 ≥ −1 x1 , x2 ∈ {0, 1}, θ ≥ 0. The initial master program yields x1 = [1, 1], and θ = 0. The upper and lower bounds are initialized as V0 = ∞ and v0 = −5.5, respectively. For the first iteration of the algorithm we set V1 = V0 , W 1 = W , T 1 (ω) = T (ω), and r1 (ω) = r(ω). Step 1 For step 1 of the algorithm we use x1 = 1, x2 = 1 and solve the linear relaxation of the second stage subproblem for ω1 and ω2 , which we call LP1 and LP2 , respectively: LP1 : f1 (1, 1, ω1 ) = Min −16y1 − 19y2 − 23y3 − 28y4 + 100R s.t. −2y1 − 3y2 − 4y3 − 5y4 + R ≥ −4 −6y1 − 1y2 − 3y3 − 2y4 + R ≥ −1 −y1 ≥ −1 −y2 ≥ −1 −y3 ≥ −1 −y4 ≥ −1 y1 , y2 , y3 , y4 , R ≥ 0. Disjunctive Decomposition with Set Convexification 13 and LP2 : f2 (1, 1, ω2 ) = Min −16y1 − 19y2 − 23y3 − 28y4 + 100R s.t. −2y1 − 3y2 − 4y3 − 5y4 + R ≥ −9 −6y1 − 1y2 − 3y3 − 2y4 + R ≥ −2 −y1 ≥ −1 −y2 ≥ −1 −y3 ≥ −1 −y4 ≥ −1 y1 , y2 , y3 , y4 , R ≥ 0. The optimal solution for LP1 is y(ω1 ) = [0, 1, 0, 0], R(ω1 ) = 0. and for LP2 is y(ω2 ) = [0, 1, 0, 0.5], R(ω2 ) = 0. Step 2 Since y(ω2 ) does not satisfy the integer restrictions, we choose y4 as the “disjunction variable” and create the disjunction y4 ≤ 0 or y4 ≥ 1 for LP2 . We formulate (10), which yields the vector π 1 for updating W 1 and the data for (11) whose optimal solution is used to update the right-hand side of the second-stage constraints. An optimal solution for (10) is π 1 = [1, −1, 1, −1, 1], λ0,1 = [0, 0, 0, 1, 0, 0], λ0,2 = 1, λ1,1 = [0, 1,·0, 0, 0, 0], and λ1,2 = 1. ¸We obtain W 2 by appending π 1 to [W 1 ] W 1: W 2 = . Using the solution from (10), we 1 −1 1 −1 1 formulate and solve (11) for both ω1 and ω2 . The optimal solution for ω1 is δ(ω1 ) = −0.5, σ0 (ω1 ) = 0.5, and σ(ω1 ) = [0, 0]. Based · 1 on this ¸ [r (ω1 )] 1 1 2 solution we update r (ω1 ) and T (ω1 ) as follows: r (ω1 ) = , −1 ¸ · 1 [T (ω1 )] . For ω2 the optimal solution of LP (11) is T 2 (ω1 ) = 0 0 δ(ω2 ) = −1, σ0 (ω2 ) = 0.5, and σ(ω2 ) = [0, −0.5]. we up· 1 Similarly, ¸ [r (ω )] 2 date r1 (ω2 ) and T 1 (ω2 ) as follows: r2 (ω2 ) = , T 2 (ω2 ) = −2 · 1 ¸ [T (ω2 )] . This completes Step 2 of the algorithm. 0 −1 Step 3 Solving (7), we obtain y(ω1 ) = [0, 1, 0, 0], R(ω1 ) = 0, and y(ω2 ) = [0, 1, 0.2, 0.2], R(ω2 ) = 0. The dual solutions are d(ω1 ) = [0, 14, 0, 0, 5, 0, 0] and d(ω2 ) = [0, 10.2, 7.6, 0, 1.2, 0, 0]. V2 ← V1 , because the integer restrictions are not satisfied. 14 Step 4 Using the dual solution for each subproblem from Step 3, we formulate the “optimality cuts” as in Benders’ decomposition. The resulting cuts are 0x1 − 14x2 + η1 ≥ −33 for ω1 and 0x1 − 17.8x2 + η2 ≥ −47 for ω2 . Since the two scenarios are equally likely, the expected values associated with the cut coefficients yield 0x1 − 15.9x2 + η ≥ −40. Applying the translation θ = η + 86 we get 0x1 − 15.9x2 + θ ≥ 46 as the optimality cut to add to the master program: Min −1.5x1 − 4x2 + θ s.t. −x1 ≥ −1 −x2 ≥ −1 0x1 − 15.9x2 + θ ≥ 46 x1 , x2 ∈ {0, 1}, θ ≥ 0. Solving the master program we get x2 = [1, 0], θ = 46 and an objective value of 44.5. Therefore, the lower bound becomes v2 = 44.5. The upper bound remains the same, V2 = V1 = ∞, as before. This completes the first iteration of the algorithm. Since V2 > v2 k ← 2, and we begin the next iteration. Iteration 2 Step 1 We start the second iteration by solving the following updated subproblems with x2 = [1, 0]: LP1 : f1 (1, 0, ω1 ) = Min −16y1 − 19y2 − 23y3 − 28y4 + 100R s.t. −2y1 − 3y2 − 4y3 − 5y4 + R ≥ −4 −6y1 − y2 − 3y3 − 2y4 + R ≥ −2 y1 − y2 − y3 − y4 + R ≥ −1 −y1 ≥ −1 −y2 ≥ −1 −y3 ≥ −1 −y4 ≥ −1 y1 , y2 , y3 , y4 , R ≥ 0. Disjunctive Decomposition with Set Convexification 15 and LP2 : f2 (1, 0, ω2 ) = Min −16y1 − 19y2 − 23y3 − 28y4 + 100R s.t. −2y1 − 3y2 − 4y3 − 5y4 + R ≥ −9 −6y1 − y2 − 3y3 − 2y4 + R ≥ −3 y1 − y2 − y3 − y4 + R ≥ −2 −y1 ≥ −1 −y2 ≥ −1 −y3 ≥ −1 −y4 ≥ −1 y1 , y2 , y3 , y4 , R ≥ 0. The optimal solution for LP1 is y(ω1 ) = [0.108108, 1, 0.027027, 0.135135] and for LP2 is y(ω2 ) = [0, 1, 0, 1]. Step 2 Since y(ω1 ) does not satisfy the integer restrictions, we choose y4 as the “disjunction variable” and create the disjunction y4 ≤ 0 or y4 ≥ 1 for LP1 . We formulate and solve (10), which yields the data used to update W 1 and to formulate (11) whose optimal solution is used to update the right-hand side of the second-stage constraints. Solving LP (10) we obtain π 2 = [0, −0.5, 0, −0.5, 1], λ0,1 = [0, 0, 0, 0, 0.5, 0, 0], λ0,2 = 0.5, λ1,1 = [0, 0.125, 0, 0.375, 0, 0, 0], We ob¸ · and λ1,2 = 0.125. 2] [W 3 2 2 3 . tain W by appending π to W : W = 0 −0.5 0 −0.5 1 Using the solution of (10) we formulate and solve (11) for both ω1 and ω2 . The optimal solution for ω1 is δ(ω1 ) = −0.25, σ0 (ω1 ) = 0.5, and σ(ω1 ) = [0, 0]. Based r2 (ω1 )¸ and T 2 (ω1 ) as · 2on this¸ solution we·update 2 [T (ω1 )] [r (ω1 )] , T 3 (ω1 ) = . follows: r3 (ω1 ) = 0 0 −0.5 For ω2 the optimal solution of LP (11) is δ(ω2 ) = −0.5, σ0 (ω2 ) = 0.5, and σ(ω2 )· = [0, 0]. Similarly, we ·update r2 (ω2 )¸and T 2 (ω2 ) as follows: ¸ 2 [r (ω2 )] [T 2 (ω2 )] r3 (ω2 ) = , T 3 (ω2 ) = . This completes Step −1 0 0 2 of the algorithm. Step 3 Solving (7) we obtain y(ω1 ) = [0.055556, 1, 0.22222, 0], R = 0, and y(ω2 ) = [0, 1, 0, 1], R = 0. The dual solutions are d(ω1 ) = [5, 1, 0, 2, 0, 2, 0, 0] and d(ω2 ) = [0, 7.667, 0, 0, 0, 11.333, 0, 12.667]. V3 ← V2 because the integer restrictions have not been met. 16 Step 4 Using the dual solution for each subproblem from step 3, we formulate the “optimality cuts” as in Benders’ decomposition. The resulting cuts are −5x1 − 1x2 + η1 ≥ −30 for ω1 and 0x1 − 7.667x2 + η2 ≥ −47 for ω2 . The expected value associated with the cut coefficients yields −2.5x1 − 4.334x2 + η ≥ −38.5. Applying the translation θ = η + 86 we get −2.5x1 − 4.334x2 + θ ≥ 47.5 as the optimality cut to add to the master program: Min −1.5x1 − 4x2 + θ s.t. −x1 ≥ −1 −x2 ≥ −1 0x1 − 15.9x2 + θ ≥ 46 −2.5x1 − 4.334x2 + θ ≥ 47.5 x1 , x2 ∈ {0, 1}, θ ≥ 0. Solving the master program we get x3 = [0, 0], θ = 47.5 and an objective value of 47.5. Therefore, the lower bound becomes v2 = 47.5. k ← 3, and we begin the next iteration. Iteration 3 Step 1 We start the third iteration by solving the following updated subproblems with x3 = [0, 0]: LP1 : f1 (0, 0, ω1 ) = Min −16y1 − 19y2 − 23y3 − 28y4 + 100R s.t. −2y1 − 3y2 − 4y3 − 5y4 + R ≥ −5 −6y1 − y2 − 3y3 − 2y4 + R ≥ −2 y1 − y2 − y3 − y4 + R ≥ −1 0y1 − 0.5y2 − 0y3 − 0.5y4 + R ≥ −0.5 −y1 ≥ −1 −y2 ≥ −1 −y3 ≥ −1 −y4 ≥ −1 y1 , y2 , y3 , y4 , R ≥ 0. Disjunctive Decomposition with Set Convexification 17 and LP2 : f2 (0, 0, ω2 ) = Min −16y1 − 19y2 − 23y3 − 28y4 + 100R s.t. −2y1 − 3y2 − 4y3 − 5y4 + R ≥ −10 −6y1 − y2 − 3y3 − 2y4 + R ≥ −3 y1 − y2 − y3 − y4 + R ≥ −2 0y1 − 0.5y2 − 1y3 − 0.5y4 + R ≥ −1 −y1 ≥ −1 −y2 ≥ −1 −y3 ≥ −1 −y4 ≥ −1 y1 , y2 , y3 , y4 , R ≥ 0. The optimal solution for LP1 is y(ω1 ) = [0, 0, 0, 1], R(ω1 ) = 0 and for LP2 is y(ω2 ) = [0, 1, 0, 1], R(ω2 ) = 0. The dual solutions are d(ω1 ) = [0, 7.6667, 0, 25.333, 0, 0, 0, 0] and d(ω2 ) = [0, 7.667, 0, 0, 0, 11.333, 0, 12.667]. We now have an incumbent integer solution x = [0, 0], y(ω1 ) = [0, 0, 0, 1], y(ω2 ) = [0, 1, 0, 1], θ = 86+0.5(−28)+0.5(−47) = 48.5. V4 ← Min{48.5, V3 } = 48.5. We go to step 4 of the algorithm. Step 4 Using the dual solution for each subproblem from Step 3, we formulate the “optimality cuts” as in Benders’ decomposition. The resulting cuts are 0x1 − 7.667x2 + η1 ≥ −28 for ω1 and 0x1 − 7.667x2 + η2 ≥ −47 for ω2 , and the expected values yield 0x1 − 7.667x2 + η ≥ −37.5. Applying the translation θ = η + 86 we get 0x1 − 7.667x2 + θ ≥ 48.5 as the optimality cut to add to the master program: Min −1.5x1 − 4x2 + θ s.t. −x1 ≥ −1 −x2 ≥ −1 0x1 − 15.9x2 + θ ≥ 46 −2.5x1 − 4.334x2 + θ ≥ 47.5 0x1 − 7.667x2 + θ ≥ 48.5 x1 , x2 ∈ {0, 1}, θ ≥ 0. Solving the master program we get x4 = [1, 0], θ = 50 and an objective value of 48.5. Therefore, the lower bound becomes v3 = 48.5. Since the upper bound (V3 = 48.5) and the lower bound are now equal, the algorithm terminates and we have an optimal solution: x = [0, 0], y(ω1 ) = [0, 0, 0, 1], y(ω2 ) = [0, 1, 0, 1] and objective value −37.5. It so happens 18 that both [0, 0] and [1, 0] are optimal for the master problem, but optimality can only be concluded for the point [0, 0], since that is the incumbent. It is interesting to note that while the cuts used in LP1 and LP2 in iteration 3 were obtained at x1 = [1, 1] and x2 = [1, 0], integer solutions (y(ω1 ) = [0, 0, 0, 1] and y(ω2 ) = [0, 1, 0, 1]) were obtained with the first relaxations solved at x3 = [0, 0]. The credit for this should go to the C 3 Theorem. 3. Conclusions This paper has presented the main results on set convexifications for large scale Stochastic Integer Programming and has given an illustration of the new decomposition method called the D2 algorithm. At the heart of this novel method is the C 3 Theorem, which allows both a temporal and scenario decomposition of the SMIP. We have used a simple example to illustrate the application of the D2 algorithm. In this example the D2 algorithm converges to an optimal solution in three iterations. The example clearly illustrates how the second-stage convexifications are sequentially carried out and how they impact the first stage objective function. Our primary focus in this paper is the generation of cutting planes within a temporal decomposition of two-stage SMIPs. We note, however, that cutting planes alone are typically inadequate for solving large mixed-integer programs. Thus, our ultimate goal is to use cuts such as those discussed in this paper within a Branch-and-Cut (BAC) setting, where a careful generation of cuts is necessary to further enhance the success of BAC-type algorithms for solving SMIP problems. Therefore, our future work is to incorporate the D2 algorithm in a branch-and-cut setting. Moreover, the computational demands of this class of problems calls for the use of high performance computing platforms. Acknowledgements: This research was supported by a grant from the National Science Foundation. References Balas, E. [1979], “Disjunctive Programming,” Annals of Discrete Mathematics, 5, pp. 3-51. Balas, E.S. Ceria, and G. Cornuéjols [1993], “A lift-and-project cutting plane algorithm for mixed 0-1 integer programs,” Math. Programming, 58, pp. 295-324. Disjunctive Decomposition with Set Convexification 19 Benders,J.F. [1962], “Partitioning procedures for solving mixed-variable programming problems,” Numerische Mathematic 4, pp. 238-252. Blair,C. [1995], “A closed-form representation of mixed-integer program value functions,” Math. Programming, 71, pp. 127-136. Blair,C., and R. Jeroslow [1982], “The value function of an integer program,” Math. Programming, 23, pp. 237-273. Carøe, C.C. [1998], Decomposition in Stochastic Integer Programming, Ph.D. thesis, Institute of Mathematical Sciences, Dept. of Operations Research, University of Copenhagen, Denmark. Schultz, R., L. Stougie, and M.H. van der Vlerk [1998], “Solving stochastic programs with integer recourse by enumeration; a framework using Gröbner basis reduction,” Math. Programming, 83, pp. 71-94. Sen, S. and J.L. Higle [2000], The C 3 Theorem and a D2 Algorithm for Large Scale Stochastic Integer Programming: Set Convexification, Working Paper, Dept. of Systems and Industrial Engineering, The University of Arizona, Tucson AZ 85721. (submitted to Math. Programming Sen, S. and H.D Sherali [1987], “Nondifferentiable reverse convex programs and facetial cuts via a disjunctive characterization,” Math. Programming, 37, pp. 169-183. Sherali, H.D. and W.P. Adams [1990], “A hierarchy of relaxations between the continuous and convex hull representations for 0-1 programming problems,” SIAM J. on Discrete Mathematics, 3, pp. 411-430. Sherali H.D. and B.M.P. Fraticelli [2002], “A modification of Benders’ decomposition algorithm for discrete subproblems: an approach for stochastic programs with integer recourse,” Journal of Global Optimization, 22, pp. 319-342. Sherali, H.D. and C.M. Shetty [1980], Optimization with Disjunctive Constraints, Lecture Notes in Economics and Math. Systems, Vol. 181, Springer-Verlag, Berlin.