Mixed Integer Nonlinear Programming (MINLP) Sven Leyffer Jeff Linderoth MCS Division Argonne National Lab leyffer@mcs.anl.gov ISE Department Lehigh University jtl3@lehigh.edu 1 / 160 Overview 1 Introduction, Applications, and Formulations 2 Classical Solution Methods 3 Modern Developments in MINLP 4 Implementation and Software 2 / 160 Part I Introduction, Applications, and Formulations 3 / 160 The Problem of the Day Mixed Integer Nonlinear Program (MINLP) f (x, y ) minimize x,y subject to c(x, y ) ≤ 0 x ∈ X , y ∈ Y integer 4 / 160 The Problem of the Day Mixed Integer Nonlinear Program (MINLP) f (x, y ) minimize x,y subject to c(x, y ) ≤ 0 x ∈ X , y ∈ Y integer f , c smooth (convex) functions X , Y polyhedral sets, e.g. Y = {y ∈ [0, 1]p | Ay ≤ b} 5 / 160 The Problem of the Day Mixed Integer Nonlinear Program (MINLP) f (x, y ) minimize x,y subject to c(x, y ) ≤ 0 x ∈ X , y ∈ Y integer f , c smooth (convex) functions X , Y polyhedral sets, e.g. Y = {y ∈ [0, 1]p | Ay ≤ b} y ∈ Y integer ⇒ hard problem 6 / 160 The Problem of the Day Mixed Integer Nonlinear Program (MINLP) f (x, y ) minimize x,y subject to c(x, y ) ≤ 0 x ∈ X , y ∈ Y integer f , c smooth (convex) functions X , Y polyhedral sets, e.g. Y = {y ∈ [0, 1]p | Ay ≤ b} y ∈ Y integer ⇒ hard problem f , c not convex ⇒ very hard problem 7 / 160 Why the MI? We can use 0-1 (binary) variables for a variety of purposes Modeling yes/no decisions Enforcing disjunctions Enforcing logical conditions Modeling fixed costs Modeling piecewise linear functions 8 / 160 Why the MI? We can use 0-1 (binary) variables for a variety of purposes Modeling yes/no decisions Enforcing disjunctions Enforcing logical conditions Modeling fixed costs Modeling piecewise linear functions If the variable is associated with a physical entity that is indivisible, then it must be integer Number of aircraft carriers to to produce. Gomory’s Initial Motivation 9 / 160 A Popular MINLP Method Dantzig’s Two-Phase Method for MINLP Adapted by Leyffer and Linderoth 10 / 160 A Popular MINLP Method Dantzig’s Two-Phase Method for MINLP 1 Adapted by Leyffer and Linderoth Convince the user that he or she does not wish to solve a mixed integer nonlinear programming problem at all! 11 / 160 A Popular MINLP Method Dantzig’s Two-Phase Method for MINLP Adapted by Leyffer and Linderoth 1 Convince the user that he or she does not wish to solve a mixed integer nonlinear programming problem at all! 2 Otherwise, solve the continuous relaxation (NLP) and round off the minimizer to the nearest integer. 12 / 160 A Popular MINLP Method Dantzig’s Two-Phase Method for MINLP Adapted by Leyffer and Linderoth 1 Convince the user that he or she does not wish to solve a mixed integer nonlinear programming problem at all! 2 Otherwise, solve the continuous relaxation (NLP) and round off the minimizer to the nearest integer. For 0 − 1 problems, or those in which the |y | is “small”, the continuous approximation to the discrete decision is not accurate enough for practical purposes. Conclusion: MINLP methods must be studied! 13 / 160 Example: Core Reload Operation (Quist, A.J., 2000) max. reactor efficiency after reload subject to diffusion PDE & safety diffusion PDE ' nonlinear equation ⇒ integer & nonlinear model avoid reactor becoming sub-critical 14 / 160 Example: Core Reload Operation (Quist, A.J., 2000) max. reactor efficiency after reload subject to diffusion PDE & safety diffusion PDE ' nonlinear equation ⇒ integer & nonlinear model avoid reactor becoming overheated 15 / 160 Example: Core Reload Operation (Quist, A.J., 2000) look for cycles for moving bundles: e.g. 4 → 6 → 8 → 10 i.e. bundle moved from 4 to 6 ... model with binary xilm ∈ {0, 1} xilm = 1 ⇔ node i has bundle l of cycle m 16 / 160 AMPL Model of Core Reload Operation Exactly one bundle per node: L X M X xilm = 1 ∀i ∈ I l=1 m=1 AMPL model: var x {I,L,M} binary ; Bundle {i in I}: sum{l in L, m in M} x[i,l,m] = 1 ; Multiple Choice: One of the most common uses of IP Full AMPL model c-reload.mod at www.mcs.anl.gov/~leyffer/MacMINLP/ 17 / 160 Gas Transmission Problem (De Wolf and Smeers, 2000) Belgium has no gas! 18 / 160 Gas Transmission Problem (De Wolf and Smeers, 2000) Belgium has no gas! All natural gas is imported from Norway, Holland, or Algeria. Supply gas to all demand points in a network in a minimum cost fashion. 19 / 160 Gas Transmission Problem (De Wolf and Smeers, 2000) Belgium has no gas! All natural gas is imported from Norway, Holland, or Algeria. Supply gas to all demand points in a network in a minimum cost fashion. Gas is pumped through the network with a series of compressors There are constraints on the pressure of the gas within the pipe 20 / 160 Pressure Loss is Nonlinear Assume horizontal pipes and steady state flows Pressure loss p across a pipe is related to the flow rate f as 2 2 = pin − pout 1 sign(f )f 2 Ψ Ψ: “Friction Factor” 21 / 160 Gas Transmission: Problem Input Network (N, A). A = Ap ∪ Aa Aa : active arcs have compressor. Flow rate can increase on arc Ap : passive arcs simply conserve flow rate Ns ⊆ N: set of supply nodes ci , i ∈ Ns : Purchase cost of gas s i , s i : Lower and upper bounds on gas “supply” at node i p i , p i : Lower and upper bounds on gas pressure at node i 22 / 160 Gas Transmission: Problem Input Network (N, A). A = Ap ∪ Aa Aa : active arcs have compressor. Flow rate can increase on arc Ap : passive arcs simply conserve flow rate Ns ⊆ N: set of supply nodes ci , i ∈ Ns : Purchase cost of gas s i , s i : Lower and upper bounds on gas “supply” at node i p i , p i : Lower and upper bounds on gas pressure at node i si , i ∈ N: supply at node i. si > 0 ⇒ gas added to the network at node i si < 0 ⇒ gas removed from the network at node i to meet demand fij , (i, j) ∈ A: flow along arc (i, j) f (i, j) > 0 ⇒ gas flows i → j f (i, j) < 0 ⇒ gas flows j → i 23 / 160 Gas Transmission Model min X cj sj j∈Ns subject to X sign(fij )fij2 − sign(fij )fij2 − fij j|(i,j)∈A Ψij (pi2 − pj2 ) Ψij (pi2 − pj2 ) si pi fij = si = ≥ ∈ ∈ ≥ ∀i ∈ N 0 ∀(i, j) ∈ Ap 0 ∀(i, j) ∈ Aa ∀i ∈ N [s i , s i ] [p i , p i ] ∀i ∈ N 0 ∀(i, j) ∈ Aa 24 / 160 Your First Modeling Trick Don’t include nonlinearities or nonconvexities unless necessary! Replace pi2 ← ρi 25 / 160 Your First Modeling Trick Don’t include nonlinearities or nonconvexities unless necessary! Replace pi2 ← ρi sign(fij )fij2 − Ψij (ρi − ρj ) = 0 ∀(i, j) ∈ Ap 2 fij − Ψij (ρi − ρj ) ≥ 0 ∀(i, j) ∈ Aa p p ∀i ∈ N ρi ∈ [ p i , p i ] 26 / 160 Your First Modeling Trick Don’t include nonlinearities or nonconvexities unless necessary! Replace pi2 ← ρi sign(fij )fij2 − Ψij (ρi − ρj ) = 0 ∀(i, j) ∈ Ap 2 fij − Ψij (ρi − ρj ) ≥ 0 ∀(i, j) ∈ Aa p p ∀i ∈ N ρi ∈ [ p i , p i ] This trick only works because 1 2 pi2 terms appear only in the bound constraints Also fij ≥ 0 ∀(i, j) ∈ Aa This model is nonconvex: sign(fij )fij2 is a nonconvex function Some solvers do not like sign 27 / 160 Dealing with sign(·): The NLP Way Use auxiliary binary variables to indicate direction of flow Let |fij | ≤ F ∀(i, j) ∈ Ap 1 fij ≥ 0 zij = 0 fij ≤ 0 fij ≥ −F (1 − zij ) fij ≤ Fzij Note that sign(fij ) = 2zij − 1 Write constraint as (2zij − 1)fij2 − Ψij (ρi − ρj ) = 0. 28 / 160 Special Ordered Sets Sven thinks this ’NLP trick’ is pretty cool 29 / 160 Special Ordered Sets Sven thinks this ’NLP trick’ is pretty cool It is not how it is done in De Wolf and Smeers (2000). 30 / 160 Special Ordered Sets Sven thinks this ’NLP trick’ is pretty cool It is not how it is done in De Wolf and Smeers (2000). Heuristic for finding a good starting solution, then a local optimization approach based on a piecewise-linear simplex method Another (similar) approach involves approximating the nonlinear function by piecewise linear segments, but searching for the globally optimal solution: Special Ordered Sets of Type 2 If the “multidimensional” nonlinearity cannot be removed, resort to Special Ordered Sets of Type 3 31 / 160 Portfolio Management N: Universe of asset to purchase xi : Amount of asset i to hold B: Budget ( min |N| x∈R+ u(x) | ) X xi = B i∈N 32 / 160 Portfolio Management N: Universe of asset to purchase xi : Amount of asset i to hold B: Budget ( min |N| x∈R+ u(x) | ) X xi = B i∈N def Markowitz: u(x) = −αT x + λx T Qx α: Expected returns Q: Variance-covariance matrix of expected returns λ: Risk aversion parameter 33 / 160 More Realistic Models b ∈ R|N| of “benchmark” holdings def Benchmark Tracking: u(x) = (x − b)T Q(x − b) Constraint on E[Return]: αT x ≥ r 34 / 160 More Realistic Models b ∈ R|N| of “benchmark” holdings def Benchmark Tracking: u(x) = (x − b)T Q(x − b) Constraint on E[Return]: αT x ≥ r Limit Names: |i ∈ N : xi > 0| ≤ K Use binary indicator variables to model the implication xi > 0 ⇒ yi = 1 Implication modeled with variable upper bounds: xi ≤ Byi P i∈N ∀i ∈ N yi ≤ K 35 / 160 Even More Models Min Holdings: (xi = 0) ∨ (xi ≥ m) Model implication: xi > 0 ⇒ xi ≥ m xi > 0 ⇒ yi = 1 ⇒ xi ≥ m xi ≤ Byi , xi ≥ myi ∀i ∈ N 36 / 160 Even More Models Min Holdings: (xi = 0) ∨ (xi ≥ m) Model implication: xi > 0 ⇒ xi ≥ m xi > 0 ⇒ yi = 1 ⇒ xi ≥ m xi ≤ Byi , xi ≥ myi ∀i ∈ N Round Lots: xi ∈ {kLi , k = 1, 2, . . .} xi − zi Li = 0, zi ∈ Z+ ∀i ∈ N 37 / 160 Even More Models Min Holdings: (xi = 0) ∨ (xi ≥ m) Model implication: xi > 0 ⇒ xi ≥ m xi > 0 ⇒ yi = 1 ⇒ xi ≥ m xi ≤ Byi , xi ≥ myi ∀i ∈ N Round Lots: xi ∈ {kLi , k = 1, 2, . . .} xi − zi Li = 0, zi ∈ Z+ ∀i ∈ N Vector h of initial holdings Transactions: ti = |xi − hi | P Turnover: i∈N ti ≤ ∆ P Transaction Costs: i∈N ci ti in objective P Market Impact: i∈N γi ti2 in objective 38 / 160 Multiproduct Batch Plants (Kocis and Gross- mann, 1988) M: Batch Processing Stages N: Different Products H: Horizon Time Qi : Required quantity of product i tij : Processing time product i stage j Sij : “Size Factor” product i stage j 39 / 160 Multiproduct Batch Plants (Kocis and Gross- mann, 1988) M: Batch Processing Stages N: Different Products H: Horizon Time Qi : Required quantity of product i tij : Processing time product i stage j Sij : “Size Factor” product i stage j Bi : Vj : Nj : Ci : Batch size of product i ∈ N Stage j size: Vj ≥ Sij Bi ∀i, j Number of machines at stage j Longest stage time for product i: Ci ≥ tij /Nj ∀i, j 40 / 160 Multiproduct Batch Plants min X βj αj Nj Vj j∈M s.t. Vj − Sij Bi ≥ 0 ∀i ∈ N, ∀j ∈ M Ci Nj ≥ tij ∀i ∈ N, ∀j ∈ M X Qi Ci ≤ H Bi i∈N Bound Constraints on Vj , Ci , Bi , Nj Nj ∈ Z ∀j ∈ M 41 / 160 Modeling Trick #2 Horizon Time and Objective Function Nonconvex. :-( 42 / 160 Modeling Trick #2 Horizon Time and Objective Function Nonconvex. :-( Sometimes variable transformations work! vj = ln(Vj ), nj = ln(Nj ), bi = ln(Bi ), ci = ln Ci 43 / 160 Modeling Trick #2 Horizon Time and Objective Function Nonconvex. :-( Sometimes variable transformations work! vj = ln(Vj ), nj = ln(Nj ), bi = ln(Bi ), ci = ln Ci X min αj e Nj +βj Vj j∈M s.t. vj − ln(Sij )bi X ci + nj Qi e Ci −Bi ≥ ≥ ≤ 0 ∀i ∈ N, ∀j ∈ M ln(τij ) ∀i ∈ N, ∀j ∈ M H i∈N (Transformed) Bound Constraints on Vj , Ci , Bi 44 / 160 How to Handle the Integrality? But what to do about the integrality? 1 ≤ Nj ≤ N j ∀j ∈ M, Nj ∈ Z ∀j ∈ M nj ∈ {0, ln(2), ln(3), . . . ...} 1 nj takes value ln(k) Ykj = 0 Otherwise nj − K X ln(k)Ykj = 0 ∀j ∈ M = 1 ∀j ∈ M k=1 K X Ykj k=1 This model is available at http://www-unix.mcs.anl.gov/ ~leyffer/macminlp/problems/batch.mod 45 / 160 A Small Smattering of Other Applications Chemical Engineering Applications: process synthesis (Kocis and Grossmann, 1988) batch plant design (Grossmann and Sargent, 1979) cyclic scheduling (Jain, V. and Grossmann, I.E., 1998) design of distillation columns (Viswanathan and Grossmann, 1993) pump configuration optimization (Westerlund, T., Pettersson, F. and Grossmann, I.E., 1994) Forestry/Paper production (Westerlund, T., Isaksson, J. and Harjunkoski, I., 1995) trimloss minimization (Harjunkoski, I., Westerlund, T., Pörn, R. and Skrifvars, H., 1998) Topology Optimization (Sigmund, 2001) 46 / 160 Part II Classical Solution Methods 47 / 160 Classical Solution Methods for MINLP 1 Classical Branch-and-Bound 2 Outer Approximation & Benders Decomposition Hybrid Methods 3 LP/NLP Based Branch-and-Bound Integrating SQP with Branch-and-Bound 48 / 160 Branch-and-Bound Solve relaxed NLP (0 ≤ y ≤ 1 continuous relaxation) . . . solution value provides lower bound integer feasible etc. Branch on yi non-integral Solve NLPs & branch until 1 2 3 • Node infeasible ... Node integer feasible ... ⇒ get upper bound (U) N Lower bound ≥ U ... etc. yi = 1 dominated by upper bound yi = 0 infeasible Search until no unexplored nodes on tree 49 / 160 Variable Selection for Branch-and-Bound Assume yi ∈ {0, 1} for simplicity ... (x̂, ŷ ) fractional solution to parent node; fˆ = f (x̂, ŷ ) 1 maximal fractional branching: choose ŷi closest to 1 2 max {min(1 − ŷi , ŷi )} i 50 / 160 Variable Selection for Branch-and-Bound Assume yi ∈ {0, 1} for simplicity ... (x̂, ŷ ) fractional solution to parent node; fˆ = f (x̂, ŷ ) 1 maximal fractional branching: choose ŷi closest to 1 2 max {min(1 − ŷi , ŷi )} i 2 strong branching: (approx) solve all NLP children: f (x, y ) minimize x,y +/− fi ← subject to c(x, y ) ≤ 0 x ∈ X , y ∈ Y , yi = 1/0 branching variable yi that changes objective the most: max min(fi + , fi − ) i 51 / 160 Node Selection for Branch-and-Bound Which node n on tree T should be solved next? 1 depth-first search: select deepest node in tree minimizes number of NLP nodes stored exploit warm-starts (MILP/MIQP only) 52 / 160 Node Selection for Branch-and-Bound Which node n on tree T should be solved next? 1 depth-first search: select deepest node in tree minimizes number of NLP nodes stored exploit warm-starts (MILP/MIQP only) 2 best estimate: choose node with best expected integer soln X + min fp(n) + min ei (1 − yi ), ei− yi n∈T i:yi fractional +/− where fp(n) = value of parent node, ei = pseudo-costs summing pseudo-cost estimates for all integers in subtree 53 / 160 Outer Approximation (Duran and Grossmann, 1986) Motivation: avoid solving huge number of NLPs • Exploit MILP/NLP solvers: decompose integer/nonlinear part Key idea: reformulate MINLP as MILP (implicit) • Solve alternating sequence of MILP & NLP NLP subproblem yj fixed: f (x, yj ) minimize x NLP(yj ) subject to c(x, yj ) ≤ 0 x ∈X MILP NLP(y j) Main Assumption: f , c are convex 54 / 160 Outer Approximation (Duran and Grossmann, 1986) η • let (xj , yj ) solve NLP(yj ) • linearize f , c about (xj , yj ) =: zj • new objective variable η ≥ f (x, y ) • MINLP (P) ≡ MILP (M) f(x) minimize η z=(x,y ),η subject to η ≥ fj + ∇fjT (z − zj ) ∀yj ∈ Y (M) 0 ≥ cj + ∇cjT (z − zj ) ∀yj ∈ Y x ∈ X , y ∈ Y integer SNAG: need all yj ∈ Y linearizations 55 / 160 Outer Approximation (Duran and Grossmann, 1986) (Mk ): lower bound (underestimate convex f , c) NLP(yj ): upper bound U (fixed yj ) NLP gives linearization NLP( y ) subproblem MILP finds new y MILP master program MILP infeasible? No Yes STOP ⇒ stop, if lower bound ≥ upper bound 56 / 160 Outer Approximation & Benders Decomposition Take OA cuts for zj := (xj , yj ) ... wlog X = Rn η ≥ fj + ∇fjT (z − zj ) & 0 ≥ cj + ∇cjT (z − zj ) sum with (1, λj ) ... λj multipliers of NLP(yj ) T η ≥ fj + λT j cj + (∇fj + ∇cj λj ) (z − zj ) KKT conditions of NLP(yj ) ⇒ ∇x fj + ∇x cj λj = 0 ... eliminate x components from valid inequality in y ⇒ η ≥ fj + (∇y fj + ∇y cj λj )T (y − yj ) NB: µj = ∇y fj + ∇y cj λj multiplier of y = yj in NLP(yj ) References: (Geoffrion, 1972) 57 / 160 LP/NLP Based Branch-and-Bound AIM: avoid re-solving MILP master (M) Consider MILP branch-and-bound 58 / 160 LP/NLP Based Branch-and-Bound AIM: avoid re-solving MILP master (M) Consider MILP branch-and-bound integer feasible interrupt MILP, when yj found ⇒ solve NLP(yj ) get xj 59 / 160 LP/NLP Based Branch-and-Bound AIM: avoid re-solving MILP master (M) Consider MILP branch-and-bound interrupt MILP, when yj found ⇒ solve NLP(yj ) get xj linearize f , c about (xj , yj ) ⇒ add linearization to tree 60 / 160 LP/NLP Based Branch-and-Bound AIM: avoid re-solving MILP master (M) Consider MILP branch-and-bound interrupt MILP, when yj found ⇒ solve NLP(yj ) get xj linearize f , c about (xj , yj ) ⇒ add linearization to tree continue MILP tree-search ... until lower bound ≥ upper bound 61 / 160 LP/NLP Based Branch-and-Bound need access to MILP solver ... call back ◦ exploit good MILP (branch-cut-price) solver ◦ (Akrotirianakis et al., 2001) use Gomory cuts in tree-search preliminary results: order of magnitude faster than OA ◦ same number of NLPs, but only one MILP similar ideas for Benders & Extended Cutting Plane methods recent implementation by CMU/IBM group References: (Quesada and Grossmann, 1992) 62 / 160 Integrating SQP & Branch-and-Bound AIM: Avoid solving NLP node to convergence. Sequential Quadratic Programming (SQP) → solve sequence (QPk ) at every node fk + ∇fkT d + 21 d T Hk d minimize d subject to ck + ∇ckT d ≤ 0 (QPk ) xk + dx ∈ X yk + dy ∈ Ŷ . Early branching: After QP step choose non-integral yik+1 , branch & continue SQP References: (Borchers and Mitchell, 1994; Leyffer, 2001) 63 / 160 Integrating SQP & Branch-and-Bound SNAG: (QPk ) not lower bound ⇒ no fathoming from upper bound minimize d fk + ∇fkT d + 12 d T Hk d subject to ck + ∇ckT d ≤ 0 xk + dx ∈ X yk + dy ∈ Ŷ . NB: (QPk ) inconsistent and trust-region active ⇒ do not fathom 64 / 160 Integrating SQP & Branch-and-Bound SNAG: (QPk ) not lower bound ⇒ no fathoming from upper bound minimize d fk + ∇fkT d + 12 d T Hk d subject to ck + ∇ckT d ≤ 0 xk + dx ∈ X yk + dy ∈ Ŷ . Remedy: Exploit OA underestimating property (Leyffer, 2001): add objective cut fk + ∇fkT d ≤ U − to (QPk ) fathom node, if (QPk ) inconsistent NB: (QPk ) inconsistent and trust-region active ⇒ do not fathom 65 / 160 Comparison of Classical MINLP Techniques Summary of numerical experience 1 Quadratic OA master: usually fewer iteration MIQP harder to solve 2 NLP branch-and-bound faster than OA ... depends on MIP solver 3 LP/NLP-based-BB order of magnitude faster than OA . . . also faster than B&B 4 Integrated SQP-B&B up to 3× faster than B&B ' number of QPs per node 5 ECP works well, if function/gradient evals expensive 66 / 160 Part III Modern Developments in MINLP 67 / 160 Modern Methods for MINLP 1 Formulations Relaxations Good formulations: big M 0 s and disaggregation 2 Cutting Planes Cuts from relaxations and special structures Cuts from integrality 3 Handling Nonconvexity Envelopes Methods 68 / 160 Relaxations def z(S) = minx∈S f (x) def z(T ) = minx∈T f (x) Independent of f , S, T : z(T ) ≤ z(S) S T 69 / 160 Relaxations def z(S) = minx∈S f (x) def z(T ) = minx∈T f (x) Independent of f , S, T : z(T ) ≤ z(S) If xT∗ = arg minx∈T f (x) S And xT∗ ∈ S, then T 70 / 160 Relaxations def z(S) = minx∈S f (x) def z(T ) = minx∈T f (x) Independent of f , S, T : z(T ) ≤ z(S) If xT∗ = arg minx∈T f (x) S And xT∗ ∈ S, then xT∗ = arg minx∈S f (x) T 71 / 160 UFL: Uncapacitated Facility Location Facilities: J Customers: I min X fj x j + j∈J X XX fij yij i∈I j∈J ∀i ∈ I yij = 1 yij ≤ |I |xj j∈J X ∀j ∈ J (1) i∈I OR yij ≤ xj ∀i ∈ I , j ∈ J (2) 72 / 160 UFL: Uncapacitated Facility Location Facilities: J Customers: I min X fj x j + j∈J X XX fij yij i∈I j∈J ∀i ∈ I yij = 1 yij ≤ |I |xj j∈J X ∀j ∈ J (1) i∈I OR yij ≤ xj ∀i ∈ I , j ∈ J (2) Which formulation is to be preferred? 73 / 160 UFL: Uncapacitated Facility Location Facilities: J Customers: I min X fj x j + j∈J X XX fij yij i∈I j∈J ∀i ∈ I yij = 1 yij ≤ |I |xj j∈J X ∀j ∈ J (1) i∈I OR yij ≤ xj ∀i ∈ I , j ∈ J (2) Which formulation is to be preferred? I = J = 40. Costs random. 74 / 160 UFL: Uncapacitated Facility Location Facilities: J Customers: I min X fj x j + j∈J X XX fij yij i∈I j∈J ∀i ∈ I yij = 1 yij ≤ |I |xj j∈J X ∀j ∈ J (1) i∈I OR yij ≤ xj ∀i ∈ I , j ∈ J (2) Which formulation is to be preferred? I = J = 40. Costs random. Formulation 1. 53,121 seconds, optimal solution. Formulation 2. 2 seconds, optimal solution. 75 / 160 Valid Inequalities Sometimes we can get a better formulation by dynamically improving it. An inequality π T x ≤ π0 is a valid inequality for S if π T x ≤ π0 ∀x ∈ S Alternatively: maxx∈S {π T x} ≤ π0 76 / 160 Valid Inequalities Sometimes we can get a better formulation by dynamically improving it. An inequality π T x ≤ π0 is a valid inequality for S if π T x ≤ π0 ∀x ∈ S Alternatively: maxx∈S {π T x} ≤ π0 Thm: (Hahn-Banach). Let S ⊂ Rn be a closed, convex set, and let x̂ 6∈ S. π Tx = π0 Then there exists π ∈ Rn such that π T x̂ > max{π T x} x̂ S x∈S 77 / 160 Nonlinear Branch-and-Cut Consider MINLP fxT x + fyT y minimize x,y subject to c(x, y ) ≤ 0 y ∈ {0, 1}p , 0 ≤ x ≤ U Note the Linear objective This is WLOG: min f (x, y ) ⇔ min η s.t. η ≥ f (x, y ) 78 / 160 It’s Actually Important! We want to approximate the convex hull of integer solutions, but without a linear objective function, the solution to the relaxation might occur in the interior. No Separating Hyperplane! :-( min(y1 − 1/2)2 + (y2 − 1/2)2 y2 s.t. y1 ∈ {0, 1}, y2 ∈ {0, 1} (ŷ1 , ŷ2 ) y1 79 / 160 It’s Actually Important! We want to approximate the convex hull of integer solutions, but without a linear objective function, the solution to the relaxation might occur in the interior. No Separating Hyperplane! :-( min(y1 − 1/2)2 + (y2 − 1/2)2 y2 s.t. y1 ∈ {0, 1}, y2 ∈ {0, 1} (ŷ1 , ŷ2 ) η ≥ (y1 − 1/2)2 + (y2 − 1/2)2 η y1 80 / 160 Valid Inequalities From Relaxations Idea: Inequalities valid for a relaxation are valid for original Generating valid inequalities for a relaxation is often easier. x̂ πT x = π 0 S T Separation Problem over T: Given x̂, T find (π, π0 ) such that π T x̂ > π0 , π T x ≤ π0 ∀x ∈ T 81 / 160 Simple Relaxations Idea: Consider one row relaxations 82 / 160 Simple Relaxations Idea: Consider one row relaxations If P = {x ∈ {0, 1}n | Ax ≤ b}, then for any row i, Pi = {x ∈ {0, 1}n | aiT x ≤ bi } is a relaxation of P. 83 / 160 Simple Relaxations Idea: Consider one row relaxations If P = {x ∈ {0, 1}n | Ax ≤ b}, then for any row i, Pi = {x ∈ {0, 1}n | aiT x ≤ bi } is a relaxation of P. If the intersection of the relaxations is a good approximation to the true problem, then the inequalities will be quite useful. 84 / 160 Simple Relaxations Idea: Consider one row relaxations If P = {x ∈ {0, 1}n | Ax ≤ b}, then for any row i, Pi = {x ∈ {0, 1}n | aiT x ≤ bi } is a relaxation of P. If the intersection of the relaxations is a good approximation to the true problem, then the inequalities will be quite useful. Crowder et al. (1983) is the seminal paper that shows this to be true for IP. 85 / 160 Simple Relaxations Idea: Consider one row relaxations If P = {x ∈ {0, 1}n | Ax ≤ b}, then for any row i, Pi = {x ∈ {0, 1}n | aiT x ≤ bi } is a relaxation of P. If the intersection of the relaxations is a good approximation to the true problem, then the inequalities will be quite useful. Crowder et al. (1983) is the seminal paper that shows this to be true for IP. MINLP: Single (linear) row relaxations are also valid ⇒ same inequalities can also be used 86 / 160 Knapsack Covers K = {x ∈ {0, 1}n | aT x ≤ b} A set C ⊆ N is a cover if P j∈C aj > b A cover C is a minimal cover if C \ j is not a cover ∀j ∈ C 87 / 160 Knapsack Covers K = {x ∈ {0, 1}n | aT x ≤ b} A set C ⊆ N is a cover if P j∈C aj > b A cover C is a minimal cover if C \ j is not a cover ∀j ∈ C If C ⊆ N is a cover, then the cover inequality X xj ≤ |C | − 1 j∈C is a valid inequality for S Sometimes (minimal) cover inequalities are facets of conv(K ) 88 / 160 Other Substructures Single node flow: (Padberg et al., 1985) X |N| xj ≤ b, xj ≤ uj yj ∀ j ∈ N S = x ∈ R+ , y ∈ {0, 1}|N| | j∈N Knapsack with single continuous variable: (Marchand and Wolsey, 1999) X S = x ∈ R+ , y ∈ {0, 1}|N| | aj y j ≤ b + x j∈N Set Packing: (Borndörfer and Weismantel, 2000) n o |N| S = y ∈ {0, 1} | Ay ≤ e A ∈ {0, 1}|M|×|N| , e = (1, 1, . . . , 1)T 89 / 160 The Chvátal-Gomory Procedure A general procedure for generating valid inequalities for integer programs 90 / 160 The Chvátal-Gomory Procedure A general procedure for generating valid inequalities for integer programs Let the columns of A ∈ Rm×n be denoted by {a1 , a2 , . . . an } S = {y ∈ Zn+ | Ay ≤ b}. 1 2 3 4 Choose nonnegative multipliers u ∈P Rm + T T u Ay ≤ u b is a valid inequality ( j∈N u T aj yj ≤ u T b). P bu T a cy ≤ u T b (Since y ≥ 0). Pj∈N T j j T T j∈N bu aj cyj ≤ bu bc is valid for S since bu aj cyj is an integer 91 / 160 The Chvátal-Gomory Procedure A general procedure for generating valid inequalities for integer programs Let the columns of A ∈ Rm×n be denoted by {a1 , a2 , . . . an } S = {y ∈ Zn+ | Ay ≤ b}. 1 2 3 4 Choose nonnegative multipliers u ∈P Rm + T T u Ay ≤ u b is a valid inequality ( j∈N u T aj yj ≤ u T b). P bu T a cy ≤ u T b (Since y ≥ 0). Pj∈N T j j T T j∈N bu aj cyj ≤ bu bc is valid for S since bu aj cyj is an integer Simply Amazing: This simple procedure suffices to generate every valid inequality for an integer program 92 / 160 Extension to MINLP (Çezik and Iyengar, 2005) This simple idea also extends to mixed 0-1 conic programming minimize f T z def z = (x,y ) subject to Az K b y ∈ {0, 1}p , 0 ≤ x ≤ U K: Homogeneous, self-dual, proper, convex cone x K y ⇔ (x − y ) ∈ K 93 / 160 Gomory On Cones (Çezik and Iyengar, 2005) LP: Kl = Rn+ SOCP: Kq = {(x0 , x̄) | x0 ≥ kx̄k} SDP: Ks = {x = vec(X ) | X = X T , X p.s.d} 94 / 160 Gomory On Cones (Çezik and Iyengar, 2005) LP: Kl = Rn+ SOCP: Kq = {(x0 , x̄) | x0 ≥ kx̄k} SDP: Ks = {x = vec(X ) | X = X T , X p.s.d} def Dual Cone: K∗ = {u | u T z ≥ 0 ∀z ∈ K} Extension is clear from the following equivalence: Az K b ⇔ u T Az ≥ u T b ∀u K∗ 0 95 / 160 Gomory On Cones (Çezik and Iyengar, 2005) LP: Kl = Rn+ SOCP: Kq = {(x0 , x̄) | x0 ≥ kx̄k} SDP: Ks = {x = vec(X ) | X = X T , X p.s.d} def Dual Cone: K∗ = {u | u T z ≥ 0 ∀z ∈ K} Extension is clear from the following equivalence: Az K b ⇔ u T Az ≥ u T b ∀u K∗ 0 Many classes of nonlinear inequalities can be represented as Ax Kq b or Ax Ks b 96 / 160 Using Gomory Cuts in MINLP (Akrotirianakis et al., 2001) LP/NLP Based Branch-and-Bound solves MILP instances: minimize η def z = (x,y ),η subject to η ≥ fj + ∇fjT (z − zj ) ∀yj ∈ Y k 0 ≥ cj + ∇cjT (z − zj ) ∀yj ∈ Y k x ∈ X , y ∈ Y integer 97 / 160 Using Gomory Cuts in MINLP (Akrotirianakis et al., 2001) LP/NLP Based Branch-and-Bound solves MILP instances: minimize η def z = (x,y ),η subject to η ≥ fj + ∇fjT (z − zj ) ∀yj ∈ Y k 0 ≥ cj + ∇cjT (z − zj ) ∀yj ∈ Y k x ∈ X , y ∈ Y integer Create Gomory mixed integer cuts from η ≥ fj + ∇fjT (z − zj ) 0 ≥ cj + ∇cjT (z − zj ) Akrotirianakis et al. (2001) shows modest improvements 98 / 160 Using Gomory Cuts in MINLP (Akrotirianakis et al., 2001) LP/NLP Based Branch-and-Bound solves MILP instances: minimize η def z = (x,y ),η subject to η ≥ fj + ∇fjT (z − zj ) ∀yj ∈ Y k 0 ≥ cj + ∇cjT (z − zj ) ∀yj ∈ Y k x ∈ X , y ∈ Y integer Create Gomory mixed integer cuts from η ≥ fj + ∇fjT (z − zj ) 0 ≥ cj + ∇cjT (z − zj ) Akrotirianakis et al. (2001) shows modest improvements Research Question: Other cut classes? Research Question: Exploit “outer approximation” property? 99 / 160 Disjunctive Cuts for MINLP (Stubbs and Mehrotra, 1999) Extension of Disjunctive Cuts for MILP: (Balas, 1979; Balas et al., 1993) def Continuous relaxation (z = (x, y )) x def C = {z|c(z) ≤ 0, 0 ≤ y ≤ 1, 0 ≤ x ≤ U} continuous relaxation y 100 / 160 Disjunctive Cuts for MINLP (Stubbs and Mehrotra, 1999) Extension of Disjunctive Cuts for MILP: (Balas, 1979; Balas et al., 1993) def Continuous relaxation (z = (x, y )) x def C = {z|c(z) ≤ 0, 0 ≤ y ≤ 1, 0 ≤ x ≤ U} def C = conv({x ∈ C | y ∈ {0, 1}p }) convex hull y 101 / 160 Disjunctive Cuts for MINLP (Stubbs and Mehrotra, 1999) Extension of Disjunctive Cuts for MILP: (Balas, 1979; Balas et al., 1993) def Continuous relaxation (z = (x, y )) x def C = {z|c(z) ≤ 0, 0 ≤ y ≤ 1, 0 ≤ x ≤ U} def C = conv({x ∈ C | y ∈ {0, 1}p }) 0/1 def Cj = {z ∈ C |yj = 0/1} z = λ0 u0 + λ1 u1 def λ0 + λ1 = 1, λ0 , λ1 ≥ 0 let Mj (C ) = u0 ∈ Cj0 , u1 ∈ Cj1 integer feasible set y ⇒ Pj (C ) := projection of Mj (C ) onto z ⇒ Pj (C ) = conv (C ∩ yj ∈ {0, 1}) and P1...p (C ) = C 102 / 160 Disjunctive Cuts: Example minimize x | (x − 1/2)2 + (y − 3/4)2 ≤ 1, −2 ≤ x ≤ 2, y ∈ {0, 1} x,y x Cj0 Given ẑ with ŷj 6∈ {0, 1} find separating hyperplane ( minimize kz − ẑk z ⇒ subject to z ∈ Pj (C ) Cj1 y ẑ = (x̂, ŷ ) 103 / 160 Disjunctive Cuts Example Cj1 Cj0 def z ∗ = arg min kz − ẑk z∗ s.t. λ0 u0 + λ1 u1 λ0 + λ1 −0.16 ≤ u0 0 −0.47 ≤ u1 0 λ0 , λ1 = z = 1 0.66 ≤ 1 1.47 ≤ 1 ≥ 0 ẑ = (x̂, ŷ ) 104 / 160 Disjunctive Cuts Example Cj1 Cj0 def z ∗ = arg min kz − ẑk22 z∗ s.t. λ0 u0 + λ1 u1 λ0 + λ1 −0.16 ≤ u0 0 −0.47 ≤ u1 0 λ0 , λ1 = z = 1 0.66 ≤ 1 1.47 ≤ 1 ≥ 0 ẑ = (x̂, ŷ ) 105 / 160 Disjunctive Cuts Example Cj1 Cj0 def z ∗ = arg min kz − ẑk∞ z∗ s.t. λ0 u0 + λ1 u1 λ0 + λ1 −0.16 ≤ u0 0 −0.47 ≤ u1 0 λ0 , λ1 = z = 1 0.66 ≤ 1 1.47 ≤ 1 ≥ 0 ẑ = (x̂, ŷ ) 106 / 160 Disjunctive Cuts Example Cj1 Cj0 def z ∗ = arg min kz − ẑk z∗ s.t. λ0 u0 + λ1 u1 λ0 + λ1 −0.16 ≤ u0 0 −0.47 ≤ u1 0 λ0 , λ1 = z = 1 0.66 ≤ 1 1.47 ≤ 1 ≥ 0 ẑ = (x̂, ŷ ) NONCONVEX 107 / 160 What to do? (Stubbs and Mehrotra, 1999) Look at the perspective of c(z) P(c(z̃), µ) = µc(z̃/µ) Think of z̃ = µz 108 / 160 What to do? (Stubbs and Mehrotra, 1999) Look at the perspective of c(z) P(c(z̃), µ) = µc(z̃/µ) Think of z̃ = µz Perspective gives a convex reformulation of Mj (C ): Mj (C̃ ), where µci (z/µ) ≤ 0 C̃ := (z, µ) 0 ≤ µ ≤ 1 0 ≤ x ≤ µU, 0 ≤ y ≤ µ c(0/0) = 0 ⇒ convex representation 109 / 160 Disjunctive Cuts Example µ (x/µ − 1/2)2 + (y /µ − 3/4)2 − 1 ≤ 0 x −2µ ≤ x ≤ 2µ C̃ = y 0≤y ≤µ µ 0≤µ≤1 µ x x Cj0 µ Cj0 Cj1 y Cj0 Cj1 C1 yµ j y x 110 / 160 Example, cont. C̃j0 = {(z, µ) | yj = 0} C̃j1 = {(z, µ) | yj = µ} 111 / 160 Example, cont. C̃j0 = {(z, µ) | yj = 0} C̃j1 = {(z, µ) | yj = µ} Take v0 ← µ0 u0 v1 ← µ1 u1 min kz − ẑk s.t. v0 + v1 µ0 + µ1 (v0 , µ0 ) (v1 , µ1 ) µ0 , µ1 = = ∈ ∈ ≥ z 1 C̃j0 C̃j1 0 Solution to example: ∗ x −0.401 = y∗ 0.780 separating hyperplane: ψ T (z − ẑ), where ψ ∈ ∂kz − ẑk 112 / 160 0 CExample, j .198 x Cont. Cj1 ψ= + 0. 061y =− 0.03 2 z∗ 2x ∗ + 0.5 2y ∗ − 0.75 0.198x + 0.061y ≥ −0.032 ẑ = (x̂, ŷ ) 113 / 160 Nonlinear Branch-and-Cut (Stubbs and Mehrotra, 1999) Can do this at all nodes of the branch-and-bound tree Generalize disjunctive approach from MILP solve one convex NLP per cut Generalizes Sherali and Adams (1990) and Lovász and Schrijver (1991) tighten cuts by adding semi-definite constraint Stubbs and Mehrohtra (2002) also show how to generate convex quadratic inequalities, but computational results are not that promising 114 / 160 Generalized Disjunctive Programming (Raman and Grossmann, 1994; Lee and Grossmann, 2000) Consider disjunctive minimize x,Y subject to NLP P fi + f (x) Yi ¬Y i W ci (x) ≤ 0 Bi x = 0 ∀i ∈ I fi = γ i fi = 0 0 ≤ x ≤ U, Ω(Y ) = true, Y ∈ {true, false}p Application: process synthesis • Yi represents presence/absence of units • Bi x = 0 eliminates variables if unit absent Exploit disjunctive structure • special branching ... OA/GBD algorithms 115 / 160 Generalized Disjunctive Programming (Raman and Grossmann, 1994; Lee and Grossmann, 2000) Consider disjunctive minimize x,Y subject to NLP P fi + f (x) Yi ¬Y i W ci (x) ≤ 0 Bi x = 0 ∀i ∈ I fi = γ i fi = 0 0 ≤ x ≤ U, Ω(Y ) = true, Y ∈ {true, false}p Big-M formulation (notoriously bad), M > 0: ci (x) ≤ M(1 − yi ) −Myi ≤ Bi x ≤ Myi fi = yi γi Ω(Y ) converted to linear inequalities 116 / 160 Generalized Disjunctive Programming (Raman and Grossmann, 1994; Lee and Grossmann, 2000) Consider disjunctive minimize x,Y subject to NLP P fi + f (x) Yi ¬Y i W ci (x) ≤ 0 Bi x = 0 ∀i ∈ I fi = γ i fi = 0 0 ≤ x ≤ U, Ω(Y ) = true, Y ∈ {true, false}p convex hull representation ... x = vi1 + vi0 , λi1 + λi0 = 1 λi1 ci (vi1 /λi1 ) ≤ 0, Bi vi0 = 0 0 ≤ vij ≤ λij U, 0 ≤ λij ≤ 1, fi = λi1 γi 117 / 160 Dealing with Nonconvexities Functional nonconvexity causes serious problems. Branch and bound must have true lower bound (global solution) 118 / 160 Dealing with Nonconvexities Functional nonconvexity causes serious problems. Branch and bound must have true lower bound (global solution) Underestimate nonconvex functions. Solve relaxation. Provides lower bound. 119 / 160 Dealing with Nonconvexities Functional nonconvexity causes serious problems. Branch and bound must have true lower bound (global solution) Underestimate nonconvex functions. Solve relaxation. Provides lower bound. If relaxation is not exact, then branch 120 / 160 Dealing with Nonconvex Constraints If nonconvexity in constraints, may need to overestimate and underestimate the function to get a convex region 121 / 160 Envelopes f :Ω→R Convex Envelope (vexΩ (f )): Pointwise supremum of convex underestimators of f over Ω. 122 / 160 Envelopes f :Ω→R Convex Envelope (vexΩ (f )): Pointwise supremum of convex underestimators of f over Ω. Concave Envelope (cavΩ (f )): Pointwise infimum of concave overestimators of f over Ω. 123 / 160 Branch-and-Bound Global Optimization Methods Under/Overestimate “simple” parts of (Factorable) Functions individually Bilinear Terms Trilinear Terms Fractional Terms Univariate convex/concave terms 124 / 160 Branch-and-Bound Global Optimization Methods Under/Overestimate “simple” parts of (Factorable) Functions individually Bilinear Terms Trilinear Terms Fractional Terms Univariate convex/concave terms General nonconvex functions f (x) can be underestimated over a region [l, u] “overpowering” the function with a quadratic function that is ≤ 0 on the region of interest L(x) = f (x) + n X αi (li − xi )(ui − xi ) i=1 Refs: (McCormick, 1976; Adjiman et al., 1998; Tawarmalani and Sahinidis, 2002) 125 / 160 Bilinear Terms The convex and concave envelopes of the bilinear function xy over a rectangular region def R = {(x, y ) ∈ R2 | lx ≤ x ≤ ux , ly ≤ y ≤ uy } are given by the expressions vexxyR (x, y ) = max{ly x + lx y − lx ly , uy x + ux y − ux uy } cavxyR (x, y ) = min{uy x + lx y − lx uy , ly x + ux y − ux ly } 126 / 160 Worth 1000 Words? xy 127 / 160 Worth 1000 Words? vexR (xy ) 128 / 160 Worth 1000 Words? cavR (xy ) 129 / 160 Summary MINLP: Good relaxations are important Relaxations can be improved Statically: Better formulation/preprocessing Dynamically: Cutting planes Nonconvex MINLP: Methods exist, again based on relaxations Tight relaxations is an active area of research Lots of empirical questions remain 130 / 160 Part IV Implementation and Software 131 / 160 Implementation and Software for MINLP 1 Special Ordered Sets 2 Implementation & Software Issues 132 / 160 Special Ordered Sets of Type 1 SOS1: P λi = 1 & at most one λi is nonzero Example 1: d ∈ {d1 , . . . , dp } discrete diameters P ⇔ d = λi di and {λ1 , . . . , λp } is SOS1 P P ⇔ d = λi di and λi = 1 and λi ∈ {0, 1} . . . d is convex combination with coefficients λi Example 2: nonlinear function c(y ) of single integer P P ⇔y= iλi and c = c(i)λi and {λ1 , . . . , λp } is SOS1 References: (Beale, 1979; Nemhauser, G.L. and Wolsey, L.A., 1988; Williams, 1993) . . . 133 / 160 Special Ordered Sets of Type 1 SOS1: P λi = 1 & at most one λi is nonzero Branching on SOS1 1 2 reference row a1 < . . . < ap e.g. diameters P fractionality: a := ai λi a < at a > a t+1 134 / 160 Special Ordered Sets of Type 1 SOS1: P λi = 1 & at most one λi is nonzero Branching on SOS1 1 2 reference row a1 < . . . < ap e.g. diameters P fractionality: a := ai λi 3 find t : at < a ≤ at+1 4 branch: {λt+1 , . . . , λp } = 0 or {λ1 , . . . , λt } = 0 a < at a > a t+1 135 / 160 Special Ordered Sets of Type 2 SOS2: P λi = 1 & at most two adjacent λi nonzero Example: Approximation of nonlinear function z = z(x) z(x) breakpoints x1 < . . . < xp function values zi = z(xi ) piece-wise linear x 136 / 160 Special Ordered Sets of Type 2 SOS2: P λi = 1 & at most two adjacent λi nonzero Example: Approximation of nonlinear function z = z(x) z(x) breakpoints x1 < . . . < xp function values zi = z(xi ) piece-wise linear P x= λi xi P z= λi zi x {λ1 , . . . , λp } is SOS2 . . . convex combination of two breakpoints . . . 137 / 160 Special Ordered Sets of Type 2 SOS2: P λi = 1 & at most two adjacent λi nonzero Branching on SOS2 1 2 reference row a1 < . . . < ap e.g. ai = xi P fractionality: a := ai λi x < at x > at 138 / 160 Special Ordered Sets of Type 2 SOS2: P λi = 1 & at most two adjacent λi nonzero Branching on SOS2 1 2 reference row a1 < . . . < ap e.g. ai = xi P fractionality: a := ai λi 3 find t : at < a ≤ at+1 4 branch: {λt+1 , . . . , λp } = 0 or {λ1 , . . . , λt−1 } x < at x > at 139 / 160 Special Ordered Sets of Type 3 Example: Approximation of 2D function u = g (v , w ) Triangularization of [vL , vU ] × [wL , wU ] domain 1 vL = v1 < . . . < vk = vU 2 wL = w1 < . . . < wl = wU 3 function uij := g (vi , wj ) 4 λij weight of vertex (i, j) 140 / 160 Special Ordered Sets of Type 3 Example: Approximation of 2D function u = g (v , w ) Triangularization of [vL , vU ] × [wL , wU ] domain 1 vL = v1 < . . . < vk = vU 2 wL = w1 < . . . < wl = wU 3 function uij := g (vi , wj ) 4 λij weight of vertex (i, j) v= w P λij vi P w= λij wj P u = λij uij v 141 / 160 Special Ordered Sets of Type 3 Example: Approximation of 2D function u = g (v , w ) Triangularization of [vL , vU ] × [wL , wU ] domain 1 vL = v1 < . . . < vk = vU 2 wL = w1 < . . . < wl = wU 3 function uij := g (vi , wj ) 4 λij weight of vertex (i, j) v= w P λij vi P w= λij wj P u = λij uij P 1 = λij is SOS3 . . . v 142 / 160 Special Ordered Sets of Type 3 SOS3: P λij = 1 & set condition holds w 1 2 3 v= P λij vi ... convex combinations P w= λij wj P u = λij uij {λ11 , . . . , λkl } satisfies set condition v ⇔ ∃ trangle ∆ : {(i, j) : λij > 0} ⊂ ∆ violates set condn i.e. nonzeros in single triangle ∆ 143 / 160 Branching on SOS3 λ violates set condition w compute P centers: v̂ = Pλij vi & ŵ = λij wi find s, t such that vs ≤ v̂ < vs+1 & ws ≤ ŵ < ws+1 branch on v or w v violates set condn 144 / 160 Branching on SOS3 λ violates set condition compute P centers: v̂ = Pλij vi & ŵ = λij wi find s, t such that vs ≤ v̂ < vs+1 & ws ≤ ŵ < ws+1 branch on v or w vertical branching: X L λij = 1 X λij = 1 R 145 / 160 Branching on SOS3 λ violates set condition compute P centers: v̂ = Pλij vi & ŵ = λij wi find s, t such that vs ≤ v̂ < vs+1 & ws ≤ ŵ < ws+1 branch on v or w horizontal branching: X T λij = 1 X λij = 1 B 146 / 160 Extension to SOS-k Example: electricity transmission network: c(x) = 4x1 − x22 − 0.2 · x2 x4 sin(x3 ) (Martin et al., 2005) extend SOS3 to SOSk models for any k ⇒ function with p variables on N grid needs N p λ’s 147 / 160 Extension to SOS-k Example: electricity transmission network: c(x) = 4x1 − x22 − 0.2 · x2 x4 sin(x3 ) (Martin et al., 2005) extend SOS3 to SOSk models for any k ⇒ function with p variables on N grid needs N p λ’s Alternative (Gatzke, 2005): exploit computational graph ' automatic differentiation 148 / 160 Extension to SOS-k Example: electricity transmission network: c(x) = 4x1 − x22 − 0.2 · x2 x4 sin(x3 ) (Martin et al., 2005) extend SOS3 to SOSk models for any k ⇒ function with p variables on N grid needs N p λ’s Alternative (Gatzke, 2005): exploit computational graph ' automatic differentiation only need SOS2 & SOS3 ... replace nonconvex parts 149 / 160 Extension to SOS-k Example: electricity transmission network: c(x) = 4x1 − x22 − 0.2 · x2 x4 sin(x3 ) (Martin et al., 2005) extend SOS3 to SOSk models for any k ⇒ function with p variables on N grid needs N p λ’s Alternative (Gatzke, 2005): exploit computational graph ' automatic differentiation only need SOS2 & SOS3 ... replace nonconvex parts piece-wise polyhedral approx. 150 / 160 Software for MINLP Outer Approximation: DICOPT++ (& AIMMS) NLP solvers: CONOPT, MINOS, SNOPT MILP solvers: CPLEX, OSL2 Branch-and-Bound Solvers: SBB & MINLP NLP solvers: CONOPT, MINOS, SNOPT & FilterSQP variable & node selection; SOS1 & SOS2 support Global MINLP: BARON & MINOPT underestimators & branching CPLEX, MINOS, SNOPT, OSL Online Tools: MINLP World, MacMINLP & NEOS MINLP World www.gamsworld.org/minlp/ NEOS server www-neos.mcs.anl.gov/ 151 / 160 COIN-OR http://www.coin-or.org COmputational INfrastructure for Operations Research A library of (interoperable) software tools for optimization A development platform for open source projects in the OR community Possibly Relevant Modules: OSI: Open Solver Interface CGL: Cut Generation Library CLP: Coin Linear Programming Toolkit CBC: Coin Branch and Cut IPOPT: Interior Point OPTimizer for NLP NLPAPI: NonLinear Programming API 152 / 160 MINLP with COIN-OR New implementation of LP/NLP based BB MIP branch-and-cut: CBC & CGL NLPs: IPOPT interior point ... OK for NLP(yi ) New hybrid method: solve more NLPs at non-integer yi ⇒ better outer approximation allow complete MIP at some nodes ⇒ generate new integer assignment ... faster than DICOPT++, SBB simplifies to OA and BB at extremes ... less efficient ... see Bonami et al. (2005) ... coming in 2006. 153 / 160 Conclusions MINLP rich modeling paradigm ◦ most popular solver on NEOS Algorithms for MINLP: ◦ Branch-and-bound (branch-and-cut) ◦ Outer approximation et al. 154 / 160 Conclusions MINLP rich modeling paradigm ◦ most popular solver on NEOS Algorithms for MINLP: ◦ Branch-and-bound (branch-and-cut) ◦ Outer approximation et al. “MINLP solvers lag 15 years behind MIP solvers” ⇒ many research opportunities!!! 155 / 160 Part V References 156 / 160 C. Adjiman, S. Dallwig, C. A. Floudas, and A. Neumaier. A global optimization method, aBB, for general twice-differentiable constrained NLPs - I. Theoretical advances. Computers and Chemical Engineering, 22:1137–1158, 1998. I. Akrotirianakis, I. Maros, and B. Rustem. An outer approximation based branch-and-cut algorithm for convex 0-1 MINLP problems. Optimization Methods and Software, 16: 21–47, 2001. E. Balas. Disjunctive programming. In Annals of Discrete Mathematics 5: Discrete Optimization, pages 3–51. North Holland, 1979. E. Balas, S. Ceria, and G. Corneujols. A lift-and-project cutting plane algorithm for mixed 0-1 programs. Mathematical Programming, 58:295–324, 1993. E. M. L. Beale. Branch-and-bound methods for mathematical programming systems. Annals of Discrete Mathematics, 5:201–219, 1979. P. Bonami, L. Biegler, A. Conn, G. Cornuéjols, I. Grossmann, C. Laird, J. Lee, A. Lodi, F. Margot, N. Saaya, and A. Wächter. An algorithmic framework for convex mixed integer nonlinear programs. Technical report, IBM Research Division, Thomas J. Watson Research Center, 2005. B. Borchers and J. E. Mitchell. An improved branch and bound algorithm for Mixed Integer Nonlinear Programming. Computers and Operations Research, 21(4):359–367, 1994. R. Borndörfer and R. Weismantel. Set packing relaxations of some integer programs. Mathematical Programming, 88:425 – 450, 2000. M. T. Çezik and G. Iyengar. Cuts for mixed 0-1 conic programming. Mathematical Programming, 2005. to appear. H. Crowder, E. L. Johnson, and M. W. Padberg. Solving large scale zero-one linear programming problems. Operations Research, 31:803–834, 1983. 157 / 160 D. De Wolf and Y. Smeers. The gas transmission problem solved by an extension of the simplex algorithm. Management Science, 46:1454–1465, 2000. M. Duran and I. E. Grossmann. An outer-approximation algorithm for a class of mixed–integer nonlinear programs. Mathematical Programming, 36:307–339, 1986. A. M. Geoffrion. Generalized Benders decomposition. Journal of Optimization Theory and Applications, 10:237–260, 1972. I. E. Grossmann and R. W. H. Sargent. Optimal design of multipurpose batch plants. Ind. Engng. Chem. Process Des. Dev., 18:343–348, 1979. Harjunkoski, I., Westerlund, T., Pörn, R. and Skrifvars, H. Different transformations for solving non-convex trim-loss problems by MINLP. European Journal of Opertational Research, 105:594–603, 1998. Jain, V. and Grossmann, I.E. Cyclic scheduling of continuous parallel-process units with decaying performance. AIChE Journal, 44:1623–1636, 1998. G. R. Kocis and I. E. Grossmann. Global optimization of nonconvex mixed–integer nonlinear programming (MINLP) problems in process synthesis. Industrial Engineering Chemistry Research, 27:1407–1421, 1988. S. Lee and I. Grossmann. New algorithms for nonlinear disjunctive programming. Computers and Chemical Engineering, 24:2125–2141, 2000. S. Leyffer. Integrating SQP and branch-and-bound for mixed integer nonlinear programming. Computational Optimization & Applications, 18:295–309, 2001. L. Lovász and A. Schrijver. Cones of matrices and setfunctions, and 0-1 optimization. SIAM Journal on Optimization, 1, 1991. H. Marchand and L. Wolsey. The 0-1 knapsack problem with a single continuous variable. Mathematical Programming, 85:15–33, 1999. 158 / 160 A. Martin, M. Möller, and S. Moritz. Mixed integer models for the stationary case of gas network optimization. Technical report, Darmstadt University of Technology, 2005. G. P. McCormick. Computability of global solutions to factorable nonconvex programs: Part I—Convex underestimating problems. Mathematical Programming, 10:147–175, 1976. Nemhauser, G.L. and Wolsey, L.A. Integer and Combinatorial Optimization. John Wiley, New York, 1988. M. Padberg, T. J. Van Roy, and L. Wolsey. Valid linear inequalities for fixed charge problems. Operations Research, 33:842–861, 1985. I. Quesada and I. E. Grossmann. An LP/NLP based branch–and–bound algorithm for convex MINLP optimization problems. Computers and Chemical Engineering, 16:937–947, 1992. Quist, A.J. Application of Mathematical Optimization Techniques to Nuclear Reactor Reload Pattern Design. PhD thesis, Technische Universiteit Delft, Thomas Stieltjes Institute for Mathematics, The Netherlands, 2000. R. Raman and I. E. Grossmann. Modeling and computational techniques for logic based integer programming. Computers and Chemical Engineering, 18:563–578, 1994. H. D. Sherali and W. P. Adams. A hierarchy of relaxations between the continuous and convex hull representations for zero-one programming problems. SIAM Journal on Discrete Mathematics, 3:411–430, 1990. O. Sigmund. A 99 line topology optimization code written in matlab. Structural Multidisciplinary Optimization, 21:120–127, 2001. R. Stubbs and S. Mehrohtra. Generating convex polynomial inequalities for mixed 0-1 programs. Journal of Global Optimization, 24:311–332, 2002. R. A. Stubbs and S. Mehrotra. A branch–and–cut method for 0–1 mixed convex programming. Mathematical Programming, 86:515–532, 1999. 159 / 160 M. Tawarmalani and N. V. Sahinidis. Convexification and Global Optimization in Continuous and Mixed-Integer Nonlinear Programming: Theory, Algorithms, Software, and Applications. Kluwer Academic Publishers, Boston MA, 2002. J. Viswanathan and I. E. Grossmann. Optimal feed location and number of trays for distillation columns with multiple feeds. I&EC Research, 32:2942–2949, 1993. Westerlund, T., Isaksson, J. and Harjunkoski, I. Solving a production optimization problem in the paper industry. Report 95–146–A, Department of Chemical Engineering, Abo Akademi, Abo, Finland, 1995. Westerlund, T., Pettersson, F. and Grossmann, I.E. Optimization of pump configurations as MINLP problem. Computers & Chemical Engineering, 18(9):845–858, 1994. H. P. Williams. Model Solving in Mathematical Programming. John Wiley & Sons Ltd., Chichester, 1993. 160 / 160