MIPing the Probabilistic Integer Programming Problem Anureet Saxena ACO PhD Student, Tepper School of Business, Carnegie Mellon University. (Joint Work with Vineet Goyal and Miguel Lejuene) Why Probabilistic Programming? Transportation Cost Fixed Cost Demand Constraints Capacity Constraints Set of Customers Set of Facilities Why Probabilistic Programming? Transportation Cost Fixed Cost Demand Constraints Capacity Constraints Uncertain Future • Population Shift • Evolution of Market Trends of Customers • Ford opens aSet manufacturing unit • Google closes its R&D center Set of Facilities Why Probabilistic Programming? A random 0/1 vector which incorporates the uncertain future into the optimization model Why Probabilistic Programming? Reliability Level Probabilistic Constraint Probabilistic MIP Model Random 0/1 Vector (Joint Distribution) Deterministic Reliability Level Probabilistic Why Probabilistic Programming? • Facility Location – – – – Strategic Planning Population shift Evolution of market trends Demographic Changes Must Read! Strategic facility location by Owen and Daskin • Contingency Service – Minimum Reliability Principle • Production Design and Manufacturing – Uncertain Demand – Lot Sizing and Inventory Problems A Simple Algorithm Random 0/1 Vector (Joint Distribution) 1. 2. Reliability Level Enumerate all possible 0/1 realizations of . For each 0/1 realization whose cdf is greater than or equal to p, solve the deterministic problem Prekopa, Beraldi, Ruszczynski Approach Prekopa, Beraldi, Ruszczynski Approach 111 110 101 011 100 010 001 000 Prekopa, Beraldi, Ruszczynski Approach p-efficient frontier 2-Phase Algorithm Enumeration of p-efficient points Solving a Deterministic Problem for each p-efficient point 2-Phase Algorithm Enumeration of p-efficient points Independent Solving a Deterministic Problem for each p-efficient point Beraldi & Ruszczynski Approach Explosive Growth In computation time scp41 scp42 2-Phase Algorithm Pitfall Enumeration of p-efficient points Solving a Deterministic Problem for each p-efficient point Our Approach Integrate the 2-phases Enumeration of p-efficient points Solving a Deterministic Problem for each p-efficient point Our Approach Integrate the 2-phases Enumeration of p-efficient points Independent Solving a Deterministic Problem for each p-efficient point Our Model Log of cumulative probability of block t Non-Linear MIPing Our Model Log of cumulative probability of block t Our Model Log of cumulative probability of block t Beraldi & Ruszczynski Approach: Comparison All instances solved in less than 1sec by CPLEX 9.0. CPLEX enumerated less than 50 nodes solving most instances at the root node scp41 scp42 Key Observations • • • • Models any arbitrary distribution Exponential number of constraints for each block Linear in the input size for generic distribution Encodes the enumeration phase as a Mixed Integer Program • Allows us to exploit state-of-art MIP solvers to perform intelligent enumeration. Key Observations • • • • Models any arbitrary distribution Exponential number of constraints for each block Linear in the input size for generic distribution Researchphase Question Encodes the enumeration as a Mixed Integer Program The model has an exponential number of • Allows us to exploitforstate-of-art solvers to perform constraints each block.MIP Is there a way intelligent to enumeration. reduce the number of constraints? The Answer is Yes p-Inefficient Frontier Refined Formulation Add t constraints only for lattice points above the frontier Set-Covering Constraint for maximally pinefficient points Refined Formulation Block Size10 A Tough Instance - p31 • • • • • SSCFLP instance from the Holmberg test-bed 30 facilities and 150 customers Deterministic instance can be solved in 80 sec. Probabilistic instance has 15 blocks of size 10 each CPLEX was unable to solve the probabilistic instance within 2 hours!! A Tough Instance - p31 A Tough Instance - p31 Research Question Why is this instance so difficult to solve? Answer Big-M Constraints Polarity Cuts Big-M Constraints model P Facets of P can strengthen the model Polarity Cuts • We know all the extreme points and extreme rays of P • Compact description of polar • Facets of P can be found by solving the linear program derived from the polar • The linear program has lot more rows than columns – dual simplex algorithm. A Tough Instance - p31 Tough Instance Solved • % Gap closed at Root Node 67.84% • Time Spent in Strengthening 0.83 sec • Time Spent in Solving Separation LP 0.30 sec • Time Taken by CPLEX 9.0 after Strengthening 51.65 sec • No. of Branch-and-Bound enumerated by CPLEX 9.0 2300 • Total time taken to solve the instance to optimality 53.04 sec Computational Results • Implementation – COIN-OR Modules – CPLEX 9.0 • Selection Criterion – ORLIB & Holmberg Instances – Instances which can be solved in 1hr • Computational Power – P4 Processor – 2GB RAM • Library of Instances – PCPLIB Test Bed Problem Set OrLib Set Covering OrLib Warehouse Location (Cap) OrLib p-Median (Cap) Holmberg Facility Location (Cap) Number of Instances 60 37 20 70 # Rows 50-500 66-100 101-201 60-230 # Columns 500-5000 816-2550 2550-10100 510-6030 • 2 Distributions – as in BR [2002] • 4 Reliability levels – 0.80, 0.85, 0.90, 0.95 • 2 Block Sizes – 5, 10 • Total Number of Instances per Deterministic Instance = 16 Computational Results Deterministic Problem Set Covering CWLP Cap k-Median SSCFLP Number of Probabilistic Instances 1440 888 480 1680 Number of Unsolved Instances 37 0 0 22 % Relative Gap (Unsolved Instances) 11.69 0.45 Computational Results Deterministic Problem Set Covering CWLP Cap k-Median SSCFLP Solution Number of BranchTime (sec) and-Bound Nodes 160.81 7440 0.31 30 43.79 1464 31.27 2248 Impact of Polarity Cuts Polarity Cuts' Strengthening Deterministic Problem % Duality Gap % Time Spent Closed Set Covering 23.74 0.22 CWLP 11.44 9.43 0.00 0.21 18.45 0.29 Cap k-Median SSCFLP Value of Information Deterministic Problem Set Covering CWLP Cap k-Median SSCFLP Value of Information (%) 5.75 15.05 9.54 4.60 Value of Information Value of Information (%) Set Covering 5.75 CWLP 15.05 Cap k-Median 9.54 Empirical Observation SSCFLP 4.60 Deterministic Problem Probabilistic versions of simple and moderately difficult mixed integer programs can themselves be formulated as MIPs which can be solved in reasonable amount of time. Structured Distributions Research Question Is it possible to exploit structure of distributions to design models which are polynomial in the input size? Stationary Distributions Definition A distribution function F is said to be stationary if F(z) depends only on the number of ones in z. Principle of Indistinguishability. Stationary Distributions 111 110 101 011 100 010 001 000 Stationary Distributions Can be converted to a MIP with linear number of additional variables and constraints!! Stationary Distributions A model with linear number of variables and constraints!! Stationary Distributions Deterministic Problem Number of Probabilistic Instances Number of Unsolved Instances % Relative Solution Time Gap (Unsolved (sec) Instances) Value of Information Set Covering 1920 127 21.34 112.98 10.42 CWLP 1184 0 - 0.09 25.15 Cap k-Median SSCFLP 640 2240 0 17 0.45 2.90 9.36 15.25 8.51 • 8 Block Sizes: 5, 10, 20, 50, m/4, m/3, m/2, m • 4 Threshold Probabilities: 0.80, 0.85, 0.90, 0.95 Number of Instances per deterministic instance= 32 Stationary Distributions Research Question What is that unique property of stationary distributions which allowed us to design a linear sized model? Disjunctive Shattering Property The lattice of a stationary distribution can be partitioned into polynomial number of pieces each of which has a polynomial sized description. Stationary Distributions 111 110 101 011 100 010 001 000 Summary BR Algorithm Stationary Distributions MIP Model Super Linear Speedup p-Inefficiency Refinement Polarity Cuts Strengthening Computational Results Our Contribution Thank you for your attention