Big Data Optimization at SAS Imre Pólik et al. SAS Institute Cary, NC, USA Edinburgh, 2013 Copyright c 2012, SAS Institute Inc. All rights reserved. Outline 1 Optimization at SAS 2 Big Data Optimization at SAS The SAS HPA architecture Support vector machines Quantile regression Marketing Optimization Local search optimization 3 Distributed/parallel optimization Decomposition Miscellaneous tools 4 Future plans Copyright c 2012, SAS Institute Inc. All rights reserved. About SAS The company Leader in business analytics software and services About $3 billion worldwide revenue Largest private software company in the world World's Best Multinational Workplace in 2012 More than 11,000 employees, 400 oces and 600 alliances SAS customers or their aliates represent over 90% of the top 100 FORTUNE 500 companies The software Originally created for basic statistics by professors at NCSU Extended tremendously over the decades Covers all aspects of analytics and business intelligence Copyright c 2012, SAS Institute Inc. All rights reserved. SAS/OR Oerings Optimization modelling and solvers Algebraic Modelling Language with all the usual solvers (LP, QP, MILP, NLP, CP, Scheduling, Decomposition, . . . ) Other tools Graph and network algorithms Discrete event simulation + the rest of SAS Solutions Marketing Optimization, Service Parts Optimization, Revenue Optimization, Size Optimization, . . . Services Technical Support, Training, Professional Services, Consulting Platforms Windows, Linux, Solaris x64/SPARC, HPUX, AIX, z/OS Copyright c 2012, SAS Institute Inc. All rights reserved. What is Big Data? Nathan Brixius (in a recent blog post) A big data analytics application is simply an analytics application where the required data does not t on a single machine and needs to be considered in full to produce a result. SAS Big data is relative; it applies whenever an organization's need to handle, store and analyze data exceeds its current capacity. Related concepts/tools large-scale optimization distributed optimization parallel optimization Copyright c 2012, SAS Institute Inc. All rights reserved. High Performance Analytics at SAS Copyright c 2012, SAS Institute Inc. All rights reserved. Parallel Implementations and Determinism Non-determinism a dirty word in the commercial world Sources of non-determinism adding columns in a dierent order aggregating results in a dierent order arbitrary random number seeds dierent machines in the pool time limits Workarounds operations in a xed order deterministic criteria (nodes, iterations) deterministic ticks (see Xpress/Cplex) Copyright c 2012, SAS Institute Inc. All rights reserved. Determinism comes with a performance penalty. Big Data Optimization at SAS Quadratic programming Support vector machine Linear programming Quantile regression Mixed-integer linear programming Marketing optimization Derivative-free optimization Local search optimization Copyright c 2012, SAS Institute Inc. All rights reserved. Support Vector Machines http://www.cac.science.ru.nl/people/ustun/SVM.JPG Copyright c 2012, SAS Institute Inc. All rights reserved. Linear SVM Problem Primal formulation 1 kwk22 + τ eT z 2 subject to Y w − βd ≥ e − z z ≥ 0. minimize w,z,β Dual formulation 1 −eT v + v T Y Y T v 2 subject to dT v = 0, 0 ≤ v ≤ τ e. minimize v Copyright c 2012, SAS Institute Inc. All rights reserved. Using Primal-Dual Interior-Point Approach Dominant cost per iteration is forming/solving Newton system Many more observations than columns/features T (I + Y T Ω−1 k Y − vk vk )∆w = −rw Y T Ωk−1 Y must be formed every iteration Many more columns/features than observations Y Y T + Ωk d ∆v ρ =− β dT 0 −∆β rΩ Y Y T constant for all iterations Copyright c 2012, SAS Institute Inc. All rights reserved. SVM: Parallel Performance Copyright c 2012, SAS Institute Inc. All rights reserved. SVM: Features and Plans Features frequency/weight term iterative (PCG) or direct (Cholesky, threaded) method to solve the Newton system balance threads to avoid cache misses balance number of compute nodes to limit communication In progress nonlinear SVM build a distributed QP solver Available soon in SAS/OR Copyright c 2012, SAS Institute Inc. All rights reserved. Quantile Regression Goal Approximate the median or some other quantile of the response variable of a number of observations min τ u+ + (1 − τ )u− A(β + − β − ) + u+ + u− = b β + , β − , u+ , u− ≥ 0 A: observations in rows, fully dense b: response variable τ : quantile level Problem size Up to 108 observations each of dimension 104 Copyright c 2012, SAS Institute Inc. All rights reserved. Quantile Regression Features distributed IPM, similar to the SVM case Newton system solved directly or iteratively dierent preconditioner Plans categorical variables sparse observations nonlinear quantile regression build a distributed dense LP/IPM solver Available soon in SAS/OR Copyright c 2012, SAS Institute Inc. All rights reserved. Marketing Optimization Problem Assign ads/oers to customers based on budget, policy, user preferences, history and other kinds of constraints. Formulation Typical sizes: millions of customers, hundreds of oers Formulated as a MILP with millions of binary variables Solution Special decomposition Subproblems are solved distributed on the grid Subgradient algorithm for the master Available as a SAS solution. Copyright c 2012, SAS Institute Inc. All rights reserved. Marketing Optimization Typical data (telecommunications) 15 million customers 910 communications 14 aggregate constraints 19 rolling contact policies (per day, per week, per month) 90 million oers in the contact history Performance Used to take 10 hours on a single machine with regular MO Solved in 2 minutes on an EMC Greenplum appliance (32 nodes, 24 threads, 48GB ram) Allows for scenario analysis Copyright c 2012, SAS Institute Inc. All rights reserved. Local Search Optimization Algorithm GA-guided pattern search Continuous, discrete and categorical variables Up to about 100 variables Implementation Classical: each worker evaluates the function at a given point Big data: each worker computes part of the function value from its own data, then these are aggregated Function value cache Leading to simulation-based optimization Available as part of SAS/OR Copyright c 2012, SAS Institute Inc. All rights reserved. Parallel Optimization Not Big Data, but uses the same infrastructure Decomposition Multistart NLP Option tuner for MILP Copyright c 2012, SAS Institute Inc. All rights reserved. Decomposition Outline Algorithm Dantzig-Wolfe Decomposition embedded in B&B a specic variant of column generation Relax A and subproblem B becomes tractable (even separable) Find convex combinations of extreme points of subproblems that satisfy the continuous relaxation of the master constraints Iterate between master A (reformulated space) and B Blocks From user, network or auto Available in SAS/OR A1 B1 A2 ··· A|K| B2 . . . B|K| Copyright c 2012, SAS Institute Inc. All rights reserved. Decomposition Parallel Implementation Implementation Shared (Threaded) and/or Distributed Memory (Gridded) Subproblems use a standard queue Areas of parallelism X X X X Branch & Bound Heuristics (non-blocking price-and-branch) Subproblem solves (across subproblems) Master solve (IPM or concurrent) Subproblem solves (for each subproblem) Factors aecting parallel performance percentage of time in subprob vs master (modeling) load balance aggregate subproblems enforce balance with time limits? (non-deterministic) MPI overhead jobs must be signicant Copyright c 2012, SAS Institute Inc. All rights reserved. Other parallel optimization applications Network tools Graph centrality, community detection Social network analysis for fraud detection Marketing analysis for telecommunications Multistart NLP (Global optimization) Standard NLP solvers started from dierent points Function evalutions and solvers are distributed Option tuner for MILP Find the best option setting for a set of MILP problems Continuous, discrete and categorical options are all included Copyright c 2012, SAS Institute Inc. All rights reserved. Future Plans Extend the list of HP enabled procedures driven by customers' needs distributed LP/QP distributed graph algorithms parallel MILP parallel solves in OPTMODEL simulation-based optimization Copyright c 2012, SAS Institute Inc. All rights reserved. Thank you for your attention. imre@polik.net