Big Data Optimization at SAS

Big Data Optimization at
SAS
Imre Pólik et al.
SAS Institute
Cary, NC, USA
Edinburgh, 2013
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Outline
1 Optimization at SAS
2 Big Data Optimization at SAS
The SAS HPA architecture
Support vector machines
Quantile regression
Marketing Optimization
Local search optimization
3 Distributed/parallel optimization
Decomposition
Miscellaneous tools
4 Future plans
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
About SAS
The company
Leader in business analytics software and services
About $3 billion worldwide revenue
Largest private software company in the world
World's Best Multinational Workplace in 2012
More than 11,000 employees, 400 oces and 600 alliances
SAS customers or their aliates represent over 90% of the top
100 FORTUNE 500 companies
The software
Originally created for basic statistics by professors at NCSU
Extended tremendously over the decades
Covers all aspects of analytics and business intelligence
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
SAS/OR Oerings
Optimization modelling and solvers
Algebraic Modelling Language with all the usual solvers
(LP, QP, MILP, NLP, CP, Scheduling, Decomposition, . . . )
Other tools
Graph and network algorithms
Discrete event simulation + the rest of SAS
Solutions
Marketing Optimization, Service Parts Optimization,
Revenue Optimization, Size Optimization, . . .
Services
Technical Support, Training, Professional Services, Consulting
Platforms
Windows, Linux, Solaris x64/SPARC, HPUX, AIX, z/OS
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
What is Big Data?
Nathan Brixius (in a recent blog post)
A big data analytics application is simply an analytics application
where
the required data does not t on a single machine and
needs to be considered in full to produce a result.
SAS
Big data is relative; it applies whenever an organization's need to
handle, store and analyze data exceeds its current capacity.
Related concepts/tools
large-scale optimization
distributed optimization
parallel optimization
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
High Performance Analytics at SAS
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Parallel Implementations and Determinism
Non-determinism
a dirty word in the commercial world
Sources of non-determinism
adding columns in a dierent order
aggregating results in a dierent order
arbitrary random number seeds
dierent machines in the pool
time limits
Workarounds
operations in a xed order
deterministic criteria (nodes, iterations)
deterministic ticks (see Xpress/Cplex)
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Determinism comes with a
performance penalty.
Big Data Optimization at SAS
Quadratic programming
Support vector machine
Linear programming
Quantile regression
Mixed-integer linear programming
Marketing optimization
Derivative-free optimization
Local search optimization
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Support Vector Machines
http://www.cac.science.ru.nl/people/ustun/SVM.JPG
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Linear SVM Problem
Primal formulation
1
kwk22 + τ eT z
2
subject to Y w − βd ≥ e − z
z ≥ 0.
minimize
w,z,β
Dual formulation
1
−eT v + v T Y Y T v
2
subject to dT v = 0,
0 ≤ v ≤ τ e.
minimize
v
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Using Primal-Dual Interior-Point Approach
Dominant cost per iteration is forming/solving Newton system
Many more observations than columns/features
T
(I + Y T Ω−1
k Y − vk vk )∆w = −rw
Y T Ωk−1 Y must be formed every iteration
Many more columns/features than observations
Y Y T + Ωk d
∆v
ρ
=− β
dT
0
−∆β
rΩ
Y Y T constant for all iterations
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
SVM: Parallel Performance
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
SVM: Features and Plans
Features
frequency/weight term
iterative (PCG) or direct (Cholesky, threaded) method to solve
the Newton system
balance threads to avoid cache misses
balance number of compute nodes to limit communication
In progress
nonlinear SVM
build a distributed QP solver
Available soon in SAS/OR
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Quantile Regression
Goal
Approximate the median or some other quantile of the response
variable of a number of observations
min τ u+ + (1 − τ )u−
A(β + − β − ) + u+ + u− = b
β + , β − , u+ , u− ≥ 0
A: observations in rows, fully dense
b: response variable
τ : quantile level
Problem size
Up to 108 observations each of dimension 104
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Quantile Regression
Features
distributed IPM, similar to the SVM case
Newton system solved directly or iteratively
dierent preconditioner
Plans
categorical variables sparse observations
nonlinear quantile regression
build a distributed dense LP/IPM solver
Available soon in SAS/OR
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Marketing Optimization
Problem
Assign ads/oers to customers based on budget, policy, user
preferences, history and other kinds of constraints.
Formulation
Typical sizes: millions of customers, hundreds of oers
Formulated as a MILP with millions of binary variables
Solution
Special decomposition
Subproblems are solved distributed on the grid
Subgradient algorithm for the master
Available as a SAS solution.
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Marketing Optimization
Typical data (telecommunications)
15 million customers
910 communications
14 aggregate constraints
19 rolling contact policies (per day, per week, per month)
90 million oers in the contact history
Performance
Used to take 10 hours on a single machine with regular MO
Solved in 2 minutes on an EMC Greenplum appliance
(32 nodes, 24 threads, 48GB ram)
Allows for scenario analysis
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Local Search Optimization
Algorithm
GA-guided pattern search
Continuous, discrete and categorical variables
Up to about 100 variables
Implementation
Classical: each worker evaluates the function at a given point
Big data: each worker computes part of the function value
from its own data, then these are aggregated
Function value cache
Leading to simulation-based optimization
Available as part of SAS/OR
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Parallel Optimization
Not Big Data, but uses the same infrastructure
Decomposition
Multistart NLP
Option tuner for MILP
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Decomposition Outline
Algorithm
Dantzig-Wolfe Decomposition embedded in B&B
a specic variant of column generation
Relax A and subproblem B becomes tractable (even separable)
Find convex combinations of extreme points of subproblems
that satisfy the continuous relaxation of the master constraints
Iterate between master A (reformulated space) and B
Blocks
From user, network or auto
Available in SAS/OR
A1
B1






A2
···
A|K|






B2
.
.
.
B|K|
Copyright
c 2012,
SAS Institute Inc. All rights reserved.

Decomposition Parallel Implementation
Implementation
Shared (Threaded) and/or Distributed Memory (Gridded)
Subproblems use a standard queue
Areas of parallelism
X
X
X
X
Branch & Bound
Heuristics (non-blocking price-and-branch)
Subproblem solves (across subproblems)
Master solve (IPM or concurrent)
Subproblem solves (for each subproblem)
Factors aecting parallel performance
percentage of time in subprob vs master (modeling)
load balance aggregate subproblems
enforce balance with time limits? (non-deterministic)
MPI overhead jobs must be signicant
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Other parallel optimization applications
Network tools
Graph centrality, community detection
Social network analysis for fraud detection
Marketing analysis for telecommunications
Multistart NLP (Global optimization)
Standard NLP solvers started from dierent points
Function evalutions and solvers are distributed
Option tuner for MILP
Find the best option setting for a set of MILP problems
Continuous, discrete and categorical options are all included
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Future Plans
Extend the list of HP enabled procedures
driven by customers' needs
distributed LP/QP
distributed graph algorithms
parallel MILP
parallel solves in OPTMODEL
simulation-based optimization
Copyright
c 2012,
SAS Institute Inc. All rights reserved.
Thank you for your attention.
imre@polik.net