Improving Consolidation of Virtual Machines with Risk

advertisement
Improving Consolidation of Virtual Machines
with Risk-aware Bandwidth Oversubscription
in Compute Clouds
Amir Epstein
Joint work with
David Breitgand
1
© 2009 IBM Corporation
Motivation
 Network Bandwidth is a critical Data Center resource
 Network Bandwidth may become a bottleneck for
consolidation
 Accurate and efficient network bandwidth demand estimation
is difficult
 Common practice: fully provision for peak loads
 Consequences: resource waste
2
© 2009 IBM Corporation
Full Provisioning VS. Multiplexing
 The aggregate demand of VMs may be much smaller than the sum of the
maximum demand of each VM: ∑i maxt di(t) >> maxt ∑i di(t)
70
60
Capacity
50
40
VM1
VM1-Max
30
20
10
0
1
10
19
28
37
46
55
64
73
82
91
Max(VM1)+Max(VM2)=110
100
Time
60
50
Capacity
40
VM2
VM2-Max
30
20
10
0
1
10
19
28
37
46
55
64
73
82
91
100
Time
3
© 2009 IBM Corporation
Full Provisioning VS. Multiplexing
80
70
60
Capacity
50
VM1
VM2
40
VM1+VM2
Max: VM1+VM2
30
20
10
0
1
11
21
31
41
51
61
71
81
91
Time
Max(VM1+VM2)=71 < Max(VM1)+Max(VM2)=110
4
© 2009 IBM Corporation
Statistical Multiplexing
 Consider each VM dynamic bandwidth demands as a random
variable
 Consider the aggregate bandwidth demand which is a sum of
the random variables representing VMs Bandwidth demands
 As the number of VMs increases:
– The ratio between standard deviation of the aggregate
bandwidth demand and the mean decreases
5
© 2009 IBM Corporation
Overcommit
 Cloud provider aims at improving cost-efficiency
 Overcommit resources using statistical multiplexing
 Our focus is bandwidth
6
© 2009 IBM Corporation
Stochastic Bin Packing Problem (SBP)
 S={X1,…, Xn} – Set of items
 Xi – random variable representing the size (bandwidth
demand) of item i
 p – overflow probability
 Goal: Partition the set S into the smallest number of subsets
(bins) S1,…,Sk such that
Pr[
X
i: X i S j
i
 1]  p
for 1  j  k
p represents a probabilistic SLA / policy
7
© 2009 IBM Corporation
SBP with Normal Distribution
 We assume that each item i independently follows normal
distribution N(μi ,σi2) .
 When σi,=0, for all i, then Xi= μi and the problem reduces to
the classical bin packing problem
 The focus of this work is SBP with normal variables
8
© 2009 IBM Corporation
Related Work – Bin Packing
 The problem is NP-hard
 Bin packing is hard to approximate to a factor better than
3/2 unless P=NP.
 First Fit Decreasing (FFD) has asymptotic approximation
ratio of 11/9 and (absolute) approximation ratio of 3/2.
 MFFD algorithm has asymptotic approximation ratio of
71/60.
 AFPTAS exists.
 Online bin packing
– First Fit (FF) has competitive ratio of 17/10.
– Best upper and lower bounds are 1.58899 and 154014,
respectively.
9
© 2009 IBM Corporation
Related Work – Stochastic Bin Packing

log p 1
 O 
1
log
log
p


 -approximation for SBP with Bernoulli

 variables [Kleinberg et. al 1997]
 SBP with Poisson, Exponential and Bernoulli variables
[Goel and Indik 1999]
– PTAS exists for Poisson and exponential distributions.
– Quasi-PTAS exists for Bernoulli variables.
– These results relax bin capacity and overflow probability
constraints by a factor 1+ε.
 (1  2)(1   ) - competitive algorithm for SBP with
normal variables [Wang et. al 2011]
10
© 2009 IBM Corporation
Our Results
 2-approximation algorithm for SBP with normal variables
 (2+ε)-competitive algorithm for online SBP with normal
variables
 Observe the existence of a dual PTAS for SBP with normal
variables.
11
© 2009 IBM Corporation
Definitions
 Definition: The effective load of bin j is l j 

i: X i S j

i  
i: X i S j
 i2
where   1 (1  p) and the quantile function  1 is the
inverse function of the CDF Ф of N(0,1).
 Observation: A packing is feasible for a given overflow
probability p iff for every bin j,
lj 

i: X i S j
i  

i: X i S j
 i2  1
The load of bin j is normally distributed with mean  i and
i: X i S j
variance   i2
i: X i S j
12
© 2009 IBM Corporation
Simple solution approach
 Reduce the problem to the classical bin packing problem
with item sizes
i   i , thus P( X i  i   i )  1  p
 A feasible solution to the classical bin packing problem is a
feasible solution SBP, since

i: X i S j
i  

i: X i S j
2
i


i: X i S j
( i   i )  1
 The optimum for the classical bin packing instance with the
new sizes may be significantly larger than the optimum for
SBP.
13
© 2009 IBM Corporation
Effective Size
 l j   i  
2

 i
iS j
  i 
iS j
iS j
(  i ) 2


iS j
2
i

 Thus, the effective size of item i on bin j can be viewed as
i 
(  i ) 2

2

 i
iS j
14
© 2009 IBM Corporation
Approximation Algorithm
Algorithm 1: First Fit VMR decreasing
 Order the items in non-increasing order of VMR
 Place the next item in the first bin into which it can be
feasibly packed
 If no such bin exists, open a new bin to pack this item
Variance to Mean Ratio (VMR) is
15
d i   / i
2
i
© 2009 IBM Corporation
Approximation Algorithm
Theorem 1: Algorithm 1 is a 2-approximation algorithm for
SBP with normal variables.
16
© 2009 IBM Corporation
Integer Program for SBP
n
 xij i  
i 1
m
x
j 1
ij
1
x ij  {0,1}
17
n
2
x

 ij i  1
1  j  m,
i 1
1  i  n,
1  i  n, 1  j  m
© 2009 IBM Corporation
Mathematical Program Relaxation
n
 xij i  
i 1
m
x
j 1
ij
1
x ij  0
18
n
2
x

 ij i  1
1  j  m,
i 1
1  i  n,
1  i  n, 1  j  m
© 2009 IBM Corporation
Fractional Algorithm (Algorithm 2)
 Order the items in non-increasing order of VMR
 Place the next item in the bin with remaining capacity. If
the item causes an overflow to the bin, assign maximum
fraction of this item to the bin. Then, open a new bin to pack
the remaining part of this item.
Variance to Mean Ratio (VMR) is
19
di   i2 / i
© 2009 IBM Corporation
Analysis
Lemma: There exists a feasible solution to the MP with the
following property. For any pair of items k,l and a pair of
bins i<j, if xkj>0 and xli>0, then dl ≥ dk.
Observation: Fractional algorithm produces a feasible
fractional solution to the MP.
 This implies that collocating items with high VMR (bursy)
minimizes the total effective size of the items
Variance to Mean Ratio (VMR) is
20
d i   / i
2
i
© 2009 IBM Corporation
Proof Outline
 Consider a feasible solution to the MP with lexicographically
maximal standard deviation (STD) vector of the bins
 i2
S=(S1,…,Sm), where S j 

i: X S
 Assume by contradiction that the items are not packed into the
bins according to non-increasing order of VMR
 Thus, there exists at least one pair of items that are not placed in
this order (i.e., item with smaller VMR is packed to a bin with
smaller index than the other item).
 We show that we can exchange fractions of these items between
the bins, such that
– the new solution is feasible
– The STD vector of the bins in the solution is lexicographically
greater than the one in the original solution
 21Contradiction
i
j
© 2009 IBM Corporation
Online Algorithm
 VMR di   i / i
 Let
1
8 1
C   ln   log1 4

  
2
 Class 0: di   2
2
k 1
2
k

(1


)

d


(1


)
 Class 1≤k≤C:
i
 Class C+1: 1/  2   2 (1   )C  di
22
© 2009 IBM Corporation
Online Algorithm
Algorithm 3:
 Classify next item according to the VMR classes
 Place the next item in the first bin of its class into which it
can be feasibly packed
 If no such bin exists, open a new bin to pack this item
Theorem 2: Algorithm 3 is a (2+O(ε))-approximation
algorithm for SBP with normal variables.
23
© 2009 IBM Corporation
Simulation Study
 Compare our proposed algorithms to previous reported
ones
 Data set
– Real trace from production data center used to compute
mean and standard deviation of bandwidth consumption
of 6000 VMs over a few hours period.
– Synthetic traces with statistical properties similar to
those of the real traces
24
© 2009 IBM Corporation
Algorithms
 Algorithms 1-3
 First Fit (FF) with deterministic item sizes μi+βσi
 First Fit Decreasing (FFD) with deterministic item sizes
μi+βσi
 Group Packing (GP) [Wang et. al 2011]
For the online algorithms (Algorithm 3 and Group Packing),
we set ε=0.1.
25
© 2009 IBM Corporation
Real Instance
(Online)
26
(Approx.)
(L.B)
© 2009 IBM Corporation
Real Instance
(Online)
27
(Approx.)
(L.B)
© 2009 IBM Corporation
Real Instance
(Online)
28
(Approx.)
(L.B)
© 2009 IBM Corporation
Online Algorithms
 Large synthetic instances
9%
8%
29
© 2009 IBM Corporation
Summary
 We studied SBP under the assumption that virtual machines
bandwidth demand obeys normal distribution
 We showed a 2-approximation algorithm
 We showed (2+ε)-competitive algorithm
 We observed the existence of a dual PTAS for SBP
 We studied the performance and applicability of our
algorithms using synthetic and real data
 The performance evaluation showed that our proposed
algorithms considerably reduce the number of bins compared
to the best known algorithms for the problem
30
© 2009 IBM Corporation
Download