Geographically Distributed Datacenters with Load Reallocation

Geographically Distributed Datacenters
with Load Reallocation
Indra Widjaja, Sem Borst, Iraj Saniee
Bell Labs
DIMACS Workshop on Cloud Computing, December 8-9, 2011
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Datacenter Alternatives
Geographically Centralized:
2
Geographically Distributed:
2
1
1
3
4
3
5
4
= Servers
5
= Potential DC Site
2
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Challenge
•Centralized datacenters cannot uniformly offer
low-latency services to all end-users
•Distributed datacenters may not achieve
elasticity
3
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Toy Example of Distributed DC with Reallocation
Without reallocation:
With reallocation:
λ1
m1
q1,1
1
m1
3
2
4
λ1
1
q1,3
3
2
5
4
5
•  i = job arrival rate at site i , mi = processing capacity at site i
• qi,j = fraction of load reallocated from site i to site j
4
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Formal Model of Load (Re)Allocation
in Geographically Distributed Datacenter
Let lik be arrival rate of type-k jobs at site i, bk service time of type-k job per server,
and ti,j round-trip delay between sites i and j. The optimization problem to solve is:
weighted average delay
fraction of load at i sent to j
st
where
normalized exogenous arrival rate at i
total exogenous arrival rate at all sites
total arrival rate at site j
utilization at site j with Kj servers
average processing delay with multiple-server approx.
5
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Toy Example of Distributed DC with Reallocation
λ1=2
1
λ2=1.5
1
λ3=1
2
3
1
1
λ4=1.5
1
1
4
5
Node
l
m
t
Delay
1
1.814
3
0
0.8432
0.186
1->3
1
1.6143
2
1.5
3
0
0.6667
3
1
3
0
0.6143
4
1.5
3
0
0.6667
5
1.814
3
0
0.8432
0.186
5->3
1
1.6143
Q=
0.907
0
0
0
0
0
1
0
0
0
0.093
0
1
0
0.093
0
0
0
1
0
0
0
0
0
0.907
λ5=2
Weighted Delay = 0.7842
6
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Large-Scale Topology
32-node, 44-link network used in the experiment:
SEA
11
SAI
6
2
5
2
SAL
4
SFO
2
3
DEN
4
4
LOS
PHO
2
1
1
PIT
2
4
RAL
5
3
ATL
1
NOR
HOU
1
JAC
3
TAM
TAM
3
2
MIA
• Each link is associated with delay tij.
• The centralized datacenter is located in CHI
7
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
1
PHI
BAL
1
1
1 WAS
2
1
ALB
2
2
1
CIN
NAS
3
ELP
CLE
3
5
2
1
CHI
2
KAN
2
3
1
DET
1
1
SPR
2
LAS
BUF
MIL
1
1
NYC
BOS
Comparison of Delays
Nearly-uniform job arrival rates:
Non-uniform job arrival rates:
li = 1.1l, if i is odd
0.9l, if i is even
li = 1.5l, if i is odd
0.5l, if i is even
mi =1 for all i
8
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Comparison of Elasticities
Moderate load variation:
High load variation:
In each trial,
li =Uniform(0.25, 1) for moderate load variation for each i
li =Uniform(0, 1.5) for high load variation for each i
Then rescale li such that system-wide utilization is fixed (to 0.5).
mi = 1 for each i
9
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Multiple Job Types
• Type-independent: jobs are reallocated from i to j with qi,j fraction regardless of their types
• Type-dependent: type-k jobs are reallocated from i to j with qki,j
Example with 2 job types:
10
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Distributed Algorithms for Load Reallocation
• Basic idea:
- Each site i computes impact on global objective function as it sends an
additional small fraction of jobs to each site j, i.e.,
- Min-rule: site i determines site jmin(i) such that ai,jmin(i) is the minimum
derivative. It then reallocates loads from other sites to site jmin.
- Max-rule: site i determines site jmax(i) such that ai,jmax(i) is the
maximum derivative. It then reallocates loads from site jmax to other
sites.
11
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Distributed Algorithm with “min-rule”
At site i: Compute gi,j = ai,j - ai,jmin(i) for all j  Ni,
compute gi = ∑jNi, j ≠jmin(i) gi,j, and
d=min{k, (1-rjmin(i)) Kjmin(i)/(li b gi)}
where jmin(i) = argminjNi ai,j
At site i: Evaluate hi,j = min{qi,j, d gi,j} for all j ≠ jmin(i), jNi,
and hi,jmin(i) = - ∑j≠jmin(i), jNi hi,j
At site i: Update qi,j = qi,j-hi,j for all jNi, qi,j=0, for jNi
Collect new
measurement and
go to next site
(e.g., i=i+1 mod N)
No
Converged?
Yes
Detect changes in
delay and utilization
12
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Distributed Algorithm with “max-rule”
At site i: Compute gi,j = max{ai,jmax(i) - ai,j, 0} for all jNi
and compute nij = (1-rj) Kj/(li b), for all j ≠ jmax(i), j Ni,
where jmax(i) = argmaxj:qi,j>0 ai,j
At site i: Compute d = min{k, qi,jmax(i)/ ∑jNi gi,j}
Evaluate hi,j = min{nij, d gi,j} for all j ≠ jmax(i), j Ni,
and hi,jmax(i) = - ∑j≠jmax(i),jNi hi,j
At site i: Update qi,j = qi,j + hi,j for all jNi, qi,j=0, for jNi
Collect new
measurement and
go to next site
(e.g., i=i+1 mod N)
No
Converged?
Yes
Detect changes in
delay and utilization
13
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Scenario 1: Load Increases by 50% at One Site
14
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Scenario 2: Load Increases by 100% at One Site
15
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Scenario 3: Load Increases by 200% at One Site
16
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Scenario 4: Two Back-to-Back Overloaded Sites
17
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Scenario 5: Noisy versus Perfect Measurements
18
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Conclusions and Further Work
• Load reallocation provides key instrument for achieving
elasticity and reducing latency simultaneously
• Only considered processing-intensive applications so far;
other applications will be considered in further work
19
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — CONFIDENTIAL — SOLELY FOR AUTHORIZED PERSONS HAVING A NEED TO KNOW — PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION