Slide - MESL

advertisement
Energy Efficient Geographical Load Balancing
via Dynamic Deferral of Workload
Muhammad Abdullah Adnan
Ryo Sugihara (Amazon.com)
Rajesh K. Gupta
Department of CSE
University of California San Diego (UCSD)
Adnan . IEEE CLOUD 2012
1
Data Centers: Energy Consumption
• Energy expenses become
increasingly important
Data Center
– 61 million MWh per year,
costing about 4.5 billion
dollars: Growing very fast
– Millions of dollars for
companies every year
• Increasing energy prices
and rise of cloud computing
– Energy efficient Cloud
• Significant research on
improving energy efficiency
Adnan . IEEE CLOUD 2012
2
Geographical Load Balancing
• Cloud Computing can be utilized for energy efficient
computing.
– Increasing energy prices.
– ability to dynamically track these price variations.
• Geographical Load Balancing techniques have been
suggested for data centers hosting cloud computation
– exploit the electricity price differences across regions.
Adnan . IEEE CLOUD 2012
3
Qureshi et al.
[ACM SIGCOMM 2009]
• Geographical Load Balancing
– reducing the electricity cost in a wholesale market environment.
– Lower electricity bill by adapting the load balancing with dynamic
electricity price variation.
• Electricity Markets
– Day-ahead markets (futures)
• Hourly price predicted for the following day
– Real-time markets (spot)
• Prices are calculated every five minutes, based on actual
conditions, rather than expectations.
Our work
• More volatile – provides opportunities for savings.
Adnan . IEEE CLOUD 2012
4
Buchbinder et al.’s Approach
[IFIP Networking 2011]
• Online algorithms for migrating jobs between data
centers,
– fundamental tradeoff between energy and bandwidth costs.
• Sophisticated methods to reduce the computational
complexity of the proposed heuristics.
• Drawbacks
– Elementary cost vectors
• Large number of iterations
– Discretization of continuous update rule
• Computationally costly
– Bounded Competitive Ratio
• constant/fixed workload - √
• varying workload – x
– No deadline requirement.
Adnan . IEEE CLOUD 2012
5
Liu et al.’s Algorithm
[ACM SIGMETRICS 2011]
• Distributed algorithms for Geographical Load Balancing
– Multiple sources for workload.
– Incorporated capacity provisioning inside data centers
• Only homogeneous servers
• Investigated how renewable energy can be used to lower
the electricity price of brown energy.
• Drawbacks
– No bound on the maximum delay.
– No workload migration.
Adnan . IEEE CLOUD 2012
6
Dynamic Deferral
• Cloud Computing and Mobile Computing
– More and more computation has been outsourced to the cloud.
– Different types of workload
• Delay sensitive, response time/throughput guarantee, Completion
time/deadline requirement.
• Service level agreement (SLA)
– Latency requirement
– Often has some flexibility
• We use the flexibility from different SLAs for geographical
load balancing to reduce energy consumption.
– Defer some of the workload to execute later when electricity
price is low
– Utilize the slackness in the execution of jobs for energy savings.
Adnan . IEEE CLOUD 2012
7
Assumptions
• Temporal and Geographical variation of electricity
prices.
– Variation is unpredictable.
– Migrate jobs between data centers
• cloud service providers have many replication of their data.
• We consider data centers as computation units.
– Homogeneous/heterogeneous
• Workloads arrive at a central dispatcher.
– Dispatcher cannot store workload
– Makes load balancing decision
CLOUD
dispatcher
Adnan . IEEE CLOUD 2012
8
Geographical Load Balancing
migration
j
assignment
zi,j,d,t
i
xi,d,t
Adnan . IEEE CLOUD 2012
9
Model Formulation
• Workload Model
– Workload Lt released at time t has Deadline Dt
• Cost Model
j
zi,j,d,t
i
– Energy cost
xi,d,t
• Proportional to the workload
C i ,t ( y i ,t )   i   i ,t y i ,t
• piecewise linear function
– Bandwidth cost
• Cost of migration
Lt
B i , t ( z i , j , t )  bi , j z i , j , t
Adnan . IEEE CLOUD 2012
t
t+1
……
t+D
10
Model Formulation
• Assumption: uniform deadline
– Deadline is same for all the jobs
• The net amount of workload executed at data
center i at time t
assigned + migrated in - migrated out
n
y i ,t  x i ,t 
D
n
D
  z j ,i , d ,t  d    z i , j , d ,t  d
j 1 d 1
Adnan . IEEE CLOUD 2012
j 1 d 1
11
Offline Formulation
• Future price known => there exists optimal
solution without migration.
– Dispatcher can always make the correct assignment.
Execution cost
Migration cost
Total assignment equals
total released workload
Total migration cannot
exceed total assignment
Adnan . IEEE CLOUD 2012
12
Online Challenges
Decide xt & zt online
0
t
time
• Unpredictable future electricity cost.
– How much to execute at current time?
– How much to defer to execute later?
– How much to migrate and where?
• Future workload is also unknown
– Online algorithm
Adnan . IEEE CLOUD 2012
13
Our Approach
• Decouple migration from assignment.
• @ Dispatcher – Assignment
– based on the current electricity prices and
future price predictions.
• @ DC - Migration Decision
– The predicted electricity prices by the
dispatcher may contain prediction errors.
– Data centers correct that error by migrating
jobs between each other at later time slots.
Adnan . IEEE CLOUD 2012
14
@ Dispatcher – Assignment
• The dispatcher distributes the workload among n data centers.
DC1
Lt
t
t
t+1
……
t+1
t+D
t+D
DC2
t
t+1
t+D
DCn
t
Adnan . IEEE CLOUD 2012
t+1
t+D
15
@ DC
• Adjust assignment with dynamic electricity price
variation.
– Moving workload at earlier time slots.
– Migrating workload between data centers.
Adnan . IEEE CLOUD 2012
16
Formulation w/o Migration
• Workload assigned at later time slots can only
be moved to previous time slots.
Total execution
should be equal
Data Center 1
t
t+1
t+D
t
t+1
t+D
t
t+1
t+D
Data Center 2
Data Center n
unexecuted workload
Adnan . IEEE CLOUD 2012
execution cannot be less
than unexecuted workload
17
Formulation with Migration
• Workload can migrate between data centers
Data Center 1
t
t+1
t+D
t
t+1
t+D
t
t+1
t+D
Data Center 2
Data Center n
unexecuted workload
Adnan . IEEE CLOUD 2012
every data center
does some work
18
@ DC - Migration Decision
+
t
t+1
t+D
t
t+1
t+D
assigned workload
unexecuted workload
+
t
t+1
t+D
Migrated-in workload
t
t+1
t+D
Migrated-out workload
Adnan . IEEE CLOUD 2012
19
How good is the algorithm?
Lemma
No online algorithm has constant competitive ratio with
respect to the offline formulation.
Proof
Adversary Method
βt+i = K’βt
βt = Kβt+D
CASE 1: xD,t ≠ 0
Lt = Lt+1 = M
t
t+1
……
t+D
Competitive Ratio =
t
t+1
t
t+1
……
……
Offline
Online Cost
Offline Cost
= K’/K, arbitrary
t+D
t+D
K’ > K
t
t+1
Adnan . IEEE CLOUD 2012
……
Any Online
t+D
20
How good is the algorithm?
Lemma
No online algorithm has constant competitive ratio with
respect to the offline formulation.
Proof
Adversary Method
βt+i = K’βt
βt = Kβt+D
CASE 2: xD,t = 0
Lt = M
t
t+1
……
t+D
Competitive Ratio =
t
t+1
t
t+1
……
……
Offline
Online Cost
Offline Cost
= K, arbitrary
t+D
t+D
K’ > K
t
t+1
Adnan . IEEE CLOUD 2012
……
Any Online
t+D
21
How good is the algorithm?
• Since the competitive ratio cannot be bounded, we
compare the online algorithm with much simpler online
algorithms.
• Suppose
Online
Prediction
Algorithm
Error
AEM
√
√
AE
√
x
A
x
x
Adnan . IEEE CLOUD 2012
Migration
22
How good is the algorithm?
Lemma
Cost(AEM) ≤ Cost(AE)
•
Proof
Let Δy = amount of migrated workload
y = amount of non- migrated workload
CostAEM(y) = CostAE(y)
•
Data Center 1
t
t+1
t+D
t
t+1
t+D
Migration happens only when
Cost of execution of Δy at earlier time slot + cost
of migration of Δy
≤
Cost of execution of Δy at later time slot
Data Center 2
CostAEM(Δy) ≤ CostAE(Δy)
Data Center n
t
t+1
unexecuted workload
t+D
•
Cost(AEM) = CostAEM(y) + CostAEM(Δy)
≤ CostAE(y) + CostAE(Δy)
≤ Cost(AE)
Adnan . IEEE CLOUD 2012
23
How good is the algorithm?
Lemma
Cost(AEM) ≤ Cost(AE)
+
Lemma
Cost(AE) ≤ (1+ε) Cost(A)
Proof
Prediction error, ε
Predicted price, β’
Actual price, β
β’ – ε ≤ β ≤ β’ + ε
α + β’y
Cost(AE)
Cost(A)
=
Adnan . IEEE CLOUD 2012
α + βy
≤ 1+
ε
β
≤1+ε
24
How good is the algorithm?
Lemma
Cost(AEM) ≤ Cost(AE)
+
Lemma
Cost(AE) ≤ (1+ε) Cost(A)
‖
Theorem
Cost(AEM) ≤ (1+ε) Cost(A)
Adnan . IEEE CLOUD 2012
25
Electricity Price Prediction
• We model future prices within 24-hr time-frame with
Gaussian random variables with
– Means: predicted prices by moving average from current day
prices.
– Variance: estimated from the history by the weighted average
price prediction filter.
• By using two different methods for mean and variance,
we exploit both temporal and historical correlation of
electricity prices.
Adnan . IEEE CLOUD 2012
26
Evaluation - Electricity Price
• Four data centers geographically located at four different locations.
five minute locational marginal electricity prices in real time market on
15th February, 2012 for four different regions.
Adnan . IEEE CLOUD 2012
27
Evaluation - Workload
• Two MapReduce Traces from Facebook
– Cluster of 600 machines over 24 hours.
– Time slot length of 5 minutes because electricity
prices vary with an interval of 5 minutes.
Workload A
Workload B
Adnan . IEEE CLOUD 2012
28
Evaluation - Deadline
• We vary deadline 1-12 slots and compare cost reduction with
respect to the greedy algorithm without deferral by Qureshi et
al.
• Dynamic deferral can provide around 30% cost savings for
deadlines of 12 slots (1 hour) and even for one slot we can
get 5% cost savings.
Workload A
Workload B
Adnan . IEEE CLOUD 2012
29
Evaluation - Deadline
• We compare the total cost from the algorithms AEM and AE.
• The total cost from AEM is always less than the AE as
claimed in Lemma.
• As deadline increases prediction error increases (AE) but cost
decreases (AEM) due to flexibility of migration.
Workload A
Workload B
Adnan . IEEE CLOUD 2012
30
Non-uniform Deadline
• Workload decomposed according to their associated
deadline, Ld,t , 0 ≤ d ≤ D
• Then we replace the release constraints in the formulations by
D
 L d ,t
 Lt
d 0
• Deadline assignment by k-means clustering based on sizes
(map, shuffle and reduce bytes)
15.64% cost reduction for Workload A
9.23% cost reduction for Workload B
Adnan . IEEE CLOUD 2012
31
Summary of Findings
• Formulation for geographical load balancing with
deferral
– Uniform deadline
– Non-uniform deadline
• Characterization of optimal offline solution
• Online Algorithm
– Formulation with migration
– Formulation without migration
• Future work
– Heterogeneity in data centers/cloud.
– Availability of renewable energy.
Adnan . IEEE CLOUD 2012
32
Thank You
?
Adnan . IEEE CLOUD 2012
33
Download