Power Cost Reduction in Distributed Data Centers

advertisement
Power Cost Reduction in Distributed
Data Centers
Yuan Yao
University of Southern California
Joint work: Longbo Huang, Abhishek Sharma,
LeanaGolubchik and Michael Neely
IBM Student Workshop for Frontiers of Cloud Computing 2011
Paper to appear on Infocom 2012
1
Background and motivation
• Data centers are growing in number and size…
– Number of servers: Google (~1M)
– Data centers built in multiple locations
• IBM owns and operates hundreds of data centers worldwide
• …and in power cost!
– Google spends ~$100M/year on power
– Reduce cost on power while considering QoS
2
Existing Approaches
• Power efficient hardware design
• System design/Resource management
– Use existing infrastructure
– Exploit options in routing and resource management of
data center
3
Existing Approaches
• Power cost reduction through algorithm design
– Server level: power-speed scaling [Wierman09]
– Data center level: rightsizing [Gandhi10, Lin11]
– Inter data center level: Geographical load balancing [Qureshi09,
Liu11]
$5/kwh
$2/kwh
job
4
Our Approach: SAVE
• We provide a framework that allows us to exploit options in
all these levels
Server level
Data center level
Inter data center level
+
Job arrived
Temporal
volatility of
power prices
=
StochAstic power
redUctionschEme(S
AVE)
Job served
5
Our Model: data center and workload
• M geographically distributed data centers
• Each data center contain a front end server and a back end cluster
• Workloads Ai(t) (i.i.d) arrive at front end servers and are routed
to one of the back end clusters
µji(t)
6
Our Model: server operation and cost
• Back end cluster of data center i contain Ni servers
– Ni(t) servers active
•
•
•
•
Service rate of active servers: bi (t) ∈[0, bmax]
Power price at data center i: pi(t) (i.i.d)
Powerusage at data center i:
Power cost at data center i:
7
Our Model: two time scale
• The system we model is two time scale
– At t=kT, change the number of active servers Nj(t)
– At all time slots, change service rate bj(t)
8
Our Model: summary
• Input: power prices pi(t), job arrival Ai(t)
• Two time Scale Control Action:
• Queue evolution:
• Objective: Minimize the time average power cost
subject to all constraints on Π, and queue stability
9
SAVE: intuitions
• SAVE operates at both front end and back end
• Front end routing:
– When
, choose μij(t)>0
• Back end server management:
– Choose small Nj(t) and bj(t) to reduce the power costfj(t)
– When
is large, choose large Nj(t) and bj(t) to stabilize
the queue
10
SAVE: how it works
• Front end routing:
– In all time slot t, choose μij(t) maximize
• Back end server management: Choose V>0
– At time slot t=kT, choose Nj(t) to minimize
– In all time slots τ choose bj(τ) to minimize
• Serve jobs and update queue sizes
11
SAVE: performance
• Theorem on performance of our approach:
– Delay of SAVE ≤ O(V)
– Power cost of SAVE ≤ Power cost of OPTIMAL + O(1/V)
– OPTIMAL can be any scheme that stabilizes the queues
• V controls the trade-off between average queue size
(delay) and average power cost.
• SAVE suited for delay tolerant workloads
12
Experimental Setup
• We simulate data centers at 7 locations
– Real world power prices
– Possion arrivals
• We use synthetic workloads that mimics MapReduce jobs
• Power Cost
Power price
Power
consumption of
active servers
Power
consumption of
servers in sleep
Power usage
effectiveness
13
Experimental Setup: Heuristics for comparison
• Local Computation
– Send jobs to local back end
• Load Balancing
All servers
are activated
– Evenly split jobs to all back ends
• Low Price (similar to [Qureshi09])
– Send more jobs to places with low power prices
• Instant On/Off
Unrealistic
– Routing is the same as Load Balancing
– Data center i tune Ni(t) and bi(t) every time slot to minimize its
power cost
– No additional cost on activating/putting to sleep servers
14
Experimental Results
relative power cost reduction as
compared to Local Computation
• As V increases, power cost reduction grows from ~0.1% to
~18%
• SAVE is more effective for delay tolerant workloads.
15
Experimental Results: Power Usage
• We record the actual power usage (not cost) of all
schemes in our experiments
• Our approach saves power usage
16
Summary
• We propose atwo time scale, non work conserving control
algorithm aimed atreducing power costin distributed data centers.
• Our work facilitating an explicit power cost vs. delay trade-off
• We derive analytical bounds on the time average power cost and
service delay achieved by our algorithm
• Through simulations we show that our approach can reduce the
power cost by as much as 18%, and our approach reduces power
usage.
17
Future work
• Other problems on power reduction in data centers
– Scheduling algorithms to save power
– Delay sensitive workloads
– Virtualized environment, when migration is available
18
Questions?
• Please check out our paper:
– "Data Centers Power Reduction: A two Time Scale
Approach for Delay Tolerant Workloads” to appear on
Infocom 2012
• Contact info:
yuanyao@usc.edu
http://www-scf.usc.edu/~yuanyao/
19
Download