ElasticTree: Saving Energy in Data Center Networks 許倫愷 2013/5/28 About the paper Brandon Heller, Srini Seetharaman, Priya Mahadevan, Yiannis Yiakoumis, Puneet Sharma, Sujata Banerjee, Nick McKeown NSDI’10 (USENIX conference on Networked systems design and implementation) Citation: 174 16 pages Outline The big picture Introduction ElasticTree system Analysis Conclusion Outline The big picture Introduction ElasticTree system Analysis Conclusion The motivation The motivation Very inefficient!! Desired Why wasting power Provisioning for peak Time varying traffic demands Low efficiency at low loads The goal of ElasticTree The approach… Turn off unneeded links and switches The challenge Performance Fault tolerance Scalability Outline The big picture Introduction ElasticTree system Analysis Conclusion Introduction What is ElasticTree: ElasticTree is a system for dynamically adapting the energy consumption of a data center network • What does it do: Finding minimum-power network subsets across a range of traffic patterns Trade-off: energy efficiency, performance and robustness Introduction Data center network (Traditional) 2N Tree: One failure can cut the effective bisection bandwidth in half; two failures can disconnect servers Data center network Fat tree: SIGCOMM 2008, A Scalable, Commodity Data Center Network Architecture Data center network provision for peak workload Traffic varies daily, weekly, monthly, and yearly. Energy Proportionality The strategy: turn off the links and switches that we don’t need Outline The big picture Introduction ElasticTree system Analysis Conclusion ElasticTree ElasticTree is a system for dynamically adapting the energy consumption of a data center network ElasticTree If 0.2 Gbps of traffic per host ,1 Gbps link… ElasticTree 13/20 switches and 28/48 links stay active ElasticTree reduces network power by 38% 0.8 0.4 0.2 ElasticTree The optimizer: find the minimum- power network subset which satisfies current traffic conditions Optimizer As traffic conditions change, the optimizer continuously re-computes the optimal network subset 3 approaches: Formal Model , Greedy Bin-Packing , Topology-aware Heuristic Optimizer comparison Formal model Finding the optimal flow assignment alone is an NPcomplete problem for integer flows. Derived from standard multi-commodity flow (MCF) problem The model outputs a subset of the original topology, plus the routes taken by each flow to satisfy the traffic matrix O(n^3.5+) Greedy Bin-Packing Strategy: choose the leftmost one with sufficient capacity O(n^2+) 1G link Greedy Bin-Packing 1G link Topo-aware Heuristic 1. does not compute the set of flow routes 2. assumes perfectly divisible flows => pack every link to full utilization and reduce TCP bandwidth => starter subset Decoupling power optimization from routing : => can be applied alongside any fat tree routing algorithm Topo-aware Heuristic An edge switch doesn’t care which aggregation switches are active, but instead, how many are active Topo-aware Heuristic Decoupling power optimization from routing Optimizer comparison Outline The big picture Introduction ElasticTree system Analysis Conclusion How to test K = 6, fat tree OpenFlow Analysis Traffic pattern: Near: servers communicate only with other servers through their edge switch Far: servers communicate only with servers in other pods Analysis Random demand: Individual aggregation/core switches turning on/off Analysis 70% to outside, 30% inside DCN Different traffic load Analysis: redundancy If only the MST is on => no redundancy => no fault tolerance Analysis: redundancy +MST: additive cost, multiplicative benefit Analysis: latency Ethernet overheads (preamble, inter-frame spacing, and the CRC) cause the egress buffer to fill up Packets either get dropped or significantly delayed 0.25 0.33 0.5 Need safety margin!! Analysis: latency Safety margin is the amount of capacity reserved at every link by the optimizer Traffic overload is the amount each host sends and receives beyond the original traffic matrix Trade-off between Energy and Performance Outline The big picture Introduction ElasticTree system Analysis Conclusion Summary Reference The paper The slide (by the author) A youtube video (by the author, too) http://www.youtube.com/watch?v=G2_D-CH4tQk Questions Thank you!