1 Energy Efficiency in Data Centers Diljot Singh Grewal “What matters most to the computer designers at Google is not speed, but power - low power, because data centers can consume as much electricity as a city” – Eric Schmidt, CEO of Google 2 Some Facts • Data centers consumed 235 billion KWH of energy 2 worldwide(2010). • Datacenters consumed 1.3% of total electricity consumption of the world(as on august 2011) • In 2000 DC used 0.53% , which almost doubled to 0.97% in 2005, by 2010 it rose only to 1.3% • A rack drawing 20 KWH at 10cents per KWH uses more than 17000$ in electricity. 3 Energy Efficiency • Run a DC wide workload and measure energy consumed • πππππππππππ πΈπππππππππ¦ = πΆππππ’π‘ππ‘πππππ ππππ πππππππππ πΈπππππ¦ π’π ππ 1 1 πΆππππ’π‘ππ‘πππ = ∗ ∗ πππΈ ππππΈ πππ‘ππ πΈπππππ¦ π‘π πΈππππ‘πππππ πΆππππππππ‘π 4 Power Usage Effectiveness (PUE) • πππΈ = πππ‘ππ π΅π’ππππππ πππ€ππ πΌπ πππ€ππ • In 2006, 85 % of DC had PUE of greater than 3.0. 5 • Another study estimated it at 2.0 6 • In the state of Art Facility the PUE of 1.1 is achievable.7 5 Reasons: • • • • • • Staged Deployment Fragmentation Following Nameplate Ratings Variable Load Excessive/Inefficient Cooling Excessive/ Inefficient humidity controls… 6 8 7 8 115kV to 13.2kV Loss ~0.5% 6-12% loss Chillers consume 30 – 50% of IT Load. CRAC units consume 10-30% of IT Load Loss in Wires ~1-3% 9 8 10 8 11 Improving Infrastructure • • • • Increasing Temperature to 27 β¦C from 20β¦C. Isolate hot exhaust air from intake Using High Efficiency UPS and other gear Google Achieved a PUE of 1.1 by 9 β« β« β« β« Better air flow and Exhaust handling. Temperature of Cold Aisle at 27 β¦ C Cooling Tower uses Water evaporation Per server UPS that has Efficiency of 99.99% instead of facility wide UPS 12 Google’s PUE over the years 13 Humidity Control • Condensation on Cooling coils can reduce the humidity • Low (<40% rH) humidity levels can lead to static buildup (sparks that can damage chips). • Steam Humidifiers are Energy Expensive • Energy Savings?? β« Using evaporative cooling on incoming air . β« Using evaporative cooling to humidify the hot output air and cool it( which is then used to cool the incoming air) 14 SPUE • ππππΈ = πππ‘ππ ππππ£ππ πΌπππ’π‘ πππ€ππ ππππ π’πππ ππ¦ πππππππππ‘π • Losses due to power supplies, fans, voltage regulators Maximum Efficiency Power supplies 80% Motherboard VRM 70% πππ‘ππ πππΈ=πππΈ∗ππππΈ • If both stand at 1.2 then only 70% of the energy is actually used for computation. 15 18 16 Efficiency of Computing πΆππππ’π‘ππ‘πππ πππ‘ππ πΈπππππ¦ π‘π πΈππππ‘πππππ πΆππππππππ‘π • Hardest to measure. How Do we Benchmark? • New benchmarks : Joule-sort and SPEC power • No benchmarks for Memory or Switches 17 Breakdown 18 CPU • Uses up to 50% at peak but drops to 30% at low activity • Dynamic Ranges β« β« β« β« CPU 3.5x Memory : 2x Disks 1.3x Switches 1.2x 19 20 10 21 10 22 Energy Proportional Computing. • Low Idle Power and proportional afterwards • energy spent will be halved by energy proportionality alone if the system idles at 10%.11 • Might be fine if peak is not that good 23 10 Load level(%) of peak 24 Savings by Energy proportional computing (green line) 11 25 Dynamic Voltage and Frequency Scaling • πππ€ππ = ππ£2π + π ππππ • The time to wake up from low voltage state depends on voltage differential • Not useful on Multicore Architectures? 26 The CPU States • ACPI States: β« Power management component of kernel sends a signal to the Processor Driver to switch to a state • States: β« β« β« β« C0 Normal Operation C1 ,C2: Stops Clocks C3 : C2+ reduced Voltage C4 : C3 + Turns off memory Cache 27 Mode Name What it does C0 Operating State CPU fully turned on C1 Halt Stop main Internal clock via Software, bus and APIC keep running C1E Enhanced Halt C1 + reduced Voltage C2 Stop Grant / Stop Clock Stops clock via Hardware. Bus and APIC Keeps running C2E Extended S.C. C2 + Reduced Voltage C3 Sleep Stops clock (Internal or both) C4 Deeper Sleep Reduces CPU Voltage C4E/C5 Enhanced Deeper Sleep Reduces CPU voltage even more and turns off the cache C6 Deep Power Down Reduces voltage even more(~0V) 28 12 29 12 30 Energy Savings 10 31 Results of scaling at Datacenter Level 11 32 Results of scaling at Datacenter Level 11 33 The Multicore problem • Clock Gating β« Core level Clock gating • Voltage Gating? β« Voltage depends on core with high utilization • Lower Wake Up Penalty by using the Cache β« New architectures have penalties of 60µs down from 250µs. • Power Gating (Power Control Unit) • Separate Power planes for Core and Un-core part 34 The Leakage power 35 Software’s Role • Well Tuned Code can reduce the consumption. • Code that generates excessive interrupts or snoop requests is not good. • OS Power Manager speculates the future processing requirements to make a decision according to the settings selected by user. 36 CPU isn’t the only culprit 10 37 38 Lets talk Storage • Consumes about 27% power • High Performance Disks to match the µP Speed • According to IDC report in 2008, total cost to power and cool a drive is 48 watts. 13 β« 12 watts for running HDD β« 12 watts for storage shelf (HBAs, fans, power supply) β« 24 watts to cool the HDDs and storage shelf 39 Power Consumption of a 2.5” drive 40 Electronics & Software • Adaptive Voltage • Frequency Reduction in Low Power Modes • Queuing Algorithms to minimize rotational delays • Algorithm to manage transitions between low and high power modes 41 Mechanical • Lighter Materials • Better motor Design • Using Helium in a sealed case to reduce air drag β« WD claims energy savings of 23% with higher capacity(40%) • Load/Unload 42 Tiered System 14 • Manage workloads efficiently among multiple RPMs in a storage system • Tiered storage β« Tier 0 with solid state drives (5%), β« Tier 1 with high performance HDDs (15%) β« Tier 2 with low power HDDs (80%) 43 Tiered Storage 44 Mixed Approach • Mirror HP Disk on Low Power Disk and use the low power disk under light load.14 • The Low performance disks use significantly low energy than HP Disks. • Other approaches β« Multispeed Disks: ability to change spin speed.14 β« Lower Rotational speed but multiple heads 45 Solid State Disks • require up to 90% less power 15 • offer up to a 100 times higher performance 15 • Life span of the SSD depends on the I/O ops and it is not good enough for server yet. • MLC vs. SLC 46 File system problems? • Google File system: β« Distribute data chunks across large number of systems (entire cluster) for resiliency.. β« But that means all machines run at low activity and do not go idle. 47 Memory • SRAM: Requires constant voltage • DRAM : Since capacitors leak charge, we need to refresh them every 64 ms (JEDEC) • Suppose we have 213 rows, then we need to refresh a row every 7.8µs. 48 Alternatives • Low Voltage RAM (LoVo) β« Runs at 1.25V (DDR2 -1.8V and DDR3 - 1.5V) β« 2-3W per RAM(2GB) • SSD as RAM17 • Future: β« Ferroelectric RAM β« Magnetoresistive RAM (MRAM) 49 Are few ‘Bulls’ better than a flock of ‘Chickens’? Is Performance Per Watt all we need? • If it is, then we should Buy ARM Servers. • Smaller RAM and Laptop HDD’s • 20 times lower power but at 5 times lower performance : High Response times. • Acc. to Google’s Study, The users prefer 10 results in 0.4 sec over 25 in 0.9 sec. 50 Power Provisioning Costs • Building a Datacenter that can provide power to servers can be costlier than Electricity costs. • $10-22 per deployed IT Watt(provisioning cost) • Cost of 1 Watt of IT Power = • 8766 1000 ∗ 0.07 ∗ 2.0 = $1.227(per year per watt) • Cost savings from efficiency can save more in provisioning. 98%,93%,7.5% 100%,90%,11% Peak 85%, slack 17% 92%, 86%, 16% 52% - 72%,39% 51 1 52 Safety Mechanism and over subscription • Since CDF intercepts the top at a flat slope β« few intervals when close to full load • Remove these intervals – even more machines β« De-scheduling tasks β« DVFS (also Power Capping) 53 Virtualization • the energy cost can be minimized by launching multiple virtual machines. • Virtualized servers have an associated overhead • Different Types have different behaviors β« Para virtualized (XEN) β« Full Virtualization(VMware Server) 54 Para Virtualization (XEN) 55 Virtualization Overheads 16 L : Native Linux X : XEN V: VM-Ware workstation 3.2 U: User mode Linux 56 Performance on Java Server Benchmark 16 Virtualization Performance on SPECjbb 2005 Number of VMs CPUs per VM SUSE SLES 10 Xen 3.0.3 VMWare ESX 3.0.2 1 1 1% 3% 1 4 3% 7% 4 2 5% 15% 4 4 7% 19% 57 Power Management in Virtualized Systems 58 Concluding: • Power efficiency in datacenters is constrained by the performance requirements imposed. • High efficiency gear, Smart design and proper consolidation can lead to huge gains • Efficiency in server components is an ongoing research problem. • Data Centers have many components that affect the overall consumption and synchronization across them is needed to ensure performance and efficiency. 59 References 1. Morgan, T. P. (2006, February 28). The server market begins to cool in Q4. The Linux Beacon. 2. EPA Report in 2006 3. Hiller, A. (2006, January). A quantitative and analytical approach to server consolidation. CiRBA White Paper, p. 4. 4. Personal correspondence with Dale Sartor of LBNL (August 9, 2006). 5. M. Kalyanakrishnam, Z. Kalbarczyk, and R. Iyer, “Failure data analysis of a LAN of Windows NT based computers,” Reliable Distributed Systems, IEEE Symposium on, vol. 0, no. 0, pp. 178, 18th IEEE Symposium on Reliable Distributed Systems, 1999 6. Green Grid, “Seven strategies to improve datacenter cooling efficiency”. 7. X. Fan, W. Weber, and L. A. Barroso, “Power provisioning for a warehouse-sized computer,” in Proceedings of the 34th Annual International Symposium on Computer Architecture, San Diego, CA, June 09–13, 2007. ISCA ’07 8. S. Greenberg, E. Mills, and B. Tschudi, “Best practices for datacenters: lessons learned from benchmarking 22 datacenters,” 2006 ACEEE Summer Study on Energy Efficiency in Buildings. 9. Google Inc., “Efficient Data Center Summit, April 2009”. 10. Luiz André Barroso, Urs Holzle, The Data Center as a Computer: An Introduction to the Design of Warehouse, 2009 , Morgan & Claypool Publishers. 11. X. Fan, W. Weber, and L. A. Barroso, “Power provisioning for a warehouse-sized computer,” in Proceedings of the 34th Annual International Symposium on Computer Architecture, San Diego, CA, June 09–13, 2007. ISCA ’07 60 References 12. 13. 14. 15. 16. 17. 18. Technology brief on Power Capping in HP Systems. Available at http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01549455/c01549455.pdf International Data Corporation, Annual Report, 2008 Enrique V. Carrera, Eduardo Pinheiro, and Ricardo Bianchini, Conserving Disk Energy in Network Servers, International Conference on Supercomputing,2003 “Solid State Drivers for Enterprise” Data Center Environments Whitepaper HGST Paul Barham , Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer , Ian Pratt, Andrew Wareld, "Xen and the Art of Virtualization", University of Cambridge Computer Laboratory Anirudh Badam and Vivek S. Pai, SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy , 8th USENIX conference on Networked systems design and implementation ,2011, Pg 16 M. Ton and B. Fortenbury, “High performance buildings: datacenters—server power supplies,” Lawrence Berkeley National Laboratories and EPRI, December 2005. 61 ARM Server (Calxeda) • More of a cluster in size of a server • Currently holds 12 Energy Cards (in 1 server) • Each Energy card has 4 Energy cores(1.1 – 1.4 GHz) • Larger L2 Cache • Runs Linux (Ubuntu Server 12.10 or Fedora 17) • Don’t need to virtualize but give each application its own node (Quadcore, 4MB L2 4GB RAM) 62 ECX 1000 is ARM Server, others are Intel