Ensemble-level Power Management for Dense Blade Servers Partha Ranganathan, Phil Leech Hewlett Packard David Irwin, Jeff Chase Duke University © 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice The problem • Power density key challenge in enterprise environments − Blades increasing power density; Data center pushback on cooling • Increased thermal-related failures if not addressed • Problems exacerbated with data center consolidation HP Confidential 2 Challenges with Traditional Solutions • Pure infrastructure solutions reaching limits − Forced air cooling to liquid cooling? − 60+Amps per rack? • Large costs for power and cooling − Capital costs: • E.g., 10MW data center, $2-$4 million for cooling equipment − Recurring costs: • At data center, 1W of cooling for 1W of power • For 10MW data center, $4-$8 million for cooling power Can we address this problem at system design level? HP Confidential 3 This Talk: Contributions Address power density at system level Ensemble-level architecture for power management − Manage power budget across collections of systems − Recognize trends across multiple systems − Address compounded overprovisioning inefficiencies • Power trends from 130+ servers in real deployments − Extract power efficiencies at larger scale • Architecture and implementation − Simple hardware/software support; preemptive and reactive policies • Prototype and simulation at blade enclosure level − Significant power savings; no performance loss HP Confidential 4 Workload Behavior Trends Data from hp.com Nominal different from peak (and nameplate) HP Confidential 5 Workload Behavior Trends ~300 ~150 Data from hp.com Sum-of-peaks >>> peak-of-sums (system-of-system) Non-synchronized burstiness across systems HP Confidential 6 Workload Behavior Trends Similar trends on 132 servers in 9 different sites Site 1 2 3 4 5 6 7 8 9 Workload and trace length Backend of pharmaceutical company Web hosting infrastructure for worldcup98 web site [2] SAP-based business process application in large company E-commerce web site of a large retail company Backend for thin enterprise clients - company 1 Backend for thin enterprise clients - company 2 Front-end customer facing web site for large company Business processing workload in small company E-commerce web site of small company All sites Servers Avg 90th % Max sumpeaks Worst Savings 26 25 27 15 10 14 8 3 4 132 87 256 585 83 138 102 119 78 90 1540 138 481 691 166 184 159 187 132 136 1872 307 1166 919 234 298 287 255 225 197 2682 1128 1366 1654 591 729 1253 467 278 228 7694 2600 2500 2700 1500 1000 1400 800 300 400 13200 88% 53% 66% 84% 70% 80% 68% 25% 51% 80% What does this mean? Compounded inefficiencies − Managing power budget for individual peak • 20W blades, 500W enclosures, 10KW racks, … − Managing power budget for ensemble typical-case • 20W blades, 250W enclosures, 4KW racks, … HP Confidential 7 Functional Architecture • Hardware-software coordination for power control Power budget Application SLA requirements Management agent Individual system (blade) Measure/ monitor/ predict Policy-driven control Monitoring hooks Power control hooks • Provision system for lower power budget • Intelligent software agent − Monitors power of individual blades − Ensures that total power of enclosure not > threshold • Use power throttling hooks in system in rare case of violations HP Confidential 8 Enclosure-level Implementation ROM Power supplies Cooling Gigabit Gigabit ethernet ethernet switches switches CPU RAM graphics PCI ATA/IDE Southbridge controller NIC NIC Hard drive USB Enclosure I2C bus Enclosure controller (IAM) Blade management controller Enclosure Blades Sensor SMBUS Thermal Thermal monitor monitor Power monitor Thermal Thermal diodes diodes Hot swap controller Enclosure Firmware Blade Firmware - Resource monitor and predict - Policy-driven throttling directives - Data gather and report - Power (request) control - Initialization & heart-beat check - Initialization & heart-beat check * Initialization and setup * data gathering/heartbeat checking * Event response HP Confidential 9 Implementation Choices • Selection of system power budget − What value? − Enforcement strictness? • Thermal provisioning: relaxed • Power provisioning: strict • Power monitoring and control − Power/Temp? Polling/interrupts? Components? − P-states? HP Confidential 10 Implementation Choices (2) • Policies for power throttling − Assigning power budgets • Preemptive: “ask before you can use more power” • Reactive: “use as much as you want until told you cant” − Choice of servers to (un)throttle • Round-robin, lowest-performance, highest-power, fair-share, … − Power level to (un)throttle • Incremental, deep, … − Resource estimation and polling heuristics HP Confidential 11 Outline • Introduction • Characterizing real-world power trends • Architecture & Implementation • Evaluation • Conclusions HP Confidential 12 Prototype Experiments • Experimental test bed with 8 proto blades − 1 GHz TM8000, 256MB, 40GB, Windows (533MHz/0.8V, 600MHz/.925V, 700MHz/1V, 833MHz/1.1V, 1000MHz/1.25V) − Prior blade design + power monitoring support − Firmware changes to BIOS and blade/enclosure controllers • Benchmarks: VNCplay and batch simulations • Measured power and performance • Tradeoffs + Validates implementation + Actual performance and power results -- Hard to model real enterprise traces -- Hard to do detailed design space exploration HP Confidential 13 Simulator Experiments • High-level model of blade enclosure − − − − • Input resource utilization traces Power/performance models Configurable architecture parameters Results validated on prototype Benchmarks − 9 real enterprise site traces for 132 servers − Synthetic utilization traces of varying concurrency, load, … • Metrics − Total workload performance, per-server performance − Changes in utilization, frequency, MIPS – for peak/idle − Usage of different P-states, impact of delays HP Confidential 14 Results • Significant enclosure power budget reductions − 10-20% @ enclosure level, 25-50% @ processor level − Higher savings possible with other P-state controls • Marginal impact on performance (less than 5%) • Preemptive competitive to reactive HP Confidential 15 Interactive Applications Minimal impact on latency Vncplay interactive latency CDFs within measurement error HP Confidential 16 Sensitivity Experiments • Other policy choices − No impact on real workload traces − Throttling few servers at high P-states preferable (vs. throttling many servers at low P-states) • Sensitivity to workload characteristics HP Confidential 17 Other Benefits • Beyond the enclosure − Cascading benefits at rack, data center, etc. • “Soft” component power budgets for lower cost − e.g.,high-volume high-power vs high-cost low-power CPU • Adaptive power budget control − Heterogeneous power supplies for low-cost redundancy • Average power reduction − e.g., 90th% @ enclosure vs. multiple 90th% @ blades HP Confidential 18 Summary Critical power density problem in enterprises • Ensemble-level architecture for power management − Manage power budget across collections of systems − Recognize trends across multiple systems − Address compounded overprovisioning inefficiencies Real world power analysis (130+ servers in 9 sites) − Dramatic differences between sum of peaks and peak of sums Architecture and implementation − Simple hardware/software support; preemptive and reactive policies Prototype and simulation at blade enclosure level − Significant power savings; no performance loss Other benefits in component flexibility, resiliency, … HP Confidential 19 Questions? Speaker contact: Partha.Ranganathan@hp.com HP Confidential 20 Backup Slides HP Confidential 21 Hp.com desktop1 sap1 ecomm2 desktop2 pharma ecomm1 worldcup sap2 HP Confidential 22 HP Confidential 23 Backup on Simulation HP Confidential 24 Pre-emptive and Reactive Policies Start with all servers unthrottled At each control period or on interrupt Compute total power consumption Start with all servers throttled At each control period or on interrupt Compute total power consumption Check if power above threshold If yes Prioritize which servers to throttle Throttle each server to decided level Stop when power budget below threshold If no Prioritize which server to unthrottle Unthrottle each server to decided level Stop if power budget likely exceeded Identify servers with “low” utilization Prioritize which servers to throttle Throttle each server to decided level Check if room in power budget If yes Identify servers with “high” utilization Prioritize which servers to unthrottle Unthrottle each server to decided level Stop if power budget likely exceeded If no Stop HP Confidential 25 Related Work • Single-server power capping − Brooks et al – Capping @ Processor level − Felter et al – Power Shifting • Cluster-level power budget − Femal et al – Throughput per budget, local control − IBM, Duke, Rutgers work on average power • Resource provisioning − Urgaonkar et al – Overbooking resources − Yuan et al – OS-level CPU scheduling for batteries • Cooling work − Moore et al – temperature-aware workload placement − Patel et al – Smart Cooling − Uptime recommendations, … HP Confidential 26 Future Work • More exploration − E.g., geographically distributed servers − More policies − High-performance workloads • Adaptive power budget variation • Interface with other local and global loops HP Confidential 27 The problem HP Confidential 28 A growing problem Server power densities up 10x in last 10 yrs Source: Datacom Equipment Power Trends and Cooling Applications, ASHRAE, 2005, http://www.ashrae.org HP Confidential 29 90th Percentile Utilization Maximum utilization 100 75 50 25 0 0 25 50 75 100 90th percentile utilization HP Confidential 30 Enterprise power challenges: Compute equipment consume power… • Electricity costs − For large data center, recurring costs: $4-$8 million/yr “… energy costs for [data center] building $1.7 million last year...”, Cincinnati Bell, 2003 “… electricity costs large fraction of data center operations…,” Google 2003 • Environmental friendliness − Compute equipment energy use: 22M GJ + 3.9M tons CO2 − EnergyStar (US), TopRunner (Japan), FOE (Switzerland),… “…goal to increase computer energy efficiency by 85% by 2005.” Japan’s “TopRunner” energy program, 2002 HP Confidential 31 Scratch slides HP Confidential 32 The problem • Power density key challenge in enterprise environments − Blades increasing power density; Data center pushback on cooling • Increased thermal-related failures if not addressed o o − 50% server reliability degradation for 10 C over 20 C o − 50% decrease in hard disk lifetime for 15 C increase • Problems exacerbated with data center consolidation HP Confidential 33 Costs of Addressing Power Density • Cooling costs large fraction of TCO − Capital costs: • For 10MW data center, $2-$4 million for cooling equipment − Recurring costs: • At data center, 1W of cooling for 1W of power • For 10MW data center, $4-$8 million for cooling power • .050- 100 KW .250 KW 0.005 KW .025 KW 10 - 15 KW 1000+ KW 1 KW 1000 KW Heat Generated Energy to Remove Heat Similar issues with power delivery − Challenges with routing more than 60 amps per rack • Problems exacerbated by consolidation & blades growth Need to go beyond traditional facilities-level solutions HP Confidential 34 Our Approach “Ensemble-level” architecture for power management Insight: systems designed for peak usage of individual box but end-user focus on long-term usage of entire solution Solution: Manage power budget across collections of systems − Recognize trends across multiple systems − Extract power efficiencies at larger scale Significant power budget savings HP Confidential 35 Significant Power Savings Original power budget @ 100W New power budget @ 22.5 New power budget @ 15 • • Processor power down from 100W to 15W (6X) System power down from 350W to 280W (20%) − Additional benefits if corresponding hooks for memory, etc. • What about performance? HP Confidential 36 Simulator Demo of Operation • Rich simulation infrastructure − Facilitates more extensive design space exploration HP Confidential 37 Questions? HP Confidential 38 The problem • Power density key challenge in enterprise environments − Blades increasing power density; Data center pushback on cooling • Increased thermal-related failures if not addressed • Problems exacerbated with data center consolidation HP Confidential 39 The problem • Power density key challenge in enterprise environments − Blades increasing power density; Data center pushback on cooling • Increased thermal-related failures if not addressed • Problems exacerbated with data center consolidation HP Confidential 40