PowerPerfCenter: A Power and Performance Prediction Tool For Multi-Tier Applications Varsha Apte, Bhavin Doshi Department of Computer Science and Engineering IIT Bombay, Mumbai, INDIA CMG Meeting, April 4th, 2014, IIT Bombay 1 The application performance prediction problem Application Web Database Application A multi-tier application is ready to be “rolled out”. •How many users can it support? •With some N simultaneous users, what will be the response time? •What sort of hardware and deployment architecture is required to support a target N users? 2 2 Application performance predictionhow? Carry out “load testing” on a testbed Clients running load generators Generate Requests, measure performance Servers running system under test Extrapolate to “production” environment, using models. More specifically, modeling tools. 3 3 PerfCenter – Multi-tier Application Performance Modeling Tool Takes as input, the server, host, network, deployment, request types, message flow details, and resource usage details. Request Machine 1 Machine 2 Machine 3 Web Server Web Server WA N Auth Server LAN 10ms 4 PerfCenter Model Specification Language variable nusr 1 end … server web thread count 150 thread buffer 0 device thread schedP fcfs intel_xeon_cpu task verify_credentials raid_disk task prepare_calendar_page end task verify_session … host server_host[2] end intel_xeon_cpu count 1 deploy web server_host[1] intel_xeon_cpu buffer 99999 deploy db server_host[2] intel_xeon_cpu schedP fcfs raid_disk count 1 lan raid_disk buffer 99999 lan1 raid_disk schedP fcfs end end task verify_session intel_xeon_cpu servt 0.010 end deploy server_host[1] lan1 deploy server_host[2] lan1 scenario Login prob 0.05 user verify_credentials 100 SYNC branch prob 0.05 verify_credentials display_error_message 100 SYNC end ….. end loadparams noofusers nusr thinktime exp(2) end modelparams method simulation type closed noofrequests 40000 end for nusr = 5 to 100 incr 5 print nusr+","+respt()+","+tput()+","+ util(server_host[1]:intel_xeon_cpu)+ ","+util(server_host[2]:intel_xeon_cpu) nusr=nusr+5 end 5 Power – an important resource •Power Costs are now a significant part of overall datacenter expenses •Specifically, many devices in servers run in self-tuned mode to save power •Typically: low speed low power Run at low speed when load is low •But low power low performance 1. Application Performance Modeling Tools must predict performance in presence of power-managed devices 2. Power Consumption should also be modeled and predicted 6 PowerPerfCenter A multi-tier application performance modeling tool which can represent “powermanaged” devices and predict power consumption Extension of existing FOSS tool developed at IIT Bombay “PerfCenter”* (www.cse.iitb.ac.in/panda/tools/perfcenter) *"PerfCenter: a performance modeling tool for application hosting centers" , Akhila Deshpande, Varsha Apte and Supriya Marathe, Proceedings of the 7th international workshop on Software and performance, June 2008. 7 PowerPerfCenter Can specify power-managed devices Power (watts)consumed at those Abstraction: set of speed levels, and power speed levels Current model based on linux governors of “frequency max scaled CPUs”: two static dynamic powersave: fixed at lowest speed performance: fixed at highest speedidle power two dynamic ondemand: speed jumps to highest if utilization exceeds utilization 1.0 upthreshold conservative: speed jumps by one step when low/high thresholds crossed Power usage model within PerfCenter: power used by device (speed, utilization) = idlepower(speed) + utilization X max_dynamic(speed) 8 PowerPerfCenter Specification host arpa[1] corei5 count 1 corei5 buffer 99999 corei5 schedP fcfs corei5 power_managed governor conservative disk disk disk disk end count 1 buffer 99999 schedP fcfs speedup 1 powermanagement corei5 speed_levels 1.2 2.26 2.39 2.53 2.66 2.79 2.93 3.06 3.19 3.19 end power_consumed 10.36 69.259 81.911 97.166 112.926 130.306 150.922 171.916 194.771 194.771 end idlepower 65 65 65 65 65 65 65 65 65 65 end probe_interval 0.08 governor_up_threshold 80 governor_down_threshold 20 end power(arpa[1]:corei5) // average power consumed by // corei5 eperr(arpa[1]:corei5) respt(), //energy per request tput() // avg response time, and throughput 9 PowerPerfCenter Validation: Application Example: “Webcalendar” Two tier application Request “scenarios” ViewEvent ViewWeek ViewMonth … 10 PowerPerfCenter Validation: Testbed Database server: Client for Load Generation Load generator: AutoPerf Web server: Intel Core i5 650 processor •One active core •Frequency scaled: •1.2 GHz to 3.19 GHz in steps of 133 MHz •Idle power consumption: 65 w at all frequency levels •Dynamic power consumption: Varies with frequency levels Intel Core2 Duo E4500 2.2 GHz processor Not frequency scaled 5000 users, 100 events each AutoPerf generates usual performance metrics as well as per scenario, per server, resource demands e.g. CPU execution time of ViewEvent request at 11 Web server PowerPerfCenter Validation: Workload Dynamic workload (cyclic) number of simultaneous users changes over time Dynamic workload can highlight the impact of using power-managed devices Each user logins, then navigates randomly through a set of various requests PowerPerfCenter specification workload cyclic noofusers 25 30 35 40 45 40 35 30 25 25 20 15 20 25 end interval 300 300 300 300 300 300 300 300 300 300 300 300 300 300 end thinktime exp(6) 12 Experiments - 1: Validation of Perfomance Metrics WebCalendar performance Measured on testbed Estimated using PowerPerfCenter model Metrics compared Throughput, response time, and resource utilizations Governors compared POwersave (PO): CPU Frequency fixed at lowest Conservative (C): Speed increases/decreases when high/low utilization thresholds are crossed Performance (PE): CPU Frequency fixed at highest 13 Modeled Measured Response Time Web CPU Utilization Throughput Number of Users Model vs Measured – Conservative Governor Time Time 14 Governors comparison Performance of Conservative (C) governor very close to Performance (PE) governor CPU with PowerSave (PO) governor does not have enough capacity 15 Experiments-2: Power-performance Tradeoff Validation Only by trend checking – currently not validated by measurement power consumed by PE > power consumed by C > power consumed by PO Response time of PE <Response time of C < Response time of of PO Demonstrate insights possible by using PowerPerfCenter Compare CPU with same idlepower consumption at different frequency settings (corei5), with CPU with different idle power consumption at different frequency settings (core i7) 16 Power-Performance Results (corei5) Power & performance trends are as expected However, power saving by Conservative governor is insignificant Workload-2 (w2): 75 users for 60 seconds, and 3 users for 600 seconds Explanation: CPU utilization is not high enough– “idlepower” consumption dominates – which is the same at all speeds – so power management is not very useful 17 Power-Performance Results (corei7) corei7: Idle power Consumption different at different frequencies powermanagement corei7 speed_levels 2.8 3.4 4.0 end power_consumed 120 120 195 end idlepower 51 66 90 end 40% power saving by conservative governor, as compared with performance governor, with much smaller performance penalty than PO 18 Summary & Future Work PowerPerfCenter:a powerful tool for performance prediction in presence of powermanaged devices Performance metrics validated Power modeling shown to lead to interesting insights in the power-performance trade-off This is just a start (work-in-progress) – lots of future work further validation of performance metrics validation of power metrics generalized power consumption model analytical models 19 Thank you/ Questions Acknowledgements: Yogesh Bagul and Rakesh Mallick for contribution to PowerPerfCenter development Swetha P. T. Srinivasan for providing us with power usage data Senthil Nathan and Swetha P. T. Srinivasan for valuable feedback This work was supported by Tata Consultancy Services, Mumbai, IBM , Intel IT and MHRD 20