Introduction to Control Theory and Its Application to Computing Systems Tarek Abdelzaher, University of Illinois Yixin Diao, IBM Research Joseph L. Hellerstein, Microsoft Developer Division Chenyang Lu, Washington University Xioayun Zhu, Hewlett Packard Laboratories June 2, 2008 Tutorial Agenda Control theory fundamentals Self-tuning memory management in IBM’s DB2 Control of real-time systems using model-predictive control Automated workload management in virtualized data centers Managing power and performance in data centers Research challenges SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. Tutorial Agenda Control theory fundamentals Control architecture and taxonomy Simple analytics Application summaries Regulating load for IBM’s Lotus Domino email server Throttling administrative work for IBM’s DB2 Optimizing throughput for Microsoft’s .NET thread pool Self-tuning memory management in IBM’s DB2 Control of real-time systems using model-predictive control Automated workload management in virtualized data centers Managing power and performance in data centers Research challenges SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. Elements of a Control System Reference Input + - Control Error Control Input Controller Transduced Output Disturbance Input Measured Output Target System Transducer Given target system, transducer Components Control theory finds controller Target system: what is controlled that adjusts control input Controller: exercises control to achieve measured Transducer: translates measured outputs output in the presence of Data disturbances. Reference input: objective Control error: reference input minus measured output Control input: manipulated to affect output Disturbance input: other factors that affect the target system Transduced output: result of manipulation SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. Closed Loop vs. Open Loop Administrative Tasks Reference RIS + - MaxUsers Closed Loop Controller Adapts Simple system model Target System Actual Measured RIS RIS Server Sensor Closed Loop System Administrative Tasks Reference RIS MaxUsers Open Loop Controller Server Target System Actual Measured RIS RIS Sensor Open Loop System SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. Stable Fast settling Types of Control MaxUsers Reference RIS Target System Controller + Server Measured RIS Sensor - Administrative Tasks MaxUsers Reference RIS + Controller Target System Server Measured RIS Sensor - Administrative Tasks MaxUsers Controller Target System Server Sensor Measured RIS Manage to a reference value Ex: Service differentiation, resource management, constrained optimization Regulatory Control Eliminate effect of a disturbance Ex: Service level management, resource management, constrained optimization Disturbance Rejection Achieve the “best” value of outputs Ex: Minimize Apache response times Optimization SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. The SASO Properties of Control Systems Stability Accuracy Unstable System Short Settling Small Overshoot SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. Control Theory By Example – IBM Domino Server Good Architecture Admin Controller RPCs MaxUsers Server Desired RIS (RPCs in System) Actual RIS Bad Block Diagram MaxUsers Desired RIS r(k) e(k) + - Actual RIS u(k) Controller y(k) Server SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. Dynamical Analysis of Discrete Time Systems y (k + 1) ay (k ) + bu (k ) u (k ) y (k ) Server Z-Transform y (k + 1) ay (k ) + bu (k ) zY ( z ) aY ( z ) + bU ( z ) Transfer Function (TF) Y ( z) b F ( z) (ba 0 , ba1 , ba 2 ,... U(z) z - a Pole: Output at time k is k proportional to a , for pole a. 5 5 5 a=0.4 a=1.2 a=0.9 0 0 0 -5 -5 -5 5 5 5 Fast systems have small poles Oscillations result if neg or im poles a=-1.2 a=-0.4 a=-0.9 0 Gain: Ratio of steady state output to steady state input 0 -5 0 -5 0 5 10 -5 0 5 10 0 5 SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. 10 9 Control Design r(k) e(k) + u(k) Controller G(z) Y ( z) F ( z) R(z) y(k) Server N(z) Closed Loop Transfer Function Example: Control Law u(k ) u(k -1) + K I e(k ) KI=0.1 KI=1 KI=0.1 Poles of FR(z) KI=5 Key Results From Linear Systems Adding signals: A(z) + G(z) Y(z) 0.5 2 0.4 1.5 y(k) u(k) 0.3 1 C(z) + U(z) B(z) Transfer functions in series U(z) G(z) W(z) H(z) Y(z) is equivalent to 0.2 U(z) G(z)H(z) Y(z) 0.5 0.1 0 0 10 20 30 k 0 0 10 20 30 k Stable if |a|<1, where a is the largest pole of G(z) kS {c(k)=a(k)+b(k)} has Z-Transform A(z)+B(z). -4 , where | a | is the largest pole of G(z) ln | a | y ( ) G (1) Steady state gain of G(z): u ( ) Transfer function of a feedback loop Target System Controller R(z) + T(z) K(z) G(z) H(z) Transducer FR ( z ) T ( z) K ( z )G ( z ) R( z ) 1 + H ( z ) K ( z )G ( z ) SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. Application 2: Throttling Administrative Work in IBM’s DB2 12 DB2 Thruput (tx/sec) 10 BACKUP RUNSTATS REBALANCE DBA 8 Drops by >70%!! 6 4 2 0 0 w/o Utility with Utility Utility Started 500 1000 Time (sec) Utilities have a big impact on production performance. Administrative policy There should be no more than an x% performance degradation of production work as a result of executing administrative utilities SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. 1500 Choosing an Actuator Production Throughput CPU Priority Sleep mechanism 5 4 3 2 1 Normalized Effector Value 0 0 200 400 600 800 0 200 400 600 800 1000 1200 1400 1600 1800 2000 0 500 1000 1000 1200 1400 1600 1800 2000 0 500 1000 1500 2000 2500 1500 2000 2500 1 0.8 0.6 0.4 0.2 0 Time (sec) Time (sec) SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. Control Architecture WL % Impact DBA R: Impact Limit R + - E U Controller E: Error DB2 % Impact U: Sleep % Y: Pageometer (pages/sec) Pages/sec Y* Baseline Estimation M Compute Degradation : Model parameters Y Model Estimation Y*: Baseline perf Y Assume linear effect of throttling on Y y u 1 a, b) Workload Y* a + b b Y* - Y M Y* E R -M E Controller U Utility a b aU Online modeling provides a transducer that translates from Pages/sec (Y) to % Impact (M) SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. Y Optimizing Throughput in the Microsoft .NET ThreadPool QueueUserWorkItem() Concurrency Level Controller Objective: Maximize CPU utilization and thread completion rates Inputs: ThreadPool events, CPU utilization Techniques Thresholds on inter-dequeue times, rate of increasing workers, change in rate of increasing workers States: Starvation, RateIncrease, RateDecrease, LowCPU, PauseInjection New approach 1 5 ThreadPool Current ThreadPool Completion Rate (throughput) Objective: Maximize thread completion rate Inputs: ThreadPool events Technique: Hill climbing SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu. Hill Climbing Controller Discrete Derivative Large ControlGain 160 Want large gain so move quickly, but not overshoot. Making good moves depends on •throughput variance •shape of curve Throughput Small 140 ControlGain 120 CurrentHistor y 100 80 LastHistory NewConcourrencyLevel for large ControlGain 60 40 0 LastConcurrencyLevel 10 20 30 #Threads 40 50 NewConcurrencyLevel for small ControlGain = History mean CurrentConcurrencyLevel (50 work items:100ms with 10%CPU, 90% wait. 2.2GHz dual core X86.) Hybrid Control State Diagram ChangePointWhileLookingForMove Same as ChangePointWhileInitializing WaitForSteadyState IsInTransition() State 2a – InTransition CurrentHistory.Add(data) State 2 – Looking for move. CurrentHistory.Add(data) CompletedInitializing IsStableHistory(LastHistory): LastControlSetting = CurrentControlSetting CurrentControlSetting = ExploreMove() State 1 - Initializing LastHistory. LastHistory.Add(data) ChangePointWhileInitializing IsChangePoint(LastHistory): LastHistory = data CurrentControlSetting = ExploreMove() DirectedMove IsSignificantDifference(CurrentHistory, LastHistory): LastControlSetting = CurrentControlSetting CurrentControlSetting = DirectedMove() LastHistory = CurrentHistory CurrentHistory = null StuckInState IsStableHistory(CurrentHistory) & CurrentHistory.Count > SufficientlyLargeHistory: LastControlSetting = CurrentControlSetting CurrentControlSetting = ExploreMove() LastHistory = CurrentHistory CurrentHistory = null ReverseBadMove CurrentHistory.Count > MinimumHistory & LastHistory.Mean() > CurrentHistory.Mean(): Swap(CurrentControlSetting, LastControlSetting) ChangePointInQueueWaiting IsChangePoint(QueueOfWaiting) State 1a – InTransition. SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Yu, and Zhu.