Feedback Control of QoS Tarek Abdelzaher Department of Computer Science University of Virginia The Web QoS Group Group is formed in 1999 Projects: Web performance Deeply embedded sensor networks Real-time systems Students Ying Lu, Chenyang Lu, Sagnik Bhattacharya, Seejo Sebastine Performance Control in Server End systems How to design adaptive services which meet pre-specified performance requirements? How to model the effects of feedback (adaptation) in software architectures for QoS guarantees? How to use software feedback to achieve performance requirements? Observation Physical and engineering sciences have a well developed analytic foundation for performance control in physical systems No such unified foundation exists for performance control of software services The objective of this research effort is to establish such a foundation based on control theory and scheduling theory Why Control Theory? Successful track record in physical process control Performance guarantees in the face of uncertainty, non-linearities, time-variations, etc. Does not require accurate system models Utilizes feedback to improve performance Performance of software services is governed by queuing dynamics which may be expressed by differential equations akin to those of physical systems Feedback Control Versus Queuing Theory Queuing theory Off-line predictive analysis Assumptions about the arrival process Difficult to analyze some distributions Control-theory On-line input/output difference equations No assumptions about the arrival process Utilize run-time feedback for error correction Feedback Control Versus Optimization Optimization Works better if the performance problem is formulated as one of maximizing or minimizing some metric Control-theory Works well if the performance problem is one of maintaining an invariant, or is a tradeoff between two conflicting metrics Software Performance Control Control theory: Robust guarantees on aggregate state and global performance metrics (e.g., average delay, total utilization, etc) Scheduling theory: Guarantees on “microscopic” performance metrics (e.g., individual response times) Conditions on aggregate state Theoretical Elements of a QoS Control Methodology Computing Tasks Difference Equation Models Modeling Resource Queues Desired Performance Resource Scheduling Feedback Control Scheduling Theory Fine-grained Performance Guarantees Potential Applications Performance-assured services (e-commerce, online trading) Service differentiation Contractual satisfaction guarantees Overload control Example: Illustrating the Methodology Consider the problem of delay differentiation between two classes of traffic in a multi-class web server It is desired to control server resource allocation to the two traffic classes such that a desired average delay ratio is observed Run-time Server Modeling A server can be modeled as a dynamic system Queues give rise to difference equations Current performance (output) depends on a finite history of resource allocation decisions (inputs) Server model n n V (m) a jV (m j ) b jU (m j ) j 1 j 1 mth sampling V(m): measured relative delay in window U(m): resource allocation in mth sampling window A Model of Delay Differentiation n n j 1 j 1 V (m) a jV (m j ) b jU (m j ) Delay differentiation: - Input: assigned process ratio - Output: delay ratio white-noise generator Model parameters {aj, bj | 1 j n} Least squares estimator { C0, C1 } { B0, B1 } TCP connection requests TCP listen queue HTTP requests HTTP response Connection Scheduler Web Web Server Server Server Process monitor Model Estimation Results A second order difference equation fits well with the Apache server Controller design PI Control 1 k U k (k ) K P ( Ek (k ) K I Ek ( j )) j 0 Imaginary Axis Root-Locus Method Relative delay controller Settling time: TS = 4.5 min Steady state error: ES = 0 Root Locus Closed Loop Poles -1 -1 Real Axis 1 Server Feedback Control {Wk | 0 k < N} {Ck | 0 k < N} Controllers monitor {Bk | 0 k < N} TCP connection requests TCP listen queue HTTP requests HTTP response Connection Scheduler Web Web Server Server Server Process Delay Ratio Reference Process Ratio Experimental Data (relative delay) #premium-users 100200 Ratio Designed settling time Time (sec) (a) Adaptive Server #premium-users 100200 Ratio Basic users get shorter delays than premium users! Time (sec) (b) Non-adaptive Server Middleware for QoS Control API for plug-in performance sensors and actuators Common sensor/actuator library Engine for mapping QoS specifications to control loops Run-time enforcement of QoS guarantees Plug-in Sensors QoS API Sensor API Loop Configuration Controlled System - Server - Proxy Plug-in Actuators Control API Performance Control Middleware Common Platforms The Middleware Suite Run-time modeling tools: Capacity planning and resource assignment: Overload/throughput control (CDC ’00, IWQoS ’99) Performance isolation (IEEE TPDS ’01) Service differentiation tools: Automated profiling (RTAS ’00) Server delay differentiation (RTAS ’01), Cache hit ratio differentiation (ICDCS ’01) Router delay differentiation (sub. to Infocom ’02) Prioritization (IWQoS ’99) Absolute delay guarantee tools (RTAS ’01) Middleware Example: Service Differentiation Tools Proportional Differentiated Web Services Architecture Differentiated Services Problem statement: N classes of users/traffic Average delay of class j is Dj It is required that D1:D2: … :DN =K1:K2: … :KN K1, K2, …, KN are specified weights Control-theoretical formulation? Control-Theoretical Formulation The differentiation objective D1:D2: … :DN =K1:K2: … :KN One feedback loop per class The feedback control objective Di Ki D1 D2 ... DN K 1 K 2 ... KN Error ei: Ki Di ei K 1 K 2 ... KN D1 D2 ... DN Control Loop Output Adjust resource allocation of each class j by DRj DRj= f (ej), where f is linear f (0)=0 The resource conservation property Sj (DRj) = 0 Proof Sj (DRj) = Sj f (ej) = f (Sj ej) = f (0) = 0 Application: Differentiated Web Caching Goal: Different content classes receive different hit ratio Experimental Setup1 Web clients Servers Surge: a tool that generates references matching empirical measurement Apache Cache Squid: cache size to file population is roughly 1 to 30 Performance 0.9 0.8 RelativeHR 0.7 class0 0.6 class1 0.5 class2 0.4 goal0 0.3 goal1 goal2 0.2 0.1 0 0 200 400 600 800 1000 time (sec) Experimental Setup2 Clients replay NLANR sanitized access logs class0: html files class1: non-html files Servers real servers on the internet Latency Reduction Backbone latency reduction latency latency j i j i includes all the pages that hit in the cache includes all the requested pages Software Performance Control Control theory: Middleware solution for robust guarantees on aggregate performance metrics (e.g., average delay, total utilization, etc) Scheduling theory: Guarantees on “microscopic” performance metrics (e.g., individual response times) Conditions on aggregate state Role of Scheduling Theory: Absolute Delay Guarantees A constant-time admission test based on current server utilization All admitted tasks are guaranteed to meet their deadlines Arbitrary number of traffic classes No assumptions about task arrival process Main Results All arrivals will meet their deadlines under an optimal fixed-priority scheduling policy if: U (t ) 1 1 1 2 Deadline monotonic scheduling is the optimal fixed-priority scheduling policy Main Idea of Derivation Minimize, over all arrival patterns z , the maximum Uz(t) that precedes a missed deadline Uz(t) Maximum Uz(t) t Missed deadline Evaluation Deadline miss ratio depends on CPU utilization Aperiodic (nonstationary) service requests meet their deadlines when utilization is below the bound The utilization bound can serve as a control set point The Future Vision An analytic foundation for performance control: Putting it all together Putting it all Together: Step 1 - Feasibility Bounds Efficient QoS feasibility tests based on aggregate measurements 1973 2001 2003 Utilization Utilization Utilization 100% 100% Schedulable bound generalized schedulable bound Relaxed Periodicity Schedulable region 0% Generalized schedulable region Connectivity 0% Periodic Load Random Load Random Load Distributed System Putting it all Together: Step 2 - Aggregate Models System models without load knowledge QoS Guarantees on Aggregate Behavior Assumptions about Load Arrival Process Aggregate Queuing Models Closed Loop Feedback Control Dynamics Server Difference Equation Models Individual Requests (Microscopic Models) Aggregate Service Profiles Putting it all Together: Step 3 – Feedback Control Distributed control to maintain global sufficient conditions for desired behavior Var1 State Control Loops Desired Aggregate Behavior Feasible Region Aggregate State Variables Var3 Aggregate Performance Guarantees Var2 Conclusions A first step towards an underlying analytic foundation and design methodology for performance control in software systems A middleware library that embodies the control loop prototypes Theory to relate aggregate state to fine-grained performance guarantees Future Work Study the characteristic features of software feedback control systems Establish a better understanding of the limitations of control theory Integrate control theory with real-time scheduling theory for robust fine-grained guarantees on temporal behavior and QoS Implement successful performance control mechanisms in the QoS control middleware Acknowledgements I would like to acknowledge: Chenyang Lu, for his work on delay differentiation in web servers and for contributing slides to this talk Ying Lu and Avneesh Saxena for their work on differentiated caching services Jack Stankovic, Sang Son, Gang Tao, Nina Bhatti, Kang Shin, Kevin Skadron, and Jorg Liebeherr for their collaboration and help