An Optimal Service Ordering for a World Wide Web Server A Presentation for the Fifth INFORMS Telecommunications Conference March 6, 2000 Amy Csizmar Dalal Scott Jordan Northwestern University University of California, Irvine Department of Electrical and Computer Engineering email: amy@ece.nwu.edu Department of Electrical and Computer Engineering email: sjordan@uci.edu Overview · · · · · Motivation/problem statement Derivation of optimal policy Two extensions to the original model Simulation results Conclusions Problems with current web server operation · Currently, web servers share processor resources among several requests simultaneously. · Web users are impatient--they tend to abort page requests that do not get a response within several seconds. · If users time out, the server wastes resources on requests that never complete-a problem especially for heavily-loaded web servers. · Objective: decrease user-perceived latency by minimizing the number of user file requests that “time out” before completion A queueing model approach to measuring web server performance · · · · · · · Performance = P{user will remain at the server until its service completes} Decreasing function of response time Service times ~ exp(m), i.i.d, proportional to sizes of requested files Request arrivals ~ Poisson(l) Server is unaware if/when requests are aborted Server alternates between busy cycles, when there is at least one request in the system, and idle cycles (state 0), when there are no requests waiting or in service. These cycles are i.i.d. Switching time between requests is negligible Web server performance model • Revenue earned by the server under policy p from serving request i to completion: g ip (t ip ) e c ( t ip x i ) • Potential revenue: gip(t) cip • By the Renewal Reward Theorem, the average reward earned by the server per unit time under policy p is N g ip (tip ) V p ( 0) i 0 Z1 Web server performance: Objective Find a service policy that maximizes revenue earned per unit time: V max Vp (0) p optimal policy Requirements for an optimal service ordering policy Lemma 1: An optimal policy is non-idling Difference in expected revenues = gj*(a2) m dl(1-e-c dl) Requirements for an optimal service ordering policy Lemma 2: An optimal service ordering policy switches between jobs in service only upon an arrival to or departure from the system If g(j+1)*(a1) gj*(a1): use first alternate policy Revenue difference = g(j+1)*(a1) m dl (1-e-c dl) If g(j+1)*(a1) < gj*(a1): use second alternate policy Revenue difference = m dl e-c dl (e-ca1 - e-ca2) (ecxj* - ecx(j+1)*) Requirements for an optimal service ordering policy Lemma 3: An optimal policy is non-processor-sharing Proof: Consider a portion of the sample path over which the server processes two jobs to completion Three possibilities: 2 m m · serve 1 then 2: expected revenue is c2 (1) c1 cm cm 2 m m c1 (2) c2 cm cm · serve 2 then 1: expected revenue is · serve both until one completes, then serve the other: expected revenue is q (1) (1 q) (2) Requirements for an optimal service ordering policy Lemma 4: An optimal policy is Markov (independent of both past and future arrivals) Follows because service times and interarrival times are exponential Definition of an optimal policy State vector at time t under policy p: Gtp c1 p c2 p cNp best job: i : max cip i * The optimal policy chooses the best job to serve at any time t regardless of the order of service chosen for the rest of the jobs in the system (greedy) LIFO-PR Extension 1: Initial reward · · Initial reward Ci: indicates user’s perceived probability of completion/service upon arrival at the web server Application: QoS levels · c(t xi ) Potential revenue: gip (t) Cie · Natural extension of the original system model Optimal policy: serve the request that has the highest potential revenue (function of the time in the system so far and the initial reward) Extension 2: Varying service time distributions · Service time of request i ~ exp(mi), independent · Application: different content types best described by different distributions · Additional initial assumptions: › only switch between jobs in service upon an arrival to or departure from the system › non-processor-sharing · We show that the optimal policy is › non-idling › Markov Extension 2 (continued) State vector at time t under policy p: G tp ( c1 p , m 1 ) ( c 2 p , m 2 ) ( c Np , m N ) cip m i Best job satisfies i : max i Optimal policy: Serve the request with the highest cimi product Simulation results · Revenue (normalized by the number of requests served during a simulation run) as a function of offered load (l/m) · Mean file size (m) = 2 kB · Extension 1: Ci ~ U[0,1) · Extension 2: m1 2 kB, m2 = 14 kB, m3 = 100 kB · Implemented in BONeS™ Simulation results Normalized revenue vs. offered load, 50 requests/sec, extension 1 0.5 0.45 0.4 Revenue 0.35 0.3 0.25 0.2 0.15 0.1 0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Load PB LIFO-PR FIFO PS 0.9 0.95 0.99 Conclusions · Using a greedy service policy rather than a fair policy at a web server can improve server performance by up to several orders of magnitude · Performance improvements are most pronounced at high server loads · Under certain conditions, a processor-sharing policy may generate a higher average revenue than comparable nonprocessor-sharing policies. These conditions remain to be quantified.