IPAM Workshop 18-19 March 2002 Insensitivity in IP network performance Jim Roberts (et al.) (james.roberts@francetelecom.com) France Télécom R&D Traffic theory: the fundamental three-way relationship traffic engineering relies on understanding the relationship between demand, capacity and performance demand volume characteristics capacity bandwidth how it is shared performance response time loss, delay Outline characterizing demand statistical bandwidth sharing performance of elastic traffic integrating streaming traffic extensions for an integrated network Characterizing traffic demand systematic traffic variations hour of the day, day of the week,... subject to statistical variations and growth trends stochastic variations model traffic as a stationary stochastic process at the intensity of a representative demand prefer a characterization at flow level (not packets or aggregates) "appliflow": a set of packets from one instance of an application two kinds of flow: streaming and elastic one week 50 Mb/s 50 Mb/s 0 Mb/s 0 Mb/s one day Streaming and elastic traffic elastic traffic is closed-loop controlled digital documents ( Web pages, files, ...) rate and duration are measures of performance QoS adequate throughput (response time) Streaming and elastic traffic elastic traffic is closed-loop controlled digital documents ( Web pages, files, ...) rate and duration are measures of performance QoS adequate throughput (response time) streaming traffic is open-loop controlled audio and video, real time and playback rate and duration are intrinsic characteristics QoS negligible loss, delay, jitter A generic traffic model a session consists of a succession of flows separated by "think times" flows are streaming or elastic think times begin at the end of each flow sessions are mutually independent think times start of session flow arrivals end of session A generic traffic model a session consists of a succession of flows separated by "think times" flows are streaming or elastic think times begin at the end of each flow sessions are mutually independent assume each flow is transmitted as an entity eg, multiple TCP connections of one Web page constitute one flow think times start of session flow arrivals end of session A generic traffic model a session consists of a succession of flows separated by "think times" flows are streaming or elastic think times begin at the end of each flow sessions are mutually independent assume each flow is transmitted as an entity eg, multiple TCP connections of one Web page constitute one flow sessions occur as a homogeneous Poisson process an Internet "invariant": [Floyd and Paxson, 2001] think times start of session flow arrivals end of session Outline characterizing demand statistical bandwidth sharing performance of elastic traffic integrating streaming traffic extensions for an integrated network Statistical bandwidth sharing reactive control (TCP, scheduling) shares bottleneck bandwidth... ...unequally depending on RTT, protocol implementation, etc. there are no persistent flows in the Internet utility is not a function of instantaneous rate performance is measured by response time for finite flows response time depends more on the traffic process than on the static sharing algorithm Performance of an isolated bottleneck consider an isolated bottleneck of capacity C bits/s shared by many users independently accessing many servers performance is measured by flow response time... ...or realized throughput (= size / response time) users servers C bits/s Performance of an isolated bottleneck consider an isolated bottleneck of capacity C bits/s shared by many users independently accessing many servers performance is measured by flow response time... ...or realized throughput (= size / response time) assume (for now) Poisson flow arrivals arrival rate l consider a fluid model assuming packet level dynamics realize bandwidth sharing objectives users servers C bits/s A fluid model of statistical bandwidth sharing the number of flows in progress is a stochastic process N(t) = number of flows in progress at time t ratei(t) = rate of flow i ratei(t) max_ratei(t) (due to other throughput limits) users servers C bits/s external rate limits, max_ratei(t) A fluid model of statistical bandwidth sharing the number of flows in progress is a stochastic process N(t) = number of flows in progress at time t ratei(t) = rate of flow i ratei(t) max_ratei(t) (due to other throughput limits) assume instant max-min fair sharing... rate of flow i, ratei(t) = min {max_ratei (t), fairshare} where fairshare is such that S ratei(t)= min {S max_ratei (t), C} users servers C bits/s external rate limits, max_ratei(t) A fluid model of statistical bandwidth sharing the number of flows in progress is a stochastic process N(t) = number of flows in progress at time t ratei(t) = rate of flow i ratei(t) max_ratei(t) (due to other throughput limits) assume instant max-min fair sharing... rate of flow i, ratei(t) = min {max_ratei (t), fairshare} where fairshare is such that S ratei(t)= min {S max_ratei (t), C} ... realized approximately by TCP... ...or more accurately by Fair Queuing users servers C bits/s external rate limits, max_ratei(t) Three link bandwidth sharing models 1. a fair sharing bottleneck external limits are ineffective: max_ratei (t) C, ratei(t) = C/N(t) e.g., an access link 2. a fair sharing bottleneck with common constant rate limit flow bandwidth cannot exceed r: max_ratei (t) r, ratei(t) = min {r,C/N(t)} e.g., r represents the modem rate 3. a transparent backbone link max_ratei(t) << C and link is unsaturated: S max_ratei(t) < C, ratei(t) = max_ratei (t) users 1 external rate limits max_ratei(t) servers Example: a fair sharing bottleneck (case 1) an M/G/1 processor sharing queue Poisson arrival rate l mean flow size s load r = ls/C (assume r < 1) max_ratei(t) > C ratei(t) = C/N(t) offered traffic ls C Example: a fair sharing bottleneck (case 1) an M/G/1 processor sharing queue Poisson arrival rate l offered mean flow size s traffic load r = ls/C (assume r < 1) stationary distribution of number of flows in progress p(n) = Pr [N(t) = n] p(n) = rn (1 – r) max_ratei(t) > C ratei(t) = C/N(t) ls C Example: a fair sharing bottleneck (case 1) an M/G/1 processor sharing queue Poisson arrival rate l offered mean flow size s traffic load r = ls/C (assume r < 1) stationary distribution of number of flows in progress p(n) = Pr [N(t) = n] p(n) = rn (1 – r) expected response time of flows of size s s R (s ) = C(1 r) mean throughput = s/R(s) = C(1 – r) max_ratei(t) > C ratei(t) = C/N(t) ls C ns simulation of TCP bandwidth sharing Poisson flow arrivals 1 Mbit/s load 0.5 Pareto size distribution 1000 800 throughput 600 of "mice": Throughput Pr[th'put=C/n] (Kbit/s) p(n) 400 throughput of C(1-r) "elephants": mean = C(1 - r) 200 0 0 500 1000 Size (Kbytes) 1500 2000 Impact of max_rate: mean throughput as a function of load C E[throughput] () fair sharing: = C(1 – r) fair sharing with rate limit: min {max_rate, C(1 – r)} max_rate 0 load, r 1 Accounting for non-Poisson flow arrivals a stochastic network representation (cf. BCMP 1975, Kelly 1979) sessions are customers arriving from the outside the link is a processor sharing station think-time is an infinite server station ... Poisson session arrivals flows link think-time end of session Accounting for non-Poisson flow arrivals ... a network of "symmetric queues" (e.g., PS, infinite server): customers successively visit the stations a certain number of times before leaving the network customers change "class" after each visit - class determines service time distribution, next station and class in that station customers arrive as a Poisson process ... Poisson session arrivals flows link think-time end of session Accounting for non-Poisson flow arrivals ... performance of the stochastic network: a product form occupancy distribution, insensitive to service time distributions ... Poisson session arrivals flows link think-time end of session Accounting for non-Poisson flow arrivals ... bandwidth sharing performance is as if flow arrivals were Poisson same distribution of flows in progress, same expected response time Poisson session arrivals flows link think-time end of session Accounting for non-Poisson flow arrivals ... bandwidth sharing performance is as if flow arrivals were Poisson same distribution of flows in progress, same expected response time class: a versatile tool to account for correlation in flow arrivals different kinds of session distibution of number of flows per session,... Poisson session arrivals flows link think-time end of session Accounting for non-Poisson flow arrivals ... bandwidth sharing performance is as if flow arrivals were Poisson same distribution of flows in progress, same expected response time class: a versatile tool to account for correlation in flow arrivals different kinds of session distibution of number of flows per session,... a general performance result for any mix of elastic applications Poisson session arrivals flows link think-time end of session Accounting for unfair sharing slight impact of unequal per flow shares e.g., green flows get 10 times as much bandwidth as red flows approximate insensitivity C throughput premium class best effort class 0 0 A C Extension for statistical bandwidth sharing in networks insensitivity in PS stochastic networks with state dependent service rates theory of Whittle networks (cf. Serfoso, 1999)... ...applied to statistical bandwidth sharing (Bonald and Massoulié, 2001; Bonald and Proutière, 2002) 1 station per route session arrivals end of session think time Underload and overload: the meaning of congestion in normal load (i.e., r < 1): performance is insensitive to all details of session structure response time is mainly determined by external bottlenecks (i.e, max_rate << C) but overloads can (and do) occur (i.e., r > 1): due to bad planning, traffic surges, outages,... the queuing models are then unstable, i.e., E[response time] throughput number of flows Underload and overload: the meaning of congestion in normal load (i.e., r < 1): performance is insensitive to all details of session structure response time is mainly determined by external bottlenecks (i.e, max_rate << C) but overloads can (and do) occur (i.e., r > 1): due to bad planning, traffic surges, outages,... the queuing models are then unstable, i.e., E[response time] to preserve stability: flow based admission control! throughput number of flows Outline characterizing demand statistical bandwidth sharing performance of elastic traffic integrating streaming traffic extensions for an integrated network Fluid queueing models for streaming traffic assume flows "on" at rate p, or "off" by shaping to rate p (e.g., isolated packet of size L = burst of length L/p) L/p packets bursts arrival rate Fluid queueing models for streaming traffic assume flows "on" at rate p, or "off" by shaping to rate p (e.g., isolated packet of size L = burst of length L/p) bufferless or buffered multiplexing? Pr [arrival rate > service rate] < e Pr [buffer overflow] < e L/p packets bursts bufferless buffered arrival rate Sensitivity of buffered multiplexing performance for a superposition of N independent on/off sources 0 0 log Pr [overflow] service rate input rate buffer size Sensitivity of buffered multiplexing performance for a superposition of N independent on/off sources performance depends on: mean lengths 0 buffer size 0 longer burst length shorter log Pr [overflow] service rate input rate Sensitivity of buffered multiplexing performance for a superposition of N independent on/off sources performance depends on: mean lengths distributions 0 buffer size 0 more variable burst length less variable log Pr [overflow] service rate input rate Sensitivity of buffered multiplexing performance for a superposition of N independent on/off sources performance depends on: mean lengths distributions correlation 0 buffer size 0 long range dependence burst length log Pr [overflow] service rate input rate short range dependence Sensitivity of buffered multiplexing performance for a superposition of N independent on/off sources performance depends on: mean lengths distributions correlation sensitivity precludes precise control prefer bufferless multiplexing 0 buffer size 0 long range dependence burst length log Pr [overflow] service rate input rate short range dependence Bufferless multiplexing (alias rate envelope multiplexing) ensure negligible probability of rate overload Pr[input rate > service rate] < e by provisioning accounting for session/flow process and rate fluctuations within each flow and using admission control to preserve transparency measurement-based admission control (e.g. Gibbens et al., 1995) service rate input rate Insensitive performance of bufferless multiplexing assuming Poisson session arrivals and peak rate p: traffic offered = a bit/s, capacity = C bit/s Poisson session arrivals bursts link silence/ think-time end of session Insensitive performance of bufferless multiplexing assuming Poisson session arrivals and peak rate p: traffic offered = a bit/s, capacity = C bit/s Pr[overflow ] = ( a / p )n a / p e n! n C / p i.e., loss depends only on load (a/C) and link/peak rate ratio (C/p) for any mix of streaming applications Poisson session arrivals bursts link silence/ think-time end of session Efficiency of bufferless multiplexing small amplitude of rate variations ... peak rate << link rate (eg, 1%) e.g., C/p = 100 Pr [overflow] < 10-6 for a/C < 0.6 ... or low utilisation overall mean rate << link rate Efficiency of bufferless multiplexing small amplitude of rate variations ... peak rate << link rate (eg, 1%) e.g., C/p = 100 Pr [overflow] < 10-6 for a/C < 0.6 ... or low utilisation overall mean rate << link rate we may have both in an integrated network priority to streaming traffic residue shared by elastic flows Integrating streaming and elastic flows scheduling required to meet QoS requirements priority to streaming flows fair sharing of residual capacity by elastic flows (TCP or FQ?) Integrating streaming and elastic flows scheduling required to meet QoS requirements priority to streaming flows fair sharing of residual capacity by elastic flows (TCP or FQ?) realizing a performance synergy very low loss for streaming data efficient use of residual bandwidth by elastic flows Integrating streaming and elastic flows scheduling required to meet QoS requirements priority to streaming flows fair sharing of residual capacity by elastic flows (TCP or FQ?) realizing a performance synergy very low loss for streaming data efficient use of residual bandwidth by elastic flows complex performance analysis a PS queue with varying capacity (cf. Delcoigne, Proutière, Régnié, 2002) exhibits local instability and sensitivity (in heavy traffic with high stream peak rate)... ... but a quasi-stationary analysis is accurate when admission control is used Conclusion: we need to account for the statistical nature of traffic a Poisson session arrival process fair sharing for insensitive elastic flow performance slight impact of unfairness (cf. Bonald and Massoulié, 2001) fair sharing in a network (cf. Bonald and Proutière, 2002) bufferless multiplexing for insensitive streaming performance choosing a common maximum peak rate (C/p > 100 for efficiency) packet scale performance (jitter accumulation)? synergistic integration of streaming and elastic traffic quasi-insensitivity with admission control (cf. Delcoigne et al., 2002, Benameur et al., 2001) Conclusion: we need to account for the statistical nature of traffic a Poisson session arrival process fair sharing for insensitive elastic flow performance slight impact of unfairness (cf. Bonald and Massoulié, 2001) fair sharing in a network (cf. Bonald and Proutière, 2002) bufferless multiplexing for insensitive streaming performance choosing a common maximum peak rate (C/p > 100 for efficiency) packet scale performance (jitter accumulation)? synergistic integration of streaming and elastic traffic quasi-insensitivity with admission control (cf. Delcoigne et al., 2002, Benameur et al., 2001) required congestion control mechanisms implicit measurement-based admission control shaping and priority queuing and reliance on TCP... ... or per-flow fair queuing for all?