Internet Congestion Control with Active Queue Management (AQM) September 4, 2001 Seungwan Ryu (sryu@eng.buffalo.edu) PhD Student of IE Department University at Buffalo Contents Internet Congestion Control Mathematical Modeling and Analysis Adaptive AQM and User Response Further studies 2 I. Internet Congestion Control Internet Traffic Engineering What is Congestion ? Congestion Control and Avoidance Implicit vs. Explicit feedback TCP Congestion Control Active Queue management (AQM) Explicit Congestion Notification (ECN) 3 Internet Traffic Engineering Measurement: for reality check Experiment: for Implementation Issues Analysis: Bring fundamental understanding of systems May loose important facts because of simplification Simulation: Complementary to analysis: Correctness, exploring complicate model May share similar model to analysis 4 What is congestion ? What is congestion ? The aggregate demand for bandwidth exceeds the available capacity of a link. What will be occur ? Performance Degradation • • • • Multiple packet losses Low link utilization (low Throughput) High queueing delay Congestion collapse 5 What is congestion ? - 2 Congestion Control Open-loop control Mainly used in circuit switched network (GMPLS) Implicit feedback control End-to-end congestion control Examples: TCP Tahoe, TCP Reno, TCP Vegas, etc. Closed-loop control Mainly used in packet switched network Use feedback information: global & local Explicit feedback control Network-assisted congestion control Examples: IBM SNA, DECbit, ATM ABR, ICMP source quench, RED, ECN 6 Congestion Control and Avoidance Two approaches of handling Congestion Congestion Control (Reactive) • Play after the network is overloaded Congestion Avoidance (Proactive) • Play before the network becomes overloaded 7 Implicit vs. Explicit feedback Implicit feedback Congestion Control Network drops packets when congestion occur Source infers congestion implicitly • time-out, duplicated ACKs, etc. Example: end-to-end TCP congestion Control Simple to implement but inaccurate • implemented only at transport layer (e.g., TCP) 8 Implicit vs. Explicit feedback - 2 Explicit feedback Congestion Control Network component (e.g., router) provides congestion indication explicitly to sources • use packet marking, or RM cells (in ATM ABR control) Examples: DECbit, ECN, ATM ABR CC, etc. Provide more accurate information to sources But is more complicate to implement • Need to change both source and network algorithm • Need cooperation between sources and network component 9 TCP Congestion Control Uses end-to-end congestion control uses implicit feedback • e.g., time-out, triple duplicated ACKs, etc. uses window based flow control • cwnd = min (pipe size, rwnd) • self-clocking • slow-start and congestion avoidance Examples: • TCP Tahoe, TCP Reno, TCP Vegas, etc. 10 TCP Congestion Control - 2 cwnd Slow-start and Congestion Avoidance Slow Start W* Congestion Avoidance W 4 W+1 W*/2 2 1 RTT RTT Time 11 TCP Congestion Control - 3 TCP Tahoe Use slow start/congestion avoidance Fast retransmit: an enhancement detect packet (segments) drop by three duplicate ACKs W = W/2, and enter congestion avoidance TCP Reno (fast recovery) Upon receiving three duplicate ACKs ssthresh = W/2, and retransmit missing packets W = ssthresh +3 Upon receiving next ACK: W = ssthresh Allow the window size grow fast to keep the pipeline full 12 TCP Congestion Control - 3 TCP SACK (Selected Acknowledgement) TCP (Thaoe) sender can only know about a single lost per RTT SACK option provides better recovery from multiple losses The sender can transmit all lost packets But those packets may have already been received Operation Add SACK option into TCP header The receiver sends back SACK to sender to inform the reception of the packet Then, the sender can retransmit only the missing packet 13 Active Queue Management (AQM) - 1 Performance Degradation in current TCP Congestion Control Multiple packet loss Low link utilization Congestion collapse The role of the router becomes important Control congestion effectively in networks Allocate bandwidth fairly 14 AQM - 2 Problems with current router algorithm Use FIFO based tail-drop (TD) queue management Two drawbacks with TD: lock-out, full-queue Lock-out: a small number of flows monopolize usage of buffer capacity Full-queue: The buffer is always full (high queueing delay) Possible solution: AQM Definition: A group of FIFO based queue management mechanisms to support end-to-end congestion control in the Internet 15 AQM - 3 Goals of AQM Reducing the average queue length: Reducing packet losses: More efficient resource allocation Methods: Decreasing end-to-end delay Drop packets before buffer becomes full Use (exponentially weighted) average queue length as an congestion indicator Examples: RED, BLUE, ARED, SRED, FRED,…. 16 AQM - 4 Random Early Detection (RED) use network algorithm to detect incipient congestion Design goals: • • • • minimize packet loss and queueing delay avoid global synchronization maintain high link utilization removing bias against bursty source Achieve goals by • randomized packet drop • queue length averaging 17 RED avgQ (1 WQ )avgQ WQ Q P 0 avgQ min th Pd pmax max th min th 1 avgQ min th min th avgQ max th max th avgQ 1 maxp minth maxth K 18 AQM - 5 : BLUE Algorithm Concept To avoid drawbacks of RED Upon packet loss if (now - last_update >freeze_t) Pm = pm + d1 last_update = now upon link idle if (now - last_update >freeze_t) Pm = pm - d2 last_update = now Parameter tuning problem Actual queue length fluctuation Decouple congestion control from queue length Use only loss and idle event as an indicator Maintains a single drop prob., pm Drawback Can not avoid some degree of multiple packet loss and/or low utilization 19 AQM - 6 : SRED Algorithm ith arriving packet is compared with a randomly selected one from Zombie list Hit = 1, if they are from same flow = 0, if NOT p(i)=hit frequency=(1-)p(i-1)+Hit p(i)-1: estimator of # of active flows Packet drop probability Concept Drawbacks Pzap 1 Psred * min( 1, ) (256 P(i)) 2 (1 / 4) pmax 0 pmax psred (1 / 3) B q B (1 / 6) B q (1 / 3) B q (1 / 6) B stabilize queue occupancy use actual queue length Penalize misbehaving flows P(i)-1 is not a good estimator for heterogeneous traffic Parameter tuning problem: Psred, Pzap, etc. Stabilize queue occupancy when traffic load is high. (When load is low ?) 20 AQM - 7 : ARED Adapt aggresiveness of RED according to the traffic load change adapt maxp based on queue behavior Operation Increase maxp when avgQ crosses above maxth Decrease maxp when avgQ crosses below minth freeze maxp after changing to prevent oscillation 21 AQM - 8 Problems with existing AQM Proposals Mismatch between macroscopic and microscopic behavior of queue length Insensitivity to the change of input traffic load Configuration (parameter setting) problem Reasons: Queue length averaging use inappropriate congestion indicator Use inappropriate control function 22 Explicit Congestion Notification (ECN) Current congestion indication Use packet drop to indicate congestion source infer congestion implicitly ECN to give less packet drop and better performance use packet marking rather than drop need cooperation between sources and network need two bits in IP header: ECT-bit, CE-bit 23 ECN - 2 ECT IP Header CE 1 ECT 0 CE 1 1 1 TCP Header 0 0 CWR CWR 2 1 ACK TCP Header ECN-Echo TCP Header 3 1 CWR Source 4 Router Destination 24 Contents Internet Congestion Control Mathematical Modeling and Analysis Adaptive AQM and User Response Further Studies 25 II. Mathematical Modeling and Analysis An Overview Mathematical Modeling of AQM Window based packet switching and the Internet Mathematical modeling and analysis of AQM Problems with existing AQMs Problems with existing AQMs Adaptive congestion indicator and control function 26 Overview - 1 Goal of mathematical modeling See steady state system dynamics Capture main factors influence to performance Provide recommendations for design and operation Two approaches for TCP Congestion Control Modeling steady state TCP behaviors • the square root law*, PFTK [Padhye et al., 1998] • assume TD queue management at the router Mathematical modeling and analysis of AQM (RED) c *: T RTT p , T: Throughput, p: constant drop rate 27 Overview - 2 AQM modeling and analysis Analytic modeling and analysis Control Theoretic Analysis Window based modeling and Analysis Assumptions Poisson assumption for input traffic Fixed number of persistent TCP traffics Steady state window size saturation 28 Mathematical Modeling of AQM - 1 Window based packet switching Model (Yang 99) Determine the steady state window size, Ws, of each flow sS If link j is not congested nsj 0, Q j 0 s S ( j ), j C j s s j If link j is congested nsj 0, Q j 0 s S ( j ), j C j s s j 29 Mathematical Modeling of AQM - 2 Window equation for an individual flow Since nsj Qj s Cj Q j 0 (1) Qj Ws s Rs jJ ( s ) nsj s j Rs s S ( j ) Cj Limitation of this model Assume infinite buffer size • No buffer overflow • No packet drop • No queue management algorithm at routers 30 Mathematical Modeling of AQM - 3 A simple AQM model Sources s1 1 S2 AQM Router 2 Bottleneck Link Destination C S K Min_th SS 31 Mathematical Modeling of AQM - 4 Extend Yang’s Model to AQM model Finite buffer capacity K The router use AQM to control congestion When congested • Yang’s Model: • Our Model: s s s s C, s s C , s s (1 pd ) 32 Mathematical Modeling of AQM - 5 Case 1: Tail drop Packet drop probability Pd: 1 pd 0 if o.w. C and QK 33 Mathematical Modeling of AQM - 6 Case 2: AQM Q Q min th s ns Let Then since Q W (1 pd )( R ) C Packet drop prob. Pd: W 1 pd ( R Q ) C 0 if C , Q min th O.w. 34 Mathematical Modeling of AQM - 7 Congestion Indicator Input traffic load should be the congestion Indicator Current AQMs • Use queue length Q as an alternative • Assume that the input traffic load is fixed in equilibrium Reason • can not measure(or estimate) exactly for on line implementation of packet drop function 35 Mathematical Modeling of AQM - 8 Packet drop function p d f ( ) Reason • The traffic load fluctuate, NOT stay in equilibrium • queue length is a function of input traffic Alternatively: pd f ( , Q) 36 Problems with existing AQMs Mismatch between macroscopic and microscopic behavior of queue length Insensitivity to the input traffic load variation parameter configuration problem 37 Problems with existing AQMs - 2 Mismatch problem Internet Traffic Generation 40 35 25 20 15 10 5 time 31 28 25 22 19 16 13 10 7 4 0 1 Window size 30 38 Problems with existing AQMs - 3 Mismatch between macroscopic and microscopic behavior of queue length Rho Queue Length Actual Wq=0.02 Wq=0.1 25 20 2.0 15 1.5 10 1.0 5 0.5 0 0 1 6 11 16 Time 21 26 31 39 Problems with existing AQMs - 4 Insensitivity to the input traffic load variation : u=0.7 : u=0.45 : u=0.25 : RED : GRED : Scheme III 1.00 Packet drop rate 0.80 0.60 0.40 0.20 0.00 0.3 0.5 0.7 0.9 1.1 1.3 Traffic Inte nsitie s (loads) Schemes: I:RED, II:GRED, III: pd f ( , Q) 40 Problems with existing AQMs - 5 Parameter configuration problem Has been a main design issue since 1993 Many modified AQMs has been proposed • Verified with simple simulation or simple experiment • good for particular traffic conditions • Real traffic is totally different. Need adaptive congestion indicator and control function • Adaptive to input traffic load variation • Avoid congestion NOT based on current state (i,e,. Q) 41 Contents Internet Congestion Control Mathematical Modeling and Analysis Adaptive AQM and User Response Further Studies 42 III. Adaptive AQM and User Response Input traffic load Prediction Adaptive AQM algorithms Adaptive parameter configuration Adaptive User response algorithm 43 Input traffic load Prediction Consider time-slotted model Time is divided into unit time slots, t, t=0,1,… calculate parameters at the end of each slot estimate Qt+1 to detect congestion proactively Qt 1 (t 1 C ) Qt • Predict ˆt 1 from measured input traffic t-1, t of past two time slots • Then, predict Q̂t 1 of next time slot t 44 Adaptive AQM algorithms Algorithm I: E-RED and E-GRED Enhanced-RED 0, Q̂ minth p max p t 1 maxth minth 1 Q̂t 1 minth minth Q̂t 1 maxth maxth Q̂t 1 E-GRED: similar to E-RED 45 Adaptive AQM algorithms - 2 Algorithm II: Use both predicted traffic intensity ˆ t 1 and current buffer utilization t=Qt/K ˆ t 1 represents imminent traffic changes in near future t represents current status of traffic Possible algorithms: 1 ˆ t 1t , 2 ˆ t2 1t , 3 2 ˆ t 1 t 46 Adaptive AQM algorithms - 3 Example: maintain Qindex to impose appropriate drop rate adaptively to traffic load change Then, pd * (1 Qindex ), arriving packets Qt Pd , where Qindex existing packets Qt Qˆ t 1 pd * Qindex , • If t is low and ˆ t 1 is high: more penalty to incoming packets ˆ t 1is low: more penalty on existing • If t is high and packets • Only High penalty for both packets when t and ˆ t 1 are high 47 Adaptive AQM algorithms - 4 Algorithm III: E-BLUE BLUE Algorithm • uses packet drops and link idle for adjusting packet drop probability • Can not avoid some degree of performance degradation Enhancement • Use Virtual lower/upper bound (VL, VU) • Combine predicted queue length Q̂t 1 with BLUE 48 Adaptive parameter configuration Adaptive queue length sampling interval t Previous recommendations • In [Firoiu et al.], minimum RTT was recommended • In [Hollot et al.], static and link speed independent value was recommended • However, above recommendations were obtained from assumptions of persistent and fixed N TCP traffics Our recommendation • The amount of incoming traffic fluctuate with time • Adjust t according to the varying traffic situation (i.e., adjust t according to the amount of input traffic) 49 Adaptive parameter configuration - 2 Q (i-1) i (i+1) (i+2) Time 50 Adaptive parameter configuration - 3 Adaptive filtering weight wq In RED, wq was recommended with 0.002 for long-term (macroscopic) performance goal Fixed small value of wq shows problems • Parameter setting problem • Insensitivity of control function to the change of traffic • Fairness problem: impose penalty to innocent packets Need to have adaptive wq to the change of traffic load One possible method: • Set wq as a function of current queue utilization, e.g., wq = Qt/C , 0 < < 1 51 Adaptive User response algorithm AQM need work with intelligent source response for better performance Enhanced-ECN If receive ECN feedback in (t-1) • If No ECN feedback in t If received ACK > 0 , W= W+M/W + M Else , W= W+M/W • Else, Continue usual response to ECN feedback Else, Continue TCP Congestion Avoidance 52 Contents Internet Congestion Control Mathematical Modeling and Analysis Adaptive AQM and User Response Further Studies 53 IV. Further Studies Mathematical Modeling and Analysis Simulation studies Stability and Control Dynamics Alternative Modeling Control Theoretic Consideration Traffics Performance Metrics Other approaches of congestion control More about AQM 54 Mathematical Modeling and Analysis Since p=f(,q) , Then find equilibrium point (*,p*) p =g(p) T ( , q ) ( 1 p ) C( 1 p ) R p P=f() (*,p*) 55 Mathematical Modeling and Analysis - 2 Alternative Modeling: State dependent service M/M/1/K queueing model 0 1 C L-1 C L C L-1 (C+p1) K-1 (C+pK’-1) K C+ L=minth, K’=K-minth 56 Mathematical Modeling and Analysis - 3 Service rates Steady state probabilities Qi min th C, S C pi, min th Qi K C , K Qi ( ) i 0 , i min th C i min th min th i 0 , min th i K j1 ( C ) ( C p ) i 1 min th K ( )i ij1min th ( ) min th , i0 C ( C p ) C i 1 i min th 1 i 57 Mathematical Modeling and Analysis - 3 Control Theoretic Consideration Control Function S t Queue dynamics Router Buffer t(1-p) D ACK (or NACK) 58 Simulation study Goal of simulation study See dynamics and performance of our AQM Compare results with other AQM such as RED Use realistic traffic previous studies has been done with simple and unreal traffic (fixed number of persistent TCPs) Generate realistic Internet traffic • Long-lived (FTP) and short-lived (web-like) TCP traffic • UDP traffic: CBR and/or ON/OFF 59 Performance Metrics TCP traffics Network-centric: for aggregate traffic • Throughput (or goodput) • Packet dropping (marking) probability • Link utilization (or queueing delay) User-centric: for Individual traffic • goodput (or throughput) • mean response time (RTT) UDP traffic • individual packet drop probability and its distribution 60 Other approaches of CC - 1: Pricing Smart-market [Mackie-Mason 1995] A price is set for each packet depends on the level of demand for mandwidth Admit packets with bid prices that exceed the cut-off value The cut-off is determined by the marginal cost Paris metro pricing (PMP) [Odlyzko] To provide differentiated services The network is partitioned into several logical separate channels with different prices With less traffic in channel with high price, better QoS would be provided. 61 Other approaches - 2: Optimization Concept Network resource allocation problem: User problems Network problems User problem sends bandwidth request with pricie Network problem allocate bandwidth to each users by solving NLP User problem Users can be distinguished by a utility function A user wants to maximize its benefit (utility - cost) Network problem maximize aggregate utilities subject to the link capacity constraints Then, it can be formulated to a Non-linear programming (NLP) problem 62 Other approaches - 3: Fairness Two fairness issues Fair bandwidth sharing: network-centric Fair packet drop (mark): user-centric Fair bandwidth sharing Max-min fair [Bertsekas, 1992]: No rate can be increased without simultaneous decreasing other rate which is already small provides equal treatment to all flows Proportional fair [Kelly 1998] A feasible set of rates are non-negative and the aggregate rate is not greater than link capacity and the aggregate of proportional change is zero or negative provides different treatment of each flow according to their rates 63 More about AQM Responsive (TCP) vs. unresponsive flows (UDP) RED fail to regulate unresponsive flows UDP do not adjust sending rate upon receiving congestion signal UDP flows consumes more bandwidth than fair share FRED [Lin & Morris, 1997] Tracks the # of packets in the queue from each flow Fair share for a flow is calculated dynamically unresponsive flows are identified and penalized maintain logical queues for each active flows in a FIFO queue Drop packets proportional to bandwidth usage See TCP-friendly website (http://www.psc.edu/networking/tcp_friendly.html) 64 More about AQM - 2 Providing QoS and DiffServ with AQM Try to support a multitude of transport protocol (TCP, UDP, etc.) Classify several types of services rather than one besteffort service. Then, apply different AQM control to each services classes. Examples: RIO (RED In and Out) [Clark98] CBT (Class based Thresholds) [Floyd1995] 65 More about AQM - 3 RIO (RED in and out) [Clark 1998] Separate flows into two classes: IN and OUT service profile router maintains two different statistics for each service profiles. Different parameters and average queue lengths Avgs: for IN packet: avgIN, for OUT profile: avgTOTAL When congested, apply different control to each classes p 1 Pmax_OUT Pmax_IN Minth_OUT Maxth_OUT = Minth_IN Maxth_IN avg 66 More about AQM - 4 CBT [Floyd 1995] packets are classified into several classes maintain a single queue but allocate fraction of capacity to each class Apply AQM (RED) based control to each class Once a class occupies its capacity, discard all arriving packets Drawbacks Fairness problem in case of changing traffic mix static threshold setting Total utilization can be fluctuated Dynamic-CBT [Chung2000] Track the number of active flows of each class dynamically adjust threshold values of each class 67 More about AQM - 5 Other Issues AQM vs. Tail Drop(TD) Congestion Indicator: Parameter tuning problem: wq, maxp, static or dynamic sampling Alternative ways: virtual queue approach Average queue length vs. Instantaneous queue length EX: [Gibbens 1998], [Kuniyur2000] Performance with/without ECN mechanism Control objective” Router-centric vs. user-centric 68 References S. Floyd et al. “Random early detection gateways for congestion avoidance control.” IEEE/ACM TON, 1993. RED web page, http://www.aciri.org/floyd/red.html RED for dummies, http://www.magma.ca/~terrim/RedLit.htm S. Ryu et al. “Advances in Internet congestion control.” submitted to IEEE comm. Survey & Tutorial, 2001 B. Braden et al. “Recommendations on queue management and congestion avoidance in the Internet.” IETF RFC2309, 1998. K. Ramakrishinan et al. “A proposal to add explicit congestion notification (ECN) to IP.” IETF RFC2481, 1999. 69