Delay-Based TCP Congestion Control David Hayes dahayes@swin.edu.au Centre for Advanced Internet Architectures (CAIA) Swinburne University of Technology Outline CAIA Background TCP congestion control Delay based congestion signals Some key delay-based algorithms LCN paper Improved coexistence and loss tolerance for delay based TCP congestion control Hamilton Institute’s Delay-based algorithm (HD) Shortcomings of HD, improved by CAIA HD algorithm (CHD) back-off decision frequency and scaling tolerance to non-congestion related loss improvements when coexisting with NewReno type flows Other TCP work Delay-gradient based congestion control Stateless TCP CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 2 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CAIA is the research arm of the Telecommunications Engineering Academic Group (or Department) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CAIA is the research arm of the Telecommunications Engineering Academic Group (or Department) Research spans: CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CAIA is the research arm of the Telecommunications Engineering Academic Group (or Department) Research spans: Congestion control (TCP, etc) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CAIA is the research arm of the Telecommunications Engineering Academic Group (or Department) Research spans: Congestion control (TCP, etc) Traffic classification CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CAIA is the research arm of the Telecommunications Engineering Academic Group (or Department) Research spans: Congestion control (TCP, etc) Traffic classification Wireless networks CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CAIA is the research arm of the Telecommunications Engineering Academic Group (or Department) Research spans: Congestion control (TCP, etc) Traffic classification Wireless networks Covert channels and lawful interception CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CAIA is the research arm of the Telecommunications Engineering Academic Group (or Department) Research spans: Congestion control (TCP, etc) Traffic classification Wireless networks Covert channels and lawful interception Network monitoring and visualisation CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CAIA is the research arm of the Telecommunications Engineering Academic Group (or Department) Research spans: Congestion control (TCP, etc) Traffic classification Wireless networks Covert channels and lawful interception Network monitoring and visualisation BGP CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CAIA is the research arm of the Telecommunications Engineering Academic Group (or Department) Research spans: Congestion control (TCP, etc) Traffic classification Wireless networks Covert channels and lawful interception Network monitoring and visualisation BGP Address space exploration CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CAIA – Centre for Advanced Internet Architectures We are at Swinburne University of Technology, about 5 km east of the Melbourne CBD CAIA is the research arm of the Telecommunications Engineering Academic Group (or Department) Research spans: Congestion control (TCP, etc) Traffic classification Wireless networks Covert channels and lawful interception Network monitoring and visualisation BGP Address space exploration Game traffic analysis CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 3 CISCO supported work Exploring the efficacy of distributed statistical traffic classification using modified open source packet filters (2010) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 4 CISCO supported work Exploring the efficacy of distributed statistical traffic classification using modified open source packet filters (2010) Implementing and testing delay-based and rate-based transport protocols in FreeBSD (2008) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 4 CISCO supported work Exploring the efficacy of distributed statistical traffic classification using modified open source packet filters (2010) Implementing and testing delay-based and rate-based transport protocols in FreeBSD (2008) Heuristics to reduce BGP Update Noise (2007) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 4 CISCO supported work Exploring the efficacy of distributed statistical traffic classification using modified open source packet filters (2010) Implementing and testing delay-based and rate-based transport protocols in FreeBSD (2008) Heuristics to reduce BGP Update Noise (2007) FreeBSD Implementation of an SCTP friendly NAT (2007) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 4 CISCO supported work Exploring the efficacy of distributed statistical traffic classification using modified open source packet filters (2010) Implementing and testing delay-based and rate-based transport protocols in FreeBSD (2008) Heuristics to reduce BGP Update Noise (2007) FreeBSD Implementation of an SCTP friendly NAT (2007) Anomalous Traffic Detection and Collaborative Network Configuration Using 3D Multiplayer Game Engines (2006) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 4 CISCO supported work Exploring the efficacy of distributed statistical traffic classification using modified open source packet filters (2010) Implementing and testing delay-based and rate-based transport protocols in FreeBSD (2008) Heuristics to reduce BGP Update Noise (2007) FreeBSD Implementation of an SCTP friendly NAT (2007) Anomalous Traffic Detection and Collaborative Network Configuration Using 3D Multiplayer Game Engines (2006) Public Implementation and Interoperability Testing of Next Generation TCP Stack Under FreeBSD (2005) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 4 CISCO supported work Exploring the efficacy of distributed statistical traffic classification using modified open source packet filters (2010) Implementing and testing delay-based and rate-based transport protocols in FreeBSD (2008) Heuristics to reduce BGP Update Noise (2007) FreeBSD Implementation of an SCTP friendly NAT (2007) Anomalous Traffic Detection and Collaborative Network Configuration Using 3D Multiplayer Game Engines (2006) Public Implementation and Interoperability Testing of Next Generation TCP Stack Under FreeBSD (2005) Dynamic Self-Learning Traffic Classification Based on Flow Characteristics (2004) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 4 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: delay↑ ≡ queue↑ 1 =⇒ indicates congestion congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: delay↑ ≡ queue↑ =⇒ indicates congestion Rate based intuition: 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: delay↑ ≡ queue↑ =⇒ indicates congestion Rate based intuition: Send rate > receive rate 1 =⇒ indicates congestion congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: delay↑ ≡ queue↑ =⇒ indicates congestion Rate based intuition: Send rate > receive rate =⇒ indicates congestion Basic questions: 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: delay↑ ≡ queue↑ =⇒ indicates congestion Rate based intuition: Send rate > receive rate =⇒ indicates congestion Basic questions: How is congestion determined? 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: delay↑ ≡ queue↑ =⇒ indicates congestion Rate based intuition: Send rate > receive rate =⇒ indicates congestion Basic questions: How is congestion determined? and if congested, how should cwnd be adjusted 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: delay↑ ≡ queue↑ =⇒ indicates congestion Rate based intuition: Send rate > receive rate =⇒ indicates congestion Basic questions: How is congestion determined? and if congested, how should cwnd be adjusted Issues: 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: delay↑ ≡ queue↑ =⇒ indicates congestion Rate based intuition: Send rate > receive rate =⇒ indicates congestion Basic questions: How is congestion determined? and if congested, how should cwnd be adjusted Issues: Noise of measurements 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: delay↑ ≡ queue↑ =⇒ indicates congestion Rate based intuition: Send rate > receive rate =⇒ indicates congestion Basic questions: How is congestion determined? and if congested, how should cwnd be adjusted Issues: Noise of measurements Correlation of measurements with congestion 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Introduction to delay and rate based TCP Promise of low latency zero loss1 transmission the congestion signal can be decoupled from packet loss potential for efficient transmission on lossy paths. Delay based intuition: delay↑ ≡ queue↑ =⇒ indicates congestion Rate based intuition: Send rate > receive rate =⇒ indicates congestion Basic questions: How is congestion determined? and if congested, how should cwnd be adjusted Issues: Noise of measurements Correlation of measurements with congestion Compatibility with existing TCP algorithms 1 congestion related CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 5 Background – TCP’s Congestion Window (w) In general loss-based TCP: CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 6 Background – TCP’s Congestion Window (w) In general loss-based TCP: increases w by the maximum segment size every RTT CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 6 Background – TCP’s Congestion Window (w) In general loss-based TCP: increases w by the maximum segment size every RTT or 1/w for every ACK CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 6 Background – TCP’s Congestion Window (w) In general loss-based TCP: increases w by the maximum segment size every RTT or 1/w for every ACK and halves w when a packet has been lost. CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 6 Background – TCP’s Congestion Window (w) In general loss-based TCP: increases w by the maximum segment size every RTT or 1/w for every ACK and halves w when a packet has been lost. ( wi lost packet wi+1 = 2 1 wi + wi otherwise 120 cwnd (packets) 100 80 60 40 20 NewReno 0 20 25 CISCO 30 35 time (s) http://www.caia.swin.edu.au 40 45 dahayes@swin.edu.au 50 15 October, 2010 6 Background: Base timing measurements S1 dsw dS1 S2 rttmin S3 rtt1 S4 rttmax S5 drw A1 A2 A3 A4 A5 daw ≈ drw S7 S9 S8 CISCO S6 http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 7 Background: Base timing measurements S1 dsw dS1 S2 rttmin S3 rtt1 S4 rttmax S5 drw A1 A2 A3 A4 A5 daw ≈ drw S7 S9 S8 CISCO S6 http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 7 Background: Base timing measurements S1 dsw dS1 S2 rttmin S3 rtt1 S4 rttmax S5 drw A1 A2 A3 A4 A5 daw ≈ drw S7 S9 S8 CISCO S6 http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 7 Background: Base timing measurements S1 dsw dS1 S2 rttmin S3 rtt1 S4 rttmax S5 drw A1 A2 A3 A4 A5 daw ≈ drw S7 S9 S8 CISCO S6 http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 7 Background: Base timing measurements S1 dsw dS1 S2 rttmin S3 rtt1 S4 rttmax S5 drw A1 A2 A3 A4 A5 daw ≈ drw S7 S9 S8 CISCO S6 http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 7 Background: Base timing measurements S1 dsw dS1 S2 rttmin S3 rtt1 S4 rttmax S5 drw A1 A2 A3 A4 A5 daw ≈ drw S7 S9 S8 CISCO S6 http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 7 Background: Base timing measurements S1 dsw dS1 S2 rttmin S3 rtt1 S4 rttmax S5 drw A1 A2 A3 A4 A5 daw ≈ drw S7 S9 S8 CISCO S6 http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 7 Background: Base timing measurements S S S A A daw S S A 0 daw 00 daw S A A S CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 8 Background: Base timing measurements S S S A A daw S S A 0 daw 00 daw S A A S CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 8 Background: Base timing measurements S S S A A daw S S A 0 daw 00 daw S A A S CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 8 Background: Base timing measurements S S Note: Queueing at FIFO network nodes can increase or decrease the interpacket times S A A daw S S A 0 daw 00 daw S A A S CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 8 Background: Base rate measurements CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 9 Background: Base rate measurements S1 Tmax = T1 = Pw S S2 rttmin S3 Pw S4 S rtt1 S5 A1 A2 Ra = Pw−a 1 A3 A4 A5 Ai daw S7 S9 S8 CISCO http://www.caia.swin.edu.au S6 dahayes@swin.edu.au 15 October, 2010 9 Background: Base rate measurements S1 Tmax = T1 = Pw S S2 rttmin S3 Pw S4 S rtt1 S5 A1 A2 Ra = Pw−a 1 A3 A4 A5 Ai daw S7 S9 S8 CISCO http://www.caia.swin.edu.au S6 dahayes@swin.edu.au 15 October, 2010 9 Background: Base rate measurements S1 Tmax = T1 = Pw S S2 rttmin S3 Pw S4 S rtt1 S5 A1 A2 Ra = Pw−a 1 A3 A4 A5 Ai daw S7 S9 S8 CISCO http://www.caia.swin.edu.au S6 dahayes@swin.edu.au 15 October, 2010 9 Background: Base rate measurements S1 Tmax = T1 = Pw S S2 rttmin S3 Pw S4 S rtt1 S5 A1 A2 Ra = Pw−a 1 A3 A4 A5 Ai daw S7 S9 S8 CISCO http://www.caia.swin.edu.au S6 dahayes@swin.edu.au 15 October, 2010 9 Background: Base rate measurements S1 Tmax = T1 = Pw S S2 rttmin S3 Pw S4 S rtt1 S5 A1 A2 Ra = Pw−a 1 A3 A4 A5 Ai daw S7 S9 S8 CISCO http://www.caia.swin.edu.au S6 dahayes@swin.edu.au 15 October, 2010 9 Quick early work overview [Clark et al., 1985]&[Clark et al., 1987] NETBLT RFCs 996&998 CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 10 Quick early work overview [Clark et al., 1985]&[Clark et al., 1987] NETBLT RFCs 996&998 [Jacobson, 1988]a – footnote on connectionless rate based AIMD. a V. Jacobson, “Congestion avoidance and control,” in SIGCOMM ’88: Symposium proceedings on Communications architectures and protocols. New York, NY, USA: ACM, 1988, pp. 314–329 CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 10 Quick early work overview [Clark et al., 1985]&[Clark et al., 1987] NETBLT RFCs 996&998 [Jacobson, 1988]a – footnote on connectionless rate based AIMD. [Jain, 1989]b normalised delay gradient. a V. Jacobson, “Congestion avoidance and control,” in SIGCOMM ’88: Symposium proceedings on Communications architectures and protocols. New York, NY, USA: ACM, 1988, pp. 314–329 b R. Jain, “A delay-based approach for congestion avoidance in interconnected heterogeneous computer networks,” SIGCOMM Comput. Commun. Rev., vol. 19, no. 5, pp. 56–71, 1989 CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 10 Quick early work overview [Clark et al., 1985]&[Clark et al., 1987] NETBLT RFCs 996&998 [Jacobson, 1988]a – footnote on connectionless rate based AIMD. [Jain, 1989]b normalised delay gradient. [Wang and Crowcroft, 1992]c DUAL algorithm. a V. Jacobson, “Congestion avoidance and control,” in SIGCOMM ’88: Symposium proceedings on Communications architectures and protocols. New York, NY, USA: ACM, 1988, pp. 314–329 b R. Jain, “A delay-based approach for congestion avoidance in interconnected heterogeneous computer networks,” SIGCOMM Comput. Commun. Rev., vol. 19, no. 5, pp. 56–71, 1989 c Z. Wang and J. Crowcroft, “Eliminating periodic packet losses in the 4.3-Tahoe BSD TCP congestion control algorithm,” SIGCOMM Comput. Commun. Rev., vol. 22, no. 2, pp. 9–16, Apr. 1992 CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 10 Quick early work overview [Clark et al., 1985]&[Clark et al., 1987] NETBLT RFCs 996&998 [Jacobson, 1988]a – footnote on connectionless rate based AIMD. [Jain, 1989]b normalised delay gradient. [Wang and Crowcroft, 1992]c DUAL algorithm. [Brakmo and Peterson, 1995]d TCP Vegas. a V. Jacobson, “Congestion avoidance and control,” in SIGCOMM ’88: Symposium proceedings on Communications architectures and protocols. New York, NY, USA: ACM, 1988, pp. 314–329 b R. Jain, “A delay-based approach for congestion avoidance in interconnected heterogeneous computer networks,” SIGCOMM Comput. Commun. Rev., vol. 19, no. 5, pp. 56–71, 1989 c Z. Wang and J. Crowcroft, “Eliminating periodic packet losses in the 4.3-Tahoe BSD TCP congestion control algorithm,” SIGCOMM Comput. Commun. Rev., vol. 22, no. 2, pp. 9–16, Apr. 1992 d L. S. Brakmo and L. L. Peterson, “TCP Vegas: end to end congestion avoidance on a global internet,” IEEE J. Sel. Areas Commun., vol. 13, no. 8, pp. 1465–1480, Oct. 1995 CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 10 A quick look at some key/interesting algorithms in the literature Centre for Advanced Internet Architectures (CAIA) Swinburne University of Technology Algorithms: Pkt pair flow control [Keshav, 1994] CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs p1 p2 t RTT pair disperion pair disperion estimate SOURCE CISCO http://www.caia.swin.edu.au network node BOTTLENECK dahayes@swin.edu.au SINK 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs p1 p2 t RTT pair disperion pair disperion estimate SOURCE CISCO http://www.caia.swin.edu.au network node BOTTLENECK dahayes@swin.edu.au SINK 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs p1 p2 t RTT pair disperion pair disperion estimate SOURCE CISCO http://www.caia.swin.edu.au network node BOTTLENECK dahayes@swin.edu.au SINK 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs p1 p2 t RTT pair disperion pair disperion estimate SOURCE CISCO http://www.caia.swin.edu.au network node BOTTLENECK dahayes@swin.edu.au SINK 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs p1 p2 t RTT pair disperion pair disperion estimate SOURCE CISCO http://www.caia.swin.edu.au network node BOTTLENECK dahayes@swin.edu.au SINK 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs p1 p2 t RTT pair disperion pair disperion estimate SOURCE CISCO http://www.caia.swin.edu.au network node BOTTLENECK dahayes@swin.edu.au SINK 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs p1 p2 t RTT pair disperion pair disperion estimate SOURCE CISCO http://www.caia.swin.edu.au network node BOTTLENECK dahayes@swin.edu.au SINK 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs p1 p2 t RTT pair disperion pair disperion estimate SOURCE CISCO http://www.caia.swin.edu.au network node BOTTLENECK dahayes@swin.edu.au SINK 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs p1 p2 t Available send rate is: T = size(p2 ) pair dispersion RTT pair disperion pair disperion estimate SOURCE CISCO http://www.caia.swin.edu.au network node BOTTLENECK dahayes@swin.edu.au SINK 15 October, 2010 12 Algorithms: Pkt pair flow control [Keshav, 1994] All data is sent as back-to-back pairs p1 p2 t Available send rate is: T = size(p2 ) pair dispersion RTT pair disperion Presumes routers use round robin scheduling pair disperion estimate SOURCE CISCO http://www.caia.swin.edu.au network node BOTTLENECK dahayes@swin.edu.au SINK 15 October, 2010 12 Algorithms: Vegas [Brakmo and Peterson, 1995] CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 13 Algorithms: Vegas [Brakmo and Peterson, 1995] Iconic rate based TCP CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 13 Algorithms: Vegas [Brakmo and Peterson, 1995] Iconic rate based TCP Defines two rates: CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 13 Algorithms: Vegas [Brakmo and Peterson, 1995] Iconic rate based TCP P Defines two rates: P actual = S rtt S1 S S2 rttmin S3 S4 rtt1 S5 A1 CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 13 Algorithms: Vegas [Brakmo and Peterson, 1995] Iconic rate based TCP P Defines two rates: P actual = S rtt S1 S S2 S3 rttmin S4 rtt1 S5 w expected = rttmin CISCO A1 http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 13 Algorithms: Vegas [Brakmo and Peterson, 1995] Iconic rate based TCP P Defines two rates: P actual = S1 S S2 S3 rttmin S rtt S4 rtt1 S5 w expected = rttmin and A1 diff = expected − actual CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 13 Algorithms: Vegas [Brakmo and Peterson, 1995] Iconic rate based TCP P Defines two rates: P actual = S1 S S2 S3 rttmin S rtt S4 rtt1 S5 w expected = rttmin and A1 diff = expected − actual window adjustment: w − 1 diff > β w ← w + 1 diff < α w otherwise CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 13 Algorithms: Vegas [Brakmo and Peterson, 1995] Iconic rate based TCP P Defines two rates: P actual = S1 S S2 S3 rttmin S rtt S4 rtt1 S5 w expected = rttmin and A1 diff = expected − actual window adjustment: w − 1 diff > β w ← w + 1 diff < α w otherwise CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 13 Algorithms: Vegas [Brakmo and Peterson, 1995] Iconic rate based TCP P Defines two rates: P actual = S1 S S2 S3 rttmin S rtt S4 rtt1 S5 w expected = rttmin and A1 diff = expected − actual window adjustment: w − 1 diff > β w ← w + 1 diff < α w otherwise CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 13 Algorithms: Vegas [Brakmo and Peterson, 1995] Iconic rate based TCP P Defines two rates: P actual = S1 S S2 S3 rttmin S rtt S4 rtt1 S5 w expected = rttmin and A1 diff = expected − actual window adjustment: w − 1 diff > β w ← w + 1 diff < α w otherwise CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 13 Algorithms: FAST [Wei et al., 2006] Enhanced Vegas type algorithm CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 14 Algorithms: FAST [Wei et al., 2006] Enhanced Vegas type algorithm MIMD — AIMD to slow for high BDP networks CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 14 Algorithms: FAST [Wei et al., 2006] Enhanced Vegas type algorithm MIMD — AIMD to slow for high BDP networks Uses delay as a rich (non binary) congestion indicator CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 14 Algorithms: FAST [Wei et al., 2006] Enhanced Vegas type algorithm MIMD — AIMD to slow for high BDP networks Uses delay as a rich (non binary) congestion indicator Cwnd is updated at regular time intervals (∆t): CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 14 Algorithms: FAST [Wei et al., 2006] Enhanced Vegas type algorithm MIMD — AIMD to slow for high BDP networks Uses delay as a rich (non binary) congestion indicator Cwnd is updated at regular time intervals (∆t): rttmin,i wt + α + (1 − γ)wt wt+∆t = min 2wt , γ rtti CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 14 Algorithms: FAST [Wei et al., 2006] Enhanced Vegas type algorithm MIMD — AIMD to slow for high BDP networks Uses delay as a rich (non binary) congestion indicator Cwnd is updated at regular time intervals (∆t): rttmin,i wt + α + (1 − γ)wt wt+∆t = min 2wt , γ rtti Smoothed window increase CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 14 Algorithms: FAST [Wei et al., 2006] Enhanced Vegas type algorithm MIMD — AIMD to slow for high BDP networks Uses delay as a rich (non binary) congestion indicator Cwnd is updated at regular time intervals (∆t): rttmin,i wt + α + (1 − γ)wt wt+∆t = min 2wt , γ rtti When congested, decreases in proportion to queueing delay CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 14 Algorithms: FAST [Wei et al., 2006] Enhanced Vegas type algorithm MIMD — AIMD to slow for high BDP networks Uses delay as a rich (non binary) congestion indicator Cwnd is updated at regular time intervals (∆t): rttmin,i wt + α + (1 − γ)wt wt+∆t = min 2wt , γ rtti increase is limited to 2w per ∆t CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 14 Algorithms: FAST [Wei et al., 2006] Enhanced Vegas type algorithm MIMD — AIMD to slow for high BDP networks Uses delay as a rich (non binary) congestion indicator Cwnd is updated at regular time intervals (∆t): rttmin,i wt + α + (1 − γ)wt wt+∆t = min 2wt , γ rtti For MIMD, α(wt , qi ) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 14 Algorithms: FAST [Wei et al., 2006] Enhanced Vegas type algorithm MIMD — AIMD to slow for high BDP networks Uses delay as a rich (non binary) congestion indicator Cwnd is updated at regular time intervals (∆t): rttmin,i wt + α + (1 − γ)wt wt+∆t = min 2wt , γ rtti For MIMD, α(wt , qi ) increase is proportional to the size of cwnd and the network queueing delay. CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 14 Algorithms: Compound TCP [Tan et al., 2006] In high speed high BDP networks aims to increase: CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 15 Algorithms: Compound TCP [Tan et al., 2006] In high speed high BDP networks aims to increase: efficiency CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 15 Algorithms: Compound TCP [Tan et al., 2006] In high speed high BDP networks aims to increase: efficiency RTT fairness and TCP fairness CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 15 Algorithms: Compound TCP [Tan et al., 2006] In high speed high BDP networks aims to increase: efficiency RTT fairness and TCP fairness In MSW Vista and 7 CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 15 Algorithms: Compound TCP [Tan et al., 2006] In high speed high BDP networks aims to increase: efficiency RTT fairness and TCP fairness In MSW Vista and 7 Uses Vegas’ rates: diff = (expected − actual)rttmin CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 15 Algorithms: Compound TCP [Tan et al., 2006] In high speed high BDP networks aims to increase: efficiency RTT fairness and TCP fairness In MSW Vista and 7 Uses Vegas’ rates: diff = (expected − actual)rttmin Provides NewReno+ performance throughput CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 15 Algorithms: Compound TCP [Tan et al., 2006] In high speed high BDP networks aims to increase: efficiency RTT fairness and TCP fairness In MSW Vista and 7 Uses Vegas’ rates: diff = (expected − actual)rttmin Provides NewReno+ performance throughput The send window, winj , is calculated as: winj = min(wj + dwndj , awndj ) CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 15 Algorithms: Compound TCP [Tan et al., 2006] In high speed high BDP networks aims to increase: efficiency RTT fairness and TCP fairness In MSW Vista and 7 Uses Vegas’ rates: diff = (expected − actual)rttmin Provides NewReno+ performance throughput The send window, winj , is calculated as: winj = min(wj + dwndj , awndj ) where wj is NewReno’s cwnd CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 15 Algorithms: Compound TCP [Tan et al., 2006] In high speed high BDP networks aims to increase: efficiency RTT fairness and TCP fairness In MSW Vista and 7 Uses Vegas’ rates: diff = (expected − actual)rttmin Provides NewReno+ performance throughput The send window, winj , is calculated as: winj = min(wj + dwndj , awndj ) where wj is NewReno’s cwnd and dwndj is the delay based window. CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 15 Algorithms: Compound TCP [Tan et al., 2006] In high speed high BDP networks aims to increase: efficiency RTT fairness and TCP fairness In MSW Vista and 7 Uses Vegas’ rates: diff = (expected − actual)rttmin Provides NewReno+ performance throughput The send window, winj , is calculated as: winj = min(wj + dwndj , awndj ) where wj is NewReno’s cwnd and dwndj is the delay based window. and awndj is the receivers advertised window. CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 15 Algorithms: DUAL [Wang and Crowcroft, 1992] Designed to supplement loss based congestion control CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 16 Algorithms: DUAL [Wang and Crowcroft, 1992] Designed to supplement loss based congestion control Delay based measurements provide “slow tuning” of cwnd every 2nd RTT ( max ) βw rtt > (rttmin +rtt 2 w← w otherwise where β = CISCO 7 8 http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 16 Algorithms: DUAL [Wang and Crowcroft, 1992] Designed to supplement loss based congestion control Delay based measurements provide “slow tuning” of cwnd every 2nd RTT ( max ) βw rtt > (rttmin +rtt 2 w← w otherwise where β = 7 8 Attempts to keep network buffers half full CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 16 Algorithms: DUAL [Wang and Crowcroft, 1992] Designed to supplement loss based congestion control Delay based measurements provide “slow tuning” of cwnd every 2nd RTT ( max ) βw rtt > (rttmin +rtt 2 w← w otherwise where β = 7 8 Attempts to keep network buffers half full Smaller multiplicative decrease CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 16 Algorithms: DUAL [Wang and Crowcroft, 1992] Designed to supplement loss based congestion control Delay based measurements provide “slow tuning” of cwnd every 2nd RTT ( max ) βw rtt > (rttmin +rtt 2 w← w otherwise where β = 7 8 Attempts to keep network buffers half full Smaller multiplicative decrease Relies on accurate estimates of rttmin and rttmax CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 16 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa Two modes: Fast delay based, and slow NewReno based. CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa Two modes: Fast delay based, and slow NewReno based. Compound TCP is based on some of Africa’s ideas CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa Two modes: Fast delay based, and slow NewReno based. Compound TCP is based on some of Africa’s ideas [Baiocchi et al., 2007] — YeAH-TCP CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa Two modes: Fast delay based, and slow NewReno based. Compound TCP is based on some of Africa’s ideas [Baiocchi et al., 2007] — YeAH-TCP Yet Another Highspeed TCP CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa Two modes: Fast delay based, and slow NewReno based. Compound TCP is based on some of Africa’s ideas [Baiocchi et al., 2007] — YeAH-TCP Yet Another Highspeed TCP Two modes like Africa CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa Two modes: Fast delay based, and slow NewReno based. Compound TCP is based on some of Africa’s ideas [Baiocchi et al., 2007] — YeAH-TCP Yet Another Highspeed TCP Two modes like Africa Provides performance improvements on lossy paths. CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa Two modes: Fast delay based, and slow NewReno based. Compound TCP is based on some of Africa’s ideas [Baiocchi et al., 2007] — YeAH-TCP Yet Another Highspeed TCP Two modes like Africa Provides performance improvements on lossy paths. A number of schemes propose traffic shaping TCP’s send rate CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa Two modes: Fast delay based, and slow NewReno based. Compound TCP is based on some of Africa’s ideas [Baiocchi et al., 2007] — YeAH-TCP Yet Another Highspeed TCP Two modes like Africa Provides performance improvements on lossy paths. A number of schemes propose traffic shaping TCP’s send rate [Karandikar et al., 2000] – ABR like CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa Two modes: Fast delay based, and slow NewReno based. Compound TCP is based on some of Africa’s ideas [Baiocchi et al., 2007] — YeAH-TCP Yet Another Highspeed TCP Two modes like Africa Provides performance improvements on lossy paths. A number of schemes propose traffic shaping TCP’s send rate [Karandikar et al., 2000] – ABR like [Wu et al., 2002] – leaky bucket CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Algorithms: Others of Interest [King et al., 2005] — TCP-Africa Two modes: Fast delay based, and slow NewReno based. Compound TCP is based on some of Africa’s ideas [Baiocchi et al., 2007] — YeAH-TCP Yet Another Highspeed TCP Two modes like Africa Provides performance improvements on lossy paths. A number of schemes propose traffic shaping TCP’s send rate [Karandikar et al., 2000] – ABR like [Wu et al., 2002] – leaky bucket [Abendroth et al., 2002] – improved leaky bucket for network burstiness. CISCO http://www.caia.swin.edu.au dahayes@swin.edu.au 15 October, 2010 17 Improved coexistence and loss tolerance for delay based TCP congestion control Best Paper Award LCN 2010 David Hayes and Grenville Armitage {dahayes,garmitage}@swin.edu.au Centre for Advanced Internet Architectures (CAIA) Swinburne University of Technology Introduction Delay-based congestion control can potentially provide: CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 19 Introduction Delay-based congestion control can potentially provide: low latency transmission no congestion induced packet loss. CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 19 Introduction Delay-based congestion control can potentially provide: low latency transmission no congestion induced packet loss. efficient TCP over lossy paths (wireless links). CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 19 Introduction Delay-based congestion control can potentially provide: low latency transmission no congestion induced packet loss. efficient TCP over lossy paths (wireless links). Issues: CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 19 Introduction Delay-based congestion control can potentially provide: low latency transmission no congestion induced packet loss. efficient TCP over lossy paths (wireless links). Issues: Measuring delay to infer congestion CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 19 Introduction Delay-based congestion control can potentially provide: low latency transmission no congestion induced packet loss. efficient TCP over lossy paths (wireless links). Issues: Measuring delay to infer congestion Coexistence with current loss-based TCP CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 19 Introduction Delay-based congestion control can potentially provide: low latency transmission no congestion induced packet loss. efficient TCP over lossy paths (wireless links). Issues: Measuring delay to infer congestion Coexistence with current loss-based TCP We propose a delay based algorithm which: CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 19 Introduction Delay-based congestion control can potentially provide: low latency transmission no congestion induced packet loss. efficient TCP over lossy paths (wireless links). Issues: Measuring delay to infer congestion Coexistence with current loss-based TCP We propose a delay based algorithm which: improves TCP efficiency over lossy paths CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 19 Introduction Delay-based congestion control can potentially provide: low latency transmission no congestion induced packet loss. efficient TCP over lossy paths (wireless links). Issues: Measuring delay to infer congestion Coexistence with current loss-based TCP We propose a delay based algorithm which: improves TCP efficiency over lossy paths improves coexistence with loss-based TCP CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 19 Introduction Delay-based congestion control can potentially provide: low latency transmission no congestion induced packet loss. efficient TCP over lossy paths (wireless links). Issues: Measuring delay to infer congestion Coexistence with current loss-based TCP We propose a delay based algorithm which: improves TCP efficiency over lossy paths improves coexistence with loss-based TCP implement algorithms in the FreeBSD kernel CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 19 Hamilton Delay (HD) based window updates [Budzisz et al., 2009] CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 20 Hamilton Delay (HD) based window updates [Budzisz et al., 2009] Probabilistic delay-based backoff (coexistence) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 20 Hamilton Delay (HD) based window updates [Budzisz et al., 2009] Probabilistic delay-based backoff (coexistence) Per−packet backoff probability g (q ) backoff probability pmax B A qmin qth CISCO qmax http://www.caia.swin.edu.au Queuing delay {dahayes,garmitage}@swin.edu.au 15 October, 2010 20 Hamilton Delay (HD) based window updates [Budzisz et al., 2009] Probabilistic delay-based backoff (coexistence) Per−packet backoff probability g (q ) backoff probability pmax ( B A qmin qth CISCO wi+1 = qmax http://www.caia.swin.edu.au wi 2 wi + 1 wi X < g(qi ) otherwise Queuing delay {dahayes,garmitage}@swin.edu.au 15 October, 2010 20 Hamilton Delay (HD) based window updates [Budzisz et al., 2009] Probabilistic delay-based backoff (coexistence) Per−packet backoff probability g (q ) backoff probability pmax ( B A qmin qth CISCO wi+1 = qmax http://www.caia.swin.edu.au wi 2 wi + 1 wi X < g(qi ) otherwise Queuing delay {dahayes,garmitage}@swin.edu.au 15 October, 2010 20 Hamilton Delay (HD) based window updates [Budzisz et al., 2009] Probabilistic delay-based backoff (coexistence) Per−packet backoff probability g (q ) backoff probability pmax ( B A qmin qth wi+1 = qmax wi 2 wi + 1 wi X < g(qi ) otherwise Queuing delay Region A stable when queueing delay is low CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 20 Hamilton Delay (HD) based window updates [Budzisz et al., 2009] Probabilistic delay-based backoff (coexistence) Per−packet backoff probability g (q ) backoff probability pmax ( B A qmin qth wi+1 = qmax wi 2 wi + 1 wi X < g(qi ) otherwise Queuing delay Region A stable when queueing delay is low Region B only stable when queueing delay is high CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 20 RTT – loss-based & delay-based congestion control 140 RTT (s) 120 100 80 60 NewReno 40 20 25 30 35 Time (s) 40 45 50 140 Hamilton RTT (s) 120 100 80 60 40 20 25 CISCO 30 35 Time (s) http://www.caia.swin.edu.au 40 45 {dahayes,garmitage}@swin.edu.au 50 15 October, 2010 21 Delay-based back-off decision frequency Per−packet backoff probability g (q ) backoff probability pmax A B qmax qmin qth CISCO http://www.caia.swin.edu.au Queuing delay {dahayes,garmitage}@swin.edu.au 15 October, 2010 22 Delay-based back-off decision frequency Per−packet backoff probability g (q ) backoff probability pmax A B qmax qmin qth Queuing delay HD decision per packet: CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 22 Delay-based back-off decision frequency Per−packet backoff probability g (q ) backoff probability pmax A B qmax qmin qth Queuing delay HD decision per packet: Doesn’t scale well: CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 22 Delay-based back-off decision frequency Per−packet backoff probability g (q ) backoff probability pmax B A qmax qmin qth Queuing delay HD decision per packet: Doesn’t scale well: P[backoff] increases with w CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 22 Delay-based back-off decision frequency Per−packet backoff probability g (q ) backoff probability pmax B A qmax qmin qth Queuing delay HD decision per packet: Doesn’t scale well: P[backoff] increases with w CPU CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 22 Delay-based back-off decision frequency Per−RTT backoff probability g (hr ) backoff probability pmax B A qmax qmin qth HD decision per packet: Queuing delay CHD decision once per RTT Doesn’t scale well: P[backoff] increases with w CPU CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 22 Delay-based back-off decision frequency Per−RTT backoff probability g (hr ) backoff probability pmax B A qmax qmin qth HD decision per packet: Queuing delay CHD decision once per RTT Doesn’t scale well: Uses hr = maxr (qi ) P[backoff] increases with w CPU CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 22 Delay-based back-off decision frequency Per−RTT backoff probability g (hr ) backoff probability pmax B A qmax qmin qth HD decision per packet: Queuing delay CHD decision once per RTT Doesn’t scale well: Uses hr = maxr (qi ) P[backoff] increases with w Scales with w CPU CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 22 Delay-based back-off decision frequency Per−RTT backoff probability g (hr ) backoff probability pmax B A qmax qmin qth HD decision per packet: Queuing delay CHD decision once per RTT Doesn’t scale well: Uses hr = maxr (qi ) P[backoff] increases with w CPU Scales with w less CPU CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 22 Tolerance to non-congestion related packet loss CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 23 Tolerance to non-congestion related packet loss HD (and NewReno) do not tolerate packet loss CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 23 Tolerance to non-congestion related packet loss HD (and NewReno) do not tolerate packet loss w = w/2 when a packet is lost CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 23 Tolerance to non-congestion related packet loss HD (and NewReno) do not tolerate packet loss w = w/2 when a packet is lost CHD tolerates low level packet loss well by: CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 23 Tolerance to non-congestion related packet loss HD (and NewReno) do not tolerate packet loss w = w/2 when a packet is lost CHD tolerates low level packet loss well by: Ignoring packet loss when queueing delays are small (region A) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 23 Tolerance to non-congestion related packet loss HD (and NewReno) do not tolerate packet loss w = w/2 when a packet is lost CHD tolerates low level packet loss well by: Ignoring packet loss when queueing delays are small (region A) Per−RTT backoff probability g (hr ) backoff probability pmax A B qmax qmin qth CISCO http://www.caia.swin.edu.au Queuing delay {dahayes,garmitage}@swin.edu.au 15 October, 2010 23 Tolerance to non-congestion related losses 6 10 x 10 NewReno HD CHD 1/sqrt(p) Goodput (bps) 8 6 4 2 0 0 0.01 0.02 0.03 0.04 Probability of non−congestion related loss CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 0.05 15 October, 2010 24 Improving coexistence with loss-based TCP CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 25 Improving coexistence with loss-based TCP To improve CHD’s coexistence ability CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 25 Improving coexistence with loss-based TCP To improve CHD’s coexistence ability React only to packet loss in region B Per−RTT backoff probability g (hr ) backoff probability pmax A B qmax qmin qth CISCO http://www.caia.swin.edu.au Queuing delay {dahayes,garmitage}@swin.edu.au 15 October, 2010 25 Improving coexistence with loss-based TCP To improve CHD’s coexistence ability React only to packet loss in region B We use a shadow window (s) (shadows NewReno) Per−RTT backoff probability g (hr ) backoff probability pmax A B qmax qmin qth CISCO http://www.caia.swin.edu.au Queuing delay {dahayes,garmitage}@swin.edu.au 15 October, 2010 25 Improving coexistence with loss-based TCP To improve CHD’s coexistence ability React only to packet loss in region B We use a shadow window (s) (shadows NewReno) On packet loss, wi+1 = Per−RTT backoff probability max(wi ,si ) 2 g (hr ) backoff probability pmax A B qmax qmin qth CISCO http://www.caia.swin.edu.au Queuing delay {dahayes,garmitage}@swin.edu.au 15 October, 2010 25 Improving coexistence with loss-based TCP To improve CHD’s coexistence ability React only to packet loss in region B We use a shadow window (s) (shadows NewReno) On packet loss, wi+1 = Per−RTT backoff probability max(wi ,si ) 2 g (hr ) backoff probability pmax A B qmax qmin qth Queuing delay Best explained with an example CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 25 Shadow window example CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 26 packets Shadow window example w s=0 number of round trip times CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 26 packets Shadow window example w delay based congestion s=0 number of round trip times CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 26 Shadow window example packets s sync s w delay based congestion number of round trip times CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 26 Shadow window example packets s sync s w delay based congestion number of round trip times CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 26 Shadow window example lost packet packets s sync s w w recovery delay based congestion number of round trip times CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 26 Shadow window example lost packet s sync packets Region B max(wi , si ) 2 s w w recovery delay based congestion number of round trip times CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 26 Shadow window example lost packet packets s sync s w w recovery delay based congestion without w recovery number of round trip times CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 26 Shadow window example lost packet packets s sync s lost transmission opportunity w w recovery delay based congestion without w recovery number of round trip times CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 26 Shadow window example lost packet packets s sync s lost transmission opportunity w gained transmission opportunity w recovery delay based congestion without w recovery number of round trip times CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 26 Testbed for coexistence tests Delay CC Sources (FreeBSD) Delay CC Sink (FreeBSD) 20ms Dummynet Router (FreeBSD) 20ms NewReno Sources (FreeBSD) NewReno Sink (FreeBSD) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 27 Testbed for coexistence tests Delay CC Sources (FreeBSD) Delay CC Sink (FreeBSD) 20ms Dummynet Router (FreeBSD) 20ms NewReno Sources (FreeBSD) NewReno Sink (FreeBSD) We will look at HD and CHD coexisting with NewReno CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 27 Testbed for coexistence tests Delay CC Sources (FreeBSD) Delay CC Sink (FreeBSD) 20ms Dummynet Router (FreeBSD) 20ms NewReno Sources (FreeBSD) NewReno Sink (FreeBSD) We will look at HD and CHD coexisting with NewReno For 0 % and 1 % non-congestion related losses CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 27 HD coexisting with NewReno (1s Av, 0 % loss) HD 6 x 10 NewReno NewReno HD 10 Goodput (bps) 8 6 4 2 0 0 20 40 60 80 100 120 140 Time (s) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 28 HD coexisting with NewReno (1s Av, 0 % loss) HD 6 x 10 NewReno NewReno HD 10 Goodput (bps) 8 6 4 2 0 0 20 40 60 80 100 120 140 Time (s) HD does not compete well with NewReno CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 28 CHD coexisting with NewReno (1s Av, 0 % loss) CHD 6 x 10 NewReno NewReno CHD 10 Goodput (bps) 8 6 4 2 0 0 20 40 60 80 100 120 140 Time (s) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 28 CHD coexisting with NewReno (1s Av, 0 % loss) CHD 6 x 10 NewReno NewReno CHD 10 Goodput (bps) 8 6 4 2 0 0 20 40 60 80 100 120 140 Time (s) CHD reclaims some of the lost capacity from NewReno CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 28 Coexiting with NewReno on a lossy path 1 % loss 5 s averages CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 29 HD coexisting with NewReno (5s Av, 1 % loss) 6 x 10 10 HD NewReno NewReno HD Goodput (bps) 8 6 4 2 0 0 20 40 60 80 100 120 140 Time (s) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 29 HD coexisting with NewReno (5s Av, 1 % loss) 6 x 10 10 HD NewReno NewReno HD Goodput (bps) 8 6 4 2 0 0 20 40 60 80 100 120 140 Time (s) HD and NewReno cannot efficiently utilise the available bandwidth CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 29 CHD coexisting with NewReno (5s Av, 1 % loss) 6 x 10 10 CHD NewReno NewReno CHD Goodput (bps) 8 6 4 2 0 0 20 40 60 80 100 120 140 Time (s) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 29 CHD coexisting with NewReno (5s Av, 1 % loss) 6 x 10 10 CHD NewReno NewReno CHD Goodput (bps) 8 6 4 2 0 0 20 40 60 80 100 120 140 Time (s) CHD is able to effectively use the available bandwidth (including what NewReno is unable to use) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 29 LCN Conclusions CHD significantly enhances HD: improves scalability with per RTT decisions CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 30 LCN Conclusions CHD significantly enhances HD: improves scalability with per RTT decisions improves tolerance to non-congestion related packet losses CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 30 LCN Conclusions CHD significantly enhances HD: improves scalability with per RTT decisions improves tolerance to non-congestion related packet losses improves coexistence with loss based TCP (NewReno) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 30 LCN Conclusions CHD significantly enhances HD: improves scalability with per RTT decisions improves tolerance to non-congestion related packet losses improves coexistence with loss based TCP (NewReno) Shadow window CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 30 LCN Conclusions CHD significantly enhances HD: improves scalability with per RTT decisions improves tolerance to non-congestion related packet losses improves coexistence with loss based TCP (NewReno) Shadow window Lightly multiplexed environments CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 30 LCN Conclusions CHD significantly enhances HD: improves scalability with per RTT decisions improves tolerance to non-congestion related packet losses improves coexistence with loss based TCP (NewReno) Shadow window Lightly multiplexed environments CHD and HD have been implemented in the FreeBSD kernel (caia.swin.edu.au/urp/newtcp/tools.html) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 30 LCN Conclusions CHD significantly enhances HD: improves scalability with per RTT decisions improves tolerance to non-congestion related packet losses improves coexistence with loss based TCP (NewReno) Shadow window Lightly multiplexed environments CHD and HD have been implemented in the FreeBSD kernel (caia.swin.edu.au/urp/newtcp/tools.html) This work was made possible in part by a grant from the Cisco University Research Program Fund at Community Foundation Silicon Valley CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 30 A brief look at some other TCP work at CAIA David Hayes and Grenville Armitage {dahayes,garmitage}@swin.edu.au Centre for Advanced Internet Architectures (CAIA) Swinburne University of Technology Delay-gradient based TCP congestion control We investigated a delay-gradient congestion signal because: CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 32 Delay-gradient based TCP congestion control We investigated a delay-gradient congestion signal because: it does not require an accurate estimate of base RTT CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 32 Delay-gradient based TCP congestion control We investigated a delay-gradient congestion signal because: it does not require an accurate estimate of base RTT delay thresholds are hard to set — need to know path’s delay characteristics CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 32 Delay-gradient based TCP congestion control We investigated a delay-gradient congestion signal because: it does not require an accurate estimate of base RTT delay thresholds are hard to set — need to know path’s delay characteristics We have implemented it in FreeBSD, to be released soon. (waiting on a paper submission) CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 32 Comparing NewReno and our Delay-Gradient TCP RTT dynamics as 3 sources start 140 RTT (s) 120 100 80 flow 1 flow 2 flow 3 60 40 0 NewReno CISCO 20 40 http://www.caia.swin.edu.au 60 Time (s) 80 {dahayes,garmitage}@swin.edu.au 100 15 October, 2010 33 Comparing NewReno and our Delay-Gradient TCP RTT dynamics as 3 sources start 140 RTT (s) 120 100 80 flow 1 flow 2 flow 3 60 40 0 NewReno 20 40 60 Time (s) 80 100 140 flow 1 flow 2 flow 3 RTT (s) 120 Delay-Gradient 100 80 60 40 0 CISCO 20 40 http://www.caia.swin.edu.au 60 Time (s) 80 {dahayes,garmitage}@swin.edu.au 100 15 October, 2010 33 Stateless TCP Proposed by Geoff Huston to mitigate a DNS server issue http://www.potaroo.net/ispcol/2009-11/ stateless.pdf CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 34 Stateless TCP Proposed by Geoff Huston to mitigate a DNS server issue http://www.potaroo.net/ispcol/2009-11/ stateless.pdf Funded by APNIC and Nominet CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 34 Stateless TCP Proposed by Geoff Huston to mitigate a DNS server issue http://www.potaroo.net/ispcol/2009-11/ stateless.pdf Funded by APNIC and Nominet DNSSEC may cause the answers to DNS queries to exceed a single UDP packet CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 34 Stateless TCP Proposed by Geoff Huston to mitigate a DNS server issue http://www.potaroo.net/ispcol/2009-11/ stateless.pdf Funded by APNIC and Nominet DNSSEC may cause the answers to DNS queries to exceed a single UDP packet Clients using TCP may overload servers CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 34 Stateless TCP Proposed by Geoff Huston to mitigate a DNS server issue http://www.potaroo.net/ispcol/2009-11/ stateless.pdf Funded by APNIC and Nominet DNSSEC may cause the answers to DNS queries to exceed a single UDP packet Clients using TCP may overload servers Geoff’s idea → stateless TCP CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 34 Stateless TCP Proposed by Geoff Huston to mitigate a DNS server issue http://www.potaroo.net/ispcol/2009-11/ stateless.pdf Funded by APNIC and Nominet DNSSEC may cause the answers to DNS queries to exceed a single UDP packet Clients using TCP may overload servers Geoff’s idea → stateless TCP Implemented in FreeBSD, will be released on the CAIA web site in a few weeks. CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 34 Basic stateless TCP idea DNS server Listen Data send data up through UDP Data UDP statelessTCP if matches hash TCP IP CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 35 CPU time versus DNS query arrival rate CPU usage in 10s interval 2.5 2 udp tcp stateless 1.5 1 0.5 0 0 100 CISCO 200 300 400 Average requests/second http://www.caia.swin.edu.au 500 {dahayes,garmitage}@swin.edu.au 15 October, 2010 36 Thoughts and conclusions Delay-based TCP coexistence with loss-based TCP CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 37 Thoughts and conclusions Delay-based TCP coexistence with loss-based TCP current schemes coexist by behaving like NewReno CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 37 Thoughts and conclusions Delay-based TCP coexistence with loss-based TCP current schemes coexist by behaving like NewReno Low latency with no congestion related loss CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 37 Thoughts and conclusions Delay-based TCP coexistence with loss-based TCP current schemes coexist by behaving like NewReno Low latency with no congestion related loss only when there are no loss-based flows sharing path CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 37 Thoughts and conclusions Delay-based TCP coexistence with loss-based TCP current schemes coexist by behaving like NewReno Low latency with no congestion related loss only when there are no loss-based flows sharing path If switches and routers could differentiate between loss and delay based TCP, benefits would be realised sooner. CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 37 Thoughts and conclusions Delay-based TCP coexistence with loss-based TCP current schemes coexist by behaving like NewReno Low latency with no congestion related loss only when there are no loss-based flows sharing path If switches and routers could differentiate between loss and delay based TCP, benefits would be realised sooner. Delay-gradient as a congestion indication CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 37 Thoughts and conclusions Delay-based TCP coexistence with loss-based TCP current schemes coexist by behaving like NewReno Low latency with no congestion related loss only when there are no loss-based flows sharing path If switches and routers could differentiate between loss and delay based TCP, benefits would be realised sooner. Delay-gradient as a congestion indication works well CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 37 Thoughts and conclusions Delay-based TCP coexistence with loss-based TCP current schemes coexist by behaving like NewReno Low latency with no congestion related loss only when there are no loss-based flows sharing path If switches and routers could differentiate between loss and delay based TCP, benefits would be realised sooner. Delay-gradient as a congestion indication works well a composite delay-based congestion indication may be better CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 37 Thank you! CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 38 Thank you! Questions? CISCO http://www.caia.swin.edu.au {dahayes,garmitage}@swin.edu.au 15 October, 2010 38