Analysis of DCTCP: Stability, Convergence, and Fairness Mohammad Alizadeh Adel Javanmard and Balaji Prabhakar Stanford University Data Center Packet Transport • Transport inside the DC – TCP rules (99.9% of traffic in some DCs) • But, TCP: – Needs large buffers for high throughput – Induces large queuing delays – Does not handle bursty traffic well (Incast) • DCTCP was proposed to address these shortcomings (SIGCOMM’10). 2 TCP Buffer Requirement • Bandwidth-delay product rule of thumb: Buffer Size – A single flow needs C×RTT buffers for 100% Throughput. B = C×RTT B B > C×RTT B B To lower the buffering requirements, we must reduce sending rate variations. Buffer Size Buffer Size B < C×RTT Throughput loss! More latency! 3 DCTCP: Main Ideas 1. React in proportion to the extent of congestion. Reduce window size based on fraction of marked packets. ECN Marks TCP DCTCP 1011110111 Cut window by 50% Cut window by 40% 0000000001 Cut window by 50% Cut window by 5% 2. Mark based on instantaneous queue length. Fast feedback to better deal with bursts. Simplifies hardware. 4 DCTCP: Algorithm B Switch side: – Mark packets when Queue Length > K. Mark K Don’t Mark Sender side: – Maintain running average of fraction of packets marked (α). each RTT : F # of marked ACKs Total # of ACKs Adaptive window decreases: W (1 2 (1 g) gF )W – Note: decrease factor between 1 and 2. 5 (Kbytes) DCTCP vs TCP Setup: Win 7, Broadcom 1Gbps Switch Scenario: 2 long-lived flows, K = 30KB 6 Analysis of DCTCP Steady State Analysis • What is the effect of the various network and algorithm parameters on system throughput and latency? – Network: Capacity, Round-trip Time, Number of flows – Algorithm: Marking threshold (K), Averaging parameter (g) • The standard approach is to study control loop behavior via fluid models. – Kelly et al., Low et al., Misra et al., Srikant et al, … 8 DCTCP Fluid Model p(t – R*) LPF α(t) AIMD Source Delay N/RTT(t) W(t) p(t) × +− C q(t) 1 0 K Switch 9 Fluid Model vs ns2 simulations N=2 N = 10 N = 100 • Parameters: N = {2, 10, 100}, C = 10Gbps, d = 100μs, K = 65 pkts, g = 1/16. 10 Normalization of Fluid Model • We make the following change of variables: • The normalized system: • The normalized system depends on only two parameters: 11 Equilibrium Characterization Case 1: • Very large N: system (globally) converges to a unique fixed point: ~ (W , ~ , q~ ) ( 2 , 1, 2 w 1) (W , , q ) ( 2 , 1, 2 N Cd ) Example: w 1, g 1 /16 . 12 Equilibrium Characterization Case 1: • Very large N: system (globally) converges to a unique fixed point: ~ (W , ~ , q~ ) ( 2 , 1, 2 w 1) (W , , q ) ( 2 , 1, 2 N Cd ) Example: w 1, g 1 /16 . 12 Equilibrium Characterization Case 2: • System has a periodic limit cycle solution. Example: w 10 , g 1 /16 . 13 Equilibrium Characterization Case 2: • System has a periodic limit cycle solution. Example: w 10 , g 1 /16 . 13 Stability of Limit Cycles • Let X* = set of points on the limit cycle. • A limit cycle is locally asymptotically stable if δ > 0 exists s.t.: 14 Poincaré Map x1 x2 x2 = P(x1) * x x*α = P(x*α) S S * S x of Poincaré Map ↔ Stability of limit cycle Stability 15 Stability Criterion • Theorem: The limit cycle of the DCTCP system: is locally asymptotically stable if and only if ρ(Z1Z2) < 1. - JF is the Jacobian matrix with respect to x. T =We (1 +have hα)+(1 +numerically hβ) is the periodchecked of the limitthis cycle.condition for: • Proof: Show that P(x*α + δ) = x*α + Z1Z2δ + O(|δ|2). 16 Parameter Guidelines • How big does the marking threshold K need to be to avoid queue underflow? B K 17 Throughput-Latency Tradeoff Throughput > 94% as K 0 For TCP: Throughput → 75% • Parameters: C = 10Gbps, d = 480μs, g = 0.05. 18 Convergence Analysis • How long does it take for DCTCP sources to converge to their “fair share” rate (C/N)? – DCTCP is slower to converge than TCP since it cuts its window by smaller factors. • The fluid model is not suitable for transient analyses. • We use a hybrid (continuous- and discrete-time) model. – The model is inspired by the AIMD models of Baccelli et al. and Shorten et al. 19 p(t) (Marking Prob.) Window Sizes The Hybrid Model Time 1 RTT Time 20 Rate of Convergence (Theorem) Assume N DCTCP flows with arbitrary Wi(0) and αi(0), evolving according to the Hybrid Model, with: Define function 0 < α*≤ 1 be the unique positive solution to , and let Then: Also: where: 21 Consequences • DCTCP converges at most 40% slower than TCP: • The parameter g should not be too small: 22 Convergence: ns2 Simulations (g = 0.07) (g = 0.025) (g = 0.005) 23 Conclusion • Our analysis shows DCTCP: – requires 17% of C×RTT for full throughput – achieves 94% throughput as K → 0. – converges at most 1.4 times slower than TCP. • We provide guidelines for setting the DCTCP parameters. • The analysis suggests a simple modification that improves the RTT-fairness of DCTCP. – Achieves linear-RTT fairness (Thrput RTT-1), like TCP-RED 24 25