Modeling TCP Throughput A Simple Model and its Empirical Validation Jitendra Padhye Victor Firoiu Don Towsley Jim Kurose Presented by Jaebok Kim Introduction • Simple analytic characterization of the steady state throughput – A stochastic model of TCP congestion control • Deriving mathematical formulas – Taking account of not only retransmit but also timeout Contents • TCP Congestion Avoidance • Simplifying assumptions • Loss indications & triple-duplicate ACKs • Loss indications & triple-duplicate ACKs, time-outs • Impact of window limitation & a full model • Empirical validation • Conclusion TCP Congestion Avoidance • How do we resolve this problem? TCP Congestion Avoidance • TCP Reno – a newer version • Slow Start – W’ = W + 1 (each ACK arrives) – Eventually, doubling every RTT TCP Congestion Avoidance • Additive Increase – W’ = W + 1/W (each ACK arrives) – W’’ = W + 1/B (Second round begins) • B = n of Acknowledged Packets by 1 ACK (Typically, 2) • W/B ACKs will arrive & each ACK increase 1/W TCP Congestion Avoidance • Multiplicative Decrease (3Duplicate ACKs) – W’ = W * Md – Eventually, W’ = W/2 – Don’t go back to Slow Start, but Additive Increase • Time Out – Go back to Slow Start – W=1 Simplifying assumptions • No time for Fast Recovery • No time for Slow Start • Correlated packets losses in a round – Drop-tail policy • At a full buffer, drop all packets arriving late – But, independent between rounds • Separated by RTT • Same implementation of TCP-Reno rP1 P2 P3 P4 P5 P6 Loss indications & triple-duplicate ACKs • B – long term steady-state TCP throughput – Windows increases by 1/b – Windows decreases by a factor of 2 • P – loss probability • Get B(p) by utilizing Markov Regenerative Process – B = E[Y] / E[A] • Y = N of packets sent in TDPi • A = duration of the period • E[ ] = Expected value in MRGP Loss indications & triple-duplicate ACKs • Why do we need MRGP? – A cycle will repeat (TDP1, TDP2, TDP3, so on….) • Like a sequence of output – New size of windows depends on only previous one’s • Markov Chain – Each loss in rounds is separated by RTT (Independently) • In statistics, a sequence of random variables is independent and identically distributed (i.i.d.) if each has the same probability distribution as the others and all are mutually independent – Representing steady state model Loss indications & triple-duplicate ACKs • Markov Model – Predict the future through the past – Based on conditional probability Future state depends on only current state, not the past Loss indications & triple-duplicate ACKs • P(Rain, Sunny, Cloudy) = ? = p(Rain) * p(Sunny|Rain) * p(Cloudy|Sunny) Loss indications & triple-duplicate ACKS • How do we predict the weather ? Loss indications & triple-duplicate ACKs • MRGP – I.I.D random variables Loss indications & triple-duplicate ACKs • To get B(p) = E[Y]/E[A] – N of packets, including first lost packet, sent in a TDPi : αi – The round where a loss occurs : Xi – Yi = αi + Wi – 1 • Total of Yi packets sent in Xi +1 rounds – E[Y] = E[α] + E[W] – 1 (2) Loss indications & triple-duplicate ACKs • To derive E[α] – Expected value in random process {αi }i : E[α] – Based on the assumption • Lost packets in a round are independent on any packets in other rounds • Independent & identically distributed random variables – P[α = k] equal to p that k-1 packets are acknowledged before a loss – By using (2) and (4), we could derive (5) E[Y] Loss indications & triple-duplicate ACKs • The increase is linear with slope 1/b • Yi can be expressed by (10) • Bi : N of packets sent in the last round – Bi = Wi / 2 Loss indications & triple-duplicate ACKs • To derive E[W] – {Wi}, {Xi} all independent sequence of I.I.D random v – So, derive (12) from (7),(10) and (5) – Quadratic equation from (11) & (12) (1-p)/p + w = b* E[W]/4 (3/2 * E[W] – 1) + E[W]/2 Loss indications & triple-duplicate ACKs • As we get E[W], we could get E[X] & E[A] • Eventually, B(p) is derived from E[Y]/E[A] Loss indications & triple-duplicate ACKs, Time-outs • The major reason for window decreases – – – – Timeout rather than fast retransmit Occurring when packets(or ACKs) are lost After time-out , W’ = 1 The period of time-out will doubles Loss indications & triple-duplicate ACKs, Time-outs • Utilizing MRGP again – – – – – ZTO : duration of a sequence of time-outs ZTD : time interval b/w 2 consecutive TO sequences Si = ZiTO + ZiTD M : N of packets sent during Si B = E[M] / E[S] Loss indications & triple-duplicate ACKs, Time-outs • How to get B(p) ? – We’ve already known E[Y], E[A]. So, let’s utilize them – Ri = N of packets sent during time-out sequence ZTO • Similar process to get B(p) for TDP – Getting a full model & an approximate model Impact of window limitation & a full model • Keep in mind that limitation of window size • Windows can’t grow up over Wmax • Let’s follow the similar process to previous models’ – Unconstrained window size : Wu – E[Wu] < Wmax – Wmax approximately equal to E[Wu] Impact of window limitation & a full model • A full model • An approximate model Empirical validation • Validating formulae, derived so far, by measurement – 24 data sets with 1 hour long TCP connection – Infinite source X-axis = frequency of loss indication Y-axis = n of packets sent TD = only TD intervals T0 = single TO intervals T1 = double TO intervals T2 = Triple TO intervals TD Only = prediction of TD only model Full = prediction of full model Empirical validation • Analysis of measurement tables • Overestimation of throughput in TD Only model • Full model close to measurement • Connections suffering from more time-out rather than 3 duplicate ACKs Conclusion • A simple model of TCP-Reno – Capturing essence of TCP’s congestion avoidance behavior • TDP & time-out – Expressing throughput as a function of loss rate • Most connections suffered from a considerable number of timeouts Q&A • Thank you for listening to my presentation