Progress on Practical Shared Bottleneck Detection for Coupled Congestion Control David Hayes (UiO) { Simone Ferlin (SRL) and Michael Welzl (UiO) } RMCAT 4 March 2014 1 / 16 Background Problem I Flows traversing different paths through a network may share a common congested link — a bottleneck I Detecting which flows share a bottleneck and coupling their congestion control can provide performance advantages. SBD design objectives I Reliable I Practical (not CPU nor network intensive) I Small numbers of bottlenecks ( < 10) I Timely stable bottleneck detection ( < 10 s) RMCAT 4 March 2014 2 / 16 Shared Bottleneck Detection What does it rely on? I flows that share a bottleneck are similar in a measurable way Why is it hard? I delay and loss measurements include “noise” from rest of the path I delay and loss at the bottleneck is noisy – each packet sees a different queueing delay I different path delays cause time correlations to be lost or degraded at the measurement point RMCAT 4 March 2014 3 / 16 Classic cross correlation techniques Pairwise flow cross correlation of delay samples I delay signal is noisy I I delay distribution is often skewed I I filter sophisticated filter different path delays I incrementally shift and cross correlate to find lag of maximum correlation. RMCAT 4 March 2014 4 / 16 The delay signal t RMCAT 4 March 2014 5 / 16 Dealing with the signal noise I I Remove half the noise from other links by using OWD instead of RTT Only using difference statistics I I removes queueing delay estimate errors due to inaccurate estimate of OWDmin Mitigate lag and sample noise by: I I I relatively large statistic gathering periods relax thresholds (no need to distinguish between 1000 bottlenecks) use multiple measures RMCAT 4 March 2014 6 / 16 Summary statistics I Skewness in OWD I I I I an estimate using 2 counters Variance in OWD estimated using PDV (RFC 5481) Key frequency of OWD at the bottleneck link I estimated based on significant mean crossings RMCAT 4 March 2014 7 / 16 Grouping overview Flows +’ve skew -’ve skew No congestion Congestion key freq Grpn Grp1 variance variance Grp1 skewness Grp1 Grpp Grpi Grpj skewness Grpq Grpr Grpk skewness Grps Grpt skewness Grpu Grpv RMCAT 4 March 2014 8 / 16 Simulation tests Objectives I Test with a known “ground truth” I Simulations can allow us to look at worse than real scenarios. But, real network experiments are in progress will be discussed before the end. RMCAT 4 March 2014 9 / 16 Example NS2 Simulation 1 1&2 1&3 2&3 2 1,2,3 & 4 3 3&4 2&4 1&4 Test sink 4 Test sources Tmix background traffic Bottlenecks RMCAT 4 March 2014 10 / 16 Notes on results I Decisions made every 300 ms, but based on 6 – 15 s statistics. I Decision “points” are large for legibility, but it can tend to magnify errors. I Results illustrates what can and can’t be done. RMCAT 4 March 2014 11 / 16 3 3 3 2 2 1 1 1 2 2 3 200 400 600 800 1000 1200 4 4 4 4 3 2 3 4 3 4 4 4 3 2 1 Simulation results 200 100 Different queue sizes, link owd std 2.5. Note bottleneck queue sizes have been subsampled (1:350) and OWDs (1:20) 3 200 100 4 200 100 12 200 100 200 0 BN queue size (103 B) 1234 14 24 34 23 13 owd (ms) 2 1 Grouping 0 1 2 2 1 1 1 200 0 200 0 200 0 200 0 200 0 200 0 0 200 400 600 800 1000 1200 Simulation time (s) RMCAT 4 March 2014 12 / 16 Real network experiments I Bottleneck “ground truth” cannot be known with 100 % certainty. I I I Find thinnest link using STAB Load thinnest link with distant internet sources to create known bottleneck What are we testing? I Robustness in unpredictable “real” environments RMCAT 4 March 2014 13 / 16 Real network experiments (in progress) Work with Simone Ferlin (1) China Tbox% (2) Spitsbergen Tbox% (3) Kristiansand A NNE 1 (0) Oslo Operator A Tbox% B Tbox% (4) Gjøvik Internet Tbox% Operator B (6) U.K. (5) Germany Tbox% RMCAT 4 March 2014 14 / 16 Working with Coupled Congestion Control I I Summary statistics are gathered at the receivers Shared bottlenecks to a receiver I I Shared bottlenecks from a sender: I I Receiver does grouping and sends information to senders Receivers send summary statistics for grouping at sender. Can provide the necessary information for a future multi-sender multi-receiver coupled congestion control. RMCAT 4 March 2014 15 / 16 Conclusions and further work Finalising this stage I Finish real network experiments I Paper submission soon (LCN) I Draft (referring to paper) Quantitative results of % correct grouping I I I I simulation based where “ground truth” is known bottleneck definition based on queue empty rate or avg. queue size extended version in journal Next steps I I Protocol for sender/receiver information exchange Integration with coupled congestion control I I I time scales of detection dealing with SBD errors oscillating bottlenecks RMCAT 4 March 2014 16 / 16 Extra slides RMCAT 4 March 2014 1/5 Time domain summary statistics 1 Normalised delay moment 0.8 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 0 T m2 m3 0.5 1 1.5 2 Load Mean, variance (m2 ), skewness (m3 ) RMCAT 4 March 2014 2/5 Practical estimation of skewness 1 Normalised delay moment 0.8 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 0 m3 γ̂ 0.5 1 1.5 2 Load RMCAT 4 March 2014 3/5 Practical estimation of variance Normalised delay moment 1 0.8 m2 PDV1 PDV2 PDV3 0.6 0.4 0.2 0 0 0.5 1 1.5 2 Load RMCAT 4 March 2014 4/5 Practical estimation of key frequency f̂ + ∝ PDV OWD − ∝ PDV RMCAT 4 March 2014 5/5