RMCAT - Shared Bottleneck Detection and Coupled Michael Welzl

advertisement
RMCAT - Shared Bottleneck Detection and Coupled
Congestion Control vs. QoS for WebRTC
Michael Welzl
CAIA, Swinburne University, Melbourne
4. December 2014
Contributors
•  Main people behind this work:
–  Shared bottleneck detection:
Algorithm development & evaluation: David Hayes
Stuff available from: http://heim.ifi.uio.no/davihay/
–  http://www.ietf.org/proceedings/91/slides/slides-91-rmcat-2.pdf
–  NorNet testing: Simone Ferlin (Simula Research Labs)
–  Coupled congestion control:
Safiqul Islam
Stuff available from:
http://heim.ifi.uio.no/safiquli/coupled-cc/index.html
–  Other contributors: Stein Gjessing, Naeem Khademi
2
WebRTC in a nutshell
•  Multimedia services in browsers, available everywhere,
without plugins
–  P2P (browser-to-browser); major focus on traversing
middleboxes (NATs etc) è all traffic will be one UDP 5-tuple
–  Includes SCTP data channel (e.g. control traffic in a game)
–  Also a lot of IETF effort to agree on audio/video codecs
•  Control by web developer, via Javascript API,
standardized in W3C
•  Protocols standardized in IETF WGs: RTCWEB (main) +
RMCAT (congestion control) + DART (QoS)
–  Note, RMCAT charter is about media, not only WebRTC
3
WebRTC possibilities: games, toys... and
talking to tech support in the browser
4
Prioritization
•  WebRTC needs priorities
–  e.g., draft-ietf-rtcweb-use-cases-and-requirements includes use
case “Simple Video Communication Service, QoS”
–  Priorities to be exposed via JS API
•  Just use QoS?
–  Not available everywhere; some people just want to set DSCP and
“hope for the best”
–  Needs special rules: different DSCP values for one UDP-5-tuple?!
draft-ietf-dart-dscp-rtp-10
–  When common shared bottleneck is known, priorities between flows
from a single sender realized by coupled congestion control
(QoS can still protect you from other users)
5
Coupled congestion control: background
•  Having multiple congestion controlled flows from the same
sender compete on the same bottleneck is detrimental
–  Combine congestion controllers in the sender è better control
fairness (with priorities) and get less delay and loss
•  Two elements: 1) shared bottleneck detection (sbd),
2) coupled congestion control
–  In WebRTC, 1) can sometimes be very easy: same 6-tuple. But
measurement-based sbd enables broader application of 2)
(same sender, different receivers)
–  apparently, “same 6-tuple” turned this into “QoS vs. coupled-CC”
in the IETF
6
Shared bottleneck detection: scenarios
Sender + Receiver
H2
Sender
Case 2
Case 3
H1
Receiver
H3
Case 1
•  Only case 1 can be supported by 6-tuple
•  Case 2 can be supported by measurement-based SBD
•  Case 3 requires coordinating senders H1 and H2
–  Can be supported by measurement-based SBD, but needs
more (H1 ó H2 coordination); not currently considered
7
SBD for RMCAT
[ David Hayes, Simone Ferlin-Oliveira, Michael Welzl: "Practical Passive
Shared Bottleneck Detection Using Shape Summary Statistics", IEEE LCN
2014, 8-11 September 2014. ]
[ draft-hayes-rmcat-sbd-01.txt ]
•  Requirements:
–  Little signalling (also true for RMCAT CC algos)
–  Passive: require no changes to traffic
–  Reliable for long-term data/multimedia streams (e.g. video)
•  False positive worse than false negative
–  Realistic to implement as a real-time function in a browser
(no expensive offline computation)
8
SBD overview
•  Goal: determine correlation from common queue
•  Receiver calculates summary statistics from OWD
–  To deal with noise, lag and limit feedback
•  Variance: Packet Delay Variation [RFC 5481]
•  Skewness: skew_est
•  Oscillation: freq_est (avg. # of significant mean crossings)
•  Sender: group flows experiencing congestion (skew_est);
divide based on freq_est; PDV; skew_est
9
Why skewness?
•  Change from
positive to negative
near 100% load
makes it a good
congestion
indicator
•  Not good to use
alone: ambiguity
and susceptible to
extreme sample
values
M/M/1/K Queue
10
Practical estimation of skewness
•  Count
proportion
of samples
above and
below the
mean
•  Removes
ambiguity
11
Simulation setup
•  Background traffic based on real traffic traces
–  >90%
•  Flows 1&2 send at twice the rate of 3&4
•  Various combinations of bottlenecks activated
12
13
Real network tests with NorNet
14
Coupled congestion control
•  When possible, best done by scheduling packet transmission
from different sources with a single congestion controller
–  Congestion Manager (CM)
•  Disadvantages of this approach:
–  Hard to combining multiple applications (RMCAT is about RTP-based
applications in general, not only WebRTC)
–  Difficult to switch on/off
•  Hence, goal of [draft-welzl-rmcat-coupled-cc-04]: achieve benefits
with minimal changes to existing congestion controllers
[ Safiqul Islam, Michael Welzl, Stein Gjessing, Naeem Khademi: "Coupled
Congestion Control for RTP Media", ACM SIGCOMM Capacity Sharing Workshop
(CSWS 2014), 18 August 2014, Chicago, USA. ]
15
Best paper award è to be published in ACM SIGCOMM CCR too.
“Flow State Exchange” (FSE)
•  The result of searching for minimum-necessarystandardization: only define what goes in / out, how
data are maintained
–  Could reside in a single app (e.g. browser) and/or in the OS
Traditional CM
FSE-based CM
CM
CM
Stream 1
Stream 2
FSE
Stream 1
Stream 2
Another possible
implementation
of flow coordination
FSE
Stream 1
Stream 2
16
First version: only passive
•  Goal: Minimal change to existing CC
–  each time it updates its sending rate (New_CR), the flow calls
update (New_CR, New_DR), and gets the new rate
–  Complicates the FSE algorithm and resulting dynamics (e.g.
need dampening to avoid overshoot from slowly-reacting flows)
Update_rate()
Flow 1
Store Information
New_Rate
Update_rate()
Calculate Rates
FSE
Flow 2
New_Rate
Update_rate()
Flow n
New_Rate
17
Now: FSE - Active
•  Actively initiates communication will all the flows
–  Perhaps harder to use, but simpler algorithm and “nicer” resulting dynamics
Update_rate()
Flow 1
Store Information
Calculate Rates
New_Rate
FSE
Flow 2
New_Rate
Flow n
New_Rate
18
Now: FSE - Active
•  Actively initiates communication will all the flows
–  Perhaps harder to use, but simpler algorithm and “nicer” resulting dynamics
Flow 1
Store Information
New_Rate
Update_rate()
Calculate Rates
FSE
Flow 2
New_Rate
Flow n
New_Rate
19
Active algorithm: 1st version
•  Every time the congestion controller of a flow
determines a new sending rate CC_R, the flow
calls UPDATE
–  Updates the sum of all rates, calculates the sending
rates for all the flows and distributes them to all
registered flows
for all flows i in FG do
FSE_R(i) = (P(i)*S_CR)/S_P
send FSE_R(i) to the flow I
end for
•  Designed to be as simple as possible
20
Evaluation
•  Done: simulations, rate-based
–  AIMD: RAP
–  Multimedia-oriented: TFRC
•  Ongoing: simulations, window-based:
–  Delay-based: LEDBAT
–  TCP (SCTP-like, for data channel)
•  Planned: currently proposed RMCAT CC’s
–  Simulations
–  Real-life tests (put code in browser)
21
Simulation setup
F1
F1
F2
F2
R1
Fn
R2
Fn
•  Bottleneck – 10 Mbps; Queue-length – 62 Packets (1/2 BDP);
Packet Size – 1000 Bytes; RTT – 100 ms
•  All tests (except when x-axis = time) ran for 300 seconds, carried
out 10 times with random start times picked from first second;
stddev consistently very small ( <= 0.2% )
22
Dynamic behavior: Rate Adaptation
Protocol RAP ( = rate-based AIMD)
7
7
Flow 1
Flow 2
6
Sending Rate (Mbps)
Sending Rate (Mbps)
6
Flow 1
Flow 2
5
4
3
5
4
3
2
2
20
21
22
23
Time (s)
With FSE
24
25
20
21
22
23
24
25
Time (s)
Without FSE
23
Dynamic behavior: TFRC
6
6
Flow 1
Flow 2
5.5
Sending Rate (Mbps)
Sending Rate (Mbps)
5.5
Flow 1
Flow 2
5
4.5
5
4.5
4
4
20
21
22
23
Time (s)
With FSE
24
25
20
21
22
23
24
25
Time (s)
Without FSE
24
FSE goals
•  Charter:
“Develop a mechanism for identifying shared bottlenecks
between groups of flows, and means to flexibly allocate their
rates within the aggregate hitting the shared bottleneck.”
(requirement F34 in draft-ietf-rtcweb-use-cases-and-requirements-12)
–  This works perfectly
10
Flow 1
Flow2
Sending Rate (Mbps)
8
Priority of
flow 1
increased
over time
6
4
2
0
50
100
150
Time(s)
200
250
300
•  But: because this avoids competition between flows, we
expected reduced queuing delay and loss as a side effect
25
Average queue length (RAP)
12
FSE
Without FSE
11
Average Queue Length
10
9
8
7
6
5
4
3
2
4
6
8
10
12
Number of Flows
14
16
18
20
26
Packet loss ratio (RAP)
30
FSE
Without FSE
25
Packet Loss Ratio %
20
15
10
5
0
2
4
6
8
10
12
Number of Flows
14
16
18
20
27
14
14
12
12
10
10
Queue size (pkts)
Queue size (pkts)
What’s going on?
8
6
8
6
4
4
2
2
0
0
15
15.5
16
16.5
17
17.5
18
18.5
19
15
15.5
Time (s)
With FSE
16
16.5
17
17.5
18
18.5
19
Time (s)
Without FSE
•  Queue drains more often without FSE
–  E.g.: 2 flows with rate X each; flow 1 halves its rate: 2X è 1 ½X
–  When flows synchronize, both halve their rate on congestion, which
really halves the aggregate rate: 2X è 1X
28
Fix: Conservative Active FSE algorithm
•  No congestion (increase): do as before
•  Congestion (decrease): proportionally reduce total rate
(like one flow)
–  e.g. flow 1 goes from 1 to ½ => total goes from X to X/2
•  Need to prevent that flows ignore congestion or overreact
–  timer prevents rate changes immediately after the common rate
reduction that follows a congestion event
–  Timer is set to 2 RTTs of the flow that experienced congestion
–  Reasoning: assume that a congestion event can persist for up to
one RTT of that flow, with another RTT added to compensate for
fluctuations in the measured RTT value
29
20
Average
Queue
15
10
FSE
Without FSE
2
11
10
9
8
7
6
5
4
3
2
1
0
4
6
8
10
12
14
16
18
20
# of Flows
FSE
Without FSE
2
100
Link Utilization %
TFRC
4
6
8
10
12
14
16
18
20
# of Flows
Packet
Loss
Ratio
80
Link
Utilization
FSE
Without FSE
Throughput - 1 flow
60
50
2
4
6
8
10
12
# of Flows
14
16
18
20
55
50
45
40
35
FSE
Without FSE
30
Receiver makes assumptions
2
4
6about
8
10
12
14
2.5
# of Flows
sending rate (expected length of loss
interval) è loss event
ratio p
2
calculation wrong è sender too
1.5
aggressive
90
70
Average Queue Length
RAP
25
Packet Loss Ratio %
30
5
Packet Loss Ratio %
60
16
18
20
1
0.5
0
FSE
Without FSE
2
4
6
8
10
12
14
16
18
20
# of Flows
100
Link Utilization %
Average Queue Length
35
90
80
70
60
50
FSE
Without FSE
2
4
6
8
10
12
# of Flows
14
16
18
20
30
Multiple LEDBAT flows
31
How to evaluate app-limited flows?
•  Not easy: who is in control?
•  RMCAT codec model not available yet
•  From a transport point of view, the send buffer can
either run empty or not, with variations in how
quickly changes between these two states occur
–  We used a non-reacting video trace of a person talking
in a video conference with a well-known H264 encoder
(X264) to steer the app sending rate
•  I-frame in the beginning, rest was mostly P-frames
32
1 app-limited flow, 1 greedy flow (RAP)
FSE
Without FSE
16
14
12
10
8
6
4
Flow 1
Flow 2
2
0
0
5
10
15
Time (s)
20
25
Sending Rate (Mbps)
Sending Rate (Mbps)
16
14
12
10
8
6
4
Flow 1
Flow 2
2
0
0
5
10
15
20
25
Time (s)
FSE-controlled flows proportionally reduce the rate in case of
congestion; without FSE, synchronization causes app-limited
flow to over-react
33
Using priorities to “protect” the applimited from the greedy flow (RAP)
25
Flow #1
Flow #2
Link Utilization
Throughput
20
15
10
5
0
40
35
30
25
20
15
10
5
Capacity
High-priority (1) application limited flow #1 is hardly affected by a lowpriority (0.2) flow #2 as long as there is enough capacity for flow 1
34
2 FSE controlled flows competing with
synthetic traffic (TFRC)
Goodput (Mbps)
•  TMIX synthetic traffic, taken
from 60 minute trace of
campus traffic at the
University of Carolina
[TCP Evaluation suite]
•  We used the preprocessed version of
this traffic, adapted to
provide an approximate
load of 50%
6
Flow #1
Flow #2
5
4
3
2
1
0
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Priority of Flow #2
Throughput ratios very close to
theoretical values è FSE
operation largely unaffected
35
Thank you!
Questions?
36
Download