Reducing Latency in Internet Access Links with Mechanisms in Endpoints

advertisement
Reducing Latency in Internet Access
Links with Mechanisms in Endpoints
and within the Network
Naeem Khademi
Networks and Distributed Systems Group
Department of Informatics
University of Oslo
PhD Defense – June 17, 2015
naeemk@ifi.uio.no
Internet’s Latency Problem
Thesis Motivation
2
Internet’s Latency Problem
• A common phrase we have all
heard/said: “Internet is too slow today!”
• Fast Internet experienced by user
translates into feeling of control and
interactivity
• In the network “fast” means: higher
medium speed (bandwidth), shorter time
(delay) and higher transfer rate (transport
protocol)
t
m3/s
m3/s
A water-pipe example
Source: http://fc04.deviantart.net/fs71/f/2011/330/d/b/slow_internet_by_syas-d4hdtpl.jpg
3
Internet’s Latency Problem (#2)
• A common phrase we have all heard/said: “Internet is too slow today!”
• A common misconception: “Faster” only means higher bandwidth (a wider
pipe)!
•
Internet is still slow on bandwidth over-provisioned networks. In fact, delay
(latency) is a major source of problem in access links
Stuart Cheshire (Apple Inc.), “It's the Latency, Stupid”, 1996
http://rescomp.stanford.edu/~cheshire/rants/Latency.html
• Upstream delay of 600 ms on DSL and
1 sec on Cable in study of ~2K
broadband hosts with 11 major
commercial providers in
Europe/North America
M. Dischinger et al., “Characterizing Residential
Broadband Networks”, ACM IMC 2007.
Source: http://theartpocalypse.blogspot.no/2011/04/blog-post_19.html
4
Internet’s Latency Problem (#3)
• Even much more problematic on Wi-Fi and 3G/4G Networks
–
–
–
–
Multi-rate (-MCS)
Shared wireless channel (contention)
Frame retransmission due to bit-error (caused by noise or collision)
…
• A personal experience: late-May using Wi-Fi in a hotel room in Antalya
(Turkey), pinging Facebook.com: ~7000ms latency, 30%~40% packet loss!
5
Internet’s Latency Problem (#4)
Some Background…
Sources of Latency on the Transmission Path






Signal propagation delay
Medium acquisition delay
Serialization delay
Link error recovery delay
Switching/forwarding delay
Queuing delay
Why do we need queues?
To accommodate for short-term mismatch
between the arrival rate (λ) and the departure
rate (a.k.a service rate) (µ)
Goal => to keep the link always fully utilized!
Possible drawback => standing queues!
6
Bufferbloat Problem
Problem Statement
7
Bufferbloat Problem
• Bufferbloat: Internet’s latency problem that is due to unmanaged and
excessively large buffers deployed over decades (superficially in the access
links) (termed by J. Gettys in 2010) => common in DSL, Cable, Wi-Fi, 3G/4G
with delays in order of several hundred milliseconds to few seconds
• Common Rule-of-thumb: BDP (C.RTT)
• Why is it a problem? loss-based TCP
fills any buffer in the bottleneck
cyclically => always-full buffers with a
handful of TCP flows
• Also other transport protocols such as
LEDBAT (implemented in µTP that is
used by BitTorrent) are shown to
create standing queues
D. Ros et al., “Assessing LEDBAT’s Delay Impact ‘’,
IEEE Communication Letters, May 2013
Loss-based TCP
Induced RTT vs. bottleneck buffer size
G. Armitage et al. “Using Delay-Gradient TCP for
Multimedia-Friendly 'Background' Transport in Home
Networks”, IEEE LCN 2013
8
Bufferbloat Problem (#2)
• All traffic (including latency-sensitive) coexisting with TCP will suffer from
high delay and jitter
Queue Occupancy
Queue Limit
V
V
V
V
V
V
Time
• It also affects the Page Load Time (PLT) for web applications (longer RTTs
for DNS lookups and inter-dependent web objects such as .html .jpg .js
.css files, etc.)
9
What can be done about the Bufferbloat Problem?
Research Questions
 RQ1: How to size and manage the buffers on the access links
to reduce the end-to-end latency?
 RQ2: Which component(s) in the network to modify?
 RQ3: How deployable is the solution in the current Internet?
Solution Space
CAT #1: sender-based
CAT #2: network-based
CAT #3: signaling between the network and the sender
10
Thesis Outline
 Sender-based Solutions
 Link Layer Considerations in 802.11 Networks
 Networks-based Solutions
 Solutions based on Signaling between the network and the sender
11
Sender-based Solutions
• Sender can try to infer the onset of network congestion either at the link
layer or the transport layer
Link Layer




Signal propagation delay
Medium acquisition delay
Serialization delay
Frame retransmission delays
Transport Layer
 Round-Trip Time (RTT)
Modulation and Coding
Scheme (MCS) (bit-rate)
Rate Adaptation (RA)
Mechanism
802.11 example
12
Sender-based Solutions
(At the Transport Layer)
• Commonly deployed loss-based TCP CCs (NewReno and CUBIC) do not
implement this
• Delay-based TCP CCs observe the RTT (or OWD) trend
– As old as Jain’s CARD in 1989
– TCP Vegas, FAST TCP, TCP-Africa, CTCP, etc.
– They differ in their measurement method, setting the thresholds, and cwnd
adjustment
Main problems with Delay-based CCs
 Need to predict the base-RTT on the path
 Unfairness when coexisting with loss-based TCP
13
Sender-based Solutions
(At the Transport Layer)
• CAIA Delay-Gradient (CDG) v0.1 uses the relative variations in RTT (delaygradient)
– backoff with a probability when gradient is positive
– No knowledge of base-RTT needed anymore
– Tries to compete with loss-based TCP using “ineffectual backoff” mechanism (e.g.
b=5, b’=5)
– Reacts to packet loss only if Q = full with βloss = 0.5 using a shadow window
(mimics NewReno’s)
– Available since FreeBSD 9.2 and also recently in Linux
14
Sender-based Solutions
(At the Transport Layer)
• CDG v0.1’s performance in wired networks
D. A. Hayes et al., “Revisiting TCP Congestion Control Using Delay Gradients”, IFIP/TC6 NETWORKING
2011
• CDG v0.1’s performance in Wi-Fi home networks
Paper III: G. Armitage et al., “Using Delay-Gradient TCP for Multimedia-Friendly ‘Background’
Transport in Home Networks”, IEEE LCN 2013
– Low latency/jitter and good utilization when CDG-only traffic
– Noticeably loses throughput when competing with loss-based traffic
Why?
 Noisy gradient signal in Wi-Fi (A mixture of medium
access delay and AP’s buffer delay)
 Different MCSs (bit-rates)
A relevant use case?

CDG v0.1 as multimedia-friendly “Background” transport
15
Thesis Outline
 Sender-based Solutions
 Link Layer Considerations in 802.11 Networks
 Networks-based Solutions
 Solutions based on Signaling between the network and the sender
16
Sender-based Solutions
(At the Link Layer)
• Major challenge in Wi-Fi: distinguishing queuing delay from link layer
mechanisms’ delay => highly varying RTTs
–
–
–
–
Varying Modulation and Coding Schemes (MCSs) (bit-rates)
Varying channel acquisition delay (contention delay)
Frame retransmission delay <= noise and interference
…
• Rate Adaptation (RA) mechanism: Minimize the delay and optimize the
frame transmission
– Many RA mechanisms proposed in literature
– Few are implemented in Wi-Fi devices
– Common Problem: Distinguishing noise/interference from frame collision
MADWIFI RA Suite => ath5k/9k



AMRR
SampleRate
Minstrel
17
Sender-based Solutions
(At the Link Layer)
• RA mechanisms evaluation:
Paper I: N. Khademi et al., “On the Uplink Performance of TCP in Multi-rate 802.11 WLANs”,
IFIP/TC6 NETWORKING 2011
Paper II: N. Khademi et al., “Experimental Evaluation of TCP Performance in Multi-rate 802.11
WLANs”, IEEE WoWMoM 2012
– Choice of RA is not a big deal for “pure” downlink
– Uplink TCP drastically deteriorates with a simplistic RA mech. (AARF, AMRR,
SampleRate)
– Minstrel RA keeps the uplink at roughly the same level as downlink
18
Thesis Outline
 Sender-based Solutions
 Link Layer Considerations in 802.11 Networks
 Networks-based Solutions
 Solutions based on Signaling between the network and the sender
19
Network-based Solutions
• Optimal CC needs feedback from the network about the onset of congestion
• Active Queue Management (AQM): marking (w/ ECN) or dropping packets
on the onset of congestion
– RED: hard to tune under different network conditions; so many knobs!
– CoDel, PIE and ARED: knob-free promise with dynamic adaptation
AQMs for Bufferbloat



(FQ_)CoDel (2012)
PIE (2013)
Adaptive RED (ARED) (2001)
Tuning RED: too many knobs?
Source: http://obiaudio.com
20
Network-based Solutions
(Short Summary on AQMs)
ARED: dynamically adjusting Pmax using an AIMD function for the aim of a desired
target Ǭ = (th_min + th_max)/2
CoDel: drops (or marks) every “dropping interval” (init. 100ms) when delay (packet
sojourn time) exceeds a certain threshold (5ms) for more than a certain time interval
(100ms). Dropping interval is set as the inverse square root of the number of
dropping intervals, until delay goes below the threshold (e.g. 100, 100/√2, 100/√3,
100/√4, …)
-Also FQ_CoDel
PIE: uses estimated latency and its trend
(increasing/decreasing) over time
- Lightweight: no need for timestamping
- Drops/marks on enque()
- Also DOCSIS 3.1 PIE
RED’s dropping/marking probability
21
Network-based Solutions
• AQM parameter sensitivity:
Paper IV: N. Khademi et al., “The New AQM Kids on the Block: An Experimental Evaluation of CoDel
and PIE”, IEEE GI 2014
– Both CoDel and PIE maintain a set of parameters (no knob-free!)
– Goodput vs. Latency tradeoff:
o ARED performs better with moderately-to-highly multiplexed traffic
o CoDel and PIE perform better with lightly-multiplexed traffic
– Poor performance with default parameters over large RTT paths!
o Underutilization with low default thresholds (5ms~20ms)
CoDel
PIE
22
Thesis Outline
 Sender-based Solutions
 Link Layer Considerations in 802.11 Networks
 Networks-based Solutions
 Solutions based on Signaling between the network and the sender
23
Solutions based on Signaling between the network and the sender
• Low marking threshold => ~ tiny average buffer
• Full utilization with larger β
• Throughput model:
capacity
buffer
Simulation vs. model
One NewReno flow @ 10 Mbps, RTT 100ms
BDP
24
Solutions based on Signaling between the network and the sender
• Packet loss: not a good indicator of the onset of congestion (AQM vs. tail-loss)
• Network Explicit signaling (AQM) to the sender => ECN (15 years old!)
ECN Deployment Issues
 Problem: Middleboxes modifying the ECN-related header bits
 Increasing support at the clients/servers and OS
 Web servers: ~1% (2000) to ~30% (2012)
 Support in all major OS (2007)
• CE-mark: clear indication of AQM with low marking threshold
- Only one CE-marking router observed by B. Trammell et al., “Enabling Internet-wide
Deployment of Explicit Congestion Notification”, PAM 2015
- Future deployment: most likely with (FQ_)CoDel or PIE
- => Larger MD factor in response to a CE-mark (βECN)
25
Solutions based on Signaling between the network and the sender
• Alternative Backoff with ECN (ABE):
Paper V: N. Khademi et al., “Alternative Backoff: Achieving Low Latency and High Throughput with
ECN and AQM”, 2015
– A minor sender-side modification (changes βECN)
– Complies with RFC 3168
– Incremental deployment with no flag-day!
ABE Performance



Significant throughput gain with lightly-multiplexed traffic
Low latency (using CoDel or PIE)
Reasonable convergence and fairness with recommended βs
RTT
Source: http://frenchyme.blogspot.no
Throughput
CoDel
26
Solutions based on Signaling between the network and the sender
• ABE and slow-start:
- Benefits short flows terminating right after SS
- E.g. reduction in PLT of large web-pages
The effect of overshoot at the end of slow-start
@ 20Mbps (experiment)
27
Solutions based on Signaling between the network and the sender
• Improving ABE’s fairness:
Paper VI: N. Khademi et al., “Improving the Fairness of Alternative Backoff with ECN (ABE)”, 2015
– Today’s CoDel and PIE give all flows low latency at the cost of utilization for lossbased flows with high RTTs; can we construct an AQM that gives ABE flows low
latency without sacrificing utilization for loss-based flows?
 Inherent fairness problem with different marking/drop thresholds
– ABE + one threshold: relatively good fairness level, no starvation with standard TCP
– ABE + two thresholds: ABE tends to starve
– => Fair ABE: A dual-threshold AQM mechanism; requires a modification to the AQM
28
Solutions based on Signaling between the network and the sender
Fair ABE AQM algorithm: a showcase with RED enque()
Jain’s Fairness Index
1 ABE/CUBIC flow vs. 1 standard TCP (simulation)
29
Answers to the Research Questions
Conclusions
30
Conclusions
Research Questions
 RQ1: How to size and manage the buffers on the access links to
reduce the end-to-end latency?
 Answer: AQM should be used. Performance is highly dependent on
the endpoint’s reaction (Paper IV)
 RQ2: Which component(s) in the network to modify?
 Answer: Sender (Papers I, II, III), middlebox (Paper IV) and both
(Papers V, VI) (best)
 RQ3: How deployable is the solution in the current Internet?
 Answer:
-
Inevitable to have the right RA at the LL
CDG is easy to deploy but poor performance in Wi-Fi
ISPs can deploy AQM (it’s happening)
ABE is easy to deploy by the user
31
Publications
Paper Publication
I
Naeem Khademi, Michael Welzl and Renato Lo Cigno, “On the Uplink Performance of TCP
in Multi-rate 802.11 WLANs”, IFIP NETWORKING 2011, Valencia, Spain
II
Naeem Khademi, Michael Welzl and Stein Gjessing, “Experimental Evaluation of TCP
Performance in Multi-rate 802.11 WLANs”, IEEE WoWMoM 2012, San Francisco, California,
USA
III
Grenville Armitage and Naeem Khademi, “Using Delay-Gradient TCP for MultimediaFriendly ‘Background’ Transport in Home Networks”, IEEE LCN 2013, Sydney, New South
Wales, Australia
IV
Naeem Khademi, David Ros and Michael Welzl, “The New AQM Kids on the Block: An
Experimental Evaluation of CoDel and PIE”, GI 2014 (INFOCOM Workshop), Toronto,
Ontario, Canada
V*
Naeem Khademi, Michael Welzl, Grenville Armitage, Chamil Kulatunga, David Ros, Gorry
Fairhurst, Stein Gjessing and Sebastian Zander, “Alternative Backoff: Achieving Low Latency
and High Throughput with ECN and AQM”, TBD, 2015
VI*
Naeem Khademi, Michael Welzl and Stein Gjessing, “Improving the Fairness of Alternative
Backoff with ECN (ABE)”, TBD, 2015
IETF-ID
Nicolas Kuhn, Naeem Khademi, Preethi Natarajan and David Ros, “AQM Characterization
Guidelines”, Active WG item, draft-ietf-aqm-eval-guidelines, 21 May 2015 (-03)
* Pending for submission/under review
32
Q&A
33
Download