TCP Nice: A Mechanism for Background Transfers

advertisement
TCP Nice: A Mechanism for
Background Transfers
Arun Venkataramani, Ravi Kokku, Mike Dahlin
Laboratory of Advanced Systems Research(LASR)
Computer Sciences, University of Texas at Austin
July 12, 2016
Department of Computer Sciences, UT Austin
What are background transfers?
 Data that humans are not waiting for
 Non-deadline-critical
 Unlimited demand
 Examples
- Prefetched traffic on the Web
- File system backup
- Large-scale data distribution services
- Background software updates
- Media file sharing
July 12, 2016
Department of Computer Sciences, UT Austin
Desired Properties
 Utilization of spare network capacity
 No interference with regular transfers
- Self-interference
- applications hurt their own performance
- Cross-interference
- applications hurt other applications’ performance
July 12, 2016
Department of Computer Sciences, UT Austin
Goal: Self-tuning protocol
 Many systems use magic numbers
-
Prefetch threshold [DuChamp99, Mogul96, …]
Rate limiting
[Tivoli01, Crovella98, …]
Off-peak hours
[Dykes01, many backup systems, …]
 Limitations of manual tuning
-
Difficult to balance concerns
Proper balance varies over time
 Burdens application developers
July 12, 2016
Complicates application design
Increases risk of deployment
Reduces benefits of deployment
Department of Computer Sciences, UT Austin
TCP Nice
 Goal: abstraction of free infinite bandwidth
 Applications say what they want
- OS manages resources and scheduling
 Self tuning transport layer
- Reduces risk of interference with FG traffic
- Significant utilization of spare capacity by BG
- Simplifies application design
July 12, 2016
Department of Computer Sciences, UT Austin
Outline
 Motivation
 How Nice works
 Evaluation
 Case studies
 Conclusions
July 12, 2016
Department of Computer Sciences, UT Austin
Why change TCP?
 TCP does network resource management
- Need flow prioritization
 Alternative: router prioritization
+ More responsive, simple one bit priority
- Hard to deploy
 Question:
- Can end-to-end congestion control achieve noninterference and utilization?
July 12, 2016
Department of Computer Sciences, UT Austin
Traditional TCP Reno
 Adjusts congestion window (cwnd)
- Competes for “fair share” of BW (=cwnd/RTT)
 Packet losses signal congestion
- Additive Increase
- no losses  Increase cwnd by one packet per RTT
- Multiplicative Decrease:
- one packet loss (triple duplicate ack) halve cwnd
- multi-packet loss  set cwnd = 1
 Problem: signal comes after damage done
July 12, 2016
Department of Computer Sciences, UT Austin
TCP Nice
 Proactively detects congestion
 Uses increasing RTT as congestion signal
- Congestion  incr. queue lengths  incr. RTT
 Aggressive responsiveness to congestion
 Only modifies sender-side congestion control
- Receiver and network unchanged
- TCP friendly
July 12, 2016
Department of Computer Sciences, UT Austin
TCP Vegas
 Early congestion detection using RTTs
 Additive decrease on early congestion
 Restrict number of packets maintained by
each flow at bottleneck router
 Small queues, yet good utilization
July 12, 2016
Department of Computer Sciences, UT Austin
TCP Nice
 Basic algorithm
-
1. Early Detection  thresh. queue length incr. in RTT
2. Multiplicative decrease on early congestion
3. Maintain small queues like Vegas
4. Allow cwnd < 1.0
 per-ack operation:
- if(curRTT > minRTT + threshold*(maxRTT – minRTT))
numCong++;
 per-round operation:
- if(numCong > f.W)
W  W/2
else { … Vegas congestion control }
July 12, 2016
Department of Computer Sciences, UT Austin
Nice: the works
Reno Add *
Nice Add *
Add *
Add *
Mul +
Add + Mul +
Mul +
t.B
m
pkts
B
minRTT = t
maxRTT = t+B/m
 Non-interference getting out of the way in time
 Utilization maintaining a small queue
July 12, 2016
Department of Computer Sciences, UT Austin
Evaluation of Nice
 Analysis
 Simulation micro-benchmarks
 WAN measurements
 Case studies
July 12, 2016
Department of Computer Sciences, UT Austin
Theoretical Analysis
 Prove small bound on interference
 Main result – interference decreases exponentially
with bottleneck queue capacity, independent of the
number of Nice flows
 Guided algorithmic design
-
All 4 aspects of protocol essential
 Unrealistic model
July 12, 2016
Fluid approximation
Synchronous packet drop
Department of Computer Sciences, UT Austin
TCP Nice Evaluation
 Micro-benchmarks (ns simulation)
- test interference under more realistic assumptions
- Near optimal interference
- 10-100X less than standard TCP
- determine if Nice can get useful throughput
- 50-80% of spare capacity
- test under stressful network conditions
- test under restricted workloads, topologies
July 12, 2016
Department of Computer Sciences, UT Austin
TCP Nice Evaluation
 Measure prototype on WAN
+ test interference in real world
+ determine if throughput is available to Nice
- limited ability to control system
 Results
- Nice causes minimal interference
- standard TCP – significant interference
- Significant spare BW throughout the day
July 12, 2016
Department of Computer Sciences, UT Austin
NS Simulations
 Background traffic:
-
Long-lived flows
Nice, rate limiting, Reno, Vegas, Vegas-0
Ideal: Router prioritization
 Foreground traffic: Squid proxy trace
-
Reno, Vegas, RED
On/Off UDP
 Single bottleneck topology
 Parameters
-
spare capacity, number of Nice flows, threshold
 Metric
July 12, 2016
average document transfer latency
Department of Computer Sciences, UT Austin
Network Conditions
Document Latency (sec)
1e3
100
10
V0
Reno
Nice
1
Vegas
Router Prio
0.1
1
10
Spare Capacity
100
 Nice causes low interference even when there isn’t
much spare capacity.
July 12, 2016
Department of Computer Sciences, UT Austin
Scalability
1e3
Document Latency (sec)
Vegas
100
Reno
10
V0
Nice
1
Router Prio
0.1
1
10
Num BG flows
 W < 1 allows Nice to scale to any number of
background flows
July 12, 2016
Department of Computer Sciences, UT Austin
100
Utilization
8e4
Vegas
BG Throughput (KB)
Reno
6e4
Router Prio
4e4
V0
Nice
2e4
0
1
10
Num BG flows
100
 Nice utilizes 50-80% of spare capacity w/o stealing
any bandwidth from FG
July 12, 2016
Department of Computer Sciences, UT Austin
Case study 1 : Data Distribution
 Tivoli Data Exchange system
 Complexity
-
1000s of clients
different kinds of networks – phone line, cable modem, T1
 Problem
-
tuning data sending rate to reduce interference and
maximize utilization
 Current solution
July 12, 2016
manual setting for each set of clients
Department of Computer Sciences, UT Austin
Case study 1 : Data Distribution
Ping latency (in ms)
350
Manual tuning
300
Nice point
Austin
to
London
250
200
150
100
50
20
40
60
80
100
120
140
Completion time(in seconds)
July 12, 2016
Department of Computer Sciences, UT Austin
160
180
Case study 2: Prefetch-Nice
 Novel non-intrusive prefetching system [USITS’03]
 Eliminates network interference using TCP Nice
 Eliminates server interference using simple monitor
 Readily deployable
-
no modifications to HTTP or the network
JavaScript based
 Self-tuning architecture, end-to-end interference
elimination
July 12, 2016
Department of Computer Sciences, UT Austin
Case study 2: Prefetch-Nice
Response time
Hand-tuned threshold
Aggressive threshold
+
Self-tuning protocol
Aggressiveness
 A self-tuning architecture gives optimal performance
for any setting
July 12, 2016
Department of Computer Sciences, UT Austin
Prefetching Case Study
Response time (sec)
Cable modem
120
100
80
Reno
Nice
60
40
20
0
None
Cons
Prefetching mode
July 12, 2016
Department of Computer Sciences, UT Austin
Aggr
Prefetching case study
Response time (sec)
Phone line
4000
3500
3000
2500
2000
1500
1000
500
0
Reno
Nice
None
Cons
Prefetching mode
July 12, 2016
Department of Computer Sciences, UT Austin
Aggr
Conclusions
 End-to-end strategy can implement simple
priorities
 Enough usable spare bandwidth out there
that can be Nicely harnessed
 Nice makes application design easy
July 12, 2016
Department of Computer Sciences, UT Austin
TCP Nice Evaluation Key Results
 Analytical
-
interference bounded by factor that falls exponentially with
bottleneck queue size independent of no. of Nice flows
all 3 aspects of algorithm fundamentally important
 Microbenchmarks (ns simulation)
-
near optimal interference
interference up to 10x less than standard TCP
TCP-Nice reaps substantial fraction of spare BW
 Measure prototype on WAN
-
TCP Nice: Minimal interference
- standard TCP: Significant interference
July 12, 2016
significant spare BW available throughout the day
Department of Computer Sciences, UT Austin
Freedom from thresholds
Document Latency (sec)
20
16
I » O(e-B/m.(1-t))
12
8
4
0
0
0.2
0.4
0.6
Threshold
0.8
1
 Dependence on threshold is weak
- small threshold )low interference
July 12, 2016
Department of Computer Sciences, UT Austin
Manual tuning
 Prefetching – attractive, restraint necessary
- prefetch bandwidth limit [DuChamp99, Mogul96]
- threshold probability of access [ Venk’01]
- pace packets by guessing user’s idle time [Cro’98]
- prefetching during off-peak hours [Dykes’01]
 Data distribution / Mirroring
- tune data sending rates on a per-user basis
- Windows XP – Background (BITS)
- rate throttling approach
- different rates  different priority levels
July 12, 2016
Department of Computer Sciences, UT Austin
Goal: Self-tuning protocol
 Hand-tuning may not work
- Difficult to balance concerns
- Proper balance varies over time
 Burdens application developers
- Complicates application design
- Increases risk of deployment
- Reduces benefits of deployment
July 12, 2016
Department of Computer Sciences, UT Austin
Nice: the works
Reno Add *
Vegas Add *
Add * Add *
Add + Add +
Nice Add * Add + Mul +
t.B
m
pkts
B
minRTT = t
 Nested congestion triggers:
- Nice < Vegas < Reno
July 12, 2016
Department of Computer Sciences, UT Austin
maxRTT = t+B/m
TCP Vegas
Expected
m
Diff
Actual
a
b
1/minRTT
W*
Window
E = W / minRTT
A = W /observedRTT
Diff = (E – A)/minRTT
if(Diff < a / minRTT)
WÃW+1
else if(Diff > b / minRTT)
WÃW-1
 Additive decrease when packets queue
July 12, 2016
Department of Computer Sciences, UT Austin
Nice features




Small a, b not good enough. Not even zero!
Need to detect absolute queue length
Aggressive backoff necessary
Protocol must scale to large number of flows
 Solution
- Detection ! ‘threshold’ queue build up
- Avoidance ! Multiplicative backoff
- Scalability ! window is allowed to drop
below 1
July 12, 2016
Department of Computer Sciences, UT Austin
Internet measurements
Transfer times (sec)
 Long Nice and Reno flows
 Interference metric: flow completion time
80
70
60
50
40
30
20
10
0
BG
FG
FG
BG
G
F
2
G
/B
FG
G
F
9
/
G
F
G
B
8
Nice vs Reno: cable modem
July 12, 2016
Department of Computer Sciences, UT Austin
Internet measurements
 Long Nice and Reno flows
 Interference metric: flow completion time
 Extensive measurements over different
kinds of networks
- Phone line
- Cable modem
- Transcontinental link (Austin to
London)
- University network (Abilene: UT Austin
to Delaware)
July 12, 2016
Department of Computer Sciences, UT Austin
Internet measurements
Transfer times (sec)
 Long Nice and Reno flows
 Interference metric: flow completion time
100
80
60
40
BG
FG
20
0
FG
BG
G
F
2
G
/B
FG
G
F
9
/
G
F
G
B
8
Nice vs Reno: Austin to London
July 12, 2016
Department of Computer Sciences, UT Austin
Internet measurements
Transfer times (sec)
 Long Nice and Reno flows
 Interference metric: flow completion time
400
350
300
250
200
150
100
50
0
BG
FG
FG
BG
2FG
FG/BG
Nice vs Reno: phone line
July 12, 2016
Department of Computer Sciences, UT Austin
Internet measurements
Austin to London
Response Time (sec)
25
20
Reno
Nice
15
10
5
0
May 10 May 11 May 12 May 13 May 14 May 15
Time of day
 Useful spare capacity throughout the day
July 12, 2016
Department of Computer Sciences, UT Austin
Internet measurements
Cable modem
Response Time (sec)
300
250
200
Reno
Nice
150
100
50
0
May 13
May 14
May 15
May 16
Time of day
 Useful spare capacity throughout the day
July 12, 2016
Department of Computer Sciences, UT Austin
NS Simulations - Red
80000
BG Throughput (KB)
70000
60000
50000
40000
30000
20000
Router Prio
Nice-0.03
Nice
Vegas
Reno
10000
0
July 12, 2016
1
10
Num BG flows
Department of Computer Sciences, UT Austin
100
NS Simulations - Red
100
Document Latency (sec)
Reno
Vegas
10
Nice
1
Router Prio
0.1
July 12, 2016
1
Nice-0.03
10
Spare Capacity
Department of Computer Sciences, UT Austin
100
Analytic Evaluation
 All three aspects of protocol essential
 Main result – interference decreases
exponentially with bottleneck queue
capacity irrespective of the number of
Nice flows
- fluid approximation model
- long-lived Nice and Reno flows
- queue capacity >> number of foreground
flows
July 12, 2016
Department of Computer Sciences, UT Austin
Download