Jiayue He
May 9 th , 2008
Approach: Theory Meets Practice
Using optimization theory:
Analyze system properties
Derive protocols and architectures
Practical solutions:
Understand limitations of today’s protocols and architectures
Propose new protocols and architectures implementable using existing technology
2
Traffic Management
Determines traffic rate along each path
Supports multiple Internet applications
Traffic
Management
Application
Transport
Network
Link
Physical
3
Traffic Management Today
Operator (hours):
Traffic Engineering
Evolved organically without conscious design
User (RTTs):
Congestion Control
Routers (seconds):
Routing Protocols
4
Goal: Redesign Traffic Management
Resource Allocation between
Multiple Traffic Classes (Part III: 18 min)
Throughput-Sensitive Traffic
Analysis (Part I: 10 min)
Design (Part II: 22 min)
Other Traffic Classes
Scope of This Talk
Single I nternet S ervice P rovider backbone
Control and visibility of network
Traffic management of aggregate flows
No inter-network economics
Multipath with flexible splitting
6
Can Congestion Control and Traffic
Engineering Be at Odds?
Motivation
Congestion Control: maximize user utility
Given routing R li
, how to adapt end rate x i
?
Traffic Engineering: minimize network congestion
Given traffic x routing R li
?
i
, how to perform
8
Goal: Understand Interaction
Congestion Control x i
R li
Traffic Engineering
Understand system properties:
Convergence to a stable value?
What is a reasonable overall objective?
9
Congestion Control Implicitly
Maximizes Aggregate User Utility
Source-destination pair indexed by i
User
Utility
U i
(x i
) aggregate utility max.∑ s.t. ∑ i
U i
(x i
)
R li x i
≤ c l
Source Rate x i source rate routing matrix
Utility function represents user satisfaction and elasticity
10
Kelly98, Low03, Srikant04…
Traffic Engineering Explicitly
Minimizes Network Congestion
Links are indexed by l aggregate congestion cost
Cost f(u l
) u l
=1 min. ∑ s.t.
u var. R l f(u l
)
=∑ i
R li x i
/c l
Link Utilization u l link utilization
Cost function represents penalty for approaching capacity and approximates average queuing delay
11
FortzThorup04
Model of Interaction
Assume the TCP session is between two customers of same ISP
Congestion Control (RTTs) : max ∑ s.t. ∑ i i
R
U li x i i
(x i
),
≤ c l x i
R li
Traffic Engineering (hours) : min ∑ s.t. u l
=∑ l i f(u l
R li
), x i
/c l
f is controlled by the operators and can be modified
12
Numerical Experiments
MATLAB experiments:
Different topologies and capacity distributions
Benchmark: max. ∑ i
U i
(x s.t. Rx ≤ c
Var. x, R
Observations: i
)
System converges
Utility gap exists between the joint system and benchmark
13
Backward Compatible Design
Simulation of the joint system suggests that it is stable , but suboptimal
Gap reduced if we change f to red curve
Cost f
0 f(u l
) u l
=1
Link utilization u l
14
Theoretical Results
Master Problem: min. g(x,R) = - ∑ i
U i
(x i
) + γ∑ l f(u l
)
Gauss-Siedel
Congestion Control: argmin x g(x,R)
Traffic Engineering: argmin
R g(x,R)
Theorem: the joint system model converges if
Replace the capacity constraint in congest control with a penalty function
U i
’’(x i
) ≤ -U i
’(x i
) /x i holds for all TCP variants
15
Pros and Cons of Changing f
Pros:
Backwards compatible
Can maximize aggregate user utility
Cons:
Creates bottleneck links
Fragile to high volume traffic bursts
Motivation for redesign in Part II
16
Contributions and Related Work
Related Work:
Separate analysis of CC and TE
Use congestion price as link weights
(WangLiLowDoyle05, HeChiangRexford06)
Contributions:
Modeled the interaction between CC/TE
Studied the interaction
Proposed backward compatible design
17
TRUMP: TR affic-management
U sing M ultipath P rotocol
Joint work with Ma’ayan Bresler and Martin Suchara
Motivation for Redesign
Shortcomings of today’s traffic management:
Congestion control assumes routing is fixed; traffic engineering assumes traffic is inelastic
Traffic engineering occurs at the timescale of hours, slower than traffic shifts
Not taking full advantage of path diversity
Goal: redesign traffic management from scratch using optimization tools
19
Top-down Redesign
Problem Formulation
Optimization decomposition
Distributed Solutions
Compare using simulations
TRUMP algorithm
Translate into packet version
TRUMP Protocol
20
A Balanced Objective max. ∑ i
U i
(x i
) - w∑ l f(u l
)
Penalty weight
Congestion Control:
Maximize throughput
Generate bottlenecks
Traffic Engineering:
Minimize congestion
Avoid bottlenecks
21
Topologies with
Different Pattern of Bottleneck Links
Access-Core
Abilene Internet2
Multihome
22
Effect of Penalty Weight w
Depends on # of flows on each bottleneck link
User utility w
Operator penalty
Can achieve high aggregate utility for a range of w 23
Top-down Redesign
Problem Formulation
Optimization decomposition
Distributed Solutions
Compare using simulations
TRUMP algorithm
Translate into packet version
TRUMP Protocol
24
Multipath Formulation
Path rate z captures source rate and routing z
1
1 max. ∑ i
U i
(∑ j z j i ) – w∑ l s.t.
link load ≤ c var. path rates z l f(u l
) i source-destination pair , j path number z
2
1 z
3
1
25
Overview of Distributed Solutions
Operator: Tune w, U, f
Parameters tuned very rarely s s s
Edge node:
Update path rates z
Rate limit incoming traffic
Routers:
Set up multiple paths
Measure link load
Update link prices s
Distributed algorithm runs on the timescale of RTTs 26
Evaluating Four Decompositions
Four decompositions differ in number of tunable parameters
Theoretical results and limitations:
All proven to converge to global optimum for well-chosen parameters
Little guidance for choosing parameters
Only loose bounds for rate of convergence
Sweep large parameter space in MATLAB
Compare rate of convergence
Compare sensitivity of tunable parameters
27
Convergence Properties o average value x actual values
Parameter sensitivity
Best rate
Tunable parameter
Tunable parameters impact convergence time 28
Convergence Properties (MATLAB)
For all algorithms:
Parameter sensitivity correlated to rate of convergence
Trade-off between convergence and utility
Comparing between algorithms:
Extra parameters do not improve convergence
Allowing packet loss improves convergence
Direct update converges faster than iterative update (with constant tunable parameter)
29
Top-down Redesign
Problem Formulation
Optimization decomposition
Distributed Solutions
Compare using simulations
TRUMP algorithm
Translate into packet version
TRUMP Protocol
30
Construct TRUMP with different parts of previous algorithms
TRUMP Algorithm
Link l: p l
(t+1) = [p l q l
(t) – (β p
(t+1) = wf’(u l
)
)(c l
– link load)] +
Price for path j = ∑
l on path j
(p l
+q l
)
Source i:
Path rate z j i (t+1) = max. U i
(∑ k z k i ) – (z j i )(path price)
31
TRUMP Properties
Theorem: TRUMP converges if:
w is sufficiently large such that p=0
n l
< αf '(u l
) (1/ α + 1) /f ''(u l
) , n l number of flows
Proof technique: contraction mapping
TRUMP trumps previous distributed algorithms (MATLAB):
Observe convergence to optimum
Faster convergence
Converges in many scenarios if β p
= 0.05/c l
2
32
Top-down Redesign
Problem Formulation
Optimization decomposition
Distributed Solutions
Compare using simulations
TRUMP algorithm
Translate into packet version
TRUMP Protocol
So far, assume fluid model and constant feedback delay
33
TRUMP: Packet-Based Version
Link l: link load = (bytes in period T) / T
Update link prices every T
Arrival and departure of flows are implicitly conveyed through price changes
Source i: Update path rates at max j
{RTT j i }
34
Packet-level Experiments (NS-2)
Set-up:
Topologies and delays of large ISPs
(Rocketfuel data)
Selected flows and paths
Link failures and recoveries
ON-OFF traffic model
Questions:
Does TRUMP react quickly to dynamics ?
How many paths does TRUMP need?
35
TRUMP Link Dynamics (NS-2)
Link failure or recovery
TRUMP reacts quickly to link dynamics
Same observation for ON-OFF flows
Time (s)
36
TRUMP: A Few Paths Suffice
Time (s)
Sources benefit the most with a few alternative paths
Summary of TRUMP Properties
Property
Tuning
Parameters
Robustness to link dynamics
Robustness to flow dynamics
General
TRUMP
Universal parameter setting
Only need to be tuned for small w
Reacts quickly to link failures and recoveries
Independent of variance of file sizes, more efficient for larger files
Trumps other algorithms
Two or three paths suffice
38
Related Work
Multiple decompositions (PalomarChiang06)
Design traffic-management protocols:
Congestion control (FAST TCP)
Dynamic traffic engineering (REPLEX, TeXCP)
Traffic management (KeyMassoulieTowsley07,
LinShroff06, Shakkottai et al 06, Voice07)
39
Contributions
Design process
Formulated new objective for traffic management
Compared four distributed algorithms
(from decomposition)
Constructed TRUMP based on insights
TRUMP
Universal parameter setting
Packet-level protocol and simulations
40
DaVinci: D ynamically A daptive Vi rtual
N etworks for a C ustomized I nternet
Joint work with Rui Zhang-Shen, Ying Li,
Mike Lee, Martin Suchara, and Umar Javed
Internet Has Many Applications
Different application requirements
Throughput-sensitive: file transfer, web
Delay-sensitive: VoIP, IPTV, online gaming
42
Support Multiple Traffic Classes
Key research areas:
QoS: provides separate resources to support multiple traffic classes in parallel
Overlays: provide customized protocols for each traffic class
Network virtualization is emerging
Current applications: router consolidation, experimental test beds, VPNs
Router virtualization: separate resources
Programmable routers: customized protocols
43
Virtual Networks
Each virtual node/link has isolated resources
44
Motivation for Virtualization
Two traffic classes:
Delay-sensitive traffic (DST): fixed demand
Throughput-sensitive traffic (TST): elastic
5ms, 100 Mbps Single queue
1
2
TST can fill up both links
DST may not be satisfied
10ms, 1000 Mbps
Shared routing
DST chooses shorter path
Capacity wasted
45
Adaptive Network Virtualization
How to partition resources?
Static partitioning
Simple, but can be inefficient
One virtual network could be congested while another is idle
Dynamically allocate bandwidth shares!
46
D ynamically A daptive Vi rtual
N etworks for a C ustomized I nternet
DaVinci is an architecture to realize adaptive network virtualization
Virtual networks indexed by (k)
One per traffic class
Run customized traffic-management protocols
Substrate network
Provides separate queues
Computes per link bandwidth shares
Enforce bandwidth shares with traffic shapers
47
DaVinci: Substrate Link
Congestion price computation s l
(1)
Bandwidth shares computation link load y l
(1) y l
(2) y l
(N)
Use optimization to determine the computations
48
ISP: Maximize Aggregate Performance weighted aggregate performance objective max. ∑ k s.t. ∑ k w
H
(k)
(k) var. z (k) , y (k) z
U (k)
(k)
(z
≤ c
(k) , y (k) ) path rates bandwidth shares
users + efficiently using resources = $$$
49
Primal Decomposition
ISP problem decomposes into multiple subproblems (per traffic class): max. U (k) (z (k) , y (k) ) s.t. H (k) z (k) ≤ y (k) var. z (k)
Master problem update y (k) using
Indication of congestion s (k)
Indication of performance d/dy (k) U (k) (z (k) , y (k) )
50
Bandwidth Allocation for Link l
Adjust bandwidth in two steps:
λ (k) l
= s (k) l
+d/dy (k) U (k) (z (k) , y (k) ) v (k) l
(t+1) = [y (k) l
(t) + (β y
)(w (k) λ (k) l
)] +
Projection onto feasible region: v
∑ k y (k) l
≤ c l
51
Theorem
Theorem: the bandwidth share computation together with per traffic class problem maximizes aggregate performance if
The objective function and constraints are convex
The stepsize β y is diminishing
The bandwidth shares are updated when the congestion prices have converged
Proof technique: primal decomposition
System Properties from Theorem
Resources are efficiently utilized to maximize aggregate performance
Bandwidth shares converge to a stable value and the computation is
Based only on local link information
Each virtual network runs its own protocols independently
Bandwidth shares updated more slowly than congestion prices
53
DST on High Capacity, High Delay Link
5ms, 100 Mbps
DST: 50Mbps
1 2
10ms, 1000 Mbps
DST: 500Mbps
Number of iterations
DST does not use all the allocated bandwidth 54
Related Work and Contributions
Related Work:
QoS, overlays, and network virtualization
Primal decomposition
Contributions:
Introduced adaptive network virtualization
Introduced DaVinci
Proved stability and optimality of DaVinci
55
Conclusions
Traffic management today is
An organic evolution
Complex for operators
Redesign of traffic management to support multiple traffic classes:
TRUMP: design of an individual traffic class
DaVinci: design of resource allocation between traffic classes
56
Future Research Directions
Extending DaVinci:
Tailoring to application-specific requirements, e.g. R-factors for voice traffic
Running sub-optimal but simpler protocols
Interdomain traffic management requires
Economic incentives
Protection against malicious users
57
Publications Related to Thesis
Part One: Globecom, JSAC
Part Two: CoNext, submitted to ToN
Part Three: under preparation
Related publications:
Multipath survey: IEEE Network Magazine
Design Optimizable Protocols: CCR Editorial, invited book chapter
58
Abilene Topology: f = e (yl/cl)
Gap exists
Standard deviation of capacity
60
Abilene Continued: f = n(y l
/c l
) n n
Gap shrinks with larger n
61
Optimization Decomposition
Deriving prices and path rates
Prices: penalties for violating a constraint
Path rates: updates driven by penalties
Example: TCP congestion control
Link prices: packet loss or delay
Source rates: AIMD based on prices
Our problem is more complicated
More complex objective, multiple paths
62
Effective Capacity (Links)
Rewrite capacity constraint: link load ≤ c l link load = y l effective capacity y l
≤ c l
Subgradient feedback price update: s l
(t+1) = [s l
(t) – stepsize*( y l
– link load)] +
Stepsize controls the granularity of reaction
Stepsize is a tunable parameter
Effective capacity keeps system robust
63
Key Architectural Principles
Effective capacity
Advance warning of impending congestion
Simulates the link running at lower capacity and give feedback on that
Dynamically updated
Consistency price
Allowing some packet loss
Allowing some overshooting in exchange for faster convergence
64
Four Decompositions - Differences
Differ in how link & source variables are updated
Algorithms Features Parameters
Partial-dual
Primal-dual
Full-dual
Primal-driven
Effective capacity
Effective capacity
Effective capacity,
Allow packet loss
Direct s update
1
3
2
1
Iterative updates contain stepsizes :
They affect the dynamics of the distributed algorithms
65
TRUMP versus File Size
TRUMP’s is better for large files
Average File Size (Mbps)
TRUMP’s performance is independent of variance
Delay-sensitive Traffic Minimizes Delay
Links are indexed by l
Propagation delay
Cost f(u l
)
Link Utilization u l u l
=1 min. ∑ l s.t. u l
∑ var. z i
H lj i
=∑ z j i z j i
R
(p i li
≥ x D x i l
+f(u l
)) i
/c l
Traffic demand
Cost function represents penalty for long queues
67
Voice Traffic: R-factor
End-to-end delay
R = R a
-α
1
δ – α
2
( δ – α
3
)H – β
1
Packet loss
– β
2 log(1+β
3
φ ) constants
R-factor: 50-60, 60-70, 70-80, 80-90, 90-100
Voice quality: poor, low, medium, high, best
68