SWAN: Software-driven wide area network Ratul Mahajan

advertisement
SWAN: Software-driven wide area network
Ratul Mahajan
Partners in crime
Chi-Yao Hong
Vijay Gill
Srikanth Kandula
Rohan Gandhi
Ratul Mahajan
Xin Jin
Mohan Nanduri
Harry Liu
Ming Zhang
Roger Wattenhofer
Inter-DC WAN: A critical, expensive resource
Dublin
Seattle
New York
Seoul
Barcelona
Hong Kong
Los Angeles
Miami
But it is highly inefficient
One cause of inefficiency: Lack of coordination
Another cause of inefficiency:
Local, greedy resource allocation
B
C
D
A
B
E
H
G
Local, greedy allocation
D
A
E
H
F
C
G
F
Globally optimal allocation
[Latency inflation with MPLS-based traffic engineering, IMC 2011]
SWAN: Software-driven WAN
Goals
Key design elements
Highly efficient WAN
Flexible sharing policies
Coordinate across services
Centralize resource allocation
[Achieving high utilization with software-driven WAN, SIGCOMM 2013]
Software-defined networking: Primer
SDNs
• Streamlined switches
• Control plane: centralized, off-board
• Data plane: direct configuration
SWAN overview
SWAN controller
Traffic demand
BW
allocation
Service broker
Rate limiting
Service hosts
Topology, traffic
Network
config.
Network agent
WAN
Key design challenges
Scalably computing BW allocations
Avoiding congestion during network updates
Working with limited switch memory
Scalably computing allocation
Goal: Prefer higher-priority traffic and max-min fair within a class
Challenge: Network-wide fairness requires many MCFs
Approach: Bounded max-min fairness (fixed number of MCFs)
Bounded max-min fairness
α3𝑈
demand 𝑑5
demand 𝑑4
α2𝑈
demand 𝑑3
α𝑈
demand 𝑑1
demand 𝑑2
Geometrically partition the demand space with parameters α and 𝑈
Bounded max-min fairness
α3𝑈
demand 𝑑5
demand 𝑑4
α2𝑈
demand 𝑑3
α𝑈
demand 𝑑1
demand 𝑑2
Maximize throughput while limiting all allocations below α𝑈
Bounded max-min fairness
α3𝑈
demand 𝑑5
demand 𝑑4
α2𝑈
demand 𝑑3
α𝑈
demand 𝑑1
demand 𝑑2
Maximize throughput while limiting all allocations below α𝑈
Bounded max-min fairness
α3𝑈
demand 𝑑5
demand 𝑑4
α2𝑈
demand 𝑑3
α𝑈
demand 𝑑1
demand 𝑑2
Fix the allocation for smaller flows
Bounded max-min fairness
α3𝑈
demand 𝑑5
demand 𝑑4
α2𝑈
demand 𝑑3
α𝑈
demand 𝑑1
demand 𝑑2
Continue until all flows fixed
Bounded max-min fairness
α3𝑈
demand 𝑑5
demand 𝑑4
α2𝑈
demand 𝑑3
α𝑈
demand 𝑑1
Fairness
demand 𝑑2
1
bound: Each flow is within , 𝛼 of
𝛼
max(𝑑𝑖)
Number of MCFs: log 𝛼
𝑈
its fair rate
Relative Relative
deviation deviation
SWAN computes fair allocations
MPLS TE
SWAN (α = 2)
Flows sorted by demand
In practice, only 4% of the flows deviate more than 5%
Key design challenges
Scalably computing BW allocations
Avoiding congestion during network updates
Working with limited switch memory
Congestion during network updates
Congestion-free network updates
Computing congestion-free update plan
Leave scratch capacity 𝑠 on each link
• Ensures a plan with at most
1
𝑠
− 1 steps
Find a plan with minimal number of steps using an LP
• Search for a feasible plan with 1, 2, …. max steps
Use scratch capacity for background traffic
Complementary
CDF
SWAN provides congestion-free updates
Oversubscription ratio
Extra traffic (MB)
Key design challenges
Scalably computing BW allocations
Avoiding congestion during network updates
Working with limited switch memory
Working with limited switch memory
Working with limited switch memory
Install only the “working set” of paths
Use scratch capacity to enable disruption-free updates to the set
Throughput
(relative to optimal)
SWAN comes close to optimal
SWAN
SWAN
w/o rate
control
MPLS
TE
Deploying SWAN in production
WAN
Data
center
Lab prototype
Partial deployment
WAN
Data
center
Full deployment
Key lesson from using SDNs
Loops in SDNs
Dependencies are worse than you might think
Ongoing work: Robust, fast network updates
Understand fundamental limits
Develop practical solutions
What are the minimal dependencies for a
desired consistency property?
[On consistent updates in software-defined networks, HotNets 2013]
Network update pipeline
Routing
policy
Consistency
property
Network
characteristics
Robust rule
generation
Dependency
graph generation
Update
execution
Robust rule generation: Example
Capacity
Capacity
Capacity
of Links:
of Links:
of 10
Links:
10 10
S4 S4 S4
S4 S4 S4
S4 S4 S4
10 10 10
10 10 10
5 5 5
10 10 10
10 10 10
5 5 5
S3 S
S3 S3 S3 S1 S1 S1
S3 S3 S3 S1 S1 S1
S1 S1
5 5 5
Demands:
Demands:
Demands:
Demands:
Demands:
Demands:
ands:
Demands:
Demands:5 5 5
S4->S3:10
S4->S3:10
S4->S3:10
S4->S3:10
S4->S3:10
S4->S3:10
S3:10
S4->S3:10
S4->S3:10
5 5 5
10 10 10
5 5 5
S2->S3:10
S2->S3:10
S2->S3:10
S2->S3:10
S2->S3:10
S2->S3:10
S3:10
S2->S3:10
S2->S3:10
S2 S2 S2
S2
S2
S2
S2
S2
S2
S1->S3:10
S1->S3:10
S1->S3:10
S1->S3:10
S1->S3:10
S1->S3:10
S3:0
S1->S3:0
S1->S3:0
Capacity
of
Links:
10
(c) (c)
Congestion
Congestion
when
when
S2
when
fails
S2 fa
S2
(b) Unreliable
(b) (b)
Unreliable
Unreliable
new
TEnew
(TE-1)
TE TE
(TE-1)
(TE-1)(c) Congestion
(a) Initial
(a) (a)
Initial
TE
Initial
TE TE
S4
S4new
to update
to update
to update
to TE-1
to TE-1
to TE-1
10
10
5
5
S3
S1
S1
5
Demands:
Demands:
S4->S3:10
S4->S3:10
5
10
S2->S3:10
S2->S3:10
S2
S2
S1->S3:10
S1->S3:10
Robust rule generation
Goal: No congestion if any k or fewer switches fail to update
Challenge: Too many failure combinations
𝑛
1
+ …+
𝑛
𝑘
Approach: Use a sorting network to identify worst k failures
– O(kn) constraints
Robust rule generation: Preliminary results
Normalized
throughput
1
0.8
0.6
0.4
0.2
0
Network update pipeline
Routing
policy
Consistency
property
Network
characteristics
Robust rule
generation
Dependency
graph generation
Update
execution
Dependency graph generation
Add @𝑢
∗𝑤
Del @𝑢
∗𝑣
Add @𝑣
∗𝑢
Del @𝑤
∗𝑢
Add @𝑤
∗𝑣
Del @𝑣
∗𝑤
Update execution
Updates without parents can be applied right away
Break any cycles and shorten long chains
Add @𝑤
𝑣 𝑣
Add @𝑢
∗𝑤
Del @𝑢
∗ 𝑣
Add @𝑣
∗𝑢
Del @𝑤
∗𝑢
Add @𝑤
∗𝑣
Del @𝑣
∗ 𝑤
Update execution: Preliminary results
Summary
SWAN yields a highly efficient and flexible WAN
– Coordinated service transmissions and centralized resource allocation
– Bounded fairness for scalable computation
– Scratch capacities for “safe” transitions
Change is hard for SDNs
– Need to understand fundamental limits, develop practical solutions
Download