Updating Data Center Networks with Zero Traffic Loss

advertisement
zUpdate:
Updating Data Center Networks with
Zero Loss
Hongqiang Harry Liu (Yale University)
Xin Wu (Duke University)
Ming Zhang, Lihua Yuan, Roger Wattenhofer, Dave Maltz
(Microsoft)
1
DCN is constantly in flux
Upgrade οƒ  Reboot
New Switch
Switches
Traffic Flows
2
DCN is constantly in flux
Switches
Traffic Flows
Virtual Machines
3
Network updates are painful for operators
Switch
Upgrade
Holy C**p
Two weeks before update, Bob has to:
• Coordinate with application owners
Complex
• Prepare a detailed
updatePlanning
plan
• Review and revise the plan with colleagues
At the night of update, Bob executes plan by hands, but
• Application
alerts arePerformance
triggered unexpectedly
Unexpected
Faults
• Switch failures force him to backpedal several times.
Eight hours later, Bob is still stuck with update:
• No sleep over
night
Laborious
Process
• Numerous application complaints
• No quick fix in sight
Bob: An operator
4
Congestion-free DCN update is the key
• Applications want network updates to be seamless
• Reachability
• Low network latency (propagation, queuing)
• No packet drops
Congestion
• Congestion-free updates are hard
•
•
•
•
Many switches are involved
Multi-step plan
Different scenarios have distinct requirements
Interactions between network and traffic demand changes
5
A clos network with ECMP
All switches: Equal-Cost Multi-Path (ECMP)
Link capacity: 1000
CORE
1
2
3
4
150= 920150
620 + 150 + 150
AGG
1
2
300
ToR
4
6
5
300
300
1
600
3
2
3
4
300
5
600
6
Switch upgrade: a naïve solution triggers
congestion
Link capacity: 1000
CORE
1
2
3
4
1070
620 + 300
150 + 150 = 920
AGG
1
2
Drain AGG1
ToR
3
4
6
5
600
1
2
3
4
5
7
Switch upgrade: a smarter solution seems to
be working
Link capacity: 1000
CORE
1
2
3
4
50 = 1070
970
620 + 300 + 150
AGG
1
2
3
4
Drain AGG1
ToR
6
5
500
1
2
3
4
100 Weighted
ECMP
5
8
Traffic distribution transition
Initial Traffic Distribution
Congestion-free
CORE
1
AGG
1
2
2
300
ToR
3
3
4
1
4
6
5
300
300
2
3
4
Final Traffic Distribution
Congestion-free
300
5
Transition
?
CORE
1
AGG
1
2
2
0
ToR
3
3
4
6
5
600
1
4
500
2
3
4
100
5
Simple?
NO!
Asynchronous Switch Updates
9
Asynchronous changes can cause transient
congestion
When ToR1 is changed but ToR5 is not yet:
Link capacity: 1000
CORE
1
2
3
4
620 + 300 + 150 = 1070
AGG
1
2
3
4
6
5
Drain AGG1
300
300
600
ToR
1
2
3
4
5
Not Yet
10
Solution: introducing an intermediate step
Final
Initial
CORE
1
2
3
4
CORE
1
AGG
1
2
3
4
Transition
AGG
1
2
300
ToR
3
4
300
1
6
5
300
2
3
Congestion-free
regardless the
asynchronizations
0
300
4
2
ToR
5
1
AGG
1
2
1
ToR
?
2
200
3
400
1
3
2
3
4
4
450
500
2
3
4
100
5
Congestion-free
regardless the
asynchronizations
150
5
6
5
6
5
4
4
600
Intermediate
CORE
3
11
How zUpdate performs congestion-free
update
Update
Scenario
Operator
Update
requirements
zUpdate
Current Traffic
Distribution
Intermediate
Traffic Distribution
Intermediate
Traffic Distribution
Target Traffic
Distribution
Data Center Network
12
Key technical issues
• Describing traffic distribution
• Representing update requirements
• Defining conditions for congestion-free transition
• Computing an update plan
• Implementing an update plan
13
Describing traffic distribution
𝑓
𝑙𝑣,𝑒 : flow f’s load on the link from switch v to u
CORE
s4
s5
𝑓
𝑙𝑠2 ,𝑠4 =150
AGG
150
s2
s3
𝑓
𝑙𝑠1 ,𝑠2 =300
ToR
300
s1
f
600
Traffic Distribution: 𝐷 =
𝑓
𝑙𝑣,𝑒
∀𝑓, 𝑒𝑣,𝑒
14
Representing update requirements
CORE
s4
s5
When s2 recovers
AGG
s2
s3
Drain s2
𝑓
𝑓
Constraint: 𝑙𝑠1 ,𝑠2 =𝑙𝑠1 ,𝑠3
𝑓
Constraint: 𝑙𝑠1 ,𝑠2 = 0
ToR
s1
f
To
To restore
upgrade ECMP:
switch 𝑠2 :
𝑓: 𝑙 𝑓 = 0
𝑓
∀𝑓,
𝑒
∀𝑓, 𝜈:
𝑣,𝑠𝑙2𝑣,𝑒𝑣,𝑠2= 𝑙𝑣,𝑒
1
2
15
Switch asynchronization exponentially inflates
the possible load values
Transition from old traffic distribution to new traffic distribution
f
ingress
1
2
4
6
3
5
7
egress
f
8
𝑓
𝑙7,8
Asynchronous updates can result in 25 possible
load values on link 𝑒7,8 during transition.
In large networks, it is impossible to check if the
load value exceeds link capacity.
16
Two-phase commit reduces the possible load
values to two
Transition from old traffic distribution to new traffic distribution
f
ingress
2
1
version flip
4
6
egress
8
5
3
f
7
• With two-phase commit, f’s load on link 𝑒𝑣,𝑒 only has two
possible values throughout a transition:
𝑓 old
𝑙𝑣,𝑒
or
𝑓 new
𝑙𝑣,𝑒
17
Flow asynchronization exponentially inflates
the possible load values
f1
1
2
4
6
f1 + f2
8
f2
0
3
5
7
𝑓 old
𝑓 old
2
+ 𝑙7,8
𝑓 old
2
+ 𝑙7,8
𝑓2
𝑓1
1
𝑙7,8
+ 𝑙7,8
= 𝑙7,8
1
𝑙7,8
𝑓 new
𝑓 old
𝑓 new
2
+ 𝑙7,8
𝑓 new
2
+ 𝑙7,8
1
𝑙7,8
1
𝑙7,8
𝑓 new
Asynchronous updates to N independent flows can
result in 2𝐍 possible load values on link 𝑒7,8
18
Handling flow asynchronization
f1
2
1
6
4
8
f2
0
5
3
7
2
+ 𝑙7,8
𝑓 old
2
+ 𝑙7,8
1
𝑙7,8
Basic idea:
𝑓1
𝑓2
𝑙7,8
+ 𝑙7,8
≤
𝑓1 new
, 𝑙7,8
}+
𝑓2 π‘œπ‘™π‘‘
max{𝑙7,8
𝑓2 new
, 𝑙7,8
}
𝑓 new
𝑓 old
𝑓 new
2
+ 𝑙7,8
𝑓 new
2
+ 𝑙7,8
1
𝑙7,8
𝑓1 π‘œπ‘™π‘‘
max{𝑙7,8
𝑓 old
𝑓 old
𝑓2
𝑓1
1
𝑙7,8
+ 𝑙7,8
= 𝑙7,8
1
𝑙7,8
𝑓 new
[Congestion-free transition constraint] There is no congestion
throughout a transition if and only if:
𝑓 old
∀𝑒𝑣,𝑒 :
max 𝑙𝑣,𝑒
𝑓 new
, 𝑙𝑣,𝑒
≤ 𝑐𝑣,𝑒
∀𝑓
𝑐𝑣,𝑒 : the capacity of link 𝑒𝑣,𝑒
19
Computing congestion-free transition plan
Linear Programming
Constant:
Current Traffic
Distribution
Constraint:
Congestion-free
Variable:
Intermediate
Traffic Distribution
Variable:
Intermediate
Traffic Distribution
Constraint:
Update Requirements
Variable:
Target Traffic
Distribution
Constraint:
• Deliver all traffic
• Flow conservation
20
Implementing an update plan
• Computation time
• Switch table size limit
Weighted-ECMP
ECMP
Critical
Flows
Other Flows
• Update overhead
• Failure during transition
Flows traversing
bottleneck links
• Traffic demand variation
21
Evaluations
• Testbed experiments
• Large-scale trace-driven simulations
22
Testbed setup
Switch: OpenFlow 1.0
Link: 10Gbps
ToR6,7: 6.2Gbps
ToR6,7: 6.2Gbps
CORE
1
AGG
1
3
2
2
3
4
5
ToR6,7: 6.2Gbps
ToR6,7: 6.2Gbps
4
4
5
8
9
6
Drain AGG1
ToR
1
2
3
6
7
ToR5: 6Gbps
10
11
12
ToR8: 6Gbps
Traffic Generator
23
zUpdate achieves congestion-free switch
upgrade
Initial
CORE
1
AGG
1
2
2
3Gbps
ToR
3
3
4
3Gbps
1
2
Intermediate
5
3Gbps
3
4
4
CORE
1
6
AGG
1
2
2
2Gbps
3Gbps
3
1
4
4
4Gbps
ToR
5
3
6
5
4.5Gbps
2
3
1.5Gbps
4
5
Real-time link utilization
Link Utilization
1.05
Final
1
0.95
CORE
1
AGG
1
2
3
4
0.9
0.85
0.8
0
5
10
15
Time (sec)
Link: CORE1-AGG3
20
Link: CORE3-AGG4
25
2
0
ToR
3
4
6Gbps
1
2
6
5
5Gbps
3
4
1Gbps
5
24
One-step update causes transient congestion
Initial
CORE
1
AGG
1
2
2
3Gbps
ToR
3
3
4
3Gbps
1
4
6
5
3Gbps
2
3
4
3Gbps
5
Real-time link utilization
Final
Link Utilization
1.1
1
CORE
1
AGG
1
2
3
4
0.9
0.8
0.7
0
5
10
Link: CORE1-AGG3
0
15
ToR
Time (sec)
2
3
4
6Gbps
1
2
6
5
5Gbps
3
4
1Gbps
5
Link: CORE3-AGG4
25
Large-scale trace-driven simulations
A production DCN topology
CORE
New Switch
AGG
ToR
Flows
Test flows (1%)
26
zUpdate beats alternative solutions
Post-transition Loss Rate
Transition Loss Rate
Loss Rate (%)
15
10
5
0
zUpdate
#step
2
zUpdate-OneStep
1
ECMP-OneStep
1
ECMP-Planned
300+
27
Conclusion
• Switch and flow asynchronization can cause severe congestion during
DCN updates
• We present zUpdate for congestion-free DCN updates
• Novel algorithms to compute update plan
• Practical implementation on commodity switches
• Evaluations in real DCN topology and update scenarios
28
Thanks & Questions?
29
Updating DCN is a painful process
Interactive
Applications
Switch
Upgrade
Any performance
disruption?
How bad will the
latency be?
Operator
Uh?…
This is Bob
How long will the
disruption last?
What servers will be
affected?
30
Network update: a tussle between
applications and operators
• Applications want network update to be fast and seamless
• Update can happen on demand
• No performance disruption during update
• Network update is time consuming
• Nowadays, an update is planned and executed by hands
• Rolling back in unplanned cases
• Network update is risky
• Human errors
• Accidents
31
Challenges in congestion-free DCN update
• Many switches are involved
• Multi-step plan
• Different scenarios have distinctive requirements
•
•
•
•
Switch upgrade/failure recovery
New switch on-boarding
Load balancer reconfiguration
VM migration
Help!
• Coordination between changes in routing (network) and
traffic demand (application)
32
Related work
• SWAN [SIGCOMM’13]
• maximizing the network utilization
• Tunnel-based traffic engineering
• Reitblatt et al. [SIGCOMM’12]
• Control plane consistency during network updates
• Per-packet and per-flow cannot guarantee “no congestions”
• Raza et al. [ToN’2011], Ghorbani et al. [HotSDN’12]
• One a specific scenario (IGP update, VM migration)
• One link weight change or one VM migration at a time
33
Download