Measuring Control Plane Latency in SDN-enabled Switches

advertisement
Measuring Control Plane Latency in
SDN-enabled Switches
Keqiang He, Junaid Khalid, Aaron Gember-Jacobson,
Sourav Das, Chaithan Prakash,
Aditya Akella, Li Erran Li and Marina Thottan
1
Latency in SDN
app
Control plane
app
app
Centralized Controller
OpenFlow msgs
Data plane
Time taken to install 100 rules ?
Some
XXX msecs??
Can be
as large
as 10 secs!!!
2
Do SDN apps care about latency?
Intra-DC Traffic Engineering
• MicroTE [CoNEXT’11] routes predictable traffic on short
time scales of 1-2 sec
Fast Failover
• Reroute
the affected
flows quickly inundermine
face of failuresthe
Latency
can significantly
• Longer update
time increases
congestion
andapps
drops
effectiveness
of many
SDN
Mobility
• SoftCell [CoNEXT’13] advocates SDN in cellular networks
• Routes setup must complete within ~30-40 ms
3
Factors contributing to latency?
Control Software Design and Distributed Controllers
Speed of Control Programs and Network Latency
Not received much attention
Latency in Network Switches
4
Our Work
Two contributions:
• Systematic experiments to quantify control plane
latencies in production switches
Latency in network switches
• Factors that affect latency and low level
explanations
5
Elements of Latency – inbound
latency
Controller
CPU board
Switch
ASIC
SDK
I1
Switch Fabric
Forwarding
Engine
PHY
DMA
OF
Agent
CPU
PCI
Memory
I3
I2
No
Hardware
Match
Lookup
Inbound Latency
I1: Send to ASIC SDK
I2: Send to OF Agent
I3: Send to Controller
Tables
PHY
Packet
6
Elements of Latency – outbound
latency
Controller
O1
CPU board
Switch
Switch Fabric
Forwarding
Engine
PHY
DMA
OF
Agent
CPU O2
O4
PCI
Memory
ASIC
SDK
Lookup
Hardwar
e Tables
O3
Outbound Latency
O1: Parse OF Msg
O2: Software schedules the
rule
O3: Reordering of rules in
table
O4: Rule is updated in table
PHY
7
Measurement Scope
• Inbound Latency
• Increases with flow arrival rate
• Increases with interference from outbound msgs
• Higher CPU usage for higher arrival rates
• Outbound Latency
• Insertion
• Modification
• Deletion
Please see our paper for details
8
Measurement Methodology
Switch
CPU
RAM
OF
Version
Flow table
size
Data Plane
Vendor A
2 Ghz
2GB
1.0
4096
40*10G+4*40
1.0
896
1.3
1792 (ACL)
14*10G+4*40G
1.0
750
48*10G+4*40G
Vendor B-1.0
1 Ghz
1GB
Vendor B-1.3
Vendor C
?
?
9
Insertion Latency Measurement
Insert B rules
in a burst (back-to-back)
Per rule insertion latency
= t1 – t0
SDN
Controller
eth0
Control Channel
t0
Pktgen
eth1
Flows OUT
Flows IN
1Gbps of 64B Ethernet packets
t1
eth2
Libpcap
OpenFlow Switch
Pre-install 1 default
“drop all” rule with low priority
10
Insertion Latency – Priority Effects
30
10
25
8
20
6
15
30
10
insertion delay (ms)
(ms)
delay(ms)
insertiondelay
insertion
• Affected by priority patterns on all the switches we
measured
• Per rule insertion always takes order of msec
4
10
25
00
25
8
20
6
15
4
10
25
0
00
40 400 60
100 20200 300
500 60080 700 100
800
rule #
rule #
(a) Burst size 100, same priority
(a) Burst size 800, same priority
0
40 400 60
100 20200 300
500 60080 700 100
800
rule #
(b)
(b) Burst
Burst size
size 100,
800, incr.
decr.priority
priority
Vendor
Vendor
B-1.0
A switch
11
Insertion Latency – Table occupancy
Effects
• Affected by the types of rules in the table
table 100
table 100
table 400
avg completion time (ms)
avg completion time (ms)
20000
18000
16000
14000
12000
10000
8000
6000
4000
2000
0
table 400
20000
18000
16000
14000
12000
10000
8000
6000
4000
2000
0
TCAM Organization, Rule Priority & Table Occupancy
Switch Software Overhead
0
100
200
300
400
500
600
burst size
(a) low priority rules into a table with high
priority rules
0
300 400 500 600
burst size
(b) high priority rules into a table with low
priority rules
Vendor B-1.0 switch
100
200
12
Modification Latency Measurement
Per rule modification latency
= t1 – t0
Modify B rules
SDN
in a burst (back-to-back) Controller
eth0
Control Channel
t0
Pktgen
eth1
Flows OUT
Flows IN
1Gbps of 64B Ethernet packets
t1
eth2
Libpcap
OpenFlow Switch
Pre-install B dropping rules
13
Modification Latency
• Same as insertion latency on vendor A
• 2X as insertion latency on vendor B-1.3
Modification Latency
160
140
120
100
80
60
40
20
0
modification delay (ms)
modification dely (ms)
• Much higher on vendor B-1.0 and vendor C
• Not affected by rule priority but affected by table
occupancy
160
140
120
100
80
60
40
20
0
Poorly optimized switch software 
0
20
40
60
80
100
0
rule #
50
100
150
200
rule #
(b) 200 rules in table
(a) 100 rules in the table
Vendor B-1.0 switch
15
Deletion Latency Measurement
Delete B rules
in a burst (back-to-back)
Per rule deletion latency
= t1 – t0
SDN
Controller
eth0
Control Channel
t0
Pktgen
eth1
Flows OUT
Flows IN
1Gbps of 64B Ethernet packets
t1
eth2
Libpcap
OpenFlow Switch
Pre-install B dropping rules
& 1 “pass all” rule with
low priority
16
Deletion Latency
• Higher than insertion latency for all the switches we
measured
• Not affected by rule priority but affected by table occupancy
14
12
deletion delay (ms)
deletion delay (ms)
14
10
8
12
10
8
Deletion is incurring TCAM6 Reorganization
6
4
2
0
0
20
40
60
rule #
80
100
4
2
0
(a) 100 rules in table
0
50
100
rule #
(b) 200 rules in table
150
200
Vendor A switch
17
Recommendations for SDN app
designers
• Insert rules in a hardware-friendly order
• Consider parallel rule updates if possible
• Consider the switch heterogeneity
• Avoid explicit rule deletion if timeout can be
applied
18
Summary
• Latency in SDN is critical to many applications
• Long latency can undermine many SDN app’s
effectiveness greatly
Need careful design of future switch silicon
• Assumption: Latency is small or constant
and software in order to fully utilize the
• Latency is high and variable
power of SDN!
Varies with Platforms, Type of operations, Rule
priorities, Table occupancy, Concurrent operations
Key Factors: TCAM Organization, Switch CPU and
inefficient Software Implementation
19
20
Inbound Latency
• Increases with flow arrival rate
• CPU Usage is higher for higher flow arrival rates
Flow Arrival Rate
(packets/sec)
Mean Delay per
packet_in (msec)
CPU Usage
100
3.32
15.7 %
200
8.33
26.5 %
vendor A switch
21
Inbound Latency
300
300
250
250
inbound delay (ms)
inbound delay (ms)
• Increases with interference from outbound msgs
200CPU
Low Power
200
150
100
150
100
Software Inefficiency
50
50
0
0
0
200
400
600
flow #
800
(a) with flow_mod/pkt_out
1000
0
200
400
600
800
flow #
(b) w/o flow_mod/pkt_out
1000
vendor A switch. Flow Arrival Rate = 200/s
22
How accurate?
• 500 flows
• 1Gbps
• 64B Ethernet packet
• Inter-packet gap of a flow = 256 us
• Solution: Either increase the packet rate or reduce
the number of flows or measure the accumulated
latency for many flows
23
Download