Software Defined Networking COMS 6998-10, Fall 2014 Instructor: Li Erran Li (

advertisement
Software Defined Networking
COMS 6998-10, Fall 2014
Instructor: Li Erran Li
(lierranli@cs.columbia.edu)
http://www.cs.columbia.edu/~lierranli/coms
6998-10SDNFall2014/
10/20/2014: SDN Forwarding Abstraction
Outline
• Review of Previous Lecture: SDN Updates
• SDN Forwarding Abstractions
– Click software router
– SwitchBlade NetFPGA programmable router
– OpenFlow++ and programming protocolindependent packet processors
10/20/14
Software Defined Networking (COMS 6998-10)
Review of Previous Lecture
• What update abstractions did we learn?
– Per-packet consistent update: each packet is
processed either by the old configuration or the
new one
– Per-flow consistent update: all packets of a flow
is processed by the same configuration (either
old or new)
– Congestion free update: updates are
congestion free under asynchronous switch and
traffic matrix changes
10/20/14
Software Defined Networking (COMS 6998-10)
Source: Andreas Voellmy, Yale
Review of Previous Lecture (Cont’d)
• How to achieve consistent update?
– Install new rules on internal switches, leave old
configuration in place
– Install edge rules that stamp with the new
version number
10/20/14
Software Defined Networking (COMS 6998-10)
Source: Andreas Voellmy, Yale
Review of Previous Lecture (Cont’d)
F1
I
F2
F3
2-Phase Update in Action
Traffic
10/20/14
Software Defined Networking (COMS 6998-10)
Source: M. Reitblatt, Cornell
Review of Previous Lecture (Cont’d)
• How to perform congestion
free update?
Operator
Update
Scenario
Update
requirements
zUpdate
Current Traffic
Distribution
Intermediate
Traffic Distribution
Intermediate
Traffic Distribution
Target Traffic
Distribution
Data Center Network
10/20/14
Software Defined Networking (COMS 6998-10)
Source: J. Liu, Yale
Review of Previous Lecture (Cont’d)
All switches: Equal-Cost Multi-Path (ECMP)
Link capacity: 1000
CORE
1
2
3
4
150= 920150
620 + 150 + 150
AGG
1
2
300
ToR
10/20/14
4
6
5
300
300
1
600
3
2
3
4
A clos network with ECMP
Software Defined Networking (COMS 6998-10)
300
5
600
Source: J. Liu, Yale
Review of Previous Lecture (Cont’d)
• Asynchronous changes can cause transient congestion
Link capacity: 1000
CORE
1
2
3
4
620 + 300 + 150 = 1070
AGG
1
2
3
4
6
5
Drain AGG1
300
300
600
ToR
1
2
3
4
5
When ToR1 is changed but ToR5 is not yet:
Not Yet
10/20/14
Software Defined Networking (COMS 6998-10)
Source: J. Liu, Yale
Review of Previous Lecture (Cont’d)
• Solution: introducing an intermediate step
Final
Initial
CORE
1
AGG
1
2
3
4
CORE
1
AGG
1
2
3
4
Transition
2
300
ToR
3
4
300
1
6
5
300
2
3
Congestion-free
regardless the
asynchronizations
4
0
300
ToR
5
CORE
1
AGG
1
2
1
ToR
?
2
3
400
1
3
2
3
4
4
4
450
500
2
3
4
Software Defined Networking (COMS 6998-10)
100
5
Congestion-free
regardless the
asynchronizations
150
5
6
5
6
5
4
3
600
Intermediate
200
10/20/14
2
Source: J. Liu, Yale
Review of Previous Lecture (Cont’d)
• What happens when control plane network
partitions?
• Assumptions:
– Out-of-band control network
– Routing and forwarding based on addresses
– Policy specification using end-host names
– Controller only aware of local name-address
bindings
10/20/14
Software Defined Networking (COMS 6998-10)
Source: Andreas Voellmy, Yale
Review of Previous Lecture (Cont’d)
• Consider policy isolating A from B. A control
network partition occurs. Only possible choices
– Let all packets through (including from A to B)
(Correctness)
– Drop all packets (including from A to D) (Availability)
10/20/14
Software Defined Networking (COMS 6998-10)
Review of Previous Lecture (Cont’d)
• Solutions:
– Network can label packets with sender’s identity
• Route based on identity instead of address
– Inband control
10/20/14
Software Defined Networking (COMS 6998-10)
Outline
• Review of Previous Lecture: SDN Updates
• SDN Forwarding Abstractions
– Click software router
– SwitchBlade NetFPGA programmable router
– OpenFlow++
10/20/14
Software Defined Networking (COMS 6998-10)
Modular software forwarding plane:
Click modular router
Control plane
• Elements
User-level
routing daemons
Linux kernel
Click
Forwarding plane
– Small building blocks, performing simple
operations
– Instances of C++ classes
• Packets traverse a directed graph of
elements
FromDevice(eth0)->CheckIPHeader(14)
->IPPrint->Discard;
10/20/14
Software Defined Networking (COMS 6998-10)
Elements
element class
input port
Tee(2)
output ports
configuration string
15-7-2016
10/20/14
PATS Research Group
Software Defined Networking (COMS 6998-10)
15
Push and pull
FromDevice
receive
packet p
Null
push(p)
return
push(p)
return
dequeue p
and return it
•
Push connection
15-7-2016
10/20/14
enqueue p
pull()
return p
•
– Source pushes packets downstream
– Triggered by event, such as packet arrival
– Denoted by filled square or triangle
•
ToDevice
Null
pull()
return p
ready to
transmit
send p
Pull connection
– Destination pulls packets from
upstream
– Packet transmission or scheduling
– Denoted by empty square or triangle
Agnostic connection
– Becomes push or pull depending on peer
– Denoted by double outline
PATS Research Group
Software Defined Networking (COMS 6998-10)
16
Push and pull violations
FromDevice
Counter
FromDevice
15-7-2016
10/20/14
ToDevice
ToDevice
PATS Research Group
Software Defined Networking (COMS 6998-10)
17
Implicit queue v. explicit queue
Implicit queue
•Used by STREAM, Scout, etc.
•Hard to control
Explicit queue
•Led to push and pull, Click’s main idea
•Contributes to high performance
10/20/14
Software Defined Networking (COMS 6998-10)
IP router
configuration
15-7-2016
10/20/14
PATS Research Group
Software Defined Networking (COMS 6998-10)
19
Click performance, circa 2000
10/20/14
Maximum loss-free forwarding rate with 64-byte packet:
333k, 284k, 84k for
Click, Linux
w/ polling driver, Plain Linux
Software Defined Networking (COMS 6998-10)
Improving software router performance:
exploiting parallelism
• Can you build a Tbps router out of PCs running
Click?
– Not quite, but you can get close
• RouteBricks: high-end software router
– Parallelism across servers and cores
– High-end servers: NUMA, multi-queue NICs
– RB4 prototype
• 4 servers in full mesh acting as 4-port (10Gbps/port) router
• 4 8.75 = 35Gbps
– Linearly scalable by adding servers (in theory)
10/20/14
Software Defined Networking (COMS 6998-10)
Outline
• Review of Previous Lecture: SDN Updates
• SDN Forwarding Abstractions
– Click software router
– SwitchBlade NetFPGA programmable router
– OpenFlow++ and programming protocolindependent packet processors
10/20/14
Software Defined Networking (COMS 6998-10)
Motivation
• Many new protocols require data-plane changes.
– Examples: OpenFlow, Path Splicing, AIP, …
• These protocols must forward packets at acceptable
speeds
• May need to run in parallel with existing or
alternative protocols
• Goal: Platform for rapidly developing new network
protocols that
– Forwards packets at high speed
– Runs multiple data-plane protocols in parallel
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Existing Approaches
• Develop custom software
– Advantage: Flexible, easy to program
– Disadvantage: Slow forwarding speeds
• Develop modules in custom hardware
– Advantage: Excellent performance
– Disadvantage: Long development cycles, rigid
• Develop in programmable hardware
– Advantage: Flexible and fast
– Disadvantage: Programming is difficult
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
SwitchBlade: Main Idea
• Identify modular hardware building blocks that
implement a variety of data-plane functions
• Allow a developer to enable and connect various
building blocks in a hardware pipeline from
software
• Allow multiple custom data planes to operate in
parallel on the same hardware
Flexible, fast, and easy to program.
Advantages of hardware and software with minimal overhead.
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
SwitchBlade: Push Custom Forwarding
Planes into Hardware
Software
Click
Click
VE3
VE3
VE1
CPU
VE2
MemoryVE3
Hard
VE4
Disk
VE1
VE2
Click
Click
PCI
VDP1
VDP2
VDP3
VDP4
SwitchBlade
NetFPGA
VDP = Virtual Data Plane
Click = Click Software Router
VE = Virtual Environment
10/20/14
Software
Hardware
Virtual Env.
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
SwitchBlade Features
• Parallel custom data planes
– Ability to demultiplex into existing data planes and
maintain isolation on common hardware platform
• Rapid development and deployment
– Pluggable preprocessor modules enable a range of
customizable functions at hardware rates
• Customizability and programmability
– Dynamic selection of modules, and ability to operate
in several different forwarding modes.
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Virtual Data Planes (VDPs)
Virtual Data
Plane Selection
Shaping
Preprocessing
Forwarding
• Separate packet processing pipeline, lookup
tables, and forwarding modules per VDP
• Stored table maps MAC address to VDP identifier
• VDP Selection step
– Identifies VDP based on MAC address
– Attaches 64-bit platform header that controls
functions in later stages
– Register interface controls this header per VDP
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Platform Header
Hash Value
Module
Module Bitmap Mode
Mode
bitmap
VDP ID
• Hash value computed based on custom bits in
header (allows for custom forwarding, if desired)
• Bitmap indicates which preprocessor modules
should execute on this packet
• Mode indicates the forwarding mode (LPM or
otherwise)
• VDP-ID indicates the VDP of the packet
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Virtual Data Plane Isolation
• Each Virtual Data Plane (VDP) has preprocessing,
lookup, and post processing stages
– Fixed set of forwarding tables
– Lookup, ARP, and exception tables
• One rate limiter per virtual-data plane
• Forwarding tables, rate limiters operate in
isolation
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
SwitchBlade Features
• Parallel custom data planes
– Ability to demultiplex into existing data planes and
maintain isolation on common hardware platfor.
• Rapid development and deployment
– Pluggable preprocessor modules to enable a range of
customizable functions at hardware rates
• Customizability and programmability
– Dynamic selection of modules, and ability to operate
in several different forwarding modes
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Preprocessing
Per-VDP Module
Selection
Bit field Register
Per-VDP module
field Selection
Virtual Data
Plane Selection
Shaping
Preprocessing
Forwarding
Preprocessing
Selector
Custom
Preprocessor
Hasher
• Select processing functions from library of reusable modules
– Selection function through bitmap
Enables fast customization without resynthesis
– Example implementations: Path Splicing, IPv6, OpenFlow
• Hash custom bits in packet header and insert value in hash field in
platform header
– Enables custom forwarding
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Hashing
16-bit
Ethernet
IP32-bit
Packet
8-bit
32-bit
Data
16-bit
Data
Data
32-bit hash
32-bit hash
• Hash custom bits in packet header
– Insert hash value in field in platform header
• Module accepts up to 256-bits from the preprocessor
according to user selection
10/20/14
Software Defined Networking (COMS 6998-10)
Example: OpenFlow
• Limited implementation (no VLANs or
wildcards)
• Preprocessing Steps
– Parse packet and extracts relevant tuples
– 240-bit OpenFlow “bitstream” passed to hasher
module in the preprocessor
– Hasher outputs 32-bit hash value on which
custom forwarding could take place
– Mode field set to perform exact match
• Most post-processing functions disabled (e.g.,
TTL decrement)
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Adding New Modules
• Adding a new module at any stage requires
Verilog programming
• User writes preprocessing (and postprocessing)
modules to extract the bits used for lookup
• Resynthesize hardware
• Enable module from register interface in software
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
SwitchBlade Features
• Parallel custom data planes
– Ability to demultiplex into existing data planes and
maintain isolation on common hardware platform.
• Rapid development and deployment
– Pluggable preprocessor modules to enable a range of
customizable functions at hardware rates.
• Customizability and programmability
– Dynamic selection of modules, and ability to operate
in several different forwarding modes.
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Forwarding
Per-VDP Lookup,
Software Exception
and ARP Tables
Virtual Data
Plane Selection
Shaping
Output Port
Lookup
Preprocessing
Forwarding
Per-VDP counters
and stats
Postprocessor
Wrappers
Custom
Postprocessor
• Output port lookup performs custom forwarding
depending on the mode bits in the platform header
• Wrapper modules allow matching on custom bit offsets
• Custom post processors allow other functions to be
enabled/disabled on the fly (e.g., checksum)
10/20/14
Software Defined Networking (COMS 6998-10)
Software Exceptions
• Ability to redirect some packets to CPU
• Packets are passed with VDP (and platform
header), to allow for VDP-based software
exceptions
• One possible application: Virtual routers in
software
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Custom Postprocessing Paths
Forwarding
IPv6
Open
Flow
Path
Splicing
10/20/14
Forwarding
Logic
TTL
Dest.
MAC
Logic
Checksum
Source
MAC
User
Defined
User
Defined
Software Defined Networking (COMS 6998-10)
Output
Queues
Source: B. Anwer, Gatech
Implementation
• NetFPGA-based implementation
– Based on NetFPGA reference router
implementation
– Xilinx Virtex 2 Pro 50
• SRAM for packet forwarding
• BRAM for storing forwarding information
• PCI for communication with CPU
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Evaluation
• Resource utilization: How much hardware
resources does running SwitchBlade require?
– Answer: Minimal additional overhead, compared
to running any custom protocol directly
• Packet forwarding overhead: How fast can
Switchblade forward packets?
– Answer: No additional overhead with respect to
base NetFPGA implementation
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Evaluation Setup
Source
CPU
Memory
Hard
Disk
Sink
PCI
NetFPGA
Packet
Generator
VDP1
VDP2
VDP3
VDP4
SwitchBlade
NetFPGA
Packet
Receiver
• Three-node topology
– NetFPGA traffic generator and sink
• Multiple parallel data planes running on SwitchBlade
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Little Additional Resource Overhead
Implementatio
n
Avail.
Data-planes
Gate
Count
IPv4
One
8M
Splicing
One
12 M
OpenFlow
One
12 M
SwitchBlade
Four
13M
• Four virtualized data planes in parallel at one time
• Larger FPGAs will ultimately support more data planes
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Forwarding Rate (kpps)
SwitchBlade Incurs No Additional
Forwarding Overhead
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Conclusion
• SwitchBlade: A programmable hardware
platform with customizable parallel data planes
– Rapid deployment using library of hardware modules
– Provides isolation using rate limiters and fixed
forwarding tables
• Rapid prototyping in programmable hardware
and software
• Multiple data planes in parallel
– Resource sharing minimizes hardware cost
http://gtnoise.net/switchblade
10/20/14
Software Defined Networking (COMS 6998-10)
Source: B. Anwer, Gatech
Outline
• Review of Previous Lecture: SDN Updates
• SDN Forwarding Abstractions
– Click software router
– SwitchBlade NetFPGA programmable router
– OpenFlow++ and programming protocolindependent packet processors
10/20/14
Software Defined Networking (COMS 6998-10)
OpenFlow++: RMT Outline
• Conventional switch chips are inflexible
• SDN demands flexibility…sounds expensive…
• How do we do it: The Reconfigurable Match Table
(RMT) switch model
• Flexibility costs less than 15%
10/20/14
Software Defined Networking (COMS 6998-10)
Fixed function switch
Action: permit/deny
X
ACL Table
Action: set L2D, dec
TTL
L2 Table
L3 Table
Stage 2
Data
10/20/14
X
L3
Stage
Stage 1
ACL: 4k
Ternary match
ACL
Stage
Queues
Out
Deparser
X
X
L2
Stage
In
Parser
PBB
Stage
X
Action: set L2D
?????????
L2: 128k x 48
L3: 16k x 32
Exact match
Longest prefix
match
Stage 3
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
What if you need flexibility?
• Flexibility to:
– Trade one memory size for another
– Add a new table
– Add a new header field
– Add a different action
• SDN accentuates the need for flexibility
– Gives programmatic control to control plane,
expects to be able to use flexibility
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
What does SDN want?
• Multiple stages of match-action
– Flexible allocation
• Flexible actions
• Flexible header fields
• No coincidence OpenFlow built this way…
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
What about Alternatives?
Aren’t there other ways to get flexibility?
• Software? 100x too slow, expensive
• NPUs? 10x too slow, expensive
• FPGAs? 10x too slow, expensive
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
What We Set Out To Learn
• How do I design a flexible switch chip?
• What does the flexibility cost?
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
What’s Hard about a
Flexible Switch Chip?
•
•
•
•
•
•
Big chip
High frequency
Wiring intensive
Many crossbars
Lots of TCAM
Interaction between physical design and
architecture
• Good news? No need to read 7000 IETF RFC’s!
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
OpenFlow++: RMT Outline
•
•
•
•
Conventional switch chip are inflexible
SDN demands flexibility…sounds expensive…
How do we do it: The RMT switch model
Flexibility costs less than 15%
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
The RMT Abstract Model
• Parse graph
• Table graph
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
Arbitrary Fields: The Parse Graph
Packet:
Ethernet
TCP
IPV4
Ethernet
10/20/14
IPV4
IPV6
TCP
UDP
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
Arbitrary Fields: The Parse Graph
Packet:
Ethernet
IPV4
TCP
Ethernet
IPV4
TCP
10/20/14
UDP
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
Arbitrary Fields: The Parse Graph
Packet:
Ethernet
IPV4
RCP
TCP
Ethernet
IPV4
RCP
TCP
10/20/14
UDP
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
Reconfigurable Match Tables:
The Table Graph
VLAN
ETHERTYPE
MAC
FORWARD
IPV4-DA
IPV6-DA
ACL
RCP
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
Changes to Parse Graph and Table Graph
ETHERTYPE
Ethernet
VLAN
VLAN
IPV6
IPV4
RCP
IPV4-DA IPV6-DA
L2S
L2D
RCP
UDP
TCP
ACL
Done
MY-TABLE
Parse Graph
Table Graph
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
But the Parse Graph and Table Graph
don’t show you how to build a switch
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
10/20/14
Stage 2
…
Stage N
Queues
Deparser
Stage 1
Match
Action
Stage
Action
Match
Action
Stage
Action
Match
Action
Stage
Match Table
Match Table
Action
Match Table
In
Programmable Parser
Match/Action Forwarding Model
Data
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
Out
Performance vs Flexibility
•
•
•
•
Multiprocessor: memory bottleneck
Change to pipeline
Fixed function chips specialize processors
Flexible switch needs general purpose CPUs
Memory
L2
CPU
Memory
CPU
Memory
CPU
10/20/14
L3
Software Defined Networking (COMS 6998-10)
ACL
Source: P. Bosshart, TI
How We Did It
•
•
•
•
Memory to CPU bottleneck
Replicate CPUs
More stages for finer granularity
Higher CPU cost ok
C
P
U
Memory
C
P
U
C
P
U
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
RMT Logical to Physical Table Mapping
Physical
Stage 1
Physical
Stage 2
Physical
Stage n
ETH
3
IPV4
VLAN
ACL
Table Graph
10/20/14
SRAM
HASH
640b
Logical
Table 1
Ethertype
Action
UDP
Match Table
TCP
5
IPV6
Action
L2D
Match Table
640b
2
VLAN
Action
IPV4
TCAM
Match Table
L2S
IPV6
9 ACL
7 TCP
4
L2S
8 UDP
Logical Table 6
L2D
Match result
Header Out
Field
ALU
Field
Header In
Action Processing Model
Data
Instruction
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
Modeled as Multiple VLIW CPUs per Stage
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
Match result
10/20/14
VLIW Instructions
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
RMT Switch Design
• 64 x 10Gb ports
• Huge TCAM: 10x current chips
– 960M packets/second
– 1GHz pipeline
• 64K TCAM words x 640b
• Programmable parser
• 32 Match/action stages
• SRAM hash tables for exact
matches
• 128K words x 640b
• 224 action processors per stage
• All OpenFlow statistics counters
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
OpenFlow++: RMT Outline
•
•
•
•
Conventional switch chip are inflexible
SDN demands flexibility…sounds expensive…
How do I do it: The RMT switch model
Flexibility costs less than 15%
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
Cost of Configurability:
Comparison with Conventional Switch
• Many functions identical: I/O, data buffer, queueing…
• Make extra functions optional: statistics
• Memory dominates area
– Compare memory area/bit and bit count
• RMT must use memory bits efficiently to compete on cost
• Techniques for flexibility
–
–
–
–
–
10/20/14
Match stage unit RAM configurability
Ingress/egress resource sharing
Table predication allows multiple tables per stage
Match memory overhead reduction
Match memory multi-word packing
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
Chip Comparison with Fixed Function Switches
Area
Section
Area % of chip
Extra Cost
IO, buffer, queue, CPU, etc
37%
0.0%
Match memory & logic
54.3%
8.0%
VLIW action engine
7.4%
5.5%
Parser + deparser
1.3%
0.7%
Total extra area cost
14.2%
Power
Section
Power % of chip
Extra Cost
I/O
26.0%
0.0%
Memory leakage
43.7%
4.0%
Logic leakage
7.3%
2.5%
RAM active
2.7%
0.4%
TCAM active
3.5%
0.0%
Logic active
16.8%
5.5%
10/20/14
Total
extra power cost
Software Defined Networking (COMS 6998-10)
12.4%
Conclusion
• How do we design a flexible chip?
– The RMT switch model
– Bring processing close to the memories:
• pipeline of many stages
– Bring the processing to the wires:
• 224 action CPUs per stage
• How much does it cost?
– 15%
• Lots of the details how this is designed in 28nm
CMOS are in the paper
10/20/14
Software Defined Networking (COMS 6998-10)
Source: P. Bosshart, TI
Outline
• Review of Previous Lecture: SDN Updates
• SDN Forwarding Abstractions
– Click software router
– SwitchBlade NetFPGA programmable router
– OpenFlow++ and programming protocolindependent packet processors
10/20/14
Software Defined Networking (COMS 6998-10)
In the Beginning…
• OpenFlow was simple
• A single rule table
– Priority, pattern, actions, counters, timeouts
• Matching on any of 12 fields, e.g.,
– MAC addresses
– IP addresses
– Transport protocol
– Transport port numbers
74
Over the Past Five Years…
Proliferation of header fields
Version
OF 1.0
Date
Dec 2009
# Headers
12
OF 1.1
OF 1.2
OF 1.3
Feb 2011
Dec 2011
Jun 2012
15
36
40
OF 1.4
Oct 2013
41
Multiple stages of heterogeneous tables
Still not enough (e.g., VXLAN, NVGRE, STT, …)
75
Where does it stop?!?
76
Future SDN Switches
• Configurable packet parser
– Not tied to a specific header format
• Flexible match+action tables
– Multiple tables (in series and/or parallel)
– Able to match on all defined fields
• General packet-processing primitives
– Copy, add, remove, and modify
– For both header fields and meta-data
77
We Can Do This!
• New generation of switch ASICs
– Intel FlexPipe: programmable parser,
– RMT [SIGCOMM’13]
– Cisco Doppler
• But, programming these chips is hard
– Custom, vendor-specific interfaces
– Low-level, akin to microcode programming
78
We need a higher-level interface
To tell the switch how we want it to behave
79
Three Goals
• Protocol independence
– Configure a packet parser
– Define a set of typed match+action tables
• Target independence
– Program without knowledge of switch details
– Rely on compiler to configure the target switch
• Reconfigurability
– Change parsing and processing in the field
80
“Classic” OpenFlow (1.x)
SDN Control Plane
Installing and
querying rules
Target Switch
81
“OpenFlow 2.0”
SDN Control Plane
Configuring:
Parser, tables,
and control flow
Compiler
Parser & Table
Configuration
Populating:
Installing and
querying rules
Rule
Translator
Target Switch
82
P4 Language
Programming Protocol-Independent
Packet Processing
83
Simple Motivating Example
• Data-center routing
• Hierarchical tag (mTag)
– Top-of-rack switches
– Two tiers of core switches
– Source routing by ToR
up2
– Pushed by the ToR
– Four one-byte fields
– Two hops up, two down
down1
down2
up1
ToR
ToR
84
Header Formats
• Header
– Ordered list of fields
– A field has a name and width
header ethernet {
fields {
dst_addr : 48;
src_addr : 48;
ethertype : 16;
}
}
header vlan {
fields {
pcp : 3;
cfi : 1;
vid : 12;
ethertype : 16;
}
}
header mTag {
fields {
up1 : 8;
up2 : 8;
down1 : 8;
down2 : 8;
ethertype : 16;
}
}
Parser
• State machine traversing the packet
– Extracting field values as it goes
parser start {
ethernet;
}
parser ethernet {
switch(ethertype) {
case 0x8100 : vlan;
case 0x9100 : vlan;
case 0x800 : ipv4;
. . .
}
}
parser vlan {
switch(ethertype) {
case 0xaaaa : mTag;
case 0x800 : ipv4;
. . .
}
parser mTag {
switch(ethertype) {
case 0x800 : ipv4;
. . .
}
}
86
Typed Tables
• Describe each packet-processing stage
– What fields are matched, and in what way
– What action functions are performed
– (Optionally) a hint about max number of rules
table mTag_table {
reads {
ethernet.dst_addr : exact;
vlan.vid : exact;
}
actions {
add_mTag;
}
max_size : 20000;
}
87
Action Functions
• Custom actions built from primitives
– Add, remove, copy, set, increment, checksum
action add_mTag(up1, up2, down1, down2, outport) {
add_header(mTag);
copy_field(mTag.ethertype, vlan.ethertype);
set_field(vlan.ethertype, 0xaaaa);
set_field(mTag.up1, up1);
set_field(mTag.up2, up2);
set_field(mTag.down1, down1);
set_field(mTag.down2, down2);
set_field(metadata.outport, outport);
}
88
Control Flow
• Flow of control from one table to the next
– Collection of functions, conditionals, and tables
• For a ToR switch:
From core
(with mTag)
ToR
From local hosts
(with no mTag)
Source
Check
Table
Local
Switching
Table
Egress
Check
Miss: Not Local
mTag
Table
89
Control Flow
• Flow of control from one table to the next
– Collection of functions, conditionals, and tables
• Simple imperative representation
control main() {
table(source_check);
if (!defined(metadata.ingress_error)) {
table(local_switching);
if (!defined(metadata.outport)) {
table(mTag_table);
}
table(egress_check);
}
}
90
P4 Compilation
91
P4 Compiler
• Parser
– Programmable parser: translate to state machine
– Fixed parser: verify the description is consistent
• Control program
– Target-independent: table graph of dependencies
– Target-dependent: mapping to switch resources
• Rule translation
– Verify that rules agree with the (logical) table types
– Translate the rules to the physical tables
92
Compiling to Target Switches
• Software switches
– Directly map the table graph to switch tables
– Use data structure for exact/prefix/ternary match
• Hardware switches with RAM and TCAM
– RAM: hash table for tables with exact match
– TCAM: for tables with wildcards in the match
• Switches with parallel tables
– Analyze table graph for possible concurrency
93
Compiling to Target Switches
• Applying actions at the end of pipeline
– Instantiate tables that generate meta-data
– Use meta-data to perform actions at the end
• Switches with a few physical tables
– Map multiple logical tables to one physical table
– “Compose” rules from the multiple logical tables
– … into “cross product” of rules in physical table
94
Related Work
•
•
•
•
•
Abstract forwarding model for OpenFlow
Kangaroo programmable parser
Protocol-oblivious forwarding
Table Type Patterns in ONF FAWG
NOSIX portability layer for OpenFlow
95
Conclusion
• OpenFlow 1.x
– Vendor-agnostic API
– But, only for fixed-function switches
• An alternate future
– Protocol independence
– Target independence
– Reconfigurability in the field
• P4 language: a straw-man proposal
– To trigger discussion and debate
– Much, much more work to do!
96
Questions?
10/20/14
Software Defined Networking (COMS 6998-10)
Download