Software Defined Networking COMS 6998-10, Fall 2014 Instructor: Li Erran Li (lierranli@cs.columbia.edu) http://www.cs.columbia.edu/~lierranli/coms 6998-10SDNFall2014/ 10/20/2014: SDN Forwarding Abstraction Outline • Review of Previous Lecture: SDN Updates • SDN Forwarding Abstractions – Click software router – SwitchBlade NetFPGA programmable router – OpenFlow++ and programming protocolindependent packet processors 10/20/14 Software Defined Networking (COMS 6998-10) Review of Previous Lecture • What update abstractions did we learn? – Per-packet consistent update: each packet is processed either by the old configuration or the new one – Per-flow consistent update: all packets of a flow is processed by the same configuration (either old or new) – Congestion free update: updates are congestion free under asynchronous switch and traffic matrix changes 10/20/14 Software Defined Networking (COMS 6998-10) Source: Andreas Voellmy, Yale Review of Previous Lecture (Cont’d) • How to achieve consistent update? – Install new rules on internal switches, leave old configuration in place – Install edge rules that stamp with the new version number 10/20/14 Software Defined Networking (COMS 6998-10) Source: Andreas Voellmy, Yale Review of Previous Lecture (Cont’d) F1 I F2 F3 2-Phase Update in Action Traffic 10/20/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell Review of Previous Lecture (Cont’d) • How to perform congestion free update? Operator Update Scenario Update requirements zUpdate Current Traffic Distribution Intermediate Traffic Distribution Intermediate Traffic Distribution Target Traffic Distribution Data Center Network 10/20/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale Review of Previous Lecture (Cont’d) All switches: Equal-Cost Multi-Path (ECMP) Link capacity: 1000 CORE 1 2 3 4 150= 920150 620 + 150 + 150 AGG 1 2 300 ToR 10/20/14 4 6 5 300 300 1 600 3 2 3 4 A clos network with ECMP Software Defined Networking (COMS 6998-10) 300 5 600 Source: J. Liu, Yale Review of Previous Lecture (Cont’d) • Asynchronous changes can cause transient congestion Link capacity: 1000 CORE 1 2 3 4 620 + 300 + 150 = 1070 AGG 1 2 3 4 6 5 Drain AGG1 300 300 600 ToR 1 2 3 4 5 When ToR1 is changed but ToR5 is not yet: Not Yet 10/20/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale Review of Previous Lecture (Cont’d) • Solution: introducing an intermediate step Final Initial CORE 1 AGG 1 2 3 4 CORE 1 AGG 1 2 3 4 Transition 2 300 ToR 3 4 300 1 6 5 300 2 3 Congestion-free regardless the asynchronizations 4 0 300 ToR 5 CORE 1 AGG 1 2 1 ToR ? 2 3 400 1 3 2 3 4 4 4 450 500 2 3 4 Software Defined Networking (COMS 6998-10) 100 5 Congestion-free regardless the asynchronizations 150 5 6 5 6 5 4 3 600 Intermediate 200 10/20/14 2 Source: J. Liu, Yale Review of Previous Lecture (Cont’d) • What happens when control plane network partitions? • Assumptions: – Out-of-band control network – Routing and forwarding based on addresses – Policy specification using end-host names – Controller only aware of local name-address bindings 10/20/14 Software Defined Networking (COMS 6998-10) Source: Andreas Voellmy, Yale Review of Previous Lecture (Cont’d) • Consider policy isolating A from B. A control network partition occurs. Only possible choices – Let all packets through (including from A to B) (Correctness) – Drop all packets (including from A to D) (Availability) 10/20/14 Software Defined Networking (COMS 6998-10) Review of Previous Lecture (Cont’d) • Solutions: – Network can label packets with sender’s identity • Route based on identity instead of address – Inband control 10/20/14 Software Defined Networking (COMS 6998-10) Outline • Review of Previous Lecture: SDN Updates • SDN Forwarding Abstractions – Click software router – SwitchBlade NetFPGA programmable router – OpenFlow++ 10/20/14 Software Defined Networking (COMS 6998-10) Modular software forwarding plane: Click modular router Control plane • Elements User-level routing daemons Linux kernel Click Forwarding plane – Small building blocks, performing simple operations – Instances of C++ classes • Packets traverse a directed graph of elements FromDevice(eth0)->CheckIPHeader(14) ->IPPrint->Discard; 10/20/14 Software Defined Networking (COMS 6998-10) Elements element class input port Tee(2) output ports configuration string 15-7-2016 10/20/14 PATS Research Group Software Defined Networking (COMS 6998-10) 15 Push and pull FromDevice receive packet p Null push(p) return push(p) return dequeue p and return it • Push connection 15-7-2016 10/20/14 enqueue p pull() return p • – Source pushes packets downstream – Triggered by event, such as packet arrival – Denoted by filled square or triangle • ToDevice Null pull() return p ready to transmit send p Pull connection – Destination pulls packets from upstream – Packet transmission or scheduling – Denoted by empty square or triangle Agnostic connection – Becomes push or pull depending on peer – Denoted by double outline PATS Research Group Software Defined Networking (COMS 6998-10) 16 Push and pull violations FromDevice Counter FromDevice 15-7-2016 10/20/14 ToDevice ToDevice PATS Research Group Software Defined Networking (COMS 6998-10) 17 Implicit queue v. explicit queue Implicit queue •Used by STREAM, Scout, etc. •Hard to control Explicit queue •Led to push and pull, Click’s main idea •Contributes to high performance 10/20/14 Software Defined Networking (COMS 6998-10) IP router configuration 15-7-2016 10/20/14 PATS Research Group Software Defined Networking (COMS 6998-10) 19 Click performance, circa 2000 10/20/14 Maximum loss-free forwarding rate with 64-byte packet: 333k, 284k, 84k for Click, Linux w/ polling driver, Plain Linux Software Defined Networking (COMS 6998-10) Improving software router performance: exploiting parallelism • Can you build a Tbps router out of PCs running Click? – Not quite, but you can get close • RouteBricks: high-end software router – Parallelism across servers and cores – High-end servers: NUMA, multi-queue NICs – RB4 prototype • 4 servers in full mesh acting as 4-port (10Gbps/port) router • 4 8.75 = 35Gbps – Linearly scalable by adding servers (in theory) 10/20/14 Software Defined Networking (COMS 6998-10) Outline • Review of Previous Lecture: SDN Updates • SDN Forwarding Abstractions – Click software router – SwitchBlade NetFPGA programmable router – OpenFlow++ and programming protocolindependent packet processors 10/20/14 Software Defined Networking (COMS 6998-10) Motivation • Many new protocols require data-plane changes. – Examples: OpenFlow, Path Splicing, AIP, … • These protocols must forward packets at acceptable speeds • May need to run in parallel with existing or alternative protocols • Goal: Platform for rapidly developing new network protocols that – Forwards packets at high speed – Runs multiple data-plane protocols in parallel 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Existing Approaches • Develop custom software – Advantage: Flexible, easy to program – Disadvantage: Slow forwarding speeds • Develop modules in custom hardware – Advantage: Excellent performance – Disadvantage: Long development cycles, rigid • Develop in programmable hardware – Advantage: Flexible and fast – Disadvantage: Programming is difficult 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech SwitchBlade: Main Idea • Identify modular hardware building blocks that implement a variety of data-plane functions • Allow a developer to enable and connect various building blocks in a hardware pipeline from software • Allow multiple custom data planes to operate in parallel on the same hardware Flexible, fast, and easy to program. Advantages of hardware and software with minimal overhead. 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech SwitchBlade: Push Custom Forwarding Planes into Hardware Software Click Click VE3 VE3 VE1 CPU VE2 MemoryVE3 Hard VE4 Disk VE1 VE2 Click Click PCI VDP1 VDP2 VDP3 VDP4 SwitchBlade NetFPGA VDP = Virtual Data Plane Click = Click Software Router VE = Virtual Environment 10/20/14 Software Hardware Virtual Env. Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech SwitchBlade Features • Parallel custom data planes – Ability to demultiplex into existing data planes and maintain isolation on common hardware platform • Rapid development and deployment – Pluggable preprocessor modules enable a range of customizable functions at hardware rates • Customizability and programmability – Dynamic selection of modules, and ability to operate in several different forwarding modes. 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Virtual Data Planes (VDPs) Virtual Data Plane Selection Shaping Preprocessing Forwarding • Separate packet processing pipeline, lookup tables, and forwarding modules per VDP • Stored table maps MAC address to VDP identifier • VDP Selection step – Identifies VDP based on MAC address – Attaches 64-bit platform header that controls functions in later stages – Register interface controls this header per VDP 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Platform Header Hash Value Module Module Bitmap Mode Mode bitmap VDP ID • Hash value computed based on custom bits in header (allows for custom forwarding, if desired) • Bitmap indicates which preprocessor modules should execute on this packet • Mode indicates the forwarding mode (LPM or otherwise) • VDP-ID indicates the VDP of the packet 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Virtual Data Plane Isolation • Each Virtual Data Plane (VDP) has preprocessing, lookup, and post processing stages – Fixed set of forwarding tables – Lookup, ARP, and exception tables • One rate limiter per virtual-data plane • Forwarding tables, rate limiters operate in isolation 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech SwitchBlade Features • Parallel custom data planes – Ability to demultiplex into existing data planes and maintain isolation on common hardware platfor. • Rapid development and deployment – Pluggable preprocessor modules to enable a range of customizable functions at hardware rates • Customizability and programmability – Dynamic selection of modules, and ability to operate in several different forwarding modes 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Preprocessing Per-VDP Module Selection Bit field Register Per-VDP module field Selection Virtual Data Plane Selection Shaping Preprocessing Forwarding Preprocessing Selector Custom Preprocessor Hasher • Select processing functions from library of reusable modules – Selection function through bitmap Enables fast customization without resynthesis – Example implementations: Path Splicing, IPv6, OpenFlow • Hash custom bits in packet header and insert value in hash field in platform header – Enables custom forwarding 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Hashing 16-bit Ethernet IP32-bit Packet 8-bit 32-bit Data 16-bit Data Data 32-bit hash 32-bit hash • Hash custom bits in packet header – Insert hash value in field in platform header • Module accepts up to 256-bits from the preprocessor according to user selection 10/20/14 Software Defined Networking (COMS 6998-10) Example: OpenFlow • Limited implementation (no VLANs or wildcards) • Preprocessing Steps – Parse packet and extracts relevant tuples – 240-bit OpenFlow “bitstream” passed to hasher module in the preprocessor – Hasher outputs 32-bit hash value on which custom forwarding could take place – Mode field set to perform exact match • Most post-processing functions disabled (e.g., TTL decrement) 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Adding New Modules • Adding a new module at any stage requires Verilog programming • User writes preprocessing (and postprocessing) modules to extract the bits used for lookup • Resynthesize hardware • Enable module from register interface in software 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech SwitchBlade Features • Parallel custom data planes – Ability to demultiplex into existing data planes and maintain isolation on common hardware platform. • Rapid development and deployment – Pluggable preprocessor modules to enable a range of customizable functions at hardware rates. • Customizability and programmability – Dynamic selection of modules, and ability to operate in several different forwarding modes. 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Forwarding Per-VDP Lookup, Software Exception and ARP Tables Virtual Data Plane Selection Shaping Output Port Lookup Preprocessing Forwarding Per-VDP counters and stats Postprocessor Wrappers Custom Postprocessor • Output port lookup performs custom forwarding depending on the mode bits in the platform header • Wrapper modules allow matching on custom bit offsets • Custom post processors allow other functions to be enabled/disabled on the fly (e.g., checksum) 10/20/14 Software Defined Networking (COMS 6998-10) Software Exceptions • Ability to redirect some packets to CPU • Packets are passed with VDP (and platform header), to allow for VDP-based software exceptions • One possible application: Virtual routers in software 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Custom Postprocessing Paths Forwarding IPv6 Open Flow Path Splicing 10/20/14 Forwarding Logic TTL Dest. MAC Logic Checksum Source MAC User Defined User Defined Software Defined Networking (COMS 6998-10) Output Queues Source: B. Anwer, Gatech Implementation • NetFPGA-based implementation – Based on NetFPGA reference router implementation – Xilinx Virtex 2 Pro 50 • SRAM for packet forwarding • BRAM for storing forwarding information • PCI for communication with CPU 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Evaluation • Resource utilization: How much hardware resources does running SwitchBlade require? – Answer: Minimal additional overhead, compared to running any custom protocol directly • Packet forwarding overhead: How fast can Switchblade forward packets? – Answer: No additional overhead with respect to base NetFPGA implementation 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Evaluation Setup Source CPU Memory Hard Disk Sink PCI NetFPGA Packet Generator VDP1 VDP2 VDP3 VDP4 SwitchBlade NetFPGA Packet Receiver • Three-node topology – NetFPGA traffic generator and sink • Multiple parallel data planes running on SwitchBlade 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Little Additional Resource Overhead Implementatio n Avail. Data-planes Gate Count IPv4 One 8M Splicing One 12 M OpenFlow One 12 M SwitchBlade Four 13M • Four virtualized data planes in parallel at one time • Larger FPGAs will ultimately support more data planes 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Forwarding Rate (kpps) SwitchBlade Incurs No Additional Forwarding Overhead 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Conclusion • SwitchBlade: A programmable hardware platform with customizable parallel data planes – Rapid deployment using library of hardware modules – Provides isolation using rate limiters and fixed forwarding tables • Rapid prototyping in programmable hardware and software • Multiple data planes in parallel – Resource sharing minimizes hardware cost http://gtnoise.net/switchblade 10/20/14 Software Defined Networking (COMS 6998-10) Source: B. Anwer, Gatech Outline • Review of Previous Lecture: SDN Updates • SDN Forwarding Abstractions – Click software router – SwitchBlade NetFPGA programmable router – OpenFlow++ and programming protocolindependent packet processors 10/20/14 Software Defined Networking (COMS 6998-10) OpenFlow++: RMT Outline • Conventional switch chips are inflexible • SDN demands flexibility…sounds expensive… • How do we do it: The Reconfigurable Match Table (RMT) switch model • Flexibility costs less than 15% 10/20/14 Software Defined Networking (COMS 6998-10) Fixed function switch Action: permit/deny X ACL Table Action: set L2D, dec TTL L2 Table L3 Table Stage 2 Data 10/20/14 X L3 Stage Stage 1 ACL: 4k Ternary match ACL Stage Queues Out Deparser X X L2 Stage In Parser PBB Stage X Action: set L2D ????????? L2: 128k x 48 L3: 16k x 32 Exact match Longest prefix match Stage 3 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI What if you need flexibility? • Flexibility to: – Trade one memory size for another – Add a new table – Add a new header field – Add a different action • SDN accentuates the need for flexibility – Gives programmatic control to control plane, expects to be able to use flexibility 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI What does SDN want? • Multiple stages of match-action – Flexible allocation • Flexible actions • Flexible header fields • No coincidence OpenFlow built this way… 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI What about Alternatives? Aren’t there other ways to get flexibility? • Software? 100x too slow, expensive • NPUs? 10x too slow, expensive • FPGAs? 10x too slow, expensive 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI What We Set Out To Learn • How do I design a flexible switch chip? • What does the flexibility cost? 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI What’s Hard about a Flexible Switch Chip? • • • • • • Big chip High frequency Wiring intensive Many crossbars Lots of TCAM Interaction between physical design and architecture • Good news? No need to read 7000 IETF RFC’s! 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI OpenFlow++: RMT Outline • • • • Conventional switch chip are inflexible SDN demands flexibility…sounds expensive… How do we do it: The RMT switch model Flexibility costs less than 15% 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI The RMT Abstract Model • Parse graph • Table graph 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI Arbitrary Fields: The Parse Graph Packet: Ethernet TCP IPV4 Ethernet 10/20/14 IPV4 IPV6 TCP UDP Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI Arbitrary Fields: The Parse Graph Packet: Ethernet IPV4 TCP Ethernet IPV4 TCP 10/20/14 UDP Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI Arbitrary Fields: The Parse Graph Packet: Ethernet IPV4 RCP TCP Ethernet IPV4 RCP TCP 10/20/14 UDP Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI Reconfigurable Match Tables: The Table Graph VLAN ETHERTYPE MAC FORWARD IPV4-DA IPV6-DA ACL RCP 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI Changes to Parse Graph and Table Graph ETHERTYPE Ethernet VLAN VLAN IPV6 IPV4 RCP IPV4-DA IPV6-DA L2S L2D RCP UDP TCP ACL Done MY-TABLE Parse Graph Table Graph 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI But the Parse Graph and Table Graph don’t show you how to build a switch 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI 10/20/14 Stage 2 … Stage N Queues Deparser Stage 1 Match Action Stage Action Match Action Stage Action Match Action Stage Match Table Match Table Action Match Table In Programmable Parser Match/Action Forwarding Model Data Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI Out Performance vs Flexibility • • • • Multiprocessor: memory bottleneck Change to pipeline Fixed function chips specialize processors Flexible switch needs general purpose CPUs Memory L2 CPU Memory CPU Memory CPU 10/20/14 L3 Software Defined Networking (COMS 6998-10) ACL Source: P. Bosshart, TI How We Did It • • • • Memory to CPU bottleneck Replicate CPUs More stages for finer granularity Higher CPU cost ok C P U Memory C P U C P U 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI RMT Logical to Physical Table Mapping Physical Stage 1 Physical Stage 2 Physical Stage n ETH 3 IPV4 VLAN ACL Table Graph 10/20/14 SRAM HASH 640b Logical Table 1 Ethertype Action UDP Match Table TCP 5 IPV6 Action L2D Match Table 640b 2 VLAN Action IPV4 TCAM Match Table L2S IPV6 9 ACL 7 TCP 4 L2S 8 UDP Logical Table 6 L2D Match result Header Out Field ALU Field Header In Action Processing Model Data Instruction 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI Modeled as Multiple VLIW CPUs per Stage ALU ALU ALU ALU ALU ALU ALU ALU ALU Match result 10/20/14 VLIW Instructions Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI RMT Switch Design • 64 x 10Gb ports • Huge TCAM: 10x current chips – 960M packets/second – 1GHz pipeline • 64K TCAM words x 640b • Programmable parser • 32 Match/action stages • SRAM hash tables for exact matches • 128K words x 640b • 224 action processors per stage • All OpenFlow statistics counters 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI OpenFlow++: RMT Outline • • • • Conventional switch chip are inflexible SDN demands flexibility…sounds expensive… How do I do it: The RMT switch model Flexibility costs less than 15% 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI Cost of Configurability: Comparison with Conventional Switch • Many functions identical: I/O, data buffer, queueing… • Make extra functions optional: statistics • Memory dominates area – Compare memory area/bit and bit count • RMT must use memory bits efficiently to compete on cost • Techniques for flexibility – – – – – 10/20/14 Match stage unit RAM configurability Ingress/egress resource sharing Table predication allows multiple tables per stage Match memory overhead reduction Match memory multi-word packing Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI Chip Comparison with Fixed Function Switches Area Section Area % of chip Extra Cost IO, buffer, queue, CPU, etc 37% 0.0% Match memory & logic 54.3% 8.0% VLIW action engine 7.4% 5.5% Parser + deparser 1.3% 0.7% Total extra area cost 14.2% Power Section Power % of chip Extra Cost I/O 26.0% 0.0% Memory leakage 43.7% 4.0% Logic leakage 7.3% 2.5% RAM active 2.7% 0.4% TCAM active 3.5% 0.0% Logic active 16.8% 5.5% 10/20/14 Total extra power cost Software Defined Networking (COMS 6998-10) 12.4% Conclusion • How do we design a flexible chip? – The RMT switch model – Bring processing close to the memories: • pipeline of many stages – Bring the processing to the wires: • 224 action CPUs per stage • How much does it cost? – 15% • Lots of the details how this is designed in 28nm CMOS are in the paper 10/20/14 Software Defined Networking (COMS 6998-10) Source: P. Bosshart, TI Outline • Review of Previous Lecture: SDN Updates • SDN Forwarding Abstractions – Click software router – SwitchBlade NetFPGA programmable router – OpenFlow++ and programming protocolindependent packet processors 10/20/14 Software Defined Networking (COMS 6998-10) In the Beginning… • OpenFlow was simple • A single rule table – Priority, pattern, actions, counters, timeouts • Matching on any of 12 fields, e.g., – MAC addresses – IP addresses – Transport protocol – Transport port numbers 74 Over the Past Five Years… Proliferation of header fields Version OF 1.0 Date Dec 2009 # Headers 12 OF 1.1 OF 1.2 OF 1.3 Feb 2011 Dec 2011 Jun 2012 15 36 40 OF 1.4 Oct 2013 41 Multiple stages of heterogeneous tables Still not enough (e.g., VXLAN, NVGRE, STT, …) 75 Where does it stop?!? 76 Future SDN Switches • Configurable packet parser – Not tied to a specific header format • Flexible match+action tables – Multiple tables (in series and/or parallel) – Able to match on all defined fields • General packet-processing primitives – Copy, add, remove, and modify – For both header fields and meta-data 77 We Can Do This! • New generation of switch ASICs – Intel FlexPipe: programmable parser, – RMT [SIGCOMM’13] – Cisco Doppler • But, programming these chips is hard – Custom, vendor-specific interfaces – Low-level, akin to microcode programming 78 We need a higher-level interface To tell the switch how we want it to behave 79 Three Goals • Protocol independence – Configure a packet parser – Define a set of typed match+action tables • Target independence – Program without knowledge of switch details – Rely on compiler to configure the target switch • Reconfigurability – Change parsing and processing in the field 80 “Classic” OpenFlow (1.x) SDN Control Plane Installing and querying rules Target Switch 81 “OpenFlow 2.0” SDN Control Plane Configuring: Parser, tables, and control flow Compiler Parser & Table Configuration Populating: Installing and querying rules Rule Translator Target Switch 82 P4 Language Programming Protocol-Independent Packet Processing 83 Simple Motivating Example • Data-center routing • Hierarchical tag (mTag) – Top-of-rack switches – Two tiers of core switches – Source routing by ToR up2 – Pushed by the ToR – Four one-byte fields – Two hops up, two down down1 down2 up1 ToR ToR 84 Header Formats • Header – Ordered list of fields – A field has a name and width header ethernet { fields { dst_addr : 48; src_addr : 48; ethertype : 16; } } header vlan { fields { pcp : 3; cfi : 1; vid : 12; ethertype : 16; } } header mTag { fields { up1 : 8; up2 : 8; down1 : 8; down2 : 8; ethertype : 16; } } Parser • State machine traversing the packet – Extracting field values as it goes parser start { ethernet; } parser ethernet { switch(ethertype) { case 0x8100 : vlan; case 0x9100 : vlan; case 0x800 : ipv4; . . . } } parser vlan { switch(ethertype) { case 0xaaaa : mTag; case 0x800 : ipv4; . . . } parser mTag { switch(ethertype) { case 0x800 : ipv4; . . . } } 86 Typed Tables • Describe each packet-processing stage – What fields are matched, and in what way – What action functions are performed – (Optionally) a hint about max number of rules table mTag_table { reads { ethernet.dst_addr : exact; vlan.vid : exact; } actions { add_mTag; } max_size : 20000; } 87 Action Functions • Custom actions built from primitives – Add, remove, copy, set, increment, checksum action add_mTag(up1, up2, down1, down2, outport) { add_header(mTag); copy_field(mTag.ethertype, vlan.ethertype); set_field(vlan.ethertype, 0xaaaa); set_field(mTag.up1, up1); set_field(mTag.up2, up2); set_field(mTag.down1, down1); set_field(mTag.down2, down2); set_field(metadata.outport, outport); } 88 Control Flow • Flow of control from one table to the next – Collection of functions, conditionals, and tables • For a ToR switch: From core (with mTag) ToR From local hosts (with no mTag) Source Check Table Local Switching Table Egress Check Miss: Not Local mTag Table 89 Control Flow • Flow of control from one table to the next – Collection of functions, conditionals, and tables • Simple imperative representation control main() { table(source_check); if (!defined(metadata.ingress_error)) { table(local_switching); if (!defined(metadata.outport)) { table(mTag_table); } table(egress_check); } } 90 P4 Compilation 91 P4 Compiler • Parser – Programmable parser: translate to state machine – Fixed parser: verify the description is consistent • Control program – Target-independent: table graph of dependencies – Target-dependent: mapping to switch resources • Rule translation – Verify that rules agree with the (logical) table types – Translate the rules to the physical tables 92 Compiling to Target Switches • Software switches – Directly map the table graph to switch tables – Use data structure for exact/prefix/ternary match • Hardware switches with RAM and TCAM – RAM: hash table for tables with exact match – TCAM: for tables with wildcards in the match • Switches with parallel tables – Analyze table graph for possible concurrency 93 Compiling to Target Switches • Applying actions at the end of pipeline – Instantiate tables that generate meta-data – Use meta-data to perform actions at the end • Switches with a few physical tables – Map multiple logical tables to one physical table – “Compose” rules from the multiple logical tables – … into “cross product” of rules in physical table 94 Related Work • • • • • Abstract forwarding model for OpenFlow Kangaroo programmable parser Protocol-oblivious forwarding Table Type Patterns in ONF FAWG NOSIX portability layer for OpenFlow 95 Conclusion • OpenFlow 1.x – Vendor-agnostic API – But, only for fixed-function switches • An alternate future – Protocol independence – Target independence – Reconfigurability in the field • P4 language: a straw-man proposal – To trigger discussion and debate – Much, much more work to do! 96 Questions? 10/20/14 Software Defined Networking (COMS 6998-10)