Programmable Routers Jae Woo Lee Fundamental router design Router Routing protocols RIB FIB Packet forwarding Control plane Forwarding plane (aka data plane) Software router User-level daemons OS kernel & Network devices routed, OSPFd, GNU Zebra, Quagga, XORP Linux, BSD, Click, NetFPGA, IXP Extensible software control plane: XORP • Compete with Cisco & Juniper, and be extensible! – All standard protocols – Event-driven, not scannerbased – Multi-process architecture – Modern software engineering • Main contributions: – Staged design for BGP, RIB – Scriptable inter-process communication mechanism – Dynamically extensible CLI and management software – Extensible policy framework Handley, M., Kohler, E., Ghosh, A., Hodson, O., and Radoslavov, P Designing extensible IP router software, NSDI 2005 Conventional router implementation Slide borrowed from http://www.xorp.org/papers.html BGP BGP Slide borrowed from http://www.xorp.org/papers.html BGP Staged Architecture Slide borrowed from http://www.xorp.org/papers.html Messages add_route Peer n tree of routes delete_route lookup_route Filte Ban Unmodified routes stored at ingress Changes in downstream modules (filters, nexthop state, etc) handled by PeerIn pushing the routes again. Slide borrowed from http://www.xorp.org/papers.html RIB Routing Information Base BGP Slide borrowed from http://www.xorp.org/papers.html RIB Structure Routing protocols can register interest in tracking changes to specific routes. Slide borrowed from http://www.xorp.org/papers.html XRLs Interprocess communication BGP Slide borrowed from http://www.xorp.org/papers.html XRL: XORP Resource Locator – URL-like unified structure for inter-process communication: – Example: finder://bgp/bgp/1.0/set_bgp_as?as:u32=1777 transport: module interface egname: x-tcp, method name: x-udp, eg bgp, name: eg kill,rip, bgp, finder set_bgp_as, ospf, vif typed manager fea parameters delete_route, to etc method Finder resolves to a concrete method instance, instantiates transport, and performs access control. xtcp://192.1.2.3:8765/bgp/1.0/set_bgp_as?as:u32=1777 Slide borrowed from http://www.xorp.org/papers.html Commercializing XORP: Vyatta • Standard x86 hardware • Flexible deployment – Standard server hardware platforms – Blades – Virtualization • Open-source software Why Vyatta is Better than Cisco, http://www.vyatta.com/downloads/whitepapers/Vyatta_Better_than_Cisco.pdf Will an open source router replace your Cisco router? http://articles.techrepublic.com.com/5100-10878_11-6163569.html Software forwarding plane: OS kernels Control plane User-level routing daemons /proc ioctl() netlink routing socket Linux kernel Forwarding plane Interface between control and forwarding planes: • Linux (old) – /proc, sysctl, ioctl • Linux (new) – Netlink socket • BSD – Routing socket •J. Salim, H. Khosravi, A. Kleen, A. Kuznetsov, Linux Netlink as an IP Services Protocol, RFC 3549, July 2003 •Bolla, R. and Bruschi, R., Linux Software Router: Data Plane Optimization and Performance Evaluation, Journal of Networks (JNW) 2, 3 (June 2007) •Qing Li, Kip Macy, Optimizing the BSD Routing System for Parallel Processing, PRESTO 2009 • Modular software forwarding plane: Click modular router Control plane • Elements User-level routing daemons Linux kernel Click – Small building blocks, performing simple operations – Instances of C++ classes • Packets traverse a directed graph of elements FromDevice(eth0)->CheckIPHeader(14) Forwarding plane ->IPPrint->Discard; •Kohler, E., Morris, R., Chen, B., Jannotti, J., Kaashoek, M. F., The click modular router, ACM Trans. Comput. Syst. 18, 3 (Aug. 2000) •Andrea Bianco, Robert Birke, Davide Bolognesi, Jorge M. Finochietto, Giulio Galante, Marco Mellia, Click vs. Linux: Two Efficient Open-Source IP Network Stacks for Software Routers, HPSR 2005 Elements 15-7-2016 PATS Research Group 16 Push and pull • • Push connection – Source pushes packets downstream – Triggered by event, such as packet arrival – Denoted by filled square or triangle • 15-7-2016 Pull connection – Destination pulls packets from upstream – Packet transmission or scheduling – Denoted by empty square or triangle Agnostic connection – Becomes push or pull depending on peer – Denoted by double outline PATS Research Group 17 Push and pull violations 15-7-2016 PATS Research Group 18 Implicit queue v. explicit queue QuickTime™ and a decompressor are needed to see this picture. Implicit queue •Used by STREAM, Scout, etc. •Hard to control QuickTime™ and a decompressor are needed to see this picture. Explicit queue •Led to push and pull, Click’s main idea •Contributes to high performance IP router configuration 15-7-2016 QuickTime™ and a decompressor are needed to see this picture. PATS Research Group 20 Click performance, circa 2000 QuickTime™ and a decompressor are needed to see this picture. MLFFR with 64-byte packet: 333k, 284k, 84k for Click, Linux w/ polling driver, Plain Linux Improving software router performance: exploiting parallelism • Can you build a Tbps router out of PCs running Click? – Not quite, but you can get close • RouteBricks: high-end software router – Parallelism across servers and cores – High-end servers: NUMA, multi-queue NICs – RB4 prototype • 4 servers in full mesh acting as 4-port (10Gbps/port) router • 4 8.75 = 35Gbps – Linearly scalable by adding servers (in theory) •Dobrescu, M., Egi, N., Argyraki, K., Chun, B., Fall, K., Iannaccone, G., Knies, A., Manesh, M., and Ratnasamy, S. RouteBricks: exploiting parallelism to scale software routers, SOSP 2009 •Bolla, R. and Bruschi, R., PC-based software routers: high performance and application service support, PRESTO 2008 Improving software router performance: specialized hardware NetFPGA Network processor QuickTime™ and a decompressor are needed to see this picture. QuickTime™ and a decompressor are needed to see this picture. •Jad Naous, Glen Gibb, Sara Bolouki, Nick McKeown, NetFPGA: Reusable Router Architecture for Experimental Research, PRESTO 2008 •Spalink, T., Karlin, S., Peterson, L., and Gottlieb, Y., Building a robust software-based router using network processors, SOSP 2001 •J. Turner, P. Crowley, J. Dehart, A. Freestone, B. Heller, F. Kuhms, S. Kumar, J. Lockwood, J. Lu, M.Wilson, C. Wiseman, D. Zar, Supercharging PlanetLab – A High Performance, Multi-Application, Overlay Network Platform, SIGCOMM 2007 •Tilman Wolf, Challenges and applications for network-processor-based programmable routers, IEEE Sarnoff Symposium, Princeton, NJ, Mar. 2006 Commercial hardware router: Juniper Control plane Routing Engine (RE) Switch Control Board (SCB) Multi-Services Module (MS-PIC) Multi-Services Module (MS-PIC) Packet Forwarding Engine (PFE) Forwarding plane • RE – x86 PC running JUNOS • PFE – ASIC hardware and microcode • MS-PIC – MIPS64-based XLR network processor – Each runs separate JUNOS • JUNOS – FreeBSD-based OS for all Juniper routers Extending commercial router: JUNOS SDK • RE SDK – Servers and management daemons running on RE • Services SDK QuickTime™ and a decompressor are needed to see this picture. – Data path apps running on MSPIC – Packet processing with zerocopy API at line rate – 32 (virtual) CPUs • 8 cores 4 hardware threads • Data threads bound to dedicated CPUs to eliminate context switch •James Kelly, Wladimir Araujo, Kallol Banerjee, Rapid Service Creation using the JUNOS SDK, PRESTO 2009 Standardizing backplane: IETF ForCES WG ------------------------------------------------| | | | | | | |OSPF |RIP |BGP |RSVP |LDP |. . . | | | | | | | | ------------------------------------------------| ForCES Interface | ------------------------------------------------^ ^ ForCES | |data control | |packets messages| |(e.g., routing packets) v v ------------------------------------------------| ForCES Interface | ------------------------------------------------| | | | | | | |LPM Fwd|Meter |Shaper |NAT |Classi-|. . . | | | | | |fier | | ------------------------------------------------| FE resources | ------------------------------------------------Examples of CE and FE functions. • Forwarding and Control Element Separation (ForCES) • Protocols for (multiple) control elements (CE) and forwarding elements (FE) • Separation can be switch fabric or LAN • Interoperability between router components • Would Cisco & Juniper care? •J. Salim, H. Khosravi, A. Kleen, A. Kuznetsov, Linux Netlink as an IP Services Protocol, RFC 3549, July 2003 •H. Khosravi, Ed., T. Anderson, Ed., Requirements for Separation of IP Control and Forwarding, RFC 3654, November 2003 •L. Yang, R. Dantu, T. Anderson, R. Gopal, Forwarding and Control Element Separation (ForCES) Framework, RFC 3746, April 2004 •Ran Giladi, Niv Yemini, A programmable, generic forwarding element (GFE) approach for dynamic network functionality, PRESTO 2009 Control plane detached: OpenFlow • Physical separation of control and forwarding • Forwarding plane in L2 OpenFlow Controller – Flow table instead of FIB – More general than IP SSL OpenFlow Protocol Flow table OpenFlow-enabled Layer-2 Switch Switch MAC Port src MAC dst • Switch exposes flow table though simple OpenFlow protocol – Keep it simple – Vendor can keep platform closed – Use outboard device for packet processing Matches subsets of packet header fields Eth VLAN IP IP IP TCP TCP type ID Src Dst Prot sport dport •McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L., Rexford, J., Shenker, S., and Turner, J., OpenFlow: enabling innovation in campus networks, SIGGCOMM Comput. Commun. Rev. 38, 2 (Mar. 2008) Slicing network: virtualization Virtual router Virtual router Virtual router • NIC virtualization – Solaris Crossbow • Router virtualization – Cisco & Juniper logical routers – Virtual Routers on the Move (VROOM) •Tripathi, S., Droux, N., Srinivasan, T., and Belgaied, K., Crossbow: from hardware virtualized NICs to virtualized networks, VISA 2009 •Eric Keller, Evan Green, Virtualizing the Data Plane through Source Code Merging, PRESTO 2008 •Yi Wang, Eric Keller, Brian Biskeborn, Jacobus van der Merwe, Jennifer Rexford, Virtual routers on the move: Live router migration as a network-management primitive, SIGCOMM 2008 Extreme programmability: Active networks Discrete approach: code installed out-of-band QuickTime™ and a decompressor are needed to see this picture. Integrated approach: packet carries code (capsule) • Heated debate in the 90s • Far-reaching vision, still relevant today •Calvert, K., Reflections on network architecture: an active networking perspective, SIGCOMM Comput. Commun. Rev. 36, 2 (Apr. 2006) •David L. Tennenhouse, Jonathan M. Smith, W. David Sincoskie, David J. Wetherall, and Gary J. Minden, A Survey of Active Network Research, IEEE Communications Magazine, Vol. 35, No. 1, January 1997 •David L. Tennenhouse, David J. Wetherall, Towards an active network architecture, SIGCOMM Comput. Commun. Rev. 26, 2 (Apr. 1996) Hosting tomorrow’s in-network services: NetServ • Reviving active network vision QuickTime™ and a decompressor are needed to see this picture. – Signaling-based code installation – Latest isolation and virtualization technology – Ubiquitous common API, from cable modem to Cisco router •Suman Srinivasan, Jae Woo Lee, Eric Liu, Mike Kester, Henning Schulzrinne, Volker Hilt, Srini Seetharaman, Ashiq Khan, NetServ: Dynamically Deploying In-network Services, ReArch 2009 NetServ - prototype Prototype • Java OSGi on top of Click • Click: Modular router platform • OSGi: dynamic loading and unloading of modules Measurement 1) Bare Linux vs. Plain Click – Penalty for kernel-user transition 2) Plain Click vs. NetServ – Java overhead 2) is small compared to 1) Thank you