Network-wide Decision Making: Toward a Wafer-thin Control Plane Jennifer Rexford, Albert Greenberg, Gisli Hjalmtysson ATT Labs Research David A. Maltz, Andy Myers, Geoffrey Xie, Jibin Zhan, Hui Zhang Carnegie Mellon University 1 A Well-Studied Architecture Question Smart hosts, dumb network • Network moves IP packets between hosts • Services implemented on hosts • Keep state at the edges Edge à IP ! Network à IP ! Edge How to partition function vertically? 2 Inside a Single Network Shell scripts Management Plane • Figure out what is Planning tools Databases happening in network Configs SNMP netflow modems • Decide how to change it OSPF Control Plane • Multiple routing processes Link Routing OSPF metrics on each router policies BGP • Each router with different configuration program OSPF OSPF • Huge number of control BGP BGP FIB knobs: metrics, ACLs, policy FIB Traffic Eng FIBPacket filters Data Plane • Distributed routers • Forwarding, filtering, queueing • Based on FIB or labels 3 Inside a Single Network Shell scripts Management Plane • Figure out what is Planning tools Databases happening in network Configs SNMP netflow modems • Decide how to change it OSPF Control Plane • Multiple routing processes Link Routing OSPF metrics on each router policies BGP • Each router with different configuration program OSPF OSPF • Huge number of control BGP BGP FIB knobs: metrics, ACLs, policy FIB Traffic Eng FIBPacket filters Data Plane • Distributed routers • Forwarding, filtering, queueing • Based on FIB or labels 4 Example – Traffic Engineering - 1 Management Plane Load sensitive routing • OSPF distributes load info • Paths computed to avoid hotspots Control Plane Data Plane • Routers make uncoordinated changes to their routes • Poor stability, traffic thrashing Network-wide view needed for a network-wide goal 5 Example – Traffic Engineering - 2 Route planning Management Plane • Learn topology • Estimate traffic matrix • Compute OSPF weights • Reconfigure routers OSPF Load info Control Plane Data Plane • Must predict & undo effects of control plane • Must translate solution into settings of control plane knobs Need ability to express desired solution 6 An Architecture Question to Study How should the functionality that controls a network be divided up? • Important: everyone hates net outages • Practical: solutions can be implemented without changing IP or end-hosts • Relevant: trends toward separating decision-making from forwarding • Unsolved: problem is not solved by running BGP/OSPF on faster servers 7 Our Proposal: Dissemination and Decision Planes What functions require a view of entire network and network objectives? • Path selection and traffic engineering • Reachability control and VPNs • ! Decision plane What functions must be on every router to support creation of a network-wide view? • Topology discovery • Report measurements, status, resources • Install state (e.g., FIBs, ACLs) into data-plane • ! Dissemination plane 8 Good Abstractions Reduce Complexity Management Plane Control Plane Data Plane Configs FIBs, ACLs Decision Plane FIBs, ACLs Dissemination Data Plane All decision making logic lifted out of control plane • Eliminates duplicate logic in management plane • Dissemination plane provides a control channel to/from data plane 9 Many Implementations Possible Decision Plane • Centralized, or • Distributed Dissemination Plane • In-band, or • Out-of-band Choice based on reliability requirements Data plane evolution should be driven by needs of decision and dissemination planes 10 Example – Traffic Engineering Reprise Decision Plane Path Computation Traffic Matrix Topology Load info FIBs Dissemination Plane Data Plane • • • • Network-wide view provided by Dissemination Plane All policy, goals, decision logic located in Decision Plane Consistent network-wide solution constructed Decision plane can directly express desired solution 11 Example – Traffic Isolation Reachability control Management Plane • Create routing design • Configure routing protocols • Add packet filters to patch holes where needed Route attrs Control Plane Data Plane Prevent some hosts/apps from communicating with others • Routing policy is very coarse grained • Packet filters are very expensive in the data plane • Missing filters can allow packets to leak, violating isolation 12 Example – Traffic Isolation Reprise Reachability matrix Decision Plane Path Computation Traffic Matrix Topology Load info FIBs, ACLs Dissemination Plane Data Plane • Reachability matrix directly expresses intended goal • Path computation can jointly optimize traffic load and obey reachability constraints • Packet filters installed only where needed, and changed whenever routing changes 13 Challenges Scalability for a single network • Back-of-the-envelope calculations ! no show-stoppers Responsiveness • Reacting to unplanned failure takes an extra 40-100ms; OSPF/iBGP reconvergence today measured in seconds • Preplanning for failure is easier with network-wide view Coordination • Minimize by having single active decision engine • Leverage distributed computing work 14 Challenges Dissemination plane robustness • Must survive failures of links, but be less complicated than the routing protocols it tries to replace Hierarchy • How do two Decision Planes inter-network? • How is the boundary of the Dissemination Plane defined? 14.25 Related Work • Separation of forwarding elements and control elements – IETF: FORCES, GSMP, GMPLS – SoftRouter [Lakshman] • Driving network operation from network-wide views – Traffic Engineering, Traffic Matrix computation • Centralization of decision making logic – RCP [Feamster], PCE [Farrel] – SS7 [Ma Bell] 14.75 Summary How to partition functionality inside the systems that control a network? Dissemination and Decision Planes Power of solution comes from: • Locating all decision making in one plane • Providing that plane with network-wide views • Directly write forwarding state to data plane Benefits • Network-wide views • Focus on network issues, less on distributed protocols • Coordinated state updates ! better reliability 15