Rethinking the Design of the Internet: From Control Plane Architecture to Data Plane Bug Detection T. S. Eugene Ng Assistant Professor 1 The 2004 A. M. Turing Award Goes to... 2 The 2004 A. M. Turing Award Goes to... Bob Kahn • • Vint Cerf "For pioneering work on internetworking, including the design and implementation of the Internet's basic communications protocols, TCP/IP, and for inspired leadership in networking." The only Turing Award given to-date to recognize work in computer networking 3 But at the Same Time... This Monday, 110 attacks from China; 26 attacks from USA 2/24/2008 YouTube traffic mis-routed to Pakistan Q2/2008 1000s of Netherlands DSL customers lost service due to network configuration error 11/10/2008 CTBC (Brazil) black-holed all Internet traffic in some parts of Brazil 2/16/2009 Supro (Czech) routing messages triggered a Cisco router bug world-wide 5/2009 AfNOG (Africa) routing messages triggered buffer overflow in open-source routing software Quagga world-wide Source: Akamai Technologies, Inc. 4 Major Internet Milestones • 1960-1964 Basic concept of “packet switching” was independently developed by Baran (RAND), Kleinrock (MIT) – AT&T insisted that packet switching would never work! • 1965 First time two computers talked to each other using packets (Roberts, MIT; Marill, SDC) dial-up MIT TX-2 SDC Q32 5 Major Internet Milestones • 1968 BBN group proposed to use Honeywell 516 mini-computers for the Interface Message Processors (i.e. packet switches) • 1969 The first ARPANET message transmitted between UCLA (Kleinrock) and SRI (Engelbart) – We sent an “L”, did you get the “L”? Yep! – We sent an “O”, did you get the “O”? Yep! – We sent a “G”, did you get the “G”? Crash! 6 Major Internet Milestones • 1970 First packet radio network ALOHANET (Abramson, U Hawaii) • 1973 Ethernet invented (Metcalfe, Xerox PARC) • 1974 “A protocol for Packet Network Interconnection” published by Cerf and Kahn – First internetworking protocol TCP – This paper was cited for their Turing Award • 1977 First TCP operation over ARPANET, Packet Radio Net, and SATNET • 1985 NSF commissions NSFNET backbone • 1991 NSF opens Internet to commercial use 7 Design Philosophy of the DARPA Internet Protocols by David D. Clark (1988) 1. Internet communication must continue despite loss of networks or gateways 2. The Internet must support multiple types of communications service “Security and network management represent architectural 3. The Internet architecture must accommodate a variety of gaps networks in today’s Internet” 4. The Internet architecture must permit distributed management of its resources -- NSF FIND Observer Panel Report 4/2009 5. The Internet mustGreenberg, be cost effective byarchitecture Cerf, Davie, Landau, Sincoskie 6. The Internet architecture must permit host attachment with a low level of effort 7. The resources used in the Internet architecture must be accountable 8 Internet’s Enormous Unwieldy Complexity Makes Network Management Hard • Traffic processing behavioral complexity – Routers no longer just forward packet with “best effort” – New features are being shoehorned into the network Control Plane Architecture • e.g. Detecting and blocking unwanted traffic, balancing network and server load, enabling virtual private networking – Accidental inconsistencies/interactions can be catastrophic • Failure mode complexity – Routers, like everything else, have bugs – They don’t always stop upon failure Data Plane Bug Detection • e.g. mis-forward or mis-filter some packets – Such complex failure modes are hard to discover or diagnose 9 Internet Control & Management Today 10 State of the Art Shell scripts Management Plane Traffic Eng Planning tools Databases • • Figure out what is happening in network Decide how to change it Configs SNMP netflow modems OSPF Control Plane • Multiple routing processes on Link Routing OSPF each router metrics policies BGP • Each router with different configuration program • Huge number of control knobs: OSPF OSPF metrics, ACLs, policy BGP BGP FIB FIB Data Plane FIB Packet filters • • • Distributed routers Forwarding, filtering, queueing Based on FIB or labels 11 Sta Shell scripts t State of the Art Management Plane e e Traffic Eng •D v er • Figure out whatoislshappening in ynatools yw Databases c Planning o network t h mic o •C s ere r r p e onf • Decide how touchange it t ! s g t n o ate modems uti igu netflow Configs r • P SNMP e e n ! red rog o in F h a c r t l i g P s ramOSPF sta Plane IBisntoControl •M o . s l g o r n t a t n m e l c e i n ! • Multiple routing processes on M e inu s d a e StaLink y deOSPFerde s nRouting b n w i router et e e each pw t t h s te u metrics a l t end tio tpolicies u ingoo b e b i s pdearyBGP router etnac istinr m y• t sEach e with different i , u c p v te d p b o n d e i a configuration program e l e i g d s c t c y m i i ies of control knobs: o inoulic lbaectew n•sciscHuge g o number C , o OSPF n L p ac OSPF n c p h o e P s t o c e t metrics, ACLs, policy s o a a g n k n r i r BGP n et f P n d b e BGP t i s i t e • i t n FIB l t t , so ilte u t Fi tim a ated r o t r f i s e R s b , e r t r • s d a k a e te c c Data Plane o a e t n tr a FIB • P way FIB Packet • Distributedlizrouters ed filters • Forwarding, filtering, No wa queueing y! • Based on FIB or labels 12 Circular Dependency: Management Plane Depends on the Data Plane!!! Management Plane ssh, SNMP, netflow, Control Plane Forwarding Table IP Service Data Plane I/O Card I/O Card 13 Better Way: An “Operating System” for a Network Control & mgmt communications • Idea 1: BIOS for a network – Enables control and management communications – “Meta-Management System (MMS)” • Idea 2: Operating Platform for network controls – Network controls as applications – Operating Platform provides services and protections – “Maestro” Logically centralized !=> Physically centralized 14 MMS: BIOS for a Network 15 Internet Hourglass Architecture Management Apps Applications TCP, UDP, IP Link Layer Interfaces 16 A Better Architecture! Other Applications Management Plane Applications IP TCP, UDP, MMS Link Layer Interfaces 17 Want Plug-and-Play Ease of Use MA MMS MMS MMS MMS MMS MA MMS • Needs to provide a secure and robust management communication channel as long as there is physical network connectivity • Service provided on powerup, no complex manual configuration • Independent of IP services • Same familiar socket programming interface 18 Design Choices • Design for “few MAs, many NEs” • Design for MA <-> NE communications only • Asymmetric crypto for authentication; Symmetric crypto for securing data – Simple configuration, good data performance – NE only needs network certificate, its own public/private key pair – Studies showed 60-70% outages caused by config errors – MA needs own key signed by network certificate and NE keys • Onion encrypted source route – MA in control of routes – Misbehaving NEs only know local connectivity – Misbehaving NEs can be by-passed easily with a different source route • Malicious end hosts cannot talk to MA – MA doesn’t have an IP address! – No DDoS 19 MMS Authentication & Secure Source Routing Step 0: MA discovers directly connected NEs, issues authentication challenge Step 3: Secure MMS zone discovers new neighbors Step 6: Alternate source-routes setup to MA Step 1: First-hop nodes authenticate to MA; secure MMS zone established Step 2: Secure default source routes are setup to MA Step 4: New neighbors authenticate to MA via secure zone Step 5: Secure default source-routes setup between MA and new neighbors Step 7: Subsequent NEs discovered and authenticated recursively 20 Performance – Channel Setup Time 21 MMS Latency Bandwidth: Achieving 800Mbps over 1Gbps links 22 Maestro: Operating Platform for Network Controls 23 Fundamental Need for Control Component Collaboration (SLA Compliance Example) • Routing • Load balancing • DDoS filtering S o DD 24 Control Component Collaboration is Tricky • Pair-wise collaboration does not scale • Lack of state consistency Routing Protocol IGP Link Weight Optimization Packet Filter Configuration MPLS VPN Optimization 25 Maestro Unified Network State Management Logic 1 Logic 2 Logic 3 …….. Logic N Virtual Network States Environmental State Performance State Computed State Underlying Network States 26 Requirements • Synchronized access to state – Granularity of locking • Consistency of input state of collaborating controls – Even when underlying network state changes • Maintaining a history of state – For trend analysis and incremental computations • Extensible network state – Support new state associated with new network functions • Extensible control logic – Programmatic, reusable, reconfigurable logic 27 Maestro Architecture Overview CLG Logic CLG Logic Logic Logic Logic Logic Local Environment Snapshot Local Environment Transactional Update Snapshot Global Environment BSG Driver BSG BSG Driver BSG Driver State Dissemination Physical Network 28 Application to SLA Compliance • DPC Coordination Protocol – Regulates forwarding table changes – Ensures routers adopt consistent forwarding tables Logic Logic Maestro DPC Driver 29 CLG 1: Evaluates Acceptability of Routing State on New Observed Topology From local env Connectivity Activation Connectivity From local env TrafficDemandMatrix Connectivity ApprovedIntraDomainRoutingTable From temp env PredictedIntraDomainRoutingTable OSPF Routing Prediction To temp env PredictedIntraDomainRoutingTable SLA Compliance Analysis To temp env Null From local env Connectivity AccessControlPolicy ApprovedAccessControlConfiguration From temp env PredictedIntraDomainRoutingTable Predict Predicte Access Control Configuration To temp env PredictedAccessControlConfiguration Appro Appro 30 CLG 2: Computes IGP Link Weights for Load Balance From local env Connectivity TrafficDemandMatrix Activation Connectivity Compute or Select Precomputed OSPF Link Weights for Improved SLA Compliance To temp env OSPFLinkWeights From temp env OSPFLinkWeights Terminal To global env OSPFLinkWeights 31 Fraction of flows with SLA violations Reduction in SLA Violations 1 0.8 CONTRACT Maestro No coordination 0.6 0.4 0.2 0 1 1.5 2 2.5 3 3.5 4 SLA delay guarantee (X min prop. delay)32 Even if Your Control Plane is Perfect Bugs Can Still Haunt You... 33 Survey of Router Bugs • Cisco bugs examples 500 450 400 350 300 250 200 150 100 50 0 500+ – Bugs cause packets sent to wrong recipients – Bugs cause packet filters to stop working – Bugs cause legal packets being dropped 200+ Cisco (2000-) 200+ Quagga (2006-) # of bugs XORP (2003-) Publicly reported router bugs http://www.cisco.com/en/US/products/products_security_advisories_listing.html 34 Router Bugs Cause Forwarding Loops HR IT Payroll CS Internet ECE History Bugs cause network applications to fail! 35 Router Bugs Stop Filters from Working “ Block all traffic to Payroll coming from outside” HR IT Payroll CS packet ECE Internet History Bugs may create security loophole! 36 What Are Needed For Error Detection? Intended Trajectory? flow1 RA Drop flow1 RB Rc Filter-bypass FIBs of R ,R Filters on R ,R B A B ControlA States Collector Actual Trajectory? Trajectory Collector RD flow1: RA=>RB flow1: RA=>RB=>RC Trajectory TrajectoryError ErrorDetector Detector Inconsistency: Error detected! 37 Naive Trajectory Monitoring • Turn on the monitoring function on all routers. Flow2 Flow1 Trajectory collector 38 Interesting Observation • One trajectory error can be redundantly observed by multiple routers along its wrong trajectory. Flow1 How to reduce redundant monitoring? 39 Idea of Router Group Monitoring • • A router group: a set of routers forming a connected subgraph Only need monitor periphery interfaces of a router group Flow1 Pros: monitor more routers using fewer interfaces Cons: some errors may hide and larger groups may not always better 40 detection rate Detection Rate with 1 Forwarding Misbehavior Router groups containing 50% nodes of network: > 85% detection rate Size of router groups / size of network 41 Observations • Detection rates of groups vary from ~30% to 100% – Random selection will not pick out good ones • Brute-force simulation to calculate detection rates is expensive – 1000 groups from largest RocketFuel topology: 9 hours • Need a fast method to predict detection rate! • Develop a predictive model 42 It Depends on the Network Structure (Some Intuition) If2 If1 in If1 in RA RA RB RB RC RD Out’ If2 RC RD If4 RE If3 out Network with loops May or may NOT detect out’ If4 RE If3 out Network with tree topology Guarantee to detect! 43 Modeling Detection Rates of Groups “Divide and Conquer” strategy: Approximate probabilities of different scenarios and corresponding detection rates of those scenarios Scenario 1 Scenario 2 Probability of Scenario 1: p1 Estimated detection rate:d1 Probability of Scenario 2: p2 Estimated detection rate:d2 Detection rate = ∑(pk x dk) Scenario n Probability of Scenario n: pn Estimated detection rate:dn 44 Example Scenarios • Buggy router is an exiting router RA IF1 RF RC IF2 RD RB RE • Buggy router not an exiting router, but it misforwards pkts to a periphery interface in its first hop • Buggy router not an exiting router, and it misforwards pkts to one router inside group in first hop, then pkts stuck in a loop 45 Average Prediction Error of the Model Prediction error = |predicted detection rate - real detection rate| Average prediction error across all topologies < 0.05 Computation speedup varies from 20 to 160. Compute detection rates for 1000 groups from RF-6: 6 minutes V.S. 9 hours 46 Resource constraints: monitor at most MaxIF interfaces Question: How to select a set of router groups to monitor so that 1) All trajectory errors are detected: correctness 2) Errors are detected as quickly as possible: performance 47 Sufficient Condition for Complete Coverage • A set of router groups is said to have complete coverage, if every interface is a periphery interface for at least one selected router group. R3 IF5 R1 IF1 IF3 R2 IF4 IF2 48 Router Group Selection Algorithm Input: (1) MaxIF (2) a list of good groups predicted by model G1 G2 G3 G4 G5 G6 G7 G8 G9 Output: a subset of groups that satisfy sufficient condition. They are organized into m sets of router groups, T1; T2; ::; Tm, for m monitoring periods, where Ti is a subset of groups from group candidates T1: G2 T2: G4 T3: G3 G5 G7 Finding smallest m is NP-hard optimization problem. 49 Monitor All Interfaces vs Router Group Monitoring • Monitoring all interfaces: all interfaces 1% sampling rate, 100 monitoring periods Monitoring all interfaces simultaneously • Router group monitoring: vary percentage of monitored interfaces (e.g., x%) & sampling rate (e.g., (100/x)%) Router group monitoring 50 Detection Speedup: Router Group Monitoring vs Monitoring All Interfaces All errors are detected using router group monitoring. Router group monitoring achieves 2-4x detection speedup. 51 Ongoing Work • Control Plane – Maestro for OpenFlow switch – Inferring network-wide congestion from data packets – Better architecture for data center networks • Measuring Amazon EC2 performance • Integrating optical circuit switching into data center • Optimizing for parallel virtual machine migration • Data Plane – Fast trajectory verification with efficiently shared data structure for router state – Identifying false detections – Localizing faulty router – System prototype compatible with Juniper’s JUNOS • Many thanks to my graduate students Zheng Cai, Florin Dinu, Guohui Wang, Bo Zhang, Jie Zheng – Talk to them at the Poster Session! 52