Internet Routing (COS 598A) Today: Router Software Jennifer Rexford http://www.cs.princeton.edu/~jrex/teaching/spring2005 Tuesdays/Thursdays 11:00am-12:20pm Outline • Continuing discussion from last class – Proposals for removing routing from routers – Feasibility, collecting data, computing paths, etc. • BGP implementation – Storage overhead – CPU overhead • Recent proposals – Graceful restart to limit effects of resets – Tunneling to limit hot-potato changes – Computing routes for groups of routers Proposal #1: Routing As a Service • Goal: third parties pick end-to-end paths for clients to satisfy diverse user objectives • Forwarding infrastructure – Basic routing (e.g., default routing) – Primitives for inserting routes • Route selector – Aggregates network information – Selects routes on behalf of clients – Competes with other selectors for customers • End host – Queries route selector to set up paths Proposal #2: Routing Control Platform • Goal: Move beyond today’s artifacts, while remaining compatible with the legacy routers • Incentive compatibility: phased evolution – Intelligent route reflector in a single AS – Learning eBGP routes directly from neighbor ASes – Interdomain routing between RCPs • Backwards compatibility: internal BGP eBGP Inter-AS Protocol –eBGP Using answers to the routers RCPiBGP to “push”RCP RCP iBGP – No need RCPat all RCPto change the legacy routers iBGP iBGP – Keep message format and change decision AS 1 AS 2 AS 3 rules Physical peering Proposal #3: Wafer-Thin Control Plane • Goal: Refactor the data, control, and management planes from scratch • Management plane Decision plane – Operates on network-wide view and objectives – Directly controls the data plane in real time • Control plane Discovery plane – Responsible for providing the network-wide view – Topology discovery, traffic measurement, etc. • Data plane – Queues, filters, and forwards data packets – Accepts direct instruction from the decision plane Simple routers that have no control-plane configuration How Does These Differ From Overlays • Overlays: circumventing the underlay – Host nodes throughout the network – Logical links between the host nodes – Active probes to observe the performance – Direct packets through good intermediate nodes • Routing services: controlling the underlay – Servers collect data directly from the routers – Servers compute forwarding tables for the routers – Data packets do not go through the servers – Like an overlay for managing the underlay Maybe some combination of the two makes sense? Practical Issues: Feasibility • Fast reaction to failures – Routers are closer to the failures – Can a service react quickly enough? • Scalability with network size – State and computation grow with the topology – Can a service manage a large network? • Reliability? – Service is now a point of failure – Is simple replication enough? • Security? – Service is now a natural point of attack – Easier (or harder) to protect than the routers? Practical Issues: Collecting Measurement Data • All three proposals make measurement a firstorder part of running the network • Routers have only two jobs – Forward packets – Collect measurement data • What measurements? – Topology discovery – Traffic demands – Performance statistics – …? Practical Issues: Path-Computation Algorithms • Selecting routes should be easier – Complete view of network topology and traffic – Possibility of using centralized algorithms – Direct control over forwarding tables • …but what algorithms to use? – Still need a separation of timescale, but how? • Fast reaction to topological changes • Semi-offline optimization of routing • … and how to compute end-to-end paths? – Policy-based path vector protocol? – Publish/subscribe system? – Something else? Practical Issues: Solving Real Problems? • Customer load-balancing – Trading off load, performance, and cost – Controlling inbound and outbound traffic – Avoiding small subnets and BGP tweaks • Preventing overloading router resources – Minimum-sized forwarding table per router – Minimum stretch while obeying memory limits • Flexible end-to-end path selection – Satisfy the goals of end users and providers – Handle pricing/economics in the right way Other Thoughts? Router Software Basic BGP Implementation RIB-in-1 RIB-in-2 Import Export RIB-out-1 Import Export RIB-out-2 RIB RIB-in-n Import Export Decision process RIB-out-n Storage Overhead: RIB-In • Storing routes learned from each neighbor – Before applying the import policy • Advantages of keeping a RIB-In – Verify receipt of routes that have been filtered – Use as input to simulate import-policy changes – Apply new policies directly on local RIB-In • Alternatives for keeping a RIB-In – Reset the session after any policy change • Undesirable, unless policy changes are infrequent – Route-refresh option to signal neighbor to resend • Relatively new feature, so not universally supported Storage Overhead: Main RIB • Storing all candidate routes – All routes after import processing – Keep track of the best route for each prefix • Advantages – Necessary to store at least one copy of each route – … since BGP is an incremental protocol • Alternatives – Store only the RIB-In for each neighbor • Require rerunning import policies per decision Storage Overhead: RIB-Out • Storing routes sent to each neighbor – After applying the export policy • Advantages of keeping a RIB-Out – Verify sending of route to the neighbor – Compare routes to suppress unnecessary updates • No update message if all attributes are the same • No withdrawal message if there was no advertisement • Alternative to keeping a RIB-Out – Reapply export policy to recompute the route • … or send some unnecessary update messages – Single RIB-Out per export policy (peer groups) BGP Peer Groups • Group of BGP neighbors with same policies – Avoid repetitive configuration – Avoid reapplying the same policy – Avoid duplicating the storage • Example iBGP peer groups – Route-reflector clients – Route-reflector peers • Example eBGP peer groups – Customers – Peers CPU Overhead: New BGP Update Message • When receiving a new BGP update – Apply import policy and update the RIB – Re-run the BGP decision process for this prefix – If best route changes, apply export policies and send update message to affected neighbors • Running decision process – Ideally, just compare with the best route • Withdraw non-best route: no change • Update non-best route: compare to current best – But, BGP does not always form a total ordering • MED attribute compared only for same next-hop AS • Re-run decision process for deterministic outcome CPU Overhead: Events that Amplify Work • BGP session failure – Must discard all routes learned from this neighbor – … and run decision process for affected prefixes • Policy change – Must apply the new routing policy to all routes learned from (or sent to) this neighbor – … and run decision process for affected prefixes • Intradomain change – Must revisit BGP decision for affected prefixes • Exclude routes with unreachable next-hop • Prefer the route with the closest egress point CPU Overhead: Deferring Heavy Jobs • Event-driven approach – Process most events as they occur – Defer heavy-load items to background task – Make sure these tasks can run soon – Example: XORP handling session failures • Timer-driven approach – Periodic timer driving the operation – Scan the data structures when the timer expires – … and identify and perform any needed work – Example: Cisco scan timer for IGP changes Reducing Overhead: Operational Practices • Avoiding RIB-In storage – Configuring router not to store RIB-In – Convincing neighbors to support route-refresh • Configuring peer groups – Limiting the number of unique export policies • … or limiting the number of these per router – Putting all possible sessions in same peer group • Selecting good timer settings – Allow grouping of update messages – Avoid false detection of session failures Reducing the Effects of Session Failures • Separating control from data – Suppose a router’s BGP process fails – … but the data plane is just fine RIB RIB FIB data FIB • When the neighbor’s BGP process fails – Do not delete routes learned from neighbor – Continue to forward data packets • When the neighbor’s process restarts – Refresh the neighbor by re-sending BGP routes – Neighbor re-builds its RIB and goes back to normal • BGP “Graceful Restart” mechanism – New BGP capability for neighbors to negotiate – Mark routes from the neighbor as “stale” – Refresh by resending RIB-Out with End-of-RIB marker Reducing the Effects of IGP Changes • Circumvent hot-potato routing – Avoid small IGP changes leading to BGP changes – … and avoid the software overhead on BGP • Tunneling between edge routers – Create tunnel from ingress to egress router – Assign a weight to the tunnel (e.g., air miles) – Tunnel weight does not depend on IGP path dst A 3 4 D 8 10 3 8 F 5 C B 9 E 4 G Reducing Overhead for Groups of Routers • Additional overhead in RCP-like approaches – Computing routes on behalf of many routers – Could lead to a linear increase in overhead • Store a single copy of each BGP route – One big global RIB for the network – Plus, avoid repeating some of decision process • Compute for groups of routers (e.g., PoP) – One shared RIB-out for each group of routers – Plus, avoid repeating the decision process • Reduce the overhead of IGP changes – E.g., by use of tunnel, as on previous slide Conclusion • Router software – Very challenging systems problem – New open-source software (Quagga, OpenBGPd) • Improving scalability – Scaling with # of routers, sessions, and prefixes – Trading off memory and CPU resources – Avoiding events that create excessive work • Newly active research area – Importance of control plane in network performance, reliability, and security – Creation of new platforms for router software Next Time: BGP Security • Two papers – “Beware of BGP Attacks” – “Secure Border Gateway Protocol (Secure-BGP)” • Review just of second paper – Summary – Why accept – Why reject – Future work • Optional NANOG video – See the Web site later today…