Lecture 14: Inter-domain Routing Stability CS 268 class March 8th, 2004 (slides from Timothy Griffin’s tutorial and Craig Labovitz’s NANOG talk) Outline of the Today’s class • • • • An Introduction to BGP BGP and the Stable Paths problem Convergence of BGP in the real world Conclusions and Open Issues Inter-domain Routing basics • Internet is composed of over 16000 autonomous systems • BGP = Border Gateway Protocol – Is a Policy-Based routing protocol – Is the de facto inter-domain routing protocol of today’s global Internet • Relatively simple protocol, but configuration is complex and the entire world can see, and be impacted by, your mistakes. BGP Operations (Simplified) Establish session on TCP port 179 AS1 BGP session Exchange all active routes AS2 Exchange incremental updates While connection is ALIVE exchange route UPDATE messages Four Types of BGP Messages • Open : Establish a peering session. • Keep Alive : Handshake at regular intervals. • Notification : Shuts down a peering session. • Update : Announcing new routes or withdrawing previously announced routes. announcement = prefix + attributes values Two Types of BGP Neighbor Relationships AS1 • External Neighbor (eBGP) in a different Autonomous Systems • Internal Neighbor (iBGP) in the same Autonomous System iBGP is routed (using IGP!) eBGP iBGP AS2 iBGP Peers Must be Fully Meshed eBGP update iBGP updates iBGP neighbors do not announce routes received via iBGP to other iBGP neighbors. • iBGP is needed to avoid routing loops within an AS • Injecting external routes into IGP does not scale and causes BGP policy information to be lost • BGP does not provide “shortest path” routing • Is iBGP an IGP? NO! Important BGP attributes • LocalPREF – Local preference policy to choose “most” preferred route • Multi-exit Discriminator – Which peering point to choose? • Import Rules – What route advertisements do I accept? • Export Rules – Which routes do I forward to whom? Route Selection Summary Highest Local Preference Enforce relationships Shortest ASPATH Lowest MED i-BGP < e-BGP traffic engineering Lowest IGP cost to BGP egress Lowest router ID Throw up hands and break ties Implementing Customer/Provider and Peer/Peer relationships Two parts: • Enforce transit relationships – Outbound route filtering • Enforce order of route preference – provider < peer < customer Import Routes provider route peer route From provider customer route From provider From peer From peer From customer From customer ISP route Export Routes provider route peer route To provider customer route ISP route From provider To peer To peer To customer To customer filters block Outline of the Today’s class • • • • An Introduction to BGP BGP and the Stable Paths problem Convergence of BGP in the real world Conclusions and Open Issues What Problem is BGP solving? Underlying problem Distributed means of computing a solution. Shortest Paths RIP, OSPF, IS-IS X? BGP Having an X can • • • • aid in the design of policy analysis algorithms and heuristics, aid in the analysis and design of BGP and extensions, help explain some BGP routing anomalies, This provide a fun way of thinking about the protocol talk Q : How simple can X get? A: The Stable Paths Problem (SPP) 210 2 20 An instance of the SPP : • A graph of nodes and edges, • Node 0, called the origin, • For each non-zero node, a set or permitted paths to the origin. This set always contains the “null path”. • A ranking of permitted paths at each node. Null path is 1 preferred. (Not always least shown in diagram) 5 5210 2 4 420 430 3 30 0 1 130 10 When modeling BGP : nodes represent BGP speaking border routers, and 0 represents a node originating some address block most preferred … least preferred (not null) Yes, the translation gets messy! A Solution to a Stable Paths Problem 2 210 20 A solution is an assignment of permitted paths to each node such that • node u’s assigned path is either the null path or is a path uwP, where wP is assigned to node w and {u,w} is an edge in the graph, 5210 5 2 4 420 430 3 30 0 1 130 10 • each node is assigned the 1 highest ranked path among those consistent with the paths A Solution need not represent assigned to its neighbors. a shortest path tree, or a spanning tree. A Stable Paths Problem may have multiple solutions 120 10 120 10 1 120 10 1 0 0 2 210 20 DISAGREE 1 2 210 20 First solution 0 2 210 20 Second solution Multiple sets of BGP routing policies can map down to the same Stable Paths Problem : DISAGREE in RPSL (Version I) 120 10 import : from AS1 action pref = 0; accept ANY; from AS0 action pref = 10; accept ANY; export : to AS2 announce ANY; 1 0 2 210 20 export : to AS1, AS2 announce AS0; import : from AS2 action pref = 0; accept ANY; from AS0 action pref = 10; accept ANY; export : to AS1 announce ANY; DISAGREE in RPSL (Version II) import : from AS-ANY action pref = 0; accept community.contains(1:1); from AS-ANY action pref = 10; accept ANY; export : to AS2 announce ANY; 120 10 1 0 export : to AS1 set community.append(2:1); announce AS0; to AS2 set community.append(1:1); announce AS0 2 210 20 import : from AS-ANY action pref = 0; accept community.contains(2:1); from AS-ANY action pref = 10; accept ANY; export : to AS1 announce ANY; Assume AS1 and AS2 use “neighbor send-community” command …. DISAGREE in RPSL (Version III) import : from AS-ANY accept ANY; export : to AS2 announce ANY; 120 10 1 0 export : to AS1 action aspath.prepend(AS0, AS0, AS0); announce AS0; to AS2 announce AS0 2 210 20 import : from AS1 action pref = 0; accept ANY; from AS0 action pref = 10; accept ANY; export : to AS1 announce ANY; The interaction of all BGP policies is directly represented in SPP Multiple solutions can result in “Route Triggering” 10 1230 1 230 210 2 1 primary link 0 2 0 1 10 1230 2 230 310 0 backup link 3210 30 3 Remove primary link 3 3 Restore primary link 3210 30 SPP helps explain possibility of BGP divergence • BGP is not guaranteed to converge to a stable routing. Policy inconsistencies can lead to “livelock” protocol oscillations. • See “Persistent Route Oscillations in Inter-domain Routing” by K. Varadhan, R. Govindan, and D. Estrin. ISI report, 1996 The SPP view : Solvable must converge Can Diverge must diverge BAD GADGET : No Solution With a BGP-like protocol, each node will do the best it can, so at least one node will always have the opportunity to improve its path. Result : persistent oscillation. 1 130 10 2 210 20 4 420 430 0 3 3420 30 SURPRISE : Beware of Backup Policies 210 20 BGP is not robust : it is not guaranteed to recover from network failures. 1 130 10 2 Becomes BAD GADGET if link (4, 0) goes down. 4 40 420 430 0 3 3420 30 PRECARIOUS Has a solution, but can get “trapped” 4 310 3120 5 5310 563120 53120 4310 453120 43120 1 3 120 10 0 6 2 6310 643120 63120 This part has a solution only when node 1 is assigned the direct path (1 0). 210 20 As with DISAGREE, this part has two distinct solutions What is to be done? Static Approach Dynamic Approach Extend BGP with a dynamic means of detecting and suppressing policy-based oscillations? Automated Analysis of Routing Policies (This is very hard). Inter-AS coordination These approaches are complementary Research papers on SPP “An Analysis of BGP Convergence Properties” Timothy G. Griffin, Gordon Wilfong SIGCOMM’99 Model BGP, show static analysis is hard “Policy Disputes in Path Vector Protocols” Timothy G. Griffin, F. Bruce Shepherd, Gordon Wilfong ICNP ‘99 Define Stable Paths Problem and develop sufficient condition for “sanity” “A Safe Path Vector Protocol” Timothy G. Griffin, Gordon Wilfong INFOCOM’00 Dynamic solution based on histories “Stable Internet Routing without Global Coordination” Lixin Gao, Jennifer Rexford Show that if certain guidelines are followed, then all is well. Rule: Do not forward route advertisements from peers or Providers to other peers or providers. SIGMETRICS’00 Outline of the Today’s class • • • • An Introduction to BGP BGP and the Stable Paths problem Convergence of BGP in the real world Conclusions and Open Issues Convergence in the real-world? • • [Labovitz99] Experimental results from two year study which measured 150,000 BGP faults injected into peering sessions at several IXPs Found – – Internet averages 3 minutes to converge after failover Some multihomed failovers (short to long ASPath) require 15 minutes Problems with Distance Vector • Distance vector protocols (e.g. RIP) suffer routing table loops – Counting-to-infinity – Routing table loops – Bouncing problem • BGP uses path vector to “solve” problems seen with RIP and other Bellman-Ford derived protocols Counting to Infinity 2 1 B A R B R R 2 1+2=3 5+2=7 A R 2 1 R 7+2=9 2+3=5 Taming Infinity • Routing Information Protocol (RIP) solved counting to infinity problem by re-defining infinity. – Added speedups: poison reverse, split horizon, triggered updates. – Strictly increasing O(N) • ASPath limits “infinity” to the width of the Internet (an ASPath through all your neighbors) – Monotonically increasing – Upper bound? BGP Convergence Example R AS2 AS3 AS0 *B R via AS3 *B R via AS1,AS3 B R via AS2,AS3 AS0 AS1 *B *B *B B R R R via AS3 via AS1,AS3 via AS2,AS3 203 AS1 *B *B *B B R R R via AS3 via viaAS0,AS3 031 via viaAS2,AS3 103 AS2 AS6113 6113 2914 237 AS2497 AS6453 6453 1239 5696 237 2497 5696 237 N > 4? AS6461 6461 5696 237 AS1239 1239 5696 237 AS5696 5696 237 AS2914 2914 237 AS237 237 AS701 701 6461 5696 237 AS5000 5000 237 AS1 AS1673 1673 5696 237 1 5696 237 The Problem with BGP • If we assume 1. unbounded delay on BGP processing and propagation 2. Full BGP mesh BGP peers 3. Constrained shortest path first selection algorithm There exists possible ordering of messages such that BGP will explore all possible ASPaths of all possible lengths • BGP is O(N!), where N number of defaultfree BGP speakers BGP and RIP • RIP precisely monotonically increasing. Can explore metrics (1…N) • BGP monotonically increasing. Multiple (N!) ways to represent a path metric of N. 2117 5696 2129 2117 1 5696 2129 2117 2041 3508 3508 4540 7037 1239 5696 2129 2117 1 2041 3508 3508 4540 7037 1239 5696 2129 2117 2041 3508 3508 4540 7037 1239 6113 5696 2129 2117 1 2041 3508 3508 4540 7037 1239 6113 5696 2129 • BGP “solved” RIP routing table loop problem by making it exponentially worse… BGP Best Case What is the best we can expect from BGP? Implementation of MinRouteAdver timer leads to 30 second rounds • Time complexity is O(n-3)*30 seconds • State/Computational complexity O(n) • At its best, BGP performs as well as RIP2 (but uses exponentially more memory in the process) MinRouteAdver • Minimum interval between successive updates sent to a peer for a given prefix – Allow for greater efficiency/packing of updates – Rate throttle • Applied only to announcements (at least according to BGP RFC) • Applied on (prefix destination, peer) basis, but implemented on (peer) basis MinRouteAdver • 30*(N-3) delay due to creation mutual dependencies. Provide proof that N-3 rounds necessarily created during bounded BGP MinRouteAdver convergence • Rounds due to – Ambiguity in the BGP RFC and lack receiver loop detection – Inclusion of BGP withdrawals with MinRouteAdver (in violation of RFC) Conclusions • Internet routing has serious convergence problems – Result 1 [Griffin et al.]: BGP does not satisfy the stable paths problem. – Result 2 [Rexford et al.]: If every AS follows a set of guidelines then Internet routing should not have convergence problems. – Result 3 [Labovitz et al.]: An extensive measurement study shows that Internet convergence can be in the order of several minutes. Open issues? • Convergence analysis (lower,upper) bounds are very weak are “worst-worst” case scenarios. • Can we design a cleaner protocol that has provably good convergence properties? – What about link-state routing? • Should we really care about convergence? – Routes to popular prefixes are stable [IMC03]