CS 268: Lecture 9 Inter-domain Routing Protocol Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720-1776 (*slides from Timothy Griffin and Craig Labovitz) Overview An Introduction to BGP BGP and the Stable Paths problem Convergence of BGP in the real world The End-to-End Effects of Internet Path Selection 2 Internet Routing Internet organized as a two level hierarchy First level – autonomous systems (AS’s) - AS – region of network under a single administrative domain AS’s run an intra-domain routing protocols - Distance Vector, e.g., RIP - Link State, e.g., OSPF Between AS’s runs inter-domain routing protocols, e.g., Border Gateway Routing (BGP) - De facto standard today, BGP-4 3 Example Interior router BGP router AS-1 AS-3 AS-2 4 Inter-domain Routing basics Internet is composed of over 16000 autonomous systems BGP = Border Gateway Protocol - Is a Policy-Based routing protocol - Is the de facto inter-domain routing protocol of today’s global Internet Relatively simple protocol, but configuration is complex and the entire world can see, and be impacted by, your mistakes. 5 Inter-domain Routing Use TCP Border Gateway Protocol (BGP), based on Bellman-Ford path vector AS’s exchange reachability information through their BGP routers, only when routes change BGP routing information – a sequence of AS’s indicating the path traversed by a route; next hop General operations of a BGP router: - Learns multiple paths - Picks best path according to its AS policies - Install best pick in IP forwarding tables 6 BGP Operations (Simplified) Establish session on TCP port 179 AS1 BGP session Exchange all active routes Exchange incremental updates AS2 While connection is ALIVE exchange route UPDATE messages 7 Customers and Providers provider provider customer IP traffic customer Customer pays provider for access to the Internet 8 The “Peering” Relationship peer provider peer customer Peers provide transit between their respective customers Peers do not provide transit between peers traffic allowed traffic NOT allowed Peers (often) do not exchange $$$ 9 Peering Provides Shortcuts Peering also allows connectivity between the customers of “Tier 1” providers. peer provider peer customer 10 Peering Wars Peer Reduces upstream transit costs Can increase end-to-end performance May be the only way to connect your customers to some part of the Internet (“Tier 1”) Don’t Peer You would rather have customers Peers are usually your competition Peering relationships may require periodic renegotiation Peering struggles are by far the most contentious issues in the ISP world! Peering agreements are often confidential. 11 Architecture of Dynamic Routing OSPF BGP AS 1 IGP = Interior Gateway Protocol Metric based: OSPF, IS-IS, RIP, EIGRP (cisco) EGP = Exterior Gateway Protocol EIGRP AS 2 Policy based: BGP The Routing Domain of BGP is the entire Internet 12 AS-Path Sequence of AS’s a route traverses Used for loop detection and to apply policy AS-3 AS-4 130.10.0.0/16 AS-2 120.10.0.0/16 AS-5 110.10.0.0/16 AS-1 120.10.0.0/16 AS-2 AS-3 AS-4 130.10.0.0/16 AS-2 AS-3 110.10.0.0/16 AS-2 AS-5 13 Four Types of BGP Messages Open : Establish a peering session. Keep Alive : Handshake at regular intervals. Notification : Shuts down a peering session. Update : Announcing new routes or withdrawing previously announced routes. announcement = prefix + attributes values 14 Two Types of BGP Neighbor Relationships AS1 eBGP • External Neighbor (eBGP) in a different Autonomous Systems • Internal Neighbor (iBGP) in the same Autonomous System iBGP is routed using Interior Gateway Protocol (IGP)! iBGP AS2 15 iBGP Peers Must be Fully Meshed eBGP update iBGP updates iBGP neighbors do not announce routes received via iBGP to other iBGP neighbors. • iBGP is needed to avoid routing loops within an AS • Injecting external routes into IGP does not scale and causes BGP policy information to be lost • BGP does not provide “shortest path” routing 16 Important BGP attributes LocalPREF - Local preference policy to choose “most” preferred route Multi-exit Discriminator (MED) - Which peering point to choose? Import Rules - What route advertisements do I accept? Export Rules - Which routes do I forward to whom? 17 Route Selection Summary Highest Local Preference Enforce relationships Shortest ASPATH Lowest MED i-BGP < e-BGP traffic engineering Lowest IGP cost to BGP egress Lowest router ID Throw up hands and break ties 18 Implementing Customer/Provider and Peer/Peer relationships Two parts: Enforce transit relationships - Outbound route filtering Enforce order of route preference - provider < peer < customer 19 Import Routes provider route peer route From provider customer route ISP route From provider From peer From peer From customer From customer 20 Export Routes provider route peer route To provider customer route ISP route From provider To peer To peer To customer To customer filters block 21 Overview An Introduction to BGP BGP and the Stable Paths problem Convergence of BGP in the real world The End-to-End Effects of Internet Path Selection 22 What Problem is BGP solving? Underlying problem Distributed means of computing a solution. Shortest Paths RIP, OSPF, IS-IS X? BGP Having an X can aid in the design of policy analysis algorithms and heuristics, aid in the analysis and design of BGP and extensions, help explain some BGP routing anomalies, This provide a fun way of thinking about the protocol talk 23 Q : How simple can X get? A: The Stable Paths Problem (SPP) 210 2 20 An instance of the SPP : A graph of nodes and edges, Node 0, called the origin, For each non-zero node, a set or permitted paths to the origin. This set always contains the “null path”. A ranking of permitted paths at each node. Null path is always least preferred. (Not 1 shown in diagram) 5 5210 2 4 420 430 3 30 0 1 130 10 When modeling BGP : nodes represent BGP speaking border routers, and 0 represents a node originating some address block most preferred … least preferred (not null) 24 A Solution to a Stable Paths Problem 2 A solution is an assignment of permitted paths to each node such that 210 20 node u’s assigned path is either the null path or is a path uwP, where wP is assigned to node w and {u,w} is an edge in the graph, 5 5210 2 4 420 430 0 1 3 30 each node is assigned the highest 1 3 0 1 ranked path among those 10 consistent with the paths assigned to its neighbors. A Solution need not represent a shortest path tree, or a spanning tree. 25 Example: SHORTEST1 10 130 1 2 20 210 0 4 30 430 420 3 26 Example: SHORTEST1 (Solution) 10 130 1 2 20 210 0 4 30 430 420 3 27 Example: SHORTEST2 10 130 1 2 20 210 0 4 30 420 430 3 28 Example: SHORTEST2 (Solution) 10 130 1 2 20 210 0 4 30 420 430 3 29 Example: GOOD GADGET 130 10 1 2 210 20 0 4 30 430 420 3 30 Example: GOOD GADGET (Solution) 130 10 1 2 210 20 0 4 30 430 420 3 31 A Stable Paths Problem may have multiple solutions 120 10 120 10 1 120 10 1 0 0 2 210 20 1 2 210 20 First solution 0 2 210 20 32 Second solution Example: NAUGHTY GADGET 130 10 1 2 210 20 0 4 3420 30 430 420 3 33 Example: NAUGHTY GADGET (Solution 1) 130 10 1 2 210 20 0 4 3420 30 430 420 3 34 Example: NAUGHTY GADGET (Solution 2) 130 10 1 2 210 20 0 4 3420 30 430 420 3 35 SPP helps explain possibility of BGP divergence BGP is not guaranteed to converge to a stable routing. Policy inconsistencies can lead to “livelock” protocol oscillations. See “Persistent Route Oscillations in Inter-domain Routing” by K. Varadhan, R. Govindan, and D. Estrin. ISI report, 1996 The SPP view : Solvable must converge Can Diverge must diverge 36 Example: NAUGHTY GADGET 130 10 1 2 210 20 0 4 3420 30 430 420 3 37 Example: BAD GADGET 130 10 1 2 210 20 0 4 3420 30 420 430 3 38 Example: BAD GADGET 130 10 1 2 210 20 0 4 3420 30 420 430 3 • 2 chooses (2 0) • 4 chooses (4 2 0) • 3 chooses (3 4 2 0) • 1 chooses (1 0) 39 Example: BAD GADGET 130 10 1 2 210 20 0 4 3420 30 420 430 3 … 2 chooses (2 1 0) … 40 Example: BAD GADGET 130 10 1 2 210 20 0 4 3420 30 420 430 3 … 3 chooses (3 0) … 41 Example: BAD GADGET 130 10 1 2 210 20 0 4 3420 30 420 430 3 … 1 chooses (1 3 0) … 42 Example: BAD GADGET 130 10 1 2 210 20 0 4 3420 30 420 430 3 … 2 chooses (2 0) … 43 Example: BAD GADGET 130 10 1 2 210 20 0 4 3420 30 420 430 3 … 3 chooses (3 4 2 0) … LOOP ! 44 BAD GADGET : No Solution With a BGP-like protocol, each node will do the best it can, so at least one node will always have the opportunity to improve its path. Result : persistent oscillation. 1 130 10 2 210 20 4 420 430 0 3 3420 3 0 45 SURPRISE : Beware of Backup Policies 210 20 BGP is not robust : it is not guaranteed to recover from network failures. 1 130 10 2 Becomes BAD GADGET if link (4, 0) goes down. 4 40 420 430 0 3 3420 30 46 PRECARIOUS Has a solution, but can get “trapped” 4 310 3120 5 5310 563120 53120 4310 453120 43120 1 3 120 10 0 6 2 6310 643120 63120 This part has a solution only when node 1 is assigned the direct path (1 0). 210 20 As with DISAGREE, this part has two distinct solutions 47 What is to be done? Static Approach Dynamic Approach Extend BGP with a dynamic means of detecting and suppressing policy-based oscillations? Automated Analysis of Routing Policies (This is very hard). Inter-AS coordination These approaches are complementary 48 Theoretical Results The problem of determining whether an instance of stable paths problem is solvable is NPcomplete Shortest path route selection is provably safe 49 Overview An Introduction to BGP BGP and the Stable Paths problem Convergence of BGP in the real world The End-to-End Effects of Internet Path Selection 50 Convergence in the real-world? [Labovitz99] Experimental results from two year study which measured 150,000 BGP faults injected into peering sessions at several IXPs Found - Internet averages 3 minutes to converge after failover Some multihomed failovers (short to long ASPath) require 15 minutes 51 BGP Convergence Example R AS2 AS3 AS0 *B R via AS3 *B R via AS1,AS3 B R via AS2,AS3 AS0 AS1 *B *B *B B R R R via AS3 via AS0,AS3 via AS2,AS3 203 AS1 *B *B *B B R R R R via AS3 via viaAS0,AS3 013 via viaAS2,AS3 103 52 AS2 Convergence Result If we assume 1. unbounded delay on BGP processing and propagation 2. Full BGP mesh BGP peers 3. Constrained shortest path first selection algorithm There exists possible ordering of messages such that BGP will explore all possible ASPaths of all possible lengths BGP is O(N!), where N number of default-free BGP speakers 53 Outline of the Today’s class An Introduction to BGP BGP and the Stable Paths problem Convergence of BGP in the real world The End-to-End Effects of Internet Path Selection 54 End-to-end effects of Path Selection Goal of study: Quantify and understand the impact of path selection on end-to-end performance Basic metric - Let X = performance of default path - Let Y = performance of best path - Y-X = cost of using default path Technical issues - How to find the best path? - How to measure the best path? 55 Approximating the best path Key Idea - Use end-to-end measurements to extrapolate potential alternate paths Rough Approach - Measure paths between pairs of hosts - Generate synthetic topology – full NxN mesh - Conservative approximation of best path Question: Given a selection of N hosts, how crude is this approximation? 56 Methodology For each pair of end-hosts, calculate: - Average round-trip time - Average loss rate - Average bandwidth Generate synthetic alternate paths (based on long-term averages) For each pair of hosts,graph difference between default path and alternate path 57 Courtesy: Stefan Savage 58 Courtesy: Stefan Savage 59 Courtesy: Stefan Savage 60 Courtesy: Stefan Savage 61 Why Path Selection is imperfect? Technical Reasons - Single path routing Non-topological route aggregation Coarse routing metrics (AS_PATH) Local policy decisions Economic Reasons - Disincentive to offer transit - Minimal incentive to optimize transit traffic 62