Internet Routing (COS 598A) Today: Hot-Potato Routing Jennifer Rexford http://www.cs.princeton.edu/~jrex/teaching/spring2005 Tuesdays/Thursdays 11:00am-12:20pm Outline • Hot-potato routing – Selecting closest egress from a set – Hot-potato routing changes • Measuring hot-potato routing – BGP and IGP monitoring – Inferring causality • Characterizing hot potatoes – Frequency and number of destinations – Convergence delays and forwarding loops • Avoiding hot potatoes – Operational practices – New egress-selection techniques Multiple Links Between Domains Multiple links Middle of path 4 3 5 2 7 1 6 Web server Client Hot-Potato Routing multiple egress points dest New York San Francisco ISP network 10 9 Dallas Hot-potato routing = route to closest egress point when more customer than -All there trafficisfrom to peers one to destination -Allroute traffic to customer prefixes with multiple connections BGP Decision Process • Highest local preference “Equally good” • Lowest AS path length • Lowest origin type • Lowest MED (with same next hop AS) • Lowest IGP cost to next hop • Lowest router ID of BGP speaker Motivations for Hot-Potato Routing • Simple computation for the routers – IGP path costs are already computed – Easy to make a direct comparison • Ensures consistent forwarding paths – Next router in path picks same egress point 2 3 dest 1 • Reduces resource consumption – Get traffic out as early as possible – (But, what does IGP distance really mean???) Hot-Potato Routing Change dest New York San Francisco - failure - planned maintenance 11 - traffic engineering ISP network 9 11 Consequences: Transient forwarding instability Traffic shift Interdomain routing changes 10 Dallas Routes to thousands of destinations switch egress points!!! Why Care about Hot Potatoes? • Understanding of Internet routing – Frequency of hot-potato routing changes – Influence on end-to-end performance • Operational practices – Knowing when hot-potato changes happen – Avoiding unnecessary hot-potato changes – Analyzing externally-caused BGP updates • Distributed root cause analysis – Each AS can tell what BGP updates it caused – Someone should know why each change happens Measuring Hot Potatoes Measuring Hot Potatoes is Hard • Cannot collect data from all routers – OSPF: flooding gives complete view of topology – BGP: multi-hop sessions to several vantage points • A single event may cause multiple messages – Group related routing messages in time • Router implementation affects message timing – Analyze timing in the measurement data – Controlled experiments with router in lab • Many BGP updates caused by external events – Classify BGP routing changes by possible causes Measurement Infrastructure • Measure both protocols – BGP and OSPF monitors Z ISP backbone X OSPF messages BGP updates M Y • Correlate the two streams – Match BGP updates with OSPF events • Analyze the interaction Algorithm for Matching Stream of OSPF messages Transform stream of OSPF messages into routing changes refresh link failure weight change chg cost chg cost del Match BGP updates with OSPF events that happen close in time time Classify BGP updates by possible OSPF causes Stream of BGP updates Computing Cost Vectors • Transform OSPF messages into path cost changes from a router’s perspective OSPF routing changes: X 5 Y 4 7 Z X 11 1 2 2 10 10 2 1 Y 1 CHG Y, 7 DEL X ADD X, 5 M LSA weight LSA weight change, 10 10 LSA deletechange, Classifying BGP Updates • Cannot have been caused by cost change – Destination just became (un)available in BGP – New BGP route through same egress point – New route better/worse than old (e.g., shorter) • Can have been caused by cost change – New route is equally good as old route (perhaps X got closer, or Y got further away) M X dst Y Z The Role of Time • OSPF link-state advertisements 10 sec – Multiple LSAs from a single physical event – Group into single cost vector change • BGP update messages 70 sec – Multiple BGP updates during convergence – Group into single BGP routing change • Matching IGP to BGP 180 sec – Avoid matching unrelated IGP and BGP changes – Match related changes that are close in time Characterize the measurement data to determine the right windows Characterizing Hot Potatoes Frequency of Hot-Potato Changes router A router B Variation Across Routers dest dest NY NY SF 9 10 SF 1 B 1000 A Small changes will make router A switch exit points to dst More robust to intradomain routing changes Important factors: - Location: relative distance to egresses - Day: which events happen Impact of an OSPF Change router A router B BGP Reaction Time BGP scan Transfer delay First BGP update All BGP updates Cumulative Number of Hot-Potato Changes Transferring Multiple Prefixes 81 seconds delay time BGP update – time LSA (seconds) Data Plane Convergence 1 – BGP decision process runs in R2 2 – R2 starts using E1 to reach dst 3 – R1’s BGP decision can take up to 60 seconds to run R1 10 111 10 100 Packets to dst may be caught in a loop for 60 seconds! R2 E2 E1 dst Disastrous for interactive applications (VoIP, gaming, web) BGP Updates Over Prefixes Cumulative %BGP updates prefixes with only one exit point OSPF-triggered BGP updates affects ~50% of prefixes uniformly % prefixes Avoiding Hot Potatoes Reducing the Impact of Hot Potatoes • Vendors: better router implementation – Avoid timer-driven reaction to IGP changes – Move toward an event-drive BGP implementation • Operators: avoid equal-distant exits Z X 10 Z X 1 10 1000 Y dst Y dst Small changes will make Z switch exit points to dst More robust to intra-domain routing changes Reducing the Impact (Continued) • Operators: new maintenance practices – Careful cost-in/cost-out of links X 5 100 5 10 4 Z 10 10 dst Y – (But, is this problem over-constrained???) Is Hot-Potato Routing the Wrong Design? • Too restrictive – Egress-selection mechanism dictates a policy • Too disruptive – Small changes inside can lead to big disruptions • Too convoluted – Intradomain metrics shouldn’t be so tightly coupled with BGP egress selection Strawman Solution: Fixed Ranking • Goal: no disruptions from internal changes – Each router has a fixed ranking of egresses – Select highest-ranked egress for each destination – Use tunnels from ingress to egress dst A 3 4 • Disadvantage D 8 10 3 8 F 5 B 9 E C – Sometimes changing egresses would be useful – Harm from disruptions depends on application 4 G Egress Selection Mechanisms hot-potato routing For each ingress, destination, egress: m(i,dst,e) = d(i,e), d is intradomain distance m(i,dst,e) = static rank(i,e) fixed ranking robustness to internal changes TIE: Tunable Interdomain Egress Selection m(i,dst,e) = (i,dst,e) * d(i,e) + (i,dst,e) • Flexible policies – Tuning and allows covering a wide-range of egress selection policies • Simple computation – One multiplication and one addition – Information already available in routers • Easy to optimize – Expressive for a management system to optimize Using TIE • Decouples egress selection from IGP paths – Egress selection is done by tuning and • Requirements – Small change in router decision logic – Use of tunnels • Configuring TIE – Network designers define high-level policy – Network management system translate policy into parameters Example Policy: Minimizing Sensitivity • Problem definition – Minimize sensitivity to equipment failures – No delay more than twice design time delay • Simple change to routers – If distance is more than twice original distance • Change to closest egress – Else • Keep using old egress point • Cannot change routers for all possible goals Minimizing Sensitivity with TIE dst B A 20 9 10 11 C At design time: m(C,dst,A) < m(C,dst,B) Output of simulation phase 9.(C,dst,A) + (C,dst,A) < 10.(C,dst,B) + (C,dst,B) 11.(C,dst,A) + (C,dst,A) < 10.(C,dst,B) + (C,dst,B) 20.(C,dst,A) + (C,dst,A) > 10.(C,dst,B) + (C,dst,B) Optimization phase: solve integer programming Evaluation of TIE on Real Networks • Topology and egress sets – Abilene network (U.S. research network) – Set link weight with geographic distance • Configuration of TIE – Considering single-link failures – Threshold of delay ratio: 2 – [1,4] and 93% of (i,dst,e)=1 – {0,1,3251} and 90% (i,dst,e)=0 • Evaluation – Simulate single-node failures – Measure routing sensitivity and delay Effectiveness of TIE • Delay – Within the 2x target whenever possible (i.e., whenever hot-potato could do it) – Lower delay than the fixed-ranking scheme • Sensitivity – Only a little more sensitive than a fixed ranking scheme – Much less sensitive than hot-potato routing Conclusion • Hot-potato routing – Simple, intuitive, distributed mechanism – But, large reaction to small changes • Studying hot-potato routing – Measurement of hot-potato routing changes – Characterization of hot potatoes in the wild – Guidelines for vendors and operators • Improving the routing architecture – Identify egress selection as its own problem – Decouple from the intradomain link weights Next Time: Root-Cause Analysis • Two papers – “Locating Internet Routing Instabilities” – “A Measurement Framework for Pin-Pointing Routing Changes” • NANOG video – “Root Cause Analysis of Internet Routing Dynamics” • Review just of the first paper – Summary, why accept, why reject, future work • Think about your course project – One-page written proposal by Thursday March 24 – Final written report due Tuesday May 10