Routing Measurements: Three Case Studies Jennifer Rexford Motivations for Measuring the Routing System • Characterizing the Internet – Internet path properties – Demands on Internet routers – Routing convergence • Improving Internet health – Protocol design problems – Protocol implementation problems – Configuration errors or attacks • Operating a network – Detecting and diagnosing routing problems – Traffic shifts, routing attacks, flaky equipment, … Techniques for Measuring Internet Routing • Active probing – Inject probes along path through the data plane – E.g., using traceroute • Passive route monitoring – Capture control-plane messages between routers – E.g., using tcpdump or a software router – E.g., dumping the routing table on a router • Injecting network events – Cause failure/recovery at planned time and place – E.g., BGP route beacon, or planned maintenance Challenges in Measuring Routing • Data vs. control plane – Understand relationship between routing protocol messages and the impact on data traffic • Cause vs. effect – Identify the root cause for a change in the forwarding path or control-plane messages • Visibility and representativeness – Collect routing data from many vantage points – Across many Autonomous Systems, or within • Large volume of data – Many end-to-end paths – Many prefixes and update measurements Measurement Tools: Traceroute • Traceroute tool exploits TTL-limited probes – Observation of the forwarding path • Useful, but introduces many challenges – Path changes – Non-participating nodes – Inaccurate, two-way measurements – Hard to map interfaces to routers and ASes TTL=1 source Time exceeded destination TTL=2 Send packets with TTL=1, 2, 3, … and record source of “time exceeded” message Measurement: Intradomain Route Monitoring • OSPF is a flooding protocol – Every link-state advertisements sent on every link – Very helpful for simplifying the monitor • Can participate in the protocol – Shared media (e.g., Ethernet) • Join multicast group and listen to LSAs – Point-to-point links • Establish an adjacency with a router • … or passively monitor packets on a link – Tap a link and capture the OSPF packets Measurement: Interdomain Route Monitoring Talk to operational routers using SNMP or telnet at command line Establish a “passive” BGP session from a workstation running BGP software BGP session over TCP (-) BGP table dumps are expensive (+) BGP table dumps do not burden operational routers (+) Table dumps show all alternate routes (-) Receives only best routes from BGP neighbor (-) Update dynamics lost (+) Update dynamics captured (-) restricted to interfaces provided by vendors (+) not restricted to interfaces provided by vendors Collect BGP Data From Many Routers Seattle Cambridge Chicago New York Kansas City Denver San Francisco Detroit Philadelphia St. Louis Washington, D.C. 2 Los Angeles Dallas San Diego Atlanta Phoenix Austin Houston BGP is not a flooding protocol Orlando Route Monitor Two Kinds of BGP Monitoring Data • Wide-area, from many ASes – RouteViews or RIPE-NCC data – Pro: available from many vantage points – Con: often just one or two views per AS • Single AS, from many routers – Abilene and GEANT public repositories – Proprietary data at individual ISPs – Pro: comprehensive view of a single AS – Con: limited public examples, mostly research nets Measurement: Injecting Events • Equipment failure/recovery – Unplug/reconnect the equipment – Packet filters that block all packets – Knowing when planned event will take place – Shutting down a routing-protocol adjacency • Injecting route announcements – Acquire some blocks of IP addresses – Acquire a routing-protocol adjacency to a router – Announce/withdraw routes on a schedule – Beacons: http://psg.com/~zmao/BGPBeacon.html Two Papers for Today • Both early measurement studies – Initially appeared at SIGCOMM’96 and ’97 – Both won the “best student paper” award – Early glimpses into the health of Internet routing – Early wave of papers on Internet measurement • Differences in emphasis – Paxson96: end-to-end active probing to measure the characteristics of the data plane – Labovitz97: passive monitoring of BGP update messages from several ISPs to characterize (in)stability of the interdomain routing system Paxson Study: Forwarding Loops • Forwarding loop – Packet returns to same router multiple times • May cause traceroute to show a loop – If loop lasted long enough – So many packets traverse the loopy path • Traceroute may reveal false loops – Path change that leads to a longer path – Causing later probe packets to hit same nodes • Heuristic solution – Require traceroute to return same path 3 times Paxson Study: Causes of Loops • Transient vs. persistent – Transient: routing-protocol convergence – Persistent: likely configuration problem • Challenges – Appropriate time boundary between the two? – What about flaky equipment going up and down? – Determining the cause of persistent loops? • Anecdote on recent study of persistent loops – Provider has static route for customer prefix – Customer has default route to the provider Paxson Study: Path Fluttering • Rapid changes between paths – Multiple paths between a pair of hosts – Load balancing policies inside the network • Packet-based load balancing – Round-robin or random – Multiple paths for packets in a single flow • Flow-based load balancing – Hash of some fields in the packet header – E.g., IP addresses, port numbers, etc. – To keep packets in a flow on one path Paxson Study: Routing Stability • Route prevalence – Likelihood of observing a particular route – Relatively easy to measure with sound sampling – Poisson arrivals see time averages (PASTA) – Most host pairs have a dominant route • Route persistence – How long a route endures before a change – Much harder to measure through active probes – Look for cases of multiple observations – Typical host pair has path persistence of a week Paxson Study: Route Asymmetry • Hot-potato routing • Other causes – Asymmetric link weights in intradomain routing – Cold-potato routing, where AS requests traffic enter at particular place Customer B Provider B multiple peering points • Consequences Early-exit routing Provider A Customer A – Lots of asymmetry – One-way delay is not necessarily half of the round-trip time Labovitz Study: Interdomain Routing • AS-level topology – Destinations are IP prefixes (e.g., 12.0.0.0/8) – Nodes are Autonomous Systems (ASes) – Links are connections & business relationships 4 3 5 2 1 Client 7 6 Web server Labovitz Study: BGP Background • Extension of distance-vector routing – Support flexible routing policies – Avoid count-to-infinity problem • Key idea: advertise the entire path – Distance vector: send distance metric per dest d – Path vector: send the entire path for each dest d “d: path (2,1)” 3 “d: path (1)” 1 2 data traffic data traffic d Labovitz Study: BGP Background • BGP is an incremental protocol – In theory, no update messages in steady state • Two kinds of update messages – Announcement: advertising a new route – Withdrawal: withdrawing an old route • Study saw an alarming number of updates – At the time, Internet had around 45,000 prefixes – Routers were exchanging 3-6 million updates/day – Sometimes as high as 30 million in a day • Placing a very high load on the routers Labovitz Study: Classifying Update Messages • Analyze update messages – For each (prefix, peer) tuple – Classify the kinds of routing changes • Forwarding instability – WADiff: explicit withdraw, replaced by alternate – AADiff: implict withdraw, replaced by alternate • Pathological – WADup: explicit withdraw, and then reanounced – AADup: duplicate announcement – WWDup: duplicate withdrawal Labovitz Study: Duplicate Withdrawals • Time-space trade-off in router implementation – Common system building technique – Trade one resource for another – Can have surprising side effects • The gory details – Ideally, you should not send a withdrawal if you never sent a neighbor a corresponding announcement – Requires remembering what update message you sent to each neighbor – Easier to just send everyone a withdrawal when your route goes away Labovitz Study: Practical Impact • “Stateless BGP” is compliant with the standard – But, it forces other routers to handle more load – So that you don’t have to maintain state – Arguably very unfair, and bad for global Internet • One router vendor was largely at fault – Router vendor modified its implementation – ISPs then deployed the updated software Labovitz Study: Still Hard to Diagnose Problems • Despite having very detailed view into BGP – Some pathologies were very hard to diagnose • Possible causes – Flaky equipment – Synchronization of BGP timers – Interaction between BGP and intradomain routing – Policy oscillation • These topics were studied in follow-up studies – Example: study of BGP data within a large ISP – http://www.cs.princeton.edu/~jrex/papers/nsdi05-jian.pdf ISP Study: Detecting Important Routing Changes • Large volume of BGP updates messages – Around 2 million/day, and very bursty – Too much for an operator to manage • Identify important anomalies – Lost reachability – Persistent flapping – Large traffic shifts • Not the same as root-cause analysis – Identify changes and their effects – Focus on mitigation, rather than diagnosis – Diagnose causes if they occur in/near the AS Challenge #1: Excess Update Messages • A single routing change – Leads to multiple update messages – Affects routing decision at multiple routers BR E BR E BGP Updates BR E BGP Update Grouping Events Persistent Flapping Prefixes Group updates for a prefix with inter-arrival < 70 seconds, and flag prefixes with changes lasting > 10 minutes. Determine “Event Timeout” Cumulative distribution of BGP update inter-arrival time (70, 98%) BGP beacon Event Duration: Persistent Flapping Complementary cumulative distribution of event duration (600, 0.1%) Long Events Detecting Persistent Flapping • Significant persistent flapping – 15.2% of all BGP update messages – … though a small number of destination prefixes – Surprising, especially since flap dampening is used • Types of persistent flapping – Conservative flap-damping parameters (78.6%) – Policy oscillations, e.g., MED oscillation (18.3%) – Unstable interface or BGP session (3.0%) Example: Unstable eBGP Session AE AT&T DE BE CE p Customer Peer Challenge #2: Identify Important Events • Major concerns of network operators – Changes in reachability – Heavy load of routing messages on the routers – Flow of the traffic through the network No Disruption Events Event Classification Loss/Gain of Reachability “Typed” Events Internal Disruption Single External Disruption Multiple External Disruption Classify events by type of impact it has on the network Event Category – “No Disruption” p AS2 AS1 DE No Traffic Shift EE AE BE “No Disruption”: eachAT&T of the border routers has no traffic shift CE Event Category – “Internal Disruption” p AS2 AS1 DE EE AE BE “Internal Disruption”: all of the traffic shifts are internal traffic shift AT&T CE Internal Traffic Shift Event Type: “Single External Disruption” p AS2 AS1 DE external Traffic Shift EE AE BE AT&T “Single External Disruption”: traffic at one exit point shifts to other exit points CE Statistics on Event Classification No Disruption Internal Disruption Single External Disruption Multiple External Disruption Loss/Gain of Reachability Events 50.3% 15.6% 20.7% 7.4% 6.0% Updates 48.6% 3.4% 7.9% 18.2% 21.9% Challenge #3: Multiple Destinations • A single routing change – Affects multiple destination prefixes “Typed” Events Event Correlation Clusters Group events of same type that occur close in time Main Causes of Large Clusters: BGP Resets • External BGP session resets – Failure/recovery of external BGP session – E.g., session to another large tier-1 ISP – Caused “single external disruption” events – Validated by looking at syslog reports on routers p AS2 AS1 DE EE AE BE AT&T CE Main Causes of Large Clusters: Hot Potatoes • Hot-potato routing changes – Failure/recovery of an intradomain link – E.g., leads to changes in IGP path costs – Caused “internal disruption” events – Validated by looking at OSPF measurements P AE 11 9 BE ISP CE 10 “Hot-potato routing” = route to closest egress point Challenge #4: Popularity of Destinations • Impact of event on traffic – Depends on the popularity of the destinations Clusters Traffic Impact Prediction Large Disruptions Netflow Data BR E BR E BR E Weight the group of destinations by the traffic volume ISP Study: Traffic Impact Prediction • Traffic weight – Per-prefix measurements from Netflow – 10% prefixes accounts for 90% of traffic • Traffic weight of a cluster – The sum of “traffic weight” of the prefixes • Flag clusters with heavy traffic – A few large clusters have large traffic weight – Mostly session resets and hot-potato changes ISP Study: Summary BGP (106) BR E Updates Events (105) BR E BR E BGP Update Grouping Persistent Flapping Prefixes (101) “Typed” Events Event Classification Clusters Event Correlation Frequent Flapping Prefixes (101) Large Disruptions (101) (103) Traffic Impact Prediction Netflow Data BR E BR E BR E Three Studies, Three Approaches • End-to-end active probes – Measure and characterize the forwarding path – Identify the effects on data traffic • Wide-area passive route monitoring – Measure and classify BGP routing churn – Identify pathologies and improve Internet health • Intra-AS passive route monitoring – Detailed measurements of BGP within an AS – Aggregate data into small set of major events