A Server-to-Server View of the Internet Bala Georgios Arthur Matthew KC CDN Servers CDN Servers Origin Servers End Users Internet’s Core End Users Content moved closer to end users to reduce latency.! ! Connections from end users are terminated at CDN servers close to the end users. 3 Viewing the Internet’s core from the distributed measurement platform of a CDN. 4 CDN Servers CDN Servers Origin Servers End Users Internet’s Core End Users Back-Office Web traffic accounts for a significant fraction of core Internet traffic — Pujol et al., IMC, Nov. 2014.! ! End-user experience is at the mercy of the unreliable Internet and its middle-mile bottlenecks — T. Leighton, CACM, Vol. 52. No. 2, Feb. 2009. 5 400 350 RTT (in ms) 300 IPv4 IPv6 250 200 150 100 50 0 Jan Feb Mar Apr May Jun A six-month timeline of RTTs between servers in Honk Kong, HK and Tokyo, JP 6 Jul 400 350 RTT (in ms) 300 IPv4 IPv6 250 200 150 100 1 50 0 Jan Feb Mar Apr May Jun A six-month timeline of RTTs between servers in Honk Kong, HK and Tokyo, JP! ! 7 Jul 400 350 RTT (in ms) 300 IPv4 IPv6 2 250 200 150 100 1 50 0 Jan Feb Mar Apr May Jun A six-month timeline of RTTs between servers in Honk Kong, HK and Tokyo, JP! ! Level-shifts in RTTs over both IPv4 and IPv6 8 Jul 400 350 RTT (in ms) 300 IPv4 IPv6 250 200 150 100 50 0 Jan Feb Mar Apr May Jun To what extent do changes in the AS path affect round-trip times? 9 Jul 350 RTT (in ms) 300 IPv4 IPv6 Night Day 250 200 150 100 50 03/26 03/27 03/28 03/29 03/30 03/31 04/01 A portion of the timeline of RTTs between servers in Honk Kong, HK and Tokyo, JP.! ! Daily oscillations in RTT between the servers. 10 04/02 350 RTT (in ms) 300 IPv4 IPv6 Night Day 250 200 150 100 50 03/26 03/27 03/28 03/29 03/30 03/31 04/01 How common are periods of daily oscillation in RTT, and where do they occur? 11 04/02 400 350 RTT (in ms) 300 IPv4 IPv6 250 200 150 100 50 0 Jan Feb Mar Apr May Jun What affects end-to-end RTTs more – routing or congestion? 12 Jul 400 350 RTT (in ms) 300 IPv4 IPv6 250 200 150 100 50 0 Jan Feb Mar Apr May Jun How does IPv4 and IPv6 compare with respect to routing and performance? 13 Jul 1. To what extent do changes in the AS path affect round-trip times?! ! 2. How common are periods of daily oscillation in RTT, and where do they occur? 14 Effect of routing changes on endto-end RTTs 15 Data Set: Long Term ! • ≈600 dual-stacked servers in 70 different countries.! ‣ US, AU, DE, IN, JP, … 16 time A B Traceroutes conducted between servers in both directions over both protocols. 17 3 3 time A A A B B B Every 3 hours traceroutes done over the full-mesh.! All traceroutes in a given 3 hour time frame have the same timestamp. 18 A-B Trace Timeline time A A A A A A B B B B B B Traceroutes over the full-mesh every 3 hours for 16 months from Jan. 2014 through Apr. 2015.! ≈700M IPv4 and ≈600M IPv6 traceroutes! Trace timeline Sa ➝ Sb is different from Sb ➝ Sa 19 time A 1 2 AS1-AS2-AS3-AS4 22.3 ms 3 3 B • 4 Extract two pieces of information from each traceroute! ‣ AS path inferred from interfaces in the traceroute output! ‣ end-to-end RTT between the two servers! 20 time AS1 AS1 AS1 AS1 AS1 AS3 AS3 AS9 AS3 AS9 AS2 AS4 22.3 AS2 AS4 29.7 AS5 AS4 18.2 AS2 AS4 23.1 AS5 AS4 17.9 A–B trace timeline! (AS-path, end-to-end RTT) tuples spanning the study period 21 time AS1 AS1 AS1 AS1 AS1 AS3 AS3 AS9 AS3 AS9 AS2 AS4 22.3 AS2 AS4 29.7 AS5 AS4 18.2 AS2 AS4 23.1 AS5 AS4 17.9 Popular AS path observed in A–B trace timeline! AS1-AS2-AS3-AS4 with prevalence 60% 22 1 0.9 0.8 ECDF 0.7 0.6 0.5 0.4 0.3 0.2 IPv4 IPv6 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Prevalence of popular AS paths 0.9 1 AS path prevalence — Vern Paxson, IEEE/ACM Transactions on Networking 1997 23 1 0.9 0.8 ECDF 0.7 0.6 0.5 0.4 0.3 0.2 IPv4 IPv6 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Prevalence of popular AS paths 0.9 Most paths had one dominant route, with 80% dominant for at least half the period. 24 1 time AS1 AS1 AS1 AS1 AS1 AS3 AS3 AS9 AS3 AS9 AS2 AS4 22.3 AS2 AS4 29.7 AS5 AS4 18.2 AS2 AS4 23.1 AS5 AS4 17.9 Number of AS-path changes observed in the A–B trace timeline 25 1 0.9 0.8 ECDF 0.7 0.6 0.5 0.4 0.3 0.2 IPv4 IPv6 0.1 0 1 10 100 Number of changes per trace timeline 1000 80% of the trace timelines experienced 20 or fewer changes over the course of 16-months. 26 How do the AS-path changes affect the baseline RTT of server-to-server paths? 27 time AS1 AS1 AS1 AS1 AS1 AS3 AS3 AS9 AS3 AS9 AS2 AS4 22.3 AS2 AS4 29.7 AS5 AS4 18.2 AS2 AS4 23.1 AS5 AS4 17.9 Group RTTs by AS paths.! Baseline: 10th-percentile of each AS-path (bucket).! 28 time AS1 AS1 AS1 AS1 AS1 AS3 AS3 AS9 AS3 AS9 AS2 AS4 22.3 AS2 AS4 29.7 AS5 AS4 18.2 AS2 AS4 23.1 Optimal Path: path with lowest baseline.! Optimal: AS1-AS5-AS9-AS4! Sub-Optimal: AS1-AS2-AS3-AS4 29 AS5 AS4 17.9 time AS1 AS1 AS1 AS1 AS1 AS3 AS3 AS9 AS3 AS9 AS2 AS4 22.3 AS2 AS4 29.7 AS5 AS4 18.2 AS2 AS4 23.1 AS5 AS4 17.9 Baseline of sub-optimal path with prevalence of 60% is ~4.5 ms increase in end-to-end RTT. 30 Fraction of trace timelines 1 0.9 v4: RTT inc. >= 100 ms v6: RTT inc. >= 100 ms v4: RTT inc. >= 50 ms v6: RTT inc. >= 50 ms v4: RTT inc. >= 20 ms v6: RTT inc. >= 20 ms 0.8 0.7 0.6 0 0.2 0.4 0.6 0.8 Prevalence of sub-optimal AS paths Typically a routing change causes only a small change in RTT. 31 1 Fraction of trace timelines 1 0.9 v4: RTT inc. >= 100 ms v6: RTT inc. >= 100 ms v4: RTT inc. >= 50 ms v6: RTT inc. >= 50 ms v4: RTT inc. >= 20 ms v6: RTT inc. >= 20 ms 0.8 0.7 0.6 0 0.2 0.4 0.6 0.8 Prevalence of sub-optimal AS paths 1 But for a minority of cases, the change can be significant.! 10% of trace timelines over IPv4 the (sub-optimal) AS paths that led to at least a 20 ms increase in RTTs had a prevalence of at least 30% 32 Effect of periods of daily oscillation on end-to-end RTTs 33 Data Set: Short Term ! • ≈3,500 server clusters in 1,000 locations in 100 different countries. 34 Use FFT to select congestion candidates ping measurements over full-mesh Perform traceroute campaigns Infer location of congestion ping measurements every 15 minutes for one week from Feb. 22, 2015 through Feb. 28, 2015.! ≈2.9M IPv4 and ≈1M IPv6 server pairs! Based on Time Sequence Latency Probes by Luckie et al., IMC 2014 35 end-to-end RTT A B 36 end-to-end RTT A B 1 2 3 Identify first segment with high-correlation with end-to-end RTT? 37 Highlights 3155 links were congested in our study of IPv4 traceroutes.! 1768 internal & 1121 interconnection links.! ! Weighting links by the number of server-to-server paths that cross them …! interconnection links are more popular!! ! Large majority of the interconnection links with congestion were private interconnects. 38 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 density All interconnection All internal US−US interconnection US−US internal 0 20 40 60 80 100 msec Typical overhead due to congestion is 20-30 ms. 39 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 density All interconnection All internal US−US interconnection US−US internal 0 20 40 60 80 100 msec Values between 20-30 ms —! US: accounts for 90% of density.! Europe & Asia: accounts for 30% of density.! 40 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 density All interconnection All internal US−US interconnection US−US internal 0 20 40 60 80 100 msec Transcontinental links in Europe & Asia.! 41 Routing changes typically do not affect end-to-end RTTs.! Congestion is not the norm. 42 What about non-typical cases? 43 Routing! Congestion! For10% of server pairs the (sub-optimal) AS paths that led to 20 ms increase in RTTs pertained for at least 30% of the study period for IPv4 & 50% for IPv6. Only 2% of the server pairs over IPv4, and just 0.6% over IPv6, experience a strong diurnal pattern with an increase in RTT of least 10 ms. 44 Routing! Congestion! 10% of trace timelines the (sub-optimal) AS paths that led to at least 20 ms increase in RTTs pertained for at least 30% of the study period for IPv4 & 50% for IPv6. Only pairs over IPv4, and just 0.6% experience a strong diurnal pattern with an increase in RTT of least 10 ms 45 - Focus on bandwidth! No packet loss measurements; platform limitations! Explore IPv4 & IPv6 infrastructure sharing 46 47 48 CDN Servers CDN Servers Origin Servers End Users Internet’s Core End Users Use measurements over paths between CDN servers to understand the state of the Internet core. 49 time AS1 AS1 AS1 AS1 AS1 AS3 AS3 AS9 AS3 AS9 AS2 AS4 22.3 AS2 AS4 29.7 AS5 AS4 18.2 AS2 AS4 23.1 AS5 AS4 17.9 Number of unique AS paths observed in the A–B trace timeline! AS1– AS2– AS3– AS4 and AS1– AS5– AS9– AS4 50 1 0.9 0.8 ECDF 0.7 0.6 0.5 0.4 0.3 0.2 IPv4 IPv6 0.1 0 1 10 Number of AS paths per trace timeline 100 80% of trace timelines have 5 or fewer AS paths in IPv4, and 6 or fewer in IPv6. 51 1 0.9 0.8 ECDF 0.7 0.6 0.5 0.4 0.3 0.2 IPv4 IPv6 0.1 0 1 10 Number of AS paths per trace timeline 80% of trace timelines have 5 or fewer AS paths in IPv4, and 6 or fewer in IPv6. 52 100 1 0.9 0.8 ECDF 0.7 0.6 0.5 0.4 0.3 0.2 IPv4 IPv6 0.1 0 1 10 Number of AS paths per trace timeline 100 80% of trace timelines have 5 or fewer AS paths in IPv4, and 6 or fewer in IPv6. 53 time AS1 AS1 AS1 AS3 AS3 AS9 AS2 AS4 22.3 AS2 AS4 29.7 AS5 AS4 18.2 Combine AS paths observed in the forward direction with 54 time AS1 29.1 AS AS2 1 AS AS3 8 AS AS4 3 AS 22.3 4 AS1 24.9 AS2 AS1 AS3 AS8 AS4 AS3 AS 29.7 4 AS1 28.8 AS5 AS1 AS9 AS8 AS4 AS3 AS 18.2 4 Combine AS paths observed in the forward direction with those in the reverse direction. 55 1 0.9 0.8 ECDF 0.7 0.6 0.5 0.4 0.3 0.2 IPv4 IPv6 0.1 0 1 10 Number of AS-path pairs per server pair 100 Pairing AS paths in the forward & reverse directions still reveals 80% of server pairs to have 8 or fewer path pairs in IPv4, and 9 or fewer in IPv6. 56 1 0.9 0.8 ECDF 0.7 0.6 0.5 0.4 0.3 0.2 IPv4 IPv6 0.1 0 1 10 Number of AS-path pairs per server pair 100 Pairing AS paths in the forward & reverse directions still reveals 80% of server pairs to have 8 or fewer path pairs in IPv4, and 9 or fewer in IPv6. 57 1 0.9 0.8 ECDF 0.7 0.6 0.5 0.4 0.3 0.2 IPv4 IPv6 0.1 0 1 10 Number of AS-path pairs per server pair 100 Pairing AS paths in the forward & reverse directions still reveals 80% of server pairs to have 8 or fewer path pairs in IPv4, and 9 or fewer in IPv6. 58 ECDF 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -100 All Same AS-paths -50 0 50 Di erence in RTT (in ms): RTTv4 - RTTv6 59 100 Comparing magnitudes of increase in (baseline) 10th percentile of RTTs of AS paths (each relative to the best AS path of the corresponding trace timeline) with the lifetime of AS paths … 60 X-axis: deciles of the distribution of AS-path lifetimes.! half-open intervals! [0.0, 3.0h) has no data points! Same value for 0th% and 10th% of the AS-path lifetime distribution 61 Y-axis: deciles of the distribution of magnitudes of increase in 10th percentile of RTTs of AS paths (each relative to the best AS path of the corresponding trace timeline). 62 Baseline RTTs of AS paths with longer lifetimes are close in value to that of the best AS path of corresponding trace timelines. 63 Paths with poor-performance are often those with relatively short lifetimes. 64 Similar observations from IPv6 traceroutes. 65