A Server-to-Server View of the Internet Bala Georgios

advertisement
A Server-to-Server View
of the Internet
Bala
Georgios
Arthur
Matthew
KC
CDN
Servers
CDN
Servers
Origin
Servers
End
Users
Internet’s Core
End
Users
Content moved closer to end users to reduce latency.!
!
Connections from end users are terminated at CDN
servers close to the end users.
3
Viewing the Internet’s core from the distributed
measurement platform of a CDN.
4
CDN
Servers
CDN
Servers
Origin
Servers
End
Users
Internet’s Core
End
Users
Back-Office Web traffic accounts for a significant fraction of core Internet
traffic — Pujol et al., IMC, Nov. 2014.!
!
End-user experience is at the mercy of the unreliable Internet and its
middle-mile bottlenecks — T. Leighton, CACM, Vol. 52. No. 2, Feb. 2009.
5
400
350
RTT (in ms)
300
IPv4
IPv6
250
200
150
100
50
0
Jan
Feb
Mar
Apr
May
Jun
A six-month timeline of RTTs between servers in
Honk Kong, HK and Tokyo, JP
6
Jul
400
350
RTT (in ms)
300
IPv4
IPv6
250
200
150
100
1
50
0
Jan
Feb
Mar
Apr
May
Jun
A six-month timeline of RTTs between servers in Honk
Kong, HK and Tokyo, JP!
!
7
Jul
400
350
RTT (in ms)
300
IPv4
IPv6
2
250
200
150
100
1
50
0
Jan
Feb
Mar
Apr
May
Jun
A six-month timeline of RTTs between servers in Honk
Kong, HK and Tokyo, JP!
!
Level-shifts in RTTs over both IPv4 and IPv6
8
Jul
400
350
RTT (in ms)
300
IPv4
IPv6
250
200
150
100
50
0
Jan
Feb
Mar
Apr
May
Jun
To what extent do changes in the AS path affect
round-trip times?
9
Jul
350
RTT (in ms)
300
IPv4
IPv6
Night
Day
250
200
150
100
50
03/26
03/27
03/28
03/29
03/30
03/31
04/01
A portion of the timeline of RTTs between servers in
Honk Kong, HK and Tokyo, JP.!
!
Daily oscillations in RTT between the servers.
10
04/02
350
RTT (in ms)
300
IPv4
IPv6
Night
Day
250
200
150
100
50
03/26
03/27
03/28
03/29
03/30
03/31
04/01
How common are periods of daily oscillation in
RTT, and where do they occur?
11
04/02
400
350
RTT (in ms)
300
IPv4
IPv6
250
200
150
100
50
0
Jan
Feb
Mar
Apr
May
Jun
What affects end-to-end RTTs more – routing or
congestion?
12
Jul
400
350
RTT (in ms)
300
IPv4
IPv6
250
200
150
100
50
0
Jan
Feb
Mar
Apr
May
Jun
How does IPv4 and IPv6 compare with respect to
routing and performance?
13
Jul
1. To what extent do changes in the AS path affect
round-trip times?!
!
2. How common are periods of daily oscillation in
RTT, and where do they occur?
14
Effect of routing changes on endto-end RTTs
15
Data Set: Long Term
!
•
≈600 dual-stacked servers in 70 different countries.!
‣ US, AU, DE, IN, JP, …
16
time
A
B
Traceroutes conducted between servers in both
directions over both protocols.
17
3
3
time
A
A
A
B
B
B
Every 3 hours traceroutes done over the full-mesh.!
All traceroutes in a given 3 hour time frame have
the same timestamp.
18
A-B Trace Timeline
time
A
A
A
A
A
A
B
B
B
B
B
B
Traceroutes over the full-mesh every 3 hours for 16
months from Jan. 2014 through Apr. 2015.!
≈700M IPv4 and ≈600M IPv6 traceroutes!
Trace timeline Sa ➝ Sb is different from Sb ➝ Sa
19
time
A
1
2
AS1-AS2-AS3-AS4 22.3 ms
3
3
B
•
4
Extract two pieces of information from each traceroute!
‣
AS path inferred from interfaces in the traceroute output!
‣
end-to-end RTT between the two servers!
20
time
AS1
AS1
AS1
AS1
AS1
AS3
AS3
AS9
AS3
AS9
AS2
AS4
22.3
AS2
AS4
29.7
AS5
AS4
18.2
AS2
AS4
23.1
AS5
AS4
17.9
A–B trace timeline!
(AS-path, end-to-end RTT) tuples spanning the study period
21
time
AS1
AS1
AS1
AS1
AS1
AS3
AS3
AS9
AS3
AS9
AS2
AS4
22.3
AS2
AS4
29.7
AS5
AS4
18.2
AS2
AS4
23.1
AS5
AS4
17.9
Popular AS path observed in A–B trace timeline!
AS1-AS2-AS3-AS4 with prevalence 60%
22
1
0.9
0.8
ECDF
0.7
0.6
0.5
0.4
0.3
0.2
IPv4
IPv6
0.1
0
0
0.1
0.2
0.3 0.4 0.5 0.6 0.7 0.8
Prevalence of popular AS paths
0.9
1
AS path prevalence — Vern Paxson, IEEE/ACM
Transactions on Networking 1997
23
1
0.9
0.8
ECDF
0.7
0.6
0.5
0.4
0.3
0.2
IPv4
IPv6
0.1
0
0
0.1
0.2
0.3 0.4 0.5 0.6 0.7 0.8
Prevalence of popular AS paths
0.9
Most paths had one dominant route, with 80%
dominant for at least half the period.
24
1
time
AS1
AS1
AS1
AS1
AS1
AS3
AS3
AS9
AS3
AS9
AS2
AS4
22.3
AS2
AS4
29.7
AS5
AS4
18.2
AS2
AS4
23.1
AS5
AS4
17.9
Number of AS-path changes observed in the A–B
trace timeline
25
1
0.9
0.8
ECDF
0.7
0.6
0.5
0.4
0.3
0.2
IPv4
IPv6
0.1
0
1
10
100
Number of changes per trace timeline
1000
80% of the trace timelines experienced 20 or fewer
changes over the course of 16-months.
26
How do the AS-path changes affect the baseline
RTT of server-to-server paths?
27
time
AS1
AS1
AS1
AS1
AS1
AS3
AS3
AS9
AS3
AS9
AS2
AS4
22.3
AS2
AS4
29.7
AS5
AS4
18.2
AS2
AS4
23.1
AS5
AS4
17.9
Group RTTs by AS paths.!
Baseline: 10th-percentile of each AS-path (bucket).!
28
time
AS1
AS1
AS1
AS1
AS1
AS3
AS3
AS9
AS3
AS9
AS2
AS4
22.3
AS2
AS4
29.7
AS5
AS4
18.2
AS2
AS4
23.1
Optimal Path: path with lowest baseline.!
Optimal: AS1-AS5-AS9-AS4!
Sub-Optimal: AS1-AS2-AS3-AS4
29
AS5
AS4
17.9
time
AS1
AS1
AS1
AS1
AS1
AS3
AS3
AS9
AS3
AS9
AS2
AS4
22.3
AS2
AS4
29.7
AS5
AS4
18.2
AS2
AS4
23.1
AS5
AS4
17.9
Baseline of sub-optimal path with prevalence of
60% is ~4.5 ms increase in end-to-end RTT.
30
Fraction of trace timelines
1
0.9
v4: RTT inc. >= 100 ms
v6: RTT inc. >= 100 ms
v4: RTT inc. >= 50 ms
v6: RTT inc. >= 50 ms
v4: RTT inc. >= 20 ms
v6: RTT inc. >= 20 ms
0.8
0.7
0.6
0
0.2
0.4
0.6
0.8
Prevalence of sub-optimal AS paths
Typically a routing change causes only a small
change in RTT.
31
1
Fraction of trace timelines
1
0.9
v4: RTT inc. >= 100 ms
v6: RTT inc. >= 100 ms
v4: RTT inc. >= 50 ms
v6: RTT inc. >= 50 ms
v4: RTT inc. >= 20 ms
v6: RTT inc. >= 20 ms
0.8
0.7
0.6
0
0.2
0.4
0.6
0.8
Prevalence of sub-optimal AS paths
1
But for a minority of cases, the change can be significant.!
10% of trace timelines over IPv4 the (sub-optimal) AS paths that
led to at least a 20 ms increase in RTTs had a prevalence of at
least 30%
32
Effect of periods of daily oscillation
on end-to-end RTTs
33
Data Set: Short Term
!
•
≈3,500 server clusters in 1,000 locations in 100
different countries.
34
Use FFT to
select
congestion
candidates
ping
measurements
over full-mesh
Perform
traceroute
campaigns
Infer location of
congestion
ping measurements every 15 minutes for one week from Feb.
22, 2015 through Feb. 28, 2015.!
≈2.9M IPv4 and ≈1M IPv6 server pairs!
Based on Time Sequence Latency Probes by Luckie et al., IMC 2014
35
end-to-end RTT
A
B
36
end-to-end RTT
A
B
1
2
3
Identify first segment with high-correlation with
end-to-end RTT?
37
Highlights
3155 links were congested in our study of IPv4 traceroutes.!
1768 internal & 1121 interconnection links.!
!
Weighting links by the number of server-to-server paths that
cross them …!
interconnection links are more popular!!
!
Large majority of the interconnection links with congestion
were private interconnects.
38
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
density
All interconnection
All internal
US−US interconnection
US−US internal
0
20
40
60
80
100
msec
Typical overhead due to congestion is 20-30 ms.
39
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
density
All interconnection
All internal
US−US interconnection
US−US internal
0
20
40
60
80
100
msec
Values between 20-30 ms —!
US: accounts for 90% of density.!
Europe & Asia: accounts for 30% of density.!
40
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
density
All interconnection
All internal
US−US interconnection
US−US internal
0
20
40
60
80
100
msec
Transcontinental links in Europe & Asia.!
41
Routing changes typically do not
affect end-to-end RTTs.!
Congestion is not the norm.
42
What about non-typical cases?
43
Routing!
Congestion!
For10% of server pairs
the (sub-optimal) AS
paths that led to 20 ms
increase in RTTs
pertained for at least
30% of the study period
for IPv4 & 50% for IPv6.
Only 2% of the server
pairs over IPv4, and just
0.6% over IPv6,
experience a strong
diurnal pattern with an
increase in RTT of least
10 ms.
44
Routing!
Congestion!
10% of trace timelines
the (sub-optimal) AS
paths that led to at least
20 ms increase in RTTs
pertained for at least
30% of the study period
for IPv4 & 50% for IPv6.
Only
pairs over IPv4, and just
0.6%
experience a strong
diurnal pattern with an
increase in RTT of least
10 ms
45
-
Focus on bandwidth!
No packet loss measurements; platform
limitations!
Explore IPv4 & IPv6 infrastructure sharing
46
47
48
CDN
Servers
CDN
Servers
Origin
Servers
End
Users
Internet’s Core
End
Users
Use measurements over paths between CDN
servers to understand the state of the Internet
core.
49
time
AS1
AS1
AS1
AS1
AS1
AS3
AS3
AS9
AS3
AS9
AS2
AS4
22.3
AS2
AS4
29.7
AS5
AS4
18.2
AS2
AS4
23.1
AS5
AS4
17.9
Number of unique AS paths observed in the A–B
trace timeline!
AS1– AS2– AS3– AS4 and AS1– AS5– AS9– AS4
50
1
0.9
0.8
ECDF
0.7
0.6
0.5
0.4
0.3
0.2
IPv4
IPv6
0.1
0
1
10
Number of AS paths per trace timeline
100
80% of trace timelines have 5 or fewer AS paths in
IPv4, and 6 or fewer in IPv6.
51
1
0.9
0.8
ECDF
0.7
0.6
0.5
0.4
0.3
0.2
IPv4
IPv6
0.1
0
1
10
Number of AS paths per trace timeline
80% of trace timelines have 5 or fewer AS
paths in IPv4, and 6 or fewer in IPv6.
52
100
1
0.9
0.8
ECDF
0.7
0.6
0.5
0.4
0.3
0.2
IPv4
IPv6
0.1
0
1
10
Number of AS paths per trace timeline
100
80% of trace timelines have 5 or fewer AS paths in
IPv4, and 6 or fewer in IPv6.
53
time
AS1
AS1
AS1
AS3
AS3
AS9
AS2
AS4
22.3
AS2
AS4
29.7
AS5
AS4
18.2
Combine AS paths observed in the forward
direction with
54
time
AS1 29.1
AS
AS2
1
AS
AS3
8
AS
AS4
3
AS
22.3
4
AS1 24.9
AS2 AS1
AS3 AS8
AS4 AS3
AS
29.7
4
AS1 28.8
AS5 AS1
AS9 AS8
AS4 AS3
AS
18.2
4
Combine AS paths observed in the forward
direction with those in the reverse direction.
55
1
0.9
0.8
ECDF
0.7
0.6
0.5
0.4
0.3
0.2
IPv4
IPv6
0.1
0
1
10
Number of AS-path pairs per server pair
100
Pairing AS paths in the forward & reverse directions
still reveals 80% of server pairs to have 8 or fewer
path pairs in IPv4, and 9 or fewer in IPv6.
56
1
0.9
0.8
ECDF
0.7
0.6
0.5
0.4
0.3
0.2
IPv4
IPv6
0.1
0
1
10
Number of AS-path pairs per server pair
100
Pairing AS paths in the forward & reverse directions
still reveals 80% of server pairs to have 8 or fewer
path pairs in IPv4, and 9 or fewer in IPv6.
57
1
0.9
0.8
ECDF
0.7
0.6
0.5
0.4
0.3
0.2
IPv4
IPv6
0.1
0
1
10
Number of AS-path pairs per server pair
100
Pairing AS paths in the forward & reverse directions
still reveals 80% of server pairs to have 8 or fewer
path pairs in IPv4, and 9 or fewer in IPv6.
58
ECDF
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-100
All
Same AS-paths
-50
0
50
Di erence in RTT (in ms): RTTv4 - RTTv6
59
100
Comparing magnitudes of increase in (baseline) 10th
percentile of RTTs of AS paths (each relative to the best
AS path of the corresponding trace timeline) with the
lifetime of AS paths …
60
X-axis: deciles of the distribution of AS-path lifetimes.!
half-open intervals!
[0.0, 3.0h) has no data points!
Same value for 0th% and 10th% of the AS-path lifetime distribution
61
Y-axis: deciles of the distribution of magnitudes of
increase in 10th percentile of RTTs of AS paths (each
relative to the best AS path of the corresponding trace
timeline).
62
Baseline RTTs of AS paths with longer lifetimes are
close in value to that of the best AS path of
corresponding trace timelines.
63
Paths with poor-performance are often those with
relatively short lifetimes.
64
Similar observations from IPv6 traceroutes.
65
Download