E2E Delay variation

advertisement
End-to-End Issues
Route Diversity
 Load balancing
o Per packet splitting
o Per flow splitting
 Spill over
 Route change
o Failure
o policy
 Route flapping
2
Diversity Effects
 Packet reorder
o TCP performance
o Larger playback buffer
 Jitter
However, jitter can occur due cross traffic.
What contribute more to jitter?
3
What contribute more to jitter?
 Understand the origins of e2e delay variations
o Result from existence of multiple routes
• designed load balancing or transient failures
S
D
o Result of variations within each route
• intra-route issues (congestion)
S
D
4
Related Work
 [Wang et al., Pucha et al.] studied the impact that
specific routing events have on the overall delay
o Routing changes result in significant RTT delay increase
o However, variability is small
 [Augustin et al.] examined the delay between different
parallel routes in short time epoch
o Only 12% have a delay difference which is larger than 1ms
 [Pathak et al.] studied the delay asymmetry
o There is a strong correlation between one-way delay
changes and route changes
5
Key Differences
 We study the RTT delay along longer time
periods
 Examine the difference of the delay
distribution between parallel routes
 Focus on the origin of delay variability
o Within each route (e.g. congestion)
o Due to multiple routes (e.g. load-balance)
6
How do we measure?
• Use DIMES for conducting two experiments
– 2006 and 2009
– Over 100 agents measures to each other
– Broad set of ASes and geo locations
– Active traceroute (ICMP and UDP)
– Each agent probes all target IPs every 1-2 hours
– 4 days of probing
– Collect the route IPs and e2e delay
7
Vantage Point Statistics (1)
 2006
o 102 VPs
o Million traceroutes
o 6861 e2e pairs
o VPs in North
America (70),
Western Europe
(14), Australia (10),
Russia (6), Israel (2)
 2009
o 105 VPs
o Million traceroutes
o 10950 e2e pairs
o VPs in Western
Europe (41), North
America (38), Russia
(14), Australia (4),
South America (2),
Israel (2), Asia (4)
8
Vantage Point Statistics (2)
 2006
o 18% tier-1
o 78% tier-2
o 3% small companies
o 1% educational
 2009
o 14% tier-1
o 58% tier-2
o 28% educational
Only 7 agents participated in both
9
Identifying Routes and Pairs
 Using community-based infrastructure
o Routes can start and end in non-routable IPs
o Users can measure from different locations
 Only the routable section of each path is
considered
o The source (S) is the first routable IP
o The destination (D) is the last routable IP
VP
S
D
Target
10
Some Accounting
• The e2e pair Pi=(S,D) contains all the
measured paths between S and D
• For a pair Pi , the set Eij contains the measured
paths that follow route j
• For a pair Pi , the dominant route Eir is the
route that was seen the most times
– There can be several dominant routes with equal
prevalence
– For brevity we assume there is one at index r
11
Delay Stability (1)
 Each route Eij has a set of RTT delays,
corresponding to each measured path
 Each delay is a sample, we consider the
segment length resulting from the 95%
confidence interval surrounding the mean
value – CI(Eij)
 High variance samples result in long CI
12
Delay Stability (2)
 RTT stability of two routes is the intersection
(overlap) between their CI’s
13
Key Concept
 Overlapping CI’s (left)- intra-route delay variance
 Non-overlapping (right) - inter-route delay variance
14
Route Stability (1)
 Prevalence is the overall appearance ratio of a
route j of pair Pi
 As a stability measure, we use the prevalence
of the dominant route r
15
Route Stability (2)
 Use Edit Distance (ED) as a measure for the
difference between two routes
o Counting insert, delete and substitute operations
 Normalize ED by the maximal route length of
the two compared routes
o Can compare between ED of routes with different
lengths
o
marks normalized ED for pair i between
routes j and r
16
Route Stability (3)
 The stability is the weighted average of ED of
all non-dominant routes to the dominant
route of nearest length:
17
Things to Note
 We measure RTT values
o Capture forward and reverse path delay
o Route stability is measured on the forward path
• However, 90% of our routes have very similar forward
and reverse paths
• Indicating that stability of one-way is a good estimation
 Using UDP and ICMP
o Capture all possible routes, not flows
o Upper bound for instability
18
Results – Route Statistics
• Both have roughly the same route length and median delay
• Shorter routes than Paxson’s (15-16hops)
• Median delay is around 100msec
19
Results – Route Stability
•
•
•
•
Overall stable e2e routing (25% of pairs with single route)
Load balancers
Academy pairs have higher stability (55% single route)
USA pairs have slightly higher stability (35% single route) Not visible in
academy pairs
RouteISM < 0.2 for over 90% of the pairs
20
Results – Origin of Delay Instability
• 70% of the cases, changes in route delay is within routes (high
overlap) and not due to multiple path routing
• In 15% of the cases (20% in the 2006 data sets) the change in
delay is mainly due to route changes
21
Results – Route and Delay Instability
High
percentage of
zero overlap
• When the difference between routes is high, higher
chances of different delay distribution
• Prevalence does not significantly indicate level of overlap!
22
Results – Additional Findings
 Over 95% of the academic pairs have an
overlap of over 0.7
o Academic networks have little usage of loadbalancing and “spill-over” backup routes
 Only 5% of the USA pairs have overlap of 0
23
Conclusions
 Delay variations are mostly within the routes
o 70% of the pairs
o 95% of the academic pairs
 Internet e2e routes are mostly stable
 The academic Internet is significantly more
stable than commercial networks
24
Download