A Measurement Framework for Pin-Pointing Routing Changes Renata Teixeira

advertisement
A Measurement Framework
for Pin-Pointing Routing
Changes
Renata Teixeira
(UC San Diego)
http://www-cse.ucsd.edu/~teixeira
with
Jennifer Rexford (AT&T)
NetTs’04 – Portland, OR
Why understand routing
changes?
 Routing changes cause service disruptions
 Convergence delay
 Traffic shift
 Change in path properties
• RTT, available bandwidth, or lost connectivity
 Operators need to know
 Why: For diagnosing and fixing problems
 Where: For accountability
• Need to guarantee service-level agreements
NetTs’04
2
What can be done with active
measurements?
 Active measurements: traceroute-like tools
 Can’t probe in the past
 Shows the effect, not the cause
AS 2
AS 4
AS 1
Web
Server
(d)
User
(s)
AS 3
NetTs’04
3
Can we use passive
measurements?
 Passive measurements: public BGP data
BGP update feeds
Data Correlation
Data Collection
(RouteViews, RIPE)
NetTs’04
root cause
4
Why Public BGP Data is Not
Enough?
Myth: The BGP updates from a single router accurately represent the AS
dst
AS 1
AS 2
The measurement system needs to capture the
A
B all border routers
BGP routing
changes from
7
6
10 D
12
C
BGP data
collection
NetTs’04
No change
5
Why Public BGP Data is Not
Enough?
Myth:Routing changes visible in eBGP have greater impact end-to-end
impact than changes with local scope.
dst
AS 2
AS 1
The measurement system needs to capture
internal
changes inside
A
B an AS
5
7
6
10 D
12
C
BGP data
collection
NetTs’04
6
Why Public BGP Data is Not
Enough?
Myth:BGP data from a router accurately represents changes on that router.
12.1.1.0/24
BGP data
collection
NetTs’04
A
12.1.0.0/16
The
measurement system needs to know
all routes the router knows.
7
Misleading BGP Changes
Myth:The AS responsible for the change appears in the old or
the new AS path.
BGP data
collection
old:
1,2,8,9,10
new:
1,4,5,6,7,10
1
2
4
8
3
5
9
6
11
7
Accurate troubleshooting may require
10
measurement
data from each AS
NetTs’04
8
Misleading BGP Changes
Myth:Looking at routing changes across prefixes resolves
d2
AS 3
AS 2
AS 1
d3
d1
A
B
12
7
ASes involved in the change need to cooperate to
10 C
pin-point the reason for the change
BGP data
collection
Changes for d2,
but not for d1 and d3
NetTs’04
9
Strawman Proposal:
Omni Server
 Creating an AS-level view
 BGP feeds from all border routers
• Inject all routes known in each router
 Internal routing data
 Archive log of routing changes
 Responding to queries
 Local cause: responds directly
 No local change: query neighbor AS
 Local change from downstream cause: query old
and/or new neighbor AS
NetTs’04
10
Diagnosis with Omnis
Omni 2
AS 2
i
User
(s)
AS 4
AS 1
Omni 1
Omni 4
j
Web
Server
(d)
AS 3
(i,s,d,t)
failure link (3,4)
(j,s,d,t’)
failure link (3,4)
NetTs’04
Omni 3
11
Conclusion
 Passive data
 AS-level view
 History (answers in the past)
 Distributed
 Active querying
 Servers, not routers
 See cause, not effect
NetTs’04
12
Future Directions
 How often are the myths violated?
 Measurement studies of ISP networks
 Doesn’t Omni require lots of data?
 ISPs already collect this kind of data
 Routing protocols extensions to reveal reasons of routing
changes
 Will ASes really cooperate?
 Pressure to provide service-level agreements
 Small group of ASes that choose to cooperate
 Won’t ASes cheat?
 Need techniques to detect persistent lying
NetTs’04
13
Download