Measuring ISP topologies with Rocketfuel
Ratul Mahajan
Neil Spring
David Wetherall
University of Washington
ACM SIGCOMM 2002
Motivation
• To understand Internet structure and design.
– How ISP router-level topologies are designed.
• Can’t get the real maps.
– Backbone maps often available in marketing form.
– Severely lacking in router-level detail.
ISP topologies for research
• Could extract from a Whole-Internet map: eg. Skitter, Mercator, Lumeta.
• Paper’s Philosophy:
– By focusing on an ISP, can get better precision.
– ISPs publish enough information to reconstruct maps.
– End goal is more accurate maps for research.
Terminology
• Each POP is a physical location where the ISP houses a collection of routers.
• The ISP backbone connects these POPs, and the routers attached to inter-POP links are called backbone or core routers.
• Within every POP, access routers provide an intermediate layer between the ISP backbone and routers in neighboring networks.
Points of Presence and Backbone
Rocketfuel’s Backbone Map
They aren’t telling us everything…
Rocketfuel Methodology
• ISPs release “helpful” information:
– BGP - which prefixes are served
– Traceroute - what the paths are
– DNS - where routers are and what they do
• Build detailed maps:
– Backbone
– POPs
– Peering links
Traceroutes
• Publicly available traceroute servers
• Challenge: To build accurate ISP maps using few measurements
• Brute Force Method
– 784 vantage points to 120,000 allocated prefixed in BGP table
– Queried every 1.5 minutes: 125 days to complete a map.
• Capitalize on routing information
• Identify traceroutes which transit the ISP network
Example : AS 7
Dependent Prefixes: 4.5.0.0/16
Insiders : 4.5.0.0/16
Up/down traces: AS 11 to 1.2.3.0/24
Path Reductions to the same destination
Ingress
Reduction
Egress
Reduction
Next-hop
AS
Reduction
Reduction Effectiveness
• Brute force : 90-150 million traceroutes required
• BGP directed probes : 0.2-15 million traceroutes required
• Executed after path reduction : 8-300 thousand traceroutes required
Location and Role Discovery
• Where is this router located?
use DNS names
S1-bb11-nyc-3-0.sprintlink.net is a Sprint router in New York City use connectivity information if a router connects only to router in Seatles, it is in Seattle
• What role does this router play in the topology?
only backbone routers connect to other cities use DNS names s1-gw2-sea-3-1.sprintlink.net is a Sprint gateway router
Alias resolution problem
Alias resolution solution
• Send a packet to each interface to solicit responses.
• Previous work - responses have the same source:
Routers often set source address to outgoing interface
• New approach –
– responses have nearby IP identifiers:
– IP ID is commonly set from a counter.
• Alias resolution optimization
• Sort by DNS name - find aliases quickly
• Cluster by return TTL - rule out many addresses
• ALLY found 2.8 times as many
IP ID method
• x<y<z, z-x small likely aliases
• If |x-y|>200
Aliases are disqualified, third packet is not sent
AT & T
Sprint
Level 3
Telstra
POP Structure
Completeness
• Validation with ISPs
– Good to excellent
– Hesitant to reveal customer data
• Scanning IP addresses
• Comparison with Routeviews
– Number of BGP adjacencies
– Worst case 70%
• Skitter
– Seven times as many links and routers
Impact of reductions
• Ingress and Egress reductions
Next HOP ASs
• Specially beneficial for Insiders
Analysis
• POP Sizes
– All skewed
– Most routers present in ten largest POPs
– Sprint: 60% POPs : less than 20% of Sprint routers
Router Degree Distribution
• Small range in data
– Layer 2 switches unaccounted
Peering Structure
• Advantage here: Where and how many places do two ISPs connect
• Highly skewed for all ISPs: