Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford

advertisement
Characterizing the Internet Hierarchy
from Multiple Vantage Points
Jennifer Rexford
Internet and Networking Systems
AT&T Labs - Research; Florham Park, NJ
http://www.research.att.com/~jrex
Work with L. Subramanian, S. Agarwal, and R. Katz
http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/
Outline
 Internet
architecture
– ASes, IP addressing, BGP routing, and AS relationships
 Type-of-relationship
problem
– Motivation, formulation, and practical challenges
 Analyzing
partial views of the AS graph
– Assigning a rank to each AS from a single vantage point
– Comparing ranks of ASes across multiple vantage points
 Analysis
results
– BGP routing data and inferred AS relationships
– AS paths that are inconsistent with the inferences
– Five-level classification of the Internet hierarchy
 Conclusions
Internet Architecture
 Divided
into Autonomous Systems
– Distinct regions of administrative control (~11,000)
– Set of routers and links managed by a single institution
– Service provider, company, university, …
 Hierarchy
of Autonomous Systems
– Large, tier-1 provider with a nationwide backbone
– Medium-sized regional provider with smaller backbone
– Small stub network run by a company or university
 Interaction
between Autonomous Systems
– Internal topology is not shared between ASes
– … but, neighboring ASes interact to coordinate routing
Autonomous Systems (ASes)
Path: 6, 5, 4, 3, 2, 1
4
3
5
2
7
1
6
Web server
Client
IP Addressing and Prefixes
 32
bits in dotted-quad notation (12.34.158.5)
 Divided
into network and host portions
 12.34.158.0/23
12
is a 23-bit prefix with 29 addresses
34
158
5
00001100 00100010 10011110 00000101
Network (23 bits)
Host (9 bits)
Interdomain Routing with BGP (Between ASes)
 ASes
announce info about prefixes they can reach
 Local policies for path selection (which to use?)
 Local policies for route propagation (who to tell?)
 Policies configured by the AS’s network operator
“I can reach 12.34.158.0/23
via AS 1”
“I can reach 12.34.158.0/23”
1
12.34.158.5
2
3
Customer-Provider Relationship
 Customer
pays provider for access to the Internet
 AS exports customer’s routes to all neighbors
 AS exports provider’s routes only to its customers
Traffic to the customer
Traffic from the customer
d
provider
advertisements
provider
traffic
customer
d
customer
Peer-Peer Relationship
 Peers
exchange traffic between their customers
 Free of charge (assumption of even traffic load)
 AS exports a peer’s routes only to its customers
Traffic to/from the peer and its customers
advertisements
peer
d
traffic
peer
AS Relationships Matter
 Motivating
problems
– Placement of servers for content distribution network
– Selection of new peers or providers for an AS
– Analyzing the convergence properties of the BGP protocol
– Installing route filters to protect against misconfiguration
– Understanding of the basic structure of the Internet
 Knowing
the AS graph is not enough
– Interdomain routing is not shortest-path routing
– Some paths not allowed (e.g., transit through a peer)
– Local preference of paths (e.g., prefer customer path)
– Node degree does not define the Internet hierarchy
 Need
to know the relationship between AS pairs
Inferring Relationships from Routing Data
 Practical
realities of the Internet
– AS graph is not known
– AS relationships are proprietary
– … at least some routing data is publicly available!
 Exploiting
routing data
– Available via traceroute experiments or BGP tables
– Provides a set of AS paths, such as “701 7018 46”
– Implies existence of edges (701, 7018) and (7018, 46)
– Implies that 7018 (AT&T) allows AS 701 (UUNet) to
transit to AS 46 (Rutgers)
Valid and Invalid Paths
 AS
relationships limit the kinds of valid paths
– Uphill portion: customer-provider relationships
– Plateau: zero or one peer-peer edge
– Downhill portion: provider-customer relationships
Valid
Invalid
Invalid
Lixin Gao, “On inferring Autonomous System relationships in the
Internet,” IEEE/ACM Transactions on Networking, December 2001.
Type-of-Relationship Problem
 Given
the inputs
– AS graph G(V,E) with vertices V and edges E
– Set of paths P on the graph G
 Find
a solution that
– Labels each edge with an AS relationship
– Minimizes the number of invalid paths in P
 Properties
of the problem
– NP complete (?)
– May have multiple solutions
– We propose a heuristic algorithm
Practical Challenges
 Peer-peer
relationships are hard to infer
– Mislabeling a peer-peer edge as provider-customer does not
change a valid path into an invalid path
– We use heuristics to detect the peer-peer edges
 Some AS
pairs have unusual relationships
– Sibling ASes that provide transit service for each other
– Backup relationship for connectivity under failure
– Misconfiguration of a conventional AS relationship
– We detect these cases by analyzing the “invalid” paths
 Getting
access to a large path set P is hard
– We exploit BGP routing tables from multiple vantage points
Validation Approaches
 Quantify
the number of invalid paths
– Small number suggests better results
– …still, this doesn’t mean that inferences are correct
 Compare
results with other inference algorithms
– Higher confidence if inferences are the same
– … still, both algorithms could give wrong answers
 Compare
results with Routing Arbiter Database
– Higher confidence if consistent with RADB routing policies
– … still, RADB information is incomplete and out-of-date
 Compare
results with proprietary ISP data
– Higher confidence if answers are correct for this AS
– … still, answers may be wrong for other ASes
Partial View of the AS Graph
 Routing
data from a single source AS
– Collection of paths starting from the source
– Directed graph from union of all edges in these paths
E
Actual graph
C
B
A
D
E
F
D
B
F
C
C
D
B
E
A
F
A
Assigning Rank to AS in a Partial View
 Reverse
pruning algorithm to assign rank
– Rank 1 to the leaves, then remove leaves
– Rank 2 to the leaves, then remove leaves…
– Single (largest) rank to nodes in connected component, if any
5 E
5 B
4
D
3
F 1
4
C
C
3
D
2
B
2
E
1
A
1
F
A 1
Combining Information From Multiple Views
 Vector
of ranks for each AS
– A single element for each of the n views
 Dominance:
provider-customer relationship
– Provider has higher ranks than customer in most views
– For example, B has (2,5) and A has (1,1)
 Equivalence:
peer-peer relationship
– Peers have equal ranks in or inconsistent ranks
– For example, C has (3,4) and D has (4,3)
 Probabilistic
inference
– Thresholds to tolerate some variations across the views
– E.g., an AS dominates in n-1 views and dominated in 1
Applying Our Algorithm
 Applying
the algorithm to ten public BGP tables
– RouteViews table and nine Looking Glass servers
– Extracted set of unique paths P for each view
– Applied reverse pruning algorithm to each view
– Applied inference rules to the vectors of ranks
 Results
of the analysis on data from April 2001
– AS graph with 10,698 ASes and 23,935 edges
– Inferences were made for 99.2% of the edges
– 94.5% provider-customer and 4.7% peer-peer edges
– Most inferences do not require the probabilistic rules
Advantage of Multiple Vantage Points
 A single
vantage point is not enough
– 15% of the edges appear in exactly one BGP table
– Only 25% of the edges appear in all ten BGP tables
Analyzing Invalid Paths
 Checking
the validity of inferences
– Assume the relationship inferences are correct
– Identify paths that are invalid under these inferences
– Compute the number of invalid paths
– Investigate common anomaly triples (A, B, C)
 Results
of our analysis
– Applied to paths in 2 of the original 10 BGP tables
– Applied to paths in 4 other BGP tables
– 0.5-3% of paths are invalid for five of the six tables
– 8.7% of paths are invalid for the KDDI table
Common Anomaly Patterns
 Misconfiguration
– (1, 65112, 6461): 65112 is a private AS that should not
appear between Genuity and AboveNet
 Sibling
relationships
– (7018, 6841, 3300): Infonet Europe merged with AUCS
– (1239, 1740, 7018): Cerfnet was acquired by AT&T
– (1239, 8043, 6395): IXC Communications acquired
SmartNAP and renamed Broadwing
 Heuristic
for identifying sibling relationships
– AS pair that appears in a large number of “invalid” paths
– Our analysis identified 22 possible sibling relationships
Digression: Really Weird “Invalid” Paths…
1 701 703 9304 7018
Genuity
 Properties
UUNet
Hutchinson AT&T
of the path
– Two tier-1 U.S. providers (Genuity and UUNet)
– One service provider in Hong Kong (Hutchinson)
– Another tier-1 U.S. provider (AT&T) at the end of the path
 Looking
at internal AT&T configuration data…
– AT&T does not have a BGP session with AS 9304
– AT&T does not originate the prefixes (e.g., 152.141.116.0/24)
 Explanation
– Another AS was using the AT&T AS number (for over a year!)
– We sent them an e-mail and asked them to stop, and they did
Digression: How Could This Happen, and Persist?
 BGP configuration
is done locally by neighbors
– Customer configures its router with AS number 7018
– Provider configures its router with neighbor of 7018
 The
misconfiguration didn’t necessarily cause a problem
– Hop-by-hop routing took the traffic to the right place
– Most BGP policies don’t look at the identity of the ASes
 Could
have caused a problem: route filtering
– Large providers might applying filtering to customer routers
– Discard routes with other large providers in the path
 Could
have caused a problem: loop detection
– The bogus routes did not appear in AT&T’s routing tables
– AT&T router saw 7018 in the path and discarded the route
– AT&T router did have a route for the supernet (152.141.0.0/16)
AS Classification
 Directed AS
graph
– Directed edge from provider to customer
– Bidirectional edge between two peers
 Lowest
level: Stubs
– Leaf nodes: no peers or downstream customers
– 8898 of the 10915 ASes (82.5% of ASes)
– Ex: UC Berkeley (25), AT&T Labs (6431), and INRIA (1300)
 Next
lowest level: Regional ISPs
– Leaf nodes after successive pruning of leaf nodes
– 971 ASes of the 10915 ASes (8.9% of ASes)
– Ex: PacBell (5676), US West (6223), and UUNET Canada (815)
 Remaining
1046 ASes: Core
Dense Core
 Ways
to classify so-called “tier-1” ASes
– Any AS with no upstream provider (98 such nodes)
– AS set that forms the largest clique of peer edges (13 nodes)
 Relaxing
the definition
– Tolerate some missing or misclassified edges
– Tolerate some ASes with sibling relationships
 “Almost
a clique”
– Subgraph of m nodes with in and out degree at least m/2
– Greedy algorithm for locating the largest near-clique
 20 ASes
in the near-clique
– 15 of the ASes form a subgraph just 3 edges short of a clique
– Genuity, Sprint, UUNET, AT&T, Verio, Level3, C&W,…
Transit and Outer Core
 Transit
core
– ASes that peer with the dense core and each other
– Notion of a “weak in-way cut” to isolate these ASes
– Algorithm for identifying the ASes in transit core
– 129 ASes, including top providers in Europe and Asia
– Ex: UUNET Europe, KDDI, and Singapore Telecom
 Outer
core
– All of the remaining ASes in the core
– 897 ASes, including large regional and national ISPs
– Ex: Turkish Telecom and Minnesota Regional Network
Node Degree is Not Enough
 Node
degree ignores relationships
– A stub AS may have many upstream providers
– A core AS may have a small number of peers
– Some ASes have customers that don’t have AS numbers
Related Work
 AS
graph characterization
– Constructing graph from BGP tables or traceroute experiments
– Characterizing the topological properties of the graph
 Inferring AS
relationships (Lixin Gao)
– Identifies the key properties of paths (uphill, downhill, etc.)
– Heuristic using node degree to infer boundary point between the
uphill and downhill portions of the path
– Application of the algorithm using RouteViews routing table
 Characterization
of the hierarchy of ASes
– Early work by Govindan/Reddy based on node degree
– Recent work by Ge et al based on AS relationships
Conclusions
 Inferring AS
relationships
– Reverse pruning to assign rank to each AS
– Comparison of ranks from different vantage points
 Performance
evaluation
– Application of algorithm to collection of ten BGP tables
– Exploration of the anomalies that cause invalid paths
 Characterization
of Internet hierarchy
– Stub, regional ISP, outer core, transit core, & dense core
– Algorithms for identifying the three parts of the core
– Application to AS graph inferred from the BGP tables
Ongoing Work
 Classification
of siblings
– Use anomalous triples (A, B, C) to identify siblings
– Group siblings into a single node (with union of edges)
– Repeat classification of the AS hierarchy on new graph
 Longitudinal
study
– Repeat the study over a period of time with new data
– Study how AS relationships and hierarchy changes
 Validation
of our inference results
– Compare to RADB, Lixin’s results, AT&T data, etc.

http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/
Download