Inferring Autonomous System Relationships in the Internet Lixin Gao Presented by Santhosh R Thampuran Contents • Motivation • Background • AS Relationships • Heuristic Algorithms • Experimental Results Motivation • Interdomain routing in the Internet is coordinated by BGP • BGP allows each AS to choose its own policy in selecting routes and propagating reachability information to others Motivation (contd.) • These routing policies are constrained by the contractual commercial agreements between administrative domains • For example: AS sets policy so that it does not provide transit services between its providers Motivation (contd.) • Since routing between ASes is controlled by BGPa policy based routing protocol, connectivity does not imply reachability • Also, connectivity alone can not fully characterize the structural properties of the internet Motivation (contd.) National ISP B National ISP A Regional ISP C Motivation (contd.) Hence there is a necessity to classify the types of routes that can appear in BGP routing tables based on the relationships between the ASes in the path Background • connectivity between ASes can be modeled using an AS graph G = (V,E) – – node set V consists of ASes edge set E consists of AS pairs that exchange traffic between each other Background (contd.) Logical relationship AS1 AS2 AS4 AS3 AS5 Background (contd.) • The degree of an AS is the number of ASes that are its neighbors • An AS uses import policies to transform incoming route updates Background (contd.) • We consider a BGP session (u,v) E between two ASes, u and v • v receives a set of route updates R from u • import(u,v)[R] represents v’s update set after applying the import policy Background (contd.) • loop avoidance rule: if v r.as_path, then import(l,v)[{r}] = {} • B(u,d) denotes the best route selected by u for prefix d • AS u applies export policies export(v,u) to its best route set, R, for sending to a neighboring AS v Background (contd.) • The routing table entry in AS u for destination d is a route with empty AS path, denoted as e(u,d), if u originates prefix d • Otherwise, it depends on the best route of its neighboring AS v, B(v,d), as well as the import policies of u from v and the export policies of v to u Background (contd.) e(u,d) if d O(u) Routing_Table(u,d) = (u,v)Eimport(v,u)[export(u,v)[B(v,d)]] otherwise For the sake of simplicity, we assume that the AS path in the BGP routing table entry is proprocessed so that no AS appears more than once - no additional information AS Relationships • The commercial agreements between pairs of administrative domains can be classified into: – customer-provider relationship – peering relationship – mutual-transit relationship AS Relationships • We classify the relationship between a pair of Autonomous Systems into: – – – – customer-to-provider relationship provider-to-customer relationship peer-to-peer relationship sibling-to-sibling relationship AS Relationships • An annotated AS graph is a partially directed graph whose nodes represent ASes and whose edges are classified into: – provider-to-customer – customer-to-provider – peer-to-peer – sibling-to-sibling AS Relationships AS1 AS2 AS4 AS7 AS3 AS6 AS5 provider-to-customer edge peer-to-peer edge sibling-to-sibling edge Rules governing BGP export policy Own Routes Exporting to a Provider Exporting to a Customer Exporting to a Peer Exporting to a Sibling Customer’s Sibling’s Provider’s Peer’s Routes Route Route Route × × × × × × × × × × × × × × × × Selective Export Rule • An AS does not provide transit services between any two of its providers and peers • The selective export rule indicates that a BGP routing table entry should have a certain pattern Network i4.2.24.0/21 Next hop 134.24.127.3 194.68.130.254 158.43.133.48 193.0.0.242 144.228.240.93 704 1849 AS Path 1740 1 i 5459 5413 1 i 1849 704 702 701 1 i 3333 286 1 i 1239 1 i 702 701 Lemma If u0’s BGP routing table contains an entry with AS path (u1,u2,…,un) for destination prefix d, then, (a) any node ui selects a route with as_path (ui+1,…,un) as the best route to prefix d, and, (b) ui exports its best route ui-1 Valley-free property No V-shape possible Valley-free property No Step possible Valley-free property No Step possible Valley-free property AS2 AS1 AS4 AS3 AS6 AS5 provider-to-customer edge AS path (1,2,3) is valley-free peer-to-peer edge sibling-to-sibling edge Valley-free property AS2 AS1 AS4 AS3 AS6 AS5 provider-to-customer edge AS path (1,2,6,3) is valley-free peer-to-peer edge sibling-to-sibling edge Valley-free property AS2 AS1 AS4 AS3 AS6 AS5 provider-to-customer edge AS path (1,4,3) is not valley-free peer-to-peer edge sibling-to-sibling edge Valley-free property AS2 AS1 AS4 AS3 AS6 AS5 provider-to-customer edge AS path (2,1,3,6) is not valley-free peer-to-peer edge sibling-to-sibling edge Valley-free property • After traversing a provider-to-customer or peer-topeer edge, the AS path can not traverse a customer-to-provider or peer-to-peer edge. • Formally, an AS path (u1,u2,…,un) is valley-free iff the following conditions hold true – A provider-to-customer edge can be followed by only provider-tocustomer or sibling-to-sibling edges – A peer-to-peer edge can be followed by only provider-to-customer or sibling-to-sibling edges Theorem • If all ASes set their export policies according to the selective export rule, then the AS path in any BGP routing table entry is valley-free • This basically shows that the selective export policy and the lemma ensures that the AS path of a BGP routing table entry has the valley-free property Case(a) provider-to-customer edge that is followed by a customer-to-provider or peer-to-peer edge un ui un-1 ui+1 u2 uk u1 uk+1 Case(a) provider-to-customer edge that is followed by a customer-to-provider or peer-to-peer edge un ui un-1 ui+1 u2 uk u1 uk+1 Case(a) provider-to-customer edge that is followed by a customer-to-provider or peer-to-peer edge • (ui,ui+1) is provider-to-customer • (uj,uj+1) is the first customer-to-provider or peer-to-peer • (uj-1,uj) is either provider-to-customer or sibling-to-sibling • from lemma, the best route to destination d selected by uj is (uj+1,…,un) and it exports this route to uj-1 • contradiction since uj-1 and uj+1 are provider or peer of uj Case(b) peer-to-peer edge is followed by a customer-toprovider or peer-to-peer edge • can apply similar argument as in case(a) • The valley-free property enables us to identify patterns for BGP routing table entries Routing Table Entry Patterns • Downhill Path: a sequence of edges that are either provider-to-customer or sibling-to-sibling • Uphill Path: a sequence of edges that are either customerto-provider or sibling-to-sibling Routing Table Entry Patterns • An AS path of a BGP routing table entry has one of the following patterns: – – – – – – an uphill path a downhill path an uphill path followed by a downhill path an uphill path followed by a peer-to-peer edge a peer-to-peer edge followed by a downhill path an uphill path followed by a peer-to-peer edge followed by a downhill path Routing Table Entry Patterns • This can be classified into: – maximal uphill path, peer-to-peer edge and maximal downhill path in order, or – maximal uphill path and the maximal downhill path in order Routing Table Entry Patterns uphill top provider downhill top provider ui u2 u1 ui+1 un-1 un Heuristic Algorithms • The Algorithm for inferring AS relationships is based on the fact that ASes set up their export policies according to the relationships and on the resulting patterns on BGP routing table entries • It is also based on the intuition that a provider typically has a larger size than its customer and the size of an AS is typically proportional to its degree in the AS graph Heuristic Algorithms • top provider of an AS path is the AS that has the highest degree among all ASes in the path • we can infer that consecutive AS pairs on the left of the top provider are customer-to-provider or sibling-to-sibling edges and on the right are provider-to-customer or sibling-to-sibling edges Algorithms for Inferring Provider-Customer and Sibling-to-Sibling Relationships Basic Algorithm • Input: BGP routing table RT • Output: Annotated AS graph G • Phase 1: Compute the degree for each AS • Phase 2: Parse AS path to initialize consecutive AS pair relationship • Phase 3: Assign relationship to AS pairs Phase 1 (Compute the degree for each AS) uj uj+1 u2 when i = 1, un-1 neighbor[ui] = neighbor[ui] {ui+1} u1 neighbor[ui+1] = neighbor[ui+1] {ui} un Phase 1 (Compute the degree for each AS) uj uj+1 u2 un-1 degree[u1] = |neighbor[u1]| u1 un Phase 2 (Parse AS path to initialize consecutive AS pair relationship) uj uj+1 u2 un-1 Smallest j such that u1 degree[uj] = max1i ndegree[ui] un Phase 2 (Parse AS path to initialize consecutive AS pair relationship) uj transient[uj+1,uj] = 1 u2 transient[u1,u2] = 1 u1 uj+1 un-1 transient[un,un-1] = 1 un ub ua uc uj uj+1 u2 u1 ud un-1 un ub ua transient[ua,ub] = 1 uc transient[ud,u2] = 1 u2 transient[u2,u1] = 1 u1 ud ub ua transient[ua,ub] = 1 uc uj transient[uj+1,uj] = 1 uj+1 transient[ud,u2] = 1 u2 ud un-1 transient[un,un-1] = 1 u1 transient[u1,u2] = 1 transient[u2,u1] = 1 un Phase 3 (Assign relationship to AS pairs) if transient[ui,ui+1] = 1 and transient[ui+1,ui] = 1 relationship[ui,ui+1] = sibling-to-sibling else if transient[ui+1,ui] = 1 relationship[ui,ui+1] = provider-to-customer else if transient[ui,ui+1] = 1 relationship [ui,ui+1] = customer-to-provider ub ua transient[ua,ub] = 1 provider-to-customer edge peer-to-peer edge uc sibling-to-sibling edge uj transient[uj+1,uj] = 1 uj+1 transient[ud,u2] = 1 u2 un-1 transient[un,un-1] = 1 u1 transient[u1,u2] = 1 transient[u2,u1] = 1 un Refined Algorithm • Top provider may not have the highest degree possibility of incorrect inference of relationships • let each routing table entry vote on the relationship of an AS pair • if a sibling-to-sibling relationship is concluded by only one entry, we ignore it Refined Algorithm • If all routing table entries agree that an AS pair has a provider-to-customer (or customer-to-provider) relationship, then the AS has that relationship • If only one routing table entry infers that an AS pair has a provider-to-customer (or customer-to-provider) relationship and more than one entry infer that an AS pair has a customer-to-provider (provider-to-customer) relationship, then the AS pair has a customer-to-provider (provider-to-customer) relationship Refined Algorithm • For all other cases, the AS pair has a sibling-tosibling relationship • Unlike the basic algorithm, the refined algorithm ignores some routing table entries Refined Algorithm • Input: BGP routing table RT • Output: Annotated AS graph G • Phase 1: Compute the degree for each AS • Phase 2: Count the number of routes that infers an AS pair as having a provider-to-customer or customer-to-provider relationship • Phase 3: Assign relationship to AS pairs Algorithm for Inferring Peerto-Peer Relationships Final Algorithm • Peer-to-peer edge between top provider and one of its neighbors only • If the top provider has sibling-to-sibling relationship with one of its neighbors, then it has a peer-to-peer relationship with the other neighbor • We use the heuristic that peer-to-peer edge is between the top provider and its neighboring AS that has a higher degree because such edges are between ASes of comparable sizes • We also use the heuristic that the degrees of two peers do not differ significantly - ASes having peer-to-peer relationship do not differ by more than R times Final Algorithm • Input: BGP routing table RT • Output: Annotated AS graph • Phase 1: Use either Basic or Refined algorithm to coarsely classify AS pairs into having provider-to-customer or sibling-to-sibling relationships • Phase 2: Identify AS pairs that can not have a peer-to-peer relationship • Phase 3: Assign peer-to-peer relationships from rest of the connected AS pairs as long as the pair degrees do not differ by more than R times Phase 2 Uj-1 uj uj+1 u3 u2 u1 degree[uj-1] < degree[uj+1] Un-2 un-1 un Phase 3 Uj-1 uj uj+1 u3 Un-2 degree[uj] / degree[uj+1] < R and u2 u1 degree[uj] / degree[uj+1] > 1/R un-1 un Experimental Results Inference Results TOTAL ROUTING ENTRIES 1999/9/27 2000/1/2 2000/3/9 968674 936058 1227596 TOTAL EDGES 11288 12571 13800 SIBLINGTOSIBLING EDGES INFERRED BY BASIC (PERCENT AGE) 149 (1.3%) 186 (1.47%) 203 (1.47%) SIBLINGTOSIBLING EDGES INFERRED BY REFINED (IGNORED ENTRIES) 124 (25) 135 (51) 157 (46) PEER-TOPEER EDGES INFERRED BY FINAL [R=] (PERCENT AGE) PEER-TOPEER EDGES INFERRED BY FINAL [R=60] (PERCENT AGE) 884 (7.8%) 838 (6.7%) 857 (6.2%) 733 (6.5%) 668 (5.3%) 713 (5.7%) Verification of Inferred Relationships by AT&T OUR INFERENCE Customer Peer Sibling Nonexistent AT&T INFORMATION Customer Peer Peer Customer Sibling Peer Customer Customer Peer PERCENTAGE OF AS 99.8% 0.2% 76.5% 23.5% 20% 60% 20% 95.6% 4.4% 8 Comparing inference results from Basic and Final(R= ) with AT&T internal information Verification of Inferred Relationships by AT&T OUR INFERENCE Customer Peer Sibling Nonexistent AT&T INFORMATION Customer Peer Peer Customer Sibling Peer Customer Customer Peer PERCENTAGE OF AS 99.5% 0.5% 76.5% 23.5% 25% 50% 25% 95.6% 4.4% 8 Comparing inference results from Refined and Final(R= ) with AT&T internal information Verification of Inferred Relationships by AT&T OUR INFERENCE Customer Peer Sibling Nonexistent AT&T INFORMATION Customer Peer Peer Sibling Peer Customer Customer Peer PERCENTAGE OF AS 99.8% 0.2% 100% 20% 60% 20% 95.6% 4.4% Comparing inference results from Basic and Final(R=60) with AT&T internal information WHOIS lookup Service • supplies the name and address of the company that owns an AS • we can confirm that an AS pair has sibling-tosibling relationship if they belong to the same company or two merging companies • we also confirm that two AS pairs have sibling-tosibling relationship if they belong to two small companies that are located in the same city WHOIS lookup Service • 101 of the 186 inferred sibling-to-sibling relationships were confirmed (more than 50%) • unconfirmed sibling-to-sibling can attribute to the fact that WHOIS service is not up to date Applications of AS Relationships • can help in the construction of distance map and the placement of the proxy or mirror site servers • can help ISPs or domain administrators to achieve load balancing and congestion avoidance • can help ISPs or companies to plan for future contractual agreements • can help ISPs to reduce the effect of the misconfiguration and to debug router configuration files • can potentially avoid route divergence problem • can verify the consistency of information in the Internet Routing Registry (IRR) Thank You