Internet Topology Mapping Hakan Kardes University of Nevada, Reno Modified version of Dr. Gunes’s Presentation on Internet Topology Discovery Outline Introduction Router Level Internet Topology Maps • Topology Collection • Topology Sampling • Resolving Anonymous Routers • Resolving Alias IP Addresses • Resolving Genuine Subnets Conclusion Internet Topology Discovery 2 Internet Measurements Understand topological and functional characteristics of the Internet • Essential to design, implement, protect, and operate underlying network technologies, protocols, services, and applications Need for Internet measurements arises due to commercial, social, and technical issues • • • • • • Realistic simulation environment for developed products, Improve network management Robustness with respect to failures/attacks Comprehend spreading of worms/viruses Know social trends in Internet use Scientific discovery • Scale-free (power-law), Small-world, Rich-club, Dissasortativity,… Internet Topology Discovery 3 Internet Topology Measurement Types of Internet topology maps • • Autonomous System (AS) level maps Router level maps A router level Internet map consists of • • Nodes: End-hosts and routers Links: Point-to-point or multi-access links Router level Internet topology discovery • A process of identifying nodes and links among them Lumenta CAIDAJan Jan06 08 00 Internet Topology Discovery 4 Router-Level Internet Topology Maps Background Internet topology measurement studies Involves topology collection / construction / analysis Current state of the research activities • Distributed topology data collection studies/platforms • • iPlane, Skitter, Dimes, DipZoom, … 20M path traces with over 20M nodes (daily) Main Issues 1. 2. 3. 4. Sampling Anonymous routers Alias IP addresses Subnet Inference Internet Topology Discovery 5 Topology Collection (traceroute) Probe packets are carefully constructed to elicit intended response from a probe destination IPB IPA IPC IPD Vantage Point Destination TTL=1 TTL=2 TTL=3 TTL=4 S A B C D traceroute probes all nodes on a path towards a given destination • • TTL-scoped probes obtain ICMP error messages from routers on the path ICMP messages includes the IP address of intermediate routers as its source Merging end-to-end path traces yields the network map Details Internet Topology Discovery 6 Topology Collection e f Internet2 backbone S N C U W K L A H d Internet Topology Discovery Traces •d-H-L-S-e •d-H-A-W-N-f •e-S-L-H-d •e-S-U-K-C-N-f • f - N - C - K- H - d •f-N-C-K-U-S-e 7 Topology Sampling Sampling to discover networks • Infer characteristics of the topology Different studies considered • • • • • Effect of sample size [Barford 01] Sampling bias [Lakhina 03] Path accuracy [Augustin 06] Sampling approach [Gunes 07] Utilized protocol [Gunes 08] • • • • ICMP echo request TCP syn UDP port unreachable ~ 10% of routers are anonymous Protocol Responsiveness ICMP 81.9 % TCP 67.3 % UDP 59.9 % Approaches Internet Topology Discovery 8 Anonymous Router Resolution Problem Anonymous routers do not respond to traceroute probes and appear as in traceroute output • Same router may appear as in multiple traces. y y y: S – L – H – x y y: S – – H – x S SCurrent daily raw topology data sets S include • ~ 20 million path traces with • ~ 20 million occurrences of s along with L• ~ 500K public IP addresses L 1 The raw topology data is far from representing the H underlying sampled networkHtopology x x: H – L – S – y x x: H – – S – y Internet Topology Discovery 2 H x 9 Anonymous Router Resolution Problem S U L e Traces •d--L-S-e •d--A-W--f •e-S-L--d •e-S-U--C--f •f--C---d •f--C--U-S-e K H d S C N A f W Sampled network C U f L A e d Internet Topology Discovery W Resulting network 10 Anonymous Router Resolution Previous Approaches Basic heuristics • • IP: Combine anonymous nodes between same known nodes [Bilir 05] • Limited resolution C U S NM: Combine all anonymous neighborsL of a known node [Xin 06] • y High false positives More theoretic approaches • y Graph minimization approach [Yao 03] • • • S y C U S z L A x H W z After W resolution A Combine s as long as they do not violate two accuracy conditions: x condition After (1) Trace preservation condition and (2) distance preservation High complexity O(n5) – n is number of s resolution ISOMAP based dimensionality reduction approach [Xin 06] C z S U matrix then use ISOMAP to reduce it to a nx5 matrix U• BuildKan nxn distance N C z delay Distance: (1) hop count or (2) link W y nodes L • High complexity O(n3W ) – n is number of H A L A x Sampled network Resulting network x Internet Topology Discovery 11 Anonymous Router Resolution Graph Based Induction A x y1 y2 y3 C A x y1 y2 y3 C Details Parallel nodes x A D w x C E z y Clique x y A C C F w x A z x A v E z D z x A y C F Details Internet Topology Discovery D w E z w E y Star Details E Complete Bipartite D w C y D A C D E v Details 12 y w z IP Alias Resolution Problem w 1 b 2 1 c 1 2 a 4 1 z 2 d 2 x A set of collected traces • • • • 3 1 e 2 y a sub-graph w, …,b1, a1, c1, …, x z, …,d1, a2, e1, …, y x, …,c2, a3, b2, …, w y, …,e2, a4, d2, …, z Sample map from the collected path traces A router with different IP in b1 may appear c1 d1 addresses e1 a1 a2 different path traces • w x z Need to resolve IP addresses belonging to the same router a3 a4 b2 c2 d2 y e2 with no alias resolution Internet Topology Discovery 13 IP Alias Resolution Problem S U K C N f Sampled network L e H A W d s.3 u.1 k.1 c.1 n.1 u.2 k.2 c.2 n.2 s.1 e f s.2 w.3 l.1 l.3 n.3 a.3 h.2 h.1 h.4 Sample map without alias resolution d Internet Topology Discovery Traces • d - h.4 - l.3 - s.2 - e • d - h.4 - a.3 - w.3 - n.3 - f • e - s.1 - l.1 - h.1 - d • e - s.1 - u.1 - k.1 - c.1 - n.1 - f • f - n.2 - c.2 - k.2 - h.2 - d • f - n.2 - c.2 - k.2 - u.214- s.3 - e IP Alias Resolution Problem 1 w b 2 1 2 a 4 1 z 2 1 c z x 3 2 d1 1 d 2 e y d2 b1 c1 a w sub-graph x b2 c2 a1 w b c x e1 e2 a2 a3 z d e y y a4 partial alias resolution (only router a is not resolved) partial alias resolution (only router a is resolved) Internet Topology Discovery 15 IP Alias Resolution Several Approaches Source IP Address Based Method [Pansiot 98] • Relies on a particular implementation of ICMP error generation. IP Identification Based Method (ally) [Spring 03] • • Relies on a particular implementation of IP identifier Bfield, B Many routers ignore direct probes. Dest = A DNS Based Method [Spring 04] • Relies on similarities in the host name structures Dest = A • A sl-bb21-lon-14-0.sprintlink.net sl-bb21-lon-8-0.sprintlink.net Dest = B Works when a systematic naming is used. Dest = B A, ID=100 B A BB,,ID=103 ID=99 Record Route Based Method [Sherwood 06] • Depends on router support to IP route record processing Internet Topology Discovery 16 Genuine Subnet Resolution Problem Subnet resolution • Identify IP addresses that are connected over the same medium IP1 IP1 IP2 IP3 IP2 IP3 Improve the quality of resulting topology map A C B D (underlying topology) A B A C D CC (observed topology) Internet Topology Discovery B B A D D (inferred topology) 17 Conclusion The Internet is man-made, so why do we need to measure it? • • • Because we still don’t really understand it • Sometimes things go wrong Measurement for network operations • • Detecting and diagnosing problems What-if analysis of future changes Measurement for scientific discovery • • Creating accurate models that represent reality Identifying new features and phenomena Researchers have been sampling and analyzing Internet topology • • • Building network graph from raw-data was not handled carefully Many researchers pointed out issues due to sampling and developed algorithms to handle each of them • Resolving anonymous routers, IP aliases, and genuine subnets Huge computational and probing overhead due to very large data size Internet Topology Discovery 18 References 1. M.H. Gunes, S. Bilir, K. Sarac and T. Korkmaz, “A Measurement Study on Overhead Distribution of Value-Added Internet Services”, Computer Networks 2007. 2. M.H. Gunes and K. Sarac, “Resolving IP aliases in Building Traceroute-Based Internet Maps”, IEEE Transactions on Networking (to appear). 3. M.H. Gunes, M. Baysan and K. Sarac, “Resolving Anonymous Routers in Building Traceroute-Based Internet Maps”, IEEE Transactions on Networking (in preperation). 4. M.H. Gunes and K. Sarac, “Analytical IP Alias Resolution”, IEEE ICC 2006. 5. M.H. Gunes, N.S. Nielsen and K. Sarac “Impact of IP alias resolution on Traceroute-Based Sample Network Topologies”, PAM 2007. 6. M.H. Gunes and K. Sarac, “Importance of IP alias resolution in Sampling Internet Topologies”, IEEE GI 2007, 7. M.H. Gunes and K. Sarac, “Inferring Subnets in Router-level Topology Collection Studies”, ACM SIGCOMM IMC 2007. 8. M.H. Gunes and K. Sarac, “Resolving Anonymous Routers in Internet Topology Measurement Studies”, IEEE INFOCOM 2008. Internet Topology Discovery 19 Questions ? Internet Topology Discovery 20