Jellyfish, and other Interesting creatures Of the Internet Scott Kirkpatrick, Hebrew University with Avishalom Shalit, Sorin Solomon, Shai Carmi, Shlomo Havlin, Eran Shir, and Yuval Shavitt TAU seminar, 6 November, 2005 Copyright, Pixar, Inc. 2003 DIMES – Internet topology map • Previous efforts to measure the Internet have used: – One machine + Traceroute to many destinations – Many* machines, specially deployed to traceroute to many destinations • * Many <= 50 because of management headaches • Sites restricted to academic or gov’t labs, on network backbone • General perception was that Law of Diminishing Returns has set in • http://www.netdimes.org seems to have made a breakthrough – Don’t manage machines, offer a very lightweight, limited purpose client, and collect its measurements centrally • 100 – 1000 clients via word-of-mouth (Sep04 to April05) • 1000 – 5000 clients achieved via press, slashdot, still in the geek community – (May05 -- ?) • 5000 – 50000 clients in the general public by offering services in return – Somewhere in the 5000-10000 client range, we have the network monitoring itself, and the possibility that it can also manage itself in real time. AS map for July 2005 BGP • 20585 nodes • 45720 edges • <k> = 4.44 DIMES • 14332 nodes • 60134 edges • <k> = 8.39 33,862 edges + 11,858 edges 81,672 edges <k> > 7.80 24,182 in both maps 35,952 new edges Exploring the DIMES AS-graph • • We consider the Internet at the level of its autonomous systems (ASes) Previous studies have used degree as indicator to decompose networks – In particular, the Faloutsos’ “jellyfish model” • Identify core of network as maximal clique (not a robust criterion) • Shells around network labeled by hop count from core (a small world) • Find that sites with few links often connect to those with high degree • • We consider longer-range connectivity, using k-pruning. K-core, K-shell, and K-crusts result – K-shell is “derivative” of K-core, K-crust is union of K-shells – Near power-law structure of a new “inflow” region is observed – K-shells are not connected, but K-crusts have a giant cluster • For Erdos-Renyi graphs, K-core is w.h.p. K-connected. For scale free? • Result – focus attention on the capabilities of the inflow region, in support of P2P, chat, local traffic. Next steps – reachability is much harder than percolation. • How does original degree map into k-shell? AS K-shell decomposition IP K-shell decomposition K-shell decomposition K-crusts show percolation threshold These are the hanging tentacles of our (Red Sea) Jellyfish Largest cluster in each shell Data from 01.04.2005 Distribution of flows in the nucleus Now offering: monthly public stats K-shell for network visualization Using LaNet-Vi http://xavier.informatics.indiana.edu/lanet-vi Michalis Faloutsos’ Jellyfish Shells 3 2 1 Core • Highly connected nodes form the core • Each Shell: adjacent nodes of previous shell, except 1degree nodes • Importance decreases as we move away from core • 1-degree nodes hanging • The denser the 1-degree node population the longer the stem Meduza ) (מדוזהmodel In January, the inner core was at K = 30, but this picture persists to the present day, when core is >40. The precise definition of the tendrils: those sites isolated from the largest cluster in all the crusts – they connect only to the core. Links per site of k-shells to k-core (above) and to k-crust (below) Where do the links go in Medusa? Early shells (1-10) link to intermediate shells as well as to the core. Average distance between sites in a crust Percolation “attacks” Random scale-free graphs produce the same structure • Seen in both Barabasi-style and Molloy-Reed models of scale free networks Next steps • New data permits reexamining the clustering behavior – Much data not seen in previous BGP-based studies – This is the major deviation from simple random models – Analyze as a function of k-shell, instead of simply degree • Reachability is not percolation, but can be evaluated – Decision to transmit a message depends on sender and destination, not simply on the existence of a link – Cost of evaluating uphill-downhill reachability is comparable to shortest path Preliminary reachability data (std data set) Now add sideways steps at top of path Now restrict to the 20-crust Sideways step less effective inside crust