Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech


Modeling Internet Topology

Ellen W. Zegura

College of Computing

Georgia Tech


• Part I - Modeling topology

– Background

– Survey of models + what is known about topology

– Example: mathematical foundations of degree-based generation

– Evaluation of topologies

• Part II - Reality check

– Beyond simple topology

– Visualization

• Open questions/Bold statements/Random thoughts

• Reading list

Networking background

domains/autonomous systems transit domains exchange point border routers peering hosts/endsystems routers stub domains access networks

Topology modeling

• Graph representation

• Router-level modeling

– vertices are routers

– edges are one-hop IP connectivity

• Domain- (AS-) level modeling

– vertices are domains (ASes)

– edges are peering relationships

Survey of models

• Waxman (Waxman 1988)

– router level model capturing locality

• Transit-stub (Zegura 1997), Tiers (Doar 1997)

– router level model capturing hierarchy

• Inet (Jin 2000)

– AS level model based on degree sequence

• BRITE (Medina 2000)

– AS level model based on evolution

Waxman model

(Waxman 1988)

• Router level model

• Nodes placed at random in

2-d space with dimension L

• Probability of edge (u,v):

– ae^{-d/(bL)}, where d is

Euclidean distance (u,v), a and b are constants

• Models locality u d(u,v) v

Transit-stub model

(Zegura 1997)

• Router level model

• Transit domains

– placed in 2-d space

– populated with routers

– connected to each other

• Stub domains

– placed in 2-d space

– populated with routers

– connected to transit domains

• Models hierarchy

Real data: AS topology

• Oregon route view server; peers with routers to collect

BGP routing tables

• Data publicly available from Nov 97 to present (,

• Faloutsos 1999

– degree sequence approximated by power law

– i.e., let f(d) be fraction of nodes with degree d , then f(d)

 d^

• Chen 2002

– Oregon data incomplete (but so is theirs!)

– degree sequence highly variable but not strict power law

(Jin 2000)

• Generate degree sequence

• Build spanning tree over nodes with degree larger than 1, using preferential connectivity

– randomly select node u not in tree

– join u to existing node v with probability d(v)/

 d(w)

• Connect degree 1 nodes using preferential connectivity

• Add remaining edges using preferential connectivity

(Medina 2000)

• Generate small backbone, with nodes placed:

– randomly or

– concentrated (skewed)

• Add nodes one at a time

(incremental growth)

• New node has constant # of edges connected using:

– preferential connectivity and/or

– locality

Router-level measurement

• General technique: traceroute, returns list of IP addresses on a path from source to destination

• Collection challenges:


– obtaining sufficient traceroute origin points

– deciding set of destination IP addresses (for coverage)

– limiting traceroute load

• Postprocessing challenges:

– resolving aliases (which IP addresses belong to same router) source 0


Zegura - Mar 2002 IPAM Workshop Tutorial destination 0



• Lucent (Burch 1999)

– single source (Lucent), ~100k destinations

– emphasis: longitudinal study, visualization

• Skitter (Broido 2001)

– 20 sources (“monitors”), ~400k destinations

– emphasis: measurement repository, analysis

• Mercator (Govindan 2000)

– single source (but uses source routing), 150k interfaces

– emphasis: heuristics for map construction

What is known? (hard to say)

• Caveat: router-level mapping clearly incomplete, so conclusions are weak

• Observations:

– qualitatively similar to AS graph on a number of measures

– Weibull distributions good fit for number of quantities (including degree distribution)

Foundations of degree-based generation

(Mihail 2002)

• Given degree sequence d(1) >= d(2) >= … >= d(n)

• A degree sequence is realizable if there is a simple graph

(no self-loops or multiple links) with this sequence

• Necessary and sufficient condition for degree sequence to be realizable:

– for each subset of k highest degree nodes, degrees can be

“absorbed” within the nodes and the outside degrees

Construction algorithm

• Maintain residual degrees of vertices, d(v)

• Repeat until all vertices have been chosen:

– pick arbitrary vertex v

– add edges from v to d(v) vertices of highest residual degree

– update residual degrees

Note: order to pick v arbitrary

Sparse/dense core

• Dense core

– pick v’s starting with high degree vertices

– will tend to connect high degree vertices

• Sparse core

– pick v’s starting with low degree vertices

– less likely to connect high degree vertices

• Large topology (11000+ nodes, 32000+ edges)

• Dense core

– diameter 5

– average path length 3.6

• Sparse core

– diameter 29

– average path length 17.9

Random instance

• Start from any realization of degree sequence

• Pick two edges at random, (u,v) and

(s,t), with distinct endpoints

• If doesn’t disconnect graph, remove edges and insert (u,s) and (v,t)

• Result satisfies degree sequence

In the limit, reaches every possible connected realization with equal probability

• Different starting points

• Snapshots, 25k, 50k, 100k, 300k, 600k iters

• Large topology, sparse initial core

– diameter: 29, 13, 11, 11, 10, 10

– avgspl: 5.6, 3.6, 3.4, 3.4, 3.4, 3.4

• Large topology, dense initial core

– diameter: 5, 10, 10, 10, 10, 10

– avgspl: 3.6, 3.2, 3.2, 3.4, 3.4, 3.4

Notes about models

• Variants on evolutionary models

• Variants on degree-driven models

• Appeal of evolutionary

• Relationship to work on “networks” in general

• Question: what determines whether a topology generator is “good”?

• Essentially an unsolved (and hard) problem

– depends on what topologies are used for

• NOT “degree sequence follows a power law!”

• Path-related metrics

– diameter, shortest path length

• Clustering metrics

– neighborhood size (“expansion”), eigenvalue decomposition, clustering coefficient

• Robustness metrics

– resilience

• Hierarchy metrics

– link usage, size of layers

Zegura - Mar 2002 IPAM Workshop Tutorial 24

Small world topologies

(Bu 2002)

• Defined by two measures:

– characteristic path length L = number of edges in shortest path between two vertices, averaged over all vertex pairs

– clustering coefficient C:

• take vertex v with k 

1 neighbors

• at most k(k-1)/2 edges among neighbors

• C(v) = fraction of k(k-1)/2 edges present v

• C = average clustering coefficient

• C >> C_random, L 

L_random k nodes

• AS-level topologies satisfy small-world test

• Example Mar 00:

– L=3.7, L_random=3.8

– C=.39, C_random=.0023

• Example, Sept 01:

– L= 3.6, L_random=3.6

– C=.47, C_random=.0015

Distinguishing between types of generators

(Tangmunarunkit 2001)

• Goal: large-scale metrics that distinguish between classes of graphs

• Proposal: Expansion, resilience and distortion

– differentiate between canonical graphs (mesh, tree, random graph)

– differentiate between three types of generators

• random graph (e.g., Waxman)

• structural (e.g., Transit-Stub, Tiers)

• degree-based (e.g., PLRG, BRITE)

Model “signatures”

• Signature: expansion, resilience, distortion

• Waxman: H H H (like random)

• Tiers: L H L

• Transit-stub: H L L (like tree)

• PLRG: H H L (like complete graph)

• Also: real topologies and other degree-based generators have H H L signature

Measure of hierarchy

• link-value measure

• see paper for details…

• bottom line: degree-based generators contain loose notion of hierarchy that is somewhat similar to loose notion in Internet

Semantics: policy-based routes

• Internet routes are not hop-based shortest paths

• General policies:

– path between two nodes in a domain remains in that domain

– path between two nodes in two different domains traverses zero or more transit domains

Zegura - Mar 2002 IPAM Workshop Tutorial 31


• Use edge weights so that shortest-paths obey general policies

• Four weights (in order)

– intra-domain edges

– T-T edges

– S-T edges

– S-S edges

BGP peering relationships

(Gao 2000)

• Problem: Routes determined by routing policy, including AS-level contractual

AS1 agreements

• Idea: label edges in AS-level graph as AS2

– provider-to-customer (customer pays provider for connectivity to rest of Internet)

– peer-to-peer (exchange traffic between customers free of charge)

– sibling-to-sibling (provide connectivity to rest of Internet for each other)



• Use BGP routing table entries




• e.g., routing table entry = AS path 1849 702 701 1

• downhill path: all edges provider-to-customer or siblingto-sibling

• uphill path: all edges customer-to-provider or sibling-tosibling

• An AS path of a BGP routing table is:

– an uphill path followed by a downhill path (either path segment may be empty)…or...

– an uphill path followed by a peer-to-peer edge followed by a downhill path (either path segment may be empty)

• an uphill path followed by a downhill path

– AS4-AS2-AS1-AS3-AS5

– AS7-AS1-AS2

• an uphill path followed by a peer-to-peer edge followed by a downhill path

– AS5-AS6-AS3-AS5

– AS6-AS3-AS2-AS4

Basic algorithm sketch

• Compute degrees for each AS

• For each routing table path:

– find highest degree AS (“top provider” T)

– AS edge (u,v) to left of T assigned value 1

– AS edge (u,v) to right of T assigned value 1

• For each edge (u,v):

– if (u,v) =1 and (v,u) = 1 then sibling-to-sibling

– else if (v,u) = 1 then provider-to-customer

– else if (u,v) = 1 then customer-to-provider

• Note: complete algorithm also identifies peer-topeer edges

Hierarchical classification

(Subramanian 2002)

• Idea: partition ASes into hierarchical levels using directed graph of peering relationships

• Process:

– identify and remove nodes with out-degree 0 (customers)

– recursively identify and remove nodes with out-degree 0



– identify dense core as largest subset of nodes that is “almost a clique” (in and out-degree at least half nodes)

– identify transit core as smallest subset of nodes that peer primarily with each other and ASes in dense core

– remaining nodes are outer core

Example result

• Dense core - 20 ASes

• Transit core - 162 ASes

• Outer core - 675 ASes

• Small regional ISPs - 950 ASes

• Customers - 8852 ASes

Visualization: netvisor

(Eagan 2002)

• Tool for router-level layout

• Combines automatic placement with userassisted placement

• Understands domain semantics

• Collaboration between Information

Visualization experts and Networking experts

Zegura - Mar 2002 IPAM Workshop Tutorial 40

Visualization: conceptual model

(Faloutsos 2002)

• Idea: simple representation of ASlevel topology, useful for intuitive understanding (and NY Times publication!)

• e.g., bowtie model for web

• jellyfish model

– highly connected core

– layers (“shells”)

– degree one nodes form legs

– length of legs denotes density core layers legs

Open Problems

• Evaluation

– what metrics are important?

• Useful modeling/scaling

– what topologies should be used for simulations?

• Semantics

– let’s move beyond simple topology

Are AS-level topologies useful?

• Many interesting problems arise due to large scale of Internet, hence need simulations that are “big enough”

• AS-level topology (about 10,000 nodes) manageable for some simulations

• But…representation of every AS as a comparable node (especially in 2-d space!) is a gross simplification

Zegura - Mar 2002 IPAM Workshop Tutorial 45

Observations on level of detail

• AS level models are limited (useless?)

– not enough distinction (all ASes look alike)

– not suitable for packet level simulations

• router level models are limited (useless?)

– too small to be realistic…or...

– too large for simulations

• need alternative models

– intermediate (border routers, exchange points,…)

– fluid flow network model??

• need better understanding of scaling

Reading List (1 of 3)

• [Broido 2001] Broido and Claffy, “Internet topology: local properties”, SPIE

ITCom 2001.

• [Bu 2002] Bu and Towsley, “Distinguishing between Internet power-law generators”, IEEE Infocom 2002.

• [Burch 1999] Burch and Cheswick, “Mapping the Internet”, IEEE Computer,

April 1999.

• [Chen 2002] Chen, Chang, Govindan, Jamin, Shenker and Willinger, “The origin of power laws in Internet topologies revisited”,

• [Calvert 1997] Calvert, Doar and Zegura, “Modeling Internet topology”, IEEE

Communications Magazine, June 1997.

• [Doar 1997] Doar and Leslie, “How bad is naïve multicast routing”, IEEE

Infocom 1993.

[Eagan 2002] Netvisor.

• [Faloutsos 1999] Faloutsos, Faloutsos and Faloutsos, “On power-laws relationships of the Internet topology”, ACM Sigcomm 1999.

Reading List (2 of 3)

• [Gao 2000] Gao, “On inferring autonomous system relationships in the

Internet”, IEEE Infocom 2000.

• [Govindan 2000] Govindan and Tangmunarunkit, “Heuristics for Internet map discovery”, IEEE Infocom 2000.

• [Jin 2000] Jin, Chen and Jamin, “Inet: Internet topology generator”, U.

Michigan technical report CSE-TR-433-00, September 2000.

• [Medina 2000] Medina, Matta and Byers, “On the origin of power-laws in

Internet topologies”, ACM CCR, April 2000.

• [Mihail 2002] Mihail, Gkantsidis, Saberi, Zegura, “On semantics of Internet topologies”, GT technical report, January 2002.

[Subramanian 2002] Subramanian, Agarwal, Rexford and Katz,

“Characterizing the Internet from multiple vantage points”, IEEE Infocom


• [Tauro 2002] Tauro, Palmer, Siganos and Faloutsos, “A simple conceptual model for the Internet topology”, Global Internet 2001.

Reading List (3 of 3)

• [Tangmunarunkit 2001] Tangmunarunkit, Govindan, Jamin, Shenker and

Willinger, “Network topologies, power laws, and hierarchy”, USC technical report 01-746, 2001.

• [Waxman 1988] Waxman, “Routing of multipoint connections”, IEEE JSAC,


• [Zegura 1997] Zegura, Calvert and Donahoo, “A quantitative comparison of graph-based models for Internet topology”, IEEE/ACM Transactions on

Networking, December 1997.

