The Internet structure

advertisement

Principles in Communication Networks

• Instructor: Prof. Yuval Shavitt,

– Office hours: room 303 s/w eng. bldg., Tue 16:00-

17:00

• Prerequisites (םדק תושירד):

– Introduction to computer communications (TAU,

Technion, BGU)

• Expectations from students:

– probability

– Queueing theory basics

– Graph theory

Course Syllabus (tentative)

• Internet structure

• Internet measurements

• Measurement optimization

• Measurement analysis

• Introduction to switching, router types

• Use of Gen. Func.: HOL analysis, TCP analysis.

• Matching algorithms and their analysis

• CLOS networks: non-blocking theorem, routing algorithms and their analysis

• Scheduling algorithms

Grade composition

• Final exam – 60%

• Project – 30%

• Home assignments (2-3) - 10%

Routing in the Internet

Routing in the Internet

Routing in the Internet is done in three levels:

– In LANs in the MAC layer:

• Spanning tree protocol for Ethernet Transparent bridge.

• Source routing for token rings

• Inside autonomous systems (ASes):

– RIP, OSPF, IS-IS, (E)IGRP

• Between ASes:

– BGP

Autonomous Systems

• Autonomous Routing Domains: A collection of physical networks glued together using IP, that have a unified administrative routing policy.

• An AS is an autonomous routing domain that has been assigned a number.

… the administration of an AS appears to other ASes to have a single coherent interior routing plan and presents a consistent picture of what networks are reachable through it.

RFC 1930: Guidelines for creation, selection, and registration of an Autonomous System

Internet Hierarchical Routing a

Host h1

C

C.b

b

Inter-AS routing between

A and B A.a

A.c

a

A d b c

Intra-AS routing within AS A

B.a

a

B c b

Host h2

Intra-AS routing within AS B

Why different Intra- and Inter-AS routing ?

Policy:

• Inter-AS: admin wants control over how its traffic routed, who routes through its net.

• Intra-AS: single admin, so no policy decisions needed

Scale:

• hierarchical routing saves table size, reduced update traffic

Performance :

• Intra-AS: can focus on performance

• Inter-AS: policy may dominate over performance

RIP

• A distance-vector protocol – (distributed

Bellman Ford)

• Developed in the 80s based on a Xerox protocol

• RIP-2 is now often used due to its simplicity

• Distance metric: minimum hop

OSPF / IS-IS

• Link state protocol – each node see the entire network map and calculate shortest paths using Dijksrta algorithm.

• Allows two level of hierarchy

• Authentication

• Complex

• IS-IS gain popularity among large ISPs

The structure of the Internet

How are routers connected?

• Why should we care?

– While communication protocols will work correctly on ANY topology

– ….they may not be efficient for some topologies

– Knowledge of the topology can aid in optimizing protocols

The Internet as a graph

• Remember: the Internet is a collection of networks called autonomous systems (ASs)

• The Internet graph:

– The AS graph

• Nodes: ASs, links: AS peering

– The router level graph

• Nodes: routers, links: fibers, cables, MW channels, etc.

– There are mid-level aggregation schemes

• PoP topologies, city topologies

• How does it looks like?

Random graphs in Mathematics

The Erdös-Rényi model

• Generation:

– create n nodes.

– each possible link is added with probability p .

• Number of links: np

Poisson distribution

• If we want to keep the number of links linear, what happen to p as n

 

?

The Waxman model

• Integrating distance with the E-R model

• Generation

– Spread n nodes on a large enough grid.

– Pick a link uar and add it with prob. that exponentially decrease with its length

– Stop if enough links

• Heavily used in the 90s

50

40

30

20

10

0

0

100

90

80

70

60

10 20 30 40 50 60 70 80 90 100

1999

The Faloutsos brothers

• Measured the Internet

AS and router graphs.

• Mine, she looks different!

Notre Dame

• Looked at complex system graphs: social relationship, actors, neurons, WWW

• Suggested a dynamic generation model

The Faloutsos Graph

1995 Internet router topology

3888 nodes, 5012 edges, <k>=2.57

SCIENCE CITATION INDEX

Nodes : papers

Links : citations

1736 PRL papers (1988)

Witten-Sander

PRL 1981

25

2212

P(k) ~k -

(

= 3)

(S. Redner, 1998)

Sex-web

Nodes: people (Females; Males)

Links: sexual relationships

4781 Swedes; 18-74;

59% response rate.

Liljeros et al. Nature 2001

Web power-laws

SCALE-FREE NETWORKS

(1) The number of nodes (N) is NOT fixed.

Networks continuously expand by the addition of new nodes

Examples:

WWW : addition of new documents

Citation : publication of new papers

(2) The attachment is NOT uniform.

A node is linked with higher probability to a node that already has a large number of links.

Examples :

WWW : new documents link to well known sites

(CNN, YAHOO, NewYork Times, etc)

Citation : well cited papers are more likely to be cited again

Scale-free model

(1) GROWTH :

A t every timestep we add a new node with m edges

(connected to the nodes already present in the system).

(2) PREFERENTIAL ATTACHMENT :

The probability Π that a new node will be connected to node i depends on the connectivity k i of that node

( k i

)

 k i j k j

P(k) ~k -3

A.-L.Barabási, R. Albert, Science 286, 509 (1999)

The Faloutsos Graph

10

4

10

3

10

2

10

1

10

0

10

0

10

1 node degree for AS20000102.m

10

2

10

3

10

4

Back to the Internet

• Understanding its structure and dynamics

– help applications (WWW, file sharing)

– help improving routing

– predict Internet growth

• So lets look at the data….

…Data?

• The Internet is an engineered system, so someone must know how it is built, no?

• NO! It is an uncoordinated interconnection of Autonomous Systems

(ASes=networks).

• No central database about Internet structure.

• Several projects attempt to reveal the structure: Skitter, RouteViews, …

The Internet Structure routers

The Internet Structure

The AS graph

Revealing the Internet Structure

Revealing the Internet Structure

Revealing the Internet Structure

Revealing the Internet Structure

Diminishing return!

Deploying more boxes does not pay-off

7 new links

30 new links

NO new links

Revealing the Internet Structure

To obtain the

‘ horizontal

’ links we need strong presence in the edge

What is DIMES?

DIMES

• Distributed Internet measurement and monitoring

– Based on software agents downloaded by volunteers

• Diminishing return?

– Software agents

– The cost of the first agent is very high

– each additional agent costs almost zero

• Capabilities

– Obtaining Internet maps at all granularity level

• connectivity, delay, loss, bandwidth, jitter, ….

– Tracking the Internet evolution in time

– Monitoring the Internet in real time

Diminishing Return?

• [Chen et al 02], [Bradford et al 01]: when you combine more and more points of view the return diminishes very fast

• What have they missed?

– The mass of the tail is significant

No. of views

Diminishing Return?

• [Chen et al 02], [Bradford et al 01]: when you combine more and more points of view the return diminishes very fast

• What have they missed?

– The mass of the tail is significant

No. of views

Diminish … shminimish

How many ASes see an edge?

~ 9000 / 6000 are seen only by one

Challenges

• It’s a distributed systems :

– Measurement traffic looks malicious

• Flying under the NOC radar screens

(Agents cannot measure too much)

– Optimize the architecture:

• Minimize the number of measurements

• Expedite the discovery rate

• BUT agents are

– Unreliable

– Some move around

real world complex system

Distributed System

Agents

• To be able to use agents wisely we need agents profiles:

– Reliablility

– Location:

• Static

• Bi-homed: where mostly?

• Mobile: identify home base

– Abilities: what type of measurements can it perform?

Agent shavitt shavitt

9000

8000

7000

6000

5000

4000

3000

2000

1000

0

31-Aug-04

Fairly stable measurements from Israel

5-Sep-04

2 idle weeks

Reappear in Spain

10-Sep-04 15-Sep-04 20-Sep-04 25-Sep-04 30-Sep-04 shavitt

5-Oct-04

1.4

1.2

1

0.8

1.8

1.6

x 10

4 agent prinCompNet

75 82 89 96 103 110

Days since project launched

117 124 131 138 140

Pr(k)

Degree Distribution

4

2

0

0

8

6

14

12

10

2

<k> k

DIMES+BGP (Feb 05)

4 6 log(degree)

8 10 12

DIMES+BGP (Feb 05)

8

6

4

12

10

2

0

0

Zipf plot

2 4 6 8 log(rank)

10 12 14 16

Quantifying the Distribution

Data Set

• Data is obtained from DIMES

–Community-based infrastructure, using almost

1000 active measuring software agents

–Agents follow a script and perform ~2 probes per minute (ICMP/UDP traceroute, ping)

–Most agents measure from a single AS (vp)

• But some (appear to) measure from more…

• Data need to be filtered to remove artifacts

–Traceroute data collected during March 2008

Filtering the data

• For each agent and each week, classify how many networks it measured the

Internet from

Typical cases:

–AS i

:15300, AS j

:8

–AS i

:10000, AS j

:3178

–AS i

:10000, AS j

:412 , AS k

:201

–18000, 12, 11, 9, 9, 3, 3, 2, 2, 1, 1, 1, 1, 1,

….

Measurements Per Agent

Week 4,2008

Measurements per Network

500

Agents per Network

Filtering Results

• 96% of the agents have less than 4 different vps

• High degree ASs tend to have more agents

• High number of measurements for all vps degrees

Diminishing Returns?

• Barford et. al.

– the utility of adding many vps quickly diminishes

–In terms of ASes and AS-links

• Shavitt and Shir – utility indeed diminishes but the tail is long and significant

–Tail is biased towards horizontal links

• We wish to quantify how different aspects of AS-level topology are affected by adding more vps

Creating topologies per VP sort by

Topology Size

• The return (especially for AS links) does not diminishes fast!

VP with small local topology can contribute many new links!

Direction of Detected Links

• For each link: Plot max adjacent AS degree and max adjacent ASes degree difference

Low degree difference – indicates tangential links and links between smallsize ASes High degree difference – indicates radial links towards the core

Convergence of Properties

• Taking several common AS-level graph properties, and analyze their convergence as local topologies are added

–Keeping the sort order by number of links

• Slow convergence indicates the need to have broad and diverse set of vps

Density and Average Degree

Slow convergence of density and average degree – easy to detect ASes but difficult to find all links

Power-law and Max Degree

Fast convergence of maximal degree – core links are easily detects

Fair convergence of power-law exponent

Betweenness and Clustering

Fast convergence of max bc – Level3

(AS3356), a tier-1 AS is immediately detected as having max bc

Radial links decrease cc

Tangential links increase cc

Revisiting Sampling Bias

• Lakhina et al. – AS degrees inferred from traceroute sampling are biased

–ASes in vicinity to vps have higher degrees

–Power-law might be an artifact of this!

• Dall’asta et al. – no…it is quite possible to have unbiased degrees with traceroutes

• Cohen et al. – when exponent is larger than 2, resulting bias is negligible

Evaluating Sampling Bias

• For each AS find:

–All the vps that have it in their local topology

–The Valley-Free distance in hops

Up-hill to the core (c2p), side-ways in the core (p2p) and down-hill from the core (p2c)

Dataset VPs and Distances

Low degree ASes are seen from less vps than high-degree ASes…this makes sense!

In our dataset, most ASes have a vp that is only 1-2 hops away!

Average Distance per Degree

Low degree ASes are seen from farther vps…sampling bias?

No real bias!

• More VPs are located in high-degree ASes

• There are high-degree ASes that are seen from “far” vps

• Broad distribution – all ASes are pretty close-by to a vp!

Predicting Growth

OurGoal

• To measure the Internet evolution in time

– AS level - too coarse

– IP level - too fine

The Internet Structure

The AS graph

The Internet Structure

The PoP level graph

The AS graph

What the PoP is ?

• PoP – Point of Presence of the ISP

OurGoal

• To measure the Internet evolution in time

– AS level - too coarse

– IP level - too fine

– PoP level – strike the right balance

• Network size is reasonable

• Nodes are roughly the same size

• Has a good geographical grip (with some exceptions)

• Other uses of PoP maps

– Network distance estimation

The Algorithm Input & Output

Pivot Idea: What is a graph representation of the POP?

DIMES a historical perspective

– It will never fly

– You’ll be lucky to get 500 downloads in three years

– You’ll never be able to clean the noise

– How will you deal with problem i

• Status in Feb 2010

( i =1,2,3,4, ….)?

– Over 21,700 downloads (over 100 nations)

– 1000-1200 active agents every day

– Measuring from over 200 ASes every week

– Data is used world wide by EE, CS, Phys, Econ

– DIMES is highly cited

http://www.netDimes.org

Download