Principles in Communication Networks
• Instructor: Prof. Yuval Shavitt,
– Office hours: room 303 s/w eng. bldg., Tue 16:00-
17:00
• Prerequisites (םדק תושירד):
– Introduction to computer communications (TAU,
Technion, BGU)
• Expectations from students:
– probability
– Queueing theory basics
– Graph theory
Course Syllabus (tentative)
• Internet structure
• Internet measurements
• Measurement optimization
• Measurement analysis
• Introduction to switching, router types
• Use of Gen. Func.: HOL analysis, TCP analysis.
• Matching algorithms and their analysis
• CLOS networks: non-blocking theorem, routing algorithms and their analysis
• Scheduling algorithms
Grade composition
• Final exam – 60%
• Project – 30%
• Home assignments (2-3) - 10%
Routing in the Internet
Routing in the Internet
Routing in the Internet is done in three levels:
– In LANs in the MAC layer:
• Spanning tree protocol for Ethernet Transparent bridge.
• Source routing for token rings
• Inside autonomous systems (ASes):
– RIP, OSPF, IS-IS, (E)IGRP
• Between ASes:
– BGP
Autonomous Systems
• Autonomous Routing Domains: A collection of physical networks glued together using IP, that have a unified administrative routing policy.
• An AS is an autonomous routing domain that has been assigned a number.
… the administration of an AS appears to other ASes to have a single coherent interior routing plan and presents a consistent picture of what networks are reachable through it.
RFC 1930: Guidelines for creation, selection, and registration of an Autonomous System
Internet Hierarchical Routing a
Host h1
C
C.b
b
Inter-AS routing between
A and B A.a
A.c
a
A d b c
Intra-AS routing within AS A
B.a
a
B c b
Host h2
Intra-AS routing within AS B
Why different Intra- and Inter-AS routing ?
Policy:
• Inter-AS: admin wants control over how its traffic routed, who routes through its net.
• Intra-AS: single admin, so no policy decisions needed
Scale:
• hierarchical routing saves table size, reduced update traffic
Performance :
• Intra-AS: can focus on performance
• Inter-AS: policy may dominate over performance
RIP
• A distance-vector protocol – (distributed
Bellman Ford)
• Developed in the 80s based on a Xerox protocol
• RIP-2 is now often used due to its simplicity
• Distance metric: minimum hop
OSPF / IS-IS
• Link state protocol – each node see the entire network map and calculate shortest paths using Dijksrta algorithm.
• Allows two level of hierarchy
• Authentication
• Complex
• IS-IS gain popularity among large ISPs
The structure of the Internet
How are routers connected?
• Why should we care?
– While communication protocols will work correctly on ANY topology
– ….they may not be efficient for some topologies
– Knowledge of the topology can aid in optimizing protocols
The Internet as a graph
• Remember: the Internet is a collection of networks called autonomous systems (ASs)
• The Internet graph:
– The AS graph
• Nodes: ASs, links: AS peering
– The router level graph
• Nodes: routers, links: fibers, cables, MW channels, etc.
– There are mid-level aggregation schemes
• PoP topologies, city topologies
• How does it looks like?
Random graphs in Mathematics
The Erdös-Rényi model
• Generation:
– create n nodes.
– each possible link is added with probability p .
• Number of links: np
Poisson distribution
• If we want to keep the number of links linear, what happen to p as n
?
The Waxman model
• Integrating distance with the E-R model
• Generation
– Spread n nodes on a large enough grid.
– Pick a link uar and add it with prob. that exponentially decrease with its length
– Stop if enough links
• Heavily used in the 90s
50
40
30
20
10
0
0
100
90
80
70
60
10 20 30 40 50 60 70 80 90 100
1999
The Faloutsos brothers
• Measured the Internet
AS and router graphs.
• Mine, she looks different!
Notre Dame
• Looked at complex system graphs: social relationship, actors, neurons, WWW
• Suggested a dynamic generation model
The Faloutsos Graph
1995 Internet router topology
3888 nodes, 5012 edges, <k>=2.57
SCIENCE CITATION INDEX
Nodes : papers
Links : citations
1736 PRL papers (1988)
Witten-Sander
PRL 1981
25
2212
P(k) ~k -
(
= 3)
(S. Redner, 1998)
Sex-web
Nodes: people (Females; Males)
Links: sexual relationships
4781 Swedes; 18-74;
59% response rate.
Liljeros et al. Nature 2001
Web power-laws
SCALE-FREE NETWORKS
(1) The number of nodes (N) is NOT fixed.
Networks continuously expand by the addition of new nodes
Examples:
WWW : addition of new documents
Citation : publication of new papers
(2) The attachment is NOT uniform.
A node is linked with higher probability to a node that already has a large number of links.
Examples :
WWW : new documents link to well known sites
(CNN, YAHOO, NewYork Times, etc)
Citation : well cited papers are more likely to be cited again
Scale-free model
(1) GROWTH :
A t every timestep we add a new node with m edges
(connected to the nodes already present in the system).
(2) PREFERENTIAL ATTACHMENT :
The probability Π that a new node will be connected to node i depends on the connectivity k i of that node
( k i
)
k i j k j
P(k) ~k -3
A.-L.Barabási, R. Albert, Science 286, 509 (1999)
The Faloutsos Graph
10
4
10
3
10
2
10
1
10
0
10
0
10
1 node degree for AS20000102.m
10
2
10
3
10
4
Back to the Internet
• Understanding its structure and dynamics
– help applications (WWW, file sharing)
– help improving routing
– predict Internet growth
• So lets look at the data….
…Data?
• The Internet is an engineered system, so someone must know how it is built, no?
• NO! It is an uncoordinated interconnection of Autonomous Systems
(ASes=networks).
• No central database about Internet structure.
• Several projects attempt to reveal the structure: Skitter, RouteViews, …
The Internet Structure routers
The Internet Structure
The AS graph
Revealing the Internet Structure
Revealing the Internet Structure
Revealing the Internet Structure
Revealing the Internet Structure
Diminishing return!
Deploying more boxes does not pay-off
7 new links
30 new links
NO new links
Revealing the Internet Structure
To obtain the
‘ horizontal
’ links we need strong presence in the edge
What is DIMES?
DIMES
• Distributed Internet measurement and monitoring
– Based on software agents downloaded by volunteers
• Diminishing return?
– Software agents
– The cost of the first agent is very high
– each additional agent costs almost zero
• Capabilities
– Obtaining Internet maps at all granularity level
• connectivity, delay, loss, bandwidth, jitter, ….
– Tracking the Internet evolution in time
– Monitoring the Internet in real time
Diminishing Return?
• [Chen et al 02], [Bradford et al 01]: when you combine more and more points of view the return diminishes very fast
• What have they missed?
– The mass of the tail is significant
No. of views
Diminishing Return?
• [Chen et al 02], [Bradford et al 01]: when you combine more and more points of view the return diminishes very fast
• What have they missed?
– The mass of the tail is significant
No. of views
Diminish … shminimish
How many ASes see an edge?
~ 9000 / 6000 are seen only by one
Challenges
• It’s a distributed systems :
– Measurement traffic looks malicious
• Flying under the NOC radar screens
(Agents cannot measure too much)
– Optimize the architecture:
• Minimize the number of measurements
• Expedite the discovery rate
• BUT agents are
– Unreliable
– Some move around
real world complex system
Distributed System
Agents
• To be able to use agents wisely we need agents profiles:
– Reliablility
– Location:
• Static
• Bi-homed: where mostly?
• Mobile: identify home base
– Abilities: what type of measurements can it perform?
Agent shavitt shavitt
9000
8000
7000
6000
5000
4000
3000
2000
1000
0
31-Aug-04
Fairly stable measurements from Israel
5-Sep-04
2 idle weeks
Reappear in Spain
10-Sep-04 15-Sep-04 20-Sep-04 25-Sep-04 30-Sep-04 shavitt
5-Oct-04
1.4
1.2
1
0.8
1.8
1.6
x 10
4 agent prinCompNet
75 82 89 96 103 110
Days since project launched
117 124 131 138 140
Pr(k)
Degree Distribution
4
2
0
0
8
6
14
12
10
2
<k> k
DIMES+BGP (Feb 05)
4 6 log(degree)
8 10 12
DIMES+BGP (Feb 05)
8
6
4
12
10
2
0
0
Zipf plot
2 4 6 8 log(rank)
10 12 14 16
Quantifying the Distribution
Data Set
• Data is obtained from DIMES
–Community-based infrastructure, using almost
1000 active measuring software agents
–Agents follow a script and perform ~2 probes per minute (ICMP/UDP traceroute, ping)
–Most agents measure from a single AS (vp)
• But some (appear to) measure from more…
• Data need to be filtered to remove artifacts
–Traceroute data collected during March 2008
Filtering the data
• For each agent and each week, classify how many networks it measured the
Internet from
Typical cases:
–AS i
:15300, AS j
:8
–AS i
:10000, AS j
:3178
–AS i
:10000, AS j
:412 , AS k
:201
–18000, 12, 11, 9, 9, 3, 3, 2, 2, 1, 1, 1, 1, 1,
….
Measurements Per Agent
Week 4,2008
Measurements per Network
500
Agents per Network
Filtering Results
• 96% of the agents have less than 4 different vps
• High degree ASs tend to have more agents
• High number of measurements for all vps degrees
Diminishing Returns?
• Barford et. al.
– the utility of adding many vps quickly diminishes
–In terms of ASes and AS-links
• Shavitt and Shir – utility indeed diminishes but the tail is long and significant
–Tail is biased towards horizontal links
• We wish to quantify how different aspects of AS-level topology are affected by adding more vps
Creating topologies per VP sort by
Topology Size
• The return (especially for AS links) does not diminishes fast!
VP with small local topology can contribute many new links!
Direction of Detected Links
• For each link: Plot max adjacent AS degree and max adjacent ASes degree difference
Low degree difference – indicates tangential links and links between smallsize ASes High degree difference – indicates radial links towards the core
Convergence of Properties
• Taking several common AS-level graph properties, and analyze their convergence as local topologies are added
–Keeping the sort order by number of links
• Slow convergence indicates the need to have broad and diverse set of vps
Density and Average Degree
Slow convergence of density and average degree – easy to detect ASes but difficult to find all links
Power-law and Max Degree
Fast convergence of maximal degree – core links are easily detects
Fair convergence of power-law exponent
Betweenness and Clustering
Fast convergence of max bc – Level3
(AS3356), a tier-1 AS is immediately detected as having max bc
Radial links decrease cc
Tangential links increase cc
Revisiting Sampling Bias
• Lakhina et al. – AS degrees inferred from traceroute sampling are biased
–ASes in vicinity to vps have higher degrees
–Power-law might be an artifact of this!
• Dall’asta et al. – no…it is quite possible to have unbiased degrees with traceroutes
• Cohen et al. – when exponent is larger than 2, resulting bias is negligible
Evaluating Sampling Bias
• For each AS find:
–All the vps that have it in their local topology
–The Valley-Free distance in hops
Up-hill to the core (c2p), side-ways in the core (p2p) and down-hill from the core (p2c)
Dataset VPs and Distances
Low degree ASes are seen from less vps than high-degree ASes…this makes sense!
In our dataset, most ASes have a vp that is only 1-2 hops away!
Average Distance per Degree
Low degree ASes are seen from farther vps…sampling bias?
No real bias!
• More VPs are located in high-degree ASes
• There are high-degree ASes that are seen from “far” vps
• Broad distribution – all ASes are pretty close-by to a vp!
Predicting Growth
OurGoal
• To measure the Internet evolution in time
– AS level - too coarse
– IP level - too fine
The Internet Structure
The AS graph
The Internet Structure
The PoP level graph
The AS graph
What the PoP is ?
• PoP – Point of Presence of the ISP
OurGoal
• To measure the Internet evolution in time
– AS level - too coarse
– IP level - too fine
– PoP level – strike the right balance
• Network size is reasonable
• Nodes are roughly the same size
• Has a good geographical grip (with some exceptions)
• Other uses of PoP maps
– Network distance estimation
The Algorithm Input & Output
Pivot Idea: What is a graph representation of the POP?
DIMES a historical perspective
– It will never fly
– You’ll be lucky to get 500 downloads in three years
– You’ll never be able to clean the noise
– How will you deal with problem i
• Status in Feb 2010
( i =1,2,3,4, ….)?
– Over 21,700 downloads (over 100 nations)
– 1000-1200 active agents every day
– Measuring from over 200 ASes every week
– Data is used world wide by EE, CS, Phys, Econ
– DIMES is highly cited
http://www.netDimes.org