Networking
Godmar Back
CS 4284 Spring 2013
Host, router network layer functions:
Network layer
Transport layer: TCP, UDP
Routing protocols
•path selection
•RIP, OSPF, BGP
IP protocol
•addressing conventions
•datagram format
•packet handling conventions forwarding table
ICMP protocol
•error reporting
•router “signaling”
Link layer
Physical layer
CS 4284 Spring 2013
IP protocol version number header length
(bytes)
“type” of data max number remaining hops
(decremented at each router) ver head.
len
32 bits type of service
16-bit identifier time to live upper layer flgs length fragment offset
Internet checksum
32 bit source IP address
32 bit destination IP address upper layer protocol to deliver payload to
Options (if any) data
(variable length, typically a TCP or UDP segment) total datagram length (bytes) for fragmentation/ reassembly
E.g. timestamp, record route taken, specify list of routers to visit.
CS 4284 Spring 2013
• network links have MTU
(max.transfer size) - largest possible link-level frame.
– different link types, different MTUs
• large IP datagram divided
(“fragmented”) within net
– one datagram becomes several datagrams
– “reassembled” only at final destination
– IP header bits used to identify, order related fragments reassembly fragmentation: in: one large datagram out: 3 smaller datagrams
CS 4284 Spring 2013
Example
• 4000 byte datagram
• MTU = 1500 bytes
1480 bytes in data field offset =
1480/8 length
=4000
ID
=x fragflag
=0 offset
=0
One large datagram becomes several smaller datagrams length
=1500
ID
=x fragflag
=1 offset
=0 length
=1500
ID
=x fragflag
=1 offset
=185 length
=1040
ID
=x fragflag
=0 offset
=370
CS 4284 Spring 2013
• IP address: 32-bit identifier for host or router interface
• interface: connection between host/router and physical link
– routers typically have multiple interfaces
– host may have multiple interfaces
223.1.1.1
223.1.1.2
223.1.1.3
223.1.3.27
223.1.2.1
223.1.1.4
223.1.2.9
223.1.2.2
– IP addresses are associated with each interface
– Link can be multipoint-link, e.g. LAN – or even entire network, e.g., ATM
• Key point: no routing table lookup is necessary to get to destination within subnet
223.1.3.1
223.1.3.2
223.1.1.1 = 11011111 00000001 00000001 00000001
223 1 1 1
CS 4284 Spring 2013
• IP address:
– subnet part (high order bits)
– host part (low order bits)
• What’s a subnet ?
– (a set of) device interfaces with a common subnet part of
IP address
– can physically reach each other without intervening router
223.1.1.1
223.1.2.1
223.1.1.2
223.1.1.4
223.1.2.9
223.1.1.3
223.1.3.1
223.1.3.27
223.1.2.2
LAN
223.1.3.2
network consisting of 3 subnets
CS 4284 Spring 2013
223.1.1.0/24
223.1.2.0/24
Recipe
• To determine the subnets, detach each interface from its host or router, creating islands of isolated networks.
Each isolated network is called a subnet
– And needs its own subnet address!
CS 4284 Spring 2013
223.1.3.0/24
Subnet mask: /24
255.255.255.0
How many?
223.1.1.2
223.1.1.1
223.1.1.4
223.1.9.2
223.1.1.3
223.1.7.1
223.1.2.1
223.1.9.1
223.1.8.1
223.1.2.6
223.1.8.2
223.1.7.2
223.1.2.2
223.1.3.1
223.1.3.27
223.1.3.2
CS 4284 Spring 2013
223.1.1.2
• IP addresses denote interfaces, not hosts
• Sets of interfaces form subnets
– Subnets share common prefix
223.1.1.1
223.1.9.2
223.1.1.3
• Route to CIDR-ized subnet addresses
– a.b.c.d/x
• Within subnet, reach
223.1.2.1
destination directly
223.1.9.1
223.1.2.6
223.1.8.1
223.1.2.2
223.1.1.4
223.1.7.1
223.1.8.2
223.1.3.1
223.1.7.2
223.1.3.27
223.1.3.2
CS 4284 Spring 2013
CIDR: C
I
D
R
– subnet portion of address of arbitrary length
– address format: a.b.c.d/x , where x is # bits in subnet portion of address subnet part host part
11001000 00010111 0001000 0 00000000
200.23.16.0/23
CS 4284 Spring 2013
• A, B, C: Pretty much only of historical interest today
CS 4284 Spring 2013
CS 4284 Spring 2013
Internet
__________
R1
PPP Link 1
__________
PPP Link 2
R2
__________
__________
Ethernet
LAN 1
60 Machines
Subnet address:
______________
Default gateway:
______________
__________
__________
R3
CS 4284 Spring 2013
Ethernet
LAN 2
120 Machines
Subnet address:
______________
Default gateway:
______________
Internet
191.23.25.197
R1
PPP Link 1
191.23.25.196/30
191.23.25.193
PPP Link 2
191.23.25.192/30
R2
191.23.25.194
191.23.25.129
191.23.25.198
191.23.25.1
R3
Ethernet
LAN 1
60 Machines
Subnet address:
191.23.25.128/26
Default gateway:
191.23.25.129
Ethernet
LAN 2
120 Machines
Subnet address:
191.23.25.0/25
Default gateway:
191.23.25.1
CS 4284 Spring 2013
• Typical: local subnets + default gateway (“firsthop router”)
• Example: “route print” on Windows XP
– 128.173.55.90 FastEthernet
– 192.82.175.230 802.11g wireless
Active Routes:
Network Destination Netmask Gateway Interface Metric
0.0.0.0 0.0.0.0 128.173.48.1 128.173.55.90 20
0.0.0.0 0.0.0.0 198.82.174.1 198.82.175.230 25
127.0.0.0 255.0.0.0 127.0.0.1 127.0.0.1 1
128.173.48.0 255.255.248.0 128.173.55.90 128.173.55.90 20
198.82.174.0 255.255.254.0 198.82.175.230 198.82.175.230 25
…
Default Gateway: 128.173.48.1
CS 4284 Spring 2013
• used by hosts & routers to communicate network-level information
– error reporting: unreachable host, network, port, protocol
– echo request/reply (used by ping)
• network-layer “above” IP:
– ICMP msgs carried in IP datagrams
• ICMP message: type, code plus first 8 bytes of IP datagram causing error
Type Code description
0 0 echo reply (ping)
3 0 dest. network unreachable
3 1 dest host unreachable
3 2 dest protocol unreachable
3 3 dest port unreachable
3 6 dest network unknown
3 7 dest host unknown
4 0 source quench (congestion control - not used)
8 0 echo request (ping)
9 0 route advertisement
10 0 router discovery
11 0 TTL expired
12 0 bad IP header
CS 4284 Spring 2013
• Source sends series of
UDP segments to dest
– First has TTL =1
– Second has TTL=2, etc.
– Unlikely port number
• When nth datagram arrives to nth router:
– Router discards datagram
– And sends to source an
ICMP message (type 11, code 0)
– Message includes name of router& IP address
• When ICMP message arrives, source calculates
RTT
• Traceroute does this 3 times
Stopping criterion
• UDP segment eventually arrives at destination host
• Destination returns ICMP
“port unreachable” packet
(type 3, code 3)
• When source gets this
ICMP, stops.
• See also
[ Heideman 2008 ]
CS 4284 Spring 2013
• Host gets IP address either hardcoded or via
DHCP ( D ynamic H ost C onfiguration P rotocol)
• Network gets subnet part of IP address allocated from ISP’s address space
• ISP gets address space assigned by ICANN
( I nternet C orporation for A ssigned N ames and
N umbers)
ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20
Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23
Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23
Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23
... ….. …. ….
Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23
CS 4284 Spring 2013
• Initial motivation: 32-bit address space soon to be completely allocated.
• Additional motivation:
– header format helps speed processing/forwarding
– header changes to facilitate QoS
– easier configuration of both hosts & backbone routers
IPv6 datagram format:
– fixed-length 40 byte header
– no fragmentation allowed
CS 4284 Spring 2013
Priority: identify priority among datagrams in flow
Flow Label: identify datagrams in same “flow.”
(concept of “flow” not well defined).
Next header: identify upper layer protocol for data
CS 4284 Spring 2013
• Checksum : removed entirely to reduce processing time at each hop
• ICMPv6: new version of ICMP
– additional message types, e.g. “Packet Too
Big”
– multicast group management functions
• Options: allowed, but outside of header, indicated by “Next Header” field
CS 4284 Spring 2013
• Grouped in six types:
– Hop-by-hop options, e.g. Jumbograms
– Destination options
– Routing, e.g. source routing
– Fragment – can be done, but end hosts only!
– Authentication
– Encapsulation
• Routers quickly know which headers they must examine and which they can skip
CS 4284 Spring 2013
0000 0000
0000 0001
0000 001
0000 010
0000 011 - 0001
001
010 - 110
1110 - 1111 1110 0
1111 110
1111 1110 10
Reserved
Unassigned
Reserved for NSAP (non-IP addresses used by ISO)
Reserved for IPX (non-IP addresses used by IPX)
Unassigned
Unicast Address Space
Unassigned
Unassigned
Unique Local Addresses (ULA)
Link Local Use addresses
1111 1110 11 Site Local Use addresses (Deprecated)
1111 1111 Multicast addresses
• Written as eight 16bit values
– e.g. fe80::020e:7bff:fe32:d716 (made from 00:0E:7B:32:D7:16)
CS 4284 Spring 2013
• stateless autoconfiguration see [ Donz é 2004 ]
– Plug in and interface creates link-local address based on adapter MAC
– Interface can have link-local (fe80::…), site-local & global (2001::…) addresses
• VT’s campus has had IPv6 testbed since 1998, now connected to public IPv6 network
• Try it out yourself!
– MacOS, Linux: enabled by default of recent installations
– Windows XP: “ipv6 install” at command prompt
– Tools add 6: ping6, traceroute6, etc..
CS 4284 Spring 2013
• Not all routers can be upgraded simultaneously
– no “flag days”
– How will the network operate with mixed IPv4 and IPv6 routers?
• Tunneling: IPv6 carried as payload in IPv4 datagram among IPv4 routers
CS 4284 Spring 2013
Logical view:
Physical view:
A B
IPv6
A
IPv6
B C
IPv6
Flow: X
Src: A
Dest: F data
IPv6 IPv4
Src:B
Dest: E
Flow: X
Src: A
Dest: F tunnel
D
E
IPv6
E
F
IPv6
F
IPv4 IPv6
Src:B
Dest: E
Flow: X
Src: A
Dest: F
Flow: X
Src: A
Dest: F data
IPv6 data data
A-to-B:
IPv6
B-to-C:
IPv6 inside
IPv4
CS 4284 Spring 2013
B-to-C:
IPv6 inside
IPv4
E-to-F:
IPv6
• Bernstein points out some hindrances [ The
IPv6 mess ]
– Lack of interoperability b/c no embedding of addresses
– Transition path (comparison to MX records)
• IPv6 – the next OSI?
• DoD requirement by 2008
– What happened to it?
• Federal 2012 deadline that all public-facing websites talk IPv6
• Asian countries are pushing for transition
CS 4284 Spring 2013
CS 4284 Spring 2013
Hierarchical addressing allows efficient advertisement of routing information:
Organization 0
200.23.16.0/23
Organization 1
200.23.18.0/23
Organization 2
200.23.20.0/23
Organization 7
..
200.23.30.0/23
Fly-By-Night-ISP
“Send me anything with addresses beginning
200.23.16.0/20”
Internet
ISPs-R-Us
“Send me anything with addresses beginning
199.31.0.0/16”
CS 4284 Spring 2013
ISPs-R-Us has a more specific route to Organization 1
Organization 0
200.23.16.0/23
Organization 2
200.23.20.0/23
Organization 7 .
..
200.23.30.0/23
Fly-By-Night-ISP
“Send me anything with addresses beginning
200.23.16.0/20”
Internet
Organization 1
200.23.18.0/23
ISPs-R-Us
“Send me anything with addresses beginning 199.31.0.0/16 or 200.23.18.0/23”
CS 4284 Spring 2013
• In Internet:
– Intra-AS known as Interior Gateway Protocols
(IGP)
– Most common Intra-AS routing protocols:
• RIP: Routing Information Protocol (original protocol, now rarely used)
• OSPF: Open Shortest Path First
• IGRP/EIGRP: (Enhanced) Interior Gateway Routing
Protocol
– Inter-AS known as Border Gateway Protocols:
• BGP4: Only protocol used
CS 4284 Spring 2013
• Distance vector algorithm
– Included in BSD-UNIX Distribution in 1982
• Distance metric: # of hops (max = 15 hops)
• Distance vectors: exchanged among neighbors every 30 sec via Response Message (also called advertisement )
• Each advertisement: list of up to 25 destination nets within AS u v destination hops u 1
A B w v 2 w 2 x 3 z
C D y x y 3 z 2
A’s routing table
CS 4284 Spring 2013
w
A x
D y
B
C
Routing table in D
Destination Network Next Router Num. of hops to dest.
w A 2 y z x
….
B
B
--
….
2
7
1
....
z
CS 4284 Spring 2013
Dest Next hops w x z
….
-
C 4
… ...
w
A
Destination Network w y z x
….
Advertisement from A to D x
D y
B
C Routing table in D
Next Router Num. of hops to dest.
A 2
B
B A
--
….
2
7 5
1
....
CS 4284 Spring 2013 z
If no advertisement heard after 180 sec → neighbor/link declared dead
– routes via neighbor invalidated
– new advertisements sent to neighbors
– neighbors in turn send out new advertisements
(if tables changed)
– poison reverse used to prevent ping-pong loops
(infinite distance = 16 hops)
CS 4284 Spring 2013
• RIP routing tables managed by application-level process called route-d (daemon)
• advertisements sent in UDP packets, periodically repeated routed routed
Transprt
(UDP) network forwarding
(IP) table link physical forwarding table
Transprt
(UDP) network
(IP) link physical
CS 4284 Spring 2013
• Cisco proprietary
– See [ Cisco Whitepaper ], [ Malhotra 2002 ]
• Distance Vector Protocol with enhancements
– Explicit Signaling (HELLO packets)
• DUAL “diffusing update algorithm”
– “feasible successor” concept guarantees loop freedom
• Intuition: rather than count to infinity, trigger route recomputation unless another loop-free path is known
– Optimize this by keeping track of all advertised routes, not just best one
CS 4284 Spring 2013
• “open”: publicly available protocol (not proprietary)
• Uses Link State algorithm
– LS packet dissemination
– Topology map at each node
– Route computation using Dijkstra’s algorithm
• OSPF advertisement carries one entry per neighbor router
– Advertisements have age field to allow for expiration
• Advertisements disseminated to entire AS (via flooding)
– Carried in OSPF messages directly over IP (rather than TCP or UDP)
CS 4284 Spring 2013
• Security: all OSPF messages authenticated (to prevent malicious intrusion)
• Multi ple same-cost path s allowed (only one path in RIP)
• For each link, multiple cost metrics for different
TOS (e.g., satellite link cost set “low” for best effort; high for real time)
• Integrated uni- and multicast support:
– Multicast OSPF (MOSPF) uses same topology data base as OSPF
• Hierarchical OSPF in large domains.
CS 4284 Spring 2013
CS 4284 Spring 2013
• Two-level hierarchy: local area, backbone.
– link-state advertisements only in same area
– each nodes has detailed area topology; only know direction (shortest path) to nets in other areas.
• Area border routers : “summarize” distances to nets in own area, advertise to other Area Border routers.
• Backbone routers : run OSPF routing limited to backbone.
• Boundary routers: connect to other AS’s.
CS 4284 Spring 2013
• BGP (Border Gateway Protocol): the de facto standard
• BGP provides each AS a means to:
1. Obtain subnet reachability information from neighboring ASs.
2. Propagate the reachability information to all routers internal to the AS.
3.
Determine “good” routes to subnets based on reachability information and policy.
• Allows a subnet to advertise its existence to rest of the Internet: “I am here”
CS 4284 Spring 2013
• Pairs of routers (BGP peers) exchange routing info over semi-permanent TCP conctns: BGP sessions
• Note that BGP sessions do not always correspond to physical links.
• When AS2 advertises a prefix to AS1, AS2 is promising it will forward any datagrams destined to that prefix towards the prefix.
– AS2 can aggregate prefixes in its advertisement
3b
3c
3a
AS3
1c
2a
2c
1a
AS1 1d
1b
AS2
2b eBGP session iBGP session
CS 4284 Spring 2013
• With eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1.
• 1c can then use iBGP do distribute this new prefix reach info to all routers in AS1
• 1b can then re-advertise the new reach info to AS2 over the 1bto-2a eBGP session
• When router learns about a new prefix, it creates an entry for the prefix in its forwarding table.
3b
3c
3a
AS3
1a
AS1
1c
1d
1b
2a
2c
AS2
2b eBGP session iBGP session
CS 4284 Spring 2013
• When advertising a prefix, advert includes BGP attributes.
– prefix + attributes = “route”
• Two important attributes:
– AS-PATH: contains the ASs through which the advert for the prefix passed: AS 67 AS 17
– NEXT-HOP: Indicates the specific internal-AS router to next-hop AS. (There may be multiple links from current AS to next-hop-AS.)
• When gateway router receives route advert, uses import policy to accept/decline.
CS 4284 Spring 2013
• Router may learn about more than 1 route to some prefix. Router must select route.
• Elimination rules:
1. Local preference value attribute: policy decision
2. Shortest AS-PATH (like DV routing, except with more information!)
3. Closest NEXT-HOP router: hot potato routing
4. Additional criteria
CS 4284 Spring 2013
• Accomplished via AS-PATH attributes
– Each node is entire AS!
CS 4284 Spring 2013
B legend: provider network
X
W
A
C
Y
Figure 4.5-BGPnew : a simple BGP scenario
• A,B,C are provider networks
• X,W,Y are customer (of provider networks)
• X is dual-homed: attached to two networks
– X does not want to route from B via X to C
– .. so X will not advertise to B a route to C customer network:
CS 4284 Spring 2013
W
A
legend:
B provider network
X
C customer network:
Y
• B advertises to X the path BAW
• Should B advertise to C the path BAW?
– No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers
– B wants to force C to route to w via A
– B wants to route only to/from its customers!
CS 4284 Spring 2013
• OSPF hierarchy is intra-AS
• BGP connects
ASs
CS 4284 Spring 2013
Policy:
• Inter-AS: admin wants control over how its traffic routed, who routes through its net.
• Intra-AS: single admin, so no policy decisions needed
Scale:
• hierarchical routing saves table size, reduced update traffic
Performance :
• Intra-AS: can focus on performance
• Inter-AS: policy may dominate over performance
CS 4284 Spring 2013
Intra-
Inter-
EBGP
Sessions
1,490
13,830
OSPF EIGRP
IGP
9,624 12,741
1,161 1,342
RIP Total
156 22,521
161 2,664
• Sample obtained by reverse-engineering router config files
• Source David Maltz et al:
– Routing Design in Operational Networks – A
Look from the inside, [ SIGCOMM 2004 ]
CS 4284 Spring 2013
• IP
– Addressing, subnets
• ICMP
• RIP
• OSPF
• BGP
CS 4284 Spring 2013