# Document

```现已知subnet mask 为255.255.255.224,

200.10.1.65
200.10.1.40
200.10.1.50
ROUTER
200.10.1.45
200.10.1.60
200.10.1.70
What Is a Variable-Length
HQ
172.16.0.0/16
What Is a Variable-Length
HQ
HQ
172.16.0.0/16
What Is a Variable-Length
172.16.14.32/27
A
172.16.14. 64/27
B
HQ
HQ
172.16.0.0/16
172.16.14.96/27
C
– Subnet 172.16.14.0/24 is divided into smaller subnets:
• Subnet with one mask at first (/27)
What Is a Variable-Length
172.16.14.32/27
A
172.16.14. 64/27
B
HQ
HQ
172.16.0.0/16
172.16.14.96/27
C
– Subnet 172.16.14.0/24 is divided into smaller subnets:
• Subnet with one mask at first (/27)
• Further subnet one of these subnets not used elsewhere (/30)
Calculating VLSMs
In Binary 10101100. 00010000.00100000.00000000
Calculating VLSMs
In Binary 10101100. 00010000.00100000.00000000
In Binary 10101100. 00010000.00100000.00000000
Calculating VLSMs
In Binary 10101100. 00010000.00100000.00000000
In Binary 10101100. 00010000.00100000.00000000
1st subnet:
10101100 . 00010000 .0010 0000.00 000000=172.16.32.0/26
Network
Subnet VLSM
subnet
Host
Calculating VLSMs
In Binary 10101100. 00010000.00100000.00000000
In Binary 10101100. 00010000.00100000.00000000
1st subnet: 10101100 . 00010000
2nd subnet:
172
.
16
3rd subnet:
172
.
16
172
.
16
4th subnet:
172
.
16
5th subnet:
Network
.0010
.0010
.0010
.0010
.0010
0000.00
0000.01
0000.10
0000.11
0001.00
Subnet VLSM
Subnet
000000=172.16.32.0/26
000000=172.16.32.64/26
000000=172.16.32.128/26
000000=172.16.32.192/26
000000=172.16.33.0/26
Host
A Working VLSM Example
Derived from the 172.16.32.0/20 Subnet
A Working VLSM Example
Derived from the 172.16.32.0/20 Subnet
172.16.32.0/26
172.16.32.64/26
172.16.32.128/26
172.16.32.192/26
(62 hosts)
A Working VLSM Example
Derived from the 172.16.32.0/20 Subnet
172.16.32.0/26
172.16.32.64/26
172.16.32.128/26
172.16.32.192/26
Derived from the
172.16.33.0/26 Subnet
(2 hosts)
(62 hosts)
A Working VLSM Example
Derived from the 172.16.32.0/20 Subnet
172.16.32.0/26
172.16.33.0/30
172.16.33.4/30
172.16.32.64/26
172.16.33.8/30
172.16.32.128/26
172.16.33.12/30
172.16.32.192/26
Derived from the
172.16.33.0/26 Subnet
(2 Hosts)
(62 Hosts)
Written Exercise: Calculating
VLSMs
• Using VLSMs, define appropriate subnets for addressing
the networks using 192.168.49.0/24.
25 Users
A
A
A Serial
B Serial
25 Users
B
C Serial
B
D Serial
C
25 Users
C
D
25 Users
D
E
25 Users
E
HQ
E Serial
IP datagram format
IP protocol version
number
Four-byte word (bytes)
“type” of data
256s max number
remaining hops
(decremented at
each router)
upper layer protocol
6:TCP
17:UDP
1:ICMP
89:OSPF
32 bits
type of
len service
length
fragment
16-bit identifier flgs
offset
time to upper
Internet
layer
live
checksum
total datagram
length (bytes)
for
fragmentation/
reassembly
Options (if any)
data
(variable length,
typically a TCP
or UDP segment)
E.g. timestamp,
record route
taken, specify
list of routers
to visit.
IP Fragmentation & Reassembly


(max.transfer size) - largest
different MTUs
large IP datagram divided
(“fragmented”) within net
– one datagram becomes
several datagrams
– “reassembled” only at final
destination
– IP header bits used to
identify, order related
fragments
fragmentation:
in: one large datagram
out: 3 smaller datagrams
reassembly
MTU
Ethernet:1500
X.25:576
FDDI:4352
Token ring:4464
IP Fragmentation and
Reassembly
length ID fragflag offset
=4000 =x
=0
=0
One large datagram becomes
several smaller datagrams
length ID fragflag offset
=1500 =x
=1
=0
Unit: eight bytes
length ID fragflag offset
=1500 =x
=1
=1480
1480/8
length ID fragflag offset
=1040 =x
=0
=2960
2960/8

•
•
•
•

x=des_IP , X =des_MAC
z=my_default_router_IP,
Z=my_default_router_MAC
IF(w AND Y)=(x AND Y ) THEN 网络号相等
Look_up_MAC(X)
IF found THEN
SEND_PACKET(X, x)
ELSE
SEND_ARP(X,?)
SEND_PACKET(X,x)
ELSE
Look_up_MAC(Z)
IF found
THEN
SEND_PACKET(Z,z)
ELSE
SEND_ARP(Z,?)
SEND_PACKET(Z,z )

B的MAC地址？
• 每一个IP节点(主机，

• ARP 表: IP/MAC 映射

< …………………………..
>
– TTL (Time To Live): 此

(typically 20 min)
ARP 协议
• A 知道 B的 IP 地址, 希望知道B的物理地址
• A 广播 ARP query 报文, 报文中包含B的IP地址
– 所有和A在同一物理网段上的设备都收到
ARP query报文
• B 收到 ARP 报文, 反馈给A 他的 (B的) 物理地

• A 缓存 IP-physical 地址对,并知道此记录超时

– Route table lookup:
Destination[i])
Forward to NextHop[i]
– Subnet mask can end on any bit.
– Mask must have contiguous 1s followed by
contiguous zeros. Routers do not support other
Route Table Lookup: Example
30.0.0.7
30.0.0.0
40.0.0.8
40.0.0.0
40.0.0.7
128.1.0.9
128.1.0.0
128.1.0.8
192.4.0.0
192.4.10.9
Destination
Next Hop
30.0.0.0
255.0.0.0
40.0.0.7
40.0.0.0
255.0.0.0 Deliver direct
128.1.0.0 255.255.0.0 Deliver direct
192.4.10.0 255.255.255.0 128.1.0.9

A 中的路由表
Dest. Net. next router Nhops
223.1.1
223.1.2
223.1.3
IP datagram:
misc source dest
data
• 在数据从源端到目的端

• 地址字段是路由感兴趣

A
223.1.1.4
223.1.1.4
1
2
2
223.1.1.1
223.1.2.1
B
223.1.1.2
223.1.1.4
223.1.1.3
223.1.3.1
223.1.2.9
223.1.3.27
223.1.2.2
223.1.3.2
E

misc
data
fields 223.1.1.1 223.1.1.3
Dest. Net. next router Nhops
223.1.1
223.1.2
223.1.3
A发出一个IP数据报给B
• 在网络上搜寻B的地址
• 发现B和A在同一网段上
• 数据链路层将报文封装到数

– B 和A是直接连接的
A
223.1.1.4
223.1.1.4
1
2
2
223.1.1.1
223.1.2.1
B
223.1.1.2
223.1.1.4
223.1.1.3
223.1.3.1
223.1.2.9
223.1.3.27
223.1.2.2
223.1.3.2
E

misc
data
fields 223.1.1.1 223.1.2.3
Dest. Net. next router Nhops
223.1.1
223.1.2
223.1.3

• 在网络上搜寻E的地址
• E 在不同的网段上
– A和E不是直接连接的
• 路由表中添加: 到E的下一跳地

• 数据链路层封装数据报到帧中，

• 数据报到达223.1.1.4
A
223.1.1.4
223.1.1.4
1
2
2
223.1.1.1
223.1.2.1
B
223.1.1.2
223.1.1.4
223.1.1.3
223.1.3.1
223.1.2.9
223.1.3.27
223.1.2.2
223.1.3.2
E

misc
data
fields 223.1.1.1 223.1.2.3

• 在路由表中查询E的地址
• E和路由器端口223.1.2.9在同一

– 路由器和E是直接连接
• 数据链路层封装到223.1.2.2的 数

• 数据报到达223.1.2.2!!!
Dest.
next
network router Nhops interface
223.1.1
223.1.2
223.1.3
A
-
1
1
1
223.1.1.4
223.1.2.9
223.1.3.27
223.1.1.1
223.1.2.1
B
223.1.1.2
223.1.1.4
223.1.1.3
223.1.3.1
223.1.2.9
223.1.3.27
223.1.2.2
223.1.3.2
E
The Internet Network layer
Host, router network layer functions:
Transport layer: TCP, UDP
Network
layer
IP protocol
•datagram format
•packet handling conventions
Routing protocols
•path selection
•RIP, OSPF, BGP
forwarding
table
ICMP protocol
•error reporting
•router “signaling”
physical layer
Data Forwarding: Steps
• Decrement TTL, check and update header checksum
• If error, drop the packet, and generate ICMP report
• Else look up packet destination address in
forwarding table:
– If datagram for a host on directly attached network,
forward
– Otherwise,
• find next-hop, and
• forward packet to outgoing interface (the next hop neighbor)
How a Router Forwards Datagrams
 Every
datagram contains a destination
 The router determines the prefix to which
the address belongs, and routes it to
the“Network ID” uniquely identifies a
physical network.
 Longest-prefix match
 All hosts and routers sharing a Network ID
share same physical network.
Longest-prefix match
• For example:
One IP datagram which destination address is
206.0.71.130, and there are three entries in
the routing table , which one is more specific?
206.0.0.0/16
206.0.68.0/22
206.0.71.128/25
How a Router Forwards Datagrams
128.17.20.1
R2
128.17.14.1
1
R1 2
3
R3
R4
128.17.16.1
e.g. 128.9.16.14 => Port 2
Prefix
Next-hop
Port
65/8
128.9/16
128.9.16/20
128.9.19/24
128.9.25/24
128.9.176/20
142.12/19
128.17.16.1
128.17.14.1
128.17.14.1
128.17.10.1
128.17.14.1
128.17.20.1
128.17.16.1
3
2
2
7
2
1
3
Forwarding/routing table
Inside a Router
1.
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
2.
Interconnect
3.
Output
Scheduling
Forwarding in an IP Router
• Lookup packet DA in forwarding table.
– If known, forward to correct port.
– If unknown, drop packet.
• Decrement TTL, update header Checksum.
• Forward packet to outgoing interface.
Question: How is the address looked up in a real router?
Making a Forwarding Decision
Class A
Class B
Class A
212.17.9.4
Class B
Class C
Class C
D
Routing Table:
Exact
Match
212.17.9.0
212.17.9.0 Port 4
Exact Match: There are many well-known ways to find an exact match in a table.
Direct Lookup
Memory
Next-hop, Port
Problem: With 232 addresses, the memory would require 4 billion entries.
Associative Lookups
• Simple
Associative
Memory or CAM
Search
Data
32
Network
Port
Number
Port
Number
Hit?
•
•
•
•
Slow
High Power
Small
Expensive
Hashed Lookups
Hashing
Function
16
Memory
Data
32
Search
Data
Associated
Data
{
Hit?
log2N

Hash是32位IP的前16bit与后16bit进行XOR操作，

Lookups Using Hashing
An example
Memory
#1
Search
Data
32
Hashing Function
16
with same hash key.
#2
#3
#4
Associated
Data
#1
#2
#1
#2
Hit?
#3
Lookups Using Hashing
• Simple
• Expected lookup time can be small
• Non-deterministic lookup time
• Inefficient use of memory
Trees and Tries
Binary Search Tree:
<
(“reTRIEval”)
>
>
<
N entries
>
log2N
<
Binary Search Trie:
0
0
1
1
010
0
1
111
Requires 32 memory references,
Longest prefix matches using
Binary Tries
0
1
f
d
e
g
h
i
abc
j
Example
a)
b)
c)
d)
e)
f)
g)
h)
i)
j)
Prefixes:
00001
00010
00011
001
0101
011
100
1010
1100
11110000
Search Tries
Multiway tries reduce the number of memory references
16-ary Search Trie
0000, ptr
0000, 0
1111, ptr
000011110000
1111, ptr
0000, 0
1111, ptr
111111111111
CIDR
128.9.19/24
128.9.25/24
128.9.16/20 128.9.176/20
128.9/16
0
232-1
128.9.16.14
Most specific route = “longest matching prefix”
Question: How can we look up addresses if they are not an exact match?
Ternary CAMs
Associative Memory
Value
Port
255.255.255.255
1
10.1.1.0
255.255.255.0
2
10.1.3.0
255.255.255.0
3
10.1.0.0
255.255.0.0
4
10.0.0.0
255.0.0.0
4
10.1.1.32
Port
Priority Encoder
Note: Most specific routes appear closest to top of table
Lookup Performance Required
Line
Line Rate
Pktsize=40B
Pktsize=240B
T1
1.5Mbps
4.68 Kpps
0.78 Kpps
OC3
155Mbps
480 Kpps
80 Kpps
OC12
622Mbps
1.94 Mpps
323 Kpps
OC48
2.5Gbps
7.81 Mpps
1.3 Mpps
OC192
10 Gbps
31.25 Mpps
5.21 Mpps
Router Architecture Overview
Two key router functions:
• run routing algorithms/protocol (RIP, OSPF, BGP)
• switching datagrams from incoming to outgoing link
Input Port Functions
Physical layer:
bit-level reception
e.g., Ethernet
Decentralized switching:
• given datagram dest., lookup output
port using forwarding table in input port
memory
• goal: complete input port processing at
‘line speed’
• queuing: if datagrams arrive faster than
forwarding rate into switch fabric
• The cell at the head of an input queue cannot
be transferred, thus blocking the following
cells
Cannot be transferred because
is blocked by red cell
Input 1
Output 1
Input 2
Output 2
Input 3
Cannot be
transferred
because output
buffer overflow
Output 3
Blocking
• Maintain at each input N virtual queues, i.e.,
one per output
Input 1
Output 1
Input 2
Output 2
Output 3
Input 3
Three Types of Switching Fabrics
Switching Via Memory
Input
Port
Memory
Output
Port
System Bus
• Input port processor performs lookup, copy into memory
• Speed limited by memory bandwidth (2 bus crossings per
datagram)
• Cisco Catalyst 8500
Switching Via Bus
• Datagram from input port memory
to output port memory via a shared
bus
• bus contention: switching speed
limited by bus bandwidth
• 1 Gbps bus, Cisco 1900: sufficient
speed for access and enterprise
routers (not regional or backbone)
Switching Via An Interconnection Network
• Overcome bus bandwidth limitations
• Banyan networks, and others
• Advanced design: fragmenting datagram into fixed length cells,
switch cells through the fabric.
• Cisco 12000: switches Gbps through the interconnection
network
• An active area of research for optical switch:
http://www.arl.wustl.edu/~jst/talks/switching_games.ppt
Output Ports
• Buffering required when datagrams arrive from
fabric faster than the transmission rate
• Scheduling discipline chooses among queued
datagrams for transmission
Output Port Queueing
• buffering when arrival rate via switch exceeds
output line speed
• queueing (delay) and loss due to output port buffer
overflow!
Overview
• What is ICMP?
• ICMP Messages
• ICMP applications: Ping, Traceroute, Path
MTU discovery
Error Reporting (ICMP)
Internet Control Message Protocol:
– Used by a router/end-host to report some types of error:
– E.g. Destination Unreachable: packet can’t be
forwarded to/towards its destination.
– E.g. Time Exceeded: TTL reached zero, or fragment
didn’t arrive in time. Traceroute uses this error to its
– An ICMP message is an IP datagram, and is sent back
to the source of the packet that caused the error.
ICMP Features
•
•
•
•
Used by IP to send error and control messages
Uses IP to send its messages
ICMP message are not required on datagram checksum
errors and multicast errors
• ICMP reports error only on the first fragment
IP Data
ICMP Message Format
Type of Message
Error Code
Checksum
Parameters, if any
Information
8b
8b
16b
Var
Var
ICMP messages are divided into two broad categories:
Error reporting and query messages
Sample ICMP Messages
• Source Quench: Please slow down! I just
• Time Exceeded: Time to live field in one of
your packets became zero.” or “Reassembly
timer expired at the destination.
• Fragmentation Required: Datagram was longer
than MTU and “No Fragment bit” was set.
Sample ICMP Messages
(Continued)
• Redirect: Send to router X instead of me.
• Time Stamp Request/Reply: used to find current
time or RTT.
• ICMP error messages normally include the IP
header of the datagram that generated the error,
plus at least 8 bytes following the IP header =>
ICMP message sizes = 70 bytes
ICMP: Message Types Summary
Type
0
3
4
5
8
11
12
13
14
15
16
17
18
Message
Destination unreachable
Source quench
Redirect
Echo request
Time exceeded
Parameter unintelligible
Time-stamp request
Information request
0
1
2
3
error codes
net unreachable
Host unreachable
Protocol unreachable
Port unreachable
• Ping: Used to test
–
–
–
–
Ping
destination reachability,
compute round trip time
count the # of hops to destination
may provide record route option. Sample output:
Reply from 164.107.144.3: 48 bytes in 47 msec.
TTL: 253
Ping-of-death
• IP报文最大可达65535bytes
• 有些系统（如Win95)能够发送大于65535
bytes 数据的ICMP报文
• 该报文在传输过程中被分成了多个片段
• 在目的端被重组后将超过IP报的最大尺寸
• 常常导致接受方覆盖掉内部的数据结构，

Traceroute
• Traceroute: Exploit TTL and ICMP
– Send the packet with time-to-live = 1 (hop)
– The first router discards the packet and sends an
ICMP “time-to-live exceeded message”
– Send the packet with time-to-live = 2 (hops)
etc…
– Does not use optional features like record route
Path MTU Discovery
• Send a large IP datagram with “Don’t
fragment” bit set.
– Failure to fragment at a link will result in ICMP
message.
• Reduce MSS until success (No ICMP