Note - Gmu

advertisement
TCOM 509:
UDP, TCP/IP - Internet
Protocols
* Obtained permission to use Raj Jain’s technical material
2
IP Routing
 An
example routing table
Destination
127.0.0.1
Default
150.100.15.0
H=1
0
Next-Hop
127.0.0.1
150.100.15.54
150.100.15.11
Flags Network Interface
H
lo0
G
emd0
emd0
Destination is a complete host address
Destination is a network address
 Search
1:
2:
3:
4:
G=1
G=0
Next-Hop to a router
Next-Hop to a directly connected destination
Order of routing table
Complete match of destination IP
Match the network addr (including the subnet ID)
Use default router
If all previous steps fail to find a suitable entry, send an ICMP “host
unreachable error”
3
ROUTING SUMMARY
 Routing is the process of discovering, selecting and following paths from
the transmitting host to the receiving host in a network. There are two
categories of routing algorithms:
 Source Routing: The transmitting host inserts a list of routers that describe
a path through the network.
 Hop-by-Hop Routing: The transmitting host knows how to get to the first
router.
 The router then employs its Routing Table to select the next best hop (router),
which selects the next best router, etc.
Routing
Source
Routing
Strict
Hop-by-Hop
Routing
Loose
Static
Dynamic
Default
 Distance Vector: The router sends a list of
networks, how far they are and the next hop
Distance
direction.
Vector
 Link State: The router has a complete topology
1. RIP
map of the network.
 Path Vector: The router sends a complete path to 2. IGRP
get to a destination.
Link
State
1. OSPF
2. LS-LS
Path
Vector
1. EGP
2. BGP
4
IP Routing Strategies
 Static
Routing
 Pre-determined
administrator
 Dynamic
Routing
 Interior
 RIP,
routes setup by
Routing Protocols
OSPF
 Exterior
Routing Protocols
 BGP
5
How Does IP Routing Work?

Basic procedure:
search for a matching host address (/32)
2. search for a matching network address (/x,
where 0<x<32)
3. search for a default entry (0/0)
4. If all previous steps fail to find a suitable
entry, send an ICMP “host unreachable
error”
1.

IP packets are routed via a “bestmatch” or “longest-match” principle
6
Processing an IP packet
7
IP Source Routing



IP routing has no concept of the source
determining the route
What if the source wanted to specify the
packet’s path?
The source route option was added to the IP
protocol in order to assist in route debugging.
Nowadays, it seems to be mainly used by large
ISPs, to make sure that their peers aren't
inappropriately dumping traffic onto their
backbone links. A packet is given a list of
desired hops that should be taken on the way
to the final destination.
8
IP Source Routing via IP Options
CODE = 131
9
Source Routing example
10
Types of Routes

Static


All packets forwarded to predetermined
destinations defined by an administrator
Dynamic

Packets are forwarded to dynamically
calculated routes determined by a routing
protocol
11
12
Static Routing

Benefits




Good for small networks
Can help create a secure network
Efficiently uses router resources
Drawbacks


Does not handle network failures well
Does not scale well
13
Static Routing Example
Destination
Next Hop
10.0.0.0
Direct
172.16
Router B
192.168.5
Router C
192.168.6
Router C
Network
10
Router A
Destination
Next Hop
10
Router A
172.16
Direct
192.168.5
Router C
192.168.6
Router C
Router C
Router B
Destination
Next Hop
10
Router A
172.16
Router B
192.168.5
Direct
192.168.6
Router D
Network
192.168.5
Network
172.16
Router D
Destination
Next Hop
192.168.6
Direct
Default
Router C
Network 192.168.6
14
Static Routing with Link Failure
Destination
Next Hop
10
Direct
172.16
Router B
192.168.5
Router C
192.168.6
Router C
Network
10
Router A
Destination
Next Hop
10
Router A
172.16
Direct
192.168.5
Router C
192.168.6
Router C
Router C
Router B
Destination
Next Hop
10
Unreachable
172.16
Router B
192.168.5
Direct
192.168.6
Router D
Network
192.168.5
Network
172.16
Router D
Destination
Next Hop
192.168.6
Direct
Default
Router C
Network 192.168.6
15
Dynamic Routing
Communicate
what?
Distance-Vector
Link-State
Between
whom?
Routing tables
Neighbors
Interface status
All routers
16
Dynamic IP Routing Protocols
 RIP
(Distance Vector)
 OSPF (Link State)
 IS-IS (Link State)
 BGP (Path Vector)
17
Distance Vector vs. Link State
18
Concept of Administrative Distance
Connected
0
Static to Interface or
Static to Next Hop
1
E-IGRP (Cisco only)
90
OSPF
110
IS-IS
115
RIP v1 and v2
120
Only one IGP route is installed
in the routing table
Administrative Distances of Routing Protocols: Measures
trustworthiness of the source of route
- Handles preferences when multiple sources of routing
info exists in router
- Protocol with lowest admin weight wins
19
Distance-Vector and Link State Protocol
Protocol
Category
Metric
Algorithm
RIP v1
Distance Vector
Hop Count
Bellman-Ford
RIP v2
Distance Vector
Hop Count
Bellman-Ford
OSPF
Link State
Bandwidthbased cost
Shortest Path
First
IGRP
Distance Vector
Composite
Bellman-Ford
20
What is RIP (Routing Info Protocol)?

RIP is a Interior Gateway Protocol (IGP)

Used within an Autonomous System (AS)


A collection of routers under the same
administrative authority
Two versions


RIP v1 (RFC 1058)
RIP v2 (RFC 2453)
21
Distance Vector Routing Protocol RIP v1 - Characteristics
Directly connected subnets are known
 Routing updates are broadcasted to
neighbors
 Listen to routing updates
 Metrics are used
 Routing info consists of subnet and metric
 Periodic updates (30 sec)
 A route is learned via a neighbor
 Failed route has a metric of infinite

22
RIP Uses UDP
RIP is a UDP-based protocol. Each router that uses RIP has a
routing process that sends and receives datagrams on UDP port
number 520, the RIP-1/RIP-2 port. All communications intended
for another routers's RIP process are sent to the RIP port. All
routing update messages are sent from the RIP port.
23
24
25
RIP Characteristics

Distance-vector routing protocol

Updates contain routes (vectors) and the cost (distance)
to reach them and consist of the following steps:
Each node calculates the distances between itself and all other
nodes within the AS and stores this information as a table.
 Each node sends its table to all neighboring nodes.
 When a node receives distance tables from its neighbors, it
calculates the shortest routes to all other nodes and updates its
own table to reflect any changes





Does not scale well for large networks as every router has
to add a RIP route for every newly added network
Hop count is used as the metric for path
selection, based on Bellman-Ford distance-vector
routing algorithm
Maximum allowable hop count is 15
Routing updates are broadcast every 30 seconds
26
RIP Message Types

Two message types

Request message


Ask neighbors to send routes
Response message

Carries route updates
Advertises 25 routes per update
 Router decides how to handle routes in
update


Add, modify, or delete
27
RIP Routing Metrics

Counts the number of hops between
source and destination
Number of hops is the number of router hops
 Hop count equals the RIP metric
 RIP cannot determine measured delay, reliability,
load, or link bandwidth


With multiple paths to the same prefix,
one with fewest hops is selected

May not be optimum path
28
RIP in Action (1):
162.11.5.0
Router A
Router A & C is down
Tr0
1
s0
s1
162.11.9.0
162.11.8.0
Routing table B
s0
Router C
s0
Router B
E0
162.11.10.0
E0
162.11.7.0
162.11.8.0
E0
s0
1
1
162.11.7.0
29
RIP in Action (2):
162.11.5.0
Router C is down
Router A is switched on
Router A
Tr0
162.11.5.0 1
1
s0
s1
162.11.9.0
162.11.8.0
s0
Router C
162.11.9.0 1
s0
Router B
E0
162.11.10.0
E0
162.11.5.0
162.11.7.0
162.11.8.0
162.11.9.0
s0
E0
s0
s0
2
1
1
2
162.11.7.0
30
RIP in Action (3):
Router C is switched on
162.11.5.0
Router A
Tr0
2
s0
s1
162.11.9.0
162.11.5.0
1
162.11.9.0
1
162.11.10.0 2
162.11.8.0
1
162.11.10.0 1
s0
Router C
s0
Router B
E0
162.11.10.0
E0
162.11.5.0
162.11.7.0
162.11.8.0
162.11.9.0
162.11.10.0
s0
E0
s0
s0
s0
2
1
1
2
3
162.11.7.0
31
RIP Timers

RIP uses numerous timers to regulate its performance. These
include a routing-update timer, a route-timeout timer, and a
route-flush timer.

Routing-update timer - clocks the interval between periodic routing
updates. Generally, it is set to 30 seconds, with a small random amount
of time added whenever the timer is reset. This is done to help prevent
congestion, which could result from all routers simultaneously attempting
to update their neighbors.

Route-timeout timer - Each routing table entry has a routetimeout timer associated with it. When the route-timeout timer
expires, the route is marked invalid but is retained in the table until the
route-flush timer expires. Default value is 120 secs

Route-flush timer - If 180 seconds elapse from the last time the
timeout was initialized, the route is considered to have expired, and the
deletion process described below begins for that route. Default value is
180 secs.
32
Bellman Ford’s Distance Vector Algorithm –
Example
http://www.laynetworks.com/Simulation%20of%2
0Bellman%20Algorithm.htm
33
Disadvantages with the Bellman Ford’s
Algorithm
 Does
not scale well
 Changes in network topology are not reflected
quickly since updates are spread node-by-node.
 Counting to infinity (if link or node failures
render a node unreachable from some set of
other nodes, those nodes may spend forever
gradually increasing their estimates of the
distance to it, and in the meantime there may be
routing loops)
34
Count To Infinity Problem – URL Link
35
Improving Convergence

Split Horizon
 For interface X, don’t advertise routes out X that you learned via X
 prevents forwarding loops only for 2 adjacent router case
 joke analogy: if you tell me a joke and you get it, I don’t need to tell it back to you

Hold Down Timers
 refuse to accept any information for a period of time (60 secs) after a route is declared
unreachable
 can increase convergence time

Triggered Updates
 when a change occurs, send update immediately (don’t wait for next update interval)
 Change can be defined as an observed increase in hop count over time (1.6 –2.0 increase in
originally store hop count)
 Attempt to speed up convergence

Hold Down Timers and Triggered Updates
 Can be used together to be more effective

Split Horizon with Poison Reverse
 a.k.a. “Infinite Split Horizon”
 For interface X, DO advertise routes out X that you learned via X, but with a metric of
INFINITY
 advantage: eliminates two-router loops
 disadvantage: increases the size of routing updates

None of these mechanisms can completely avoid routing loops
 and counting to infinity doesn’t go away
36
RIPv2 (rfc 2453) – Solves some of the RIPv1
Shortcomings
RIPv2 is classless and uses UDP port 520 as does RIPv1
(classful). It is still distance vector and still uses hop count as
the metric with a max hop count of 15.
The ability to multicast saves other devices on the network from
wasting time opening broadcast packets.
37
38
RIPv2 – Subnet Mask
 Classless
routing protocols carry the
subnet mask. This allows all 0 and 1
subnets to be used, eliminating confusion
between 172.16.255.255 and
172.16.255.255. Here, one is the 'all
subnets' broadcast and one is broadcast
on the all 1s subnet - but which is which?
 If
the subnet mask is sent then
172.16.255.255 /16 and 172.16.255.255
/24 can be differentiated.
39
RIPv2 – Route Tag
 Each
RIPv2 entry includes a Route Tag
field, where additional information about a
route can be stored. It provides a method
for distinguishing between internal routes
(learned by RIP) and external routes
(learned from other protocols).
40
RIPv2 – Next Hop

In RIPv2, each RIP entry includes a space where an explicit IP address can be entered as the next hop router for
datagrams intended for the network in that entry

Specifying a value of 0.0.0.0 in this field indicates that routing should be via the originator of the RIP
advertisement.

The purpose of the Next Hop field is to eliminate packets being routed through extra hops in the system. It is
particularly useful when RIP is not being run on all of the routers on a network. A simple example is given in
Appendix A. Note that Next Hop is an "advisory" field. That is, if the provided information is ignored, a possibly
sub-optimal, but absolutely valid, route may be taken. If the received Next Hop is not directly reachable, it should
be treated as 0.0.0.0.
----- ----- --------- ----- ----|IR1| |IR2| |IR3|
|XR1| |XR2| |XR3|
--+-- --+-- --+---+-- --+-- --+-|
|
|
|
|
|
--+-------+-------+----------------+-----+------+-<-------------RIP-2----------------->
Assume that IR1, IR2, and IR3 are all "internal" routers which are under one
administration (e.g. a campus) which has elected to use RIP-2 as its IGP. XR1, XR2, and
XR3, on the other hand, are under separate administration (e.g. a regional network, of
which the campus is a member) and are using some other routing protocol (e.g. OSPF).
XR1, XR2, and XR3 exchange routing information among themselves such that they know
that the best routes to networks N1 and N2 are via XR1, to N3, N4, and N5 are via XR2,
and to N6 and N7 are via XR3. By setting the Next Hop field correctly (to XR2 for
N3/N4/N5, to XR3 for N6/N7), only XR1 need exchange RIP-2 routes with IR1/IR2/IR3
for routing to occur without additional hops through XR1. Without the Next Hop (for
example, if RIP-1 were used) it would be necessary for XR2 and XR3 to also participate in
the RIP-2 protocol to eliminate extra hops.
41
RIPv2 - Authentication
8
bits 8
Command
bits 8
Version
0XFFF
bits 8
bits
Unused - set to all zeros
Authentication Type
Password (bytes 0-3)
Password (bytes 4-7)
Password (bytes 8-11)
Password (bytes 12-15)
RIPv2 authenticates the source of the packets. The source of
the update uses the first field of the message that would
normally carry IP address, SM, Next Hop, Metric and hijacks
these for authentication. This leaves room for only 24 updates
per packet instead of 25 with RIPv1.
A password is indicated if the
AFI field is set to 0XFFF. The
authentication type for simple
authentication is set to 0X002.
The password is left justified
and unused bits are set to zero.
MD5 authentication may be
enabled to overcome plain-text
authentication. Use the
Authentication Type field to
identify the method used. MD5
computes a 128-bit hash value
from plain text plus password.
This hash is transmitted along
with the message and the hash
is recalculated at the far end
and the received and calculated
hash values are checked
against each other. If they
match, the message is
authenticated.
42
RIP v2 Packet Format
8
0
Command
16
Version
24
31
Reserved (Must be zero)
Route Tag
Address Family Identifier
IP Address
Subnet Mask
Next Hop
Metric
…
Route Tag
Address Family Identifier
IP Address
Subnet Mask
Next Hop
Metric
43
RIP Limitations










Maximum network diameter = 15
Lack of alternative routes. RIPv2 keeps only one
route to a destination in routing tables. It has to wait
for updates after a failure to assess whether a new (if
any) route exists
Regular updates include entire routing table
approximately every 30 seconds
Poison reverse increases the size of the routing
updates
Count to infinity slows route loop prevention
Metrics only involve hop count
Broadcasts between neighbors (RIPv1 only)
Classful routing means no prefix length carried in
route updates (RIPv1 only) – and no VLSM
No authentication mechanism (RIPv1 only)
Slow convergence
44
OSPF
OSPF Concept : Having the Same Copy
of Network Topology at Every Node
R1 LSA
R3 LSA
R2 LSA
R5 LSA
R4 LSA
xyz
LSA
R6 LSA
abc
LSA
pdq
LSA
46
SPF Algorithm

The Shortest Path First (SPF) routing algorithm is the basis for OSPF operations. When
an SPF router is powered up, it initializes its routing-protocol data structures and then waits for
indications from lower-layer protocols that its interfaces are functional.

After a router is assured that its interfaces are functioning, it uses the OSPF Hello
protocol to acquire neighbors, which are routers with interfaces to a common
network. The router sends hello packets to its neighbors and receives their hello packets. In
addition to helping acquire neighbors, hello packets also act as keepalives to let routers know that
other routers are still functional.

On multi-access networks (networks supporting more than two routers), the Hello
protocol elects a designated router and a backup designated router. Among other
things, the designated router is responsible for generating LSAs for the entire multi-access
network. Designated routers allow a reduction in network traffic and in the size of the topological
database.

When the link-state databases of two neighboring routers are synchronized, the
routers are said to be adjacent. On multiaccess networks, the designated router determines
which routers should become adjacent. Topological databases are synchronized between pairs of
adjacent routers. Adjacencies control the distribution of routing-protocol packets, which are sent
and received only on adjacencies.

Each router periodically sends an LSA to provide information on a router's
adjacencies or to inform others when a router's state changes. By comparing established
adjacencies to link states, failed routers can be detected quickly, and the network's topology can
be altered appropriately. From the topological database generated from LSAs, each router
calculates a shortest-path tree, with itself as root. The shortest-path tree, in turn, yields a routing
table.
47
How OSPF Protocol Works
Stage 1: Discovering Neighbors
=> Hello Message
Stage 2: Electing the Designated Router
=> Hello Message
Stage 3: Establishing Adjacencies
=> DB Description Msgs
Stage 4: Propagating Link State Information
=> Flooding using LS Request/Update Msgs)
Stage 5: Calculating the Routing Table(s)
=> (Diksjtra’s Algorithm)
48
First Requirements for the new IGP
to Replace RIP

Had to be more efficient than RIP


Faster convergence than RIP



consume fewer network resources: link bandwidth and
CPU cycles
communicate changes quickly
link, interface, and router failures
More descriptive metric than RIP


hop-count limitations
ability to include other factors (bandwidth, delay,
reliability, etc.)
 Figure

3.2
OSPF ended up using cost
 16
bits, no limit on total path cost
 eliminated network diameter limitations
49
New IGP Requirements (2)
 Support
for load balancing over multiple
equal-cost links to a destination
 more
efficient use of network resources
 implementation was not mandated by the
protocol
 multiple
strategies exist: flow-based, round-robin,
hash function, packet-by-packet
 in theory, multiple vendor strategies can be combined
in a single network and it can still work, but this
needs to be examined closely in some cases
 Support
for a routing hierarchy
 split
the AS up into mini-AS’s, in a sense
 a scalability mechanism
50
New IGP Requirements (3)

Separate internal and external routes



Support for more flexible subnetting


essentially CIDR addressing and notation, no notion of
classful routing
Security


RIPv1 had no way to distinguish
You generally trust info from your AS over routes from
outside your AS
ability to control what routers participate in OSPF
routings based on a password
ToS-based routing


allow specification of different metrics for each of the
original ToS categories
in reality, never really used; chicken-and-egg problem
51
52
What is OSPF?

An IGP using Link-State technique to update
routing tables



Based on the shortest path first (SPF) algorithm, also
known as the Dijkstra algorithm
Created to fill the need for a high functionality,
standards-based IGP for the TCP/IP protocol
family
Main RFCs:


1587 – OSPF NSSA Option
2328 – OSPF Version 2 (current implementation)
53
What is a Link-State Protocol ?





Link = router interface
State = description of interface and its
relationship to neighboring routers
OSPF routers send link-state advertisements
(LSAs) to all other routers within the same
hierarchical area
Routers store information in a link-state, or
topological, database
Each OSPF router uses the SPF algorithm to
calculate the shortest path to each node
54
Three (3) Types of OSPF LS Messages
1.
LSA (Link State Advertisement): LSAs are included in the
database description packets (DDPs or DBDs). LSA entries include
link-state type, the address of the advertising router, the cost of the
link, and the sequence number.
2.
LSR ( Link State Request): When a slave router receives an DDP
(Database Description Packet), it sends and LSAck packet. Then it
compares the received information with the information it has. If
the DDP has more recent information, the slave router sends a linkstate request (LSR) to the master router.
3.
LSU ( Link State Update): LSU packet is sent in response to LSR
(Link-State Request) packet sent from a slave router to a master
router. LSU contains complete information about the requested
entry.
55
What is SPF?

Places each router at the root of a tree
and calculates the shortest path to each
destination based on the cumulative cost
to reach that destination

Each router has its own view of the
topology even though all the routers build
a shortest path tree using the same linkstate database
56
SPF Cost

Cost, or metric, of an interface indicates
the overhead required to send packets
across that interface





Cost = 10**8/bandwidth (bps)
Higher bandwidth = lower cost
10M Ethernet line cost = 10**8/10**7 = 10
T1 line cost = 10**8/1544000 = 64
To handle hi-speed links, use a value
greater than 10**8 in the cost calculation

This is the Reference Bandwidth
57
58
Shortest Path Tree

Router A’s SPF tree



A is the Root; use the least-cost path to each IP prefix
If a link goes down, the SPF tree is recalculated
Each router calculates its own SPF tree
Router A
Router D
10
10
0
128.213.0.0
Router B
5
5
192.213.11.0
Router D
10
7
222.211.10.0
59
Dijkstra’s Link State Algorithm

Principle: Dijkstra's algorithm works on the principle that the shortest possible path
from the source has to come from one of the shortest path already discovered.

Layman’s Terms: Using the street map, you're marking over the streets
(tracing the street with a marker) in a certain order, until you have a route
marked in from the starting point to the destination. The order is
conceptually simple: from all the street intersections of the already
marked routes, find the closest unmarked intersection - closest to the
starting point (the "greedy" part). It's the whole marked route to the
intersection, plus the street to the new, unmarked intersection. Mark that
street to that intersection, draw an arrow with the direction, then repeat.
Never mark to any intersection twice. When you get to the destination,
follow the arrows backwards. There will be only one path back against the
arrows, the shortest one.

Demo:


http://www.eng.tau.ac.il/~shtilman/C-Programming/Year2/dijkstra.html
http://www.oopweb.com/Algorithms/Documents/PLDS210/Volume/dij-op.html
60
OSPF Breaks an AS into Areas
AS 100
ABR
Area
234
Area 0
Area 10
ABR
61
62
Area Sizing Guidelines
 Rules
of thumb for non-backbone area
 No
more than 100 routers
 No more than 50 neighbors per router
 Decrease
when media unstable
 Consider
static/default and demand
techniques
 Decrease
when large numbers of
externals injected
 Consider
if the incoming externals can be
summarized or filtered
63
When Might Single-Area OSPF make
sense?


Fewer than 50 routers with alternate paths
Needs:





multivendor compatibily
fast convergence
VLSM
complex defaults and externals
No clear candidates for core


OSPF power greatest with hierarchy
Multiple domains may be better than 1 area
64
Design Guidelines – Network Topology (Cont’d) OSPF Network Size Recommendation
65
Design Guidelines – Network Topology (Cont’d) How Many Areas Should Be Connected per ABR?
66
OSPF : Location of different routers
67
Different Types of OSPF routers

Internal router: An internal router has all the interfaces in the same area.
All internal routers have same link state databases.

Backbone router: Backbone routers sit on the perimeter of Area 0, with
at least one interface connected to backbone (Area 0).

Area Border Router (ABR): ABRs are routers that have interfaces
attached to multiple areas. It may be noted that these routers maintain
separate link-state databases for each area that they are connected. They
are capable of routing traffic destined for or arriving from other areas.

Autonomous System Boundary Router (ASBR): These are the routers
that have at least one interface to the external network (another
autonomous system). This autonomous network can be non-OSPF. ASBRs
are capable of route redistribution, a term used to imply that the concerned
router can import routing information from non-OSPF networks and
distribute the same in OSPF network for which it is responsible and visa
versa.
68
OSPF Terminology
 Area ID: A 32 bit number identifying an area. Acquired from
IANA.
 Router ID: A 32 bit number identifying a router. Normally
the lowest numbered IP address belonging to a router.
 Router Priority: An 8 bit number that indicates this router’s
willingness to be a designated/backup designated router.
 A router priority of Zero indicates that this router is ineligible
to be a designated router.
 LINK State Advertisement: Exchanged by adjacent routers
to allow area topology databases to be maintained and interarea and intra-AS routes to be advertised.
 The are five types of link state advertisements.
69
70
71
OSPF NETWORKS
 OSPF supports three kinds of connections and
networks.
 Point-to-Point between exactly two routers.
 Multi-access networks with broadcasting(e.g.,
Ethernet, T-R, etc)
 Multi-access networks without broadcasting (e.g.,
packet switching WANs)
Point-to-Point Network
Multi-access w/ Broadcasting
Multi-access w/o Broadcasting
X.25
NETWORK
72
PROTOCOL ENCAPSULATION and OSPF PROTOCOL NUMBER
NOTE
 OSPF uses
direct IP
encapsulation.

Protocol 89 is used for OSPF.
 OSPF is sent as multicast on pt-
OSPF
to-pt and broadcast networks.
 224.0.0.5
NETWORK
LAYER
Protocol Type
89
IP Header
Source IP Address: 128.66.12.2
Destination IP Address: 224.0.0.5
DATA LINK
LAYER
ETHERNET
PREAMBLE
DESTINATION ADDR
00 00 1B 12 23 34
SOURCE ADDR
00 00 1B 09 08 07
FIELD
TYPE
IP
HEADER
OSPF
FCS
73
74
OSPF MESSAGE TYPES
 HELLO (Type 1) is used to:
identify neighbors,
to elect a designated Router for multi-access network,
to find out about an existing Designated Router and
as "I'm alive" signal.
 DATABASE DESCRIPTION (Type 2) is used to exchange information during initialization so that a
router can find out what data is missing from its topology database. Each LSA is preceded by a
common LS Advertisement Header.
LSA Specific Type 1
Router Link Advertisement
LSA Specific Type 2
Network Link Advertisement
LSA Specific Type 3
Summary Link Advertisement to other Areas.
LSA Specific Type 4
Summary Link Advertisement to ASBR.
LSA Specific Type 5
AS External Link Advertisement
 LINK STATE REQUEST (Type 3) is used to ask for data that a router has discovered is missing from
its topology database or to replace data that is out of date.
Database descriptions are exchanged first then
Link State request are submitted to resolve missing or suspicious data.
 LINK STATE UPDATE (Type 4) is used to reply to a Link State Request and also to dynamically
report changes in network topology.
 LINK STATE ACKNOWLEDGEMENT (Type 5) is used to confirm receipt of a Link State Update. The
sender will retransmit until an update is ACKed.
75
Link State Advertisement Types
 Router Link Advertisement (LSA Type 1)

Generated by all OSPF routers and describe the state of the router's
interface (links) within the area.

They are flooded throughout a single area only.
 Network Link Advertisement (LSA Type 2)

Generated by the Designated Router (DR) on a multi-access network and
lists the routers connected to the network.

They are flooded throughout a single area only.
 Summary Link Advertisements

Generated by Area Border Routers (ABR) and flooded throughout a single
area only. There are two types:
 A summary advertisement (LSA Type 3) describing routes to
destinations in other areas within the same AS.
 A summary advertisement (LSA Type 4) describing routes to AS
Boundary Routers.
 For routers to get information out of the AS.
 AS External Link Advertisement (LSA Type 5)

Generated by the AS Boundary Routers(ASBR) to describe routes to
destinations external to the OSPF network.

They are flooded to all areas in the OSPF network.
76
LSA Types Used in Flooding
Router Links
Type 1
Summary Links
Types 3 and 4
ABR
Describe the state and cost of the router’s
links (interfaces) to the area (Intra-area).
Network Links
Type 2
DR
Originated for multi-access segments with
more than one attached router. Describe
all routers attached to the specific
segment. Originated by a Designated
Router (discussed later on).
Originated by ABRs only.
Describe networks in the AS but outside of area
(Inter-area).
Also describe the location of the ASBR.
External Links
Type 5
ASBR
Originated by an ASBR.
Describe destinations external
to the autonomous system or a
default route to the outside AS.
77
LSA Specific - Description
LSA Specific Type 1 (Router Description)
 These are the router-LSAs. They describe the collected states of
the router's interfaces. For more information, consult Section
12.4.1.
 LSA Specific Type 2 (Network Description)
 These are the network-LSAs. They describe the set of routers
attached to the network.
 LSA Specific Type 3 or 4
 These are the summary-LSAs. They describe inter-area routes, and
enable the condensation of routing information at area borders.
Originated by area border routers,
 The Type 3 summary-LSAs describe routes to networks (Network
Description)
 Type 4 summary-LSAs describe routes to AS boundary routers.
(Router Description)
 LSA Specific Type 5 (Network Description)
 These are the AS-external-LSAs. Originated by AS boundary
routers, they describe routes to destinations external to the
Autonomous System. A default route for the Autonomous System
can also be described by an AS-external-LSA.

78
BASIC OSPF OPERATIONAL SEQUENCE
S1: Routers discover their OSPF neighbors.
 When the OSPF routers first start they establish
and maintain a relationship with their neighbors using
the Hello protocol.
S2: Routers elect a Designated Router (DR) and a
Backup Designated Router (BDR) for a network
(LAN) with multiple routers using the Hello protocol.
S3: The routers form adjacencies.
 For routers on Multi-access networks all routers
become adjacent to the DR and the BDR.
79
BASIC OSPF OPERATIONAL SEQUENCE Contd
S4: Adjacent routers then exchange Database Description packets
which may be part or all of the routers Link State Database.
The adjacent routers then synchronize their Link State
Databases by requesting missing or outdated information on the
advertised links.
 This is done through a Link State Request packet .
The response is a Link State Update packet.
A Link State Acknowledgement packet is used to confirm the
correct receipt of a Link State Update packet.
S5: The routers then calculate the routing table by running the
Shortest Path First(SPF) algorithm using the Link State Database
as input.
The routers periodically engage in advertising its Link States
based upon a refresh timer expiration or a link state change.
They then recalculate their routing table.
80
How OSPF Protocol Works
S1: Discovering Neighbors
=> Hello Messages
S2: Electing the Designated Router
=> Hello Messages
S3: Establishing Adjacencies
=> DB Description Msgs
S4: Propagating Link State Information
=> Flooding using LS Request/Update Msgs)
S5: Calculating the Routing Table(s)
=> (Diksjtra’s Algorithm)
81
Discovering Neighbors – Hello Protocol
82
Hello Exchange Process – Pt-to-Pt Link
83
Hello Exchange Process – Ethernet Link
84
OSPF HELLO MESSAGE
0
8
16
24
Message
Type
Version
31
Message Length
Router Identification
Area Identification
Checksum
Common Message Header
Authentication Type
Authentication (octets 0-3)
Authentication (octets 4-7)
Network Mask
Options
Hello Interval
Dead Interval Timer
Designated Router
Backup Designated Router
...
Neighbor One IP Address


E T
Router Priority
Hello Message Type
Identifies neighbors
Elects the Designated Router(DR)
Find out about an existing DR
An Alive Signal
NETWORK MASK: This field contains the subnet mask of the network over which the
message was sent (the mask associated with the interface).

If this field does not match the receiving router's mask for that network, the
receiving router rejects the Hello message and does not accept the transmitting
router as a neighbor.

In the absence of subnetting it is set to the default subnet mask.
HELLO INTERVAL: This field tells how often in seconds this router transmits its Hello
messages.

A Broadcast is normally 10 seconds.

A non-Broadcast is normally every 30 seconds.

HelloInterval and RouterDeadInterval fields in sent OSPF packet must match
with the settings configured in the receiving interface.
85
How OSPF Protocol Works
S1: Discovering Neighbors
=> Hello Message
S2: Electing the Designated Router
=> Hello Message
S3: Establishing Adjacencies
=> DB Description Msgs
S4: Propagating Link State Information
=> Flooding using LS Request/Update Msgs)
S5: Calculating the Routing Table(s)
=> (Diksjtra’s Algorithm)
86
DR and BDR Election - Example
87
How OSPF Protocol Works
S1: Discovering Neighbors
=> Hello Message
S2: Electing the Designated Router
=> Hello Message
S3: Establishing Adjacencies
=> DB Description Msgs
S4: Propagating Link State Information
=> Flooding using LS Request/Update Msgs)
S5: Calculating the Routing Table(s)
=> (Diksjtra’s Algorithm)
88
89
Database Sync Process

In a link-state routing algorithm, it is very important for all routers' link-state databases to stay
synchronized in order to have a compatible routing tables. OSPF simplifies this by requiring only
adjacent routers to remain synchronized. The synchronization process begins as soon as the
routers attempt to bring up the adjacency. Each router describes its database by sending a
sequence of Database Description packets to its neighbor. Each Database Description Packet
describes a set of LSAs belonging to the router's database. This sending and receiving of
Database Description packets is called the "Database Exchange Process". During this process, the
two routers form a master/slave relationship. Each Database Description Packet has a sequence
number. Database Description Packets (DDPs) sent by the master (polls) are acknowledged by
the slave through echoing of the sequence number. The DB exchange initially only sends the LSA
headers and not the LSA info to achieve better BW and processing efficiencies.

The master is the only one allowed to retransmit Database Description Packets. It does so only at
fixed intervals, the length of which is the configured per-interface constant RxmtInterval. Each
Database Description contains an indication that there are more packets to follow --- the M-bit. The
Database Exchange Process is over when a router has received and sent Database Description
Packets with the M-bit off.

During and after the Database Exchange Process, each router has a list of those LSAs for which
the neighbor has more up-to-date instances. These LSAs are requested in Link State Request
Packets. Link State Request packets that are not satisfied are retransmitted at fixed intervals of
time RxmtInterval. When the Database Description Process has completed and all Link State
Requests have been satisfied, the databases are deemed synchronized and the routers are
marked fully adjacent. At this time the adjacency is fully functional and is advertised in the two
routers' router-LSAs.

Criteria for determining adjacency between 2 routers:
 Have the same number of LSAs in their LSDBs
 Sum of their LSA’s LS Checksum fields are equal
90
OSPF DB DESCRIPTION MESSAGE
0
8
16
24
Message
Type
Version
31
Message Length
Router Identification
Area Identification
Checksum
Common Message Header
Authentication Type
Authentication (octets 0-3)
0
4-7)
8 Authentication (octets
16
LS Age
24
Options
31
LS Type
DB Descrioption Message Type
Establishing adjacency
Link State Identification
Advertising Router
Link State Sequence Number
LS Checksum


Length
NETWORK MASK: This field contains the subnet mask of the network over which the
message was sent (the mask associated with the interface).

If this field does not match the receiving router's mask for that network, the
receiving router rejects the Hello message and does not accept the transmitting
router as a neighbor.

In the absence of subnetting it is set to the default subnet mask.
HELLO INTERVAL: This field tells how often in seconds this router transmits its Hello
messages.

A Broadcast is normally 10 seconds.

A non-Broadcast is normally every 30 seconds.

HelloInterval and RouterDeadInterval fields in sent OSPF packet must match
with the settings configured in the receiving interface.
91
OSPF DB DESCRIPTION MESSAGE WITH LINK
ADVERTISEMENT HEADER
Message
Type
Version
Headers
contain enough information
to identify the LS records needed
during synchronization.
The receiver marks the LS Records
to be requested.
Message Length
Router Identification
Area Identification
Checksum
Authentication Type
Authentication (octets 0-3)
0
4-7)
8 Authentication (octets
16
LS Age
24
Options
31
LS Type
Link State Identification
Advertising Router
LS Types are:
Type 1: Router Links
Type 2: Network Links
Type 3/4: Summary Links
Type 5: External Links
Link State Sequence Number
LS Checksum
Length
 LS Age: A 16 bit number indicating the time in seconds since the origin of the advertisements.
This
time increases as the link state advertisement resides in the router database and/or with each hop
count.
When it reaches a maximum value, normally one hour, it is discarded unless needed for
synchronization.
 Options: See the Hello Packet.
 LS Type: This field specifies which of five different link state advertisements is contained in this header.
 Link State ID: A unique ID for the advertisement which is dependent upon the message type.
 LSA Message Types 1/4 uses the Router ID.
 LSA Message Type 2 uses the IP address of the Designated Router.
 LSA Message Types 3/5 uses an IP network number.
92
OSPF DB DESCRIPTION MESSAGE WITH LINK
ADVERTISEMENT HEADER (CONT’D)
Headers
contain enough information to
identify the LS records needed during
synchronization.
The receiver marks the LS Records to be
requested.
Message
Type
Version
Message Length
Router Identification
Area Identification
Checksum
Authentication Type
Authentication (octets 0-3)
0
4-7)
8 Authentication (octets
16
LS Age
24
Options
31
LS Type
Link State Identification
Advertising Router
LS Types are:
Type 1: Router Links
Type 2: Network Links
Type 3/4: Summary Links
Type 5: External Links
Link State Sequence Number
LS Checksum
Length
Advertising Router: The Router ID of the router that originated the link state advertisement.
LSA
Message Type 1 is identical to the Link State ID.
LSA Message Type 2 uses the Router ID of the Network's Designated Router.
LSA Message Types 3/4 use Router ID of the Area Border Router.
LSA Message Type 5 uses the Router ID of the AS Boundary Router.
 LS Sequence Number: This field is used to sequence the advertisements and to detect duplicate
or old packets.
 LS Checksum: The Checksum of the complete Link State Advertisement excluding the LS Age
field.
 Length: The length is the size of the advertisements in bytes including the Advertisement Header.
93
LSA Flooding - Operations
94
Example of Router LSAs
95
Example of Network LSAs
96
Example of Summary LSAs
97
External Route LSAs - Example
98
How OSPF Protocol Works
S1: Discovering Neighbors
=> Hello Message
S2: Electing the Designated Router
=> Hello Message
S3: Establishing Adjacencies
=> DB Description Msgs
S4: Propagating Link State Information
=> Flooding using LS Request/Update Msgs)
S5: Calculating the Routing Table(s)
=> (Diksjtra’s Algorithm)
99
Propagating LS Info: When a link
state changes
100
Propagating LS Info: When a link
state changes
101
102
How OSPF Protocol Works
S1: Discovering Neighbors
=> Hello Message
S2: Electing the Designated Router
=> Hello Message
S3: Establishing Adjacencies
=> DB Description Msgs
S4: Propagating Link State Information
=> Flooding using LS Request/Update Msgs)
S5: Calculating the Routing Table(s)
=> (Diksjtra’s Algorithm)
103
Pros and Cons of OSPF
Advantages of OSPF:
1.Changes in an OSPF network are propagated quickly.
2.OSPF is heirarchical, using area 0 as the top as the heirarchy.
3.OSPF is a Link State Algorithm.
4.OSPF supports Variable Length Subnet Masks (VLSM).
5.OSPF uses multicasting within areas.
6.After initialization, OSPF only sends updates on routing table sections which have changed, it
does not send the entire routing table.
7.Using areas, OSPF networks can be logically segmented to decrease the size of routing tables.
Table size can be further reduced by using route summarization.
8.OSPF is an open standard, not related to any particular vendor.
9.Can load-balance up to 6 equal-cost routes with 4 as the default
Disadvantages of OSPF:
1.OSPF is very processor intensive.
2.OSPF maintains multiple copies of routing information, increasing the amount of memory
needed.
3.Using areas, OSPF can be logically segmented (this can be a good thing and a bad thing).
4.OSPF is not as easy to learn as some other protocols.
5.In the case where an entire network is running OSPF, and one link within it is "bouncing"
every few seconds, OSPF updates would dominate the network by informing every other router
every time the link changed state
104
105
BGP - Autonomous System
 Networks
and Routers under a
single administrative authority
 Each AS is assigned a number
 AS numbers range form 1 to
65,535
106
Different AS Types
http://ipmon.sprint.com/pubs_trs/tutorials/Taft_BGP.pdf (slide 30)
107
BGP is
 An
Exterior Gateway Protocol (EGP), used
to propagate tens or hundreds of
thousands of routes between networks
(ASs).
 The
only protocol used to do this on the
Internet today.
108
What is BGP?


BGP is an inter-domain routing protocol that
communicates prefix reachability
BGP is a path vector protocol

Similar to distance vector
BGP views the Internet as a collection of
autonomous systems
 Stability is very important to the Internet and
BGP
 BGP supports CIDR
 BGP routers exchange routing information
between peers
 Defined in RFC 1771

109
How Does BGP Work?

BGP uses TCP as its transport protocol (port 179). Two BGP routers
form a TCP connection between one another (peer routers) and
exchange messages to open and confirm the connection parameters.

BGP routers exchange network reachability information. This
information is mainly an indication of the full paths (BGP AS numbers)
that a route should take in order to reach the destination network. This
information helps in constructing a graph of ASs that are loop-free and
where routing policies can be applied in order to enforce some
restrictions on the routing behavior.

Any two routers that have formed a TCP connection in order to
exchange BGP routing information are called peers, or neighbors. BGP
peers initially exchange their full BGP routing tables. After this
exchange, incremental updates are sent as the routing table changes.
BGP keeps a version number of the BGP table, which should be the
same for all of its BGP peers. The version number changes whenever
BGP updates the table due to routing information changes. Keepalive
packets are sent to ensure that the connection is alive between the
BGP peers and notification packets are sent in response to errors or
special conditions.
110
BGP Fundamentals

BGP peers exchange routes and send updates not
faster than every 90 seconds by default.

Routes consist of destination prefixes with an AS
path and BGP-specific attributes

Each BGP update contains one path
advertisement and attributes

Many destinations can share the same path

BGP compares the AS path and attributes to
choose the best path

Unfeasible routes can be advertised

Unreachable routes are withdrawn
111
BGP Connections

BGP updates are incremental



No regular refreshes
Except at session establishment, when
volume of routing can be high
BGP runs over TCP connections


TCP port 179
TCP Services


Fragmentation, Acknowledgements, Checksums,
Sequencing, and Flow Control
No automatic neighbor discovery
112
BGP Peering

BGP sessions are established between
peers


Two types of peering sessions



BGP Speakers
E-BGP (external) peers with different ASs
I-BGP (internal) peers within the same AS
Still requires interior gateway protocols
(IGPs)


IGP connects BGP speakers within the AS
IGP advertises internal routes
113
iBGP
AS 3847
When BGP speakers in the same
AS form a BGP connection for
the purpose of exchanging routing
information, they are said to be
running IBGP or internal BGP.
A
c
B
IBGP speakers are usually
fully-meshed.
114
eBGP (1)
When BGP speakers in different
ASs form a BGP connection for
the purpose of exchanging routing
information, they are said to be
running EBGP or external BGP.
EBGP peers are usually directly
connected.
AS 3561
A
AS 3847
B
115
eBGP (2)
AS 2033
AS 7007
AS 4200
AS 2041
116
iBGP and eBGP Diagram
AS 1239
AS 7007
XP
AS 701
AS 6079
AS 4006
117
eBGP Rules
 By
default, only talks to directly-connected
router.
 Sends the one best BGP route for each
destination.
 Sends all of the important “attributes”;
omits the “local preference” attribute.
 Adds (prepends) the speaker’s ASN to the
“as-path” attribute.
 Usually rewrites the “next-hop” attribute.
118
iBGP Rules
 Can
talk to routers many hops away by
default.
 Can only send routes it “injects”, or routes
heard DIRECTLY from an external peer.
 Thus, requires a FULL mesh.
 Sends all attributes.
 Leaves the as-path attribute alone.
 Doesn’t touch the “next hop” attribute.
119
Logical view of 16 routers, fully
meshed
120
iBGP Restriction (1)
Assume AS1239 sends route 10.0.0.0/8 to
AS2828. Router A will send that route to
Routers B and C.
B
AS 2828
C
A
AS 1239
121
iBGP Restriction (2)
When Router B receives 10.0.0.0/8, it will
not propagate that route to Router C
because it was learned from an iBGP
neighbor. Router C will behave similarly.
B
AS 2828
C
A
AS 1239
122
BGP Route Advertisement

Only advertise the active BGP routes to
peers (by default)


Never forward I-BGP routes to I-BGP
peers


BGP Next-hop must be reachable
Prevents loops
Withdraw routes if active BGP routes
become unreachable
123
CIDR and Aggregate Addresses
(1) AS 2 has the
detailed routes
AS 1
(3) AS 1 learns
only the
aggregate and
not the details
192.168.0.0/24
192.168.1.0/24
192.168.2.0/24
192.168.3.0/24
Router A
Router B
2.2.2.2
3.3.3.2
192.168.0/22
192.168.0/22
2.2.2.1
3.3.3.1
Router C
AS 2
(2) BGP with
Routing Policy
can advertise a
prefix that
aggregates the
detailed routes
AS 3
124
IBGP, EBGP Example
AS 1
EBGP
AS 3
AS 2
EBGP
IBGP
125
Advertising Networks
 Using
the Network command
 Redistributing static routes
 Redistributing Dynamic routes
126
Advertising Networks
Using Network Command
Router A
11.0.0.0
12.0.0.0
router bgp 1
neighbor 1.1.1.2 remote-as 2
network 11.0.0.0
network 12.0.0.0
Router B
router bgp 2
neighbor 1.1.1.1 remote-as 1
network 92.0.0.0
network 93.0.0.0
AS1
A
EBGP
92.0.0.0
93.0.0.0
B
AS2
127
Advertising Networks
By redistributing Static Routes
11.0.0.0
12.0.0.0
A
AS1
Router A
router bgp 1
neighbor 1.1.1.2 remote-as 2
redistribute static
ip route 11.0.0.0 255.0.0.0 null 0
ip route 12.0.0.0 255.0.0.0 null 0
EBGP
92.0.0.0
93.0.0.0
B
AS2
128
Advertising Networks
By Redistributing Dynamic Routes
11.0.0.0
12.0.0.0
A
AS1
Router A
router bgp 1
neighbor 1.1.1.2 remote-as 2
redistribute ospf 1
EBGP
92.0.0.0
93.0.0.0
router ospf 1
network 11.0.0.0 0.255.255.255 area 0
B
AS2
129
BGP Attributes
 AS-path
 Next-hop
 Local
preference
 MED
 Origin
 Communities
130
BGP Attributes
AS-Path
traversed one or
more members of a set
{1880, 1881, 1882}
(as-set)
A list of AS’s that a route
has traversed
1880 1883 (sequence)
Shortest AS path preferred
1883
193.0.32/24
Path
1880
193.0.34/24
1881
193.0.33/24
1882
193.0.35/24
193.0.33/24 1880 1881
193.0.34/24 1880
193.0.35/24 1880 1882
193.0.32/22 1880 1983
131
BGP Attributes
Multi-Exit Discriminator (MED)
690
1883
1755
200
1880
209
Preference
sent to all routers in remote AS
Where do I want to receive the traffic ?
132
Multi-Exit Discriminator (MED)
Indication
to external peers of the
preferred path into an AS.
Affects routes with same AS path.
Advertised to external neighbors
Usually based on IGP metric
* Lowest MED preferred
133
MED Attribute (2)



The MED (multi-exit discriminator) is a
commonly used attribute. It comes after the
AS_PATH in evaluation, and thus isn’t quite as
much of a “hammer” as local-pref.
Commonly, MED is used to tack a distance on
BGP routes as they move within your network.
NSPs advertise MEDs to each other to let it be
known which POP the route is “closest” to.
134
BGP Attributes
Local Preference
690
1755
1880
A
Needs to go to 690
666
Preference
sent to all routers in local AS
Where do I want traffic to leave?
102
NW’98
135
© 1998, Cisco Systems, Inc.
135
Local Preference Attribute
AS 3847
F
G
E
C
208.1.1.0/24



*
D
80
Local to AS
Used to influence BGP
path selection
Default 100
Highest local-pref preferred
208.1.1.0/24
100
Preferred by all
AS3847 routers
A
B
208.1.1.0/24
AS 6201
136
Local-Pref Attribute (2)
 An
often-used attribute, local-pref
(normally 100) overrides AS_PATH, and
is transitive throughout your network.
It is never advertised to an eBGP peer.
 For example, you can express the policy
“prefer private interconnects” by
making the local_pref be 150 and
leaving all other peers at 100.
 Best used as an intermediate-level
knob.
137
The BGP Path Decision Algorithm

BGP determines the best path to each destination for a BGP
speaker by comparing path attributes according to the
following selection sequence:
Select a path with a reachable next hop.
2. Select the path with the highest weight.
3. If path weights are the same, select the path with the highest
local preference value.
4. Prefer locally originated routes (network routes, redistributed
routes, or aggregated routes) over received routes.
5. Select the route with the shortest AS-path length.
6. If all paths have the same AS-path length, select the path based
on origin: IGP is preferred over EGP; EGP is preferred over
Incomplete.
7. If the origins are the same, select the path with lowest MED value.
8. If the paths have the same MED values, select the path learned via
EBGP over one learned via IBGP.
9. Select the route with the lowest IGP cost to the next hop.
10. Select the route received from the peer with the lowest BGP
router ID.
1.
138
Common Internet Routing Phenomemon
 E-BGP
Route Flapping/Oscillation
 Remedy: Route Flap Dampening
139
BGP - Route Flapping


Routing instability
Routes disappear, appear again, then
disappear


Visible to the Internet


Withdrawal, announcement, withdrawal,
announcement
Waste resources
Some causes of route flapping





Flaky inter-AS links
Flaky or insufficient hardware
Link congestion
IGP instability
Operator error
140
BGP – Route Flap Dampening

If you are running BGP version 4, the BGP process assigns a penalty of
1000 to the route each time it flaps. When the penalty value exceeds
the first of two limits (Re-use limit, Suppress limit), the route is
moved into the 'historical' list of routes, dampened, and then is no
longer accepted from other peers or announced to any peers. After the
first limit has been exceeded, the timer which tracks the period for
which the route is to be dampened is doubled for each flap.

The suppression half-life is 15 minutes. The maximum suppress limit
is four times the half-life; thus, one hour is the default. The
suppression penalty decays at half the half life (7.5 minutes). So:
1.
2.
3.
4.
First flap, penalty 1000 assigned, route placed in 'historical' category and becomes less
preferred.
Second flap, route has met the suppression limit of 2000 (a Cisco default). The route is
dampened and no longer advertised to neighbors or accepted from neighbors.
If route does not flap any further the penalty is decayed. The decay process begins 7.5
minutes after the route stabilized and decays exponentially every 5 seconds thereafter.
Once the suppression penalty decays below 750 (the default value for the reuse
threshold), the route is removed from dampened state and reused. The router parses
the historical routes list every 10 seconds for reusable routes.
141
Route Flap Dampening - Operation
142
Useful Tool To Understand BGP
Peering Relationships
www.netlantis.org
143
UDP, TCP/IP - Internet
Protocols
UDP: User Datagram Protocol
UDP Header
146
What is UDP?
 Relatively
simple compared to TCP
 UDP provides connectionless service for
data delivery between two hosts
 No
logical connection is established by UDP, so
no connection-oriented services are supplied
 The application may need some of those
services (e.g., no corrupt packets), so the
application is responsible to provide them
 Applications
that use UDP: tftp, DNS (for
some functions), SNMP, RIP, VoIP
147
Why Use UDP over TCP?

No connection establishment. As we shall discuss in Section 3.5, TCP uses a three-way handshake
before it starts to transfer data. UDP just blasts away without any formal preliminaries. Thus UDP does
not introduce any delay to establish a connection. This is probably the principle reason why DNS runs
over UDP rather than TCP -- DNS would be much slower if it ran over TCP. HTTP uses TCP rather than
UDP, since reliability is critical for Web pages with text. But, as we briefly discussed in Section 2.2, the
TCP connection establishment delay in HTTP is an important contributor to the "world wide wait".

No connection state. TCP maintains connection state in the end systems. This connection state
includes receive and send buffers, congestion control parameters, and sequence and acknowledgment
number parameters. We will see in Section 3.5 that this state information is needed to implement TCP's
reliable data transfer service and to provide congestion control. UDP, on the other hand, does not
maintain connection state and does not track any of these parameters. For this reason, a server devoted
to a particular application can typically support many more active clients when the application runs over
UDP rather than TCP.

Small segment header overhead. The TCP segment has 20 bytes of header overhead in every
segment, whereas UDP only has 8 bytes of overhead.

Unregulated send rate. TCP has a congestion control mechanism that throttles the sender when one
or more links between sender and receiver becomes excessively congested. This throttling can have a
severe impact on real-time applications, which can tolerate some packet loss but require a minimum
send rate. On the other hand, the speed at which UDP sends data is only constrained by the rate at
which the application generates data, the capabilities of the source (CPU, clock rate, etc.) and the access
bandwidth to the Internet. We should keep in mind, however, that the receiving host does not
necessarily receive all the data - when the network is congested, a significant fraction of the UDPtransmitted data could be lost due to router buffer overflow. Thus, the receive rate is limited by network
congestion even if the sending rate is not constrained.
148
UDP Encapsulation
149
UDP Header
150
Computing the UDP Checksum
151
IP Fragmentation
152
Other UDP Uses
 Path
MTU Discovery
 Using
Traceroute
 Using UDP
 Max
UDP datagram size
 ICMP Source Quench
153
TCP: Transmission Control
Protocol
TCP Header
155
What is TCP?

Relatively complex compared to UDP

TCP provides connection-oriented service for data delivery
between two hosts

Client and server establish a logical TCP connection before exchanging
data

TCP segments flow over the network in IP packets (which are
connectionless) so that the logical TCP connection can be maintained
over a changing physical path

Connections are full-duplex

Timers are used to maintain connections

TCP relies on IP to provide hop-by-hop routing and error detection

Applications that use TCP: telnet, ftp, http, many others
156
TCP Logical Connections/Ports
 TCP
and UDP introduce the concept of
ports
 Common ports and the services that run
on them:
 FTP
 telnet
 SMTP
 http
 POP3
21 and 20
23
25
80
110
 Multiple
ports/logical connections can be
supported
157
TCP Header Fields and Other Info

Each connection uniquely identified by combination of src IP, dest IP, src
port, and dest port


socket = IP + port (e.g., 10.1.1.1.23)
Sequence numbers

The sequence number of the first data octet in this segment (except when SYN is present).
If SYN is present the sequence number is the initial sequence number (ISN) and the first
data octet is ISN+1.

Acknowledgements

If the ACK control bit is set this field contains the value of the next sequence number the
sender of the segment is expecting to receive. Once a connection is established this is
always sent.
Header Length: Length of header in bytes

Flag bits: Provides connection-oriented service







The SYN and Fin flags are used when establishing and terminating a TCP connection,
respectively.
The ACK flag is set any time the Acknowledgement field is valid, implying that the
receiver should pay attention to it.
The URG flag signifies that this segment contains urgent data. When this flag is set, the
UrgPtr field indicates where the non-urgent data contained in this segment begins.
The PUSH flag signifies that the sender invoked the push operation, which indicates to
the receiving side of TCP that it should notify the receiving process of this fact.
Finally, the RESET flag signifies that the receiver has become confused and so wants to
abort the connection.
Window size

The number of data octets beginning with the one indicated in the acknowledgment field
which the sender of this segment is willing to accept.
158
TCP Options
159
TCP Encapsulation
160
TCP Functionalities
Connection-Orient
Identifies traffic flow by some identifier rather than by explicitly listing source and destination addresses
Stream Data Transfer
From the application's viewpoint, TCP transfers a contiguous stream of bytes. TCP does this by grouping the bytes in
TCP segments, which are passed to IP for transmission to the destination. TCP itself decides how to segment the data
and it may forward the data at its own convenience.
Reliability
TCP assigns a sequence number to each byte transmitted, and expects a positive acknowledgment (ACK) from the
receiving TCP. If the ACK is not received within a timeout interval, the data is retransmitted. The receiving TCP uses
the sequence numbers to rearrange the segments when they arrive out of order, and to eliminate duplicate segments.
Flow Control
The receiving TCP, when sending an ACK back to the sender, also indicates to the sender the number of bytes it can
receive beyond the last received TCP segment, without causing overrun and overflow in its internal buffers. This is
sent in the ACK in the form of the highest sequence number it can receive without problems.
Logical Connections
The reliability and flow control mechanisms described above require that TCP initializes and maintains certain status
information for each data stream. The combination of this status, including sockets, sequence numbers and window
sizes, is called a logical connection. Each connection is uniquely identified by the pair of sockets used by the sending
and receiving processes.
Multiplexing
To allow for many processes within a single host to use TCP communication facilities simultaneously, the TCP
provides a set of addresses or ports within each host. Concatenated with the network and host addresses from the
internet communication layer, this forms a socket. A pair of sockets uniquely identifies each connection.
Full Duplex
TCP provides for concurrent data streams in both directions.
161
Important Factors That Affect TCP
(Application) Performance

Link BW (window size), network delay (RTT) and MTU size (Bit Error Rate)

How a receiver/sender implements Acknowlegment scheme



Speed at which received data is processed and ACKed at destination
(sender/receiver buffer size)

Maximum TCP Buffer (Memory) space for use by any TCP connection

Socket Buffer Sizes for individual TCP connection
Ability to actively manage speed mismatch between sender and receiver by
regulating how much can/should be sent without getting into a congestion
state


Window size based on Bandwidth Delay Product
Ability to actively detect and prevent dynamic congestion and re-act to it


Stop-N-Wait, Go-back-N, Selective Ack
Connection timeout, slow start, back-off strategies
Ability for sender to maximize/improve BW/network throughput

TCP Large Window Scaling
162
TCP Connection-Oriented Protocol:
TCP Connection Establishment
and Termination
Passive and Active Ports

TCP enables two methods to establish a
connection: active and passive. An active connection
establishment happens when TCP issues a request for
the connection, based on an instruction from an upperlevel protocol that provides the socket number. A passive
approach takes place when the upper-level protocol
instructs TCP to wait for the arrival of connection
requests from a remote system (usually from an active
open instruction). When TCP receives the request, it
assigns a port number. This enables a connection to
proceed rapidly, without waiting for the active process.
164
Connection Establishment
165
Connection Termination
166
MSS: Maximum Segment Size
167
168
169
TCP Half-Close
170
TCP State Diagram
171
States: Establishment and Termination
172
TCP Reset
173
Simultaneous Open
For example: An application at host A uses 7777 as the local port and connects
to port 8888 on host B. At the same time, an application at host B uses 8888 as
the local port and connects to port 7777 on host A. This is "Simultaneous Open".
Here is another example: The Telnet client at host A connects to the Telnet server
at host B. At the same time, the Telnet client at host B connects to the Telnet server
at host A. Be careful. This time, it's not "Simultaneous Open" because the two
Telnet servers on both sides do "passive open" instead of "active open". There are
actually two TCP connections, instead of one in "Simultaneous Open".
174
Simultaneous Close
175
TCP Provides a Byte-Stream Service
TCP is a byte-oriented protocol, which means the sender writes bytes into a TCP
connection and the receiver reads bytes out of the TCP connection. Although ``bytestream'' describes the service TCP offers to application processes, TCP does not, itself,
transmit individual bytes over the Internet. Instead, TCP on the source host buffers
enough bytes from the sending process to fill a reasonably sized packet, and then sends
this packet to its peer on the destination host. TCP on the destination host then empties
the contents of the packet into a receive buffer, and the receiving process reads from
this buffer at its leisure. This situation is illustrated in figure below, which for
simplicity, shows data flowing in only one direction. In general, remember, a single TCP
connection supports byte-streams flowing in both directions.
176
TCP Provides a Byte-Stream Service
 It
is a streaming protocol
 No
“record markers” inserted into data stream
 Writes
on one end and reads on the other
are independent of each other
 Ex:
data could be written in a sequence of 10 bytes,
then 20 bytes, then 50 bytes
 That data could be read as 4 x 20 byte
 No
interpretation of the application data
 Very
similar to Unix kernel’s treatment of files in a
filesystem
177
TCP Provides Reliability
 TCP
segments are sized for the application
 Segments must be acknowledged
 TCP checksum on header and data
 Out-of-sequence IP packets can be reordered
 Receiving TCP must discard duplicate
packets
 Flow control is employed to manage finite
buffer space
178
TCP Flow Control Algorithms

Sliding Window
 Regular Sliding Window
 The BW Delay Product (BWDP)
 Window Scaling option
 The Silly Window syndrome
 Tinygram Congestion Prevention: The Nagle algorithm

TCP Timeout
 RTT-Timeout Calculation

Acknowlegement schemes
 Stop-N-Wait, Go-back-N, Selective Ack schemes
 Ambiguous Acknowledgements: The Karn’s Algorithm

Congestion Avoidance
 Slow Start
179
End-to-end flow control
 Problem
 Sender
can send more traffic that receiver can
handle. (Too fast)
 Solution
 variable
sliding window protocol
each acknowledgement, which specifies how
many octets have been received, contains a
window advertisement that specifies how
many additional octets receiver are prepared to
accept.
180
Variable Window Size
…
…
Window Advertisement
Receiver
Transmitter
Transmitter Window Size
Value of
Window Advertisement
Free space in buffer to fill
increase
bigger
increase
decrease
smaller
decrease
Stop transmissions
0
full
181
Sliding window protocol in TCP
 TCP
allows the window size to vary over
time.
 Window size changes at the time it slides
forward.
 Advantage: it provides flow control as well
as reliable transfer.
182
TCP Sliding Window Algorithm
 Flow
control with the use of Window
concept
 via
specifying an acceptable range of
sequence numbers
183
Sliding Windows
Advertized Window
Available Window
184
Sliding Window: Details
Sender
Max ACK received
Receiver
Next expected
Next seqnum
…
…
…
…
Sender window
Sent & Acked
Sent Not Acked
OK to Send
Not Usable
Max acceptable
Receiver window
Received & Acked
Acceptable Packet
Not Usable
185
Window Flow Control: Header
Packet Received
Packet Sent
Source Port
Dest. Port
Source Port
Dest. Port
Sequence Number
Sequence Number
Acknowledgment
Acknowledgment
HL/Flags
Window
HL/Flags
Window
D. Checksum
Urgent Pointer
D. Checksum
Urgent Pointer
Options..
Options..
App write
acknowledged
sent
to be sent outside window
186
Sliding Windows Example - A Dynamic Parameter
187
Window Size – A Dynamic Parameter
188
How To Determine Optimal Window Size
(Keeping the Pipe Full): The Bandwidth-Delay
Product (BDP)
189
The TCP Window Scaling option

The original TCP specification included a window size no larger
than 64 KB. This limitation was introduced by the 16 bit header that
specified window size. To achieve the recommended 1 MB window size, TCP
extensions must be enabled to add another 14 bits to the window
size,making the total bist equal to 30 for window size

The TCP window scaling option works by including a scale factor in
a SYN packet imbedded in the TCP OPTIONS field. This scale factor
informs the receiver that the sender is willing to do window scaling and
offers a scale factor for the communication. The scale factor is used to shift
the window field before the data segment is sent.

It's important to note that the window size used in the actual 3way handshake is NOT the window size that is scaled. This means
that the first data packet sent after the 3-way handshake is the actual
window size. If there is a scaling factor, the initial window size of 65,535
bytes is always used. The window size is then multiplied by the scaling
factor identified in the 3-way handshake. The table below represents the
scaling factor boundaries for various window sizes.
190
The TCP Window Scaling Option
Scale factor Scale Value Initial
Window
0
1
65535 or
less
1
2
65535
2
4
65535
3
8
65535
Window
Scaled
65535 or
less
131,070
262,140
524,280
4
5
6
65535
65535
65535
1,048,560
…
…
65535
65535
…
1,073,725,4
40
…
14
16
16384
191
TCP Options – Window Scale Factor
192
The TCP Receiver Silly Window Syndrome

Problems Associated With "Shrinking" The TCP Window - when the receiver is much slower than
the sender (ex below where receiver can only process 1 out of 3 received packets)

This diagram shows one example of how the phenomenon known as TCP silly window syndrome can
arise. The client is trying to send data as fast as possible to the server, which is very busy and
cannot clear its buffers promptly. Each time the client sends data the server reduces its receive
window. The size of the messages the client sends shrinks until it is only sending very small,
inefficient segments.
193
Receiver SWS Avoidance

Let's start with SWS avoidance by the receiver. As we saw in the initial
example above, the receiver contributed to SWS by reducing the size of its receive
window to smaller and smaller values due its being busy. This caused the right edge
of the sender's send window to move by ever-smaller increments, leading to smaller
and smaller segments. To avoid SWS, we simply make the rule that the receiver may
not update its advertised receive window in such a way that this leaves too little
usable window space on the part of the sender. In other words, we restrict the
receiver from moving the right edge of the window by too small an amount. The
usual minimum that the edge may be moved is either the value of the MSS
parameter, or one-half the buffer size, whichever is less.

Let's see how we might use this in the example above. When the server
receives the initial 360-byte segment from the client and can only process 120 bytes,
it does not reduce the window size to 120. It reduces it all the way to 0, closing the
window. It sends this back to the client, which will then stop and not send a small
segment. Once the server has removed 60 more bytes from the buffer, it will now
have 180 bytes free, half the size of the buffer. It now opens the window up to 180
bytes in size and sends the new window size to the client.

It will continue to only advertise either 0 bytes, or 180 or more, not
smaller values in between. This seems to slow down the operation of TCP, but it
really doesn't. Because the server is overloaded, the limiting factor in overall
performance of the connection is the rate at which the server can clear the buffer. We
are just exchanging many small segments for a few larger ones.
194
Tinygram Congestion Prevention:
The Nagle algorithm (Sender SWS)

Nagle’s algorithm, named after John Nagle, is a means of improving the
efficiency of TCP/IP networks by reducing the number of packets that
need to be sent over the network.

Telnet example test over a long-haul link with a 5-second round trip time.
User sends 25 bytes. Without any mechanism to prevent small-packet (tinygram)
congestion 25 new packets would be sent in 5 seconds in accordance with the
delayed ACK algorithm, meaning delay up to 200 ms. Amount data to be sent is
41x25 bytes (20 bytes for IP header, 20 bytes for TCP header). Overhead here is
4000%.

With Nagle algorithm however, the first character from the user would be
sent immediately. The next 24 characters, arriving from the user at 200ms
intervals. When an ACK arrived for the first packet at the end of 5 seconds, a single
packet with the 24 queued characters would be sent, i.e. 41x2 +25 bytes would be
sent total. Overhead is only 320% with no penalty in response time.

The Nagle Algorithm is useful on a slow WAN when it is desired to reduce
the tinygram congestion.

Sometimes the Nagle algorithm needs to be turned off. For example, in X
Window System server small messages (mouse movements) must be
delivered without delay to provide real-time feedback for interactive user.
195
Key TCP Concepts
 Modern
TCP implementations incorporate
a set of SWS avoidance algorithms. When
receiving, devices are programmed not to
advertise very small windows, waiting instead
until there is enough room in the buffer for one
of a reasonable size. Transmitters use Nagle’s
algorithm to ensure that small segments are not
generated when there are unacknowledged
bytes outstanding.
196
TCP Timers

The Retransmission Timer
 The retransmission timer manages retransmission timeouts (RTOs), which occur
when a preset interval between the sending of a datagram and the returning
acknowledgment is exceeded. The value of the timeout tends to vary, depending
on the network type, to compensate for speed differences. If the timer expires,
the datagram is retransmitted with an adjusted RTO, which is usually increased
exponentially to a maximum preset limit. If the maximum limit is exceeded,
connection failure is assumed, and error messages are passed back to the upperlayer application.


Values for the timeout are determined by measuring the average time that data
takes to be transmitted to another machine and the acknowledgment received
back, which is called the round-trip time, or RTT. From experiments, these RTTs
are averaged by a formula that develops an expected value, called the smoothed
round-trip time, or SRTT. This value is then increased to account for unforeseen
delays.
The Delayed ACK Timer
 TCP uses delayed acknowledgments to reduce the number of packets that are
sent on the media. Instead of sending an acknowledgment for each TCP
segment received, TCP takes a common approach to implementing delayed
acknowledgments. As data is received by TCP on a particular connection, it
sends an acknowledgment back only if one of the following conditions is true:


No acknowledgment was sent for the previous segment received.
A segment is received, but no other segment arrives within 200 milliseconds for that
connection.
197
TCP Timers (Continued)

The Quiet Timer
 After a TCP connection is closed, it is possible for datagrams that are still making
their way through the network to attempt to access the closed port. The quiet
timer is intended to prevent the just-closed port from reopening again quickly
and receiving these last datagrams.


The quiet timer is usually set to twice the maximum segment lifetime (the same
value as the Time to Live field in an IP header), ensuring that all segments still
heading for the port have been discarded. Typically, this can result in a port
being unavailable for up to 30 seconds, prompting error messages when other
applications attempt to access the port during this interval.
The Keep-Alive Timer and the Idle Timer
 Both the keep-alive timer and the idle timer were added to the TCP specifications
after their original definition. The keep-alive timer sends an empty packet at
regular intervals to ensure that the connection to the other machine is still active.
If no response has been received after sending the message by the time the idle
timer has expired, the connection is assumed to be broken.

The keep-alive timer value is usually set by an application, with values ranging
from 5 to 45 seconds. The idle timer is usually set to 360 seconds.
198
RTT and TCP Retransmission Timeouts (RTO)

The RTO is typically calculated based on the RTT

The original TCP specification had TCP update a smoothed
RTT estimator (called R) using the low-pass filter


R <- aR + (1-a)M, M is the measured RTT

where a is a smoothing factor with a recommended value of 0.9.
This smoothed RTT is updated every time a new measurement is
made. Ninety percent of each new estimate is from the previous
estimate and 10% is from the new measurement.
Given this smoothed estimator, which changes as the RTT
changes, RFC 793 recommended the retransmission timeout
value (RTO) be set to RTO = Rb

where b is a delay variance factor with a recommended value of
2.
199
TCP Performance Enhancement - Congestion
Avoidance Using Slow Start
Cwnd = congestion window
Some implementation
recommends cwnd = 4
as the initial value to
improve throughput
200
The TCP Delayed ACK timer
Delayed ACK timer = 200 msec
by default
201
TCP Reliability – Use of
Acknowledgement
 TCP
is responsible for data recovery
 by
providing a sequence number with each
packet that it sends
 TCP
requires ACK (acknowledgement)
to ensure correct data is received
 Packet can be retransmitted if error
detected
 Three ACK schemes available which will
be discussed at length later
202
TCP Acknowledgement Schemes
 Stop-and-Wait
: individual ACK is required
for each segment received
 Go-Back-N
: cumulative ACK required for
consecutive segments received
 Selective-Ack:
selectively ACK only
segments that are in error as part of the
consecutive segments received
203
Stop-And-Wait ACK Scheme:
TCP Interactive Data Flow
Interactive Data: character/echo
205
Delayed Acknowledgements
Delayed ACK timer = 200 msec
by default
206
Regular Acknowledgements
207
Stop-And-Wait ACK Scheme – Loss Packet
Scenario
208
Stop-And-Wait ACK Scheme – Loss ACK
Scenario
209
Performance of Stop-And-Wait ACK Scheme
It works, but performance stinks
 Example: 1 Gbps link, 15 ms e-e prop. delay, 1KB
packet:

Ttransmit =
U



L (packet length in bits)
8kb/pkt
=
= 8 microsec
R (transmission rate, bps)
10**9 b/sec
=
sender
L/R
RTT + L / R
=
.008
30.008
= 0.00027
microsec
onds
U sender: utilization – fraction of time sender busy sending
1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps
link
network protocol limits use of physical resources!
210
Ambiguous ACK – Loss ACK Scenario
211
Ambiguous Acknowledgement - The Karn Algorithm

Rule 1: Ignore measured RTT for retransmitted packets.
This removes ambiguity from RTT measurements.

Rule 2: RTO should be doubled after retransmission.
This is called "Exponential Back-off".

A problem occurs when a packet is retransmitted. Say a packet is transmitted, a
timeout occurs, the RTO is backed off, the packet is retransmitted with the longer
RTO, and an acknowledgment is received. Is the ACK for the first transmission or the
second? This is called the retransmission ambiguity problem.

[Karn and Partridge 1987] specify that when a timeout and retransmission occur, we
cannot update the RTT estimators when the acknowledgment for the retransmitted
data finally arrives. This is because we don't know to which transmission the ACK
corresponds. (Perhaps the first transmission was delayed and not thrown away, or
perhaps the ACK of the first transmission was delayed.)

Also, since the data was retransmitted, and the exponential backoff has been applied
to the RTO, we reuse this backed off RTO for the next transmission. Don't calculate a
new RTO until an acknowledgment is received for a segment that was not
retransmitted.
212
TCP Fast Retransmission

Fast Retransmission: When TCP detects segment loss
using retransmission timer, then ssthresh is set to half of
CWND. i.e. ( ssthresh = CWND / 2 ) and CWND is set
to 1-full size segment. Now instead of waiting for
retransmission timer to get off , TCP detects packet loss
by looking for packet re-ordering and retransmit the lost
packet. This scheme is called as " Fast Retransmit". In
this algorithm , TCP receiver sends an immediate
duplicate ACK on out-of-order segment arrival. The
other end TCP deduce from small number ( normally 3 )
of consecutive duplicate ACKs that the segment has
been lost and deduces the starting sequence number of
missing segment. The missing segment is retransmitted.
213
Go-Back-N ACK Scheme:
TCP Bulk Data Flow
Motivation: Pipelining Increases Utilization
sender
receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R
first packet bit arrives
last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
RTT
ACK arrives, send next
packet, t = RTT + L / R
Increase utilization
by a factor of 3
U
sender
=
3*L/R
RTT + L / R
=
.024
30.008
= 0.0008
microsecon
ds
215
Go-Back-N (GBN) - Sender
Sender:


k-bit seq # in pkt header
“sliding window” of up to N, consecutive unack’ed pkts
allowed
ACK(n): ACKs all pkts up to, including seq # n - “cumulative ACK”
 may receive duplicate ACKs (see receiver)
 The local node must also keep a buffer of all PDUs which have been sent,
but have not yet been acknowledged
 Timer for each in-flight pkt
 Timeout(n): retransmit pkt n and all higher seq # pkts in window

216
GBN: Receiver
ACK-only: always send ACK for correctly-received
pkt with highest in-order seq #



may generate duplicate ACKs
need only remember expectedseqnum
out-of-order pkt:


discard (don’t buffer) -> no receiver buffering!
Re-ACK pkt with highest in-order seq #
217
GBN In Action
218
GBN – How To Handle Packet Loss
Example of Go-Back-N. The sender in this example transmits four PDUs (1-4) and
the first one (1) of these is not successfully received. The receiver notes that it was
expecting a PDU numbered 1 and actually receives a PDU numbered 2. It therefore
deduces that (1) was lost. It requests retransmission of the missing PDU by sending
a Go-Back-N request (in this case N=1), and discards all received PDUs with a
number greater than 1.The sender receives the Go-Back-N request and retransmits
the missing PDU (1), followed by all subsequently sent PDUs (2-4) which the
receiver the correctly receives and acknowledges.
219
GBN: ACK of Multiple Segments
220
GBN: ACK of Multiple Segments –
Slight Differences
221
GBN: Fast Sender, Slow Receiver
222
Limitations With Stop-And-Wait and
Go-Back-N Ack Schemes
 Poor
performance when multiple packets
are lost from a window of data
 Cumulative ACKs provide limited
information
 Aggressively re-transmitting packets is
inefficient
 Selective-ACK solves these problems
223
Download