Ch3 Notes

advertisement
ECE/CS 372 – introduction to computer networks
Lecture 7
Announcements:
 HW1 is due today
 LAB2 is due tomorrow
Acknowledgement: slides drawn heavily from Kurose & Ross
Chapter 3, slide: 1
Chapter 3: Transport Layer
Our goals:
 understand
principles behind
transport layer
services:
reliable data
transfer
 flow control
 congestion control

 learn about transport
layer protocols in the
Internet:
UDP: connectionless
transport
 TCP: connectionoriented transport
 TCP congestion control

Chapter 3, slide: 2
Transport services and protocols
 provide
logical communication
between app processes
running on different hosts
 transport protocols run in
end systems
 send side: breaks app
messages into segments,
passes to network layer
 rcv side: reassembles
segments into messages,
passes to app layer
 more than one transport
protocol available to apps
 Internet: TCP and UDP
application
transport
network
data link
physical
application
transport
network
data link
physical
Chapter 3, slide: 3
Transport vs. network layer
network layer: logical communication between hosts
 transport layer: logical communication between processes

Household case:

12 kids (East coast house)
sending letters to 12 kids
(West coast house)
 Ann is responsible for the
house at East coast
 Bill is responsible for the
house at West coast
 Postal service is
responsible for between
houses
Household analogy:
 kids = processes
 letters = messages
 houses = hosts
 home address = IP address
 kid names = port numbers
 Ann and Bill = transport
protocol
 postal service = networklayer protocol
Chapter 3, slide: 4
Internet transport-layer protocols
 reliable, in-order
delivery (TCP)



congestion control
flow control
connection setup
 unreliable, unordered
delivery: UDP

no-frills extension of
“best-effort” IP
 services not available:
 delay guarantees
 bandwidth guarantees
application
transport
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physicalnetwork
network
data link
physical
data link
physical
network
data link
physical
application
transport
network
data link
physical
Chapter 3, slide: 5
UDP: User Datagram Protocol [RFC 768]
 “best effort” service, UDP
segments may be:
 lost
 delivered out of order
to app

connectionless:


no handshaking between
UDP sender, receiver
each UDP segment
handled independently
of others
Why is there a UDP?
 less delay: no connection
establishment (which can
add delay)
 simple: no connection state
at sender, receiver
 less traffic: small segment
header
 no congestion control: UDP
can blast away as fast as
desired
Chapter 3, slide: 6
UDP: more
 often used for streaming
multimedia apps
 loss tolerant
 rate sensitive
Length, in
bytes of UDP
segment,
including
header
 other UDP uses
 DNS
 SNMP
 reliable transfer over UDP:
add reliability at
application layer
 application-specific
error recovery!
32 bits
source port #
dest port #
length
checksum
Application
data
(message)
UDP segment format
Chapter 3, slide: 7
UDP checksum
Goal: detect “errors” (e.g., flipped bits) in transmitted
segment
Sender:
Receiver:
 treat segment contents
 compute checksum of
as sequence of 16-bit
integers
 checksum: addition (1’s
complement sum) of
segment contents
 sender puts checksum
value into UDP checksum
field
received segment
 check if computed checksum
equals checksum field value:
 NO - error detected
 YES - no error detected.
But maybe errors
nonetheless? ….
Chapter 3, slide: 8
Internet Checksum Example
 Note

When adding numbers, a carryout from the
most significant bit needs to be added to the
result
 Example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
Chapter 3, slide: 9
Chapter 3 outline
 Principles of reliable
data transfer
 Connection-oriented
 Principles of congestion
control

TCP congestion control
transport: TCP
Chapter 3, slide: 10
Principles of Reliable data transfer
 important in app., transport, link layers
 top-10 list of important networking topics!
 characteristics of unreliable channel will determine
complexity of reliable data transfer protocol (rdt)
Chapter 3, slide: 11
Principles of Reliable data transfer
 important in app., transport, link layers
 top-10 list of important networking topics!
 characteristics of unreliable channel will determine
complexity of reliable data transfer protocol (rdt)
Chapter 3, slide: 12
Principles of Reliable data transfer
 important in app., transport, link layers
 top-10 list of important networking topics!
 characteristics of unreliable channel will determine
complexity of reliable data transfer protocol (rdt)
Chapter 3, slide: 13
Reliable data transfer: getting started
rdt_send(): called from above,
(e.g., by app.). Passed data to
deliver to receiver upper layer
send
side
udt_send(): called by rdt,
to transfer packet over
unreliable channel to receiver
deliver_data(): called by
rdt to deliver data to upper
receive
side
rdt_rcv(): called when packet
arrives on rcv-side of channel
Chapter 3, slide: 14
Reliable data transfer: getting started
We will:
 incrementally develop sender, receiver sides of
reliable data transfer protocol (rdt)
 consider only unidirectional data transfer
 use finite state machines (FSM) to specify
sender, receiver
event causing state transition
actions taken on state transition
state: when in this
“state” next state
uniquely determined
by next event
state
1
event
actions
state
2
Chapter 3, slide: 15
rdt1.0:
reliable transfer over a reliable channel
 underlying channel perfectly reliable
 no bit errors
 no loss of packets
 separate FSMs for sender, receiver:
 sender sends data into underlying channel
 receiver read data from underlying channel
Wait for
call from
above
rdt_send(data)
packet = make_pkt(data)
udt_send(packet)
sender
Wait for
call from
below
rdt_rcv(packet)
extract (packet,data)
deliver_data(data)
receiver
Chapter 3, slide: 16
rdt2.0: channel with bit errors
 underlying channel may flip bits in packet
 Receiver can detect bit errors (e.g., use checksum)
 But still no packet loss
 questions: (1) how does sender know and (2) what
does it do when packet is erroneous:

acknowledgements:
• positive ack (ACK): receiver tells sender that pkt received OK
• negative ack (NAK): receiver tells sender that pkt had erros

retransmission: sender retransmits pkt on receipt of NAK
 new mechanisms in rdt2.0 (beyond rdt1.0):



error detection
receiver feedback: control msgs (ACK,NAK) rcvr->sender
assume ACK/NAK are error free
Chapter 3, slide: 17
rdt2.0: operation with no errors
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for
Wait for
call from
ACK or
udt_send(sndpkt)
above
NAK
rdt_rcv(rcvpkt) && isACK(rcvpkt)
L
sender
receiver
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
udt_send(NAK)
Wait for
call from
below
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
Chapter 3, slide: 19
: errordt2.0r scenario
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for
Wait for
call from
ACK or
udt_send(sndpkt)
above
NAK
rdt_rcv(rcvpkt) && isACK(rcvpkt)
L
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
udt_send(NAK)
Wait for
call from
below
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
Chapter 3, slide: 20
rdt2.0 has a fatal flaw!
What happens if ACK/NAK is
corrupted? That is, sender
receives garbled ACK/NAK
 sender doesn’t know what happened
at receiver!
 can’t sender just retransmit?


Sure: sender retransmits current
pkt if ACK/NAK garbled
Any problem with this ??
Problem: duplicate
Receiver doesn’t know
whether received pkt is a
retransmit or a new pkt
Handling duplicates:
 sender adds
sequence
number to each pkt
 receiver discards
(doesn’t deliver up)
duplicate pkt
stop and wait
Sender sends one packet,
then waits for receiver
response
Chapter 3, slide: 21
rdt2.1: sender, handles garbled ACK/NAKs
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait
for
Wait for
isNAK(rcvpkt) )
ACK or
call 0 from
udt_send(sndpkt)
NAK 0
above
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt)
L
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isNAK(rcvpkt) )
udt_send(sndpkt)
L
Wait for
ACK or
NAK 1
Wait for
call 1 from
above
rdt_send(data)
sndpkt = make_pkt(1, data, checksum)
udt_send(sndpkt)
Chapter 3, slide: 22
rdt2.1: receiver, handles garbled ACK/NAKs
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) &&
has_seq1(rcvpkt)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt)
Wait for
0 from
below
Wait for
1 from
below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) &&
has_seq0(rcvpkt)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
Chapter 3, slide: 23
rdt2.1: discussion
Sender:
 seq # added to pkt
 two seq. #’s (0,1) will
suffice. Why?
 must check if received
ACK/NAK corrupted
 twice as many states

state must “remember”
whether “current” pkt
has 0 or 1 seq. #
Receiver:
 must check if received
packet is duplicate

state indicates whether
0 or 1 is expected pkt
seq #
 note: receiver can
not
know if its last
ACK/NAK received OK
at sender
Chapter 3, slide: 24
rdt2.2: a NAK-free protocol
 do we really need NAKs??
 instead of NAK, receiver sends ACK for last pkt
received OK

receiver must explicitly include seq # of pkt being ACKed
 duplicate ACK at sender results in same action as
NAK: retransmit current pkt
 rdt2.2: same functionality as rdt2.1, using ACKs only
Chapter 3, slide: 25
rdt3.0: channels with errors and loss
New assumption:
packet may be lost:
underlying channel
can also lose packets
(data or ACKs)
 checksum, seq. #,
ACKs, retransmissions
will be of help, but
not enough
 What else is needed?
Approach:
timeout policy:
 sender waits “reasonable”
amount of time for ACK
 retransmits if no ACK
received in this time
 if pkt (or ACK) just delayed
(not lost):
 retransmission will be
duplicate, but use of seq.
#’s already handles this
 receiver must specify seq
# of pkt being ACKed
 requires countdown timer
Chapter 3, slide: 26
ECE/CS 372 – introduction to computer networks
Lecture 8
Announcements:
 LAB2 is due today
 HW2 is posted and is due Tuesday next week
 LAB3 is posted and is due Tuesday next week
Acknowledgement: slides drawn heavily from Kurose & Ross
Chapter 3, slide: 27
rdt3.0 in action (still stop-n-wait w/ (0,1) sn)
Chapter 3, slide: 28
rdt3.0 in action (still stop-n-wait w/ (0,1) sn)
Chapter 3, slide: 29
Performance of rdt3.0: stop-n-wait
 rdt3.0 works, but performance stinks
 example: R=1 Gbps, RTT=30 ms, L=1000Byte packet:
sender
receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R
RTT
first packet bit arrives
last packet bit arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R
Ttransmit =
L (packet length in bits)
8.103 b/pkt
=
R (transmission rate, bps)
109 b/sec
= 8 microsec
Chapter 3, slide: 30
Performance of rdt3.0: stop-n-wait
 rdt3.0 works, but performance stinks
 example: R=1 Gbps, RTT=30 ms, L=1000Byte packet:
sender
receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R
first packet bit arrives
last packet bit arrives, send ACK
RTT
ACK arrives, send next
packet, t = RTT + L / R

U sender: utilization – fraction of time sender busy sending
U
sender
=
L/R
RTT + L / R
=
.008
30.008
= 0.00027
microsec Chapter 3, slide: 31
Performance of rdt3.0: stop-n-wait
 rdt3.0 works, but performance stinks
 example: R=1 Gbps, RTT=30 ms, L=1000Byte packet:
sender
receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R
RTT
first packet bit arrives
last packet bit arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R


1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link
network protocol limits use of physical resources!
Chapter 3, slide: 32
Pipelining: increased utilization
sender
receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R
first packet bit arrives
last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
RTT
ACK arrives, send next
packet, t = RTT + L / R
Increase utilization
by a factor of 3!
U
sender
=
3*L/R
=
.024
= 0.0008
30.008
Question:
link utilization
RTT What
+ L / R is the
microsecon Usender
ds
Chapter 3, slide: 33
So far: rdt3.0
 Acknowledgment and retransmission

Reliable delivery
 Sequence numbers
 Duplication detection
 Timeout and retransmit

Deal with packet losses
 Stop-n-wait
 one packet at a time
 Problem: efficiency
Chapter 3, slide: 34
Pipelined protocols
Pipelining: sender allows multiple, “in-flight”, yet-to-be-ACK’ed pkts
 What about the range of sequence numbers then??
 What about buffering at receiver??
 Two generic forms of pipelined protocols:
go-Back-N and selective repeat
Chapter 3, slide: 35
Go-Back-N
Sender side
 N pckts can be sent
w/o being ACK’ed,
“sliding window”
 Single timer, oldest
non-ACK’ed pckt
 At timeout(n),
retransmit pkt n and
all successive pkts
Receiver side
 Cumulative ACK
 Discard out-of-order
packets, no buffering
Chapter 3, slide: 36
Selective repeat in action
GBN?
GBN?
GBN?
Chapter 3, slide: 37
Selective Repeat via comparison
Go-Back-N
 Sliding window




N pckts on the fly
Cumulative ACK
Single timer
Discard out-of-order
pkts
Retransmit all successive
pkts at timeout
Selective Repeat
 Sliding window




N pckts on the fly
Selective ACK
One timer for each pckt
Buffer out-of-order
pksts
Retransmit only lost or
delayed pkt at timeout
Chapter 3, slide: 38
Selective repeat:
dilemma
Example:
 seq #’s: 0, 1, 2, 3
 window size=3
Chapter 3, slide: 39
Selective repeat:
dilemma
Example:
 seq #’s: 0, 1, 2, 3
 window size=3
 receiver sees no difference
in two scenarios! Even though,
 (a) is a retransmit pkt
 (b) is a new pkt
 in (a), receiver incorrectly
passes old data as new
Q: what relationship between seq
# size and window size to avoid
duplication problem??
Chapter 3, slide: 40
Duplication problem: illustration
 Let’s assume window size W = 2 for illustration
 Consider a scenario where:
Only ACK 1 is lost
 Now let’s assume that seqNbr space n = W =2
That is, seqNbr pattern is 0,1,0,1,0,1,…
Chapter 3, slide: 41
Duplication problem: illustration
Scenario: n=W=2; only ACK 1 is lost
Receiver thinks of retransmission of 2nd Pkt (Pkt1, SN=1) as a 4th/new Pkt (Pkt3,SN=1)
 Duplication detection problem!
Chapter 3, slide: 42
Duplication problem: illustration
Recap:
 When window size W = seqNbr space n,
 there is a duplication problem
 receiver can’t tell a new transmission from a
retransmission of a lost pkt
 Let’s now see what happens if we increase n by 1:
Now n=3 and seqNbr pattern is: 0,1,2,0,1,2,0…
 Let’s revisit the scenario again
Chapter 3, slide: 43
Duplication problem: illustration
Scenario: n=3;W=2; ACK 1 is lost
Receiver still thinks of retrans. of 2nd Pkt (Pkt1, SN=1) as a new Pkt (4th w/ SN=1)
 Increasing n by 1 didn’t solve the duplication detection problem this time!!
Chapter 3, slide: 44
Duplication problem: illustration
Recap:
 Increasing seqNbr space n from 2 to 3 didn’t solve the
duplication problem yet!
 Let’s now see what happens if we again increase n by 1:
Now n=4 and seqNbr pattern is: 0,1,2,3,0,1,2,3,0…
 Let’s revisit the scenario with n=4 and see
Chapter 3, slide: 45
Duplication problem: illustration
Scenario: n=4;W=2; ACK 1 is lost
Receiver now drops this retransmission of 2nd Pkt (Pkt1, SN=1) since its SN falls outside
 Duplication detection problem is now solved when n=4 w=2
Chapter 3, slide: 46
Duplication problem: illustration
Recap:
 Increasing seqNbr space n from 2 to 4 did solve the
duplication problem
 That is, when seqNbr space = twice window size (n = 2w)
the duplication problem was solved
 Hence, in this case:
when W ≤ n/2, the problem is solved
 Can we generalize?
Yes, W ≤ n/2 must hold in order to avoid dup problem
Chapter 3, slide: 47
SeqNbr space n & window size w
relationship: general case
seqNbr space
n-1
 Sender:
 Suppose ½ n < W < n

W segments with
seqNbr in [0, W-1] are
sent, but their ACKs are
not received yet
0
1
W-1
w
pkts already sent,
but ACKs not received yet
= sender’s sliding window
SeqNbr space n & window size w
relationship: general case
 Receiver: Now assume:



seqNbr expected
by receiver
receiver received all W
segments (SeqNbr in [0,W-1])
n-1
0
1
receiver sent all W ACKs, one
for each received segment
receiver slides its window:
Sliding window (SeqNbr of
expected segments) from:
W to x = 2W-n-1
x=2W-n-1
W
Hint: (n-1-W+1) + (x-0+1)=W  x=2W–n-1
Chapter 3, slide: 49
SeqNbr space n & window size w
relationship: general case
 Consider worst case scenario:
all W ACKs are lost
 Sender will re-transmit all the first W
seqNbr expected by
receiver
n-1
segments; i.e., with SeqNbr in [0,W-1]
seqNbr
overlap
0
 But receiver is expecting segments
with SeqNbr in [W,n-1] U [0,2W-n-1]
 Retransmissions with seqNbr falling
within [0,2W-n-1] will then be
interpreted by receiver as new
transmissions ( dup problem)
To avoid dup problem, we must then have:
2W-n-1 ≤ -1  W ≤ ½ n
x=2W-n-1
W
seqNbr
retransmitted
by sender
ECE/CS 372 – introduction to computer networks
Lecture 9
Announcements:
 HW2 and LAB3 are due Tuesday 4th week.
 A possible room change (Will send email with change
if happened)
 Midterm July 19th, 2012 (Thursday 4th week)
 Ch1, Ch2, and Ch3. Make sure you review the
material, HW, labs
Acknowledgement: slides drawn heavily from Kurose & Ross
Chapter 3, slide: 51
Chapter 3 outline
 Principles of reliable
data transfer
 Connection-oriented
 Principles of congestion
control

TCP congestion control
transport: TCP
Chapter 3, slide: 52
TCP Round Trip Time (RTT) and Timeout
Why need to estimate RTT?
 “timeout” and “retransmit”
needed to address pkt loss
 need to know when to timeout
and retransmit
Ideal world:
 exact RTT is needed
Some intuition
 What happens if too
short: premature timeout
 unnecessary
retransmissions
 What happens if too long:

slow reaction to segment
loss
Real world:
 RTTs change over time bcause


pkts may take different paths
network load changes over time
 RTTs can only be estimated
Chapter 3, slide: 53
TCP Round Trip Time (RTT) and Timeout
Technique: Exponential Weighted Moving Average (EWMA)
EstimatedRTT(current) = (1-)*EstimatedRTT(previous) + *SampleRTT(recent)
0 <  < 1;
typical value:  = 0.125
 SampleRTT:
 measured time from segment transmission until ACK receipt
 current value of RTT
 Ignore retransmission
 EstimatedRTT:
 estimated based on past & present; smoother than SampleRTT
 to be used to set timeout period
Chapter 3, slide: 54
TCP Round Trip Time (RTT) and Timeout
Technique: Exponential Weighted Moving Average (EWMA)
EstimatedRTT(current) = (1-)*EstimatedRTT(previous) + *SampleRTT(recent)
0 <  < 1;
typical value:  = 0.125
Illustration:
Assume: we received n RTT samples so far:
Let’s order them as: 1,2,3, … ,n
(1 is the most recent one, 2 is the 2nd most recent, etc.)
EstimatedRTT(n) = (1-)*EstimatedRTT(n-1) + *SampleRTT(1)
(EstimatedRTT(n) : estimated RTT after receiving ACK of nth Pkt.)
Example: Spse 3 ACKs returned w/ SampleRTT(1), SampleRTT(2), SampleRTT(3)
Question: What would be EstimatedRTT after receiving the 3ACKs ?
Assume that EstimatedRTT(1) = SampleRTT(3) after receiving 1st ACK
Chapter 3, slide: 55
TCP Round Trip Time (RTT) and Timeout
Technique: Exponential Weighted Moving Average (EWMA)
EstimatedRTT(current) = (1-)*EstimatedRTT(previous) + *SampleRTT(recent)
0 <  < 1;
typical value:  = 0.125
What happens if  is too small (say very close 0):
 A sudden, real change in network load does not get reflected
in EstimatedRTT fast enough
 May lead to under- or overestimation of RTT for a long time
What happens if  is too large(say very close 1):
 Transient fluctuations/changes in network load affects
EstimatedRTT and makes it unstable when it should not
 Also leads to under- or overestimation of RTT
Chapter 3, slide: 56
Example RTT estimation:
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
350
RTT (milliseconds)
300
250
200
150
100
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
time (seconnds)
SampleRTT
Estimated RTT
Chapter 3, slide: 57
TCP Round Trip Time (RTT) and Timeout
Setting the timeout
 timeout = EstimtedRTT, any problem with this???
 add a “safety margin” to

EstimtedRTT
large variation in EstimatedRTT -> larger safety margin
 see how much SampleRTT deviates from EstimatedRTT:
DevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT|
(typically,  = 0.25)
Then set timeout interval:
TimeoutInterval = EstimatedRTT + 4*DevRTT
Chapter 3, slide: 58
TCP: Overview
 point-to-point:
 one sender, one receiver
 reliable, in-order
stream:

byte
no “message boundaries”
 pipelined:
 TCP congestion and flow
control set window size

socket
door
send & receive buffers
application
writes data
application
reads data
TCP
send buffer
TCP
receive buffer
RFCs: 793, 1122, 1323, 2018, 2581
 full duplex data:
 bi-directional data flow
in same connection
 MSS: maximum segment
size
 connection-oriented:
 handshaking (exchange
of control msgs) init’s
sender, receiver state
before data exchange
 flow controlled:
 sender will not
socket
door
overwhelm receiver
segment
Chapter 3, slide: 59
TCP: a reliable data transfer
 TCP creates rdt
service on top of IP’s
unreliable service
 Pipelined segments
 Cumulative acks
 TCP uses single
retransmission timer
 Retransmissions are
triggered by:


timeout events
duplicate acks
 Initially consider
simplified TCP sender:


ignore duplicate acks
ignore flow control,
congestion control
Chapter 3, slide: 60
TCP sender events:
data rcvd from app:
 Create segment with
seq #
 seq # is byte-stream
number of first data
byte in segment
 start timer if not
already running (think
of timer as for oldest
unACK’ed segment)
 expiration interval:
TimeOutInterval
timeout:
 retransmit segment
that caused timeout
 restart timer
Ack rcvd:
 If acknowledges
previously unACK’ed
segments


update what is known to
be ACK’ed
start timer if there are
outstanding segments
Chapter 3, slide: 61
TCP seq. #’s and ACKs
Seq. #’s:
 byte stream
“number” of first
byte in segment’s
data
ACKs:
 seq # of next byte
expected from
other side
 cumulative ACK
Host A
User
types
‘C’
Host B
host ACKs
receipt of
‘C’, echoes
back ‘C’
host ACKs
receipt
of echoed
‘C’
simple telnet scenario
time
Chapter 3, slide: 62
TCP: retransmission scenarios
Host A
X
loss
Host B
Seq=92 timeout
Host B
Seq=92 timeout
timeout
Host A
time
lost ACK scenario
time
premature timeout
Chapter 3, slide: 63
TCP retransmission scenarios (more)
timeout
Host A
Host B
X
loss
time
Cumulative ACK scenario
Chapter 3, slide: 64
TCP ACK generation
[RFC 1122, RFC 2581]
Event at Receiver
TCP Receiver action
Arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed
Delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK
Arrival of in-order segment with
expected seq #. One other
segment has ACK pending
Immediately send single cumulative
ACK, ACKing both in-order segments
Arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected
Immediately send duplicate ACK,
indicating seq. # of next expected byte
Arrival of segment that
partially or completely fills gap
Immediate send ACK, provided that
segment starts at lower end of gap
Chapter 3, slide: 65
ECE/CS 372 – introduction to computer networks
Lecture 10
Announcements:
 Midterm July 19th, 2012 (Thursday 4th week)
 Ch1, Ch2, and Ch3. Make sure you review
the material, HW, labs
 Closed notes, book. Allowed only your
pen/pencil and a calculator.
Acknowledgement: slides drawn heavily from Kurose & Ross
Chapter 3, slide: 66
Fast Retransmit
client
 Suppose: Packet 0 gets lost


Q: when retrans. of Packet 0 will
happen?? Why happens at that time?
A: typically at t1 ; we think it is lost
when timer expires
 Can we do better??
server
Timer is
set at t0
Think of what means to receive many
duplicate ACKs


it means Packet 0 is lost
Why wait till timeout since we know
packet 0 is lost
=> Fast retransmit
=> better perfermance
 Why 3 dup ACK, not just 1 or 2


Think of what happens when pkt0
arrives after pkt1 (delayed, not lost)
Think of what happens when pkt0
arrives after pkt1 & 2, etc.
Timer
expires
at t1
Chapter 3, slide: 67
Fast Retransmit: recap
 Receipt of duplicate
ACKs indicate lost of
segments


Sender often sends
many segments back-toback
If segment is lost,
there will likely be many
duplicate ACKs.
This is how TCP works:
 If sender receives 3 ACKs
for the same data, it
supposes that segment after
ACK’ed data was lost:
 fast retransmit:


resend segment before
timer expires
better performance
Chapter 3, slide: 68
TCP Flow Control
 receive side of TCP
connection has a
receive buffer:
flow control
sender won’t overflow
receiver’s buffer by
transmitting too much,
too fast
 speed-matching
 app process may be
service: matching the
send rate to the
receiving app’s drain
rate
slow at reading from
buffer
Chapter 3, slide: 69
TCP Flow control: how it works
 Rcvr advertises spare
room by including value
of RcvWindow in
segments
 spare room in buffer
= RcvWindow
= RcvBuffer-[LastByteRcvd LastByteRead]
 Sender limits unACKed
data to RcvWindow

guarantees receive
buffer doesn’t overflow
Chapter 3, slide: 70
Review questions
Problem:




Host A
Host B
TCP connection between A and B
B received upto 248 bytes
A sends back-to-back 2 segments to
B with 40 and 60 bytes
B ACKs every pkt it receives
Q1: Seq# in 1st and 2nd seg. from A to B ?
Q2: Spse: 1st seg. gets to B first. What is
seq# in 1st ACK?
Chapter 3, slide: 71
Review questions
Problem:




Host A
Host B
TCP connection between A and B
B received upto 248 bytes
A sends back-to-back 2 segments to
B with 40 and 60 bytes
B ACKs every pkt it receives
Q1: Seq# in 1st and 2nd seg. from A to B ?
Q2: Spse: 1st seg. gets to B first. What is
seq# in 1st ACK?
Q3: Spse: 2nd seg. gets to B first. What
is seq# in 1st ACK? And in 2nd ACK?
Chapter 3, slide: 72
Review questions
Problem:




Host A
Host B
TCP connection between A and B
B received upto 248 bytes
A sends back-to-back 2 segments to
B with 40 and 60 bytes
B ACKs every pkt it receives
Q1: Seq# in 1st and 2nd seg. from A to B ?
Q2: Spse: 1st seg. gets to B first. What is
seq# in 1st ACK?
Q3: Spse: 2nd seg. gets to B first. What
is seq# in 1st ACK? And in 2nd ACK?
Chapter 3, slide: 73
Review questions
Problem:




TCP connection between A and B
B received upto 248 bytes
A sends back-to-back 2 segments to
B with 5 and 10 bytes
B ACKs every pkt it receives
Now suppose:
- 2 segs get to B in order.
- 1st ACK is lost
- 2nd ACK arrives after timeout
Question: fill out all pckt seq numbers,
and ACKs seq numbers in the timing
diagram
Chapter 3, slide: 74
Chapter 3 outline
 Principles of reliable
data transfer
 Connection-oriented
 Principles of congestion
control

TCP congestion control
transport: TCP
Chapter 3, slide: 75
Principles of Congestion Control
 Oct. ’86: LBL (Lawrence Berkeley Lab) to UC-Berkeley:
drop frm 32kbps to 40bps
 cause:
end systems are sending too much data too fast for
network/routers to handle
 manifestations:


lost/dropped packets (buffer overflow at routers)
long delays (queueing in router buffers)
 different from flow control!
 a top-10 problem!
Chapter 3, slide: 76
Approaches towards congestion control
Two broad approaches towards congestion control:
End-end congestion
control:
 no explicit feedback from
network
 congestion inferred from
end-system observed loss,
delay
 approach taken by TCP
Network-assisted
congestion control:
 routers provide feedback
to end systems
 single bit indicating
congestion (DECbit,
TCP/IP ECN)
 explicit rate sender
should send at
 One more restriction on
sliding window: CongWin
Chapter 3, slide: 77
TCP congestion control

Keep in mind:



not using them!
Too fast: over-utilization => waste of network resources by
congesting them!
Challenge is then:


Too slow: under-utilization => waste of network resources by
Not too slow, nor too fast!!
Approach:
Increase slowly the sending rates to probe for usable
bandwidth
 Decrease the sending rates when congestion is observed
=> Additive-increase, multiplicative decrease (AIMD)

Chapter 3, slide: 78
TCP congestion control: AIMD
Additive-increase, multiplicative decrease (AIMD)
(also called “congestion avoidance”)


additive increase: increase CongWin by 1 MSS every RTT until
loss detected (MSS = Max Segment Size)
multiplicative decrease: cut CongWin in half after loss
congestion window size
congestion
window
24 Kbytes
16 Kbytes
Saw tooth
behavior: probing
for bandwidth
8 Kbytes
time
time
Chapter 3, slide: 79
TCP Congestion Control: details
 sender limits transmission:
LastByteSent-LastByteAcked
 CongWin
 Roughly,
rate =
How does sender perceive
congestion?
 loss event
 timeout or
 3 duplicate ACKs
 TCP sender reduces rate
CongWin
Bytes/sec
RTT
 CongWin is dynamic, function
of perceived network
congestion
(CongWin) after loss event
 AIMD increases window by 1
MSS every RTT
Improvements:
 AIMD, any problem??

Think of the start of
connections
 Solution: start a little
faster, and then slow down
=>“slow-start”
Chapter 3, slide: 80
TCP Slow Start
 When connection begins,
TCP addresses this via
CongWin = 1 MSS
Slow-Start mechanism
 Example: MSS = 500 Bytes  When connection begins,
RTT = 200 msec
increase rate
 initial rate = 20 kbps
exponentially fast
 available bandwidth may
be >> MSS/RTT

desirable to quickly ramp
up to respectable rate
 When loss of packet
occurs (indicates that
connection reaches up
there), then slow down
Chapter 3, slide: 81
TCP Slow Start (more)
How it is done
Host A
begins, increase rate
exponentially until
first loss event:


RTT
 When connection
Host B
double CongWin every
RTT
done by incrementing
CongWin for every ACK
received
 Summary: initial rate
is slow but ramps up
exponentially fast
time
Chapter 3, slide: 82
Refinement: TCP Tahoe
Question:
When should exponential increase (Slow-Start) switch to linear (AIMD)?
Here is how it works:
 Define a variable, called Threshold
 Start “Slow-Start” (i.e., CongWin =1 and double CongWin every RTT)
 At 1st loss event (timeout or 3 Dup ACKs),

Set Threshold = ½ current CongWin
 Start “Slow-Start” (i.e., CongWin = 1 and double CongWin every RTT)
 When CongWin =Threshold:
Switch to AIMD (linear)
 At any loss event (timeout or 3 Dup ACKs):
 Set Threshold = 1/2 CongWin
 Start “Slow-Start” (i.e. start over with CongWin = 1)

Chapter 3, slide: 83
Refinement: TCP Tahoe
Question:
When should exponential increase (Slow-Start) switch to linear (AIMD)?
CongWin
Chapter 3, slide: 84
More refinement: TCP Reno
Loss event: Timeout
vs. dup ACKs
 3 dup ACKs: fast retransmit
client
server
Timer is
set at t0
Timer
expires
at t1
Chapter 3, slide: 85
More refinement: TCP Reno
Loss event: Timeout
vs. dup ACKs
 3 dup ACKs: fast retransmit
 Timeout: retransmit
client
server
Timer is
set at t0
Any difference (think
congestion) ??
 3 dup ACKs indicate network still
capable of delivering some
segments after a loss
 Timeout indicates a “more”
alarming congestion scenario
Timer
expires
at t1
Chapter 3, slide: 86
More refinement: TCP Reno
TCP Reno treats “3 dup ACKs” different from “timeout”
How does TCP Reno
work?
 After 3 dup ACKs:


CongWin is cut in half
congestion avoidance
(window grows linearly)
 But after timeout event:


CongWin instead set
to 1 MSS;
Slow-Start (window
grows exponentially)
Chapter 3, slide: 87
Summary: TCP Congestion Control
 When CongWin is below Threshold,
sender in slow-start phase, window
grows exponentially.
 When CongWin is above Threshold,
sender is in congestion-avoidance
phase, window grows linearly.
 When a triple duplicate ACK occurs,
Threshold set to CongWin/2 and
CongWin set to Threshold.
 When timeout occurs, Threshold
set to CongWin/2 and CongWin is
set to 1 MSS.
Chapter 3, slide: 88
Average throughput of TCP
Avg. throughout as a function of window size W and RTT?


Ignore Slow-Start
Let W, the window size when loss occurs, be constant
 When window is W, throughput is ??
throughput(high) = W/RTT
 Just after loss, window drops to W/2, throughput is ??
throughput(low) = W/(2RTT).
 Throughput then increases linearly from W/(2RTT) to
W/RTT
 Hence, average throughout = 0.75 W/RTT
Chapter 3, slide: 89
TCP Fairness
Fairness goal: if K TCP sessions share same
bottleneck link of bandwidth R, each should have
average rate of R/K
TCP connection 1
TCP
connection 2
bottleneck
router
capacity R
Chapter 3, slide: 90
Fairness (more)
Fairness and parallel TCP connections
 nothing prevents app from opening parallel conections
between 2 hosts.
 Web browsers do this
 Example: link of rate R supporting 9 connections;
 9 TCPs belong to same host, each getting R/9
 new app asks for 1 TCP, gets rate R/10
 One host gets 9/10 of R and one gets 1/10 of R!
Chapter 3, slide: 91
Question: is TCP fair?
Again assume two competing sessions only, and consider AIMD:
 Additive increase gives slope of 1, as throughout increases
 Multiplicative decrease halves throughput proportionally, when
congestion occurs
TCP connection 1
TCP
connection 2
bottleneck
router
capacity R
Chapter 3, slide: 92
Question: is TCP fair?
Again assume two competing sessions only, and consider AIMD:
 Additive increase gives slope of 1, as throughout increases
 Multiplicative decrease halves throughput proportionally, when
congestion occurs
Question:
 How would (R1, R2) vary
R
equal bandwidth share
when AIMD is used?
 Does it converge to
equal share?
loss: decrease window by factor of 2
congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase
Connection 1 throughput R
Chapter 3, slide: 93
Question?
Again assume two competing sessions only:
 Additive increase gives slope of 1, as throughout increases
 Constant/equal decrease instead of multiplicative decrease!!!
(Call this scheme: AIED)
R
equal bandwidth share
Question:
 How would (R1, R2)
vary when AIED is
used instead of AIMD?
 Is it fair? Does it converge
to equal share?
(R1, R2)
Connection 1 throughput R
Chapter 3, slide: 94
Download