Transport Layer

advertisement
Part 2
TCP Flow Control, Congestion Control,
Connection Management, etc.
1
Transport Layer – TCP
B
Encapsulation in TCP/IP
IP datagram
2
Transport Layer – TCP
B
TCP: Overview
Error detection, retransmission, cumulative ACKs, timers, header fields for
sequence and ACK numbers
 point-to-point:
 one sender, one receiver
 reliable, in-order byte
stream:

no message boundaries
 pipelined:
 TCP congestion and flow
control set window size
 send & receive buffers
socket
door
application
writes data
application
reads data
TCP
send buffer
TCP
receive buffer
 full duplex data:
 bi-directional app. data
flow in same connection
 MSS: maximum segment
size
 connection-oriented:
 handshaking (exchange
of control msgs) init's
sender, receiver state
before data exchange
 flow controlled:
socket
 sender will not ''flood''
door
receiver with data
segment
3
Transport Layer – TCP
B
Recall
socket
door
application
writes data
application
reads data
TCP
send buffer
TCP
receive buffer
socket
door
Packet ->
 Reliable Data Transfer Mechanisms:
 Checksum
 Timer
- Verification of integrity of packet
- Signals necessary re-transmission is required
 Sequence
number
- Keeps track of which packet has been sent and received
 ACK
- Indicates receipt of packet in good or bad form
 NAK
 Window,
pipelining
- Allows for the sending of multiple yet-to-be-acknowledged
packets
Transport Layer
– TCP
4
B
Internet Checksum Example
 Note
 When
adding numbers, a carryout from the
most significant bit needs to be added to the
result
 Example: add two 16-bit integers
data
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
To check:
1 1 1 1 1 1 1 1 1 1 1 1 Transport
1 1 1Layer
1 – TCP
5
B
Connection Oriented Transport: TCP
 TCP Segment Structure
 SEQ and ACK numbers
 Calculating the Timeout Interval
 The Simplified TCP Sender
 ACK Generation Recommendation
(RFC 1122, RFC 2581)
 Interesting Transmission Scenarios
 Flow Control
 TCP Connection Management
6
Transport Layer – TCP
B
TCP segment structure
32 bits
URG: urgent data
(generally not used)
ACK: ACK #
valid
PSH: push data now
(generally not used)
RST, SYN, FIN:
connection established
(setup, tear down
commands)
Internet
checksum
(as in UDP)
source port #
dest. port #
sequence number
acknowledgement number
head not
UA P R S F
len used
Header
rcvr window size
checksum
URGent data ptr
Options (variable length)
application
data
(variable length)
In practice, PSH, URG, and the Urgent Data
Pointer are not used.
counting
by bytes
of data
(not segments!)
# bytes the
rcvr is willing
to accept
We can view these
teeny-weeny
details using Ethereal.
7
Transport Layer – TCP
B
Example
Suppose that a process in Host A wants to send a stream of data to a
process in Host B over a TCP connection.
Assume:
Data stream: file consisting of 500,000 bytes
MSS: 1,000 bytes
First byte of data stream: numbered as 0
TCP constructs 500 segments out of the data stream.
500,000 bytes/1,000 bytes = 500 segments
8
Transport Layer – TCP
B
TCP sequence #'s and ACKs
Segment 1
0 1 2 3 4 .....999
...
Segment 2
1000 1001 1002....1999
Sequence. Numbers (#'s):
 byte stream 'number' of first byte in segment's data
 Do not necessarily start from 0, use random initial number R
• Segment 1: 0 + R
• Segment 2: 1000 + R etc...
ACKs (acknowledgment):
 Seq # of next byte expected from other side (last byte +1)
 Cumulative ACK
 If received segment 1, waits for segment 2
 E.g. Ack=1000 + R (received up to 999th byte)
9
Transport Layer – TCP
B
TCP sequence #'s and ACKs
simple telnet scenario (with echo on)
client
Host A
Host B
server
Q: how receiver handles
User
out-of-order segments
types
 A: TCP specs. does
'C'
host ACKs
not say, - decide
I’m sending data starting at seq. num=42 receipt of
when implementing
'C', echoes
back 'C'
Assuming that the starting
sequence numbers for Host A
and Host B are: 42 and 79
respectively
host ACKs
receipt
of echoed
'C'
Send me the bytes from 43 onward
ACK is being piggy-backed on
server-to-client data
time
10
Transport Layer – TCP
B
Yet another server echo example
Host A:
seq=42
ack=79
seq=47
ack=84
Host A
User
types
'Hello'
Host B
host ACKs
receipt of
'Hello',
echoes back
'Hello'
host ACKs
receipt
of echoed
'Hello'
send something
else
Host
B:
seq=79
ack=47
seq=84
ack=50
time
ACK tells about up to what byte has been received and what is the next
starting byte the host is expecting to receive
11
Transport Layer – TCP
B
TCP Round Trip Time and Timeout
Main Issue: How long is the sender willing to wait
before re-transmitting the packet?
Q: how to set TCP
timeout value?
 longer than RTT *
note: RTT will vary
 too short: premature
timeout
 unnecessary
retransmissions
 too long: slow reaction
to segment loss
* RTT = round trip time

Q: how to estimate RTT?
 SampleRTT: measured time from
segment transmission until ACK
receipt
 ignore retransmissions,
cumulatively ACKed segments
 SampleRTT will vary, we would
want estimated RTT to be
''smoother''
 use several recent
measurements, not just
current SampleRTT
12
Transport Layer – TCP
B
TCP Round Trip Time and Timeout
EstimatedRTT = (1-x) * EstimatedRTT + x * SampleRTT
 Exponential weighted moving average
 influence of given sample decreases exponentially fast
 typical value of x: 0.125 (RFC 2988)
Setting the timeout
 EstimatedRTT plus ''safety margin''
 large variation in EstimatedRTT -> larger safety margin
 recommended value of x: 0.25
Deviation = (1-x) * Deviation +
x * |SampleRTT-EstimatedRTT|
Timeout = EstimatedRTT + (4 * Deviation)
13
Transport Layer – TCP
B
Sample Calculations
EstimatedRTT = 0.875 * EstimatedRTT + 0.125 * SampleRTT
EstimatedRTT after the receipt of the ACK of segment 1:
EstimatedRTT = RTT for Segment 1 = 0.02746 second
EstimatedRTT after the receipt of the ACK of segment 2:
EstimatedRTT = 0.875 * 0.02746 + 0.125 * 0.035557 = 0.0285
EstimatedRTT after the receipt of the ACK of segment 3:
EstimatedRTT = 0.875 * 0.0285 + 0.125 * 0.070059 = 0.0337
EstimatedRTT after the receipt of the ACK of segment 4:
EstimatedRTT = 0.875 * 0.0337+ 0.125 * 0.11443 = 0.0438
EstimatedRTT after the receipt of the ACK of segment 5:
EstimatedRTT = 0.875 * 0.0438 + 0.125 * 0.13989 = 0.0558
EstimatedRTT after the receipt of the ACK of segment 6:
EstimatedRTT = 0.875 * 0.0558 + 0.125 * 0.18964 = 0.0725
14
Transport Layer – TCP
B
RTT Samples and RTT estimates
300
Estimated RTT
Sample RTT
250
RTT (msec.)
200
150
100
time
The variations in the SampleRTT are
smoothed out in the computation of the
15
EstimatedRTT.
Transport Layer – TCP
B
An Actual RTT estimation:
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
350
RTT (milliseconds)
300
250
200
150
100
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
time (seconnds)
SampleRTT
Estimated RTT
16
Transport Layer – TCP
B
FSM of TCP for Reliable Data Transfer
event: data received
from application above
create, send segment
wait
wait
for
for
event
event
Simplified TCP sender,
assuming:
- one way data transfer
- no flow, congestion control
event: timer timeout for
segment with seq. number y
retransmit segment
event: ACK received,
with ACK number y
process ACK
17
Transport Layer – TCP
B
SIMPLIFIED TCP
SENDER
Assumptions:
• sender is not constrained by TCP flow or congestion control
• that data from above is less than MSS in size
• that data transfer is in one direction only
Associated with the
oldest unACKed
segment
00 sendbase = initial_sequence number
01 nextseqnum = initial_sequence number
02
03 loop (forever) {
04
switch(event)
05
event: data received from application above
06
create TCP segment with sequence number nextseqnum
07
If (timer is currently not running) start timer for segment
nextseqnum
08
pass segment to IP
09
nextseqnum = nextseqnum + length(data)
10
event: timer timeout
11
retransmit not-yet-ACKed segment with smallest Seq. #
12
Start timer
13
event: ACK received, with ACK field value of y
15
if (y > sendbase) { /* cumulative ACK of all data up to y */
16
sendbase = y
17
If (there are currently any not-yet-ACKed segments)
18
start timer
19
}
18
20
} /* end of loop forever */
Transport Layer – TCP
B
TCP with MODIFICATIONS
Why wait for the
timeout to expire, when
consecutive ACKs can
be used to indicate a
lost segment
With Fast
Retransmit
SENDER
00 sendbase = initial_sequence number
01 nextseqnum = initial_sequence number
02
03 loop (forever) {
04
switch(event)
05
event: data received from application above
06
create TCP segment with sequence number nextseqnum
07
start timer for segment nextseqnum
08
pass segment to IP
09
nextseqnum = nextseqnum + length(data)
10
event: timer timeout for segment with sequence number y
11
retransmit segment with sequence number y
12
compute new timeout interval for segment y
13
restart timer for sequence number y
14
event: ACK received, with ACK field value of y
15
if (y > sendbase) { /* cumulative ACK of all data up to y */
16
cancel all timers for segments with sequence numbers < y
17
sendbase = y
18
}
19
else { /* a duplicate ACK for already ACKed segment */
20
increment number of duplicate ACKs received for y
21
if (number of duplicate ACKS received for y is 3) {
22
/* perform TCP fast retransmission */
23
resend segment with sequence number y
24
restart timer for segment y
25
}
26
} /* end of loop forever */
20
Transport Layer – TCP
B
TCP ACK generation
[RFC 1122, RFC 2581]
Receiver does not discard out-of-order segments
Event
TCP Receiver action
1
in-order segment arrival,
no gaps,
everything else already ACKed
Delay sending the ACK. Wait up to 500ms
for next segment. If next segment does
not arrive in this interval, send ACK
2
in-order segment arrival,
no gaps, one delayed ACK
pending (due to action 1)
immediately send a single
cumulative ACK
3
out-of-order segment arrival
with higher than expect seq. #
- a gap is detected
send duplicate ACK, indicating seq. #
of next expected byte
4
arrival of segment that
partially or completely fills gap
Immediately send an ACK if segment starts
at lower end of gap
21
Transport Layer – TCP
B
TCP: Interesting Scenarios
Simplified TCP version
Host A
Host B
Host B
Seq=92 timeout
timeout
Host A
X
loss
Timer is
restarted
here for
Seq=92
time
lost ACK scenario
Retransmission due to lost ACK
time
premature timeout,
cumulative ACKs
Segment with Seq=100 not 22
Transport Layer – TCP
retransmitted
B
TCP: Retransmission Scenario
Seq=92 timeout
Host A
Host B
X
loss
time
Cumulative ACK avoids retransmission of the first segment.
23
Transport Layer – TCP
B
TCP Modifications:
Doubling the Timeout Interval
Provides a limited form of congestion control
Timer expiration is more likely caused by
congestion in the network
Congestion may get
worse if sources
continue to
retransmit packets
persistently.
TimeoutInterval = 2 * TimeoutIntervalPrevious
TCP acts more politely by
increasing the TimeoutInterval,
causing the sender to retransmit
after longer and longer intervals.
Others: check RFC 2018 – selective ACK
After ACK is received,
TimeoutInterval is derived
from most recent EstimatedRTT
and DevRTT
24
Transport Layer – TCP
B
TCP Flow Control
flow control
sender won't overrun
receiver's buffer by
transmitting too much,
too fast
RcvBuffer = size of TCP Receive Buffer
RcvWindow = amount of spare room in Buffer
receiver buffering
receiver: explicitly
informs sender of
(dynamically changing)
amount of free buffer
space
 RcvWindow field in
TCP segment
sender: keeps the amount
of transmitted,
unACKed data less than
most recently received
RcvWindow
25
Transport Layer – TCP
B
FLOW CONTROL: Receiver
EXAMPLE: HOST A sends a large file to HOST B
RECEIVER: HOST B – uses RcvWindow, LastByteRcvd, LastByteRead
LastByteRead
Data
from IP
Application
Process
100
60
50 40
0
LastByteRcvd
HOST
B RcvWindow
tells HOST A=how
much spare room it
has in the connection
by
Initially,
RcvBuffer
Application
reads from buffer
the buffer
placing its current value of RcvWindow in the receive window field of every
segment it sends to HOST A. RcvBuffer
RcvWindow=RcvBuffer-[LastByteRcvd-LastByteRead]
26
Transport Layer – TCP
B
FLOW CONTROL: Sender
EXAMPLE: HOST A sends a large file to HOST B
SENDER: HOST A – uses RcvWindow of HostB, LastByteSent, LastByteACKed
LastByteACKed
SENDER: HOST A
ACKs from
Host B
100
60
50 40
0
LastByteSent
Data
To ensure that HOST B does not overflow, HOST A maintains throughout the 27
Transport Layer – TCP
connection’s life that [LastByteSent-LastByteACKed] <= RcvWindow
B
FLOW CONTROL
Some issue to consider:
RcvWindow – used by the connection to
provide the flow control service
What happens
when the receive
buffer of HOST B is
full ? (that is, when
RcvWindow=0)
TCP sends a segment only when
there is data or ACK to send.
Therefore, the sender must
maintain the connection ‘alive’.
TCP requires that HOST A continue
to send segments with one data byte
when HOST B’s receive window is
0. Such segments will be ACKed
by HOST B.
Eventually, the buffer will have
some space and the ACKs will
contain RcvWindow > 0
28
Transport Layer – TCP
B
TCP Connection Management
Recall: TCP sender, receiver establish “connection” before exchanging
data segments
 Initialize TCP variables:

sequence numbers

buffers, flow control info (e.g. RcvWindow)
 Client is the connection initiator
if (connect(s, (struct sockaddr *)&sin, sizeof(sin)) != 0) {
printf("connect failed\n");
WSACleanup(); exit(1);
}
In Java, Socket clientSocket = new
Socket("hostname","port number"); connect;
 Server is contacted by client
ns = accept(s,(struct sockaddr *)(&remoteaddr),&addrlen);
In Java,Socket accept();
29
Transport Layer – TCP
B
TCP Connection Management
Establishing a connection
Client
Server
This is what
happens when we
create a socket for
connection to a
server
Three way handshake:
Step 1: client end system sends TCP SYN
control segment to server (executed by
TCP itself)
 specifies initial seq number (isn)
Step 2: server end system receives SYN,
replies with SYNACK control segment
 ACKs received SYN
 allocates buffers
 specifies server’s initial seq.
number
Step 3: client ACKs the connection with
ACK=server_isn +1
time

allocates buffers

sends SYN=0
Connection established!
30
After establishing the connection, the client can receive segments with app-generated
data!
(SYN=0)
Transport
Layer
– TCP
B
TCP Connection Management (cont.)
How TCP connection is established and torn down
Closing a connection:
client closes socket:
client
server
close
closesocket(s);
close
Java: clientSocket.close();
sends TCP FIN control
segment to server
Step 2: server receives FIN,
replies with ACK. Closes
connection, sends FIN.
timed wait
Step 1: client end system
closed
31
Transport Layer – TCP
B
TCP Connection Management (cont.)
Step 3: client receives FIN,
replies with ACK.

client
server
closing
Enters ''timed wait'' will respond with ACK
to received FINs
closing
Step 4: server, receives
Note: with small
modification, can handle
simultaneous FINs.
timed wait
ACK. Connection closed.
closed
closed
32
Transport Layer – TCP
B
TCP Connection Management (cont)
12
10
2
8
4
Used in case ACK gets lost. It is
implementation-dependent (e.g. 30
seconds, 1 minute, 2 minutes
TCP server lifecycle
6
11
TCP client lifecycle
Connection
formally closes –
all resources (e.g.
port numbers) are
released
9
1
7
3
5
33
Transport Layer – TCP
B
End of Flow Control and Error
Control
34
Transport Layer – TCP
B
Flow Control vs. Congestion Control
Similar actions are taken, but for very different reasons
Flow Control
• point-to-point traffic between sender and receiver
• speed matching service, matching the rate at which the sender is
sending against the rate at which the receiving application is reading
• prevents Receiver Buffer from overflowing
Congestion – happens when there are too many sources attempting to
send data at too high a rate for the routers along the path
Congestion Control
• service that makes sure that the routers between End Systems are
able to carry the offered traffic
• prevents routers from overflowing
Same course of action: Throttling of the sender
35
Transport Layer – TCP
B
Principles of Congestion Control
Congestion:
 Informally: ''too many sources sending too much
data too fast for network to handle''
 different from flow control!
 Manifestations:
 lost packets (buffer overflow at routers)
 long delays (queuing in router buffers)
 a top-10 problem!
36
Transport Layer – TCP
B
Approaches towards congestion control
Two broad approaches towards congestion control:
1
End-to-end congestion
control:
 no explicit feedback from
network
 congestion inferred by
end-systems from
observed packet loss &
delay
 approach taken by TCP
2
Network-assisted
congestion control:
 routers provide feedback
to End Systems in the
form of:
 single bit indicating
link congestion (SNA,
DECbit, TCP/IP ECN,
ATM ABR)
 explicit transmission
rate the sender should
send at
37
Transport Layer – TCP
B
TCP Congestion Control
How TCP sender limits the rate at which it sends traffic
into its connection?
SENDER:
New variable! –
Congestion Window
(Amount of unACKed data)SENDER < min(CongWin, RcvWindow)
LastByteSent - LastByteACKed
Assumptions:
By adjusting
Indirectly
limits the sender’s
sendsender
rate
CongWin,
can
therefore
adjust
the
• TCP receive buffer is very large – no
RcvWindow
constraint
rateisatsolely
which
it sends
 Amt. of unACKed data at sender
limited
by CongWin
• Packet loss delay & packet transmissiondata
delayinto
areits
negligible
connection
Sending rate: (approx.)
CongWin
38
Transport Layer – TCP
RTT
B
TCP Congestion Control
TCP uses ACKs to trigger (“clock”) its increase in
congestion window size – “self-clocking”
Arrival of ACKs – indication to the sender that all is well
1.
Slow Rate
• Congestion window will be increased at a relatively slow rate
2.
High rate
• Congestion window will be increased more quickly
39
Transport Layer – TCP
B
TCP Congestion Control
How TCP perceives that there is congestion on the path?
“Loss Event” – when there is excessive congestion, router buffers along the path
overflows, causing datagrams to be dropped, which in turn, results in a “loss
event” at the sender
1.
Timeout
• no ACK is received after segment loss
2.
Receipt of three duplicate ACKs
• segment loss is followed by three ACKs received at the sender
40
Transport Layer – TCP
B
TCP Congestion Control: details
 sender limits transmission:
LastByteSent-LastByteAcked
 cwnd
 roughly,
rate =
cwnd
RTT
Bytes/sec
 cwnd is dynamic, function of
perceived network congestion
How does sender
perceive congestion?
 loss event = timeout or
3 duplicate acks
 TCP sender reduces
rate (cwnd) after loss
event
Three mechanisms:
1.
2.
3.
AIMD
slow start
conservative after
timeout events
41
Transport Layer – TCP
B
TCP congestion avoidance :
additive increase, multiplicative decrease
approach: increase transmission rate (window size), probing for
usable bandwidth, until loss occurs
 additive increase: increase cwnd by 1 MSS every RTT until
loss is detected
 multiplicative decrease: cut cwnd in half after loss
saw tooth
behavior: probing
for bandwidth
cwnd: congestion window size

congestion
window
24 Kbytes
16 Kbytes
8 Kbytes
time
time
42
Transport Layer – TCP
B
TCP Slow Start
 when connection begins,



initially cwnd = 1 MSS
double cwnd every RTT
done by incrementing cwnd
by 1 MSS for every ACK
received
Host A
Host B
RTT
increase rate
exponentially until first
loss event:
 summary: initial rate is
slow but ramps up
exponentially fast
(doubling of the sending
rate every RTT)
time
43
Transport Layer – TCP
B
Refinement: inferring loss
 after 3 dup ACKs:
is cut in half
 window then grows
linearly
 but after timeout event:
 cwnd is set to 1 MSS
 window then grows
exponentially
 Up to a threshold,
then grows linearly
 cwnd
Philosophy:
3 dup ACKs indicates
network capable of
delivering some segments

timeout indicates a
“more alarming”
congestion scenario

44
Transport Layer – TCP
B
Refinement
Q: when should the
exponential
increase switch to
linear?
A: when cwnd gets to
1/2 of its value
before timeout.
Implementation:
 variable ssthresh (slow-start threshold)
 on loss event, ssthresh is set to 1/2 of cwnd just
before loss event
45
Transport Layer – TCP
B
TCP Sender Congestion Control
STATE
EVENT
TCP SENDER CongestionControl Action
Commentary
SLOW
START
(SS)
ACK receipt for
previously
unACKed data
CongWin = CongWin + MSS,
If(CongWin > Threshold)
set state to “Congestion
Avoidance”
Resulting in a
doubling of CongWin
every RTT
Congestion ACK receipt for
Avoidance previously
(CA)
unACKed data
CongWin = CongWin + MSS *
(MSS/CongWin)
Additive increase,
resulting in increasing
of CongWin by 1 MSS
every RTT
SS or CA
Loss event
detected by triple
duplicate ACK
Threshold = CongWin / 2,
CongWin = Threshold,
Set state to “Congestion
Avoidance”
Fast recovery,
implementing
multiplicative
decrease, CongWin
will not drop below 1
MSS.
SS or CA
Timeout
Threshold = CongWin / 2,
CongWin = 1 MSS,
Set state to “Slow Start”
Enter Slow Start.
SS or CA
Duplicate ACK
Increment duplicate ACK count
for segment being ACKed
CongWin and
Threshold not
46
changed
Transport Layer – TCP
B
Summary: TCP Congestion Control
duplicate ACK
dupACKcount++
L
cwnd = 1 MSS
ssthresh = 64 KB
dupACKcount = 0
slow
start
timeout
ssthresh = cwnd/2
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
dupACKcount == 3
ssthresh= cwnd/2
cwnd = ssthresh + 3
retransmit missing segment
New
ACK!
new ACK
cwnd = cwnd+MSS
dupACKcount = 0
transmit new segment(s), as allowed
cwnd > ssthresh
L
timeout
ssthresh = cwnd/2
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
timeout
ssthresh = cwnd/2
cwnd = 1
dupACKcount = 0
retransmit missing segment
New
ACK!
new ACK
cwnd = cwnd + MSS (MSS/cwnd)
dupACKcount = 0
transmit new segment(s), as allowed
.
congestion
avoidance
duplicate ACK
dupACKcount++
New
ACK!
New ACK
cwnd = ssthresh
dupACKcount = 0
dupACKcount == 3
ssthresh= cwnd/2
cwnd = ssthresh + 3 MSS
retransmit missing segment
fast
recovery
duplicate ACK
cwnd = cwnd + MSS
transmit new segment(s), as allowed
47
Transport Layer – TCP
B
TCP’s Congestion Control Service
Problem: Gridlock sets-in when there is packet loss due to router congestion
CLIENT
The sending system’s packet is lost due to
congestion, and is alerted when it stops
receiving ACKs of packets sent
SERVER
Congestion control
forces the End Systems to decrease the rate at which packets are sent during
periods of congestion
48
Transport Layer – TCP
B
Macroscopic Description of TCP throughput
(Based on Idealised model for the steady-state dynamics of TCP)
 what’s the average throughout of TCP as a
function of window size and RTT?
 ignore
slow start (typically very short phases)
 let W be the window size when loss occurs.
 when
window is W, throughput is W/RTT
 just after loss, window drops to W/2, throughput to
W/2RTT.
 Throughput increases linearly (by MSS/RTT every
RTT)
 Average Throughput: .75 W/RTT
49
Transport Layer – TCP
Transport Layer
3- B
TCP Futures: TCP over “long, fat pipes”
Example: GRID computing application
 1500-byte segments, 100ms RTT, desired
throughput of 10 Gbps
 requires window size W = 83,333 in-flight
segments
 Throughput in terms of loss rate:

1.22  MSS
RTT L
➜ L = 2·10-10 – a very small loss rate! (1 loss
event every 5 billion segments)
 new versions of TCP is needed for highTransport Layer
speed environments

3-50
TCP Fairness
Fairness goal: if N TCP sessions share same bottleneck
link, each should get an average transmission rate of
R/N , an equal share of the link’s bandwidth
TCP connection 1
TCP
connection 2
bottleneck
router
capacity R
Go to Summary of TCP Congestion Control
51
Transport Layer – TCP
B
Analysis of 2 connections sharing a link
Assumptions:
Link with transmission rate of R
Each connection have the same MSS, RTT
No other TCP connections or UDP datagrams traverse the shared link
Ignore slow start phase of TCP
Operating in congestion-avoidance mode (linear increase phase)
Goal: adjust sending rate of the two connections to allow for equal
bandwidth sharing
52
Transport Layer – TCP
B
Why is TCP fair?
Two competing sessions:
 Additive increase gives slope of 1, as throughout increases
 multiplicative decrease: decreases throughput proportionally
A point on the
graph depicts
the amount of
link bandwidth
jointly
consumed by
the
connections
equal bandwidth share
We can view
a simulation
on this
Connection 2 throughput
R
loss: decrease window by factor of 2
congestion avoidance: additive increase
Connection 1 throughput
R
Full bandwidth utilisation line
53
View Simulation
Transport Layer – TCP
B
The End
The next succeeding slides are just for additional reading.
54
Transport Layer – TCP
B
TCP Latency Modeling
Multiple End Systems sharing a link
1 TCP connection
1 TCP connection
1 TCP connection
3 TCP connections
Multithreading implementation
R bps – link’s transmission rate
Loop holes in TCP:
In practice, client/server applications with smaller RTT gets the
available bandwidth more quickly as it becomes free. Therefore, they
have higher throughputs
Multiple parallel TCP connection allows one application to get a bigger
55
share of the bandwidth
Transport Layer – TCP
B
TCP latency modeling
the time from when the client initiates a TCP connection until when the client
receives the requested object in its entirety
Q: How long
 TCP connection establishment time
 data transfer delay
 Actual data transmission time
Two cases to consider:
 WS/R > RTT + S/R:
does it take to
receive an object
from a Web
server?
No data transfer delay
An ACK for the first segment in window
returns to the Sender before a window’s worth
of data is sent
 WS/R < RTT + S/R:
There’s data transfer delay
Sender has to wait for an ACK after a window’s
worth of data sent
56
Transport Layer – TCP
B
TCP Latency Modeling
SERVER
CLIENT
R bps – link’s transmission rate
FILE
Assumptions:
O - Size of object in bits
S – number of bits of MSS (max. segment size)
Network is uncongested, with one link between end systems of rate R
CongWin (fixed) determines the amount of data that can be sent
No packet loss, no packet corruption, no retransmissions required
Header overheads are negligible
File to send = integer number of segments of size MSS
Connection establishment, request messages, ACKs, TCP connectionestablishment segments have negligible transmission times
Initial Threshold of TCP congestion mechanism is very big
57
Transport Layer – TCP
B
TCP latency Modeling
Case Analysis: STATIC CONGESTION WINDOW
Case 1: WS/R > RTT + S/R:
An ACK for the first segment in window returns to the Sender
before a window’s worth of data is sent
K = Number of
Windows of data that
cover the object
K = O/WS
Number of segments;
Rounded up to the nearest integer
e.g. O=256bits, S=32bits, W=4
Assume: W=4 segments
Case 1: latency = 2RTT + O/R
58
Transport Layer – TCP
B
TCP latency Modeling
Case Analysis: STATIC CONGESTION WINDOW
Case 2: WS/R < RTT + S/R:
Sender has to wait for an ACK after a
window’s worth of data sent
Number of Windows of
data that cover the
object
K:= O/WS
If there are k windows, sender will
be stalled (k-1) times
STALLED
PERIOD
Case 2: latency = 2RTT + O/R + (K-1)[S/R + RTT
- WS/R]
Transport
Layer – TCP
59
B
Case Analysis: DYNAMIC CONGESTION WINDOW
STALLED
PERIOD
O/S=15
4 windows
60
Transport Layer – TCP
B
Case Analysis: DYNAMIC CONGESTION WINDOW
• Let K be the number of windows that cover the object.
• We can express K in terms of the number of segments in the
object as follows:
O

0
1
k 1
K  min k : 2  2  ...  2  
S

O

k
K  min k : 2  1  
S


 O 
K  min k : k  log 2   1
S


O 
K  log 2   1
S

Note:
61
Transport Layer – TCP
B
Case Analysis: DYNAMIC CONGESTION WINDOW
• From the time the server begins to transmit the kth window
until the time the server receives an ACK for the first segment
in the window
S
   RTT
R
• Transmission of kth window =
• Stall Time =
• Latency =
 S  k 1
 *2
R
S
 S  k 1
 RTT    * 2
R
R
O K 1  S
 S  k 1 
2 RTT      RTT    * 2 
R k 1  R
R

62
Transport Layer – TCP
B
Case Analysis: DYNAMIC CONGESTION WINDOW
• Let Q be the number of times the server would stall if the object
contained an infinite number of segments.






RTT 


Q  log 2 1 
1


S 

   

  R  

• The actual number of times that the server stalls is
P = min{ Q, K-1 }.
63
Transport Layer – TCP
B
Case Analysis: DYNAMIC CONGESTION WINDOW
• Let Q be the number of times the server would stall if the object
contained an infinite number of segments.






RTT 


Q  log 2 1 
1


S 

   

  R  

• The actual number of times that the server stalls is
P = min{ Q, K-1 }.
• Closed-form expression for the latency:
O
S
S

P
Latency  2 RTT   P  RTT    (2  1)
R
R
R

64
Transport Layer – TCP
B
Case Analysis: DYNAMIC CONGESTION WINDOW
• Let Q be the number of times the server would stall if the object
contained an infinite number of segments.










Latency
P
 1 

MinimumLatency


O




   R  

  2

RTT






 


*Slow start will not significantly increase latency if RTT << O/R
65
Transport Layer – TCP
B
 http://www1.cse.wustl.edu/~jain/cis788-
97/ftp/tcp_over_atm/index.htm#atm-features
66
Transport Layer – TCP
B
Download