cwnd - It works!

advertisement
Chapter 3
Transport Layer
Part 4: Congestion control
Computer Networking: A Top
Down Approach
6th edition
Jim Kurose, Keith Ross
Addison-Wesley
March 2012
Transport Layer
3-1
Chapter 3: Transport Layer
Our goals:
r understand principles
behind transport
layer services:
m
m
m
m
multiplexing/demultipl
exing
reliable data transfer
flow control
congestion control
r learn about transport
layer protocols in the
Internet:
m
m
m
UDP: connectionless
transport
TCP: connection-oriented
transport
TCP congestion control
Transport Layer
3-2
Chapter 3 outline
r 3.1 Transport-layer
services
r 3.2 Multiplexing and
demultiplexing
r 3.3 Connectionless
transport: UDP
r 3.4 Principles of
reliable data transfer
r 3.5 Connection-oriented
transport: TCP
m
m
m
m
segment structure
reliable data transfer
flow control
connection management
r 3.6 Principles of
congestion control
r 3.7 TCP congestion
control
Transport Layer
3-3
Principles of Congestion Control
Congestion:
r informally: “too many sources sending too much
data too fast for network to handle”
r different from flow control!
r manifestations:
m lost packets (buffer overflow at routers)
m long delays (queueing in router buffers)
r a top-10 problem!
Transport Layer
3-4
Host A and Host B’s each transmit at an average rate of λin bytes/sec (e.g. 50MBs)
Causes/costs of congestion: scenario 1
Receiver takes in at a rate of λout bytes/sec
original data: lin
r
r
r
r
two senders, two
receivers
one router, infinite
buffers
output link capacity: R
no retransmission Host B
throughput:
Host A
Result of sharing link between two Hosts
unlimited shared
output link buffers
Link capacity is R
lout
delay
R/2
Throughput
lin
r
lout
R/2
maximum per-connection
throughput: R/2

When λin> R/2 the average
number of queued packets
in the router is unbounded
and the average delay
between source and
destination becomes
infinite
lin R/2
large delays as arrival rate, lin,
approaches capacity
Transport Layer
3-5
Causes/costs of congestion: scenario 2
r one router, finite buffers
λ` is called the
offered load
r sender retransmission of timed-out packet
m application-layer input = application-layer output: lin = lout
m
transport-layer input includes retransmissions : l‘in
performance
depends on how
retransmission is
done!
lin : original data
l'in: original data, plus
lin
lout
retransmitted data
Host A
Host B
finite shared output
link buffers
Transport Layer
3-6
Causes/costs of congestion: scenario 2
R/2
lout
idealization: perfect
knowledge
r sender sends only when
router buffers available
lin : original data
l'in: original data, plus
copy
Max average
throughput is R/2
since no packets
are lost
lin
R/2
λin = λin` since we
only send segment
if we know there is
buffer space
lout
retransmitted data
A
Host B
free buffer space!
finite shared output
link buffers
Transport Layer
3-7
Causes/costs of congestion: scenario 2
Idealization: known loss
r
packets can be lost, dropped
at router due to full buffers
sender only resends if packet
known to be lost
lin : original data
l'in: original data, plus
copy
lout
retransmitted data
A
no buffer space!
Host B
Transport Layer
3-8
Causes/costs of congestion: scenario 2
Idealization: known loss
R/2
when sending at R/2,
some packets are
retransmissions but
asymptotic goodput
is still R/2 (why?)
lout
packets can be lost,
dropped at router due
to full buffers
r sender only resends if
packet known to be lost
lin : original data
l'in: original data, plus
lin
R/2
lout
retransmitted data
A
free buffer space!
Assumption: On average
out of 0.5R units sent,
0.166R bytes/sec are
retrans and 0.333R are
original
Host B
Transport Layer
3-9
Causes/costs of congestion: scenario 2


packets can be lost, dropped
at router due to full buffers
sender times out prematurely,
sending two copies, both of
which are delivered
lin
l'in
timeout
copy
A
R/2
when sending at R/2,
some packets are
retransmissions
including duplicated
that are delivered!
lout
Realistic: duplicates
lin
Assuming (for no good
reason) that on average
each packet is forwarded
twice
free buffer space!
R/2
lout
May time-out prematurely
and retrans a packet that
is delayed in the queue
Receiver will receive
multiple copies of some
packets
Host B
Transport Layer
3-10
Causes/costs of congestion: scenario 2


packets can be lost, dropped
at router due to full buffers
sender times out prematurely,
sending two copies, both of
which are delivered
R/2
when sending at R/2,
some packets are
retransmissions
including duplicated
that are delivered!
lout
Realistic: duplicates
lin
R/2
“costs” of congestion:


more work (retrans) for given “goodput”
unneeded retransmissions: link carries multiple copies of pkt
 decreasing goodput
Transport Layer
3-11
Causes/costs of congestion: scenario 3
Q: what happens as lin and lin’
r four senders
increase ?
A: as red lin’ increases, all arriving
blue pkts at upper queue are
dropped, blue throughput g 0
r multihop paths
r timeout/retransmit
Host A
lout
lin : original data
l'in: original data, plus
Host B
retransmitted data
finite shared output
link buffers
Host D
R1
R2
Host C
Transport Layer
3-12
Low traffic
r Consider
connection A to C
that goes through
routers R1 and R2
r A-C shares router
R1 with D-B
r A-C shares router
R2 with B-D
r Extremely small values
of lin
m
m
m
Buffer overflows rare
Throughput approx
equals offered load l
in
i.e., lin = lout
With slightly larger
values of l
in
l increases the same
out
since buffer
overflows are still rare
m
Transport Layer 3-13
Large traffic
r Will get extremely
r A - C that goes
through routers R1
and R2
r A-C shares router
R2 with B-D
r
r
r
r
large values of l‘
in
from B-D
This will overflow R2
segments from A
through R1 to R2 will
tend to get discarded.
Result: throughput
from A to C will go
towards 0
See next slide
Transport Layer 3-14
Causes/costs of congestion: scenario 3
H
o
s
t
A
l
o
u
t
H
o
s
t
B
another “cost” of congestion:
r when packet dropped, any “upstream transmission
capacity used for that packet was wasted!
r i.e., when R2 drops a packet from R1, R1’s
transmission was wasted!
Transport Layer
3-15
Approaches towards congestion control
two broad approaches towards congestion control:
end-end congestion
control:
r
r
r
no explicit feedback from
network
congestion inferred from
end-system observed loss,
delay
approach taken by TCP
ATM was thought to be “the” networking
technology for the internet in the late
90’s. Very popular with the telcom’s
network-assisted
congestion control:
r
routers provide feedback
to end systems
m Method 1: single bit
indicating congestion
(SNA, DECbit, TCP/IP
ECN (proposed), ATM)
m Method 2: router gives
explicit rate sender
should transmit
Transport Layer 3-16
Case study: ATM* ABR congestion control
ABR: available bit rate:
r
r
r
“elastic service”
if sender’s path
“underloaded”:
m sender should use
available bandwidth
if sender’s path
congested:
m sender throttled to
minimum guaranteed
rate
RM (resource management) cells:
r
r
r
r
sent by sender, interspersed with
data cells (~1 every 32 data cells)
bits in RM cell set by switches
(“network-assisted”)
m NI bit: no increase in rate
(mild congestion)
m CI bit: congestion indication
ER setting: 2-byte explicit rate
field.
RM cells returned to sender by
receiver, with bits intact
*ATM is not used on the internet, but it’s an interesting example of how congestion control can be handled.
Transport Layer 3-17
Case study: ATM* ABR congestion control
r two-byte ER (explicit rate) field in RM cell
m congested switch may lower ER value in cell
m sender’ send rate thus maximum supportable rate on path
r EFCI bit in data cells: set to 1 in congested switch
m if data cell preceding RM cell has EFCI set, sender sets CI
bit in returned RM cell
* EFCI: explicit forward congestion indication
* ATM (asynchronous transfer mode) is a dedicated-connection
switching technology that organizes digital data into 53-byte cells
Transport Layer 3-18
Chapter 3 outline
r 3.1 Transport-layer
services
r 3.2 Multiplexing and
demultiplexing
r 3.3 Connectionless
transport: UDP
r 3.4 Principles of
reliable data transfer
r 3.5 Connection-oriented
transport: TCP
m
m
m
m
segment structure
reliable data transfer
flow control
connection management
r 3.6 Principles of
congestion control
r 3.7 TCP congestion
control
Transport Layer 3-19
TCP congestion control:
r TCP must use end-to-end congestion control; IP will
not help!
r Each sender limits the rate at which it sends data
into its connection as a function of perceived network
congestion.
1.
2.
3.
Q: how does TCP limit the rate at which it sends traffic?
Q: how does a TCP sender perceive that there is congestion
on the path between itself and the destination?
Q: what algorithm should the sender use to change its send
rate as a function of perceived end-to-end congestion?
Transport Layer 3-20
TCP congestion control:
1.
Q: how does TCP limit the rate at which it sends traffic?
r Limiting rate at which it sends traffic
m Each side of TCP connection has receive buffer, send
buffer, variables (LastByteRead, rwnd, etc.)
m
TCP congestion control keeps another variable, the
congestion window (cwnd):
LastByteSent – LastByteAcked <= min{cwnd,rwnd}
Transport Layer 3-21
TCP Congestion Control: details
sender sequence number space
cwnd
last byte
ACKed
sent, notyet ACKed
(“inflight”)
last byte
sent
r sender limits transmission:
LastByteSent< cwnd
LastByteAcked
r cwnd is dynamic, function
of perceived network
congestion
TCP sending rate:
r roughly: send cwnd
bytes, wait RTT for
ACKS, then send
more bytes
rate
~
~
cwnd
RTT
bytes/sec
assume receive buffer is so large that rwnd can be
ignored.
assume that sender always has data to send
assume loss and packet delays are negligible
Transport Layer
3-22
TCP Congestion Control: details
r sender limits rate by limiting number
of unACKed bytes “in pipeline”:
LastByteSent-LastByteAcked  cwnd
m cwnd: differs from rwnd (how, why?)
m cwnd: congestion window in bytes
m sender limited by min(cwnd,rwnd)
cwnd
bytes
RTT
ACK(s)
Transport Layer 3-23
TCP congestion control:
2. Q: how does a TCP sender perceive that there is congestion on
the path between itself and the destination?
r goal: TCP sender should
transmit as fast as
possible, but without
congesting network
mQ:
how to find rate just
below congestion level
ACK: segment received (a good
thing!), network not congested,
so increase sending rate
m
lost segment: assume loss due
to congested network, so
decrease sending rate
m
r decentralized: each TCP
sender sets its own rate,
based on implicit feedback:
Transport Layer 3-24
TCP: congestion avoidance
AIMD
r ACKs: increase cwnd by 1 MSS per
RTT: additive increase
r loss: cut cwnd in half (non-
timeout-detected loss ):
multiplicative decrease
AIMD: Additive Increase
Multiplicative Decrease
Transport Layer 3-25
TCP congestion control: additive increase
multiplicative decrease
approach: sender increases transmission rate (window
size), probing for usable bandwidth, until loss occurs
 additive increase: increase cwnd by 1 MSS (Max
Segment Size) every RTT until loss detected
 multiplicative decrease: cut cwnd in half after loss
AIMD saw tooth
behavior: probing
for bandwidth
cwnd: TCP sender
congestion window size

additively increase window size …
…. until loss occurs (then cut window in half)
Note that the actual curve is more complicated
time
Transport Layer
3-26
TCP congestion control:
r TCP congestion-control algorithm
m
m
m
Slow start (required in TCP)
Congestion avoidance (required in TCP)
Fast recovery (recommended in TCP)
Transport Layer 3-27
TCP Congestion Control: overview
ACK received: increase
cwnd
r slowstart phase:
m increase exponentially
fast at connection start,
or following timeout
r congestion avoidance:
m increase exponentially to
threshold
m then increase linearly
segment loss event:
reducing cwnd
r timeout: no response
from receiver
m
cut cwnd to 1 MSS
r 3 duplicate ACKs: (some
segments getting
through)
m
cut cwnd in half
Transport Layer 3-28
TCP Slow Start
r
when connection begins, cwnd =
1 MSS (max segment size)
m example: MSS = 500 bytes
& RTT = 200 msec
m initial rate = 25 kbps
available bandwidth may be >>
MSS/RTT
m quickly ramp up to
respectable rate
Host A
RTT
r
cwnd becomes 2 MSS
cwnd = 4 MSS
Host B
cwnd = 3 MSS
time
Transport Layer 3-29
TCP Slow Start
r
increase rate exponentially
until first loss event or when
threshold reached
m double cwnd every RTT
m done by incrementing cwnd
by 1 for every ACK received
Host A
RTT
r
cwnd becomes 2 MSS
Host B
cwnd = 3 MSS
Example:
m
1 MSS sent
• Rcv 1 ACK, so increase cwnd to
2MSS (double)
m
Now 2 MSS sent
• Rcv 2 ACK, incr cwnd by 1 for each
to 4 MSS (double)
m
Now 4 MSS sent
• Rcv 4 ACK, incr cwnd by 4 to 8
MSS (double)
cwnd = 4 MSS
time
Transport Layer 3-30
Transitioning into/out of slowstart
ssthresh: cwnd threshold variable maintained by TCP
r on loss event: set ssthresh to cwnd/2
remember (half of) TCP rate when congestion last
occurred
r when cwnd >= ssthresh: transition from slowstart to
congestion avoidance phase
m
Transport Layer 3-31
TCP: congestion avoidance
r Goal: incr cwnd slower to avoid congestion
r while cwnd <= ssthresh
m
grow cwnd exponentially:
increase cwnd by 1 MSS per ACK
r while cwnd > ssthresh
m
grow cwnd linearly:
cwnd = cwnd + MSS/cwnd per ACK
Transport Layer 3-32
TCP: congestion avoidance
r while cwnd > ssthresh grow cwnd linearly:
cwnd = cwnd + (MSS/cwnd per ACK)
m
m
m
m
m
m
m
Assume cwnd starts at 1MSS
Send one MSS
Recv Ack, cwnd = 1MSS + 1MSS/1MSS or by 1
cwnd now is 2MSS. Send 2MSS
Recv 2 Ack. cwnd = 2MSS + (1MSS/2MSS * 2 ACKs) =
3MSS.
cwnd now is 3MSS. Send 3MSS.
Recv 3 Ack. cwnd = 3MSS + (1MSS/3MSS * 3) = 4MSS.
r Result: cwnd grows by 1MSS each RTT
Transport Layer 3-33
Relation between slowstart and
congestion avoidence
duplicate ACK
dupACKcount++
L
cwnd = 1 MSS
ssthresh = 64 KB
dupACKcount = 0
timeout
ssthresh = cwnd/2
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
slow
start
new ACK
cwnd = cwnd+MSS
dupACKcount = 0
transmit new segment(s),as allowed
cwnd > ssthresh
L
timeout
ssthresh = cwnd/2
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
congestion
avoidance
Transport Layer 3-34
TCP: fast recovery
r Goal: incr cwnd faster to get back to max
transmission rate
r increase by 1 MSS per duplicate ACK recieved
for the missing segment
r when ACK arrives for missing segment, enter
congestion-avoidance state after deflating cwnd
r if timeout occurs, go to slow-start state after
setting ssthresh to ½ cwnd and then cwnd to 1
MSS
Transport Layer 3-35
TCP congestion control FSM: overview
cwnd > ssthresh
slow
start
loss:
timeout
congestion
avoidance
loss:
timeout
loss:
timeout
loss:
3dupACK
new ACK
loss:
3dupACK
fast
recovery
Transport Layer 3-36
Summary: TCP Congestion Control
duplicate ACK
dupACKcount++
L
cwnd = 1 MSS
ssthresh = 64 KB
dupACKcount = 0
slow
start
timeout
ssthresh = cwnd/2
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
dupACKcount == 3
ssthresh= cwnd/2
cwnd = ssthresh + 3
retransmit missing segment
New
ACK!
new ACK
cwnd = cwnd+MSS
dupACKcount = 0
transmit new segment(s), as allowed
cwnd > ssthresh
L
timeout
ssthresh = cwnd/2
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
timeout
ssthresh = cwnd/2
cwnd = 1
dupACKcount = 0
retransmit missing segment
New
ACK!
new ACK
cwnd = cwnd + MSS (MSS/cwnd)
dupACKcount = 0
transmit new segment(s), as allowed
.
congestion
avoidance
duplicate ACK
dupACKcount++
New
ACK!
New ACK
cwnd = ssthresh
dupACKcount = 0
dupACKcount == 3
ssthresh= cwnd/2
cwnd = ssthresh + 3
retransmit missing segment
fast
recovery
duplicate ACK
cwnd = cwnd + MSS
transmit new segment(s), as allowed
Transport Layer
3-37
TCP: detecting, reacting to loss
Tahoe vs RENO (newer version)
r loss indicated by timeout:
m cwnd set to 1 MSS;
m
window then grows exponentially (as in slow start) to
threshold, then grows linearly
r loss indicated by 3 duplicate ACKs: TCP RENO
m dup ACKs indicate network capable of delivering some
segments
m cwnd is cut in half window then grows linearly
r TCP Tahoe always sets cwnd to 1 (timeout or 3
duplicate acks)
Transport Layer
3-38
cwnd window size (in segments)
Popular “flavors” of TCP
TCP Reno
ssthresh
ssthresh
TCP Tahoe
Transmission round*
1 Transmission round = send cwnd bytes, receive all ACKs for those bytes (or loss occurs)
Transport Layer 3-39
Summary: TCP Congestion Control
r when cwnd < ssthresh, sender in slow-start
phase, window grows exponentially.
r when cwnd >= ssthresh, sender is in congestion-
avoidance phase, window grows linearly.
r when triple duplicate ACK occurs, ssthresh set
to cwnd/2, cwnd set to ~ ssthresh
r when timeout occurs, ssthresh set to cwnd/2,
cwnd set to 1 MSS.
Transport Layer 3-40
TCP throughput
r Q: what’s average throughout of TCP as
function of window size, RTT?
m
ignoring slow start
r let W* be window size when loss occurs.
m when
window is W, throughput is W/RTT
m just after loss, window drops to W/2,
throughput to W/2RTT.
m average throughout: .75 W/RTT
*W is just another symbol for cwnd
Transport Layer 3-41
TCP throughput
r Problem: TCP was designed for the days of
SMTP, FTP, and Telnet.
r Not designed for HTTP-dominated internet
and video-on-demand
r high-speed TCP connections necessary for
grid- and cloud-computing applications
require different configurations
Transport Layer 3-42
TCP Futures: TCP over “long, fat pipes”
r example: 1500 byte segments, 100ms RTT, want 10
Gbps throughput
r if W is the window size, then throughput is
W/RTT so
num segments *1500 * 8 bits/100 x 10-3 sec = 10Gbps
num segments * 120,000 = 10Gbps
num segments = 10 x 109 / 120 x 103
num segments = 0.08333333 x 106
r requires 83,333 in-flight segments
Transport Layer 3-43
TCP Futures: TCP over “long, fat pipes”
r 83,333 in-flight segments is large. Segments will
be lost.
r throughput can be stated in terms of loss rate
(see problem P45):
r to get 10Gbps throughput ➜ L = 2·10-10 Wow
r new versions of TCP for high-speed
Transport Layer 3-44
TCP Fairness
fairness goal: if K TCP sessions share same
bottleneck link of bandwidth R, each should have
average rate of R/K
TCP connection 1
TCP
connection 2
bottleneck
router
capacity R
Transport Layer 3-45
Why is TCP fair?
Two competing sessions:
r
r
Additive increase gives slope of 1, as throughout increases
multiplicative decrease decreases throughput proportionally
R
equal bandwidth share
loss: decrease window by factor of 2
congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase
Connection 1 throughput R
Transport Layer 3-46
Fairness (more)
Fairness and UDP
r multimedia apps often
do not use TCP
m
do not want rate
throttled by congestion
control
r instead use UDP:
m pump audio/video at
constant rate, tolerate
packet loss
Fairness and parallel TCP
connections
r nothing prevents app from
opening parallel
connections between 2
hosts.
r web browsers do this
r example: link of rate R
supporting 9 connections;
m
m
new app asks for 1 TCP, gets
rate R/10
new app asks for 11 TCPs,
gets R/2 !
Transport Layer 3-47
Chapter 3: Summary
r principles behind transport
layer services:
m multiplexing,
demultiplexing
m reliable data transfer
m flow control
m congestion control
r instantiation and
implementation in the
Internet
m UDP
m TCP
Next:
r leaving the network
“edge” (application,
transport layers)
r into the network
“core”
Transport Layer 3-48
Download