Slides in PDF format

advertisement
• Error Control
• Congestion Control
• Timers
Malathi Veeraraghavan
Originals by Jörg Liebeherr
1
Background on ARQ Error Control 1
• Two types of errors:
– Lost packets
– Damaged packets
• Error control schemes that involve error detection and
retransmission of lost or corrupted frames are referred to
as Automatic Repeat reQuest (ARQ) error control
• Most Error Control techniques are based on:
1. Error Detection Scheme (Parity checks, CRC).
2. Retransmission Scheme.
Malathi Veeraraghavan
Originals by Jörg Liebeherr
2
1
Background on ARQ Error Control 2
• All retransmission schemes use all or a subset of the
following procedures:
– Positive acknowledgments (ACK)
– Negative acknowledgment (NACK)
– Selective acknowledgment (SACK)
– All retransmission schemes (using ACK, NACK, SACK or
all) rely on the use of timers
• The most common ARQ retransmission schemes are:
Stop-and-Wait ARQ
Go-Back-N ARQ
Selective Repeat ARQ
Malathi Veeraraghavan
Originals by Jörg Liebeherr
3
Error Control in TCP
• TCP maintains multiple timers for each connection
• TCP couples error control and congestion control (I.e., it
assumes that errors are caused by congestion)
Malathi Veeraraghavan
Originals by Jörg Liebeherr
4
2
TCP Timers
• TCP maintains multiple timers:
– Retransmission Timer:
• The timer is started during a transmission. A timeout causes a
retransmission
– Persist Timer
• Ensures that window size information is transmitted even if no data
is transmitted
– Keepalive Timer
• Detects crashes on the other end of the connection
– Other timers
• Delayed ACK timer, timeout of connection setup, abort timeout
(total timeout - keeps retransmitting till this timeout, then it kills the
connection), 2MSL timeout (when closing connection)
Malathi Veeraraghavan
Originals by Jörg Liebeherr
5
TCP Retransmission Timer
• Retransmission Timer:
– The setting of the retransmission timer is crucial for
efficiency
– Timeout value too small -> results in unnecessary
retransmissions
– Timeout value too large -> long waiting time before a
retransmission can be issued
– A problem is that the delays in the network are not fixed
– Therefore, the retransmission timers must be adaptive
Malathi Veeraraghavan
Originals by Jörg Liebeherr
6
3
Measuring TCP Retransmission Timers
ftp session
from aida
to rigoletto
aida.poly.edu
rigoletto.poly.edu
•Transfer file from aida to rigoletto
• Unplug Ethernet cable in the middle of file transfer
Malathi Veeraraghavan
Originals by Jörg Liebeherr
7
tcpdump Trace
10:42:01.704681 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:42:01.705603 aida.40001 > rigoletto.ftp-data: . 162649:164109(1460) ack 1 win 17520
10:42:01.706753 aida.40001 > rigoletto.ftp-data: . 164109:165569(1460) ack 1 win 17520
10:42:02.741764 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:42:05.741788 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:42:11.741828 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:42:23.741951 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:42:47.742176 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:43:35.742587 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:44:39.743140 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:45:43.743702 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:46:47.744271 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:47:51.752138 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:48:55.745547 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:49:59.746123 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:51:03.745839 aida.40001 > rigoletto.ftp-data: R 165569:165569(0) ack 1 win 17520
Malathi Veeraraghavan
Originals by Jörg Liebeherr
8
4
Interpreting the Measurements
The interval between retransmission
attempts in seconds is:
600
1.03, 3, 6, 12, 24, 48, 64, 64, 64,
64, 64, 64, 64.
500
•
Time between retransmissions is
doubled each time (Exponential
Backoff Algorithm)
•
Timer is not increased beyond 64
seconds
•
TCP gives up after 13th attempt
100
and 9 minutes (total timeout,
tcp_ip_abort_interval is 2 mins in
Solaris and can be programmed by
0
administrator - 9 mins is the
commonly used old timeout value)
300
Transmission Attempts
12
10
8
6
4
200
2
Seconds
400
0
•
Malathi Veeraraghavan
Originals by Jörg Liebeherr
9
TCP timers
•
First timeout occurs based on when timer was intialized.
•
This explains why the first timeout occurs at 1.03 sec and not 1.5.
•
If the base timer clock is 500 ms, the first timeout occurs after 3 timer ticks. This
happens to occur at 1.03 sec after first segment was sent. Subsequent
retransmissions occur at 3 sec, 6 sec, 12 sec, etc.
1
2
3
4
5
6
somewhere
here TCP sends
Retransmission timer
first segment
expires after three
ticks (<1.5 sec; in this
case it happens to be
1.03 sec)
Malathi Veeraraghavan
Originals by Jörg Liebeherr
500 ms
per tick
7
8
9
10
11
12
Retransmission timer
expires after six ticks
(3 sec)
10
5
Adaptive mechanism
• The retransmission mechanism of TCP is adaptive
•
The retransmission timers are set based on round-trip time (RTT) measurements
that TCP performs
ent 1
ACK for Segm
Segment 2
Segment 3
RTT #2
ACK for Se
gment 2 +
Segment
5
RTT #3
• But:
– TCP does not ACK each
segment
– Can’t start a second RTT
measurement if timing on
one segment is in progress
– Each connection has only
one timer
Segment 1
RTT #1
• The RTT is based on time
difference between segment
transmission and ACK
ACK for Se
ACK for Se
3
Segme
nt 4
gment 4
gment 5
Malathi Veeraraghavan
Originals by Jörg Liebeherr
11
Computation of RTO in adaptive scheme
•
Retransmission timer is set to a Retransmission Timeout (RTO) value.
•
RTO is calculated based on the RTT measurements.
•
The RTT measurements are smoothed by the following estimators A (mean RTT value) and
D (smoothed mean deviation of RTT):
Err = M - A
AÅ A+ g Err=A(1-g)+gM
D Å D+ h (|Err|-D)=D(1-h)+ h|Err|
RTO = A + 4D (latest formula)
(book also says A+2D for initial value; we’ll use A+4D)
The gains are set to h=1/4 and g=1/8
– In the formula for computing the new smoothed mean RTT A, 0.125
times the newly measured value (M) is added to 0.875 times the old smoothed
value of A
Malathi Veeraraghavan
Originals by Jörg Liebeherr
12
6
Example of RTO computation (adaptive)
• Assume A=1, D=1 (initial values)
• Err = 2 -1 =1 (since M, the measured RTT is 2)
• A = 1 + 0.125×1= 1.125; D = 1+0.25 (1-1)=1
• RTO = A+4D=1.125+4 = 5.125
RTT =2
• This is why in the figure below when segment 2 is lost, it is
Segment 1
retransmitted after 5.125 sec.
ent 1
ACK for Segm
Segment 2
RTO
=5.125
X (packet lost)
Segment 2 (retransmitted)
ACK for Se
gment 2 +
3
Malathi Veeraraghavan
Originals by Jörg Liebeherr
13
Karn’s Algorithm
segme
Timeout !
RTT ?
RTT ?
• If an ACK for a retransmitted
segment is received, the sender
cannot tell if the ACK belongs to
the original or the
retransmission.
nt
retrans
miss
of segm ion
ent
ACK
• Karn’s Algorithm:
– Don’t update A or D on any segments that have been
retransmitted.
Malathi Veeraraghavan
Originals by Jörg Liebeherr
14
7
RTO Calculation: Example
•
At t1: RTO = 6 sec
•
At t2: RTO= 2 * 6 = 12 sec
(exponential backoff)
•
At t3: RTO is not updated
.
.
.
Seg
men
Seg
t4
me
n
t5
Seg
me
nt 6
Segm
ent 2
Segm
ent 3
Segmen
t1
ACK
SYN
for
ACK t 4
men
Seg
for
t3
ACK
men
Seg
ent 2
r Segm
ACK fo
ent 1
r Segm
ACK fo
ACK
SYN +
SYN
(Due to Karn’s algorithm)
Timeout !
RTT #2
RTT #1
t1
t2
t3
t4
t5 t6
RTT #3
t 7 t8
t9
Malathi Veeraraghavan
Originals by Jörg Liebeherr
15
Congestion control (Second topic of this
lecture)
• Most often, a packet loss in a network is due to an overflow at
a congested router (rather than due to a transmission error)
• A sender can detect lost packets through a:
• Timeout of a retransmission timer
• Receipt of a duplicate ACK
• TCP assumes that a packet loss is caused by congestion and
reduces the size of the sending window (cwnd)
• Algorithms that reduce and then reopen the sending window
as packets are lost:
– Congestion Avoidance
– Fast retransmit and Fast recovery
Malathi Veeraraghavan
Originals by Jörg Liebeherr
16
8
Recall Slow Start / Congestion Avoidance
• Here we give a recap of the normal operation of Slow Start
and Congestion Avoidance
If cwnd <= ssthresh then
/* Slow Start Phase */
Each time an ACK is received:
cwnd = cwnd + segsize
else
/* cwnd > ssthresh */
/* Congestion Avoidance Phase */
Each time an ACK is received:
cwnd = cwnd + segsize * segsize / cwnd + segsize / 8
endif
Malathi Veeraraghavan
Originals by Jörg Liebeherr
17
Congestion Avoidance Algorithm
• When congestion occurs (indicated by timeout or receipt of
duplicate ACK),
– ssthresh is set to half the current window size (the
minimum of the advertised window (AW) and cwnd):
ssthresh = min(cwnd,AW) / 2 but at least 2 segments
– cwnd is changed according to:
cwnd = 1 segsize = 1 MSS bytes (in case of timeout only)
• When new data is acknowledged,cwnd is increased according
to whether it is in slow start or CA
Malathi Veeraraghavan
Originals by Jörg Liebeherr
18
9
Slow Start / Congestion Avoidance
•
A typical plot of cwnd for a TCP connection (segsize = 1500
bytes) :
Malathi Veeraraghavan
Originals by Jörg Liebeherr
19
Accelerated retransmissions (Fast retransmit)
• TCP allows accelerated retransmissions (Fast Retransmit)
– If receiver gets a segment out of order, it sends an ack with
the expected sequence number. If sender receives one or
two duplicate ACKs, it thinks segments are misordered.
When expected segment is received at receiver, it sends
the correct ACK. But if the third duplicate ACK is received
at sender, it assumes lost segments and retransmits
immediately without waiting for expiry of retransmission
timer. Hence it is called fast retransmit.
Malathi Veeraraghavan
Originals by Jörg Liebeherr
20
10
Fast Retransmit and Fast Recovery
ACK 100
• After the third duplicate ACK
(meaning fourth ACK) is received
by the sender, it transmits a single
segment without waiting for a
timeout to expire.
Data (100:200)
ACK 100
ACK 100
ACK 100
Data (100:200
)
•
If 3rd duplicate ACK (this means fourth ACK with same ack no.) is received:
ssthresh = min(cwnd, receiver’s advertised window)/2
cwnd = ssthresh + 3 segsize; then retransmit segment
Reason: TCP receiver has to issue an ACK every time it receives a new segment.
Therefore when the sender receives 3 duplicate ACKs it implies that three
segments got through the network successfully; Therefore it inflates the cwnd.
•
For each additional duplicate ACK received:
cwnd = cwnd + segsize
and transmit a segment if allowed by new value of cwnd
When an ACK arrives that acknowledges new data set cwnd = ssthresh; (this should
be the ACK for the retransmission from step 1); additionally, it will ack intermediate
segments between lost packet and receipt of third duplicate ACK, so set cwnd = cwnd
+ segsize; now in CA phase
Malathi Veeraraghavan
•
Originals by Jörg Liebeherr
21
Example of slow start and congestion avoidance (MSS=512
bytes; advertised window =5120 bytes)
•
Normal operation
cwnd=512; ssthresh=2560
cwnd=1024
PSH 1:513 (512) ack 10
ack 513
PSH 513:1025 (512) ack 10
PSH 1025:1537 (512) ack 10
cwnd=1536; ssthresh=2560
cwnd=2048
ack 1025
ack 1537
PSH 1537:2049 (512) ack 10
cwnd=2560; ssthresh=2560
cwnd=3072; ssthresh=2560
Enter congestion avoidance
cwnd=3222; ssthresh=2560
Malathi Veeraraghavan
Originals by Jörg Liebeherr
PSH 2049:2561(512) ack 10
ack 2049
ack 2561
PSH 2561:3073(512) ack 10
ack 3073
22
11
Example: computation of cwnd on previous
slide
• Upto and including ack 2561, this TCP connection is in slow
start, and cwnd is increased by 1 MSS bytes each time an
ACK is received.
• Note that when cwnd = ssthresh, slow start is still applied.
Hence when ack 2561 is received, cwnd = 2560+512 = 3072.
• When the last ack shown on the previous slide is received,
the TCP connection is in congestion avoidance since cwnd is
> ssthresh. Therefore, cwnd = cwnd + MSS × MSS / cwnd +
MSS / 8 = 3072 + 512 × 512/3072+512/8=3222
Malathi Veeraraghavan
Originals by Jörg Liebeherr
23
Example: RTO timeout (see congestion
avoidance algorithm)
• Example of a retransmit based on a timeout
cwnd=3222; ssthresh=2560
cwnd=512; ssthresh=1536
PSH
3073:3585(512)
ack 10
X
PSH 3073:3585(5
12)
ack 10
RTO expiry
• When segment is retransmitted, ssthresh is dropped to half of the
minimum of the cwnd and advertised window. Since advertised window is
5120 bytes for this example, half of 3222 is 1611, but this is rounded down
to the next multiple of the MSS (see page 316 for this rounding down
concept).
Malathi Veeraraghavan
Originals by Jörg Liebeherr
24
12
Example: duplicate ACKs
(congestion avoidance algorithm and fast retransmit/recovery algorithm)
•
In case of duplicate ACKs, both congestion avoidance algorithm and fast
retransmit/recovery algorithms apply
ack 3073
cwnd=3222; ssthresh=2560
cwnd=3222; ssthresh=1536
cwnd=3222; ssthresh=1536
cwnd=1536+3*512=3072; ssthresh=1536
PSH 3073:3585
X
(512) ack 10
PSH 3585:4097 (512) ack 10
PSH 4097:4609 (512) ack 10
ck 3073
PSH 4609a
:512
1 (512) ack 10
ack 3073
ck 3073
PSH 3073a:358
5 (512) ack 10
ack 5121
cwnd=ssthresh=1536; ssthresh=1536;
cwnd=2048
•For reason for last cwnd increase to 2048, see last case in Fig. 21.11
Malathi Veeraraghavan
Originals by Jörg Liebeherr
25
Repacketization
• When TCP does a retransmission, it can send the missing data in
differently sized segments
• Increase segment size (if allowed by MSS limit) to improve efficiency
(new data arrives after first transmitted segment was lost)
Data (1:100)
ACK 100
new data arrives from
application (100 bytes)
before the retransmission
timer times out
Data (100:200
)
lost
Data (100
:300)
ACK 300
Malathi Veeraraghavan
Originals by Jörg Liebeherr
26
13
Persist Timer in TCP
• Assume the window size goes down to zero and the ACK that
opens the window gets lost
• If ACK (see figure) is
lost, both sides are
blocked.
Receiver
Buffer
0
AckNo=2048
• Persist Timer:
Win=2048
2K
2K SeqNo=20
48
Sender blocked
Forces the sender to
periodically query the
receiver about its window
size (window probes)
4K
2K SeqNo=0
4K
AckN
ACK is
lost
0
o=4096 Win=
AckNo=4096
3K
Win=1024
Malathi Veeraraghavan
Originals by Jörg Liebeherr
27
Persist Timer
• The persist timer is started by the sender when the sliding window is zero
• Persist timer uses exponential backoff (initial value is 1.5 seconds), but it
is bounded to the range [5 sec, 60sec]
• So the time interval between timeouts are at:
5, 5, 6, 12, 24, 48, 60, 60, …
– The first two are 5 because the first two timer values, 1.5 and 3,
are both increased to be within bound [5, 60]
• The window probe packet contains one byte of data
• TCP allows sender to send one byte beyond close of receiver window
• Persist timer never gives up (till connection gets aborted)
Malathi Veeraraghavan
Originals by Jörg Liebeherr
28
14
Persist Timer
Timeout
(5 sec)
AckNo=4096
Win=0
Probe
1 byte SeqN
o=4096
Timeout
(5 sec)
AckNo=4096
Win=0
Probe
1 byte SeqN
o=4096
Timeout
(6 sec)
AckNo=4096
Win=0
Probe
1 byte SeqN
o=4096
AckNo=4096
Win=1024
1 KB SeqNo=
5120
Malathi Veeraraghavan
Originals by Jörg Liebeherr
29
Keepalive Timer in TCP
• When a TCP connection has been idle for a long time, a
Keepalive timer reminds a station to check if the other side is
still there.
• A probe packet is sent if the connection has been idle for 2
hours
• Assume a probe has been sent from A to B:
(1) B is up and running:
(2) B has crashed and is down:
(3) B has rebooted:
(4) B is up, but unreachable:
Malathi Veeraraghavan
Originals by Jörg Liebeherr
B responds with an ACK
A will send 10 more probes, each 75
seconds apart. If A does not get a
response, it will close the connection
B will send a RST segment
Looks to A the same as (2)
30
15
TCP Summary
• TCP Header - fields
• TCP connection open/close (SYN/FIN)
• Interactive TCP data transfer:
– Delayed ACKs
– Nagle’s algorithm
Malathi Veeraraghavan
Originals by Jörg Liebeherr
31
TCP Summary Contd.
• Bulk TCP data transfer:
– Flow control: sliding window (receiver paces sender)
– Error control: time-outs and retransmissions
• exponential backoff (in case of retransmits)
• RTO changing adaptively to measured RTTs
• Karn’s algorithm
– Congestion control: congestion window (sender has window)
• Slow start and congestion avoidance phases (normal operation)
• Lost packets (timeout or duplicate ACKs)
– congestion avoidance algorithm
– fast retransmit and fast recovery algorithm
• Because of the congestion recovery schemes, TCP’s ARQ scheme is Goback-N if an error (loss) is detected by a retransmission time-out
occurs but selective repeat if an error (loss) is detected by triple
duplicate ACKs.
– Repacketization
•
Persist and Keep-alive timers
Malathi Veeraraghavan
Originals by Jörg Liebeherr
32
16
Different schemes for determining RTO
• Exponential backoff if a segment is retransmitted
• adaptive RTO as a function of RTT (A+4D)
– RTT measurement is in progress and a new segment sent
then no RTT measurement is taken for new segment
• Karn’s algorithm
– no RTT measurement on retransmitted segment
Malathi Veeraraghavan
Originals by Jörg Liebeherr
33
17
Download