LastByteAcked

advertisement
Reliable Byte-Stream (TCP)
Outline
Connection Establishment/Termination
Sliding Window Revisited
Flow Control
Adaptive Timeout
1
End-to-End Protocols
• Underlying best-effort network
–
–
–
–
–
drop messages
re-orders messages
delivers duplicate copies of a given message
limits messages to some finite size
delivers messages after an arbitrarily long delay
• Common end-to-end services
–
–
–
–
–
–
–
guarantee message delivery
deliver messages in the same order they are sent
deliver at most one copy of each message
support arbitrarily large messages
support synchronization
allow the receiver to flow control the sender
support multiple application processes on each host
2
Simple Demultiplexor (UDP)
•
•
•
•
Unreliable and unordered datagram service
Adds multiplexing
No flow control
Endpoints identified by ports
– servers have well-known ports
– see /etc/services on Unix
• Header format
0
16
31
SrcPort
DstPort
Length
Checksum
Data
• Optional checksum
– psuedo header + UDP header + data
3
TCP Overview
• Connection-oriented
• Reliable
• Byte-stream
• Full duplex
• Flow control: keep sender
from overrunning receiver
• Congestion control: keep
sender from overrunning
network
– app writes bytes
– TCP sends segments
– app reads bytes
Application process
Application process
Write
bytes
Read
bytes
TCP
TCP
Send buffer
Receive buffer
Segment
Segment ■ ■ ■ Segment
Transmit segments
4
Data Link Versus Transport
• Potentially connects many different hosts
– need explicit connection establishment and termination
• Potentially different RTT
– need adaptive timeout mechanism
• Potentially long delay in network
– need to be prepared for arrival of very old packets
• Potentially different capacity at destination
– need to accommodate different node capacity
• Potentially different network capacity
– need to be prepared for network congestion
5
Segment Format
0
10
4
16
31
SrcPort
DstPort
SequenceNum
Acknowledgment
HdrLen
0
Flags
AdvertisedWindow
Checksum
UrgPtr
Options (variable)
Data
6
Segment Format (cont)
• Each connection identified with 4-tuple:
– (SrcPort, SrcIPAddr, DsrPort, DstIPAddr)
• Sliding window + flow control
– acknowledgment, SequenceNum, AdvertisedWinow
Data (SequenceNum)
Receiver
Sender
Acknowledgment +
AdvertisedWindow
• Flags
– SYN, FIN, RESET, PUSH, URG, ACK
• Checksum
– pseudo header + TCP header + data
7
Connection Establishment and
Termination
Active participant
(client)
Passive participant
(server)
8
Three way handshake protocol to establish connection. The requirement is no connection
will be established by accident caused by duplicate connection request and its
acknowledgement.
Three protocol scenarios for establishing a connection using a three-way handshake. CR
denotes CONNECTION REQUEST.
(a) Normal operation,
(b) Old CONNECTION REQUEST appearing out of nowhere.
(c) Duplicate CONNECTION REQUEST and duplicate ACK.
9
Connection Release
• Two styles of connection release
– Asymmetric: like telephone, the connection is released
if any one party hangs up, abrupt, can lose data.
– Symmetric: used in TCP, the connection is treated as
two separate unidirectional connections and each one is
released separately, so that data loss is less likely.
10
An implementation of three way handshake protocol for
releasing a connection, four scenarios are shown.
(a) Normal case of a three-way handshake. (b) final ACK lost.
11
(c) Response lost. (d) Response lost and subsequent DRs lost.
12
State Transition Diagram
CLOSED
Active open /SYN
Passive open
Close
Close
LIST EN
SYN_RCVD
SYN/SYN + ACK
Send/SYN
SYN/SYN + ACK
ACK
SYN + ACK/ACK
EST ABLISHED
Close/FIN
Close/FIN
FIN/ACK
FIN_WAIT _1
ACK
FIN_WAIT _2
SYN_SENT
CLOSE_WAIT
AC FIN/ACK
K
+
FI
N
/A
CK
FIN/ACK
Close/FIN
CLOSING
ACK T imeout after two
segment lifetimes
T IME_WAIT
LAST _ACK
ACK
CLOSED
13
Sliding Window Revisited
Sending application
Receiving application
T CP
LastByteWritten
LastByteAcked
LastByteSent
(a)
• Sending side
– LastByteAcked < =
LastByteSent
– LastByteSent < =
LastByteWritten
– buffer bytes between
LastByteAcked and
LastByteWritten
T CP
LastByteRead
NextByteExpected
LastByteRcvd
(b)
• Receiving side
– LastByteRead <
NextByteExpected
– NextByteExpected < =
LastByteRcvd +1
– buffer bytes between
NextByteRead and
LastByteRcvd
14
Flow Control
• Send buffer size: MaxSendBuffer
• Receive buffer size: MaxRcvBuffer
• Receiving side
– LastByteRcvd - LastByteRead < = MaxRcvBuffer
– AdvertisedWindow = MaxRcvBuffer - (NextByteExpected NextByteRead)
• Sending side
– LastByteSent - LastByteAcked < = AdvertisedWindow
– EffectiveWindow = AdvertisedWindow - (LastByteSent LastByteAcked)
– LastByteWritten - LastByteAcked < = MaxSendBuffer
– block sender if (LastByteWritten - LastByteAcked) + y >
MaxSenderBuffer
• Always send ACK in response to arriving data segment
• Persist when AdvertisedWindow = 0
15
Silly Window Syndrome
• How aggressively does sender exploit open window?
Sender
Receiver
• Receiver-side solutions
– after advertising zero window, wait for space equal to a maximum
segment size (MSS) (or MTU of directly connected network)
– delayed acknowledgements
16
Nagle’s Algorithm
• How long does sender delay sending data?
– too long: hurts interactive applications
– too short: poor network utilization
– strategies: timer-based vs self-clocking
• When application generates additional data
– if fills a max segment (and window open): send it
– else
• if there is unack’ed data in transit: buffer it until ACK arrives
• else: send it
17
Protection Against Wrap Around
• 32-bit SequenceNum
– Maximum segment life(MSL): 120 s
– typical RTT: 50 ms
Bandwidth
T1 (1.5 Mbps)
Ethernet (10 Mbps)
T3 (45 Mbps)
Fast Ethernet (100 Mbps)
OC-3 (155 Mbps)
OC-12 (622 Mbps)
OC-48 (2.5 Gbps)
Time Until Wrap Around
6.4 hours
57 minutes
13 minutes
6 minutes
4 minutes
55 seconds
14 seconds
18
Keeping the Pipe Full
• 16-bit AdvertisedWindow
– Window size: 64 KB
Bandwidth
T1 (1.5 Mbps)
Ethernet (10 Mbps)
T3 (45 Mbps)
Fast Ethernet (100 Mbps)
OC-3 (155 Mbps)
OC-12 (622 Mbps)
OC-48 (2.5 Gbps)
Delay x Bandwidth Product
18KB
122KB
549KB
1.2MB
1.8MB
7.4MB
14.8MB
assuming 100ms RTT
19
TCP Extensions
• Implemented as header options
• Store timestamp in outgoing segments
– For better measurement of RTT
• Extend sequence space with 32-bit timestamp
(PAWS)
– For protecting against seq# wrap around
• Shift (scale) advertised window
– For improving pipeline efficiency
20
Adaptive Retransmission
(Original Algorithm)
• Measure SampleRTT for each segment / ACK pair
• Compute weighted average of RTT
–
–


EstRTT = a x EstRTT + b x SampleRTT
where a + b = 1
a between 0.8 and 0.9
b between 0.1 and 0.2
• Set timeout based on EstRTT
– TimeOut = 2 x EstRTT
21
Karn/Partridge Algorithm
Sender
Receiver
Orig
inal
t
r ans
miss
ion
Retr
an
Sender
Orig
smis
sion
Receiver
inal
t
r ans
miss
ion
ACK
Retr
ansm
ACK
(a)
issio
n
(b)
• Do not sample RTT when retransmitting
– Two kinds of wrong samples above
• Double timeout after each retransmission
– Amounting to exponential backoff, congestion assumed
22
Jacobson/ Karels Algorithm
•
•
•
•
New Calculations for average RTT
Diff = SampleRTT - EstRTT
EstRTT = EstRTT + (d x Diff)
Dev = Dev + d( |Diff| - Dev)
– where d is a factor between 0 and 1
• Consider variance when setting timeout value
• TimeOut = m x EstRTT + f x Dev
– where m = 1 and f = 4
• Notes
– algorithm only as good as granularity of clock (500ms on Unix)
– accurate timeout mechanism important to congestion control (later)
23
Download