TCP Slides

advertisement
ECE 428
Transport-level Protocols
(Layer 4)
TCP: Transmission Control Protocol
UDP: User Datagram Protocol
1
Need for a Protocol above IP layer
• IP layer
•
•
•
•
Delivers packets to a host from another host
Delivery: best-effort basis
Can reorder, lose, duplicate
Is not sure if data has been delivered (no end-to-end ACK)
• Thus, need for an upper layer protocol  TCP
• Deliver data to applications <= end-to-end semantics
• Maintain data flow between applications.
– Receiver’s view
» Reliable: Ordered, without loss, no duplicate. Drop data belonging to an
earlier association between applications.
» Flow control
– Sender’s view: Confirmed delivery
• Congestion control: by the sender <= Reduce network congestion
2
Transport-level protocols
• TCP: Operates in a connection-oriented mode.
• Establish a connection between two applications
– Identify and discard old data segments from an earlier connection.
• Data transfer
– Flow and congestion control is done on a connection basis.
– Other desirable features: ordered, no loss, no duplicate.
• Disconnection
– Do it in a graceful manner so that segments are not transmitted when
there is no one to receive it.
• User Datagram Protocol (UDP): Just send …
3
TCP Header
0
4
10
16
Source Port
24
31
Destination Port
Sequence Number
Acknowledgment Number
Header
Length
Reserved
UAPR S F
Window size
Checksum
Urgent Pointer
Options
H
e
a
d
e
r
Padding
Data
U: URG (Urgent)
A: ACK
P: PSH (Push)
R: RST (Reset)
S: SYN (Sync.)
F: FIN (Finish)
4
TCP: Application Context
Server
Client
Read/Write
Read/Write
Port
Port
TCP
IP/DLC/
MAC/PHY
Connection
Internet
TCP
IP/DLC/
MAC/PHY
Ports
- Reserved for well-known services
- Telnet/23, SMTP/25, FTP/20,21, HTTP/80,
BGP/179, RIP/520, DNS/53, lp/515
- Free ports (allocated by the OS)
5
TCP: Header
• Source/destination Ports
–
–
–
–
Port: A 16 bit local unique number on the host <= OS
Port + Host IP => Unique end point of an application
(Src Port + IP, Dst Port + IP): Unique connection ID
Source and destination IP: NOT part of a TCP segment
• 32-bit seq. number
– SYN = 0 (DATA segment)
• Position of the first data byte of this segment in the sender’s data stream
– SYN = 1
• ISN to be used in the sender’s byte stream. (in fact, ISN+1)
• Different each time a host requests a connection
6
TCP: Header
• 32-bit ACK number
– Valid if ACK = 1
– Identifies the sequence number of the NEXT data byte
that the sender of the ACK expects to receive.
• Header length in 4-byte units
– Lets the receiver know the beginning of the data area due
to the variable length of the Option field.
• Reserved (6 bits)
– For future use. All 0’s.
7
TCP: Header
• URG: ‘1’ => Urgent Pointer is valid
• ACK: ‘1’ => ACK Seq# is valid
• PSH:
• ‘1’: The receiving TCP module passes the data to the
application immediately
• ‘0’: The receiving TCP module may delay the data
• RST: ‘1’ => Tells the receiver to abort the conn.
• SYN: This bit requests a connection
• FIN
• ‘1’: Sender has no more data to send, but is ready to receive.
8
TCP: Header
• Window Size
• The number of bytes the sender is willing to receive.
– Used in flow control and congestion control
• Checksum: For error detection
• Urgent Pointer: Valid if URG = ‘1’
• Urgent data
– Start byte is not specified, but it is considered to be the start of the seg.
– Final byte in receiver’s buffer: Seq# + Urgent Ptr.
• The sender can send “control” information to the receiver to be
processed on a priority basis.
9
TCP: Header
• Options
• MSS
– The Max Segment Size accepted by the sender
– Specified during connection set up
• Window Scale
– Allows the use of a larger advertised Window Size
• TimeStamp
–
–
–
–
Intended to be used on high-speed connection
Sequence number may wrap around during a connection.
New segments are distinguished from old segments.
Also used in Round-Trip Time (RTT) calculation
10
TCP Connection: General
• TCP connection
• A short- or long-term association between two apps.
• Comm params are exchanged before data segments:
– ISN
– Receive Window (RWND)
– Max Segment Size (MSS)
• Start of a connection is known to both the parties so
that an old (terminated) connection has no impact.
• Bidirectional (Full-duplex)
11
TCP Conn.: Established in two ways
Server
Client
Peer
Peer
Listen
(Passive)
Active
Active
Active
Most common
Possible mode
The server must be running, and attached
to a port known to the client.
12
TCP Connection: 3-way handshake
• Use the fields necessary to understand it
•
•
•
•
Connection request (SYN)
Sequence number
Acknowledgement (ACK)
Window size
13
TCP Connection: 3-way handshake
Client
Server
Seg(Seq# = 8000,SYN)
Passive open
Active open
Seg(Seq#=15000, Ack = 8001,
SYN+ACK, RWND = 5000)
Seg(Seq#=8000, Ack = 15001,
ACK, RWND = 10000)
14
TCP Connection: 3-way handshake
– SYN segment from client to server
»
»
»
»
SYN = 1
A random initial Seq# (ISN)
RWND is undefined (defined later …)
Options
– SYN segment from server to client
–
–
–
–
–
SYN = 1
A random initial Seq# (ISN)
ACK = 1 (servers acks the received SYN segment)
Ack Seq.#: The sequence # of first data byte to be received
RWND: Receive window size
– ACK from client to server
– ACKs the second SYN segment
– RWND
15
TCP: Connection Management State Diagram
Timeout/RST
CLOSED
LISTEN/ (Create TCB)
CLOSE/
RST/
SEND/ SYN
SYN/ SYN, ACK
ACK/
ESTABLISHED
CLOSE/
FIN
ACK/
FIN_WAIT2
FIN/ ACK
FIN,ACK/ ACK
FIN/ ACK
SYN_SENT
SYN,ACK/ ACK
CLOSE/ FIN
FIN_WAIT1
CLOSE or
Time-out or
RST/
(Delete TCB)
LISTEN
SYN/ SYN, ACK
SYN_RCVD
CONNECT/ (Create TCB)
SYN
FIN/ ACK
CLOSE_WAIT
CLOSING
CLOSE/
FIN
ACK/
LAST_ACK
TIME_WAIT
ACK/
2MSL Time-out/
(Delete TCB)
16
Client/Server Communication and State Transitions
17
FIN
TIME
WAIT
Closed
Client states
ACK
Inform app.
Passive close
LAST
ACK
CLOSE
WAIT
ACK
Established
Transfer
FIN
2MSL timer
Server
Passive open
Data
Closed
Established
ACK
FIN
WAIT-2
FIN
WAIT-1
Active close
SYN+ACK
SYN
LISTEN Closed
RCVD
SYN
SYN
SENT
Active open
TCP Operation
Closed
Client
Server states
18
TCP: Flow Control
• FC: Regulates the amount of data a source
can send before receiving an ACK.
• Using a Sliding Window Protocol
– The bytes within the window are the bytes that
can be in transit.
– The (sender’s) window is opened/ closed.
19
TCP: Flow Control
• Window Size = min(RWND, CWND)
– RWND: Receiver’s window
• The receiver sends this info to the sender in a segment
– There is a field for this in segment header.
• CWND: Congestion window
– Used for congestion control
– Managed by the sender
20
TCP: Flow Control
• Silly Window Syndrome: TCP/IP header = 40 bytes
• (#of data bytes/total segment length) is very low.
• Can occur if the sender and/or the receiver is very slow.
• Syndrome created by sender (Nagle’s solution)
• Sender sends the first segment even if it is a small one.
• Next, the sender waits until
» An ACK is received, OR
» A maximum-size segment is accumulated.
Before sending the next segment …… and repeat the “next” ...
• Syndrome created by receiver
• Clark’s solution:
– Send an ACK, and close the window until another segment can be
received or buffer is ½ empty.
• Delayed ACK: at most 500 ms;
21
TCP: Error Control
• Mechanisms for detecting
– Corrupted segments, lost segments, out-of-order segments,
duplicated segments
• Mechanisms for error detection and correction
– Checksum (header + data)
– ACK
– Timeout (a retransmission timer for each segment)
22
TCP: ACK
• ACK Types
– Positive ACK
• ACK (flag) = 1
• ACK Sequence# => The expected sequence number
– Selective ACK
• There is no provision for SACK in TCP header
• Some implementations use an Option field
23
ACK Generation Rules
– When an in-order data segment is received, delay the ACK until
• Another data segment is received, OR
• 500 ms has elapsed.
– When an out of sequence segment with a higher sequence # arrives
• Send an ACK with the expected seq#
• Ask for fast retransmission: Send 3 ACKs.
– When a missing segment arrives, send an ACK to announce the next seq#
expected.
– If a duplicate segment arrives, immediately send an ACK.
24
TCP: Retransmission
– Central to error control
– Retransmission occurs
• When a retransmission timer expires
– Sender starts a Retrans. Time-Out (RTO) timer for each segment sent
(except for ACK segments)
• Three duplicate ACKs are received
– A mechanism for fast retransmission
– Useful when the receiver notices one missing segment, but the
subsequent segments are just fine…..
Note: Out-of-order segments are simply buffered…. Earlier implementations
simply dropped those ….
25
TCP: Congestion Control
Host
H
Total Output rate
H
Internet
(Net of
routers)
:
H
:
H
Network capacity
No
congestion congestion
Total Input rate
Too many packets are sent in  Congestion
Network input
Network output
26
TCP: Causes of congestion
• Packets arriving on different input links want to go
out on the same output link
• Queue builds up for the outgoing link.
• Router starts dropping packets.
• Slow routers
• Queues build up if computing tasks take too much time.
– Queuing buffers, updating tables, running routing protocols
27
General Principles of CC
Static decisions
- Decide when to accept new
traffic
- Decide when to discard packets
(Congestion prevention policy)
Dynamic decisions
(in 3 parts)
- Monitor the system to know
when and where congestion
occur.
- Pass on this information to
where action can be taken.
- Adjust system operation to
correct the problem.
28
Congestion Control
• Dynamic decision
– A variety of metrics can be used to monitor a system.
•
•
•
•
Fraction of all packets discarded due to lack of buffer
Average queue length
Number of retransmitted packets
Average packet delay
– Dissemination of congestion information
• A field can be reserved in packet header to carry this info.
• Hosts and routers can send probe packets to enquire.
– Flow adjustment
• Deny service to some users.
• Degrade service to some users.
• Have users schedule their demand in a more predictable manner.
29
Congestion Control
• Congestion Prevention Policies
– DLC level
• Don’t discard out-of-sequence packets.
– Selective-Repeat is better than Go-back-N.
• May not use a separate packet to ACK (use piggyback).
– Network level
• Spread traffic over many paths.
• Use a good discard policy
– File transfer: Drop new packets
– Real-time: Drop old packets
– TCP level …. Next …
30
TCP: Congestion Control
• Achieved by putting one more condition for FC
• Actual Window Size = min (RWND, CWND)
• Main idea
– Slow start
• but quickly speed up to a threshold
– Congestion avoidance
• beyond threshold, increase linearly
– Congestion detection
• Go back to slow start ….
31
TCP: Congestion Control
• Slow start
•
– Initially, CW = 1: Transmit 1
segment (MSS)
– If ACK received before TO
• CW = 2 (= CW x 2): Transmit
•
2 segments (MSS)
– If ACKs received before TO
• CW = 4 (= CW x 2): Transmit
•
4 segments (MSS)
Congestion Avoidance: Additive Inc.
– Each time the whole window of
segments is ACKed
• CW = CW + 1
• CWmax = RWND
Congestion Detection
– RTO timer goes off
– 3 copies of an ACK are received
Update CT and CW
– If ACKs received before TO
– RTO timer goes off
• CW = 8 (= CW x 2): Transmit
8 segments (MSS)
:
– Continue until you hit a threshold:
• CT = CW/2 and CW = 1
– 3 ACKs received
• CT = CW/2 and CW = CT
Congestion Threshold (CT)
• Normally, CT = 64 KBytes
32
TCP: Congestion Control Example: SS-AIMD
CW
Time
33
TCP: Timers
• Four kinds of timers
–
–
–
–
Retransmission Time-Out (RTO) timer
Persistence timer
Keepalive timer
TIME-WAIT timer
34
– Operation
TCP: Timers (RTO)
» For each segment transmitted (except ACK), start an RTO
» If RTO goes off, retransmit the segment and restart RTO
» If ACK is received before the RTO goes off, kill RTO
– RTTS (RTT Smoothed)
– After first measurement
RTTS = RTTM
– After another measurement RTTS = (1 – α )RTTS + α.RTTM
– RTTD (RTT Deviation)
– After first measurement
RTTD = RTTM/2
– After another measurement RTTD = (1 – β )RTTD + β.| RTTS – RTTM|
– RTO
– Original
– After a measurement
Initial value
RTO = RTTS + 4. RTTD
35
TCP: Timers (Persistence)
– Problem
– A receiver can close the sender’s window and reopen it with an ACK
– If the ACK is lost, there is deadlock.
– Solution
– When a sending TCP receives a segment with RWND = 0, start a
persistence timer.
– Persistence timer goes off: Send a probe segment (1 byte data) to alert
the receiver.
– Persistence timer value
» Initially: Equal to RTO
» Subsequently: Doubled with each retransmission of the probe.
» Saturates at 60 sec.
36
TCP: Timers (Keepalive and TIME-WAIT)
– Keepalive Timer
• To sustain mostly idle connections (as between BGP routers)
• Each time the server hears from a client
– Reset the timer: Length = 2 hours.
– If the server does not hear from the client for two hours
» Send a probe segment.
– If there is no response after 10 probes (75 sec apart)
» Assume that the client is down.
– TIME-WAIT Timer (2.MSL)
• Used during connection termination.
37
OS Support for TCP-based Network I/O
38
OS Support for TCP-based Network I/O
• Server’s calls
• sockfd = socket(protocol options, …)
• status = bind(sockfd, *myaddress, …)
• status = listen(sockfd, backlog)
» Convert the socket to a passive socket; -1 for error
• confd = accept(socketfd, *clientaddress, …)
» Returns a connected socket for a client; -1 for error
• status = read(confd, *buf, len)
• Client’s calls
• sockfd = socket(protocol options, …)
• status = connect(sockfd, *serveraddress, …))
• status = write(sockfd, *buf, len)
39
OS Support for TCP-based Network I/O
• Interested in network programming?
– UNIX Network Programming The Socket Networking API
Vol. 1, 3rd Edition
W. Richard Stevens, et al.
Addison Wesley
40
Download