Lecture 18 Homework 5, Wireshark Project 3, Programming Project 3 due today. Questions? Thursday, October 18 CS 475 Networks - Lecture 18 1 Outline Chapter 5 - End-to-End Protocols 5.1 Simple Demultiplexer (UDP) 5.2 Reliable Byte Stream (TCP) 5.3 Remote Procedure Call (RPC) 5.4 Transport for Real-Time Applications (RTP) 5.5 Summary Thursday, October 18 CS 475 Networks - Lecture 18 2 Introduction We've covered connecting computers. The transport layer deals with connecting processes running on computers. A transport protocol may be expected to provide: guaranteed delivery, in-order delivery, no duplicates, support for large messages, synchronization, flow control, support for multiple processes. Lower layers may: drop messages, reorder messages, deliver duplicates, limit message size, deliver after a long delay. Thursday, October 18 CS 475 Networks - Lecture 18 3 Simple Demultiplexer (UDP) UDP extends the host-to-host service of the network to a process-to-process service but adds no other functionality. UDP uses 16-bit port numbers to demultiplex between processes. (A process is identified with a port number/IP number pair.) UDP header format Thursday, October 18 CS 475 Networks - Lecture 18 4 Simple Demultiplexer (UDP) Implementation of the port abstraction may vary from OS to OS. Typically, each port is associated with a message queue. When a process receives a message one is removed from the queue. Thursday, October 18 CS 475 Networks - Lecture 18 5 Simple Demultiplexer (UDP) How does a client know which port to send a message to on the server? The server may use a well-known port (see /etc/services on a UNIX machine). Alternatively, the server could run a port mapper process on a well-known port. The client communicates with the port mapper to find the port number of the desired process. UDP does employ a checksum to verify a message. Packets with errors are dropped. Thursday, October 18 CS 475 Networks - Lecture 18 6 Reliable Byte Stream (TCP) In addition to demultiplexing, TCP provides guaranteed, reliable, in-order delivery with flow and congestion control. TCP connections are fullduplex. Flow control prevents the sender from overwhelming the receiver. Congestion control prevents the sender from overwhelming the network (switches, links). Thursday, October 18 CS 475 Networks - Lecture 18 7 End-to-End Issues The sliding window algorithm used by TCP is like that used on a point-to-point link (Section 2.5.2), but there are important differences: 1)Setup (exchange of state so the sliding window algorithm can start) and teardown are needed 2)RTTs are variable, so timeouts must be adaptive, 3)Packets can be reordered (the maximum segment lifetime or MSL is typically 120 s) 4)Resources are not tied to a single link and can not be determined in advance (flow control needed), 5)Congestion is possible (congestion control needed) Thursday, October 18 CS 475 Networks - Lecture 18 8 Segment Format TCP is a byte-oriented protocol. Bytes are normally collected into segments before before being sent to the destination. Thursday, October 18 CS 475 Networks - Lecture 18 9 Segment Format The TCP header is shown at right. A TCP connection is identified by the 4tuple (SrcPort, SrcIPAddr, DstPort, DstIPAddr). The HdrLen is the size of the header in 32-bit words. Thursday, October 18 CS 475 Networks - Lecture 18 10 Segment Format The Acknowledgment, SequenceNum and AdvertisedWindow fields are used by the sliding window algorithm. Each transmitted byte has a corresponding SequenceNum. Acknowledgment and AdvertisedWindow are associated with received data. Thursday, October 18 CS 475 Networks - Lecture 18 11 Segment Format The Flags field contains 6 bits: SYN, FIN, RESET, PUSH, URG, and ACK. SYN and FIN are used to set up a connection. RESET indicates that the receiver is confused and wants to abort the connection. PUSH indicates that data should be send immediately. URG signifies that the segment contains urgent data. The UrgPtr field contains the number of urgent data bytes. ACK is set when the Acknowledgment field is valid. Thursday, October 18 CS 475 Networks - Lecture 18 12 Connection Establishment A three-way handshake is used to set up the connection. Packets contain the initial sequence numbers to be used by the client and the server (x and y) in subsequent packets. The TCP specification requires that the initial sequence numbers be random numbers. Thursday, October 18 CS 475 Networks - Lecture 18 13 Connection Establishment A trans. diagram for TCP setup and tear down is shown at right. Rectangles show states. Arcs have tags of the form event/action Retransmissions due to timeouts are not shown. Thursday, October 18 CS 475 Networks - Lecture 18 14 Sliding Window Revisited The sliding window algorithm discussed previously provided reliable, in-order delivery. TCP's sliding window algorithm extends the prior one by adding flow control. Flow control is achieved by having the receiver advertise a window size to the sender instead of using a fixed-size window. The sender is limited to sending no more than AdvertisedWindow bytes of unacknowledged data at any time. Thursday, October 18 CS 475 Networks - Lecture 18 15 Sliding Window Revisited The sender maintains three pointers where: LastByteAcked ≤ LastByteSent ≤ LastByteWritten while on the receiver: LastByteRead < NextByteExpected ≤ LastByteRcvd + 1 Thursday, October 18 CS 475 Networks - Lecture 18 16 Sliding Window Revisited Assume the send and receive buffers are of size MaxSendBuffer and MaxRcvBuffer. On the receive side TCP must keep: LastByteRcvd – LastByteRead ≤ MaxRcvBuffer The advertised window size is AdvertisedWindow = MaxRcvBuffer – ((NextByteExpected – 1) – LastByteRead) Thursday, October 18 CS 475 Networks - Lecture 18 17 Sliding Window Revisited On the sending side TCP ensures: LastByteSent – LastByteAcked ≤ AdvertisedWindow while maintaining LastByteWritten – LastByteAcked ≤ MaxSendBuffer If the sending process tries to write n bytes in such a way that this inequality would not be maintained then the process is blocked. Thursday, October 18 CS 475 Networks - Lecture 18 18 Sliding Window Revisited A 32-bit sequence number will wrap around in 57 minutes at a 10 Mbps transmit rate, but in only 36 seconds at 1 Gbps. An extension to TCP extends the sequence number space. A 16-bit AdvertisedWindow field allows for a 64 KB window. It should be large enough to allow for a full delay x BW product. A cross country delay of 100 ms at 10 Mbps corresponds to 122 KB. The TCP extension increases the advertised window size also. Thursday, October 18 CS 475 Networks - Lecture 18 19 Triggering Transmission TCP will transmit a segment when (1) it has collected a maximum segment size (MSS) number of bytes, (2) the sending process tells it to (a push), or (3) a “timer” expires. Nagle's Algorithm: when there is data to send if both the data and the window ≥ MSS send a full segment else if there is unACKed data in flight buffer data until ACK arrives else send all data now Thursday, October 18 CS 475 Networks - Lecture 18 20 Adaptive Retransmission Originally, a TimeOut value for retransmission was computed using: EstimatedRTT = α EstimatedRTT + (1 – α)SampleRTT TimeOut = 2 x EstimatedRTT where SampleRTT is the time between when a segment is sent and its ACK arrives. The original TCP spec recommended a value of α between 0.8 and 0.9. Thursday, October 18 CS 475 Networks - Lecture 18 21 Adaptive Retransmission Unfortunately an ACK for a retransmission is identical to an ACK for the original. This can lead to incorrect values for SampleRTT. Thursday, October 18 CS 475 Networks - Lecture 18 22 Adaptive Retransmission The Karn/Partridge algorithm fixed the problem quite simply. SampleRTT was measured only for segments that have been sent once. The new algorithm included a second change. After each retransmit the next timeout value would be set to twice the previous timeout value (exponential backoff). This helped to alleviate problems due to network congestion. Thursday, October 18 CS 475 Networks - Lecture 18 23 Adaptive Retransmission The original algorithm did not handle situations in which the SampleRTT might vary a lot. The Jackobson/Karels algorithm was an improvement: Difference = SampleRTT – EstimatedRTT EstimatedRTT = EstimatedRTT+(δ x Difference) Deviation = Deviation+δ(|Difference|-Deviation) TimeOut = μ x EstimatedRTT + φ x Deviation where μ was typically 1 and φ was 4. Thursday, October 18 CS 475 Networks - Lecture 18 24 Record Boundaries TCP has two features that allow record boundaries to be put into the byte stream. TCP allows data to be flagged as urgent or out-ofband. Urgent data can be used to indicate the end of a record. A TCP push operation can be used to indicate a complete record. (The sockets API does not provide access to the PUSH flag.) It is usually simpler for record boundary markers to be inserted by the application. Thursday, October 18 CS 475 Networks - Lecture 18 25 TCP Extensions There have been four optional extensions to TCP that are implemented using Options in the TCP header: 1) The sender places a 32-bit time stamp in the header. The receiver echoes the time stamp in the ACK. This allows for accurate measurement of the RTT. Thursday, October 18 CS 475 Networks - Lecture 18 26 TCP Extensions 2) The sequence number and the time stamp are examined to determine if the sequence number has wrapped around. 3) A scaling factor can be included to advertise a window larger than 64 KB. 4) The receiver can respond with a selective acknowledgment (SACK). This allows the sender to (re)transmit just missing segments. Thursday, October 18 CS 475 Networks - Lecture 18 27 Performance Now that we have a complete protocol graph, we can discuss how to measure its performance as seen by applications. In particular, as network speeds increase, can a protocol like TCP provide enough data to keep the network full? Simple host-to-host in a room. 2 2.4GHz dual cores; 2 Gbps bandwidth. Thursday, October 18 CS 475 Networks - Lecture 18 28 Performance TTCP benchmark using various sizes of messages. Note: "perfect" network, measures TCP implementation and workstation hardware/software only. Will see other issues like congestion. Thursday, October 18 CS 475 Networks - Lecture 18 29 Alternative Design Choices TCP is a stream-oriented protocol as opposed to a request/reply protocol. We will examine a request/reply protocol (RPC) next time. (TCP can be used for request/reply applications, but there are complications.) TCP is a byte-stream rather than a messagestream service. (Record boundaries can however be inserted into the byte stream.) Thursday, October 18 CS 475 Networks - Lecture 18 30 Alternative Design Choices TCP uses connection setup and teardown. It is possible to send all connection parameters with the first data message. TCP setup allows a receiver to reject a connection before any data is sent. TCP teardown means that “keep alive” messages don't need to be sent. TCP uses window-based versus rate-based flow control. There are similarities but also some interesting differences. Thursday, October 18 CS 475 Networks - Lecture 18 31 In-class Exercises Log on locally under Linux or log on remotely to csserver to answer the following questions: How do we send out-of-band data via TCP? (man send) How do we receive out-of-band data? (man recv) Which of the four TCP extensions described in class are supported under Linux? (man tcp) What acronym is used for the TCP extension that helps to determine if the sequence number has wrapped around? What does this acronym stand for? Is there a way to disable Nagle's algorithm so that segments are sent immediately? If so, how? Thursday, October 18 CS 475 Networks - Lecture 18 32 In-class Exercises The Linux /proc pseudo-filesystem interface can be used to tune many of the TCP algorithms. Changing the parameters requires system administration privileges. Use cat to examine appropriate /proc file contents (man tcp) and determine the answers to the following: Is the optional SACK extension enabled? What is the default receive buffer size? Is the optional window scaling extension enabled? What is the default congestion control algorithm? Which algorithms are available for use? Thursday, October 18 CS 475 Networks - Lecture 18 33