Introduction to TCP A first look at the sockets API for ‘connection-oriented’ client/server application programs Benefits of TCP • Communication is ‘transparently reliable’ • Data is delivered in the proper sequence • An application programmer does not need to worry about issues such as: – Lost or delayed packets – Timeouts and retransmissions – Duplicated packets – Packets arriving out-of-sequence – Flow and congestion control Overview process A … port P … application layer transport layer process B … port Q … reliable TCP TCP IP byte-stream connection network layer unreliable IP datagrams TCP IP Transport-Layer’s duties • Create the illusion of a reliable two-way point-to-point connection linking a client application with a server application – Manage the ‘error-control’ mechanism – Manage the ‘flow-control’ mechanism – Manage the connection’s ‘persistence’ – Manage the connection’s ‘shutdown’ Interaction overview The ‘server’ application socket() bind() The ‘client’ application socket() listen() bind() accept() read() write() close() 3-way handshake data flow to server data flow to client 4-way handshake connect() write() read() close() What is a ‘connection’? • An application’s socket is ‘connected’ if it has a defined pair of socket-addresses: – An IP-address and port-number for the ‘host’ – An IP-address and port-numbet for the ‘peer’ ‘hrn23501.usfca.edu’ ‘hopper.usfca.edu’ 138.202.171.14 port 53124 138.202.192.14 port 80 2-way data stream 138.202.192.14 port 80 138.202.171.14 port 53124 classroom workstation USF’s web-server Layout of TCP header 32-bits source port address destination port address sequence number acknowledgment number Header Length reserved U A P R S F R C S S Y I G K H T N N checksum window size urgent pointer options and padding Sequence number segmented stream of 1000 bytes ISN data data data data (256 bytes) (256 bytes) (256 bytes) (232 bytes) ISN + 256 ISN + 512 ISN + 768 The sequence number field defines the number being assigned to the first byte of data contained in this segment. During connection setup, each party to the connection uses a random number generator to get the value it will assign to the first byte of data it will transmit, called its initial sequence number (ISN). Thereafter, the sequence number in each succeeding segment will equal the sequence number used in the prior segment plus the number of data bytes in that prior segment. By this scheme the receiver can arrange all the incoming data bytes in the proper order, even if some segments happen to arrive out-of-order. Acknowledgment number • This field holds the number of the byte that the source of this segment is expecting to receive next from its connection partner 9, 8, 7 6, 5 4, 3, 2, 1 sender receiver ACK 5 • This field’s value is meaningful only when this segment’s ACK control flag bit is set TCP Header Length • Like IP headers, the TCP Header’s length is expressed in multiples of 32-bits: it’s at least 5 (i.e., 20-bytes) if there aren’t any ‘TCP Options’ included in the TCP header IP header TCP header DATA Total Length (in bytes) • The amount of DATA in a TCP packet can be calculated from the IP Header’s ‘Total Length’ field, minus the number of bytes that comprise these two headers (IP header + TCP header) Control flags 5 4 3 2 1 0 U R G A C K P S H R S T S Y N F I N Legend: FIN = Terminate the connection SYN = Synchronize sequence numbers RST = Reset the connection PSH = Push the data ACK = The value in the acknowledgement field is valid URG = The value in the urgent pointer field is valid Establishing the connection timeline SYN J SYN K, ACK J+1 server application (passive) ACK K+1 The 3-way Handshake client application (active) Exchanging data timeline PSH+ACK ACK server application (passive) PSH+ACK ACK A typical Request and Reply transaction client application (active) Connection shutdown timeline ACK+FIN ACK server application (passive) ACK+FIN ACK The 4-way Handshake client application (active) 3-way handshake SYN ACK + SYN ACK ‘request-and-reply’ ACK+PSH ACK ACK+PSH ACK+FIN 4-way handshake ACK+FIN ACK ACK+FIN ACK TCP Timers • To achieve transparent reliability, the TCP subsystem maintains some internal timers • One of these is the ‘Retransmission Timer’ • If a packet is sent, but its ACK does not arrive before this timer expires, then the packet will be ‘retransmitted’ • Of course, this could result in the receiver getting duplicate packets (if it’s a bit slow) ‘lost’ versus ‘late’ timeline PSH+ACK retransmit timeout PSH+ACK ACK ACK arrives late same PSH arrives twice client application Busy server might be ‘slow’ to acknowledge server application ‘piggyback’ • To reduce traffic-flow when possible, TCP delays sending an immediate ACK for an arriving data-packet, in case the receiver might soon have some data of its own to send back – in which case the ACK can ‘piggyback’ on the outgoing data PSH • This mechanism, of course, requires TCP to maintain a ‘Delayed ACK’ timer Delayed ACK senario process A application layer write read buffer for incoming data buffer for outgoing data Delayed ACK timer Retransmit timer Window Probe timer Keep Alive timer … port P … TCP to/from the IP layer transport layer Window Size • During the ‘connection setup’ handshake, each host communicates to its partner a ‘window size’ parameter, to let be known some information about its capacity for buffering packets • It also conveys its ‘MSS’ parameter (as a ‘TCP header option’) to inform its partner of its buffers’ Maximum Segment Size MSS versus MTU • A diagram shows the distinction between the protocol’s MSS (Maximum Segment Size) and the interface’s MTU (Maximum Transmission Unit); no TCP packets will be sent with a segment-size that’s larger MTU MSS datalink header IP header TCP header packet DATA FCS Main TCP option types • The TCP Header contains an options list, occupying from 0 to 11 longword values • Each option is identified by an 8-bit ‘type’: • • • • • • • Type 0: End of the options list: data follows this Type 1: No option: used for alignment padding Type 2: Maximum Segment Size (MSS) Type 3: Window scaling option (WSOPT) Type 4: Selective Acknowledgments supported Type 5: Selective Acknowledgment (SACK) Type 8: Timestamp value and echo reply (TSOPT) Option formats • Option types 0 and 1 are single-bytes • All other option types are at least 2-bytes, with the second byte containing the length type 2 length 4 type 3 MSS value 4 bytes type 8 length 10 length 3 3 bytes timestamp value 10 bytes WSOPT value type 4 length 2 2 bytes timestamp echo reply value Looking at TCP options HLEN HLEN HLEN ‘Wrapped’ sequences • The TCP header’s ‘Sequence Number’ is a 32-bit value, initially chosen at random • It could happen that a large number gets selected as an Initial Sequence Number and that a large amount of data gets sent, thus causing the 32-bit field to ‘overflow’ • So how does the receiver tell a ‘wrapped sequence’ from a ‘late-arriving’ segment? Type 8: TSOPT 16 bits TYPE (=8) LENGTH (=10) Timestamp Value Timestamp Echo Reply • The TCP timestamps have two purposes: – RTTM: Round-Trip Time Measurement – PAWS: Protect Against Wrapped Sequences The SACK option • It conveys extended ‘acknowledgment’ information from a receiver to a sender about ‘gaps’ in the received data-stream Type (=5) Length Left Edge of first Block no gaps here Right Edge of first Block ... no gaps here Left Edge of n-th Block Right Edge of n-th Block 32-bits 2+n*8 bytes Demo programs • We put ‘tcpserver.cpp’ and ‘tcpclient.cpp’ on our class website, so you can watch actual TCP packets being exchanged by using our ‘nicwatch’ application (or some other packet-sniffer, e.g., ‘wireshark’) • We deliberately used loops which write to, or read from, sockets one-byte-at-a-time so that you can observe ‘TCP buffering’!