Chapter 3: Transport Layer Our goals: understand principles behind transport layer services: Multiplexing/demultip lexing reliable data transfer flow control congestion control learn about transport layer protocols in the Internet: UDP: connectionless transport TCP: connection-oriented transport TCP congestion control Transport Layer 3-1 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles of reliable data transfer 3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management 3.6 Principles of congestion control 3.7 TCP congestion control Transport Layer 3-2 Transport services and protocols provide logical communication between app processes running on different hosts transport protocols run in end systems send side: breaks app messages into segments, passes to network layer rcv side: reassembles segments into messages, passes to app layer more than one transport protocol available to apps Internet: TCP and UDP application transport network data link physical application transport network data link physical Transport Layer 3-3 Transport vs. network layer network layer: logical communication between hosts transport layer: logical communication between processes relies on, enhances, network layer services C Sport:8050 Dport: 25 A B Sport:4625 Dport: 80 D Transport Layer 3-4 Internet transport-layer protocols reliable, in-order delivery (TCP) congestion control flow control connection setup unreliable, unordered delivery: UDP services not available: delay guarantees bandwidth guarantees application transport network data link physical network data link physical network data link physical network data link physicalnetwork network data link physical data link physical network data link physical application transport network data link physical Transport Layer 3-5 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles of reliable data transfer 3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management 3.6 Principles of congestion control 3.7 TCP congestion control Transport Layer 3-6 Multiplexing/demultiplexing Multiplexing at send host: gathering data from multiple sockets, enveloping data with header (later used for demultiplexing) Demultiplexing at rcv host: delivering received segments to correct socket = socket application transport network link = process P3 P1 P1 application transport network P2 P4 application transport network link link physical host 1 physical host 2 physical host 3 Transport Layer 3-7 How demultiplexing works host receives IP datagrams each datagram has source IP address, destination IP address each datagram carries transport-layer segment each segment has source, destination port number host uses IP addresses & port numbers to direct segment to appropriate socket 32 bits source port # dest port # other header fields application data (message) TCP/UDP segment format Transport Layer 3-8 Connectionless demultiplexing (UDP) Create a socket binding to a port number UDP socket identified by two-tuple: (dest IP address, dest port number) When host receives UDP segment: checks destination port number in segment directs UDP segment to socket with that port number IP datagrams with different source IP/port can be directed to same socket Transport Layer 3-9 Connectionless demux (cont) P2 P1 P1 P3 SP: 6428 SP: 6428 DP: 9157 DP: 5775 SP: 9157 client IP: A DP: 6428 SP: 5775 server IP: C Port: 6428 Socket tuple: (dest IP address, dest port number) Two clients’ traffic can be mixed together at server DP: 6428 Client IP:B Transport Layer 3-10 Connection-oriented demux (TCP) TCP socket identified by 4- tuple: source IP address source port number dest IP address dest port number recv host uses all four values to direct segment to appropriate socket Two connections cannot mixed together at the receiver host Server host may support many simultaneous TCP sockets: each socket identified by its own 4-tuple Web servers have different sockets for each connecting client Remember the fork() and new socket generated by accept() Transport Layer 3-11 Connection-oriented demux (cont) P1 P4 P5 P2 P6 P1P3 SP: 5775 DP: 80 S-IP: B D-IP:C SP: 9157 client IP: A DP: 80 S-IP: A D-IP:C SP: 9157 server IP: C DP: 80 S-IP: B D-IP:C Client IP:B Transport Layer 3-12 Connection-oriented demux: Threaded Web Server P1 P2 P4 P1P3 SP: 5775 DP: 80 S-IP: B D-IP:C SP: 9157 client IP: A DP: 80 S-IP: A D-IP:C SP: 9157 server IP: C Port: 80 DP: 80 S-IP: B D-IP:C Client IP:B Transport Layer 3-13 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles of reliable data transfer 3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management 3.6 Principles of congestion control 3.7 TCP congestion control Transport Layer 3-14 UDP: User Datagram Protocol [RFC 768] “no frills,” “bare bones” Internet transport protocol “best effort” service, UDP segments may be: lost delivered out of order to app connectionless: no handshaking between UDP sender, receiver each UDP segment handled independently of others Why is there a UDP? no connection establishment (which can add delay) simple: no connection state at sender, receiver small segment header no congestion control: UDP can blast away as fast as desired UDP worm (Slammer) Transport Layer 3-15 UDP-based Worm: Slammer Worm code flow: Exploit code (buffer overflow) Generate random target IP address x: Sendto() worm code to x on udp port 1434 Fast spreading worm code (Jan. 2003) Single UDP packet: 376 bytes Average scan rate: 4000 scans/sec Infect 90% in 10 minutes ~ 100,000 infected in an hour Bandwidth-limited worm Severely congested Internet Stopped ATM, Flight checking, … TCP-based worm is much slower TCP connection setup • Connect() is a blocking call Multiple threads for spreading Transport Layer 3-16 UDP: more often used for streaming multimedia apps loss tolerant rate sensitive Length, in bytes of UDP segment, including header other UDP uses DNS SNMP reliable transfer over UDP: add reliability at application layer application-specific error recovery! 32 bits source port # dest port # length checksum Application data (message) UDP segment format Transport Layer 3-17 UDP checksum Goal: detect “errors” (e.g., flipped bits) in transmitted segment Sender: Receiver: treat segment contents Add all received 16-bit as sequence of 16-bit integers checksum: 1’s complement of addition of segment contents sender puts checksum value into UDP checksum field segments, including checksum check if result is 1111 1111 1111 1111: NO - error detected YES - no error detected. But maybe errors nonetheless? More later …. Transport Layer 3-18 Internet Checksum Example Note When adding numbers, a carryout from the most significant bit needs to be added to the result Example: add two 16-bit integers 1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0 checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 Transport Layer 3-19 Internet Checksum Example 2 Suppose a 6-bytes packet content is 0xABCC, 0x960B, 0x5A3D What is the checksum for this packet? 0x is a hexadecimal representation that each symbol (0-9, A-F) represents 4 bits binary within the value of 0-15. For more details see: http://en.wikipedia.org/wiki/Hexadecimal Normal summation: 0xABCC+0x960B+0x5A3D = 0x19C14 Wrap up carry-out value: 0x9C14 + 0x1 = 0x9C15 So the checksum is: 0xFFFF – 0x9C15 = 0x63EA Transport Layer 3-20 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles of reliable data transfer 3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management 3.6 Principles of congestion control 3.7 TCP congestion control Transport Layer 3-21 Principles of Reliable data transfer important in app., transport, link layers top-10 list of important networking topics! characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt) u Network layer Transport Layer 3-22 Reliable data transfer: getting started rdt_send(): called from above, (e.g., by app.). Passed data to deliver to receiver upper layer deliver_data(): called by rdt to deliver data to upper send side receive side u udt_send(): called by rdt, to transfer packet over unreliable channel to receiver udt_rcv(): called when packet arrives on rcv-side of channel Transport Layer 3-23 Reliable data transfer: getting started We’ll: incrementally develop sender, receiver sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer but control info will flow on both directions! use finite state machines (FSM) to specify sender, receiver state: when in this “state” next state uniquely determined by next event state 1 event causing state transition actions taken on state transition event actions state 2 Transport Layer 3-24 Rdt1.0: reliable transfer over a reliable channel Assumption: underlying channel perfectly reliable no bit errors no loss of packets separate FSMs for sender, receiver: sender sends data into underlying channel receiver read data from underlying channel Wait for call from above sender rdt_send(data) packet = make_pkt(data) udt_send(packet) Wait for call from below udt_rcv(packet) Only need to chop bit-stream data into packets and send Modern Internet packet has Maximum Transition Unit (MTU) of 1500 Bytes (Ethernet) extract (packet,data) deliver_data(data) receiver Transport Layer 3-25 Rdt2.0: channel with bit errors Assumption #1: underlying channel may flip bits in packet checksum to detect bit errors Assumption # 2: no packet will be lost the question: how to recover from errors: acknowledgements (ACKs): receiver explicitly tells sender that pkt received OK negative acknowledgements (NAKs): receiver explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK new mechanisms in rdt2.0 (beyond rdt1.0): Error detection (checksum) Receiver feedback: control msgs (ACK,NAK) rcvr->sender Sender retransmit if NAK Transport Layer 3-26 rdt2.0: FSM specification rdt_send(data) snkpkt = make_pkt(data, checksum) udt_send(sndpkt) udt_rcv(rcvpkt) && isNAK(rcvpkt) Wait for Wait for call from ACK or udt_send(sndpkt) above NAK udt_rcv(rcvpkt) && isACK(rcvpkt) L sender L : means no action receiver udt_rcv(rcvpkt) && corrupt(rcvpkt) udt_send(NAK) Wait for call from below udt_rcv(rcvpkt) && notcorrupt(rcvpkt) extract(rcvpkt,data) deliver_data(data) udt_send(ACK) Transport Layer 3-27 rdt2.0: operation with no errors rdt_send(data) snkpkt = make_pkt(data, checksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && isNAK(rcvpkt) Wait for Wait for call from ACK or udt_send(sndpkt) above NAK rdt_rcv(rcvpkt) && isACK(rcvpkt) L rdt_rcv(rcvpkt) && corrupt(rcvpkt) udt_send(NAK) Wait for call from below rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) extract(rcvpkt,data) deliver_data(data) udt_send(ACK) Transport Layer 3-28 rdt2.0: error scenario rdt_send(data) snkpkt = make_pkt(data, checksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && isNAK(rcvpkt) Wait for Wait for call from ACK or udt_send(sndpkt) above NAK udt_rcv(rcvpkt) && isACK(rcvpkt) L rdt_rcv(rcvpkt) && corrupt(rcvpkt) udt_send(NAK) Wait for call from below udt_rcv(rcvpkt) && notcorrupt(rcvpkt) extract(rcvpkt,data) deliver_data(data) udt_send(ACK) Transport Layer 3-29 rdt2.0 has a fatal flaw! What happens if ACK/NAK corrupted? sender doesn’t know what happened at receiver! Time-out and retransmit can’t just retransmit: possible duplicate Handling duplicates: sender retransmits current pkt if ACK/NAK garbled sender adds sequence number to each pkt receiver discards (doesn’t deliver up) duplicate pkt stop and wait Sender sends one packet, then waits for receiver response Transport Layer 3-30 rdt2.1: sender, handles garbled ACK/NAKs rdt_send(data) sndpkt = make_pkt(0, data, checksum) udt_send(sndpkt) udt_rcv(rcvpkt) && Wait for call 0 from above udt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt) udt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt) L udt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isNAK(rcvpkt) ) udt_send(sndpkt) ( corrupt(rcvpkt) || isNAK(rcvpkt) ) udt_send(sndpkt) Wait for ACK or NAK 0 L Wait for ACK or NAK 1 Wait for call 1 from above rdt_send(data) sndpkt = make_pkt(1, data, checksum) udt_send(sndpkt) Transport Layer 3-31 rdt2.1: receiver, handles garbled ACK/NAKs udt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt) udt_rcv(rcvpkt) && (corrupt(rcvpkt) extract(rcvpkt,data) deliver_data(data) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) udt_rcv(rcvpkt) && (corrupt(rcvpkt) sndpkt = make_pkt(NAK, chksum) udt_send(sndpkt) udt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq1(rcvpkt) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) Why ACK for wrong sequence packet? sndpkt = make_pkt(NAK, chksum) udt_send(sndpkt) Wait for 0 from below Wait for 1 from below udt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt) udt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) extract(rcvpkt,data) deliver_data(data) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) Transport Layer 3-32 rdt2.1: discussion Sender: seq # added to pkt two seq. #’s (0,1) will suffice. Why? must check if received ACK/NAK corrupted twice as many states state must “remember” whether “current” pkt has 0 or 1 seq. # Receiver: must check if received packet is duplicate state indicates whether 0 or 1 is expected pkt seq # note: receiver can not know if its last ACK/NAK received OK at sender Transport Layer 3-33