UDP, TCP/IP, and IP Multicast COM S 414 Sunny Gleason, Vivek Uppal Tuesday, October 23rd, 2001 In This Lecture • We will build on understanding of IP (Internet Protocol) – UDP: User Datagram Protocol • Unreliable, packet-based protocol – TCP: Transmission Control Protocol • Reliable, connection-oriented, stream-based protocol – IP Multicast (if time allows…) • Facilities for delivering datagrams to multiple recipients – We won’t discuss ICMP (Internet Control Message Protocol), but you can look it up if you want Where To Find More Info • For More “Practical” Information – Network Programming in Java • The Java Custom Networking Trail http://java.sun.com/docs/books/tutorial/networking/sockets/ http://java.sun.com/docs/books/tutorial/networking/datagrams/ – Network Programming in C • Books by W. Richard Stevens [HIGHLY recommended!] – “TCP/IP Illustrated” Series – UNIX Network Programming, Vol. 1 – Kernel Source – “Real” Protocol Stacks • Linux TCP/IP Stack – http://www.kernel.org/pub/linux/kernel/v2.4/ • OpenBSD TCP/IP Stack – ftp://ftp.openbsd.org/pub/OpenBSD/src/sys/netinet/ Where to Find More Info • Papers, Lecture Notes and RFC’s – TCP Congestion Control • Van Jacobson, “Congestion Avoidance and Control”, 1988 • Internet RFC Series: http://www.rfc-editor.org/ – CS514 - Fall 2000 Lecture Notes – Birman, Kenneth. Building Secure and Reliable Network Applications. 1995. First, some definitions… • Keep the OSI Layers in mind! • Address – An identifier, following an addressing convention, which allows a machine to be uniquely identified • MAC Address, or Hardware Address – Numeric address used by Ethernet (data-link layer) – Might look like: “00:02:2D:08:68:F8” • IP Address – Numeric address used by IP (network layer) – Might look like: “128.84.133.221” First, some definitions… • Packet, or Datagram – self-contained unit of information – consists of a header and body • Packet Header – For now, realize that it includes source address, destination address – With layered model, “nesting” of headers First, some definitions… • Local Area Network (LAN) – Group of machines sharing a common communications medium (such as Ethernet) – High data rates, “private wires”, shorter distances • Wide Area Network (WAN) – spans a greater geographic area, may depend on publicly available network structures (telephone system, leased lines, satellites…) First, some definitions… • Router – Machine that moves packets from one network to a network that is closer to the destination – (Based on a routing table, which may change) • Bridge – A machine that “indiscriminantly” replicates packets between two LANs – typically “not as smart” but faster than a router • Gateway – A machine that routes packets from the LAN to the WAN (What is a Firewall?) First, some definitions… • Port – In UDP and TCP, a number which the kernel uses to deliver datagrams to the appropriate application – For instance: HTTP is port 80, SMTP is port 25, Telnet is port 23, DNS is port 53, FTP is port 21 • In this model, receivers agree to wait for datagrams on a specified port • Socket: {address, port} The Internet • A network based on the Internet Protocol (IP) = Router The Internet • Routes IP Datagrams from point A to point B … [unreliably] B: 128.84.154.132 A: 171.64.14.203 = Router Unreliably? • What good is that? • Packet loss rate is extremely low (<< 1%) • Packets usually dropped by overloaded routers (as we’ll see later) • This is good enough for us to build the User Datagram Protocol (UDP) UDP • For applications where IP guarantees of reliability are good-enough – Streaming multimedia, stock quotes… • Extends IP packet with source port, destination port • In addition, provides fragmentation (and checksum) Fragmentation in UDP • Very simple: splits large UDP datagram into multiple IP datagrams, each with a sequence number • Marks “fragmented” bit in the UDP header • If one fragment is lost, the whole UDP packet is discarded • UDP datagrams are discarded if checksum fails The UDP API • No-frills! Basically, you: – Create a socket {address, port} – Send data to a remote socket – Receive data on a given socket • No guarantees about reliability, or even the ordering in which datagrams are received • How can we get around this? Adding Reliability to UDP • Timeouts & Acknowledgements – Receiver sends acks of received datagrams – If sender does not receive ack within a certain time, retransmit the packet • Sequence Numbers – Sender marks datagrams with sequence numbers – Receiver uses sequence numbers to restore order to the datagrams, and ignore duplicates • What if we have 100 or more concurrent applications? Is this efficient? TCP • A TCP connection is defined by: – { src_addr, src_port, dst_addr, dst_port } – Note symmetry at both ends of connection – Thus, sender is a receiver and vice-versa • The goal: a reliable, stream-based, connection-oriented protocol – Reliable: data gets through [or connection breaks] – Stream-based: imagine reading a file in-order – Connection-oriented: point-to-point • How is it all done? Vivek Presents … • The inner workings of the TCP protocol… • Any questions before we move on? TCP • • • • • • • TCP – Stream Protocol 3-way Handshake Closing a connection Acknowledgments Sliding Window Flow Control RED TCP -- Stream Protocol • Connection oriented • like a telephone connection • Needs set up before the transfer starts. • Reliable, point to point communication. • In order delivery • No loss or duplication. • Flow Control and error correction • Duplex connections 3 Way Hand Shake TCP is connection Oriented A Syn Connection initiated by a 3 - way handshake Takes 3 packets Protection against duplicate Syn Packets Syn, Ack Of Syn Ack Of Syn B Basic 3 Way Handshake TCP A TCP B SEQ ACK 1. CLOSED 2. SYN-SENT <100> CTL LISTEN <SYN> SYN-RECV 3. ESTABLISH <300> <101> <SYN,ACK> SYN-RECV 4. ESTABLISH <101> <301> <ACK> ESTABLISH Duplicate Recovery TCP A TCP B SEQ ACK CTL 1. CLOSED LISTEN 2. SYN-SENT <100> <SYN> 3. (duplicate) ... <90> <SYN> SYN-RECV ... 4. <300> <91> <SYN,ACK> (duplicate) 5. <91> 6. ... <100> <SYN> SYN-RECV 7. SYN-SENT <400> <101> <SYN,ACK> SYN-RECV 8. ESTABLISH <101> <401> <RST> LISTEN <ACK> ESTABLISH 3 Way Handshake It ensures that both sides are ready to transmit data, and that both ends know that the other end is ready before transmission actually starts. It allows both sides to pick the initial sequence number to use. Closing a Connection Send a Fin packet before tearing the connection Both processes must send Fin packets separately for closing the connection in that direction A B Fin, Ack Ack of Fin Closing a Connection TCP A TCP B SEQ ACK CTL 1. ESTABLISHED ESTABLISHED 2. (Close) FIN <100> <300> <FIN,ACK> CLOSE-WAIT 3. FIN <300> <101> <ACK> 4. (Close) <300> <101> <FIN,ACK> LAST-ACK 5. <101> <301> <ACK> CLOSED Acknowledgements • Receiver acks only the last in-order packet received • Send nacks for out-of-order packets • Sender resends the first unacknowledged packet • timeout typically set to 1.5 * round trip times Sliding Window The sender window has k segments (buffers) Initially Empty Initially Empty Sliding Window Send message m[i] m[i] m[i] Empty Sliding Window m[i] m[i+1] … … m[i+k] ack m[i] m[i+1] Sliding Window m[i+2] m[i+3] … … m[i+k+1] m[i] ack m[i+1] Have been acked m[i+2] m[i+3] TCP Congestion Control • Dynamically adjust window size • Sender should not swamp the receiver – both sides advertise maximum window size • Linear increase -- When packets are getting through, increment the window size by 1. • When a packet is dropped, halve the window size, and double the retransmission timeouts -- exponential backoff. • Also called TCP fairness/friendliness TCP Slow start • Might take some time to get to the maximum possible window size Optimization: • Exponential increase to start with. • Then follow the linear increase exponential back off when the first packet is lost RED • • • • Random Early Detection Idea is very simple Router senses that load is increasing It simply notices that it has less available memory for buffering • This is because packets are entering faster than they can be forwarded RED … • • • • Picks a packet at random and discards it Even though perhaps it could be forwarded Receiver detects the loss and sends a NACK The network isn’t completely overloaded yet so the NACK gets through • Sender chokes back Sunny Presents • IP Multicast … • Any questions before we move on? • Note: Slides were stolen from CS514 FA2000 Web site Unicast to multiple hosts Multicast to multiple hosts “to group” Why do multicast? • Send to a group, not to individual hosts – Reduces overhead in sender – Reduces bandwidth consumption in network – Reduces latency seen by receivers (all receive “at the same time”, in theory) Logical addressing • Multicast groups “handled by network” • Senders, receivers do not need to know each others’ identities • Group persists as long as it has at least one member • a “rendezvous” mechanism Applications • • • • • Teleconferencing Distance learning Multimedia streaming Directory service lookup ... Multicasting for resource location • Expanding-ring search • We want to find an instance of a resource (database, etc) which is close by • Use multicast with IP time-to-live (TTL) values Time-to-live and hop counts • TTL is a counter in the packet header – Decrement at each “hop” through a router – When TTL reaches zero, the packet is dropped – special values for “global” and “regional” TTL (use with care!) Expanding-ring search “Find me a database”, TTL=1 Expanding-ring search “Find me a database”, TTL=2 “I’m a database, what can I do for you?” Multicast addresses • Class D IP addresses for group – 224.0.0.0 to 239.255.255.255 • Treated like any other IP address: can send from it or listen to it • In practice, use UDP as well (more on this later) Multicast at the LAN level • Ethernet is a broadcast medium: all network cards see all packets • Register the multicast address in the network card – only pass matching packets to OS – all other packets are ignored Multicast beyond the LAN • We would like to multicast between hosts on different LANs – LANs are joined together directly by bridges – or can be connected through the Internet by a sequence of routers – need an inter-LAN (WAN) protocol • (in fact, this is rarely enabled!) A naive approach • We want to send multicasts everywhere where there are group members – use flooding to send multicast between routers – when we get to a LAN, use regular (Ethernet) multicast Multicast by flooding router group member non-member Multicast by flooding router group member non-member Why simple flooding doesn’t work router group member non-member Why simple flooding doesn’t work wasted! router group member non-member Multicast flooding • Not a scalable mechanism – every LAN sees every multicast – every WAN router sees every multicast: wastes bandwidth, CPU • Requires a two-part solution – determining LAN group members – omitting WAN routers from multicast Multicast trees • Shortest-path tree to all multicast members, rooted at sender • But must be computed independently by each router • And must be dynamically adjusted for joins and leaves A multicast tree A multicast tree IGMP • Internet Group Management Protocol (Deering and Cheriton) • Developed from work in V distributed operating system – introduced notion of process groups (Cheriton and Zwaenepol) – groups for services, e.g. name resolution, remote paging IGMP • Detects if a multicast group has any members within a LAN • Query and report messages – router sends query of group membership periodically – hosts report groups they’re in IGMP Internet “Who is a member?” IGMP Internet “I am” “I am” “I am” IGMP Internet “I am” “I am” “I am” Avoiding overloading • Report packets may overload router – upon getting a query, each group member sets a timer – if it sees a report for its group before the timer expires, it suppresses its report – otherwise reports on expiration THE END! • Any questions? • Slides will be put up on the web • If interested, check out the sources for more information