20-755: The Internet Lecture 5: Internetworking II David O’Hallaron School of Computer Science and Department of Electrical and Computer Engineering Carnegie Mellon University Institute for eCommerce, Summer 1999 Lecture 5, 20-755: The Internet, Summer 1999 1 Today’s lecture • • • IP: Internetworking with routers (50 min) Break (10 min) UDP, TCP (35 min) Lecture 5, 20-755: The Internet, Summer 1999 2 Typical computer system Keyboard Processor Interrupt controller Mouse Keyboard controller Modem Serial port controller Printer Parallel port controller Local/IO Bus Memory IDE disk controller SCSI controller Video adapter Network adapter Display Network SCSI bus disk disk Lecture 5, 20-755: The Internet, Summer 1999 cdrom 3 IP: Internetworking with routers • • IP is the most successful protocol ever developed Keys to success: – simple enough to implement on top of any physical network » two tin cans and a string. – rich enough to serve as the base for implementations of more complicated protocols and applications. » The IP designers never dreamed of something like the Web. – “rough consensus and working code” » solid implementable specs. Many different kinds of applications and higher-level protocols IP Many different kinds of networks The “Hourglass Model”, Dave Clark, MIT Lecture 5, 20-755: The Internet, Summer 1999 4 Internet protocol stack Berkeley sockets interface User application program (FTP, Telnet, WWW, email) Unreliable best effort datagram delivery (processprocess) Unreliable best effort datagram delivery (host-host) User datagram protocol (UDP) Transmission control protocol (TCP) Internet Protocol (IP) Reliable byte stream delivery (processprocess) Network interface (ethernet) Lecture 5, 20-755: The Internet, Summer 1999 hardware Physical connection 5 IP service model • IP service model: – Delivery model: IP provides best-effort delivery of datagram (connectionless) packets between two hosts. » IP tries but doesn’t guarantee that packets will arrive (best effort) » packets can be lost or duplicated (unreliable) » ordering of datagrams not guaranteed (connectionless) – Naming scheme: IP provides a unique address (name) for each host in the Internet. • Why would such a limited delivery model be useful? – simple, so it runs on any kind of network – provides a basis for building more sophisticated and userfriendly protocols like TCP and UDP Lecture 5, 20-755: The Internet, Summer 1999 6 IP datagram delivery: Example internet Network 1 (Ethernet) H1 H2 H3 H7 Network 2 (Ethernet) R1 R2 R3 H8 Network 4 (Point-to-point) Network 3 (FDDI) H4 Lecture 5, 20-755: The Internet, Summer 1999 H5 H6 7 IP layering Protocol layers used to connect host H1 to host H8 in example internet. H1 R1 R2 R3 H8 TCP TCP IP ETH IP ETH IP FDDI FDDI Lecture 5, 20-755: The Internet, Summer 1999 IP P2P P2P IP ETH ETH 8 Encapsulating IP datagrams in Ethernet IP datagram header IP datagram data IP datagram Ethernet frame IP datagram header header IP datagram data Ethernet frame The same idea is used for other types of physical networks Lecture 5, 20-755: The Internet, Summer 1999 9 IP packet format 0 4 8 Ver Hlen 16 TOS 31 Length Datagram ID TTL 19 Flags Protocol Offset Checksum Source IP address Destination IP address Options (variable) VER IP version HL Header length (in 32-bit words) TOS Type of service (unused) Length Datagram length (max 64K B) ID Unique datagram identifier Flags xxM (more fragmented packets) Offset Fragment offset TTL Time to Live Protocol Higher level protocol (e.g., TCP) Data Lecture 5, 20-755: The Internet, Summer 1999 10 Fragmentation and reassembly • • Different networks types have different maximum transfer units (MTU). A problem can occur if packet is routed onto network with a smaller MTU. – e.g. FDDI (4,500B) onto Ethernet (1,500B) • Solution: break packet into smaller fragments. – each fragment has identifier and sequence number • Destination reassembles packet before handing it up in the stack. – alternative would be to reassemble when entering network with larger MTU • Sender can disable fragmentation using flag. Lecture 5, 20-755: The Internet, Summer 1999 11 Fragmentation example H1 R1 R2 R3 H8 TCP TCP IP ETH IP ETH ETH IP 1400 MTU=1500 IP FDDI FDDI FDDI IP 1400 MTU=4500 Lecture 5, 20-755: The Internet, Summer 1999 IP P2P P2P IP ETH ETH P2P IP 512 ETH IP 512 P2P IP 512 ETH IP 512 P2P IP 376 ETH IP 376 MTU=532 MTU=1500 12 Fragmentation example (cont) start of header ident=x m=1 offset=0 rest of header First packet 512 data bytes start of header ident=x m=1 offset=512 rest of header Second packet 512 data bytes start of header ident=x m=0 offset=1024 rest of header Third packet 376 data bytes Lecture 5, 20-755: The Internet, Summer 1999 13 Internet addresses • • Each host h has a physical address P(h) and a unique IP address I(h). IP addresses contain a network part and a host part: 3 classes of addresses: 012 8 16 0 network(7) 10 110 24 31 Class A (128 nets, 16 M hosts/net) host (24) network (14) host (16) network (21) Lecture 5, 20-755: The Internet, Summer 1999 host (8) Class B (16 K nets, 65 K hosts/net) Class C (2 M nets, 256 hosts/net) 14 Example Internet addresses Host IP Number Class Network cs.cmu.edu 128.2.222.173 cmu.edu 128.2.35.186 cs.stanford.edu 171.64.64.64 B B B 0x0002 0x0000 0x2640 att.com C 0x008085 01234 0 192.128.133.151 8 16 network 10 110 24 31 host network Class A host network Lecture 5, 20-755: The Internet, Summer 1999 Class B host Class C 15 IP Datagram Forwarding • • • Forwarding: the process of copying an input packet from an input port to an output port. Routing: the process of building the tables on each router that allow the correct output port to be determined (beyond our scope) Key points – Every IP datagram contains the IP address of the destination. – Network part of IP address uniquely identifies a single physical network. – All hosts and routers with same network field in address are on the same physical network. – Every physical network on the Internet has a router connected to at least one other physical network. Lecture 5, 20-755: The Internet, Summer 1999 16 IP Forwarding Algorithm Algorithm for host S sending to host D: if (NetworkNum(S) == NetworkNum(D)) { deliver packet directly to D /* IP->physical mapping via ARP */ else deliver packet to default router Algorithm for router receiving packet for host D NextHop = lookup(NetworkNum(D)); if (NextHop is an interface) deliver packet directly to D using interface NextHop else if (NextHop != <undefined>) deliver packet to NextHop (a router) else deliver packet to default router Lecture 5, 20-755: The Internet, Summer 1999 Forwarding table consists of (NetworkNum, NextHop) pairs 17 IP Forwarding example H1 H2 Network 1 (Ethernet) H3 H7 Network 2 (Ethernet) R1 R2 R3 H8 Network 4 (Point-to-point) Network 3 (FDDI) H4 Router R2 forwarding table NetworkNum 1 2 3 4 Lecture 5, 20-755: The Internet, Summer 1999 H5 H6 NextHop R3 R1 Interface 1 Interface 0 18 ARP: Address resolution protocol • Initially: – Hosts S and D on the same network with IP addresses I(S) and I(D) and physical addresses P(S) and P(D). • Problem: – Given I(D), host S wants to discover P(D). • (I(S), P(S), I(D), ???) S Solution: – Host S broadcasts triple (I(S), P(S), I(D),???) on network. – Host D (and only host D) responds with tuple (I(S), P(S), I(D), P(D)) – Both sender and receiver maintain a software cache of IP to physical mappings. – Time out old entries Lecture 5, 20-755: The Internet, Summer 1999 D (I(S), P(S), I(D), P(D)) S D 19 Subnetting • • Problem: IP addressing scheme makes inefficient use of addresses Partial solution: subnetting – physical network part of address identifies a “virtual” physical network to the external world. – use some of the high order “host” bits to identify local physical networks within the “virtual” physical network. network number host number 11111111 11111111 11111111 00000000 xxxxxxxx xxxxxxxx xxxxxxxx 00000000 Class B address & Subnet mask (255.255.255.0) = Subnet number - All hosts on same physical network have same subnet number. - There is exactly one subnet mask per subnet. - All hosts on subnet configured with this mask (ifconfig) Lecture 5, 20-755: The Internet, Summer 1999 20 IP forwarding with subnetting Algorithm on a host: D1 = SubnetMask & destination IP address if (D1 == MySubnetNum) deliver datagram directly to destination else deliver datagram to default router Algorithm on a router: for each forwarding table entry <SubnetNum,SubnetMask,NextHop> D1 = SubnetMask & destination IP address if (D1 == SubnetNum) if (NextHop is an interface) deliver datagram directly to destination else deliver datagram to NextHop (a router) Lecture 5, 20-755: The Internet, Summer 1999 21 Subnetting example subnet mask: 255.255.255.128 subnet number: 128.96.34.0 128.96.34.1 128.96.34.15 H1 R1 128.96.34.130 128.96.34.129 R2 subnet mask: 255.255.255.128 subnet number: 128.96.34.128 128.96.34.139 H2 128.96.33.1 128.96.33.14 H3 forwarding table for R1 subnet mask: 255.255.255.0 subnet number: 128.96.33.0 SubnetNum 128.96.34.0 128.96.34.128 129.96.33.0 Lecture 5, 20-755: The Internet, Summer 1999 SubnetMask 255.255.255.128 255.255.255.128 255.255.255.0 NextHop interface 0 interface 1 R2 22 IPv6 • • • 3 010 • Also called Next Generation IP and IPng Extends address space from 32 bits to 128 bits Hierarchical address space: registryID providerID SubscriberID SubnetID 48 InterfaceID neat feature – embedded InterfaceID allows host to assign itself an IP address! Lecture 5, 20-755: The Internet, Summer 1999 23 IPv6 packet format 4 8 Ver Pri 16 24 31 FlowLabel PayloadLen NextHdr HopLimit Source Address Ver Pri/Flowlabel PayloadLen NextHdr HopLimit Source Address Dest Address IP version (6) Quality of Service) packet len (max 64KB) optional/encapsulated header type same as TTL in IPv4 128-bit source addr 128-bit dest addr Destination Address Optional header examples: fragmentation (44) authentication (51) TCP (6) Next header/data Lecture 5, 20-755: The Internet, Summer 1999 24 Converting from IPv4 to IPv6 • • Not possible to have a “flag day” Must upgrade incrementally – dual stack operation » IPv6 nodes run both IPv4 and IPv6 protocol stacks – IP tunneling » IP packet sent as payload of another IP packet » networking community’s version of indirection! IPV6 IPv6 router IPv6 router IPv4 network IPV6 IPV4 Lecture 5, 20-755: The Internet, Summer 1999 IPV6 IPV6 IPV4 25 Break time! Lecture 5, 20-755: The Internet, Summer 1999 26 Today’s lecture • • • IP: Internetworking with routers (50 min) Break (10 min) UDP, TCP (35 min) Lecture 5, 20-755: The Internet, Summer 1999 27 UDP: User datagram protocol Berkeley sockets interface User application program (FTP, Telnet, WWW, email) Unreliable best effort datagram delivery (processprocess) Unreliable best effort datagram delivery (host-host) User datagram protocol (UDP) Transmission control protocol (TCP) Internet Protocol (IP) Reliable byte stream delivery (processprocess) Network interface (ethernet) Lecture 5, 20-755: The Internet, Summer 1999 hardware Physical connection 28 UDP: User datagram protocol • • • • Extends IP to provide process-to-process (end-to-end) datagram delivery Mechanism for demultiplexing IP packets Based on port abstraction Process identified by <host, port> pair. SrcPort DstPort CheckSum Length Data Lecture 5, 20-755: The Internet, Summer 1999 29 TCP: Transmission Control Protocol Berkeley sockets interface User application program (FTP, Telnet, WWW, email) Unreliable best effort datagram delivery (processprocess) Unreliable best effort datagram delivery (host-host) User datagram protocol (UDP) Transmission control protocol (TCP) Internet Protocol (IP) Reliable byte stream delivery (processprocess) Network interface (ethernet) Lecture 5, 20-755: The Internet, Summer 1999 hardware Physical connection 30 TCP: Transmission control protocol • Uses IP to provide reliable process-to-process byte stream delivery. – stream orientation » sender transfers ordered stream of bytes; receiver gets identical stream – virtual circuit connection » stream transfer analogous to placing phone call » sender initiates connection which must be accepted by receiver. – buffered data transfer » protocol software free to use arbitrary size transfer units – unstructured streams » stream is a sequence of bytes, just like Unix files – full duplex » concurrent transfers in both directions along a connection Lecture 5, 20-755: The Internet, Summer 1999 31 TCP functions • • • • Connections Sequence numbers Sliding window protocol Reliability and congestion control. Source Port Dest. Port Sequence Number Acknowledgment Hlen/Flags Window D. Checksum Urgent Pointer Options.. Lecture 5, 20-755: The Internet, Summer 1999 32 Connections • Connection is a fundamental TCP communication abstraction. – data sent along a connection arrives in order – implies allocation of resources (buffers) on hosts • The endpoint of a connection is a pair of integers: – (IP address, port) • A connection is defined by a pair of endpoints: – ((128.2.254.139, 1184), (128.10.2.3, 53)) (128.2.254.139, 1184) Lecture 5, 20-755: The Internet, Summer 1999 connection (128.10.2.3, 53) 33 Sequence space • • Each stream split into a sequence of segments which are encapsulated in IP datagrams. Each byte in the byte stream is numbered. – 32 bit value – wraps around – initial values selected at runtime • Each segment has a sequence number. – indicates the sequence number of its first byte – Detects lost, duplicate or out of order segments Lecture 5, 20-755: The Internet, Summer 1999 34 TCP flow control mechanism: sliding window • • The purpose of flow control is to keep senders from flooding receivers with packets and filling up their memories. Often confused with congestion control, which tries to keep the senders from flooding the network with packets. Lecture 5, 20-755: The Internet, Summer 1999 35 Sliding window protocol (sender) • Sender maintains a “window” of unacknowledged bytes that it is allowed to send, and a pointer to the last byte it sent: current window 1 2 3 4 5 6 7 8 9 10 11 ... left curr byte stream right Bytes through 2 have been sent and acknowledged (and thus can be discarded) Bytes 3 -- 6 have been sent but not acknowledged (and thus must be buffered) Bytes 7 -- 9 have been not been sent but will be sent without delay. Bytes 10 and higher cannot be sent until the right edge of window moves. Lecture 5, 20-755: The Internet, Summer 1999 36 Sliding window protocol (receiver) • Receiver acknowledges receipt of a segment with two pieces of information: – ACK: the sequence number of the next byte in the contiguous stream it has already received – WIN: amount of available buffer space. • ACK indicates that data was received correctly. – sender can increment left edge of window – sender can delete data to the left of the window. • WIN indicates that more buffer space was freed up. – sender can increment the right edge of its window – sender can transmit more data. Lecture 5, 20-755: The Internet, Summer 1999 37 Sliding window protocol (example) Sender Receiver’s buffer Receiver Application does 2K write 0 4K 2K, SEQ = 0 empty ACK=2K, WIN = 2K 2K Application does 3K write 2K, SEQ =2K ACK=4K, WIN = 0 4K Sender is blocked Application reads 2K ACK=4K, WIN = 2K 2K Sender may send up to 2K 1K, SEQ =4K 1K Lecture 5, 20-755: The Internet, Summer 1999 2K 38 Opening and closing connections Host 1 The three way handshake Application does a connect to a socket on Host 2 SYN, SEQ = J, WIN = 4K ACK =J +1, SYN, SEQ = K, WIN = 4K ACK = K+1, Application does a close on a connection FIN, SEQ = M Host 2 J is the initial sequence number for messages from Host 1 to Host 2. K is the initial sequence number for messages from Host 2 to Host 1. SYN is the “synchronize” flag ACK = M+1 FIN, SEQ = N ACK = N+1 Lecture 5, 20-755: The Internet, Summer 1999 Host 2 replies with its own close. FIN is the “finish” flag 39 Reliability and congestion control • Reliability: – sender » saves segments inside its window » uses timeouts and sequence numbers in ACKS to detect lost segments. » retransmit segments it thinks are lost – receiver » uses sequence numbers to assemble segments in order » also to detect duplicate segments (how might this happen?) • Congestion control – sender maintains separate separate congestion window – uses smaller of the two windows – uses “slow start” algorithm to adaptively set congestion window size. Lecture 5, 20-755: The Internet, Summer 1999 40 End-to-end data issues • Presentation formatting – must account for different data formats on different machines » different byte orders » different word sizes • Compression – data can be compressed/decompressed on the endpoints to save network bandwidth (beyond our scope) • Encryption – sensitive data can be encrypted/unencrypted on the endpoints. • Authentication – Receivers may want to verify that messages really do come from the sender. Lecture 5, 20-755: The Internet, Summer 1999 41 Key themes in IP internetworking • Protocol layering – Way to structure complex system – Handle different concerns at different layers • • • Must cope with heterogeneous networks Must cope with huge scale Must cope with imperfect environment – Packets get corrupted and lost • No one has complete routing table – Too many hosts – Hosts continually being added and removed – In the future, they will start moving around (mobile computing) Lecture 5, 20-755: The Internet, Summer 1999 42 Next time: Programming the global IP Internet Berkeley sockets interface User application program (FTP, Telnet, WWW, email) Unreliable best effort datagram delivery (processprocess) Unreliable best effort datagram delivery (host-host) User datagram protocol (UDP) Transmission control protocol (TCP) Internet Protocol (IP) Reliable byte stream delivery (processprocess) Network interface (ethernet) Lecture 5, 20-755: The Internet, Summer 1999 hardware Physical connection 43