CPS110: Networks Landon Cox March 20, 2008 Network hardware reality Lots of different network interface cards (NICs) 3Com/Intel, Ethernet/802.11x Each NIC has a fixed hardware address MAC address: 01:10:C6:CE:8E:42 Send packet to LAN by specifying MAC address Max packet size is 1500 bytes Packets can be reordered, corrupted, dropped Anyone can sniff packets from the network Virtual/physical interfaces Applications Device indepe ndence Route across networ ks Symbol ic host names Large messag es Process to process Ordere d messag es Reliabl e messag ing Byte stream s Secure transm ission Proced ure calls Distinc t messag es Insecur e transm ission Messag es OS Many types of NICs Deliver only on LAN Hardware MAC addres ses Small messag es NIC to NIC Unorde red messag es Unrelia ble messag ing Distributed computing Try to make multiple computers look like one We won’t really cover Take CPS 214 Distributed shared memory Distributed file systems Parallelizing compilers Process migration Protocol layers NFS (files) HTTP (web) SMTP (email) SSH (login) RPC Applications Abstraction UDP Abstraction TCP IP Ethernet ATM Abstraction PPP Hardware OSI model Open Systems Interconnections Layer 7 Applications Applications Layer 6 Presentation Presentation Layer 5 Session Session Layer 4 Transport Transport Layer 3 Network Network Layer 2 Layer 1 DataLink DataLink Physical Physical Network layers (the stack) Build higher-level services on simpler ones IP over Ethernet TCP over IP HTTP over TCP Why build in layers? Could have 0 layers (build directly on top of HW) What would happen? Have to build from scratch each time HW changes E.g. one firefox for wired NIC, one for wireless NIC Network layers (the stack) Build higher-level services on simpler ones IP over Ethernet TCP over IP HTTP over TCP Why build in layers? Could have 1 layer (OS provides single layer) What would happen? Better to let applications choose functionality they need Unneeded features usually cost something (performance) E.g. would you ever not need reliable communication? Virtual/physical interfaces Applications Route across networks OS Deliver only on LAN Hardware Routing HW lets us send to neighbor on same LAN Single-hop route Want to send to computer on another LAN Multi-hop route IP (Internet Protocol) handles this Local-area network Typically, switched Ethernet Ethernet switch Messages delivered using Ethernet MAC address E.g. 00:0D:56:1E:AD:BB Unique to physical card (like a serial number) Switch knows all connected computers’ MAC addresses Routing Can’t put all computers on one switch! Think of the wiring logistics Want to connect two LANs together Use a machine that straddles two networks Called a router or gateway or bridge LANs and routers form the Internet Internet graph A B Each letter is a router, possibly with a LAN connected to it. C E D G F Internet graph Each node is an Autonomous System (AS). Can think of as an ISP. Internet graph A B C E D G F How does D know how to get to router G? Should it send messages to E, C, or F? Internet routing is imprecise Internet has no centralized state Makes it (supposedly) more fault-tolerant Routing is hard when a network is Large (a lot to track) Dynamic (connections change quickly) Incentives to lie (make money by accepting traffic) The Internet exhibits all three Basic idea Routers propagate info about the graph to each other BGP (Border Gateway Protocol) Traceroute example www.kernel.org Unix traceroute utility Virtual/physical interfaces Applications Symbolic host names OS MAC addresse s Hardware Naming other computers Low-level interface Provide the destination MAC address 00:13:20:2E:1B:ED Middle-level interface Provide the destination IP address 152.3.140.183 High-level interface Provide the destination hostname crocus.cs.duke.edu Translating hostname to IP addr Hostname IP address Performed by Domain Name Service (DNS) Used to be a central server /etc/hosts at SRI What’s wrong with this approach? Doesn’t scale to the global Internet DNS Centralized naming doesn’t scale Server has to learn about all changes Server has to answer all lookups Instead, split up data Use a hierarchical database Hierarchy allows local management of changes Hierarchy spreads lookup work across many computers Example: www.cs.duke.edu nslookup in interactive mode Translating IP to MAC addrs IP address MAC address Performed by ARP protocol Only done after you get to the right LAN How does a router know the MAC address of 152.3.140.183? ARP (Address Resolution Protocol) If it doesn’t know the mapping, broadcast through switch “Whoever has this IP address, please tell me your MAC address” Cache the mapping “/sbin/arp” Why is broadcasting over a LAN ok? Number of computers connected to a switch is relatively small Virtual/physical interfaces Applications Large messages OS Small messages Hardware Message sizes Hardware interface Max Ethernet message size is 1500 bytes Application interface IP maximum packet size is 64 kbytes What if the route narrows? Start at Ethernet max of 1500 bytes Could traverse ATM w/ max of 53 bytes Message sizes IP layer fragments larger MTU to smaller MTU Computer 1 Router Computer 2 IP IP IP Ethernet Ethernet ATM ATM Virtual/physical interfaces Applications Processtoprocess OS NIC-toNIC Hardware Processes vs machines IP is machine-to-machine E.g. crocus.cs.duke.edu www.kernel.org Process abstraction Each app thinks it has its own machine Give each process multiple virtual NICs Processes vs machines Hardware interface One network endpoint per machine Application interface Multiple network endpoints per machine Sockets Software endpoints for communication Like virtual network cards Sockets Another example of virtualized hardware Thread virtual processor Address space virtual memory Endpoint/socket virtual NIC NIC and socket both have unique identifiers NIC: MAC address Socket: ‹hostname, port number› bind () assigns a port number to a host’s socket Sockets OS allows apps to program sockets E.g. BSD sockets WinSock has pretty much same interface Processes name each other via sockets Each message includes a destination ‹host, port› Tells routers which computer gets message Tells dst computer which process gets message Sockets OS can multiplex multiple connections over one NIC Kinds of sockets: UDP (datagrams), TCP (ordered, reliable) Course administration Exam regrades back on Tuesday Project 2 also due on Tuesday Four groups have submitted Any questions? Virtual/physical interfaces Applications Ordered messages Reliable messages Byte streams OS Unordere d messages Hardware Unreliabl e messages Distinct messages Ordered messages Networks can re-order IP messages E.g. Send: A, B. Arrive: B, A How should we fix this? Assign sequence numbers (0, 1, 2, 3, 4, …) Ordered messages Do what for a message that arrives out of order? (0, 1, 3, 2, 4) a. Save #3 and deliver after #2 is delivered (this is what TCP does) b. c. Drop #3, deliver #2, deliver #4 Deliver #3, drop #2, deliver #4 b. and c. are ordered, but not reliable (messages are dropped). Relies on the reliability layer to handle lost messages. Ordered messages For a notion of order, first need “connections” Why? Must know which messages are related to each other Idea in TCP Open a connection Send a sequence of messages Close the connection Opening a connection ties two sockets together Connection is socket-to-socket unique: only these sockets can use it Sequence numbers are connection specific Virtual/physical interfaces Applications Ordered messages Reliable messages Byte streams OS Unordere d messages Hardware Unreliabl e messages Distinct messages Reliable messages Usually paired with ordering TCP provides both ordering and reliability Hardware interface Network drops messages Network duplicates messages Network corrupts messages Application interface Every message is delivered exactly once Detecting and fixing drops How to fix a dropped message? Have sender re-send it How does sender know it’s been dropped? Have receiver tell the sender Receiver may not know it’s been sent Like asking in the car, “If we left you at the theater, speak up.” Detecting and fixing drops Have receiver acknowledge each message Called an “ACK” If sender doesn’t get an ACK Assume message has been dropped Resend original message Is this ok for the sender to assume? No. ACKs can be dropped too (or delayed) Detecting and fixing drops Possible outcomes Message is delayed or dropped ACK is delayed or dropped Strategy Deal with all as though message was dropped Worst case if message wasn’t dropped after all? Need to deal with duplicate messages How to detect and fix duplicate messages? Easy. Just use the sequence number and drop duplicate. What about corruption? Messages can also be corrupted Bits get flipped, etc Especially true over wireless networks How to deal with this? Add a checksum (a little redundancy) Checksum usually = sum of all bits Drop corrupted messages What about corruption? Dropping corrupted messages is elegant Transforms problem into a dropped message We already know how to deal with drops Common technique Solve one problem by transforming it into another 1. Corruption drops 2. Drops duplicates 3. Drop any duplicate messages (very simple) Virtual/physical interfaces Applications Ordered messages Reliable messages Byte streams OS Unordere d messages Hardware Unreliabl e messages Distinct messages Byte streams Hardware interface Send information in discrete messages Application interface Send data in a continuous stream Like reading/writing from/to a file Byte streams Many apps think about info in distinct messages What if you want to send more data than fits? UDP max message size is 64 KB What if data never ends? Streamed media TCP provides “byte streams” instead of messages Byte streams Sender writes messages of arbitrary size TCP breaks up the stream into fragments Reassembles the fragments at destination Receiver sees a byte stream Fragments are not visible to either process Programming the receiver Must loop until certain number of bytes arrive Otherwise, might get first fragment and return Byte streams UDP makes boundaries visible TCP makes boundaries invisible (loop until you get everything you need) How to know # of bytes to receive? 1. Size is contained in header 2. Read until you see a pattern (sentinel) 3. Sender closes connection Sentinels Idea: message is done when special pattern arrives Example: C strings How do we know the end of a C string? When you reach the null-termination character (‘\0’) Ok, now say we are sending an arbitrary file Can we use ‘\0’ as a sentinel? No. The data payload may contain ‘\0’ chars What can we do then?