Chapter 2 Delivering the data Adapted from slides provided for: All material copyright 1996-2010 J.F Kurose and K.W. Ross, All Rights Reserved Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley Network Layer 4-1 Chapter goals: understand how data moves between layers and systems on the network IP address, subnet Routing Routing table Address resolution Protocols, ports and sockets Network Layer 4-2 Interplay between routing, forwarding routing algorithm local forwarding table header value output link 0100 0101 0111 1001 3 2 2 1 value in arriving packet’s header 0111 1 3 2 Network Layer 4-3 IP datagram format IP protocol version number header length (bytes) “type” of data max number remaining hops (decremented at each router) upper layer protocol to deliver payload to 32 bits head. type of length ver len service fragment 16-bit identifier flgs offset upper time to header layer live checksum total datagram length (bytes) for fragmentation/ reassembly 32 bit source IP address 32 bit destination IP address Options (if any) data (variable length, typically a TCP or UDP segment) E.g. timestamp, record route taken, specify list of routers to visit. Network Layer 4-4 IP Addressing: introduction IP address: 32-bit identifier for network interface interface: connection between host/router and physical link router’s typically have multiple interfaces host typically has one interface 223.1.1.1 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3 223.1.2.9 223.1.3.27 223.1.2.2 223.1.3.2 223.1.3.1 • Host with multiple interfaces can acts as a router 223.1.1.1 = 11011111 00000001 00000001 00000001 Command:ifconfig 223 1 1 Network Layer 1 4-5 IP Address vs MAC Address MAC address Globally unique Statically configured by manufacturer flat Discussion: difference between IP address vs domain name? IP address Not necessarily unique Dynamically assigned Hierarchical: made up of network part and host part, corresponding to hierarchy in the Internet Network Layer 4-6 [zhang@storm ~]$ ifconfig em1 Link encap:Ethernet HWaddr B4:99:BA:01:3B:F6 inet addr:150.108.68.26 Bcast:150.108.68.255 Mask:255.255.255.0 inet6 addr: fe80::b699:baff:fe01:3bf6/64 Scope:Link …. IP address em2 em3 Link encap:Ethernet HWaddr B4:99:BA:01:3B:F8 UP BROADCAST MULTICAST MTU:1500 Metric:1 … Link encap:Ethernet HWaddr B4:99:BA:01:3B:FA UP BROADCAST MULTICAST MTU:1500 Metric:1 …. em4 Link encap:Ethernet HWaddr B4:99:BA:01:3B:FC UP BROADCAST MULTICAST MTU:1500 Metric:1 …. lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host … virbr0 within Fordham network Private IP address used in CIS dept. network Link encap:Ethernet HWaddr 52:54:00:F2:86:A6 inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0 Network Layer 4-7 IP Addresses “class-full” addressing: class A 0 network B 10 C 110 D 1110 1.0.0.0 to 127.255.255.255 host network 128.0.0.0 to 191.255.255.255 host network multicast address host 192.0.0.0 to 223.255.255.255 224.0.0.0 to 239.255.255.255 32 bits Network Layer 4-8 Getting a datagram from source to dest. forwarding table in A Dest. Net. next router Nhops 223.1.1 223.1.2 223.1.3 IP datagram: misc source dest fields IP addr IP addr data A datagram remains unchanged, as it travels source to destination addr fields of interest here 223.1.1.4 223.1.1.4 1 2 2 223.1.1.1 223.1.2.1 B 223.1.1.2 223.1.1.4 223.1.2.9 223.1.2.2 223.1.1.3 223.1.3.1 223.1.3.27 E 223.1.3.2 Network Layer 4-9 Getting a datagram from source to dest. forwarding table in A misc data fields 223.1.1.1 223.1.1.3 Dest. Net. next router Nhops 223.1.1 223.1.2 223.1.3 Starting at A, send IP datagram addressed to B: look up net. address of B in forwarding table find B is on same net. as A link layer will send datagram directly to B inside link-layer frame B and A are directly connected A 223.1.1.4 223.1.1.4 1 2 2 223.1.1.1 223.1.2.1 B 223.1.1.2 223.1.1.4 223.1.2.9 223.1.2.2 223.1.1.3 223.1.3.1 223.1.3.27 E 223.1.3.2 Network Layer 4-10 Getting a datagram from source to dest. forwarding table in A misc data fields 223.1.1.1 223.1.2.3 Dest. Net. next router Nhops 223.1.1 223.1.2 223.1.3 Starting at A, dest. E: look up network address of E in forwarding table E on different network A, E not directly attached routing table: next hop router to E is 223.1.1.4 link layer sends datagram to router 223.1.1.4 inside linklayer frame datagram arrives at 223.1.1.4 continued….. A 223.1.1.4 223.1.1.4 1 2 2 223.1.1.1 223.1.2.1 B 223.1.1.2 223.1.1.4 223.1.2.9 223.1.2.2 223.1.1.3 223.1.3.1 223.1.3.27 E 223.1.3.2 Network Layer 4-11 Getting a datagram from source to dest. forwarding table in router misc data fields 223.1.1.1 223.1.2.3 Arriving at 223.1.4, destined for 223.1.2.2 look up network address of E in router’s forwarding table E on same network as router’s interface 223.1.2.9 router, E directly attached link layer sends datagram to 223.1.2.2 inside link-layer frame via interface 223.1.2.9 datagram arrives at 223.1.2.2!!! (hooray!) Dest. Net router Nhops interface 223.1.1 223.1.2 223.1.3 A - 1 1 1 223.1.1.4 223.1.2.9 223.1.3.27 223.1.1.1 223.1.2.1 B 223.1.1.2 223.1.1.4 223.1.2.9 223.1.2.2 223.1.1.3 223.1.3.1 223.1.3.27 E 223.1.3.2 Network Layer 4-12 Subnetting Problem 1: Any network with need for more than 255 hosts, needed class B addresses, or get many class C addresses Problem 2: Each new network implies additional entry in forwarding table large table Solution: Share one network number between several networks. …Subnetting Made most sense for large corporations or campuses Corporation networks share 1 network number Number of other networks within the corporation, using subnet masks E.g. a class B address, is shared among 8 networks, by using a 19-bit “subnet mask” (255.255.224.0 = 11111111 11111111 11100000 00000000) I.e. subnet addresses are defined by 1st 19 bits of the IP address. Host part now has a “subnet” part in it. Class B network address continues to be advertised to the rest of the Internet, subnetting only used “within campus” Subnet mask Introduce another level of hierarchy into IP address 8 bits are borrowed from the host address field to create subnet address field. Subnet mask: 255.255.255.0, i.e., all 1’s in upper 24 bits and 0’s in lower 8 bits * 24 bits are network number * 8 bits are host number Network Layer 4-15 Forwarding Ex. with Subnet Masks • Routing Table: SubnetNumber SubnetMask NextHop 128.96.170.0 255.255.254.0 Intface 0 128.96.168.0 255.255.254.0 Intface 1 128.96.166.0 255.255.254.0 R2 128.96.164.0 255.255.252.0 R3 Default R4 D = Dest IP Address Forwarding For each table entry (subnetNumber, SubnetMask, NextHop) pseudocode If (D & SubnetMask == SubnetNumber) if NextHop is an interface forward datagram to the interface else deliver datagram to NextHop (a router) IP addressing: CIDR Classful addressing: inefficient use of address space, address space exhaustion e.g., class B net allocated enough addresses for 65K hosts, even if only 2K hosts in that network CIDR: Classless InterDomain Routing network portion of address of arbitrary length address format: a.b.c.d/x, where x is # bits in network portion of address network part host part 11001000 00010111 00010000 00000000 200.23.16.0/23 Network Layer 4-17 Special IP address within subnet NETWORK ADDRESS A network address is an address where all host bits in the IP address are set to zero (0). first and lowest numbered address BROADCAST ADDRESS all host bits in the IP address are set to one (1). last address in the range of addresses All hosts are to accept and respond to the broadcast address. This makes special services possible. Network Layer 4-18 Hosts LOOPBACK ADDRESS 127.0.0.0 class 'A' subnet is used for only a single address, the loopback address 127.0.0.1. used to test the local network interface device's functionality. All network interface devices should respond to this address. ping 127.0.0.1 to test network hardware and software Network Layer 4-19 Special Use IP addresses PRIVATE IP ADDRESSES RFC 1918 defines a number of IP blocks set aside by American Registry of Internet Numbers (ARIN) for use as private addresses on private networks that are not directly connected to t Internet. Class Start End A 10.0.0.0 10.255.255.255 B 172.16.0.0 172.31.255.255 C 192.168.0.0 192.168.255.255 Network Layer 4-20 Special Use IP addresses Multicast IP Addresses set aside for special purposes, such as the IP's used in OSPF, Multicast, and experimental purposes that cannot be used on the Internet. Class Start End D 224.0.0.0 239.255.255.255 Network Layer 4-21 IP addresses: how to get one? Q: How does a network get the network part of IP addr? A: gets allocated portion of its provider ISP’s address space ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20 Organization 0 Organization 1 Organization 2 ... 11001000 00010111 00010000 00000000 11001000 00010111 00010010 00000000 11001000 00010111 00010100 00000000 ….. …. 200.23.16.0/23 200.23.18.0/23 200.23.20.0/23 …. Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23 Network Layer 4-22 Hierarchical addressing: route aggregation Hierarchical addressing allows efficient advertisement of routing information: Organization 0 200.23.16.0/23 Organization 1 200.23.18.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . Fly-By-Night-ISP “Send me anything with addresses beginning 200.23.16.0/20” Internet 200.23.30.0/23 ISPs-R-Us “Send me anything with addresses beginning 199.31.0.0/16” Network Layer 4-23 Hierarchical addressing: more specific routes ISPs-R-Us has a more specific route to Organization 1 Organization 0 200.23.16.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . Fly-By-Night-ISP Internet Longest match 200.23.30.0/23 ISPs-R-Us Organization 1 200.23.18.0/23 “Send me anything with addresses beginning 200.23.16.0/20” “Send me anything with addresses beginning 199.31.0.0/16 or 200.23.18.0/23” Network Layer 4-24 IP addressing: the last word... Q: How does an ISP get block of addresses? A: ICANN: Internet Corporation for Assigned Names and Numbers allocates addresses manages DNS assigns domain names, resolves disputes Network Layer 4-25 IP addresses: how to get one? Q: How does a host get IP address? hard-coded by system admin in a file Windows: control-panel->network->configuration>tcp/ip->properties UNIX: /etc/rc.config DHCP: Dynamic Host Configuration Protocol: dynamically get address from a server “plug-and-play” Network Layer 4-26 DHCP: Dynamic Host Configuration Protocol Goal: allow host to dynamically obtain its IP address from network server when it joins network Can renew its lease on address in use Allows reuse of addresses (only hold address while connected an “on”) Support for mobile users who want to join network (more shortly) DHCP overview: host broadcasts “DHCP discover” msg [optional] DHCP server responds with “DHCP offer” msg [optional] host requests IP address: “DHCP request” msg DHCP server sends address: “DHCP ack” msg Network Layer 4-27 DHCP client-server scenario A B 223.1.2.1 DHCP server 223.1.1.1 223.1.1.2 223.1.1.4 223.1.2.9 223.1.2.2 223.1.1.3 223.1.3.1 223.1.3.27 223.1.3.2 E arriving DHCP client needs address in this network Network Layer 4-28 DHCP client-server scenario DHCP server: 223.1.2.5 DHCP discover arriving client src : 0.0.0.0, 68 dest.: 255.255.255.255,67 yiaddr: 0.0.0.0 transaction ID: 654 DHCP offer src: 223.1.2.5, 67 dest: 255.255.255.255, 68 yiaddrr: 223.1.2.4 transaction ID: 654 Lifetime: 3600 secs DHCP request time src: 0.0.0.0, 68 dest:: 255.255.255.255, 67 yiaddrr: 223.1.2.4 transaction ID: 655 Lifetime: 3600 secs DHCP ACK src: 223.1.2.5, 67 dest: 255.255.255.255, 68 yiaddrr: 223.1.2.4 transaction ID: 655 Lifetime: 3600 secs Network Layer 4-29 DHCP: more than IP address DHCP can return more than just allocated IP address on subnet: address of first-hop router for client name and IP address of DNS sever network mask (indicating network versus host portion of address) Network Layer 4-30 NAT: Network Address Translation rest of Internet local network (e.g., home network) 10.0.0/24 10.0.0.4 10.0.0.1 10.0.0.2 138.76.29.7 10.0.0.3 All datagrams leaving local network have same single source NAT IP address: 138.76.29.7, different source port numbers Datagrams with source or destination in this network have 10.0.0/24 address for source, destination (as usual) Network Layer 4-31 NAT: Network Address Translation Motivation: local network uses just one IP address as far as outside world is concerned: range of addresses not needed from ISP: just one IP address for all devices can change addresses of devices in local network without notifying outside world can change ISP without changing addresses of devices in local network devices inside local net not explicitly addressable, visible by outside world (a security plus). Network Layer 4-32 NAT: Network Address Translation Implementation: NAT router must: outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #) . . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr. remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table Network Layer 4-33 NAT: Network Address Translation 16-bit port-number field: 60,000 simultaneous connections with a single LAN-side address! NAT is controversial: routers should only process up to layer 3 violates end-to-end argument • NAT possibility must be taken into account by app designers, e.g., P2P applications address shortage should instead be solved by IPv6 Network Layer 4-34 NAT traversal problem client wants to connect to server with address 10.0.0.1 server address 10.0.0.1 local to LAN (client can’t use it as destination addr) only one externally visible NATed address: 138.76.29.7 solution 1: statically configure NAT to forward incoming connection requests at given port to server Client 10.0.0.1 ? 10.0.0.4 138.76.29.7 NAT router e.g., (123.76.29.7, port 2500) always forwarded to 10.0.0.1 port 25000 Network Layer 4-35 NAT traversal problem solution 2: Universal Plug and Play (UPnP) Internet Gateway Device (IGD) Protocol. Allows NATed host to: learn public IP address (138.76.29.7) add/remove port mappings (with lease times) 10.0.0.1 IGD 10.0.0.4 138.76.29.7 NAT router i.e., automate static NAT port map configuration Network Layer 4-36 NAT traversal problem solution 3: relaying (used in Skype) NATed client establishes connection to relay External client connects to relay relay bridges packets between to connections 2. connection to relay initiated by client Client 3. relaying established 1. connection to relay initiated by NATed host 138.76.29.7 10.0.0.1 NAT router Network Layer 4-37 ICMP: Internet Control Message Protocol used by hosts & routers to communicate network-level information error reporting: unreachable host, network, port, protocol echo request/reply (used by ping) network-layer “above” IP: ICMP msgs carried in IP datagrams ICMP message: type, code plus first 8 bytes of IP datagram causing error Type 0 3 3 3 3 3 3 4 Code 0 0 1 2 3 6 7 0 8 9 10 11 12 0 0 0 0 0 description echo reply (ping) dest. network unreachable dest host unreachable dest protocol unreachable dest port unreachable dest network unknown dest host unknown source quench (congestion control - not used) echo request (ping) route advertisement router discovery TTL expired bad IP header Network Layer 4-38 Traceroute and ICMP Source sends series of UDP segments to dest first has TTL =1 second has TTL=2, etc. unlikely port number When nth datagram arrives to nth router: router discards datagram and sends to source an ICMP message (type 11, code 0) ICMP message includes name of router & IP address when ICMP message arrives, source calculates RTT traceroute does this 3 times Stopping criterion UDP segment eventually arrives at destination host destination returns ICMP “port unreachable” packet (type 3, code 3) when source gets this ICMP, stops. Network Layer 4-39 Hierarchical Routing Our routing study thus far - idealization all routers identical network “flat” … not true in practice scale: with 200 million destinations: can’t store all dest’s in routing tables! routing table exchange would swamp links! administrative autonomy internet = network of networks each network admin may want to control routing in its own network Network Layer 4-40 Hierarchical Routing aggregate routers into regions, “autonomous systems” (AS) routers in same AS run same routing protocol gateway router at “edge” of its own AS has link to router in another AS “intra-AS” routing protocol routers in different AS can run different intraAS routing protocol Network Layer 4-41 Interconnected ASes 3c 3a 3b AS3 1a 2a 1c 1d 1b Intra-AS Routing algorithm 2c AS2 AS1 Inter-AS Routing algorithm Forwarding table 2b forwarding table configured by both intra- and inter-AS routing algorithm intra-AS sets entries for internal dests inter-AS & intra-As sets entries for external dests Network Layer 4-42 Intra-AS Routing also known as Interior Gateway Protocols (IGP) most common Intra-AS routing protocols: RIP: Routing Information Protocol OSPF: Open Shortest Path First IGRP: Interior Gateway Routing Protocol (Cisco proprietary) Network Layer 4-43 RIP ( Routing Information Protocol) included in BSD-UNIX distribution in 1982 distance vector algorithm distance metric: # hops (max = 15 hops), each link has cost 1 DVs exchanged with neighbors every 30 sec in response message (aka advertisement) each advertisement: list of up to 25 destination subnets (in IP addressing sense) u v A z C B D w x y from router A to destination subnets: subnet hops u 1 v 2 w 2 x 3 y 3 z 2 Network Layer 4-44 OSPF (Open Shortest Path First) “open”: publicly available uses Link State algorithm LS packet dissemination topology map at each node route computation using Dijkstra’s algorithm OSPF advertisement carries one entry per neighbor router advertisements disseminated to entire AS (via flooding) carried in OSPF messages directly over IP (rather than TCP or UDP Network Layer 4-45 UNIX routing Principle 1. Search for a matching host address 2. Search for a matching network address 3. Search for a default entry (specified as a network entry, with network ID of 0 Network Layer 4-46 netstat Display routing table [zhang@storm ~]$ netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 0.0.0.0 150.108.68.1 0.0.0.0 UG 00 0 em1 150.108.68.0 0.0.0.0 255.255.255.0 U 00 0 em1 192.168.122.0 0.0.0.0 255.255.255.0 U 00 0 virbr0 Network Layer 4-47 Multiplexing/demultiplexing Demultiplexing at rcv host: delivering received segments to correct socket = socket = process application P3 P1 P1 application transport transport network network link link physical physical host 1 Multiplexing at send host: gathering data from multiple sockets, enveloping data with header (later used for demultiplexing) P2 P4 application transport network link physical host 2 host 3 Transport Layer 3-48 How demultiplexing works host receives IP datagrams each datagram has source IP address, destination IP address each datagram carries 1 transport-layer segment each segment has source, destination port number host uses IP addresses & port numbers to direct segment to appropriate socket 32 bits source port # dest port # other header fields application data (message) TCP/UDP segment format Transport Layer 3-49 Connectionless demultiplexing recall: create sockets with host-local port numbers: checks destination port number in segment directs UDP segment to socket with that port number DatagramSocket mySocket1 = new DatagramSocket(12534); DatagramSocket mySocket2 = new DatagramSocket(12535); recall: when creating datagram to send into UDP socket, must specify (dest IP address, dest port number) when host receives UDP segment: IP datagrams with different source IP addresses and/or source port numbers directed to same socket Transport Layer 3-50 Connectionless demux (cont) DatagramSocket serverSocket = new DatagramSocket(6428); P2 SP: 6428 DP: 9157 client IP: A P1 P1 P3 SP: 9157 DP: 6428 SP: 6428 DP: 5775 server IP: C SP: 5775 DP: 6428 Client IP:B SP provides “return address” Transport Layer 3-51 Connection-oriented demux TCP socket identified by 4-tuple: source IP address source port number dest IP address dest port number recv host uses all four values to direct segment to appropriate socket server host may support many simultaneous TCP sockets: each socket identified by its own 4-tuple web servers have different sockets for each connecting client non-persistent HTTP will have different socket for each request Transport Layer 3-52 Connection-oriented demux (cont) P1 P4 P5 P2 P6 P1P3 SP: 5775 DP: 80 S-IP: B D-IP:C client IP: A SP: 9157 DP: 80 S-IP: A D-IP:C server IP: C SP: 9157 DP: 80 S-IP: B D-IP:C Client IP:B Transport Layer 3-53 Connection-oriented demux: Threaded Web Server P1 P2 P4 P1P3 SP: 5775 DP: 80 S-IP: B D-IP:C client IP: A SP: 9157 DP: 80 S-IP: A D-IP:C server IP: C SP: 9157 DP: 80 S-IP: B D-IP:C client IP:B Transport Layer 3-54 TCP Connection Management Recall: TCP sender, receiver establish “connection” before exchanging data segments initialize TCP variables: seq. #s buffers, flow control info (e.g. RcvWindow) client: connection initiator Socket clientSocket = new Socket("hostname","port number"); server: contacted by client Socket connectionSocket = welcomeSocket.accept(); Three way handshake: Step 1: client host sends TCP SYN segment to server specifies initial seq # no data Step 2: server host receives SYN, replies with SYNACK segment server allocates buffers specifies server initial seq. # Step 3: client receives SYNACK, replies with ACK segment, which may contain data Transport Layer 3-55 TCP Connection Management (cont.) Closing a connection: client closes socket: clientSocket.close(); client close Step 1: client end system close FIN, replies with ACK. Closes connection, sends FIN. timed wait sends TCP FIN control segment to server Step 2: server receives server closed Transport Layer 3-56 TCP Connection Management (cont.) Step 3: client receives FIN, replies with ACK. client server closing Enters “timed wait” will respond with ACK to received FINs closing Step 4: server, receives Note: with small modification, can handle simultaneous FINs. timed wait ACK. Connection closed. closed closed Transport Layer 3-57 TCP Connection Management (cont) TCP server lifecycle TCP client lifecycle Transport Layer 3-58 TCP segment structure URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) Internet checksum (as in UDP) 32 bits source port # dest port # sequence number acknowledgement number head not len used UAP R S F Receive window checksum Urg data pnter Options (variable length) counting by bytes of data (not segments!) # bytes rcvr willing to accept application data (variable length) Transport Layer 3-59