1 3. INTERNETWORKING (PART 3: IP) Rocky K. C. Chang Department of Computing The Hong Kong Polytechnic University 18 February 2016 2 1. The internetworking problem • Problem: How to interconnect heterogeneous networks effectively? • Three problems with interconnection at the data-link layer: • Do not scale to the number of data-link technologies. • Do not scale to the number of hosts (or networks). • Do not have a common addressing space. 3 1. The internetworking problem Network 1 (Ethernet) H7 H2 H1 H3 Network 4 (point-to-point) Network 2 (Ethernet) S1 S2 H4 Network 3 (FDDI) H5 S3 H6 H8 4 1.1 A layer-three internetworking solution • Use IP, XNS, IPX, etc on top of the networks. • Replace LAN switches with layer-three switches, more commonly known as routers. • Add IP software to each end host (with the whole protocol suite software). • Assign an IP address to each network interface. 5 1.1 A layer-three internetworking solution Network 1 (Ethernet) H7 H2 H1 H3 Network 4 (point-to-point) Network 2 (Ethernet) R1 R2 H4 Network 3 (FDDI) H5 R3 H6 H8 6 2. Encapsulation and address binding • To transmit IP datagrams over any data-link network, two requirements are needed: • A standard way to encapsulate IP datagrams • Address resolution between IP addresses and MAC addresses • Standard RFCs for specifying datagram encap-sulations and possibly address resolutions, e.g., Ethernet (RFC 894), IEEE 802 (RFC 1042), etc. • A shared medium uses an Address Resolution Protocol (ARP) for address binding. 7 2.1 Data encapsulation • You have seen from chapter 2 • IP over DIX Ethernet • IP over IEEE 802.3 • IP over PPP • Others are in the RFC documents 8 2.2 Address resolution protocol • An ARP request message is data-link broadcasted on the LAN with the target IP address. • Every IP host picks up a copy of the message and examine the target IP address. • If matching its IP address, send an ARP reply message back to the sender with its MAC address. • Else, drop the message. • To reduce broadcast traffic, each host uses an ARP cache to remember the recent binding. 9 2.2 Address resolution protocol 0 8 16 Hardware type = 1 HLen = 48 PLen = 32 31 ProtocolType = 0x0800 Operation SourceHardwareAddr (bytes 0 – 3) SourceHardwareAddr (bytes 4 – 5) SourceProtocolAddr (bytes 0 – 1) SourceProtocolAddr (bytes 2 – 3) TargetHardwareAddr (bytes 0 – 1) TargetHardwareAddr (bytes 2 – 5) TargetProtocolAddr (bytes 0 – 3) 10 2.3 An internetworking example • On each “hop or link,” both data encapsulation and address resolution occur. H1 H8 TCP R1 IP IP ETH R2 ETH R3 IP FDDI FDDI IP PPP PPP TCP IP ETH ETH 11 3. The IP service model • The IP service model consists of • an addressing scheme to identify an IP host, and • a datagram (connectionless) model of data delivery. • IP provides a best-effort service. • IP makes its best effort to send a datagram to its destination. • The best-effort service does not guarantee reliable datagram delivery, i.e., an unreliable service. 12 3.1 Internet protocol suite (incomplete) Application FTP Ping DNS HTTP NV TFTP RTP SSL Transport Network UDP TCP ICMP IGMP IP ARP & RARP Data-link NET1 NET2 … NETn 13 4. IP datagram 0 4 Version 8 HLen 16 TOS 31 Length Ident TTL 19 Flags Protocol Offset Checksum SourceAddr DestinationAddr Options (variable) Data Pad (variable) 14 4. IP datagram • Version: 4 for the current IP. • Type of service (TOS) for specifying how a router should handle this datagram. • Header length handles a variable-length header. • 20-byte IP header without IP options • A 16-bit length limits the size of an IP datagram to 65,535 bytes, including the IP header. • Identification, flags, and offset are used for packet fragmentation and reassembly. 15 4. IP datagram • Time to live (TTL) limits the the number of times that a datagram processed by routers. • Protocol specifies the type of payload, e.g., 6 for TCP and 17 for UDP. • Checksum is a 16-bit word checksum. • IP options, e.g., • Source routing • Record route 16 5. MTU and packet fragmentation • Each network chooses a maximum packet size that can be sent on it, Maximum Transmission Unit (MTU). For example, • 1500 bytes for 10-Mbps Ethernet • 4352 bytes for FDDI • 17914 bytes for 16-Mbps token ring • Note that all MTUs are smaller than IP datagram’s maximum size. • One internetworking problem is to accommodate various MTU values. 17 5. MTU and packet fragmentation • To send datagrams to a directly attached host, use the network’s MTU. • To send datagrams to a nondirectly attached host, use the path MTU. • Path MTU is the minimum of the networks’ MTUs on the path from the source to destination. • If the actual MTU used is larger than the path MTU, packet fragmentation occurs. • Fragmentation occurs when a router attempts to forward it to a network with a smaller MTU. 18 5. MTU and packet fragmentation H1 ETH IP (1400) R1 R2 FDDI IP (1400) R3 H8 PPP IP (512) ETH IP (512) PPP IP (512) ETH IP (512) PPP IP (376) ETH IP (376) 19 Start of header Ident = x (a) 0 Offset = 0 Rest of header 1400 data bytes Start of header Ident = x (b) 1 Offset = 0 Rest of header 512 data bytes Start of header Ident = x 1 Offset = 512 Rest of header 512 data bytes Start of header Ident = x 0 Offset = 1024 Rest of header 376 data bytes 20 5. MTU and packet fragmentation • Each IP fragment contains enough information for forwarding to the destination. • A fragmented IP datagram will be reassembled only at the destination node. • If any fragments do not arrive within a certain time, other received fragments in the datagram will be discarded. • Fragmentation could occur multiple times to an IP datagram. 21 6. IP subnets • IP subnets introduce additional levels within an IP network: • A network address, a subnet ID, and a host ID. • IP subnets offer flexibility in allocating addresses to different sizes of sub-networks. • A subnet mask is used to indicate which bits are referred to the network and subnet ID. • Each network interface stores subnet mask and its unicast IP address. 22 6. IP subnets • Subnetting for a class B address: Network number Host number Class B address 1111111111111111 11111111 00000000 Subnet mask (255.255.255.0) Network number Subnet ID Subnetted address Host ID 23 6. IP subnets Subnet mask: 255.255.255.128 Subnet number: 128.96.34.0 128.96.34.15 128.96.34.1 H1 R1 Subnet mask: 255.255.255.128 Subnet number: 128.96.34.128 128.96.34.130 128.96.34.139 128.96.34.129 H2 R2 H3 128.96.33.1 128.96.33.14 Subnet mask: 255.255.255.0 Subnet number: 128.96.33.0 24 7. IP forwarding mechanisms • Assume that both routers and hosts already have appropriate routing tables in place. • Routing tables for routers are constructed from routing protocols. • Routing tables for hosts are constructed from other means. • Problem: Given a routing table, how do hosts and routers forward datagrams? 25 7.1 Examples of routing tables • For example, R1’s • Network/Subnet • 128.96.34.0 • 128.96.34.128 • 128.96.33.0 routing table: • For example, H1’s • Network/Subnet • 128.96.34.0 • 0.0.0.0 routing table: Subnet Mask Next Hop 255.255.255.128 upper int. 255.255.255.128 lower int. 255.255.255.0 128.96.34.129 Subnet Mask Next Hop 255.255.255.128 upper int. 0.0.0.0 128.96.34.1 26 7.2 Host’s forwarding mechanisms • A host sends a datagram to another host on the same LAN or not. • In the former, it sends the datagram to the destination directly. • In the latter, it sends the datagram to a default router. • In both cases, the host uses ARP cache or ARP to find out the corresponding MAC addresses. 27 7.3 A general forwarding mechanism D = Destination IP address for each entry (Network/Subnet ID, Subnet Mask, Next Hop) D1 = Subnet mask & D if D1 = Network/Subnet ID if Next Hop is an interface deliver datagram directly to destination else deliver datagram to Next Hop (a router) 28 7.4 Characteristics of IP forwarding • Both hosts and routers are involved in forwarding. • Compared with routers, a host makes a much simpler binary decision. • IP forwarding is done on a hop-by-hop basis. • It is assumed that the next-hop router is really closer to the destination. • IP forwarding is able to specify a route to a network, and not have to specify a route to every host. 29 8. The routing problem • Problem: How does a router construct its routing table for IP forwarding? • Forwarding vs routing • Routing is the process by which forwarding tables are built. • Forwarding table vs routing table • A routing table is built by routing protocols as a precursor to building the forwarding table. • A forwarding table consists of detail enough information to speed up datagram forwarding. 30 8.1 Internet topology Large corporation “Consumer ” ISP Peering point Backbone service provider “ Consumer” ISP Large corporation Small corporation “Consumer”ISP Peering point 31 8.1 Internet topology • Major components in the Internet topology: • Autonomous system (AS), e.g., polyu.edu.hk, ibm.com, etc. • Internet service providers (ISPs): Local ISPs, regional ISPs, National ISPs, Backbone ISPs. • Exchange networks: For local traffic interchange, e.g., HKIX. • Some special networks, like Harnet in Hong Kong. • Routers (plus other networks) are usually used to connect these components together. 32 8.1 Not all routers are equal • Interior routers: Only know how to route datagrams to destinations within the same AS. • Border routers: Interface between its AS and other AS: • A nonbackbone router usually has a “default route” to another “more knowledgeable” router for “unknown destinations.” • A backbone router is supposed to know every IP network in the Internet. • Intradomain routing vs Interdomain routing 33 8.2 Distance vector routing protocols • Each node does two things: • It constructs a one-dimensional array (a vector) containing the “distances” (costs) to all other nodes. • It distributes the vector to its immediate neighbors. • Each node’s vector initially consists of • a distance of 0 for reaching itself, and • a distance of infinity for reaching other nodes. • When the algorithm converges, each node knows for each destination node • (1) the next node closer to the destination, and • (2) the associated cost for this path. 34 8.2.1 An example B C A D E F G 35 8.2.1 An example • Node A’s routing table (using hop count as the cost) 1. Initially Destination A Cost 0 Next hop A 3. After convergence Destination A B 2. After talking with its neighbors C D Destination Cost Next hop E A 0 A F B 1 B G C 1 C E 1 E F 1 F Cost 0 1 1 2 1 1 2 Next hop A B C C E F F 36 8.2.2 Dynamic routing • Each node periodically sends its distance vector to its neighbor (periodic updates). • If link A-C fails, • The cost in A’s entry to C becomes infinity. • B will advertise to A a path to C with cost 1. • F will advertise to A a path to C with cost 2. • Therefore, A’ entry to C is updated to: Next hop = B and cost = 2. 37 8.2.2 Dynamic routing • Each node may send an updated distance vector to its neighbor, triggered by external events (triggered updates). • If link A-C fails, • The cost in A’s entry to C becomes infinity. • A will immediately send its updated vector to B, E, F. • This update does not affect B’s routing table. • However, E will update its entry to C from 2 to infinity, and then from infinity to 3; and similarly for F. 38 8.2.3 Routing loops • If the link A-E fails, • The corresponding entry in A is updated. • A triggered update from A, and periodic updates from B, C, and F. • Possible timing (>: earlier than): • Case 1: A > B and A > C and A > F • Case 2: A > B and A > C but A < F • Case 3: A > B and A > F, but A < C • In case 1, all nodes will eventually conclude that E is unreachable. • In case 2, a routing loop between A and F forms. 39 8.2.3 Routing loops • In case 3, a routing loop between A and C forms. • In both cases 2 and 3, the cost to E keeps on increasing. • One solution to this problem is to declare the link unusable when the cost reaches, say, 16 (count to infinity). • Split horizon is another solution to solving 2-node routing loop. • A node will not advertise a route back to another node that serves as the next hop for that route. • For example, B, C, F will not advertise their routes to E back to A. 40 8.2.4 Routing information protocol (RIP) • RIP implements the distance vector approach. • A hop count of 16 is interpreted as infinity. • Each RIP router broadcasts its distance vectors to its neighbors every 30 seconds. • RIP is implemented at the application level. • Common daemons used on the Unix systems are the programs routed and gated. • RIP packets are carried over UDP and IP. 41 8.3 Link state routing protocols • In this approach, every nodes maintains the network topology information in a link state database. • Thus, this approach relies on two mechanisms: • A reliable flooding for dissemination of link-state information, and • a shortest-path algorithm for computing routes. 42 8.3.1 An example B C A D E F G 43 8.3.1 An example • Link state database: From A A A A A A B B B B B B : : To B C D E F G A C D E F G : : Metric 1 1 1 1 1 1 1 1 1 1 1 1 : : Seq. Number 1 1 1 1 1 1 1 1 1 1 1 1 : : 44 8.3.2 Link state updates • The link state can be based on any metric, including hop count, latency, throughput, monetary cost, etc. • When a link state is changed, say from 1 to 2 for AE, A will send this update to all other nodes through a reliable flooding scheme. • A sends the update to B, C, F. • A ensures the reliable transmission of the update through positive acknowledgment and retransmission. 45 8.3.2 Link state updates • B, C, F, upon receiving the update, compare the sequence number of the update and that in their databases. • If the sequence number in the update is higher, update the link state in the database, and forward it to other interfaces other than the one where the update is received. • Otherwise, drop the update and no change in the database. • Although C receives two copies of the update, it forwards only one copy to D and the other is discarded. • The new link state database becomes 46 8.3.2 Link state updates From A A A A A A B B B B B B : : To B C D E F G A C D E F G : : Metric 1 1 1 2 1 1 1 1 1 1 1 1 : : Seq. Number 1 1 1 2 1 1 1 1 1 1 1 1 : : 47 8.3.3 Computing optimal paths • Given a link state database for the network topology, each node can apply any shortest-path algorithms to find optimal paths from itself to other nodes in the network. • For example, using the hop count as the metric, we have for node A: 48 8.3.3 Computing optimal paths B C D A E F G 49 8.3.4 Open shortest path first (OSPF) protocol • OSPF implements a link state approach. • OSPF supports different type-of-service routing by having different sets of metric for route computation. • OSPF supports equal-cost routes to a destination. • OSPF reduces the amount of routing update messages as compared with RIP. • OSPF provides fast and loopless convergence.