CPS-356- Computer Networks Class 8: IP Forwarding+ Routing Theophilus Benson Based partly on lecture notes by Rodrigo Fonseca, David Mazières, Phil Levis, John Jannotti Admini-strivia • Midterm 1: – Day after UNC game: New proposed dates: • 02/24/2015 • HW #1: Going Up Tomorrow on Website (due in a week 02/12/2015) Today’s Lecture • Forwarding – IP-Address/IP-Packet Format – Fragmentation – Debugging the network: ICMP – Getting IP-Address: ARP + DHCP • Routing – Intra-Domain Routing – Distance Vector Protocol • Loop Detection + Avoidance Format of IP Addresses Classed Addresses • Pros – Very simple to use and implement – Allows for hierarchical routing – Use first 3 bits to determine addresses class (A, B, C) – Based on class you know what bits to ignore • Cons – Wasteful allocation – Statically specify network and host portion of address CIDR Addresses • Pros – Efficient allocation of resources – dynamically specify network and host portion of address • Cons – More complex to implement in hardware Format of IP Addresses Classed Addresses (Static partitioning of Network/host portions) • Class A (8-bit prefix), B (16bit), C (24-bit) CIDR (Dynamic partitioning of Network/hosts portions) 128.23.16.12/31 Specifies the prefix size: the number of bits in the network portion (NetMask) 10000000.00010111.00010000.00001100 11111111.11111111.11111111.11111110 128.23.92.12 10000000 Prefix size = 31 bits Host size = 1 bit 32-31=1 Only 2^1 hosts in the network Other CIDR Examples 128.23.16.12/24 128.23.16.12/32 10000000.00010111.00010000.00001100 10000000.00010111.00010000.00001100 11111111.11111111.11111111.00000000 11111111.11111111.11111111.11111111 Prefix size = 24 bits Host size = 8 bits 32 – 24 = 8 Only 2^8 hosts in the network Prefix size = 32 bits Host size = 0 bit 32-32=0 Only 2^0 hosts in the network Where Does IP-Address Fit Into a packet? Src Port Dst Port Seq Number Ack Number Window Offset Reserved V TOS Identification V Data (Payload) Total Length MMM Frag Protocol Hdr checksum TTL Source IP Address Destination IP Address Options Padding Destination MAC Address Source MAC Address Length Type IP Ethernet IP v4 packet format • Forward based on destination address Hdr len vers TOS Identification TTL Protocol Total Length Fragment Offset Hdr Checksum Source IP Address Destination IP Address Options Padding IP v4 packet format Hdr len vers TOS Identification TTL Protocol Total Length Fragment Offset Hdr Checksum Source IP Address Destination IP Address Options Padding • Forward based on destination address • TTL = Time to Live • Prevents forwarding loops • Decremented at each hop IP v4 packet format DF Hdr len vers TOS Identification TTL Protocol MF Total Length Fragment Offset Hdr Checksum Source IP Address Destination IP Address Options Padding • Forward based on destination address • TTL = Time to Live • Prevents forwarding loops • Decremented at each hop • Cut large packets into smaller ones • E.g. from Ethernet to ATM • From 1500B to 64B • MF: more fragments • DF: don’t fragment (return an error to the sender) IP v4 packet format DF Hdr len vers TOS Identification TTL Protocol MF Total Length Fragment Offset Hdr Checksum Source IP Address Destination IP Address Options Padding • Forward based on destination address • TTL = Time to Live • Prevents forwarding loops • Decremented at each hop • Cut large packets into smaller ones • E.g. from Ethernet to ATM • From 1500B to 64B • MF: more fragments • DF: don’t fragment (return an error to the sender) • Version = IPv4 or IPv6 IP v4 packet format Hdr len vers TOS Identification TTL Protocol Total Length Fragment Offset Hdr Checksum Source IP Address Destination IP Address Options Padding • Forward based on destination address • TTL = Time to Live • Prevents forwarding loops • Decremented at each hop • Cut large packets into smaller ones • E.g. from Ethernet to ATM • From 1500B to 64B • MF: more fragments • DF: don’t fragment (return an error to the sender) • Version = IPv4 or IPv6 • Protocol = TCP/UDP? IP v4 packet format Hdr len vers TOS Identification TTL Protocol Total Length Fragment Offset Hdr Checksum Source IP Address Destination IP Address Options Padding • Forward based on destination address • TTL = Time to Live • Prevents forwarding loops • Decremented at each hop • Cut large packets into smaller ones • E.g. from Ethernet to ATM • From 1500B to 64B • MF: more fragments • DF: don’t fragment (return an error to the sender) • Version = IPv4 or IPv6 • Protocol = TCP/UDP? Header length == size of the header, which can vary because you can have an arbitrary number of options Total length == length of header + payload Today’s Lecture • Forwarding – IP-Address/IP-Packet Format – Fragmentation – Network Error Messages (Debugging): ICMP – Getting IP-Address: ARP + DHCP • Routing – Intra-Domain Routing: RIP Why Do you need to Fragment Packets? • Different networks have different MTUs. – Router may need to fragment packets to allow them to cross different mediums DukeNet (Ethernet) MTU=1500 Le Theo Net (ATM) MTU=64 ATT (Ethernet) MTU=1500 Implication of Fragmentation • If a fragment is lost, must retransmit the whole packet!!! Why? • Fragmentation delays reassembly of packet until all fragments are received • Some people avoid fragmentation!!!! What do Fragmented Packets look like? • Use ‘identification’, ‘fragment offset’ and ‘MF’ bit in IP header – Set the ‘MF’ bit – Use the same ‘Id’ for all fragments – Offset present position in original packet Start of header 1 0 Rest of header 213 Start of header 0 0 Rest of header 213 1400 Bytes 512 bytes Start of header 1 64 Rest of header 213 512 bytes Start of header 0 128 Rest of header 213 376 bytes Internet Control Message Protocol (ICMP) • • • • • • • • Echo (ping) Redirect Destination unreachable (protocol, port, or host) TTL exceeded Checksum failed Reassembly failed Can’t fragment Many ICMP messages include part of packet that triggered them • See http://www.iana.org/assignments/icmpparameters ICMP message format Example: Time Exceeded • Code usually 0 (TTL exceeded in transit) • Discussion: traceroute Example: Can’t Fragment • Sent if DF=1 and packet length > MTU • What can you use this for? • Path MTU Discovery – Can do binary search on packet sizes – But better: base algorithm on most common MTUs Today’s Lecture • Forwarding – IP-Address/IP-Packet Format – Fragmentation – Debugging the network: ICMP – Getting IP-Address: ARP + DHCP • Routing – Intra-Domain Routing: RIP How do you Make a Packet Src Port Dst Port Seq Number Ack Number Window Offset Reserved V TOS Identification V TTL Data (Payload) Total Length MMM Frag Protocol Hdr checksum Source IP Address IP Destination IP Address Options Padding Destination MAC Address Source MAC Address Length Type DNS gives this to you ??????? Comes with your hardware Ethernet Obtaining Host IP Addresses - DHCP • Address must be assigned to each host by his network. – Manually: Tedious and error-prone: – Automatically: Dynamic Host Configuration Protocol • Client: DHCP Discover to 255.255.255.255 (broadcast) • Server(s): DHCP Offer to 255.255.255.255 (why broadcast?) • Client: choose offer, DHCP Request (broadcast, why?) • Server: DHCP ACK (again broadcast) • Result: IP-address, gateway, netmask, DNS server How do you Make a Packet Src Port Dst Port Seq Number Ack Number Window Offset Reserved V TOS Identification V TTL Data (Payload) Total Length MMM Frag Protocol Hdr checksum Source IP Address IP DHCP Destination IP Address Options Padding Destination MAC Address Source MAC Address Length Type DNS gives this to you ??????? Comes with your hardware Ethernet What is the Destination Address? • If dest. is in your network (e.g. Alice to Bob) – Then use the Destination’s Ethernet address. • If dest. is not in your network (e.g Alice to Google) – Then use the gateway router’s Ethernet address. – The destination may use a different protocol Ethernet ATM ATM Ethernet DukeNet (Ethernet) Alice Le Theo Net (ATM) Ethernet Google Bob How do you find this destination address? Ethernet • Check local ARP table – If found use it. (DONE!) – Start sending packets! Ethernet DukeNet (Ethernet) Alice Ethernet Bob ATM How do you find this destination address? • Check local ARP table – If found use it. (DONE!) • Compare my IP with dest IP Alice: 128.23.16.12/30 Bob: 128.23.16.14 Google: 128.16.16.16 – In same network? • Then ARP request for Dest IP – In different Networks? • Then ARP request for Router IP DukeNet: 128.23.16.12/30 4 addresses 128.23.16.12– 128.23.16.16 Alice->Bob: same network Alice->Google: diff network How do you find this destination address? Ethernet Alice: 128.23.16.12/30 Bob: 128.23.16.14 Google: 128.16.16.16 Ethernet DukeNet (Ethernet) Alice Ethernet Bob DukeNet: 128.23.16.12/30 4 addresses 128.23.16.12– 128.23.16.16 Alice->Bob: same network Alice->Google: diff network ATM How Ethernet ARP ATM works. Ethernet DukeNet (Ethernet) Alice I am: 128.23.16.12 Who is IP: 128.23.16.14 Ethernet Bob How Ethernet ARP ATM works. Ethernet DukeNet (Ethernet) Alice I am: 128.23.16.12 Who is IP: 128.23.16.14 Now I know who: 128.23.16.12 is! Now I know who: 128.23.16.12 is! Ethernet Bob How Ethernet ARP ATM works. Ethernet DukeNet (Ethernet) Alice Ethernet Bob I am:128.23.16.14 MacAdd: 02……….. Now I know who: 128.23.16.14 is! Now I know who: 128.23.16.14 is! ARP Ethernet frame format • Why include source hardware address? How do you Make a Packet Src Port Dst Port Seq Number Ack Number Window Offset Reserved V TOS Identification V TTL Data (Payload) Total Length MMM Frag Protocol Hdr checksum Source IP Address IP DHCP Destination IP Address Options Padding Destination MAC Address Source MAC Address Length Type DNS gives this to you ARP Comes with your hardware Ethernet Today’s Lecture • Forwarding – IP-Address/IP-Packet Format – Fragmentation – Debugging the network: ICMP – Getting IP-Address: ARP + DHCP • Routing – Intra-Domain Routing – Distance Vector Protocol • Loop Detection + Avoidance Routing • Routing is the process of updating forwarding tables – Routers exchange messages about routers or networks they can reach – Goal: find optimal route for every destination – … or maybe a good route, or any route (depending on scale) • Challenges – Dynamic topology – Decentralized – Scale Scaling Issues • Every router must be able to forward based on any destination IP address – Given address, it needs to know next hop – Naïve: one entry per address – There would be 108 entries! • Solutions – Hierarchy (many examples) – Address aggregation • Address allocation is very important (should mirror topology) – Default routes IP Connectivity • For each destination address, must either: – Have prefix mapped to next hop in forwarding table – Know “smarter router” – default for unknown prefixes • Route using longest prefix match, default is prefix 0.0.0.0/0 • Core routers know everything – no default • Manage using notion of Autonomous System (AS) Internet structure, 1990 • Several independent organizations • Hierarchical structure with single backbone Internet structure, today • Multiple backbones, more arbitrary structure Autonomous Systems • Correspond to an administrative domain – AS’s reflect organization of the Internet – E.g., DukeNet, large company, etc. – Identified by a 16-bit number • AS are also called ISP – ISP = Internet Service Providers Lnk 1 DukeNet B Lnk2 A ATT Le Theo Net C D • AS’s choose their own local routing algorithm • How should A,B,C,D do routing? • AS’s want to set policies about non-local routing • Should DukeNet use Link 1 or 2 to ATT? • AS’s need not reveal internal topology of their network • That Duke Net has 4 routers Inter and Intra-domain routing • Routing organized in two levels • Intra-domain routing – Complete knowledge, strive for optimal paths – Scale to ~100 networks – Today • Inter-domain routing – Aggregated knowledge, scale to Internet – Dominated by policy • E.g., route through X, unless X is unavailable, then route through Y. Never route traffic from X to Y. – Policies reflect business agreements, can get complex – Next lecture Lnk 1 DukeNet B Lnk2 A ATT Le Theo Net C D Intradomain: Routing inside DukeNET Interdomain: Routing across DukeNet, ATT, TheoNet Today’s Lecture • Forwarding – IP-Address/IP-Packet Format – Fragmentation – Debugging the network: ICMP – Getting IP-Address: ARP + DHCP • Routing – Intra-Domain Routing – Distance Vector Protocol • Loop Detection + Avoidance Network as a graph • Nodes are routers • Assign cost to each edge – Can be based on latency, b/w, queue length, … • Problem: find lowest-cost path between nodes – Each node individually computes routes Basic Algorithms • Two classes of intra-domain routing algorithms • Distance Vector (Bellman-Ford SP Algorithm) – Requires only local state – Harder to debug – Can suffer from loops • Link State (Djikstra-Prim SP Algorithm) – Each node has global view of the network – Simpler to debug – Requires global state Distance Vector • Local routing algorithm • Each node maintains a set of triples – <Destination, Cost, NextHop> • Exchange updates with neighbors – Periodically (seconds to minutes) – Whenever table changes (triggered update) • Each update is a list of pairs – <Destination, Cost> • Update local table if receive a “better” route – Smaller cost • Refresh existing routes, delete if time out DV Example B only exchanges information with A and C Distance Vector • Local routing algorithm • Each node maintains a set of triples – <Destination, Cost, NextHop> • Exchange updates with neighbors – Periodically (seconds to minutes) – Whenever table changes (triggered update) • Each update is a list of pairs – <Destination, Cost> • Update local table if receive a “better” route – Smaller cost • Refresh existing routes, delete if time out DV Example B only exchanges information with A and C D, 1 A, 1 B’s routing table @ time = 0 Destination Cost Next Hop A 1 A C 1 C D infinity -- E infinity -- F infinity -- G infinity -- DV Example B only exchanges information with A and C D, 1 A, 1 B’s routing table @ time = 0 Destination Cost Next Hop A 1 A C 1 C D 2 C E infinity -- F infinity -- G infinity -- Distance Vector • Local routing algorithm • Each node maintains a set of triples – <Destination, Cost, NextHop> • Exchange updates with neighbors – Periodically (seconds to minutes) – Whenever table changes (triggered update) • Each update is a list of pairs – <Destination, Cost> • Update local table if receive a “better” route – Smaller cost • Refresh existing routes, delete if time out Calculating the best path • Bellman-Ford equation • Let: – Db(d) denote the current best distance from b to d – C(b,c) denote the cost of a link from a to b • Then Db(d) = mind(Db(d) , c(b,c) + Dc(d)) • Routing messages contain D C’s update Destination Cost 1 • D is any additiveD,metric Next Hop 1 A delay 1 A – e.g, number ofA,hops, queue length, C – log can convert multiplicative metric into an1 additiveC infinity -one (e.g., probability of failure) D E Db(d) = mind(infinity, 1 + 1) F infinite -- infinite -- Db(A) = minA(1, 1 + 1) G infinite -- Calculating the best path • Bellman-Ford equation • Let: – Db(d) denote the current best distance from b to d – C(b,c) denote the cost of a link from a to b • Then Db(d) = mind(Db(d) , c(b,c) + Dc(d)) • Routing messages contain D • D is any additive metric – e.g, number of hops, queue length, delay – asdf DV Example B’s routing table Destination Cost Next Hop A 1 A C 1 C D 2 C E 2 A F 2 A G 3 A Adapting to Failures G, 3, D G, 2, D G, 3,C 2, F ∞,- G, 1, G G, 4, 3, A 1, A GG, ∞, 4, • • • • F-G fails F sets distance to G to infinity, propagates A sets distance to G to infinity A receives periodic update from C with 2-hop path to G • A sets distance to G to 3 and propagates • F sets distance to G to 4, through A Count-to-Infinity • • • • • • • Link from A to E fails A advertises distance of infinity to E B and C advertise a distance of 2 to E B decides it can reach E in 3 hops through C A decides it can reach E in 4 hops through B C decides it can reach E in 5 hops through A, … When does this stop? Good news travels fast B 1 4 1 A C 10 • A decrease in link cost has to be fresh information • Network converges at most in O(diameter) steps Bad news travels slowly A 4 A C 1 C B B 4 B C 5 B 4 1 A A 5 B B 1 B C 10 • An increase in cost may cause confusion with old information, may form loops • Consider routes to A • Initially, B:A,4,A; C:A,5,B • Then B:A,12,A, selects C as next hop -> B:A,6,C • C -> A,7,B; B -> A,8,C; C -> A,9,B; B -> A,10,C; • C finally chooses C:A,10,A, and B -> A,11,C! Bad news travels slowly A 6 C A 6 C C 1 C 12 B 11 C C 10 C B 4 1 A A 5 B B 1 B C 10 • An increase in cost may cause confusion with old information, may form loops • Consider routes to A • Initially, B:A,4,A; C:A,5,B • Then B:A,12,A, selects C as next hop -> B:A,6,C • C -> A,7,B; B -> A,8,C; C -> A,9,B; B -> A,10,C; • C finally chooses C:A,10,A, and B -> A,11,C! Bad news travels slowly A 7 C C 1 C 12 B 11 C C 10 C B 4 1 A A 6 B B 1 B C 10 • An increase in cost may cause confusion with old information, may form loops • Consider routes to A • Initially, B:A,4,A; C:A,5,B • Then B:A,12,A, selects C as next hop -> B:A,6,C • C -> A,7,B; B -> A,8,C; C -> A,9,B; B -> A,10,C; • C finally chooses C:A,10,A, and B -> A,11,C! Bad news travels slowly A 11 C C 1 C 12 B 11 C C 10 C B 4 1 A A 10 C B 1 B C 10 • An increase in cost may cause confusion with old information, may form loops • Consider routes to A • Initially, B:A,4,A; C:A,5,B • Then B:A,12,A, selects C as next hop -> B:A,6,C • C -> A,7,B; B -> A,8,C; C -> A,9,B; B -> A,10,C; • C finally chooses C:A,10,A, and B -> A,11,C! How to avoid loops • IP TTL field prevents a packet from living forever – Does not repair a loop • Simple approach: consider a small cost n (e.g., 16) to be infinity – After n rounds decide node is unavailable – But rounds can be long, this takes time • Problem: distance vector based only on local information Bad news travels slowly A 11 C C 1 C 12 B 11 C C 10 C B 4 1 A A 10 C B 1 C 10 • Why did it take a while to converge? B Better loop avoidance • Split Horizon – When sending updates to node A, don’t include routes you learned from A – Prevents B and C from sending cost 2 to A • Split Horizon with Poison Reverse – Rather than not advertising routes learned from A, explicitly include cost of ∞. – Faster to break out of loops, but increases advertisement sizes Warning • Split horizon/split horizon with poison reverse only help between two nodes – Can still get loop with three nodes involved – Might need to delay advertising routes after changes, but affects convergence time Today’s Lecture • Forwarding – IP-Address/IP-Packet Format – Fragmentation – Network Error Messages (Debugging): ICMP – Getting IP-Address: ARP + DHCP • Routing – Intra-Domain Routing: RIP • Next class: – Intra-Domain Routing: OSPF, OSPF v RIP