Routing Outline Algorithms Scalability 1 Overview • Forwarding vs Routing – forwarding: to select an output port based on destination address and routing table – routing: process by which routing table is built • Network as a Graph A 3 4 C 6 1 2 1 B 9 E F 1 D • Problem: Find lowest cost path between two nodes • Factors – static: topology – dynamic: load 2 Distance Vector • Each node maintains a set of triples – (Destination, Cost, NextHop) • Directly connected neighbors exchange updates – periodically (on the order of several seconds) – whenever table changes (called triggered update) • Each update is a list of pairs: – (Destination, Cost) • Update local table if receive a “better” route – smaller cost – came from next-hop • Refresh existing routes; delete if they time out 3 Example B C A D E F G Destination Cost NextHop A 1 A C 1 C D 2 C E 2 A F 2 A G 3 A Routing Table at B 4 Routing Loops • Example 1 – – – – – – F detects that link to G has failed F sets distance to G to infinity and sends update t o A A sets distance to G to infinity since it uses F to reach G A receives periodic update from C with 2-hop path to G A sets distance to G to 3 and sends update to F F decides it can reach G in 4 hops via A • Example 2 – – – – – – link from A to E fails A advertises distance of infinity to E B and C advertise a distance of 2 to E B decides it can reach E in 3 hops; advertises this to A A decides it can read E in 4 hops; advertises this to C C decides that it can reach E in 5 hops… 5 Loop-Breaking Heuristics • Set infinity to 16 – can not loop for ever • Split horizon – B does not send update (D, h, A) to A, since it learned from A. – prevents two-node loops • Split horizon with poison reverse – B sends update (D, inf., A) to A, since it learned from A. – prevents two-node loops • Not scale to large networks 6 Link State • Strategy – send to all nodes (not just neighbors) information about directly connected links (not entire routing table) • Link State Packet (LSP) – – – – id of the node that created the LSP cost of link to each directly connected neighbor sequence number (SEQNO) time-to-live (TTL) for this packet 7 Link State (cont) • Reliable flooding – store most recent LSP from each node – forward LSP to all nodes but one that sent it – generate new LSP periodically • increment SEQNO – start SEQNO at 0 when reboot – decrement TTL of each stored LSP • discard when TTL=0 8 Route Calculation • Dijkstra’s shortest path algorithm • Let – – – – – N denotes set of nodes in the graph l (i, j) denotes non-negative cost (weight) for edge (i, j) s denotes this node M denotes the set of nodes incorporated so far (labeled set) C(n) denotes cost of the path from s to node n M = {s} for each n in N - {s} C(n) = l(s, n) while (N != M) M = M union {w} such that C(w) is the minimum for all w in (N - M) for each n in (N - M) C(n) = MIN(C(n), C (w) + l(w, n )) 9 Metrics • Original ARPANET metric – measures number of packets queued on each link – took neither latency or bandwidth into consideration • New ARPANET metric – stamp each incoming packet with its arrival time (AT) – record departure time (DT) – when link-level ACK arrives, compute Delay = (DT - AT) + Transmit + Latency – if timeout, reset DT to departure time for retransmission – link cost = average delay over some time period • Fine Tuning – compressed dynamic range – replaced Delay with link utilization 10 Routing Table at Routers Subnet mask: 255.255.255.128 Subnet number: 128.96.34.0 128.96.34.15 128.96.34.1 R1 H1 Subnet mask: 255.255.255.128 Subnet number: 128.96.34.128 128.96.34.130 128.96.34.139 128.96.34.129 H3 R2 H2 128.96.33.1 128.96.33.14 Subnet mask: 255.255.255.0 Subnet number: 128.96.33.0 Forwarding table at router R1 Subnet Number 128.96.34.0 128.96.34.128 128.96.33.0 Subnet Mask 255.255.255.128 255.255.255.128 255.255.255.0 Next Hop interface 0 interface 1 R2 11 Forwarding Algorithm D = destination IP address for each entry (SubnetNum, SubnetMask, NextHop) D1 = SubnetMask & D if D1 = SubnetNum if NextHop is an interface deliver datagram directly to D else deliver datagram to NextHop • • • • Use a default router if nothing matches Not necessary for all 1s in subnet mask to be contiguous Can put multiple subnets on one physical network Subnets not visible from the rest of the Internet 12 Internet Structure Recent Past NSFNET backbone Stanford ISU BARRNET regional Berkeley Westnet regional PARC ■■■ UNM NCAR MidNet regional UNL KU UA 13 Internet Structure Today Large corporation “Consumer”ISP Peering point Backbone service provider “Consumer”ISP Large corporation Peering point “Consumer”ISP Small corporation 14 How to Make Routing Scale • Still Too Many Networks – routing tables do not scale – route propagation protocols do not scale 15 CIDR: Classless Inter-Domain Routing • CIDR (RFC 1519) assigns variable-sized addresses, without regard to classes to solve address shortage of IPv4. – IP address is accompanied by a network mask to indicate the boundary. Usually written as: 128.131.0.0/22 (first IP address + number of bits in the network part • Longest prefix match and address aggregation for scalable routing. 16 Longest Prefix Match and Address Aggregation • • • • A: B: C: D: 11000010 11000010 11000010 11000010 00011000 00011000 00011000 00011000 00000000 00000000 /21 00001000 00000000 /22 00001100 00000000 /22 00010000 00000000 /20 host bits: 11 host bits: 10 host bits: 10 host bits: 12 • If a packet comes in with destination address: 11000010 00011000 00010001 00000100 (194.24.17.4), the only entry that produces a match is D. • The above 4 entries can be further aggregated into 1 if the router has the same next hop for the 4 destinations, in the form of 194.24.0.0/19, or 11000010 00011000 00000000 00000000 /19. 17 How Routing Works in the Internet • Know a smarter router – – – – hosts know local router (default router) local routers know site routers site routers know core router core routers know everything • Autonomous System (AS) – corresponds to an administrative domain – examples: University, company, backbone network – assign each AS a 16-bit number • Two-level route propagation hierarchy – interior gateway protocol (each AS selects its own) – exterior gateway protocol (Internet-wide standard) 18 Popular Interior Gateway Protocols • RIP: Route Information Protocol – – – – developed for XNS distributed with Unix distance-vector algorithm based on hop-count • OSPF: Open Shortest Path First – – – – recent Internet standard uses link-state algorithm supports load balancing supports authentication 19 EGP: Exterior Gateway Protocol • Overview – designed for tree-structured Internet – concerned with reachability, not optimal routes • Protocol messages – neighbor acquisition: one router requests that another be its peer; peers exchange reachability information – neighbor reachability: one router periodically tests if the another is still reachable; exchange HELLO/ACK messages; uses a k-out-of-n rule – routing updates: peers periodically exchange their routing tables (distance-vector) 20 BGP-4: Border Gateway Protocol • AS Types – stub AS: has a single connection to one other AS • carries local traffic only – multihomed AS: has connections to more than one AS • refuses to carry transit traffic – transit AS: has connections to more than one AS • carries both transit and local traffic • Each AS has: – one or more border routers – one BGP speaker that advertises: • local networks • other reachable networks (transit AS only) • gives path information 21 BGP Example • Speaker for AS2 advertises reachability to P and Q – network 128.96, 192.4.153, 192.4.32, and 192.4.3, can be reached directly from AS2 Customer P (AS 4) 128.96 192.4.153 Customer Q (AS 5) 192.4.32 192.4.3 Customer R (AS 6) 192.12.69 Customer S (AS 7) 192.4.54 192.4.23 Regional provider A (AS 2) Backbone network (AS 1) Regional provider B (AS 3) • Speaker for backbone advertises – networks 128.96, 192.4.153, 192.4.32, and 192.4.3 can be reached along the path (AS1, AS2). • Speaker can cancel previously advertised paths 22 IP Version 6 • Features: – Address is 16 byte long (IPv4 has 4 bytes). – Header is simplifies, having only 7 fields (IPv4 has 13). – Less used features are put in the option fields, which are made easier to be processed. – Better support for security. – Better support for QoS. • Header – 40-byte “base” header – extension headers (fixed order, mostly fixed length) • • • • fragmentation source routing authentication and security other options 23 The Main IPv6 Header 24 The Main IPv6 Header • The version field is always 6. • The traffic class field indicates the QoS treatment required for the packet. • The flow label field provides a mechanism to implement a virtual circuit, which is uniquely identified by the tuple (source address, destination address, flow label). Virtual circuit makes providing QoS easier. • The payload field indicates the number of bytes of the packet excluding the 40-byte fixed header. • The next header field indicates which of the six extension headers (options) follows the fixed header, if none, indicates the upper layer protocol, e.g. TCP or UDP, to pass the data to. • The hop limit field indicates the maximum hop the packet is allowed to go through, to prevent a packet looping for ever, similar to TTL in IPv4 • IPv4 header has fragmentation (option in IPv6), checksum, HLEN, etc. 25 IPv6 Address • The address fields use 16-byte IPv6 addresses. The number of possible IPv6 addresses is 2^128 or 10^38. A new notation is used, i.e., an address is written as eight groups of four hexadecimal numbers, with colons between the groups, like this: 8000:0000:0000:0000:0123:4567:89AB:CDEF • Since many zeros can appear in an address, three optimizations are made – Leading zeros are omitted, so 0123 becomes 123. – One or more groups of all zeros can be replaced by a pair of colons, so the above address becomes 8000::123:4567:89AB:CDEF. – IP addresses can be written as a pair of colons and an old dotted decimal number, e.g., ::192.31.20.46. 26