Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet Protocol m m m m Datagram format IPv4 addressing ICMP IPv6 r 4.5 Routing algorithms m Link state m Distance Vector m Hierarchical routing r 4.6 Routing in the Internet m m m RIP OSPF BGP r 4.7 Broadcast and multicast routing Hierarchical OSPF Perhaps some routers don’t need to know about every link. r E C G A H r r r two-level hierarchy: local area, backbone. m Link-state advertisements only within the area m each nodes has detailed knowledge of its area topology area border routers: “summarize” distances to nets in own area, advertise to other Area Border routers. backbone routers: run OSPF routing limited to backbone. boundary routers: connect to other AS’s. •ABR C announces link C<->A to Area 1 •ABR C announces link C<->E to Backbone •ABR C announces a summary of Area 1 to the Backbone •ABR C announces a summary of the Backbone and other areas to Area 1 •C learns about other areas from the other ABR •…… Area Border Router Summaries Should the summaries include reachbility information or path metrics? C G H •Routers in area 1 do not need to know about the paths used to reach destinations in other areas •They only need to know that they can be reached. •In this case, reachbility information is sufficient to compute optimal routes •i.e., the ABR only announces which destinations it can reach. •However, no one would make a topology as shown in the figure •Why? •If a single key links break or router crashes, the network would be partitioned (and the network designer would be fired) Area Border Router Summaries e.g., if summaries only include reachbility information area border router E F C G A B D •ABR C announces to Area 1 that it can reach Area 2 in 1 hops (and includes a list of destinations in Area 2) •ABR F announces to Area 1 that it can reach Area 2 in 0 hops •Router A determines the path to D as follows •The path to Area 2 via F is 2 hops (2 to reach F and then 0 more to Area 2) •The path to Area 2 via C is 2 hops (1 to C and then 1 more to Area 2) •Either path is good to reach D •However, the path via F is better. A does not have sufficient information to determine this. Area Border Router Summaries area border router E F C G A B D •ABR G tells all routers in the Backbone that it can reach D in 2 hop. •ABR F tells all routers in the Backbone that it can reach D in 1 hops •ABR C tells all routers in Area 1 that it can reach D in 3 hops •ABR F tells all routers in Area 1 that it can reach D in 1 hop •A decides B is the best next hop toward D •In this case, reachability information is not enough to compute optimal routes. •Therefore, ABRs provide distance vector type information, i.e., which destinations can be reached and the cost to reach them •Notice the C does not announce the link CG to Area 1. •Notice that C gets a summary from G, which is distances to destinations, like distance vector. •C uses the distances from G to determine its distances. •C announces these distances to Area 1 •This is like a one hop distance vector protocol Area Border Router Summaries area border router E F C G The backbone is completely connected because each router essentially sends distance vector updates directly to its neighbor C A B D F A in 1 hop B in 2 hops … … Area 1 2 1 3 1 G A in 4 hop B in 5 hops … … 2 A in 2 hop B in 3 hops … … Area 3 Area 2 •This is like a one hop distance vector protocol •Convergence time: 1 •Loops are not possible Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet Protocol m m m m Datagram format IPv4 addressing ICMP IPv6 r 4.5 Routing algorithms m Link state m Distance Vector m Hierarchical routing r 4.6 Routing in the Internet m m m RIP OSPF BGP r 4.7 Broadcast and multicast routing Recall: Subnets 223.1.1.2 223.1.1.1 223.1.1.4 223.1.1.3 223.1.9.2 223.1.7.0 223.1.9.1 223.1.7.1 223.1.8.1 223.1.8.0 223.1.2.6 223.1.2.1 223.1.3.27 223.1.2.2 223.1.3.1 223.1.3.2 IP addressing: CIDR CIDR: Classless InterDomain Routing m m subnet portion of address of arbitrary length address format: a.b.c.d/x, where x is # bits in subnet portion of address Subnet part or CIDR-block host part 11001000 00010111 00010000 00000000 200.23.16.0/23 IP addresses: how to get one? Q: How does network get subnet part of IP addr? A: gets allocated portion of its provider ISP’s address space ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20 Organization 0 Organization 1 Organization 2 ... 11001000 00010111 00010000 00000000 11001000 00010111 00010010 00000000 11001000 00010111 00010100 00000000 ….. …. 200.23.16.0/23 200.23.18.0/23 200.23.20.0/23 …. Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23 Hierarchical addressing: route aggregation Hierarchical addressing allows efficient advertisement of routing information: Organization 0 200.23.16.0/23 Organization 1 200.23.18.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . ISP1 “Send me anything with addresses beginning 200.23.16.0/20” Border Router 200.23.30.0/23 ISP2 “Send me anything with addresses beginning 199.31.0.0/16” This way, the whole 32 bit address does not need to be examined Internet Hierarchical addressing: more specific routes ISP2 has a more specific route to Organization 1 Organization 0 200.23.16.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . ISP1 “Send me anything with addresses beginning 200.23.16.0/20” Border Router 200.23.30.0/23 ISP2 Organization 1 200.23.18.0/23 “Send me anything with addresses beginning 199.31.0.0/16 or 200.23.18.0/23” Internet Longest prefix matching Border Router Forwarding Table Prefix Match 200.23.16.0/20 200.23.18.0/23 199.31.0.0/16 otherwise Link Interface 0 1 1 2 If a packet with destination address 200.23.18.12 arrives at the boarder router, then is it forwarding to interface 0 or 1? Since interface 1 has a longer match, it goes to interface 1 A Problem with Longest Match and subnetting In order to improve reliability, organization 7 has a backup link with ISP1. This way, if ISP1 has problems or ISP1’s provider has problems, then organization 7 is still reachable. Will this work? Organization 0 200.23.16.0/23 Organization 1 200.23.18.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . ISP1 “Send me anything with addresses beginning ……” Border Router 200.23.30.0/23 ISP2 “Send me anything with addresses beginning ….. Internet Hierarchical Routing Our routing study thus far has been an idealization r all routers identical r network “flat” … not true in practice scale: with 200 million destinations: r can’t store all dest’s in routing tables! m Memory for address table must be very fast • • r How fast? How long can an address lookup take on a 10GBit interface? E.g., 64B/1010=50nsec routing table exchange would swamp links! m There are ~ 1 million links m If link state was flooded every 30 minutes seconds and each link state is 20B, then each router receives and processes 100kbps in link announcements m But, perhaps, only changes in link state could be distributed. administrative autonomy r r internet = network of networks each network admin wants to control routing in its own network m ATT does not want Sprint to know what their topology is • • m Trade secret Improves security ATT wants to select a routing protocol and parameters without getting Sprint’s permission Hierarchical Routing r r r aggregate routers into regions, “autonomous systems” (AS) Single administrative domain Routers in the same AS run same routing protocol m m r An ISP may be made of 1 or more ASs m m r r “intra-AS” routing protocol routers in different AS can run different intra-AS routing protocol ATT-USA = 1 AS and ATT-Europe is another Some stub networks are an AS • • UD is an AS Some companies have routers but are not ASs ASs have their own number, assigned by ICANN There are ~50K ASs Gateway router r Direct link to router in another AS r Gateway routers run a common inter-networking routing protocol Simple example Connections to other ASs and the rest of the Internet AS2 Service provider of AS1 (e.g., AS1=UD and AS2=cogent) E The rest of the internet (Recall that ASs (ISPs) sometimes meet at NAPs. E.g., google: MAE-East) An AS could also meet its provider at a POP. Stub network (at the edge of the network) These tables are made with RIP, OSPF, ISIS, etc 3 Forwarding table Interface Prefix 4 10.1.1.0/24 3 10.1.2.0/24 2 10.1.4.0/22 10.1.1.0/24 4 C 3 1 B 2 Forwarding table Prefix 3 10.1.1.0/24 3 10.1.2.0/24 2 10.1.4.0/22 AS1 1 2 1 3 A 2 10.1.2.0/24 10.1.4.0/22 Forwarding table Interface Prefix 3 10.1.1.0/24 3 10.1.2.0/24 2 10.1.4.0/22 Q: How can routers in AS1 know where to send pkts with destination not in AS1? A: Easy, if a pkt is for an “unknown” address, then send it to B. Specifically, B advertises a link to prefix 0.0.0.0/0 This is called a default route, and it can be statically set (no need for any routing protocol beside OSPF) AS2 Service provider of AS1 (e.g., AS1=UD and AS2=cogent) E The rest of the internet Stub network (at the edge of the network) These tables are made with RIP, OSPF, ISIS, etc 3 Forwarding table Interface Prefix 4 10.1.1.0/24 3 10.1.2.0/24 2 10.1.4.0/22 1 0.0.0.0/0 10.1.1.0/24 4 C 3 1 B 2 Forwarding table Prefix 3 10.1.1.0/24 3 10.1.2.0/24 2 10.1.4.0/22 1 0.0.0.0/0 1 2 1 3 A 2 10.1.2.0/24 10.1.4.0/22 Forwarding table Interface Prefix 3 10.1.1.0/24 3 10.1.2.0/24 2 10.1.4.0/22 1 0.0.0.0/0 AS1 We need to put prefixes 1.1.0.0/16, 1.2.0.0/16, 2.2.0.0/16 in the forwarding tables How to get there? 1. B must learn from E that 1.1.0.0/16 and 1.2.0.0/16 are reachable through E 2. A must learn that 2.2.0.0/16 is reachable through D 3. B and A must distribute this information throughout AS1 Steps 1 and 2 need a exterior inter-networking routing protocol Step 3 needs an interior inter-networking routing protocol EBGP and IBGP – border gateway routing protocol can accomplish this to the rest of the Internet 1.2.0.0/16 1.1.0.0/16 AS2 E These tables are made with RIP, OSPF, ISIS, etc Forwarding table Interface Prefix 4 10.1.1.0/24 3 10.1.2.0/24 2 10.1.4.0/22 10.1.1.0/24 3 1 B 2 4 C 1 2 3 3 10.1.1.1/24 10.1.4.0/22 Forwarding table Prefix 3 10.1.1.0/24 3 10.1.2.0/24 2 10.1.4.0/22 1 A 2 4 AS1 Forwarding table Interface Prefix 3 10.1.1.0/24 3 10.1.2.0/24 2 10.1.4.0/22 D AS3 2.2.0.0/16 Interconnected ASes 3c 3a 3b AS3 1a 2a 1c 1d 1b Intra-AS Routing algorithm 2c AS2 AS1 Inter-AS Routing algorithm Forwarding table 2b r forwarding table configured by both intra- and inter-AS routing algorithm m m intra-AS sets entries for internal dests inter-AS & intra-As sets entries for external dests Example: Setting forwarding table in router 1d r r r r suppose AS1 learns (via inter-AS protocol) that subnet x is reachable via AS3 (gateway 1c) but not via AS2. inter-AS protocol propagates reachability info to all internal routers. router 1d determines from intra-AS routing info that its interface I is on the least cost path to 1c. m installs forwarding table entry (x,I) Alternatively, 1d has two table entries m m One entry says x is reachable via 1c (determined by IBGP) A second entry says which is the next hop to reach 1c (determined by intra-routing protocol) x 3c 3a 3b AS3 1a 2a 1c 1d 1b AS1 2c 2b AS2 Example: Choosing among multiple ASes r r now suppose AS1 learns from inter-AS protocol that subnet x is reachable from AS3 and from AS2. to configure forwarding table, router 1d must determine towards which gateway it should forward packets for dest x. m this is also job of inter-AS routing protocol! m If both gateways are equivalent, then the intra-AS routing protocol will route packets to the best gateway • This is called hot potato routing: send packet towards closest of two routers. x 3c 3a 3b AS3 1a 2a 1c 1d 1b 2c AS2 AS1 2b Hot Potato Routing 128.4.0.0/16 AS1 A AS2 B Pkt arrives with dest in 124.4.0.0/16 •AS2 could give send the pkt to gateway B – hot potato routing. •But AS1 would prefer AS2 to carry its own traffic. •So AS1 might require that AS2 gives higher priority to gateway A. •But how can AS1 enforce AS2 to do this? Example: Choosing among multiple ASes r now suppose AS1 learns from inter-AS protocol that subnet x is reachable from AS3 and from AS2. r to configure forwarding table, router 1d must determine which gateway it should forward packets for dest x. m this is also job of inter-AS routing protocol! r hot potato routing: send packet towards closest of two routers. Learn from inter-AS protocol that subnet x is reachable via multiple gateways Use routing info from intra-AS protocol to determine costs of least-cost paths to each of the gateways Hot potato routing: Choose the gateway that has the least cost Determine from forwarding table the Interface that leads to least-cost gateway. Enter (x,I) in forwarding table Internet inter-AS routing: BGP r BGP (Border Gateway Protocol): the de facto standard r BGP provides each AS a means to: 1. 2. 3. Obtain subnet reachability information from neighboring ASs. Propagate reachability information to all ASinternal routers. Determine “good” routes to subnets based on reachability information and policy. r allows subnet to advertise its existence to rest of Internet: “I am here” BGP basics r pairs of routers (BGP peers) exchange routing info over semi-permanent TCP connections: BGP sessions m BGP sessions need not correspond to physical links. r when AS2 advertises a prefix to AS1: m AS2 promises it will forward datagrams towards that prefix. m AS2 can aggregate prefixes in its advertisement • But this can cause problems when some prefixes have backup links eBGP session 3c 3a 3b AS3 1a AS1 iBGP session 2a 1c 1d 1b 2c AS2 2b Distributing reachability info r using eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1. m 1c can then use iBGP do distribute new prefix info to all routers in AS1 m 1b can then re-advertise new reachability info to AS2 over 1b-to-2a eBGP session r when router learns of new prefix, it creates entry for prefix in its forwarding table. eBGP session 3c 3a 3b AS3 1a AS1 iBGP session 2a 1c 1d 1b 2c AS2 2b Aggregation Problem 1.1.1.0/24 1.1.2.0/24 1.1.0.0/22 ISP ISP Y 1.1.3.0/24 1.1.4.0/24 1.1.0.0/22 ISP W 1.1.4.0/24 ISP 1.1.4.0/24 ISP ISP ISP X From ISP W, the next hop to 1.1.4.0/24 is X, it should be Y Path attributes & BGP routes r advertised prefix includes BGP attributes. m prefix + attributes = “route” r two important attributes: m AS-PATH: contains ASs through which prefix advertisement has passed: e.g, AS 67, AS 17, … m NEXT-HOP: indicates specific internal-AS router to next-hop AS. (there may be multiple routers with links from current AS to next-hop-AS. Each router can advertise the path) r when gateway router receives route advertisement, uses import policy to accept/decline. BGP route selection r r router may learn about more than 1 route to some prefix. Router must select route. elimination rules: 1. 2. 3. 4. local preference value attribute: policy decision shortest AS-PATH closest NEXT-HOP router: hot potato routing additional criteria BGP messages r BGP messages exchanged using TCP. r BGP messages: m m m m OPEN: opens TCP connection to peer and authenticates sender UPDATE: advertises new path (or withdraws old) KEEPALIVE keeps connection alive in absence of UPDATES; also ACKs OPEN request NOTIFICATION: reports errors in previous msg; also used to close connection r TCP reset security risk BGP routing policy legend: B W X A provider network customer network: C Y r A,B,C are provider networks r X,W,Y are customer (of provider networks) r X is dual-homed: attached to two networks m m X does not want to route from B via X to C .. so X will not advertise to B a route to C BGP routing policy (2) legend: B W X A provider network customer network: C Y r A advertises path AW to B r B advertises path BAW to X r Should B advertise path BAW to C? m m m No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers B wants to force C to route to w via A B wants to route only to/from its customers! BGP route processing r r BGP advertises and withdraws paths with the UPDATE message UPDATE has three fields m m m r r Router to withdraw Attributes of routes to prefixes in NLRI NLRI The NLRI is a list of prefixes that the list of attributes applies to. If two prefixes have different attributes, then these two prefixes need to be announced with different UPDATE messages. In OSPF each path is a list of routes and a total cost (two attributes). In BGP, routes have many attributes, the cost (in AS hops) is only one of the attributes configuration from peers input policy routing engine decision configuration routing table output policy engine to peers RIBs r Routing information base (RIB) – a list of routes (including attributes) m m m Adj-RIB-In: RIB learned from neighbor (many of these) Adj-RIB-Out: RIB to be sent to neighbor (many of these) Loc-RIB: RIB for local use (only one of these) peer Adj-rib-in Adj-rib-out peer Adj-rib-in Adj-rib-out peer Adj-rib-out peer Adj-rib-out peer peer Adj-rib-in peer Adj-rib-in Input Policy engine BGP Loc-RIB decision Input Policy engine peer Sample routing environment •deny 0/0 from AS1 •Give 192.213.1.0/24 from AS1 better preference •Accept other routes AS1 192.213.1.0/24 0/0 AS2 input policy engine 192.213.1.0/24 193.214.10.0/24 0/0 decision process •Do not propagate 0/0 •Do not send 193.214.10.0/24 to AS4 •Give 192.213.1.0/24 with metric = 10 to AS3 routes •Use 0/0 from AS2 •Use 192.213.1.0/24 from AS1 •Use 193.214.10.0/24 from AS2 •Use 172.16.10.0/24 from AS5 (this AS) output policy engine AS3 •193.214.10.0/24 path=(AS5, AS2) •192.213.1.0/24 path=(AS5, AS1) metric=10 •172.16.10.0/24 path=(AS5) •172.16.10.0/24 path=(AS5) •192.213.1.0/24 path=(AS5 AS1) AS4 Fun with BGP r Routeviews.org collects and archives BGP announcements r One way to use routeviews is with dig m m m At the linux prompt dig txt 4.128.aspath.routeviews.org Outputs various stuff and • Answer section: – 4.128.aspath.routeviews.org 600 IN TXT “5056 1238 174 34” “128.4.0.0” “16” • Syntax = ASPath “Prefix” “prefix length” r Now use whois -h whois.arin.net "a ASXX" to learn about ASs where XX is an AS number. E.g., whois h whois.arin.net "a AS34" gives information about AS34 r Try with some other AS Check out a collection of path announcements r Open bgp030408p39.Partial m m r http://www.eecis.udel.edu/~bohacek/Classes/ELEG651Spring2008/bgp030508p39.Partial An old (2003) partial list of BGP announcements received by several routers Check which ASs peer with UD (ASN 34) Why different Intra- and Inter-AS routing ? Policy: r Inter-AS: admin wants control over how its traffic routed, who routes through its net. r Intra-AS: single admin, so no policy decisions needed Scale: r hierarchical routing saves table size, reduced update traffic Performance: r Intra-AS: can focus on performance r Inter-AS: policy may dominate over performance