Middle Technical University Electrical Engineering Technical College Computer Engineering Techniques Department Presented by Karrar Shakir Muttair ALNomani Email : karraralnomani123@gmail.com Outline 4.1 Introduction 4.5 Routing algorithms link state 4.2 Virtual circuit and datagram networks. distance vector 4.3 What’s inside a router hierarchical routing 4.4 IP : Internet Protocol 4.6 Routing on the Internet datagram format RIP IPv4 addressing OSPF ICMP BGP IPv6 4.7 Broadcast and multicast routing. Introduction Transport segment from sending to receiving host. On sending side encapsulates segments into datagrams. On receiving side, delivers segments to transport layer. Network layer protocols in every host, router. Router examines header fields in all IP datagrams passing through it. application transport network data link physical Two key network-layer functions Forwarding: move packets from router’s input to appropriate router output. Forwarding: process of getting through single interchange. Routing: process of planning trip from source to destinations. Routing: determine route taken by packets from source to destinations. Routing algorithms Interplay between routing and forwarding Routing algorithm determines end-end-path through network Forwarding table determines local forwarding at this router routing algorithm local forwarding table header value output link 0100 0101 0111 1001 3 2 2 1 value in arriving packet’s header 0111 1 3 2 Connection, connection-less service Datagram network provides network-layer connectionless service. Virtual-circuit network provides network-layer connection service. Analogous to TCP/UDP connection-oriented / connectionless transport-layer services, but: Service: host-to-host No choice: network provides one or the other Implementation: in network core Virtual circuits “source-to-destination path behaves much like telephone circuit” performance-wise network actions along source-to-destination path call setup, teardown for each call before data can flow. Each packet carries VC identifier (not destination host address). Every router on source-destination path maintains “state” for each passing connection. Link, router resources (bandwidth, buffers) may be allocated to VC (dedicated resources = predictable service). VC implementation A VC consists of: 1. Path from source to destination. 2. VC numbers, one number for each link along path. 3. Entries in forwarding tables in routers along path. Packet belonging to VC carries VC number (rather than destination address). VC number can be changed on each link. New VC number comes from forwarding table. VC forwarding table 22 12 1 2 32 3 VC number Interface number Forwarding table in northwest router: Incoming interface 1 2 3 1 … Incoming VC # 12 63 7 97 … Outgoing interface 3 1 2 3 … Outgoing VC # 22 18 17 87 … Virtual circuits: signaling protocols Used to setup, maintain teardown VC. Used in ATM, frame-relay, X.25. Not used in today’s Internet. application transport 5. data flow begins network data link 4. call connected physical 1. initiate call 6. receive data application 3. accept call 2. incoming call transport network data link physical Datagram networks No call setup at network layer. Routers: no state about end-to-end connections. No network-level concept of “connection”. Packets forwarded using destination host address. application transport network 1. send datagrams data link physical application transport 2. receive datagrams network data link physical Datagram forwarding table routing algorithm local forwarding table dest address output link address-range 1 address-range 2 address-range 3 address-range 4 3 2 2 1 IP destination address in arriving packet’s header 1 3 2 4 billion IP addresses, so rather than list individual destination address list range of addresses (aggregate table entries) Datagram forwarding table Destination Address Range Link Interface 11001000 00010111 00010000 00000000 through 11001000 00010111 00010111 11111111 0 11001000 00010111 00011000 00000000 through 11001000 00010111 00011000 11111111 1 11001000 00010111 00011001 00000000 through 11001000 00010111 00011111 11111111 2 otherwise 3 Longest prefix matching longest prefix matching when looking for forwarding table entry for given destination address, use longest address prefix that matches destination address. Destination Address Range Link interface 11001000 00010111 00010*** ********* 0 11001000 00010111 00011000 ********* 1 11001000 00010111 00011*** ********* 2 otherwise 3 Examples: DA: 11001000 00010111 00010110 10100001 DA: 11001000 00010111 00011000 10101010 which interface? which interface? Router architecture overview Two key router functions: run routing algorithms/protocol (RIP, OSPF, BGP) Forwarding datagrams from incoming to outgoing link. forwarding tables computed, pushed to input ports Routing processor Router input ports Routing, management control plane (software) Router output ports high-seed switching fabric Forwarding data plane (hardware) Input port functions line termination Physical layer: bit-level reception Data link layer: e.g., Ethernet see chapter 5 link layer protocol (receive) lookup, forwarding Switch Fabric queueing Decentralized Switching: Given datagram destination, lookup output port using forwarding table in input port memory (“match plus action”). Goal: complete input port processing at ‘line speed’. Queuing: if datagrams arrive faster than forwarding rate into switch fabric. Switching fabrics Transfer packet from input buffer to appropriate output buffer. Switching rate: rate at which packets can be transfer from inputs to outputs. Often measured as multiple of input/output line rate. N inputs: switching rate N times line rate desirable. Three types of switching fabrics. memory memory bus crossbar Switching via memory First generation routers: Traditional computers with switching under direct control of CPU. Packet copied to system’s memory. Peed limited by memory bandwidth (2 bus crossings per datagram). input port (e.g., Ethernet) memory output port (e.g., Ethernet) system bus Switching via a bus Datagram from input port memory to output port memory via a shared bus. Bus contention: switching speed limited by bus bandwidth. 32 Gbps bus, Cisco 5600: sufficient speed for access and enterprise routers. bus Switching via interconnection network Overcome bus bandwidth limitations. Banyan networks, crossbar, other interconnection nets initially developed to connect processors in multiprocessor. Advanced design: fragmenting datagram into fixed length cells, switch cells through the fabric. crossbar Cisco 12000: switches 60 Gbps through the interconnection network. Output ports switch fabric datagram buffer queueing link layer protocol (send) line termination Buffering required when datagrams arrive from fabric faster than the transmission rate Scheduling discipline chooses among queued datagrams for transmission Output port queueing switch fabric at t, packets more from input to output switch fabric one packet time later The Internet network layer host, router network layer functions: transport layer: TCP, UDP IP protocol routing protocols network layer • addressing conventions • datagram format • packet handling conventions • path selection • RIP, OSPF, BGP forwarding table ICMP protocol • error reporting • router “signaling” link layer physical layer IP protocol version number header length (bytes) “type” of data Max number remaining hops (decremented at each router) Upper layer protocol to deliver payload to IP datagram format 32 bits type of head. Length ver service len Fragment 16-bit identifier flgs offset upper time to Header checksum layer live 32-bit source IP address Total datagram length (bytes) for fragmentation/ reassembly 32-bit destination IP address options (if any) data (variable length, typically, a TCP or UDP segment) e.g., timestamp, record route taken, specify list of routers to visit. IP addressing introduction 223.1.1.1 IP address: 32-bit identifier for host, router interface. Interface: connection between host/router and physical link. Router ’ s typically have multiple interfaces. Host typically has one or two interfaces (e.g., wired Ethernet, wireless 802.11). IP addresses associated with each interface. 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3 223.1.2.9 223.1.3.27 223.1.2.2 223.1.3.2 223.1.3.1 223.1.1.1 = 11011111 00000001 00000001 00000001 223 1 1 1 Subnets IP address: subnet part - high order bits host part - low order bits 223.1.1.1 223.1.1.2 223.1.1.4 223.1.2.9 223.1.2.2 223.1.1.3 what ’s a subnet ? device interfaces with same subnet part of IP address can physically reach each other without intervening router 223.1.2.1 223.1.3.27 subnet 223.1.3.1 223.1.3.2 network consisting of 3 subnets Subnets 223.1.1.0/24 223.1.2.0/24 223.1.1.1 What’s a subnet? Device interfaces with same subnet part of IP address. To determine the subnets, detach each interface from its host or router, creating islands of isolated networks. Each isolated network is called a subnet. 223.1.1.2 223.1.1.4 223.1.2.1 223.1.2.9 223.1.2.2 223.1.1.3 223.1.3.27 subnet 223.1.3.1 223.1.3.2 223.1.3.0/24 subnet mask: /24 IP addressing: CIDR CIDR: Classless Inter Domain Routing Subnet portion of address of arbitrary length. Address format: a.b.c.d/x, where x is # bits in subnet portion of address. subnet part host part 11001000 00010111 00010000 00000000 200.23.16.0/23 DHCP: Dynamic Host Configuration Protocol Goal: allow host to dynamically obtain its IP address from network server when it joins network. Can renew its lease on address in use. Allows reuse of addresses (only hold address while connected/“on”). Support for mobile users who want to join network (more shortly). DHCP overview: Host broadcasts “DHCP discover” msg [optional]. DHCP server responds with “DHCP offer” msg [optional]. Host requests IP address: “DHCP request” msg. DHCP server sends address: “DHCP ack” msg. DHCP client-server scenario DHCP server 223.1.1.0/24 223.1.2.1 223.1.1.1 223.1.1.2 223.1.1.4 223.1.1.3 223.1.2.9 223.1.3.27 223.1.2.2 223.1.2.0/24 223.1.3.1 223.1.3.2 223.1.3.0/24 arriving DHCP client needs address in this network ICMP: internet control message protocol used by hosts & routers to communicate network-level information error reporting: unreachable host, network, port, protocol echo request/reply (used by ping) network-layer “above” IP: ICMP msgs carried in IP datagrams ICMP message: type, code plus first 8 bytes of IP datagram causing error Type 0 3 3 3 3 3 3 4 Code 0 0 1 2 3 6 7 0 8 9 10 11 12 0 0 0 0 0 description echo reply (ping) dest. network unreachable dest. host unreachable dest. protocol unreachable dest. port unreachable dest. network unknown dest. host unknown source quench (congestion control - not used) echo request (ping) route advertisement router discovery TTL expired bad IP header IPv6: motivation Lnitial motivation: 32-bit address space soon to be completely allocated. Additional motivation: Header format helps speed processing/forwarding. Header changes to facilitate QoS. IPv6 datagram format: Fixed-length 40-byte header. No fragmentation allowed. IPv6 datagram format Priority: identify priority among datagrams in flow. Flow Label: identify datagrams in same “flow” (concept of“flow” not well defined). Next header: identify upper layer protocol for data. ver pri flow label hop limit payload len next hdr source address (128 bits) destination address (128 bits) data 32 bits Interplay between routing, forwarding routing algorithm local forwarding table dest address output link address-range 1 address-range 2 address-range 3 address-range 4 3 2 2 1 IP destination address in arriving packet’s header 1 3 2 routing algorithm determines end-end-path through network forwarding table determines local forwarding at this router Routing algorithm classification A Link-State Routing Algorithm Dijkstra’s algorithm Distance vector algorithm Dijkstra’s algorithm: Example D(v) D(w) D(x) D(y) D(z) Step 0 1 2 3 4 5 N' p(v) p(w) p(x) u uw uwx uwxv uwxvy uwxvyz 7,u 6,w 6,w 3,u ∞ ∞ 5,u ∞ 5,u 11,w 11,w 14,x 10,v 14,x 12,y p(y) Notes: Construct shortest path tree by tracing predecessor nodes Ties can exist (can be broken arbitrarily) p(z) x 5 9 7 4 8 3 u w y 3 4 7 v 2 z Dijkstra’s algorithm: Another Example Step 0 1 2 3 4 5 N' u ux uxy uxyv uxyvw uxyvwz D(v),p(v) D(w),p(w) 2,u 5,u 2,u 4,x 2,u 3,y 3,y D(x),p(x) 1,u 2 u 2 1 x 3 w 3 1 5 z 1 y D(z),p(z) ∞ ∞ 4,y 4,y 4,y 5 v D(y),p(y) ∞ 2,x 2 Distance vector algorithm Bellman-Ford equation (dynamic programming) let dx(y) := cost of least-cost path from x to y then dx(y) = min {c(x,v) + dv(y) } cost from neighbor v to destination y cost to neighbor v min taken over all neighbors v of x Bellman-Ford example clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3 5 2 v 3 w B-F equation says: 5 du(z) = min { c(u,v) + dv(z), 3 c(u,x) + dx(z), 1 2 x y c(u,w) + dw(z) } 1 = min {2 + 5, 1 + 3, 5 + 3} = 4 Node achieving minimum is next hop in shortest path, used in forwarding table u 2 1 z x y z x 0 2 7 y ∞∞ ∞ z ∞∞ ∞ x 0 2 3 y 2 0 1 z 7 1 0 node y cost to table x y z cost to from from node x cost to table x y z Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} = min{2+0 , 7+1} = 2 Dx(z) = min{c(x,y) + Dy(z), c(x,z) + Dz(z)} = min{2+1 , 7+0} = 3 from from x ∞ ∞ ∞ y 2 0 1 z ∞∞ ∞ node z cost to table x y z 2 x x ∞∞ ∞ y ∞∞ ∞ z 7 1 0 time y 7 1 z cost to cost to cost to x y z x y z 0 2 7 x 0 2 3 y 2 0 1 z 7 1 0 x 0 2 3 y 2 0 1 z 3 1 0 ∞∞ ∞ ∞∞ ∞ from x y z from from Node x table x y z x y z x y z x ∞ ∞ ∞ y 2 0 1 z ∞∞ ∞ x 0 2 7 y 2 0 1 z 7 1 0 x 0 2 3 y 2 0 1 z 3 1 0 cost to cost to from from from Node y cost to table x y z cost to x y z x ∞∞ ∞ y ∞∞ ∞ z 7 1 0 x 0 2 7 y 2 0 1 z 3 1 0 x 0 2 3 y 2 0 1 z 3 1 0 from x y z from Dx(z) = min{c(x,y) + Dy(z), c(x,z) + Dz(z)} = min{2+1 , 7+0} = 3 cost to Node z cost to table x y z from Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} = min{2+0 , 7+1} = 2 2 x time y 7 1 z Hierarchical routing our routing study thus far - idealization all routers identical network “flat”… not true in practice Scale: with 600 million destinations: can’t store all destinations in routing tables! routing table exchange would swamp links! Administrative autonomy internet = network of networks each network admin may want to control routing in its own network Intra-AS Routing Also known as interior gateway protocols (IGP) Most common intra-AS routing protocols: RIP: Routing Information Protocol OSPF: Open Shortest Path First IGRP: Interior Gateway Routing Protocol (Cisco proprietary). RIP ( Routing Information Protocol) included in BSD-UNIX distribution in 1982 distance vector algorithm distance metric: # hops (max = 15 hops), each link has cost 1 DVs exchanged with neighbors every 30 sec in response message (aka advertisement) each advertisement: list of up to 25 destination subnets (in IP addressing sense). u v A z C B w x D y from router A to destination subnets: subnet hops u 1 v 2 w 2 x 3 y 3 z 2 RIP: Example z w A y x D B C Routing table in router D Destination subnet Next router # hops to destination w y z x A B B -- 2 2 7 1 …. …. .... OSPF (Open Shortest Path First) “open”: publicly available uses link state algorithm LS packet dissemination topology map at each node route computation using Dijkstra’s algorithm OSPF advertisement carries one entry per neighbor advertisements flooded to entire AS carried in OSPF messages directly over IP (rather than TCP or UDP IS-IS routing protocol: nearly identical to OSPF Hierarchical OSPF boundary router backbone router backbone area border routers area 3 internal routers area 1 area 2 Internet inter-AS routing: BGP BGP (Border Gateway Protocol): the de facto inter-domain routing protocol “glue that holds the Internet together” BGP provides each AS a means to: eBGP: obtain subnet reachability information from neighboring ASs. iBGP: propagate reachability information to all AS-internal routers. determine “ good ” routes to other networks based on reachability information and policy. allows subnet to advertise its existence to rest of Internet: “I am here” BGP basics BGP session: two BGP routers (“peers”) exchange BGP messages: advertising paths to different destination network prefixes (“path vector” protocol) exchanged over semi-permanent TCP connections when AS3 advertises a prefix to AS1: AS3 promises it will forward datagrams towards that prefix AS3 can aggregate prefixes in its advertisement 3c 3b other networks 3a BGP message AS3 2c 1c 1a AS1 1d 2a 1b 2b AS2 other networks Network Layer BGP basics: distributing path information using eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1. 1c can then use iBGP do distribute new prefix info to all routers in AS1 1b can then re-advertise new reachability info to AS2 over 1b-to-2a eBGP session when router learns of new prefix, it creates entry for prefix in its forwarding table. eBGP session 3b other networks 3a AS3 iBGP session 2c 1c 1a AS1 1d 2a 1b 2b AS2 other networks Network Layer Broadcast routing Deliver packets from source to all other nodes. Source duplication is inefficient: duplicate duplicate creation/transmission R1 R1 duplicate R2 R2 R3 R4 source duplication R3 R4 in-network duplication Source duplication: how does source determine recipient addresses? Spanning tree First construct a spanning tree. Nodes then forward/make copies only along spanning tree. A A B B c c D F D E F G (a) broadcast initiated at A E G (b) broadcast initiated at D Multicast routing: problem statement Goal: find a tree (or trees) connecting routers having local multicast group members legend tree: not all paths between routers used group member shared-tree: same tree used by all group members not group source-based: different tree from each sender to rcvrs member router with a group member router without group member shared tree source-based trees Shortest path tree Multicast forwarding tree: tree of shortest path routes from source to all receivers. Dijkstra’s algorithm LEGEND s: source R1 1 2 R2 3 router with attached group member R4 5 4 R3 R6 R5 6 R7 i router with no attached group member link used for forwarding, i indicates order link added by algorithm