Computer Networks The Network Layer 1 Main Functions • Routing. • Forwarding. 2 Design Issues • Services provided to transport layer. • How to design network-layer protocols. 3 Store-and-Forward Packet Switching Subnet fig 5-1 . Host sends packet to nearest router. . Packet forwarded to next router. . Until packet reaches destination. 4 Services • What kind of services provided to transport layer? • Connection-oriented versus connectionless service? 5 Connectionless Service • Datagram network. • “Move all intelligence to the edges”. – Routers just route. – Everything else should be done end-to-end. • No ordering, no flow/congestion control, no reliable delivery. • Best-effort service model. • Packets are routed independently… • E.g., Internet. 6 Connection-Oriented Service • • • • Virtual circuit networks. `A la telephone network. Reliable, ordered service. Virtual connection established from source to destination. • E.g., X-25, ATM. 7 Datagram Network Operation • How does it work? • Data from transport layer is broken into packets, or datagrams. • Network layer at host adds network-layer header and forwards packets to directlyconnected router. 8 Datagram Network: Example • Routing within a diagram subnet. 9 Virtual Circuit Network Operation • Connection-establishment before sending data. – All traffic for that connection follows same route. 10 Virtual Circuit Network: Example • Routing within a virtual-circuit subnet. 11 Virtual-Circuit versus Datagram Subnets 5-4 12 Routing 13 Routing • One of the main functions of network layer. • Routing versus forwarding? • Datagram versus VC networks? 14 Routing Algorithm • Computes routing tables. • Properties: – – – – Correctness. Robustness. Stability. Optimality. • Try to optimize a certain metric. 15 Optimality Principle • General statement about optimal routes (topology, routing algorithm independent). • If router J is on optimal path between I and K, then the optimal path from J to K also falls along the same route. – Proof by contradiction. • Corollary: – Set of optimal routes from all sources to destination form a tree rooted at destination. – Sink tree. 16 Types of Routing Algorithms • Non-adaptive versus adaptive. 17 Adaptive and Non-adaptive Routing • Non-adaptive routing: – Fixed routing, static routing. – Do not take current state of the network (e.g., load, topology). – Routes are computed in advance, off-line, and downloaded to routers when booted. • Adaptive routing: – Routes change dynamically as function of current state of network. – Algorithms vary on how they get routing information, metrics used, and when they change routes. 18 Static Algorithms (Non-Adaptive) 1.Shortest-path routing. 2.Flooding. 19 Shortest-Path Routing • Problem: Given a graph, where nodes represent routers and edges, links, find shortest path between a given pair of nodes. • What is shortest in shortest path? – Depends on the routing metric in use. – Example: number of hops (static), geographic distance (static), delay, bandwidth (raw versus available), combination of a subset of these. • Dijkstra’s shortest-path algorithm (1959). 20 Dijkstra’s Shortest-Path Algorithm • Initially, links are assigned costs. • As the algorithm executes, nodes are labeled with its distance to source along best known path. • Initially, no routes known, so all nodes are labeled with infinity. • Labels change as the algorithm proceeds. • Labels can be temporary or permanent. – Initially all labels are tentative. – A label becomes permanent if it represents the shortest path from the source to the node. 21 Shortest Path Routing Find shortest-path from A to D: Label each adjacent node with distance to A. Start B is made permanent. 22 Flooding • Every incoming packet forwarded on every outgoing link except the one it arrived on. • Problem: duplicates. • Constraining the flood: – Hop count. – Keep track of packets that have been flooded. • Robust, shortest delay (picks shortest path as one of the paths). 23 Flooding: Example •Stallings Figure 12.4 (hop-count=3) 24 Dynamic Routing Algorithms • (Adaptive Routing) – Distance vector routing. – Link state routing. 25 Distance Vector Routing • Aka, Bellman-Ford (1957), Ford-Fulkerson (1962). • Original ARPANET routing; also used by Internet’s RIP. • Each router keeps routing table (or routing vector) with best known distance to each destination and corresponding outgoing interface. • Routing tables are updated by exchanging routing information with neighbors. 26 Distance Vector (Cont’d) • Routing table at each router: – One entry per participating router. – Each entry contains outgoing interface and distance to corresponding destination. – Metric: number of hops, delay, queue length. – Each router knows distance to its neighbors. • Old ARPANET algorithm: DV where cost metric is outgoing link queue length. 27 Distance Vector Routing • (a) A subnet. (b) Input from A, I, H, K, and the new routing table for J. 28 Routing Updates • Every T interval, routers exchange routing updates. Routing update from router X consists of a vector with all destinations and the corresponding distance from X to them. When router Y receives an update from X, it can estimate its distance to router Z through X as Dyz = Dyx + Dxz. Router Y receives update from all its neighbors and builds a new RT. • • • 29 Distance Vector: Example 3 2 5 2 2 9 1 1 4 79 3 3 1 6 5 2 Node Distance Next 1 0 - 2 3 2 3 2 4 4 5 6 1 2 4 4 4 4 T=T2 1 Node Distance Next 1 0 - 2 3 2 5 2 3 4 5 6 1 6 8 4 3 3 T=T0 2 3 4 3 0 7 5 4 0 2 1 3 2 3 2 3 5 2 0 1 3 T=T1 30 Problems 1.Routing loops. 2.Slow convergence. 3.Counting to infinity. 31 Count-to-Infinity • Good news propagate faster. A Initially, A down: A comes up: B infinity 1 1 1 1 C D E infinity infinity infinity infinity infinity infinity 2 infinity infinity 2 3 infinity 2 3 4 (after 1 exchange) (after 2 exchanges) (after 3 exchanges) (after 4 exchanges) 32 Count-to-Infinity (Cont’d) • But, bad news propagate slower! A Initially, all up: A goes down: B 1 3 3 5 5 7 7 C 2 2 4 4 6 6 8 E D 3 3 3 5 5 7 7 4 4 4 4 6 6 8 (after 1 exchange) (after 2 exchanges) (after 3 exchanges) (after 4 exchanges) (after 5 exchanges) (after 6 exchanges) …. infinity 33 Count-to-Infinity (Cont’d) • Gradually routers work their way up to infinity. • Number of exchanges depends on how large is infinity. • To reduce number of exchanges, if metric is number of hops, infinity=maximum path+1. 34 Solution • Routing loops: – – • Path vector: record actual path used in the DV. Previous hop tracing: records preceding router. Count-to-infinity: – Split horizon: router reports to neighbor cost “infinity” for destination if route to that destination is through that neighbor. 35 Split Horizon • Tries to make bad news spread faster. • A node reports infinity as distance to node X on link packets to X are sent. • Example, in the first exchange, C tells D its distance to A but tells B its distance to A is infinity. – So B discovers its link to A is down and C’s distance to A is infinity; so it sets its distance to A to infinity. 36 Link State Routing • DV routing used in the ARPANET until 1979, when it was replaced by link state routing. • Used by the Internet’s OSPF. • Based on Dijkstra’s “all pairs shortest path” algorithm. • Plus link state updates. 37 Link State Routing (Cont’d) • Link state routing is based on: – Discover your neighbors and measure the communication cost to them. – Send updates about your neighbors to all other routers. – Compute shortest path to every other router. 38 Finding Neighbors • When router is booted, its first task is to find who its neighbors are. • Special single-hop “hello” packets. • Cost metric: – Number of hops: in this case, always 1. – Delay: “echo” packets and measure RTT/2. – Load? 39 Generating Link State Updates • Link state packets (LSP). – – – – Sender identity. Sequence number. TTL. List of (neighbor, cost). • When to send updates? – Proactive: periodic updates; how often? – Reactive: whenever some significant event is detected, e.g., link goes down. • Where to send them? Everywhere: flood. 40 Processing Updates • When LSP is received: – Check sequence number. – If higher than current sequence number, keep it and flood it; otherwise, discard it. – Periodically decrement TTL. • When TTL=0, purge LSP. 41 Computing Routes • Routers have global view of network. – They receive updates from all other routers with their cost to their neighbors. – Build network graph. • Use Dijkstra’s shortest-path algorithm to compute shortest paths to all other nodes. 42 Measuring Line Cost • A subnet in which the East and West parts are connected by two lines. 43 Building Link State Packets • (a) A subnet. (b) The link state packets for this subnet. 44 Distributing the Link State Packets B’s LSP buffer: each row corresponds to a recently LSP that hasn’t been processed yet. 45 Link State Routing: Problems • Scalability: – Storage: kn, where n is number of routers and k is number of neighbors. – Computation time. – LSP propagation via flooding. 46 DV versus LS • DV: – Node tells its neighbors what it knows about everybody. – Based on other’s knowledge, node chooses best route. – Distributed computation. • LS: – Node tells everyone what it knows about its neighbors. – Every node has global view. – Compute their own routes. 47 Hierarchical Routing • For scalability: – • As network grows, so does RT size, routing update generation, processing, and propagation overhead, and route computation time and resources. Divide network into routing regions. – – – Routers within region know how to route packets to all destinations within region. But don’t know how to route within other regions. “Border” routers: route within regions. 48 Hierarchical Routing: Example Flat routing: 1A Dest. Next Hops 1B 2A 2B 1C 1A 4A 2D 2C 5B 5A 3A 3B 5C 4B 4C 5E 5D 1A 1B 1C 2A 2B 2C 2D 3A 3B 4A 4B 4C 5A 5B 5C 5D 5E 1B 1C 1B 1B 1B 1B 1C 1C 1C 1C 1C 1C 1C 1B 1C 1C 1 1 2 3 3 4 3 2 3 4 4 4 5 5 6 5 49 Hierarchical Routing: Example 1A Dest. Next Hops Hierarchy: 1B 1A 2A 1C 4A 3A 2B 3B 2D 2C 1A 1B 1C 2 3 4 5 1B 1C 1B 1C 1C 1C 1 1 2 2 3 4 5B 5A 5C 4B 4C 5E 5D 50 Hierarchical Routing • Optimal paths are not guaranteed. – Example: 1A->5C should be via 2 and not 3. • How many hierarchical levels? – Example: 720 routers. • 1 level: each router needs 720 RT entries. • 2 levels: 24 regions of 30 routers: each router’s RT has 30+23 entries. • 3 levels: 8 clusters of 9 regions with 10 routers: each router’s RT 10+8+7. 51 Intra-AS and Inter-AS routing C.b a C Gateways: B.a A.a b A.c d A a b c a c B b •perform inter-AS routing amongst themselves •perform intra-AS routers with other routers in their AS network layer inter-AS, intra-AS routing in gateway A.c link layer physical layer 52 Intra-AS and Inter-AS routing C.b a C Host h1 b A.a a Inter-AS Internet: BGP routing between B.a A and B Host h2 c A.c a b B d c b A Intra-AS routing within AS A Intra-AS routing within AS B Internet: OSPF, IS-IS, RIP 53 Many-to-Many Routing • Support many-to-many communication. • Example applications: multi-point data distribution, multi-party teleconferencing. 54 Broadcasting • Send to ALL destinations. • Several possible routing mechanisms to broadcasting. • Simplistic approach: send separate packet to each destination. – Simple but expensive. – Source needs to know about all destinations. • Flooding: – May generate too many duplicates (depending on node connectivity). 55 Multidestination Routing • • Packet contains list of destinations. Router checks destinations and determines on which interfaces it will forward packet. – – Router generates new copy of packet for each output line and includes in packet only the appropriate set of destinations. Eventually, packets will only carry 1 destination. 56 Spanning Tree Routing • Use spanning tree (sink tree) rooted at broadcast initiator. • No need for destination list. • Each on spanning tree forwards packets on all lines on the spanning tree (except the one the packet arrived on). • Efficient but needs to generate the spanning tree and routers must have that information. 57 Reverse Path Forwarding • • Routers don’t have to know spanning tree. Router checks whether broadcast packet arrived on interface used to send packets to source of broadcast. – – If so, it’s likely that it followed best route and thus not a duplicate; router forwards packet on all lines. If not, packet discarded as likely duplicate. 58 Broadcast Routing Reverse path forwarding. (a) A subnet. (b) a Sink tree. (c) The tree built by reverse path forwarding. 59 Multicasting • Special form of broadcasting: – Instead of sending messages to all nodes, send messages to a group of nodes. • Multicast group management: – Creating, deleting, joining, leaving group. – Group management protocols communicate group membership to appropriate routers. 60 Multicast Routing • Each router computes spanning tree covering all other participating routers. – Tree is pruned by removing that do not contain any group members. 2 2 1 1,2 1,2 2 1 2 1 1 2 1 2 1 1 1 1,2 2 1 1,2 2 2 2 1 2 1 61 Shared Tree Multicasting • Source-rooted tree approaches don’t scale well! – 1 tree per source, per group! – Routers must keep state for m*n trees, where m is number of sources in a group and n is number of groups. • Core-based trees: single tree per group. – Host unicast message to core, where message is multicast along shared tree. – Routes may not be optimal for all sources. – State/storage savings in routers. 62 Internetworking 63 Internetworking • What is it? – Connecting networks together forming a single “internet”. 64 Connecting Networks • A collection of interconnected networks. 65 How Networks Differ 5-43 66 How Networks Can Be Connected • • (a) Two Ethernets connected by a switch. (b) Two Ethernets connected by routers. 67 How to Internet? • Connection-oriented versus connectionless internetworking. • Connection oriented internetworking: – Based on VC concatenation. • Connectionless internetworking follows the datagram model. 68 Concatenated Virtual Circuits Gateway . Builds VC crossing the different networks. . Use of gateways to perform necessary conversions. 69 Connectionless Internetworking . Follows datagram model. . Packets from Host X to Host Y may follow different routes. . Gateways make routing decisions and perform translations. 70 Translating versus “Gluing” • Translation: converting between different protocols. • Hard! • Alternative: “gluing”. – I.e., using the same network layer protocol everywhere. – That’s what IP does! 71 Tunneling • Interconnecting source and destination on separate networks but of the same type. S D 72 Tunneling Analogy 73 More Tunneling … 74 Internetworking 75 Internetwork Routing • Inherently hierarchical. – Routing within each network: interior gateway protocol (IGP). – Routing between networks: exterior gateway protocol (EGP). • Within each network, different routing algorithms can be used. • Each network is autonomously managed and independent of others: autonomous system (AS). 76 Internetwork Routing: Example • (a) An internetwork. (b) A graph of the internetwork. 77 Internetwork Routing (Cont’d) • Typically, packet starts in its LAN. Gateway receives it (broadcast on LAN to “unknown” destination). • Gateway sends packet to gateway on the destination network using its routing table. If it can use the packet’s native protocol, sends packet directly. Otherwise, tunnels it. 78 Fragmentation • Happens when internetworking. • Network-specific maximum packet size. – Width of TDM slot. – OS buffer limitations. – Protocol (number of bits in packet length field). • Maximum payloads range from 48 bytes (ATM cells) to 64Kbytes (IP packets). 79 Problem • What happens when large packet wants to travel through network with smaller maximum packet size? Fragmentation. • Gateways break packets into fragments; each sent as separate packet. • Gateway on the other side have to reassemble fragments into original packet. • 2 kinds of fragmentation: transparent and nontransparent. 80 Types of Fragmentation • (a) Transparent fragmentation. Nontransparent fragmentation. (b) 81 Transparent Fragmentation • Small-packet network transparent to other subsequent networks. • Fragments of a packet addressed to the same exit gateway, where packet is reassembled. – OK for concatenated VC internetworking. • Subsequent networks are not aware fragmentation occurred. • ATM networks (through special hardware) provide transparent fragmentation. 82 Problems with Transparent Fragmentation • Exit gateway must know when it received all the pieces. – Fragment counter or “end of packet” bit. • Some performance penalty but requiring all fragments to go through same gateway. • May have to repeatedly fragment and reassemble through series of small-packet networks. 83 Non-Transparent Fragmentation • Only reassemble at destination host. – Each fragment becomes a separate packet. – Thus routed independently. • Problems: – Hosts must reassemble. – Every fragment must carry header until it reaches destination host. 84 Keeping Track of Fragments • • Fragments must be numbered so that original data stream can be reconstructed. Tree-structured numbering scheme: – – – – Packet 0 generates fragments 0.0, 0.1, 0.2, … If these fragments need to be fragmented later on, then 0.0.0, 0.0.1, …, 0.1.0, 0.1.1, … But, too much overhead in terms of number of fields needed. Also, if fragments are lost, retransmissions can take alternate routes and get fragmented differently. 85 Keeping Track of Fragments (Cont’d) • Another way is to define elementary fragment size that can pass through every network. • When packet fragmented, all pieces equal to elementary fragment size, except last one (may be smaller). • Packet may contain several fragments. 86 Fragmentation: Example • • • Fragmentation when the elementary data size is 1 byte. (a) Original packet, containing 10 data bytes. (b) Fragments after passing through a network with maximum packet size of 8 payload bytes plus header. (c) Fragments after passing through a size 5 gateway. • 87 Keeping Track of Fragments • Header contains packet number, number of first fragment in the packet, and last-fragment bit. 27 0 1 A B C D Number of first fragment Packet number 27 0 0 A B Last-fragment bit E F G H I C D E F G H 1 byte J (a) Original packet with 10 data bytes. 27 8 1 I J (b) Fragments after passing through network with maximum packet size = 8 bytes. 88 The Internet 89 Design Principles for Internet • • • • • • • Keep it simple. Exploit modularity. Expect heterogeneity. Think robustness. Avoid static options and parameters. Think about scalability. Consider performance and cost. 90 Internet as Collection of Subnetworks 91 IP (Internet Protocol) • Glues Internet together. • Common network-layer protocol spoken by all Internet participating networks. • Best effort datagram service: – No reliability guarantees. – No ordering guarantees. 92 IP • Transport layer breaks data streams into datagrams; fragments transmitted over Internet, possibly being fragmented. • When all packet fragments arrive at destination, reassembled by network layer and delivered to transport layer at destination host. 93 Encapsulation & Demultiplexing • • • • • When an application sends data using TCP, the data is sent down the protocol stack, through each layer, until it is sent as a stream of bits across the network. Each layer adds information to the data by prepending headers (and sometimes adding trailer information) to the data that it receives. The unit of data that TCP sends to IP is called a TCP segment. The unit of data that IP sends to the network interface is called an IP datagram. The stream of bits that flows across the Ethernet is called a frame. A physical property of an Ethernet frame is that the size of its data must be between 46 and 1500 bytes We could draw a nearly identical picture for UDP data. The only changes are that the unit of information that UDP passes to IP is called a UDP datagram, and the size of the UDP header is 8 bytes. When an Ethernet frame is received at the destination host it starts its way up the protocol stack and all the headers are removed by the appropriate protocol box. 94 IP Versions • IPv4: IP version 4. – Current, predominant version. – 32-bit long addresses. • IPv6: IP version 6 (aka, IPng). – Evolution of IPv4. – Longer addresses (16-byte long). 95 IP Datagram Format • IP datagram consists of header and data (or payload). • Header: – 20-byte fixed (mandatory) part. – Variable length optional part. 96 The IP v4 Header 97 The format of an IP datagram • • • • • The current protocol version is 4, so IP is sometimes called IPv4. The header length is the number of 32bit words in the header, including any options. Since this is a 4-bit field, it limits the header to 60 bytes. The type-of-service field (TOS) is composed of a 3-bit precedence field (which is ignored today), 4 TOS bits, and an unused bit that must be 0. The 4 TOS bits are: minimize delay, maximize throughput, maximize reliability, and minimize monetary cost. The total length field is the total length of the IP datagram in bytes. Since this is a 16-bit field, the maximum size of an IP datagram is 65535 bytes. The identification, flags and fragment offset fields are used when the packet is fragmented. • • The time-to-live field, or TTL, sets an upper limit on the number of routers through which a datagram can pass. It limits the lifetime of the datagram. The header checksum is calculated over the IP header only. It does not cover any data that follows the header. 98 IP Options 5-54 99 IP Fragmentation & Reassembly • network links have MTU (max.transfer size) - largest possible link-level frame. – different link types, different MTUs • fragmentation: in: one large datagram out: 3 smaller datagrams large IP datagram divided (“fragmented”) within net – one datagram becomes several datagrams – “reassembled” only at final destination – IP header bits used to identify, order related fragments reassembly 100 IP Fragmentation & Reassembly Example • 4000 byte datagram • MTU = 1500 bytes length ID fragflag offset =4000 =x =0 =0 One large datagram becomes several smaller datagrams length ID fragflag offset =1500 =x =1 =0 1480 bytes in data field length ID fragflag offset =1500 =x =1 =185 offset = 1480/8 length ID fragflag offset =1040 =x =0 =370 101 IP Addresses • IP address formats. 102 IP Addresses (Cont’d) • Class A: 128 networks with 16M hosts each. • Class B: 16,384 networks with 64K hosts each. • Class C: 2M networks with 256 hosts each. • More than 500K networks connected to the Internet. • Network numbers centrally administered by ICANN. 103 IP Addresses (Cont’d) • Special IP addresses. 104 Scalability of IP Addresses • Problem: a single A, B, or C address refers to a single network. • As organizations grow, what happens? 105 Example: A Campus Network 106 Solution • Subnetting: divide the organization’s address space into multiple “subnets”. • How? Use part of the host number bits as the “subnet number”. • Example: Consider a university with 35 departments. – With a class B IP address, use 6-bit subnet number and 10-bit host number. – This allows for up to 64 subnets each with 1024 hosts. 107 Subnets • IP address: – subnet part (high order bits) – host part (low order bits) • What’s a subnet ? – device interfaces with same subnet part of IP address – can physically reach each other without intervening router 223.1.1.1 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3 223.1.2.9 223.1.3.27 223.1.2.2 LAN 223.1.3.1 223.1.3.2 network consisting of 3 subnets Subnet mask: /24 108 Subnets • A class B network subnetted into 64 subnets. 109 Subnet Mask • Indicates the split between network and subnet number + host number. Subnet Mask: 255.255.252.0 or /22 (network + subnet part) 110 Subnetting: Observations • Subnets are not visible to the outside world. • Thus, subnetting (and how) is a decision made by local network admin. 111 Subnet: Example • Subnet 1: 10000010 00110010 000001|00 00000001 – 130.50.4.1 • Subnet 2: 10000010 00110010 000010|00 00000001 – 130.50.8.1 • Subnet 3: 10000010 00110010 000011|00 00000001 – 130.50.12.1 112 Problem with IPv4 • IPv4 is running out of addresses. • Problem: class-based addressing scheme. – Example: Class B addresses allow 64K hosts. • More than half of Class B networks have fewer than 50 hosts! 113 Solution: CIDR • CIDR: Classless Inter-Domain Routing. – RFC 1519. • Allocate remaining addresses in variablesized blocks without considering classes. • Example: if an organization needs 2000 addresses, it gets 2048-address block. • Forwarding had to be modified. – Routing tables need an extra entry, a 32-bit mask, which is ANDed with the destination IP address. – If there is a match, the packet is forwarded on that interface. 114 IP addressing: CIDR CIDR: Classless InterDomain Routing – subnet portion of address of arbitrary length – address format: a.b.c.d/x, where x is # bits in subnet portion of address subnet part host part 11001000 00010111 00010000 00000000 200.23.16.0/23 115 IP addresses: how to get one? Q: How does host get IP address? • hard-coded by system admin in a file – Windows: control-panel->network>configuration->tcp/ip->properties – UNIX: /etc/rc.config • DHCP: Dynamic Host Configuration Protocol: dynamically get address from as server – “plug-and-play” (more in next chapter) 116 IP addresses: how to get one? Q: How does network get subnet part of IP addr? A: gets allocated portion of its provider ISP’s address space ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20 Organization 0 Organization 1 Organization 2 ... 11001000 00010111 00010000 00000000 11001000 00010111 00010010 00000000 11001000 00010111 00010100 00000000 ….. …. 200.23.16.0/23 200.23.18.0/23 200.23.20.0/23 …. Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23 117 Hierarchical addressing: route aggregation Hierarchical addressing allows efficient advertisement of routing information: Organization 0 200.23.16.0/23 Organization 1 200.23.18.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . Fly-By-Night-ISP “Send me anything with addresses beginning 200.23.16.0/20” Internet 200.23.30.0/23 ISPs-R-Us “Send me anything with addresses beginning 199.31.0.0/16” 118 Hierarchical addressing: more specific routes ISPs-R-Us has a more specific route to Organization 1 Organization 0 200.23.16.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . Fly-By-Night-ISP “Send me anything with addresses beginning 200.23.16.0/20” Internet 200.23.30.0/23 ISPs-R-Us Organization 1 200.23.18.0/23 “Send me anything with addresses beginning 199.31.0.0/16 or 200.23.18.0/23” 119 IP addressing: the last word... Q: How does an ISP get block of addresses? A: ICANN: Internet Corporation for Assigned Names and Numbers – allocates addresses – manages DNS – assigns domain names, resolves disputes 120 Routing: Store-and-Forward • Store-and-Forward Packet Switching A host with a packet to send transmits it to the nearest router, either on its own LAN or over a point-to-point link to the carrier. The packet is stored there until it has fully arrived so the checksum can be verified. Then it is forwarded to the next router along the path until it reaches the destination host, where it is delivered. 121 Router Architecture Overview Two key router functions: • run routing algorithms/protocol • forwarding datagrams from incoming to outgoing link 122 Input Port Functions Physical layer: bit-level reception Data link layer: e.g., Ethernet Decentralized switching: • given datagram dest., lookup output port using forwarding table in input port memory • goal: complete input port processing at ‘line speed’ • queuing: if datagrams arrive faster than forwarding rate into switch fabric 123 Three types of switching fabrics 124 Switching Via Memory First generation routers: • traditional computers with switching under direct control of CPU • packet copied to system’s memory • speed limited by memory bandwidth (2 bus crossings per datagram) Input Port Memory Output Port System Bus 125 Switching Via a Bus • datagram from input port memory to output port memory via a shared bus • bus contention: switching speed limited by bus bandwidth • 1 Gbps bus, Cisco 1900: sufficient speed for access and enterprise routers (not regional or backbone) 126 Switching Via An Interconnection Network (CrossBar) • overcome bus bandwidth limitations • Banyan networks, other interconnection nets initially developed to connect processors in multiprocessor • Advanced design: fragmenting datagram into fixed length cells, switch cells through the fabric. • Cisco 12000: switches Gbps through the interconnection network 127 Output Ports • Buffering required when datagrams arrive from fabric faster than the transmission rate • Scheduling discipline chooses among queued datagrams for transmission 128 Output port queueing • buffering when arrival rate via switch exceeds output line speed • queueing (delay) and loss due to output port buffer overflow! 129 Network Address Translation • Another “quick fix” to the address shortage in IP v4. • Specified in RFC 3022. • Each organization gets a single (or small number of) IP addresses. – This is used for Internet traffic only. – For internal traffic, each host gets its own “internal” IP address. • Three IP ranges have been declared as “private”. – 10.0.0.0 – 10.255.255.255/8 – 172.16.0.0 – 172.31.255.255/12 – 192.168.0.0 – 192.168.255.255/16 • No “private” IP address can show up on the Internet, i.e., outside the organization’s network. 130 NAT: Network Address Translation rest of Internet local network (e.g., home network) 10.0.0/24 10.0.0.4 10.0.0.1 10.0.0.2 138.76.29.7 10.0.0.3 All datagrams leaving local network have same single source NAT IP address: 138.76.29.7, different source port numbers Datagrams with source or destination in this network have 10.0.0/24 address for source, destination (as usual) 131 NAT: Network Address Translation (Cont’d) • Motivation: local network uses just one IP address as far as outside word is concerned: – no need to be allocated range of addresses from ISP: - just one IP address is used for all devices – can change addresses of devices in local network without notifying outside world – can change ISP without changing addresses of devices in local network – devices inside local net not explicitly addressable, visible by outside world (a security plus). 132 NAT: Network Address Translation (Cont’d) Implementation: NAT router must: – outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #) . . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr. – remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair – incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table 133 NAT: Network Address Translation (Cont’d) 2: NAT router changes datagram source addr from 10.0.0.1, 3345 to 138.76.29.7, 5001, updates table 2 NAT translation table WAN side addr LAN side addr 1: host 10.0.0.1 sends datagram to 128.119.40, 80 138.76.29.7, 5001 10.0.0.1, 3345 …… …… S: 10.0.0.1, 3345 D: 128.119.40.186, 80 S: 138.76.29.7, 5001 D: 128.119.40.186, 80 138.76.29.7 S: 128.119.40.186, 80 D: 138.76.29.7, 5001 3: Reply arrives dest. address: 138.76.29.7, 5001 3 1 10.0.0.4 S: 128.119.40.186, 80 D: 10.0.0.1, 3345 10.0.0.1 10.0.0.2 4 10.0.0.3 4: NAT router changes datagram dest addr from 138.76.29.7, 5001 to 10.0.0.1, 3345 134 NAT: Network Address Translation (Cont’d) • 16-bit port-number field: – 60,000 simultaneous connections with a single LAN-side address! • NAT is controversial: – routers should only process up to layer 3 – violates end-to-end argument • NAT possibility must be taken into account by app designers, eg, P2P applications – address shortage should instead be solved by IPv6 135 Internet Control Protocols • “Companion” protocols to IP. • Control protocols used mainly for signaling and exchange of control information. • Examples: ICMP, ARP, RARP, BOOTP, and DHCP. 136 ICMP: Internet Control Message Protocol • used by hosts & routers to communicate network-level information – error reporting: unreachable host, network, port, protocol – echo request/reply (used by ping) • network-layer “above” IP: – ICMP msgs carried in IP datagrams • ICMP messages encapsulated within an IP datagram ICMP message ICMP message: type, code plus first 8 bytes of IP datagram causing error 137 ICMP: Internet Control Message Protocol (Cont’d) • • ICMP error messages are sometimes handled specially. For example, an ICMP error message is never generated in response to an ICMP error message. When an ICMP error message is sent, the message always contains the IP header and the first 8 bytes of the IP datagram that caused the ICMP error to be generated. This lets the receiving ICMP module associate the message with one particular protocol and one particular user process. 138 Example of ICMP —— ping • The name "ping" is taken from the sonar operation to locate objects. The Ping program was written by Mike Muuss and it tests whether another host is reachable. • The program sends an ICMP echo request message to a host, expecting an ICMP echo reply to be returned. • Normally if you can't Ping a host, you won't be able to Telnet or FTP to that host. Conversely, if you can't Telnet to a host, Ping is often the starting point to determine what the problem is. Ping also measures the round-trip time to the host, giving us some indication of how "far away" that host is. • With the increased awareness of security on the Internet, ping may show a host as being unreachable, yet we might be able to Telnet. 139 Example of ICMP —— ping (Cont’d) • • Format of ICMP message for echo request and echo reply. Example of ping output bsdi % ping svr4 PING svr4 (140.252.13.34): 56 data bytes 64 bytes from 140.252.13.34: icmp_seq=0 ttl=255 time=0 ms 64 bytes from 140.252.13.34: icmp_seq=l ttl=255 time=0 ms 64 bytes from 140.252.13.34: icmp_seq=2 ttl=255 time=0 ms 64 bytes from 140.252.13.34: icmp_seq=3 ttl=255 time=0 ms 64 bytes from 140.252.13.34: icmp_seq=4 ttl=255 time=0 ms 64 bytes from 140.252.13.34: icmp_seq=5 ttl=255 time=0 ms 64 bytes from 140.252.13.34: icmp_seq=6 ttl=255 time=0 ms 64 bytes from 140.252.13.34: icmp_seq=7 ttl=255 time=0 ms ^? type interrupt key to stop --- svr4 ping statistics --8 packets transmitted, 8 packets received, 0% packet loss round-trip min/avg/max = 0/0/0 ms 140 Example of ICMP —— ping (Cont’d) • IP Record Route Option – The ping program gives us an opportunity to look at the IP record route (RR) option. – It causes ping to set the IP RR option in the outgoing IP datagram. This causes every router that handles the datagram to add its IP address to a list in the options field. When ping receives the echo reply it prints the list of IP addresses. – Most systems today do support these optional features, but some systems don't reflect the IP list. – Example …… 141 Example of ICMP —— ping (Cont’d) • Broadcast ping – For example, %ping –b 211.69.193.255 WARNING: pinging broadcast address PING 211.69.193.255 (211.69.193.255) from 211.69.193.53 : 56(84) bytes of data. Warning: time of day goes back, taking countermeasures. 64 bytes from 211.69.193.53: icmp_seq=0 ttl=255 time=649 usec 64 bytes from 211.69.193.62: icmp_seq=0 ttl=255 time=1.243 msec (DUP!) 64 bytes from 211.69.193.13: icmp_seq=0 ttl=64 time=1.298 msec (DUP!) 64 bytes from 211.69.193.12: icmp_seq=0 ttl=64 time=1.316 msec (DUP!) 64 bytes from 211.69.193.52: icmp_seq=0 ttl=255 time=1.332 msec (DUP!) 64 bytes from 211.69.193.52: icmp_seq=0 ttl=255 time=1.348 msec (DUP!) 64 bytes from 211.69.193.11: icmp_seq=0 ttl=64 time=1.364 msec (DUP!) 64 bytes from 211.69.193.15: icmp_seq=0 ttl=64 time=1.380 msec (DUP!) 64 bytes from 211.69.193.25: icmp_seq=0 ttl=64 time=1.396 msec (DUP!) 64 bytes from 211.69.193.26: icmp_seq=0 ttl=64 time=1.412 msec (DUP!) 64 bytes from 211.69.193.41: icmp_seq=0 ttl=64 time=1.428 msec (DUP!) 64 bytes from 192.168.1.216: icmp_seq=0 ttl=64 time=1.485 msec (DUP!) 142 MAC Addresses and ARP • 32-bit IP address: – network-layer address – used to get datagram to destination IP subnet • MAC (or LAN or physical or Ethernet) address: – used to get datagram from one interface to another physically-connected interface (same network). – 48 bit MAC address (for most LANs) burned in the adapter ROM. 143 LAN Addresses and ARP Each adapter on LAN has unique LAN address 1A-2F-BB-76-09-AD 71-65-F7-2B-08-53 LAN (wired or wireless) Broadcast address = FF-FF-FF-FF-FF-FF = adapter 58-23-D7-FA-20-B0 0C-C4-11-6F-E3-98 144 ARP: Address Resolution Protocol Question: how to determine MAC address of B knowing B’s IP address? 237.196.7.78 1A-2F-BB-76-09-AD 237.196.7.23 237.196.7.14 • Each IP node (Host, Router) on LAN has ARP table • ARP Table: IP/MAC address mappings for some LAN nodes < IP address; MAC address; TTL> LAN 71-65-F7-2B-08-53 237.196.7.88 58-23-D7-FA-20-B0 0C-C4-11-6F-E3-98 – TTL (Time To Live): time after which address mapping will be forgotten (typically 20 min) 145 ARP Operation • Host 1 broadcasts an ARP request on the Ethernet asking who owns host 2’s IP address. • Host 2 replies with its Ethernet address. • Some optimizations: – ARP caches. – Piggybacking host’s own Ethernet address on ARP requests. – Proxy ARP: services ARP requests for hosts on separate LANs. 146 • “ftp bsdi” – – – – – – – – – An Example of ARP The application, the FTP client, calls the function gethostbyname to convert the hostname (bsdi) into its 32-bit IP address. This function is called a resolver in the DNS (Domain Name System). The FTP client asks its TCP to establish a connection with that IP address TCP sends a connection request segment to the remote host by sending an IP datagram to its IP address. the IP datagram is sent to a host or router on a locally attached network. Assuming an Ethernet, the sending host must convert the 32-bit IP address into a 48-bit Ethernet address. ARP sends an Ethernet frame called an ARP request to every host on the network. This is called a broadcast. The destination host's ARP layer receives this broadcast, recognizes that the sender is asking for its hardware address, and replies with an ARP reply. This reply contains the IP address and the corresponding hardware address. The ARP reply is received and the IP datagram that forced the ARP requestreply to be exchanged can now be sent. The IP datagram is sent to the destination host. 147 Other problems about ARP • The fundamental concept behind ARP is that the network interface has a hardware address (a 48-bit value for an Ethernet or token ring interface). Frames exchanged at the hardware level must be addressed to the correct interface. • Knowing a host's IP address doesn't let the kernel send a frame to that host. The kernel (i.e., the Ethernet driver) must know the destination's hardware address to send it data. • Point-to-point links don't use ARP. • RARP (reverse address resolution protocol) – This protocol performs the exactly reverse functionality than ARP • Important Command about ARP – arp (in both Windows and Unix) 148 DHCP • • • • Dynamic Host Configuration Protocol. RFCs 2131 and 2132. Assigns IP addresses to hosts dynamically. DHCP server may not be on the same LAN as requesting host. • DHCP relay agent. 149 DHCP Operation • Newly booted host broadcasts a DHCP DISCOVER message. • DHCP relay agent intercepts DHCP DISCOVERs on its LAN and unicasts them to DHCP server. 150 DHCP Operation 151 DHCP: Address Reuse • How long should an IP address be allocated? • Issue: hosts come and go. • IP addresses may be assigned on a “Lease” basis. • Hosts must renew their leases. 152