Internet Protocol © Prof. Aiman Hanna Department of Computer Science Concordia University Montreal, Canada I nternet Protocol The Internet • A WAN consisting of many Networks • Started in late 1960s when the Advanced Research Project Agency (ARPA), a part of the U.S. Department of Defense began funding universities and private organizations for developing communication systems • The research led to the creation of the ARPANET network, which eventually evolved into the Internet • Today, many would find it impossible to live without the Internet! 2 TCP/IP Networks connected to the Internet runs the TCP (Transmission Control Protocol) / IP (Internet Protocol) protocols TCP is a Transport Layer protocol; IP is a Network Layer protocol TCP is a connection-oriented protocol TCP relies on IP to route packets through the network 3 TCP/IP Two communicating ends run TCP, with which connections are established (through handshaking) then reliable transfer, data flow control, …etc. can be enforced The Network Control Protocol (NCP) was TCP’s predecessor in the original ARPANET, which was reasonably reliable When the ARPANET evolved to the Internet, reliability was lost Consequently, TCP replaced NCP to allow running over unreliable networks 4 TCP/IP The User Datagram protocol (UDP) is a Layer 4 alternative to TCP In contrast, UDP is a connectionless protocol Figure 11.1 – Internet Protocols 5 TCP/IP TCP provides reliable connection independent of underlying network architecture IP provides the packet delivery service Figure 11.2 – IP Transmitting over Different Networks 6 I nternet Addressing To an Internet user, Internet addresses may look like: Sunset.Concordia.ca http://www.Microsoft.com The form also appears on Internet addresses: Comp445@encs.concordia.ca These textual representations however are not actual Internet addresses Internet Domain: A collection of sites of a particular type, such as com, org, edu, gov, ...etc. 7 I nternet Addressing How this textual addresses map to actual addresses? The Internet is a collection of independent networks, each may contain many computers Each computer can then be identified by two things: • The network where it belongs • Its local address within this network These two together define a 32-bit address for that machine The address is often written as a sequence of four 8-bit numbers separated by “.” For example, concordia.ca is 132.205.7.63 8 I nternet Addressing But now, which bits represent a network number and which represent a local ID? This depends on the type of address; IP recognizes several classification of Internet addresses depending on the size of the organization’s network classification byte1 byte2 byte3 byte4 # of NWs max nodes Class A 0nnnnnnn xxxxxxxx xxxxxxxx xxxxxxxx 27 224 Class B 10nnnnnn nnnnnnnn xxxxxxxx xxxxxxxx 214 216 Class C 110nnnnn nnnnnnnn nnnnnnnn xxxxxxxx 221 28 (127) (16,364) (2,097,152) Class D multicast 1110 followed by a 28-bit multicast address Class E Reserved 1111; reserved (>16 million) (65,536) (256) n’s represent bits in the network number; x’s represent bits in the local identifier 9 I nternet Addressing Example: Which class does Concordia.ca belong (assume IP address: 132.205.7.63)? Which class does Nasa.gov belong (Assume IP address: 64.37.246.3)? Concordia.ca is 132.205.7.63, which is 10000100.11001101.00000111.00111111 which is class B Nasa.gov has address: 64.37.246.3, which is 01000000.00100101.11110110.00000011 So it is class C 10 C lassless Addresses The older version of IP, IPv4, has address depletion problem Internet addresses are 32-bit, so there is only finite number of addresses 232 ≈ 4.3 billion; isn’t that enough? What if an organization has 1000 computers? Two solutions exist: • Use more bits for addresses (more than 32) • Use Classless InterDomain Routing (CIDR) 11 CIDR Specifies a group of addresses that do not fall into any of the predefined classes Each address in the group can still be interpreted as a network number followed by a local identifier Commonly used to allocate multiple class C addresses For example, if a network has 1000 computers, CIDR allocates 4 consecutive Class C addresses to that network 12 CIDR Example: Class C Bit representation Address range 211.195.8.0 1101011-1100001100001000-xxxxxxxx 211.195.8.0 to 211.195.8.255 211.195.9.0 1101011-1100001100001001-xxxxxxxx 211.195.9.0 to 211.195.9.255 211.195.10.0 1101011-1100001100001010-xxxxxxxx 211.195.10.0 to 211.195.10.255 211.195.11.0 1101011-1100001100001011-xxxxxxxx 211.195.11.0 to 211.195.11.255 13 CIDR Further, a router can extract the network number (in this case it is 211.198.8.0), which is the 1st address of this network This can be done via logical AND operation between a 32-bit subnet mask (255.255.252.0) and an IP address IP Address 1101011-11000011-000010xx-xxxxxxxx AND with subnet mask 11111111-11111111-11111100-00000000 Network number 1101011-11000011-00001000-00000000 (which is, 211.195.8.0) 14 CIDR In effect, CIDR groups several smaller networks together and visualize them as a single large network; this is referred to as supernetting Advantages? Yet, there is another issue: each IP packet has the destination IP address, where the first 3 bits determine whether this is class A, B or C The rest of the bits in the address are then extracted to determine the network number 15 CIDR Determining the network number is straight forward when that number of bits is fixed With CIDR, this is not the case The router must know the number of network bits To allow that, the usual representation of an address (w.x.y.z) is replaced by (w.x.y.z/m), where m is the number of bits in the network ID For the previous example, that will be: 211.195.8.0/22, which means that the network number has 22 bits long 16 O btaining an Address A computer connected to a company network, or connected to an ISP, will get an IP address assigned The machine may have a static IP address, or it may dynamically require an IP address from the server The server runs a protocol called Dynamic Host Configuration Protocol (DHCP), which allocates the machine one of the available IP addresses it maintains Try ipconfig command 17 O btaining an Address Prior to 1999, all network addresses were managed by IANA, Internet Assigned Numbers Authority In 1999, the Internet Corporation for Assigned Names and Numbers (ICANN) assumed that responsibility and others related to Domain Name System (DNS) After all, isn’t IP addresses are hard to memorize?! 18 DNS Use textual addresses instead of IP addresses to access a network DNS is a distributed database that enables the maping between the textual and IP names Why DNS is not centralized? DNS is distributed among a collection of DNS servers, which are scattered around the Internet Try nslookup command in Unix 19 DNS To facilitate the lookup process, servers are organized into zones A request for an textual name escalates, may go down, until the IP address is found (or the process fails) Figure 11.3 – DNS Hierarchy 20 DNS Example • What happens if try to access pc1.products.acme.com, where pc1 is an existing machine Once an IP address of a textual address is found, the involved servers in the lookup process will keep this information in their cache so a next request is processed faster Figure 11.4 – Zones in DNS Hierarchy 21 IP Routing Based on routing tables There is a distinction between physical addresses and IP addresses For example, an e-mail to user@host.department.university.domain is translated into 32bit dotted IP address The IP address is the unique address of the computer The physical address is the one used by the underlying network 22 IP Routing For example, a device connected to an Ethernet would sense an Ethernet address on the segment to know which packet is destined for it However, this Ethernet address is a 48-bit address that has no significance on a global IP scale So, how can the device then recognize a packet containing an IP address? 23 IP Routing How the router determines the physical address from the IP address when the packet is embedded into a LAN frame When the router receives an IP packet, there are two possibilities: • The packet’s destination machine is in a network where the router is attached, or • It is not If the destination machine belongs to the same network, then the router can directly send the packet to the destination; that is called Direct Routing The router will know that since the network part of the IP address is the same as its own network part 24 IP Routing But still, how the router can determine the physical address from the IP address? One approach is the Dynamic Binding, also called Address Resolution Protocol The router transmits a broadcast request to all devices in the LAN, specifying the IP address The device with the specified IP responds with its physical address The router can then sends the packet to the proper device; it also stores this information on a local cache for future requests 25 IP Routing What if the destination is not directly reachable through one of the router’s networks? The router then uses hierarchical routing, as discussed in previous lectures, to determine another router to send the packet to The packet will then travel from one router to another until it reaches a router connected through the same network to the destination machine 26 IP Routing Example Figure 11.7 – IP Routing Try traceroute command in Unix, or tracert on Windows 27 R outers Figure 11.8 – Router Function 28 R outers Address lookup must occur very quickly. Why? Any search method can be used, but some of them are too slow for many routers to function fast enough Content Addressable Memory, a hash structure, is probably the fastest Instead of searching, use a hash function to calculate the next hop from the destination Figure 11.9 – HASH Function to Update Routing Tables 29 R outers Moving the packet from the input port to the output port can utilize the bus transport system approach, however this could be too slow. Why? Another approach is to use shared memory; where multiple processors are capable of accessing at the same time Once the next hop is known, the input port processor places the packet on the queue (just a part of that shared memory) of the output port 30 R outers Another approach is to create a hardware switch that connects each input port with each of the output ports Figure 11.10 – Switch-based Router 31 R outers What if many input port need to send to a single output port? Scheduling is needed; first-in-first-out is simple but it may not guarantee the required QoS For example, compare file transfer application to a video streaming one; the expected QoS is not the same A priority queue may be used instead of a simple FIFO queue 32 M ulticast Routing One multicast address, class D address, defines a set of destination in contrast to a single one Figure 11.11 – Multicasting Video 33 M ulticast Routing The Internet Group Management Protocol (IGMP) is used to enable a host to join or leave the multicast group Multicast information must be delivered to all routers in the network Routers must implement some type of multicasting routing algorithms Not all routers in the network may need to be involved in a multicast communication 34 M ulticast Routing Which routers in the network need to be involved in a multicast communication involving all the shown machines? Figure 11.12 – Routers Involved in Multicasting 35 M ulticast Routing What routers need to do when they receive a multicast packet? If a minimal tree is created that connects all hosts in the communication, then the answer is simple! Figure 11.13 – Possible Multicast Trees for the Network in Figure 11.12 36 M ulticast Routing A multicast tree can be created to connect all involved hosts, or A multicast tree can be created to connect each source in a group to all the rest Regardless of either one to choose, creation of multicast trees, especially for a large number of involved nodes, is not that easy That is why the majority of IP routers do NOT support multicasting Those routers that support multicasting make up the Mbone, a network within the Internet that supports class D address routing 37 M ulticast Routing Mbone utilizes the Distance Vector Multicast Routing Protocol (DVMRP), which has several components One such component is the Reverse-Path Broadcasting (RPB) RPB allows the packets to be broadcast while eliminating the potential loop problems that broadcasting might cause RPB assumes that a router knows the next link along the shortest path to a given node When a packet is received at port, the source is looked at. If packets from that source are expected at that port (from a previous hop), then broadcast the packet to all other ports; otherwise drop it 38 M ulticast Routing The idea of RPB is to forward the packet away from the direction of the source; doing that in all routers eliminates loops Figure 11.14 – Router Executing Reverse Broadcast Protocol 39 M ulticast Routing RPB by itself would not work for multicasting since not all paths may have hosts that belong to the multicast group Another component in DVMRP is a pruning algorithm, which limits when a router forwards a message, hence eliminating the improper branches When a router receives a multicast packet but it has no hosts that listening to that address, it send a prune message to the originating router in which the packet came from When the originating router receives the prune message, it stops multicasting in that direction If later the router needs to receive these multicast packets, it must send a a Graft message to the originating router 40 M ulticast Routing Figure 11.15 – Pruning 41 T ransport Protocols Transport layer provides the “connection” the user perceives; the connection is a logical connection Transport layer provides that perception by acting as an interface between the user and the network protocols Transport protocols are either connection-oriented or connectionless 42 T ransport Protocols Some transport layer functions include: • Connection management: defines the rules that allow two users to start talking; handshaking may be performed to setup the connection • Flow Control • Error detection: not all errors are detected at the lower layers; different errors are possible • Response to users’ request: for example, user may request high throughput, low delays, reliable services, …etc. 43 T ransmission Control Protocol (TCP) Connection-oriented protocol that provides user-to-user byte stream service It defines a logical connection between two sites and then transmits a byte stream sequence between them The byte stream is divided into a sequence of segments that is then sent using a variation of a sliding window flow control protocol Initial connection is initiated via handshaking 44 T ransmission Control Protocol (TCP) TCP is reliable; even if the network below it is unreliable Figure 11.25 – TCP as a User-to-User Service 45 TCP’s Connection Management Connection management is the process of establishing, maintaining and disconnecting connections A connection here is more virtual than physical First establish the connection, then exchange the needed parameters for the connection such as sequence numbers used, the number of bytes an entity can receive, ..etc. 46 TCP’s Connection Management Establishing a connection may seem easy; one end request a connection, the other end accepts; that is twoway handshaking Figure 11.27 – Failure of 2-way Handshake Protocol 47 TCP’s Connection Management TCP divides the message (stream of bytes) into segments Figure 11.26 – TCP Segment 48 TCP’s Connection Management TCP uses three-way handshaking instead Figure 11.28 – 3-way Handshake Protocol 49 TCP’s Connection Management TCP provides full-duplex communication; so one entity waiting to disconnect does not mean the other is ready to disconnect That is, the two parties must agree to disconnect The protocol again uses three-way handshaking to disconnect 50 TCP’s Connection Management Figure 11.29 – TCP Disconnect Protocol 51 TCP’s Flow Control TCP entities buffer sent segments since there is no guarantee that they will correctly be received The received segments are also buffered since there is no guarantee that they arrive in sequence Effectively, the entities use a variation of the sliding window protocol with the following differences: • In TCP flow control, the sequence number refers to byte sequences instead of packet (or segment) sequence • Either entity can dynamically alter the size of the other’s sending window 52 TCP’s Flow Control Each entity implements flow control using a credit mechanism, also called a window advertisement A credit specifies the maximum number of bytes that the entity can receive and buffer from the other entity The credit is in addition to those that have already been received and buffered, but not yet delivered Upon receiving this information, the entity can determine how many bytes it can still send before it must wait for Acks. 53 TCP’s Flow Control Figure 11.30 – Flow Control Using a Credit Mechanism 54 User Datagram Protocol (UDP) Connectionless transport layer protocol Does not guarantee reliable delivery of data No handshake to establish/terminate connections Just construct what to be send and give it to IP for delivery If received data is okay, deliver it; otherwise discard it No formal mechanism for acknowledging errors or for flow control Why then use UDP? 55