Unit 01.01.01 CS 5220: COMPUTER COMMUNICATIONS Evolution of Communication Networks XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science What is a Communication Network? ⚫ The equipment (hardware & software) and facilities that provide the basic communication service o Facilities ⚫ Communication Network o Copper wires, optical fiber … Equipment ⚫ Routers, servers, switches, … Information transfer per second Network Architecture Evolution ? 1.0E+14 1.0E+12 1.0E+10 1.0E+08 1.0E+06 1.0E+04 1.0E+02 1.0E+00 1850 Telegraph networks 1875 1900 1925 Telephone networks 1950 1975 2000 Internet, Optical & Wireless networks Next Generation Internet Telegraph Networks ⚫ Telegraph: a message is transmitted across a network using signals ⚫ ⚫ Drums, beacons, mirrors, smoke, flags, semaphores… Electricity, light Digital Communications ⚫ ⚫ Morse code converts text message in sequence of dots & dashes Use transmission system designed to convey dots and dashes Morse Code Morse Code Morse Code Morse Code A · — J ·—— — S ··· 2 ··——— B —··· K —·— T — 3 ···—— C —·—· L ·—·· U ··— 4 ····— D —·· M —— V ···— 5 ····· E · N —· W ·—— 6 —···· F ··—· O ——— X —··— 7 ——··· G ——· P ·——· Y —·—— 8 ———·· H ···· Q ——· — Z ——·· 9 ————· I ·· R ·—· 1 ·———— 0 ————— Electric Telegraph Networks ⚫ Electric telegraph networks exploded ⚫ Message switching & Store-and-Forward operation ⚫ Key elements: Framing, Multiplexing, Addressing, Routing, Forwarding Message Message Message Source Message Switches Destination Elements of Telegraph Networks ⚫ Digital transmission ⚫ ⚫ ⚫ Multiplexing ⚫ ⚫ Text messages converted into symbols Transmission system designed to convey symbols Framing needed to recover text characters Message Switching ⚫ ⚫ ⚫ Messages contain source & destination addresses Store-and-Forward: messages forwarded hop-by-hop across network Routing according to destination address Bell’s Telephone ⚫ ⚫ Alexander G. Bell (1876) working on harmonic telegraph to multiplexing discovered voice signals can be transmitted directly ⚫ Microphone converts voice pressure variation into analogous electrical signal ⚫ Loudspeaker converts electrical signal back into sound Basic telephone service involves two-way, real-time transmission of voice signals across a network ⚫ Signaling required to establish a call Signaling + voice signal transfer The N2 Problem ⚫ Initially, p2p direct communications - for N users to be fully connected directly ⚫ ⚫ Requires too much space for cables Inefficient & costly since connections not always on 1 2 N 4 3 N = 1000 N(N – 1)/2 = 499500 Circuit Switching is Connection-oriented ⚫ ⚫ Patchcord panel switch invented in 1878 Operators connect users on demand ⚫ ⚫ Establish circuit to allow electrical current to flow from inlet to outlet Only N connections required to central office 1 N N–1 3 2 Hierarchical Tele-Network Structure ⚫ End-to-end connection requires collaborative switching CO = central office switching Toll trunks Tandem Tandem CO CO CO CO CO last mile Elements of Telephone Networks ⚫ Digital transmission & switching ⚫ ⚫ Circuit switching – Connection oriented ⚫ ⚫ ⚫ ⚫ ⚫ Digital voice; Time Division Multiplexing User signals for call setup and tear-down Route selected during connection setup End-to-end connection across network Signaling coordinates connection setup Hierarchical Network Structure ⚫ ⚫ Decimal numbering system Hierarchical structure; simplified routing; scalability Network Architecture Evolution ⚫ Telegraph Networks ⚫ ⚫ Telephone Networks ⚫ ⚫ Circuit Switching and connection oriented Computer Networks and the Internet ⚫ ⚫ ⚫ Message switching & store-and-forward Packet switching Virtual circuit switching Next-Generation Internet ⚫ ??? Summary of the Lesson ⚫ History often repeats itself Unit 01.01.02 CS 5220: COMPUTER COMMUNICATIONS Computer Network Evolution XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Computer Network Evolution ⚫ 1960s: Terminals access shared host computer ⚫ ⚫ ⚫ 1970s: Computers connect directly to each other ⚫ ⚫ ⚫ ⚫ SAGE; SABRE airline reservation system Tree-topology terminal-oriented networks ARPANET packet switching network TCP/IP Internet protocols Ethernet local area network 1980s - 2000s: New applications and Internet growth ⚫ ⚫ ⚫ Commercialization of Internet E-mail, file transfer, web, P2P, . . . Internet traffic surpasses voice traffic Terminal-Oriented Networks ⚫ Early computer systems very expensive; Time-sharing methods allowed multiple terminals to share local computer ⚫ Remote access via telephone modems ... Terminal Terminal Host computer Modem Telephone Network Modem Terminal Medium Access Control ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Dedicated communication lines were expensive Terminals generated messages sporadically Frames carried messages to/from attached terminals Address in frame header identified terminal Medium Access Controls for sharing a line in arbitrated manner Example: Polling protocol on a multi-drop line Polling frames & output frames input frames Terminal Terminal . . . Terminal Multiplexing ⚫ ⚫ Multiplexer allows a line to carry frames to/from multiple terminals Frames are buffered at multiplexer until line becomes available, i.e. store-and-forward Header carries other control information for framing Terminal CRC Information Header Terminal ... ⚫ Header Information CRC Terminal Frame Host computer Multiplexer Error Control Protocol ⚫ ⚫ Communication lines introduced errors Error checking codes used on frames ⚫ ⚫ ⚫ “Cyclic Redundancy Check” (CRC) calculated based on frame header and information payload, and appended Header also carries ACK/NAK control information Retransmission requested when errors detected CRC Information Header Terminal Header Information CRC Computer-to-Computer Networks ⚫ As cost of computing dropped, terminal-oriented networks viewed as too inflexible and costly ⚫ Need to develop flexible computer networks ⚫ ⚫ ⚫ Interconnect computers as required Support many applications Application Examples ⚫ ⚫ File transfer between arbitrary computers Execution of a program on another computer Packet Switching ⚫ Network should support multiple applications ⚫ ⚫ ⚫ ⚫ Transfer arbitrary message size Low delay for interactive applications Store-and-forward operation could induce high delay on interactive messages Packet switching introduced ⚫ ⚫ ⚫ ⚫ Network transfers packets using store-and-forward Packets have maximum length Break long messages into multiple packets By switching, packets delivered (and reassembled) at destination The ARPANET ⚫ The vulnerability of the telephone system was a concern. (a) Telephone system structure; (b) Distributed switching system structure The ARPANET Design ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Connection-less packet transmission Packets are encapsulated in frames Error control uses check bits Destinations identified by unique addresses Routing tables at the packet switches Messages are segmented into packets End-to-end congestion control Flow control prevents buffer overflow ARPANET Applications ⚫ ARPANET (NSF-NET) introduced new applications ⚫ Email, remote login, file transfer, … AMES McCLELLA N UTAH BOULDE R GWC CASE RADC ILL CARN LINC USC AMES MIT MITRE UCSB STAN SCD ETAC UCL A RAND TINKE R BBN HARV NBS Ethernet Local Area Network ⚫ In 1980s, affordable workstations available ⚫ Need for low-cost, low error rate, high-speed network, possible using coaxial cable ⚫ Broadcasting, medium access control ⚫ Network interface card with a unique address ⚫ Ethernet is the standard for high-speed wired access to computer networks Summary of the Lesson ⚫ Services and Applications drive network architecture design Unit 01.01.03 CS 5220: COMPUTER COMMUNICATIONS Examples of Protocols and Services XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Services & Applications ⚫ Service: information transfer capability ⚫ ⚫ ⚫ ⚫ Applications build on communication services ⚫ ⚫ Internet transfer of individual block of information Internet reliable transfer of a stream of bytes Real-time transfer of a voice signal E-mail & web build on reliable stream service New applications build on multiple networks ⚫ SMS builds on Internet reliable stream service and cellular telephone text messaging Layers, Services & Protocols ⚫ ⚫ ⚫ ⚫ The overall communications process between machines connected across one or more networks is very complex Layering partitions related communications functions into groups that are manageable Each layer provides a service to the layer above Each layer operates according to a protocol DNS A. 64.15.247.200 Q. www.nytimes.com? ⚫ ⚫ ⚫ ⚫ ⚫ User clicks on http://www.nytimes.com/ URL contains Internet name of machine (www.nytimes.com), but not Internet address Internet needs Internet address to send information to a machine Browser software uses Domain Name System (DNS) protocol to send query for Internet address DNS system responds with Internet address 2. TCP AC K ACK, TCP Connection Request From: 64.15.247.200 Port 80 To:128.100.11.13 Port 1127 TCP Connection Request From: 128.100.11.13 Port 1127 To: 64.15.247.200 Port 80 ⚫ Browser software uses HTTP to send request for document ⚫ HTTP server waits for requests by listening to a well-known port number (80 for HTTP) ⚫ HTTP client sends request messages through an “ephemeral port number,” e.g. 1127 ⚫ HTTP needs a Transmission Control Protocol (TCP) connection between the HTTP client and HTTP server to transfer messages reliably 3. HTTP Content 200 OK GET / HTTP/1.1 ⚫ ⚫ ⚫ ⚫ ⚫ HTTP client sends its request message: “GET comm.html …” HTTP server sends a status response: “200 OK” HTTP server sends requested file Browser displays document Clicking a link sets off a chain of events across the Internet! Protocols ⚫ A protocol is a set of precise & unambiguous rules that governs ⚫ ⚫ ⚫ how two or more communicating entities in a layer are to interact Messages that can be sent and received Actions that are to be taken when a certain event occurs The purpose of a protocol is to provide a service to the layer above Example: HTTP ⚫ HTTP is an application layer protocol ⚫ Retrieves documents on behalf of a browser application program ⚫ HTTP specifies fields in request messages and response messages ⚫ ⚫ ⚫ Request types; Response codes Content type, options, cookies, … HTTP specifies actions to be taken upon receipt of certain messages HTTP uses service of TCP HTTP client HTTP server GET Port 80 Port 1127 TCP TCP HTTP uses service of TCP - CONT HTTP client HTTP server Port 80 Port 1127 TCP80, 1127 GET TCP TCP HTTP uses service of TCP- CONT HTTP client HTTP server Port 80 Port 1127 GET TCP TCP HTTP uses service of TCP- CONT HTTP client HTTP GET RESPONSE Port 80 Port 1127 TCP TCP HTTP uses service of TCP- CONT HTTP client HTTP Port 80 Port 1127 TCP 1127, 80 RESPONSE TCP HTTP uses service of TCP- CONT HTTP client HTTP GET Port 80 Port 1127 RESPONSE TCP Example: DNS Protocol ⚫ DNS protocol is an application layer protocol ⚫ DNS is a distributed database that resides in multiple machines in the Internet ⚫ DNS protocol allows queries of different types ⚫ DNS usually involves short messages and so uses service provided by UDP ⚫ Well-known port 53 Local Name Server Authoritative Name Server 1 5 2 4 3 6 Root Name Server ⚫ Local Name Server: resolve frequently-used names ⚫ ⚫ Root Zone Name Servers ⚫ ⚫ E.g., University department, ISP Resolves query or refers query to Authoritative Name Server Authoritative Name Server: last resort, 13 autoorities ⚫ Every machine must register its address with at least two servers Summary ⚫ Services: a protocol provides a communications service to the layer above ⚫ DNS servers are one primary target of cyber attacks Unit 01.02.01 CS 5220: COMPUTER COMMUNICATIONS Layered Architecture and OSI Model XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Application Application Application Layer Application Layer Presentation Layer Presentation Layer Session Layer Session Layer Transport Layer Transport Layer Network Layer Network Layer Network Layer Network Layer Data Link Layer Data Link Layer Data Link Layer Data Link Layer Physical Layer Physical Layer Physical Layer Physical Layer Application Application End-to-End Protocols Application Layer Application Layer Presentation Layer Presentation Layer Session Layer Session Layer Transport Layer Transport Layer Network Layer Network Layer Network Layer Network Layer Data Link Layer Data Link Layer Data Link Layer Data Link Layer Physical Layer Physical Layer Physical Layer Physical Layer Communicating End Systems 7-Layer OSI Model Application Application Application Layer Application Layer Presentation Layer Presentation Layer Session Layer Session Layer Transport Layer Transport Layer Network Layer Network Layer Network Layer Network Layer Data Link Layer Data Link Layer Data Link Layer Data Link Layer Physical Layer Physical Layer Physical Layer Physical Layer One or More Network Nodes 7-Layer OSI Model Why Layering Architectures? ⚫ Layering simplifies design, implementation, and testing by partitioning ⚫ Protocol in each layer can be designed separately from those in other layers ⚫ Protocol makes “calls” for services from layer below ⚫ Layering provides flexibility for modifying and evolving protocols and services ⚫ Non-layered architectures are costly, inflexible, and soon obsolete Physical Layer ⚫ ⚫ Transfers bits across a link Definition & specification of the physical aspects ⚫ ⚫ ⚫ ⚫ ⚫ Mechanical: cable, plugs, pins... Electrical/optical: modulation, signal strength, voltage levels, bit times, … functional/procedural: how to activate, maintain, and deactivate physical links… Ethernet, DSL, cable modem, telephone modems… Twisted-pair cable, coaxial cable optical fiber, radio, … Data Link Layer ⚫ ⚫ ⚫ ⚫ Transfers frames across direct connections ⚫ Groups bits into frames ⚫ Detection of bit errors; Retransmission of frames Activation, maintenance of data link connections Medium access control for local area networks Node-to-node flow control Data Link Layer Physical Layer frames bits Data Link Layer Physical Layer Network Layer ⚫ Transfers packets across multiple links and/or multiple networks ⚫ Addressing must scale to large networks ⚫ Nodes execute routing algorithm to determine paths across the network ⚫ Routing protocol means the procedure used to select routing paths ⚫ Forwarding transfers packet across a node ⚫ Congestion control to deal with traffic surges ⚫ Most complex layer in the OSI reference model Internetworking ⚫ ⚫ Internetworking is part of network layer and provides transfer of packets across multiple and possibly dissimilar networks Gateways (routers) direct packets across networks H H G Net Net 11 G = gateway H = host G G G H Net Net 33 Net 2 Net55 Net G Net 4 G H Internetworking - II Ethernet LAN ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ATM Network ATM Switch ATM Switch ATM Switch ATM Switch H G Net Net 11 G = gateway H = host H G G G H Net Net 33 Net 2 Net55 Net G Net 4 G H Transport Layer ⚫ Transfers segments end-to-end from process in a machine to process in another machine ⚫ ⚫ ⚫ Reliable stream transfer or quick-and-simple single-block transfer Message segmentation and reassembly Connection setup, maintenance, and release Transport Layer Network Layer Transport Layer Network Layer Network Layer Communication Network Network Layer Application & Upper Layers ⚫ ⚫ ⚫ Application Layer: Provides services that are frequently required by applications: DNS, HTTP web access, file transfer, email… Presentation Layer: machineindependent representation of data… Session Layer: dialog management, recovery from errors, … Incorporated into Application Layer Application Application Application Layer Application Layer Presentation LayerTransport Layer Session Layer Transport Layer Lesson Summary ⚫ The overall communication process between machines connected across one or more networks is very complex ⚫ Layering partitions related communication functions into groups that are manageable Unit 01.02.02 CS 5220: COMPUTER COMMUNICATIONS OSI Unified View of Protocols and Services XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science OSI Unified View: layers ⚫ ⚫ ⚫ ⚫ A layer is a set of related communication functions managed and grouped together Layer n in one machine interacts with layer n in another machine to provide a service to its upper layer n +1 The entities comprising the corresponding layers on different machines are called peer processes. The processes at layer n are referred to as layer n entities. OSI Unified View: Protocols ⚫ ⚫ The machines at the same layer use a set of precise and unambiguous rules called the layer-n protocol. Layer-n peer processes communicate by exchanging Protocol Data Units (PDUs) n-PDUs n Entity n Entity Layer n peer protocol OSI Unified View: Services ⚫ ⚫ ⚫ ⚫ ⚫ Communication between peer processes is virtual and actually indirect Layer n+1 transfers information by invoking the services provided by layer n Services are available at Service Access Points (SAP’s) Each layer passes data & control information to the layer below it until the physical layer is reached and transfer occurs The data passed to the layer below is called a Service Data Unit (SDU); SDU’s are encapsulated in PDU’s Layers, Services & Protocols n+1 entity n+1 entity n-SDU n-SDU n-SAP n-SDU n entity H n-PDU H n-SAP n-SDU n entity Encapsulation Layer n SDU Payload Layer n+1 header SDU PDU at layer n+1 Trailer Headers & Trailers ⚫ Each protocol uses a header that carries addresses, sequence #... Application Application APP DATA Application Layer AH APP DATA Application Layer Transport Layer TH AH APP DATA Transport Layer Network Layer NH TH AH APP DATA Network Layer Data Link Layer Physical Layer DH NH TH AH APP DATA CRC bits Data Link Layer Physical Layer Bandwidth Utilization Application APP DATA Application … Data Link Layer … DH NH TH AH APP DATA CRC APP DATA Utilization = APP DATA + HEADERS + CRC Data Link Layer Encapsulation in TCP/IP HTTP Request TCP Header contains source & destination port numbers IP Header contains source and destination IP addresses; transport protocol type Ethernet Header contains source & destination MAC addresses; network protocol type Ethernet header TCP header HTTP Request IP header TCP header HTTP Request IP header TCP header HTTP Request FCS Segmentation & Reassembly ⚫ ⚫ ⚫ A layer may impose a limit on the size of a data block that it can transfer Thus a layer-n SDU may be too large to be handled as a single unit by layer(n-1) Sender side: SDU is segmented into multiple PDUs (a) Segmentation n-SDU n-PDU (b) n-PDU n-PDU Reassembly n-SDU ⚫ Receiver side: SDU is reassembled from sequence of PDUs n-PDU n-PDU n-PDU Connectionless & ConnectionOriented Services ⚫ Connection-Oriented ⚫ Three-phases: 1. 2. 3. ⚫ Connection setup between two SAPs to initialize state information SDU transfer Connection release E.g. TCP, ATM ⚫ Connectionless ⚫ ⚫ ⚫ Immediate SDU transfer No connection setup E.g. UDP, IP Why Internetworking? ⚫ To build a “network of networks” or Internet ⚫ ⚫ operating over multiple, coexisting, different network technologies providing ubiquitous connectivity through IP packet transfer H H Connection-oriented Reliable Stream Service Net51 Net G G G H Net52 Net Net53 Net G Net55 Net G Net54 Net G H Connectionless User Datagram Service Internet Protocol (IP) Approach ⚫ IP packets transfer information across Internet Host A IP → router→ router…→ router→ Host B IP Router Host A Transport Layer Internet Layer Internet Layer Network Interface Router Internet Layer Net51 Net Router Network Interface Network Interface Internet Layer Net54 Net Net52 Net Network Interface Net53 Net Host B Transport Layer Internet Layer Network Interface Lesson Summary ⚫ The unified view enables a common understanding of the protocols and services found in different layers. Unit 01.02.03 CS 5220: COMPUTER COMMUNICATIONS TCP/IP: Architecture and Routing Example XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science TCP/IP Protocol Suite HTTP Reliable stream service TCP Best-effort Connectionless packet transfer Network interface 1 DNS SMTP RTP UDP IP Network interface 2 Best-effort user datagram service (ICMP, ARP) Network interface 3 Internet Protocol (IP) ⚫ ⚫ ⚫ ⚫ Routers (gateways) interconnect different networks Host computers prepare IP packets and transmit them over their attached network Routers forward IP packets across networks Best-effort IP transfer service Net 1 Net 2 Router Internet Addresses ⚫ ⚫ ⚫ Hierarchical address: Net ID + Host ID IP packets routed according to Net ID Routers compute routing tables using distributed algorithm H H Net 1 G Net 3 G G G H Net 2 G Net 5 Net 4 G H Physical Addresses ⚫ LANs assign physical addresses to physical attachment to the network ⚫ The network uses its own address to transfer packets or frames to the appropriate destination ⚫ IP address needs to be resolved to physical address at each IP network interface, by address resolution protocol (ARP) ⚫ Example: Ethernet uses 48-bit addresses ⚫ ⚫ Each NIC has globally unique physical address (called MAC address) First 24 bits identify NIC manufacturer; second 24 bits are serial number Server PC Router Example (2,1) (1,1) s Ethernet (netid=1) PPP Netid=2 (1,3) r (2,2) w Workstation *PPP does not use addresses (PPP stands for Point-to-Point) (1,2) netid hostid Physical address server 1 1 s workstation 1 2 w router 1 3 r router 2 1 - PC 2 2 - Encapsulation ⚫ Ethernet header contains: ⚫ ⚫ source and destination physical addresses network protocol type (e.g. IP) Ethernet header IP header IP Payload IP header IP Payload FCS Example: IP packet from workstation to server Server (2,1) (1,1) s (1,2) 2. 3. 4. (1,3) w Ethernet 1. PC Router PPP r w, s (2,2) (1,2), (1,1) Workstation IP packet has (1,2) IP address for source and (1,1) IP address for destination IP table at workstation indicates (1,1) connected to same network, so IP packet is encapsulated in Ethernet frame with addresses w and s Ethernet frame is broadcast by workstation NIC and captured by server NIC NIC examines protocol type field and then delivers packet to its IP layer Example: IP packet from server to PC Router Server (1,1) s, r s (1,1), (2,2) (1,2) 1. 2. 3. PC (2,1) (1,3) r (2,2) w Workstation IP packet has (1,1) and (2,2) as IP source and destination addresses IP table at server indicates packet should be sent to router, so IP packet is encapsulated in Ethernet frame with addresses s and r Ethernet frame is broadcast by server NIC and captured by router NIC Router Server (2,1) (1,1) s (1,3) r PC (1,1), (2,2) (2,2) w (1,2) 4. 5. 6. 7. 8. Workstation Router NIC examines protocol type field and delivers packet to its IP layer IP layer examines IP packet destination address and determines IP packet should be routed to (2,2) Router’s table indicates (2,2) is directly connected via PPP link IP packet is encapsulated in PPP frame and delivered to PC PPP at PC examines protocol type field and delivers packet to PC IP layer Lesson Summary ⚫ Encapsulation is key to layering ⚫ Layers work together for routing Unit 01.03.01 CS 5220: COMPUTER COMMUNICATIONS Berkeley Socket API - I XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Berkeley Socket API ⚫ Berkeley UNIX Sockets API ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Abstraction for applications to send & receive data Applications create sockets that “plug into” network Applications write/read to/from sockets Implemented in the kernel Facilitates development of network applications Hides details of underlying protocols & mechanisms Also in Windows, Linux, and other OS’s Client Socket interface Server Application 1 Application 2 User descriptor Kernel Communications through Sockets User descriptor Kernel Socket port number Socket interface Socket • Application references a socket through a descriptor • Socket bound to a port number Underlying communication protocols port number Underlying communication protocols Communications network Transport Protocols ⚫ Host computers run two transport protocols on top of IP to enable process-to-process communications ⚫ User Datagram Protocol (UDP) enables best-effort connectionless transfer of individual block of information ⚫ Transmission Control Protocol (TCP) enables connectionoriented reliable transfer of a stream of bytes ⚫ Two services though Sockets: connection-oriented and connection-less Stream Mode of Service Connection-oriented (TCP) ⚫ First, setup connection between two peer application processes ⚫ Then, reliable bidirectional insequence transfer of byte stream (boundaries not preserved in transfer) ⚫ Multiple write/read between peer processes ⚫ Finally, connection release Connectionless (UDP) ⚫ ⚫ ⚫ ⚫ ⚫ Immediate transfer of one block of information (boundaries preserved) No setup overhead & delay Destination address with each block Send/receive to/from multiple peer processes Best-effort service only ⚫ Possible out-of-order ⚫ Possible loss Client & Server Differences ⚫ Server ⚫ ⚫ ⚫ ⚫ Specifies well-known port # when creating socket May have multiple IP addresses (net interfaces) Waits passively for client requests Client ⚫ ⚫ ⚫ Assigned ephemeral port # Initiates communications with server Needs to know server’s IP address & port # ⚫ ⚫ DNS for URL & server well-known port # Server learns client’s address & port # Server socket() bind() Server does Passive Open ⚫ socket call creates socket to listen for connection requests ⚫ Server specifies type: TCP (stream) ⚫ socket call returns: non-negative integer descriptor; or -1 if unsuccessful listen() Client accept() Blocks read() write() close() socket() Connect negotiation Data Data connect() write() read() close() Socket Calls for ConnectionOriented Mode Server socket() bind() ⚫ ⚫ ⚫ ⚫ bind assigns local address & port # to socket with specified descriptor Can wildcard IP address for multiple net interfaces bind call returns: 0 (success); or -1 (failure) Failure if port # already in use or if reuse option not set listen() Client accept() Blocks read() write() close() socket() Connect negotiation Data Data connect() write() read() close() Socket Calls for ConnectionOriented Mode ⚫ Server socket() ⚫ bind() ⚫ listen indicates to TCP readiness to receive connection requests for socket with given descriptor Parameter specifies max number of requests that may be queued while waiting for server to accept them listen call returns: 0 (success); or -1 (failure) listen() Client accept() Blocks read() write() close() socket() Connect negotiation Data Data connect() write() read() close() Socket Calls for ConnectionOriented Mode Server ⚫ socket() ⚫ Server calls accept to accept incoming requests accept blocks if queue is empty bind() listen() Client accept() Blocks read() write() close() socket() Connect negotiation Data Data connect() write() read() close() Server socket() bind() Client does Active Open ⚫ socket call creates socket to connect to server ⚫ Client specifies type: TCP (stream) ⚫ socket call returns: non-negative integer descriptor; or -1 if unsuccessful listen() Client accept() Blocks read() write() close() socket() Connect negotiation Data Data connect() write() read() close() Socket Calls for ConnectionOriented Mode Server ⚫ socket() bind() ⚫ connect establishes a connection on the local socket with the specified descriptor to the specified remote address and port # connect returns 0 if successful; -1 if unsuccessful listen() Client accept() Blocks read() write() close() socket() Connect negotiation Data Data connect() write() read() close() Note: connect initiates TCP three-way handshake ⚫ Server ⚫ socket() ⚫ bind() ⚫ listen() ⚫ accept wakes with incoming connection request accept fills client address & port # into address structure accept call returns: descriptor of new connection socket (success); or -1 (failure) Client & server use new socket for data transfer Original socket continues to listen for new requests Client accept() Blocks read() write() close() socket() Connect negotiation Data Data connect() write() read() close() Socket Calls for ConnectionOriented Mode Server socket() bind() Data Transfer ⚫ Client or server call write to transmit data into a connected socket ⚫ write call returns: # bytes transferred (success); or -1 (failure); blocks until all data transferred listen() Client accept() Blocks read() write() close() socket() Connect negotiation Data Data connect() write() read() close() Server socket() bind() listen() Data Transfer ⚫ Client or server call read to receive data from a connected socket ⚫ read specifies: socket descriptor; pointer to a buffer; amount of data ⚫ read call returns: # bytes read (success); or -1 (failure); blocks if no data arrives Client accept() Blocks read() write() close() socket() Connect negotiation Data Data connect() write() read() close() Note: write and read can be called multiple times to transfer byte streams in both directions Server socket() bind() Connection Termination ⚫ Client or server call close when socket is no longer needed ⚫ close specifies the socket descriptor ⚫ close call returns: 0 (success); or -1 (failure) listen() Client accept() Blocks read() write() close() socket() Connect negotiation Data Data Note: close initiates TCP graceful close sequence connect() write() read() close() Socket Calls for ConnectionOriented Mode Summary of the Lesson ⚫ Socket API hides details of underlying protocols & mechanisms Unit 01.03.02 CS 5220: COMPUTER COMMUNICATIONS Berkeley Socket API - II XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Stream Mode of Service Connectionless (UDP) ⚫ ⚫ ⚫ ⚫ ⚫ Immediate transfer of one block of information (boundaries preserved) No setup overhead & delay Destination address with each block Send/receive to/from multiple peer processes Best-effort service only ⚫ ⚫ Possible out-of-order Possible loss Server socket() Server starts first ⚫ Socket call creates socket of type UDP (datagram) ⚫ socket call returns: descriptor; or -1 if unsuccessful ⚫ bind assigns local address & port # to socket with specified descriptor bind() Client socket() recvfrom() Blocks until server receives data from client sendto() Data sendto() Data recvfrom() close() close() Socket Calls for Connection-Less Mode ⚫ Server ⚫ recvfrom copies bytes received in specified socket into a specified location recvfrom blocks until data arrives socket() bind() Client socket() recvfrom() Blocks until server receives data from client sendto() Data sendto() Data recvfrom() close() close() Socket Calls for Connection-Less Mode Client started Server socket() ⚫ ⚫ socket creates socket of type UDP (datagram) socket call returns: descriptor; or -1 if unsuccessful bind() Client socket() recvfrom() Blocks until server receives data from client sendto() Data sendto() Data recvfrom() close() close() Socket Calls for Connection-Less Mode ⚫ Server ⚫ socket() bind() ⚫ recvfrom() Blocks until server receives data from client sendto() sendto transfer bytes in buffer to specified socket sendto specifies: socket descriptor; pointer to a buffer; amount of data; flags to control transmission behavior; destination address & port #; length of destination address structure Client sendto returns: # bytes sent; or -1 if unsuccessfulsocket() Data sendto() Data recvfrom() close() close() ⚫ Server ⚫ socket() bind() ⚫ recvfrom wakes when data arrives recvfrom specifies: socket descriptor; pointer to a buffer to put data; max # bytes to put in buffer; control flags; copies: sender address & port #; length of sender address structure recvfrom returns # bytes received or -1 (failure) Client socket() recvfrom() Blocks until server receives data from client sendto() Data sendto() Data recvfrom() close() close() Note: recvfrom returns data from at most one send, i.e. from one datagram Server socket() Socket Close ⚫ Client or server call close when socket is no longer needed ⚫ close specifies the socket descriptor ⚫ close call returns: 0 (success); or -1 (failure) bind() Client socket() recvfrom() Blocks until server receives data from client sendto() Data sendto() Data recvfrom() close() close() Socket Calls for Connection-Less Mode Example-I: TCP Echo Server ⚫ ⚫ ⚫ ⚫ ⚫ As illustration of the use of system calls and functions, let’s see two programs communicate via TCP. The client prompts a user to type a line of text and sends it to the server, and reads the data back from the server. The server aces as a simple each server. In this example, each program expects a fixed number of bytes from the other end, defined by BUFLEN. The example code is given in the Textbook Chapter 2.4 TCP Echo Server - Binding /* Bind an address to the socket */ bzero((char *)&server, sizeof(struct sockaddr_in)); server.sin_family = AF_INET; server.sin_port = htons(port); server.sin_addr.s_addr = htonl(INADDR_ANY); if (bind(sd, (struct sockaddr *)&server, sizeof(server)) == -1) { fprintf(stderr, "Can't bind name to socket\n"); exit(1); } TCP Echo Server - Connections /* queue up to 5 connect requests */ listen(sd, 5); while (1) { client_len = sizeof(client); if ((new_sd = accept(sd, (struct sockaddr *)&client, &client_len)) == -1) { fprintf(stderr, "Can't accept client\n"); exit(1); } TCP Echo Server – Repeated Byte Reads /* Repeated calls to read until all data received */ bp = buf; bytes_to_read = BUFLEN; while ((n = read(new_sd, bp, bytes_to_read)) > 0) { bp += n; bytes_to_read -= n; } printf("Rec'd: %s\n", buf); write(new_sd, buf, BUFLEN); printf("Sent: %s\n", buf); close(new_sd); TCP Echo Client – Name-to-Address bzero((char *)&server, sizeof(struct sockaddr_in)); server.sin_family = AF_INET; server.sin_port = htons(port); if ((hp = gethostbyname(host)) == NULL) { fprintf(stderr, "Can't get server's address\n"); exit(1); } bcopy(hp->h_addr, (char *)&server.sin_addr, hp->h_length); TCP Echo Client - Connection /* Connecting to the server */ if (connect(sd, (struct sockaddr *) &server, sizeof(server)) == -1) { fprintf(stderr, "Can't connect\n"); exit(1); } printf("Connected: server's address is %s\n", hp->h_name); TCP Echo Client – Repeated reads printf("Receive:\n"); bp = rbuf; bytes_to_read = BUFLEN; while ((n = read(sd, bp, bytes_to_read)) > 0) { bp += n; bytes_to_read -= n; } printf("%s\n", rbuf); Example-II: UDP Echo Server while (1) { client_len = sizeof(client); if ((n = recvfrom(sd, buf, MAXLEN, 0, (struct sockaddr *)&client, &client_len)) < 0) { fprintf(stderr, "Can't receive datagram\n"); exit(1); } if (sendto(sd, buf, n, 0, (struct sockaddr *)&client, client_len) != n) { fprintf(stderr, "Can't send datagram\n"); exit(1); } } Example: UDP Echo Client gettimeofday(&start, NULL); /*start delay measurement*/ server_len = sizeof(server); if (sendto(sd, sbuf, data_size, 0, (struct sockaddr *) &server, server_len) == -1) { fprintf(stderr, "sendto error\n") exit(1); } if (recvfrom(sd, rbuf, MAXLEN, 0, (struct sockaddr *) &server, &server_len) < 0) { fprintf(stderr, "recvfrom error\n"); exit(1); } gettimeofday(&end, NULL); /* end delay measurement */ Summary: UDP Rliability ⚫ As UDP is unreliable, users may have to take care of reliability assurance by themselves. ⚫ LAN vs. WAN ⚫ Timeout mechanism avoids forever wait ⚫ Re-transmission to get a lost message ⚫ Reordering and de-duplication are requiired for reliability Unit 01.03.03 CS 5220: COMPUTER COMMUNICATIONS Digital Communication Fundamentals XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Interests of Interest ⚫ ⚫ How long will it take to transmit a message? ⚫ How many bits are in the message (text, image)? ⚫ How fast does the network/system transfer information? Can a network/system handle a voice (video) call? ⚫ How many bits/second does voice/video require? ⚫ How long will it take to transmit a message? ⚫ What transmission speed is possible over radio, copper cables, fiber, …? Bits, numbers, information ⚫ Bit: number with value 0 or 1 ⚫ ⚫ ⚫ ⚫ n bits allows enumeration of 2n possibilities ⚫ ⚫ ⚫ ⚫ n bits: digital representation for 0, 1, … , 2n Byte or Octet, n = 8 Computer word, n = 16, 32, or 64 n-bit field in a header n-bit representation of a voice sample Message consisting of n bits The number of bits required to represent a message is a measure of its information content; more bits means more content Block vs. Stream Information Block ⚫ Information that occurs in a single block ⚫ ⚫ ⚫ ⚫ Text message Data file JPEG image Size = Bits / block or bytes/block ⚫ ⚫ ⚫ Stream ⚫ Information that is produced & transmitted continuously 1 kbyte = 210 bytes 1 Mbyte = 220 bytes 1 Gbyte = 230 bytes ⚫ ⚫ ⚫ Real-time voice Streaming video Bit rate = bits / second ⚫ ⚫ ⚫ 1 kbps = 103 bps 1 Mbps = 106 bps 1 Gbps =109 bps Delay – Propagation Delay ⚫ The delay of communication between two nodes has two components, the propagation delay and the transmission delay ⚫ The propagation delay tprop = d/v ⚫ ⚫ ⚫ tprop d v time for signal to propagate across medium distance between two nodes in meters speed of light in the transmission medium (3x108 m/s in vacuum) Delay - Transmission Delay ⚫ The transmission delay: ttrans = L/R ⚫ ⚫ ⚫ L R number of bits in message bandwidth of digital transmission system in bps Overall Delay = tprop + ttrans = d/v + L/R Use data compression to reduce L Use higher bandwidth modem to increase R Place server closer to reduce d Compression ⚫ ⚫ Information usually not represented efficiently Data compression algorithms ⚫ ⚫ Represent the information using fewer bits Noiseless: original information recovered exactly ⚫ ⚫ Noisy: recover information approximately ⚫ ⚫ ⚫ E.g. zip, compress, GIF, fax E.g., JPEG Tradeoff: # bits vs. quality Compression Ratio #bits (original file) / #bits (compressed file) Examples of Block Information Type Method Format Original Compresse d(Ratio) Text Zip, compress ASCII KbytesMbytes (2-6) Fax CCITT Group 3 A4 page 200x100 pixels/in2 256 kbytes 5-54 kbytes (5-50) JPEG 8x10 in2 photo 4002 pixels/in2 38.4 Mbytes 1-8 Mbytes (5-30) Color Image Examples of Digital Video Signals Type Video Conference Full Motion HDTV Method H.261 Format 176x144 or 352x288 pix @10-30 fr/sec MPEG2 720x480 pix @30 fr/sec MPEG2 1920x1080 @30 fr/sec Original Compressed 2-36 Mbps 64-1544 kbps 249 Mbps 1.6 Gbps 2-6 Mbps 19-38 Mbps Trans. of Stream Information ⚫ Constant bit-rate ⚫ ⚫ ⚫ Signals such as digitized telephone voice produce a steady stream: e.g. 64 kbps Network must support steady transfer of signal, e.g. 64 kbps circuit Variable bit-rate ⚫ ⚫ Signals such as digitized video produce a stream that varies in bit rate, e.g. according to motion and detail in a scene Network must support variable transfer rate of signal, e.g. packet switching or rate-smoothing with constant bit-rate circuit Stream Quality-of-Service (QoS) Issues Network Transmission Impairments ⚫ Delay: Is information delivered in timely fashion? ⚫ Jitter: Is information delivered in sufficiently smooth fashion? ⚫ Loss: Is information delivered without loss? If loss occurs, is delivered signal quality acceptable? ⚫ Applications & application layer protocols developed to deal with these impairments A Transmission System Transmitter ⚫ ⚫ Converts information into signal suitable for transmission Injects energy into communications medium or channel ⚫ Telephone converts voice into electric current; Modem converts bits into tones Receiver ⚫ ⚫ Receives energy from medium Converts received signal into form suitable for delivery to user ⚫ Telephone converts current into voice; Modem converts tones into bits Transmitter Receiver Communication channel Transmission Impairments Transmitter Transmitted Signal Received Signal Receiver Communication channel Communication Channel ⚫ ⚫ ⚫ ⚫ ⚫ Pair of copper wires Coaxial cable Radio Light in optical fiber Infrared Transmission Impairments ⚫ ⚫ ⚫ ⚫ Signal attenuation Signal distortion Spurious noise Interference from other signals Digital Long-Distance Communications ⚫ Regenerator (repeater) recovers original data sequence and retransmits on next segment ⚫ Each regeneration is like the first time! ⚫ Can redesign so error probability is very small Transmission segment Source Regenerator ... Regenerator Destination Twisted Pair ⚫ A twisted pair consists of two insulated copper wires, typically about 1mm thick; twisted together to reduce the susceptibility to interference. ⚫ More twists per cm leads to less crosstalk and better quality over longer distance (a) Category 3 UTP (16 MHz). (b) Category 5 UTP (100 MHz). Twisted Pair Bit Rates Data rates of 24-gauge twisted pair Standard Data Rate Distance T-1 1.544 Mbps 18,000 feet, 5.5 km DS2 6.312 Mbps 12,000 feet, 3.7 km 1/4 STS-1 12.960 Mbps 4500 feet, 1.4 km ⚫ ⚫ Twisted pairs can provide high bit rates at short distances Asymmetric Digital Subscriber Loop (ADSL) ⚫ ⚫ ⚫ ⚫ ⚫ 1/2 STS-1 25.920 Mbps 3000 feet, 0.9 km STS-1 51.840 Mbps 1000 feet, 300 m ⚫ High-speed Internet Access Lower 3 kHz for voice Upper band for data 64 kbps inbound 640 kbps outbound Much higher rates possible at shorter distances ⚫ Strategy for telephone companies is to bring fiber close to home & then twisted pair Ethernet LANs ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Category 3 unshielded twisted pair (UTP): ordinary telephone wires Category 5 UTP: tighter twisting to improve signal quality Shielded twisted pair (STP): to minimize interference; costly 10BASE-T Ethernet ⚫ ⚫ ⚫ ⚫ 100BASE-T4 Fast Ethernet ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ 10 Mbps, Baseband, Twisted pair Two Cat3 pairs Manchester coding, 100 meters 100 Mbps, Baseband, Twisted pair Four Cat3 pairs Three pairs for one direction at-a-time 100/3 Mbps per pair; 3B6T line code, 100 meters Cat5 & STP provide other options Coaxial Cable ⚫ A good combination of high bandwidth and excellent interference immunity ⚫ ⚫ ⚫ ⚫ Higher bandwidth than twisted pair Cable TV distribution; Long distance telephone transmission Used in the original Ethernet LAN medium Optical Fiber Electrical signal Modulator Optical fiber Receiver Electrical signal Optical source ⚫ Light sources generate pulses of light that are transmitted on optical fiber ⚫ ⚫ ⚫ Very long distances (>1000 km), and very high speeds (>40 Gbps/wavelength) Nearly error-free (Bit-Error-Rate of 10-15) Profound influence on network architecture ⚫ ⚫ ⚫ Dominates long distance transmission Distance less of a cost factor in communications Plentiful bandwidth for new services Optical Fiber Properties Advantages ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Very low attenuation Noise immunity Extremely high bandwidth Security: very difficult to tap without breaking No corrosion More compact & lighter than copper wire Disadvantages ⚫ ⚫ ⚫ ⚫ New types of optical signal impairments & dispersion ⚫ Wavelength dependence Limited bend radius ⚫ If physical arc of cable too high, light lost or won’t reflect ⚫ Will break Difficult to splice Mechanical vibration becomes signal noise Bit Rates of Digital Transmission Systems System Bit Rate (Bandwidth) Observations Telephone twisted pair 33.6-56 kbps 4 kHz telephone channel Ethernet twisted 10 Mbps, 100 Mbps pair 100 meters of unshielded twisted copper wire pair Cable modem 500 kbps-4 Mbps Shared CATV return channel ADSL twisted pair 64-640 kbps in, 1.5366.144 Mbps out Coexists with analog telephone signal 2.4 GHz radio 2-11 Mbps IEEE 802.11 wireless LAN 28 GHz radio 1.5-45 Mbps 5 km multipoint radio Optical fiber 2.5-10 Gbps 1 wavelength Optical fiber >1600 Gbps Many wavelengths Summary of the Lesson ⚫ Different digital transmission systems have various bit rate, cost, bit-error-rate, and usages. Unit 01.04.1 CS 5220: COMPUTER COMMUNICATIONS Error Control – Parity Checks XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Error Control ⚫ ⚫ ⚫ ⚫ ⚫ Digital transmission systems introduce errors with different error probability (bit-error-rate) Applications require certain reliability level Error control used when transmission system does not meet application requirement Error control ensures a data stream is transmitted to a certain level of accuracy despite errors Two basic approaches: ⚫ Error detection & retransmission ⚫ Error correction Codeword ⚫ A n-bit codeword: a frame of m-bit data plus k-bit redundant check bits (n = m + k) Key Idea ⚫ All transmitted data blocks “codewords” satisfy a pattern ⚫ ⚫ If received block doesn’t satisfy pattern, it is in error Blindspot: when channel transforms a codeword into another codeword All inputs to channel satisfy pattern or condition User Encoder information Channel codeword Channel output Pattern checking codeword Deliver user information or set error alarm Single Parity Check ⚫ Append an overall parity check to k information bits Info Bits: Check Bit: Codeword: b1, b2, b3, …, bk bk+1= b1+ b2+ b3+ …+ bk modulo 2 (b1, b2, b3, …, bk,, bk+!) ⚫ All codewords have even # of 1s Receiver checks to see if # of 1s is even ⚫ Parity bit used in ASCII code ⚫ Example of Single Parity Code ⚫ Information (7 bits): (0, 1, 0, 1, 1, 0, 0) Parity Bit: b8 = 0 + 1 +0 + 1 +1 + 0 = 1 Codeword (8 bits): (0, 1, 0, 1, 1, 0, 0, 1) ⚫ If single error in bit 3 : (0, 1, 1, 1, 1, 0, 0, 1) ⚫ ⚫ ⚫ ⚫ ⚫ # of 1’s =5, odd Error detected If errors in bits 3 and 5: (0, 1, 1, 1, 0, 0, 0, 1) ⚫ ⚫ # of 1’s =4, even Error not detected How good is the single parity check code? ⚫ Redundancy: Single parity check code adds 1 redundant bit per m information bits: overhead = 1/(m + 1) ⚫ Coverage: all error patterns with odd # of errors can be detected ⚫ ⚫ ⚫ An error patten is a binary (m + 1)-tuple with 1s where errors occur and 0’s elsewhere Of 2k+1 binary (m + 1)-tuples, ½ are odd, so 50% of error patterns can be detected Code vector (e1, e2, …, en) where ei = 1 if an error occurs in the ith transmitted bit and ei = 0 otherwise Error Probability ⚫ Many transmission channels introduce bit errors at random, independently of each other, and with probability p ⚫ For a n-bit frame, P [1-bit error] = 1 n-1 and n p (1 – p) P [ j-bit error] = j n p j (1 – p) n-j What if bit errors are random? ⚫ Some error patterns are more probable than others: p and 1–p p 2 1–p P [10000000] = p(1 – p)7 = (1 – p)8 P [11000000] = p2(1 – p)6 = (1 – p)8 ⚫ In any worthwhile channel p < 0.5, and so (p/(1 – p) < 1) ⚫ It follows that patterns with 1 error are more likely than patterns with 2 errors and so forth Single Parity – Undectable errors ⚫ ⚫ What is the probability an undetectable error pattern occurs? Undetectable error pattern if even # of bit errors: P [error detection failure] = P [undetectable error pattern] = P [error patterns with even number of 1s] = ⚫ ⚫ n 2 p2(1 – p)n-2 + n 4 p4(1 – p)n-4 + … Example: evaluate above for n = 32, p = 10-3 , roughly 1 in 2000 error patterns is undetectable Is it possible to detect more errors if we add more check bits? Two-Dimensional Parity Check ⚫ ⚫ ⚫ ⚫ More parity bits to improve coverage Arrange information as columns Add single parity bit to each column Add a final “parity” column 1 0 0 1 0 0 0 1 0 0 0 1 Last column consists 1 0 0 1 0 0 of check bits for each 1 1 0 1 1 0 row 1 0 0 1 1 1 Bottom row consists of check bit for each column Error-detecting capability 1 0 0 1 0 0 0 0 0 0 0 1 1 0 0 1 0 0 One error 0 0 0 0 0 1 1 0 0 1 0 0 1 0 0 1 0 0 1 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 1 1 1 0 0 1 1 1 Two errors Error-detecting capability - II 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 1 Three 1 0 0 1 0 0 errors 1 0 0 1 0 0 1 0 0 1 1 0 1 0 0 0 1 0 1 0 0 1 1 1 1 0 0 1 1 1 Arrows indicate failed check bits Four errors (undetectable) Summary of the Lesson ⚫ Single parity bit code is used in ASCII. Its overhead is low, but able to detect any odd number of errors, including the most possible 1-bit errors. ⚫ Two-dimensional parity checks was used in old systems, as it was able to detect 1, 2, and 3 bit errors. But its overhead is high. Unit 01.04.02 CS 5220: COMPUTER COMMUNICATIONS Error Control – Polynomial Codes (CRC) XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Polynomial Codes ⚫ ⚫ ⚫ ⚫ ⚫ Polynomial arithmetic instead of check sums Implemented using shift-register circuits Also called cyclic redundancy check (CRC) codes Most data communications standards use polynomial codes for error detection Polynomial codes also basis for powerful error-correction methods Binary Polynomial Arithmetic ⚫ Binary vectors map to polynomials (polynomial degree k-1) (ik-1 , ik-2 ,…, i2 , i1 , i0) → ik-1xk-1 + ik-2xk-2 + … + i2x2 + i1x + i0 Addition: (x7 + x6 + 1) + (x6 + x5) = x7 + x6 + x6 + x5 + 1 = x7 +(1+1)x6 + x5 + 1 = x7 +x5 + 1 since 1+1=0 mod2 Multiplication: (x + 1) (x2 + x + 1) = x(x2 + x + 1) + 1(x2 + x + 1) = x3 + x2 + x + x2 + x + 1 = x3 + 1 Division ⚫ Division with Decimal Numbers (Euclidean Division) 34 quotient dividend = quotient x divisor + remainder 35 ) 1222 dividend 1222 = 34 x 35 + 32 105 divisor 172 140 remainder 32 Binary Polynomial Division ⚫ Polynomial Division, p(x) = q(x) g(x) + r(x) = q(x) quotient x3+ x2+ x x3 + x + 1 ) x6 + x5 Dividend p(x) x6 + x4 + x3 Divisor g(x) x 5+x4 +x3 x5 + x3 + x2 x4 + x2 x4 + x2 + x Note: Degree of r(x) is x = r(x) remainder less than degree of divisor Cyclic Redundancy Check ⚫ Cyclic Redundancy Check (CRC) uses polynomial code, treating bit strings as representation of polynomials with coefficients of 0 and 1 only. ⚫ A k-bit data frame is regarded as the coefficient list for a polynomial with k terms, ranging from x^k-1 to x^0. Such a polynomial is said to be of degree k-1 ⚫ Polynomial arithmetic is done by per-bit XOR Examples: 10011011 + 11001010 11110000 - 10100110 CRC Idea - Checkbits & Error Detection Information k bits Received information bits Recalculate check bits k bits Channel Calculate check bits Sent check bits Generator n – k bits Polynomial Generator Polynomial Compare Received check bits Information accepted if check bits match CRC Procedure - Preparation ⚫ Given a generator polynomial g(x) that has degree n-k g(x) = xn-k + gn-k-1xn-k-1 + … + g2x2 + g1x + 1 ⚫ Information polynomial i(x) has k information bits (degree k – 1) i(x) = ik-1xk-1 + ik-2xk-2 + … + i2x2 + i1x + i0 CRC Encoding Procedure 1. Multiply i(x) by n-k; (puts n-k zeros in (n-k) low order positions) 2. Divide xn-k i(x) by g(x), and get a remainder polynomial r(x) of at most degree n-k-1. The remainder is the CRC checkbits; q(x) g(x) ) xn-k i(x) r(x) xn-ki(x) = q(x)g(x) + r(x) 3. Add remainder r(x) to xn-k i(x); (put check bits in the n-k lowerorder positions). The resulted polynomial will be transmitted codeword b(x) = xn-k i(x) + r(x) CRC Polynomial example: k = 4, n–k = 3 Generator polynomial: g(x)= x3 + x + 1 Information: (1,1,0,0) Encoding dividend: i(x) = x3 + x2 x3i(x) = x6 + x5 x3 + x2 + x x3 + x + 1 ) x6 + x5 x6 + x 4 + x3 x5 + x4 + x3 x5 + x 3 + x2 x4 + x2 x4 + x2 + x x Transmitted codeword: b(x) = x6 + x5 + x b = (1,1,0,0,0,1,0) 1110 1011 ) 1100000 1011 1110 1011 1010 1011 010 Remainder (check bits) Summary of the Lesson ⚫ Binary polynomial code and binary arithmetic are key to CRC encoding and checkbits calculation Unit 01.04.03 CS 5220: COMPUTER COMMUNICATIONS CRC Capability; Internet Checksum XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science CRC Encoding - Recab 1. Multiply i(x) by n-k; (puts n-k zeros in (n-k) low order positions) 2. Divide xn-k i(x) by g(x), and get a remainder polynomial r(x) of at most degree n-k-1. The remainder is the CRC checkbits; 3. Add remainder r(x) to xn-k i(x); (put check bits in the n-k lower-order positions). The resulted polynomial will be transmitted codeword b(x) = xn-k i(x) + r(x) An Example – Step-by-Step An Example – Step 1 An Example – Step 2 An Example – Step 3 An Example – Step 4 An Example – Step 5 An Example – Step 6 An Example – Step 7 An Example – Step 8 An Example – Step 9 An Example – Step 10 Overall CRC Capability Analysis ⚫ ⚫ ⚫ What kind of errors will be detected? Imagine that a transmission error e(x) occurs, so that instead of b(x) arriving, b(x) + e(x) arrives. e(x) has 1s in error locations & 0s elsewhere, an additive error model adding bit-by-bit to the input codeword b(x) using modulo 2 arithmetic (Transmitter) b(x) (Receiver) + R(x)=b(x)+e(x) (Channel) e(x) Error polynomial Undetectable Error Patterns ⚫ ⚫ Receiver divides the received polynomial R(x) by g(x) Blindspot: If e(x) is a multiple of g(x), that is, e(x) is a nonzero codeword, then R(x) = b(x) + e(x) = q(x)g(x) + q’(x)g(x) If e(x) is divisible by g(x), the error will slip by! So, how we select g(x)? (Transmitter) b(x) (Receiver) + R(x)=b(x)+e(x) (Channel) e(x) Error polynomial Designing Good Polynomial Codes ⚫ ⚫ Select generator polynomial so that likely error patterns are not multiples of g(x) Detecting Single Errors ⚫ ⚫ ⚫ e(x) = xi for error in location i + 1 If g(x) has more than 1 term, it cannot divide xi Detecting Double Errors ⚫ ⚫ ⚫ ⚫ e(x) = xi + xj = xi(xj-i+1) where j>i If g(x) has more than 1 term, it cannot divide xi If g(x) is a primitive polynomial, it cannot divide xm+1 for all m<2n-k-1 (Need to keep codeword length less than 2n-k-1) Primitive polynomials can be found by consulting coding theory books Designing Good Polynomial codes ⚫ Detecting Odd Numbers of Errors ⚫ ⚫ ⚫ ⚫ Suppose all codeword polynomials have an even # of 1s, then all odd numbers of errors can be detected As well, b(x) evaluated at x = 1 is zero because b(x) has an even number of 1s This implies x + 1 must be a factor of all b(x) Pick g(x) = (x + 1) p(x) where p(x) is primitive Standard CRC Generator Polynomials ⚫ CRC-8: = x8 + x2 + x + 1 ⚫ CRC-16: = x16 + x15 + x2 + 1 = (x + 1)(x15 + x + 1) ⚫ CCITT-16: ATM Bisync HDLC, XMODEM, V.41 = x16 + x12 + x5 + 1 ⚫ CCITT-32: IEEE 802, DoD, V.42 = x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1 Internet Checksum ⚫ Internet Protocols (IP, TCP, UDP) use check bits to detect errors, instead of using CRC polynomial ⚫ The rationale is the simplicity: the checksum must be recalculated at every router, the algorithm for the checksum was selected for its ease of implementation, instead of strength of error detection capability Internet (IP) Checksum Algorithm ⚫ Let IP header consists of L, 16-bit words, b0, b1, b2, ..., bL-1 ⚫ The algorithm appends a 16-bit checksum bL to the header. The checksum bL is calculated as follows: ⚫ Treating each 16-bit word as an integer, find x = (b0 + b1 + b2+ ...+ bL-1 ) modulo 216-1 ⚫ The checksum is then given by: bL = - x Thus, the headers must satisfy the following pattern: 0 = (b0 + b1 + b2+ ...+ bL-1 + bL ) modulo 216-1 ⚫ The checksum calculation is carried out in software using one’s complement arithmetic Internet Checksum Example Assume 4-bit words Use mod 24-1 arithmetic b0=1100 = 12 b1=1010 = 10 Use Modulo Arithmetic Use Binary Arithmetic Internet Checksum Example Use Modulo Arithmetic ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Assume 4-bit words Use mod 24-1 arithmetic b0=1100 = 12 b1=1010 = 10 b0+b1=12+10=7 mod15 b2 = -7 = 8 mod15 Therefore b2=1000 Use Binary Arithmetic ⚫ Note 16 mod15 =1 ⚫ So: 10000 mod15 = 0001 ⚫ leading bit wraps around b0 + b1 = 1100+1010 =10110 =10000+0110 =0001+0110 =0111 =7 Take 1s complement b2 = -0111 =1000 Summary of the Lesson ⚫ Choosing good generator polynomial codes determine the capability of CRC error detection. ⚫ Internet checksum values more on ease of implementation than on detection capability. Unit 02.01.01 CS 5220: COMPUTER COMMUNICATIONS Peer-to-Peer Protocols and Services XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science n + 1 peer process ⚫⚫⚫ ⚫⚫⚫ Peer-to-Peer Protocols n + 1 peer process n – 1 peer process n – 1 peer process ⚫⚫⚫ n peer process ⚫⚫⚫ n peer process ⚫⚫⚫ ⚫⚫⚫ Peer-to-Peer Protocols ⚫ n + 1 peer process n + 1 peer process SDU n peer process n peer process ⚫ n – 1 peer process ⚫⚫⚫ ⚫⚫⚫ n – 1 peer process Peer-to-Peer processes execute layer-n protocol to provide service to layer-(n+1) Layer-(n+1) peer calls layer-n and passes Service Data Units (SDUs) for transfer ⚫⚫⚫ ⚫⚫⚫ n + 1 peer process Peer-to-Peer Protocols n + 1 peer process ⚫ PDU n – 1 peer process n – 1 peer process ⚫⚫⚫ n peer process ⚫⚫⚫ n peer process Layer-n peers exchange Protocol Data Units (PDUs) to effect transfer ⚫⚫⚫ ⚫⚫⚫ n + 1 peer process n + 1 peer process SDU ⚫ n peer process n – 1 peer process n – 1 peer process ⚫⚫⚫ n peer process ⚫⚫⚫ Peer-to-Peer Protocols Layer-n delivers SDUs to destination layer-(n+1) peer Service Models ⚫ The service model specifies the information transfer service layer-n provides to layer-(n+1) ⚫ The most important distinction is if the service is: ⚫ ⚫ ⚫ Connection-oriented Connectionless Quality-of-Service (QoS) requirement that specifies a level of performance that can be expected in the transfer of information. Examples of Services ⚫ Service model possible features: ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Arbitrary message size or structure Sequencing Reliability Timing Flow control Multiplexing Privacy, integrity, and authentication Message Size and Structure ⚫ What message size and structure will a service model accept? ⚫ ⚫ ⚫ Different services impose restrictions on size & structure of data it will transfer Single bit? Block of bytes? Byte stream? Ex: Transfer of voice mail = 1 long message Ex: Transfer of voice call = byte stream 1 voice mail= 1 message = entire sequence of speech samples (a) 1 call = sequence of 1-byte messages (b) Segmentation & Blocking ⚫ Segmentation & Reassembly: a layer breaks long messages into smaller blocks and reassembles these at the destination ⚫ Blocking & Unblocking: a layer combines small messages into bigger blocks prior to transfer 1 long message 2 or more blocks 2 or more short messages 1 block Reliability & Sequencing ⚫ Reliability: what transmission is reliable? ⚫ ⚫ How to provide reliable communication? ⚫ ⚫ Sequencing: Are messages or information stream delivered in order? Or duplication? Examples: TCP and HDLC ARQ protocols combine error detection, retransmission, and sequence numbering to provide reliability Flow Control ⚫ Messages can be lost if receiving system does not have sufficient buffering to store arriving messages ⚫ If destination layer-(n+1) does not retrieve its information fast enough, destination layer-n buffers may overflow ⚫ Flow Control provide backpressure mechanisms that control transfer according to availability of buffers at the destination ⚫ Examples: TCP and HDLC Timing ⚫ Applications involving voice and video generate units of information that are related temporally ⚫ Destination application must reconstruct temporal relation in voice/video units ⚫ Network transfer introduces delay & jitter ⚫ Timing Recovery protocols use timestamps & sequence numbering to control the delay & jitter in delivered information ⚫ Examples: RTP & associated protocols in Voice over IP Multiplexing ⚫ Multiplexing enables multiple layer-(n+1) users to share a layer-n service ⚫ What it needs? ⚫ ⚫ A multiplexing tag is required to identify specific users at the destination Examples: IP Privacy, Integrity, & Authentication ⚫ ⚫ ⚫ ⚫ Privacy: ensuring that information transferred cannot be read by others Integrity: ensuring that information is not altered during transfer Authentication: verifying that sender and/or receiver are who they claim to be Examples: IPSec, SSL End-to-End vs. Hop-by-Hop ⚫ A service feature can be provided by implementing a protocol ⚫ ⚫ ⚫ end-to-end across the network across every hop in the network Examples: ⚫ ⚫ Perform error control at every hop in the network or only between the source and destination? Perform flow control between every hop in the network or only between source & destination? Packets Packets Data link layer Data link layer (a) A Frames Physical layer Physical layer Error control in Data Link Layer B (b) 1 2 3 21 12 3 B 2 1 Medium A 1 Physical layer entity 2 Data link layer entity 3 Network layer entity ⚫ Data Link operates over wirelike, directly-connected systems ⚫ Frames can be corrupted or lost, but arrive in order ⚫ Data link performs errorchecking & retransmission ⚫ Ensures error-free packet transfer between two systems 21 Error Control in Transport Layer Messages Messages Segments Transport layer Transport layer Network layer Network layer Network layer Network layer Data link layer Data link layer Data link layer Data link layer layer Physical layer Physical layer Physical B layer End system Physical A Network End system Which Approach Preferred Hop-by-hop (HDLC) Data 1 Data 2 Data 3 ACK/ NAK Data 4 ACK/ NAK Hop-by-hop cannot ensure E2E correctness 5 ACK/ NAK Faster recovery ACK/ NAK Simple inside the network End-to-end (TCP) ACK/NAK 1 2 Data 3 Data 5 4 Data Data More scalable if complexity at the edge Summary of the Lesson ⚫ There is a basic tradeoff in choosing end-to-end and hop-byhop approaches. Unit 02.01.01 CS 5220: COMPUTER COMMUNICATIONS Stop-and-Wait ARQ Protocol XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Automatic Repeat Request (ARQ) ⚫ Purpose: to ensure a sequence of information packets is delivered in order and without errors or duplications despite transmission errors & losses ⚫ Sliding window: a set of Seq.# corresponding to frames permitted to send or receive. Automatic Repeat Request (ARQ) - CONT ⚫ Three ARQ protocols ⚫ ⚫ ⚫ ⚫ Stop-and-Wait ARQ Go-Back N ARQ Selective Repeat ARQ Basic elements of ARQ: ⚫ ⚫ ⚫ ⚫ Error-detecting code with high error coverage ACKs (positive acknowledgments) NAKs (negative acknowledgments) Timeout mechanism Transmit a frame, wait for ACK Stop-and-Wait ARQ Error-free packet Packet Information frame Timer set after each frame transmission Receiver (Process B) Transmitter (Process A) Control frame Header Information packet Information frame CRC Header CRC Control frame: ACKs Need for Sequence Numbers (a) Frame 1 lost A Time Frame 0 (b) ACK lost B ⚫ ⚫ Frame 1 ACK B A Time-out Frame 1 Frame 2 ACK Time-out Time Frame 0 ACK Frame 1 ACK Frame 1 Frame 2 ACK In cases (a) & (b) the transmitting station A acts the same way But in case (b) the receiving station B accepts frame 1 twice Sequence Numbers in ACK (c) Premature Time-out Time-out A Time Frame 0 ACK B ⚫ ⚫ Frame 0 Frame 1 Frame 2 ACK The transmitting station A misinterprets duplicate ACKs Incorrectly assumes second ACK acknowledges Frame 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Rnext Slast Timer Slast Transmitter A Receiver B Rnext Global State: (Slast, Rnext) (0,0) Error-free frame 0 arrives at receiver ACK for frame 1 arrives at transmitter (1,0) Error-free frame 1 arrives at receiver (0,1) ACK for frame 0 arrives at transmitter (1,1) 1-Bit Sequence Numbering Suffices Stop-and-Wait ARQ Transmitter Ready state ⚫ ⚫ ⚫ Await request from higher layer for packet transfer When request arrives, transmit frame with updated Slast and CRC Go to Wait State Receiver Always in Ready State ⚫ ⚫ ⚫ Wait state ⚫ ⚫ Wait for ACK or timer to expire; block requests from higher layer If timeout expires ⚫ ⚫ retransmit frame and reset timer ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ If sequence number is incorrect or if errors detected: ignore ACK If sequence number is correct (Rnext = Slast +1): accept frame, go to Ready state ⚫ ⚫ accept frame, update Rnext, send ACK frame with Rnext, deliver packet to higher layer If no errors detected and wrong sequence number ⚫ If ACK received: ⚫ Wait for arrival of new frame When frame arrives, check for errors If no errors detected and sequence number is correct (Slast=Rnext), then discard frame send ACK frame with Rnext If errors detected ⚫ discard frame Summary: Applications of Stop-and-Wait ARQ ⚫ IBM Binary Synchronous Communications protocol (Bisync): character-oriented data link control ⚫ Xmodem: modem file transfer protocol ⚫ Trivial File Transfer Protocol (RFC 1350): simple protocol for file transfer over UDP Unit 02.01.03 CS 5220: COMPUTER COMMUNICATIONS S&W Performance, and Go-back-N ARQ XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Stop-and-Wait Performance ⚫ Stop-and-Wait ARQ works well on channels that have low propagation delay ⚫ The protocol becomes inefficient when the propagation delay is much greater than the time to transmit a frame Stop-and-Wait ARQ Efficiency First frame bit enters channel Last frame bit enters channel ACK arrives Channel idle while transmitter waits for ACK t A B First frame bit arrives at receiver t Last frame bit arrives at receiver Receiver processes frame and prepares ACK Stop-and-Wait Delay Model t0 = total time to transmit 1 frame A tproc B tprop frame tf time (nf/R) ACK tproc tack (na/R) t 0 = 2t prop + 2t proc + t f + t ack = 2t prop + 2t proc + nf R + na R tprop bits/info frame bits/ACK frame channel transmission rate S&W Efficiency on Error-free channel bits for header & CRC Effective transmission rate: 0 Reff = number of informatio n bits delivered to destination n f − no = , total time required to deliver th e informatio n bits t0 Transmission efficiency: n f − no R t0 0 = eff = = R R Effect of ACK frame 1+ na + nf n 1− o nf . 2(t prop + t proc ) R Effect of frame overhead nf Effect of Delay-Bandwidth Product Delay-Bandwidth Product t0 = total time to transmit 1 frame A tproc B tprop frame tf time (nf/R) tproc tack (na/R) tprop • Delay-bandwidth product is 2( tprop + tproc ) * R, or RTT * R Example: Impact of Delay-Bandwidth Product nf=1250 bytes = 10000 bits, na=no=25 bytes = 200 bits 2xDelayxBW Efficiency 1 ms 10 ms 200 km 2000 km (RTT dist.) 1 Mbps 1 Gbps 103 88% 106 1% 104 49% 107 0.1% 100 ms 20000 km 105 9% 108 0.01% 1 sec 200000 km 106 1% 109 0.001% S&W Efficiency in Channel with Errors ⚫ ⚫ ⚫ ⚫ Let 1 – Pf = probability frame arrives w/o errors Avg. # of transmissions to first correct arrival is then 1/ (1–Pf ) “If 1-in-10 get through without error, then avg. 10 tries to success” Avg. Total Time per frame is then t0/(1 – Pf) SW = Reff R = n f − no t0 1 − Pf R 1− = no nf 2(t + t )R n 1 + a + prop proc nf nf (1 − Pf ) Effect of frame loss Example: Impact Bit Error Rate nf=1250 bytes = 10000 bits, na=no=25 bytes = 200 bits Find efficiency for random bit errors with p=0, 10-6, 10-5, 10-4 1 − Pf = (1 − p) 1 – Pf nf e −n f p for large n f and small p 0 10-6 10-5 10-4 1 88% 0.99 86.6% 0.905 79.2% 0.368 32.2% Efficiency 1 Mbps & 1 ms Go-Back-N ARQ ⚫ ⚫ ⚫ Improve Stop-and-Wait by not waiting! Keep channel busy by continuing to send frames A procedure where the transmission of a new frame is begun before the completion time of the previous frame transmission is said to be pipelining. Go-Back-N ⚫ Allow a window of up to Ws outstanding frames ⚫ Receiver’s window size is often 1 ⚫ The window size must be larger than the delay-bandwidth product to ensure that the channel is kept full ⚫ If ACK for oldest frame arrives before window is exhausted, continue transmitting ⚫ If window is exhausted, pull back and retransmit all outstanding frames Go-Back-4 4 frames are outstanding; so go back 4 Go-Back-4: fr 0 A fr 1 fr 2 fr 3 fr 4 fr 5 fr 6 fr 3 fr 4 fr 5 fr 6 fr 7 fr 8 Time fr 9 B A C K 1 Rnext ⚫ ⚫ 0 1 A C K 2 2 A C K 3 3 out of sequence frames 3 A C K 4 4 A C K 5 5 A C K 7 A C K 6 6 7 8 A C K 9 A C K 8 9 Frame with errors and subsequent out-of-sequence frames are ignored Transmitter is forced to go back when window of 4 is exhausted Summary of the Lesson ⚫ Delay-bandwidth product is a key element in performance evaluation of network protocols ⚫ Stop-and-wait is only efficient if the delay-bandwidth product is very small Unit 02.01.04 CS 5220: COMPUTER COMMUNICATIONS Go-back-N and Selective Repeat ARQ XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Go-Back-N with Timeout ⚫ Problem with Go-Back-N as presented: ⚫ Window size should be long enough to cover round trip time ⚫ If a frame is lost and transmitter does not have a frame to send, then window will not be exhausted and recovery will not commence Go-Back-N with Timeout ⚫ Use a timeout with each frame ⚫ When timeout expires, resend all outstanding frames Time-out expires A fr 0 fr 0 fr 1 B Receiver is looking for Rnext = 0 ACK1 Time Maximum Window Size ⚫ Given m-bit seq. numbers, what is the maximum number of frames that can be outstanding in “go back N”? MAX_SEQ = 2^m – 1 while there are 2^m sequence numbers. Should the maximum number be MAX_SEQ or MAX_SEQ + 1? Example: m = 2; sequence numbers: 0, 1, 2, 3 1) Sender sends 4 frames in a row, from 0 through 3 2) Receiver sends four corresponding ACKs back to sender, but all lost! 3) Sender times out, re-sends the 4 frames (0 through 3) 4) Receiver is waiting for frame 0 Can receiver determine whether this is a new frame 0 or an old frame 0? Maximum Allowable Window Size is Ws = 2m M = 22 = 4, Go-Back - 4: A fr 0 B Rnext fr 2 fr 1 0 fr 3 Transmitter goes back 4 fr 0 A C K 1 A C K 2 A C K 3 A C K 0 1 2 3 0 fr 1 fr 2 fr 3 Time Receiver has Rnext= 0, but it does not know whether its ACK for frame 0 was received, so it does not know whether this is the old frame 0 or a new frame 0 Maximum Allowable Window Size is Ws = 2m-1 M = 22 = 4, Go-Back-3: A fr 0 B Rnext 0 fr 0 fr 2 fr 1 Transmitter goes back 3 A C K 1 A C K 2 A C K 3 1 2 3 fr 1 fr 2 Time Receiver has Rnext= 3 , so it rejects the old frame 0 The sequence of frame exchange. Applications of Go-Back-N ARQ ⚫ HDLC (High-Level Data Link Control): bit-oriented data link control ⚫ V.42 modem: error control over telephone modem links Piggybacking and Bidirectional Links ⚫ Since in the two-way transmission, data frames and ACK frames are interleaving, why not have a “free” ride of ACK upon a data delivering? ⚫ ⚫ Piggybacking: receiver inserts ACK in the next departing frame For piggybacking, how long should the data link layer wait for a packet onto which to piggyback the ACK? Slast Transmitter A Rnext Receiver B Required Timeout & Window Size Tout Tprop Tf Tf Tproc Tprop ⚫ Timeout value should allow for: ⚫ Two propagation times + two transmission times + 1 processing time: 2 Tprop + 2 Tf + Tproc; Assume receiver starts transmission right after receiving ⚫ Ws should be large enough to keep channel busy for Tout Window Size for Delay-Bandwidth Product Frame = 1250 bytes =10,000 bits, R = 1 Mbps Delay: 2(tprop + tproc) Delay x BW Window (1 + D * W / L) 1 ms 1000 bits 1 10 ms 10,000 bits 2 100 ms 100,000 bits 11 1 second 1,000,000 bits 101 Selective Repeat ARQ ⚫ Why Go-Back-N ARQ inefficient? ⚫ because multiple frames are resent when errors or losses occur ⚫ Correct but out-of-sequence frames would be discarded (because receiver buffer only 1) Selective Repeat ARQ ⚫ ⚫ Selective Repeat retransmits only an individual frame ⚫ Timeout causes individual corresponding frame to be resent ⚫ NAK causes retransmission of oldest un-acked frame Receiver maintains a receive window of sequence numbers that can be accepted ⚫ Receiver window is increased larger ⚫ Error-free, but out-of-sequence frames with sequence numbers within the receive window are buffered ⚫ Arrival of frame with Rnext causes window to slide forward by 1 or more Selective Repeat ARQ A fr 0 fr 1 fr 2 fr 3 fr 4 fr 5 fr 6 fr 2 fr 7 A C K 2 A C K 2 fr 8 fr fr fr fr 9 10 11 12 Time B A C K 1 A C K 2 N A K 2 A C K 2 A C K 7 A C K 8 A C K 9 A C K 1 0 A C K 1 1 A C K 1 2 What size Ws and Wr allowed? ⚫ Example (as in Go-back N): M=22=4, Ws=3, Wr=3 Frame 0 resent Send Window {0,1,2} {1,2} A B Receive Window fr0 {2} fr1 {.} fr2 ACK1 {0,1,2} {1,2,3} fr0 ACK2 Time ACK3 {2,3,0} {3,0,1} Old frame 0 accepted as a new frame because it falls in the receive window Ws + Wr = 2m is maximum allowed ⚫ Example: M=22=4, Ws=2, Wr=2 Frame 0 resent Send Window {0,1} A {.} {1} fr0 B Receive Window fr0 fr1 ACK1 {0,1} {1,2} Time ACK2 {2,3} Old frame 0 rejected because it falls outside the receive window Applications of Sel. Repeat ARQ ⚫ TCP (Transmission Control Protocol): transport layer protocol uses variation of selective repeat to provide reliable stream service ⚫ Service Specific Connection Oriented Protocol: error control for signaling messages in ATM networks Efficiency of Selective Repeat t0 = total time to transmit 1 frame A tproc B tprop ⚫ frame tf time (nf/R) tproc Tack (na/R) tprop Assume Pf frame loss probability, then number of transmissions required to deliver a frame is 1/(1-Pf): ⚫ Average transmission time: tf /(1-Pf) n f − no SR = t f /(1 − Pf ) R = (1 − no )(1 − Pf ) nf Comparison of ARQ Efficiencies Assume na, no are negligible relative to nf, and L = 2(tprop+tproc)R/nf , Ws= L+1 Selective-Repeat: SR = (1 − Pf )(1 − no ) (1 − Pf ) nf For Pf≈0, SR & GBN same Go-Back-N: GBN = 1 − Pf 1 + (WS − 1) Pf Stop-and-Wait: SW = = 1 − Pf 1 + LPf For Pf→1, GBN & SW same (1 − Pf ) 1 − Pf 2(t + t )R 1 + L n 1 + a + prop proc nf nf Summary: Impact Bit Error Rate on ARQ nf=1250 bytes = 10000 bits, na=no=25 bytes = 200 bits Compare S&W, GBN & SR efficiency for random bit errors with p=0, 10-6, 10-5, 10-4 and R = 1 Mbps & 100 ms Efficiency 0 10-6 10-5 10-4 S&W 8.9% 8.8% 8.0% 3.3% GBN 98% 88.2% 45.4% 4.9% SR 98% 97% 89% 36% Unit 02.02.01 CS 5220: COMPUTER COMMUNICATIONS TCP Reliable Stream and Flow Control XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science TCP ARQ Model • TCP reliable stream service • Connection-oriented • Error free, without duplication, in order of sequence • TCP uses Selective Repeat ARQ • Transfers byte stream without preserving boundaries TCP Reliable Stream Service Application Layer writes bytes into send buffer through socket Application layer TCP transfers byte stream in order, without errors or duplications Write 45 bytes Write 15 bytes Write 20 bytes Application Layer reads bytes from receive buffer through socket Read 40 bytes Read 40 bytes Transport layer Segments Transmitter Receiver Receive buffer Send buffer ACKs TCP ARQ Environment • Operates over best effort service of IP that is not wirelike • Packets can arrive with errors or be lost • Packets can arrive out-of-order • Packets can arrive after very long delays • Old segments from previous connections may arrive, so detection and elimination of duplicates is hard TCP ARQ Sequence # • Sequence Numbers • Seq. # is number of first byte in segment payload • Very long Seq. #s (32 bits) to deal with long delays • Initial sequence numbers negotiated during connection setup (to deal with very old duplicates) • Accept segments within a receive window • Timeout at the end of connection to clear old segments TCP Connections ⚫ TCP Connection ⚫ ⚫ Connection Setup with Three-Way Handshake ⚫ ⚫ Three-way exchange to negotiate initial Seq. #’s for connections in each direction Data Transfer ⚫ ⚫ Identified uniquely by Send IP Address, Send TCP Port #, Receive IP Address, Receive TCP Port # Exchange segments carrying data Graceful Close ⚫ Close each direction separately Initial Seq. # from client to server SYN bit set indicates request to establish connection from client to server Transmitter Receiver Send Window Slast + Wa-1 ... bytes transmitted & ACKed ... Slast Srecent Receive Window Rlast Rlast + WR – 1 ... Slast + Ws – 1 Slast oldest unacknowledged byte Srecent highest-numbered transmitted byte Slast+Wa-1 highest-numbered byte that can be transmitted Slast+Ws-1 highest-numbered byte that can be accepted from the application Rnext Rnew Rlast highest-numbered byte not yet read by the application Rnext next expected byte Rnew highest numbered byte received correctly Rlast+WR-1 highest-numbered byte that can be accommodated in receive buffer TCP Data Exchange ⚫ ⚫ Application Layers write bytes into buffers TCP sender forms segments ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ When bytes exceed threshold or timer expires Upon PUSH command from applications Consecutive bytes from buffer inserted in payload Sequence # & ACK # inserted in header Checksum calculated and included in header TCP receiver ⚫ ⚫ Performs selective repeat ARQ functions Writes error-free, in-sequence bytes to receive buffer TCP Sequence # • The segment contains a sequence number that corresponds to the number of the first byte in the string that is being transmitted • Significantly differs from ARQs TCP Flow Control ⚫ ⚫ ⚫ TCP receiver controls rate at which sender transmits to prevent buffer overflow TCP receiver advertises a window size specifying number of bytes that can be accommodated by receiver WA = WR – (Rnew – Rlast) TCP sender obliged to keep # outstanding bytes below WA (Srecent - Slast) ≤ WA Send Window Receive Window Slast + WA-1 ... ... Slast Srecent WA ... Slast + Ws – 1 Rlast Rnew Rlast + WR – 1 TCP Retransmission Timeout ⚫ TCP retransmits a segment after timeout period ⚫ ⚫ ⚫ ⚫ Timeout too short: excessive number of retransmissions Timeout too long: recovery too slow Timeout depends on RTT: time from when segment is sent to when ACK is received Round trip time (RTT) in Internet is highly variable ⚫ ⚫ Routes vary and can change in mid-connection Traffic fluctuates Adaptive RTT ⚫ TCP uses adaptive estimation of RTT ⚫ Measure RTT each time ACK received: tn tRTT(new) = a tRTT(old) + (1 – a) tn ⚫ a = 7/8 typical RTT Variability ⚫ ⚫ ⚫ ⚫ ⚫ Estimate variance s2 of RTT variation Estimate for timeout: tout = tRTT + k sRTT If RTT highly variable, timeout increase accordingly If RTT nearly constant, timeout close to RTT estimate In practice, approximate estimation of deviation dRTT(new) = b dRTT(old) + (1-b) | tn - tRTT | tout = tRTT + 4 dRTT Summary of the Lesson ⚫ TCP flow control is based on selective repeat ARQ, yet varies in several key aspects. Unit 02.02.02 CS 5220: COMPUTER COMMUNICATIONS Framing and PPP XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Data Link Protocols A ⚫ ⚫ ⚫ Packets Packets Data link layer Data link layer Physical layer Frames Physical layer B Directly connected, wire-like Losses & errors, but no out-ofsequence frames Applications: Direct Links; LANs; Connections across WANs Data Links Services ⚫ Framing ⚫ Error control ⚫ Flow control ⚫ Multiplexing ⚫ Link Maintenance ⚫ Security: Authentication & Encryption Examples ⚫ PPP ⚫ HDLC ⚫ Ethernet LAN ⚫ IEEE 802.11 (Wi Fi) LAN Framing ⚫ ⚫ Mapping stream of physical layer bits into frames Mapping frames into bit stream Frame boundaries can be determined using: ⚫ ⚫ ⚫ Character Counts Control Characters Flags CRC Checks 0110110111 ⚫ received frames Framing 0111110101 ⚫ transmitted frames Character-Oriented Framing (Byte Stuffing) Data to be sent A DLE B ETX DLE STX E After stuffing and framing DLE STX A DLE DLE B ETX DLE DLE STX E DLE ETX ⚫ ⚫ ⚫ Frames consist of integer number of bytes Special 8-bit patterns used as control characters Byte used to carry non-printable characters in frame ⚫ ⚫ ⚫ ⚫ DLE (data link escape) = 0x10 DLE STX (DLE ETX) used to indicate beginning (end) of frame Insert extra DLE in front of occurrence of DLE STX (DLE ETX) in frame All DLEs occur in pairs except at frame boundaries Flag-based Framing & Bit Stuffing HDLC frame Flag Address Control Information FCS Flag any number of bits ⚫ Flag-based frame synchronization is for transferring an arbitrary number of bits within a frame Bit Stuffing ⚫ Frame delineated by flag character ⚫ It uses bit stuffing to prevent occurrence of flag 01111110 (HEX 7E) inside the frame ⚫ Transmitter inserts extra 0 after each consecutive five 1s inside the frame ⚫ Receiver checks for five consecutive 1s ⚫ ⚫ ⚫ if next bit = 0, it is removed if next two bits are 10, then flag is detected If next two bits are 11, then frame has errors Example: Bit stuffing & de-stuffing (a) Data to be sent 0110111111111100 After stuffing and framing 0111111001101111101111100001111110 (b) Data received 0111111001101111101111100001111110 After destuffing and deframing 0110111111111100 PPP: Point-to-Point Protocol ⚫ Data link protocol for point-to-point lines in Internet ⚫ Router-router; dial-up to router 1. Provides Framing and Error Detection ⚫ Character-oriented HDLC-like frame structure 2. Link Control Protocol ⚫ ⚫ Bringing up, testing, bringing down lines; negotiating options Authentication: key capability in ISP access 3. A family of Network Control Protocols specific to different network layer protocols ⚫ IP, OSI network layer, IPX (Novell), Appletalk PPP Frame Flag Address 01111110 1111111 Control 00000011 Protocol Information CRC Flag 01111110 integer # of bytes All stations are to accept the frame ⚫ ⚫ ⚫ Specifies what kind of packet is contained in the payload, e.g., LCP, NCP, IP, OSI CLNP, IPX PPP uses similar frame structure as HDLC, except ⚫ ⚫ Unnumbered frame Protocol type field Payload contains an integer number of bytes PPP uses the same flag, but uses byte stuffing Problems with PPP byte stuffing ⚫ ⚫ Size of frame varies unpredictably due to byte insertion Malicious users can inflate bandwidth by inserting 7D & 7E Byte-Stuffing in PPP ⚫ ⚫ ⚫ ⚫ PPP is character-oriented version of HDLC Flag is 0x7E (01111110) Control escape 0x7D (01111101) Any occurrence of flag or control escape inside of frame is replaced with 0x7D followed by original octet XORed with 0x20 (00100000) Data to be sent 41 7E 41 7D 7D 5D 42 42 7E 50 70 46 7D After stuffing and framing 5E 50 70 46 7E PPP Applications PPP used in many point-to-point applications ⚫ Telephone Modem Links 30 kbps ⚫ Packet over SONET 600 Mbps to 10 Gbps ⚫ ⚫ IP→PPP→SONET PPP is also used over shared links such as Ethernet to provide LCP, NCP, and authentication features ⚫ ⚫ PPP over Ethernet (RFC 2516) Used over DSL PPP Authentication ⚫ Password Authentication Protocol ⚫ ⚫ ⚫ ⚫ ⚫ Initiator must send ID & password Authenticator replies with authentication success/fail After several attempts, LCP closes link Transmitted unencrypted, susceptible to eavesdropping Challenge-Handshake Authentication Protocol (CHAP) ⚫ ⚫ ⚫ ⚫ ⚫ Initiator & authenticator share a secret key Authenticator sends a challenge (random # & ID) Initiator computes cryptographic checksum of random # & ID using the shared secret key Authenticator calculates cryptocgraphic checksum & compares to response Authenticator can reissue challenge during session Summary ⚫ Bit stuffing and byte stuffing are used for framing ⚫ PPP uses byte stuffing ⚫ CHAP is a security protocol Unit 02.02.03 CS 5220: COMPUTER COMMUNICATIONS HDLC, Multiplexing XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science High-Level Data Link Control (HDLC) ⚫ ⚫ ⚫ Bit-oriented data link control Derived from IBM Synchronous Data Link Control (SDLC) Related to Link Access Procedure Balanced (LAPB) ⚫ ⚫ LAPD in ISDN LAPM in cellular telephone signaling Network layer NLPDU Network layer “Packet” DLSDU DLSAP DLSAP Data link layer DLPDU “Frame” Physical layer DLSDU Data link layer Physical layer HDLC Data Transfer Modes ⚫ Normal Response Mode (NRM) ⚫ Used in polling multi-drop lines Commands Primary Responses Secondary ⚫ Secondary Secondary Asynchronous Balanced Mode (ABM) ⚫ Used in full-duplex point-to-point links Primary Commands Secondary Responses Responses Secondary Commands Primary HDLC Frame Format Flag Address Control ⚫ ⚫ Information FCS Flag Control field gives HDLC its functionality Codes in fields have specific meanings and uses ⚫ ⚫ ⚫ ⚫ ⚫ Flag: delineate frame boundaries Address: identify secondary station (1 or more octets) ⚫ In ABM mode, a station can act as primary or secondary so address changes accordingly Control: purpose & functions of frame (1 or 2 octets) Information: contains user data; length not standardized, but implementations impose maximum Frame Check Sequence: 16- or 32-bit CRC Control Field Format Information Frame 1 2-4 0 N(S) 5 6-8 P/F N(R) P/F N(R) Supervisory Frame 1 0 S S Unnumbered Frame 1 1 M M P/F M M M Error Detection & Loss Recovery ⚫ Frames lost due to loss-of-synch or receiver buffer overflow ⚫ Frames may undergo errors in transmission ⚫ CRCs detect errors and such frames treated as lost ⚫ Recovery through ACKs, timeouts & retransmission ⚫ Sequence numbering to identify out-of-sequence & duplicate frames ⚫ HDLC provides for options that implement several ARQ methods Statistical Multiplexing ⚫ ⚫ Multiplexing concentrates bursty traffic onto a shared line Greater efficiency and lower cost Header Data payload A B Buffer Output line C Input lines Tradeoff Delay for Efficiency (a) Dedicated lines A2 A1 B2 B1 C1 (b) ⚫ ⚫ Shared lines A1 C2 C1 B1 A2 B2 C2 Dedicated lines involve not waiting for other users, but lines are used inefficiently when user traffic is bursty Shared lines concentrate packets into shared line; packets buffered (delayed) when line is not immediately available Multiplexers inherent in Packet Switches 1 1 2 2 ⚫ ⚫ ⚫ N ⚫ ⚫ ⚫ ⚫ ⚫ N Packets/frames forwarded to buffer (queue) prior to transmission from switch Multiplexing occurs in these buffers; FIFIO vs priority scheduling Delay and Utilization Tradeoff ⚫ Buffering introduces packet delay ⚫ Buffer overflow introduces packet loss ⚫ End-to-end protocols deals with loss Multiplexer Modeling Input lines A Output line B Buffer C ⚫ Arrivals: What is the packet inter-arrival pattern? Service Time: How long are the packets? Service Discipline: What is order of transmission? Buffer Discipline: If buffer is full, which packet is dropped? ⚫ Performance Measures: Delay Distribution; Packet Loss Probability; Line Utilization ⚫ ⚫ ⚫ Delay = Waiting + Service Times Packet completes transmission P2 P1 P3 P5 Service time Packet begins transmission Packet arrives at queue P1 P4 P2 P3 P4 P5 Waiting time Delay = Waiting + Service Times ⚫ ⚫ ⚫ ⚫ Packets arrive and wait for service Waiting Time: from arrival instant to beginning of service Service Time: time to transmit packet Delay: total time in system = waiting time + service time Fluctuations in Packets in the System (a) Dedicated lines A1 A2 B2 B1 C2 C1 (b) Shared line (c) N(t) Number of packets in the system A1 C1 B1 A2 B2 C2 Poisson Arrivals & Queuing ⚫ ⚫ ⚫ ⚫ Average Arrival Rate: l packets per second Arrivals are equally-likely to occur at any point in time Time between consecutive arrivals is an exponential random variable with mean 1/ l Number of arrivals in interval of time t is a Poisson random variable with mean lt ( l t ) k − lt P k arrivals in t seconds = e k! Summary ⚫ Multiplexing makes good tradeoff of delay and utilization ⚫ Queueing involves sophisticated modeling and analysis Unit 02.03.01 CS 5220: COMPUTER COMMUNICATIONS Medium Access Control XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Medium Access Control ⚫ Broadcast LANs ⚫ ⚫ ⚫ ⚫ Simple and Cheap All information sent to all users No routing Medium Access Control ⚫ How to coordinate access to shared medium Multiple Access Communications ⚫ Shared media basis for broadcast networks ⚫ ⚫ ⚫ Inexpensive: radio over air; copper or coaxial cable M users communicate by broadcasting into medium Key issue: How to share the medium when there is a competition for it? 3 2 4 1 Shared multiple access medium M 5 Approaches to Media Sharing Medium sharing techniques Static channelization ⚫ ⚫ ⚫ ⚫ Partition medium Dedicated allocation to users Satellite transmission Cellular Telephone Dynamic medium access control Scheduling ⚫ ⚫ ⚫ ⚫ Random access Polling: take turns Request for slot in transmission schedule Token ring Wireless LANs ⚫ ⚫ ⚫ ⚫ Loose coordination Send, wait, retry if necessary Aloha Ethernet Channelization Example: Satellite Satellite Channel uplink fin downlink fout Channelization: Cellular uplink f1 ; downlink f2 uplink f3 ; downlink f4 Why Channelization? ⚫ Channelization ⚫ Semi-static bandwidth allocation of portion of shared medium to a given user ⚫ Highly efficient for constant-bit rate traffic ⚫ Preferred approach in ⚫ ⚫ Cellular telephone networks Terrestrial & satellite broadcast radio & TV Channelization Approaches ⚫ Frequency Division Multiple Access (FDMA) ⚫ ⚫ ⚫ Time Division Multiple Access (TDMA) ⚫ ⚫ ⚫ Frequency band allocated to users Broadcast radio & TV, analog cellular phone Periodic time slots allocated to users Telephone backbone, GSM digital cellular phone Code Division Multiple Access (CDMA) ⚫ ⚫ Code allocated to users Cellular phones, 3G cellular Why not Channelization? ⚫ ⚫ ⚫ ⚫ Inflexible in allocation of bandwidth to users with different requirements Inefficient for bursty traffic Does not scale well to large numbers of users Dynamic MAC much better at handling bursty traffic Scheduling: Token-Passing token Ring networks Data to M Station that holds token transmits into ring Scheduling: Polling Inbound line Data from 1 Poll 1 Host computer Outbound line 1 2 M 3 Stations Random Access Multi-tapped (mulit-access) Bus Collision!! Transmit when ready Transmission collisions can occur; need retransmission strategy Delay-Bandwidth Product ⚫ Delay-bandwidth product is key parameter ⚫ Coordination in sharing medium involves using bandwidth (explicitly or implicitly) ⚫ Difficulty of coordination commensurate with delay-bandwidth product Delay-bandwidth product is 2( tprop + tproc ) * R, or RTT * R (if tproc is negligible) Two-Station Example ⚫ Simple two-station example ⚫ Station with frame to send listens to channel and transmits if channel found idle ⚫ Station monitors channel to detect collision ⚫ If collision occurs, station that begin transmitting earlier retransmits (the propagation time is fixed & known between two stations) A B Two-Station MAC Example – CASE I Two stations are trying to share a common channel A transmits A at t = 0 Distance d meters tprop = d / seconds B Case 1 A B B does not transmit before t = tprop & A captures channel Two-Station MAC Example – Case II Two stations are trying to share a common medium A transmits A at t = 0 Case 2 A detects A collision at A t = 2 tprop Distance d meters tprop = d / seconds B B B B transmits before t = tprop and detects collision soon thereafter Efficiency of Two-Station Example ⚫ Each frame transmission requires 2tprop of quiet time ⚫ ⚫ ⚫ Station B needs to be quiet tprop before and after time when Station A transmits R transmission bit rate (bandwidth) L bits/frame Efficiency = max = L 1 1 = = L + 2t propR 1 + 2t propR / L 1 + 2a MaxThroughput = Reff = L 1 = R bits/second L / R + 2t prop 1 + 2a Summary: Normalized Delay-Bandwidth ⚫ Normalized Delay-bandwidth product plays a key role in performance of medium access control protocols a= t prop L/R Propagation delay Time to transmit a frame Unit 02.03.02 CS 5220: COMPUTER COMMUNICATIONS MAC Random Access - ALOHA XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Random Access ⚫ Random access ⚫ No scheduling overhead ⚫ Delay-bandwidth product is important factor ⚫ Uniform frame size ALOHA ⚫ Wireless link to provide data transfer between campuses of University of Hawaii, a simple scheme ⚫ ⚫ ⚫ ⚫ A station transmits whenever it has data to transmit If more than one frames are transmitted, they collide with each other and lost If ACK not received within timeout, then a station picks random backoff Station retransmits frame after backoff time First transmission Backoff period B Retransmission t t0-X t0 t0+X Vulnerable period t0+X+2tprop Time-out t0+X+2tprop +B ALOHA Model ⚫ Definitions and assumptions ⚫ ⚫ ⚫ ⚫ X: frame transmission time (assume constant) S: throughput (average # successful frame transmissions per X seconds) G: load (average # transmission attempts per X sec.) Psuccess : probability a frame transmission is successful S = GPsuccess X Prior interval X Reference frame transmission ⚫ ⚫ Any transmission that begins during vulnerable period leads to collision Success if no arrivals during 2X seconds Abramson’s Assumption ⚫ What is probability of no arrivals in vulnerable period? ⚫ Abramson assumption: Effect of backoff algorithm is that frame arrivals are equally likely to occur at any time interval ⚫ G is avg. # arrivals per X seconds, divide X into n intervals of duration D = X/n ⚫ p = probability of arrival in D interval, then G = n p since n intervals in X seconds Psuccess = P[0 arrivals in 2X seconds] = = P[0 arrivals in 2n intervals] G 2n 2n = (1 - p) = (1 − ) → e −2G as n → n S = Ge-2G Slotted ALOHA ⚫ ALOHA performance depends on probability of collisions ⚫ Reducing vulnerable period can reduce collision probability ⚫ Slotted ALOHA reduces collision probability by constraining the stations to transmit in synchronized manner. Slotted ALOHA Model ⚫ ⚫ ⚫ ⚫ Time is slotted in X seconds slots (X is frame transmission time) Stations synchronized to frame times (may incur waiting time) Stations transmit frames only at the beginning of a time slot Backoff intervals in multiples of slots Backoff period B t kX (k+1)X Vulnerable period t0 +X+2tprop Time-out t0 +X+2tprop+ B Throughput of Slotted ALOHA S = GPsuccess = GP[no arrivals in X seconds] = GP[no arrivals in n intervals] G = G (1 − p) n = G (1 − ) n → Ge−G n 1/e=0.368 Ge-G S 1/2e = 0.184 Ge-2G G Summary ⚫ ALOHA schemes are simple, but low maximum system throughput Unit 02.03.03 CS 5220: COMPUTER COMMUNICATIONS Random Access: CSMA & CSMA-CD XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science CSMA ⚫ Carrier Sensing Multiple Access (CSMA) ⚫ When collisions occur they involve entire frame transmission time ⚫ Sensing for detecting an ongoing transmission ⚫ Avoids transmission certain to cause collisions Carrier Sensing Multiple Access (CSMA) ⚫ A station senses the channel before it starts transmission ⚫ ⚫ ⚫ If busy, either wait or schedule backoff (different options) If idle, start transmission Vulnerable period is reduced to tprop (due to channel capture effect) Station A begins transmission at t=0 A Station A captures channel at t = tprop A CSMA Options: 1-persistent CSMA ⚫ Transmitter behavior when busy channel is sensed ⚫ 1-persistent CSMA (most greedy) Start transmission as soon as the channel becomes idle ⚫ Low delay and low efficiency ⚫ 1-Persistent CSMA Throughput S 0.53 0.45 ⚫ Normalized propagation delay a (tprog/X) ⚫ Better than Aloha & slotted Aloha for small a ⚫ Worse than Aloha for a > 1 a = 0.01 0.16 a =0.1 a=1 G CSMA Options ⚫ Non-persistent CSMA (least greedy) ⚫ ⚫ ⚫ If busy, wait a backoff period, then sense carrier again High delay and high efficiency p-persistent CSMA (adjustable greedy) ⚫ ⚫ Wait till channel becomes idle, transmit with probability p; or wait one tprop & re-sense with probability 1-p Delay and efficiency can be balanced Sensing Non-Persistent CSMA Throughput a = 0.01 S ⚫ Higher maximum throughput than 1-persistent for small a ⚫ Worse than Aloha for a>1 0.81 0.51 a = 0.1 0.14 a=1 G CSMA with Collision Detection (CSMA/CD) ⚫ Monitor for collisions & abort transmission ⚫ Stations with frames to send, first do carrier sensing ⚫ After beginning transmissions, stations continue listening to the medium to detect collisions ⚫ If collisions detected, all stations involved abort transmission, reschedule random backoff times, and try again at scheduled times - quickly terminating a damaged frame saves time & bandwidth (T_trans >> T_prog) ⚫ In CSMA, collisions result in wastage of entire frame transmission time ⚫ CSMA-CD reduces wastage to time to detect collision & abort transmission CSMA/CD reaction time A begins to transmit at A t=0 B A B A detects collision at A t= 2 tprop- B B begins to transmit at t = tprop- ; B detects collision at t = tprop It takes 2 tprop to find out if channel has been captured CSMA-CD Model ⚫ Collisions can be detected and resolved in 2tprop ⚫ Time slotted in 2tprop slots during contention periods ⚫ Once the contention period is over (a station successfully occupies the channel), it takes X seconds for a frame to be transmitted ⚫ It takes tprop before the next contention period starts. Busy Contention Busy Idle Contention Busy Time Contention Resolution ⚫ ⚫ Contention is resolved if exactly 1 station transmits in a slot Assume n busy stations, and each may transmit with probability p in each contention time slot P = np(1 − p) n−1 success ⚫ By taking derivative of Psuccess we find max occurs at p=1/n 1 1 n −1 1 n −1 1 max Psuccess = n (1 − ) = (1 − ) → n n n e ⚫ On average, 1/Pmax = e = 2.718 time slots to resolve contention Average Contention Period = 2t prope seconds CSMA/CD Throughput ⚫ At maximum throughput, systems alternates between contention periods and frame transmission times max = ⚫ X 1 1 = = X + t p ro p + 2et p ro p 1 + (2e + 1)a 1 + (2e + 1)Rd / L where: R bits/sec, L bits/frame, X=L/R seconds/frame a = tprop/X (normalized propagation delay) meters/sec. d meters is diameter of system 2e+1 = 6.44 Summary: Throughput for Random Access MACs 1 max CSMA/CD 1-P CSMA 0.8 Non-P CSMA 0.6 Slotted ALOHA 0.4 ALOHA 0.2 a 0 0.01 ⚫ ⚫ 0.1 1 For small a: CSMA-CD has best maximum throughput For larger a: Aloha & slotted Aloha better maximum throughput, since not dependent on a (normalized propagation delay) Unit 02.03.04 CS 5220: COMPUTER COMMUNICATIONS Scheduling Approaches XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Scheduling for MAC ⚫ Schedule frame transmissions to avoid collision in shared medium ✓ ✓ ✓ ⚫ More efficient channel utilization Less variability in delays Can provide fairness to stations Increased computational or procedural complexity Two main approaches ⚫ ⚫ Reservation Polling Collision-free Reservation System Reservation interval r d Frame transmissions d d r d Cycle n r ⚫ ⚫ = 1 2 d Cycle (n + 1) 3 M Transmissions organized into cycles Cycle: a reservation interval + frame transmissions d Time Reservation Scheme ⚫ The reservation interval has a mini-slot for each station to request reservations for frame transmissions ⚫ ⚫ ⚫ ⚫ The stations announce their intention to transmit a frame by broadcasting their reservation bit during the appropriate mini-slot By listening to the reservation interval, stations determine the order of frame transmissions in the corresponding cycle Mini-slot should cover the round-trip propagation delay Collision is avoided Example (negligible propagation delay) ⚫ Initially stations 3 & 5 have reservations to transmit frames 8 (a) r 3 5 r 3 5 r 3 5 8 r 3 5 8 r 3 t ⚫ Cycle 3: Station 8 becomes active and makes reservation ⚫ Cycle 3: now includes frame transmission from station 8 Example (non-negligible prop. delay) ⚫ Initially stations 3 & 5 have reservations to transmit frames 8 (b) r 3 5 r 3 5 r 3 5 8 r 3 5 8 r 3 ⚫ Cycle 2: Station 8 becomes active and makes reservation ⚫ Cycle 3: now includes frame transmission from station 8 t Reservation System Options ⚫ Centralized or distributed system ⚫ Centralized systems: A central controller listens to reservation information, decides order of transmission, issues grants ⚫ Distributed systems: Each station determines its slot for transmission from the reservation information Reservation System Options ⚫ ⚫ Single or Multiple Frames ⚫ Single frame reservation: Only one frame transmission can be reserved within a reservation cycle ⚫ Multiple frame reservation: More than one frame transmission can be reserved within a minoslot Channelized or Random Access Reservations ⚫ Channelized (typically TDMA) reservation: Reservation messages from different stations are multiplexed without any risk of collision ⚫ Random access reservation: Each station transmits its reservation message randomly until the message goes through Efficiency of Reservation Systems ⚫ ⚫ Assume mini-slot duration = vX (v < 1; negligible delay) A single frame reservation scheme, M stations ⚫ ⚫ a single frame transmission requires (1+v)X seconds Link is fully loaded when all stations transmit, maximum efficiency is: max = ⚫ MX 1 = MvX + MX 1 + v A k frame reservation scheme ⚫ ⚫ If k frame transmissions can be reserved with a reservation message and if there are M stations, as many as Mk frames can be transmitted in XM(k+v) seconds MkX 1 = = Maximum efficiency is: max MvX + MkX 1+ v k Random Access Reservation Systems ⚫ Large number of light-traffic stations ⚫ ⚫ Dedicating a minislot to each station is inefficient Slotted ALOHA reservation scheme ⚫ ⚫ ⚫ Stations use slotted Aloha on reservation minislots On average, each reservation takes at least e minislot attempts Effective time required for the reservation is 2.71vX ρmax = 1 X = 1 + 2.71v X(1+ev) Example: GPRS ⚫ General Packet Radio Service ⚫ ⚫ ⚫ ⚫ Packet data service in GSM cellular radio GPRS devices, e.g. cellphones or laptops, send packet data over radio and then to Internet Slotted Aloha MAC used for reservations Single & multi-slot reservations supported Polling Systems ⚫ Centralized systems: A central controller accepts requests from stations and issues grants to transmit ⚫ Distributed systems: Stations implement a decentralized algorithm to determine transmission order Central Controller Polling System Options ⚫ Service Limits: How much is a station allowed to transmit per poll? ⚫ ⚫ ⚫ ⚫ ⚫ Exhaustive: until station’s data buffer is empty (including new frame arrivals) Gated: all data in buffer when poll arrives Frame-Limited: one frame per poll Time-Limited: up to some maximum time Priority mechanisms ⚫ ⚫ More bandwidth and lower delay for stations that appear multiple times in the polling list Issue polls for stations with message of priority k or higher Comparison of MAC approaches ⚫ Channelization ⚫ ⚫ Aloha & Slotted Aloha ⚫ ⚫ ⚫ ⚫ ⚫ Feasible if traffic is steady Simple & quick transfer at very low load Accommodates large number of low-traffic bursty users Highly variable delay at moderate loads Efficiency does not depend on a CSMA & CSMA-CD ⚫ ⚫ ⚫ Quick transfer and high efficiency for low delay-bandwidth product Can accommodate large number of bursty users Variable and unpredictable delay Comparison of MAC approaches (Cont) ⚫ Reservation ⚫ ⚫ ⚫ ⚫ ⚫ On-demand transmission of bursty or steady streams Accommodates large number of low-traffic users with slotted Aloha reservations Can incorporate QoS Handles large delay-bandwidth product via delayed grants Polling ⚫ ⚫ ⚫ ⚫ Generalization of time-division multiplexing Provides fairness through regular access opportunities Can provide bounds on access delay Performance deteriorates with large delay-bandwidth product Typical MAC Efficiencies Two-Station Example: 1 Efficiency = 1 + 2a ⚫ CSMA-CD protocol: Efficiency = 1 1 + 6.44a Token-ring network 1 Efficiency = 1 + a ⚫ If a<<1, then efficiency close to 100% As a approaches 1, the efficiency becomes low a΄= latency of the ring (bits)/average frame length Typical Delay-Bandwidth Products Distance ⚫ 10 Mbps 100 Mbps 1 Gbps Network Type 1m 3.33 x 10-02 3.33 x 10-01 3.33 x 100 100 m 3.33 x 1001 3.33 x 1002 3.33 x 1003 Local area network 10 km 3.33 x 1002 3.33 x 1003 3.33 x 1004 1000 km 3.33 x 1004 3.33 x 1005 3.33 x 1006 Wide area network 100000 km 3.33 x 1006 3.33 x 1007 3.33 x 1008 Global area network Desk area network Metropolitan area network Long and/or fat pipes give large products and a (normalized delay) Summary: MAC protocol features ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Delay-bandwidth product / normalized delay Efficiency Transfer delay Fairness Reliability Capability to carry different types of traffic Quality of service Cost Unit 02.04.01 CS 5220: COMPUTER COMMUNICATIONS Local Area Networks (LANs) XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science What is a LAN? ⚫ Private ownership ⚫ ⚫ Short distance (~1km) between computers ⚫ ⚫ ⚫ ⚫ low cost high-speed, relatively error-free communication complex error control unnecessary Machines are constantly moved ⚫ ⚫ ⚫ ⚫ freedom from regulatory constraints of WANs Keeping track of location of computers a chore Simply give each machine a unique address Broadcast messages to all machines in the LAN Need a medium access control protocol Typical LAN Structure ⚫ ⚫ ⚫ Ethernet Processor RA M NIC ROM RA M Transmission Medium Network Interface Card (NIC) Unique MAC “physical” address Medium Access Control Sublayer In IEEE 802.1, Data Link Layer divided into: 1. Medium Access Control Sublayer ⚫ ⚫ ⚫ ⚫ Coordinate access to medium Connectionless frame transfer service Machines identified by MAC/physical address Broadcast frames with MAC addresses Logical Link Control Sublayer 2. ⚫ Between Network layer & MAC sublayer MAC Sub-layer OSI IEEE 802 Network layer Network layer 802.2 Logical link control LLC Data link layer MAC 802.11 802.3 802.5 CSMA-CD Token Ring Wireless LAN Physical layer Various physical layers Other LANs Physical layer Logical Link Control Layer ⚫ IEEE 802.2: LLC enhances service provided by MAC C A A Unreliable Datagram Service Reliable frame service C LLC LLC LLC MAC MAC MAC MAC MAC MAC PHY PHY PHY PHY PHY PHY Logical Link Control Services ⚫ Type 1: Unacknowledged connectionless service ⚫ ⚫ Unnumbered frame mode of HDLC Type 2: Reliable connection-oriented service ⚫ Asynchronous balanced mode of HDLC ⚫ Type 3: Acknowledged connectionless service ⚫ Additional addressing ⚫ ⚫ A workstation has a single MAC physical address Can handle several logical connections, distinguished by their SAP (service access points). Encapsulation of MAC frames IP Packet LLC LLC PDU Header MAC Header IP Data FCS Ethernet - A bit of history… ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ 1970 ALOHAnet radio network deployed in Hawaiian islands 1973 Metcalf and Boggs invent Ethernet, random access in wired net 1979 DIX Ethernet II Standard 1985 IEEE 802.3 LAN Standard (10 Mbps) 1995 Fast Ethernet (100 Mbps) Metcalf’s Sketch 1998 Gigabit Ethernet 2002 10 Gigabit Ethernet Ethernet is dominant LAN standard IEEE 802.3 MAC: Ethernet ⚫ MAC Protocol: CSMA/CD ⚫ Slot Time is the critical system parameter ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ upper bound on time to detect collision upper bound on time to acquire channel upper bound on length of frame segment generated by collision quantum for retransmission scheduling At least round-trip propagation Truncated binary exponential backoff ⚫ ⚫ for n_th retransmission: 0 < r < 2k, where k=min(n,10) Give up after 16 retransmissions IEEE 802.3 Original Parameters ⚫ ⚫ ⚫ ⚫ ⚫ Transmission Rate: 10 Mbps Minimum Frame: 512 bits = 64 bytes Max Length: 2500 meters + 4 repeaters Slot time: 51.2 msec => 512 bits/10 Mbps Each x10 increase in bit rate, must be accompanied by x10 decrease in distance, or x10 increase in minimum frame size IEEE 802.3 MAC Frame 802.3 MAC Frame 7 1 Preamble SD Synch ⚫ ⚫ ⚫ ⚫ ⚫ Start frame 6 Destination address 6 Source address 2 Length Information Pad 4 FCS 64 - 1518 bytes Every frame transmission begins “from scratch” Preamble helps receivers synchronize their clocks to transmitter clock 7 bytes of 10101010 generate a square wave Start frame byte changes to 10101011 Receivers look for change in 10 pattern IEEE 802.3 MAC Frame 802.3 MAC Frame 7 1 Preamble SD Synch 6 Destination address Start frame 0 Single address 1 Group address 0 Local address 1 Global address 6 Source address 2 Length Information Pad 64 - 1518 bytes • Destination address • single address • group address • broadcast = 111...111 Addresses • local or global • Global addresses 4 FCS IEEE 802.3 MAC Frame 802.3 MAC Frame 7 1 Preamble SD Synch ⚫ ⚫ ⚫ Start frame 6 Destination address 6 Source address 2 Length Information Pad 4 FCS 64 - 1518 bytes Length: # bytes in information field ⚫ Max frame 1518 bytes, excluding preamble & SD ⚫ Max information 1500 bytes: 05DC Pad: ensures minimum frame of 64 bytes FCS: CCITT-32 CRC, covers addresses, length, information, pad fields ⚫ NIC discards frames if failed CRC Ethernet Scalability ⚫ CSMA-CD maximum throughput depends on the normalized delay-bandwidth product a=tprop/X (X is the frame transmission time) ⚫ x10 increase in bit rate = x10 decrease in X ⚫ To keep a constant need to either: decrease tprop (distance) by x10; or increase frame length x10 Fast Ethernet Table 6.4 IEEE 802.3 100 Mbps Ethernet medium alternatives Medium Max. Segment Length Topology 100baseT4 100baseT 100baseFX Twisted pair category 3 UTP 4 pairs Twisted pair category 5 UTP two pairs Optical fiber multimode Two strands 100 m 100 m 2 km Star Star Star To preserve compatibility with 10 Mbps Ethernet: ⚫ Same frame format, same interfaces, same protocols ⚫ Hub topology only with twisted pair & fiber ⚫ Bus topology & coaxial cable abandoned Gigabit Ethernet Table 6.3 IEEE 802.3 1 Gbps Fast Ethernet medium alternatives Medium Max. Segment Length Topology ⚫ ⚫ ⚫ ⚫ 1000baseSX 1000baseLX 1000baseCX 1000baseT Optical fiber multimode Two strands Optical fiber single mode Two strands Shielded copper cable Twisted pair category 5 UTP 550 m 5 km 25 m 100 m Star Star Star Star Slot time increased to 512 bytes; small frames need to be extended to 512 Bytes Frame bursting to allow stations to transmit burst of short frames Frame structure preserved but CSMA-CD essentially abandoned, and operated primarily in a switched mode Extensive deployment in backbone of enterprise data networks and server farms 10 Gigabit Ethernet Table 6.5 IEEE 802.3 10 Gbps Ethernet medium alternatives 10GbaseSR Two optical fibers Multimode at Medium 850 nm 64B66B code Max. Segment Length ⚫ ⚫ ⚫ ⚫ ⚫ 300 m 10GBaseLR 10GbaseEW 10GbaseLX4 Two optical fibers Two optical fibers Single-mode at 1310 nm Single-mode at 1550 nm SONET compatibility Two optical fibers multimode/singlemode with four wavelengths at 1310 nm band 8B10B code 64B66B 10 km 40 km 300 m – 10 km Frame structure preserved CSMA-CD protocol officially abandoned LAN PHY for local network applications WAN PHY for wide area interconnection using SONET OC-192c Extensive deployment in metro networks and in datacenters Unit 02.04.02 CS 5220: COMPUTER COMMUNICATIONS Wireless LANs: CSMA-CA XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Wireless Data Communications ⚫ Wireless communication is compelling ✓ ✓ ✓ ✓ Easy, low-cost deployment Mobility & roaming: Access information anywhere Supports personal devices ✓ PDAs, laptops, data-cell-phones Supports communicating devices ✓ Cameras, location devices, wireless identification Susceptible to noise and interference: reliability! Signal strength varies in space & time: coverage! Signal can be captured by snoopers: security! Hidden Terminal Problem (a) C A Data Frame A transmits data frame B (b) Data Frame A B C senses medium, station A is hidden from C Data Frame C C transmits data frame & collides with A at B (a) B RTS A requests to send (b) C CTS B CTS A C B announces A ok to send (c) Data Frame B A sends (d) CSMA-CA C remains quiet ACK B B sends ACK ACK Ad Hoc Communications C A The basic service set (BSS) The basic service area (BSA) B ⚫ D An ad-hoc network: temporary association of group of stations ⚫ ⚫ Within range of each other; Need to exchange information E.g. Presentation in meeting, or distributed computer game, or both Infrastructure Network Server Gateway to Portal the Internet Portal Distribution System AP1 AP2 A1 B1 BSS A A2 BSS B B2 IEEE 802.11 Wireless LAN ⚫ Stimulated by availability of unlicensed spectrum ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ U.S. Industrial, Scientific, Medical (ISM) bands 902-928 MHz, 2.400-2.4835 GHz, 5.725-5.850 GHz Targeted wireless LANs @ 20 Mbps MAC for high speed wireless LAN Ad Hoc & Infrastructure networks Variety of physical layers 802.11 Definitions ⚫ Basic Service Set (BSS) ⚫ ⚫ ⚫ ⚫ ⚫ Group of stations that coordinate their access using a given instance of MAC Located in a Basic Service Area (BSA) Stations in BSS can communicate with each other Distinct collocated BSS’s can coexist Extended Service Set (ESS) ⚫ ⚫ ⚫ Multiple BSSs interconnected by Distribution System (DS) Each BSS is like a cell and stations in BSS communicate with an Access Point (AP) Portals (access point) attached to DS provide access to Internet Distribution Services ⚫ Stations in BSS can communicate directly with each other ⚫ DS provides distribution services: ⚫ Transfer MSDUs between APs in ESS ⚫ Transfer MSDUs between portals & BSSs in ESS ⚫ Transfer MSDUs between stations in same BSS ⚫ ⚫ Multicast, broadcast, or stations’s preference ESS looks like single BSS to LLC layer Infrastructure Services ⚫ Select AP and establish association with AP ⚫ ⚫ ⚫ ⚫ ⚫ Then can send/receive frames via AP & DS Reassociation service to move from one AP to another AP Dissociation service to terminate association Authentication service to establish identity of other stations Privacy service to keep contents secret Frame Types ⚫ Management frames ⚫ ⚫ ⚫ ⚫ Control frames ⚫ ⚫ ⚫ Station association & disassociation with AP Timing & synchronization Authentication & de-authentication Handshaking ACKs during data transfer Data frames ⚫ Data transfer Frame Structure MAC header (bytes) 2 Frame Control ⚫ ⚫ ⚫ 2 Duration/ ID 6 Address 1 6 Address 2 6 Address 3 2 6 Sequence Address control 4 0-2312 Frame body 4 CRC MAC Header: 30 bytes Frame Body: 0-2312 bytes CRC: CCITT-32 4 bytes CRC over MAC header & frame body Summary: IEEE 802.11 Physical Layer Options Frequency Band Bit Rate Modulation Scheme 802.11 2.4 GHz 1-2 Mbps Frequency-Hopping Spread Spectrum, Direct Sequence Spread Spectrum 802.11b 2.4 GHz 11 Mbps Complementary Code Keying & QPSK 802.11g 2.4 GHz 54 Mbps Orthogonal Frequency Division Multiplexing & CCK for backward compatibility with 802.11b 802.11a 5-6 GHz 54 Mbps Orthogonal Frequency Division Multiplexing Unit 02.04.03 CS 5220: COMPUTER COMMUNICATIONS Wireless LANs: Medium Access Control XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science IEEE 802.11 MAC ⚫ MAC sublayer responsibilities ⚫ ⚫ ⚫ ⚫ MAC security service options ⚫ ⚫ Channel access PDU addressing, formatting, error checking Fragmentation & reassembly of MAC SDUs Authentication & privacy MAC management services ⚫ ⚫ Roaming within ESS Power management MAC Services ⚫ ⚫ ⚫ DCF Contention Service: Asynchronous best-effort, required for all stations PCF Contention-Free Service: connection-oriented, time-bounded transfer MAC can alternate between Contention Periods (CPs) & Contention-Free Periods (CFPs) Contention MSDUs -free service Point coordination function Distribution coordination function (CSMA-CA) Physical MSDUs Contention service MAC Distributed Coordination Function (DCF) DIFS Contention window PIFS DIFS SIFS Busy medium Defer access ⚫ Next frame Wait for reattempt time Time CSMA-CA (carrier sense multiple access with collision avoidance) ⚫ ⚫ Ready stations wait for completion of transmission All stations must wait Interframe Space (IFS) Priorities through Interframe Spacing DIFS Contention window PIFS DIFS SIFS Busy medium Defer access ⚫ Next frame Wait for reattempt time High-Priority frames wait Short IFS (SIFS) ⚫ Typically to complete exchange in progress ⚫ ACKs, CTS, data frames of segmented MSDU, etc. ⚫ PCF IFS (PIFS) to initiate Contention-Free Periods ⚫ DCF IFS (DIFS) to transmit data & MPDUs Time Contention & Backoff Behavior ⚫ ⚫ If channel is still idle after DIFS period, ready station can transmit an initial MPDU If channel becomes busy before DIFS, then station must schedule backoff time for reattempt ⚫ ⚫ ⚫ ⚫ Backoff period is integer # of idle contention time slots Waiting station monitors medium & decrements backoff timer each time an idle contention slot transpires Station can contend when backoff timer expires A station that completes a frame transmission is not allowed to transmit immediately ⚫ Must first perform a backoff procedure Carrier Sensing in 802.11 ⚫ Physical Carrier Sensing ⚫ ⚫ ⚫ Virtual Carrier Sensing at MAC sublayer ⚫ ⚫ ⚫ ⚫ Analyze all detected frames Monitor relative signal strength from other sources Source stations informs other stations of transmission time (in msec) for an MPDU Carried in Duration field of RTS & CTS Stations adjust Network Allocation Vector to indicate when channel will become idle Channel busy if either sensing is busy Transmission of MPDU without RTS/CTS DIFS Data Source SIFS ACK Destination DIFS Other NAV Defer Access Wait for Reattempt Time Transmission of MPDU with RTS/CTS DIFS RTS Data Source SIFS SIFS SIFS Ack CTS Destination DIFS NAV (RTS) Other NAV (CTS) NAV (Data) Defer access Collisions, Losses & Errors ⚫ Collision Avoidance ⚫ ⚫ ⚫ ⚫ When station senses channel busy, it waits until channel becomes idle for DIFS period & then begins random backoff time (in units of idle slots) Station transmits frame when backoff timer expires If collision, recompute backoff over interval that is twice as long Receiving stations of error-free frames send ACK ⚫ ⚫ ⚫ Sending station interprets non-arrival of ACK as loss Executes backoff and then retransmits Receiving stations use sequence numbers to identify duplicate frames Point Coordination Function ⚫ ⚫ ⚫ Point coordinator (PC) in AP performs PCF Polling table up to implementor CFP repetition interval ⚫ ⚫ ⚫ ⚫ Determines frequency with which CFP occurs Initiated by beacon frame transmitted by PC in AP Contains CFP and CP During CFP stations may only transmit to respond to a poll from PC or to send ACK SUMMARY ⚫ Distributed Coordination Function DCF is required ⚫ Point Coordination Function PCF is optional Unit 03.01.01 CS 5220: COMPUTER COMMUNICATIONS Bridges and Data Link Layer Switching XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Interconnecting Networks ⚫ Several ways of interconnecting networks Application layer Application gateway Transport layer Transport gateway Network layer Router Data Link layer Bridge, Switch Physical layer Repeater, Hub Hubs, Bridges & Routers ⚫ Hub: Active central element in a star topology ⚫ ⚫ ⚫ ⚫ Twisted Pair: inexpensive, easy to install Simple repeater in Ethernet LANs “Intelligent hub”: fault isolation, net configuration, statistics User community grows, need to interconnect hubs: ? Hub Two Twisted Pairs Two Twisted Pairs Station Hub Station Station Station Station Station Hubs, Bridges & Routers ⚫ Interconnecting Hubs/LANs ⚫ Repeater: Signal regeneration ⚫ All traffic appears in both LANs ⚫ Bridge: MAC address filtering ⚫ Local traffic stays in own LAN ⚫ Routers: Internet routing ⚫ All traffic stays in own LAN Higher Scalability General Bridge Issues Network Network LLC LLC MAC 802.3 802.3 802.5 802.5 MAC PHY 802.3 802.3 802.5 802.5 PHY 802.3 CSMA/CD ⚫ 802.5 Token Ring Operation at data link level implies capability to work with multiple network layers; However, must deal with ⚫ ⚫ Difference in MAC formats, and maximum frame length Difference in data rates; buffering; timers; security Bridges of Same Type Network Network Bridge LLC LLC MAC MAC MAC MAC Physical Physical Physical Physical ⚫ ⚫ Common case involves LANs of same type Bridging is done at MAC level Transparent Bridges ⚫ ⚫ Interconnection of IEEE LANs with complete transparency S2 S3 Use backward learning to build table ⚫ ⚫ ⚫ S1 observe source address of arriving frame handle topology changes by removing old entries LAN1 Bridge LAN2 Use table lookup, and ⚫ ⚫ ⚫ discard frame, if source & destination in same LAN forward frame, if source & destination in different LAN use flooding, if destination unknown S4 S5 S6 S1→S5 S1 S2 S3 S5 S4 S1 to S5 LAN1 LAN2 LAN3 B1 Port 1 Address Port B2 Port 2 Port 1 Address Port Port 2 S1→S5 S1 S2 S3 S5 S4 S1 to S5 LAN1 LAN2 LAN3 B1 B2 Port 1 Port 2 Address Port S1 1 Port 1 Address Port Port 2 S1→S5 S1 S2 S3 S5 S4 S1 to S5 LAN1 LAN2 LAN3 B1 B2 Port 1 Port 2 Address Port S1 1 Port 1 Port 2 Address Port S1 1 S1→S5 S1 S2 S3 S5 S4 S1 to S5 LAN1 LAN2 LAN3 B1 B2 Port 1 Port 2 Address Port S1 1 Port 1 Port 2 Address Port S1 1 S3→S2 S1 S2 S3 S5 S4 S3→S2 LAN1 LAN2 LAN3 B1 B2 Port 1 Port 2 Address Port S1 1 Port 1 Port 2 Address Port S1 1 S3→S2 S1 S2 S3 S3→S2 S3→S2 LAN1 LAN2 LAN3 B1 B2 Port 1 Port 2 Address Port S1 S3 1 2 S5 S4 Port 1 Port 2 Address Port S1 S3 1 1 S3→S2 S1 S2 S3 S5 S4 S3→S2 S3→S2 LAN1 LAN2 LAN3 B1 B2 Port 1 Port 2 Address Port S1 S3 1 2 Port 1 Port 2 Address Port S1 S3 1 1 S4→S3 S1 S2 S3 S5 S4 S4 S3 S4→S3 LAN1 LAN2 LAN3 B1 Port 1 B2 Port 2 Address Port S1 S3 1 2 Port 1 Port 2 Address Port S1 S3 1 1 S4→S3 S1 S2 S3 S5 S4 S4 S3 S4→S3 LAN1 LAN2 LAN3 B1 Port 1 B2 Port 2 Address Port S1 S3 1 2 Port 1 Port 2 Address Port S1 S3 S4 1 1 2 S4→S3 S1 S2 S3 S4 S4→S3 LAN1 LAN2 B2 Port 2 Address Port S1 S3 S4 1 2 2 S3 LAN3 B1 Port 1 S5 S4 Port 1 Port 2 Address Port S1 S3 S4 1 1 2 S2→S1 S1 S2 S3 S5 S4 S2→S1 LAN1 LAN2 LAN3 B1 B2 Port 1 Port 2 Address Port S1 S3 S4 1 2 2 Port 1 Port 2 Address Port S1 S3 S4 1 1 2 S2→S1 S1 S2 LAN1 S3 LAN2 S2→S1 LAN3 B1 B2 Port 1 Port 2 Address Port S1 S3 S4 S2 1 2 2 1 S5 S4 Port 1 Port 2 Address Port S1 S3 S4 1 1 2 Summary: Adaptive Learning ⚫ ⚫ In a static network, tables eventually store all addresses & learning stops In practice, stations are added & moved all the time ⚫ ⚫ ⚫ Introduce timer (minutes) to age each entry & force it to be relearned periodically If frame arrives on port that differs from frame address & port in table, update immediately Spanning tree algorithm is adopted to avoid loops in interconnection Unit 03.01.02 CS 5220: COMPUTER COMMUNICATIONS Network Layer Services and Topology XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Network Layer ⚫ Network Layer: the most complex layer ⚫ ⚫ Requires the coordinated actions of multiple, geographically distributed network elements (switches & routers) Must be able to deal with very large scales ⚫ ⚫ Billions of users (people & communicating devices) Biggest Challenges ⚫ ⚫ Addressing: where should information be directed to? Routing: what path should be used to get information there? Packet Switching t1 t0 Network ⚫ ⚫ ⚫ Transfer of information as payload in data packets Packets undergo random delays & possible loss Different applications impose differing requirements on the transfer of information Network Service Messages Messages Segments Transport layer Transport layer Network service Network service End system α Network layer Network layer Network layer Network layer Data link layer Data link layer Data link layer Data link End layer Physical layer Physical layer Physical layer layer system Physical β Network layer can offer a variety of services to transport layer Network Service vs. Operation Network Service ⚫ Connectionless ⚫ ⚫ Datagram Transfer Connection-Oriented ⚫ Internal Network Operation ⚫ Connectionless Reliable and possibly constant bit rate transfer ⚫ ⚫ IP Connection-Oriented ⚫ ATM Various combinations are possible • Connection-oriented service over Connectionless operation • Connectionless service over Connection-Oriented operation • Context & requirements determine what makes sense Complexity at the Edge or in the Core? C 1 2 3 21 End system α 4 3 21 End system β 12 3 21 Medium Physical layer entity 2 Data link layer entity Network 3 Network layer entity 21 12 3 4 2 1 B A 1 3 12 3 4 Network layer entity Transport layer entity The End-to-End Argument for System Design ⚫ An end-to-end function is best implemented at a higher level than at a lower level ⚫ ⚫ ⚫ End-to-end service requires all intermediate components to work properly Higher-level better positioned to ensure correct operation Example: stream transfer service ⚫ ⚫ Establishing an explicit connection for each stream across network requires all network elements (NEs) to be aware of connection; All NEs have to be involved in reestablishment of connections in case of network fault In connectionless network operation, NEs do not deal with each explicit connection and are much simpler in design Network Layer Functions Essentials ⚫ Routing: mechanisms for determining the set of best paths for routing packets that requires the collaboration of network elements ⚫ Forwarding: transfer of packets from inputs to outputs ⚫ Priority & Scheduling: determining order of packet transmission in each network element Optional: congestion control, segmentation & reassembly, security End-to-End Packet Network Topology In Packet networks, ⚫ Individual packet streams are highly bursty ⚫ ⚫ User demand can undergo dramatic change ⚫ ⚫ Statistical multiplexing is used to concentrate streams Peer-to-peer applications stimulated huge growth in traffic volumes Internet structure highly decentralized ⚫ ⚫ Paths traversed by packets can go through many networks controlled by different organizations No single entity responsible for end-to-end service Access Multiplexing Access MUX To packet network ⚫ ⚫ ⚫ Packet traffic from users multiplexed at access to network into aggregated streams DSL traffic multiplexed at DSL Access Mux Cable modem traffic multiplexed at Cable Modem Termination System Oversubscription Access Multiplexer ⚫ ⚫ ⚫ ⚫ N subscribers connected @ c bps to mux Each subscriber active r/c of time Mux has C = mc bps to network Oversubscription ratio: N/m Find m so that at most 1% overflow probability r r r Nr ••• ⚫ ••• ⚫ Nc mc Oversubscription Ratio ••• ••• r r r Nr mc Feasible oversubscription rate increases with size Nc N r/c m N/m 10 0.01 1 10 10 extremely lightly loaded users 10 0.05 3 3.3 10 very lightly loaded user 10 0.1 4 2.5 10 lightly loaded users 20 0.1 6 3.3 20 lightly loaded users 40 0.1 9 4.4 40 lightly loaded users 100 0.1 18 5.5 100 lightly loaded users Home LANs WiFi Ethernet Home Router To packet network ⚫ Home Router ⚫ ⚫ ⚫ LAN Access using Ethernet or WiFi (IEEE 802.11) Private IP addresses in Home (192.168.0.x) using Network Address Translation (NAT) Single global IP address from ISP issued using Dynamic Host Configuration Protocol (DHCP) LAN Concentration Switch / Router ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ LAN Hubs and switches in the access network also aggregate packet streams that flows into switches and routers Servers have redundant connectivity to backbone Campus Network To Internet or wide area network Organization Servers s s Gateway Backbone R R High-speed campus backbone net connects dept routers R S R Departmental Server S S R R s s Only outgoing packets leave LAN through router s s s s s s s Summary: ⚫ End-to-end argument significantly affect networking system design, for performance and scalability ⚫ Oversubscription is a common technique with multiplexing for efficiency Unit 03.01.03 CS 5220: COMPUTER COMMUNICATIONS Packet Switching - Datagrams XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science The Switching Function ⚫ ⚫ ⚫ Dynamic interconnection of inputs to outputs Enables dynamic sharing of transmission resource Two fundamental approaches: ⚫ ⚫ Connectionless Connection-Oriented Backbone Network Switch Access Network Packet Switching Network User Transmission line Network Packet switch Packet switching network ⚫ Transfers packets between users ⚫ Transmission lines + packet switches (routers) ⚫ Origin in message switching Two modes of operation: ⚫ Connectionless ⚫ Virtual Circuit Message Switching ⚫ Message Message ⚫ Message Source Message ⚫ ⚫ Switches Destination ⚫ Message switching invented for telegraphy Entire messages multiplexed onto shared lines, stored & forward Headers for source & destination addresses Routing at message switches Connectionless Transmission delay vs. propagation delay • Transmit a 1000B from LA to DC via a 1Gbps network, signal speed 200Km/sec. Message Switching Delay Source T t Switch 1 t Switch 2 t t Destination Delay Minimum delay = 3 + 3T Additional queueing delays possible at each link Long Messages vs. Packets 1 Mbit message ⚫ ⚫ source dest BER=p=10-6 How many bits need to be transmitted to deliver message? Approach 1: send 1 Mbit ⚫ Approach 2: send 10 100-kbit message packets Probability message arrives ⚫ Probability packet arrives correctly correctly Pc = (1 – 10-6)10^6 ≈ e -10^6 10^-6 = e-1 ≈ 1/3 ⚫ ⚫ BER=10-6 On average it takes about 3 transmissions/hop Total # bits transmitted ≈ 6 Mbits Pc = (1 – 10-6)10^5 ≈ e -10^5 10^-6 = e-1/10 ≈ 0.9 ⚫ On average it takes about 1.1 transmissions/hop ⚫ Total # bits transmitted ≈ 2.2 Mbits Packet Switching - Datagram ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Messages broken into smaller units (packets) Source & destination addresses in packet header Connectionless, packets routed independently (datagram) Packet may be out of order Pipelining of packets across network can reduce delay, increase throughput Lower delay than message switching, suitable for interactive traffic Packet 1 Packet 1 Packet 2 Packet 2 Packet 2 Packet Switching Delay Assume three packets corresponding to one message Source 1/3T t 1 2 3 t Switch 1 1 2 3 t Switch 2 1 2 3 Destination t Delay Minimum Delay = 3τ + 5(T/3) (assumed single path, no queueing delay) Packet pipelining enables message to arrive sooner Delay for k-Packet Message over L Hops Source Switch 1 Switch 2 t 1 2 3 t 1 2 3 t 1 Destination 2 3 t L hops 3 hops 3 + 2(T/3) first bit received L + (L-1)P first bit received 3 + 3(T/3) first bit released L + LP first bit released 3 + 5 (T/3) last bit released L + LP + (k-1)P last bit released where T = k P Summary: ⚫ Long message switching suffers long delay and is more vulnerable to errors. ⚫ Packet pipelining reduce delay and improve throughput. Unit 03.02.01 CS 5220: COMPUTER COMMUNICATIONS Packet Switching – Virtual Circuits XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Packet Switching – Virtual Circuit Packet Packet Packet Packet Virtual circuit ⚫ ⚫ ⚫ ⚫ ⚫ Call set-up phase sets ups pointers in fixed path along networks All packets for a connection follow the same path Abbreviated header identifies connection on each link Variable bit rates possible, negotiated during call set-up Physical-layer circuit vs Network-layer virtual circuit Connection Setup Connect request Connect confirm ⚫ ⚫ ⚫ ⚫ ⚫ SW 1 Connect request Connect confirm SW 2 … SW n Connect request Connect confirm Signaling messages propagate as route is selected Signaling messages identify connection and setup tables in switches Typically a connection is identified by a local tag, Virtual Circuit Identifier (VCI) Each switch only needs to know how to relate an incoming tag in one input to an outgoing tag in the corresponding output Once tables are setup, packets can flow along path Virtual Circuit Forwarding Tables Input VCI Output port Output VCI 12 13 44 15 15 23 27 13 16 ⚫ ⚫ ⚫ ⚫ ⚫ 58 7 34 Each input port of packet switch has a forwarding table Lookup entry for per-link/port VCI of incoming packet Determine output port (next hop) and insert VCI for next link Very high speeds are possible (HW-based) Table can also include priority or other information about how packet should be treated Routing in Virtual Circuit Subnet Label switching Virtual Circuit w/ Connection Setup Delay t Connect request CC CR CC CR Connect confirm 1 2 3 1 2 Release 3 t t 1 2 3 t ⚫ Connection setup delay is incurred before any packet can be transferred ⚫ Delay is acceptable for sustained transfer of large number of packets ⚫ This delay may be unacceptably high if only a few packets are being transferred Example: ATM Networks ⚫ All information mapped into short fixed-length packets called cells ⚫ Connections set up across network ⚫ ⚫ ⚫ Virtual circuits established across networks Tables setup at ATM switches Several types of network services offered ⚫ ⚫ Constant bit rate connections Variable bit rate connections Cut-Through Switching Source t Switch 1 2 1 3 t Switch 2 2 1 3 t 1 Destination 2 3 t Minimum delay = 3 + T ⚫ Some networks perform error checking on header only, so packets can be forwarded as soon as header is received & processed ⚫ Delays reduced further with cut-through switching Message vs. Packet Minimum Delay ⚫ Message switching: L + LT = L + (L – 1) T + T ⚫ Packet switching with store-and-forward L + L P + (k – 1) P = L + (L – 1) P + T ⚫ Cut-Through Packet switching (Immediate forwarding after header) = L+ T Above measurements neglect header processing delays 5-4 Unit 03.02.02 CS 5220: COMPUTER COMMUNICATIONS Routing in Packet Networks XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Routing in Packet Networks 1 3 6 4 2 ⚫ Node (switch or router) Three possible (loopfree) routes from 1 to 6: ⚫ ⚫ 5 1-3-6, 1-4-5-6, 1-2-5-6 Which is “best”? ⚫ Min delay? Min hop? Max bandwidth? Min cost? Routing Algorithm Requirements ⚫ Responsiveness to changes ⚫ ⚫ ⚫ ⚫ Optimality ⚫ ⚫ Resource utilization, path length Robustness ⚫ ⚫ Topology or bandwidth changes, congestion Rapid convergence of routers to consistent set of routes Freedom from persistent loops Continues working under high load, congestion, faults, equipment failures, incorrect implementations Simplicity ⚫ Efficient software implementation, reasonable processing load Creating the Routing Tables ⚫ Need information on state of links ⚫ ⚫ Need to distribute link state info using a routing protocol ⚫ ⚫ ⚫ Link up/down; congested; delay or other metrics What information is exchanged? How often? Exchange with neighbors; Broadcast or flood Need to compute routes based on information ⚫ ⚫ Single metric; multiple metrics Single route; alternate routes Routing Tables in Datagram Networks Destination address 0785 Output port Route determined by table lookup ⚫ Routing decision involves finding next hop in route to given destination ⚫ Routing table has an entry for each destination specifying output port that leads to next hop ⚫ Size of table becomes impractical for very large number of destinations 7 1345 12 1566 6 2458 ⚫ 12 Example: Internet Routing ⚫ Internet protocol uses datagram packet switching across networks ⚫ ⚫ Hosts have two-part IP address: ⚫ ⚫ Network address + Host address Routers do table lookup on network address ⚫ ⚫ Networks are treated as data links This reduces size of routing table In addition, network addresses are assigned so that they can also be aggregated Routing in Virtual-Circuit Packet Networks 2 1 A 1 3 5 VCI 2 ⚫ 5 2 Switch or router 5 5 2 ⚫ B 4 3 6 8 6 1 4 Host C 7 3 D Route determined during connection setup Tables in switches implement forwarding that realizes selected route Non-Hierarchical Addresses and Routing 0000 0111 1010 1101 1 4 3 R2 R1 5 2 0011 0110 1001 1100 ⚫ ⚫ 0001 0100 1011 1110 0000 0111 1010 … 1 1 1 … 0001 0100 1011 … 4 4 4 … 0011 0101 1000 1111 No relationship between addresses & routing proximity Routing tables require 16 entries each Hierarchical Addresses and Routing 0000 0001 0010 0011 1 0100 0101 0110 0111 4 3 R2 R1 5 2 1000 1001 1010 1011 00 01 10 11 1 3 2 3 00 01 10 11 3 4 3 5 1100 1101 1110 1111 • Prefix indicates network where host is attached • Routing tables require 4 entries each Specialized Routing ⚫ Flooding ⚫ ⚫ ⚫ Useful in starting up network Useful in propagating information to all nodes Deflection Routing ⚫ ⚫ Fixed, preset routing procedure No route synthesis Flooding Send a packet to all nodes in a network ⚫ ⚫ No routing tables available Need to broadcast packet to all nodes (e.g. to propagate link state information) Approach ⚫ ⚫ Send packet on all ports except one where it arrived Exponential growth in packet transmissions 1 3 6 4 2 5 Flooding is initiated from Node 1: Hop 1 transmissions 1 3 6 4 2 5 Flooding is initiated from Node 1: Hop 2 transmissions 1 3 6 4 2 5 Flooding is initiated from Node 1: Hop 3 transmissions Limited Flooding ⚫ Time-to-Live field in each packet limits number of hops to certain diameter ⚫ Each router adds its ID before flooding; discards repeats ⚫ Source puts sequence number in each packet; switches records source address and sequence number and discards repeats Deflection Routing ⚫ ⚫ ⚫ Network nodes forward packets to preferred port If preferred port busy, deflect packet to another port Works well with regular topologies ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Manhattan street network Rectangular array of nodes Nodes designated (i,j) Rows alternate as one-way streets Columns alternate as one-way avenues Bufferless operation is possible ⚫ ⚫ Proposed for optical packet networks All-optical buffering currently not viable Manhattan Street Network 0,0 0,1 0,2 0,3 1,0 1,1 1,2 1,3 2,0 2,1 2,2 2,3 3,0 3,1 3,2 3,3 Tunnel from last column to first column or vice versa Summary: ⚫ Routing algorithm optimality depends on the objective function that the network operator tries to optimize ⚫ Hierarchical addressing reduces the size of routing table ⚫ Flooding is useful when routing tables are unavailable Unit 03.02.03 CS 5220: COMPUTER COMMUNICATIONS Shortest Path Routing – Distance Vector XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Shortest Paths & Routing ⚫ Many possible paths connect any given source and to any given destination ⚫ Routing involves the selection of the path to be used to accomplish a given transfer ⚫ Typically it is possible to attach a cost or distance to a link connecting two nodes ⚫ Routing can then be posed as a shortest path problem Routing Metrics Means for measuring desirability of a path ⚫ ⚫ Path Length = sum of costs or distances Possible metrics ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Hop count: rough measure of resources used Reliability: link availability; BER Delay: sum of delays along path; complex & dynamic Bandwidth: “available capacity” in a path Load: Link & router utilization along path Cost: $$$ Shortest Path Approaches Distance Vector Protocols ⚫ Neighbors exchange list of distances to destinations ⚫ Best next-hop determined for each destination ⚫ Bellman-Ford (distributed) shortest path algorithm Link State Protocols ⚫ Link state information flooded to all routers ⚫ Routers have complete topology information ⚫ Shortest path (& hence next hop) calculated ⚫ Dijkstra (centralized) shortest path algorithm Distance Vector Do you know the way to San Jose? San Jose 596 Distance Vector Local Signpost ⚫ Direction ⚫ Distance Table Synthesis ⚫ Neighbors exchange table entries ⚫ Determine current best next hop ⚫ Inform neighbors ⚫ Routing Table For each destination list: ⚫ Next Node ⚫ Distance ⚫ Periodically After changes dest next dist Shortest Path to SJ Focus on how nodes find their shortest path to a given destination node, i.e. SJ San Jose Dj Cij i Di j If Di is the shortest distance to SJ from i and if j is a neighbor on the shortest path, then Di = Cij + Dj But we don’t know the shortest paths Router i only has local info from neighbors San Jose Dj' j' Cij' i Di j Cij Cij” j" Dj Dj" Pick current shortest path How Distance Vector Works 3 Hops From SJ 2 Hops From SJ 1 Hop From SJ San Jose How Distance Vector Works 3 Hops From SJ 2 Hops From SJ 1 Hop From SJ SJ sends accurate info San Jose How Distance Vector Works 3 Hops From SJ 2 Hops From SJ 1 Hop From SJ San Jose Hop-1 nodes calculate current (next hop, dist), & send to neighbors 3 Hops From SJ 2 Hops From SJ 1 Hop From SJ Current info about SJ ripples across network, Shortest Path Converges San Jose Bellman-Ford Algorithm ⚫ ⚫ ⚫ ⚫ Consider computations for one destination d Initialization ⚫ Each node table has 1 row for destination d ⚫ Distance of node d to itself is zero: Dd=0 ⚫ Distance of other node j to d is infinite: Dj=, for j d ⚫ Next hop node nj = -1 to indicate not yet defined for j d Send Step ⚫ Send new distance vector to immediate neighbors across local link Receive Step ⚫ At node j, find the next hop that gives the minimum distance to d, ⚫ ⚫ Minj { Cij + Dj } ⚫ Replace old (nj, Dj(d)) by new (nj*, Dj*(d)) if new next node or distance Go to send step Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (-1, ) (-1, ) (-1, ) (-1, ) (-1, ) 1 2 3 Table entry @ node 3 for dest SJ Table entry @ node 1 for dest SJ 2 3 1 5 1 2 4 3 1 2 6 3 4 5 2 San Jose Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (-1, ) (-1, ) (-1, ) (-1, ) (-1, ) 1 (-1, ) (-1, ) (6,1) (-1, ) (6,2) 1 D6=0 2 3 D3=D6+1 n3=6 2 3 1 5 1 2 0 4 3 1 2 6 3 4 D5=D6+2 n5=6 2 5 2 D6=0 San Jose Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (-1, ) (-1, ) (-1, ) (-1, ) (-1, ) 1 (-1, ) (-1, ) (6, 1) (-1, ) (6,2) 2 (3,3) (5,6) (6, 1) (3,3) (6,2) 3 3 2 3 1 1 5 3 1 2 0 4 3 1 2 6 6 3 4 2 5 2 San Jose Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (-1, ) (-1, ) (-1, ) (-1, ) (-1, ) 1 (-1, ) (-1, ) (6, 1) (-1, ) (6,2) 2 (3,3) (5,6) (6, 1) (3,3) (6,2) 3 (3,3) (4,4) (6, 1) (3,3) (6,2) 3 1 2 3 1 5 3 1 2 0 4 3 1 2 64 6 3 4 5 2 2 San Jose ⚫ Summary: Shortest-Path Tree 3 1 2 3 1 5 3 1 2 0 4 1 2 6 6 3 5 2 2 San Jose Unit 03.03.01 CS 5220: COMPUTER COMMUNICATIONS Shortest Path Routing – Link State XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Counting to Infinity Problem (a) 1 (b) 1 1 1 2 2 1 1 3 3 1 X 4 4 Destination/root is node 4; link (3,4) fails. Counting to Infinity Problem (Cont.) Update Node 1 Node 2 Node 3 Before break (2,3) (3,2) (4, 1) After break (2,3) (3,2) (2,3) 1 (2,3) (3,4) (2,3) 2 (2,5) (3,4) (2,5) 3 (2,5) (3,6) (2,5) 4 (2,7) (3,6) (2,7) 5 (2,7) (3,8) (2,7) … … … … Nodes 2 and 3 believe best path is through each other (Destination is node 4) Problem: Bad News Travels Slowly Remedies ⚫ Split Horizon ⚫ ⚫ Do not report route to a destination to the neighbor from which route was learned Split Horizon with Poisoned Reverse ⚫ ⚫ ⚫ Report route to a destination to the neighbor from which route was learned, but with infinite distance Breaks erroneous direct loops immediately Does not work on some indirect loops Split Horizon with Poison Reverse (a) 1 1 2 1 3 1 4 Destination is node 4 (b) 1 1 2 1 3 X 4 Update Node 1 Node 2 Node 3 Before break (2, 3) (3, 2) (4, 1) After break (2, 3) (3, 2) (-1, ) Node 2 advertizes its route to 4 to node 3 as having distance infinity; node 3 finds there is no route to 4 1 (2, 3) (-1, ) (-1, ) Node 1 advertizes its route to 4 to node 2 as having distance infinity; node 2 finds there is no route to 4 2 (-1, ) (-1, ) (-1, ) Node 1 finds there is no route to 4 Link-State Algorithm ⚫ Basic idea: three step procedure ⚫ Each source node creates a link state packet containing to-neighbor link metrics ⚫ Each source node broadcasts its link state packet so as to get a map of all nodes and link metrics of the entire network ⚫ Find the shortest path on the map from the source node to all destination nodes Link-State Algorithm - Broadcasting ⚫ Broadcast of link-state information ⚫ Every node i broadcasts to every other node in the network: ⚫ ID’s of its neighbors: Ni=set of neighbors of i ⚫ ⚫ ⚫ Distances to its neighbors: {Cij | j Ni} Flooding is a popular method of broadcasting packets How to limit flooding? Building Link State Packets A state packet starts with the ID of the sender, a seq#, age, and a list of neighbors with delay/distance information. (a) A subnet. (b) The link state packets for this subnet. Unit 03.03.02 CS 5220: COMPUTER COMMUNICATIONS Dijkstra Algorithm XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Dijkstra Algorithm ⚫ Finding the shortest paths from a source node to all other nodes in a network (graph) ⚫ More efficient than Bellman-Ford algorithm ⚫ Named after scientist Edsger W. Dijkstra Dijkstra Algorithm: Idea and Procedure ⚫ First iteration ⚫ ⚫ At the second iteration ⚫ ⚫ Finds the second closest node from the source node; must be the neighbor of either the source node or the closest node to the source node At the third iteration ⚫ ⚫ Finds the closest node from the source node; must be the neighbor of the source node Finds the third closest node; must be the neighbor of the source node or the first two closest nodes At the k_th iteration ⚫ Finds the k_th closest node from the source node Dijkstra Algorithm: Illustration Closest node to s is 1 hop away 2nd closest node to s is 1 hop away from s or w 3rd closest node to s is 1 hop away from s, w, or x Find shortest paths from source s to all other destinations W’ w ' z w" s x w" w z' x' ⚫ ⚫ N: set of nodes for which shortest path already found Initialization: (Start with source node s) ⚫ ⚫ ⚫ Step A: (Find next closest node i) ⚫ ⚫ ⚫ ⚫ ⚫ N = {s}, Ds = 0, “s is distance zero from itself” Dj=Csj for all j s, distances of directly-connected neighbors Find i N such that Di = min Dj for j N Add i to N If N contains all the nodes, stop Dijkstra’s algorithm Step B: (update minimum costs) ⚫ ⚫ ⚫ For each node j N Dj = min (Dj, Di+Cij) Go to Step A Minimum distance from s to j through node i in N Execution of Dijkstra’s algorithm 1 2 3 5 2 1 6 3 4 1 2 3 2 5 4 Iteration N D2 D3 D4 D5 D6 Initial {1} 3 2 5 Iteration 1 2 1 ✓ 1 3 6 5 2 3 4 1 2 3 2 5 4 Iteration N D2 D3 D4 D5 D6 Initial {1} 3 5 1 {1,3} 3 2 ✓ 2 4 3 Iteration 2 1 2 3 5 2 6 3 ✓ 1 4 1 2 3 2 5 4 Iteration N D2 D3 D4 D5 D6 Initial {1} 3 2 5 1 {1,3} 3 2 4 3 2 {1,2,3} 3 2 4 7 3 Iteration 3 1 2 3 5 2 3 4 1 2 3 2 ✓6 1 5 4 Iteratio n N D2 D3 D4 D5 D6 Initial {1} 3 2 5 1 {1,3} 3 2 4 3 2 {1,2,3} 3 2 4 7 3 3 {1,2,3,6} 3 2 4 5 3 Iteration 4 1 2 1 3 6 5 2 3 4 1 ✓ 2 3 2 5 4 Iteration N D2 D3 D4 D5 D6 Initial {1} 3 2 5 1 {1,3} 3 2 4 3 2 {1,2,3} 3 2 4 7 3 3 {1,2,3,6} 3 2 4 5 3 4 {1,2,3,4,6} 3 2 4 5 3 Iteration 5 1 2 3 5 2 3 1 6 4 1 2 3 2 5 4 ✓ Iteration N D2 D3 D4 D5 D6 Initial {1} 3 2 5 1 {1,3} 3 2 4 3 2 {1,2,3} 3 2 4 7 3 3 {1,2,3,6} 3 2 4 5 3 4 {1,2,3,4,6} 3 2 4 5 3 5 {1,2,3,4,5,6} 3 2 4 5 3 Shortest-Path Tree from node 1 to other nodes 1 2 1 3 1 6 5 2 6 2 3 4 1 2 3 2 4 5 3 1 3 2 4 2 2 5 Unit 03.03.03 CS 5220: COMPUTER COMMUNICATIONS Link State Routing, ATM Networks XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Reaction to Failure ⚫ In distance vector routing, if a link fails ⚫ ⚫ Neighboring routers exchange routing tables that may use failed links In link-state routing, if a link fails, ⚫ ⚫ ⚫ Router sets link distance to infinity & floods the network with an update packet All routers immediately update their link database & recalculate their shortest paths Recovery very quick Why is Link State Better? ⚫ Fast, loopless convergence ⚫ Support for precise metrics, and multiple metrics if necessary (throughput, delay, cost, reliability) ⚫ Support for multiple paths to a destination ⚫ algorithm can be modified to find best two paths Problem of Link State Routing ⚫ But watch out for old update messages ⚫ ⚫ ⚫ ⚫ Add time stamp or sequence # to each update message Check whether each received update message is new If new, add it to database and broadcast If older, send update message on arriving link Source Routing vs. H-by-H ⚫ Source host selects path to be followed by a packet ⚫ ⚫ ⚫ Intermediate switches read next-hop address and remove address ⚫ ⚫ ⚫ Strict: sequence of nodes in path inserted into header Loose: subsequence of nodes in path specified Or maintained for the reverse path Source routing allows the host to control the paths that its information traverses in the network Potentially the means for customers to select what service providers they use Example 3,6,B 1,3,6,B 6,B 1 3 6 B A 4 B Source host 2 5 Destination host Asynchronous Tranfer Mode (ATM) ⚫ Packet multiplexing and switching ⚫ ⚫ ⚫ ⚫ Fixed-length packets: “cells” Connection-oriented Rich Quality of Service support Conceived as end-to-end ⚫ Supporting wide range of services ⚫ ⚫ ⚫ Real time voice and video Circuit emulation for digital transport Data traffic with bandwidth guarantees TDM vs. Packet Multiplexing Variable bit rate TDM Multirate only Packet Easily handled Delay Burst traffic Processing Low, fixed Inefficient Minimal, very high speed Variable Header & packet processing required Efficient *In mid-1980s, packet processing mainly in software and hence slow; By late 1990s, very high speed packet processing possible ATM: Attributes of TDM & Packet Switching Voice Data packets Images • • Packet structure gives flexibility & efficiency Fixed packet length simplifies implementation and makes high speed 1 2 MUX 3 Wasted bandwidth 4 TDM 3 2 1 4 3 2 1 4 3 1 4 3 2 1 2 1 ATM 3 2 Packet Header ATM Virtual Connections ⚫ ⚫ ⚫ Virtual connections setup across network Connections identified by locally-defined tags ATM Header contains virtual connection information: ⚫ ⚫ 8-bit Virtual Path Identifier 16-bit Virtual Channel Identifier Virtual paths Physical link Virtual channels ⚫ Powerful traffic grooming capabilities ⚫ Multiple VCs can be bundled within a VP MPLS & ATM ⚫ ⚫ ATM initially touted as more scalable than packet switching Advances in optical transmission proved ATM to be the less scalable: @ 10 Gbps ⚫ ⚫ ⚫ Segmentation & reassembly of messages & streams into 48-byte cell payloads difficult & inefficient Header must be processed every 53 bytes vs. 500 bytes on average for packets MPLS (multiprotocol label switching) uses tags to transfer packets across virtual circuits in Internet ⚫ Adopts label switching paradigm, but variable-length packets by packet-over-SONET encapsulation Unit 03.03.04 CS 5220: COMPUTER COMMUNICATIONS RIP and OSPF XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Routing Information Protocol (RIP) ⚫ RIP based on routed, distributed in BSD UNIX ⚫ Uses the distance-vector algorithm ⚫ Runs on top of UDP, port number 520 ⚫ Metric: number of hops ⚫ Max limited to 15 ⚫ ⚫ ⚫ suitable for small networks (local area environments) value of 16 is reserved to represent infinity small number limits the count-to-infinity problem RIP Operation ⚫ Router sends update message to neighbors every 30 sec ⚫ A router expects to receive an update message from each of its neighbors within 180 seconds in the worst case ⚫ If router does not receive update message from neighbor X within this limit, it assumes the link to X has failed and sets the corresponding minimum cost to 16 (infinity) ⚫ Uses split horizon with poisoned reverse ⚫ Convergence speeded up by triggered updates ⚫ neighbors notified immediately of changes in distance vector table Deficiencies in RIP Protocol ⚫ Limited Metric Use ⚫ Slow Convergence Open Shortest Path First (OSPF) ⚫ Fixes some of the deficiencies in RIP ⚫ Enables each router to learn complete network topology ⚫ Each router monitors the link state to each neighbor and floods the link-state information to other routers ⚫ Each router builds an identical link-state database ⚫ Allows router to build shortest path tree with router as root ⚫ OSPF typically converges faster than RIP when there is a failure in the network OSPF Network ⚫ To improve scalability, an autonomous system (AS) may be partitioned into two-level areas ⚫ ⚫ ⚫ Area defined by 32-bit area ID Routers in area only knows complete topology inside area & limits the flooding of link-state information to area Area border routers summarize info from other areas ⚫ Each area must connect to backbone area (0.0.0.0) ⚫ Internal router has all links to nets within the same area ⚫ Area border router has links to more than one area ⚫ backbone router has links connected to the backbone ⚫ Autonomous system boundary (ASB) router has links to another autonomous system OSPF Areas To another AS N1 R1 N2 R2 N5 R3 R6 R4 N4 R5 R7 N6 N3 Area 0.0.0.1 ASB: 4 Area border router : 3, 6, and 8 Internal router: 1,2,7 Backbone router: 3,4,5,6,8 R8 Area 0.0.0.0 Area 0.0.0.2 N7 R = router N = network Area 0.0.0.3 Unit 03.04.01 CS 5220: COMPUTER COMMUNICATIONS Packet level – Scheduling and QoS XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Time Scales & Granularities ⚫ Packet Level ⚫ ⚫ ⚫ Flow Level ⚫ ⚫ ⚫ Queueing & scheduling at multiplexing points Determines relative performance offered to packets over a short time scale (microseconds) Management of traffic flows & resource allocation to ensure delivery of QoS (milliseconds to seconds) Matching traffic flows to resources available; congestion control Flow-Aggregate Level ⚫ ⚫ Routing of aggregate traffic flows across the network for efficient utilization of resources & meeting of service levels “Traffic Engineering”, at scale of minutes to days End-to-End QoS Packet buffer … 1 ⚫ ⚫ ⚫ 2 N–1 N A packet traversing network encounters delay and possible loss at various multiplexing points End-to-end performance is accumulation of per-hop performances Packet loss occurs when no more buffer available for a packet Scheduling & QoS ⚫ End-to-End QoS & Resource Control ⚫ ⚫ ⚫ Scheduling Concepts ⚫ ⚫ FQ/WFQ, PGPS Guaranteed Service ⚫ ⚫ fairness/isolation, priority, aggregation, Fair Queueing & Variations ⚫ ⚫ Buffer & bandwidth control → Performance Admission control to regulate traffic level FQ/WFQ, Rate-control Packet Dropping FIFO Queueing Packet buffer Arriving packets Packet discard when full Transmission link ⚫ All packet flows share the same buffer ⚫ Transmission Discipline: First-In, First-Out ⚫ Buffering Discipline: Discard arriving packets if buffer is full (Alternative: random discard; pushout head-of-line, etc.) ⚫ Delay and loss depends on inter-arrival and packet lengths FIFO Queueing ⚫ Cannot provide differential QoS to packet flows ⚫ ⚫ Different packet flows interact strongly Statistical delay guarantees via load control ⚫ ⚫ Restrict number of flows allowed (admission control) Difficult to determine performance delivered ⚫ Finite buffer determines a maximum possible delay ⚫ Buffer size determines loss probability ⚫ But depends on arrival & packet length statistics FIFO w/o and w/ Discard Priority (a) Packet buffer Arriving packets FIFO queueing Packet discard when full (b) Transmission link Packet buffer Arriving packets Class 1 discard when full Transmission link Class 2 discard when threshold exceeded FIFO queueing with discard priority HOL Priority Queueing Packet discard when full High-priority packets Low-priority packets Packet discard when full ⚫ ⚫ ⚫ Transmission link When high-priority queue empty High priority queue serviced until empty High priority queue has lower waiting time Buffers can be dimensioned for different loss probabilities Summary: HOL Priority Features Delay (Note: Need labeling) Per-class loads ⚫ Provides differential QoS ⚫ High-priority classes can hog all of the bandwidth & starve lower priority classes ⚫ Need to provide some isolation between classes Unit 03.04.02 CS 5220: COMPUTER COMMUNICATIONS Packet Level: Fair Queuing and RED XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Fair Queuing ⚫ Attempts to provide isolated and equitable access transmission bandwidth (like Processor Sharing) ⚫ Each user flows has its own logical buffer ⚫ Idealized system assumes fluid flow from queues ⚫ Weighted fair queueing (WFQ) further addresses different users with different priorities/weights Fair Queuing – Fluid Packet flow 1 Approximated bit-level round robin service Packet flow 2 Packet flow n … … C bits/second Transmission link ⚫ Each flow has its own logical queue: prevents hogging; allows differential loss probabilities ⚫ C bits/sec allocated equally among non-empty queues ⚫ transmission rate = C / n(t), where n(t)=# non-empty queues Fair Queuing - Approximation ⚫ Per-bit round-robin: decomposing the resulting bit stream into the component networks would be costly ⚫ In ATM, fair queueing can be approximated easier ⚫ In packet networks, implementation requires approximation: simulate fluid system; sort packets according to completion time in ideal system Example ⚫ FIFO (per-packet) -> Fair queueing (per-bit) (a) A router with five packets queued for line O. (b) Finishing times for the five packets. Buffer Management ⚫ ⚫ ⚫ Drop strategy: Which packet to drop when buffers full Fairness: protect behaving sources from misbehaving ones Aggregation: ⚫ ⚫ ⚫ ⚫ Drop priorities: ⚫ ⚫ ⚫ ⚫ Per-flow buffers protect flows from misbehaving flows Full aggregation provides no protection Aggregation into classes provides intermediate protection Drop packets from buffer according to priorities Maximizes network utilization & application QoS Examples: layered video, policing at network edge Controlling sources at the edge Random Early Detection (RED) Random early detection (RED): ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Early drop: discard packets before buffers are full; drop packets if short-term average of queue exceeds threshold Packet drop probability increases linearly with queue length Random drop causes some sources to reduce rate before others, causing gradual reduction in aggregate input rate Packets produced by TCP will reduce input rate in response to network congestion Improves performance of cooperating TCP sources Increases loss probability of misbehaving sources Algorithm: ⚫ Maintain running average of queue length ⚫ If Qavg < minthreshold, do nothing ⚫ If Qavg > maxthreshold, drop packet ⚫ If in between, drop packet according to probability ⚫ Flows that send more packets are more likely to have packets dropped Probability of packet drop Packet Drop Profile in RED 1 0 minth maxth full Average queue length Unit 03.04.03 CS 5220: COMPUTER COMMUNICATIONS Flow Level: Leaky Bucket Policing XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Why Congestion? Congestion 3 6 1 4 8 2 5 7 Approaches to Congestion Control: • Preventive Approaches (open –loop) • Reactive Approaches (closed-loop) Ideal Effect of Congestion Control Approaches to Congestion Control: • Preventive Approaches (open –loop) • Reactive Approaches (closed-loop) Open-Loop Control ⚫ ⚫ ⚫ Network performance is guaranteed to all traffic flows that have been admitted into the network Initially for connection-oriented networks Key Mechanisms ⚫ ⚫ ⚫ Admission Control Policing Traffic Shaping Admission Control ⚫ ⚫ Bits/second Peak rate Flows negotiate contract with network Specify requirements: ⚫ ⚫ Average rate ⚫ ⚫ Network computes resources needed ⚫ ⚫ Time Typical bit rate demanded by a variable bit rate information source Peak, Avg., Min Bit rate Maximum burst size Delay, Loss requirement “Effective” bandwidth If flow accepted, network allocates resources to ensure QoS delivered as long as source conforms to contract Policing ⚫ Network monitors traffic flows continuously to ensure they meet their traffic contract ⚫ When a packet violates the contract, network can discard or tag the packet giving it lower priority ⚫ If congestion occurs, tagged packets are discarded first Leaky Bucket Illustration water poured irregularly (a) A leaky bucket with water. (b) a leaky bucket with packets. Leaky Bucket in ATM Network ⚫ ATM Network ⚫ All packets are of same fixed length ⚫ A counter records the content of the leaky bucket. ⚫ When a packet arrives the counter is increased by I if bucket would not exceed the limit, packet is conforming ⚫ Value I indicates the nominal inter-arrival time of packets being policed Summary: Leaky Bucket Example I=4 L=6 Nonconforming Packet arrival t1 t5 t7 Time L+I Bucket content I t1 * * * * t5 * t7 * * * * Time Non-conforming packets not allowed into bucket & hence not included in calculations maximum burst size (MBS = 3 packets) Unit 03.04.04 CS 5220: COMPUTER COMMUNICATIONS Traffic Shaping by Token Bucket XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Flow-level Traffic Shaping ⚫ Traffic shaping refers to the process of altering a traffic flow to ensure conformance ⚫ A traffic shaping device is often located at the node just before traffic flow leaves network ⚫ A traffic policing device is usually located at the node that receives the traffic flow from a network Leaky Bucket Traffic Shaper Size N Shaped traffic Incoming traffic Server Packet ⚫ ⚫ ⚫ ⚫ ⚫ Buffer incoming packets Play out periodically to conform to parameters Surges in arrivals are buffered & smoothed out Possible packet loss due to buffer overflow Restrctive, not allowing any variable-rate outgoing traffic Token Bucket Traffic Shaper Tokens arrive periodically An incoming packet must have sufficient tokens before admission into the network Size K Token Incoming traffic Size N Shaped traffic Server Packet ⚫ ⚫ ⚫ Token rate regulates transfer of packets If sufficient tokens available, packets enter network without delay K determines how much burstiness allowed into the network Token Bucket Shaping Effect (full) b bytes instantly The token bucket constrains the traffic from a source to be limited to b + r t bits in an interval of length t b+rt r bytes/second t Q1: what are two main differences of a leaky bucket and a token bucket? Allow saving for burst spending; packet discarding or not. Q2: When a token bucket is the same as a leaky bucket? b = 0; but still different indeed: packet discarding or not Token Bucket Shaping Effect (empty) ⚫ Behavior of the token bucket shaper is similar to that of the leady bucket shaper ⚫ If the bucket size is reduced to zero, they are identical Closed-Loop Flow Control ⚫ Congestion control ⚫ ⚫ ⚫ ⚫ End-to-end vs. Hop-by-hop ⚫ ⚫ Feedback information to regulate flow from sources into network Based on buffer length, link utilization, etc. Examples: TCP at transport layer; congestion control at ATM level Delay in effecting control Implicit vs. Explicit Feedback ⚫ ⚫ Source deduces congestion from observed behavior Routers/switches generate messages alerting to congestion E2E vs. H2H Congestion Control Source Packet flow Destination (a) TCP vs. ATM Source Destination (b) Feedback information Congestion Warning ⚫ The Warning Bit in ACKs ⚫ Choke packets to the source ⚫ A time-out due to missing acknowledgement Aggregate Level - Traffic Engineering ⚫ Management exerted at flow aggregate level ⚫ Distribution of flows in network to achieve efficient utilization of resources (bandwidth) ⚫ Must take into account aggregate demand from all flows ⚫ “Traffic Engineering”, at scale of minutes to days Unit 04.01.01 CS 5220: COMPUTER COMMUNICATIONS TCP/IP Architecture and IP Packet XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Why Internetworking? ⚫ To build a “network of networks” or internet ⚫ ⚫ ⚫ operating over multiple, coexisting, different networks providing ubiquitous connectivity through IP packet transfer achieving huge economies of scale H H Net51 Net G G G H Net52 Net Net53 Net G Net55 Net G Net54 Net G H TCP/IP Protocol Suite HTTP Reliable stream service SMTP DNS Distributed applications TCP UDP Best-effort connectionless packet transfer Network Interface 1 IP Network Interface 2 RTP User datagram service (ICMP, ARP) Network Interface 3 Encapsulation HTTP Request TCP Header contains source & destination port numbers IP Header contains source and destination IP addresses; transport protocol type Ethernet Header contains source & destination MAC addresses; network protocol type Ethernet header TCP header HTTP Request IP header TCP header HTTP Request IP header TCP header HTTP Request FCS Internet Addresses ⚫ Each host has globally unique logical IP address ⚫ Separate address for each physical connection to a network ⚫ Routing decision is done based on destination IP address ⚫ IP address has two parts: ⚫ ⚫ ⚫ netid and hostid netid unique, facilitates routing Dotted Decimal Notation: int1.int2.int3.int4 (intj = jth octet) 128.100.10.13 DNS resolves IP name to IP address Internet Protocol ⚫ Provides best effort, connectionless packet delivery ⚫ ⚫ ⚫ ⚫ motivated by need to keep routers simple and by adaptibility to failure of network elements packets may be lost, out of order, or even duplicated higher layer protocols must deal with these, if necessary IP also includes: ⚫ ⚫ Internet Control Message Protocol (ICMP) Internet Group Management Protocol (IGMP) IP Packet Header 0 4 Version 8 IHL 16 24 Type of Service Identification Time to Live 19 31 Total Length Flags Protocol Fragment Offset Header Checksum Source IP Address Destination IP Address Options ⚫ ⚫ Minimum 20 bytes Up to 40 bytes in options fields Padding IP Packet Header 0 4 Version 8 IHL 16 24 Type of Service Identification Time to Live 19 Total Length Flags Protocol 31 Fragment Offset Header Checksum Source IP Address Destination IP Address Options Padding Version: current IP version is 4. Internet header length (IHL): length of the header in 32-bit words. Type of service (TOS): traditionally priority of packet at each router. Recent Differentiated Services redefines TOS field to include other services besides best effort. IP Packet Header 0 4 Version 8 IHL 16 24 Type of Service Identification Time to Live 19 Total Length Flags Protocol 31 Fragment Offset Header Checksum Source IP Address Destination IP Address Options Padding Total length: number of bytes of the IP packet including header and data Identification, Flags, and Fragment Offset: for fragmentation and reassembly. IP Packet Header 0 4 Version 8 IHL 16 24 Type of Service Identification Time to Live 19 Total Length Flags Protocol 31 Fragment Offset Header Checksum Source IP Address Destination IP Address Options Padding Time to live (TTL): number of hops packet is allowed to traverse in network. • Each router along the path to the destination decrements this value by one. • If the value reaches zero before the packet reaches the destination, the router discards the packet and sends an error message back to the source. IP Packet Header 0 4 Version 8 IHL 16 24 Type of Service Identification Time to Live 19 Total Length Flags Protocol 31 Fragment Offset Header Checksum Source IP Address Destination IP Address Options Padding Protocol: specifies upper-layer protocol that is to receive IP data at the destination. Examples include TCP (protocol = 6), UDP (protocol = 17), and ICMP (protocol = 1). Header checksum: verifies the integrity of the IP header. Source IP address and destination IP address: contain the addresses of the source and destination hosts. IP Packet Header 0 4 Version 8 IHL 16 24 Type of Service Identification Time to Live 19 Total Length Flags Protocol 31 Fragment Offset Header Checksum Source IP Address Destination IP Address Options Padding Options: Variable length field, allows packet to request special features such as security level, route to be taken by the packet, and timestamp at each router. Detailed descriptions of these options can be found in [RFC 791]. Padding: This field is used to make the header a multiple of 32-bit words. IP Header Processing 1. Compute header checksum for correctness and check that fields in header (e.g. version and total length) contain valid values 2. Consult routing table to determine next hop 3. Change fields that require updating (TTL, header checksum) Unit 04.01.02 CS 5220: COMPUTER COMMUNICATIONS IP Addressing XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science IP Addressing ⚫ Each host on Internet has unique 32 bit IP address ⚫ Each address has two parts: netid and hostid ⚫ netid is unique & administered by Internet registration ⚫ netid facilitates routing and reduces routing table ⚫ A separate address is required for each physical connection of a host to a network; “multi-homed” hosts ⚫ Dotted-Decimal Notation: int1.int2.int3.int4 where intj = integer value of jth octet IP address of 10000000 10000111 01000100 00000101 is 128.135.68.5 in dotted-decimal notation Classful Addresses Class A 7 bits 14 bits 0 1 16 bits hostid netid 128.0.0.0 to 191.255.255.255 16,382 networks with up to 64,000 hosts Class C 21 bits 1 • 1.0.0.0 to 127.255.255.255 126 networks with up to 16 million hosts Class B • hostid netid 0 • 24 bits 1 0 netid 2 million networks with up to 254 hosts 8 bits hostid 192.0.0.0 to 223.255.255.255 Class D Addresses Class D 1 28 bits 1 1 0 multicast address 224.0.0.0 to 239.255.255.255 ⚫ ⚫ Up to 250 million multicast groups at the same time Permanent group addresses ⚫ ⚫ ⚫ ⚫ All systems in LAN; All routers in LAN; All OSPF routers on LAN; All designated OSPF routers on a LAN, etc. Temporary groups addresses created as needed Special multicast routers Reserved Host IDs (all 0s & 1s) Internet address used to refer to network has hostid set to all 0s 0 0 0 0 0 0 0 0 0 this host (used when booting up) a host in this network host Broadcast address has hostid set to all 1s 1 1 1 netid 1 1 1 1 1 1 1 1 1 1 broadcast on local network broadcast on distant network Private IP Addresses ⚫ Specific ranges of IP addresses set aside for use in private networks (RFC 1918) ⚫ Use restricted to private internets; routers in public Internet discard packets with these addresses ⚫ Range 1: 10.0.0.0 to 10.255.255.255 ⚫ Range 2: 172.16.0.0 to 172.31.255.255 ⚫ Range 3: 192.168.0.0 to 192.168.255.255 ⚫ Network Address Translation (NAT) used to convert between private & global IP addresses Example of IP Addressing 128.135.40.1 Interface Address is 128.135.10.2 H Network 128.135.0.0 H 128.135.10.20 128.140.5.40 Interface Address is 128.140.5.35 R H Network 128.140.0.0 What class types? H 128.135.10.21 H 128.140.5.36 Address with host ID=all 0s refers to the network Address with host ID=all 1s refers to a broadcast packet R = router H = host Subnets A campus network consisting of LANs for various departments. - How to allow a network to be split into several parts for internal use but still act like a single network to the outside Unit 04.01.03 CS 5220: COMPUTER COMMUNICATIONS Subnetting XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Subnet Addressing ⚫ Subnet addressing introduces another hierarchical level ⚫ Transparent to remote networks ⚫ Simplifies management of multiplicity of LANs ⚫ Masking used to find subnet number Original address 1 0 Net ID Subnetted address 1 0 Net ID Host ID Subnet ID Host ID Subnetting Scheme ⚫ ⚫ Organization has Class B address (16 host ID bits) with network ID: 150.100.0.0 Create subnets with up to 100 hosts each ⚫ ⚫ ⚫ 7 bits sufficient for each subnet (IP mask 7 bits) 16-7 = 9 bits for subnet ID (2^9 – 2 = 510 subnets) Apply subnet mask to IP addresses to find corresponding subnet ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Example: Find subnet for 150.100.12.176 IP address = 10010110 01100100 00001100 10110000 Mask = 11111111 11111111 11111111 10000000 (7 0s) AND = 10010110 01100100 00001100 10000000 Subnet = 150.100.12.128 Subnet address used by routers within organization Subnet Range ⚫ Given the subnet 150.100.12.128 ⚫ ⚫ ⚫ IP address 150.100.12.128 is used to identify the subnetwork IP address 150.100.12.255 is used to broadcast packets in the subnet Range of the subnet IP address is between IP address = 10010110 01100100 00001100 10000001 to IP address = 10010110 01100100 00001100 11111110 That is, 150.100.12.129 to 150.100.12.254 Subnet Example H1 H2 150.100.12.154 150.100.12.176 150.100.12.128 150.100.12.129 150.100.0.1 To the rest of the Internet R1 150.100.12.4 H3 H4 150.100.12.24 150.100.12.55 150.100.12.0 150.100.12.1 R2 H5 150.100.15.54 150.100.15.0 150.100.15.11 Routing with Subnetworks ⚫ ⚫ IP layer in hosts and routers maintain a routing table Originating host: To send an IP packet, consult routing table ⚫ ⚫ ⚫ If destination host is in same network, send packet directly using appropriate network interface Otherwise, send packet indirectly; typically, routing table indicates a default router Router: Examine IP destination address in arriving packet ⚫ If destination IP address not own, router consults routing table to determine next-hop and associated network interface & forwards packet Unit 04.01.04 CS 5220: COMPUTER COMMUNICATIONS Subnet Routing Examples XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Routing Table ⚫ Each row in routing table contains: ⚫ ⚫ ⚫ ⚫ ⚫ Destination IP address IP address of next-hop router Physical address Statistics information Flags ⚫ H=1 (0) indicates route is to a host (network) ⚫ G=1 (0) indicates route is to a router (directly connected destination) Routing Search and Actions ⚫ Routing table search order & action ⚫ Complete destination address; send as per next-hop & G flag ⚫ Destination network ID; send as per next-hop & G flag ⚫ Default router entry; send as per next-hop ⚫ Declare packet undeliverable; send ICMP “host unreachable error” packet to originating host Example 1: A packet with 150.100.15.11 arrives at R1 H1 H2 150.100.12.154 150.100.12.176 150.100.12.128 150.100.15.11 150.100.0.1 To the rest of the Internet 150.100.12.129 R1 150.100.12.4 H3 H4 150.100.12.24 150.100.12.55 150.100.12.0 150.100.12.1 Routing Table at R1 Destination Next-Hop 127.0.0.1 150.100.12.128 150.100.12.0 150.100.15.0 Flags Net I/F 127.0.0.1 H 150.100.12.129 150.100.12.4 150.100.12.1 G R2 H5 150.100.15.54 lo0 emd0 emd1 emd1 150.100.15.0 150.100.15.11 Example 1: Subnetting Scheme ⚫ ⚫ IP address 150.100.15.11 in binary string is 10010110 01100100 00001111 00001011 Apply subnet mask to IP addresses to find corresponding subnet ⚫ ⚫ ⚫ ⚫ IP address = 10010110 01100100 00001111 00001011 Mask = 11111111 11111111 11111111 10000000 AND = 10110110 01100100 00001111 00000000 Subnet = 150.100.15.0 H1 H2 150.100.12.154 150.100.12.176 150.100.12.128 150.100.15.11 150.100.0.1 To the rest of the Internet 150.100.12.129 R1 150.100.12.4 H3 H4 150.100.12.24 150.100.12.55 150.100.12.0 150.100.12.1 R2 H5 150.100.15.54 Routing Table at R1 Destination Next-Hop 127.0.0.1 150.100.12.128 150.100.12.0 150.100.15.0 150.100.15.0 Flags Net I/F 127.0.0.1 H 150.100.12.129 150.100.12.4 150.100.12.1 G lo0 emd0 emd1 emd1 150.100.15.11 Example 2: Host H5 sends packet to host H2 H1 H2 150.100.12.154 150.100.12.176 150.100.12.128 150.100.12.129 150.100.0.1 To the rest of the Internet R1 150.100.12.4 H3 H4 150.100.12.24 150.100.12.55 150.100.12.0 150.100.12.1 R2 Routing Table at H5 Destination Next-Hop 127.0.0.1 default 150.100.15.0 Flags Net I/F 127.0.0.1 H 150.100.15.54 G 150.100.15.11 H5 150.100.15.54 lo0 emd0 emd0 150.100.15.0 150.100.15.11 Example 2: Host H5 sends packet to host H2 H1 H2 150.100.12.154 150.100.12.176 150.100.12.128 150.100.12.129 150.100.0.1 To the rest of the Internet R1 150.100.12.4 H3 H4 150.100.12.24 150.100.12.55 150.100.12.0 150.100.12.1 R2 Routing Table at H5 H5 150.100.15.54 Destination Next-Hop 127.0.0.1 default 150.100.15.0 Flags Net I/F 127.0.0.1 H 150.100.15.54 G 150.100.15.11 lo0 emd0 emd0 150.100.15.11 150.100.15.0 150.100.12.176 Example: Host H5 sends packet to host H2 H1 H2 150.100.12.154 150.100.12.176 150.100.12.128 150.100.12.129 150.100.0.1 To the rest of the Internet R1 150.100.12.4 H3 H4 150.100.12.24 150.100.12.55 150.100.12.0 150.100.12.1 Routing Table at R2 150.100.12.176 R2 H5 150.100.15.54 Destination Next-Hop 127.0.0.1 default 150.100.15.0 150.100.12.0 Flags Net I/F 127.0.0.1 H 150.100.12.4 G 150.100.15.54 150.100.12.1 lo0 emd0 emd1 emd0 150.100.15.0 150.100.15.11 H1 H2 150.100.12.154 150.100.12.176 150.100.12.128 150.100.12.129 150.100.0.1 To the rest of the Internet 150.100.12.176 R1 150.100.12.4 H3 H4 150.100.12.55 150.100.12.24 150.100.12.0 150.100.12.1 R2 H5 150.100.15.54 Routing Table at R1 Destination Next-Hop 127.0.0.1 150.100.12.128 150.100.12.0 150.100.15.0 150.100.15.0 Flags Net I/F 127.0.0.1 H 150.100.12.129 150.100.12.4 150.100.12.1 G lo0 emd0 emd1 emd1 150.100.15.11 Unit 04.02.01 CS 5220: COMPUTER COMMUNICATIONS Classless Interdomain Routing (CIDR) XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science IP Address Problems ⚫ IP Address Exhaustion ⚫ ⚫ IP routing table size ⚫ ⚫ Growth in # of networks in Internet reflected in # of table entries ⚫ Stress on router processing power and memory allocation Short-term solution: ⚫ ⚫ ⚫ ⚫ Class A, B, and C address structure inefficient ⚫ Class B too large for most organizations, but future proof ⚫ Class C too small Classless Interdomain Routing (CIDR), RFC 1518 New allocation policy (RFC 2050) Private IP Addresses set aside for intranets (NAT) Long-term: IPv6 with much bigger address space Classless Interdomain Routing Scheme ⚫ CIDR uses an arbitrary prefix length to indicate the network number ⚫ ⚫ ⚫ Packets are routed according to the prefix w/o address classes ⚫ ⚫ 205.100.0.0/22 /22 means mask: 11111111 11111111 11111100 00000000 255.255.252.0 An entry in CIDR routing table contains 32-bit IP address and 32-bit mask Enables supernetting to allow a single routing entry to cover a block of classful addresses CIDR Aggregation ⚫ A company is allocated the following four contiguous /24 networks. At some router, it is often true that all of the four networks use the same outgoing line. CIDR aggregation can be done to reduce the number of entry at the router. ⚫ ⚫ ⚫ ⚫ ⚫ 128.56.24.0/24; 128.56.25.0/24; 128.56.26.0/24; 128.56.27.0/24. 10000000 00111000 00011000 00000000 10000000 00111000 00011001 00000000 10000000 00111000 00011010 00000000 10000000 00111000 00011011 00000000 By Per-bit AND 128.56.24.0/22 10000000 00111000 00011000 00000000 (Instead of 4 entries in routing table, one entry is sufficient by CIDR) CIDR Scheme and Range ⚫ CIDR deals with Routing Table Explosion Problem ⚫ ⚫ ⚫ Networks represented by prefix and mask Summarize a contiguous group of class C addresses using variablelength mask, if all of them use the same outgoing line Solution: Route according to prefix of address, not class ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Routing table entry has <IP address, network mask> Example: 192.32.136.0/21 11000000 00100000 10001000 00000001 min address 11111111 11111111 11111--- -------- mask 11000000 00100000 10001--- -------- IP prefix 11000000 00100000 10001111 11111110 max address Eight C networks: 192.32.136.0/24 to 192.32.143.0/24 CIDR Supernetting Example (1) ⚫ ⚫ Summarize a contiguous group of class C addresses using variable-length mask Example: 150.158.16.0/20 ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ IP Address (150.158.16.0) & mask length (20) IP address = 10010110 10011110 00010000 00000000 Mask = 11111111 11111111 11110000 00000000 Contains 16 Class C blocks: From 10010110 10011110 00010000 00000000 i.e. 150.158.16.0/24 Up to 10010110 10011110 00011111 00000000 i.e. 150.158.31.0/24 CIDR Supernetting Example (2) ⚫ A router has the following CIDR entries in its routing table: Address/mask 128.56.24.0/22 128.56.60.0/22 default Next hop Interface 0 Interface 1 Router 2 A packet comes with IP address of 128.56.63.10. What does the router do? CIDR Supernetting Example (2) – Cont. ⚫ 128.56.63.10 and mask 22 bits ⚫ ⚫ ⚫ ⚫ IP address = 10000000 00111000 01111111 00001010 Mask = 11111111 11111111 11111100 00000000 By Per-bit AND Prefix = 10000000 00111000 01111100 00001010 i.e. 128.56.60 Router table lookup and match, should go to interface 1 New Address Allocation Policy ⚫ ⚫ Class A & B assigned only for clearly demonstrated need Consecutive blocks of class C assigned (up to 64 blocks) Address Allocation < 256 1 Class C 256<,<512 2 Class C 512<,<1024 4 Class C 1024<,<2048 8 Class C Address assignment should reflect the physical topology of the network 2048<,<4096 16 Class C Facilitates the aggregation of logical packet flows into physical flows 4096<,<8192 32 Class C ⚫ ⚫ ⚫ Address Requirement ⚫ All IP addresses in the range have a common prefix, and every address with that prefix is within the range Arbitrary prefix length for network ID improves efficiency 8192<,<16384 64 Class C Longest Prefix Match ⚫ By CIDR, multiple entries may match a given IP destination address but different prefix ⚫ Example: perform CIDR on the following three /24 IP addresses (but 128.56.24.0/24 to a different port) ⚫ ⚫ ⚫ ⚫ ⚫ 128.56.25.0/24; 128.56.26.0/24; 128.56.27.0/24; By CIDR aggregation: 128.56.24.0/22 What if a packet with destination IP address 128.56.24.1 comes? Example of Longest Prefix Match Packet coming with Address 128.56.24.1 Port 0 Company B 128.56.24.0/24 R Port 1 Company A 128.56.25.0/24 128.56.26.0/24 128.56.27.0/24 Routing Table at R Address/mask 128.56.24.0/24 128.56.24.0/22 Next Hop 0 1 Longest Prefix Match: packet must be routed using the more specific route (128.56.24.0/24) Unit 04.02.02 CS 5220: COMPUTER COMMUNICATIONS ARP, Fragmentation and Reassembly XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science Address Resolution Protocol ⚫ IP addresses are said to be logical, because they are defined in terms of logical topology of the routers and end systems. ⚫ The logical IP addresses need to be converted into specific physical addresses that identify the physical endpoints for the Ethernet sender and receiver ⚫ ARP: conversion between IP address and Physical address Address Resolution Protocol (ARP) How to map an IP address to a physical address? How to speed up? How fresh? H1 wants to learn physical address of H3 -> broadcasts an ARP request H1 H2 150.100.76.20 150.100.76.21 H3 150.100.76.22 H4 150.100.76.23 ARP request (what is the MAC address of 150.100.76.22?) Every host receives the request, but only H3 reply with its physical address H1 H2 H3 ARP response (my MAC address is 08:00:5a:3b:94) H4 Fragmentation and Reassembly ⚫ Each physical network imposes a certain packet size limitation on the packet to be carried, called maximum transmission unit MTU. Q1: who does it? Q2: penalty? Source Fragment at source IP Router Reassemble at destination Destination Fragment at router Network IP Network RE: IP Packet Header 0 4 Version 8 IHL 16 24 Type of Service Identification Time to Live 19 Total Length Flags Protocol 31 Fragment Offset Header Checksum Source IP Address Destination IP Address Options Padding Identification, Flags, and Fragment Offset: used for fragmentation and reassembly Fragment offset is 13 bits; total length is 16 bits, what does it imply? Example: Fragmenting a Packet ⚫ ⚫ ⚫ Packet is to be forwarded to a network with MTU of 576 bytes. The packet has an IP header of 20 bytes and a data part of 1484 bytes. Maximum data length per fragment = 576 - 20 = 556 bytes. Set maximum data length to 552 bytes to get multiple of 8. Total Length Id MF Fragment Offset Original packet 1504 x 0 0 Fragment 1 572 x 1 0 Fragment 2 572 x 1 69 Fragment 3 400 x 0 138 Unit 04.02.03 CS 5220: COMPUTER COMMUNICATIONS DHCP, NAT XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science DHCP ⚫ Dynamic Host Configuration Protocol (RFC 2131) ⚫ Bootstrap Protocol BOOTP allows a diskless workstation to be remotely booted up in a network ⚫ ⚫ UDP port 67 (server) & port 68 (client) DHCP builds on BOOTP to allow servers to deliver configuration information to a host ⚫ ⚫ ⚫ Used extensively to assign temporary IP addresses to hosts Allows ISP to maximize usage of their limited IP addresses Time thresholds to enforce lease time Network Address Translation (NAT) ⚫ Class A, B, and C addresses have been set aside for use within private Internets ⚫ ⚫ ⚫ Private IP addresses are sufficient for use inside of private networks But packets with private (“unregistered”) addresses are discarded by routers in the global Internet NAT (RFC 1631): method for mapping packets from hosts in private internets into packets that can traverse the Internet ⚫ ⚫ A device (computer, router, firewall) acts as an agent between a private network and a public network A number of hosts can share a limited number of registered IP addresses Placement of Operation of a NAT Box ⚫ NAT: provides mapping between public IP address and private IP addresses NAT Operations Address Translation Table: 192.168.0.10; x 128.100.10.15; y 192.168.0.13; w 128.100.10.15; z 192.168.0.10;x Private Network 192.168.0.13;w 128.100.10.15;y NAT Device Public Network 128.100.10.15; z ⚫ ⚫ ⚫ Hosts inside private networks generate packets with private IP address & TCP/UDP port #s NAT maps each private IP address & port # into shared global IP address & available port # Translation table allows packets to be routed unambiguously NAT Discussions ⚫ In theory, up to 2^16 private IP addresses supported by a single public IP address in NAT box ⚫ Overhead in NAT operation ⚫ TCP/UDP Port number used for NAT mapping at IP layer, violating OSI layer architecture principle Unit 04.02.04 CS 5220: COMPUTER COMMUNICATIONS IPv6 XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science IPv6 ⚫ Longer address field: ⚫ ⚫ 128 bits can support up to 3.4 x 1038 hosts Simplified header format: ⚫ ⚫ Simpler format to speed up processing of each packet header All fields are of fixed size IPv4 vs IPv6 Overview ⚫ IPv4 vs IPv6 fields: ⚫ ⚫ ⚫ ⚫ Same: Version Dropped: Header length, ID/flags/frag offset, header checksum Replaced: ▪ Datagram length by Payload length ▪ Protocol type by Next header ▪ TTL by Hop limit ▪ TOS by traffic class New: Flow label IPv6 Header Format 0 4 Version 12 16 24 Traffic Class Payload Length 31 Flow Label Next Header Hop Limit Source Address Destination Address ⚫ ⚫ ⚫ Version field same size, same location Traffic class to support differentiated services Flow: sequence of packets from particular source to particular destination for which source requires special handling IPv6 Basic Header Format 0 4 Version 12 16 24 Traffic Class Payload Length 31 Flow Label Next Header Hop Limit Source Address Destination Address ⚫ ⚫ ⚫ Payload length: length of data excluding header, up to 65535 B Next header: type of extension header that follows basic header Hop limit: # hops packet can travel before being dropped by a router Extension Headers ⚫ Allows an arbitrary number of extension headers be placed between the basic header and the payload (the extension headers are chained by the next header field) ⚫ Large Packet (Jumbo packet): payload>64K 0 8 Next header 16 0 24 194 Jumbo payload length 31 Opt len = 4 Extension Headers ⚫ ⚫ Fragmentation: At source only Source performs “path MTU discovery” (a fragment extension header for each packet fragment) 0 8 Next header 16 Reserved 29 Fragment offset Identification 31 Res M Extension Headers ⚫ Source Routing: strict/loose routes 0 8 Next header Reserved 16 Header length 24 Routing type = 0 Strict/loose bit mask Address 1 Address 2 ... Address 24 31 Segment left IPv6 Addressing ⚫ ⚫ Address Categories ⚫ Unicast: single network interface ⚫ Multicast: group of network interfaces, typically at different locations. Packet sent to all. ⚫ Anycast: group of network interfaces. Packet sent to only one interface in group, e.g. nearest. Hexadecimal notation ⚫ Groups of 16 bits represented by 4 hex digits ⚫ Separated by colons ⚫ ⚫ Shortened forms: ⚫ ⚫ ⚫ ⚫ 4BF5:AA12:0216:FEBC:BA5F:039A:BE9A:2176 4BF5:0000:0000:0000:BA5F:039A:000A:2176 To 4BF5:0:0:0:BA5F:39A:A:2176 To 4BF5::BA5F:39A:A:2176 Mixed notation: ⚫ ::FFFF:128.155.12.198 Migration from IPv4 to IPv6 ⚫ Gradual transition from IPv4 to IPv6 ⚫ Dual IP stacks: routers run IPv4 & IPv6 ⚫ ⚫ Type field used to direct packet to IP version IPv6 islands can tunnel across IPv4 networks ⚫ Encapsulate user packet insider IPv4 packet Source Tunnel tail-end Tunnel head-end Destination Tunnel IPv6 network IPv6 header IPv4 header IPv4 network IPv6 network Unit 04.03.01 CS 5220: COMPUTER COMMUNICATIONS UDP and TCP XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science UDP ⚫ ⚫ ⚫ Best effort datagram service Multiplexing enables sharing of IP datagram service Simple transmitter & receiver ⚫ ⚫ ⚫ ⚫ ⚫ Connectionless: no handshaking & no connection state Low header overhead No flow control, no error control, no congestion control UDP datagrams can be lost or out-of-order Applications ⚫ ⚫ multimedia (e.g. RTP) network services (e.g. DNS, RIP, SNMP) UDP Datagram 0 16 31 ⚫ Source Port Destination Port UDP Length UDP Checksum Source and destination port numbers ⚫ ⚫ Data ⚫ ⚫ ⚫ Well-known ports 256-1023 ⚫ Less well-known ports 1024-65536 ⚫ UDP length ⚫ 0-255 Ephemeral client ports ⚫ ⚫ Client ports are ephemeral Server ports are well-known Max number is 65,535 Total number of bytes in datagram (including header) 8 bytes ≤ length ≤ 65,535 UDP Checksum ⚫ Optionally detects errors in UDP datagram UDP De-Multiplexing ⚫ ⚫ All UDP datagrams arriving to IP address B and destination port number n are delivered to the same process Source port number is not used in demultiplexing 1 2 A ... n 1 2 ... n 1 2 ... UDP UDP UDP IP IP IP B C n UDP Checksum Calculation 0 8 16 31 Source IP Address UDP pseudoheader Destination IP Address 00000000 ⚫ ⚫ ⚫ ⚫ ⚫ Protocol = 17 UDP Length UDP checksum detects for end-to-end errors Covers pseudoheader followed by UDP datagram IP addresses included to detect against misdelivery The use of UDP checksums is optional But hosts are required to have checksums enabled TCP ⚫ ⚫ Reliable byte-stream service More complex transmitter & receiver ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ Connection-oriented: full-duplex unicast connection between client & server processes Connection setup, connection state, connection release Higher header overhead Error control, flow control, and congestion control Higher delay than UDP Most applications use TCP ⚫ HTTP, SMTP, FTP, TELNET, POP3, … TCP Multiplexing ⚫ A TCP connection is specified by a 4-tuple ⚫ ⚫ (source IP address, source port, destination IP address, destination port) TCP allows multiplexing of multiple connections between end systems to support multiple applications simultaneously 1 2 A ... m 1 2 ... n 1 ... 2 TCP TCP TCP IP IP IP (A, 6234, B, 80) (A, 5234, B, 80) B C (C, 5234, B, 80) k Reliable Byte-Stream Service ⚫ Stream Data Transfer ⚫ ⚫ ⚫ ⚫ transfers a contiguous stream of bytes across the network, with no indication of boundaries groups bytes into segments transmits segments as convenient (Push function defined) Reliability: error control to deal with IP transfer impairments Application Write 45 bytes Write 15 bytes Write 20 bytes Transport Error Detection & Retransmission Read 40 bytes Read 40 bytes segments buffer ACKS, sequence # buffer Flow Control ⚫ ⚫ Buffer limitations & speed mismatch can result in loss of data that arrives at destination; p2p issue Receiver controls rate at which sender transmits to prevent receiver’s buffer overflow Application Transport segments buffer advertised window size < B buffer used buffer available = B TCP Segment Format 0 4 10 16 Source port 24 31 Destination port Sequence number Acknowledgment number Header length Reserved U A P R S F R C S S Y I G K H T N N Checksum Window size Urgent pointer Options Padding Data TCP Header Window Size ⚫ 16 bits to advertise window size ⚫ Used for flow control ⚫ Sender will accept bytes with SN from ACK to ACK + window ⚫ Maximum win size 65535 bytes 0 TCP Checksum ⚫ Internet checksum method ⚫ TCP pseudoheader + TCP segment 8 16 31 Source IP address Destination IP address 00000000 Protocol = 6 TCP segment length Unit 04.03.02 CS 5220: COMPUTER COMMUNICATIONS TCP Three-way Handshake XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science TCP Connection Management Out-of-Order and Duplication Problem ⚫ Old segment from previous connections to come ⚫ Use long sequence number (32-bits) ⚫ Establish randomly selected initial sequence number (ISN) ⚫ Accept sequence numbers from a small window ⚫ Enforces a time-out period at end of connection, called maximum segment lifetime (MSL), usually 2 minutes but round-trip delay dependent TCP Header – Seq and Ack Sequence Number ⚫ Byte count ⚫ First byte in segment ⚫ 32 bits long ⚫ 0 SN 232-1 ⚫ Initial sequence number (ISN) selected during connection setup (SYN flag bit is 1); Acknowledgement Number ⚫ SN of next byte expected by receiver ⚫ Acknowledges that all prior bytes in stream have been received correctly ⚫ Valid if ACK flag is set TCP Header – Control bits Control ⚫ 6 bits ⚫ URG: urgent pointer flag ⚫ ⚫ ⚫ ⚫ ACK: ACK packet flag PSH: override TCP buffering RST: reset connection ⚫ ⚫ ⚫ Urgent message end = SN + urgent pointer Upon receipt of RST, connection is terminated and application layer notified SYN: establish connection FIN: close connection TCP Connection Establishment • “Three-way Handshake” • ISN’s protect against segments from prior connections Host A Host B If host always uses the same ISN Host A Host B Delayed segment with Seq_no = n+2 will be accepted TCP Connection Closing “Graceful Close” Host A Deliver 150 bytes TIME-WAIT (2 MSL) Host B Unit 04.03.02 CS 5220: COMPUTER COMMUNICATIONS TCP Flow Control and Data Transfer XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science TCP Flow Control TCP Data Transfer ⚫ Selective Repeat ARQ with Positive ACK ⚫ Window slides a byte basis instead packet basis ⚫ Dynamically advertising the window size Host A (client) Host B (server) socket bind listen accept (blocks) socket t1 connect (blocks) t2 connect returns t3 write read (blocks) t5 t4 accept returns read (blocks) t6 read returns write read (blocks) read returns TCP Window Flow Control Host A Host B t0 1024 bytes to transmit 1024 bytes to transmit t1 t2 1024 bytes to transmit Why delay here? 128 bytes to transmit t3 1024 bytes to transmit t4 can only send 512 bytes TCP Connection Management ⚫ Select initial sequence numbers (ISN) to protect against segments from prior connections (delayed duplicates) ⚫ Use local clock to select ISN sequence number ⚫ Time for clock to go through a full cycle should be greater than the maximum lifetime of a segment (MSL); Typically MSL=120 seconds ⚫ High bandwidth connections pose a problem Sequence Number Wraparound ⚫ 232 = 4.29x109 bytes = 34.3x109 bits ⚫ ⚫ High bandwidth poses a problem; At 1 Gbps, sequence number wraparound in 34.3 seconds (< MSL that is 120 seconds). Timestamp option: Insert 32 bit timestamp in header of each segment ⚫ ⚫ Timestamp + sequence no → 64-bit seq. no Timestamp can be in TCP option clock must: Unit 04.03.04 CS 5220: COMPUTER COMMUNICATIONS TCP Congestion Control XIAOBO ZHOU, Ph.D. Professor, Department of Computer Science TCP Congestion Control Router Packet flows from many sources ⚫ ⚫ R bps Congestion occurs when total arrival rate from all packet flows exceeds R over a sustained period of time Buffers at multiplexer will fill and packets will be lost Throughput (bps) Phases of Congestion Behavior 1. Light traffic R ⚫ ⚫ ⚫ Arrival Rate Knee (congestion onset) 2. ⚫ ⚫ Delay (sec) ⚫ ⚫ ⚫ R Arrival rate approaches R Delay increases rapidly Throughput begins to saturate Congestion collapse 3. Arrival Rate Arrival Rate << R Low delay Can accommodate more ⚫ Arrival rate > R Large delays, packet loss Useful application throughput drops Congestion Window ⚫ ⚫ ⚫ ⚫ ⚫ Desired operating point: just before knee TCP sender maintains a congestion window (cwnd) to control congestion at intermediate routers Effective window is minimum of congestion window and advertised window Problem: senders does not know what its “fair” share of available bandwidth should be Solution: adapt dynamically to available BW ⚫ ⚫ ⚫ Senders probe the network by increasing cwnd When congestion detected, senders reduce rate Ideally, sending rate stabilizes near optimal point Congestion Window (Cont.) ⚫ ⚫ How does the TCP congestion algorithm change congestion window dynamically according to the most up-to-date state of the network? At light traffic: each segment is ACKed quickly ⚫ ⚫ At knee: segment ACKs arrive, but more slowly ⚫ ⚫ Increase cwnd aggresively Slow down increase in cwnd At congestion: segments encounter large delays, timeout, segments are dropped in router buffers ⚫ Reduce transmission rate, then probe again TCP Congestion Control (1): Slow Start ⚫ Slow start: increase congestion window size by one segment upon receiving an ACK from receiver ⚫ ⚫ ⚫ initialized at 2 segments; usually 1 segment used at start of data transfer congestion window increases exponentially cwnd Seg 8 4 2 1 ACK RTTs TCP Congestion Control (2): Congestion Avoidance ⚫ Algorithm progressively sets a congestion threshold ⚫ ⚫ When cwnd > threshold, slow down rate at which cwnd is increased cwnd 8 threshold Increase congestion window size by one segment per roundtrip-time (RTT) ⚫ ⚫ ⚫ Each time an ACK arrives, cwnd is increased by 1/cwnd In one RTT, allccwnd segments are sent, so total increase in cwnd is cwnd x 1/cwnd = 1 cwnd grows linearly with time 4 2 1 RTTs TCP Congestion Control (3): Congestion 20 Congestion avoidance ⚫ Time-out Congestion window ⚫ 15 Threshold ⚫ 10 ⚫ Slow start ⚫ 5 ⚫ 0 Round-trip times Congestion is detected upon timeout or receipt of duplicate ACKs Assume current cwnd corresponds to available bandwidth Adjust congestion threshold = ½ x current cwnd Reset cwnd to 1 Go back to slow-start Over several cycles expect to converge to congestion threshold equal to about ½ the available bandwidth Fast Retransmit & Fast Recovery ⚫ Congestion causes many segments to be dropped ⚫ Burt if only a single segment is dropped, then subsequent segments trigger duplicate ACKs before timeout ⚫ Can avoid large decrease in cwnd as follows: ⚫ When three duplicate ACKs arrive before timeout expires, retransmit lost segment immediately ⚫ Reset congestion threshold to ½ cwnd ⚫ Reset cwnd to congestion threshold + 3 to account for the three segments that triggered duplicate ACKs ⚫ Remain in congestion avoidance phase ⚫ In absence of timeouts, cwnd will oscillate around optimal value SN=1 SN=2 SN=3 SN=4 SN=5 ACK=2 ACK=2 ACK=2 ACK=2 TCP Congestion Control: Fast Retransmit & Fast Recovery 20 Congestion avoidance Congestion window Time-out 15 Threshold 10 Slow start 5 Round-trip times 0