Lecture 11: Addressing, Framing, and Switching in the Link Layer CS 3035/GZ01: Networked Systems Kyle Jamieson Department of Computer Science University College London The link layer: Functionality IP datagram Link-layer protocol Sending host frame frame Receiving host • Enables the exchange of messages (frames) between end hosts • Functionality: 1. Framing: Determine start and end of bits and frames 2. Error control: Detect and/or correct errors 3. Reliable delivery: Deliver frames exactly once 4. Medium access control: Control hosts’ access to a shared medium, if applicable (medium access control) Networked Systems 3035/GZ01 2 Today • We finish the functionality of the link layer, and tie it in to IP 1. Framing and addressing 1. Repeaters, hubs, and switches 2. Bootstrapping a host Networked Systems 3035/GZ01 3 Framing frames • We have seen how to frame bits on a link – Ethernet’s Manchester encoding – Result: An infinite stream of bits on a link • But, two hosts connected on the same physical medium need to be able to exchange frames – Service provided by the link layer – Implemented by the network adaptor • Problem: how does the link layer determine where each frame begins and ends? (…how hard can that be?) Networked Systems 3035/GZ01 4 Simple approach to framing: count bytes • Sender includes number of bytes in header • Receiver extracts this number of bytes of body 53 Body 80 Body 53 bytes of data 21 bytes of data • But what if the Count field is corrupted? – L2 will frame the wrong bytes – This is called a framing error – With high probability, CRC will detect the framing error and discard that frame, but: Bogus count field 61 Body 80 61 bytes of data misdelivered ??? Body ??? bytes of data misdelivered • This state of persistent framing errors is called desynchronization Networked Systems 3035/GZ01 5 Desynchronization • Once framing on a link is desynchronized, it can stay that way • Need a method to resynchronize • But once we have that method, why use counting? Networked Systems 3035/GZ01 6 Framing with sentinel bytes • Delineate beginning of frame with special byte (SYN) • Delineate end of frame with another special byte (ETX) SYN Frame contents ETX • What if sentinel occurs in data? – Byte stuffing: insert another special “escape” byte DLE before sentinel • What if any of the above escape characters occur in data? – Byte stuffing again: Stuff DLE before DLE occurring in data • Example: SYN DLE, SYN, DLE, DLE, DLE, ETX ETX • Can we be more efficient? Networked Systems 3035/GZ01 7 Framing with sentinel bits • Delineate frame with special bit pattern – e.g., 01111110 start, 01111111 end 01111110 Frame contents 01111111 • Problem: what if sentinel occurs within frame? • Solution: bit stuffing – Sender always inserts a 0 after five 1s in the frame contents – Receiver always removes a 0 appearing after five 1s Networked Systems 3035/GZ01 8 When receiver sees five 1s… 01111110 Frame content 01111111 • If next bit 0, remove it, and begin counting again – Because this must be a stuffed bit; we can’t be at beginning/end of frame (those had six or seven 1s) • If next bit 1 (i.e., we’ve seen six 1s) then: – If following bit is 0, this is start of frame • Because the receiver has seen 01111110 – If following bit is 1, this is end of frame • Because the receiver has seen 01111111 Networked Systems 3035/GZ01 9 Example: sentinel bits • Original data, including start/end of frame: 01111110011111101111101111100101111111 • Sender rule: five 1s insert a 0 – After bit stuffing at the sender: 01111110011111010111110011111000101111111 • Receiver rule: five 1s and next bit 0 remove 0 01111110011111101111101111100101111111 Networked Systems 3035/GZ01 10 Comparing addressing schemes • Network layer address (IP address) – Function: move datagram to destination network – 32-bit address, dotted quad notation a.b.c.d where each component is an eight-bit unsigned integer – Hierarchical address space • Link layer address (MAC address, Ethernet address): – Function: move frame from one point to another point on the same network – Unique 48-bit address (in most LANs) – Burned in NIC ROM, also sometimes software settable – Usually a flat address space Networked Systems 3035/GZ01 11 Ethernet addresses • 48-bit source and destination addresses – Receiver’s link layer passes frame up to network-level protocol: • If destination address matches the adaptor’s • Or the destination address is the broadcast address (ff:ff:ff:ff:ff:ff) • Or the card is in a mode of operation that receives all frames (promiscuous mode) – Addresses are globally unique • Assigned by NIC vendors (top three bytes specify vendor) Networked Systems 3035/GZ01 12 Today • We finish the functionality of the link layer, and tie it in to IP 1. Framing and addressing 1. Repeaters, hubs, and switches – Comparison – Self-learning switches – The Spanning Tree Protocol 2. Bootstrapping a host Networked Systems 3035/GZ01 13 Message, segment, datagram, and frame host host HTTP message HTTP TCP segment TCP router IP HTTP IP datagram Ethernet interface Ethernet interface Ethernet frame Networked Systems 3035/GZ01 IP TCP router IP datagram SONET interface SONET interface SONET frame IP IP datagram Ethernet interface IP Ethernet interface Ethernet frame 14 Different devices switch on different information • Routers: forward IP datagrams based on network-layer addresses in the IP header HHHH data Router IP datagram Network Link Physical • Switches (Bridges): forward linklayer frames based on link-layer addresses in the link-layer header HHHH data Networked Systems 3035/GZ01 data data Switch Link layer frame Link Physical • Repeaters/Hubs: rebroadcast all bits in the physical-layer frame HHHH HH HHH data Hub Physical-layer frame Physical HHHH data 15 Physical Layer: Repeaters • Distance limitation in local-area networks – Electrical signal becomes weaker as it travels – Imposes a limit on the length of a LAN • In addition to limit imposed by collision detection • Repeaters join LANs together – Analog electronic device – Continuously monitors electrical signals on each LAN – Transmits an amplified copy Repeater Networked Systems 3035/GZ01 16 Physical Layer: Hubs • Joins multiple input lines electrically – Do not necessarily amplify the signal • Very similar to repeaters – Also operate at the physical layer hub hub Networked Systems 3035/GZ01 hub hub 17 Limitations of repeaters and hubs • One large place where packets collide (collision domain), since every bit is sent everywhere – So, aggregate throughput is limited – e.g., three departments each get 10 Mbps independently – … and then if connect via a hub must share 10 Mbps • Cannot support multiple LAN technologies – Repeaters/hubs do not buffer or interpret frames – So, can’t interconnect between different rates or formats • e.g., no mixing 100 Mbit/s Ethernet and Gigabit Ethernet • Limitations on maximum nodes and distances – Does not circumvent limitations of the shared medium – e.g., still cannot go beyond 2500 m in commercial Ethernet Networked Systems 3035/GZ01 18 Link Layer: Switches • Switches also connect two or more LANs at the link layer – Extracts destination address from the frame – Looks up the destination in a table – Forwards the frame to the appropriate LAN segment • Or point-to-point link, for higher-speed Ethernet • Each port is its own collision domain (if not just a link) Switch collision domain hub Networked Systems 3035/GZ01 Extended LAN 19 Switches and concurrent communication • Host A can talk to C, while B talks to D B A switch C D • If host has (dedicated) point-to-point link to switch: – Full duplex: each connection can send in both directions – Completely avoids collisions No need for carrier sense, collision detection, and so on Change in medium access control, but same framing Networked Systems 3035/GZ01 20 Switches: Advantages over hubs and repeaters • Only forwards frames as needed – Filters frames to avoid unnecessary load on segments – Sends frames only to segments that need to see them • Extends the geographic span of the network – Separate collision domains allow longer distances • Improves privacy by limiting scope of frames – Hosts can “snoop” the traffic traversing their segment – … but not all the rest of the traffic • Applies CSMA/CD in segment (not whole net) – Smaller collision domain • Joins segments using different technologies Networked Systems 3035/GZ01 21 Disadvantages over hubs and repeaters • Higher cost – More complicated devices that cost more money • Delay in forwarding frames – Bridge/switch must receive and parse the frame – … and perform a look-up to decide where to forward – Introduces store-and-forward delay • Can ameliorate using cut-through switching – Start forwarding after only header received • Need to learn where to forward frames – Bridge/switch needs to construct a forwarding table – Ideally, without intervention from network administrators – Solution: Self-learning algorithm Networked Systems 3035/GZ01 22 Motivation for self learning • Benefit if switch forwards frame only on segment(s) that need it – Allows concurrent use of other links • Switch forwarding table – Maps destination link-layer address to outgoing interface – Goal: construct the switch table automatically B A switch C D Networked Systems 3035/GZ01 23 Self learning algorithm: Building the table • When a frame (e.g., from A to B) arrives at the switch: – Inspect the source link-layer address • Associate that address with the incoming switch port • Store the mapping in the switch table • Use time-to-live field to eventually forget the mapping an amount of time later equal to its value – This is an example of soft state Switch forwarding table: B A B data A Port 1 Time-to-live 2 minutes 2 1 switch 3 4 C D Networked Systems 3035/GZ01 Address A Switch just learned how to reach A. 24 Self learning algorithm: Handling misses • When frame arrives with unfamiliar destination (e.g., B) – Forward the frame out all ports except for the one on which the frame arrived • This is called flooding – Hopefully, this case won’t happen very often • When e.g. B replies, switch will learn that node, too Switch forwarding table: B A B data A Address A Port 1 Time-to-live 2 minutes 2 1 switch 3 4 C D Networked Systems 3035/GZ01 25 Self-learning algorithm When switch receives a frame: index into the forwarding table using link-layer destination address if entry found for destination { if dest on segment from which frame arrived then drop frame else forward frame on interface indicated } else flood the frame Forward on all ports except the port on which the frame arrived Problems? Networked Systems 3035/GZ01 26 Flooding can lead to loops • Switches sometimes need to flood frames: – Upon receiving a frame with an unfamiliar destination – Upon receiving a frame sent to the broadcast address • Flooding can lead to forwarding loops – e.g., if the network contains a cycle of switches • Either accidentally, or by design for higher reliability How can we revise the bridge learning • This is catastrophic, for twoto reasons: algorithm avoid broadcast storms? 1. 2. Unlike IP, layer 2 has no way of preventing frame looping Ethernet duplicates frames, leading to an exponential increase, quickly crashing the extended LAN (this is called a broadcast storm) Networked Systems 3035/GZ01 27 The spanning tree protocol (STP) • Early 1980s: Digital Equipment Corporation, a key Ethernet vendor, wanted to leverage the benefits of loops while avoiding broadcast storms • Radia Perlman’s idea: Switches agree on a loop-free and connected spanning tree – Spanning tree: a sub-graph that touches all vertices but contains no cycles Graph with cycles Spanning tree has no cycles • Once the spanning tree is formed: – Switches use the switch learning algorithm to forward data frames over the tree links only Networked Systems 3035/GZ01 28 Spanning Tree Protocol (STP): Overview • Users connect Ethernet switches and shared-medium Ethernet LANs together 4 – Arbitrarily, possibly creating forwarding loops • Need a distributed algorithm so that: 3 1. Switches cooperate to build the spanning tree 1. Switches adapt automatically when failures occur Networked Systems 3035/GZ01 2 1 29 STP: Key ingredients of the algorithm • Switches elect one root switch from which to build the tree – Switch identifier = link-layer address on one port 4 • Switches block some ports from sending or receiving frames of a simplified version of Let’s begin with Ethernet type IP (or other L3 data) 3 B the full STP distributed algorithm • To form tree, switches exchange configuration messages (R, d, X): – From switch X – Proposing switch R (which is d hops away) as the root – Configuration messages are never blocked Blocked ports B 2 1 Root switch Networked Systems 3035/GZ01 30 Simplified STP: State at each switch • Each switch X keeps the following state: 1. Its view of who the root is – Initially, itself: X X Root id: X Networked Systems 3035/GZ01 31 Simplified STP: Startup and calculating the root • Note: Initially, each switch X periodically sends (X, 0, X) from all its ports 4 Root id: 4 Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports 3 Root id: 3 2 Root id: 2 1 Root id: 1 Networked Systems 3035/GZ01 32 Simplified STP: Startup and calculating the root • Note: Initially, each switch X periodically sends (X, 0, X) from all its ports 4 Root id: 4 Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports • Switch 2 sends (2, 0, 2); switch 3 sets its root id to 1, switch 1 ignores 3 Root id: 2 (2, 0, 2) 2 Root id: 2 1 Root id: 1 Networked Systems 3035/GZ01 33 Simplified STP: Startup and calculating the root • Note: Initially, each switch X periodically sends (X, 0, X) from all its ports 4 Root id: 4 Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports • Switch 1 sends (1, 0, 1); switches 2 and 3 set their root ids to 1 3 Root id: 1 2 Root id: 1 1 Root id: 1 Networked Systems 3035/GZ01 34 Simplified STP: Startup and calculating the root • Note: Initially, each switch X periodically sends (X, 0, X) from all its ports 4 Root id: 3 Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports • Switch 3 sends (3, 0, 3); switch 4 sets its root id to 3, others ignore 3 Root id: 1 2 Root id: 1 1 Root id: 1 Networked Systems 3035/GZ01 35 STP: Startup and calculating the root • Note: Initially, each switch X periodically sends (X, 0, X) from all its ports 4 Root id: 3 Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports 3 • Switch 4 sends (4, 0, 4); switch 3 Not yet agreeing on the identity of the root:Root let’s now see id: 1 ignores how switches propagate information through the network 2 Root id: 1 1 Root id: 1 Networked Systems 3035/GZ01 36 Simplified STP: State at each switch • Each switch X keeps the following state: 1. Its view of who the root is – Initially, itself: X 2. Its configuration message to send – Initially, announcing itself as root with zero distance to root: (X, 0, X) Networked Systems 3035/GZ01 X Root id: X Msg: (X, 0, X) 37 Simplified STP: Calculating the message • Switch X finds its distance from the root (d): 1. If X thinks it is the root, d 0 2. Otherwise, d the minimum distance from messages received matching X’s root id (call it r), plus one Configuration message rule: Switch X sets its configuration message to (r, d, X). If configuration message changes, sends updated message immediately 4 Root id: 4 Msg: (4, 0, 4) 3 Root id: 3 Msg: (3, 0, 3) 2 Root id: 2 Msg: (2, 0, 2) 1 Root id: 1 Msg: (1, 0, 1) Networked Systems 3035/GZ01 38 Simplified STP: Calculating the message • Switch X finds its distance from the root (d): 1. If X thinks it is the root, d 0 2. Otherwise, d the minimum distance from messages received matching X’s root id (call it r), plus one Configuration message rule: Switch X sets its configuration message to (r, d, X). If configuration message changes, sends updated message immediately • Switch 1 sends (1, 0, 1), switches 2 and 3 update their root ids and msgs 4 Root id: 3 Msg: (4, 0, 4) 3 Root id: 1 Msg: (1, 1, 3) 2 Root id: 1 Msg: (1, 1, 2) 1 Root id: 1 Msg: (1, 0, 1) Networked Systems 3035/GZ01 39 Simplified STP: Calculating the message • Switch X finds its distance from the root (d): 1. If X thinks it is the root, d 0 2. Otherwise, d the minimum distance from messages received matching X’s root id (call it r), plus one Configuration message rule: Switch X sets its configuration message to (r, d, X). If configuration message changes, sends updated message immediately • Switch 3 sends (1, 1, 3), switch 4 updates its root id and message 4 Root id: 1 Msg: (1, 2, 4) 3 Root id: 1 Msg: (1, 1, 3) 2 Root id: 1 Msg: (1, 1, 2) 1 Root id: 1 Msg: (1, 0, 1) Networked Systems 3035/GZ01 40 Simplified STP: Calculating the message • Switch X finds its distance from the root (d): 1. If X thinks it is the root, d 0 2. Otherwise, d the minimum distance from messages received matching X’s root id (call it r), plus one 4 Root id: 1 Msg: (1, 2, 4) 3 Configuration message rule: Now agree on the root identifier.RootBut id: 1how do SwitchallX switches sets its configuration (1, 1, 3) tree? message to (r, d, X) ports to block to form theMsg: they decide which spanning 2 Root id: 1 Msg: (1, 1, 2) 1 Root id: 1 Msg: (1, 0, 1) Networked Systems 3035/GZ01 41 STP: Port status 4 • All switches connected to a Ethernet LAN (or the two at the ends of a cable) agree on a single “designated” port Designated port: The port on the shortest path from the LAN or cable to the root is the designated port (D) Root id: 1 Msg: (1, 2, 4) D 3 Root id: 1 Msg: (1, 1, 3) – The designated port forwards frames from the LAN to the root 2 Root id: 1 Msg: (1, 1, 2) – Only designated ports send configuration messages D Networked Systems 3035/GZ01 D 1 Root id: 1 Msg: (1, 0, 1) D 42 STP: Port status Root port: Each non-root switch notes which of its port is on the shortest path to the root; this port is the root port (R) R 4 Root id: 1 Msg: (1, 2, 4) D R 3 Root id: 1 Msg: (1, 1, 3) D 2 Root id: 1 Msg: (1, 1, 2) R D Networked Systems 3035/GZ01 1 Root id: 1 Msg: (1, 0, 1) D 43 STP: Port status R Blocked port: If neither designated nor root, a port is a blocked port (B), not forwarding data traffic. 4 Root id: 1 Msg: (1, 2, 4) D R 3 Root id: 1 Msg: (1, 1, 3) B B D 2 Root id: 1 Msg: (1, 1, 2) R D Networked Systems 3035/GZ01 1 Root id: 1 Msg: (1, 0, 1) D 44 STP: State at each switch • Each switch X keeps the following state: 1. Its view of who the root is – Initially, itself: X 2. Its configuration message to send – Initially, announcing itself as root with zero distance to root: (X, 0, X) 3. X Root id: X Msg: (X, 0, X) D: (X, 0, X) For each of X’s ports: – Whether designated (D), root (R), or blocking (B) data traffic • Initially, designated (D) – “Best” configuration message heard on that port • Initially, its own configuration message (X, 0, X) Networked Systems 3035/GZ01 45 STP: Designated port rule • At a switch, for each port p: – Consider all configuration messages received on port p and the configuration message the switch would send – If switch receives a “better” configuration message on a port p, don’t send configuration messages on port p – Else, p is designated: send configuration message on p • Rule for comparing configuration messages: (R1, d1, X1) better than (R2, d2, X2) if R1 < R2 or (R1 = R2 and d1 < d2) or (R1 = R2 and d1 = d2 and X1 < X2) Networked Systems 3035/GZ01 46 STP: Complete example • All switches begin thinking they are root with all ports in the designated state D: (4,0,4) 4 Root id: 4 Msg: (4,0,4) D: (3,0,3) D: (3,0,3) 3 Root id: 3 Msg: (3,0,3) D: (3,0,3) D: (2,0,2) D: (2,0,2) 2 Root id: 2 Msg: (2,0,2) D: (2,0,2) D: (1,0,1) Networked Systems 3035/GZ01 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) 47 STP: Complete example • All switches begin thinking they are root with all ports in the designated state D: (4,0,4) • Switch 1 sends (1,0,1), switches 2 and 3 update their root ids, ports, and msgs Root id: 4 Msg: (4,0,4) D: (3,0,3) R: (1,0,1) 3 Root id: 1 Msg: (1,1,3) D: (3,0,3) D: (2,0,2) 3 2 B: (1,0,1) – Each switch’s port remembers the best configuration message seen so far Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) D: (1,0,1) Networked Systems 3035/GZ01 2 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) (1, 0, 1) – Switch 2 breaks “tie” between the two copies of (1,0,1) locally by numbering its ports 4 48 STP: Complete example – Switch 2, port 3 remains designated because Switch 2’s message (1,1,2) is better than (1,1,3) (1,1,3) • Switch 3 sends (1,1,3) from its designated ports , switch 4 updates its root id and message R: (1,1,3) 4 Root id: 1 Msg: (1,2,4) D: (3,0,3) R: (1,0,1) 3 Root id: 1 Msg: (1,1,3) D: (3,0,3) D: (1,1,3) 3 2 – Switch 1, port 1 remains designated because Switch 1’s message (1,0,1) is better than (1,1,3) B: (1,0,1) Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) D: (1,0,1) 1 Networked Systems 3035/GZ01 2 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) 2 49 STP: Complete example • Switch 2 sends (1,1,2) from port 3 only R: (1,1,3) – Switch 3 blocks its port 3 since (1,1,2) is better than its message (1,1,3) 4 Root id: 1 Msg: (1,2,4) D: (3,0,3) 2 1 R: (1,0,1) 3 3 Root id: 1 Msg: (1,1,3) B: (1,1,2) D: (1,1,3) 3 2 B: (1,0,1) 2 Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) D: (1,0,1) Networked Systems 3035/GZ01 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) 50 STP: Dynamics • When do switches send configuration messages? – If you think you’re the root, send periodically with parameter hello time (two seconds recommended in 802.1d) – Other switches send on all designated ports upon receiving root’s message • How does the algorithm adapt to topology changes? – State table contains age field, which is updated continuously – Aging rule: If age reaches a threshold max age (20 sec in 802.1d), discard that table entry and recalculate using all rules • What happens if max age is too big? Too small? – Recalculate when receive better or newer configuration message on port p (resulting in a table entry being overwritten) Networked Systems 3035/GZ01 51 STP: Handling failures • Suppose the Ethernet LAN fails R: (1,1,3) 4 Root id: 1 Msg: (1,2,4) D: (3,0,3) 2 1 R: (1,0,1) 3 3 Root id: 1 Msg: (1,1,3) B: (1,1,2) D: (1,1,3) 3 2 B: (1,0,1) 2 Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) D: (1,0,1) Networked Systems 3035/GZ01 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) 52 STP: Handling failures • Suppose the Ethernet LAN fails R: (1,1,3) • Switch 3: – Stops hearing the root’s messages through port 1, so it becomes designated – Port 3 becomes root – Updates its own message 4 Root id: 1 Msg: (1,2,4) D: (3,0,3) 2 1 D: (1,2,3) 3 3 Root id: 1 Msg: (1,2,3) R: (1,1,2) D: (1,1,3) 3 2 B: (1,0,1) 2 Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) D: (1,0,1) Networked Systems 3035/GZ01 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) 53 STP: Handling failures • Suppose the Ethernet LAN fails R: (1,2,3) • Switch 4: – Updates message heard on root port – Updates its own message Root id: 1 Msg: (1,3,4) D: (3,0,3) 2 1 D: (1,2,3) 3 3 Root id: 1 Msg: (1,2,3) • Switch 2: – Stops hearing the root’s messages through port 2, so it becomes designated R: (1,1,2) D: (1,1,3) 3 2 D: (1,1,2) 2 Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) D: (1,0,1) Networked Systems 3035/GZ01 4 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) 54 STP: Handling topology change • Suppose we fix the LAN. Now we have created (temporary) forwarding loops – This also happens when switches are powered-up R: (1,2,3) 4 Root id: 1 Msg: (1,3,4) D: (3,0,3) 2 1 D: (1,2,3) 3 3 Root id: 1 Msg: (1,2,3) R: (1,1,2) D: (1,1,3) 3 2 D: (1,1,2) 2 Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) D: (1,0,1) Networked Systems 3035/GZ01 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) 55 STP: Pre-forwarding port state • Suppose any of the following apply to a port: 1. Transition from B D 2. Any newly-connected port (detect Ethernet carrier) 3. Any port on a freshlypowered switch • The port then enters the preforwarding (PF) state, where: R: (1,2,3) Networked Systems 3035/GZ01 Root id: 1 Msg: (1,3,4) D: (3,0,3) 2 1 PF: (1,2,3) 3 3 Root id: 1 Msg: (1,2,3) R: (1,1,2) D: (1,1,3) 3 2 PF: (1,1,2) – It sends configuration messages and transitions to blocked and root states as if designated – But it does not forward data frames, so can’t create loops 4 2 Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) PF: (1,0,1) 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) 56 STP: Pre-forwarding port state • Switches 3 returns to old state R: (1,2,3) 4 Root id: 1 Msg: (1,3,4) D: (3,0,3) 2 1 R: (1,0,1) 3 3 Root id: 1 Msg: (1,1,3) R: (1,1,2) D: (1,1,3) 3 2 PF: (1,1,2) 2 Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) PF: (1,0,1) Networked Systems 3035/GZ01 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) 57 STP: Pre-forwarding port state • Switch 3 returns to old state • Switch 2 returns to old state R: (1,2,3) 4 Root id: 1 Msg: (1,3,4) D: (3,0,3) 2 1 R: (1,0,1) 3 3 Root id: 1 Msg: (1,1,3) R: (1,1,2) D: (1,1,3) 3 2 B: (1,0,1) 2 Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) PF: (1,0,1) Networked Systems 3035/GZ01 1 Root id: 1 Msg: (1,0,1) D: (1,0,1) 58 STP: Pre-forwarding port state • Switch 3 returns to old state • Switch 2 returns to old state • Switch 4 returns to old state • Now switch 1, port 1 remains in the pre-forwarding state R: (1,1,3) 4 Root id: 1 Msg: (1,2,4) D: (3,0,3) 2 1 R: (1,0,1) 3 3 Root id: 1 Msg: (1,1,3) R: (1,1,2) D: (1,1,3) 3 2 B: (1,0,1) 2 Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) 1 PF: (1,0,1) Networked Systems 3035/GZ01 1 Root id: 1 Msg: (1,0,1) 2 D: (1,0,1) 59 STP: Leaving the pre-forwarding state • If still in PF state after some number of seconds (forwarding delay parameter) then the port becomes designated (D) • How long should forwarding delay be? – Long enough for the entire spanning tree to re-form, i.e.: R: (1,1,3) Root id: 1 Msg: (1,2,4) D: (3,0,3) 2 1 R: (1,0,1) 3 3 Root id: 1 Msg: (1,1,3) R: (1,1,2) D: (1,1,3) 3 2 B: (1,0,1) – Twice the maximum transit time across the extended LAN • 30 seconds in 802.1d 2 Root id: 1 Msg: (1,1,2) 1 R: (1,0,1) 1 D: (1,0,1) Networked Systems 3035/GZ01 4 1 Root id: 1 Msg: (1,0,1) 2 D: (1,0,1) 60 The evolution of Ethernet • From the coaxial cable shared medium to switches – Even more capacity, with simultaneous conversations • From 3 Mbit/s experimental Ethernet to 100 Gbit/s recent standards • From electrical signaling to optical • Changed everything except the frame format • Lesson: The right interface can accommodate many changes – Implementation is hidden behind interface Networked Systems 3035/GZ01 61 Today • We finish the functionality of the link layer, and tie it in to IP 1. Framing and addressing 1. Repeaters, hubs, and switches 2. Bootstrapping a host – Protocols for bootstrapping: DHCP, ARP – Communicating over the same, different networks Networked Systems 3035/GZ01 62 What does a host need to know? • What IP address should the host use? • What local DNS server to use? • How to tell which destinations are local? – How to address them using the local network? • How to send packets to remote destinations? ??? 1.2.3.7 1.2.3.156 host host ... DNS host host ... DNS 5.6.7.0/24 1.2.3.0/23 1.2.3.19 router Networked Systems 3035/GZ01 router router 63 Avoiding manual configuration • Dynamic Host Configuration Protocol (DHCP) – End host learns how to send packets – Learn IP address, DNS servers, “gateway,” what’s local • Address Resolution Protocol (ARP) – For local destinations, learn the mapping between IP address and MAC address 1.2.3.48 1.2.3.7 1.2.3.156 host host 1.2.3.0/23 255.255.254.0 ... DNS host host ... DNS 5.6.7.0/24 1.2.3.19 router Networked Systems 3035/GZ01 1A-2F-BB-76-09-AD router router 64 Key ideas in both protocols • Broadcasting: when in doubt, shout! – Broadcast query to all hosts in the local-area-network – … when you don’t know how to identify the right one • Caching: remember the past for a while – Store the information you learn to reduce overhead – Remember your own address and other host’s addresses • Soft state: eventually forget the past – Associate a time-to-live field with the information – On expiry either refresh or discard the information – This is key for robustness in the face of unpredictable change Networked Systems 3035/GZ01 65 Bootstrapping problem • Host doesn’t have an IP address yet – So, host doesn’t know what source address to use • Host doesn’t know whom to ask for an IP address – So, host doesn’t know what destination address to use host host router Networked Systems 3035/GZ01 router 66 DHCP discovery, from the client • DHCP Solution: “shout” to discover a server that can help – Client broadcasts a DHCP discover message (to the broadcast IP address, 255.255.255.255) – Two possibilities: 1. Server on same subnet sends a reply offering an address 2. Or: a DHCP relay agent (configured only with DHCP server’s IP address) unicasts to a DHCP server on another network • DHCP server replies unicast to relay agent; agent forwards replies to the new host’s network host DHCP server host DHCP server DHCP relay router Networked Systems 3035/GZ01 router 67 Response from the DHCP server • The server responds with a DHCP offer message – Contains configuration parameters (including proposed IP address, mask, gateway router, DNS server) – Contains lease time (duration the information remains valid) • Multiple servers may respond – Multiple servers on the same subnetwork – Each may respond with an offer • Accepting one of the offers – Client sends a DHCP request echoing the parameters – The DHCP server responds with a DHCP ACK to confirm – The other servers see they were not chosen • They can then safely offer those same parameters to other clients Networked Systems 3035/GZ01 68 Dynamic Host Configuration Protocol Arriving client DHCP server • Why all the broadcasts? • Discover broadcast: client doesn’t know DHCP server’s identity • Offer, ACK broadcast: client doesn’t have an IP yet • Request broadcast: so other servers can see Networked Systems 3035/GZ01 69 Soft state: Refresh or forget • Why is a lease time necessary? – Client can release the IP address (DHCP release) • e.g., clean shutdown of the computer – But, host might not release the address • e.g., the host crashes • e.g., buggy client software – And you don’t want the address to be allocated forever • Performance trade-offs – Short lease time: returns inactive addresses quickly – Long lease time: avoids overhead of frequent renewals & lessens frequency of lease being denied Networked Systems 3035/GZ01 70 So, now the host knows things… IP address Mask Gateway router DNS server • And can send packets to other IP addresses • But: how to use the local network to do this? Networked Systems 3035/GZ01 71 Figuring out where to send locally • Two cases: 1. Destination is on the local network: need to address it directly 2. Destination is not local (remote): need to figure out the first “hop” on the local network • Determining if it’s local: use the netmask – e.g., bitwise-AND the destination IP address with 255.255.254.0 – Is it the same value as when we do the same with ownIP address? • Yes destination IP is local; no destination IP is remote 1.2.3.48 1.2.3.7 1.2.3.156 host host 1.2.3.0/23 255.255.254.0 Networked Systems 3035/GZ01 ... DNS 1A-2F-BB-76-09-AD host host ... DNS 5.6.7.0/24 1.2.3.19 router router router 72 Figuring out where to send locally (2) • If it’s remote, look up the first hop in a (very small) local routing table – e.g., by default, route via 1.2.3.19 – Now do the local case but for 1.2.3.19 rather than ultimate destination IP address 1.2.3.48 1.2.3.7 1.2.3.156 host host 1.2.3.0/23 255.255.254.0 ... DNS 1A-2F-BB-76-09-AD host host ... DNS 5.6.7.0/24 1.2.3.19 router router router • For the local case, need to determine the destination’s link-layer address • How does a host translate the next hop IP address to a link-layer address? Networked Systems 3035/GZ01 73 Address Resolution Protocol (ARP) • Every node maintains an ARP table – (IP address, link-layer address) pairs • Consult the table when sending a packet – Map destination IP address to destination MAC address – Encapsulate and transmit the data packet • But: what if IP address not in the table? – Sender broadcasts: “Who has IP address 1.2.3.156?” – Receiver responds (unicast, to the source of the broadcast): “link-layer address 58-23-D7-FA-20-B0” – Sender caches result in its ARP table • Sender may include its own <IP, link-layer> address mapping in request, so that receiver can reply back to the sender Networked Systems 3035/GZ01 74 Example: Putting it all together • How does host A send a datagram to host B? 1. A sends packet to R 2. R sends packet to B B A host 74:29:9c:e8:ff:55 128.16.74.92 netmask 0xfffff000 Network 128.16.64.0/20 e6:e9:00:17:bb:4b 128.16.64.1 Networked Systems 3035/GZ01 49:bd:d2:C7:56:2a host 128.17.0.2 router R Network 128.17.64.0/20 1a:23:f9:cd:06:9b 128.17.0.1 75 Host A decides to send through R • Host A constructs an IP packet to send to B – IP source 128.16.74.92, IP destination 128.170.0.2 • Host A has a gateway router R – Used to reach any destination outside of 128.16.64.0/20 – Address 128.16.64.1 for R learned via DHCP B A host 74:29:9c:e8:ff:55 128.16.74.92 netmask 0xfffff000 Network 128.16.64.0/20 e6:e9:00:17:bb:4b 128.16.64.1 Networked Systems 3035/GZ01 49:bd:d2:C7:56:2a host 128.17.0.2 router R Network 128.17.64.0/20 1a:23:f9:cd:06:9b 128.17.0.1 76 Host A sends packet through R • Host A learns the MAC address of R’s interface – ARP request: broadcast request for 128.16.64.1 – ARP response: R responds with e6:e9:00:17:bb:4b • Host A encapsulates the packet in a link-layer header and sends to R B A host 74:29:9c:e8:ff:55 128.16.74.92 netmask 0xfffff000 Network 128.16.64.0/20 Network 128.17.64.0/20 To: R A B data e6:e9:00:17:bb:4b 128.16.64.1 Networked Systems 3035/GZ01 49:bd:d2:C7:56:2a host 128.17.0.2 router R 1a:23:f9:cd:06:9b 128.17.0.1 77 R decides how to forward datagram • Router R’s left interface receives the packet – R extracts the IP packet from the Ethernet frame – R sees the IP packet is destined to 128.17.0.2 • Router R consults its forwarding table – Packet matches 128.17.64.0/20 via right interface B A host 74:29:9c:e8:ff:55 128.16.74.92 netmask 0xfffff000 Network 128.16.64.0/20 A B data e6:e9:00:17:bb:4b 128.16.64.1 Networked Systems 3035/GZ01 49:bd:d2:C7:56:2a host 128.17.0.2 router R Network 128.17.64.0/20 1a:23:f9:cd:06:9b 128.17.0.1 78 R sends datagram to B • Router R’s right interface learns the link-layer address of host B – ARP request: broadcast request for 128.17.0.2 – ARP response: B responds with 49:bd:d2:C7:56:2a • Router R encapsulates the packet and sends to B B A host 74:29:9c:e8:ff:55 128.16.74.92 netmask 0xfffff000 Network 128.16.64.0/20 To: B A B data e6:e9:00:17:bb:4b 128.16.64.1 Networked Systems 3035/GZ01 49:bd:d2:C7:56:2a host 128.17.0.2 router R Network 128.17.64.0/20 1a:23:f9:cd:06:9b 128.17.0.1 79 Security analysis of ARP • Impersonation – Any node that hears an ARP request can answer… – …and can say whatever they want – Actual legit receiver never sees a problem • Because even though later packets carry its IP address, its NIC doesn’t capture them since not its link-layer address • Man-in-the-middle attack – Imposter updates frames with correct link-layer address and forwards whatever it receives to the legit destination… • …but gets to inspect (and maybe alter) it first • Does the attacker have to “win” a race? – Maybe not, if sender blindly believes ARP responses Networked Systems 3035/GZ01 80 The problem with extended LANs • Switched LANs afford greater scalability, but extended LANs do not isolate traffic • Three resulting issues: 1. Security: Allows eavesdropping across LANs, just by putting an interface in promiscuous mode 2. Load: Some LANs are more heavily-used than others, may be desirable to separate them at times. 3. Broadcast scalability: Broadcast frames traverse the entire extended LAN; this reduces overall performance Networked Systems 3035/GZ01 81 Virtual LANs (VLANs) 1 2 4 8 9 10 … Computer Science 15 16 … Electrical Engineering • Switch assigns each port a color, an identifier designating the VLAN that port belongs to • Traffic isolation: colors = broadcast domains • Easily reconfigurable port assignments • Routing between VLANs: layer 3 routing functionality Networked Systems 3035/GZ01 82 VLAN example • Configure ports on W, X, Y, and Z to be in appropriate VLANs – Trunk ports between B1 and B2 configured for both VLANs • Bridge inserts VLAN header containing color between Ethernet header and payload Trunk link • If a packet contains a VLAN header, bridges only forward on matchingcolor or trunk ports Networked Systems 3035/GZ01 83 Comparing L2 switches and L3 routers • Advantages of L2 switches over L3 routers – No human configuration is needed – Fast filtering and forwarding of frames • Disadvantages of L2 switches over L3 routers – Topology restricted to a spanning tree – Large networks require large ARP tables – Broadcast storms can cause the network to collapse – Can’t accommodate non-Ethernet segments (why not?) Networked Systems 3035/GZ01 84 Acknowledgement Selected parts adapted from lecture material by Scott Shenker (UC Berkeley) and Kurose and Ross Computer Networking (4/e) Coursework 2 due Friday 15th November, 4:05 PM Midterm exam in regular lecture timeslot, Thursday 14th November NEXT TIME Networked Systems 3035/GZ01 85