Spanning Tree and Multicast The Story So Far • Switched ethernet is good – Besides switching needed to join even multiple classical ethernet networks • Routing is expensive, learning switches seem cheaper • Flooding in a network with a cycle of switches is bad (why?) Flooding With Cycles A wants to send a packet to D - A sends packet to 1 - 1 Floods to 2 and 4 - 2 Floods to B and 3 - 4 Floods to D and 3 <- D receives packet - 3 Floods packet from 2 to C and 4 - 3 Floods packet from 4 to C and 2 - 4 Floods packet from 3 to D and 1 - 2 Floods packet from 3 to B and 1 B - 1 Floods packet from 2 to A and 4 - 1 Floods packet from 4 to B and 2 - …. - When does this craziness stop? A 1 2 4 3 C D Fixing with Cycles • DEC (which is long dead) faced the same problem in the mid-1980s (connecting Catenets) • Choices offered – Rely on network administrators to build loop-free topologies • Turns out to be hard – Build a protocol • Given a loopy graph, want no loops – Trees are nice, they don’t have loops – Need a tree that connects all the nodes somehow – Spanning Trees <- Trees spanning over all nodes The High Level Protocol • Pick a root – For the purposes of this class, root node is the one with the lowest MAC address. • In reality also adds a user specified “priority” into the mix A 1 B 2 4 3 C D The High Level Protocol • For each node, determine shortest path to root – Break ties by choosing the lower of two MAC addresses – In the example 3 picks the path through 2 rather than 4 A 1 B 2 4 3 C D The High Level Protocol • Disable all links not used by the previously picked paths. A 1 B 2 4 3 C D The High Level Protocol • Recomputation for resilience: – If root goes down, select a new root, rerun algorithm – If another node goes down adjust links to recreate the tree. A 1 B 2 4 3 C D The Protocol in Real Life • Uses messages of the form – (proposed root, distance to root, node sending message) – From example (1, 0, 1) <- “Node 1 proposes that 1 be the root, also it is distance 0 away from 1” • Messages allow for election of root and determining distances • Messages (when sent described next) are always flooded out all ports of a switch – This is not a problem even in the presence of loops. Why? The Protocol in Real Life • Initially all switches send a message proposing themselves as the root – Messages like (1, 0, 1), (2, 0, 2) etc • Switches update their view of the root – On receiving a message (Y, d, Z) from Z, if id(Y) < id(root), root = Y • Compute distance from the root • If root or shortest distance has changed, flood an update message notifying neighbors of new root and distance • Periodically everyone reannounces their distance and perceived root – Includes the root which sends (root, 0, root) – Used to detect failures and recompute tree when needed. Multicast • Promise of the 90s – All TV and live events broadcast over the internet • More viewers, more revenue, all over the world. – Too expensive to run one stream per user. The Problem Your favorite media conglomerate (The Producer) … The Network You A B Everyone else in the world ZZZZ • The individual connections from the “producer” to the network all carry the exact same data. • Similarly each individual stream uses network bandwidth, so there might be 15 copies of the same data needlessly using up bandwidth. • IDEA: Why not just have the network deal with this, give it one copy of the data and have it determine where the packets go. • Does this violate End-to-End? • Multicast!!!! Multicast • Fundamentally things to do – Join a multicast group (set of end hosts listening to packets) – Send to group • Different implementation at different layers – Link layer • Easiest to implement, used in LANs – IP • Harder to implement, but allows for greater efficiency (as implemented) – Application level • We ignore this. Link Layer Multicast • Each multicast groups is denoted by an address G • Join a group by telling NIC about G – NIC then listens for packets sent to address G • Send by broadcasting packet with a destination address of G. • Very efficient in terms of state (end-host stores everything) • Inefficient in terms of bandwidth IP Multicast • We focus on intra-domain • Portion of IP address space is reserved for Multicast • Receivers join group using IGMP (anyone can join) • Anyone can send (don’t need to be a part of the group) Receiver Receiver Sender IP Multicast • Take graph on right. • Want Sender to be able to multicast to receivers • Minimize number of packets sent to get one packet from sender to all receivers. • A few ways to do this Receiver Receiver Sender IP Multicast • Must build a tree from source to all destinations • We know how to flood along a tree • Can build a tree based on a specific source Receiver – Distance Vector Multicast Routing Protocol • Build one tree for all possible sources – Core Based Trees Receiver Sender DVMRP • An extension to distance vector routing. • Consider distances to source (use source as root of spanning tree). • Three steps, each getting us closer to the ideal. – Reverse Path Flooding – Reverse Path Broadcasting – Truncated Reverse Path Broadcasting Receiver Receiver Sender DVMRP • RPF Receiver – If incoming link is shortest path to source • Send on all links except incoming L2 – Otherwise drop • Packets sent along the black links in the direction marked will be flooded. • Sometimes two of the same packet are sent. – For instance Node X receives the same packet along both L0 and L1, forwards both of them along L2. X L1 L0 Receiver Sender DVMRP • RPB Receiver – Pick a single parent for each link. – Send packets from parent along a link. • In example to the right Y is picked as the parent for L2. • Packet from Z is not forwarded by X, while packet from Y is, therefore only one packet goes through L2. • Still suboptimal, X does not really need to receive the packet. L2 X L1 Z L0 Y Receiver Sender DVMRP • Pruning Receiver – Do not send packet to destinations not in the multicast group. • Start by sending to everyone as in RPB • Nodes send an explicit nonmembership request if no one below them on the tree wants the data. • In this example X might send a NMR. • Do not send data to a pruned node. • NMR eventually expires, at which point data is again transmitted. L2 X L1 Z L0 Y Receiver Sender Core Based Trees • Build a common tree • For each group pick a “rendezvous point” (the Core) • Build a spanning tree rooted at the rendezvous point • Flood using spanning tree algorithm Receiver Receiver Sender