Connecting to the IP Multicast World David Mitchell NCNE, NLANR, NCAR, and the Pittsburgh Supercomputing Center ncne@ncne.org 2215 1195_05_2000_c2 © 2000, Cisco Systems, Inc. 1 Agenda • • • • • • What Is Multicast Multicast Protocols Multicast Configurations Troubleshooting Futures Useful Links 2 What Is Multicast • Unicast is one to one – traffic is sent from a single source to a single destination • Multicast is one to many – traffic is sent from a single source to a “group” address 3 Unicast vs. Multicast Unicast Server Router Multicast Server Router 4 Why Multicast • More efficient method of sending the same data to lots of hosts – Distributing a video stream to multiple viewers • Allows unique distributed protocols – clients can announce services or perform queries in a distributed fashion ala Appletalk’s “Chooser” 5 Multicast Disadvantages Multicast Is UDP Based!!! • Best Effort Delivery: Drops are to be expected. Multicast applications should not expect reliable delivery of data and should be designed accordingly. Reliable Multicast is still an area for much research. Expect to see more developments in this area. • No Congestion Avoidance: Lack of TCP windowing and “slow-start” mechanisms can result in network congestion. If possible, Multicast applications should attempt to detect and avoid congestion conditions. • Duplicates: Some multicast protocol mechanisms (e.g. Asserts, Registers and SPT Transitions) result in the occasional generation of duplicate packets. Multicast applications should be designed to expect occasional duplicate packets. • Out of Order Delivery : Some protocol mechanisms may also result in out of order delivery of packets. 6 Agenda • What Is Multicast • Multicast Protocols • Multicast Configurations • Troubleshooting • Futures • Useful Links 7 Multicast Protocols • Multicast Addressing Basics • IGMP • PIM • MSDP • MBGP 8 Multicast Addressing • IP Multicast Group Addresses – 224.0.0.0–239.255.255.255 – Class “D” Address Space • High order bits of 1st Octet = “1110” • Reserved Link-local Addresses – 224.0.0.0–224.0.0.255 – Transmitted with TTL = 1 – Examples: • • • • • 224.0.0.1 224.0.0.2 224.0.0.4 224.0.0.5 224.0.0.13 All systems on this subnet All routers on this subnet DVMRP routers OSPF routers PIMv2 Routers 9 Multicast Addressing • Administratively Scoped Addresses – 239.0.0.0–239.255.255.255 – Private address space • Similar to RFC1918 unicast addresses • Not used for global Internet traffic • Used to limit “scope” of multicast traffic • Same addresses may be in use at different locations for different multicast sessions – Examples • Site-local scope: 239.253.0.0/16 • Organization-local scope: 239.192.0.0/14 10 Multicast Addressing IP Multicast MAC Address Mapping (FDDI and Ethernet) 32 Bits 28 Bits 1110 239.255.0.1 5 Bits Lost 01-00-5e-7f-00-01 25 Bits 23 Bits 48 Bits 11 Multicast Addressing IP Multicast MAC Address Mapping (FDDI & Ethernet) Be Aware of the 32:1 Address Overlap 32 - IP Multicast Addresses 224.1.1.1 224.129.1.1 225.1.1.1 225.129.1.1 . . . 238.1.1.1 238.129.1.1 239.1.1.1 239.129.1.1 1 - Multicast MAC Address (FDDI and Ethernet) 0x0100.5E01.0101 12 Multicast Addressing • Dynamic Group Address Assignment – Historically accomplished using SDR application • Sessions/groups announced over well-known multicast groups • Address collisions detected and resolved at session creation time • Has problems scaling – Future dynamic techniques under consideration • Multicast Address Set-Claim (MASC) – Hierarchical, dynamic address allocation scheme – Extremely complex garbage-collection problem. – Long ways off • MADCAP – Similar to DHCP – Need application and host stack support 13 Multicast Addressing • Static Group Address Assignment (new) – Temporary method to meet immediate needs – Group range: 233.0.0.0 - 233.255.255.255 • Your AS number is inserted in middle two octets • Remaining low-order octet used for group assignment – Defined in IETF draft • “draft-ietf-mboned-glop-addressing-00.txt” 14 Multicast Protocols • Multicast Addressing Basics • IGMP • PIM • MSDP • MBGP 15 Host-Router Signaling: IGMP • How hosts tell routers about group membership • Routers solicit group membership from directly connected hosts • RFC 1112 specifies version 1 of IGMP Supported on Windows 95 • RFC 2236 specifies version 2 of IGMP Supported on latest service pack for Windows and most UNIX systems • IGMP version 3 is specified in IETF draft draft-ietf-idmr-igmp-v3-03.txt 16 Host-Router Signaling: IGMP Joining a Group H1 H2 224.1.1.1 H3 Report • Host sends IGMP Report to join group 17 Host-Router Signaling: IGMP Maintaining a Group 224.1.1.1 H1 224.1.1.1 X Suppressed H2 224.1.1.1 H3 X Report Suppressed Query • Router sends periodic Queries to 224.0.0.1 • One member per group per subnet reports • Other members suppress reports 18 Host-Router Signaling: IGMP Leaving a Group (IGMPv1) H1 H2 H3 #1 General Query #2 • • • • Host quietly leaves group Router sends 3 General Queries (60 secs apart) No IGMP Report for the group is received Group times out (Worst case delay ~= 3 minutes) 19 Host-Router Signaling: IGMP Leaving a Group (IGMPv2) H1 H2 224.1.1.1 H3 Leave to #1 224.0.0.2 Group Specific Query to 224.1.1.1 #2 • • • • Host sends Leave message to 224.0.02 Router sends Group specific query to 224.1.1.1 No IGMP Report is received within ~3 seconds Group 224.1.1.1 times out 20 IGMPv3 • draft-ietf-idmr-igmp-v3-??.txt • Early implementations now in beta – Enables hosts to listen only to a specified subset of the hosts sending to the group 21 IGMPv3 Source = 1.1.1.1 Group = 224.1.1.1 Source = 2.2.2.2 Group = 224.1.1.1 R2 R1 • H1 wants to receive from S = 1.1.1.1 but not from S = 2.2.2.2 • With IGMP, specific sources can be pruned back - S = 2.2.2.2 in this case R3 IGMPv3: Join 1.1.1.1, 224.1.1.1 Leave 2.2.2.2, 224.1.1.1 H1 - Member of 224.1.1.1 22 Multicast Protocols • Multicast Addressing Basics • IGMP • PIM • MSDP • MBGP 23 PIM • Distribution Trees • Neighbor Discovery • PIM-DM • PIM-SM 24 Multicast Distribution Trees Shortest Path or Source Distribution Tree Source 1 Notation: (S, G) S = Source G = Group Source 2 A B F D C E Receiver 1 Receiver 2 25 Multicast Distribution Trees Shortest Path or Source Distribution Tree Source 1 Notation: (S, G) S = Source G = Group Source 2 A B F D C E Receiver 1 Receiver 2 26 Multicast Distribution Trees Shared Distribution Tree Notation: (*, G) * = All Sources G = Group A B C D (RP) E F (RP) PIM Rendezvous Point Shared Tree Receiver 1 Receiver 2 27 Multicast Distribution Trees Shared Distribution Tree Source 1 Notation: (*, G) * = All Sources G = Group Source 2 A B C D (RP) E F (RP) PIM Rendezvous Point Shared Tree Source Tree Receiver 1 Receiver 2 28 Multicast Distribution Trees Characteristics of Distribution Trees • Source or Shortest Path trees Uses more memory O(S x G) but you get optimal paths from source to all receivers; minimizes delay • Shared trees Uses less memory O(G) but you may get sub-optimal paths from source to all receivers; may introduce extra delay 29 PIM Neighbor Discovery 171.68.37.2 PIM Router 2 Highest IP Address elected as “DR” (Designated Router) PIM Hello PIM Hello PIM Router 1 171.68.37.1 • PIMv2 Hellos are periodically multicast to the “All-PIM-Routers” (224.0.0.13) group address. (Default = 30 seconds) – Note: PIMv1 multicasts PIM Query messages to the “All-Routers” (224.0.0.2) group address. • If the “DR” times-out, a new “DR” is elected. • The “DR” is responsible for sending all Joins and Register messages for any receivers or senders on the network. 30 PIM Neighbor Discovery wan-gw8>show ip pim neighbor PIM Neighbor Table Neighbor Address Interface 171.68.0.70 FastEthernet0 171.68.0.91 FastEthernet0 171.68.0.82 FastEthernet0 171.68.0.86 FastEthernet0 171.68.0.80 FastEthernet0 171.68.28.70 Serial2.31 171.68.28.50 Serial2.33 171.68.27.74 Serial2.36 171.68.28.170 Serial0.70 171.68.27.2 Serial1.51 171.68.28.110 Serial3.56 171.68.28.58 Serial3.102 Uptime 2w1d 2w6d 7w0d 7w0d 7w0d 22:47:11 22:47:22 22:47:07 1d04h 1w4d 1d04h 12:53:25 Expires 00:01:24 00:01:01 00:01:14 00:01:13 00:01:02 00:01:16 00:01:08 00:01:21 00:01:06 00:01:25 00:01:20 00:01:03 Mode Dense Dense (DR) Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense 31 PIM-DM — Evaluation • Most effective for small pilot networks • Advantages: – Easy to configure—two commands – Simple flood and prune mechanism • Potential issues... – Inefficient flood and prune behavior – Complex Assert mechanism – Mixed control and data planes • Results in (S, G) state in every router in the network • Can result in non-deterministic topological behaviors – No support for shared trees 32 PIM-SM Overview • Explicit join model –Receivers join to the Rendezvous Point (RP) –Senders register with the RP –Data flows down the shared tree and goes only to places that need the data from the sources –Last hop routers can join source tree if the data rate warrants by sending joins to the source • RPF check depends on tree type –For shared trees, uses RP address –For source trees, uses Source address 33 PIM-SM Overview • Only one RP is chosen for a particular group • RP statically configured or dynamically learned (Auto-RP, PIM v2 candidate RP advertisements) • Data forwarded based on the source state (S, G) if it exists, otherwise use the shared state (*, G) • RFC 2326 - “PIM Sparse Mode Protocol Spec” 34 PIM-SM Shared Tree Join RP (*, G) State created only along the Shared Tree. (*, G) Join Shared Tree Receiver 35 PIM-SM Sender Registration RP Source (S, G) State created only along the Source Tree. Traffic Flow Shared Tree Source Tree (S, G) Register (unicast) Receiver (S, G) Join 36 PIM-SM Sender Registration RP Source (S, G) traffic begins arriving at the RP via the Source tree. Traffic Flow Shared Tree Source Tree (S, G) Register (S, G) Register-Stop (unicast) Receiver RP sends a Register-Stop back to the first-hop router to stop the Register process. (unicast) 37 PIM-SM Sender Registration RP Source Traffic Flow Source traffic flows natively along SPT to RP. Shared Tree Source Tree From RP, traffic flows down the Shared Tree to Receivers. Receiver 38 PIM-SM SPT Switchover RP Source Last-hop router joins the Source Tree. Traffic Flow Shared Tree Source Tree (S, G) Join Additional (S, G) State is created along new part of the Source Tree. Receiver 39 PIM-SM SPT Switchover RP Source Traffic Flow Shared Tree Source Tree (S, G)RP-bit Prune Traffic begins flowing down the new branch of the Source Tree. Receiver Additional (S, G) State is created along along the Shared Tree to prune off (S, G) traffic. 40 PIM-SM SPT Switchover RP Source (S, G) Traffic flow is now pruned off of the Shared Tree and is flowing to the Receiver via the Source Tree. Traffic Flow Shared Tree Source Tree Receiver 41 PIM-SM SPT Switchover RP Source (S, G) traffic flow is no longer needed by the RP so it Prunes the flow of (S, G) traffic. Traffic Flow Shared Tree Source Tree (S, G) Prune Receiver 42 PIM-SM SPT Switchover RP Source (S, G) Traffic flow is now only flowing to the Receiver via a single branch of the Source Tree. Traffic Flow Shared Tree Source Tree Receiver 43 PIM-SM Protocol Mechanics • • • • • • PIM SM State PIM SM Forwarding PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning 44 PIM-SM State Example sj-mbone> show ip mroute IP Multicast Routing Table Flags: D - Dense, S - Sparse, C - Connected, L - Local, P - Pruned R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT M - MSDP created entry, X - Proxy Join Timer Running A - Advertised via MSDP Timers: Uptime/Expires Interface state: Interface, Next-Hop or VCD, State/Mode (*, 224.1.1.1), 00:13:28/00:02:59, RP 10.1.5.1, flags: SCJ Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:13:28/00:02:32 Serial0, Forward/Sparse, 00:4:52/00:02:08 (171.68.37.121/32, 224.1.1.1), Incoming interface: Serial0, Outgoing interface list: Ethernet1, Forward/Sparse, Ethernet0, forward/Sparse, 00:01:43/00:02:59, flags: CJT RPF nbr 192.10.2.1 00:01:43/00:02:11 00:01:43/00:02:11 45 PIM-SM (*,G) State Rules • (*,G) creation – Upon receipt of a (*,G) Join or – Automatically if (S,G) must be created • (*,G) reflects default group forwarding – IIF = RPF interface toward RP – OIL = interfaces • that received a (*,G) Join or • with directly connected hosts or • manually configured • (*,G) deletion – When OIL = NULL and – no child (S,G) state exists 46 PIM-SM (S,G) State Rules • (S,G) creation – By receipt of (S,G) Join or Prune or – By “Register” process – Parent (*,G) created (if doesn’t exist) • (S,G) reflects forwarding of “S” to “G” – IIF = RPF Interface normally toward source • RPF toward RP if “RP-bit” set – OIL = Initially, copy of (*,G) OIL minus IIF • (S,G) deletion – By normal (S,G) entry timeout 47 PIM-SM OIL Rules • Interfaces in OIL added – By receipt of Join message • Intfc’s added to (*,G) are added to all (S,G)’s • Interfaces in OIL removed – By receipt of Prune message • Intfc’s removed from (*,G) are removed from all (S,G)’s – Interface Expire timer counts down to zero • Timer reset (to 3 min.) by receipt of periodic Join or • By IGMP membership report 48 PIM-SM State Flags • S = Sparse Mode • C = Directly Connected Host • L = Local (Router is member) • P = Pruned (All intfcs in OIL = Prune) • T = Forwarding via SPT – Indicates at least one packet was forwarded 49 PIM-SM State Flags (Cont.) • J = Join SPT – In (*, G) entry •Indicates SPT-Threshold is being exceeded •Next (S,G) received will trigger join of SPT – In (S, G) entry •Indicates SPT joined due to SPT-Threshold •If rate < SPT-Threshold, switch back to Shared Tree • F = Register – In (S,G) entry •Indicates the router is a first-hop router and there is a directly connected source or proxy registers are being sent. – In (*, G) entry •Set when “F” set in at least one child (S,G) 50 PIM-SM State Flags (Cont.) • R = RP bit – (S, G) entries only – Set by (S,G)RP-bit Prune/Join – Indicates info is applicable to Shared Tree – Used to prune (S,G) traffic from Shared Tree • Initiated by Last-hop router after switch to SPT – Modifies (S,G) forwarding behavior • IIF = RPF toward RP (I.e. up the Shared Tree) • OIL = Pruned accordingly 51 PIM-SM Protocol Mechanics • • • • • • PIM SM State PIM SM Forwarding PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning 52 PIM-SM Forwarding Rules • Use longest match entry – Use (S, G) entry if exists – Otherwise, use (*, G) entry • RPF check first – If Packet didn’t arrive via IIF, drop it. • Forward Packet (if RPF succeeded) – Send out all “unpruned” interfaces in OIL 53 PIM SM Forwarding to RP (10.1.5.1) S0 Shared Tree Multicast Packets (128.9.160.1, 224.1.1.1) S1 E0 Source Tree (SPT) Multicast Packets (128.9.160.43, 224.1.1.1) rtr-a E0 E1 Rcvr A (*, 224.1.1.1) • Packets are “forwarded” out all interfaces in “oilist”. • PIM Sparse mode interfaces are placed on the “oilist” for a Multicast Group IF: – PIM neighbor Joins the group on this interface – Host on this interface has joined the group – Interface has been manually configured to join group. 54 PIM-SM Protocol Mechanics • • • • • • PIM SM State PIM SM Forwarding PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning 55 PIM SM Joining • Leaf routers send a (*,G) Join to toward RP – Joins sent hop-by-hop via unicast path toward RP • Each router along path creates (*,G) state – IF no (*,G) state, create it & send a Join toward RP – ELSE Join process complete. Reached the (*,G) tree. 56 PIM SM Joining S1 To RP (10.1.5.1) rtr-a S0 10.1.4.2 E0 10.1.2.1 Shared Tree 10.1.2.2 1 IGMP Join E1 E0 rtr-b Rcvr A 1• “Rcvr A” wishes to receive group G traffic. Sends IGMP Join for G. 57 PIM SM Joining S1 To RP (10.1.5.1) rtr-a S0 10.1.4.2 E0 10.1.2.1 Shared Tree 10.1.2.2 E1 E0 rtr-b Rcvr A (*, 224.1.1.1), 00:00:05/00:02:54, RP 10.1.5.1, flags: SC Incoming interface: Ethernet0, RPF nbr 10.1.2.1 Outgoing interface list: Ethernet1, Forward/Sparse, 00:00:05/00:02:54 “rtr-b” creates (*, 224.1.1.1) state 58 PIM SM Joining S1 To RP (10.1.5.1) rtr-a S0 10.1.4.2 E0 10.1.2.1 Shared Tree 10.1.2.2 E0 E1 2 PIM Join rtr-b Rcvr A 1• “Rcvr A” wishes to receive group G traffic. Sends IGMP Join for G. 2• “rtr-b” sends (*,G) Join towards RP. 59 PIM SM Joining S1 To RP (10.1.5.1) rtr-a S0 10.1.4.2 E0 10.1.2.1 Shared Tree 10.1.2.2 E1 E0 rtr-b Rcvr A (*, 224.1.1.1), 00:00:05/00:02:54, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.4.1 Outgoing interface list: Ethernet0, Forward/Sparse, 00:00:05/00:02:54 “rtr-a” creates (*, 224.1.1.1) state. 60 PIM SM Joining 4 Shared Tree S1 To RP (10.1.5.1) 3 PIM Join rtr-a S0 10.1.4.2 E0 10.1.2.1 Shared Tree 10.1.2.2 E0 E1 rtr-b Rcvr A 1• “Rcvr A” wishes to receive group G traffic. Sends IGMP Join for G. 2• “rtr-b” sends (*,G) Join towards RP. 3• “rtr-a” sends (*,G) Join towards RP. 4• Shared tree is built all the way back to the RP. 61 PIM-SM Protocol Mechanics • • • • • • PIM SM State PIM SM Forwarding PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning 62 PIM SM Registering • Senders begin sourcing Multicast Traffic – Senders don’t necessarily perform IGMP group joins. • 1st-hop router unicasts “Registers” to RP – A Mcast packet is encapsulated in each Register msg – Registers messages follow unicast path to RP • RP receives “Register” messages – De-encapsulates the Mcast packet inside Register msg – Forwards Mcast packet down Shared Tree – Sends (S,G) Join toward Source / 1st-Hop router to build an (S,G) SPT between Source and RP 63 PIM SM Registering • 1st-hop router receives (S,G) Join – SPT between Source and RP now built. – Begins forwarding traffic down (S,G) SPT to RP – (S,G) Traffic temporarily flowing down 2 paths to RP • RP receives traffic down native (S,G) SPT – Sends a “Register-Stop” msg to Source / 1st-Hop router. • 1st-Hop router receives “Register-Stop” msg – Stops encapsulating traffic in “Register” messages – (S,G) Traffic now flowing down single SPT to RP 64 PIM SM Register Examples • Receivers Join Group First • Source Registers First 65 PIM SM Registering Receiver Joins Group First RP E0 rtr-a S0 S0 S1 rtr-b S3 S0 S1 rtr-c Shared Tree (*, 224.1.1.1), 00:00:03/00:02:56, RP 171.68.28.140, flags:S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial0, Forward/Sparse, 00:03:14/00:02:59 Serial1, Forward/Sparse, 00:03:14/00:02:59 State in “RP” before any source registers (with receivers on Shared Tree) 66 PIM SM Registering Receiver Joins Group First RP E0 S0 rtr-a S0 S1 rtr-b S3 S0 S1 rtr-c Shared Tree rtr-b>sh ip mroute 224.1.1.1 No such group State in “rtr-b” before any source registers (with receivers on Shared Tree) 67 PIM SM Registering Receiver Joins Group First RP E0 S0 rtr-a S0 S1 rtr-b S3 S0 S1 rtr-c Shared Tree rtr-a>sh ip mroute 224.1.1.1 No such group. State in “rtr-a” before any source registers (with receivers on Shared Tree) 68 PIM SM Registering Receiver Joins Group First (171.68.37.121, 224.1.1.1) Mcast Packets RP 1 Source 171.68.37.121 E0 rtr-a S0 S0 S1 rtr-b S3 S0 S1 rtr-c Shared Tree 1• “Source” begins sending group G traffic. 69 PIM SM Registering Receiver Joins Group First 2 (171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs RP Source 171.68.37.121 E0 rtr-a S0 S0 S1 S3 rtr-b S0 S1 rtr-c Shared Tree (*, 224.1.1.1), 00:00:03/00:02:56, RP 171.68.28.140, flags: SP Incoming interface: Serial0, RPF nbr 171.68.28.191, Outgoing interface list: Null (171.68.37.121/32, 224.1.1.1), 00:00:03/00:02:56, flags: FPT Incoming interface: Ethernet0, RPF nbr 0.0.0.0, Registering Outgoing interface list: Null “rtr-a” creates (S, G) state for source (After automatically creating a (*, G) entry) 1• “Source” begins sending group G traffic. 2• “rtr-a” encapsulates packets in Registers; unicasts to RP. 70 PIM SM Registering Receiver Joins Group First (171.68.37.121, 224.1.1.1) Mcast Packets Source 171.68.37.121 E0 rtr-a Register Msgs 171.68.28.139 S0 S0 S1 rtr-b RP S3 S0 S1 rtr-c 3 (*, 224.1.1.1) Shared Tree (*, 224.1.1.1), 00:09:21/00:02:38, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial0, Forward/Sparse, 00:09:21/00:02:38 Serial1, Forward/Sparse, 00:03:14/00:02:46 Mcast Traffic (171.68.37.121, 224.1.1.1, 00:01:15/00:02:46, flags: Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Serial0, Forward/Sparse, 00:00:49/00:02:11 Serial1, Forward/Sparse, 00:00:49/00:02:11 “RP” processes Register; creates (S, G) state 3• “rtr-c” (RP) de-encapsulates packets; forwards down Shared tree. 71 PIM SM Registering Receiver Joins Group First (171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs 4 Join S0 Source 171.68.37.121 E0 rtr-a S0 S0 RP S1 rtr-b S0 S1 rtr-c (*, 224.1.1.1) Mcast Traffic Shared Tree 4• RP sends (S,G) Join toward Source to build SPT. 72 PIM SM Registering Receiver Joins Group First (171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs 5 Join S0 Source 171.68.37.121 E0 S0 rtr-a RP S0 S1 S0 rtr-b S1 rtr-c (*, 224.1.1.1) Mcast Traffic 171.68.28.190 Shared Tree (*, 224.1.1.1), 00:04:28/00:01:32, RP 171.68.28.140, flags: SP Incoming interface: Serial1, RPF nbr 171.68.28.140, Outgoing interface list: Null (171.68.37.121/32, 224.1.1.1), 00:04:28/00:01:32, flags: Incoming interface: Serial0, RPF nbr 171.68.28.190 Outgoing interface list: Serial1, Forward/Sparse, 00:04:28/00:01:32 “rtr-b” processes Join, creates (S, G) state (After automatically creating the (*, G) entry) 4• 5• RP sends (S,G) Join toward Source to build SPT. “rtr-b” sends (S,G) Join toward Source to continue building SPT. 73 PIM SM Registering Receiver Joins Group First (171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs RP Source 171.68.37.121 E0 rtr-a S0 S0 S1 rtr-b S0 S1 rtr-c (*, 224.1.1.1) Mcast Traffic Shared Tree (*, 224.1.1.1), 00:04:28/00:01:32, RP 171.68.28.140, flags: SP Incoming interface: Serial0, RPF nbr 171.68.28.191, Outgoing interface list: Null (171.68.37.121/32, 224.1.1.1), 00:04:28/00:01:32, flags: FT Incoming interface: Ethernet0, RPF nbr 0.0.0.0, Registering Outgoing interface list: Serial0, Forward/Sparse, 00:04:28/00:01:32 “rtr-a” processes the (S, G) Join; adds Serial0 to OIL 74 PIM SM Registering Receiver Joins Group First (171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs 6 Source 171.68.37.121 E0 rtr-a S0 RP S0 S1 rtr-b 7 Register-Stop 6• RP begins receiving (S,G) traffic down SPT. 7• RP sends “Register-Stop” to “rtr-a”. S0 S1 rtr-c (*, 224.1.1.1) Mcast Traffic Shared Tree 75 PIM SM Registering Receiver Joins Group First (171.68.37.121, 224.1.1.1) Mcast Packets 8 Source 171.68.37.121 E0 rtr-a S0 RP S0 S1 rtr-b S3 S0 S1 rtr-c (*, 224.1.1.1) Mcast Traffic Shared Tree (*, 224.1.1.1), 00:04:28/00:01:32, RP 171.68.28.140, flags: SP Incoming interface: Serial0, RPF nbr 171.68.28.191, Outgoing interface list: Null (171.68.37.121/32, 224.1.1.1), 00:04:28/00:01:32, flags: FT Incoming interface: Ethernet0, RPF nbr 0.0.0.0, Registering Outgoing interface list: Serial0, Forward/Sparse, 00:04:28/00:01:32 “rtr-a” stops sending Register messages (Final State in “rtr-a”) 8• (S,G) Traffic now flowing down a single path (SPT) to RP. 76 PIM SM Registering Receiver Joins Group First (171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121 E0 S0 rtr-a S0 S1 rtr-b S0 S1 rtr-c (*, 224.1.1.1) Mcast Traffic Shared Tree (*, 224.1.1.1), 00:04:28/00:01:32, RP 171.68.28.140, flags: SP Incoming interface: Serial1, RPF nbr 171.68.28.140, Outgoing interface list: Null (171.68.37.121/32, 224.1.1.1), 00:04:28/00:01:32, flags: T Incoming interface: Serial0, RPF nbr 171.68.28.190 Outgoing interface list: Serial1, Forward/Sparse, 00:04:28/00:01:32 Final state in “rtr-b” 77 PIM SM Registering Receiver Joins Group First (171.68.37.121, 224.1.1.1) Mcast Packets 171.68.28.139 RP Source 171.68.37.121 E0 rtr-a S0 S0 S1 rtr-b S3 S0 S1 rtr-c (*, 224.1.1.1) Mcast Traffic Shared Tree (*, 224.1.1.1), 00:09:21/00:02:38, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial0, Forward/Sparse, 00:09:21/00:02:38 Serial1, Forward/Sparse, 00:03:14/00:02:46 (171.68.37.121, 224.1.1.1, 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Serial0, Forward/Sparse, 00:00:49/00:02:11 Serial1, Forward/Sparse, 00:00:49/00:02:11 Final state in the “RP” (with receivers on Shared Tree) 78 PIM SM Register Examples • Receivers Join Group First • Source Registers First 79 PIM SM Registering Source Registers First RP E0 S0 S0 rtr-a S1 rtr-b S3 S0 S1 rtr-c rtr-c>show ip mroute 224.1.1.1 Group 224.1.1.1 not found. State in “RP” before Registering (without receivers on Shared Tree) 80 PIM SM Registering Source Registers First RP E0 S0 rtr-a S0 S1 rtr-b S3 S0 S1 rtr-c rtr-b>show ip mroute 224.1.1.1 Group 224.1.1.1 not found. State in “rtr-b” before any source registers (with receivers on Shared Tree) 81 PIM SM Registering Source Registers First RP E0 S0 rtr-a S0 S1 rtr-b S3 S0 S1 rtr-c rtr-a>show ip mroute 224.1.1.1 Group 224.1.1.1 not found. State in “rtr-a” before any source registers (with receivers on Shared Tree) 82 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets RP 1 Source 171.68.37.121 1• E0 rtr-a S0 S0 S1 rtr-b S3 S0 S1 rtr-c “Source” begins sending group G traffic. 83 PIM SM Registering Source Registers First 2 (171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs RP Source 171.68.37.121 E0 rtr-a S0 S0 S1 S3 rtr-b S0 S1 rtr-c (*, 224.1.1.1), 00:00:03/00:02:56, RP 171.68.28.140, flags: SP Incoming interface: Serial0, RPF nbr 171.68.28.191, Outgoing interface list: Null (171.68.37.121/32, 224.1.1.1), 00:00:03/00:02:56, flags: FPT Incoming interface: Ethernet0, RPF nbr 0.0.0.0, Registering Outgoing interface list: Null “rtr-a” creates (S, G) state for source (After automatically creating a (*, G) entry) 1• “Source” begins sending group G traffic. 2• “rtr-a” encapsulates packets in Registers; unicasts to RP. 84 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets Source 171.68.37.121 E0 rtr-a Register Msgs 171.68.28.139 S0 S0 S1 RP 3 S3 rtr-b S0 S1 rtr-c (*, 224.1.1.1), 00:01:15/00:01:45, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Null (171.68.37.121, 224.1.1.1), 00:01:15/00:01:45, flags: P Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Null “RP” processes Register; creates (S, G) state (After automatically creating the (*, G) entry) 3• “rtr-c” (RP) has no receivers on Shared Tree; discards packet. 85 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs RP Source 171.68.37.121 E0 rtr-a S0 S0 S1 rtr-b S3 S0 S1 rtr-c Register-Stop 4 3• “rtr-c” (RP) has no receivers on Shared Tree; discards packet. 4• RP sends “Register-Stop” to “rtr-a”. 86 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121 E0 rtr-a S0 5 S0 S1 rtr-b S3 S0 S1 rtr-c 3• “rtr-c” (RP) has no receivers on Shared Tree; discards packet. 4• RP sends “Register-Stop” to “rtr-a”. 5• “rtr-a” stops encapsulating traffic in Register Messages; drops packets from Source. 87 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121 E0 rtr-a S0 S0 S1 S3 rtr-b S0 S1 rtr-c (*, 224.1.1.1), 00:01:28/00:01:32, RP 171.68.28.140, flags: SP Incoming interface: Serial0, RPF nbr 171.68.28.191, Outgoing interface list: Null (171.68.37.121/32, 224.1.1.1), 00:01:28/00:01:32, flags: FPT Incoming interface: Ethernet0, RPF nbr 0.0.0.0 Outgoing interface list: Null State in “rtr-a” after Registering (without receivers on Shared Tree) 88 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121 E0 S0 rtr-a S0 S1 rtr-b S3 S0 S1 rtr-c rtr-b>show ip mroute 224.1.1.1 Group 224.1.1.1 not found. State in “rtr-b” after “rtr-a” Registers (without receivers on Shared Tree) 89 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets 171.68.28.139 RP Source 171.68.37.121 E0 rtr-a S0 S0 S1 rtr-b S3 S0 S1 rtr-c (*, 224.1.1.1), 00:01:15/00:01:45, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Null (171.68.37.121, 224.1.1.1), 00:01:15/00:01:45, flags: P Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Null State in “RP” after “rtr-a” Registers (without receivers on Shared Tree) 90 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121 E0 rtr-a S0 S0 S1 rtr-b S3 S0 S1 rtr-c (*, G) Join 6 Receivers begin joining the Shared Tree 6• RP (“rtr-c”) receives (*, G) Join from a receiver on Shared Tree. 91 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets Join 7 Source 171.68.37.121 E0 rtr-a S0 S0 S1 RP S3 rtr-b S0 S1 rtr-c (*, 224.1.1.1), 00:09:21/00:02:38, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial1, Forward/Sparse, 00:00:14/00:02:46 (171.68.37.121/32, 224.1.1.1, 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Serial1, Forward/Sparse, 00:00:14/00:02:46 “RP” processes (*,G) Join (Adds Serial1 to Outgoing Interface Lists) 7• RP sends (S,G) Joins for all known Sources in Group. 92 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets Join 7 8 Join S0 Source 171.68.37.121 E0 S0 rtr-a S0 S1 RP S3 rtr-b S0 S1 rtr-c 171.68.28.190 (*, 224.1.1.1), 00:04:28/00:01:32, RP 171.68.28.140, flags: SP Incoming interface: Serial1, RPF nbr 171.68.28.140, Outgoing interface list: Null (171.68.37.121/32, 224.1.1.1), 00:04:28/00:01:32, flags: Incoming interface: Serial0, RPF nbr 171.68.28.190 Outgoing interface list: Serial1, Forward/Sparse, 00:04:28/00:01:32 “rtr-b” processes Join, creates (S, G) state (After automatically creating the (*, G) entry) 7• 8• RP sends (S,G) Joins for all known Sources in Group. “rtr-b” sends (S,G) Join toward Source to continue building SPT. 93 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets 9 Source 171.68.37.121 E0 rtr-a S0 RP S0 S1 rtr-b S3 S0 S1 rtr-c 10 (*, 224.1.1.1) Mcast Traffic (*, 224.1.1.1), 00:04:28/00:01:32, RP 171.68.28.140, flags: SP Incoming interface: Serial0, RPF nbr 171.68.28.191, Outgoing interface list: Null (171.68.37.121/32, 224.1.1.1), 00:04:28/00:01:32, flags: FT Incoming interface: Ethernet0, RPF nbr 0.0.0.0, Registering Outgoing interface list: Serial0, Forward/Sparse, 00:04:28/00:01:32 “rtr-a” processes the (S, G) Join; adds Serial0 to OIL •9 RP begins receiving (S,G) traffic down SPT. 10 • RP forwards (S,G) traffic down Shared Tree to receivers. 94 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121 E0 S0 rtr-a S0 S1 rtr-b S3 S0 S1 rtr-c 171.68.28.190 (*, 224.1.1.1) Mcast Traffic (*, 224.1.1.1), 00:04:28/00:01:32, RP 171.68.28.140, flags: SP Incoming interface: Serial1, RPF nbr 171.68.28.140, Outgoing interface list: Null (171.68.37.121/32, 224.1.1.1), 00:04:28/00:01:32, flags: T Incoming interface: Serial0, RPF nbr 171.68.28.190 Outgoing interface list: Serial1, Forward/Sparse, 00:04:28/00:01:32 Final state in “rtr-b” after Receivers Join 95 PIM SM Registering Source Registers First (171.68.37.121, 224.1.1.1) Mcast Packets 171.68.28.139 RP Source 171.68.37.121 E0 rtr-a S0 S0 S1 rtr-b S3 S0 S1 rtr-c (*, 224.1.1.1) Mcast Traffic (*, 224.1.1.1), 00:09:21/00:02:38, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial1, Forward/Sparse, 00:03:14/00:02:46 (171.68.37.121/32, 224.1.1.1, 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Serial1, Forward/Sparse, 00:00:49/00:02:11 Final state in “RP” after Receivers Join 96 PIM SM SPT-Switchover • SPT Thresholds may be set for any Group – Access Lists may be used to specify which Groups – Default Threshold = 0kbps (I.e. immediately join SPT) – Threshold = “infinity” means “never join SPT”. • Threshold triggers Join of Source Tree – Sends an (S,G) Join up SPT for next “S” in “G” packet received. • Pros – Reduces Network Latency • Cons – More (S,G) state must be stored in the routers. 97 PIM SM SPT-Switchover SPT-Switchover Mechanism Once each second – Compute new (*, G) traffic rate – If threshold exceeded, set “J” flag in (*, G) For each (Si , G) packet received: – If “J” flag set in (*, G) • Join SPT for (Si , G) • Mark (Si , G) entry with “J” flag • Clear “J” flag in (*,G) 98 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 S1 To Source “Si” S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 rtr-b Rcvr A (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.5.1, Outgoing interface list: Serial1, Forward/Sparse, 00:01:43/00:02:11 Serial2, Forward/Sparse, 00:00:32/00:02:28 State in “rtr-c” before switch 99 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 To Source “Si” S1 S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 rtr-b (Si, G) Traffic Flow Shared (RPT) Tree Rcvr A SPT Tree Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SC Incoming interface: Serial0, RPF nbr 10.1.4.8, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 State in “rtr-d” before switch 100 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 To Source “Si” S1 S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 rtr-b (Si, G) Traffic Flow Shared (RPT) Tree Rcvr A SPT Tree Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.4.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 State in “rtr-a” before switch 101 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 S1 To Source “Si” S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 rtr-b Rcvr A (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SC Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 State in “rtr-b” before switch 102 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 To Source “Si” S1 S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 1 Group “G” rate > Threshold rtr-b (Si, G) Traffic Flow Shared (RPT) Tree Rcvr A SPT Tree Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SC J Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 1 2 2 Group “G” rate exceeds SPT Threshold at “rtr-b”; Set J Flag in (*, G) and wait for next (Si,G) packet. 103 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 S1 To Source “Si” S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 3 rtr-b Rcvr A (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SC J Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 3 4 4 (Si,G) packet arrives down Shared tree. Clear J Flag in the (*,G) & create (Si,G) state. 104 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 To Source “Si” S1 S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 rtr-b (Si, G) Traffic Flow Shared (RPT) Tree Rcvr A Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SC Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 SPT Tree J Flag indicates (S, G) created by exceeding the SPT-threshold (171.68.37.121/32, 224.1.1.1), 00:13:28/00:02:53, flags: CJT Incoming interface: Ethernet0, RPF nbr 10.1.2.1 Outgoing interface list: Ethernet1, Forward/Sparse, 00:13:28/00:02:53 New State in “rtr-b” 105 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 To Source “Si” S1 S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 Rcvr A 5 (Si,G) Join rtr-b (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree Rcvr B 5 Send (Si,G) Join towards Si . 106 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 S1 To Source “Si” S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 rtr-b (Si, G) Traffic Flow Shared (RPT) Tree Rcvr A SPT Tree Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.4.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 (171.68.37.121/32, 224.1.1.1), 00:13:28/00:02:53, flags: T Incoming interface: Serial1, RPF nbr 10.1.9.2 Outgoing interface list: Ethernet0, Forward/Sparse, 00:13:25/00:02:30 New state in “rtr-a” 107 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 S1 S0 7 (Si,G)RP-bit Prune S2 rtr-a S1 6 (Si,G) Join S0 10.1.4.2 To Source “Si” E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 rtr-b Rcvr A (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree Rcvr B 6 “rtr-a” forwards (Si,G) Join toward Si. 7 SPT & RPT diverge, triggering (Si,G)RP-bit Prunes toward RP. 108 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 S1 8 (S ,G) Traffic i S0 10.1.4.2 To Source “Si” E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 rtr-b Rcvr A (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree Rcvr B 8 (Si, G) traffic begins flowing down SPT tree. 109 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 To Source “Si” S1 S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 rtr-b (Si, G) Traffic Flow Shared (RPT) Tree Rcvr A SPT Tree Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.5.1, Outgoing interface list: Serial1, Forward/Sparse, 00:01:43/00:02:11 Serial2, Forward/Sparse, 00:00:32/00:02:28 (171.68.37.121/32, 224.1.1.1), 00:13:28/00:02:53, flags: R Incoming interface: Serial0, RPF nbr 10.1.5.1 Outgoing interface list: Serial2, Forward/Sparse, 00:00:32/00:02:28 State in “rtr-c” after receiving the (Si, G) RP-bit Prune 110 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 9 S1 To Source “Si” S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 Rcvr A rtr-b (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree Rcvr B •9 Unnecessary (Si, G) traffic is pruned from the Shared tree. 111 PIM SM SPT-Switchover To RP (10.1.5.1) rtr-c 10.1.4.1 rtr-a S1 S0 S2 10 S1 To Source “Si” S0 10.1.4.2 E0 10.1.2.1 S0 10.1.2.2 rtr-d E0 E1 E0 Rcvr A rtr-b (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree Rcvr B •9 Unnecessary (Si, G) traffic is pruned from the Shared tree. 10 • (Si, G) traffic still flows via other branches of the Shared tree. 112 PIM SM SPT-Switchover Shared Tree Switchback Mechanism • Once each second when the (S,G) state is older than 1 minute – If “J” flag set in (Si , G) entry • Compute new (Si , G) traffic rate • If rate < SPT-threshold – Rejoin (*, G) Tree for (Si , G) traffic – Send (Si , G) prune up SPT toward Si – Delete (Si , G) entry 113 PIM-SM Protocol Mechanics • • • • • • PIM SM State PIM SM Forwarding PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning 114 PIM SM Pruning • IGMP group times out / last host sends Leave • Interface removed from all (*,G) and (S,G) entries – IF all interfaces in “oilist” for (*,G) are pruned; THEN send Prune up shared tree toward RP – Any (S, G) state allowed to time-out • Each router along path “prunes” interface – IF all interfaces in “oilist” for (*,G) are pruned; THEN send Prune up shared tree toward RP – Any (S, G) state allowed to time-out 115 PIM SM Pruning Shared Tree Case S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2 E0 10.1.2.1 10.1.2.2 E1 E0 rtr-b Rcvr A (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SC Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 State in “rtr-b” before Pruning 116 PIM SM Pruning Shared Tree Case S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2 E0 10.1.2.1 10.1.2.2 E1 E0 rtr-b Rcvr A (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.4.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 State in “rtr-a” before Pruning 117 PIM SM Pruning Shared Tree Case S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2 10.1.2.1 10.1.2.2 Rcvr A E1 X 1 IGMP Leave E0 E0 3 (*,G) Prune rtr-b 2 1 “rtr-b” is a Leaf router. Last host “Rcvr A”, leaves group G. 2 “rtr-b” removes E1 from (*,G) and any (Si,G) “oilists”. 3 “rtr-b” (*,G) “oilist” now empty; sends (*,G) Prune toward RP. 118 PIM SM Pruning Shared Tree Case X To RP (10.1.5.1) (*,G) Prune (Si, G) Traffic Flow Shared Tree SPT Tree 5 6 S1 rtr-a S0 10.1.4.2 10.1.2.2 X E0 E1 4 E0 10.1.2.1 4 rtr-b “rtr-a” receives Prune; removes E0 from (*,G) “oilist”. (After the 3 second Multi-access Network Prune delay.) 5 “rtr-a” (*,G) “oilist” now empty; send (*,G) Prune toward RP. 6 Pruning continues back toward RP. 119 PIM SM Pruning Source (SPT) Case S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2 To Source “Si” E0 10.1.2.1 10.1.2.2 E1 E0 rtr-b Rcvr A (*, 224.1.1.1), 00:01:43/00:02:59, RP 10.1.5.1, flags: SC Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 (171.68.37.121/32, 224.1.1.1), 00:01:05/00:01:55, flags: CJT Incoming interface: Ethernet0, RPF nbr 10.1.2.1 Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:05/00:02:55 State in “rtr-b” before Pruning 120 PIM SM Pruning Source (SPT) Case S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2 To Source “Si” E0 10.1.2.1 10.1.2.2 E1 E0 rtr-b Rcvr A (*, 224.1.1.1), 00:01:43/00:02:59, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.4.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 (171.68.37.121/32, 224.1.1.1), 00:01:05/00:01:55, flags: T Incoming interface: Serial1, RPF nbr 10.1.9.2 Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:05/00:02:55 State in “rtr-a” before Pruning 121 PIM SM Pruning Source (SPT) Case S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2 E1 X Rcvr A E0 10.1.2.1 10.1.2.2 1 IGMP Leave To Source “Si” E0 3 (*,G) Prune rtr-b 2 1 “rtr-b” is a Leaf router. Last host “Rcvr A”, leaves group G. 2 “rtr-b” removes E1 from (*,G) and any (Si,G) “oilists”. 3 “rtr-b” (*,G) “oilist” now empty; sends (*,G) Prune toward RP. 122 PIM SM Pruning Source (SPT) Case S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree 10.1.4.2 10.1.2.1 10.1.2.2 X E1 Rcvr A E0 E0 X SPT Tree To Source “Si” 4 Periodic (S, G) Join rtr-b 1 “rtr-b” is a Leaf router. Last host “Rcvr A”, leaves group G. 2 “rtr-b” removes E1 from (*,G) and any (Si,G) “oilists”. 3 “rtr-b” (*,G) “oilist” now empty; sends (*,G) Prune toward RP. 4 “rtr-b” stops sending periodic (S, G) joins. 123 PIM SM Pruning Source (SPT) Case S1 To RP (10.1.5.1) (*,G) Prune (Si, G) Traffic Flow Shared Tree SPT Tree 6 E1 5 rtr-a S0 10.1.4.2 10.1.2.2 X E0 To Source “Si” E0 10.1.2.1 5 rtr-b “rtr-a” receives Prune; removes E0 from (*,G) “oilist”. (After the 3 second Multiaccess Network Prune delay.) 6 “rtr-a” (*,G) “oilist” now empty; sends (*,G) Prune toward RP. 124 PIM SM Pruning Source (SPT) Case S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2 To Source “Si” E0 10.1.2.1 10.1.2.2 E1 E0 rtr-b (*, 224.1.1.1), 00:02:32/00:02:59, RP 10.1.5.1, flags: SP Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: (171.68.37.121/32, 224.1.1.1), 00:01:56/00:00:53, flags: PT Incoming interface: Ethernet0, RPF nbr 10.1.2.1 Outgoing interface list: State in “rtr-b” after Pruning 125 PIM SM Pruning Source (SPT) Case S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2 To Source “Si” E0 10.1.2.1 10.1.2.2 E1 E0 rtr-b (*, 224.1.1.1), 00:02:32/00:02:59, RP 10.1.5.1, flags: SP Incoming interface: Serial0, RPF nbr 10.1.4.1, Outgoing interface list: (171.68.37.121/32, 224.1.1.1), 00:01:56/00:00:53, flags: PT Incoming interface: Serial1, RPF nbr 10.1.9.2 Outgoing interface list: State in “rtr-a” after Pruning 126 PIM SM Pruning Source (SPT) Case (Si,G) Data 7 S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2 To Source “Si” 8 (Si,G) Prune E0 10.1.2.1 10.1.2.2 E1 E0 rtr-b 7 Another (Si,G) data packet arrives via Serial1. 8 ‘rtr-a’ responds by sending an (Si,G) Prune toward source. 127 PIM SM Pruning Source (SPT) Case 9 S1 To RP (10.1.5.1) rtr-a S0 (Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2 To Source “Si” E0 10.1.2.1 10.1.2.2 E1 E0 rtr-b 7 Another (Si,G) data packet arrives via Serial1. 8 ‘rtr-a’ responds by sending an (Si,G) Prune toward source. 9 (Si,G) traffic ceases flowing down SPT. 128 Multicast Protocols • Multicast Addressing Basics • IGMP • PIM • MSDP • MBGP 129 MSDP—Multicast Source Discovery Protocol • • • • • • MSDP concepts MSDP design points MSDP example Cisco MSDP implementation MSDP configuration MSDP application—Anycast RP 130 MSDP Concept • Simple but elegant – Utilize inter-domain source trees – Reduces problem to locating active sources – RP or receiver last-hop can join inter-domain source tree 131 MSDP Concepts • Works with PIM-SM only – RP’s knows about all sources in a domain • Sources cause a “PIM Register” to the RP • Can tell RP’s in other domains of its sources – Via MSDP SA (Source Active) messages – RP’s know about receivers in a domain • Receivers cause a “(*, G) Join” to the RP • RP can join the source tree in the peer domain – Via normal PIM (S, G) joins 132 MSDP Overview MSDP Example Domain E MSDP Peers Source Active Messages RP SA r SA Domain C Join (*, 224.2.2.2) RP SA Domain B SA SA RP SA SA SA Message 192.1.1.1, 224.2.2.2 RP RP SA Message 192.1.1.1, 224.2.2.2 Domain D s Domain A Register 192.1.1.1, 224.2.2.2 133 MSDP Overview MSDP Example Domain E MSDP Peers RP r Domain C RP Domain B RP RP Domain D RP s Domain A 134 MSDP Overview MSDP Example Domain E MSDP Peers RP Multicast Traffic r Domain C RP Domain B RP RP Domain D RP s Domain A 135 MSDP Overview MSDP Example Domain E MSDP Peers RP Multicast Traffic r Domain C RP Domain B RP RP Domain D RP s Domain A 136 MSDP Overview MSDP Example Domain E MSDP Peers RP Multicast Traffic r Domain C RP Domain B RP RP Domain D RP s Domain A 137 MSDP Design Points • MSDP peers talk via TCP connections • Source Active (SA) messages – Peer-RPF forwarded to prevent loops • RPF check on AS-PATH back to the peer RP – other rules from draft apply • If successful, flood SA message to other peers • Stub sites can be configured to accept all SA messages (similar to static default for routing) – Since they have only one exit (e.g., default peer) – MSDP speaker may cache SA messages • Reduces join latency 138 MSDP Peers • MSDP establishes a neighbor relationship between MSDP peers – Peers connect using TCP port 639 – Peers send keepalives every 60 secs (fixed) – Peer connection reset after 75 seconds if no MSDP packets or keepalives are received • MSDP peers must run BGP! – May be an MBGP peer, a BGP peer or both – Exception: BGP is unnecessary when peering with only a single MSDP peer. 139 MSDP SA Messages • MSDP SA Message Contents – One or more messages (in TLV format) • • • • Keepalives Source Active (SA) Messages Source Active Request (SA-Req) Messages Source Active Response (SA-Resp) Message – Source Active (SA) Messages • Used to advertise active Sources in a domain • Can also carry initial multicast packet from source – Hack for Bursty Sources (ala SDR) • SA Message Contents: – IP Address of Originator (usually an RP) – Number of (S, G)’s pairs being advertised – List of active (S, G)’s in the domain – Encapsulated Multicast packet 140 MSDP SA Messages • MSDP Message Contents (cont.) – SA Request (SA-Req) Messages • Used to request a list of active sources for a group – Sent to an MSDP SA Cache Server – Reduces Join Latency to active sources • SA Request Messages contain: – Requested Group Address – SA Response (SA-Resp) Messages • Sent in response to an SA Request message • SA Response Messages contain: – IP Address of Originator (usually an RP) – Number of (S, G)’s pairs being advertised – List of active (S, G)’s in the domain – Keepalive messages • Used to keep MSDP peer connection up 141 Receiving SA Messages • Skip RPF Check and process SA if: – Sending MSDP peer = only MSDP peer (i.e. the ‘default-peer’ or only ‘msdp-peer’ configured.) – Sending MSDP peer = Mesh-Group peer • RPF Check the received SA message. – Lookup best BGP path to RP in SA message • Search MBGP RIB first then BGP RIB – If path to RP not found, RPF Check Fails; ignore SA message – Sending MSDP Peer = BGP peer? • Yes: Is best path to RP via this BGP peer? – If yes, RPF Check Succeeds; process SA message • No: Is next AS in best path to RP = AS of MSDP peer? – If yes, RPF Check Succeeds; process SA message 142 Receiving SA Messages • Detailed RPF Check rules – Case 1: Sending MSDP Peer = iBGP peer • Is best path to RP via this BGP peer? – Translation: Is the address of the iBGP peer that advertised the best path to the RP = address of the sending MSDP peer? – Case 2: Sending MSDP Peer = eBGP peer • Is best path to RP via this BGP peer? – Translation: Is the AS of eBGP peer = next AS in the best path to the RP? – Case 3: Sending MSDP Peer != BGP peer • Is the next AS in best path to RP = AS of the sending MSDP peer? – Translation: None needed. 143 RPF Check Example RPF rule when MSDP == internal (m)BGP peer BGP Table router B Network *> 172.16.0.2/32 B 4.2 MSDP Peers router B 4.1 2.1 RP A Path i Who is the iBGP peer adverting this route, in our example 172.16.3.1 AS1 3.1 Next Hop 172.16.3.1 MSDP Peer 172.16.3.1 172.16.4.1 AS 1 1 State UP UP C Is the MSDP == BGP peer Source ip pim rp-address 172.16.0.2 RPF Success! MSDP SA MSDP/iMBGP mesh-peering 144 RPF Check Example RPF rule when MSDP == internal (m)BGP peer BGP Table router B Network *> 172.16.0.2/32 B 4.2 MSDP Peers router B 4.1 2.1 RP A Path i Who is the iBGP peer adverting this route, In our example 172.16.3.1 AS1 3.1 Next Hop 172.16.3.1 MSDP Peer 172.16.3.1 172.16.4.1 AS 1 1 State UP UP C Is the MSDP == BGP peer Source ip pim rp-address 172.16.0.2 RPF Failure! MSDP SA MSDP/iMBGP mesh-peering 145 RPF Check Example MSDP Peers router B RPF rule when MSDP == external (m)BGP peer MSDP Peer 172.16.3.1 172.16.4.1 AS 2 B State UP UP BGP Neighbours router B 4.2 3.1 Neighbor 172.16.3.1 172.16.4.1 AS 1 3 Up/Down 06:06:08 06:09:06 4.1 2.1 RP AS 1 3 BGP Table A C AS3 AS1 Network *> 172.16.0.2/32 172.16.0.2/32 Next Hop 172.16.3.1 172.16.4.1 Path 1 i 3 1 i Is the MSDP AS == Last AS in RP route ip pim rp-address 172.16.0.2 Source RPF Success! MSDP SA MSDP/eMBGP mesh-peering Who is the BGP peer adverting this route 146 RPF Check Example MSDP Peers router B RPF rule when MSDP == external (m)BGP peer MSDP Peer 172.16.3.1 172.16.4.1 AS 2 B State UP UP BGP Neighbours router B 4.2 3.1 Neighbor 172.16.3.1 172.16.4.1 AS 1 3 Up/Down 06:06:08 06:09:06 4.1 2.1 RP AS 1 3 BGP Table A C AS3 AS1 Network *> 172.16.0.2/32 172.16.0.2/32 Next Hop 172.16.3.1 172.16.4.1 Path 1 i 3 1 i Is the MSDP AS == Last AS in RP route ip pim rp-address 172.16.0.2 Source RPF Failure! MSDP SA MSDP/eMBGP mesh-peering Who is the BGP peer adverting this route 147 RPF Check Example RPF rule when MSDP != (m)BGP peer MSDP Peer 172.16.3.1 172.16.4.1 AS 2 B MSDP Peers router B AS 1 3 State UP UP 4.2 BGP Table router B 3.1 Network *> 172.16.0.2/32 172.16.0.2/32 *> 172.16.4.0/24 *> 172.16.3.0/24 4.1 2.1 RP A Next Hop 172.16.3.1 172.16.4.1 172.16.4.1 172.16.3.1 Path 1 i 3 1 I 3 i 1 i C AS3 AS1 Is the MSDP AS == Last AS in RP route ip pim rp-address 172.16.0.2 Source RPF Success! MSDP SA MSDP/eMBGP mesh-peering 148 RPF Check Example RPF rule when MSDP != (m)BGP peer MSDP Peer 172.16.3.1 172.16.4.1 AS 2 B MSDP Peers router B AS 1 3 State UP UP 4.2 BGP Table router B 3.1 Network *> 172.16.0.2/32 172.16.0.2/32 *> 172.16.4.0/24 *> 172.16.3.0/24 4.1 2.1 RP A Next Hop 172.16.3.1 172.16.4.1 172.16.4.1 172.16.3.1 Path 1 i 3 1 I 3 i 1 i C AS3 AS1 Is the MSDP AS == Last AS in RP route ip pim rp-address 172.16.0.2 Source RPF Failure! MSDP SA MSDP/eMBGP mesh-peering 149 Cisco MSDP Implementation • Cisco implementation current with ID: – draft-ietf-msdp-spec-02.txt • Multiple peer support – Peer with BGP, MBGP, or static peers • SA caching (off by default) • Sending and receiving SA-requests • Sending and receiving SA-responses 150 Cisco MSDP Implementation • SA input and output filtering • SA-request input filtering • Default peer support – So a tail site can MSDP with a backbone provider without requiring the two to BGP peer • Triggered join support when creating an (S,G) learned by MSDP • Mesh groups – Reduces RPF-flooding of SA messages between fully meshed MSDP peers 151 MSDP Configuration • Configure peers ip msdp peer <ip-address> [connect-source <i/f>] • Configure default peer ip msdp default-peer <ip-address> [prefix-list acl] • SA caching ip msdp cache-sa-state [list <acl>] • Mesh groups ip msdp mesh-group <name> <ip-address> 152 MSDP Configuration • Filtering – Can filter SA in/out, groups, with acls or route-maps • TTL Scoping ip msdp ttl-threshold <ip-address> <ttl> • For more configuration commands see: ftp://ftpeng.cisco.com/ipmulticast/msdp-commands 153 ISP Requirements to Deploy Now • Want an explicit join protocol for efficiency PIM-SM (forwarding) • Use existing (unicast) operation model MBGP (routing) • Need interdomain source discovery MSDP (source discovery) 154 MSDP Application—Anycast RP • draft-ietf-mboned-anycast-rp-05.txt • Within a domain, deploy more than one RP for the same group range • Give each RP the same IP address assignment • Sources and receivers use closest RP • May be used intra-domain (enterprise) to provide redundancy and RP load sharing 155 MSDP Application—Anycast RP • Sources from one RP are known to other RPs using MSDP • When an RP goes down, sources and receivers are taken to new RP via unicast routing – Fast convergence 156 Anycast RP—Convergence Src Rec RP1 RP2 MSDP Rec X 10.0.0.1 10.0.0.1 Rec Rec Src 157 Anycast RP—Convergence Src Rec RP1 X RP2 Rec 10.0.0.1 10.0.0.1 Rec Rec Src 158 Multicast Protocols • Multicast Addressing Basics • IGMP • PIM • MSDP • MBGP 159 MBGP—Multiprotocol BGP • MBGP overview • MBGP capability negotiation • MBGP NLRI exchange 160 MBGP Overview • MBGP: Multiprotocol BGP (aka multicast BGP in multicast networks) – Defined in RFC 2283 (extensions to BGP) – Can carry different types of routes • Unicast • Multicast – Both routes carried in same BGP session – Does not propagate multicast state info – Same path selection and validation rules • AS-Path, LocalPref, MED, … 161 MBGP Overview • New multiprotocol attributes – MP_REACH_NLRI – MP_UNREACH_NLRI • MP_REACH_NLRI and MP_UNREACH_NLRI – Address Family Information (AFI) = 1 (IPv4) • Sub-AFI = 1 (NLRI is used for unicast) • Sub-AFI = 2 (NLRI is used for multicast RPF check) • Sub-AFI = 3 (NLRI is used for both unicast and multicast RPF check) 162 MBGP Overview • Separate BGP tables maintained – Unicast Routing Information Base (RIB) – Multicast Routing Information Base (MRIB) • Unicast RIB (U-RIB) – Contains unicast prefixes for unicast forwarding – Populated with BGP unicast NLRI • AFI = 1, Sub-AFI = 1 or 3 • Multicast RIB (M-RIB) – Contains unicast prefixes for RPF checking – Populated with BGP multicast NLRI • AFI = 1, Sub-AFI = 2 or 3 163 MBGP Overview • MBGP allows different unicast and multicast topologies and different policies – Same IP address may have different signification • Unicast routing information • Multicast RPF information – For same IPv4 address two different NLRI with different nexthops – Can use existing or new BGP peering topology for multicast 164 MBGP NLRI Exchange • BGP/MBGP configuration allows you to: – Define which NLRI type are exchanged (unicast, multicast, both) – Set NLRI type through route-maps (redistribution) – Define policies through standard BGP attributes (for unicast and/or multicast NLRI) • No redistribution allowed between MBGP and BGP tables – NLRI type can be set with set nlri route-map command 165 MBGP NLRI Exchange • MRIB is populated by: – Receiving AFI/SAFI 1/2 MP_REACH_NLRI from neighbors – Configured/stored locally by: network <foo> <foo-mask> [nlri multicast unicast] redistribute <unicast> route-map <map> aggregate-address <foo> <foo-mask> [nlri multicast unicast] neighbor <foo> default-originate [nlri multicast unicast] • Note: Syntax changes in 12.07T/12.1 forward 166 MBGP — NLRI Information Congruent Topologies BGP Session for Unicast and Multicast NLRI AS 123 .1 192.168.10.0/24 Receiver 192.168.100.0/24 AS 321 .2 router bgp 321 neighbor 192.168.100.1 remote-as 123 nlri unicast multicast network 192.192.25.0 255.255.255.0 nlri unicast multicast no auto-summary 192.192.25.0/24 Sender 167 MBGP — NLRI Information Congruent Topologies BGP Session for Unicast and Multicast NLRI AS 123 .1 192.168.100.0/24 AS 321 .2 Unicast Information 192.168.10.0/24 NLRI: 192.192.25/24 AS_PATH: 321 MED: Next-Hop: 192.168.100.2 ... 192.192.25.0/24 Multicast Information Receiver MP_REACH_NLRI: 192.192.25/24 AFI: 1, Sub-AFI: 2 (multicast) AS_PATH: 321 MED: Next-Hop: 192.168.100.2 ... Sender Routing Update 168 MBGP — NLRI Information Incongruent Topologies AS 123 AS 321 .1 .1 Unicast Traffic 192.168.100.0/24 Multicast Traffic .2 .2 192.168.200.0/24 192.192.25.0/24 router bgp 321 . . . network 192.192.25.0 nlri unicast multicast neighbor 192.168.100.1 remote-as 123 nlri unicast neighbor 192.168.200.1 remote-as 123 nlri multicast Sender 169 MBGP — NLRI Information Incongruent Topologies AS 123 AS 321 .1 Unicast Traffic 192.168.100.0/24 Multicast Traffic .1 .2 .2 192.168.200.0/24 192.192.25.0/24 Sender Unicast Information NLRI: 192.192.25/24 AS_PATH: 321 MED: Next-Hop: 192.168.100.2 Routing Update 170 MBGP — NLRI Information Incongruent Topologies AS 123 AS 321 .1 Unicast Traffic 192.168.100.0/24 Multicast Traffic .1 .2 .2 192.168.200.0/24 192.192.25.0/24 Multicast Information Sender MP_REACH_NLRI: 192.192.25/24 AFI: 1, Sub-AFI: 2 AS_PATH: 321 MED: Next-Hop: 192.168.200.2 Routing Update 171 MBGP — NLRI Information Incongruent Topologies AS 123 AS 321 .1 .1 Unicast Traffic 192.168.100.0/24 Multicast Traffic .2 .2 192.168.200.0/24 192.192.25.0/24 Unicast RIB Network Next-Hop Path 192.192.25.0/24 192.168.100.2 321 Sender Multicast RIB Network Next-Hop Path 192.192.25.0/24 192.168.200.2 321 172 Populating the MRIB • Just to restate network <foo> <foo-mask> [nlri multicast unicast] redistribute <unicast> route-map <map> aggregate-address <foo> <foo-mask> [nlri multicast unicast] neighbor <foo> default-originate [nlri multicast unicast] 173 MBGP—Summary • Solves part of inter-domain problem – Can exchange multicast routing information – Uses standard BGP configuration knobs – Permits separate unicast and multicast topologies if desired • Still must use PIM to: – Build distribution trees – Actually forward multicast traffic – PIM-SM recommended • But there’s still a problem using PIM-SM here… (more on that later) 174 Midterm Exam Which protocol... – ...is used to propagate forwarding state? PIM-SM - Protocol Independent Multicast-Sparse Mode – ...maintains routing information for interdomain RPF checking? MBGP - Multiprotocol Border Gateway Protocol – …exchanges active source information between RPs? MSDP - Multicast Source Discovery Protocol 175 Agenda • • • • • • What Is Multicast Multicast Protocols Multicast Configurations Troubleshooting Futures Useful Links 176 Putting it all together.. • PIM-SM, MBGP, MSDP • Public multicast peering (MIX) • Transit topology examples • Address allocation – GLOP draft-ietf-mboned-static-allocation-00.txt 177 ISP Requirements at the MIX • Current solution: MBGP + PIM-SM + MSDP – Environment • ISPs run iMBGP and PIM-SM (internally) • ISPs multicast peer at a public interconnect – Deployment • • • • Border routers run eMBGP The interfaces on interconnect run PIM-SM RPs’ MSDP peering is fully meshed All peers set a common distance for eMBGP 178 ISP Requirements at the MIX Peering Solution: MBGP + PIM-SM +MSDP ISP A Public Interconnect PIM-SM PIM-SM RP RP iMBGP ISP B iMBGP Redistribute old MBONE DVMRP routes into MBGP eMBGP iMBGP RP ISP C PIM-SM RP iMBGP AS 10888 179 Multicast Transit Design Objectives • PIM Border Constraints – – – – Confine registers within domain Confine local groups Confine RP announcements Control SA advertisements via MSDP • Border RPF check – RPF check against unicast routes to multicast sources • MSDP RPF check – RPF check toward RP in received SAs 180 Recommended MSDP SA Filter ftp://ftp-eng.cisco.com/ipmulticast/msdp-sa-filter.txt ! domain-local applications access-list 111 deny ip any host 224.0.2.2 ! access-list 111 deny ip any host 224.0.1.3 ! Rwhod access-list 111 deny ip any host 224.0.1.24 ! Microsoft-ds access-list 111 deny ip any host 224.0.1.22 ! SVRLOC access-list 111 deny ip any host 224.0.1.2 ! SGI-Dogfight access-list 111 deny ip any host 224.0.1.35 ! SVRLOC-DA access-list 111 deny ip any host 224.0.1.60 ! hp-device-disc ! auto-rp groups access-list 111 deny ip any host 224.0.1.39 access-list 111 deny ip any host 224.0.1.40 ! scoped groups access-list 111 deny ip any 239.0.0.0 0.255.255.255 ! loopback, private addresses (RFC 1918) access-list 111 deny ip 10.0.0.0 0.255.255.255 any access-list 111 deny ip 127.0.0.0 0.255.255.255 any access-list 111 deny ip 172.16.0.0 0.15.255.255 any access-list 111 deny ip 192.168.0.0 0.0.255.255 any access-list 111 permit ip any any 181 Multicast Boundary Filter Minimum Recommended !deny auto-rp groups access-list 1 deny 224.0.1.40 access-list 1 deny 224.0.1.39 !deny scoped groups access-list 1 deny 239.0.0.0 0.255.255.255 !permit the rest of 224/4 access-list 1 permit 224.0.0.0 15.255.255.255 182 Single-Homed, ISP RP, Non-MBGP PIM Border Constraints Transit AS109 Tail-site Customer pos0/0 1.1.1.2 pos0/0 1.1.1.1 RP int pos0/0 ip pim sparse-dense-mode 192.168.100.0/24 Receiver ip pim send-rp-announce Loopback0 scope 255 ip pim send-rp-discovery Loopback0 scope 255 int pos0/0 ip pim sparse-dense-mode 183 Single-Homed, ISP RP, Non-MBGP Checking PIM Border (RP mapping) Transit AS109 Tail-site Customer pos0/0 1.1.1.1 pos0/0 1.1.1.2 RP tail-gw#show ip pim rp mapping PIM Group-to-RP Mappings Receiver Group(s) 224.0.0.0/4 192.168.100.0/24 RP 3.3.3.7 (loopback.transit.net), v2v1 Info source: 1.1.1.2 (tail.transit.net), via Auto-RP Uptime: 21:57:41, expires: 00:02:08 184 Single-Homed, ISP RP, Non-MBGP Checking PIM Border (RP mapping) Transit AS109 Tail-site Customer pos0/0 1.1.1.1 Receiver pos0/0 1.1.1.2 RP Transit-tail#show ip pim rp mapping 192.168.100.0/24 PIM Group-to-RP Mappings This system is an RP (Auto-RP) This system is an RP-mapping agent Group(s) 224.0.0.0/4 RP 3.3.3.7 (loopback.transit.net), v2v1 Info source: 3.3.3.7 (loopback.transit.net), via Auto-RP Uptime: 22:08:47, expires: 00:02:14 185 Single-Homed, ISP RP, Non-MBGP Border RPF Check Transit AS109 Tail-site Customer pos0/0 1.1.1.1 pos0/0 1.1.1.2 RP ip route 0.0.0.0 0.0.0.0 1.1.1.2 192.168.100.0/24 Receiver ip route 192.168.100.0 255.255.255.0 1.1.1.1 router bgp 109 ... network 192.168.100.0 nlri unicast multicast 186 Single-Homed, ISP RP, Non-MBGP MSDP RPF Check Transit AS109 Tail-site Customer pos0/0 1.1.1.1 pos0/0 1.1.1.2 RP - no RP / no MSDP 192.168.100.0/24 Receiver - no downstream RP - no downstream MSDP peering 187 Single-Homed, Customer RP, Non-MBGP PIM Border Constraints Transit AS109 Tail-site Customer pos0/0 1.1.1.1 pos0/0 1.1.1.2 RP int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 192.168.100.0/24 Receiver ip msdp sa-filter out 1.1.1.2 111 ip msdp sa-filter in 1.1.1.2 111 Note: Access-list 111 = Recommended SA Filter 188 Single-Homed, Customer RP, Non-MBGP PIM Border Constraints Transit AS109 Tail-site Customer pos0/0 1.1.1.2 pos0/0 1.1.1.1 192.168.100.0/24 Receiver int ip ip ip RP pos0/0 pim sparse-mode pim bsr-border multicast boundary 1 ip msdp sa-filter out 1.1.1.1 111 ip msdp sa-filter in 1.1.1.1 111 Note: Access-list 111 = Recommended SA Filter 189 Single-Homed, Customer RP, Non-MBGP Border RPF Check Transit AS109 Tail-site Customer pos0/0 1.1.1.1 pos0/0 1.1.1.2 RP ip route 0.0.0.0 0.0.0.0 1.1.1.2 192.168.100.0/24 Receiver ip route 192.168.100.0 255.255.255.0 1.1.1.1 router bgp 109 ... network 192.168.100.0 nlri unicast multicast 190 Single-Homed, Customer RP, Non-MBGP MSDP RPF Check Transit AS109 Tail-site Customer pos0/0 1.1.1.1 pos0/0 1.1.1.2 RP ip msdp peer 1.1.1.1 connect-source pos0/0 192.168.100.0/24 Receiver ip msdp peer 1.1.1.2 connect-source pos0/0 191 Single-Homed, Customer RP, MBGP PIM Border Constraints Transit AS109 Tail-site Customer pos0/0 1.1.1.1 pos0/0 1.1.1.2 RP int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 192.168.100.0/24 Receiver ip msdp sa-filter out 1.1.1.2 111 ip msdp sa-filter in 1.1.1.2 111 Note: Access-list 111 = Recommended SA Filter 192 Single-Homed, Customer RP, MBGP PIM Border Constraints Transit AS109 Tail-site Customer pos0/0 1.1.1.2 pos0/0 1.1.1.1 192.168.100.0/24 Receiver int ip ip ip RP pos0/0 pim sparse-mode pim bsr-border multicast boundary 1 ip msdp sa-filter out 1.1.1.1 111 ip msdp sa-filter in 1.1.1.1 111 Note: Access-list 111 = Recommended SA Filter 193 Single-Homed, Customer RP, MBGP Border RPF Check Transit AS109 Tail-site Customer pos0/0 1.1.1.1 pos0/0 1.1.1.2 RP router bgp 100 network 192.168.100.0 nlri unicast multicast neighbor 1.1.1.2 remote-as 109 nlri unicast multicast neighbor 1.1.1.2 update-source pos0/0 192.168.100.0/24 Receiver router bgp 109 neighbor 1.1.1.1 remote-as 100 nlri unicast multicast neighbor 1.1.1.1 update-source pos 0/0 194 Single-Homed, Customer RP, MBGP MSDP RPF Check Transit AS109 Tail-site Customer pos0/0 1.1.1.1 pos0/0 1.1.1.2 RP ip msdp peer 1.1.1.1 connect-source pos0/0 192.168.100.0/24 Receiver ip msdp peer 1.1.1.2 connect-source pos0/0 195 Which Mode—Sparse or Dense • Sparse mode – Must configure a Rendezvous Point (RP) • Statically (on every Router) • Using Auto-RP (Routers learn RP automatically) • Using BSR (Routers learn RP automatically) – Very efficient • Uses Explicit Join model • Traffic only flows to where it’s needed – Separate control and data planes • Router state only created along flow paths • Deterministic topological behavior – Scales better than dense mode • Works for both sparsely or densely populated networks 196 Which Mode—Sparse or Dense CONCLUSION Virtually all production networks should be configured to run in Sparse mode! 197 RP Placement • Q: “Where do I put the RP?” – A: “Generally speaking, it’s not critical” • SPT’s are normally used by default – RP is a place for source and receivers to meet – Traffic does not normally flow through the RP – RP is therefore not a bottleneck • Exception: SPT-Threshold = Infinity – Traffic stays on the shared tree – RP could become a bottleneck 198 Static RP’s • Hard-coded RP address – When used, must be configured on every router – All routers must have the same RP address – RP fail-over not possible • Exception: If Anycast RPs are used. • Command ip pim rp-address <address> [group-list <acl>] [override] – Optional group list specifies group range • Default: Range = 224.0.0.0/4 (Includes Auto-RP Groups!!!!) – Override keyword “overrides” Auto-RP information • Default: Auto-RP learned info takes precedence 199 Auto-RP Overview • All routers automatically learn RP address – No configuration necessary except on: • Candidate RPs • Mapping Agents • Makes use of Multicast to distribute info – Two specially IANA assigned Groups used • Cisco-Announce - 224.0.1.39 • Cisco-Discovery - 224.0.1.40 – Typically Dense mode is used forward these groups • Permits backup RP’s to be configured • Can be used with Admin-Scoping 200 Auto-RP Fundamentals • Candidate RPs – Configured via global config command ip pim send-rp-announce <intfc> scope <ttl> [group-list acl] – Multicast RP-Announcement messages • Sent to Cisco-Announce (224.0.1.39) group • Sent every rp-announce-interval (default: 60 sec) – RP-Announcements contain: • Group Range (default = 224.0.0.0/4) • Candidate’s RP address • Holdtime = 3 x <rp-announce-interval> 201 Auto-RP Fundamentals • Mapping agents – Configured via global config command ip pim send-rp-discovery scope <ttl> – Receive RP-Announcements • Select highest C-RP IP addr as RP for group range • Stored in Group-to-RP Mapping Cache with holdtimes – Multicast RP-Discovery messages • Sent to Cisco-Discovery (224.0.1.40) group • Sent every 60 seconds or when changes detected – RP-Discovery messages contain: • Contents of MA’s Group-to-RP Mapping Cache 202 Auto-RP Fundamentals • All Cisco routers – Join Cisco-Discovery (224.0.1.40) group • Automatic • No configuration necessary – Receive RP-Discovery messages • Stored in local Group-to-RP Mapping Cache • Information used to determine RP for group range 203 C-RP 1.1.1.1 C Announce Announce A Announce MA B Announce Announce MA D Announce Announce Auto-RP—From 10,000 Feet Announce C-RP 2.2.2.2 RP-Announcements multicast to the Cisco Announce (224.0.1.39) group Announce 204 Auto-RP—From 10,000 Feet MA A C-RP 1.1.1.1 C MA B D C-RP 2.2.2.2 RP-Discoveries multicast to the Cisco Discovery (224.0.1.40) group Discovery 205 BSR Overview • A single Bootstrap Router (BSR) is elected – Multiple Candidate BSR’s (C-BSR) can be configured • Provides backup in case currently elected BSR fails – C-RP’s send C-RP announcements to the BSR • C-RP announcements are sent via unicast • BSR stores ALL C-RP announcements in the “RP-set” – BSR periodically sends BSR messages to all routers • BSR Messages contain entire RP-set and IP address of BSR • Messages are flooded hop-by-hop throughout the network away from the BSR – All routers select the RP from the RP-set • All routers use the same selection algorithm; select same RP • BSR cannot be used with Admin-Scoping 206 BSR Fundamentals • Candidate RPs – Configured via global config command – ip pim rp-candidate <intfc> [group-list acl] – Unicast PIMv2 C-RP messages to BSR • Learns IP address of BSR from BSR messages • Sent every rp-announce-interval (default: 60 sec) – C-RP messages contain: • Group Range (default = 224.0.0.0/4) • Candidate’s RP address • Holdtime = 3 x <rp-announce-interval> 207 BSR Fundamentals • Bootstrap router (BSR) – Receive C-RP messages • Accepts and stores ALL C-RP messages • Stored in Group-to-RP Mapping Cache w/holdtimes – Originates BSR messages • Multicast to All-PIM-Routers (224.0.0.13) group – (Sent with a TTL = 1) • Sent out all interfaces. Propagate hop-by-hop • Sent every 60 seconds or when changes detected – BSR messages contain: • Contents of BSR’s Group-to-RP Mapping Cache • IP Address of active BSR 208 BSR Fundamentals • Candidate bootstrap router (C-BSR) – Configured via global config command ip pim bsr-candidate <intfc> <hash-length> [priority <pri>] – <intfc> » Determines IP address – <hash-length> » Sets RP selection hash mask length – <pri> » Sets the C-BSR priority (default = 0) – C-BSR with highest priority elected BSR – C-BSR IP address used as tie-breaker » (Highest IP address wins) – The active BSR may be preempted » New router w/higher BSR priority forces new election 209 BSR Fundamentals • All PIMv2 routers – Receive BSR messages • Stored in local Group-to-RP Mapping Cache • Information used to determine active BSR address – Selects RP using Hash algorithm • Selected from local Group-to-RP Mapping Cache • All routers select same RP using same algorithm • Permits RP-load balancing across group range 210 BSR—From 10,000 feet G BSR Msgs PIMv2 Sparse Mode BSR Msgs BSR F BSR Msgs A BSR Msgs D C-RP B C C-RP E BSR Msgs Flooded Hop-by-Hop 211 Auto-RP vs BSR • Auto-RP – Easy to configure – Supports Admin. Scoped Zones – Works in either PIMv1 or PIMv2 Cisco networks – Cisco proprietary • BSR – Easy to configure – Does not support Admin. Scoped Zones – Non-proprietary (PIMv2 networks only) 212 Anycast RP—Overview • Uses single statically defined RP address – Two or more routers have same RP address • RP address defined as a Loopback Interface. • Loopback address advertised as a Host route. – Senders & Receivers Join/Register with closest RP • Closest RP determined from the unicast routing table. • MSDP session(s) run between all RPs – Informs RPs of sources in other parts of network – RPs join SPT to active sources as necessary 213 Anycast RP—Convergence Src Rec RP1 RP2 MSDP Rec X 10.0.0.1 10.0.0.1 Rec Rec Src 214 Anycast RP—Convergence Src Rec RP1 X RP2 Rec 10.0.0.1 10.0.0.1 Rec Rec Src 215 PIM Configuration Steps • Enable Multicast Routing on every router • Configure every interface for PIM • Use Sparse-Dense Mode (to support Auto-RP) • Configure the RP (if using Sparse Mode) – Static RP addressing • RP address must be configured on every router – Using Auto-RP or BSR • Configure certain routers as Candidate RP(s) • All other routers automatically learn elected RP 216 Fool-Proof Multicast Configuration RP/Mapping Agent A B PIM Sparse Mode C RP/Mapping Agent D On every router: ip multicast-routing ip pim accept-RP Auto-RP On every interface: ip pim sparse-dense-mode On routers B and C: ip pim send-rp-announce loopback0 scope <ttl> ip pim send-rp-discovery loopback0 scope <ttl> 217 Classic Partial Multicast Cloud Mistake src T1 no ip pim dense-mode 56K ip pim dense-mode X We’ll just use the spare 56K line for the IP Multicast traffic and not the T1. RPF Failure!!!!! Network Engineer rcvr 218 Beginner’s Guide to IP Multicast JUST DO IT! Enable multicast on ALL interfaces in ALL routers in a given network. Failure to do so will require complex (and typically unnecessary) Multicast Traffic Engineering. 219 Agenda • • • • • • What Is Multicast Multicast Protocols Multicast Configurations Troubleshooting Futures Useful Links 220 Debugging and Troubleshooting • Monitoring Tools – sdr_monitor • Kamil Sarac and Kevin Almeroth - UCSB • http://www.cs.ucsb.edu/~ksarac/sdr-desc.html • Several sites on Abilene and vBNS – mlisten • http://www.cc.gatech.edu/computing/Telecomm/mbone/ – Abilene Core Node Router Proxy • http://hydra.uits.iu.edu/~abilene/proxy/ 221 Debugging and Troubleshooting • Multicast Looking Glass – Written by NLANR Engineering Services – Based on DIGEX Looking Glass – Queries Pittsburgh GigaPoP multicast border router for multicast information – http://www.ncne.nlanr.net/tools/mlg2.html 222 Debugging and Troubleshooting • Mtrace – No longer gives coherent stats – Won’t even traverse some routers – Basically useless • Multicast Debugging Handbook – draft-ietf-mboned-mdh-02.txt 223 Debugging and Troubleshooting (cont.) • Useful Cisco Commands show ip sdr [detail | address | <“session name”>] show ip mroute [summary | address | active] show ip rpf <ip address> show ip msdp [ summary | peer | count | sa-cache ] show ip mbgp [address | neighbor [ ip [adv]] 224 >show ip sdr SDR Cache - 67 entries 2/5 ASSIST J Kerr Repeat 3/4 LAN Systems - NT & SMS in Labs 4J KRVM Radio ABONE Discussion BGMP/MASC discussion CANARIE 5th Annual Advanced Networks Workshop(MPEG1) dslab-test ECG - Tunnel Test from UB GT test High Bandwidth Bird (HBB) iBAND3 Keynotes [MP3] at Stardust.com IMJ -- Channel 1 IMJ -- Channel 2 Korea Radio Broadcasting from SNU ... 225 >show ip sdr detail SDR Cache - 68 entries Session Name: BGMP/MASC discussion Description: BGMP/MASC discussion Group: 0.0.0.0, ttl: 0, Contiguous allocation: 1 Lifetime: from 12:30:00 EDT May 31 1999 until 19:50:00 EDT Jun 9 2000 Uptime: 12:17:02, Last Heard: 11:27:03 Announcement source: 128.9.160.100, destination: 224.2.127.254 Created by: pavlin 3136737610 3136737920 IN IP4 128.9.160.100 Phone number: Pavlin Radoslavov (USC/ISI) +1 310 822 1511 (ext. 8125) Email: Pavlin Radoslavov (USC/ISI) <pavlin@catarina.usc.edu> URL: http://netweb.usc.edu/bgmp/ Media: audio 17396 RTP/AVP 0 Media group: 224.2.134.153, ttl: 127 Attribute: ptime:40 Media: whiteboard 48173 udp wb Media group: 224.2.188.133, ttl: 127 Media: text 56770 udp nt Media group: 224.2.168.204, ttl: 127 226 >show ip sdr "NASA TV - Broadcast from NASA HQ" SDR Cache - 68 entries Session Name: NASA TV - Broadcast from NASA HQ Description: NASA TV Multicasting from NASA HQ Group: 0.0.0.0, ttl: 0, Contiguous allocation: 1 Lifetime: from 12:36:16 EDT Jun 11 1999 until 11:36:16 EST Dec 11 1999 Uptime: 1d17h, Last Heard: 18:21:10 Announcement source: 131.182.10.250, destination: 224.2.127.254 Created by: ellery 3132060082 3138107776 IN IP4 131.182.10.250 Phone number: +202 651 8512 Email: Ellery.Coleman@hq.nasa.gov URL: http://www.nasa.gov/ntv (Ellery D. Coleman) Media: audio 23748 RTP/AVP 0 Media group: 224.2.203.38, ttl: 127 Media: video 60068 RTP/AVP 31 Media group: 224.2.203.37, ttl: 127 Attribute: framerate:9 Attribute: quality:8 Attribute: grayed:0 227 >show ip bgp neighbors 192.88.115.113 BGP neighbor is 192.88.115.113, remote AS 145, external link Index 2, Offset 0, Mask 0x4 NEXT_HOP is always this router BGP version 4, remote router ID 204.147.128.139 BGP state = Established, table version = 2, up for 05:54:44 Last read 00:00:05, last send 00:00:07 Hold time 90, keepalive interval 30 seconds Neighbor NLRI negotiation: Configured for unicast and multicast routes Peer negotiated unicast and multicast routes Exchanging unicast and multicast routes ... Number of unicast/multicast prefixes received 0/6773 ... 228 >show ip mbgp neigh 192.88.115.113 adv Network Next Hop *> 128.182.0.0 0.0.0.0 *> 192.88.115.64/24 0.0.0.0 Metric LocPrf Weight Path 0 0 32768 i 32768 i 229 Agenda • • • • • • What Is Multicast Multicast Protocols Multicast Configurations Troubleshooting Futures Useful Links 230 Futures • PIM modes designed for specific scalings and applications – Bidir-PIM – Single Source Multicast (SSM) 231 Bidir PIM • Designed for many to many multicast groups • Builds a single (*,G) shared tree which all sources use • Relieves the router from having to keep state for each and every source 232 SSM • Designed to allow receivers to specify the single source which they wish to receive traffic from, such as when viewing a broadcast from a single provider • Requires IGMPv3 so that receivers can tell the first hop router which source they want • RP is not used • Requires an out of band method to discover the proper source (e.g. a web pag or email 233 Agenda • • • • • • What Is Multicast Multicast Protocols Multicast Configurations Troubleshooting Futures Useful Links 234 Useful Links • Multicast Documentation (FAQ and Tools): http://www.ncne.nlanr.net/faq/multicast.html • Cisco IOS Documentation • Cisco IP Multicast: ftp://ftp-eng.cisco.com/ipmulticast/ • Introduction to Multicast Routing: http://www.3com.com/nsc/501303.html • Send mail to ncne@nlanr.net 235