notes_v0.1

advertisement
VLANs
Different broadcast domains
Switches create a separate CAM table per VLAN
VLANs (0-4095) 0 and 4095 are reserved
Normal VLANs 1-1005
Extended 1006-4094
Trunks dot1q or ISL 802.1q
ISL is Cisco proprietary
- 30 byte encapsulation (26 byte header, 4 byte trailer)
802.1q is 4 byte tag inside existing header
Switchport modes
Access, trunk, dynamic (desirable, auto), Tunnel Dynamic (auto assigns access or trunk via DTP)
Goal is to negotiate in order ISL, 802.1q, access port.
#switchport mode dynamic (auto | desirable)
#switchport nonegotiate
Dynamic auto will only listen for negotiation, it will not send. So 2 x autos will not negotiate.
VTP solves the administration problem only. Cisco proprietary.
VTP does not define the broadcast domains. It simply resolves the administration problem. EG switches
in different VTP domains that share the same VLAN numbers are still a part of the same broadcast
domain.
Default mode is server
#vtp mode server
#vtp mode client
#vtp mode transparent
#vtp password (pwd)
VTP Pruning
Switches advertise what VLAN’s they need so their trunk links do not transmit un-needed VLAN’s.
Comparable to #switchport trunk allowed vlan x but easier to administer. This does not operate in
transparent mode. You can enable this on a single VTP server and it will enable throughout the domain.
Etherchannel: Aggregate bandwidth from multiple physical interfaces into a single logical interface.
Consists of two parts
- Port channel interface (logical)
- Member interfaces (physical)
The logical channel can be any type of interface ie L2 trunk, access, l3 routed, etc.
#channel-group x mode y
Modes (y)
- On – no negotiate
- Desirable – PagP (Cisco Proprietary)
- Active/passive – LACP (IEEE Standard)
-
With ON mode, both ends need to be ON. This is no negotiation. With [desirable on, desirable auto]
and [active/passive], links will form as long as one end is sending negotiation parameters.
Etherchannel load balancing based on source/destination MAC, source/dest IP or combination. This can
be modified with #port-channel load-balance command.
Spanning Tree Protocol 802.1d
STP Variations
- 802.1d (CST or common spanning tree)
- PVST/PVST+ (Cisco per vlan)
- 802.1w (RSTP)
- 802.1s MST
802.1d
- Elect root bridge
- Elect root port per bridge
- Elect designated ports
Root bridge election
Lowest bridge ID
- Bridge priority (lower is better) 0-65440
- System ID Extension
- MAC Address
Once elected, BPDU’s flow down from the root of the tree to the leaves.
Root port election
RP is upstream towards the root bridge
RP is elected based on lowest root path cost, which is the cumulative cost of all links to get to the root.
Link cost is based on inverse bandwidth but is not linear, like OSPF.
Tie breaker
1. Lowest bridge ID
2. Lowest port ID
Designated ports are downstream away from the root bridge. Typically all ports on the root bridge are
DP. Blocking ports still receive BPDU’s.
CST convergence based on timers set by RB.
- Hello timer (2 seconds), forward delay (15), max age timer (20)
Forward delay
Blocking, listening, learning, forwarding, disabled.
Topology change notification BPDU
- Used to notify the root bridge of changes. Flows up to the RB who replies with ACK.
PVST/PVST+
- Per vlan basis. Same algorithm.
- PVST only supports ISL
- PVST+ supports 802.1q. Default mode on switches.
STP Enhancements
PortFast: no forwarding delay. Does not generate TCN.
UplinkFast: Direct root port failure should reconverge immediately if alternate port is available.
Alternate ports are redundant blocking ports towards the root.
Backbone fast: Indirect failures should start recalculating immediately. Once an inferior BPDU is received
from an upstream failure, the switch can skip the max-age time and recalculate.
Other STP features
- BPDU Filter: filters DU in and out. Passive-interface type of setting
- BPDU Guard: Will shut down port if it receives a BPDU. Like to prevent someone plugging in a
switch into an access port that wants to run spanning-tree.
- Root Guard: Listens for BPDU’s. Will only shut down if a superior advertisement comes in. Protects
the upstream root bridge.
- Loop guard and UDLD: Both used to prevent unidirectional links. Loop guard uses BPDU’s to
determine this whereas UDLD uses a lightweight L2 packet.
802.1w
- Rapid convergence based on sync process. Allows for faster initial convergence. EG Allows the
switch to elect root port while flooding BPDU’s down other links.
- Simplifies port states (discarding, learning, forwarding)
Discarding (clicking, disabled, etc.)
Learning (Building MAC table but not forwarding)
Forwarding (Forwarding!)
- All bridges generate BPDU’s and send out at hello time.
In RSTP, if 3 hello’s are missed, neighbor is declared down.
Fast initial and recovergence. Also is backwards compatible with 802.1d.
802.1s MST
PVST is one instance per vlan. MST is user defined.
- Uses 802.1w for rapid re/convergence
- Highly scalable.
IP Routing overview
Route selection
1. Longest match based on what is in the routing table, regardless of what protocol it comes from.
2. Administrative distance
RIP
Standards based distance vector IGP.
-
Uses split-horizon, poison revers, count to infinity
UDP port 520 for transport
RIPv1
- Classful (NO VLSM)
- Updates are broadcasts (all hosts 255.255.255.255)
RIPv2
- Classless
- Updates as multicast 224.0.0.9
Enable global process
#router rip
Enable interfaces
#network x.x.x.x (Matches classful major networks only)
- Supports both v1 and v2 at the same time.
Interface level config
#ip rip send version (1/2)
- Summarization
o RIPv2 is classless but does auto classful
o Summarization by default
o Manual summaries can be configured at the interface level
- Split-horizon: updates received on an interface will not be sent back out of the same interface.
- RIP updates can be configured per interface as broadcast, multicast or unicast.
The passive interface command in RIP global process prevents the sending of multicast updates .You can
then enable unicast update, by specifying the RIP neighbor at the interface level.
#show ip protocols
- Metric calculation
1 hop per device, up to 16.
Metric can be changed with offset list
#debug ip rip
-
-
Convergence timers (update, invalid, hold down, flush)
o Update (how often updates are sent) 30 seconds default
o Hold down: no update after hold down expiry? Route goes to possibly down
o Invalid: how long until the route cannot be used.
o Flush: when will it be removed form route table
Authentication. Clear text or MD5. Interface level. Specific to interface neighbor.
Filtering updates
o Passive interfaces
o Distribute lists (acl or prefix list)
o Offset lists (modify metrics)
o
Admin distance
EIGRP
Enhanced interior gateway routing protocol
Cisco proprietary hybrid protocol (advanced distance vector)
Classless: Supports both VLSM and summarization
IPv4, IPX and AppleTalk
- Uses its own transport protocol. Protocol 88 RTP. 224.0.0.10 or unicast (RTP)
- Forms active neighbor adjacencies
- Guaranteed loop-free topology using DUAL Diffusing update algorithm
- Fast convergence (feasible successors)
- More granular metric (bandwidth, load, delay, reliability)
o Lowest cost wins.
- Supports unequal load cost balancing. Primary path = successor route.
- Lowest feasible distance = feasible successor
- Supports manual or auto-summarization.
#no auto-summarization
- Supports control plane MD5 authentication. NO CLEAR TEXT.
- Neighbors are discovered with multicast hello packets 224.0.0.10.
- Neighbors must agree on
o ipv4 subnet
o autonomous system number
o authentication
o metric weightings (K values)
- Neighbors do not need to agree on timers. This is opposite to OSPF logic.
- Once neighbors are found, EIGRP update messages are used to exchange routes. Sent as 224.0.0.10
or unicast.
The update messages describe attributes of a route
- Prefix + length
- Next hop values
- Bandwidth
- Delay
- Load
- Reliability
- Mtu
- Hop count
- Other external attributes.
All routes learned from all neighbors make up the topology table. Once the topology is learned, DUAL
runs to determine the best loop free paths to each destination. Best path (based on lowest metric) is the
successor. Only the best routes, routes with the lowest metrics, are allowed to be advertised to
neighbors.
Loop prevention
- Split Horizon: don’t advertise routes out the links they came from
DUAL Feasibility Condition: if your metric is lower than mine, you are loop free.
EIGRP Reconvergence.
- Hello packets contain “hold time”. If hold time expires, neighbor is down. When a neighbor is lost,
as expected, paths via that neighbor are removed from the topology and routing tables. If a backup
path exist (feasible successor) they become the new best paths and are inserted in the routing
table. Sub second convergence. If no backup routes exist (feasible successors), then the router must
re-run the DUAL algorithm.
DUAL Recovergence
- When the best path is lost and no backup route (or feasible successor) exists, the router goes into
an “active” state and the “active timer” starts.
- Stable routes not in active are considered “passive”
- The active state means that they are actively being recalculated.
- EIGRP Query messages are sent to active neighbors asking them if they have a loop free path to my
“active” routes. Query messages are flooded to all neighbors within the EIGRP “Query Domain”.
Where OSPF floods the area, EIGRP floods the query domain. Summarization and EIGRP stub
features limit the size of the query domain. EIGRP neighbors respond with EIGRP “reply” packets
indicating that an alternate route is available. If an alternate route exists, DUAL recalculates the
new path. If not, it is removed from the topology table. However… if the “active timer” expires and
no reply is received, the route is declared “stuck in active” and removed from the topology table.
Essentially, SIA comes from your query domain being too large and routers do not receive
responses to queries before the active timer expires.
OSPF
- Link state protocol. Everyone in the area has the same link state database.
- Classless protocol
- Supports VLSM and summarization
- Guarantees loop free topology
- Standards based
- Users its own transport protocol 89.
- 224.0.0.5 (all routers)
- 224.0.0.6 (DR & BDR)
- Or unicast
Large scalability with area’s
- Fast convergence (sub-second)
o Neighbor adjacencies
o Event driven incremental updates
- Efficient updating
o Uses reliable multicast and unicast
-
o Non-ospf devices do not process updates. Duh!!!
Bandwidth based cost metric.
Control plane security (clear text and md5)
Extensible for applications.
OSPF Adjacencies
- Hello packets (not all OSPF neighbors form adjacencies)
To form adjacencies
- Unique router-id
- Unique IP address
- Area ID
- Hello and dead interval
- Interface network address
- Interface MTU
- OSPF Network type
- Authentication
- Sub flags
- Other optional capabilities.
OSPF Network types control
- How updates are sent
- Who forms the adjacencies
- How the next-hop value is calculated.
Network types
1. Broadcast (DR & BRD)22.0.0.5, 224.0.0.6
2. Non-broadcast (multipoint (frame relay/ATM)) unicast.
a. Neighbors must be defined.
3. Point to point (multicast)
4. Point to multipoint. (multicast)
5. Point to multipoint non-broadcast. Unicast (defined neighbors)
6. Loopback
OSPF DR/BDR used to broadcast/non-broadcast to
- Minimize adjacencies
- Minimize LSA adjacencies
-
Update to DR 224.0.0.6. Update to everyone 224.0.0.5.
BDR for redundancy of DR.
Everyone else is DROthers. They form full adjacencies with the DR & BDR and stop at 2-way
adjacency with each other
DR/BDR election.
- Interface priority and router-id
- Interface priority 0-255. Higher better. 0 is never.
-
Router-id (highest loopback/interface IP)
Can be statically set
Higher better.
NOTE: Whoever takes role of DR stays DR even if a router with higher priority joins the segment.
There is no pre-emption
Sending OSPF updates.
- Devices in the same area share the same database.
- Database is used to calculate the shortest path tree SPT.
How flooding occurs depends on LSA type. Different LSA’s use d to describe different types of routes.
- Intra area routes LSA
- Inter area routes LSA
- External LSA
- NSSA External LSA
LSA Types
Type 1 Router LSA: Generated y every router describing directly connected links. Local to the area (O)
Type 2 Network LSA: Generated by the DR. Local to the area (O)
Type 3 Network Summary LSA: Used by ABR in order to exchange info about other areas. Inter-area
route (OIA)
Type 4 ASBR Summary LSA: Generated by ABR to describe the path to a router which is doing
redistribution.
Type 5 External LSA: Generated by the ASBR itself. (E1) or (E2).
Type 7 NSSA External LSA.
OSPF Path Selection
- Bandwidth relevant
- Intra-area over inter over E1 over E2 over Type 1 over Type 2
OSPF Route Filtering
Routers in the same OSPF area must have the same database. This limits filtering capabilities. For
example a distribute list can prevent local routes from entering the table but you cannot prevent routes
on an interface. This typically means that filtering occurs on ABR’s before routes are flooded. ABR’s
supports summarization or different stub areas or SLA type 3 filtering.
Stub areas
- Reduce the size of the db without impacting reachability.
o Stub (no asbr summary or external routes) LSA 4, 5.
o Totally stubby (plus inter area routes) LSA type 3
o NSSA (allowed to redistribute into area (Type 7)
o Totally NSSA
Virtual links
OSPF must be contiguous.
Virtual links are used to connect area 0 over non-transit links.
They are an area 0 adjacency between two ABR’s over a non-transit area.
Requirements include:
- Non transit area must have full routing information
- Cannot be stub area and should not have filtering.
Network type non-broadcast does require a DR.
Non-broadcast timers are 30/120 by default.
BGP
MD5 only!!!
Standards based exterior gateway protocol.
- Path vector protocol. Uses multiple attributes, for routing between autonomous systems.
- Classless protocol
o Supports VLSM and summarization
- Highly scalable
o IGP can scale to thousands
o BGP scales to 100,000+ routes
- Highly stable. Can handle routing and pushing traffic at the same time.
- Used to enforce routing policy (traffic engineering)
- BGP uses attributes of the route itself.
o Traffic engineering is feasible and simple to implement.
- Uses ASN to identify processes
o BGP ANS’s 2 byte field. 0-65535 (originally)
o Private ASN’s are the last 1024 (64511 – 65535)
o Now up to 4 byte field. AS Dot notation. 65535.65535
- Doesn’t use its own transport. Unicast TCP 179
- BGP peers are not auto-discovered. Manually configured with neighbor statements.
- BGP neighbors do not have to be directly connected. They just need ip connectivity. Logical peering
over TCP. BGP has different types of neighbors.
o External and internal peers. EBGP IBGP.
- Path vector attributes.
o Choose BGP best paths to build routing table
- Control Plane security
o Support TCP MD5 signature option.
- BGP is an extensible protocol.
o Multiprotocol BGP extensions beyond normal ipv4 unicast routing
Establishing BGP Peers.
- Like IGP, first step is to find neighbors.
- Peering establishment and maintenance uses four types of packets.
o
o
o
o
OPEN
KEEPALIVE
UPDATE
NOTIFICATION
BGP Open (Negotiate with neighbors)
- Negotiate parameters
o BGP Version (4)
o ASN
o Router ID
o Hold time (Negotiated to lowest value)
o Options (AKA capabilities)
 IPv4, IPv6, MPLS VPNv4, IPv4 multi or unicast
BGP Keep alive (hellos)
- Used for dead neighbor detection (hello)
- If hold time is 0, keep alives are disabled
BGP Update (routing updates)
- Used to advertise or withdraw a prefix
- Includes
o Withdrawn routes (list of routes to be discarded)
o NLRI (routes being advertised)
o Path vector attributes (route attributes used for best path selection) next-hop, local, pref,
med, origin, etc.
BGP Notification (used to convey ERROR messages)
- After notification sent, BGP session is closed.
o Unsupported version #
o Unsupported optional parameters (IPv4 vs IPv6, etc)
o Unacceptable hold timer
o Hold timer expired
BGP Peering types
EBGP (Outside AS)
IBGP (Inside AS)
- Update and path selection rules change depending on what type of peer a route is being sent
to/received from.
EBGP Peering
- Usually directly connected neighbors. You can have multihop EBGP neighbourships but the default
TTL is 1. This must be changed.
- Uses AS path attribute for loop prevention.
o Inbound routes containing my own AS are discarded. Like split-horizon.
IBGP Peering
- Same AS
- Many times not directly connected
o Implies IGP needed to provide TCP transport
- Loop prevention via route suppression.
o Routes learned via an iBGP peer cannot be advertised on to another iBGP peer.
- This implies that all routes running BGP within an AS must peer with each other.
- Quite simply, iBGP peers do not advertise other devices routes.
- IBGP full mesh of n*(n-1)/2 peering’s.
- The limitation of the IBGP full mesh is scalability. This can be fixed with two exceptions.
1. Route Reflections (Similar to OSPF DR). A central point of peering. Everyone peers with RR, RR
peers with everyone. The RR controls the control plane but does not necessarily become the
next-hop in the data plane.
2. Break network into confederation: Split the AS into sub-ASes
NLRI: Network Layer Reachability Information
BGP Peering Redundancy
- Peering is based on IP/TCP reachability.
- No reachability, peer is down.
- Using loopbacks allows for re-routing around link failures. Redundancy with Reconvergence.
- Can also be used for load balancing.
Building the BGP table (update messages)
- NLRI can be originated from:
o Network statement. Different from other network statements. Whereas with RIP we use it
to enable an interface, with BGP we are matching a route to be advertised.
o Redistribution
o Aggregation/summarization (take 2x /24 and make one /23)
o Conditional route injection: Taking aggregates and generating more specific routes. IE if you
wanted to highlight a route for traffic engineering.
o Unlike IGP, network do not have to be directly connected to be advertised, they only have
to be in the route table. IE routes learned by OSPF which are in the route table can be
advertised by BGP.
BGP Path Vector Attributes (update messages)
- Attributes fall into categories
o Well known: must be implemented
o Optional: may or may not be implemented
o Mandatory: must be present in update
o Discretionary: May or may not be implemented.
o Transitive: passes between EBGP and IBGP peers
o Non-transitive: passes only between iBGP peers
Well known mandatory types
- Next hop
- AS path
- Origin code (how it was originate into BGP)
Well known discretionary attributes
- Local preference (control how traffic leaves the autonomous system. Exchange between iBGP peers
but not eBGP, so not mandatory.
- Atomic aggregate
Optional transitive
- Aggregator (if a route was summarized or aggregated, sorry if a route is the culmination of a
summarization, the aggregator or atomic aggregate is the device that performed the aggregation).
Optional non-transitive
- MED (Multi exit discriminator)
BGP Best Path Selection
- Once updates are exchanged, path selection begins. BGP best path selection algorithm. The best
path selection algorithm compares path vector attributes and elects one route as best for each
prefix. Only the best route is sent to the routing table and is the only one to be advertised to BGP
peers.
- Only the best route can be advertised to BGP peers! Not everyone in the same AS has the same BGP
table. Unlike OSPF, where all routers in an area must have matching databases.
- Multipath can occur but in very strict circumstance.
BGP Best Path Selection Order
- Algorithm runs top down until a deciding match occurs. Like an ACL.
- Cisco IOS Selection order is
o Weight: (highest) Locally significant value
o Local preference (highest)
o Locally originated routes (Generated with network statement or redistribution)
o Shortest AS Path
o Origin (lowest) IGP over EGP over unknown
o MED (lowest)
o External learned over internal learned
o Smallest IGP metric to next-hop value
- Other tie breakers: oldest route, lowest router-id, lowest interface ip address, etc.
Manipulating BGP Best Path Selection (TE)
- Vector attributes can be manually modified to define different routing policy for different routes.
Eg. Control inbound/outbound traffic flow on a per-prefix basis.
- Attributes typically modified are
o Weight
o Local Pref
o AS Path
o MED
(Which attributes you change will depend on if you are trying to modify inbound or outbound traffic)
- Inbound routing policy affects outbound traffic flow
o Inbound, change weight or local pref to affect traffic out.
o Outbound, change AS Path or MED to affect traffic in.
IPv6
IPv4 (4 byte) IPv6 (16 byte, 128 bit)
IPv6 hexadecimal
4 main address types.
- Global unicast (public routable)
2000-3FFF
- Unique local (Site local) private addressing.
FC00
- Link Local
FE80
- Multicast Addressing
FF
Host portion is auto generated from MAC addy. MAC is 48 bit, IPv6 host address is 64 bit. The extra 16
bit is derived as follows:
- MAC 1234.5678.9012
- Invert 7th most significant bit
1034.5678.9012
- Insert FFFE in middle
1034:56FF:FE78:9012
IPv6 Address Resolution.
Ethernet: ICMPv6 ND replaces ARP.
NBMA
- Static resolution on multipoint interfaces.
- Inverse neighbor discovery not implemented
ICMPv6 Neighbor Discovery (ND)
- NS (neighbor solicitation) (ARP request)
o Ask for information
- NA (Neighbor advertisement) (ARP reply)
- RS (router solicitation) ask for information about routers
- RA (router advertisement) advertise yourself as a router.
Once the neighbor joins the LAN segment:
- Send out neighbor solicitation to solicited node multicast
- If no reply, my address is unique (Duplicate address detection)
- Send unsolicited neighbor advertisement to announce yourself. All hosts multicast FF02::1
IPv6 routing is disabled by default.
- #ipv6 unicast-routing
Dynamic routing with
- RIPng
- OSPFv3
- EIGRPv6
- IS-IS
- BGP
Dynamic information (next-hop) resources to remote link local addresses
IPv6 Static Routing
Same as IPv4
- To next-hop
- To multipoint interface (do not use)
- To point-to-point interface
-
-
RIPng, OSPFv3, EIGRPv6
o Use separate process.
BGP & ISIS
o Use the same global process
o Different address families
RIPng Overview
- Similar in operation to RIPv1/v2
- UDP port 521 multicast to FF02::9 (224.0.0.9)
- Configuration
o Interface level #ipv6 rip (process) enable
- Split horizon enabled globally
o #no split-horizon on NBMA interfaces
EIGRPv6
- Similar to v4
- IP Protocol 88 multicast to FF02::A (224.0.0.10)
- Config
o Interface level #ipv6 eigrp (AS)
o Process level #no shutdown
OSPFv3
Similar to OSPFv2
- Router ID is IPv4 address
o Use router-id if ipv4 is not configured.
- Configuration
o
Interface level #ipv6 ospf (Process) area area-id
LSA’s
Most LSA’s are the same. 2 new LSA’s. Type 8 and type 9.
Type 8 LSA Link LSA
- Link local scope
- Used for link-local next-hop calculation. Only relevant to the segment.
Type 9 LSA Intra area prefix LSA
- Area scope
- Used to advertise global addresses of connected links.
LSA’s 1 and 2 are still used to build the intra-area graph of the network, but are now decoupled from the
actual addresses on the link. Basically you can have as many IPv6 addresses on a single interface as you
want. They are not secondary, they are all primary. LSA’s 8 and 9 are used to advertise the actual nexthop information.
Same network types
- Broadcast
o DR/BDR election
- Non-broadcast
o DR/BDR election
o Unicast updates to link local addresses.
- Point to point
- Point to multipoint
- Point to multipoint non broadcast
BGP for IPv6
- Same logical process.
o Uses address family identifier configuration.
- Normal BGP rules apply
o Requires underlying IGP transport
Tunneling IPv6 over IPv4
- Static tunnel (GRE)
- IPv6 IP tunnel
- Automatic Tunnels
o 6to4 tunnels: imbed IPv4 address into IPv6 prefix to provide automatic tunnel endpoint
termination.
- ISATAP (intra site tunnel)
o Automatic host to router and host to host.
MPLS Tag/Label Switching
- Can transport different protocols.
- Layer 2 (Ethernet, HDLC, PPP, Frame Relay, ATM)
- Layer 3 (IPv4 and IPv5)
-
Traffic is switched based on locally significant label value.
MPLS label (4 byte (32 bit) header)
- 20 bit label = local to router (link local)
- 3 bit experimental (QoS COS)
- S bit = bottom of stack
If 1, label is last in the stack
- 8 bit TTL
How labels work
- MPLS labels are bound to FEC’s
o Forwarding equivalency class
o Mainly ipv4 prefix for our purposes.
o Could also be ipv6 prefix or L2 circuit.
- Router uses MPLS LFIB to switch traffic
o Essentially CEF table + label.
- Switching logic
o If traffic comes in if1 with label x, send it out if2 with label y.
MPLS Device roles
- PE/LER (Provider edge/Label edge router)
o Connects to customer edge
o Receives unlabeled packets and adds labels.
o PE runs both ip routing and MPLS label switching
- P/LSR (Provider router/Label switch router)
o Connects to PE’s and other P’s
o Switches based on MPLS label only.
Major MPLS operations Push/swap/pop
- PE and P routers perform three major operations
- Label Push
o Adds a label to an incoming packet
o Label imposition
- Label swap
o Replace a label on an incoming packet
- Label pop
o Remove a label from an outgoing packet
o Aka label disposition
Label distribution
- Adjacent P/PE’s must agree on label per FEC.
- Label binding can be dynamic through
o Tag distribution protocol (TDP) Cisco proprietary
o Label distribution protocol (LDP)
o
o
Resource reservation protocol (RSVP)
Multiprotocol BGP (MP-BGP)
LDP
- Neighbor discovery on UDP port 646 to 224.0.0.2 (all routers)
- Neighbor adjacency
o TCP port 646 to remote LDP router-id
- Label advertisement
o Advertise FEC for connected IGP interfaces
o Advertise FEC for IGP learned routes
Penultimate hop popping (PHP)
- Penultimate means next to last
- Normal last hop must (as an example
o Lookup MPLS label in LFIB
o Pop MPLS label (remove label)
o Lookup IPv4 destination
- PHP avoids extra lookup on last hop
- Accomplished through implicit null label advertisement from connected prefixes
MPLS L3 VPNs AKA MPLS L3VPN
- Combines logic of MPLS tunnels with separation of layer 3 routing information
o PE’s learn customer routes from CE’s
o PE’s advertise CE’s routes to other PE’s via BGP
o BGP next-hop values point to MPLS tunnels. E.g. loopbacks of PE routers.
How MPLS L3 VPN’s work
- Two basic components
- Separation of customer routing information
o VRF instances
o Customers have different “virtual” routing tables.
- Exchange of customer routing information
o MP-BGP over MPLS network
o Traffic is label switched toward next-hop value.
Virtual routing forwarding (VRF)
VRF’s without MPLS is considered “VRF Lite”. Routing inside a VRF is specific to that vrf.
VRF Lite vs MPLS VPN’s
- In VRF lite, all transit devices must carry all routes. Same as normal IP routing
- In MPLS L3 VPN, only PE routers need customer routes.
- Accomplished through
o VPNv4 route into BGP
- BGP route + RD (route distinguisher) makes VPN routes globally unique.
-
MPLS label: PE routers exchange label for each customer route via VPNv4.
Transport label: Label towards PE’s BGP next-hop. This is derived via LDP.
Multiprotocol BGP (BGP/MPLS)
Multicast IPv4 Addressing
224.0.0.0/4
Includes reserved ranges
- Link local ranges (224.0.0.x)
- Source specific multicast
o 232.0.0.0/8 – 232.255.255.255
- Administratively scoped (private)
o 239.0.0.0/8 – 239.255.255.255
Multicast Control Plane
- Who is sending traffic and to what groups.
- Who is receiving traffic and for what groups.
- How traffic should be forwarded when it is received “The multicast tree”
The control plane is built with a combination of:
- Host to router communications (IGMP)
- Router to router communications (PIM & MSDP)
Multicast Data Plane
- Once the tree from senders to receivers is build, traffic begins to flow.
- Before forwarding, data plane checks occur.
o Reverse path forwarding (RPF) check. Was traffic received on the correct interface?
o Multicast routing table: What interface should I forward the packets out?
Control Plane IGMP
- Used for receiver to signal routers on the LAN that if wants traffic for a specific group.
- IGMP host signals membership to router via report.
- IGMPv1/v2 supports only group specific joins
(*,G) report
- IGMPv3 supports group and source specific joins
(S,G) report
- IOS router listens for IGMPv1/v2 when PIM is enabled.
PIM
- Router to router
- Protocol independent because it does not advertise its own topology information. Implies that IGP
already run in the network to build a loop-free topology.
- IOS runs PIMv2 by default
PIM Modes: Controlling how the tree is built and who receives what traffic.
- Dense mode:
o Implicit join to all groups, unless specified
o Uses flood and prune behavior
- Sparse mode:
o Considered explicit join. No traffic unless specified.
o Utilizes RP to process join requests.
- Sparse/dense mode
o Sparse for groups with RP assigned
o Dense for all others.
MSDP (multicast source discovery protocol)
- Used between RP’s to signal each other about multicast senders.
- Originally designed for inter-as multicast
- Can be used for intra as anycast RP.
Data plane RPF check
- PIM does not exchange topology information. How do we know the network is loop free? RPF check
prevents loops in the data plane.
- When packet is received
o Check source ip and incoming interface
o If incoming mcast interface == outgoing unicast interface back to source, RPF check passes.
o If incoming mcast interface is not equal to outgoing unicast interface, RPF check fails and
packet is dropped.
Data Plane – Mcast Routing Table
Using PIM, router learns where sources and receivers exist.
- Interface facing upstream towards the source is the “incoming interface”
- Interface/s facing receivers or end hosts are downstream and “outgoing interface list” OIL
- Split-horizon prevents any single interface from being in both lists.
If RPF passes
- Prefer (S,G) over (*,G) in routing table
- Switch packets form incoming int to all interfaces in OIL.
PIM Dense mode PIM-DM
- Uses push model or implicit join
- Called flood and prune
- All traffic flooded through entire network
- Routers that have no receivers prune (unjoin) unused links.
Only suitable for small implementations
- Doesn’t scale because of flooding and (S,G) state creation.
PIM Dense Mode Operation
- Discover PIM neighbors (224.0.0.13)
- Flood all mcast traffic
- Prune unwanted traffic
- Multicast table maintenance
o Graft message (unpruned)
o Assert
o State refresh
PIM Sparse Mode PIM-SM
- Uses pull model or explicit join
o Traffic is not flooded unless you ask for it
- Uses both shared tree and source based tree
o Dense only uses source based trees
o More suitable. Better design choice.
Shared vs source trees
Source based tree (shortest path tree)
- Uses shortest path from sender to receiver.
- Dense or sparse mode.
- Relies on IGP
- (S,G) entries. Most specific.
Shared Trees
- Uses shortest path from sender to RP then shortest path from RP to receiver
- Sparse mode only
- Used to eliminate flooding and pruning and make routing table more scalable.
- You specify the RP
PIM-SM operation
- Discover PIM neighbors and elect DR
- Discover RP
- Tell RP about sources
- Tell RP about receivers
- Build shared tree from sender to receiver through RP
- Join shortest path tree
- Once the RP is used to build the control plane, the RP can leave the shared tree or data place.
- MCast table maintenance: The RP is used to maintain and limit broadcast traffic on the control
plane. It does not insert itself into the data plane.
Learning the RP address
- Without the RP
o Sources can’t be registered
o Joins can’t be processed.
- All routers must agree on the same RP address on a per-group basis
o
Registrations and joins are rejected for invalid RP’s
RP Address can be assigned
- Statically
- Dynamically
o Auto-RP ((Cisco proprietary) legacy)
o BSR (more preferred)
Auto-RP
- Uses two functional roles
o Candidate RP’s
 Devices willing to be the RP’s
 Sends advertisement to multicast address via 224.0.1.39
o Mapping agent
 Chooses the RP among candidates and relays this information to the rest of the PIM
domain
 Sends advertisements to all routers via 224.0.1.40
Auto-RP caveats
- Dynamically learned RP mappings are preferred over statically configured ones.
- Auto-RP control plane messages are subject to RPF check
- To successfully advertise Auto-RP information
o Mapping agent must listen for 224.0.1.39
o Everyone must listen for 224.0.1.40
- In PIM-SM
o Cannot join the auto-RP groups without knowing where the RP is
o Cannot know where the RP is without joining auto-rp groups
- Reverse logic loop!!
Auto-RP solutions
- Default RP assignment
o Assign a static RP for groups 224.0.1.39 and 224.0.1.40.
o Defeats the purpose of automatic assignment
- PIM-SD mode
o Dense for groups without an RP
o Sparse for all others
- Auto-RP listener feature
o Dense for 224.0.1.39 and 1/40 only
o Sparse for all others
Bootstrap Router
- Standard per PIMv2
o Functionally similar to Auto-RP
- Defines two roles in BSR domain
-
o RP Candidate (candidate RP in Auto-RP)
o Uses unicast PIM to advertise itself to the bootstrap router
Bootstrap router (BSR)
o Analogous to mapping agent in auto-rp
o Advertises RP information to other routers with multicast PIM on a hop-by-hop basis.
Bidirectional PIM (BiDir-PIM)
- Traditionally two trees in sparse mode
o Unidirectional SPT from source to RP
o Unidirectional shared tree from RP to receivers
- Results in (*,G) and (S,G) entries in control-plane
o For many to many applications, doesn’t scale well.
- Bidirectional PIM solves this by only allowing the shared tree (*,G) and never a SPT (S,G)
NOTE: BiDir PIM as in a financial institution which is receiving multicast streams from the web and also
sending to local systems.
Source Specific Multicast
- IGMPv2 and PIM Sparse mode use (*,G) joins to discover sources
o This is why the RP is needed
- IGMPv3 and SSM change these rules
o Clients generate (S,G) IGMPv3 joins
o Routers build SPT directly to S for G
- Removes the need for RP
o Since no RP, no register messages
o Source discovery is out-of-band
- Typically uses 232.0.0.0/8
MSDP
- Allows service provider to use their own internal RP’s
o Not relying on internal routing of another AS
- MSDP is used to communicate between RP’s
o Who are the active sources?
- Does not eliminate the need for PIM between domains
o RP to RP messaging
How MSDP works
- TCP peering
- When RP receives PIM register for (S,G), it informs MSDP peers using a source active (SA) message.
o Allows other AS’s to know what senders there are.
- If another RP receives a (*,G) join for a group, join is sent to source.
- MSDP peers are used for control, not data plane.
Anycast RP
-
-
Uses anycast load balancing to decentralize the placement of RP’s
o PIM register and join messages go to the closest RP in the topology.
o If one RP goes down, convergence is up to the IGP
o As long as one RP is up, network is functional.
Anycast design issues
o All RP’s must share info about senders and receivers.
Anycast RP’s assign a duplicate address and advertise into IGP.
All routers point to anycast address
o Static or dynamic assignment
Anycast RP’s are MSDP peers using a unique address
When PIM register is received, MSDP SA is sent to MSDP peers.
o Results in synchronization of (S,G) information
Download