Deployment and Operation of BGP TECRST-2310 Agenda Introduction to BGP BGP General Operation BGP Attributes and Policy Control BGP Path Selection Algorithm Applying Policy with BGP Multi-Protocol BGP BGP Load Balancing Full Mesh IBGP BGP Route-Reflectors Scaling BGP Updates BGP Fast Convergence A Little BGP “Show and Tell” TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2 Introduction to BGP TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3 Autonomous System A network sharing the same routing policy Possibly multiple IGPs Usually under single administrative control Contiguous internal connectivity Numbering range form 1 to 65,535—Globally unique—“AS Number” Private range: 64512–65534 Reserved: 0 and 65535 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4 Border Gateway Protocol - BGP BGP is classified as a path vector routing protocol (see RFC 1322) A path vector protocol defines a route as a pairing between a destination and the attributes of the path to that destination. BGP used internally (iBGP) and externally (eBGP) iBGP used to carry Some/all Internet prefixes across ISP backbone ISP’s customer prefixes eBGP used to Exchange prefixes with other Autonomous Systems (ASes) Implement routing policy TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5 BGP Basics eBGP Peering A C AS 101 AS 100 iBGP D B BGP speakers are called peers or neighbors TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. E AS 102 Cisco Confidential 6 External BGP - eBGP Between BGP speakers in different AS AS 2 Usually directly connected 2.0.0.0 Usually sets next-hop to self Router A router bgp 1 neighbor 2.0.1.1 remote-as 2 .1 Router B router bgp 2 neighbor 2.0.1.2 remote-as 1 2.0.1.0 neighbor 2.0.1.2 route-map X {in|out} .. route-map X permit 10 {set | match} <attribute> TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. A Cisco Confidential 1.0.0.0 .2 B AS 1 7 Internal BGP - iBGP Neighbor in same AS Next-hop unchanged…usually May be several hops away Don’t forward iBGP learned routes to other iBGP peers n*(n-1)/2 peering mesh – scaling problem! Route-Reflectors relax this constraint A B Router B: router bgp 1 neighbor 1.0.1.1 remote-as 1 Router A: router bgp 1 neighbor 1.0.2.1 remote-as 1 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8 iBGP and Loopback Interfaces RtrA RtrB interface loopback0 ip address 1.1.1.254 255.255.255.255 ! Router bgp 100 neighbor 1.1.2.254 remote-as 100 neighbor 1.1.2.254 update-source loopback0 interface loopback0 ip address 1.1.2.254 255.255.255.255 ! router bgp 100 neighbor 1.1.1.254 remote-as 100 neighbor 1.1.1.254 update-source loopback0 AS 100 RtrB RtrA Why not peer to the address assigned to a physical interface? TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 9 Reasons for Using BGP 1. You need to scale your IGP 2. You’re a multihomed ISP customer and need to implement routing policy 3. You’re an MPLS/VPN subscriber to an SP service and want to run dynamic routing between CE and PE routers TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 10 Using BGP to Scale Your IGP Scaling a large network—“Divide and Conquer” Hierarchy Periodic IGPs/flooding Isolate network instability Complex policies Control reachability to prefixes Merge separate organizations Connect multiple IGPs TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 11 Best Path Selection for Cisco Routers Which Route Is Best? First, always take the next-hop advertising the longest prefix (most specific route to destination) Route Source Default Distance Values Connected interface 0 Static route* 1 Enhanced Interior Gateway Routing Protocol (EIGRP) summary route 5 External Border Gateway Protocol (BGP) 20 Internal EIGRP 90 IGRP 100 OSPF 110 Intermediate System-to-Intermediate System (IS-IS) 115 Routing Information Protocol (RIP) 120 Exterior Gateway Protocol (EGP) 140 On Demand Routing (ODR) 160 See table on the right External EIGRP 170 Lower is more believable Internal BGP 200 Unknown** 255 Choose next-hop advertising 10.1.1.0/24 over the next-hop advertising 10.1.0.0/16 If two next-hop routers advertising exact same route, refer to Default Administrative distances as index of believability Defaults can be modified if necessary (with caution) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 12 General Operation TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 13 BGP General Operation Learns multiple paths via internal and external BGP speakers Picks the best path and installs in the forwarding table Policies applied by influencing the best path selection TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 14 Summary of Operation TCP connection established (port 179) Both peers attempt to connect—There is an algorithm to resolve “connection collisions” Exchange messages to open and confirm the connection parameters Initial exchange of entire table Incremental updates after initial exchange Keepalive messages exchanged when there are no updates TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15 What Are Incremental Updates? IGPs typically rebroadcast routes BGP runs over TCP => reliable date delivery Once BGP sends a route to a peer, it assumes the peer will keep it unless: A replacement route is sent—Implicit withdraw of old route The route is withdrawn—Explicit withdraw The BGP session goes down (keepalive failure) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16 Inserting Prefixes into BGP Two ways to insert/originate prefixes into BGP Redistribute (static or dynamic) Network command Always necessary for default route Default rules for re-advertising BGP learned prefixes to other BGP neighbors eBGP learned routes are sent to all eBGP and iBGP peers ee, ei iBGP learned routes are sent to all eBGP but NO iBGP peers ie Exception: iBGP Route-Reflectors TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17 Inserting Prefixes into BGP Redistribute Configuration Examples: router bgp 109 redistribute static ip route 198.10.4.0 255.255.254.0 serial0 router bgp 109 redistribute eigrp 100 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18 Inserting Prefixes into BGP - Network Network Used to tell BGP which networks to advertise to neighbors; unlike IGPs, the network command is not used to determine which interfaces will be active for the protocol; networks must be in the IP routing table in order for them to be advertised router bgp 100 neighbor x.x.x.x remote-as Y network 172.16.0.0 If auto-summary is on then a specific route from 172.16.0.0 must be in the routing table; if auto-summary is off then the prefix 172.16.0.0/16 must be in the IP routing table network 172.17.1.0 mask 255.255.255.0 Must be an exact match in the IP routing table TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19 Inserting Prefixes into BGP – Network Command Configuration Example router bgp 109 network 198.10.4.0 mask 255.255.254.0 network 0.0.0.0 A matching route must exist in the routing table before the network is announced Exact prefix length “show ip route x.x.x.x” must return exact route before BGP will advertise Static route can be real next hop or null0 interface ip route 198.10.4.0 255.255.254.0 192.168.1.1 ip route 192.10.4.0 255.255.254.0 null0 ip route 0.0.0.0 0.0.0.0 null0 250 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20 BGP Attributes and Policy Control TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 21 Route Metrics OSPF has a dimensionless metric based on interface speed EIGRP has a 5-tuple [(K1 * BW + K2 * BW/(256 – Load) + K3 * Delay) * K5/(K4 + Reliability] * 256 RIP has a hop count BGP has … TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 22 BGP Attributes (More Than Just Route Cost…) AS path Next hop Weight Local preference Multi-Exit Discriminator (MED) Community Atomic Origin Originator ID Cluster list TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 23 What Is an Attribute? ... Next Hop AS Path MED ... ... Properties associated with a prefix/route Used to determine the best path to a destination when multiple paths exist Attribute Categories Well-known, mandatory Well-know, discretionary Optional, transitive Optional, non-transitive TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 24 AS-Path Well-known, Mandatory, Code = 2 Sequence of ASes a route has traversed Loop detection Apply policy AS 200 AS 100 170.10.0.0/16 180.10.0.0/16 180.10.0.0/16 300 200 100 170.10.0.0/16 300 200 AS 300 AS 400 150.10.0.0/16 AS 500 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. 180.10.0.0/16 300 200 100 170.10.0.0/16 300 200 150.10.0.0/16 300 400 Cisco Confidential 25 Next Hop Well-known, Mandatory, Code = 3 150.10.1.1 AS 200 150.10.0.0/16 150.10.1.2 A B AS 300 150.10.0.0/16 150.10.1.1 160.10.0.0/16 150.10.1.1 AS 100 160.10.0.0/16 TECRST-2310_c1 Next hop to reach a network Usually a local network is the next hop in eBGP session © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 26 Next Hop 150.10.1.1 150.10.1.2 iBGP AS 200 150.10.0.0/16 A B eBGP C AS 300 150.10.0.0/16 150.10.1.1 160.10.0.0/16 150.10.1.1 AS 100 160.10.0.0/16 Next hop not changed TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 27 Local Preference Well-known, Code = 5 AS 100 160.10.0.0/16 AS 200 AS 300 D 500 800 B A 160.10.0.0/16 > 160.10.0.0/16 500 800 E AS 400 C TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 28 Local Preference Local to an AS Local preference set to 100 when heard from neighbouring AS Used to influence BGP path selection Determines best path for outbound traffic Path with highest local preference wins TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 29 Local Preference Configuration of Router B: router bgp 400 neighbor 220.5.1.1 remote-as 300 neighbor 220.5.1.1 route-map local-pref in ! route-map local-pref permit 10 match ip address prefix-list MATCH set local-preference 800 ! ip prefix-list MATCH permit 160.10.0.0/16 ip prefix-list MATCH deny 0.0.0.0/0 le 32 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 30 MULTI_EXIT_DISC (MED or Metric) Optional, Non-transitive, Code = 5 4 octets Used by a BGP speaker’s Decision Process to discriminate among multiple entry points into a neighboring autonomous system. If MED is missing, it is assumed MED=0 If bgp bestpath missing-as-worst then it is assumed the MAXIMUM value TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 31 MULTI_EXIT_DISC (MED or Metric) 192.0.1.0 /24 MED = 10 Route with lowest MED wins!! MED 20 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 32 How to Scale Routing Policy Communities! NOT in decision algorithm BGP route can be a member of many communities Really just a number for grouping prefixes. Typical communities: Destinations learned from customers Destinations learned from ISPs or peers Destinations in VPN—BGP community is fundamental to the operation of BGP VPNs TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 33 BGP Attributes: COMMUNITY Activated per neighbor/peer-group: neighbor {peer-address | peer-group-name} sendcommunity Carried across AS boundaries BGP community values are configured as a 32-bit number (old format) or as a 2x2 byte number (new format). Common convention is string of four bytes: <AS>:[0-65536] TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 34 IP BGP-Community New-Format Specifies that communities be displayed in a 4-byte AA:NN format AA identifies the autonomous system NN is a number that identifies the community within the autonomous system. r2#show ip bgp 10.10.1.0/24 BGP routing table entry for 65001:100:10.10.1.0/24, version 9 <snip> Community: 6553700 r2 (config)#ip bgp-community new-format r2#show ip bgp 10.10.1.0/24 BGP routing table entry for 65001:100:10.10.1.0/24, version 9 <snip> Community: 100:100 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 35 BGP Attributes: COMMUNITY (Cont.) Each destination can be a member of multiple communities Using a route-map: set community <1-4294967295> community number aa:nn community number in aa:nn format additive Add to the existing community none No community attribute local-AS Do not send to EBGP peers (well-known community) no-advertise no-export TECRST-2310_c1 Do not advertise to any peer (well-known community) Do not export outside AS/confed (well-known community) © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 36 BGP Path Selection Algorithm TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 37 BGP Path Selection Algorithm Do not consider path if no route to next hop Example: Router learns a route from an eBGP peer and then advertises to an iBGP peer. If the iBGP peer does not know how to reach the next hop the route is rejected. iBGP usually does not change the next hop. Do not consider iBGP path if not synchronized TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 38 Synchronization A BGP Router Will Not Accept a Route from an iBGP Neighbor Unless the Route Is Already in the IP Routing Table Rtr B Rtr A Rtr C iBGP eBGP 172.16.0.0 TECRST-2310_c1 • Rtr B does not know about 172.16.0.0; therefore, Rtr C should not advertise 172.16.0.0 to Rtr D • Redistribute 172.16.0.0 into IGP, use a full iBGP mesh or disable synchronization if iBGP path = physical path. © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential eBGP Rtr D 39 BGP Path Selection Algorithm Highest weight (local to router) Highest local preference (global within AS) Prefer locally originated route (aggregate address) Shortest AS path TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 40 BGP Path Selection Algorithm (Cont.) Lowest origin code IGP < EGP < incomplete IGP – network command EGP – from an eBGP neighbor Incomplete - redistribution Lowest Multi-Exit Discriminator (MED) If bgp deterministic-med, order the paths before comparing (not the default but recommend using it) If bgp always-compare-med, then compare for all paths otherwise MED only considered if paths are from the same AS (default) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 41 BGP Path Selection Algorithm (Cont.) Prefer eBGP path over iBGP path Path with lowest IGP metric to next-hop For eBGP paths If multipath enabled, install N parallel paths in routing table If router-ID is the same, go to next step If router-ID not the same, select “oldest” TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 42 BGP Path Selection Algorithm (Cont.) Lowest router-id (originator-id for reflected routes) Shortest Cluster-List Client must be aware of Route Reflector attributes! Lowest neighbor IP address TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 43 Applying Policy with BGP TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 44 Constructing the Forwarding Table Input policies BGP in process in discarded accepted everything bgp BGP table peer forwarding table best paths out BGP out process output policies TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 45 Applying Policy with BGP Policy based on various attributes: AS path Community Destination prefix Many, many others… Reject/accept selected routes Set attributes to influence path selection Tools (IOS): Distribute-list or prefix-list Filter-list (as-path access-list) Community-list Route-maps (the Swiss army knife) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 46 Policy Control - Prefix List Per-peer prefix filter, inbound or outbound Allows coverage for ranges of prefix lengths (ge, le) Based upon network numbers in NLRI (using familiar IPv4 address/mask format) Example configuration: router bgp 200 neighbor 220.200.1.1 remote-as 210 neighbor 220.200.1.1 prefix-list PEER-IN in neighbor 220.200.1.1 prefix-list PEER-OUT out ! ip prefix-list PEER-IN deny 218.10.0.0/16 ip prefix-list PEER-IN permit 0.0.0.0/0 le 32 ip prefix-list PEER-OUT permit 215.7.0.0/16 ip prefix-list PEER-OUT deny 0.0.0.0/0 le 32 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 47 Policy Control - Prefix List a.b.c.d/x [ge | eq | le] y care vs. don’t care bits base prefix length to match operator operand ip prefix-list PEER-IN permit 10.0.0.0/8 le 32 10.0.0.8/8 le 32 = all 10.x.x.x subnets, regardless of mask length (e.g. 10.1.2.4/24, 10.1.1.1/32, 10.1.0.0/16) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 48 Policy Control - Prefix List More Examples: 0.0.0.0/0 eq 32 = all /32 prefixes (e.g. 1.2.3.4/32) 192.168.1.0/24 = 192.168.1.0/24 eq 24 (ONLY 192.168.1.0/24) 172.16.0.0/16 ge 28 = all subnets from 172.16.0.0/16 that have a mask length of /28 or greater (e.g. 172.16.4.0/28) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 49 Policy Control - Filter List Filter routes based on AS path Inbound or Outbound Example Configuration: router bgp 100 neighbor 220.200.1.1 filter-list 5 out neighbor 220.200.1.1 filter-list 6 in ! ip as-path access-list 5 permit ^200$ ip as-path access-list 6 permit ^150$ TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 50 Policy Control - Regular Expressions Simple Examples .* Match anything .+ Match at least one character (cannot be empty) ^$ Match routes local to this AS (as-path is empty) _1800$ Originated by 1800 (as-path ends with 1800) ^1800_ Received from 1800 (as-path starts with 1800) _1800_ Via 1800 (1800 is somewhere in the middle of the as-path) _790_1800_ Passing through 1800 then 790 For more information on regular expressions: http://www.cisco.com/en/US/docs/ios/12_2/termserv/configuration/guide/tc faapre_ps1835_TSD_Products_Configuration_Guide_Chapter.html http://www.ccietalk.com/2008/07/25/cisco-regular-expression-characters TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 51 Policy Control – Setting Communities Example Configuration router bgp 100 network 215.7.0.0 neighbor 220.200.1.1 remote-as 200 neighbor 220.200.1.1 send-community neighbor 220.200.1.1 route-map set-community out ! ip bgp-community new-format ! route-map set-community permit 10 match ip address prefix-list NO-ANNOUNCE set community no-export ! route-map set-community permit 20 match ip address prefix-list EVERYTHING ! ip prefix-list NO-ANNOUNCE permit 172.168.0.0/16 ge 17 ip prefix-list EVERYTHING permit 0.0.0.0/0 le 32 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 52 Policy Control – Matching Communities Example Configuration router bgp 100 neighbor 220.200.1.2 remote-as 200 neighbor 220.200.1.2 route-map filter-on-community in ! route-map filter-on-community permit 10 match community 1 set local-preference 50 ! route-map filter-on-community permit 20 match community 2 exact-match set local-preference 200 ! ip community-list 1 permit 150:3 200:5 ip community-list 2 permit 88:6 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 53 Multi-protocol BGP TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 54 MP-BGP (RFC4760) Extension to the BGP protocol Carry routing information about other protocols: IPv4 and IPv6 Unicast IPv4/IPv6 + Label (RFC 3107, 6PE) IPv4 and IPv6 Multicast Multi-Protocol Label Switching (MPLS) VPN (IPv4 and IPv6) Layer 2 VPN …many others proposed Multi-Protocol Capabilities must be negotiated at session setup time (important!) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 55 MP-BGP Attributes New non-transitive and optional Border Gateway Protocol (BGP) attributes MP_REACH_NLRI “Carry the set of reachable destinations together with the next-hop information to be used for forwarding to these destinations” (RFC4760) MP_UNREACH_NLRI Carry the set of unreachable destinations Note: NEXT_HOP has different format for different AFI/SAFI TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 56 MP-BGP Attributes (Cont.) Attribute contains one or more triples: Address Family Information (AFI) with Sub-AFI Identifies type of protocol information carried in the Network Layer Reachability Info (NRLI) field Next-hop information Reachability/non-reachability information TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 57 MP-BGP Capabilities Negotiation TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 58 MP-BGP Capabilities Negotiation (Cont.) BGP router sends an OPEN message with CAPABILITIES parameter containing its capabilities : Value Description Reference 0 Reserved RFC 5492 1 Multiprotocol Extensions RFC 2858 2 Route Refresh RFC 2918 3 Outbound Route Filtering RFC 5291 4 Multiple Routes to Destination RFC 3107 5 Extended Next Hop Encoding RFC 5549 64 Graceful Restart RFC 4724 65 4-octet AS number RFC 4893 69 ADD-PATH draft-ietf-idr-add-paths TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 59 MP-BGP Session Establishment AS 123 AS 321 BGP: 3FFE:B00:C18:2:1::1 sending OPEN, version 4, my as: 100 BGP: 3FFE:B00:C18:2:1::1 rcv OPEN, version 4 BGP: 3FFE:B00:C18:2:1::1 rcv OPEN w/ OPTION parameter len: 16 BGP: 3FFE:B00:C18:2:1::1 rcvd OPEN w/ optional parameter type 2 (Capability) len 6 BGP: 3FFE:B00:C18:2:1::1 OPEN has CAPABILITY code: 1, length 4 BGP: 3FFE:B00:C18:2:1::1 OPEN has MP_EXT CAP for afi/safi: 2/1 BGP: 3FFE:B00:C18:2:1::1 rcvd OPEN w/ optional parameter type 2 (Capability) len 2 BGP: 3FFE:B00:C18:2:1::1 went from OpenSent to OpenConfirm BGP: 3FFE:B00:C18:2:1::1 went from OpenConfirm to Established %BGP-5-ADJCHANGE: neighbor 3FFE:B00:C18:2:1::1 Up TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 60 BGP Load Balancing TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 61 Load Balancing BGP isn’t inherently designed to load-balance traffic By default, BGP chooses, installs, and advertises one “best” route Attempting to balance traffic comes in two parts Inbound traffic Outbound traffic Load balancing is relatively trivial in some topologies A pair of eBGP peers connected via multiple links Two connections from one router to the same AS …but not others Multi-homed to more than one provider TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 62 Single Path – eBGP Multihop Router A configuration: • A must do a recursive lookup for 2.2.2.2 interface loopback 0 ip address 1.1.1.1 255.255.255.255 • A has two equal cost paths to 2.2.2.2 ! router bgp 100 neighbor 2.2.2.2 remote-as 200 neighbor 2.2.2.2 update-source loopback0 neighbor 2.2.2.2 ebgp-multi-hop ! • A will load balance traffic over these two links • B must be configured similarly for bidirectional load balancing ip route 2.2.2.2 255.255.255.255 serial 0 ip route 2.2.2.2 255.255.255.255 serial 1 ! B Loopback 0 2.2.2.2/32 200 A 100 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 63 eBGP Multipath Support 100 200 A A peers with multiple routers in the same neighbor AS Install multiple routes in IP routing table Use ‘maximum-paths ebgp’ command Routes must be identical in terms of LOCAL_PREF, AS_PATH, MED, etc… (probably true if coming from the same AS) Outbound traffic will be split over these two links A still advertises one best path to peers Next-hop is set to self (using loopback interface) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 64 Multi-Homed AS AS 100 AS 300 D A C B AS 200 Very common topology for many customers Customer wants to split traffic between AS 100 and AS 300 Misconception: “I’ll make half of my routes preferred via AS 100 and the other half through AS 300. Then I’ll have load-balancing!!”…no, you’ll have prefix splitting! TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 65 Multi-Homed AS Huge difference between “load balancing” and “prefix splitting” Traffic may be balanced perfectly…until traffic patterns change Some customers use this method but they are forced to change their policies to accommodate for changes in traffic patterns For outbound balancing use Weight LOCAL_PREF (recommended) For inbound balancing use Conditional-advertisement AS_PATH prepending (may not work) MEDs (may not work) Communities and LOCAL_PREF (recommended…but requires upstream coordination!) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 66 BGP Multipath Multiple eBGP paths can be flagged as multipath as long as the paths are “similar” “Similar” means that all relevant BGP attributes are a tie (up to next-hop metric) If paths 1 and 2 both have a local-pref of 200, MED of 300, etc… but the Router-IDs are different then paths 1 and 2 are eligible for multipath These paths are installed in the RIB/FIB to load-balance outbound traffic Multipath is the correct approach to a difficult problem but not terribly useful because it can only be used in one specific topology iBGP multipath and Link-BW will help correct this TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 67 iBGP: Without Multipath R2 R4 R1 AS 100 AS 200 10.0.0.0/8 R3 R5 R1 has two paths for 10.0.0.0/8 Both paths are identical in terms of localpref, med, IGP cost to next-hop, etc Router-ID, peer-address, etc are different but these are arbitrary in terms of selecting a best path R1 will select one path as best and send all traffic for 10.0.0.0/8 towards one of the exit points TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 68 iBGP Multipath Flag multiple iBGP paths as ‘multipath’ Each path must have a unique NEXT_HOP All multipaths are inserted the RIB/FIB Number of multipaths can be controlled maximum-paths ibgp <1-6> Still advertise a single bestpath Each BGP next-hop is resolved and mapped to available IGP paths (not next-hop-self unless routing follows forwarding) Supported on all IOS versions in past ~10 yrs TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 69 iBGP: With Multipath R1 has two paths for 10.0.0.0/8 R2 Both paths are flagged as “multipath” R1 R4 AS 200 10.0.0.0/8 AS 100 R3 R5 R1#sh ip bgp 10.0.0.0 200 20.20.20.3 from 20.20.20.3 (3.3.3.3) Origin IGP, metric 0, localpref 100, valid, internal, multipath 200 20.20.20.2 from 20.20.20.2 (2.2.2.2) Origin IGP, metric 0, localpref 100, valid, internal, multipath, best TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 70 iBGP Multipath These two paths are installed in the RIB/FIB Traffic is load-balanced across the two paths/exit points based on per-packet hash Depending on platform/version, there may or may not be multiple levels of load balancing (IGP + BGP) TECRST-2310_c1 R1#sh ip route 10.0.0.0 Routing entry for 10.0.0.0/8 * 20.20.20.3, from 20.20.20.3, Route metric is 0, traffic 1 AS Hops 1 20.20.20.2, from 20.20.20.2, Route metric is 0, traffic 1 AS Hops 1 00:00:09 ago share count is 00:00:09 ago share count is R1#show ip cef 10.0.0.0 10.0.0.0/8, version 237, per-destination sharing 0 packets, 0 bytes via 20.20.20.3, 0 dependencies, recursive traffic share 1 next hop 20.20.20.3, FastEthernet0/0 via 20.20.20.3/32 valid adjacency via 20.20.20.2, 0 dependencies, recursive traffic share 1 next hop 20.20.20.2, FastEthernet0/0 via 20.20.20.2/32 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 71 eiBGP Multipath *Applies Only to the MPLS/VPN Case* The traffic destined to a site may be load shared between all entry points. From the MPLS/VPNs provider’s point of view, these entry points may not all correspond to internal or external peers. The intent is for the MPLS/VPN network to be transparent to the customers. The ability to consider both iBGP and eBGP paths, when using multipath, is needed. Paths must match up to MED attribute TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 72 eiBGP Multipath Example PE-2 has two possible paths into Site-1 eiBGP Multipath allows both paths to be used. PE-1 PE-2 CE-3 Site-2 CE-1 CE-2 Site-1 SOO=100:65 maximum-paths eibgp <num> TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 73 Full Mesh iBGP TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 74 Full Mesh iBGP “If a particular AS has multiple BGP speakers and is providing transit service for other ASes, then care must be taken to ensure a consistent view of routing within the AS. A consistent view of the interior routes of the AS is provided by the IGP used within the AS. For the purpose of this document, it is assumed that a consistent view of the routes exterior to the AS is provided by having all BGP speakers within the AS maintain IBGP sessions with each other.” RFC 4271 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 75 Full Mesh iBGP Why? A Learns 10.1.1.0/24 eBGP Because BGP relies on the AS Path to prevent loops B Learns 10.1.1.0/24 iBGP C iBGP Thus… There’s no way to tell if a route advertised through several iBGP speakers is a loop! Advertises 10.1.1.0/24 iBGP iBGP iBGP peers are in the same AS, so they do not add anything to the AS Path Advertises 10.1.1.0/24 eBGP If a router learns a route from an iBGP peer, it will not re-advertise that route to another iBGP peer Do not advertise 10.1.1.0/24 iBGP D TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 76 Full Mesh iBGP How scalable is using a full mesh of iBGP speakers? 2 speakers == 1 session 3 speakers == 3 sessions 4 speakers == 6 sessions 5 speakers == 10 sessions n(n-1)/2 = O(n2) sessions (n-1) sessions per speaker How can we better handle scale? Confederations (yuck) Route Reflectors (hooray!) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 77 Confederations A Sub-AS 65002 B C Sub-AS 65004 Sub-AS 65003 G D E F H Sub-AS 65001 Confederation 100 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 78 BGP Route Reflectors TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 79 BGP Route Reflectors Route Reflector Basics Hierarchical Route Reflectors Deploying Route Reflectors Route Reflector Redundancy TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 80 Route Reflector Basics A route reflector is an iBGP speaker that reflects routes learned from iBGP peers to other iBGP peers Route reflectors Route reflectors are designated by configuring some of their iBGP peers as route reflector clients A B neighbor <A> route-reflector-client neighbor <B> route-reflector-client TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 81 Route Reflector Basics A route reflector client is just an iBGP speaker Route reflectors There is no special configuration for a route reflector client A B Route reflector client neighbor <A> route-reflector-client neighbor <B> route-reflector-client TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 82 Route Reflector Basics A cluster is a route reflector and its clients Route reflectors Route reflector clusters may overlap Cluster A B Route reflector client neighbor <A> route-reflector-client neighbor <B> route-reflector-client TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 83 Route Reflector Basics Route reflectors A non-client is any route reflector iBGP peer that is not a route reflector client Non-client Cluster Each route reflector is also a non-client of each other route reflector in this network A Route reflectors must be fully iBGP meshed with non-clients B Route reflector client neighbor <A> route-reflector-client neighbor <B> route-reflector-client TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 84 Hierarchical Route Reflectors - Motivation All of the route reflectors will need to be fully meshed Full iBGP mesh between reflectors Reflectors still follow the normal rules of iBGP route propagation between themselves This full iBGP mesh between reflectors can still contain so many routers that it presents a scaling problem as well TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cluster Cisco Confidential Cluster 85 Hierarchal Route Reflectors To resolve this, route reflectors can be deployed in a hierarchy Client and reflector Cluster A single router can be a reflector client and a reflector Cluster TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Cluster 86 Hierarchal Route Reflectors An unlimited number of tiers can be used But very rare to see more than 3 levels Edges of route reflector tiers are a natural place to reduce the amount of routing information being carried in the lower tiers RRs would be ABRs in “textbook” network design The same topology rule applies: The reflector topology should follow the physical topology to prevent loops and black holes RRs can lead to suboptimal routing because they can hide full path information from clients (RRs can advertise a single best path). TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 87 Route Reflector Basics If a Route Reflector Receives a Route from an eBGP Peer: Send the route to all clients eBGP peer Non-client iBGP peer Send the route to all non-clients Send Send Send Non-client iBGP peer Client Client TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 88 Route Reflector Basics If a Route Reflector Receives a Route from a Client: Reflect the route to all clients Non-client iBGP peer eBGP peer Unless “no client-to-client reflection” Send Reflect Reflect the route to all non-clients Reflect Send the route to all eBGP peers Non-client iBGP peer Client Client TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 89 Route Reflector Basics If a Route Reflector Receives a Route from a Non-Client: Reflect the route to all clients Non-client iBGP peer eBGP peer Send Send the route to all eBGP peers Reflect Reflect Non-client iBGP peer Client Client TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 90 Route Reflector Basics What we need is a mechanism to prevent loops within the AS! RFC2796 defines two BGP attributes to provide loop detection within an AS Originator ID Set to the router ID of the router injecting the route into the AS Cluster List Each route reflector the route passes through adds their cluster ID to this list. Cluster-id = Router ID by default TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 91 Route Reflector Basics When reflecting a route, a route reflector always: Creates a cluster list if one doesn’t exist and adds its router ID (or configured cluster ID) Adds the router ID of the peer it received the route from as the Originator ID TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 92 Deploying Route Reflectors Use the divide and conquer approach to convert from a full iBGP mesh to route reflectors Divide network into multiple clusters, using the physical topology as a guide to the logical divisions Pick out one router to act as the reflector in each cluster, making certain reflection follows the physical topology Remove redundant iBGP sessions as you configure reflectors in each cluster TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 93 Deploying Route Reflectors This small network has nine routers, and 36 iBGP sessions A Reflectors First, choose clusters using the physical topology as a guide B C D Next choose reflectors based on the physical topology F E G J H Physical links iBGP sessions TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 94 Deploying Route Reflectors Configure each client in a single cluster A Remove extra iBGP sessions Start with B B C D F E G J H neighbor <f> route-reflector-client neighbor <h> route-reflector-client neighbor <d> route-reflector-client Physical links iBGP sessions TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 95 Deploying Route Reflectors Next, configure G, E, and J as route reflector clients of C A Remove extra iBGP sessions B C D F E G J H neighbor <g> route-reflector-client neighbor <e> route-reflector-client neighbor <j> route-reflector-client Physical links iBGP sessions TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 96 Deploying Route Reflectors The resulting network has nine iBGP sessions along physical links A B C D F E G J H Physical links iBGP sessions TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 97 Route Reflector Design and Redundancy A client may peer with more than one reflector, in different clusters A client that peers to only one reflector has a single point of failure Clients should peer to at least two reflectors to provide redundancy How many reflectors should a single client be peered to? Where should the RRs be placed in the network? How many RRs are needed? TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 98 Route Reflector Design and Redundancy Redundancy is needed but…. Too much burns memory on RRCs because the client learns the same information from each RR Also burns memory on the RRs because they learn multiple paths for each route introduced by a RRC Two route reflectors per client should be plenty… …but this is not a hard and fast rule As with everything else….”it depends” PEs, RRs, SLAs, network size, network topology, etc. Other sessions dedicated to this topic… TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 99 Scaling BGP Updates TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 100 Scaling BGP Updates Aggregation Peer Groups Input Queue Tuning Path MTU Discovery TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 101 Aggregation Why aggregate? Reduce number of Internet prefixes Advertise only your CIDR block According to some studies, about 50% of the current Internet routing table represents “leakage past aggregates” Increase stability If you aggregate properly, the aggregate will remain stable even if specific components of the aggregate come and go Perhaps your upstream provider will not allow the more specifics (filter long prefixes, dampening) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 102 Aggregation One of the easiest ways to scale eBGP is to aggregate routing information To configure aggregation in BGP, use the aggregate address command 10.1.0.0/24 “65001” 10.1.1.0/24 “65001 65002” 10.1.0.0/22 “” AS65100 Aggregated route is created if we have at least one component: Components are the longer length prefixes that fall within the aggregate’s range By default: The aggregate address command only creates an aggregate For the new created aggregate route, ASPATH=NULL and other attributes are default for local routes AS65101 10.1.0.0/24 “65100 65001” 10.1.1.0/24 “65100 65002” 10.1.0.0/22 “65100” aggregate-address 10.1.0.0 255.255.252.0 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 103 Aggregation 10.1.0.0/24 “65001” 10.1.1.0/24 “65001 65002” 10.1.0.0/22 “” Adding the keyword summary-only causes BGP to suppress the components of the aggregate AS65100 Suppressed route = use it, but do not advertise it to any peer AS65101 10.1.0.0/22 “65100” aggregate-address 10.1.0.0 255.255.252.0 summary-only TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 104 Aggregation Adding the Keyword as-set Causes BGP to: 10.1.0.0/24 “65001” 10.1.1.0/24 “65001 65002” 10.1.0.0/22 “” For the aggregate, AS-PATH = AS Set made by merging of all ASes of all the components Additionally (not shown): AS65100 Merge all the communities and extended-communities of all components AS65101 10.1.0.0/22 “65100 {65001 65002}” aggregate-address 10.1.0.0 255.255.252.0 summary-only as-set TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 105 Aggregation 10.1.0.0/24 “65001” 10.1.1.0/24 “65001 65002” 10.1.0.0/22 “” LP=200 Use a route map to set the aggregate’s other attributes. AS65100 AS65101 10.1.0.0/22 “65100 {65001 65002}” aggregate-address 10.1.0.0 255.255.252.0 summary-only as-set route-map foo route-map foo set local-preference 200 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 106 Aggregation Other aggregate commands advertise-map <route-map>: to select which components are considered as part of the aggregate suppress-map <route-map>: to select which components we want to suppress neighbor … unsuppressed-map <route-map>: to unsuppress (advertise) a suppressed component towards a particular peer TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 107 Aggregation Creating an aggregate with an aggregate command Adds AGGREGATOR attribute (troubleshooting info with the IP and AS of the router that did the aggregation) If as-set keyword is NOT used: Atomic Aggregate attribute is also added (troubleshooting info that indicates loss of AS Path information) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 108 Peer Groups What is it? A way to group peers with similar configuration Configuration of neighbor is now done in 2 steps: Define a peer-group like a neighbor It has associated neighbor commands, policies, etc. Define individual neighbors as a member of that peer-group All the configuration of the peer-group applies to the member Reasons for using peer-groups: Ease of administration Scaling TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 109 Peer Groups Ease of administration: Offering customers a few options in the number of routes they receive, rather than filtering per customer Classifying peering arrangements with other providers so you only manage two or three types of connections Example for customer types: cust-default—send default route only cust-cust—send customer routes only cust-full—send full Internet routes TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 110 Peer Groups Core Peer Group CORE Your AS CIDR Block: 1.0.0.0/8 Route Reflector Aggregation Router (RR Client) Client Peer Group Full Routes Peer Group TECRST-2310_c1 “Default” Peer Group © 2010 Cisco and/or its affiliates. All rights reserved. Customer Routes Peer Group Cisco Confidential 111 Peer Groups router bgp 65000 neighbor 10.1.1.1 neighbor 10.1.1.1 neighbor 10.1.1.1 neighbor 10.1.1.1 neighbor 10.1.1.2 neighbor 10.1.1.2 neighbor 10.1.1.2 neighbor 10.1.1.2 neighbor 10.1.1.3 neighbor 10.1.1.3 neighbor 10.1.1.3 neighbor 10.1.1.3 NO PEER-GROUPS remote-as 65001 route-map cust-receive route-map cust-default send-community remote-as 65002 route-map cust-receive route-map cust-default send-community remote-as 65003 route-map cust-receive route-map cust-default send-community Defining peer-groups Applying peer-groups to neighbors TECRST-2310_c1 in out in out in out PEER-GROUPS router bgp 65000 neighbor cust-default route-map cust-receive in neighbor cust-default route-map cust-default out neighbor cust-default send-community neighbor 10.1.1.1 remote-as 65001 neighbor 10.1.1.1 peer-group cust-default neighbor 10.1.1.2 remote-as 65002 neighbor 10.1.1.2 peer-group cust-default neighbor 10.1.1.3 remote-as 65003 neighbor 10.1.1.3 peer-group cust-default © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 112 Peer Groups Peer groups also improve scaling Advertising 100,000+ routes to hundreds of peers is a big challenge from a scalability point of view (1) Each packet to each peer must be individually formatted (2) Each packet to each peer must be individually transmitted Peer-groups makes possible to do (1) only once for all the members of the peer-group GOLDEN RULE of peer-groups Outbound policy MUST be unique Individual peers cannot be configured with outbound policy TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 113 Peer Groups Update generation without peer groups BGP table is walked for every peer Updates are generated and sent to each peer Update generation with peer groups A peer-group leader is elected for each peer group The BGP table is walked for the leader only Updates are generated, transmitted by the peer-group leader, and replicated and transmitted by the rest of peergroup members TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 114 Peer Groups For the same amount of convergence time TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 115 Beyond Peer Groups Today peer-groups are not used but live in spirit Peer-groups still can be configured But we have decoupled its two functions: Scalability: update-groups Administration: peer templates TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 116 Beyond Peer Groups Update-groups Software automatically groups neighbors that can be included in the same update-group Basically, all the neighbors that share outbound policy Only one update is formatted for each update-group To check how many update-groups and members are created: show ip bgp [<af>] update-group TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 117 Beyond Peer Groups Peer-templates Configuration is similar to peer-groups Define a peer-template with configuration commands Individual neighbor is configured to inherit commands from peer-template And additionally: multiple peer-templates can be applied to a neighbor peer-templates can be applied to another peer-template No GOLDEN rule: individual peers can be configured with outbound policy Two types of peer templates peer-session: defines session commands (update independent) Remote-as, update-source … peer-policy: defines policy commands (associated to updates) Route-map inbound, route-map outbound, remove-private-as, … Neighbors can still be grouped in update-groups If “total” outbound policy is the same TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 118 Beyond Peer Groups Peer-template (peer-policy) example router bgp 1 template peer-policy ppol1 route-map map1 out filter-list 1 in inherit peer-policy ppol2 10 inherit peer-policy ppol3 5 template peer-policy ppol2 filter-list 2 in distribute-list 2 in route-reflector-client Neighbor 1.1.1.10 for IPV4 uses: Route-map out = map0 distribute-list in = 2 Filter-list in = 1 It’s a route-reflector client Uses next-hop self TECRST-2310_c1 template peer-policy ppol3 distribute-list 3 in next-hop-self address-family ipv4 neighbor 1.1.1.10 route-map map0 out neighbor 1.1.1.10 inherit peer-policy ppol1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 119 Input Queue Tuning Large bursts of input packets may overflow the input holdqueue and produce input queue drops BGP packets may be dropped when many BGP peers are reached via the same interface (usually an Ethernet interface) The final effect is that the available bandwidth is lower than the available bandwidth (TCP congestion window is reduced) Solutions: Increase input hold-queue: hold-queue <1 – 4096> in (default is only 75!) Give extra buffer for BGP packets (marked with precedence 6): spd headroom <0-65535> (default in last version is good : 2000) show [ip] spd to verify TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 120 Larger Input Queues For the same amount of convergence time Results from increasing the interface input queue depth from 75 (default) to 1000 TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 121 TCP Path MTU Discovery MSS (Max Segment Size) Largest segment that can traverse a TCP session Does not include IP or TCP headers MSS is 536 bytes by default (in multihop sessions) Anything larger must be fragmented & re-assembled 536 bytes is inefficient for Ethernet (1500) & POS (4470) Increases the number of IP packets Makes TCP work harder Slows BGP convergence and reduces scalability Solution: ip tcp path-mtu-discovery Another helpful command: show ip bgp neighbors | include max data TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 122 TCP Path MTU Discovery MSS increased from 536 to 1460 Bytes (GE) 350 300 Supported Peers Configuring path MTU discovery between BGP peers can provide dramatic results in the speed of convergence w/ PMTUD 250 200 150 100 w/o PMTUD 50 0 80K 90K 100K 110K Routes 120K MSS Formula = Lowest MTU - IP overhead (20 bytes) – TCP overhead (20 bytes) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 123 BGP Fast Convergence TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 124 Faster Convergence Increased focus on faster BGP convergence Critical for traffic (i.e. voice, video) VPN customers want IGP like convergence Several factors influence BGP convergence Detection Propagation Scalability Stability TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 125 Faster Convergence Typically two scenarios where we need faster convergence Single route convergence A bestpath change occurs for one prefix How quickly can BGP propagate the change throughout the network? How quickly can the entire BGP network converge? Key for VPNs and voice networks Bootup or “clear ip bgp *” convergence Most stressful scenario for BGP CPU may be busy for several minutes Limiting factor in terms of scalability Key for any router with a full Internet table and many peers TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 126 Convergence Basics – BGP Scanner BGP Scanner plays a key role in convergence Full BGP table scan happens every 60 seconds bgp scan-time X Affects only some AF dependent tasks, most tasks are still perform every 60 seconds Full scan performs multiple housekeeping tasks Validate nexthop reachability Validate bestpath selection Route redistribution and network statements Conditional advertisement Route dampening BGP Database cleanup Import scanner runs once every 15 seconds Imports VPNv4 routes into vrfs bgp scan-time import X TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 127 Convergence Basics – BGP Nexthops Every 60 seconds the BGP scanner recalculates best path for all prefixes Changes to the IGP cost of a BGP nexthop will go unnoticed until scanner’s next run IGP may converge in less than a second BGP may not react for as long as 60 seconds Need to change from a polling model to an event driven model to improve convergence Polling model – Check each BGP nexthop’s IGP cost every 60 seconds Event driven model – BGP is informed by a 3rd party process when the IGP cost to a BGP nexthop changes TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 128 ATF – Address Tracking Filter ATF is a middle man between the RIB and RIB clients RIB clients: BGP, OSPF, EIGRP, etc. ATF and client interaction Client tells ATF to register a given IP address (ex: an IP address that is used as a BGP next-hop) RIB notifies ATF of any route modification/creation/deletion ATF notifies client if the lookup route associated to any registered IP address changes/switches/appears/disappears TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 129 ATF – Address Tracking Filter BGP tells ATF to let us know about any changes to 10.1.1.3 and 10.1.1.5 BGP ATF Changes to 10.1.1.3/32 and 10.1.1.5/32 are passed along to BGP TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. BGP Nexthops 10.1.1.3 10.1.1.5 ATF filters out any changes for 10.1.1.1/32, 10.1.1.2/32, and 10.1.1.4/32 RIB 10.1.1.1/32 10.1.1.2/32 10.1.1.3/32 10.1.1.4/32 10.1.1.5/32 Cisco Confidential 130 NHT – Next Hop Tracking BGP Next Hop Tracking Enabled by default [no] bgp nexthop trigger enable BGP registers all nexthops with ATF Hidden command will let you see a list of nexthops show ip bgp attr nexthop ATF will let BGP know when a route change occurs (if of interest for a BGP nexthop) ATF notification will trigger a “lightweight BGP Scanner” run Bestpaths will be calculated The rest of the other “Full Scan” work will NOT happen TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 131 NHT – Next Hop Tracking Once an ATF notification is received BGP waits 5 seconds (default) before triggering NHT scan bgp nexthop trigger delay <0-100> Configured value should be the maximum time it takes for the IGP to converge Event driven model allows BGP to react quickly to IGP changes No longer need to wait as long as 60 seconds for BGP to scan the table and recalculate bestpaths Tuning your IGP for fast convergence is recommended TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 132 NHT – Next Hop Tracking Dampening is used to reduce frequency of triggered scans It does not allow too frequent “lightweight BGP scanner” show ip bgp internal Displays data on when the last NHT scan occurred Time until the next NHT may occur New commands bgp nexthop trigger enable bgp nexthop trigger delay <0-100> show ip bgp attr next-hop ribfilter debug ip bgp events nexthop debug ip bgp rib-filter TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 133 Fast External Fallover Objective: Tear down the session if the interface to reach the peering address goes down No need to wait for the hold timer to expire! When does it work? Only when peering address is directly connected Only for eBGP peers ebgp-multihop OR disable-connected-check can NOT be configured Configuration ON by default Under router bgp: [no] bgp fast-external-fallover Under interface (priority over router configuration): [no] ip bgp fast-external-fallover {permit|deny} Recommended if interface goes down during failure TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 134 Fast Session Deactivation (FSD) Objective: Tear down the session if the route to reach the peering address disappears No need to wait for the hold timer to expire! How does it work? BGP registering peering address with ATF (similar to NHT) It’s triggered immediately (trigger-delay = 0 and cannot be configured) Configuration: OFF by default Under router bgp: neighbor <neighbor-ip> fall-over Recommended for multihop eBGP peers known via IGP Very dangerous for iBGP peers If we loose the route for a split second, we bring the peer down! iBGP sessions usually re-route! TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 135 Scalability Update – Overview Bootup convergence (or convergence after “clear ip bgp *”) are the biggest challenges Must receive updates from all peers Must compute all best-paths Must format updates for all peers Must transmit updates for all peers To improve the process: Make sure that you don’t start computing best-paths till you have received updates from all peers All the peers will send you a KA or a EOR when they have finished sending you the updates Maximum timer: bgp update-delay <1-3600> (default 120) Increase it if your network takes lot of time to converge Depends on number of routes, number of peers and on specific platform TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 136 NSR – Non Stop Routing NSR and NSF (Non Stop Forwarding) are not the same Both provide for a restarting speaker to continue forwarding Usually, FIB is distributed and not affected while the main RP is restarting NSF in a nutshell Needs support of (NSF aware) peers Peers are aware that restarting speaker keeps forwarding while restarting and don’t delete the routes towards him. BGP extensions required: GR (Graceful Restart) Not a challenge within an AS PE CE is a problem Upgrading CEs is a huge deployment challenge TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 137 NSR – Non Stop Routing NSR in a nutshell Provides forwarding and preserves routing during Active RP failover to Standby RP It’s called a SSO (stateful switchover) BGP peers’ TCP sessions are maintained BGP extensions: NOT required CEs do not need to be upgraded! TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 138 NSR – Non Stop Routing Deployment challenges: NSF is easy to implement inside an AS All the routers can be upgraded to support GR Problem are CEs (upgrading to support GR can be a huge deployment challenge) NSR is easier to implement No need to upgrade CEs PE uses NSR with CEs that are not NSF-aware PE uses NSF with NSF-aware CEs PE uses NSF with RRs (NSF-aware) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 139 NSR – Non Stop Routing Simplified deployment for service providers Only PEs need to be upgraded to support NSR (incremental deployment) CEs are not touched! (i.e., no software upgrade required) TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 140 NSR – Related Commands show ip bgp vpnv4 all sso summary Used to display the number of BGP peers that support Cisco BGP NSR Router# show ip bgp vpnv4 all sso summary Stateful switchover support enabled for 40 neighbors TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 141 Route Flap Dampening Defined in RFC 2439 Route flap: The bouncing of a path or a change in its characteristics A flap ripples through the entire Internet Consumes CPU cycles, causes instability Solution: Reduce scope of route flap propagation Suppress oscillating routes (history predicts future behavior) Only eBGP routes are dampened TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 142 Route Flap Dampening Flap: every time we receive a withdrawn or change of attributes for a given route Withdrawn: we increase the penalty by 1000 Change of attributes: we increase the penalty by 500 To suppress (dampen a route): Penalty accumulated must be greater than the suppress-limit To reuse a route (“undampen” a route): Penalty decreases exponentially When it reaches reuse-limit, we use it again TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 143 Route Flap Dampening 4000 Penalty Suppress limit 3000 Penalty 2000 Reuse limit 1000 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Time Network Announced TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Network Not Announced Cisco Confidential Network Re-announced 144 Route Flap Dampening Benefits Basically reduces CPU hit load Does not propagate local flaps to the whole internet Troubleshooting PLUS: Makes all these local flaps (routes that have been suppressed) visible TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 145 Route Flap Dampening Guidelines RIPE-229: “Progressive” dampening: more aggressive for longer prefixes Needs to be coordinated Some parameters recommended “golden” networks (like gTLD name servers) should not be dampened Apply as close as possible to the prefix being advertised Peering, upstream, customer boundaries No need to dampen routes from customers that use Provider Aggregated addresses TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 146 Route Flap Dampening Guidelines RIPE-378: Internet today: A single “normal” withdraw/update can propagate as many withdraws/updates a few hops away Route dampening would maintain this prefixes unreachable unnecessarily Routers today: Power makes them more tolerant to route flapping Recommendation: Do NOT implement route dampening TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 147 A Little BGP “Show and Tell” TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 148 AS 1 AS 3 Loop0: 10.1.1.1 R1 10.3.1.0/30 2/0 2/0 R3 Loop0: 10.3.3.3 1/0 Internet 10.1.2.0/24 10.201-249 /16 AS 4 1/0 Loop0: 10.1.1.2 TECRST-2310_c1 10.4.1.0/30 R2 2/0 © 2010 Cisco and/or its affiliates. All rights reserved. 2/0 Cisco Confidential R4 Loop0: 10.4.4.4 149 Complete Your Online Session Evaluation Give us your feedback and you could win fabulous prizes. Winners announced daily. Receive 20 Cisco Preferred Access points for each session evaluation you complete. Complete your session evaluation online now (open a browser through our wireless network to access our portal) or visit one of the Internet stations throughout the Convention Center. TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Don’t forget to activate your Cisco Live and Networkers Virtual account for access to all session materials, communities, and on-demand and live activities throughout the year. Activate your account at any internet station or visit www.ciscolivevirtual.com. Cisco Confidential 150 Enter to Win a 12-Book Library of Your Choice from Cisco Press Visit the Cisco Store in the World of Solutions, where you will be asked to enter this Session ID code Check the Recommended Reading brochure for suggested products available at the Cisco Store TECRST-2310_c1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 151