Swinog-3, 19 September 2001 BGP Oscillation …the Internet routing protocol is diverging! Fabien Berger CCIE#6143 IP-Plus Backbone Engineering Fabien Berger, berger@ip-plus..net Swinog-3, BGP Oscillation Well known issue? Does a BGP system always converge? – NO! – feature not a bug :) Researchers have shown theoretical eBGP convergence issues – [Griffin]: “bad gadget” topology diverges! backup scenario diverges! iBGP diverges in complex RR/confederation environment (draftietf-idr-route-oscillation-00.txt) Fabien Berger, berger@ip-plus.net 2 Swinog-3, BGP Oscillation Goal of the presentation make you aware of the issue (before the customer :) – troubleshooting not easy pointer to solutions/discussions presentation based on [NANOG] [Cisco] [IETF] Fabien Berger, berger@ip-plus.net 3 Swinog-3, BGP Oscillation Convergence convergence = “process of bringing all route tables to a state of consistency” – no loops! does not converge -> you see (on a RR or a confed border): #show ip bgp 10.0.0.0 | include best # Paths: (3 available, best #3) #show ip bgp 10.0.0.0 | include best # Paths: (3 available, best #2) #show ip bgp 10.0.0.0 | include best # Paths: (3 available, best #3) ... Fabien Berger, berger@ip-plus.net 4 Swinog-3, BGP Oscillation Cause of the Oscillation RR/confederation hides some information – RR/confederation sends best path only – not all routers know all best paths MED (Multi Exit Discriminator) vs IGP cost to the neighbor: A: path 200 100, igp cost 5, med 2 B: path 300 100, igp cost 50 C: path 200 100, igp cost 500, med 1 if A,B,C are known: B is best (assuming “deterministic-med” is enabled [detMED] :) if C is hidden: A is best A<B<C<A Fabien Berger, berger@ip-plus.net 5 Swinog-3, BGP Oscillation Oscillation = Route Reflector = Advertisement = Withdrawal Step 1 – B selects Y0 – C selects Y1 = Client Cluster 1 B Cluster 2 1 10 C 3 AS_PATH 2 D AS Y MED 0 AS X Fabien Berger, berger@ip-plus.net IGP B * Y A MED E AS Y MED 1 C X * Y 0 1 10 3 2 6 Swinog-3, BGP Oscillation Oscillation = Route Reflector = Advertisement Step 2 – C selects X = Withdrawal = Client Cluster 1 B Cluster 2 1 10 C 3 A D AS Y MED 0 AS X Fabien Berger, berger@ip-plus.net AS_PATH 2 E AS Y MED 1 B Y * Y C *X Y Y MED 1 0 1 0 IGP 3 10 3 2 11 7 Swinog-3, BGP Oscillation Oscillation = Route Reflector = Advertisement Step 3 – B selects X = Withdrawal = Client Cluster 1 B Cluster 2 1 10 C 3 A D AS Y MED 0 AS X Fabien Berger, berger@ip-plus.net AS_PATH 2 B E AS Y MED 1 C *X Y *X Y Y MED 0 1 0 IGP 4 10 3 2 11 8 Swinog-3, BGP Oscillation Oscillation = Route Reflector = Advertisement Step 4 – C selects Y1 = Withdrawal = Client Cluster 1 B Cluster 2 1 10 C 3 A D AS Y MED 0 AS X Fabien Berger, berger@ip-plus.net AS_PATH 2 B E AS Y MED 1 C *X Y X *Y MED IGP 0 4 10 1 3 2 9 Swinog-3, BGP Oscillation Oscillation = Route Reflector = Advertisement Step 5 – B selects Y0 = Withdrawal = Step 1!! = Client Cluster 1 B Cluster 2 1 10 C 3 AS_PATH 2 D AS Y MED 0 AS X Fabien Berger, berger@ip-plus.net IGP B *Y A MED E AS Y MED 1 C X *Y 0 10 1 3 2 10 Swinog-3, BGP Oscillation How to detect an oscillation? Observe the latest received routes: – run every minute during 5 minutes #show ip route | include ^B_.*_00:00: – prefixes that appear 60% of the time are probably oscillating – full routing table must be traversed :( Via SNMP: – poll ipRouteAge of ipRouteTable Observation should be made in the core (top level RR, backbone sub-AS) – eBGP within a confed applies flap damping – RR client may see only the replacement route Fabien Berger, berger@ip-plus.net 11 Swinog-3, BGP Oscillation Shall we care? MED usage – 34% of the prefix we receive have the MED set – 75% of our peers have > 1 prefix with MED set Potential AS that can oscillate (AS received via > 2 peers) – 60% (upper bound, as-path not taken into account!) Oscillation not propagated to customers because of damping Oscillation seen in our backbone :( but cured :) Fabien Berger, berger@ip-plus.net 12 Swinog-3, BGP Oscillation Solutions configure bgp deterministic-med full iBGP mesh when you can do not listen to the MED (or only with stub-AS) – set metric 0 on all prefixes – bgp always-compare-med use local-pref to force decision – exit no longer chosen by peer = more work :( allow peer to set local-pref using community protocol improvement – RR/confederation should send more than just the best path – closer to the iBGP full mesh :( Fabien Berger, berger@ip-plus.net 13 Swinog-3, BGP Oscillation Conclusion It’s happening today :( It is possible to detect Solutions (fixes) exist today Protocol improvement on the way by IETF Fabien Berger, berger@ip-plus.net 14 Swinog-3, BGP Oscillation References [Cisco] http://www.cisco.com/warp/customer/770/fn12942.html [IETF] draft-ietf-idr-route-oscillation-00.txt, www.ietf.org [Nanog] NANOG 21 Atlanta February 2001, www.nanog.org [Griffin] http://www.research.att.com/~griffin/ [detMED] http://www.cisco.com/warp/public/459/37.html [bgpDecision] http://www.cisco.com/warp/public/459/25.shtml Fabien Berger, berger@ip-plus.net 15 Swinog-3, BGP Oscillation BGP Oscillation comments? questions? experiences? Fabien Berger, berger@ip-plus.net 16 Swinog-3, BGP Oscillation BGP Decision Process [bgpDecision] 1. Largest weight 2. Largest local preference 3. Locally originated 4. Shortest AS-Path length 5. Lowest origin 6. Lowest Multi Exit Discriminator (cisco default = 0 unlesss “bgpbestpath-missing-as-worst”) 7. Prefer EBGP over IBGP (conf EBGP=IBGP) 8. Lowest IGP metric 9. Lowest BGP router ID Fabien Berger, berger@ip-plus.net 17