Deploying and troubleshooting BGP

advertisement
Deploying and
Troubleshooting BGP
Networks
© 2000, Cisco Systems, Inc.
1
Agenda
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
2
Agenda
• Basics
• Peering
• Attributes and Route Selection
Algorithm
• Prefix Generation and Aggregation
© 2000, Cisco Systems, Inc.
3
Agenda (cont)
• Soft Reconfiguration
• Internal mesh reduction
• MP-BGP
© 2000, Cisco Systems, Inc.
4
Basics
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
5
Autonomous system
B
AS
123
AS
456
C
A
D
AS
678
E
• Collection of networks under a a single
technical administration
• Range: 1 to 65,535 (private: 64512 to 65534)
© 2000, Cisco Systems, Inc.
6
Autonomous systems
Stub AS
Stub AS
ISP
© 2000, Cisco Systems, Inc.
7
Autonomous systems
• Multihomed Nontransit AS
AS 2
AS 1
AS 3
© 2000, Cisco Systems, Inc.
8
Autonomous systems
• Multihomed Transit AS
AS 2
AS 1
AS 3
© 2000, Cisco Systems, Inc.
9
BGP session
BGP
session
• BGP session established on top of
TCP (port 179)
• Reliable transport layer
• TCP needs a routing layer (IGP)
© 2000, Cisco Systems, Inc.
10
BGP table
IGP
FIB
BGP
• BGP uses a database (BGP table)
• Databases are exchanged after
session set up
• Incremental updates after
© 2000, Cisco Systems, Inc.
11
Generalities
• BGP supports CIDR
• NLRI: Network Layer Reachability
Information
Information carried and exchanged by
BGP
© 2000, Cisco Systems, Inc.
12
iBGP vs eBGP
• eBGP is used to exchange NLRI
between Autonomous Systems
• iBGP is used to carry NLRI within
the Autonomous System
• A BGP router has internal and/or
external neighbors
© 2000, Cisco Systems, Inc.
13
iBGP vs eBGP
AS 1
© 2000, Cisco Systems, Inc.
eBGP
session
iBGP
session
AS 2
14
General operation
• Learns multiple paths via internal
and external BGP speakers
• Picks THE best path and installs it in
the IP forwarding table
• Policies applied by influencing the
best path selection
© 2000, Cisco Systems, Inc.
15
General operation
• BGP speaker advertises only the
routes that it uses itself
“hop-by-hop” routing paradigm
• Reliable Transport Protocol
no need to implement fragmentation,
reTX, ACKs and sequencing
assumes a “graceful” close: all
outstanding data will be delivered
© 2000, Cisco Systems, Inc.
16
Information Transfer
• From eBGP -> advertise to all
• From iBGP -> advertise only to eBGP
full iBGP mesh is required!!
• Propagate ONLY the best path
© 2000, Cisco Systems, Inc.
17
When should you use BGP?
• Most appropriate for
Multihomed transit and non-transit AS
Scaling large networks
Deploying new IP-VPN services (MBGP)
• Not appropriate on stub AS (static
route instead)
© 2000, Cisco Systems, Inc.
18
Peering
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
19
Peers
Peers
A
C
AS 100
AS 101
D
B
E
AS 102
© 2000, Cisco Systems, Inc.
20
BGP message types
• OPEN
• UPDATE
• NOTIFICATION
• KEEPALIVE
• size: 19 to 4096 octets
© 2000, Cisco Systems, Inc.
21
Open message
1
2
3
4
bytes
Version
My autonomous system
Hold Time
BGP identifier
Opt param Len
Optional parameters
Hold time = Max time (in sec) that may elapse between the receipt of
successive UPDATE or KEEPALIVE packet.
Negotiated when session starts
© 2000, Cisco Systems, Inc.
22
Notification message
1
2
3
Error
Error subcode
4
bytes
Data
Data
Error code
1- message Header Error
2-Open message error
3-UPDATE message error
4-Hold timer expired
5-Finite state machine error
6-Cease
© 2000, Cisco Systems, Inc.
Error subcode
1: Connection Not sync
2: Bad message length
3: Bad message type
1: Unsupported version numb
2: Bad Peer AS
3: Bad BGP identifier
4: Unsupported Optional Par.
5: Authent error
6: Unacceptable hold time
1: Malformed Attribute-list
2: Unrecognised well-know attr.
3: Missing well-know attribute (…)
NA
NA
NA
23
Update message
1
2
3
4
bytes
Unfeasible Routes Length
Withdrawn routes (variable len)
Unreach.
routes
Total Path Attribute length
Path Attributes (var len)
Length
Prefix (var)
Length
Prefix (var)
Path
Attributes
NLRI
Information
(…)
© 2000, Cisco Systems, Inc.
24
Path Attributes
• 4 Categories:
Well-Known mandatory (ex: AS_Path,
next-hop, origin)
Well-Known discretionary (ex: local pref)
Optional transitive: should be passed
along even if not supported (ex:
community, aggregator)
Optional nontransitive (ex: MED)
© 2000, Cisco Systems, Inc.
25
Keepalive message
• 19 Byte BGP header with no data
• Periodically exchanged.
• Hold time = max time between
successive Keepalive and
Update messages.
© 2000, Cisco Systems, Inc.
26
Neighbor negotiation’s
finite state machine
Connect
Active
?
START
Idle
OpenSent
OpenConfirm
Established
© 2000, Cisco Systems, Inc.
GOAL
27
Neighbor negotiation’s
finite state machine
Connect
Active
Start event (inc: reset)
Idle
OpenSent
OpenConfirm
Established
© 2000, Cisco Systems, Inc.
28
Neighbor negotiation’s
finite state machine
TCP session
not OK
Active
BGP is waiting for the
transport session to start
Connect
Connect retry
timer expires ->
new TCP session
TCP session
successful
Idle
OpenSent
OpenConfirm
Established
© 2000, Cisco Systems, Inc.
29
Neighbor negotiation’s
finite state machine
BGP tries to establish TCP session
and listens for other potential peers
Active
Connect
Connect retry
timer expires
TCP session successfully
established
Idle
OpenSent
OpenConfirm
Troubleshooting tip:
Established
A neighbor state flip-flopping between connect
and active indicates a problem with the TCP session.
Use extended ping to check
© 2000, Cisco Systems, Inc.
30
Neighbor negotiation’s
finite state machine
Connect
Active
If TCP
disconnect
received
In case of error (ex: bad version)
-> Notification message sent
Idle
OpenSent
Open message
sent. BGP waits
for neighbor’s
open message
© 2000, Cisco Systems, Inc.
If Open mess. OK, send
a Keepalive
OpenConfirm
Established
31
Neighbor negotiation’s
finite state machine
Connect
Active
Notification
message
received
OpenSent
Idle
OpenConfirm
BGP waits
for Keepalive
© 2000, Cisco Systems, Inc.
Keepalive
received
Established
32
Neighbor negotiation’s
finite state machine
Connect
Active
Idle
OpenSent
OpenConfirm
If notification
message received
or sent
Established
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
Sends
periodic
Keepalives
33
eBGP Peering
• BGP speakers in different AS
• Should be directly connected
• Configuration:
Router B
AS 109
A
131.108.0.0/16
router bgp 110
network 150.10.0.0
neighbor 131.108.10.1 remote-as 109
.
1
131.108.10.0/24
Router A
router bgp 109
network 131.108.0.0
neighbor 131.108.10.2 remote-as 110
© 2000, Cisco Systems, Inc.
AS 110
.2
B
150.10.0.0/16
34
eBGP Peering
• Non directly connected neighbors
-> ebgp-multihop
• Configuration:
Router B
AS 109
A
131.108.0.0/16
router bgp 110
neighbor 131.108.10.1 remote-as 109
.
1
131.108.10.0/24
neighbor 131.108.10.1 update-source ethernet 0
AS 110
Router A
router bgp 109
neighbor 150.10.0.1 remote-as 110
neighbor 150.10.0.1 ebgp-multihop
.2
B
.1
150.10.0.0/16
ip route 150.10.0.1 255.255.255.255 131.108.10.2
© 2000, Cisco Systems, Inc.
35
iBGP Peering
• BGP speakers in same AS
• Use loopback interfaces
A
-> Update source loopback 0
.1
• Configuration:
Router B
131.108.10.0/24
AS 123
router bgp 123
neighbor 131.108.10.1 remote-as 123
neighbor 131.108.10.1 update-source
loopback 0
Router A
.2
B
10.0.0.2/32
router bgp 123
neighbor 10.0.0.2 remote-as 123
© 2000, Cisco Systems, Inc.
36
Load Balancing across parallel
links
• Use of <ebgp-multihop>
• Use the loopback on both routers
ISP
• Define IGP between the loopback
interfaces in DMZ
• Configuration:
router bgp 201
neighbor x.x.x.x remote-as ISP-AS
neighbor x.x.x.x update-source loopback0
neighbor x.x.x.x ebgp-multihop
!
ip route x.x.x.x 255.255.255.255 next-hop0/1
ip route x.x.x.x 255.255.255.255 next-hop0/2
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
AS 201
37
Typical issue with eBGP multihop
• Use specific static routes
ex: ip route x.x.x.x 255.255.255.255 nexthop0/1
ISP
• If not a specific static route, you
could end-up learning via BGP a
better prefix (longer match) for
reaching the neighbor.
-> Session restarts continuously.
AS 201
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
38
MultiPath Support
• Router peering with multiple
routers in neighboring AS
• Install multiple routes in IP
routing table
ISP
D
F
• Routes should be identical
• Next-hop is set to self (use
loopback interface)
A
AS 201
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
39
MultiPath Support (Cont.)
• Configuration:
router bgp 201
neighbor 141.153.12.1 remote-as 2
neighbor 141.153.17.2 remote-as 2
maximum-paths 2
• <sh ip route>
B
144.10.0.0/16 [20/0] via 141.153.12.1, 00:03:29
[20/0] via 141.153.17.2, 00:03:29
ISP
D
F
A
AS 201
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
40
Summary Typical Peering issues
• Extended ping fails -> IGP issue
• Update source missing
• No directly connected route to
neighbor (eBGP) + forgot ebgpmultihop
• ebgp-multihop but wrong (or not
specific enough) static route
to neighbor
© 2000, Cisco Systems, Inc.
41
Attributes and Route
Selection Algorithm
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
42
BGP Attributes
WKM
WKD
OT
ONT
•
•
•
•
•
•
•
•
AS-path
Next-hop
Origin
Local preference
Atomic aggregate
Aggregator
Community
Multi Exit Discriminator (MED)
© 2000, Cisco Systems, Inc.
43
Synchronization
“In a transit network, a route learned
from an external peer should not be
advertised to other eBGP peers until
all the routers in the local AS have
learned about it.
”
© 2000, Cisco Systems, Inc.
44
Synchronization
690
A
1880
209
B
• Rtr A won’t advertise the prefixes from AS209 until
the IGP converges.
• Turn synchronization off!
next-hop has to be known via IGP
router bgp 1880
no sync
© 2000, Cisco Systems, Inc.
45
Synchronization
690
A
1880
B
C
209
• Rtr A won’t advertise the prefixes from AS209 until the
IGP converges.
• Solutions:
redistribute into IGP (NOT!)
run BGP in rtr B
© 2000, Cisco Systems, Inc.
46
no synchronization
• Why?
not a transit network
all routers in transit path run BGP
• Advantages
carry fewer routes in IGP
BGP converges faster
© 2000, Cisco Systems, Inc.
47
NEXT_HOP
• The next hop to reach a network
eBGP
AS 109
IP address of the peer
iBGP
A
131.108.0.0/16
NEXT_HOP advertised by eBGP
.
1
131.108.10.0/24
IGP should carry route to NEXT_HOPs
AS 110
Recursive route lookup
Unlinks BGP from the physical topology
Allows IGP to make intelligent forwarding
decision
.2
B
150.10.0.0/16
Unreachable next-hop -> route not used
© 2000, Cisco Systems, Inc.
48
Third-Party NEXT_HOP
• Example:
AS 200
A and B are
in the same AS
Router A will advertise
192.68.1.0/24
150.1.1.2
with a NEXT_HOP of
150.1.1.3.
• More efficient!
© 2000, Cisco Systems, Inc.
C
150.1.1.1
150.1.1.3
A
B
192.68.1.0/24
AS 201
49
Third-Party NEXT_HOP
• Use of <next-hop-self>
• Example:
150.10.0.0
C
.1
.3
B
A and B are in the same AS
Router A will advertise 150.10.0.0 with
a NEXT_HOP of 131.108.10.1, but
router C can’t reach the next-hop!!
131.108.10.0
Frame relay
• Configuration (rtr A):
router bgp 109
network 150.10.0.0
neighbor 131.108.10.3 next-hop-self
© 2000, Cisco Systems, Inc.
.2
A
50
Override Third-Party Next-Hop
• Alternative to configuring a specific
IP address to be the next-hop for
BGP routes
• Syntax (route-map command):
set ip next-hop peer-address
© 2000, Cisco Systems, Inc.
51
Override Third-Party Next-Hop
(Cont.)
• Set IP next-hop : best used on outbound
route-map
• Be careful when manipulating next-hop and
default routes. Routing loops can occur!
Solution: Good network design
© 2000, Cisco Systems, Inc.
52
WEIGHT
• Cisco specific (sort of router’s
internal local preference)
• Local to the router
Not propagated
• value: 0 - 65535
• Default:
originated locally = 32768
other = 0
© 2000, Cisco Systems, Inc.
53
LOCAL_PREF
• Indication of preferred path to exit the
local AS
• Global to the local AS
• Paths with highest LOCAL-PREF are
most desirable (default = 100)
bgp default local-preference value
© 2000, Cisco Systems, Inc.
54
LOCAL_PREF (Cont.)
690
• Configuration (rtr A):
router bgp 109
neighbor x.x.x.x remote-as 1880
neighbor x.x.x.x route-map foo in
!
route-map foo permit 10
666
match as-path 2
set local-preference 120
!
ip as-path access-list 2 permit ^1880_
© 2000, Cisco Systems, Inc.
1755
1880
A
Needs to go to 690
55
AS_PATH
•AS-PATH contains the list of AS the
update had to traverse.
•AS-PATH is updated by the sending
router with its own AS number.
•BGP uses the AS-PATH to detect
routing loops.
© 2000, Cisco Systems, Inc.
56
AS_PATH
•Each time the router receives an
eBGP update it checks the AS-PATH.
•If it finds is own AS number on the ASPATH, the update is discarded.
© 2000, Cisco Systems, Inc.
57
AS_PATH
690
B
1. Router A sends update for
141.253.10.0/24 with AS_PATH: 1880
1880
A
2. Router B sends update
for 141.253.10.0/24 with
AS_PATH: 690 1880
C
200
© 2000, Cisco Systems, Inc.
141.253.10.0/24
3.Router C sends update
for 141.253.10.0/24 with
AS_PATH: 200 690 1880
4.Router A will detect its
own AS number and will
discard the update
58
AS_PATH manipulation
AS-PATH prepending
ISP 1
Internet
Problem: 80% of the
incoming traffic comes
from ISP 1
© 2000, Cisco Systems, Inc.
You
ISP 2
59
AS_PATH manipulation
AS-PATH prepending
Solution:
ISP 1
Internet
route-map prepend permit 10
match as-path 2
set as-path prepend 250 250
As-path: 250 250 250
AS 250
As-Path: 250
ISP 2
© 2000, Cisco Systems, Inc.
60
AS_path manipulation
Private-AS Removal
• neighbor x.x.x.x remove-private-AS
available for eBGP neighbors only
Update must have AS_PATH exclusively made up of
private-AS numbers.
Confederations: private AS will be removed only if it’s after the
confederation’s set of Ases
remove-private-as will not work if the private ASN you want to
remove is the neighboring one!
© 2000, Cisco Systems, Inc.
61
Private-AS - Application
• Applications include:
ISP with singlehomed customers
65001
193.0.32.0/24
Scaling big corporate
1880
networks
193.1.34.0/24
65002
193.0.33.0/24
65003
193.2.35.0/24
A
193.1.32.0/22 1880
© 2000, Cisco Systems, Inc.
62
misc issue with AS_PATH
• Error message: #%BGP-3INSUFCHUNKS: Insufficient chunk
pools for aspath
• Router keeps working fine!!!
• Appears when router gets an update
with AS_PATH > 50 AS
• Since 12.0(11) and 12.1(2), only
appears when AS_PATH > 125
© 2000, Cisco Systems, Inc.
63
ORIGIN
• Origin of the prefix
• Values:
IGP (i) = via network command
EGP (e) = learned from EGP
incomplete (?) = redistribution
© 2000, Cisco Systems, Inc.
64
Multi-Exit Discriminator (MED)
• Indication (to external peers) of the
preferred path into an AS
used in multiple entry AS
non-transitive
• Compared only for routes from the
same AS
• Lower MED value is more preferable
© 2000, Cisco Systems, Inc.
65
MED
690
A
1755
1880
B
209
• Configuration (rtr B):
router bgp 1755
neighbor x.x.x.x remote-as 1880
neighbor x.x.x.x route-map set_MED out
!
route-map set_MED permit 10
match as-path 2
set metric 2
!
ip as-path access-list 2 permit _690$
© 2000, Cisco Systems, Inc.
66
MED & IGP Metric
• set metric-type internal
enable BGP to advertise a MED which
corresponds to the IGP metric values
changes are monitored (and readvertised
if needed) every 600s
bgp dynamic-med-interval <secs>
© 2000, Cisco Systems, Inc.
67
MED Comparison
• MED is compared ONLY for prefixes received
from the same AS
(unless bgp always-compare-med is enabled)
• If the AS_PATH is made up of only
confederation sub-ASs, its length is not
considered AND the MED is not compared
• If an update is received with no MED, the
router (by default) assigns it a value of 0
© 2000, Cisco Systems, Inc.
68
Community Attribute
rfc1997
• Used to group destinations and apply
a common policy
• Each prefix can belong to multiple
communities
• Not propagated by default
neighbor ip-address send-community
© 2000, Cisco Systems, Inc.
69
Community Attribute (Cont.)
• 32-bits long
use 16 bits to indicate the ASN
ip bgp-community new-format
set community AS:community [additive]
set community none
erase all the values in the attribute
set comm-list <number> delete
erase selected communities
© 2000, Cisco Systems, Inc.
70
Well-Known Communities
• internet = all routes are members of
this community
• no-export = do not advertise to
eBGP peers
• no-advertise = do not advertise to
any peer
• local-AS = do not advertise outside
local AS (used with confederations)
© 2000, Cisco Systems, Inc.
71
No-Export Community
170.10.0.0/16
170.10.X.X No-Export
170.10.X.X
D
A
AS 100
B
C
© 2000, Cisco Systems, Inc.
E
AS 200
170.10.0.0/16
G
F
72
Extended Community Attribute
draft-ramachandra-bgp-ext-communities-01
• Extended range
8 Bytes (64 bits)
• Structure
type:value
Value may be of the form AS:xxx
© 2000, Cisco Systems, Inc.
73
BGP Path Selection
• 1 Only consider paths with reachable NEXT_HOPs
• 2 Do not consider iBGP path if not synchronized
• 3 Highest WEIGHT
• 4 Highest LOCAL_PREF
• 5 Prefer locally originated route
• 6 Shortest AS_PATH
© 2000, Cisco Systems, Inc.
74
BGP Path Selection
• 7 Lowest ORIGIN code: IGP < EGP < incomplete
• 8 Lowest Multi-Exit Discriminator (MED)
8a IF bgp always-compare-med, then compare
it for all paths
8b Considered only if paths are from the same
neighbor AS
• 9
Prefer an External path over an Internal one
• 10 Lowest IGP metric to the NEXT_HOP
© 2000, Cisco Systems, Inc.
75
BGP Path Selection (Cont.)
• 11 IF multipath is enabled, the router may install
up to N parallel paths in the routing table
• 12 For eBGP paths, select the “oldest” to minimize
route-flap
• 13 Lowest Router-ID
Originator-ID is considered for reflected routes
• 14 Shortest Cluster-List
Client must be aware of RR attributes!
• 15 Lowest neighbor IP address
© 2000, Cisco Systems, Inc.
76
Prefix Generation And
Aggregation
Say what?!
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
77
<network> Command
• Networks originated by the local router
• Matching IGP route must exist
dynamic or static entry in routing table
• Example:
router bgp 109
network 200.10.10.0
network 198.10.0.0 mask 255.255.0.0
!
ip route 198.10.0.0 255.255.0.0 null 0
© 2000, Cisco Systems, Inc.
78
Redistribution
• From IGP
Typically NOT a good thing!
• Static routes pointed to null0
• Example:
router bgp 109
redistribute static
!
ip route 198.10.0.0 255.255.0.0 null 0
© 2000, Cisco Systems, Inc.
79
Aggregate Addresses Aggregate
Addresses
• Combine different routes into one
• Advertised as coming from the local
AS
• A component must exist in the BGP
table
© 2000, Cisco Systems, Inc.
80
Aggregation Attributes
• Aggregator Attribute
Last AS number that formed the
aggregate route
IP address of the BGP speaker that formed the
aggregate route
• Atomic Aggregate attribute
indicates a more specific route exists
BGP speaker receiving this attribute shall not
remove the attribute when propagating it
• Useful for debugging. Don’t affect
route selection.
© 2000, Cisco Systems, Inc.
81
Aggregate Attributes
NEXT_HOP = local
WEIGHT = 32768
LOCAL_PREF = best
AS_PATH = AS_SET or nothing
ORIGIN = worst
MED = none
© 2000, Cisco Systems, Inc.
82
<aggregate address>
• With no options it propagates the aggregate
and all the components
• summary-only
Advertise ONLY the aggregate (no components)
Example:
router bgp 109
aggregate-address 198.10.0.0 255.255.0.0 summary-only
© 2000, Cisco Systems, Inc.
83
as-set
• AS_SET
unordered set of al ASs traversed
helps avoid loops
• advertise the prefix and the
components AND include AS_SET
information in the path
© 2000, Cisco Systems, Inc.
84
as-set (Cont.)
• Example:
router bgp 1880
network 193.1.34.0
aggregate-address 193.0.32.0 255.255.254.0 as-set
1880
193.1.34/24
1883
193.0.32/24
1881
193.0.33/24
A
193.1.34/24
193.0.33/24
193.0.32/24
193.0.32/23
© 2000, Cisco Systems, Inc.
1880
1880 1881
1880 1883
1880 {1881,1883}
85
Options (Cont.)
suppress | advertise | attribute-map
suppress-map = suppress specific
components
advertise-map = create an aggregate
from specific components
attribute-map = set attributes for the
aggregate route
© 2000, Cisco Systems, Inc.
86
Conditional Advertisement
• Conditionally advertise prefixes— useful for
dual homing
• Syntax:
neighbor <address> advertise-map <route-map>
non-exist-map <route-map>
non-exist-map is periodically checked; if satisfied
(i.e. routes are not in the BGP table), the prefixes
matched by the advertise-map are advertised to
the neighbor
© 2000, Cisco Systems, Inc.
87
Soft Reconfiguration
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
88
BGP Soft-Reconfiguration
• Allows policies to be changed
without clearing the neighbor
• Both inbound and outbound
Inbound requires additional memory
Outbound is more efficient
© 2000, Cisco Systems, Inc.
89
Soft-Reconfiguration
• Outbound does not require any
configuration
• Inbound configuration:
router bgp 30
neighbor 141.153.12.2 remote-as 32
neighbor 141.153.12.2 soft-reconfiguration
neighbor 141.153.12.2 route-map filter in
neighbor 141.153.30.2 remote-as 31
• <clear ip bgp x.x.x.x soft [in|out]>
© 2000, Cisco Systems, Inc.
90
Managing Policy Changes
clear ip bgp <addr> [soft] [in|out]
• <addr> may be any of the following
x.x.x.x
IP address of a peer
*
all peers
ASN
all peers in an AS
external
all external peers
peer-group <name> all peers in a peer-group
© 2000, Cisco Systems, Inc.
91
Route Refresh Capability
• Facilitates non-disruptive
policy changes
• No configuration is needed
• No additional memory is used
• clear ip bgp x.x.x.x in
© 2000, Cisco Systems, Inc.
92
Internal mesh reduction
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
93
IBGP Mesh
• IBGP speaker does not advertise IBGP
learned info to a third IBGP speaker!!!
• Avoids routing information loop
• Does not scale
• Following solutions do not change the
current behaviour
Route reflectors
Confederation
© 2000, Cisco Systems, Inc.
94
Normal IBGP
A
AS 100
B
© 2000, Cisco Systems, Inc.
C
95
Route Reflector: Principle
Route Reflector
A
AS 100
B
© 2000, Cisco Systems, Inc.
C
96
Route-reflector
• Multiple level of RR
RR
RR
AS 1
B
AS2
© 2000, Cisco Systems, Inc.
97
Loop Avoidance
• Originator_ID Attribute
carries the RID of the originator of the
route in the local AS
• Cluster_list Attribute
The local cluster-id (RR router-ID) is
added when the update is reflected
(added by the RR)
© 2000, Cisco Systems, Inc.
98
Loop Avoidance
• When RR receives an update:
Check if its cluster-id is on the cluster-list
If cluster-id is on the cluster-list the
update is silently discarded
If the BGP update is ok, the RR updates the
cluster-list with its cluster-id and reflects
the update (according to the rules)
With multiple RR in the same cluster, a
unique cluster-id should be set
by configuration
© 2000, Cisco Systems, Inc.
99
Confederations
• Collection of AS—sub-AS
• Visible to outside world as single AS
• Uses reserved AS numbers for
internal sub-AS
• Sub-AS are fully meshed
• EBGP between sub-AS
© 2000, Cisco Systems, Inc.
100
Confederation
Sub-AS
65002
A
Sub-AS
65003
B
C
Sub-AS
65001
Confederation 100
© 2000, Cisco Systems, Inc.
101
Confederation: Principle
• Mini-AS have eBGP like connections to
other mini-AS
• However they do carry all the usual IBGP
information : MED, local-pref, next-hop.
© 2000, Cisco Systems, Inc.
102
Confederation: AS-path
180.10.0.0/16
200
180.10.0.0/16
{65002} 200
A
Sub-AS
65002
B
180.10.0.0/16
{65004 65002} 200
C
Sub-AS
65004
H
Sub-AS
65003
180.10.0.0/16
© 2000, Cisco Systems, Inc.
D
E
F
G
100 200
Sub-AS
65001
Confederation
100
103
RR vs Confederations
• Route-Reflectors
– Easy to configure (clients are unchanged)
– RR configuration does not require any
downtime
– RR will scale easily
© 2000, Cisco Systems, Inc.
104
RR vs Confederations
• Confederations
– Maintenance is complex due
reconfiguration of ALL routers in AS
– Sub-confederation may have different
BGP policies
© 2000, Cisco Systems, Inc.
105
Route Dampening
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
106
Route Flap Dampening
• Route flaps ripple through the
entire Internet
up and down of path
change in attributes
• Wastes CPU
• Objective: reduce the scope of route
flap propagation
© 2000, Cisco Systems, Inc.
107
Route Flap Dampening
4
Suppress-Limit
3
Penalty
Reuse-Limit
2
1
0
0 1
2
3 4
5 6 7 8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Time
© 2000, Cisco Systems, Inc.
108
Flap Dampening: Operation
• Add fixed penalty for each flap
flap = withdraw or attribute change
• Exponentially decay penalty
half-life determines rate
• Penalty above suppress-limit = do not
advertise up route
• Penalty decayed below reuse-limit =
advertise route
© 2000, Cisco Systems, Inc.
109
MP-BGP
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
110
Multi-Protocol BGP
• Extension to the BGP protocol in
order to carry routing information
about other protocols
ex: Multicast, MPLS-VPN, IPv6, CLNS, ...
• Exchange of Multi-Protocol NLRI
must be negotiated at session set up
BGP Capabilities negotiation
© 2000, Cisco Systems, Inc.
111
Multi-Protocol BGP - RFC2283
• New non-transitive and optional
BGP attributes
MP_REACH_NLRI
“Carry the set of reachable destinations
together with the next-hop information to be
used for forwarding to these destinations”
(RFC2283)
MP_UNREACH_NLRI
Carry the set of unreachable destinations
© 2000, Cisco Systems, Inc.
112
Multi-Protocol BGP - RFC2283
• Attribute contains one or more Triples
1) Address Family Information (AFI) with Sub-AFI
Identifies the protocol information carried in the NLRI field
2) Next-Hop Information
Next-hop address must be of the same family
3) NLRI
© 2000, Cisco Systems, Inc.
113
BGP Capabilities Negotiation
• BGP routers establish BGP sessions
through the OPEN message
• OPEN message contains optional
parameters
• BGP session is terminated if OPEN
parameters are not recognised
• A new optional parameter:
CAPABILITIES
© 2000, Cisco Systems, Inc.
114
BGP Capabilities Negotiation
• A BGP router sends an OPEN
message with CAPABILITIES
parameter containing its
capabilities:
Multiprotocol extension
Route-refresh
...
© 2000, Cisco Systems, Inc.
115
BGP Capabilities Negotiation
• BGP routers determine capabilities
of their neighbors by looking at the
capabilities parameters in the
open message
• Unknown or unsupported
capabilities may trigger the
transmission of a
NOTIFICATION message
© 2000, Cisco Systems, Inc.
116
MBGP
• MBGP: Multiprotocol BGP for
Multicast NLRIs
Multicast-BGP
• Unicast and Multicast routes are
carried through same BGP session
© 2000, Cisco Systems, Inc.
117
MBGP
• AFI, Sub-AFI part of
MP_REACH_NLRI and
MP_UNREACH_NLRI
AFI = 1 (IPv4)
Sub-AFI = 1 (NLRI is used for unicast)
Sub-AFI = 2 (NLRI is used for multicast)
Sub-AFI = 3 (NLRI is used for both unicast and
multicast)
• Separate BGP tables
© 2000, Cisco Systems, Inc.
118
MBGP
• MBGP is used to match RPF
• MBGP does NOT propagate any
multicast state
• Same rules apply to path selection
and validation
BGP attributes (AS-Path, LocalPref, MED, …)
• Recursive RPF lookup is done in
unicast routing table
© 2000, Cisco Systems, Inc.
119
MBGP
• BGP/MBGP configuration allows to
define which NLRI type are exchanged (unicast,
multicast, both)
set NLRI type through route-maps
(redistribution)
define policies through standard BGP attributes
(for unicast and/or multicast NLRI)
• Translation between multicast and
unicast NLRIs
© 2000, Cisco Systems, Inc.
120
MBGP
BGP session for unicast and
multicast NLRI
AS 321
AS
123
192.168.100.0/24
RP
RP
receiver
© 2000, Cisco Systems, Inc.
BGP: 192.168.100.2 open active, local address
192.168.100.1
BGP: 192.168.100.2 went from Active to OpenSent
BGP: 192.168.100.2 sending OPEN, version 4
BGP: 192.168.100.2 OPEN rcvd, version 4
BGP: 192.168.100.2 rcv OPEN w/ option parameter type: 2,
len: 6
BGP: 192.168.100.2 OPEN has CAPABILITY code: 1,
length 4
BGP: 192.168.100.2 OPEN has MP_EXT CAP for afi/safi:
1/1
BGP: 192.168.100.2 rcv OPEN w/ option parameter type: 2,
len: 6
BGP: 192.168.100.2 OPEN has CAPABILITY code: 1,
length 4
BGP: 192.168.100.2 OPEN has MP_EXT CAP for afi/safi:
1/2
BGP: 192.168.100.2 went from OpenSent to OpenConfirm
BGP: 192.168.100.2 went from OpenConfirm to
Established
sender
121
MBGP and non-congruent
topologies
Single BGP session across loopback interfaces
AS 321
AS 123
Unicast traffic
192.168.100.0/24
Multicast traffic
192.168.200.0/24
© 2000, Cisco Systems, Inc.
router bgp 321
network 192.168.100.0 nlri unicast
network 192.168.200.0 nlri multicast
network 192.168.25.0 nlri unicast multicast
neighbor 192.168.1.1 remote-as 123 nlri unicast
multicast
neighbor 192.168.1.1 ebgp-multihop 255
neighbor 192.168.1.1 update-source Loopback0
neighbor 192.168.1.1 route-map setNH out
!
route-map setNH permit 10
match nlri multicast
set ip next-hop 192.168.200.2
!
route-map setNH permit 15
match nlri unicast
set ip next-hop 192.168.100.2
192.168.25.0/24
sender
122
MPLS-VPN
What is an IP VPN ?
• An IP network infrastructure
delivering private network services
over a public infrastructure
Use a layer 3 backbone
Scalability, easy provisioning
Global as well as non-unique private
address space
© 2000, Cisco Systems, Inc.
123
VPN Models - The Overlay model
• Private trunks over a TELCO/SP
shared infrastructure
Leased/Dialup lines
FR/ATM circuits
IP (GRE) tunnelling
• Transparency between provider
and customer networks
• Optimal routing requires full mesh
over backbone
© 2000, Cisco Systems, Inc.
124
VPN Models - The Peer model
• Both provider and customer
network use same network protocol
• CE and PE routers have a routing
adjacency at each site
• All provider routers hold the full
routing information about all
customer networks
• Private addresses are not allowed
© 2000, Cisco Systems, Inc.
125
VPN Models - MPLS-VPN:
The True Peer model
• Same as Peer model BUT !!!
• Provider Edge routers receive and hold
routing information only about VPNs
directly connected
• Reduces the amount of routing information
a PE router will store
• Routing information is proportional to the
number of VPNs a router is attached to
• MPLS is used within the backbone to
switch packets (no need of full routing)
© 2000, Cisco Systems, Inc.
126
MPLS VPN Connection Model
VPN_A
VPN_A
MP-iBGP sessions
10.2.0.0
CE
PE
P
P
PE
CE
VPN_A
VPN_B
10.2.0.0
CE
CE
VPN_A
11.6.0.0
10.1.0.0
CE
VPN_B
10.1.0.0
11.5.0.0
CE
PE
P
P
PE
CE
VPN_B
10.3.0.0
• P routers (LSRs) are in the core of the
MPLS cloud
• PE routers use MPLS with the core and
plain IP with CE routers
• P and PE routers share a common IGP
• PE router are MP-iBGP fully meshed
© 2000, Cisco Systems, Inc.
127
MPLS VPN Connection Model
P
P
PE
PE
VPN Backbone IGP
P
P
MP-iBGP session
• Multiple routing tables (VRFs) are used
on PEs
Each VRF contain customer routes
Customer addresses can overlap
VPNs are isolated
• MP-BGP is used to propagate these
addresses between PE routers
© 2000, Cisco Systems, Inc.
128
MPLS VPN Connection Model
Addresses overlap
P
P
PE
PE
VPN Backbone IGP
P
P
MP-iBGP session
• BGP always propagate ONE route
per destination
• What if two customers are using the
same address ?
BGP will propagate only one route - PROBLEM !!!
• Therefore MP-BGP will distinguish
between customer addresses
© 2000, Cisco Systems, Inc.
129
MPLS VPN Connection Model
Route propagation through MP-BGP
P
P
update for
Site-1 Net1
VPN-A
PE-2
PE-1
VPN-IPv4 updates are translated
into IPv4 address and inserted into
the VRF corresponding to the RT
value
VPN Backbone IGP
P
P
Site-2
VPN-A
update for
Net1
update for
Site-1
VPN-B
update for
Net1
CE-1
Net1
VPN-IPv4 update:
RD1:Net1, Nexthop=PE-1
SOO=Site1, RT=Yellow,
Label=10
VPN-IPv4 update:
RD2:Net1, Nexthop=PE-1
SOO=Site1, RT=Green,
Label=12
Site-2
VPN-B
MP-BGP assign a RD to each route in order to make
them unique
In order to propagate them all
MP-BGP assign a Route-Target in order for remote PEs
to insert such route to the corresponding routing
table (VRF)
Route-Target is the colour of the route
© 2000, Cisco Systems, Inc.
130
VPN Connection Model:Route
propagation through MP-BGP
P
P
update for
Site-1 Net1
VPN-A
PE-2
PE-1
VPN Backbone IGP
P
update for
Net1
P
Site-2
VPN-B
update for
Net1
update for
Site-1
VPN-B
VPN-IPv4 updates are translated
into IPv4 address and inserted into
the VRF corresponding to the RT
value
CE-1
Net1
VPN-IPv4 update:
RD1:Net1, Nexthop=PE-1
SOO=Site1, RT=Yellow,
Label=10
VPN-IPv4 update:
RD2:Net1, Nexthop=PE-1
SOO=Site1, RT=Green,
Label=12
Site-2
VPN-A
When a PE router receives a MP-BGP route it does
check the route-target value
If such value is equal to the one intended to be used
in a particular routing table the route is inserted
into it
The label associated with the route is stored and
used to send packets towards the destination
© 2000, Cisco Systems, Inc.
131
MPLS VPN Connection Model
MP-BGP Update
• VPN-IPV4 address
Route Distinguisher
64 bits
Makes the IPv4 route globally unique
RD is configured in the PE for each VRF
IPv4 address (32bits)
• Extended Community attribute (64 bits)
Site of Origin (SOO): identifies the originating
site
Route-target (RT): identifies the set of sites
the route has to be advertised to
© 2000, Cisco Systems, Inc.
132
MPLS VPN Connection Model
MP-BGP Update
Any other standard BGP attribute
Local Preference
MED
Next-hop
AS_PATH
Standard Community
...
A Label identifying:
The outgoing interface
The VRF where a lookup has to be done
(aggregate label)
The BGP label will be the second label in the
label stack of packets travelling in the core
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
133
Scaling
• Existing BGP techniques can be
used to scale the route distribution:
route reflectors
• Each edge router needs only the
information for the VPNs it supports
Directly connected VPNs
• RRs are used to distribute VPN
routing information
© 2000, Cisco Systems, Inc.
134
Scaling
• Very highly scalable:
Initial VPN release: 1000 VPNs x 1000 sites/VPN =
1,000,000 sites
Architecture supports 100,000+ VPNs, 10,000,000+
sites
BGP “segmentation” through RRs is essential !!!!
• Easy to add new sites
• configure the site on the PE connected to it
• the network automagically does the rest
© 2000, Cisco Systems, Inc.
135
MPLS-VPN
Scaling BGP
VPN_A
Route Reflectors
VPN_A
RR
10.2.0.0
VPN_B
10.2.0.0
CE
P
P
P
P
PE2
CE
11.5.0.0
VPN_A
PE
CE
10.1.0.0
VPN_B
PE
CE
PE1
VPN_B
10.1.0.0
CE
CE
VPN_A
11.6.0.0
RR
10.3.0.0
CE
• Route Reflectors may be partitioned
Each RR store routes for a set of VPNs
• Thus, no BGP router needs to store ALL VPNs
information
• PEs will peer to RRs according to the VPNs they
directly connect
© 2000, Cisco Systems, Inc.
136
MPLS-VPN Scaling
BGP updates filtering
iBGP full mesh between PEs results in flooding all
VPNs routes to all PEs
Scaling problems when large amount of routes. In
addition PEs need only routes for attached
VRFs
Therefore each PE will discard any VPN-IPv4 route
that hasn’t a route-target configured to be
imported in any of the attached VRFs
This reduces significantly the amount of
information each PE has to store
Volume of BGP table is equivalent of volume of
attached VRFs (nothing more)
© 2000, Cisco Systems, Inc.
137
Conclusion
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
138
Summary
• BGP represents a viable solution
today for Service Providers to:
Offer new world IP-VPN services.
Interconnect transit and non transit AS
to the Internet
• And for Enterprise customers to
Scale Big networks and dual home
their AS.
© 2000, Cisco Systems, Inc.
139
Thanks to
• Stefano Previdi for his slides!!!
• You for your attention!!!!!
© 2000, Cisco Systems, Inc.
140
CCIE’00 Paris
© 2000, Cisco Systems, Inc.
141
Download