Uploaded by Mohammed Md

Session Presentation

advertisement
#CLUS
ACI Troubleshooting:
Endpoints
Andy Gossett, DCBU ACI Escalation
@agccie
BRKACI-2641
#CLUS
Cisco Webex Teams
Questions?
Use Cisco Webex Teams to chat
with the speaker after the session
How
1 Find this session in the Cisco Live Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space
Webex Teams will be moderated
by the speaker until June 16, 2019.
cs.co/ciscolivebot#BRKACI-2641
#CLUS
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
3
8:00 a.m.
8:00 a.m.
120min
BRKACI-1001
120min
BRKACI-3545
9:30 a.m.
9:30 a.m.
60min
BRKACI-2641
1:00 p.m.
60min
BRKACI-2642
11:00 a.m.
60min
60min
BRKACI-2644
BRKACI-2643
2:30 p.m.
60min
BRKACI-2645
4:00 p.m.
1:00 p.m.
90min
120min
BRKACI-2271
BRKACI-2934
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
4
Agenda
•
ACI Endpoint Learning
•
Configuration Options
•
Endpoint Learning Troubleshooting Tips
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
5
Acronyms/Definitions
Acronyms
Definitions
Acronyms
Definitions
ACI
Application Centric Infrastructure
LPM
Longest Prefix Match
ACL
Access Control List
MDT
Multicast Distribution Tree
APIC/IFC
Application Policy Infrastructure Controller/
Insieme Fabric Controller
pcTag
Policy Control Tag
BD
Bridge Domain
PL
Physical Local
COOP
Council of Oracle Protocol
sclass
Source class (source pcTag)
ECMP
Equal Cost Multipath
SVI
Switch Virtual Interface
EP
Endpoint
TC
Topology Change
EPG
Endpoint Group
VL
Virtual Local
EPM
Endpoint Manager
VNID
Virtual Network Identifier
EPMC
Endpoint Manager Client (LC component)
VXLAN/iVXLAN
Virtual Extensible LAN / Insieme VXLAN
FTEP/VTEP
Fabric/Virtual or VXLAN Tunnel Endpoint
XR
VXLAN Remote
 Reference Slide
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
6
Endpoint Learning
What is an ACI Endpoint
Depends on who’s counting…
An endpoint is a MAC with one or
more IPv4 (/32) or IPv6 (/128)
addresses
An endpoint is a MAC, IPv4 (/32), or
IPv6 (/128) address
fvCEp
<epg-dn>/cep-00:00:00:00:0a
Endpoint
Synthetic IP
00:00:00:00:00:0a
28.186.73.78
10.0.0.10
21.215.190.9
coop
db
fvIp
<epg-dn>/cep-00:00:00:00:0a/ip-[10.0.0.10]
Spine
mac: 00:00:00:00:0a
count: 1
ip0 : 10.0.0.10
#CLUS
BRKACI-2641
Two hardware
entries
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
8
What is an ACI Endpoint
Why the count matters
#Mac w/ one
or more IPs
#Mac + #IP
450K max
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
9
Classical Learning
Encap + Interface => VLAN
VLAN => VRF
L4/Payload
Proto
DIP
SIP
802.1Q SMAC
DMAC
L2 Forwarding for (VLAN, DMAC)
L2 Learning for (VLAN, SMAC) => (Interface)
L3 Forwarding for (VRF, DIP)
L2 Forwarding:
(VLAN, DMAC) Miss => Flood
(VLAN, DMAC) Gateway MAC => Route
(VLAN, DMAC) Hit => Destination Port
config on destination port + VLAN
determines egress encap
(tagged or untagged)
L3 Forwarding (Longest Prefix Match)
(VRF, DIP) Miss => Drop
(VRF, DIP) Hit=> Adjacency
Might be Glean or packet rewrite (SMAC, DMAC,
VLAN, etc…), may include destination port in
adjacency or require second L2 lookup on new DMAC
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
10
ARP Packet
Classical Learning
DMAC
SMAC
LPM Routes
•
•
Eth: 0x0806
Connected/direct routes manually
configured
Route
Adj
10.1.1.101/32
…
Hdr/Opcode
Static/dynamic routing protocols to
learn prefixes
20.1.1.101/32
10.1.1.0/24
…
Glean
Sender MAC
20.1.1.0/24
Glean
Sender IP
Host Routes (IP Endpoints)
•
•
Glean adjacency for connected
routes to punt frame and generate
ARP request
ARP/ND used to create MAC to IP
binding and install host route into
routing table
Target MAC
ARP
P
ARP
20.1.1.101/24
10.1.1.101/24
#CLUS
Target IP
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
11
ACI Learning (Physical Local - PL)
L4/Payload
Proto
DIP
SIP
802.1Q SMAC
DMAC
Encap + Interface => EPG
EPG => BD
BD => VRF
EPGs and L3
Learning
L2 Forwarding for (BD, DMAC)
L2 Learning for (BD, SMAC) => (EPG, Interface)
L3 Learning for (VRF, SIP) => (EPG, Interface)
L3 Forwarding for (VRF, DIP)
L2 Forwarding:
(BD, DMAC) Miss => (Flood/Proxy+Drop)
(BD, DMAC) Gateway MAC => Route
(BD, DMAC) Hit => Adjacency
L3 Forwarding (Longest Prefix Match)
(VRF, DIP) Miss => Drop
Proxy/Glean for BD subnets
(VRF, DIP) Hit=> Adjacency
Adjacency contains dst EPG, encap
information, dst VTEP or port, etc…
in upcoming
slides
© 2019 Cisco
and/or its affiliates. All rights reserved.
#CLUS MoreBRKACI-2641
Cisco Public
12
Optimize Forwarding
(ARP Flooding disabled)
ACI Learning (ARP)
Target Target Sender Sender Hdr/
IP
MAC
IP
MAC Opcode
ethtype
802.1Q SMAC
ARP
Encap + Interface => EPG
EPG => BD
BD => VRF
DMAC
L2 Learning for (BD, SMAC) => (EPG, Interface)
L2 Learning for (BD, ARP SMAC) => (EPG, Interface)
L3 Learning for (VRF, ARP Sender IP) => (EPG, Interface)
L3 Forwarding for (VRF, ARP Target IP)
ARP L3 Forwarding
(VRF, ARP Target IP) Miss => Proxy
(VRF, ARP Target IP) Hit=> Adjacency
L3 forwarding based on ARP target IP field
with miss sent to spine proxy 
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
13
ACI Learning (Virtual Local - VL)
VXLAN Outer Header
Inner Header
Proto
DIP
SIP
ethtype SMAC
DMAC
VNID
Rsvd
Proto
UDP
DIP
SIP
802.1Q SMAC
DMAC
External VNID => EPG
EPG => BD
BD => VRF
L2 Forwarding for (BD, DMAC)
Infra BD MAC
Host MAC
L2 Learning for (BD, SMAC) => (EPG, Tunnel)
L3 Learning for (VRF, SIP) => (EPG, Tunnel)
VXLAN Tunnel
L4/Payload
Fabric TEP
Host VTEP
Infra VLAN
L3 Forwarding for (VRF, DIP)
AVS/AVE/OVS
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
14
iVXLAN Header
OUTER
MAC Header
802.1Q
IPv4 Header
Flags
0
1
2
3
4
5
INNER
6
7
8
UDP Header
MAC Header
VXLAN Header
iVXLAN
Header
D
L
E
S
P
D
P
9
10
11
12
IPv4 Header
UDP Header
PAYLOAD
Source Group
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Virtual Network Identifier (VNID)
32
33
34
35
36
37
38
39
40
41
42
43
44
FCS
45
46
47
27
28
29
30
31
61
62
63
Reserved
48
49
50
51
52
53
54
55
56
57
58
59
60
Abbr.
Name
Description
DL
Do not learn
Informs remote leaf that it should not perform dataplane learning
from this frame
E
Exception
Set when frame has gone through proxy path
SP
Source-policy-applied
Policy has already been applied to this frame
DP
Destination-policy-applied
- (DP and SP are always set together)
sclass/pcTag
Source group (policy-control tag)
16-bit policy control tag representing the EPG that sourced the
15
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
frame
iVXLAN Header
OUTER
MAC Header
802.1Q
IPv4 Header
Flags
0
1
2
3
4
5
INNER
6
7
8
UDP Header
MAC Header
VXLAN Header
iVXLAN
Header
D
L
E
S
P
D
P
9
10
11
12
IPv4 Header
UDP Header
PAYLOAD
Source Group
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Virtual Network Identifier (VNID)
32
33
34
35
36
37
38
39
40
41
42
43
44
FCS
45
46
47
27
28
29
30
31
61
62
63
Reserved
48
49
50
51
52
53
54
55
56
57
58
59
60
Abbr.
Name
Description
DL
Do not learn
Informs remote leaf that it should not perform dataplane learning
from this frame
E
Exception
Set when frame has gone through proxy path
SP
Source-policy-applied
Policy has already been applied to this frame
DP
Destination-policy-applied
- (DP and SP are always set together)
sclass/pcTag
Source group (policy-control tag)
16-bit policy control tag representing the EPG that sourced the
16
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
frame
ACI Learning (Remote - XR)
iVXLAN Outer Header
Inner Header
L4/Payload
Proto
Dst Leaf VTEP
Src Leaf VTEP
Fabric QoS
DIP
SIP
ethtype SMAC
DMAC
VNID
flags
EPG
Proto
UDP
DIP
SIP
802.1Q SMAC
DMAC
EPG (pcTag)
Internal MAC
BD or VRF VNID (based on routed or switched)
L2 Forwarding for (BD, DMAC)
L2 Learning for (BD, SMAC) => (EPG, Tunnel)
L3 Learning for (VRF, SIP) => (EPG, Tunnel)
L3 Forwarding for (VRF, DIP)
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
17
ACI Learning
Learning Exceptions
No IP EP learning if routing is
disabled on the BD
•
No IP EP learning on external BD’s
(Layer-3 Outside interfaces)
•
No IP EP learning on Infra VLAN
•
No IP learning of shared service
prefixes outside of our VRF
LPM Routes (Same as Classical)
•
Pervasive SVI Routes (BD Subnets)
•
Static and dynamic routing protocols
on L3Out
VXLAN/Opflex traffic
between host and
fabric on Infra VLAN
VXLAN Tunnel
•
Static/Dynamic
Routing on L3Out
WAN/
Internet
AVS/AVE/OVS
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
18
ACI Learning
Frame
Forwarding
Operation
Learn
NonIP/IP
Bridged
MAC
ARP
-
MAC (sender-HW),
IP (sender-IP)
IPv4
Unicast
Routed
MAC, IP
IPv6
Unicast
Routed
MAC, IP
IPv6
Neighbor
Discovery
MAC, IP
Leaf Endpoint Database
VRF
Remote IP Entries
(VRF, IP)
BD
Remote MAC Entries
(VRF, BD, MAC)
Encap
Endpoint Entry
- EPG (pcTag)
- Interface/Tunnel
- Control flags
Local MAC and IP Entries
(VRF, BD, VLAN/VXLAN, MAC)
(VRF, BD, VLAN/VXLAN, IP)
IP
IP
Entry
Mac
IP
Entry
IP
Entry
Entry
Entry
Relationship to
multiple IPs
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
19
ACI Learning (COOP and EP Sync)
COOP sync between oracles (Spines)
Spines learns all
endpoints through Coop
COOP citizen(leaf) update to oracle
(spine) for local EP learn
remote learn on leaf
from dataplane packet
vPC Domain 2
vPC Domain 1
local learn on leaf
EP sync between vPC peersfrom dataplane packet
EP sync between vPC peers
for remote learns
for local learns
(both orphan and vPC ports)
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
20
ACI Learning: Review
•
MAC learning for all frames
•
IP learning for routed packets and ARP packets
•
No IP learning on frames received on L3Out or Infra vlan
•
All local endpoint learns are published to coop
spine has full knowledge of all fabric endpoints
• Proxy forwarding for any fabric endpoint allowing for zero-penalty impact
for remote endpoint miss
•
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
21
Spines
Moves and Bounce
Addr
Interface
Detail
A
tun1001
leaf101/102 vTEP
B
tun4
leaf104 TEP
Leaf101/102
leaf102
leaf101
leaf103
leaf104
Addr
Interface
Detail
A
vpc1
local vpc
B
tun4
XR -> leaf104
Addr
Interface
Detail
-
-
-
-
-
-
Addr
Interface
Detail
A
tun1001
XR -> leaf101/102 VIP
B
eth1/1
local learn
Leaf 103
A
B
Initial State
Leaf 104
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
22
Spines
Moves and Bounce
3
4
leaf102
leaf101
leaf103
Spines receive
event and updates
leaf101/102
Bounce set on
old leaf101/102
leaf104
Addr
Interface
Detail
A
tun1001
tun3
leaf101/102
leaf103
TEP vTEP
B
tun4
leaf104 TEP
Leaf101/102
Addr
Interface
Detail
Detail
A
vpc1
tun3,
bounce
local
vpc
XR ->
leaf103 with
bounce bit set
B
tun4
XR
XR->
->leaf104
leaf104
Addr
Interface
Detail
A
eth1/1
local
learn from 1st packet
-
-
-
Addr
Interface
Detail
A
tun1001
XR -> leaf101/102 VIP
B
eth1/1
local learn
Leaf 103
A
B
A
2
1
Host A moves to
leaf-103
!
learn on leaf103,
published to coop
leaf104 still points
to old tunnel
#CLUS
Leaf 104
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
23
Spines
Moves and Bounce
2
leaf101/102
bounce to leaf103
Addr
Interface
Detail
A
tun3
leaf103 TEP
B
tun4
leaf104 TEP
Leaf101/102
leaf101
leaf102
leaf103
leaf104
Addr
Interface
Detail
A
tun3,
bounce
XR -> leaf103 with
bounce bit set
B
tun4
XR -> leaf104
Leaf 103
B
A
1
host B sends
packet to host A
#CLUS
leaf103 learns
host B to leaf104
Addr
Interface
Interface
3
Detail
Detail
A
eth1/1
eth1/1
local
locallearn
learn
B
tun4
-
XR
- -> leaf104
Leaf 104
Addr
Interface
Detail
A
tun1001
XR -> leaf101/102 VIP
B
eth1/1
local learn
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
24
Spines
Moves and Bounce
Addr
Interface
Detail
A
tun3
leaf103 TEP
B
tun4
leaf104 TEP
Leaf101/102
leaf101
leaf102
leaf103
leaf104
Addr
Interface
Detail
A
tun3,
bounce
XR -> leaf103 with
bounce bit set
B
tun4
XR -> leaf104
Leaf 103
B
A
4
host A sends
packet to host B
5
Addr
Interface
Detail
A
eth1/1
local learn
B
tun4
XR -> leaf104
Addr
Interface
Detail
A
tun1001
tun3
XR -> leaf101/102
leaf103 TEP VIP
B
eth1/1
local learn
Leaf 104
leaf104 updates
XR to leaf103
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
25
Aging
Addr
Time-left
Reset-count
Hit
A
15 second
900
second
225
224
No
Yes
•
Hardware maintains hit-bit for each entry which is set whenever a
frame is received from corresponding source address
•
If packet is not seen within timeout, then entry is aged and removed
from hardware
•
Else if leaf receives a frame and hit-bit is set, then software resets timer
and hit bit and entry is not aged out.
•
For local IP endpoints, at 75% of endpoint timer, then host tracking
sends 3x ARP/ND to verify if endpoint is still present
• ARP/ND reply resets timer for both IP and MAC
No regular ARP/ND required
• Support for silent hosts
to verify IP is still present if
traffic is regularly received!
• No response and endpoint will eventually age-out
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
A
26
VPC Aging
Addr
Hit
Flags
Addr
Hit
Flags
A
No
Local,vpc-attached
local,
vpc-attached
A
No
Local,vpc-attached
local,
vpc-attached
B
No
peer-attached
B
No
local
A
B
vpc host
Orphan host
•
For vpc, both leaves in the vpc domain have to age out the entry before it
is removed. This applies to remote and local entries
•
For orphan ports, as soon as the local leaf ages it out it is deleted from
both switches.
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
27
VPC Aging
2
Peer-aged flag set indicating that peer
has aged the entry. Will be deleted
once local leaf ages out it as well.
1
When vpc endpoint is aged,
set local-aged flag and send
update to peer
Addr
Hit
Flags
Addr
Hit
Flags
A
No
local, vpc-attached
peer-aged
A
No
local, vpc-attached
local-aged
B
No
peer-attached
B
No
local
B
A
vpc host
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
28
VPC Aging
3
Endpoint is locally-aged, send update
to peer. Since both local-aged and
peer-aged is set, delete entry
4
Receive peer-aged from peer.
Since both local-aged and
peer-aged is set, delete entry
Addr
Hit
Flags
Addr
Hit
Flags
A
No
local, vpc-attached
peer-aged, local-aged
A
No
local, vpc-attached
local-aged, peer-aged
B
No
peer-attached
B
No
local
B
A
vpc host
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
29
VPC Aging
2
1
Orphan port deleted as
soon as peer ages it out
When orphan port is locallyaged, simply delete and
send update to peer
Addr
Hit
Flags
Addr
Hit
Flags
A
No
local, vpc-attached
A
No
local, vpc-attached
B
No
peer-attached
B
No
local
local-aged
B
A
Orphan host
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
30
Configuration Options
Nerd Knobs
Timers – Endpoint Retention Policy
Timer
Default
Applied at BD
Applied at VRF
Local
900 sec
Mac and IP
-
Bounce
630 sec
Mac
IP
Remote
300 sec
Mac
IP
Move
256/sec
-
-
Hold
300 sec
-
-
XR MACs are always
learned at BD level
XR IP’s are always
learned at VRF level
• If moves/sec exceed rate then learning is disabled on BD for the hold time
as a protection mechanism for software components (epm/epmc/coop)
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
32
Timers – Endpoint Retention Policy
Custom Aging Timers
at BD level
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
33
Timers – Endpoint Retention Policy
Custom Aging Timers
at VRF level
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
34
Issue #1
Switch independent NIC team and load/spreading (Misconfigured Host)
1
ARP on eth2-1
with mac A, IP C
eth2-1
mac: A
3
eth2-2
mac: B
Source traffic for
flow-Y from A
2
Source traffic for
flow-X from B
IP: C
• Each routed IP frame triggers a new IP learn within the fabric and endpoint
is rapidly moving between mac A and mac B
• Possibly no perceived impact on dataplane traffic, however high CPU on
leaf. If NIC is between two leaves, then may see coop process high on
spine as well.
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
35
Issue #1
Available in 3.2(1)
Fix – Enable Rogue Endpoint Detection
System -> System Settings -> Endpoint Controls ->
Rogue EP Control
• An endpoint is marked as
Rogue if it moves over the
multiplication factor within
the detection interval.
• Endpoint is programmed
as static to prevent new
local learns and DL bit is
set for all frames to
prevent XR updates.
Note, this is not a fix but allows operators an
opportunity to protect their fabric and get
notified of misconfigured hosts
#CLUS
BRKACI-2641
• Fault raised for endpoints
detected as rogue.
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
36
Issue #1
Fix – Enable Rogue Endpoint Detection
Example Fault
• Fault is raised under the
node and also be seen
under System faults.
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
37
Issue #1
Fix – Enable Rogue Endpoint Detection
Check EPM flag on leaf
fab4-leaf101# show system internal epm endpoint ip 10.1.1.101
MAC : 0000.0000.000a ::: Num IPs : 1
IP# 0 : 10.1.1.101 ::: IP# 0 flags : rogue|
Vlan id : 3028 ::: Vlan vnid : 8292 ::: VRF name : ag:v1
BD vnid : 15958069 ::: VRF vnid : 2555909
Phy If : 0x16000002 ::: Tunnel If : 0
Interface : port-channel3
Flags : 0x80080c05 ::: sclass : 10932 ::: Ref count : 5
EP Create Timestamp : 12/31/1969 19:00:00.000000
EP Update Timestamp : 05/13/2019 19:58:26.310178
EP Flags : local|vPC|IP|MAC|sclass|rogue|
::::
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
38
Issue #1
What about EP Loop Protection?
Not RECOMMENDED
• Action is potentially
disruptive to other stable
endpoints.
• BD Learn disable prevents
new learns on the entire
BD
• Port disable may impact a
critical port such as fabricinterconnect or DCI link.
No mechanism to prioritize
a host port.
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
39
Issue #2
Old IP never times out after new IP is assigned to host
fab4-leaf101# show endpoint ip 10.1.1.101
Legend:
s - arp
H - vtep
V - vpc-attached
p - peer-aged
R - peer-attached-rl B - bounce
S - static
M - span
D - bounce-to-proxy O - peer-attached
a - local-aged
L - local
+-----------------------------------+---------------+-----------------+--------------+-------------+
VLAN/
Encap
MAC Address
MAC Info/
Interface
Domain
VLAN
IP Address
IP Info
+-----------------------------------+---------------+-----------------+--------------+-------------+
3028
vlan-101
0000.0000.000a LV
po3
ag:v1
vlan-101
169.254.8.62 LV
po3
ag:v1
vlan-101
10.1.1.101
LV
po3
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
40
Issue #2
Available in 2.1(1)
Fix: Enable IP Aging Policy
System -> System Settings -> Endpoint Controls ->
IP Aging
• For aging, an endpoint is a
MAC with one or more IP
addresses. If the MAC is
active then all IPs learned
on the MAC will remain
active.
• IP Aging policy performs
aging on each IP
individually
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
41
Issue #3
Misconfigured host/L4-L7 service triggers unexpected learn
Border Leaf (BL)
L3
Out
IP: X
service
border
Addr
Interface
Detail
A
tun1
XR -> Service Leaf
B
tun1
XR -> Service Leaf
C
eth1/1
local learn
Initial Working State
A
B
Service Leaf (SL)
C
IP X represents a prefix that is learned on the L3Out.
During stable state, the service leaf would have an
LPM route pointing to the border leaf for this prefix
#CLUS
Addr
Interface
Detail
A
eth1/1
local learn
B
eth1/2
local learn
C
tun6
XR -> Border Leaf
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
42
Issue #3
Misconfigured host/L4-L7 service triggers unexpected learn
1
Border Leaf (BL)
Host-A sends pkt
with source-IP X
L3
Out
dmac
IP: X
smac
SIP-X
service
border
DIP-C
A
B
C
Addr
Interface
Detail
A
tun1
XR -> Service Leaf
B
tun1
XR 3
-> Service
Leaf leaf
on border
C
eth1/1
local learn
X
tun1
XR -> Service Leaf
Triggers a learn
Service Leaf (SL)
#CLUS
Addr
Interface
Detail
A
eth1/1
local learn
B
eth1/2
C
tun6
2 learnon service leaf
local
XR -> Border Leaf
X
eth1/1
local learn
BRKACI-2641
Triggers a learn
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
43
Issue #3
Misconfigured host/L4-L7 service triggers unexpected learn
3
Border Leaf (BL)
Packet incorrectly sent
to SL instead of L3Out L3
Addr
Interface
Detail
A
tun1
XR -> Service Leaf
B
tun1
dmac
C
eth1/1
smac
X
tun1
Out
IP: X
service
A
border
C
B
Same problem if Host-B
tries to send packet to IP X.
All connectivity to this IP is
broken
1
SIP-C
BLService
has learned
IP
Leaf
2XR ->
X toward SL
local learn
XR -> Service Leaf
Service Leaf (SL)
DIP-X
Addr
Interface
Detail
A
eth1/1
local learn
eth1/2
local learn
tun6
XR -> Border Leaf
eth1/1
local learn
Host-C sends pkt
B
with source-IP XC
X
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
44
Issue #3
Available in 1.1(1)
Fix: Limit IP Learning to Subnet
Tenant -> Networking -> Bridge Domain
• Default setting for new
BDs created in 2.3(1e)
and 3.0(1k) and
above.
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
45
Issue #3
Fix: Limit IP Learning to Subnet (Partial Fix)
1
Local off-subnet
learn is ignored
2
dmac
Border Leaf (BL)
Packet is still
BL
Interface
L3forwarded toAddr
Detail
Out
XR -> Service Leaf
IP: X
smac
SIP-X
service
border
DIP-C
A
B
A
tun1
B
tun1
XR 3
-> Service
Leaf leaf
on border
C
eth1/1
local learn
X
tun1
XR -> Service Leaf
Triggers a learn
Service Leaf (SL)
C
Limit IP learning to subnet prevents off-subnet learn
on local leaf but border leaf cannot apply off-subnet
logic on XR frame since BD information is not present
in packet, only VRF VNID in iVXLAN header
#CLUS
Addr
Interface
Detail
A
eth1/1
local learn
B
eth1/2
local learn
C
tun6
XR -> Border Leaf
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
46
Available in
2.2(2) and 3.0(2)
Issue #3
Fix: Enforce Subnet Check
System -> System Settings -> Fabric Wide Settings
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
47
Available in
2.2(2) and 3.0(2)
Issue #3
Fix: Enforce Subnet Check
1
Local off-subnet
learn is ignored
• This feature is available only for Gen2
L3
Out
dmac
IP: X
smac
SIP-X
service
DIP-C
A
B
• This implicitly enables local subnet
check whether it is enabled or not
enabled on the BD (i.e., Limit Ip
Learning to Subnet on the BD is no
longer required).
border
2
switches and above
XR off-subnet for all
BDs in VRF is ignored
C
• For remote learns, the IP is only
learned if the IP belongs to at least
BD in the VRF.
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
48
Issue #4
Leaf101
Stale Endpoint on Border Leaf
Traffic from L3out destined to Host-A
is bounced through leaf101
L3
Out
leaf101
leaf103
border
A
A
B
Addr
Interface
Detail
A
tun3,
bounce
XR -> leaf103 with
bounce bit set
Leaf 103
Addr
Interface
Detail
A
eth1/1
local learn
Border Leaf
Addr
Interface
Detail
A
tun1
XR -> leaf101 TEP
• In initial state, Host-A has triggered an XR learn on the border leaf. Let’s
assume in this example that Host-A was communicating with Host-B.
• Host-A then moves to leaf103. It no longer sends any frames to Host-B but
continues sending frames out the L3out toward the border leaf.
• Leaf101 maintains a bounce-entry for Host-A
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
49
Issue #4
Leaf101
Stale Endpoint on Border Leaf
L3
Out
leaf101
leaf103
border
A
Addr
Interface
Detail
A
tun3,
bounce
Bounce
entry timed
XR -> leaf103
with out
bounce bit set
Eventually bounce
entry times out
Leaf 103
Addr
Interface
Detail
A
eth1/1
local learn
HIT bit set, but move
ignored due to DL bit
Border Leaf
Addr
Interface
Detail
Hit
A
tun1
XR -> leaf101 TEP
No
Yes
• Leaf103 is a Gen1 leaf and the VRF is in ingress enforcement. Due to hardware
restriction on Gen1, traffic sent to the L3Out has the DL (don’t-learn) bit set in the
iVXLAN header.
• When the border leaf receives the frame, it updates aging hit bit but does not update
the learn entry since DL bit is set.
• Eventually, the bounce entry on leaf101 will timeout but border leaf will still have XR
#CLUS
entry point to leaf-101. Any traffic destined
to host-A will be dropped
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
50
Issue #4
Leaf101
Stale Endpoint on Border Leaf
Traffic from L3out toward
Host-A is sent to leaf-101
L3
Out
leaf101
leaf103
Addr
Interface
Detail
A
-
Bounce entry timed-out
Leaf 103
border
Addr
Interface
Detail
A
eth1/1
local learn
Border Leaf
A
Leaf-101 drops
the packet
Addr
Interface
Detail
Hit
A
tun1
XR -> leaf101 TEP
Yes
Entry on BL is now stale. It
points to leaf-101 which is not
where Host-A exists
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
51
Issue #4
Fix: Disable Remote Endpoint Learning on Border Leaf
Available in
2.2(2) and 3.0(1)
System -> System Settings -> Fabric Wide Settings
• No XR IP learning on Border Leaf
• L3Out deployed with VRF in ingress policy enforcement mode
• Prevents stale endpoint caused by Gen1 sending traffic to L3Out with DL bit set
52
© 2019 Cisco
and/or its
affiliates. All
rights reserved.
Cisco Public
• Note, routed multicast will still trigger an XR#CLUS
IP learn BRKACI-2641
on Border
Leaf
with
Gen2
switches
Stale Endpoint Software Fix
Feature: EP Announce on Bounce Delete
L3
Out
leaf101
A
leaf103
Leaf101
Addr
Interface
Detail
A
tun3,
bounce
XR -> leaf103 with
bounce bit set
Border Leaf
border
A
Addr
Interface
Detail
Hit
A
tun1
XR -> leaf101 TEP
Yes
• Let’s consider the same scenario as Issue#4. Host-A moved from leaf101 to
leaf103, a bounce entry is present on Host-A, and some flow is resetting the
XR hit-bit on the border leaf toward leaf101
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
53
Stale Endpoint Software Fix
Feature: EP Announce on Bounce Delete
L3
Out
Leaf101
Addr
Interface
Detail
A
tun3
-
XR -> leaf103
Bounce
entry timed-out
Border Leaf
leaf101
Bounce timer expires,
Send EP Announce Delete
leaf103
border
A
Addr
Interface Detail
Detail
Interface
A
tun1
by announce
XRDeleted
-> leaf101
TEP
Triggers XR delete on any
leaf still pointing to leaf101
• Enabled by default in 3.2.2 and above, no configuration required
• Supports Gen1 and Gen2
• Prevents stale endpoint issues
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
54
Issue #5
I have no control over the devices connected to the network…
• Some environments must support
Users routing
through their own
virtual firewalls
Servers IP
load-sharing
Virtual
routers
VM with multiple NICs that
perform their own routing OR
allow users to spin up their own
virtual routers, load-balancers, or
firewalls
• There are supported design
recommendations to address
each scenario, however it is too
difficult or not possible to address
each in the current network
Dynamic loadbalancers
• Can we just do traditional IP
learning?
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
55
Issue #5
Available in 4.0(1)
Fix: Disable IP Dataplane Learning on the VRF
Tenant -> Networking -> VRFs
• Local MAC learning still occurs via
dataplane
• Remote MAC learning still occurs
via dataplane for Gen2
• BD L2 hardware proxy is required
to support Gen1 since remote MAC
learning will not occur
• Local IPs are only learned via
ARP/ND control plane
IP Dataplane
learning
• Remote IPs are not learned from
unicast
• Remote IPs are still learned from
routed multicast packets
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
56
Issue #5
What about Disable IP Dataplane Learning on the BD?
Tenant -> Networking -> Bridge Domains
Not recommended
to disable
#CLUS
BRKACI-2641
•
Disabling IP Dataplane
learning on the BD is
only tested/supported
for service graph BDs
with PBR
•
In 3.1 and above with
Gen2, this feature is
auto-enabled on the
PBR node EPG, so
disabling on BD is not
required with PBR
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
57
Endpoint Control Best Practices
•
Run 3.2 or above to take advantage of EP Announce Delete
•
Per BD, enable Limit IP Learning to Subnet
•
Enable Global IP Aging
•
Enable Global Enforce Subnet Check (not applicable for Gen1)
•
If Gen1 leaf present, enable Disable Remote EP Learn on Border Leaf
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
58
Endpoint Learning
Troubleshooting
Tips
Packet Walk Checklist
Problem: Host-A cannot ping the gateway
• Start with the basics:
 Verify EPG/BD/VRF basic config
 What leaf/port is the host connected?
 Is the vlan-encap deployed to the leaf?
 Is the port a member of the vlan?
 Is the SVI present with gateway config?
A
10.1.1.101
0000.0000.000A
EPG: e1
BD: bd1
VRF: v1
 Is the endpoint learned?
If we were learning the endpoint in the
fabric, we could quickly tell which leaf/port
it was connected and, most likely, it would
be able to ping its gateway…
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
60
Packet Walk Checklist
 Is the endpoint learned?
Problem: Host-A cannot ping the gateway
Skip to the last step first, since it
can validate all other steps
Check EP Tracker in APIC UI
fab4-apic1# show endpoint ip 10.1.1.101
Legends:
(P):Primary VLAN
(S):Secondary VLAN
Check for endpoint on APIC CLI
Total Dynamic Endpoints: 0
Total Static Endpoints: 0
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
61
Packet Walk Checklist
Problem: Host-A cannot ping the gateway
VRF: v1
Validate static path attachment and
encap. In this example, vpc on node101/102 and VLAN encap 101
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
62
Packet Walk Checklist
Problem: Host-A cannot ping the gateway
Ensure the BD is associated to the EPG
Also (not shown), ensure the BD is
associated to the VRF
Network faults may require you to verify your
access policy configuration (AEP, phy domain,
vlan pool, switch/interface selectors)
Ensure there are no faults for the
EPG that might have stop
deployment to your leaf.
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
63
Packet Walk Checklist
 Is the vlan-encap deployed?
Problem: Host-A cannot ping the gateway
 Is the port a member of the vlan?
Port-channel ag_po1001 with id
Po3 and member interface Eth1/3
fab4-leaf101# show port-channel extended | egrep ag_po1001
3
Po3(SU)
ag_po1001
LACP
Eth1/3(P)
fab4-leaf101# vsh_lc -c 'show system internal eltmc info vlan access_encap_vlan 101' | egrep "vlan_id"
vlan_id:
3028
:::
hw_vlan_id:
3009
vlan_id:
3028
:::
isEpg:
1
bd_vlan_id:
3027
:::
hwEpgId:
12766
Get the PI vlan for the encap
(FD) and the BD vlans
fab4-leaf101# show vlan id 3028 extended
VLAN Name
Encap
Ports
---- -------------------------------- ---------------- -----------------------3028 ag:app:e1
vlan-101
Eth1/3, Eth1/4, Eth1/6,
Po3, Po4
fab4-leaf101# show vlan id 3027 extended
VLAN Name
Encap
Ports
---- -------------------------------- ---------------- -----------------------3027 ag:bd1
vxlan-15958069
Eth1/3, Eth1/4, Eth1/6,
Po3, Po4
#CLUS
BRKACI-2641
Verify my interface is
forwarding for both EPG
and BD vlans
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
64
Packet Walk Checklist
Problem: Host-A cannot ping the gateway
 Is the SVI present with gateway
config?
 Is the endpoint learned?
fab4-leaf101# show ip interface vlan 3027
IP Interface Status for VRF "ag:v1"
vlan3027, Interface status: protocol-up/link-up/admin-up, iod: 1028, mode: pervasive
IP address: 10.1.1.1, IP subnet: 10.1.1.0/24
IP broadcast address: 255.255.255.255
IP primary address route-preference: 1, tag: 0
fab4-leaf101# show system internal epm endpoint ip 10.1.1.101
<none>
Remember, vlan-3027
is the vlan for bd1
Queries EPM state
directly (fast)
fab4-leaf101# show endpoint ip 10.1.1.101
Legend:
Same command
used on
s - arp
H - vtep
V - vpc-attached
p - peer-aged
R - peer-attached-rl B - bounce
S - static
M - span
APIC, queries epm
MIT state
D - bounce-to-proxy O - peer-attached
a - local-aged
L - local
+-----------------------------------+---------------+-----------------+--------------+-------------+
VLAN/
Encap
MAC Address
MAC Info/
Interface
Domain
VLAN
IP Address
IP Info
+-----------------------------------+---------------+-----------------+--------------+-------------+
<none>
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
65
Packet Walk Checklist
 Is the endpoint learned?
Problem: Host-A cannot ping the gateway
 Is the correct subnet pushed?
 Is learning enabled?
fab4-leaf101# show system internal epm vlan 3027 detail | egrep "Learn|fwd_mode|BD Subnet"
Valid : Yes ::: Incomplete : No ::: Learn Enable : Yes
fwd_mode : route,bridge ::: fwd_ctrl : mdst-flood,ip-lrn-pfx-check,
BD Subnet ip_pfx-1 : 10.1.1.1/24
fab4-leaf101# vsh_lc -c 'show system internal epmc vlan 3027 detail' | egrep "Learn|fwd_mode|BD Subnet"
fwd_mode : route,bridge ::: fwd_ctrl : mdst-flood,ip-lrn-pfx-check, ::: bridge_mode: mac ::: unk_mac_ucast:
proxy
Learning disabled :no
BD Subnet ip_pfx-1 : 10.1.1.1/24
Both epm (sup component) and epmc (LC
Gen2 only, ensure that learning
is globally enabled in Hal
component) have routing enabled on the BD
and learning is enabled.
Also BD subnet list contains our prefix
fab4-leaf101# vsh_lc -c 'show system internal epmc global-info' | egrep "Hal Learn"
Hal Learn Disabled
: No
fab4-leaf101# vsh_lc -c 'show platform internal hal learn learn' | egrep status
status
: Enabled
status_reason
: None
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
66
Packet Walk Checklist
 Is the endpoint learned?
 Is the correct subnet pushed?
Problem: Host-A cannot ping the gateway
 Is learning enabled?
• Under what conditions do we expect learning to be disabled?
Endpoint Retention Policy
Remember, if moves per second
exceed BD configured policy, learning
will temporarily be disabled!
Timer
Default
Applied at BD
Applied at VRF
Local
900 sec
Mac and IP
-
Bounce
630 sec
Mac
IP
Remote
300 sec
Mac
IP
Move
256/sec
-
-
Hold
300 sec
-
-
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
67
Packet Walk Checklist
 Is the endpoint learned?
Problem: Host-A cannot ping the gateway
 Are we receiving the frame?
What tools do we have to help?
SPAN, ELAM (ELAM-Assistant App)
fab4-leaf101# show endpoint mac 0000.0000.000a
Legend:
We did learn
the MAC, but in the
s - arp
H - vtep
V - vpc-attached
p - peer-aged
R - peer-attached-rl B - bounce
S - static
M
span
wrong vlan. Misconfigured host
D - bounce-to-proxy O - peer-attached
a - local-aged
L - local
+-----------------------------------+---------------+-----------------+--------------+-------------+
VLAN/
Encap
MAC Address
MAC Info/
Interface
Domain
VLAN
IP Address
IP Info
+-----------------------------------+---------------+-----------------+--------------+-------------+
291/ag:v1
vlan-102
0000.0000.000a LV
po3
• We got lucky that the vlan-encap the host was sending in was configured on
the leaf, else the frame would have been dropped and no MAC learn
triggered
Limit IP Learning to Subnet enabled by
• Why wasn’t the IP learned?
default, vlan-102 in a different BD or
unicast routing disabled on that BD
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
68
Packet Walk Checklist
 Is the endpoint learned?
Fixed: Host-A can ping the gateway
Fixed the host config and now
we’re learning the IP!
fab4-leaf101# show endpoint ip 10.1.1.101
Legend:
s - arp
H - vtep
V - vpc-attached
p - peer-aged
R - peer-attached-rl B - bounce
S - static
M - span
D - bounce-to-proxy O - peer-attached
a - local-aged
L - local
+-----------------------------------+---------------+-----------------+--------------+-------------+
VLAN/
Encap
MAC Address
MAC Info/
Interface
Domain
VLAN
IP Address
IP Info
+-----------------------------------+---------------+-----------------+--------------+-------------+
3028
vlan-101
0000.0000.000a LV
po3
ag:v1
vlan-101
10.1.1.101 LV
po3
fab4-leaf101# show system internal epm endpoint ip 10.1.1.101
MAC : 0000.0000.000a ::: Num IPs : 1
IP# 0 : 10.1.1.101 ::: IP# 0 flags :
Vlan id : 3028 ::: Vlan vnid : 8292 ::: VRF name : ag:v1
BD vnid : 15958069 ::: VRF vnid : 2555909
Phy If : 0x16000002 ::: Tunnel If : 0
Interface : port-channel3
Flags : 0x80000c05 ::: sclass : 10932 ::: Ref count : 5
EP Create Timestamp : 05/17/2019 02:14:09.965041
EP Update Timestamp : 05/17/2019 02:14:09.965041
EP Flags : local|vPC|IP|MAC|sclass|
::::
#CLUS
Remember that epm/epmc treat an
endpoint as a MAC with one or more
IPs, so MAC is also displayed for local
IP endpoints
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
69
Packet Walk Checklist
 Does coop have the endpoint?
Fixed: Host-A can ping the gateway
Bonus validation
fab4-spine201# show coop internal info ip-db key 2555909 10.1.1.101
IP address : 10.1.1.101
Verify endpoint in coop using
Vrf : 2555909
Flags : 0
VRF vnid and IP address
EP bd vnid : 15958069
EP mac : 00:00:00:00:00:0A
Publisher Id : 10.0.128.93
Mac and BD VNID
Record timestamp : 06 09 2019 13:32:53 827717825
Publish timestamp : 06 09 2019 13:32:53 828777370
Seq No: 0
Remote publish timestamp: 12 31 1969 19:00:00 0
URIB Tunnel Info
Num tunnels : 1
Tunnel address : 10.0.128.95
pTEP/vTEP/eTEP of leaf/pod/site
Tunnel ref count : 1::::
• Endpoint must be in coop in order for proxy lookups to work. This is critical
for XR miss for both intra/inter-pod and intra/inter-site. You should see the
same state on all spines.
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
70
Packet Walk Checklist
 Does coop have the endpoint?
Fixed: Host-A can ping the gateway
Bonus validation
fab4-spine201# show coop internal info repo ep key 15958069 00:00:00:00:00:0A | egrep "^Vrf|^Tunnel nh|^EP|num
of active|^Real"
EP bd vnid : 15958069
EP mac : 00:00:00:00:00:0A
Verify endpoint is in coop using
Vrf vnid : 2555909
Tunnel next-hop
BD VNID and mac address
Tunnel nh : 10.0.128.95
num of active ipv4 addresses : 4
num of active ipv6 addresses : 1
Real IPv4 EP : 10.1.1.101
IPv4/IPv6 addressed
Real IPv4 EP : 10.1.1.102
tied to this MAC
Real IPv4 EP : 10.1.1.103
Real IPv4 EP : 10.1.1.104
Real IPv6 EP : 2001:0000:0000:0000:0000:0000:0000:0065
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
71
Endpoint Learning Troubleshooting Review


Verify logical config (EPG/BD/VRF and contracts)
Verify no network faults under the EPG that would prevent the encap from being
deployed

Verify that the leaf has the encap deployed

Verify that the port is a member of the vlan

Verify that the SVI is present on the leaf with the proper subnets

Verify that local leaf is learning the endpoint


Verify learning is enabled on the BD

Verify software components have the correct BD prefixes programmed

Verify the leaf is receiving the frame on expected interface and encapsulation
Verify that endpoint is present in coop and coop has correct tunnel address
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
72
Recommend Troubleshooting Apps
https://aciappcenter.cisco.com/
ELAM Assistant
EnhancedEndpointTracker
The ELAM Assistant performs ELAM to capture a
packet and decode the result.
The EnhancedEndpointTracker is a Cisco ACI
application that maintains a database of endpoint
events on a per-node basis allowing for unique fabricwide analysis. The application can be
configured to analyze, notify, and automatically
remediate various endpoint events. This gives
ACI fabric operators better visibility and control over
the endpoints in the fabric.
ELAM is a built-in tool that captures a single packet at
the ASIC level to check forwarding decision details.
It is typically used by Cisco TAC as it requires a deep
knowledge of each ACI ASIC to both perform and
correctly understand the resulting output.
This app wraps the differences between each ACI ASIC
and provides a UI to perform an ELAM capture for
those who don't have access to ASIC level information.
It then decodes this results of the ELAM capture in a
user friendly format.
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
73
Enhanced Endpoint Tracker
Active endpoint count
and fast search
Start/Stop the monitor
Uptime of the monitor
and number of queued
events to process
Health/history of the
monitor itself
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
74
Enhanced Endpoint Tracker
Fast search for IP or MAC
~150ms for search to
complete
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
75
Enhanced Endpoint Tracker
Historical tables to browse various events along
with browsing all endpoints in the fabric
Top moves in the fabric, quickly see any
unstable/misconfigured endpoints
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
76
Enhanced Endpoint Tracker
Full details of current state of endpoint within
the fabric including local and XR learns
Also per-node detailed history, move events,
rapid/offsubnet/stale/and clear events
History of where endpoint was learned or if it
was deleted from the fabric
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
77
Enhanced Endpoint Tracker
Clear problem endpoints on
multiple nodes quickly
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
78
Complete your
online session
evaluation
•
Please complete your session survey
after each session. Your feedback
is very important.
•
Complete a minimum of 4 session
surveys and the Overall Conference
survey (starting on Thursday) to
receive your Cisco Live water bottle.
•
All surveys can be taken in the Cisco Live
Mobile App or by logging in to the Session
Catalog on ciscolive.cisco.com/us.
Cisco Live sessions will be available for viewing
on demand after the event at ciscolive.cisco.com.
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
79
Continue your education
Demos in the
Cisco campus
Walk-in labs
Meet the engineer
1:1 meetings
Related sessions
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
80
Thank you
#CLUS
Appendix
Packet Walk Checklist
Problem: Host-A cannot ping Host-B
Subtle but important. If bridged then
we need to check MAC endpoints, if
routed we need to check IP…
 Is this frame bridged or routed?
 Am I learning Host-A and Host-B IPs in
the fabric?
leaf101
leaf102
leaf103
A
B
10.1.1.101
0000.0000.000A
EPG: e1
10.1.2.102
0000.0000.000B
EPG: e2
BD: bd1
BD: bd2
 Do we have a remote learn for Host-B on
ingress leaf or are we using proxy-path?
 Do the spines have Host-B entry
programmed to handle proxy forwarding?
 For the leaf that is performing policy
enforcement, do I have the appropriate
contract?
VRF: v1
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
83
Packet Walk Checklist
 Am I learning Host-A and Host-B
IPs in the fabric?
Problem: Host-A cannot ping Host-B
fab4-apic1# show endpoint ip 10.1.1.101
<snip>
Dynamic Endpoints:
Tenant
: ag
Application : app
AEPg
: e1
End Point MAC
----------------00:00:00:00:00:0A
We can check the endpoint directly on
the APIC. If not present, then repeat
previous local learn troubleshooting
IP Address
---------------------------------------10.1.1.101
Node
---------101 102
Interface
-----------------------------vpc ag_po1001
fab4-apic1# show endpoints ip 10.1.2.102
<snip>
Dynamic Endpoints:
Tenant
: ag
Application : app
AEPg
: e2
End Point MAC
IP Address
----------------- ---------------------------------------00:00:00:00:00:0B 10.1.2.102
Node
---------103
Interface
-----------------------------eth1/5
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
84
Packet Walk Checklist
 Do we have a remote learn for
Problem: Host-A cannot ping Host-B
Host-B on ingress leaf or are we
using proxy-path?
fab4-leaf101# show endpoint ip 10.1.2.102
Legend:
s - arp
H - vtep
V - vpc-attached
p - peer-aged
R - peer-attached-rl B - bounce
S - static
M - span
D - bounce-to-proxy O - peer-attached
a - local-aged
L - local
+-----------------------------------+---------------+-----------------+--------------+-------------+
Leaf-101 (ingress leaf) does not
VLAN/
Encap
MAC Address
MAC Info/
Interface
learn
for Host-B IP Info
Domain
VLANhave an XR IP
Address
+-----------------------------------+---------------+-----------------+--------------+-------------+
<none>
fab4-leaf101# show ip route 10.1.2.0 vrf ag:v1
IP Route Table for VRF "ag:v1"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>
Ensure that the route has
pervasive flag for ‘pervasive BD’
10.1.2.0/24, ubest/mbest: 1/0, attached, direct, pervasive
*via 10.0.208.64%overlay-1, [1/0], 00:24:38, static, tag 4294967295
recursive next hop: 10.0.208.64/32%overlay-1
fab4-leaf101# show isis dteps vrf overlay-1 | grep 10.0.208.64
10.0.208.64
SPINE
N/A
PHYSICAL,PROXY-ACAST-V4
#CLUS
Next-hop IP is spine anycast
IPv4 Proxy
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
85
Packet Walk Checklist
 Do the spines have Host-B entry
Problem: Host-A cannot ping Host-B
programmed to handle proxy?
First, we need the VNID for the
VRF to validate routed flow.
We can get it vrf vnid
from the leaf
fab4-leaf101# moquery -c fvCtxDef -x 'query-target-filter=eq(fvCtxDef.ctxDn,"uni/tn-ag/ctx-v1")'
scope
: 2555909
…
fab4-leaf101# vsh_lc -c 'show system internal eltmc info vrf ag:v1' | egrep vnid: | head -1
overlay_index:
0
:::
vnid:
2555909
Tenant -> Networking -> VRFs
fab4-apic1# moquery -d uni/tn-ag/ctx-v1 | egrep scope
scope
: 2555909
We can get it vrf vnid
from the APIC cli
We can get it vrf vnid
from the APIC UI
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
86
Packet Walk Checklist
 Do the spines have Host-B entry
programmed to handle proxy?
Problem: Host-A cannot ping Host-B
The tunnel address can by
one of several different
type of TEPs:
fab4-spine201# show coop internal info ip-db key 2555909 10.1.2.102
IP address : 10.1.2.102
Vrf : 2555909
Flags : 0x2
Spine has the entry in coop
EP bd vnid : 16187409
(should validate each spine)
EP mac : 00:00:00:00:00:0B
Publisher Id : 10.4.0.2
Record timestamp : 12 31 1969 19:00:00 0
Publish timestamp : 12 31 1969 19:00:00 0
Seq No: 0
Remote publish timestamp: 05 17 2019 02:22:08 814730181
URIB Tunnel Info
Num tunnels : 1
Tunnel address : 10.0.16.94
Tunnel ref count : 1
• Physical TEP within same
pod
• VPC TEP within same
pod
• Anycast External IP for
remote pod or site
In this case, this is
leaf103 PTEP
admin@fab4-apic1:~> acidiag fnvread | grep 10.0.16.94
103
1
fab4-leaf103
SAL19069BUY
10.0.16.94/32
#CLUS
BRKACI-2641
• RemoteLeaf PTEP
leaf
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
87
Packet Walk Checklist
 For the leaf that is performing
Problem: Host-A cannot ping Host-B
policy enforcement, do I have the
appropriate contract?
Which Leaf applies the contract?
• Ingress leaf applies contract if remote endpoint is known so packet does not have to
be forwarded all the way through the fabric
• Egress leaf applies contract if packet was sent via spine proxy.
Will focus on leaf-103
• Border leaf in ingress policy enforcement does not apply contract unless application
EPG is deployed locally.
To Verify Contract
 VRF VNID
 Source EPG pcTag (Host-A)
 Destination EPG pcTag (Host-B)
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
88
Packet Walk Checklist
 For the leaf that is performing
Problem: Host-A cannot ping Host-B
policy enforcement, do I have the
appropriate contract?
fab4-leaf101# show system internal epm end ip 10.1.1.101
MAC : 0000.0000.000a ::: Num IPs : 1
IP# 0 : 10.1.1.101 ::: IP# 0 flags :
Vlan id : 3028 ::: Vlan vnid : 8292 ::: VRF name :Host-A
ag:v1 local EPM entry on
BD vnid : 15958069 ::: VRF vnid : 2555909
leaf101 contains source pcTag
Phy If : 0x16000002 ::: Tunnel If : 0
Interface : port-channel3
Flags : 0x80004c05 ::: sclass : 49155 ::: Ref count : 5
fab4-leaf103# show system internal epm endpoint ip 10.1.2.102
EP Create Timestamp : 05/17/2019 02:14:09.965041
EP Update Timestamp : 05/17/2019 03:46:08.819921
MAC : 0000.0000.000b ::: Num IPs : 1
EP Flags : local|vPC|IP|MAC|sclass|timer|
IP# 0 : 10.1.2.102 ::: IP# 0 flags :
::::
Host-B
Vlan id : 279 ::: Vlan vnid : 8293 ::: VRF
name : local
ag:v1EPM entry on
BD vnid : 16187409 ::: VRF vnid : 2555909
leaf103 contains dest pcTag
Phy If : 0x1a004000 ::: Tunnel If : 0
Interface : Ethernet1/5
Flags : 0x80004c04 ::: sclass : 16389 ::: Ref count : 5
EP Create Timestamp : 05/17/2019 02:21:47.612351
EP Update Timestamp : 05/17/2019 03:45:01.836174
EP Flags : local|IP|MAC|sclass|timer|
::::
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
89
Packet Walk Checklist
 For the leaf that is performing
Problem: Host-A cannot ping Host-B
fab4-leaf103# show zoning-rule scope 2555909
Rule ID
SrcEPG
DstEPG
=======
======
======
4419
0
0
4420
0
0
4421
0
15
4535
0
49154
FilterID
========
implicit
implarp
implicit
implicit
policy enforcement, do I have the
appropriate contract?
operSt
======
enabled
enabled
enabled
enabled
Scope
=====
2555909
2555909
2555909
2555909
Action
======
deny,log
permit
deny,log
permit
fab4-leaf103# contract_parser.py --vrf ag:v1
Key:
Available since 3.2.2
[prio:RuleId] [vrf:{str}] action protocol src-epg [src-l4] dst-epg [dst-l4] [flags][contract:{str}] [hit=count]
[16:4535]
[16:4420]
[21:4419]
[22:4421]
[vrf:ag:v1]
[vrf:ag:v1]
[vrf:ag:v1]
[vrf:ag:v1]
permit any epg:any tn-ag/bd-bd2(49154) [contract:implicit] [hit=0]
permit arp epg:any epg:any [contract:implicit] [hit=0]
deny,log any epg:any epg:any [contract:implicit] [hit=5157]
deny,log any epg:any pfx-0.0.0.0/0(15) [contract:implicit] [hit=0]
fab4-leaf103# show logging ip access-list internal packet-log deny | egrep 10.1.2.102 | head
[ Fri May 17 04:02:02 2019 634490 usecs]: CName: ag:v1(VXLAN: 2555909), VlanType: Unknown, Vlan-Id: 0, SMac:
0x000c0c0c0c0c, DMac:0x000c0c0c0c0c, SIP: 10.1.1.101, DIP: 10.1.2.102, SPort: 0, DPort: 0, Src Intf: Tunnel14,
Proto: 1, PktLen: 98
<snip>
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
90
Packet Walk Checklist
 For the leaf that is performing
Problem: Host-A cannot ping Host-B
policy enforcement, do I have the
appropriate contract?
In this instance the contract was missing. Add the proper consumer/provider and/or
VzAny/preferred group updates to allow communication between the two EPGs
fab4-leaf101# show zoning-rule scope 2555909
Rule ID
SrcEPG
DstEPG
=======
======
======
4735
49155
16389
4700
49155
16389
4736
16389
49155
6137
16389
49155
| egrep "Rule|===|16389"
FilterID
operSt
Scope
Action
========
======
=====
======
7
enabled
2555909
permit
default
enabled
2555909
permit
Traffic from
Host-A (pcTag
default
enabled
2555909
permit
6
enabled to Host-B
2555909
permit
49155)
(pcTag 16389)
fab4-leaf101# contract_parser.py --vrf ag:v1 --epg tn-ag/ap-app/epg-e1
Key:
[prio:RuleId] [vrf:{str}] action protocol src-epg [src-l4] dst-epg [dst-l4] [flags][contract:{str}] [hit=count]
[7:6137]
[7:4735]
[9:4736]
[9:4700]
[vrf:ag:v1]
[vrf:ag:v1]
[vrf:ag:v1]
[vrf:ag:v1]
permit
permit
permit
permit
ip tcp tn-ag/ap-app/epg-e2(16389) tn-ag/ap-app/epg-e1(49155) eq 80 [contract:uni/tn-ag/brc-c1] [hit=0]
ip tcp tn-ag/ap-app/epg-e1(49155) eq 80 tn-ag/ap-app/epg-e2(16389) [contract:uni/tn-ag/brc-c1] [hit=0]
any tn-ag/ap-app/epg-e2(16389) tn-ag/ap-app/epg-e1(49155) [contract:uni/tn-ag/brc-c1] [hit=0]
any tn-ag/ap-app/epg-e1(49155) tn-ag/ap-app/epg-e2(16389) [contract:uni/tn-ag/brc-c1] [hit=220,+10]
#CLUS
BRKACI-2641
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
91
#CLUS
Download