BRKACI-2934.pdf - Cisco Live
-
Upload
khangminh22 -
Category
Documents
-
view
5 -
download
0
Transcript of BRKACI-2934.pdf - Cisco Live
Questions? Use Cisco Webex Teams to chat with the speaker after the session
Find this session in the Cisco Events Mobile App
Click “Join the Discussion”
Install Webex Teams or go directly to the team space
Enter messages/questions in the team space
How
1
2
3
4
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Cisco Webex Teams
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Introduction
• Multipod Overview
• Troubleshooting the Multipod Setup Process
• Troubleshooting Unicast Flows
• Troubleshooting Multi-destination Flows
• Troubleshooting External Routed Communication
• Quality of Service
• Conclusion
4
Agenda
BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Acronyms/Definitions
5BRKACI-2934
Acronyms Definitions Acronyms Definitions
ACI Application Centric Infrastructure MDT Multicast Distribution Tree
ACL Access Control List MST Multiple Spanning Tree
APIC/IFC Application Policy InfrastructureController/ Insieme Fabric Controller
OSPF Open Shortest Path First Protocol
BD Bridge Domain pcTag Policy Control Tag
COOP Council of Oracle Protocol PIM Protocol Independent Multicast
ECMP Equal Cost Multipath PL Physical Local
EP Endpoint SVI Switch Virtual Interface
EPG Endpoint Group TC Topology Change
EVPN Ethernet VPN BGP Address-family VL Virtual Local
FTEP/VTEP Fabric/Virtual or VXLAN Tunnel Endpoint VNID Virtual Network Identifier
GIPo Outer Group IP Address VPNv4 BGP VPN Address-Family
ISIS Intermediate System to Intermediate System
VXLAN/iVXLAN Virtual Extensible LAN / Insieme VXLAN
LPM Longest Prefix Match XR VXLAN Remote
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Feature Evolution
• Effective Troubleshooting requires understanding…
• Why does the feature exist?
• What problems does it solve?
• How does it solve them?
• How do the components interact?
Understand the “why”
I said no marketing…why is this necessary?
7BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Feature Evolution – Classic ACI
• VXLAN TEP reachability learned through ISIS
• Endpoint Repo on Spines handled by COOP
• MP-BGP to distribute external routes through fabric
8BRKACI-2934
APIC APICAPIC
Leaf
Spine
LeafLeafLeaf
Spine
Single Fabric
ISISCOOP
MPBGP
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Feature Evolution
• TEP reachability must be communicated
• Endpoints must be synced across locations
• Mechanism needed for BUM traffic
• APIC Cluster must be extended
What if ACI must be extended to other locations?
9BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Feature Evolution – Stretched Fabric
10BRKACI-2934
APIC
Leaf LeafLeaf
SingleFabric
SpineSpine
APIC
Leaf LeafLeaf
SpineSpine
APIC
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Feature Evolution – Stretched Fabric
• Transit leafs connect to all spines
• COOP, ISIS, and BGP extended across locations
Not scalable
11BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Feature Evolution – Multipod
12BRKACI-2934
• Single Fabric Extended
• Each pod is local instance of ISIS and COOP
• Inter-pod connectivity through IPN
• Inter-pod BUM uses PIM-Bidir
• BGP between pods to share endpoints and external routes
APIC
Leaf LeafLeaf
Pod1
SpineSpine
APIC
Leaf LeafLeaf
SpineSpine
APIC
Pod2 ISISCOOP
MPBGP
ISISCOOP
MPBGP
BGP VPNv4/EVPN
OSPF
IPN
,PIM
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPN Requirements
❑OSPF
❑DHCP relay
❑Jumbo MTU (9150 Bytes)
❑Routed Subinterfaces
❑PIM Bidir with at least /15 Mask
❑QoS (optional)
13BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Overview
1. Configure Pod 1 (TEP pool, infra l3out)
2. Configure Remote Pod (TEP pool, infra l3out)
3. Register Remote Pod Spines (DHCP)
4. Discover Remote Pod Leafs (LLDP)
5. Remote Pod APIC’s join cluster
15BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup ProcessSetting up Pod 1 (Seed Pod)
16BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Process
17BRKACI-2934
➢ Configure Addressing for Pod 1 Spine > IPN connection
L3 Interface used for OSPF peering with IPN. Make sure MTU matches IPN!
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Process
➢ Configure OSPF parameters for Pod 1 Spine > IPN connection
18BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Process
➢ Configure Dataplane TEP
19BRKACI-2934
Not needed for Multipod, leave blank
if not used
Anycast Address used for Pod x > Pod 1 Proxied traffic. More on this in Unicast Section
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Process
➢ Review POD1 configurations
20BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Process
21BRKACI-2934
After setting up Seed Pod (Pod 1)…
pod1-spine1# show ip ospf neighbors vrf overlay-1OSPF Process ID default VRF overlay-1Total number of neighbors: 1Neighbor ID Pri State Up Time Address Interface172.31.255.1 1 FULL/ - 00:00:14 172.21.0.0 Eth1/23.23
✓ Verify OSPF is up between Pod1 Spines and IPN
✓ Ensure IPN is pre-provisioned for Pod 2 Connectionsinterface Ethernet1/21.4mtu 9150encapsulation dot1q 4vrf member P1IPNip address 172.22.0.0/31ip ospf network point-to-pointip router ospf P1IPN area 0.0.0.0ip pim sparse-modeip dhcp relay address 10.0.0.1ip dhcp relay address 10.0.0.2no shutdown
Configure Jumbo MTU
Use Vlan 4 Subinterface
Ensure PIM is Enabled
Make sure dhcp relay is configured pointing to APIC Infra IP’s
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Process
22BRKACI-2934
Setting up Pod 2
L3 Interface used for OSPF peering with Pod 2 IPN.
➢ Assign TEP pool to non-seed Pod
➢ Configure L3 parameters
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Process
23BRKACI-2934
➢ Configure OSPF parameters
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Process
24BRKACI-2934
➢ Configure Dataplane TEP for Pod 2
Not needed for multipod, leave blank if not used
Anycast Address used for Pod x > Pod 2 Proxied traffic. More on this in Unicast Section
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Where to find the less-known MPOD configurations?
25BRKACI-2934
Dataplane TEPs from Setup
Spine > IPN subnets leaked into ISIS
Leave default, allows PODs to import BGP paths from each other
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Process
➢ POD 2 Spines should now be discoverable
26BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Remote Pod Discovery
27BRKACI-2934
1. Remote Pod Spines send DHCP DISCOVER. IPN Relays to APICs
APIC
Leaf LeafLeaf
Pod1
SpineSpine
APIC
SpineSpine
Pod2
IPN
DISCOVER
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Remote Pod Discovery
28BRKACI-2934
2. IP Address from Multipod l3out is assigned.
APIC
Leaf LeafLeaf
Pod1
SpineSpine
APIC
SpineSpine
Pod2
IPN
OFFER
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
What’s in the DHCP OFFER?
29BRKACI-2934
Offered IP (From l3out interface profile)
Directory on APIC from which Spine downloads full l3out configuration *full directory is
/firmware/fwrepos/fwrepo/boot/bootstrap-202.xml
Pod2 Facing IPN IP address (relay)
Default Gateway, used for downloading config
• IP address from L3out interface profile assigned
• Gateway is next-hop for default route
• Bootstrap file communicates location of l3out Config
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Remote Pod Discovery
30BRKACI-2934
3. Spine configures static default route for APIC reachability with NH of IPN.
APIC
Leaf LeafLeaf
Pod1
SpineSpine
APIC
SpineSpine
Pod2
IPN
pod2-spine2# vsh -c "show ip route 0.0.0.0/0 vrf overlay-1"IP Route Table for VRF "overlay-1"'[x/y]' denotes [preference/metric]'%<string>' in via output denotes VRF <string>
0.0.0.0/0, ubest/mbest: 1/0*via 172.22.0.0, Eth1/23.23, [250/0], 5d17h, static
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Remote Pod Discovery
31BRKACI-2934
4. Spine downloads bootstrap XML from APIC which contains l3out configuration
APIC
Leaf LeafLeaf
Pod1
SpineSpine
APIC
SpineSpine
Pod2
IPN
CONFIG
pod1-apic1# grep bootstrap /var/log/dme/log/access.log
172.22.0.1 - - [18/Apr/2019:14:13:16 +0000] "GET /fwrepo/boot/bootstrap-202.xml HTTP/1.1" 200 8561 "-" "-"
200 OK Code. GET was success!
switch# moquery -c topSystem# top.Systemaddress : 0.0.0.0bootstrapState : downloading-bootstrap-configrole : spine
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Remote Pod Discovery
32BRKACI-2934
5. Spine acts as self relay for TEP DHCP request
APIC
Leaf LeafLeaf
Pod1
SpineSpine
APIC
Spine
Pod2
IPN
6. TEP address from POD2 pool is assigned
Spine
Lo0 DISCOVER
OFFER
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Remote Pod Discovery
33BRKACI-2934
7. Pod2 Leafsdiscovered through normal process (LLDP/DHCP)
APIC
Leaf LeafLeaf
Pod1
SpineSpine
APIC
Leaf LeafLeaf
SpineSpine
Pod2
IPN
LLDP
DHCP
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Remote Pod Discovery
34BRKACI-2934
8. Pod2 APIC(s) join cluster
*Non-seed pod APICs still use Pod1 TEP Pool!
APIC
Leaf LeafLeaf
Pod1
SpineSpine
APIC
Leaf LeafLeaf
SpineSpine
Pod2
IPN
APIC
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multipod Discovery Problems
Possible Causes
1. DHCP Relays on IPN point to APIC OOB rather than infra
✓Configure Relays to point to infra (show controller on APICs)
2. IPN doesn’t have route to APICs
✓Check that OSPF is up between IPN and Pod1
3. Miscabling results in Spine receiving IP in different subnet than GW
✓Correct cabling or addressing then remove and rediscover Spine
4. Spines can’t resolve ARP for connected IPN interface
✓Ensure SW version supports multipod + spine hw (ex: for 9364C MPOD supported in 3.1(1))
Issue #1: Pod2 Spines Don’t Receive L3out IP or Config
35BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multipod Discovery Problems
Ensure leafs are connected to spine
-Spine TEP not assigned until leaf-facing interfaces “up”
Issue #2: Pod2 Spines Don’t Receive TEP Addresses
36BRKACI-2934
Ensure Leaf–facing interfaces are “up” so Spine gets TEP
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multipod Discovery Problems
Check your setup Parameters!
Issue #3: Remote Pod APIC Not Joining Cluster
37BRKACI-2934
Cluster configuration ...Enter the fabric name [ACI Fabric1]: CL-FabEnter the fabric ID (1-128) [1]: 1Enter the number of active controllers in the fabric (1-9) [3]:Enter the number of active controllers in the fabric (1-9) [3]: 3Enter the POD ID (1-12) [1]: 2Is this a standby controller? [NO]:Is this an APIC-X? [NO]:Enter the controller ID (1-3) [1]: 3Enter the controller name [apic3]: p2-apic3Enter address pool for TEP addresses [10.0.0.0/16]:Note: The infra VLAN ID should not be used elsewhere in your
environmentand should not overlap with any other reserved VLANs on other
platforms.Enter the VLAN ID for infra network (1-4094): 3967
Out-of-band management configuration ...Enable IPv6 for Out of Band Mgmt Interface? [N]:Enter the IPv4 address [192.168.10.1/24]: 10.122.143.14/26Enter the IPv4 address of the default gateway [None]: 10.122.143.1Enter the interface speed/duplex mode [auto]:
Ensure Pod ID is Correct
Remote Pod APIC must use Pod 1 TEP Pool
✓ Run “acidiag avread” to check setup config
✓ If wrong wipe and reload the APIC
“acidiag touch clean”
“acidiag touch setup”
“acidiag reboot”
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Setup Verification Checklist
❑Verify BGP EVPN and VPNv4 is up between pods
❑Verify both unicast and multidestination interpod flows work
❑Verify jumbo MTU interpod flows work
❑Verify above flows work during various Spine > IPN and IPN > IPN link failures
38BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Unicast Overview
• Spines share EP’s via BGP EVPN
• Tunnels to remote pod dynamically built
• IPN’s must support overlay traffic requirements
Troubleshooting Single Pod and Multipod is Similar!
Key Differences Between Single Pod Unicast
40BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 UnicastBD Settings - UUC Proxy, ARP Flooding Enabled, UC Routing Disabled
41BRKACI-2934
Leaf LeafLeaf
Pod1
SpineSpine
Leaf LeafLeaf
SpineSpine
Pod2
EP1172.16.1.1/240050.56a8.b003
root@vm1:/home/joyo# arp -an? (172.16.1.2) at 8c:60:4f:02:88:fc [ether] on ens192
root@vm1:/home/joyo# ping 172.16.1.2PING 172.16.1.2 (172.16.1.2) 56(84) bytes of data.
Verify first that the flow is unicast
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 Unicast
42BRKACI-2934
Ingress traffic triggers local learn
Leaf LeafLeaf
Pod1
SpineSpine
Leaf LeafLeaf
SpineSpine
Pod2
EP1172.16.1.1/240050.56a8.b003
a-leaf101# show endpoint mac 0050.56a8.b003 detail | grep epg-l2-2123/CiscoLive2020:vrf1 vlan-1011 0050.56a8.b003 L eth1/26 CiscoLive2020:ap1:epg-l2-2
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 Unicast
43BRKACI-2934
Ingress leaf updates COOP record on Spines
Leaf LeafLeaf
Pod1
SpineSpine
EP1172.16.1.1/240050.56a8.b003
a-spine1# show coop internal info repo ep | grep -B 8 -A 35 00:50:56:A8:B0:03------------------------------------------**ommittedEP bd vnid : 15761417EP mac : 00:50:56:A8:B0:03**ommittedTunnel nh : 10.0.72.67**omitted
a-apic1# moquery -c ipv4Addr -f ‘ipv4.Addr.addr==“10.0.72.67”’Total Objects shown: 1
# ipv4.Addraddr : 10.0.72.67/32dn : topology/pod-1/node-101/sys/ipv4/inst/dom-overlay-1/if-[lo0]/addr-[10.0.72.67/32] **ommitted
Who owns the tunnel next-hop?
Leaf 101 has the local learn
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 Unicast
44BRKACI-2934
How does the remote pod learn about the EP?
Local Pod Spine installs COOP record
Local Pod Spine Exports into BGP EVPN
Remote Pod Spine Receives through EVPN
Remote Pod Spine Imports into COOP
show coop internal info repo ep | grep -B 8 -A 35 <mac address>
show bgp l2vpn evpn <mac address> vrf overlay-1
show bgp l2vpn evpn <mac address> vrf overlay-1
show coop internal info repo ep | grep -B 8 -A 35 <mac address>
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 Unicast
45BRKACI-2934
Local spines exports to evpn
Leaf LeafLeaf
Pod1
SpineSpine
EP1172.16.1.1/240050.56a8.b003
a-spine1# show bgp l2vpn evpn 00:50:56:A8:B0:03 vrf overlay-1Route Distinguisher: 1:16777199 (L2VNI 1)BGP routing table entry for [2]:[0]:[15761417]:[48]:[0050.56a8.b003]:[0]:[0.0.0.0]/216, **ommittedPaths: (1 available, best #1)Flags: (0x00010a 00000000) on xmit-list, is not in rib/evpnMultipath: eBGP iBGP
Advertised path-id 1Path type: local 0x4000008c 0x0 ref 0, path is valid, is best pathAS-Path: NONE, path locally originated0.0.0.0 (metric 0) from 0.0.0.0 (192.168.1.101)Origin IGP, MED not set, localpref 100, weight 32768Received label 15761417Extcommunity:
RT:5:16
Path-id 1 advertised to peers:192.168.2.101 192.168.2.102
COOP
EVPN
BD VNID
Advertised to Remote Pod Spines
Originated Locally
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 Unicast
46BRKACI-2934
Remote spines receive EP through EVPN
a-spine3# show bgp l2vpn evpn 00:50:56:A8:B0:03 vrf overlay-1Route Distinguisher: 1:16777199BGP routing table entry for [2]:[0]:[15335345]:[48]:[0050.56a8.b003]:[0]:[0.0.0.0]/216, *ommittedPaths: (2 available, best #1)Flags: (0x000202 00000000) on xmit-list, is not in rib/evpn, is lockedMultipath: eBGP iBGP
Advertised path-id 1Path type: internal 0x40000018 0x2040 ref 1, path is valid, is best pathAS-Path: NONE, path sourced internal to AS192.168.1.254 (metric 3) from 192.168.1.101 (192.168.1.101)Origin IGP, MED not set, localpref 100, weight 0Received label 15335345Received path-id 1Extcommunity:
RT:5:16ENCAP:8
DataplaneTEP/ETEP of POD1
BGP Address of Spine 1
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 Unicast
• Per-Pod anycast address
• Owned by all spines in a pod
• Set as Next Hop for BGP EVPN Paths
• COOP Placeholder for external proxy lookups
What is the Dataplane TEP/External Proxy TEP (ETEP)?
47BRKACI-2934
a-apic1# moquery -c ipv4If -f 'ipv4.If.mode*"etep"' -x 'rsp-subtree=children'
# ipv4.Ifid : lo14 adminSt : enableddn : topology/pod-1/node-1001/sys/ipv4/inst/dom-overlay-1/if-[lo14]donorIf : unspecifiedlcOwn : localmodTs : 2019-02-20T16:58:34.113-04:00mode : eteprn : if-[lo14]
# ipv4.Addraddr : 192.168.1.254/32
**ommitted
Loopback14 on Node 1001
Multipod Dataplane TEP/ETEP
NH Address in BGP Update
Not Actually Used for Dataplane!
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
SpineProxied Layer 2 Traffic
SpineProxied Layer 3 Traffic
Forward to Remote Pod External MAC
Proxy TEP
Layer 3 ETEP Lookup
Spine COOP Lookup Points
to Remote POD ETEP
Spine COOP Lookup Points
to Remote POD ETEP
Forward to Remote Pod
External v4/v6 Proxy TEP
apic1# moquery -c ipv4If -f 'ipv4.If.mode=="anycast-mac,external"' -x 'rsp-subtree=children' | egrep "addr" | grep dndn : topology/pod-1/node-1001/sys/ipv4/inst/dom-overlay-1/if-[lo8]/addr-[10.0.0.33/32]dn : topology/pod-2/node-2001/sys/ipv4/inst/dom-overlay-1/if-[lo1]/addr-[10.0.128.33/32]
apic1# moquery -c ipv4If -f 'ipv4.If.mode=="anycast-v4,external"' -x 'rsp-subtree=children' | egrep "addr" | grep dndn : topology/pod-2/node-2001/sys/ipv4/inst/dom-overlay-1/if-[lo2]/addr-[10.0.128.34/32]dn : topology/pod-1/node-1001/sys/ipv4/inst/dom-overlay-1/if-[lo9]/addr-[10.0.0.34/32]
Layer 2 ETEP Lookup
48BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 UnicastVerify Remote Pod COOP Entry
49BRKACI-2934
Leaf LeafLeaf
SpineSpine
Pod2
a-spine3# show coop internal info repo ep | grep -B 8 -A 35 00:50:56:A8:B0:03------------------------------------------**ommittedEP bd vnid : 15761417EP mac : 00:50:56:A8:B0:03Remote Type : MPODMAC Tunnel : 10.0.0.33IPv4 Tunnel : 10.0.0.34IPv6 Tunnel : 10.0.0.35ETEP Tunnel : 192.168.1.254**ommitted
COOP Entry exists and points to POD1
Proxied L2 Traffic will forward to the Pod1 External MAC-
proxy Address
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 Unicast
50BRKACI-2934
BD Settings - UUC Proxy, ARP Flooding Enabled, UC Routing Disabled
Leaf LeafLeaf
Pod1
SpineSpine
Leaf LeafLeaf
SpineSpine
Pod2
EP1172.16.1.1/240050.56a8.b003
root@vm1:/home/joyo# arp -an? (172.16.1.2) at 8c:60:4f:02:88:fc [ether] on ens192
root@vm1:/home/joyo# ping 172.16.1.2PING 172.16.1.2 (172.16.1.2) 56(84) bytes of data.
No EP entry for dmac, send to proxy
EP2172.16.1.2/248c60.4f02.88fc
pod1-leaf101# show endpoint mac 8c60.4f02.88fc<no output>
pod1-leaf101# show isis dteps vrf overlay-110.0.120.33 SPINE PHYSICAL,PROXY-ACAST-MAC
Local Spines have COOP entry pointing
to remote ETEP
Remote Spines have COOP entry pointing to
Local Pod Leaf
Local POD Leaf receives, installs tunnel to source leaf, and forwards to EP
1
23
4
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Dynamic Tunnel Learns
51BRKACI-2934
Vxlan Tunnels are Created 3 Ways
a-leaf205# moquery -c tunnelIf -f 'tunnel.If.id=="tunnel1"'
id : tunnel1dest : 10.0.72.67idRequestorDn : sys/inst-overlay-1/db-dtep/dtep-[10.0.72.67]
a-leaf205# moquery -c tunnelIf -f 'tunnel.If.id=="tunnel1"'
id : tunnel1dest : 10.0.72.64idRequestorDn : sys/bgp/inst/dom-overlay-1/db-dtep/dtep-[10.0.72.64]
a-leaf205# moquery -c tunnelIf -f 'tunnel.If.id=="tunnel1"'
# tunnel.Ifid : tunnel1dest : 10.0.152.64idRequestorDn : sys/isis/inst-default/dom-overlay-1/lvl-l1/db-dtep/dtep-[10.0.152.64]
Remote Pod Endpoint Learns
Remote POD External Routes
Local POD ISIS Database
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Dynamic Tunnel LearnsEndpoint Created Tunnels
52BRKACI-2934
Pod1 Pod2TEP Pool:
10.0.0.0/17TEP Pool:
10.0.128.0/17
LeafLeaf
TEP: 10.0.72.67 TEP: 10.0.200.67
EP1172.16.1.1/240050.56a8.b003
EP1172.16.1.2/248c60.4f02.88fc
ping 172.16.1.2
Leafs install white-list for remote TEP ranges
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Dynamic Tunnel LearnsEndpoint (Dataplane) Created Tunnels
53BRKACI-2934
vsh_lc -c "show sys internal eltmc info pfxwl_table" | grep "Prefix" | grep -v Tunnel**ommitted
Prefix: 10.0.20.64Prefix len: 27
Prefix: 10.0.64.64Prefix len: 27
Prefix: 10.0.72.64 Src TEP matches herePrefix len: 27**ommitted
Outer Dst IP Outer Src IP Inner Dst IP Inner Src IP
10.0.200.67 10.0.72.67 172.16.1.2 172.16.1.1
Verify White-List on Dst Leaf Verify Tunnel Created on Dst Leaf
moquery -c tunnelIf -f 'tunnel.If.dest=="10.0.72.67"'Total Objects shown: 1
# tunnel.Ifid : tunnel3dest : 10.0.72.67operSt : up**ommitted
White-list Prefixes based on dhcp pools within remote POD TEP-Range
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 Unicast
54BRKACI-2934
Remote Leaf Installs EP to Source
Leaf LeafLeaf
SpineSpine
Pod2Where does tunnel16 point?
a-leaf205# show interface tunnel16Tunnel16 is up
Tunnel source 10.0.200.67/32 (lo0)Tunnel destination 10.0.72.67
**Ommitted
a-apic1# moquery -c ipv4Addr -f ‘ipv4.Addr.addr==“10.0.72.67”’
addr : 10.0.72.67/32dn : topology/pod-1/node-101/sys/ipv4/inst/dom-overlay-1/if-[lo0]/addr-[10.0.72.67/32]**ommitted
a-leaf205# show endpoint mac 0050.56a8.b003 detail**ommitted100/CiscoLive2020:vrf1 0050.56a8.b003 tunnel16 CiscoLive2020:bd-L2-2
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 2 Unicast
55BRKACI-2934
Return Path…
Leaf LeafLeaf
Pod1
SpineSpine
Leaf LeafLeaf
SpineSpine
Pod2
EP1172.16.1.1/240050.56a8.b003
Pod1 Leaf installs tunnel and remote learn to pod 2 leaf
EP2172.16.1.2/248c60.4f02.88fc
Spines simply provide transit
Pod2 Leaf Forwards Based on Remote Learn
1
2
3
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Using Ftriage to Troubleshoot Multipod (14.2+)
56BRKACI-2934
*Recommended with EX or Later Hardware
a-apic1# ftriage bridge -ii LEAF:101,103 -dip 172.16.1.2 -sip 172.16.1.1Starting ftriageftriage: main:839 L2 frame Seen on a-leaf101 Ingress: Eth1/30 (Po15) Egress: Eth1/54 Vnid: 16056274ftriage: main:242 ingress encap string vlan-1062ftriage: main:839 L2 frame Seen on a-spine2 Ingress: Eth1/25 Egress: Eth1/31 Vnid: 16056274ftriage: fib:332 a-spine2: Transit in spineftriage: unicast:1458 a-spine2: Infra route 10.0.200.67 present in RIBftriage: unicast:1681 a-spine2: Packet is exiting the fabric through {a-spine2: ['Eth1/31']}ftriage: main:839 L2 frame Seen on a-spine3 Ingress: Eth1/29 Egress: LC-1/3 FC-22/0 Port-1 Vnid: 16056274ftriage: fib:332 a-spine3: Transit in spineftriage: unicast:1458 a-spine3: Infra route 10.0.200.67 present in RIBftriage: unicast:1774 L2 frame Seen on FC of node: a-spine3…. ftriage: main:622 Found peer-node a-leaf205 and IF: Eth1/53 in candidate listftriage: main:839 L2 frame Seen on a-leaf205 Ingress: Eth1/53 Egress: Eth1/31 Vnid: 11371ftriage: main:522 Computed egress encap string vlan-1039ftriage: main:332 Egress BD(s): jy:cl1ftriage: unicast:1833 a-leaf205: Dst EP is localftriage: misc:657 a-leaf205: EP if(Eth1/31) same as egr if(Eth1/31)
Look for bridged flow ingressing 101 or 103
Frame seen on leaf101
Frame seen on spine2
Frame seen on pod2 spine3
Frame seen on pod2 leaf205
Forwards out eth1/31
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Troubleshooting Scenario:
Is communication unicast or multi-destination?
EP’s cannot communicate in L2 BD
57BRKACI-2934
1
root@vm1:/home/joyo# arp -an? (172.16.1.2) at 8c:60:4f:02:88:fc [ether] on ens192
root@vm1:/home/joyo# ping 172.16.1.2PING 172.16.1.2 (172.16.1.2) 56(84) bytes of data.
ARP is resolved so host sends unicast
a-leaf101# show endpoint mac 8c60.4f02.88fc<no entry>
Ingress leaf has no remote learn
Does BD flood or proxy unknown unicast?
a-apic1# moquery -c fvBD -f 'fv.BD.name=="bd-L2-2“’
name : bd-L2-2dn : uni/tn-CiscoLive2020/BD-bd-L2-2unkMacUcastAct : proxy UUC set to “Proxy”
Pod 1 Verifications
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Troubleshooting Scenario:
Does Local Pod Spine have the EP?
EP’s cannot communicate in L2 BD
58BRKACI-2934
2
a-spine1# moquery -c coopEpRec -f 'coop.EpRec.mac=="8c60.4f02.88fc "'No Mos found MAC not in COOP
a-spine1# show bgp l2vpn evpn 8c60.4f02.88fc vrf overlay-1<no output>
MAC not in EVPN
Pod 1 Verifications
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Troubleshooting Scenario:
Does Remote Pod Spine have the EP?
EP’s cannot communicate in L2 BD
59BRKACI-2934
3
a-spine3# moquery -c coopEpRec -f 'coop.EpRec.mac=="8c60.4f02.88fc "'# coop.EpRecvnid : 15761417mac : 8C:60:4F:02:88:FC**ommitted**
MAC is in COOP
a-spine3# show bgp l2vpn evpn 8c60.4f02.88fc vrf overlay-1<ommitted>AS-Path: NONE, path locally originated
0.0.0.0 (metric 0) from 0.0.0.0 (192.168.2.101)Origin IGP, MED not set, localpref 100, weight 32768Received label 15761417Extcommunity:
RT:5:16
Remote Spine Exports to EVPN
Pod 2 Verifications
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Troubleshooting Scenario:
60BRKACI-2934
EP’s cannot communicate in L2 BD
Is EVPN up between Pods?4
a-spine1# show bgp l2vpn evpn summ vrf overlay-1
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd192.168.2.101 4 65000 57380 66362 0 0 0 00:00:21 Active192.168.2.102 4 65000 57568 66357 0 0 0 00:00:22 Active
BGP is down
Next Steps…
• Do the local spines have routes to remote spines?
• Does IPN support jumbo MTU?
• Can spines ping between each other?
Pod 1 or Pod2 Verifications
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 3 Unicast
• Differences from Layer 2
• VRF Lookup rather than BD lookup
• VRF VNID used instead of BD VNID
• Spines trigger ARP Glean if Dst is Unknown (leverages fabric multicast)
…nearly identical to layer 2 unicast
61BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 3 Unicast – Glean Scenario
62BRKACI-2934
BD Settings - UC Routing Enabled
Leaf LeafLeaf
Pod1
SpineSpine
EP1172.16.1.1/240050.56a8.b003
root@vm1:/home/joyo# ping 172.16.2.2PING 172.16.2.2 (172.16.2.2) 56(84) bytes of data.
a-leaf101# show isis dtep vrf overlay-1 | grep 10.0.120.3410.0.120.34 SPINE N/A PHYSICAL,PROXY-ACAST-V4
a-leaf101# show ip route 172.16.2.2 vrf CiscoLive2020:vrf1
172.16.2.0/24, ubest/mbest: 1/0, attached, direct, pervasive*via 10.0.120.34%overlay-1, [1/0], 12:02:55, static
recursive next hop: 10.0.120.34/32%overlay-1
a-leaf101# show endpoint ip 172.16.2.2+--------+-------+-------------+-----------+-----------+VLAN/ Encap MAC Address MAC Info/ InterfaceDomain VLAN IP Address IP Info
+--------+-------+-------------+-----------+-----------+
No EP learn, check routing table
Pervasive Flag indicates BD Subnet
Static Route
Next-hop is spine Proxy
1
2
3
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 3 Unicast – Glean Scenario
63BRKACI-2934
BD Settings - UC Routing Enabled
Leaf LeafLeaf
Pod1
SpineSpine
Leaf LeafLeaf
SpineSpine
Pod2
EP1172.16.1.1/240050.56a8.b003
root@vm1:/home/joyo# ping 172.16.2.2PING 172.16.2.2 (172.16.2.2) 56(84) bytes of data.
EP2172.16.2.2/248c60.4f02.88fc
Local Spines have no COOP entry for Dst IP
172.16.2.2 not learned yet
a-spine1# show coop internal info ip-db | grep -F -B 1 -A 15 “172.16.2.2"
No COOP Entry! This will trigger a Glean
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 3 Unicast
• If the Spines do not have an IP learn
and…
• The destination IP is within a deployed BD Subnet
✓The spine floods the proxied request with a special ethertype
✓Gleans flooded to 239.255.255.240 (*see hidden slide)
✓Leafs with destination subnet generate ARP
What is a Glean?
64BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Inter-Pod Glean
65BRKACI-2934
ERSPAN of Spine > IPN Link
nexus5k# ping 172.16.2.2 source 172.16.1.1 vrf jy1PING 172.16.2.2 (172.16.2.2) from 172.16.1.1: 56 data bytesRequest 0 timed outRequest 1 timed outRequest 2 timed out Src: Originating Leaf
Dst: Reserved Glean Multicast Group
Custom Ethertypefor Gleans
Source IP that triggered Glean: 0xac100101 = 172.16.1.1
Dst IP that triggered Glean: 0xac100202 = 172.16.2.2
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
System Gipo Usage
• If “Use Infra as System Gipo” is enabled actual BD gipo’s used rather than 239.255.255.240
66BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Capturing a Glean with TcpdumpACI Leafs and Spines contain pseudo interfaces for traffic to and from the CPU
67BRKACI-2934
1st Gen Leaf
CPUkpm_inb
PhysPort
ASICknet0knet1
EX (or Later) Leaf
CPUkpm_inb
PhysPort
ASIC Tahoe0
• For traffic going to the cpucheck knet0 and kpm_inb
• For traffic coming from the cpu check knet1 and kpm_inb
*Note, not all traffic will show up on the kpm_inb interface. However, all traffic shows on the pseudo interface*Gen1 and 2 Modular spines use psdev0, psdev1, and psdev2 interfaces. Gen 2 fixed spines use tahoe0. Gen 1 fixed spines use knet0-3
• For traffic to and from the cpu check Tahoe0 and kpm_inb
• Traffic on the on the knet or tahoe pseudo interface will have a special ieth header. It must be decoded.
• Starting in 3.2 the knet_parser.py script is available on the switch cli to decode
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Capturing a Glean with Tcpdump
68BRKACI-2934
Gen2 or Later Leaftcpdump -xxxvei tahoe0 -w /bootflash/tahoe0.pcapknet_parser.py --file /bootflash/tahoe0.pcap --pcap --decoder tahoe
Frame 111Time: 2019-05-16T16:56:33.059831+00:00Header: ieth_extn CPU Receive
sup_qnum:0x14, sup_code:0x21, istack:ISTACK_SUP_CODE_SPINE_GLEAN(0x21)Header: ieth
sup_tx:0, ttl_bypass:0, opcode:0x6, bd:0x120e, outer_bd:0x27, dl:0, span:0, traceroute:0, tclass:0src_idx:0x3a, src_chip:0x0, src_port:0x19, src_is_tunnel:1, src_is_peer:1dst_idx:0x0, dst_chip:0x0, dst_port:0x0, dst_is_tunnel:0
Len: 148Eth: 000d.0d0d.0d0d > 0100.5e7f.fff1, len/ethertype:0x8100(802.1q)802.1q: vlan:2, cos:5, len/ethertype:0x800(ipv4)ipv4: 10.0.116.64 > 239.255.255.241, len:130, ttl:249, id:0x0, df:0, mf:0, offset:0x0, dscp:32, prot:17(udp)udp: (ivxlan) 0 > 48879, len:110ivxlan: n:1, l:1, i:1,
vnid: 0x2b0000lb:0, dl:1, exception:0, src_policy:0, dst_policy:0, src_class:0x5c0mcast(routed:0, ingress_encap:0/802.1q), ac_bank:0, src_port:0x0
Eth: 000c.0c0c.0c0c > ffff.ffff.ffff, len/ethertype:0xfff2(aci-glean)ipv4: 172.16.1.1 > 172.16.2.2, len:84, ttl:63, id:0x71f9, df:1, mf:0, offset:0x0, dscp:0, prot:1(icmp)icmp: echo request id:0x9092, seq:0x1980
Traffic that triggered Glean
Switch recognizes this as a Glean
RX sup traffic rather than TX
Decode type should be tahoe for
tahoe interface
Egress Leaf Verification
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Capturing a Glean with Tcpdump
69BRKACI-2934
Gen1 Leaf Example
tcpdump -xxxvei knet0 -w /bootflash/knet0.pcapknet_parser.py --file /bootflash/knet0.pcap --pcap --decoder knet
Egress Leaf Verification
tcpdump -xxxvei knet1 -w /bootflash/knet1.pcapknet_parser.py --file /bootflash/knet1.pcap --pcap --decoder knet
tcpdump -xxxvei kpm_inb ether proto 0xfff2a-leaf102# tcpdump -xxxvei kpm_inb ether proto 0xfff2tcpdump: listening on kpm_inb, link-type EN10MB (Ethernet), capture size 65535 bytes15:27:37.663580 00:0c:0c:0c:0c:0c (oui Unknown) > Broadcast, ethertype Unknown (0xfff2), length 94:
0x0000: ffff ffff ffff 000c 0c0c 0c0c fff2 45000x0010: 0054 aa4b 4000 3f01 825d 0404 0464 03030x0020: 0396 0800 0dc6 2384 38db 5275 dd5c 00000x0030: 0000 9e35 0100 0000 0000 1011 1213 14150x0040: 1617 1819 1a1b 1c1d 1e1f 2021 2223 24250x0050: 2627 2829 2a2b 2c2d 2e2f 3031 3233
knet0 would show Rx traffic (similar output as Tahoe0)
knet1 would show Tx traffic
No decode necessary for kpm_inb (cpu) interface…Gleans aren’t easily readable
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Layer 3 Unicast – Glean Scenario
IPN Must Route 239.255.255.240 (*see Troubleshooting Multidestination Flows Section)
70BRKACI-2934
IPN1# show run | grep 239ip pim rp-address 192.168.100.1 group-list 239.0.0.0/8 bidirip pim rp-address 10.10.1.1 group-list 239.0.0.0/8 bidir
a-leaf205#show ip arp internal event-history event | grep -F -B 1 172.16.2.273) Event:E_DEBUG_DSF, length:127, at 316928 usecs after Wed May 1 08:31:53 2019Updating epm ifidx: 1a01e000 vlan: 105 ip: 172.16.2.2, ifMode: 128 mac: 8c60.4f02.88fc75) Event:E_DEBUG_DSF, length:152, at 316420 usecs after Wed May 1 08:31:53 2019log_collect_arp_pkt; sip = 172.16.2.2; dip = 172.16.2.254; interface = Vlan104;info = Garp Check adj:(nil) 77) Event:E_DEBUG_DSF, length:142, at 131918 usecs after Wed May 1 08:28:36 2019log_collect_arp_pkt; dip = 172.16.2.2; interface = Vlan104;iod = 138; Info = Internal Request Done78) Event:E_DEBUG_DSF, length:136, at 131757 usecs after Wed May 1 08:28:36 2019log_collect_arp_glean;dip = 172.16.2.2;interface = Vlan104;info = Received pkt Fabric-Glean: 179) Event:E_DEBUG_DSF, length:174, at 131748 usecs after Wed May 1 08:28:36 2019log_collect_arp_glean; dip = 172.16.2.2; interface = Vlan104; vrf = CiscoLive2020:vrf1; info = Address in PSVI subnet or special VIP
Glean Group Range included as Bidir on IPN
Verify ARP on Remote Leaf
Glean Received, Dst IP is in BD Subnet
ARP Request is generated by leaf
Response Received
Endpoint Learn Installed
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Using Ftriage to Troubleshoot Multipod (14.2+)
71BRKACI-2934
L3 Proxy/Glean Scenario
a-apic1# ftriage route -ii LEAF:101,103 -dip 172.16.2.3 -sip 172.16.1.1ftriage: main:839 L3 packet Seen on a-leaf103 Ingress: Eth1/30 (Po15) Egress: Eth1/49 Vnid: 2588674ftriage: main:242 ingress encap string vlan-1062ftriage: main:301 Ingress Ctx: jy:vrf11ftriage: main:933 SIP 172.16.1.1 DIP 172.16.2.3ftriage: unicast:973 a-leaf103: <- is ingress nodeftriage: unicast:1194 a-leaf103: Dst EP is unknown - proxyftriage: main:839 L3 packet Seen on a-spine1 Ingress: Eth2/29 Egress: LC-2/3 FC-23/0 Port-1 Vnid: 2588674ftriage: fib:323 a-spine1: EP not found in COOP! for VRF VNID: 2588674ftriage: unicast:1373 a-spine1: EP is unknown in COOP. Ftriage will exit but continue with further fault isolationftriage: unicast:1412 a-spine1: Egress node not provided. Cannot check local EP. Exiting!ftriage: unicast:1413 : Ftriage Completed with hunch: Check if local EP learnt on egress node(s)
Look for routed flow ingressing 101 or 103 Frame seen on
leaf103
Dst Unknown, Proxy!
Seen on spine1
EP not in COOP!
✓ EP Not in COOP, gleans should be generated. Check local learn on egress leaf
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multipod Multicast
• Unknown Unicast Flooding
• Multidestination Traffic (ARP, Multicast, BPDU’s)
• Inter-pod Glean Messages
• EP Announce Messages
What does Multipod use BUM for?
73BRKACI-2934
…it doesn’t just affect multidestination flows
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPN Multicast Control-plane
• Spines act has multicast hosts (IGMP only)
• Spines join fabric multicast groups (Gipo’s)
• IPN’s receive Joins
• IPN’s send PIM joins to RP
• PIM Bidir is used so no (S,G)
74BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
What is a Gipo?
• Multicast group allocated per-VRF and per-BD.
• Used for all flooded traffic
75BRKACI-2934
a-apic1# moquery -c fvBD -f 'fv.BD.name=="bd-L3-1"'
# fv.BDname : bd-L3-1bcastP : 225.0.80.64dn : uni/tn-CiscoLive2020/BD-bd-L3-1ipLearning : yesmultiDstPktAct : bd-floodunicastRoute : yesunkMacUcastAct : proxyunkMcastAct : floodv6unkMcastAct : flood
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPN Multicast Control-plane
76BRKACI-2934
Pod1
SpineSpine SpineSpine
Pod2
BD Gipo Ex: 225.0.80.64
IPN
BD Gipo Ex: 225.0.80.64
RP
Spine sends IGMP Join for GIPO
IPN Routers Send PIM Join to RP
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPN Multicast Dataplane
77BRKACI-2934
Pod1
SpineSpine SpineSpine
Pod2
BD Gipo Ex: 225.0.80.64
IPN
BD Gipo Ex: 225.0.80.64
All Multicast DataplaneGoes Through RP
RP
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPN Multicast Control-plane
78BRKACI-2934
Only one spine in each pod joins each group
Pod2
BD Gipo Ex: 225.0.80.64
a-spine1# show ip igmp gipo joinsGIPo list as read from IGMP-IF group-linked list------------------------------------------------225.0.80.64 0.0.0.0 Join Eth1/25.25 95 Enabled
Pod1
BD Gipo Ex: 225.0.80.64
IPN IPNIGMP Join IGMP Join
Spine1 Joins for Pod1
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPN Multicast Control-plane
79BRKACI-2934
Only one spine in each pod joins each group
Pod2
BD Gipo Ex: 225.0.80.64
IPN1# show ip mroute 225.0.80.64 vrf IPNIP Multicast Routing Table for VRF “IPN"
(*, 225.0.80.64/32), bidir, uptime: 13:00:48, igmp ip pimIncoming interface: loopback1, RPF nbr: 192.168.100.1Outgoing interface list: (count: 3)Ethernet8/2, uptime: 01:34:42, pimloopback1, uptime: 13:00:48, pim, (RPF)Ethernet1/1.4, uptime: 13:00:48, igmp
Pod1
BD Gipo Ex: 225.0.80.64
IPN IPNIGMP Join IGMP Join
IPN1# show ip igmp groups 225.0.80.64 vrf IPN
Type: S - Static, D - Dynamic, L - Local, T - SSM TranslatedGroup Address Interface Uptime Expires Last Reporter225.0.80.64 Ethernet1/1.4 13:02:14 00:04:02 192.168.1.0
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPN Multicast Control-plane
80BRKACI-2934
RPF for all IPN’s must point to same RP
Pod2
BD Gipo Ex: 225.0.80.64
IPN3# show ip mroute 225.0.80.64 vrf IPNIP Multicast Routing Table for VRF "IPN"
(*, 225.0.80.64/32), bidir, uptime: 01:34:35, igmp ip pimIncoming interface: Ethernet8/25, RPF nbr: 10.255.0.0Outgoing interface list: (count: 2)Ethernet8/25, uptime: 01:34:35, pim, (RPF)Ethernet1/17.4, uptime: 01:34:35, igmp
Pod1
BD Gipo Ex: 225.0.80.64
IGMP Join IGMP Join
RP
IPN3
IPN2
IPN1
RPF
IPN3# show ip pim rp 225.0.80.64 vrf IPNPIM RP Information for group 225.0.80.64 in VRF "IPN"RP: 192.168.100.1, (1)
IPN3# show ip route 192.168.100.1 vrf IPN192.168.100.0/30, ubest/mbest: 1/0
*via 10.255.0.0, Eth8/25, [110/5], 13:01:42, ospf-IPN, intra
RPF must not point to ACI
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPN Multicast Control-plane
• Bidir PIM doesn’t support multiple RP’s
• Phantom RP is only means of RP redundancy
• Works by advertising varied Prefix Lengths for RP subnet
• Failover handled via IGP
• Loopback must be OSPF P2P network type
• Exact RP Address must not exist anywhere
Phantom RP
81BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPN Multicast Control-plane
Requirement: Each multicast group must have only one RP
To use multiple RP’s…
Break mcast group range into smaller groups
225.0.0.0/8 becomes
225.0.0.0/9 – RP 192.168.255.1
225.128.0.0/9 – RP 192.168.255.2
Phantom RP Load-Balancing
82BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPN Multicast Control-planePhantom RP
83BRKACI-2934
IPN4
IPN3
IPN2
IPN1
RP Addr - 192.168.255.1
IPN1# show run int lo1
interface loopback1ip address 192.168.255.2/27ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode
IPN2# show run int lo1
interface loopback1ip address 192.168.255.2/29ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode
IPN3# show run int lo1
interface loopback1ip address 192.168.255.2/28ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode
IPN4# show run int lo1
interface loopback1ip address 192.168.255.2/30ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode
IPN4 is RP due to Longest Prefix
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multicast ProblemsIssue #1: RP Address Exists on Multiple Routers
84BRKACI-2934
IPN2
IPN1
IPN1# show run int lo1
interface loopback1ip address 192.168.255.1/29ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode
IPN2# show run int lo1
interface loopback1ip address 192.168.255.1/30ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode
RP Addr - 192.168.255.1
IPN1# show ip route 192.168.255.1 vrf IPNIP Route Table for VRF "IPN"
192.168.100.1/32, ubest/mbest: 1/0, attached*via 192.168.100.1, Lo1, [0/0], 21:01:48, local
• IPN Routers see local /32 for RP address
• All Routers think they are RP
Exact RP address can’t exist with Phantom RP
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multicast ProblemsIssue #2: RP Loopback not OSPF P2P Network
85BRKACI-2934
IPN1
IPN1# show run int lo1
interface loopback1ip address 192.168.255.2/29ip router ospf IPN area 0.0.0.0ip pim sparse-mode
IPN2# show run int lo1
interface loopback1ip address 192.168.255.2/30ip router ospf IPN area 0.0.0.0ip pim sparse-mode
RP Addr - 192.168.255.1
IPN1# show ip route 192.168.255.1 vrf IPNIP Route Table for VRF "IPN"
192.168.100.0/29, ubest/mbest: 1/0, attached*via 192.168.100.2, Lo1, [0/0], 21:15:36, direct
Where is the /30 from IPN2?
IPN2
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multicast ProblemsIssue #2: RP Loopback not OSPF P2P Network
86BRKACI-2934
• Loopbacks advertise /32 by default
IPN1#show ip ospf database router 1.1.1.1 detail vrf IPNLink connected to: a Stub Network(Link ID) Network/Subnet Number: 192.168.255.2(Link Data) Network Mask: 255.255.255.255Number of TOS metrics: 0TOS 0 Metric: 1
IPN1# show ip ospf database router 1.1.1.1 detail vrf IPNLink connected to: a Stub Network(Link ID) Network/Subnet Number: 192.168.255.0(Link Data) Network Mask: 255.255.255.248Number of TOS metrics: 0TOS 0 Metric: 1
Without P2P Network Type With P2P Network Type
/32 Advertised /29 Advertised
Configure “ip ospf network point-to-point” on RP loopbacks
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multicast ProblemsIssue #3: RPF Points to ACI
87BRKACI-2934
Pod1
Spine Spine
Pod2
IPN
RP
Low Speed Link: Cost 10
IPN1 IPN3
IPN2
High Speed Link: Cost 1
IPN3# show ip mroute 225.0.80.64 vrf IPN
(*, 225.0.80.64/32), bidir, uptime: 00:00:26, igmp ip pimIncoming interface: Ethernet1/1.4, RPF nbr: 192.168.1.0Outgoing interface list: (count: 2)
Ethernet1/1.4, uptime: 00:00:26, igmp, (RPF)
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multicast Problems
• Spines don’t run PIM so not valid RPF
• Applicable when single spine is connected to multiple IPN’s
Issue #3: RPF Points to ACI
88BRKACI-2934
IPN3# show ip mroute 225.0.80.64 vrf IPN
(*, 225.0.80.64/32), bidir, uptime: 00:00:26, igmp ip pimIncoming interface: Ethernet1/1.4, RPF nbr: 192.168.1.0Outgoing interface list: (count: 2)
Ethernet1/1.4, uptime: 00:00:26, igmp, (RPF)
IPN1# show ip pim int eth1/1.4 brief vrf IPNPIM Interface Status for VRF “IPN"Interface IP Address PIM DR Address Neighbor
CountEthernet1/1.4 192.168.1.1 192.168.1.1 0
RPF is Eth1/1.4 (ACI) No PIM Neighbor on that Link
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multicast ProblemsIssue #3: RPF Points to ACI
89BRKACI-2934
Pod1
Spine Spine
Pod2
IPN
RP
Low Speed Link: Cost 10
IPN1 IPN3
IPN2
High Speed Link: Cost 1 Make IPN-IPN links have equal or better OSPF Cost
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
External Routed L3out Control-Plane
• External Routes redistributed into Fabric BGP
• Each pod is BGP VPNv4 Route-Reflector Cluster
• Spines reflect external routes across pods
• Internal Leafs import VPNv4 routes
• Internal Leafs see Border Leaf as next hop
Almost the same as traditional L3outs
91BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
External Routed L3out Control-Plane
92BRKACI-2934
Leaf LeafLeaf
Pod1
SpineSpine
Leaf
Spine
Pod2
How do internal Leafslearn external routes?
Border Leaf Learns External Routes
Border Leaf Exports into BGP and sends
to Spines
Spines Reflect VPNv4Paths between Pods
Internal Leafs Import Routes from BGP
10.13.13.13/32
Internal Leaf install tunnel to Border Leaf based on BGP Next Hop5
4
3
2
1
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
External Routed L3out Control-Plane
93BRKACI-2934
External Route on Internal Leaf
a-leaf101# show bgp ipv4 unicast 10.13.13.13/32 vrf CiscoLive2020:vrf1
Advertised path-id 1, VPN AF advertised path-id 1Path type: internal adv path ref 2, path is valid, is best path
Imported from 10.0.200.67:10:13.13.13.13/32AS-Path: NONE, path sourced internal to AS10.0.200.67 (metric 64) from 10.0.64.64 (192.168.1.102)Origin incomplete, MED 5, localpref 100, weight 0Received label 0Received path-id 2Extcommunity:
RT:65000:2818051COST:pre-bestpath:165:2415919104VNID:2818051COST:pre-bestpath:162:110
Originator: 10.0.200.67 Cluster list: 192.168.1.102 192.168.2.254
Next-hop, this is the Border Leaf TEP
Imported Route-target
Vxlan Vnid used for traffic using this route
Source route AD is 110 (must be OSPF)
BGP Route-Reflector Cluster-list, one for
each pod
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
External Routed L3out Control-Plane
94BRKACI-2934
Tunnel Built by BGP on Internal Leaf
a-leaf101# show ip route 10.13.13.13 vrf CiscoLive2020:vrf1IP Route Table for VRF "CiscoLive2020:vrf1"'*' denotes best ucast next-hop
10.13.13.13/32, ubest/mbest: 1/0*via 10.0.200.67%overlay-1, [200/5], 1d00h, bgp-65000, internal, tag 65000
recursive next hop: 10.0.200.67/32%overlay-1
External Route Learned via BGP
a-leaf101# moquery -c tunnelIf -f 'tunnel.If.dest*"10.0.200.67"'
# tunnel.Ifid : tunnel47dest : 10.0.200.67idRequestorDn : sys/bgp/inst/dom-overlay-1/db-dtep/dtep-[10.0.200.67]type : fabric-ext,physicalvrfName : overlay-1
a-leaf101# vsha-leaf101# show bgp internal event-history objstore | grep a00c8432019 Apr 2 21:12:30 bgp 65000 [58156]: TID 58302: (0) OBJ: bgp_dtep_add: tep=a00c843
Tunnel Created by BGP
*Note, Tunnel could be created by unrelated EP learn first
Initial BGPTunnel Creation
Dest IP in hex
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
External Routed L3out Control-Plane
• Exactly the same as non-multipod…
• Bridge Domain Static Route Pushed to Border Leaf by Contract
• Border Leafs Redistributes (if configured) into external protocol
• External > Internal traffic hits EP learn or BD static (proxy) route
How do Border Leafs forward to internal Leafs?
95BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
External Routed L3out Control-PlaneHow do Border Leafs forward to internal Leafs?
96BRKACI-2934
Leaf LeafLeaf
Pod1
SpineSpine
Leaf
Spine
Pod2
EPG ABD Subnet 10.1.1.0/24
L3OutContract
a-leaf205# show ip route 10.1.1.1 vrf CiscoLive2020:vrf1IP Route Table for VRF "CiscoLive2020:vrf1"'*' denotes best ucast next-hop
10.1.1.0/24, ubest/mbest: 1/0, attached, direct, pervasive*via 10.0.120.34%overlay-1, [1/0], 22:14:10, static
recursive next hop: 10.0.120.34/32%overlay-1
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Troubleshooting TIP
1) Check if there is an Endpoint Learn
If not then…
2) Check if there is a BD (pervasive) static route
If not then…
3) Check if there is an External Route
When Troubleshooting Layer 3 Flows Always…
97BRKACI-2934
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Using Ftriage to Troubleshoot Multipod (14.2+)
98BRKACI-2934
L3Out Scenario
a-apic1# ftriage route -ii LEAF:101,103 -dip 10.13.13.13 -sip 172.16.1.1ftriage: main:839 L3 packet Seen on a-leaf103 Ingress: Eth1/30 (Po15) Egress: Eth1/50 Vnid: 2588674ftriage: main:242 ingress encap string vlan-1062ftriage: main:301 Ingress Ctx: jy:vrf11ftriage: nxos:1404 a-leaf103: nxos matching rule id:4572 scope:63 filter:65535ftriage: main:933 SIP 172.16.1.1 DIP 10.13.13.13ftriage: unicast:1058 a-leaf103: Dst EP is a WAN EPftriage: unicast:1070 a-leaf103: Policy enforcement mode is ingressftriage: unicast:1215 a-leaf103: Dst EP is remoteftriage: misc:657 a-leaf103: RwDMAC DIPo(10.0.200.67) is one of dst TEPs ['10.0.200.67']ftriage: main:839 L3 packet Seen on a-spine2 Ingress: Eth1/27 Egress: Eth1/31 Vnid: 2588674ftriage: main:839 L3 packet Seen on a-spine3 Ingress: Eth1/29 Egress: LC-1/3 FC-26/0 Port-1 Vnid: 2588674ftriage: main:839 L3 packet Seen on a-leaf205 Ingress: Eth1/53 Egress: Eth1/31 Vnid: Nullftriage: pktrec:490 a-leaf205: Collecting transient losses snapshot for LC module: 1ftriage: fib:169 a-leaf205: L3 out interface Ethernet1/31ftriage: main:522 Computed egress encap string vlan-1055ftriage: main:313 Building egress BD(s), Ctxftriage: main:331 Egress Ctx jy:vrf11ftriage: main:332 Egress BD(s): jy:vrf11:l3out-bgp:vlan-1055
Look for routed flow ingressing 101 or 103
Frame seen on leaf103
Dst is behind L3outSends to this TEP
(leaf 205)
Arrives on 205 and forwards out l3out bgp on vlan 1055
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multipod L3out ProblemsIssue #1: Asymmetric Routing with Active/Active Pods
99BRKACI-2934
Leaf Leaf
Pod1
Spine
Leaf Leaf
Pod2
Spine
BD110.1.1.0/24
BD110.1.1.0/24
EP110.1.1.1/24
• Both Pods advertise same BD Subnet
• External Device performs stateful inspection
• Return traffic ingresses different pod than where EP exists
• Traffic dropped
EP sends outbound traffic
Return traffic goes to Pod2and is dropped by FW
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multipod L3out ProblemsIssue #1: Asymmetric Routing with Active/Active Pods
100BRKACI-2934
Leaf Leaf
Pod1
Spine
Leaf Leaf
Pod2
Spine
BD110.1.1.0/24
BD110.1.1.0/24
EP110.1.1.1/24
• Pods advertise local /32 EP information
• Requires GOLF or Host Border Route Feature (HBR in 4.0)
Pod 1 Advertises 10.1.1.1/32
Return traffic goes to Pod 1
Implement Host Route Advertisement
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multipod L3out ProblemsIssue #2: Stretched L3out VIP Failover
101BRKACI-2934
Leaf Leaf
Pod1
Spine
Leaf Leaf
Pod2
Spine
Active VIP10.2.2.2
Two Common Problems Here
• Same encap vlan not deployed for each vlan –breaks flooded traffic
• IPN isn’t routing multicast properly
New Active FW Sends GARP
Active VIP10.2.2.2Pod 2 Becomes Active
IPN Forwards GARP
Pod 1 Leafs don’t see GARP, still think local FW is active
Standby VIP10.2.2.2
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multipod L3out ProblemsIssue #2: Stretched L3out VIP Failover
102BRKACI-2934
Which VNID and Gipo should the l3out use?
a-apic1# moquery -c fvIfConn -f 'fv.IfConn.dn*"uni/tn-CiscoLive2020/out-EIGRP/"'Total Objects shown: 2
# fv.IfConnbcastP : 225.1.188.208dn : uni/epp/rtd-[uni/tn-CiscoLive2020/out-EIGRP/instP-defaultNet]/node-101/stpathatt-[shared-5596-A-VPC]/conndef/conn-[vlan-1052]-[52.52.52.101/24]extEncap : vxlan-15466402
# fv.IfConnbcastP : 225.1.188.208dn : uni/epp/rtd-[uni/tn-CiscoLive2020/out-EIGRP/instP-defaultNet]/node-205/stpathatt-[shared-5596-A-VPC]/conndef/conn-[vlan-1052]-[52.52.52.103/24]extEncap : vxlan-15466402
“EIGRP” is the name of the L3out
The same VNID and GIPO is extended to nodes 101 and 205
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Common Multipod L3out ProblemsIssue #2: Stretched L3out VIP Failover
103BRKACI-2934
If there’s a problem check these things…
• Ensure an SVI is used for the l3out (no flooding for routed interfaces)
• Ensure the same vlan encap is used in each pod
• Ensure the IPN agrees on the tree for the GIPO
• Ensure a GARP is sent by external router
• Check if the GARP is sent with COS 6 (more on this later)
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
ACI QoS OverviewKey Points
105BRKACI-2934
• Fabric QoS is based on COS and DEI bits in outer L2 header
• Incoming COS on non-fabric ports not preserved through fabric but…
• Incoming COS is written into outer DSCP so it can be preserved on egress
• Traffic is level 3 by default (COS 0 + DEI 0)
• COS 6 traffic from IPN may be mistreated (pre-4.0)
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
ACI QoS Overview
106BRKACI-2934
COS Function Notes
3,4,5 APIC, SPAN, Control Plane SPAN=low Priority
6 iTracerouteReserved, sent to CPU; actual iTraceroute is dscp 6
210
Level 1Level 2Level 3 (Default)
User Traffic1 Priority Class
2+DEI3+DEI5+DEI
Level 4Level 5Level 6
New in 4.0!User Traffic5 Priority Classes
SIPDIPProtoUDP
flagsEPG
DMACSMAC802.1QVNIDSIPDIPProtoL4/Payload DMACSMACethtype
iVXLAN Outer HeaderInner Header Fabric QOS
Used for tracing flows
within the fabric.
Reserved for CPU generated
traffic
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
ACI QoS OverviewWhere is QoS Behavior Configured?
107BRKACI-2934
Configuration Where is it Configured? Function
Dot1p Preserve Global Access Policies Causes egress leaf to rewrite cos to original value when forwarding
QoS Class App EPG, Contract, Subject Defines prioritization of traffic through the fabric
Custom QoS App EPG Re-marks traffic based on incoming COS or DSCP
Target DSCP (L3out) L3out, Contract, Subject Sets the DSCP value
DSCP Class-Cos Translation Policy
Infra > Networking > Protocols Spines re-map QoS of traffic going to and coming from IPN/ISN
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
ACI QoS – Preserve COS
108BRKACI-2934
Egress leaf rewrites COS
based on DSCP
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
ACI Forwarding and QoS – Preserve COS
109BRKACI-2934
Layer 2 COS encoded into most significant 3 bits of DSCP
Outer COS Value matches the Level (Contract/EPG)
Note: Incoming COS and DSCP is not used unless custom QoS policy is configured
SIPDIPDSCPflagsEPG
DMACSMAC802.1QVNIDSIPDIPProtoL4/Payload DMACSMAC802.1Q
Configure Dot1p Preserve! The egress leaf will look at the 3 MSB bits of the DSCP value to know which COS value to use for packet rewrite
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Pre-4.0 COS 6 Problem
110BRKACI-2934
Data Center 2
Frame with COS 6 set
1
Leaf forwards frame towards DC1 with COS 0 and an outer DSCP of 480b110 000
2
Data Center 1
IP packet with DSCP 48
3
Datacenter interconnect(IPN, ISN)
Last hop IPN router writes COS based on DSCP …DSCP 48 = COS6 4
DC1 treats packet as iTraceroute
5
Fix? Configure “DSCP class-cos translation policy for L3 traffic”The spine will map the outer COS value to a new DSCP class on egress and map DSCP to COS in ingress
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
DSCP – COS Translation Policy
111BRKACI-2934
✓ COS 6 Problem solved by using DSCP – COS Translation Policy
Incoming traffic from IPN now classified on DSCP
and not COS
Note: DSCP-COS Translation Policy Will Negate Dot1p Preserve
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
After 4.0 Software…
Pre-4.0
112BRKACI-2934
• All devices trust DSCP markings set on ingress leaf
• QoS class is derived from DSCP
• Spine rewrites COS received from IPN based on DSCP
• Traceroute is DSCP 6 so COS 6 + DSCP 48 is forwarded normally
SpineDatacenter interconnect
(IPN, ISN)
COS6 + DSCP 48
SpineDatacenter interconnect
(IPN, ISN)
COS6 + DSCP 48
Traceroute, not forwarded on egress leaf due to COS6
Whichever class DSCP 48
maps to
After 4.0
✓ DSCP – COS Translation Policy Not Required
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
QoS CLI’s
113BRKACI-2934
show queueing interface ethernetx/y See per class Interface Stats
show queueing interface ethernetx/y detail Same as above + control and policy classes
show system internal qos classes Check queueing behavior per class
moquery -c qospInfraDscpMap Check COS to DSCP translation policy on spine
moquery -c qosInstPol Verify if dot1P preserve is enabled (apic or switch)
Complete your online session survey • Please complete your session survey
after each session. Your feedback is very important.
• Complete a minimum of 4 session surveys and the Overall Conference survey (starting on Thursday) to receive your Cisco Live t-shirt.
• All surveys can be taken in the Cisco Events Mobile App or by logging in to the Content Catalog on ciscolive.com/emea.
Cisco Live sessions will be available for viewing on demand after the event at ciscolive.com.
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKACI-2934 114
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Continue your education
115BRKACI-2934
Related sessions
Walk-in self-paced labs
Demos in the Cisco campus
Meet the engineer 1:1 meetings