BRKACI-2934.pdf - Cisco Live

117

Transcript of BRKACI-2934.pdf - Cisco Live

Joseph Young, ACI Technical Leader

BRKACI-2934

ACI Troubleshooting: Multipod

Questions? Use Cisco Webex Teams to chat with the speaker after the session

Find this session in the Cisco Events Mobile App

Click “Join the Discussion”

Install Webex Teams or go directly to the team space

Enter messages/questions in the team space

How

1

2

3

4

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Cisco Webex Teams

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Introduction

• Multipod Overview

• Troubleshooting the Multipod Setup Process

• Troubleshooting Unicast Flows

• Troubleshooting Multi-destination Flows

• Troubleshooting External Routed Communication

• Quality of Service

• Conclusion

4

Agenda

BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Acronyms/Definitions

5BRKACI-2934

Acronyms Definitions Acronyms Definitions

ACI Application Centric Infrastructure MDT Multicast Distribution Tree

ACL Access Control List MST Multiple Spanning Tree

APIC/IFC Application Policy InfrastructureController/ Insieme Fabric Controller

OSPF Open Shortest Path First Protocol

BD Bridge Domain pcTag Policy Control Tag

COOP Council of Oracle Protocol PIM Protocol Independent Multicast

ECMP Equal Cost Multipath PL Physical Local

EP Endpoint SVI Switch Virtual Interface

EPG Endpoint Group TC Topology Change

EVPN Ethernet VPN BGP Address-family VL Virtual Local

FTEP/VTEP Fabric/Virtual or VXLAN Tunnel Endpoint VNID Virtual Network Identifier

GIPo Outer Group IP Address VPNv4 BGP VPN Address-Family

ISIS Intermediate System to Intermediate System

VXLAN/iVXLAN Virtual Extensible LAN / Insieme VXLAN

LPM Longest Prefix Match XR VXLAN Remote

Multipod Overview

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Feature Evolution

• Effective Troubleshooting requires understanding…

• Why does the feature exist?

• What problems does it solve?

• How does it solve them?

• How do the components interact?

Understand the “why”

I said no marketing…why is this necessary?

7BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Feature Evolution – Classic ACI

• VXLAN TEP reachability learned through ISIS

• Endpoint Repo on Spines handled by COOP

• MP-BGP to distribute external routes through fabric

8BRKACI-2934

APIC APICAPIC

Leaf

Spine

LeafLeafLeaf

Spine

Single Fabric

ISISCOOP

MPBGP

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Feature Evolution

• TEP reachability must be communicated

• Endpoints must be synced across locations

• Mechanism needed for BUM traffic

• APIC Cluster must be extended

What if ACI must be extended to other locations?

9BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Feature Evolution – Stretched Fabric

10BRKACI-2934

APIC

Leaf LeafLeaf

SingleFabric

SpineSpine

APIC

Leaf LeafLeaf

SpineSpine

APIC

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Feature Evolution – Stretched Fabric

• Transit leafs connect to all spines

• COOP, ISIS, and BGP extended across locations

Not scalable

11BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Feature Evolution – Multipod

12BRKACI-2934

• Single Fabric Extended

• Each pod is local instance of ISIS and COOP

• Inter-pod connectivity through IPN

• Inter-pod BUM uses PIM-Bidir

• BGP between pods to share endpoints and external routes

APIC

Leaf LeafLeaf

Pod1

SpineSpine

APIC

Leaf LeafLeaf

SpineSpine

APIC

Pod2 ISISCOOP

MPBGP

ISISCOOP

MPBGP

BGP VPNv4/EVPN

OSPF

IPN

,PIM

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPN Requirements

❑OSPF

❑DHCP relay

❑Jumbo MTU (9150 Bytes)

❑Routed Subinterfaces

❑PIM Bidir with at least /15 Mask

❑QoS (optional)

13BRKACI-2934

Troubleshooting Multipod Setup

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Overview

1. Configure Pod 1 (TEP pool, infra l3out)

2. Configure Remote Pod (TEP pool, infra l3out)

3. Register Remote Pod Spines (DHCP)

4. Discover Remote Pod Leafs (LLDP)

5. Remote Pod APIC’s join cluster

15BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup ProcessSetting up Pod 1 (Seed Pod)

16BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Process

17BRKACI-2934

➢ Configure Addressing for Pod 1 Spine > IPN connection

L3 Interface used for OSPF peering with IPN. Make sure MTU matches IPN!

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Process

➢ Configure OSPF parameters for Pod 1 Spine > IPN connection

18BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Process

➢ Configure Dataplane TEP

19BRKACI-2934

Not needed for Multipod, leave blank

if not used

Anycast Address used for Pod x > Pod 1 Proxied traffic. More on this in Unicast Section

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Process

➢ Review POD1 configurations

20BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Process

21BRKACI-2934

After setting up Seed Pod (Pod 1)…

pod1-spine1# show ip ospf neighbors vrf overlay-1OSPF Process ID default VRF overlay-1Total number of neighbors: 1Neighbor ID Pri State Up Time Address Interface172.31.255.1 1 FULL/ - 00:00:14 172.21.0.0 Eth1/23.23

✓ Verify OSPF is up between Pod1 Spines and IPN

✓ Ensure IPN is pre-provisioned for Pod 2 Connectionsinterface Ethernet1/21.4mtu 9150encapsulation dot1q 4vrf member P1IPNip address 172.22.0.0/31ip ospf network point-to-pointip router ospf P1IPN area 0.0.0.0ip pim sparse-modeip dhcp relay address 10.0.0.1ip dhcp relay address 10.0.0.2no shutdown

Configure Jumbo MTU

Use Vlan 4 Subinterface

Ensure PIM is Enabled

Make sure dhcp relay is configured pointing to APIC Infra IP’s

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Process

22BRKACI-2934

Setting up Pod 2

L3 Interface used for OSPF peering with Pod 2 IPN.

➢ Assign TEP pool to non-seed Pod

➢ Configure L3 parameters

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Process

23BRKACI-2934

➢ Configure OSPF parameters

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Process

24BRKACI-2934

➢ Configure Dataplane TEP for Pod 2

Not needed for multipod, leave blank if not used

Anycast Address used for Pod x > Pod 2 Proxied traffic. More on this in Unicast Section

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Where to find the less-known MPOD configurations?

25BRKACI-2934

Dataplane TEPs from Setup

Spine > IPN subnets leaked into ISIS

Leave default, allows PODs to import BGP paths from each other

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Process

➢ POD 2 Spines should now be discoverable

26BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Remote Pod Discovery

27BRKACI-2934

1. Remote Pod Spines send DHCP DISCOVER. IPN Relays to APICs

APIC

Leaf LeafLeaf

Pod1

SpineSpine

APIC

SpineSpine

Pod2

IPN

DISCOVER

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Remote Pod Discovery

28BRKACI-2934

2. IP Address from Multipod l3out is assigned.

APIC

Leaf LeafLeaf

Pod1

SpineSpine

APIC

SpineSpine

Pod2

IPN

OFFER

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

What’s in the DHCP OFFER?

29BRKACI-2934

Offered IP (From l3out interface profile)

Directory on APIC from which Spine downloads full l3out configuration *full directory is

/firmware/fwrepos/fwrepo/boot/bootstrap-202.xml

Pod2 Facing IPN IP address (relay)

Default Gateway, used for downloading config

• IP address from L3out interface profile assigned

• Gateway is next-hop for default route

• Bootstrap file communicates location of l3out Config

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Remote Pod Discovery

30BRKACI-2934

3. Spine configures static default route for APIC reachability with NH of IPN.

APIC

Leaf LeafLeaf

Pod1

SpineSpine

APIC

SpineSpine

Pod2

IPN

pod2-spine2# vsh -c "show ip route 0.0.0.0/0 vrf overlay-1"IP Route Table for VRF "overlay-1"'[x/y]' denotes [preference/metric]'%<string>' in via output denotes VRF <string>

0.0.0.0/0, ubest/mbest: 1/0*via 172.22.0.0, Eth1/23.23, [250/0], 5d17h, static

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Remote Pod Discovery

31BRKACI-2934

4. Spine downloads bootstrap XML from APIC which contains l3out configuration

APIC

Leaf LeafLeaf

Pod1

SpineSpine

APIC

SpineSpine

Pod2

IPN

CONFIG

pod1-apic1# grep bootstrap /var/log/dme/log/access.log

172.22.0.1 - - [18/Apr/2019:14:13:16 +0000] "GET /fwrepo/boot/bootstrap-202.xml HTTP/1.1" 200 8561 "-" "-"

200 OK Code. GET was success!

switch# moquery -c topSystem# top.Systemaddress : 0.0.0.0bootstrapState : downloading-bootstrap-configrole : spine

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Remote Pod Discovery

32BRKACI-2934

5. Spine acts as self relay for TEP DHCP request

APIC

Leaf LeafLeaf

Pod1

SpineSpine

APIC

Spine

Pod2

IPN

6. TEP address from POD2 pool is assigned

Spine

Lo0 DISCOVER

OFFER

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Remote Pod Discovery

33BRKACI-2934

7. Pod2 Leafsdiscovered through normal process (LLDP/DHCP)

APIC

Leaf LeafLeaf

Pod1

SpineSpine

APIC

Leaf LeafLeaf

SpineSpine

Pod2

IPN

LLDP

DHCP

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Remote Pod Discovery

34BRKACI-2934

8. Pod2 APIC(s) join cluster

*Non-seed pod APICs still use Pod1 TEP Pool!

APIC

Leaf LeafLeaf

Pod1

SpineSpine

APIC

Leaf LeafLeaf

SpineSpine

Pod2

IPN

APIC

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multipod Discovery Problems

Possible Causes

1. DHCP Relays on IPN point to APIC OOB rather than infra

✓Configure Relays to point to infra (show controller on APICs)

2. IPN doesn’t have route to APICs

✓Check that OSPF is up between IPN and Pod1

3. Miscabling results in Spine receiving IP in different subnet than GW

✓Correct cabling or addressing then remove and rediscover Spine

4. Spines can’t resolve ARP for connected IPN interface

✓Ensure SW version supports multipod + spine hw (ex: for 9364C MPOD supported in 3.1(1))

Issue #1: Pod2 Spines Don’t Receive L3out IP or Config

35BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multipod Discovery Problems

Ensure leafs are connected to spine

-Spine TEP not assigned until leaf-facing interfaces “up”

Issue #2: Pod2 Spines Don’t Receive TEP Addresses

36BRKACI-2934

Ensure Leaf–facing interfaces are “up” so Spine gets TEP

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multipod Discovery Problems

Check your setup Parameters!

Issue #3: Remote Pod APIC Not Joining Cluster

37BRKACI-2934

Cluster configuration ...Enter the fabric name [ACI Fabric1]: CL-FabEnter the fabric ID (1-128) [1]: 1Enter the number of active controllers in the fabric (1-9) [3]:Enter the number of active controllers in the fabric (1-9) [3]: 3Enter the POD ID (1-12) [1]: 2Is this a standby controller? [NO]:Is this an APIC-X? [NO]:Enter the controller ID (1-3) [1]: 3Enter the controller name [apic3]: p2-apic3Enter address pool for TEP addresses [10.0.0.0/16]:Note: The infra VLAN ID should not be used elsewhere in your

environmentand should not overlap with any other reserved VLANs on other

platforms.Enter the VLAN ID for infra network (1-4094): 3967

Out-of-band management configuration ...Enable IPv6 for Out of Band Mgmt Interface? [N]:Enter the IPv4 address [192.168.10.1/24]: 10.122.143.14/26Enter the IPv4 address of the default gateway [None]: 10.122.143.1Enter the interface speed/duplex mode [auto]:

Ensure Pod ID is Correct

Remote Pod APIC must use Pod 1 TEP Pool

✓ Run “acidiag avread” to check setup config

✓ If wrong wipe and reload the APIC

“acidiag touch clean”

“acidiag touch setup”

“acidiag reboot”

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Setup Verification Checklist

❑Verify BGP EVPN and VPNv4 is up between pods

❑Verify both unicast and multidestination interpod flows work

❑Verify jumbo MTU interpod flows work

❑Verify above flows work during various Spine > IPN and IPN > IPN link failures

38BRKACI-2934

Troubleshooting Unicast Flows

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Unicast Overview

• Spines share EP’s via BGP EVPN

• Tunnels to remote pod dynamically built

• IPN’s must support overlay traffic requirements

Troubleshooting Single Pod and Multipod is Similar!

Key Differences Between Single Pod Unicast

40BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 UnicastBD Settings - UUC Proxy, ARP Flooding Enabled, UC Routing Disabled

41BRKACI-2934

Leaf LeafLeaf

Pod1

SpineSpine

Leaf LeafLeaf

SpineSpine

Pod2

EP1172.16.1.1/240050.56a8.b003

root@vm1:/home/joyo# arp -an? (172.16.1.2) at 8c:60:4f:02:88:fc [ether] on ens192

root@vm1:/home/joyo# ping 172.16.1.2PING 172.16.1.2 (172.16.1.2) 56(84) bytes of data.

Verify first that the flow is unicast

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 Unicast

42BRKACI-2934

Ingress traffic triggers local learn

Leaf LeafLeaf

Pod1

SpineSpine

Leaf LeafLeaf

SpineSpine

Pod2

EP1172.16.1.1/240050.56a8.b003

a-leaf101# show endpoint mac 0050.56a8.b003 detail | grep epg-l2-2123/CiscoLive2020:vrf1 vlan-1011 0050.56a8.b003 L eth1/26 CiscoLive2020:ap1:epg-l2-2

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 Unicast

43BRKACI-2934

Ingress leaf updates COOP record on Spines

Leaf LeafLeaf

Pod1

SpineSpine

EP1172.16.1.1/240050.56a8.b003

a-spine1# show coop internal info repo ep | grep -B 8 -A 35 00:50:56:A8:B0:03------------------------------------------**ommittedEP bd vnid : 15761417EP mac : 00:50:56:A8:B0:03**ommittedTunnel nh : 10.0.72.67**omitted

a-apic1# moquery -c ipv4Addr -f ‘ipv4.Addr.addr==“10.0.72.67”’Total Objects shown: 1

# ipv4.Addraddr : 10.0.72.67/32dn : topology/pod-1/node-101/sys/ipv4/inst/dom-overlay-1/if-[lo0]/addr-[10.0.72.67/32] **ommitted

Who owns the tunnel next-hop?

Leaf 101 has the local learn

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 Unicast

44BRKACI-2934

How does the remote pod learn about the EP?

Local Pod Spine installs COOP record

Local Pod Spine Exports into BGP EVPN

Remote Pod Spine Receives through EVPN

Remote Pod Spine Imports into COOP

show coop internal info repo ep | grep -B 8 -A 35 <mac address>

show bgp l2vpn evpn <mac address> vrf overlay-1

show bgp l2vpn evpn <mac address> vrf overlay-1

show coop internal info repo ep | grep -B 8 -A 35 <mac address>

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 Unicast

45BRKACI-2934

Local spines exports to evpn

Leaf LeafLeaf

Pod1

SpineSpine

EP1172.16.1.1/240050.56a8.b003

a-spine1# show bgp l2vpn evpn 00:50:56:A8:B0:03 vrf overlay-1Route Distinguisher: 1:16777199 (L2VNI 1)BGP routing table entry for [2]:[0]:[15761417]:[48]:[0050.56a8.b003]:[0]:[0.0.0.0]/216, **ommittedPaths: (1 available, best #1)Flags: (0x00010a 00000000) on xmit-list, is not in rib/evpnMultipath: eBGP iBGP

Advertised path-id 1Path type: local 0x4000008c 0x0 ref 0, path is valid, is best pathAS-Path: NONE, path locally originated0.0.0.0 (metric 0) from 0.0.0.0 (192.168.1.101)Origin IGP, MED not set, localpref 100, weight 32768Received label 15761417Extcommunity:

RT:5:16

Path-id 1 advertised to peers:192.168.2.101 192.168.2.102

COOP

EVPN

BD VNID

Advertised to Remote Pod Spines

Originated Locally

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 Unicast

46BRKACI-2934

Remote spines receive EP through EVPN

a-spine3# show bgp l2vpn evpn 00:50:56:A8:B0:03 vrf overlay-1Route Distinguisher: 1:16777199BGP routing table entry for [2]:[0]:[15335345]:[48]:[0050.56a8.b003]:[0]:[0.0.0.0]/216, *ommittedPaths: (2 available, best #1)Flags: (0x000202 00000000) on xmit-list, is not in rib/evpn, is lockedMultipath: eBGP iBGP

Advertised path-id 1Path type: internal 0x40000018 0x2040 ref 1, path is valid, is best pathAS-Path: NONE, path sourced internal to AS192.168.1.254 (metric 3) from 192.168.1.101 (192.168.1.101)Origin IGP, MED not set, localpref 100, weight 0Received label 15335345Received path-id 1Extcommunity:

RT:5:16ENCAP:8

DataplaneTEP/ETEP of POD1

BGP Address of Spine 1

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 Unicast

• Per-Pod anycast address

• Owned by all spines in a pod

• Set as Next Hop for BGP EVPN Paths

• COOP Placeholder for external proxy lookups

What is the Dataplane TEP/External Proxy TEP (ETEP)?

47BRKACI-2934

a-apic1# moquery -c ipv4If -f 'ipv4.If.mode*"etep"' -x 'rsp-subtree=children'

# ipv4.Ifid : lo14 adminSt : enableddn : topology/pod-1/node-1001/sys/ipv4/inst/dom-overlay-1/if-[lo14]donorIf : unspecifiedlcOwn : localmodTs : 2019-02-20T16:58:34.113-04:00mode : eteprn : if-[lo14]

# ipv4.Addraddr : 192.168.1.254/32

**ommitted

Loopback14 on Node 1001

Multipod Dataplane TEP/ETEP

NH Address in BGP Update

Not Actually Used for Dataplane!

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

SpineProxied Layer 2 Traffic

SpineProxied Layer 3 Traffic

Forward to Remote Pod External MAC

Proxy TEP

Layer 3 ETEP Lookup

Spine COOP Lookup Points

to Remote POD ETEP

Spine COOP Lookup Points

to Remote POD ETEP

Forward to Remote Pod

External v4/v6 Proxy TEP

apic1# moquery -c ipv4If -f 'ipv4.If.mode=="anycast-mac,external"' -x 'rsp-subtree=children' | egrep "addr" | grep dndn : topology/pod-1/node-1001/sys/ipv4/inst/dom-overlay-1/if-[lo8]/addr-[10.0.0.33/32]dn : topology/pod-2/node-2001/sys/ipv4/inst/dom-overlay-1/if-[lo1]/addr-[10.0.128.33/32]

apic1# moquery -c ipv4If -f 'ipv4.If.mode=="anycast-v4,external"' -x 'rsp-subtree=children' | egrep "addr" | grep dndn : topology/pod-2/node-2001/sys/ipv4/inst/dom-overlay-1/if-[lo2]/addr-[10.0.128.34/32]dn : topology/pod-1/node-1001/sys/ipv4/inst/dom-overlay-1/if-[lo9]/addr-[10.0.0.34/32]

Layer 2 ETEP Lookup

48BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 UnicastVerify Remote Pod COOP Entry

49BRKACI-2934

Leaf LeafLeaf

SpineSpine

Pod2

a-spine3# show coop internal info repo ep | grep -B 8 -A 35 00:50:56:A8:B0:03------------------------------------------**ommittedEP bd vnid : 15761417EP mac : 00:50:56:A8:B0:03Remote Type : MPODMAC Tunnel : 10.0.0.33IPv4 Tunnel : 10.0.0.34IPv6 Tunnel : 10.0.0.35ETEP Tunnel : 192.168.1.254**ommitted

COOP Entry exists and points to POD1

Proxied L2 Traffic will forward to the Pod1 External MAC-

proxy Address

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 Unicast

50BRKACI-2934

BD Settings - UUC Proxy, ARP Flooding Enabled, UC Routing Disabled

Leaf LeafLeaf

Pod1

SpineSpine

Leaf LeafLeaf

SpineSpine

Pod2

EP1172.16.1.1/240050.56a8.b003

root@vm1:/home/joyo# arp -an? (172.16.1.2) at 8c:60:4f:02:88:fc [ether] on ens192

root@vm1:/home/joyo# ping 172.16.1.2PING 172.16.1.2 (172.16.1.2) 56(84) bytes of data.

No EP entry for dmac, send to proxy

EP2172.16.1.2/248c60.4f02.88fc

pod1-leaf101# show endpoint mac 8c60.4f02.88fc<no output>

pod1-leaf101# show isis dteps vrf overlay-110.0.120.33 SPINE PHYSICAL,PROXY-ACAST-MAC

Local Spines have COOP entry pointing

to remote ETEP

Remote Spines have COOP entry pointing to

Local Pod Leaf

Local POD Leaf receives, installs tunnel to source leaf, and forwards to EP

1

23

4

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Dynamic Tunnel Learns

51BRKACI-2934

Vxlan Tunnels are Created 3 Ways

a-leaf205# moquery -c tunnelIf -f 'tunnel.If.id=="tunnel1"'

id : tunnel1dest : 10.0.72.67idRequestorDn : sys/inst-overlay-1/db-dtep/dtep-[10.0.72.67]

a-leaf205# moquery -c tunnelIf -f 'tunnel.If.id=="tunnel1"'

id : tunnel1dest : 10.0.72.64idRequestorDn : sys/bgp/inst/dom-overlay-1/db-dtep/dtep-[10.0.72.64]

a-leaf205# moquery -c tunnelIf -f 'tunnel.If.id=="tunnel1"'

# tunnel.Ifid : tunnel1dest : 10.0.152.64idRequestorDn : sys/isis/inst-default/dom-overlay-1/lvl-l1/db-dtep/dtep-[10.0.152.64]

Remote Pod Endpoint Learns

Remote POD External Routes

Local POD ISIS Database

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Dynamic Tunnel LearnsEndpoint Created Tunnels

52BRKACI-2934

Pod1 Pod2TEP Pool:

10.0.0.0/17TEP Pool:

10.0.128.0/17

LeafLeaf

TEP: 10.0.72.67 TEP: 10.0.200.67

EP1172.16.1.1/240050.56a8.b003

EP1172.16.1.2/248c60.4f02.88fc

ping 172.16.1.2

Leafs install white-list for remote TEP ranges

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Dynamic Tunnel LearnsEndpoint (Dataplane) Created Tunnels

53BRKACI-2934

vsh_lc -c "show sys internal eltmc info pfxwl_table" | grep "Prefix" | grep -v Tunnel**ommitted

Prefix: 10.0.20.64Prefix len: 27

Prefix: 10.0.64.64Prefix len: 27

Prefix: 10.0.72.64 Src TEP matches herePrefix len: 27**ommitted

Outer Dst IP Outer Src IP Inner Dst IP Inner Src IP

10.0.200.67 10.0.72.67 172.16.1.2 172.16.1.1

Verify White-List on Dst Leaf Verify Tunnel Created on Dst Leaf

moquery -c tunnelIf -f 'tunnel.If.dest=="10.0.72.67"'Total Objects shown: 1

# tunnel.Ifid : tunnel3dest : 10.0.72.67operSt : up**ommitted

White-list Prefixes based on dhcp pools within remote POD TEP-Range

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 Unicast

54BRKACI-2934

Remote Leaf Installs EP to Source

Leaf LeafLeaf

SpineSpine

Pod2Where does tunnel16 point?

a-leaf205# show interface tunnel16Tunnel16 is up

Tunnel source 10.0.200.67/32 (lo0)Tunnel destination 10.0.72.67

**Ommitted

a-apic1# moquery -c ipv4Addr -f ‘ipv4.Addr.addr==“10.0.72.67”’

addr : 10.0.72.67/32dn : topology/pod-1/node-101/sys/ipv4/inst/dom-overlay-1/if-[lo0]/addr-[10.0.72.67/32]**ommitted

a-leaf205# show endpoint mac 0050.56a8.b003 detail**ommitted100/CiscoLive2020:vrf1 0050.56a8.b003 tunnel16 CiscoLive2020:bd-L2-2

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 2 Unicast

55BRKACI-2934

Return Path…

Leaf LeafLeaf

Pod1

SpineSpine

Leaf LeafLeaf

SpineSpine

Pod2

EP1172.16.1.1/240050.56a8.b003

Pod1 Leaf installs tunnel and remote learn to pod 2 leaf

EP2172.16.1.2/248c60.4f02.88fc

Spines simply provide transit

Pod2 Leaf Forwards Based on Remote Learn

1

2

3

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Using Ftriage to Troubleshoot Multipod (14.2+)

56BRKACI-2934

*Recommended with EX or Later Hardware

a-apic1# ftriage bridge -ii LEAF:101,103 -dip 172.16.1.2 -sip 172.16.1.1Starting ftriageftriage: main:839 L2 frame Seen on a-leaf101 Ingress: Eth1/30 (Po15) Egress: Eth1/54 Vnid: 16056274ftriage: main:242 ingress encap string vlan-1062ftriage: main:839 L2 frame Seen on a-spine2 Ingress: Eth1/25 Egress: Eth1/31 Vnid: 16056274ftriage: fib:332 a-spine2: Transit in spineftriage: unicast:1458 a-spine2: Infra route 10.0.200.67 present in RIBftriage: unicast:1681 a-spine2: Packet is exiting the fabric through {a-spine2: ['Eth1/31']}ftriage: main:839 L2 frame Seen on a-spine3 Ingress: Eth1/29 Egress: LC-1/3 FC-22/0 Port-1 Vnid: 16056274ftriage: fib:332 a-spine3: Transit in spineftriage: unicast:1458 a-spine3: Infra route 10.0.200.67 present in RIBftriage: unicast:1774 L2 frame Seen on FC of node: a-spine3…. ftriage: main:622 Found peer-node a-leaf205 and IF: Eth1/53 in candidate listftriage: main:839 L2 frame Seen on a-leaf205 Ingress: Eth1/53 Egress: Eth1/31 Vnid: 11371ftriage: main:522 Computed egress encap string vlan-1039ftriage: main:332 Egress BD(s): jy:cl1ftriage: unicast:1833 a-leaf205: Dst EP is localftriage: misc:657 a-leaf205: EP if(Eth1/31) same as egr if(Eth1/31)

Look for bridged flow ingressing 101 or 103

Frame seen on leaf101

Frame seen on spine2

Frame seen on pod2 spine3

Frame seen on pod2 leaf205

Forwards out eth1/31

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Troubleshooting Scenario:

Is communication unicast or multi-destination?

EP’s cannot communicate in L2 BD

57BRKACI-2934

1

root@vm1:/home/joyo# arp -an? (172.16.1.2) at 8c:60:4f:02:88:fc [ether] on ens192

root@vm1:/home/joyo# ping 172.16.1.2PING 172.16.1.2 (172.16.1.2) 56(84) bytes of data.

ARP is resolved so host sends unicast

a-leaf101# show endpoint mac 8c60.4f02.88fc<no entry>

Ingress leaf has no remote learn

Does BD flood or proxy unknown unicast?

a-apic1# moquery -c fvBD -f 'fv.BD.name=="bd-L2-2“’

name : bd-L2-2dn : uni/tn-CiscoLive2020/BD-bd-L2-2unkMacUcastAct : proxy UUC set to “Proxy”

Pod 1 Verifications

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Troubleshooting Scenario:

Does Local Pod Spine have the EP?

EP’s cannot communicate in L2 BD

58BRKACI-2934

2

a-spine1# moquery -c coopEpRec -f 'coop.EpRec.mac=="8c60.4f02.88fc "'No Mos found MAC not in COOP

a-spine1# show bgp l2vpn evpn 8c60.4f02.88fc vrf overlay-1<no output>

MAC not in EVPN

Pod 1 Verifications

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Troubleshooting Scenario:

Does Remote Pod Spine have the EP?

EP’s cannot communicate in L2 BD

59BRKACI-2934

3

a-spine3# moquery -c coopEpRec -f 'coop.EpRec.mac=="8c60.4f02.88fc "'# coop.EpRecvnid : 15761417mac : 8C:60:4F:02:88:FC**ommitted**

MAC is in COOP

a-spine3# show bgp l2vpn evpn 8c60.4f02.88fc vrf overlay-1<ommitted>AS-Path: NONE, path locally originated

0.0.0.0 (metric 0) from 0.0.0.0 (192.168.2.101)Origin IGP, MED not set, localpref 100, weight 32768Received label 15761417Extcommunity:

RT:5:16

Remote Spine Exports to EVPN

Pod 2 Verifications

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Troubleshooting Scenario:

60BRKACI-2934

EP’s cannot communicate in L2 BD

Is EVPN up between Pods?4

a-spine1# show bgp l2vpn evpn summ vrf overlay-1

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd192.168.2.101 4 65000 57380 66362 0 0 0 00:00:21 Active192.168.2.102 4 65000 57568 66357 0 0 0 00:00:22 Active

BGP is down

Next Steps…

• Do the local spines have routes to remote spines?

• Does IPN support jumbo MTU?

• Can spines ping between each other?

Pod 1 or Pod2 Verifications

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 3 Unicast

• Differences from Layer 2

• VRF Lookup rather than BD lookup

• VRF VNID used instead of BD VNID

• Spines trigger ARP Glean if Dst is Unknown (leverages fabric multicast)

…nearly identical to layer 2 unicast

61BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 3 Unicast – Glean Scenario

62BRKACI-2934

BD Settings - UC Routing Enabled

Leaf LeafLeaf

Pod1

SpineSpine

EP1172.16.1.1/240050.56a8.b003

root@vm1:/home/joyo# ping 172.16.2.2PING 172.16.2.2 (172.16.2.2) 56(84) bytes of data.

a-leaf101# show isis dtep vrf overlay-1 | grep 10.0.120.3410.0.120.34 SPINE N/A PHYSICAL,PROXY-ACAST-V4

a-leaf101# show ip route 172.16.2.2 vrf CiscoLive2020:vrf1

172.16.2.0/24, ubest/mbest: 1/0, attached, direct, pervasive*via 10.0.120.34%overlay-1, [1/0], 12:02:55, static

recursive next hop: 10.0.120.34/32%overlay-1

a-leaf101# show endpoint ip 172.16.2.2+--------+-------+-------------+-----------+-----------+VLAN/ Encap MAC Address MAC Info/ InterfaceDomain VLAN IP Address IP Info

+--------+-------+-------------+-----------+-----------+

No EP learn, check routing table

Pervasive Flag indicates BD Subnet

Static Route

Next-hop is spine Proxy

1

2

3

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 3 Unicast – Glean Scenario

63BRKACI-2934

BD Settings - UC Routing Enabled

Leaf LeafLeaf

Pod1

SpineSpine

Leaf LeafLeaf

SpineSpine

Pod2

EP1172.16.1.1/240050.56a8.b003

root@vm1:/home/joyo# ping 172.16.2.2PING 172.16.2.2 (172.16.2.2) 56(84) bytes of data.

EP2172.16.2.2/248c60.4f02.88fc

Local Spines have no COOP entry for Dst IP

172.16.2.2 not learned yet

a-spine1# show coop internal info ip-db | grep -F -B 1 -A 15 “172.16.2.2"

No COOP Entry! This will trigger a Glean

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 3 Unicast

• If the Spines do not have an IP learn

and…

• The destination IP is within a deployed BD Subnet

✓The spine floods the proxied request with a special ethertype

✓Gleans flooded to 239.255.255.240 (*see hidden slide)

✓Leafs with destination subnet generate ARP

What is a Glean?

64BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Inter-Pod Glean

65BRKACI-2934

ERSPAN of Spine > IPN Link

nexus5k# ping 172.16.2.2 source 172.16.1.1 vrf jy1PING 172.16.2.2 (172.16.2.2) from 172.16.1.1: 56 data bytesRequest 0 timed outRequest 1 timed outRequest 2 timed out Src: Originating Leaf

Dst: Reserved Glean Multicast Group

Custom Ethertypefor Gleans

Source IP that triggered Glean: 0xac100101 = 172.16.1.1

Dst IP that triggered Glean: 0xac100202 = 172.16.2.2

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

System Gipo Usage

• If “Use Infra as System Gipo” is enabled actual BD gipo’s used rather than 239.255.255.240

66BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Capturing a Glean with TcpdumpACI Leafs and Spines contain pseudo interfaces for traffic to and from the CPU

67BRKACI-2934

1st Gen Leaf

CPUkpm_inb

PhysPort

ASICknet0knet1

EX (or Later) Leaf

CPUkpm_inb

PhysPort

ASIC Tahoe0

• For traffic going to the cpucheck knet0 and kpm_inb

• For traffic coming from the cpu check knet1 and kpm_inb

*Note, not all traffic will show up on the kpm_inb interface. However, all traffic shows on the pseudo interface*Gen1 and 2 Modular spines use psdev0, psdev1, and psdev2 interfaces. Gen 2 fixed spines use tahoe0. Gen 1 fixed spines use knet0-3

• For traffic to and from the cpu check Tahoe0 and kpm_inb

• Traffic on the on the knet or tahoe pseudo interface will have a special ieth header. It must be decoded.

• Starting in 3.2 the knet_parser.py script is available on the switch cli to decode

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Capturing a Glean with Tcpdump

68BRKACI-2934

Gen2 or Later Leaftcpdump -xxxvei tahoe0 -w /bootflash/tahoe0.pcapknet_parser.py --file /bootflash/tahoe0.pcap --pcap --decoder tahoe

Frame 111Time: 2019-05-16T16:56:33.059831+00:00Header: ieth_extn CPU Receive

sup_qnum:0x14, sup_code:0x21, istack:ISTACK_SUP_CODE_SPINE_GLEAN(0x21)Header: ieth

sup_tx:0, ttl_bypass:0, opcode:0x6, bd:0x120e, outer_bd:0x27, dl:0, span:0, traceroute:0, tclass:0src_idx:0x3a, src_chip:0x0, src_port:0x19, src_is_tunnel:1, src_is_peer:1dst_idx:0x0, dst_chip:0x0, dst_port:0x0, dst_is_tunnel:0

Len: 148Eth: 000d.0d0d.0d0d > 0100.5e7f.fff1, len/ethertype:0x8100(802.1q)802.1q: vlan:2, cos:5, len/ethertype:0x800(ipv4)ipv4: 10.0.116.64 > 239.255.255.241, len:130, ttl:249, id:0x0, df:0, mf:0, offset:0x0, dscp:32, prot:17(udp)udp: (ivxlan) 0 > 48879, len:110ivxlan: n:1, l:1, i:1,

vnid: 0x2b0000lb:0, dl:1, exception:0, src_policy:0, dst_policy:0, src_class:0x5c0mcast(routed:0, ingress_encap:0/802.1q), ac_bank:0, src_port:0x0

Eth: 000c.0c0c.0c0c > ffff.ffff.ffff, len/ethertype:0xfff2(aci-glean)ipv4: 172.16.1.1 > 172.16.2.2, len:84, ttl:63, id:0x71f9, df:1, mf:0, offset:0x0, dscp:0, prot:1(icmp)icmp: echo request id:0x9092, seq:0x1980

Traffic that triggered Glean

Switch recognizes this as a Glean

RX sup traffic rather than TX

Decode type should be tahoe for

tahoe interface

Egress Leaf Verification

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Capturing a Glean with Tcpdump

69BRKACI-2934

Gen1 Leaf Example

tcpdump -xxxvei knet0 -w /bootflash/knet0.pcapknet_parser.py --file /bootflash/knet0.pcap --pcap --decoder knet

Egress Leaf Verification

tcpdump -xxxvei knet1 -w /bootflash/knet1.pcapknet_parser.py --file /bootflash/knet1.pcap --pcap --decoder knet

tcpdump -xxxvei kpm_inb ether proto 0xfff2a-leaf102# tcpdump -xxxvei kpm_inb ether proto 0xfff2tcpdump: listening on kpm_inb, link-type EN10MB (Ethernet), capture size 65535 bytes15:27:37.663580 00:0c:0c:0c:0c:0c (oui Unknown) > Broadcast, ethertype Unknown (0xfff2), length 94:

0x0000: ffff ffff ffff 000c 0c0c 0c0c fff2 45000x0010: 0054 aa4b 4000 3f01 825d 0404 0464 03030x0020: 0396 0800 0dc6 2384 38db 5275 dd5c 00000x0030: 0000 9e35 0100 0000 0000 1011 1213 14150x0040: 1617 1819 1a1b 1c1d 1e1f 2021 2223 24250x0050: 2627 2829 2a2b 2c2d 2e2f 3031 3233

knet0 would show Rx traffic (similar output as Tahoe0)

knet1 would show Tx traffic

No decode necessary for kpm_inb (cpu) interface…Gleans aren’t easily readable

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 3 Unicast – Glean Scenario

IPN Must Route 239.255.255.240 (*see Troubleshooting Multidestination Flows Section)

70BRKACI-2934

IPN1# show run | grep 239ip pim rp-address 192.168.100.1 group-list 239.0.0.0/8 bidirip pim rp-address 10.10.1.1 group-list 239.0.0.0/8 bidir

a-leaf205#show ip arp internal event-history event | grep -F -B 1 172.16.2.273) Event:E_DEBUG_DSF, length:127, at 316928 usecs after Wed May 1 08:31:53 2019Updating epm ifidx: 1a01e000 vlan: 105 ip: 172.16.2.2, ifMode: 128 mac: 8c60.4f02.88fc75) Event:E_DEBUG_DSF, length:152, at 316420 usecs after Wed May 1 08:31:53 2019log_collect_arp_pkt; sip = 172.16.2.2; dip = 172.16.2.254; interface = Vlan104;info = Garp Check adj:(nil) 77) Event:E_DEBUG_DSF, length:142, at 131918 usecs after Wed May 1 08:28:36 2019log_collect_arp_pkt; dip = 172.16.2.2; interface = Vlan104;iod = 138; Info = Internal Request Done78) Event:E_DEBUG_DSF, length:136, at 131757 usecs after Wed May 1 08:28:36 2019log_collect_arp_glean;dip = 172.16.2.2;interface = Vlan104;info = Received pkt Fabric-Glean: 179) Event:E_DEBUG_DSF, length:174, at 131748 usecs after Wed May 1 08:28:36 2019log_collect_arp_glean; dip = 172.16.2.2; interface = Vlan104; vrf = CiscoLive2020:vrf1; info = Address in PSVI subnet or special VIP

Glean Group Range included as Bidir on IPN

Verify ARP on Remote Leaf

Glean Received, Dst IP is in BD Subnet

ARP Request is generated by leaf

Response Received

Endpoint Learn Installed

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Using Ftriage to Troubleshoot Multipod (14.2+)

71BRKACI-2934

L3 Proxy/Glean Scenario

a-apic1# ftriage route -ii LEAF:101,103 -dip 172.16.2.3 -sip 172.16.1.1ftriage: main:839 L3 packet Seen on a-leaf103 Ingress: Eth1/30 (Po15) Egress: Eth1/49 Vnid: 2588674ftriage: main:242 ingress encap string vlan-1062ftriage: main:301 Ingress Ctx: jy:vrf11ftriage: main:933 SIP 172.16.1.1 DIP 172.16.2.3ftriage: unicast:973 a-leaf103: <- is ingress nodeftriage: unicast:1194 a-leaf103: Dst EP is unknown - proxyftriage: main:839 L3 packet Seen on a-spine1 Ingress: Eth2/29 Egress: LC-2/3 FC-23/0 Port-1 Vnid: 2588674ftriage: fib:323 a-spine1: EP not found in COOP! for VRF VNID: 2588674ftriage: unicast:1373 a-spine1: EP is unknown in COOP. Ftriage will exit but continue with further fault isolationftriage: unicast:1412 a-spine1: Egress node not provided. Cannot check local EP. Exiting!ftriage: unicast:1413 : Ftriage Completed with hunch: Check if local EP learnt on egress node(s)

Look for routed flow ingressing 101 or 103 Frame seen on

leaf103

Dst Unknown, Proxy!

Seen on spine1

EP not in COOP!

✓ EP Not in COOP, gleans should be generated. Check local learn on egress leaf

Troubleshooting MultidestinationFlows

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multipod Multicast

• Unknown Unicast Flooding

• Multidestination Traffic (ARP, Multicast, BPDU’s)

• Inter-pod Glean Messages

• EP Announce Messages

What does Multipod use BUM for?

73BRKACI-2934

…it doesn’t just affect multidestination flows

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPN Multicast Control-plane

• Spines act has multicast hosts (IGMP only)

• Spines join fabric multicast groups (Gipo’s)

• IPN’s receive Joins

• IPN’s send PIM joins to RP

• PIM Bidir is used so no (S,G)

74BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

What is a Gipo?

• Multicast group allocated per-VRF and per-BD.

• Used for all flooded traffic

75BRKACI-2934

a-apic1# moquery -c fvBD -f 'fv.BD.name=="bd-L3-1"'

# fv.BDname : bd-L3-1bcastP : 225.0.80.64dn : uni/tn-CiscoLive2020/BD-bd-L3-1ipLearning : yesmultiDstPktAct : bd-floodunicastRoute : yesunkMacUcastAct : proxyunkMcastAct : floodv6unkMcastAct : flood

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPN Multicast Control-plane

76BRKACI-2934

Pod1

SpineSpine SpineSpine

Pod2

BD Gipo Ex: 225.0.80.64

IPN

BD Gipo Ex: 225.0.80.64

RP

Spine sends IGMP Join for GIPO

IPN Routers Send PIM Join to RP

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPN Multicast Dataplane

77BRKACI-2934

Pod1

SpineSpine SpineSpine

Pod2

BD Gipo Ex: 225.0.80.64

IPN

BD Gipo Ex: 225.0.80.64

All Multicast DataplaneGoes Through RP

RP

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPN Multicast Control-plane

78BRKACI-2934

Only one spine in each pod joins each group

Pod2

BD Gipo Ex: 225.0.80.64

a-spine1# show ip igmp gipo joinsGIPo list as read from IGMP-IF group-linked list------------------------------------------------225.0.80.64 0.0.0.0 Join Eth1/25.25 95 Enabled

Pod1

BD Gipo Ex: 225.0.80.64

IPN IPNIGMP Join IGMP Join

Spine1 Joins for Pod1

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPN Multicast Control-plane

79BRKACI-2934

Only one spine in each pod joins each group

Pod2

BD Gipo Ex: 225.0.80.64

IPN1# show ip mroute 225.0.80.64 vrf IPNIP Multicast Routing Table for VRF “IPN"

(*, 225.0.80.64/32), bidir, uptime: 13:00:48, igmp ip pimIncoming interface: loopback1, RPF nbr: 192.168.100.1Outgoing interface list: (count: 3)Ethernet8/2, uptime: 01:34:42, pimloopback1, uptime: 13:00:48, pim, (RPF)Ethernet1/1.4, uptime: 13:00:48, igmp

Pod1

BD Gipo Ex: 225.0.80.64

IPN IPNIGMP Join IGMP Join

IPN1# show ip igmp groups 225.0.80.64 vrf IPN

Type: S - Static, D - Dynamic, L - Local, T - SSM TranslatedGroup Address Interface Uptime Expires Last Reporter225.0.80.64 Ethernet1/1.4 13:02:14 00:04:02 192.168.1.0

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPN Multicast Control-plane

80BRKACI-2934

RPF for all IPN’s must point to same RP

Pod2

BD Gipo Ex: 225.0.80.64

IPN3# show ip mroute 225.0.80.64 vrf IPNIP Multicast Routing Table for VRF "IPN"

(*, 225.0.80.64/32), bidir, uptime: 01:34:35, igmp ip pimIncoming interface: Ethernet8/25, RPF nbr: 10.255.0.0Outgoing interface list: (count: 2)Ethernet8/25, uptime: 01:34:35, pim, (RPF)Ethernet1/17.4, uptime: 01:34:35, igmp

Pod1

BD Gipo Ex: 225.0.80.64

IGMP Join IGMP Join

RP

IPN3

IPN2

IPN1

RPF

IPN3# show ip pim rp 225.0.80.64 vrf IPNPIM RP Information for group 225.0.80.64 in VRF "IPN"RP: 192.168.100.1, (1)

IPN3# show ip route 192.168.100.1 vrf IPN192.168.100.0/30, ubest/mbest: 1/0

*via 10.255.0.0, Eth8/25, [110/5], 13:01:42, ospf-IPN, intra

RPF must not point to ACI

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPN Multicast Control-plane

• Bidir PIM doesn’t support multiple RP’s

• Phantom RP is only means of RP redundancy

• Works by advertising varied Prefix Lengths for RP subnet

• Failover handled via IGP

• Loopback must be OSPF P2P network type

• Exact RP Address must not exist anywhere

Phantom RP

81BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPN Multicast Control-plane

Requirement: Each multicast group must have only one RP

To use multiple RP’s…

Break mcast group range into smaller groups

225.0.0.0/8 becomes

225.0.0.0/9 – RP 192.168.255.1

225.128.0.0/9 – RP 192.168.255.2

Phantom RP Load-Balancing

82BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPN Multicast Control-planePhantom RP

83BRKACI-2934

IPN4

IPN3

IPN2

IPN1

RP Addr - 192.168.255.1

IPN1# show run int lo1

interface loopback1ip address 192.168.255.2/27ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode

IPN2# show run int lo1

interface loopback1ip address 192.168.255.2/29ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode

IPN3# show run int lo1

interface loopback1ip address 192.168.255.2/28ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode

IPN4# show run int lo1

interface loopback1ip address 192.168.255.2/30ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode

IPN4 is RP due to Longest Prefix

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multicast ProblemsIssue #1: RP Address Exists on Multiple Routers

84BRKACI-2934

IPN2

IPN1

IPN1# show run int lo1

interface loopback1ip address 192.168.255.1/29ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode

IPN2# show run int lo1

interface loopback1ip address 192.168.255.1/30ip ospf network point-to-pointip router ospf IPN area 0.0.0.0ip pim sparse-mode

RP Addr - 192.168.255.1

IPN1# show ip route 192.168.255.1 vrf IPNIP Route Table for VRF "IPN"

192.168.100.1/32, ubest/mbest: 1/0, attached*via 192.168.100.1, Lo1, [0/0], 21:01:48, local

• IPN Routers see local /32 for RP address

• All Routers think they are RP

Exact RP address can’t exist with Phantom RP

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multicast ProblemsIssue #2: RP Loopback not OSPF P2P Network

85BRKACI-2934

IPN1

IPN1# show run int lo1

interface loopback1ip address 192.168.255.2/29ip router ospf IPN area 0.0.0.0ip pim sparse-mode

IPN2# show run int lo1

interface loopback1ip address 192.168.255.2/30ip router ospf IPN area 0.0.0.0ip pim sparse-mode

RP Addr - 192.168.255.1

IPN1# show ip route 192.168.255.1 vrf IPNIP Route Table for VRF "IPN"

192.168.100.0/29, ubest/mbest: 1/0, attached*via 192.168.100.2, Lo1, [0/0], 21:15:36, direct

Where is the /30 from IPN2?

IPN2

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multicast ProblemsIssue #2: RP Loopback not OSPF P2P Network

86BRKACI-2934

• Loopbacks advertise /32 by default

IPN1#show ip ospf database router 1.1.1.1 detail vrf IPNLink connected to: a Stub Network(Link ID) Network/Subnet Number: 192.168.255.2(Link Data) Network Mask: 255.255.255.255Number of TOS metrics: 0TOS 0 Metric: 1

IPN1# show ip ospf database router 1.1.1.1 detail vrf IPNLink connected to: a Stub Network(Link ID) Network/Subnet Number: 192.168.255.0(Link Data) Network Mask: 255.255.255.248Number of TOS metrics: 0TOS 0 Metric: 1

Without P2P Network Type With P2P Network Type

/32 Advertised /29 Advertised

Configure “ip ospf network point-to-point” on RP loopbacks

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multicast ProblemsIssue #3: RPF Points to ACI

87BRKACI-2934

Pod1

Spine Spine

Pod2

IPN

RP

Low Speed Link: Cost 10

IPN1 IPN3

IPN2

High Speed Link: Cost 1

IPN3# show ip mroute 225.0.80.64 vrf IPN

(*, 225.0.80.64/32), bidir, uptime: 00:00:26, igmp ip pimIncoming interface: Ethernet1/1.4, RPF nbr: 192.168.1.0Outgoing interface list: (count: 2)

Ethernet1/1.4, uptime: 00:00:26, igmp, (RPF)

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multicast Problems

• Spines don’t run PIM so not valid RPF

• Applicable when single spine is connected to multiple IPN’s

Issue #3: RPF Points to ACI

88BRKACI-2934

IPN3# show ip mroute 225.0.80.64 vrf IPN

(*, 225.0.80.64/32), bidir, uptime: 00:00:26, igmp ip pimIncoming interface: Ethernet1/1.4, RPF nbr: 192.168.1.0Outgoing interface list: (count: 2)

Ethernet1/1.4, uptime: 00:00:26, igmp, (RPF)

IPN1# show ip pim int eth1/1.4 brief vrf IPNPIM Interface Status for VRF “IPN"Interface IP Address PIM DR Address Neighbor

CountEthernet1/1.4 192.168.1.1 192.168.1.1 0

RPF is Eth1/1.4 (ACI) No PIM Neighbor on that Link

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multicast ProblemsIssue #3: RPF Points to ACI

89BRKACI-2934

Pod1

Spine Spine

Pod2

IPN

RP

Low Speed Link: Cost 10

IPN1 IPN3

IPN2

High Speed Link: Cost 1 Make IPN-IPN links have equal or better OSPF Cost

Troubleshooting External Routed Communication

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

External Routed L3out Control-Plane

• External Routes redistributed into Fabric BGP

• Each pod is BGP VPNv4 Route-Reflector Cluster

• Spines reflect external routes across pods

• Internal Leafs import VPNv4 routes

• Internal Leafs see Border Leaf as next hop

Almost the same as traditional L3outs

91BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

External Routed L3out Control-Plane

92BRKACI-2934

Leaf LeafLeaf

Pod1

SpineSpine

Leaf

Spine

Pod2

How do internal Leafslearn external routes?

Border Leaf Learns External Routes

Border Leaf Exports into BGP and sends

to Spines

Spines Reflect VPNv4Paths between Pods

Internal Leafs Import Routes from BGP

10.13.13.13/32

Internal Leaf install tunnel to Border Leaf based on BGP Next Hop5

4

3

2

1

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

External Routed L3out Control-Plane

93BRKACI-2934

External Route on Internal Leaf

a-leaf101# show bgp ipv4 unicast 10.13.13.13/32 vrf CiscoLive2020:vrf1

Advertised path-id 1, VPN AF advertised path-id 1Path type: internal adv path ref 2, path is valid, is best path

Imported from 10.0.200.67:10:13.13.13.13/32AS-Path: NONE, path sourced internal to AS10.0.200.67 (metric 64) from 10.0.64.64 (192.168.1.102)Origin incomplete, MED 5, localpref 100, weight 0Received label 0Received path-id 2Extcommunity:

RT:65000:2818051COST:pre-bestpath:165:2415919104VNID:2818051COST:pre-bestpath:162:110

Originator: 10.0.200.67 Cluster list: 192.168.1.102 192.168.2.254

Next-hop, this is the Border Leaf TEP

Imported Route-target

Vxlan Vnid used for traffic using this route

Source route AD is 110 (must be OSPF)

BGP Route-Reflector Cluster-list, one for

each pod

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

External Routed L3out Control-Plane

94BRKACI-2934

Tunnel Built by BGP on Internal Leaf

a-leaf101# show ip route 10.13.13.13 vrf CiscoLive2020:vrf1IP Route Table for VRF "CiscoLive2020:vrf1"'*' denotes best ucast next-hop

10.13.13.13/32, ubest/mbest: 1/0*via 10.0.200.67%overlay-1, [200/5], 1d00h, bgp-65000, internal, tag 65000

recursive next hop: 10.0.200.67/32%overlay-1

External Route Learned via BGP

a-leaf101# moquery -c tunnelIf -f 'tunnel.If.dest*"10.0.200.67"'

# tunnel.Ifid : tunnel47dest : 10.0.200.67idRequestorDn : sys/bgp/inst/dom-overlay-1/db-dtep/dtep-[10.0.200.67]type : fabric-ext,physicalvrfName : overlay-1

a-leaf101# vsha-leaf101# show bgp internal event-history objstore | grep a00c8432019 Apr 2 21:12:30 bgp 65000 [58156]: TID 58302: (0) OBJ: bgp_dtep_add: tep=a00c843

Tunnel Created by BGP

*Note, Tunnel could be created by unrelated EP learn first

Initial BGPTunnel Creation

Dest IP in hex

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

External Routed L3out Control-Plane

• Exactly the same as non-multipod…

• Bridge Domain Static Route Pushed to Border Leaf by Contract

• Border Leafs Redistributes (if configured) into external protocol

• External > Internal traffic hits EP learn or BD static (proxy) route

How do Border Leafs forward to internal Leafs?

95BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

External Routed L3out Control-PlaneHow do Border Leafs forward to internal Leafs?

96BRKACI-2934

Leaf LeafLeaf

Pod1

SpineSpine

Leaf

Spine

Pod2

EPG ABD Subnet 10.1.1.0/24

L3OutContract

a-leaf205# show ip route 10.1.1.1 vrf CiscoLive2020:vrf1IP Route Table for VRF "CiscoLive2020:vrf1"'*' denotes best ucast next-hop

10.1.1.0/24, ubest/mbest: 1/0, attached, direct, pervasive*via 10.0.120.34%overlay-1, [1/0], 22:14:10, static

recursive next hop: 10.0.120.34/32%overlay-1

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Troubleshooting TIP

1) Check if there is an Endpoint Learn

If not then…

2) Check if there is a BD (pervasive) static route

If not then…

3) Check if there is an External Route

When Troubleshooting Layer 3 Flows Always…

97BRKACI-2934

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Using Ftriage to Troubleshoot Multipod (14.2+)

98BRKACI-2934

L3Out Scenario

a-apic1# ftriage route -ii LEAF:101,103 -dip 10.13.13.13 -sip 172.16.1.1ftriage: main:839 L3 packet Seen on a-leaf103 Ingress: Eth1/30 (Po15) Egress: Eth1/50 Vnid: 2588674ftriage: main:242 ingress encap string vlan-1062ftriage: main:301 Ingress Ctx: jy:vrf11ftriage: nxos:1404 a-leaf103: nxos matching rule id:4572 scope:63 filter:65535ftriage: main:933 SIP 172.16.1.1 DIP 10.13.13.13ftriage: unicast:1058 a-leaf103: Dst EP is a WAN EPftriage: unicast:1070 a-leaf103: Policy enforcement mode is ingressftriage: unicast:1215 a-leaf103: Dst EP is remoteftriage: misc:657 a-leaf103: RwDMAC DIPo(10.0.200.67) is one of dst TEPs ['10.0.200.67']ftriage: main:839 L3 packet Seen on a-spine2 Ingress: Eth1/27 Egress: Eth1/31 Vnid: 2588674ftriage: main:839 L3 packet Seen on a-spine3 Ingress: Eth1/29 Egress: LC-1/3 FC-26/0 Port-1 Vnid: 2588674ftriage: main:839 L3 packet Seen on a-leaf205 Ingress: Eth1/53 Egress: Eth1/31 Vnid: Nullftriage: pktrec:490 a-leaf205: Collecting transient losses snapshot for LC module: 1ftriage: fib:169 a-leaf205: L3 out interface Ethernet1/31ftriage: main:522 Computed egress encap string vlan-1055ftriage: main:313 Building egress BD(s), Ctxftriage: main:331 Egress Ctx jy:vrf11ftriage: main:332 Egress BD(s): jy:vrf11:l3out-bgp:vlan-1055

Look for routed flow ingressing 101 or 103

Frame seen on leaf103

Dst is behind L3outSends to this TEP

(leaf 205)

Arrives on 205 and forwards out l3out bgp on vlan 1055

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multipod L3out ProblemsIssue #1: Asymmetric Routing with Active/Active Pods

99BRKACI-2934

Leaf Leaf

Pod1

Spine

Leaf Leaf

Pod2

Spine

BD110.1.1.0/24

BD110.1.1.0/24

EP110.1.1.1/24

• Both Pods advertise same BD Subnet

• External Device performs stateful inspection

• Return traffic ingresses different pod than where EP exists

• Traffic dropped

EP sends outbound traffic

Return traffic goes to Pod2and is dropped by FW

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multipod L3out ProblemsIssue #1: Asymmetric Routing with Active/Active Pods

100BRKACI-2934

Leaf Leaf

Pod1

Spine

Leaf Leaf

Pod2

Spine

BD110.1.1.0/24

BD110.1.1.0/24

EP110.1.1.1/24

• Pods advertise local /32 EP information

• Requires GOLF or Host Border Route Feature (HBR in 4.0)

Pod 1 Advertises 10.1.1.1/32

Return traffic goes to Pod 1

Implement Host Route Advertisement

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multipod L3out ProblemsIssue #2: Stretched L3out VIP Failover

101BRKACI-2934

Leaf Leaf

Pod1

Spine

Leaf Leaf

Pod2

Spine

Active VIP10.2.2.2

Two Common Problems Here

• Same encap vlan not deployed for each vlan –breaks flooded traffic

• IPN isn’t routing multicast properly

New Active FW Sends GARP

Active VIP10.2.2.2Pod 2 Becomes Active

IPN Forwards GARP

Pod 1 Leafs don’t see GARP, still think local FW is active

Standby VIP10.2.2.2

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multipod L3out ProblemsIssue #2: Stretched L3out VIP Failover

102BRKACI-2934

Which VNID and Gipo should the l3out use?

a-apic1# moquery -c fvIfConn -f 'fv.IfConn.dn*"uni/tn-CiscoLive2020/out-EIGRP/"'Total Objects shown: 2

# fv.IfConnbcastP : 225.1.188.208dn : uni/epp/rtd-[uni/tn-CiscoLive2020/out-EIGRP/instP-defaultNet]/node-101/stpathatt-[shared-5596-A-VPC]/conndef/conn-[vlan-1052]-[52.52.52.101/24]extEncap : vxlan-15466402

# fv.IfConnbcastP : 225.1.188.208dn : uni/epp/rtd-[uni/tn-CiscoLive2020/out-EIGRP/instP-defaultNet]/node-205/stpathatt-[shared-5596-A-VPC]/conndef/conn-[vlan-1052]-[52.52.52.103/24]extEncap : vxlan-15466402

“EIGRP” is the name of the L3out

The same VNID and GIPO is extended to nodes 101 and 205

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Common Multipod L3out ProblemsIssue #2: Stretched L3out VIP Failover

103BRKACI-2934

If there’s a problem check these things…

• Ensure an SVI is used for the l3out (no flooding for routed interfaces)

• Ensure the same vlan encap is used in each pod

• Ensure the IPN agrees on the tree for the GIPO

• Ensure a GARP is sent by external router

• Check if the GARP is sent with COS 6 (more on this later)

Quality of Service

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

ACI QoS OverviewKey Points

105BRKACI-2934

• Fabric QoS is based on COS and DEI bits in outer L2 header

• Incoming COS on non-fabric ports not preserved through fabric but…

• Incoming COS is written into outer DSCP so it can be preserved on egress

• Traffic is level 3 by default (COS 0 + DEI 0)

• COS 6 traffic from IPN may be mistreated (pre-4.0)

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

ACI QoS Overview

106BRKACI-2934

COS Function Notes

3,4,5 APIC, SPAN, Control Plane SPAN=low Priority

6 iTracerouteReserved, sent to CPU; actual iTraceroute is dscp 6

210

Level 1Level 2Level 3 (Default)

User Traffic1 Priority Class

2+DEI3+DEI5+DEI

Level 4Level 5Level 6

New in 4.0!User Traffic5 Priority Classes

SIPDIPProtoUDP

flagsEPG

DMACSMAC802.1QVNIDSIPDIPProtoL4/Payload DMACSMACethtype

iVXLAN Outer HeaderInner Header Fabric QOS

Used for tracing flows

within the fabric.

Reserved for CPU generated

traffic

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

ACI QoS OverviewWhere is QoS Behavior Configured?

107BRKACI-2934

Configuration Where is it Configured? Function

Dot1p Preserve Global Access Policies Causes egress leaf to rewrite cos to original value when forwarding

QoS Class App EPG, Contract, Subject Defines prioritization of traffic through the fabric

Custom QoS App EPG Re-marks traffic based on incoming COS or DSCP

Target DSCP (L3out) L3out, Contract, Subject Sets the DSCP value

DSCP Class-Cos Translation Policy

Infra > Networking > Protocols Spines re-map QoS of traffic going to and coming from IPN/ISN

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

ACI QoS – Preserve COS

108BRKACI-2934

Egress leaf rewrites COS

based on DSCP

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

ACI Forwarding and QoS – Preserve COS

109BRKACI-2934

Layer 2 COS encoded into most significant 3 bits of DSCP

Outer COS Value matches the Level (Contract/EPG)

Note: Incoming COS and DSCP is not used unless custom QoS policy is configured

SIPDIPDSCPflagsEPG

DMACSMAC802.1QVNIDSIPDIPProtoL4/Payload DMACSMAC802.1Q

Configure Dot1p Preserve! The egress leaf will look at the 3 MSB bits of the DSCP value to know which COS value to use for packet rewrite

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Pre-4.0 COS 6 Problem

110BRKACI-2934

Data Center 2

Frame with COS 6 set

1

Leaf forwards frame towards DC1 with COS 0 and an outer DSCP of 480b110 000

2

Data Center 1

IP packet with DSCP 48

3

Datacenter interconnect(IPN, ISN)

Last hop IPN router writes COS based on DSCP …DSCP 48 = COS6 4

DC1 treats packet as iTraceroute

5

Fix? Configure “DSCP class-cos translation policy for L3 traffic”The spine will map the outer COS value to a new DSCP class on egress and map DSCP to COS in ingress

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

DSCP – COS Translation Policy

111BRKACI-2934

✓ COS 6 Problem solved by using DSCP – COS Translation Policy

Incoming traffic from IPN now classified on DSCP

and not COS

Note: DSCP-COS Translation Policy Will Negate Dot1p Preserve

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

After 4.0 Software…

Pre-4.0

112BRKACI-2934

• All devices trust DSCP markings set on ingress leaf

• QoS class is derived from DSCP

• Spine rewrites COS received from IPN based on DSCP

• Traceroute is DSCP 6 so COS 6 + DSCP 48 is forwarded normally

SpineDatacenter interconnect

(IPN, ISN)

COS6 + DSCP 48

SpineDatacenter interconnect

(IPN, ISN)

COS6 + DSCP 48

Traceroute, not forwarded on egress leaf due to COS6

Whichever class DSCP 48

maps to

After 4.0

✓ DSCP – COS Translation Policy Not Required

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

QoS CLI’s

113BRKACI-2934

show queueing interface ethernetx/y See per class Interface Stats

show queueing interface ethernetx/y detail Same as above + control and policy classes

show system internal qos classes Check queueing behavior per class

moquery -c qospInfraDscpMap Check COS to DSCP translation policy on spine

moquery -c qosInstPol Verify if dot1P preserve is enabled (apic or switch)

Complete your online session survey • Please complete your session survey

after each session. Your feedback is very important.

• Complete a minimum of 4 session surveys and the Overall Conference survey (starting on Thursday) to receive your Cisco Live t-shirt.

• All surveys can be taken in the Cisco Events Mobile App or by logging in to the Content Catalog on ciscolive.com/emea.

Cisco Live sessions will be available for viewing on demand after the event at ciscolive.com.

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKACI-2934 114

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public

Continue your education

115BRKACI-2934

Related sessions

Walk-in self-paced labs

Demos in the Cisco campus

Meet the engineer 1:1 meetings

Thank youThank you