Layer 3 Forwarding Modes on Cisco Nexus 9k and 3k Platform Switches White Paper Cisco Public Layer 3 Forwarding Modes on Cisco Nexus 9500, 9300, 3164, and 3200 Platform Switches Authors Ambrish Mehta, Cisco Systems Inc. Swaminathan Narayanan, Cisco Systems Inc. © 2016 Cisco and/or its affiliates. All rights reserved. Layer 3 Forwarding Modes on Cisco Nexus 9k and 3k Platform Switches White Paper Cisco Public What You Will Learn This document provides a closer look at the various Layer 3 routing and forwarding modes available on Cisco Nexus® 9500, 9300, 3164, and 3200 platform data center switches. Its goal is to help network architects and engineers select the appropriate Layer 3 routing and forwarding mode for their current deployments as well as to meet future needs. It assumes that the reader has a high-level understanding of Cisco Nexus 9500, 9300, 3164, and 3200 platform data center switches and routing concepts. Background Today’s data center network needs to be agile to meet the increasing demands of the new workloads constantly being brought online. With the rapid increase in the number of IPv6-enabled applications, organizations need larger hardware tables to accommodate the IPv6 prefixes. Increasing deployment of overlay networking technologies and virtualization also increases the need for larger hardware tables for IPv4 and IPv6 hosts prefixes. The dynamic nature of data center networks and the need for agility directly dictate new Layer 3 route scalability requirements. The scalability requirements for host and nonhost (longest-prefix match [LPM]) route differs from one network design to another. Dual-stack networks make the prefixscale requirements more complex, making it difficult to find a single solution that can meet all needs. This document discusses various routing and forwarding offerings that are available on Broadcom Trident II and Tomahawak ASIC. For Cisco’s next generation ASIC (Application Specific Integrated Circuit) routing offering please look at http://www.cisco.com/c/en/ us/products/collateral/switches/nexus-9000-seriesswitches/white-paper-c11-736863.html. Layer 3 Forwarding Architecture Designs Layer 3 forwarding architecture for the Cisco Nexus 9500, 9300, 3164, and 3200 platforms is categorized into the following designs: • Modular multichip design -- Modular chassis (Cisco Nexus 9500 platform switches) belong in this category. -- Line cards and fabric modules operate independently. -- Depending on the line-card type and chassis form factor, a different number of network © 2016 Cisco and/or its affiliates. All rights reserved. forwarding engines (NFEs) will be present in line card and fabric module. • Fixed multichip design -- The Cisco Nexus 3164Q Switch belongs in this category. -- Virtual and pseudo line cards and fabric modules operate independently. • Fixed single-chip design -- The Cisco Nexus 9300 platform and 3232C, and 3264Q switches belong in this category. -- The single chip makes all forwarding decisions. Prefix blocks across the network can be logically categorized into the following types: • IPv4 host • IPv4 LPM (/31 to /0) • IPv6 host • IPv6 LPM (/127 to /0) NFE and NFE2 Forwarding Architecture The network forwarding engines, NFE (Broadcom Trident II ASIC) and NFE2 (Broadcom Tomahawk ASIC), have a unified forwarding table (UFT). Some banks are dedicated to the Layer 2 MAC address table and Layer 3 hosts routes (IPv4 and IPv6), and some banks can be configured for Layer 2 MAC (MAC address table) addresses, Layer 3 hosts, and LPM, depending on deployment needs. The configuration parameters are set by the application, in this case, Cisco® NX-OS Software, and are set when the application-specific integrated circuit (ASIC) is initialized. Different Layer 3 routing and forwarding modes make use of this configurable ternary content-addressable memory (TCAM) space to address different routing requirements. 2 Layer 3 Forwarding Modes on Cisco Nexus 9k and 3k Platform Switches White Paper Cisco Public Tables 1 and 2 summarize the Layer 2, Layer 3 host, and LPM scale parameters for the various UFT modes for NFE and NFE2. From the perspective of Layer 3 data forwarding, the functions and behavior of NFE and NFE2 are identical, with the only difference being the scale supported. Throughout this document, NFE and NFE2 both refer to the forwarding ASIC and are used interchangeably. Table 1. NFE Supported Scale for UFT Modes Mode Layer 2 (MAC) Layer 3 Host** LPM** 0 288,000 16,000 16,000 1 224,000 56,000 16,000 2 160,000 144,000 16,000 3 96,000 208,000 16,000 4 32,000 16,000 128,000 Table 2. Mode NFE2 Supported Scale for UFT Modes Layer 2 (MAC) Layer 3 Host** LPM** 0 144,000 8000 16,000 1 104,000 40,000 16,000 2 72,000 72,000 16,000 3 40,000 104,000 16,000 4 8000 8000 128,000 Layer 3 Forwarding Modes: In-Depth Look The previous section discussed the UFT modes and associated scale for Layer 2 (MAC address table), Layer 3 host, and LPM. This section examines various Layer 3 routing and forwarding modes in detail. © 2016 Cisco and/or its affiliates. All rights reserved. Each Layer 3 forwarding mode has a unique value proposition, and NX-OS has been enhanced to address various deployment needs. Routing and Forwarding Modes for Cisco Nexus 9500 Platform and 3164Q Switch This section describes the various routing and forwarding modes supported on modular multichip and fixed multichip architecture, which includes the Cisco Nexus 9500 platform and 3164Q Switch. The Cisco Nexus 9500 platform and 3164Q Switch use multichip architecture with NFE at two levels: the line card and the fabric module. NFE provides the forwarding function by performing a lookup and deriving the egress port. Two modes of operation are available: • Hierarchical routing: In this mode, the line-card NFE handles forwarding decisions for some prefix blocks, and the fabric-module NFE handles decisions for the other set of prefix blocks. -- Line-card lookup: If the destination address belongs to the line-card prefix block, then the forwarding decision is made at the line card, and the packet is sent directly to the egress port. For packets from one line card to another, the fabric module performs only the switching. -- Fabric-module lookup: If the packet destination does not belong to the line-card prefix block, then a catch-all entry at the line card matches this destination and is sent to the fabricmodule NFE for forwarding assistance (another lookup). For such packets, the fabric -module NFE derives the egress port, and packets are forwarded. Hierarchical routing provides greater routing scale (more than a single chip can handle) because the line-card and fabric-module NFEs both handle forwarding. • Nonhierarchical routing: In this mode, all the Layer 3 unicast forwarding decisions are handled at the line card NFE, and the fabric module just acts as the switching entity between line cards. In this mode, the scale is limited to a single chip because the fabric module is not used for forwarding lookup. 3 Layer 3 Forwarding Modes on Cisco Nexus 9k and 3k Platform Switches White Paper Cisco Public Default Hierarchical Routing Mode Hierarchical routing mode is the default mode for the Cisco Nexus 9500 platform and 3164Q Switch. In this mode: • All host (/32 IPv4 and /128 IPv6) routes are stored in a line card’s NFE. On the line card, along with these host routes, one catch-all route is programmed. Data traffic that does not match a host entry hits this catch-all route and sent to the fabric module for additional lookup. • All LPM prefixes (for IPv4 and IPv6) are stored in fabric-module NFEs. Depending on the modular chassis form factor, the fabric module contains a different numbers of NFE instances. For example, the Cisco Nexus 9504 chassis has one NFE instance, and the Cisco Nexus 9508 chassis has two NFE instances. Figure 1. Default Hierarchical Routing Mode for Cisco Nexus 9500 Platform and 3164Q Switch IPv4 LPM Fabric Module UFT Mode 4 IPv6 LPM IPv4 Host Line Card UFT Mode 3 IPv6 Host HiGig Port Channel Between Line Card and Fabric Module Hierarchical 64-Bit Algorithmic LPM Mode The algorithmic LPM (ALPM) routing mode is supported on the Cisco Nexus 9500 platform and 3164Q Switch. To put the switch into 64-bit ALPM mode, use the following command: • This mode provides a good balance between host and LPM routes. system routing mode hierarchical max-mode l3 64-alpm¹ • Table 3 shows the prefix scale limit in this mode. In this mode: Table 3. Prefix Scale Limit in Hierarchical Routing Mode • As in the hierarchical mode, all host routes (/32 for IPv4 and /128 for IPv6) are stored in line-card NFEs. On the line card along with these host routes, one catch-all route is programmed. ASIC NFE NFE2 IPv4 Host 208,000 (48,000^) 104,000 (32,000^) IPv6 Host 104,000 (48,000^) 52,000 (32,000^) IPv4 LPM 128,000/64,000 128,000/64,000 IPv6 LPM 16,000/20,000+ 10,000/15,000+ For deployments requiring more than the default IPv6 LPM prefix scale, you can use the following software configuration knob to increase the scale to 20,000 IPv6 LPM routes: hardware profile ipv6 alpm carve-value 3072¹, ² As shown in Figure 1, in this mode the fabric module is configured in UFT mode 4, and the Line card is configured in UFT mode 3. © 2016 Cisco and/or its affiliates. All rights reserved. • LPM prefixes for IPv4 (all prefix lengths) and IPv6 (prefix length of /64 bits or less) are stored in the fabric-module NFE. The fabric-module NFE runs in 64-bit mode, which provides high IPv6 route scale for a prefix length of /64 or less. However, this characteristic also means that you can’t program any IPv6 LPM /65 to /127 entry in the fabric module. • LPM prefixes for IPv6 (prefix length from /65 to /127) thus are stored in the line-card NFE. By default, NX-OS allocates hardware resources for 1000 IPv6 (/65 to /127) entries, but this value can be adjusted or configured using the following configuration knob: hardware profile ipv6 lpm-entries maximum <#>¹ • Table 4 shows the prefix scale limit in this mode. 4 Layer 3 Forwarding Modes on Cisco Nexus 9k and 3k Platform Switches White Paper Cisco Public Table 4. Table 5. Prefix Scale Limit in Maximum Host Mode Prefix Scale Limit in ALPM Mode ASIC NFE NFE2 208,000 (48,000^) 104,000 (48,000^) 104,000 (32,000^) 52,000 (32,000^) IPv4 LPM 128,000 128,000 IPv6 LPM (<=64) 80,000 IPv6 LPM (/65 to /127) 1000 to 3000* IPv4 Host IPv6 Host ASIC NFE NFE2 208,000 (80,000^) 104,000 (40,000^) 104,000 (32,000^) 52,000 (32,000^) IPv4 LPM 16,000 16,000 80,000 IPv6 LPM (<=64) 6000 6000 1000 to 3000* IPv6 LPM (/65 to /127) 1000 to 3000* 1000 to 3000* IPv4 Host IPv6 Host As shown in Figure 2, in this mode the fabric module is configured in 64-bit UFT mode 4, and the line card is configured in UFT mode 3. As shown in Figure 3, in this mode the fabric module is configured in UFT mode 3, and the line card is configured in UFT mode 2. Figure 2. 64-Bit ALPM Mode for Cisco Nexus 9500 Platform and 3164Q Switch Figure 3. Maximum Host Mode for Cisco Nexus 9500 Platform IPv4 LPM Fabric Module UFT Mode 4 64-Bit IPv6 /64 LPM IPv4 Host IPv6 /65 to /127 LPM IPv6 Host Line Card UFT Mode 3 Maximum Host Mode (Hierarchical) The hierarchical maximum host routing mode is supported only on the Cisco Nexus 9500 platform. To put a switch in maximum host mode, use the following command: system routing max-mode host¹ In this mode: • All IPv4 routes (host and LPM) are stored in the fabric-module NFE. The fabric module runs in UFT mode 3. • All IPv6 routes (host and LPM) are stored in the line-card NFE. The line card runs in UFT mode 2. • This routing mode is well suited for environments in which a large number of attached hosts are present. • Table 5 shows the prefix scale limit in this mode. IPv4 Host Fabric Module UFT Mode 3 IPv4 LPM IPv6 Host Line Card UFT Mode 2 IPv6 LPM Nonhierarchical Routing Mode The nonhierarchical routing mode is supported on the Cisco Nexus 9500 platform and 3164Q Switch. To put the switch in nonhierarchical mode, use the following command: system routing mode non-hierarchical¹ In this mode: • All host routes (/32 IPv4 and /128 IPv6) are stored in line-card NFEs. • All LPM routes are also stored in the line-card NFE. No routes are stored in the fabric-module NFE. As a result, the LPM route scale is reduced to 16,000 IPv4 LPM entries. • For a deployment scenario in which the overall LPM route scale is less than 16,000 IPv4 LPM entries, this mode is well suited. It avoids additional lookups, which would be performed on the fabric module for LPM prefixes in the hierarchical routing mode. It also provides consistent latency between host and LPM flows if the source and destination ports are connected to the same NFE on a given line card. • Table 6 shows the prefix scale limit in this mode. © 2016 Cisco and/or its affiliates. All rights reserved. 5 Layer 3 Forwarding Modes on Cisco Nexus 9k and 3k Platform Switches White Paper Cisco Public Table 6. Prefix Scale Limit in Nonhierarchical Routing Mode ASIC IPv4 Host IPv6 Host NFE NFE2 208,000 (48,000^) 104,000 (48,000^) 104,000 (32,000^) 52,000 (32,000^) Table 7. Prefix Scale Limit in Nonhierarchical ALPM Mode ASIC NFE NFE2 IPv4 Host 16,000 8000 IPv6 Host 8000 4000 IPv4 LPM 12,000/4000 12,000/4000 IPv4 LPM 128,000/ 64,000+ 128,000/ 64,000+ IPv6 LPM (<=64) 6000/2000 6000/2000 IPv6 LPM (/0 to /127) 1000 to 3000* 10,000 to 15,000+ IPv6 LPM (/65 to /127) 1000 to 3000* 16,000 to 20,000+ As shown in Figure 4, in this mode the line card is configured in UFT mode 3. The UFT mode on the fabric module is not relevant. Figure 4. Nonhierarchical Mode for Cisco Nexus 9500 Platform and 3164Q Switch As shown in Figure 5, in this mode the line card is configured in UFT mode 4. The UFT mode on the fabric module is not relevant. Figure 5. Nonhierarchical Maximum Layer 3 Mode for Cisco Nexus 9500 Platform Fabric Module UFT Mode X Fabric Module UFT Mode X IPv4 Host IPv4 Host IPv4 LPM Line Card UFT Mode 3 IPv6 Host IPv6 LPM Nonhierarchical ALPM Mode Nonhierarchical ALPM routing mode is supported on the Cisco Nexus 9500 platform only. To put a switch in nonhierarchical ALPM mode, use the following command: system routing mode non-hierarchical max-mode l3¹ In this mode: • All host routes (/32 for IPv4 and /128 for IPv6) are stored in line-card NFEs. However, the host route scale has been reduced to 16,000 IPv4 entries. • All LPM routes are also stored in line-card NFEs. No routes are stored in fabric-module NFEs. The LPM route scale is 128,000 IPv4 LPM entries. • For deployment scenarios in which the overall host route scale is less than 16,000 IPv4 entries, this mode is well suited. This mode provides the same LPM route scale as the default (hierarchical) mode, but a smaller scale for host routes. • This mode is the same as the nonhierarchical mode but with higher LPM route scale. • Table 7 shows the prefix scale limit in this mode. © 2016 Cisco and/or its affiliates. All rights reserved. IPv4 LPM Line Card UFT Mode 4 IPv6 Host IPv6 LPM Routing Modes for Cisco Nexus 9300, 3100, and 3200 Platform Switches This section describes the routing modes for fixed single-chip architecture, including routing modes supported on the Cisco Nexus 9300 platform and 3132Q, 3232C, and 3264Q Switches. Default Routing Mode In the default routing mode: • IPv4 and IPv6 (host and LPM) routes are stored in a single NFE instance present in the switches. • The prefix scale limit in this mode is as follows: -- Host: IPv4 up to 208,000, and IPv6 up to 104,000 -- LPM: IPv4 up to 12,000, and IPv6 up to 7000 The 7000 IPv6 prefixes consist of 6000 with prefix length <= /64, and 1000 with prefix length /65 to /127. By default, NX-OS allocates hardware resources for 1000 IPv6 (/65 to /127) entries, but you can adjust or configure this value using the following configuration knob: hardware profile ipv6 lpm-entries maximum <#>¹, ³ • Table 8 shows the prefix scale limit in this mode. 6 Layer 3 Forwarding Modes on Cisco Nexus 9k and 3k Platform Switches White Paper Cisco Public Table 8. Prefix Scale Limit in Default Routing Mode ASIC NFE NFE2 IPv4 Host 208,000 (48,000^) 104,000 (32,000^) IPv6 Host 104,000 52,000 IPv4 LPM 12,000/4000 12,000/4000 IPv6 LPM (<=64) 6000/2000 6000/2000 IPv6 LPM (/65 to /127) 1000 to 3000* 1000 to 3000* Figure 6 shows the configuration of the default routing mode. Figure 6. Default Routing Mode for Cisco Nexus 9300 Platform and 3132Q, 3232C, and 3264Q Switches IPv6 LPM IPv6 Host Table 9. Prefix Scale Limit in Maximum Layer 3 (ALPM) Mode ASIC NFE NFE2 IPv4 Host 16,000 8000 IPv6 Host 8000 4000 IPv4 LPM 128,000/ 64,000+ 128,000/ 64,000+ IPv6 LPM (0 to 127) 16,000 to 20,000+ 10,000 to 15,000+ Figure 7 shows the configuration of the ALPM mode. Figure 7. Maximum Layer 3 (ALPM) Mode for Cisco Nexus 9300 Platform and 3132Q, 3232C, and 3264Q Switches IPv6 LPM UFT Mode 4 UFT Mode 3 IPv4 LPM IPv6 Host IPv4 Host Maximum Layer 3 (ALPM) Mode For the ALPM mode, use the following command: system routing max-mode l3¹ In this mode: • IPv4 and IPv6 (host and LPM) routes are stored in a single NFE instance (configured in UFT mode 4). • Table 9 shows the prefix scale limit in this mode. IPv4 LPM IPv4 Host For More Information For more information about the Cisco Nexus 9000 Series Switches, see the detailed product information at the product homepage at http://www. cisco.com/c/en/us/products/switches/nexus-9000series-switches/index.html. For more information about the Cisco Nexus 9000 Series Switches, see the detailed product information at the product homepage at http://www. cisco.com/c/en/us/products/switches/nexus-3000series-switches/index.html. Endnotes 1. This configuration change requires you to reboot the switch after you save the configuration. 2. The IPv4 LPM scale is restricted to 64,000 entries. 3. Increasing this allocation reduces the IPv4 LPM scale of 12,000 entries and the IPv6 LPM scale of 6000 entries for a prefix length of /64 or less. ** Number of IPv4 entries ^ Address Resolution Protocol (ARP) and Neighbor Discovery + “hardware profile ipv6 alpm carve-value <#>” configuration knob * “hardware profile ipv6 lpm-entries maximum <#>” configuration knob © 2016 Cisco and/or its affiliates. All rights reserved. Cisco and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries. To view a list of Cisco trademarks, go to this URL: www.cisco.com/go/trademarks. Third-party trademarks mentioned are the property of their respective owners. The use of the word partner does not imply a partnership relationship between Cisco and any other company. (1110R) C11-736548-02 05/16