STACKING FEATURES IN 8.020 Kevin Lin Principle Engineer Aug 2014 © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 1 Legal Disclaimer All or some of the products detailed in this presentation may still be under development and certain specifications, including but not limited to, release dates, prices, and product features, may change. The products may not function as intended and a production version of the products may never be released. Even if a production version is released, it may be materially different from the pre-release version discussed in this presentation. Nothing in this presentation shall be deemed to create a warranty of any kind, either express or implied, statutory or otherwise, including but not limited to, any implied warranties of merchantability, fitness for a particular purpose, or non-infringement of third-party rights with respect to any products and services referenced herein. ADX, Brocade, Brocade Assurance, Brocade One, the B-wing symbol, DCX, Fabric OS, ICX, MLX, SAN Health, VCS, and VDX are registered trademarks, and AnyIO, HyperEdge, MyBrocade, NET Health, OpenScript, and The Effortless Network are trademarks of Brocade Communications Systems, Inc., in the United States and/or in other countries. Other brands, products, or service names mentioned may be trademarks of their respective owners. *** NOTE: Backdoor and hidden commands are for INTERNAL USE ONLY*** *** Debug commands are for use under direction of Technical Support only*** © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 2 Agenda • Stacking Introduction • Supported platforms • Existing features for new platforms • Differences • HW differences • Config mismatch • Architectures © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 3 Acronyms Acronym Description Acronym Description Katara-L ICX6430 bootup master A unit was a master when it reloads. Katara-H ICX6450 RPC remote procedure call Kataras Katara-L and -H IPC Inter-processor communication Chow ICX6610 STK port Stacking port Sidewinder ICX7750 Spatha ICX7450 master Active controller standby Standby controller member Member unit clean unit a standalone unit without startup configuration © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 4 STACKING IS SIMILAR TO A CHASSIS SYSTEM • Linking multiple stackables to form a logical unit. • A stacking system is similar to a chassis system • Active controller (master) = management module • Standby controller = standby management module • Member controller = line card • Stacking ports = switching fabric • A stacking system uses a single IP to manage the entire stack. • The master is the single point of contact for all outside applications. E.g. It provides port statistics of every unit. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 5 STACKING IS SIMILAR TO A CHASSIS SYSTEM • All CPU bounded packets, except internal communication and S-flow, are redirected to the master CPU by HW. • Telnet to any data port links to the master. • A unit’s local console can “rconsole” to the master and vice versa • Note: Only the master’s management port is enabled. Therefore the system’s management port changes after failover/switch-over. • Single configuration on the master. Other units are configured by the master. • Configuration is very similar to a chassis that uses modules and ports, not units. • The master downloads software images to all units • All protocols operate in the same way as on a chassis system © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 6 DIFFERENCE FROM A CHASSIS SYSTEM • • • • • • Supports linear or ring topology for homogeneous stacking systems. Supports mesh in family stacking (mixing ICX6610 and ICX6450) Need unit ID assignment. Split and merge Any unit can repeatedly changes role among master, standby and member. All units have data ports. • Cannot reload the old master after switch-over • Standby failover virtual machine architecture becomes more complicated because its local ports are managed by the master. • A unit has multiple modules. • Ports are represented by unit/module/port such as 2/1/48. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 7 NON-OPERATIONAL MODE • IPC version mismatch: Unit cannot communicate at all. This happens when units are running different major releases such as 7300x and 7400x. • Image mismatch: IPC version matches, but images are different such as 7400f/g, or switch/router. Autocopy downloads the correct image if available. • Config mismatch: A member unit module is different from the master’s provisional configuration. Removing unit config by “no unit ?” let the master relearn the unit and put it into an operational state. • License mismatch: The master has an advanced license and the stack is running advanced features such as BGP. A unit without a license joins. • A mismatched unit is non-operational. The master can still communicate to the unit to download the image or reload a unit. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 8 Supported platforms • FCX (Marvell Cheetah 3) since 7.0 • ICX 6610 (Chow) (Marvell Cheetah 5) since 7.3 • ICX 6430 (Katara-L) (Marvell HW) since 7.4 • ICX 6450 (Katara-H) (Marvell HW) since 7.4 • Pugio does not support stacking • ICX 7750 (Sidewinder) (Broadcom Trident-2) since 8.010 (standalone) • ICX 7450 (Spatha) (Broadcom Firescout) since 8.020 © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 9 Stacking ports: All platforms have up to 2 stack ports/trunks. • FCX: flexible ports. See release note for different FCX types. • ICX 6610: always 2 trunks (x/2/1 to x/2/2 and x/ 2/6 to x/2/7). One is 40G, and the other is 4x10G (4 ports). 40G and 4x10G cannot be data ports even when stacking is not enabled. • ICX 6430 (Katara-L): flexible ports/trunks Slot 2 has four 1-Gig ports. • ICX 6450: (Katara-H): flexible ports/trunks. Slot 2 has four 40G ports. • ICX 7750 (Sidewinder): flexible ports/trunks. Slot 2 has six 40G ports. Slot 3 has six 40G ports. Cannot mix slots 2 & 3 in ports/trunks. A trunk has 2-3 ports. • ICX 7450 (Spatha):flexible ports. No trunk. Slot 3 has one 40G. Slot 4 has one 40G. 10 © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only SAME CONFIGURATION • Same configuration commands for new products ICX7750 and ICX7450 • stack-port : configure 1 or 2 ports. Default: 2. ICX6610 cannot change it. • stack-trunk: 2 ports for Kataras, 2 (40G + 4x10G) for ICX6610, 2-3 for ICX7750. Not available on FCX and ICX7450. ICX6610 cannot change it. • default-ports: A unit always has two default-ports. Default ports are used in secure setup as candidate stack ports. A stack port is always a default-port. A default port may not necessarily be a stack port. Not available on ICX6610. For example, you can config “default-ports” to change the factory-set ICX7750 x/2/1 and x/2/4 to x/3/1 and x/3/4. • multi-stack-trunk/ port : change both ends of a live link to trunk/port. Available only after 8.0. • peri-trunk/port: only available on ICX6610 for family stacking (mixing with Katara-H) • NOTE: The system blocks any config CLI that may break the stack or cause port to trunk connection. It warns if existing links have any port-to-trunk connections. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 11 SAME FEATURES • ICX6610 40G and 4x10G ports cannot be data ports even when stacking is not enabled. It always have two stack trunks. FCX Katara-L/H ICX6610 ICX7450 ICX7750 stack-port Yes Yes cannot change Yes Yes stack-trunk No Yes cannot change No Yes default-ports Yes No No No Yes multi-stacktrunk/port No Yes No No Yes peri-trunk/port No No Yes No No © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 12 SAME RELOAD BEHAVIORS AS 7.3 AND LATER • All units reload as a standalone, master or member. No unit reloads as a standby. • A unit without stacking.boot flash file reloads as a standalone. • A master manages the entire stack. • Master and standalone parse the startup-config flash. A member does not do so. • The master gets all system parameters from the startup-config. A member gets them from stacking.boot. • If a member reloads with different system parameters such as jumbo or VE number, the master sends parameters and reloads the member. • The master reloads a member at most once to correct its bootup ID or system max setting. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 13 SAME MASTER ELECTION • Election criteria: • BOOTUP_MASTER, MORE_MEMBER, PRIORITY, LONGER_UPTIME, LOWEST_BOOTUP_ID, LOWEST_MAC_ADDRRESS • The winning master always keeps its own bootup ID. • The master assigns unit IDs to others. If a standalone unit configures “stack suggested-id”, the master will try to assign that ID to it. • ID assignment considers the configuration and physical location of a unit. • Stack merge: The losing stack reloads, and loses its configuration. • Master & standby elections do not consider licenses. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 14 SAME STANDBY BEHAVIORS • The master does standby assignment about 1 minute after topology discovery. Debug CLI “dm stk disable assign” disables standby assignment. • A member is not a standby candidate if it was once a master or standby, and its “show run” is different from the master’s. • A stack may have no standby due to the above reason. • Suppose there is no master since the stack reloads: 1. If a member was a standby before its reload, it waits for about 2 minutes, and reloads the entire stack and boots up as a master. Debug CLI “dm stk disable standby” disables the standby taking over. 2. Else: The stack remains in non-functional state because of no master. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 15 ADVANCE FEATURES’ LICENSE MANAGEMENT • Some features, such as BGP, need every unit to have a license. • Master/standby elections do not consider licenses. • A master without a license cannot access the BGP CLIs at all. • A master with a license can access BGP CLIs. However, the system blocks BGP config actions if any operational unit has no license. • If a unit without a license joins a stack running BGP, it is put into non-operational state. • The user can purchase a license and download it to a unit through the master. The master puts the unit to an operational state without a reload. • Licenses have no effect in switch-over/failover. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 16 SAME STACK CONSTRUCTION METHODS • Configure and join. • Every unit configures “stack enable” and optional “priority” / “stack suggested-id” and links all units together. • Secure Setup (an interactive program for a user to probe the topology and select an ID.) • Secure Setup allows a user to change the ID of a non-master unit. The system does not adjust the configuration related to the old ID. • Unit Replacement: replace a unit with a clean unit of the same model with no user intervention. • Only the master has the configuration for every unit. Then it links clean units to form a stack. (same as Unit Replacement) © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 17 SAME FAILOVER BEHAVIORS • FAILOVER: The standby takes over if the master is gone. If “hitless-failover enable” is configured, the failover does not require reloads. Otherwise, the standby reloads the entire stack, and it boots up as the new master. • Switch-over: Switch the master and standby. It requires that the master and standby have the same priority. • If there is no standby and the master is gone, the stack cannot learn or communicate. Reloading won’t help because a member is still a member after a reload. • Use “stack unconfigure” to change to standalone units. Syntax: stack unconfigure [ stack-unit | all | me | clean | mixed-stack ] © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 18 FAILOVER WITH HITLESS-FAILOVER ENABLED (SAME) • Suppose master U1 crashes, standby U2 takes over immediately without reloading any units. • U1 comes up again. There are two masters. • U2 wins the master election if it has more members. • If there are only U1 and U2, the one with higher priority wins. • If they have the same priority, the one with longer up time wins. • The loser unit reloads as a member. It is assigned as a standby later. • If the standby has a higher priority than the master, the system triggers a switchover after the standby has learned L2 protocols. • NOTE: hitless-failover is by default enabled since 8.020. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 19 SAME RCONSOLE FEATURE • A member’s local console is redirected to the master unless “stack rconsole-off” (hidden config CLI) is configured. • Many debug commands must be executed on a member’s local console. There are three ways to go to member’s local console: • The master rconsole to a member unit by “rconsole unit-ID”. • Type “shift ctrl – x” to go back to the local console. The three keys shift, ctrl and dash must be typed together, and be followed by ‘x’. • Configure “stack rconsole-off” (hidden command) on the master. This avoids automatic redirecting a member’s local console to the master when a member joins. • Ctrl o x (ctrl o together followed by x) or ‘exit” is to terminate master to member rconsole. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 20 INFORMATION AVAILABLE IN STANDBY & MEMBER • Local console information used in trouble shooting failover or platform issues. • If a member was once a master or standby, it has “show run”. • Never trust application information such as “show mac”, “show arp” and “show ip ..” on standby or member. Every application is different. • Standby runs protocols in its virtual machine. Its application show commands depends on its implementation and be different between Marvell and Broadcom platforms • Most CLIs, except show, debug and dm, are disabled in master and standby. • Type “develop” (hidden CLI) to toggle fully accessibility of all CLIs. (be careful) © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 21 INFORMATION AVAILABLE IN STANDBY & MEMBER standby member show run fully functional Only local module show interface fully functional Only local module show statistics 7750/7450: local ports Else: not available 7750/7450: local ports Else: not available show stack fully functional Fully functional except no stacking port info for other unit debug fully functional fully functional dm fully functional fully functional © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 22 USEFUL SHOW AND DEBUG COMMANDS • show int brief, show stat brief • show stack • show stack ipc: repeat twice to see the difference or “dm stk clear ipc” • dm stk show hit log: Prints hundreds of lines of useful stacking logs. • debug prestera reg /table: monitor writing a register or table in Marvell HW • debug bcm api: monitor calling a specific Broadcom API • dm stk debug-save/del: save/delete current debug setting into flash so it is loaded during bootup. If there are lots of debug messages even after reload, do “no debug all” and “dm stk debug-del” © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 23 SAME FEATURES • Same display commands • show stack, show stack detail/neighbors/stack-ports: They are similar. • A dynamic unit is removed from “show run” when it leaves. • “write memory” changes a dynamic unit to a static unit. ICX7750-26Q Router#show stack detail T=3d2h24m22.0: alone: standalone, D: dynamic cfg, S: static, A=10, B=11, C=12 ID Type Role Mac Address Pri State Comment 1 S ICX7750-20QXG active cc4e.2438.7480 128 local Ready 6 S ICX7750-48XGC member 0000.0000.0000 0 reserve 11 D ICX7750-48XGC member cc4e.2438.6e80 128 remote Ready, standby if reloaded 12 S ICX7750-48XGC member cc4e.2438.8c00 0 remote Ready, standby if reloaded active +---+ +---+ +---+ -2/1| 1 |2/4--2/4| C |2/1--2/1| B |2/4| +---+ +---+ +---+ | | | |-------------------------------------| There is no standby. Will assign in 46 sec @3d2h25m8s continued in the next slide © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 24 Same display commands Continued from previous slide Current stack management MAC is cc4e.2438.7480 Image-Auto-Copy is Enabled. Unit# 1 11 12 Stack Port Status Stack-port1 up (1/2/1) up (11/2/1) up (12/2/1) Unit# 1 11 12 54 System uptime 3 days 2 hours 24 minutes 22 seconds 2 minutes 7 seconds seconds Stack-port2 up (1/2/4) up (11/2/4) up (12/2/4) Neighbors Stack-port1 U11 (11/2/4) U12 (12/2/1) U11 (11/2/1) Stack-port2 U12 (12/2/4) U1 (1/2/1) U1 (1/2/4) If unit ID > 9, it is shown as A (10), B(11) and C(12). NOTE: up* means an ICX7450 stacking port is in IEEE mode. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 25 Same display commands • show stack ipc: shows internal communication between CPUs: • Show stack ipc unit-number: show IPC to/from one unit. Recv shows two numbers for two directions. FCX/ICX6610/Kataras may show two numbers only in certain types (using hop by hop communication). “dm stk clear ipc” clears IPC counters. ICX7750-26Q Router#show stack ipc V21, G2, src=cc4e.2438.7480, , Recv: SkP0: 1645885, P1: 1580780, sum: 3226665, t=268072.7 Message types have callbacks: Send message types: [1]=1261028, [5]=532970, [6]=19073, [9]=63317, … removed Recv message types: [1]= 0:699208, 1:635040, [5]= 0:266955, 1:266087, [6]= 0:18231, 1:18231, ... Removed Continued at the next slide © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 26 Same display commands Continued from previous slide “show stack ipc” Statistics: send pkt num send msg num send frag pkt num pkt buf alloc : : : : 2986384, 2986384, 0, 2986534, Reliable-mail send success target ID 2 2 target MAC 8659 8654 unrel target ID 1 unrel target MAC 14 There is 0 current jumbo IPC session recv pkt num recv msg num recv frag pkt num receive 0 0 0 0 : : : duplic T (us) 0 33172644 0 33172644 Possible errors: *** send error pkt num: 150, *** send error msg num: *** no DSA tag : 146, *** state not ready : *** rel-mail-mac fail : 5, *** recv from non-exist unit 6 times: unit 12 ICX7750-26Q Router# © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 3226657, 3226658, 0, 150, 2, 27 Same display commands • show stack rel-ipc: show reliable IPC channels. This is used to debug different applications, such as rconsole, ACL and S-flow, which use reliable IPC channels. “dm stk clear rel-ipc” clear reliable IPC counters. • show stack flash: shows the current stacking.boot and what was read during a reload. stacking.boot stores all stacking information required for a reload, such as bootup ID, bootup role (master or member) and system parameters. A unit without stacking.boot always reloads as a standalone unit. A system parameter shown as ‘x’ means getting the value from the startup-config. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 28 Same display commands • show stack connection: probe the topology and show details. <---> means good bi-directional traffic. <--- or ---> means bad uni-directional link. ICX7750-26Q Router#show stack conn Probing the topology. Please wait ... ICX7750-26Q Router# active standby +---+ +---+ +---+ -2/1| 1 |2/4--2/4| C |2/1--2/1| B |2/4| +---+ +---+ +---+ | | | |-------------------------------------| trunk probe results: 3 links Link 1: u1 -- u11, num=1 1: 1/2/1 (P0) <---> 11/2/4 (P1) Link 2: u1 -- u12, num=1 1: 1/2/4 (P1) <---> 12/2/4 (P1) Link 3: u11 -- u12, num=1 1: 11/2/1 (P0) <---> 12/2/1 (P0) CPU to CPU packets are fine between 3 units. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 29 DIFFERENCES • “hitless-failover” is enabled by default since 8.020. • “stack mac-address” is configured by default since 8.020 if “hitless-failover” is enabled. The mac address is changeable but cannot be removed. • FCX, ICX6610, ICX6450 and ICX6430 use Marvell HW • ICX7750 and ICX7450 use Broadcom HW • HW debugging commands are very different. • Removable ICX7750 slot 3 and ICX7450 slots 3 and 4 • FCX, ICX6610 and ICX6450 support up to 8 units. ICX6430: 4 units. • ICX7750 and ICX7450 support up to 12 units. • ICX7450 stacking ports may be in IEEE data or HiGi mode. ICX7750 stacking ports are always in HiGi mode. • ICX7450 cannot enable stacking if with 4x10G ports. • An ICX7450 stacking port must exist or be provisional to be in “stack-port”. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 30 HiGi VS IEEE modes • All stacking (STK) ports are considered as internal ports. They are not used in connecting to other switch/routers or computers. • Packets to/from a STK ports (HiGi mode) have special tags (Marvell:8 bytes, Broadcom: 16 bytes) • Linking a STK port (HiGi mode) to a data port (IEEE mode) causes HW decoding errors. • Secure setup and unit replacement need to work even in a mixing link. The CPU can decode CPU-bounded packets correctly. • ICX7450 cannot bring port up when linking a HiGi port to an IEEE port. • ICX7450 STK ports are initially in the IEEE mode. After negotiation with neighboring port, it may enter HiGi mode. • Whenever an ICX7450 STK port goes down, it enters IEEE mode. • “show stack detail” shows “up*” if a STK port is up and in IEEE mode. This happens when its neighbor port is not a STK port. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 31 CONFIG MISMATCH • Master can have provisional configurations for non-existing units or modules. • “show interface” displays all ports in a provisional module in Link Down state. • A member’s module type does not match the provisional configuration. • • • • Port number mismatch: 24-port module vs 48 port-module: mismatch POE mismatch: POE vs non-POE modules: may mismatch? Module mismatch: ICX7450 4x10G vs 40G: mismatch Non-existing modules (ICX7750 module 3, ICX7450 modules 3 and 4): OK • Master learns new modules if a unit has more modules than the config • Master keeps config modules if a unit has less modules than the config. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 32 UPGRADE AND DOWNGRADE • “hitless-failover enable” is by default enabled. • Pre-8.020 no config (not enabled) 8.020: no config (enabled). • Pre-8.020 has config (enabled) 8.020: has config (enabled). • 8.020 “no hitless-failover enable” (disabled): downgrade safe, but upgrade becomes enabled if do “write mem” in pre 8.020. • 8.020 hitless-failover enabled has two states: 1) no config, 2) has “hitlessfailover enable” config. The latter is due to config in pre-8.020. • 8.020 8.030 (has 10G breakout ports): The internal port number changes. For example, port 2/2/4 is 0x143 in 8.020 and 0x14c in 8.030. • Stacking.boot stores internal port number, not “2/2/3” string, for stacking ports. Each unit must detect version number and do conversion. © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 33 COMMUNICATIONS BETWEEN CPUS • • • • • Hop to hop mails: Do not need device number. Used in topology discovery. Reliable mails: Used in critical communication during topology changes. Inter-processor Communication (IPC): unicast from one CPU to another CPU. Reliable IPC (Rel-IPC): It is similar to TCP that has sliding windows and transmission buffers. Applications can have their own channels. Rel-mail and rel-IPC fail if transmission buffer is full. 1. Application should check the available rel-IPC space before depositing messages. 2. Implement error recovery. • • Application IPC (App-IPC): APIs to send messages by mail, rel-mail, IPC or relIPC. Applications can register a recovery function to handle rel-mail fails. DY-sync for syncing large tables. It has built in error recovery capabilities. © 2014 Brocade Commsunications Systems, Inc. CONFIDENTIAL—For Internal Use Only 34 ARCHITECTURE FOR MARVELL PLATFORMS • • • • • L3 uses distributed architectures. The master sends ARP and routing tables to its members, and members do HW programming based on the tables. L2 uses register caches. The master can read/write remote units’ registers without waiting as if they were local. Platform uses both register caches and distributed architectures. The master syncs port control tables to its members. Some features use register caches. ACL uses IPC to sync internal data structures to members to do ACL programming. Counters and link statuses are periodically sent by each member to master register caches. © 2014 Brocade Commsunications Systems, Inc. CONFIDENTIAL—For Internal Use Only 35 REGISTER CACHES • • • • • • • Marvell register APIs are redirected to the register caches. Write: the value is written to the caches and returns immediately. The value is sent by reliable IPC to a remote unit to write a register. Read: read from caches. Counters, link status and other HW updated registers are periodically pushed to the master. Different kinds of registers need different processing: Read/Write (RW), Read Only (RO), Write Only, Read and Clear, RO and updated by HW, RW and also updated by HW. Many registers that the master never access are classified as LOCAL. They are not in register caches. The classification is defined in tables. Different chips requires different tables. 36 © 2014 Brocade Commsunications Systems, Inc. CONFIDENTIAL—For Internal Use Only HOTSWAP FOR MARVELL PLATFORMS • • • • • • The member uploads its non-LOCAL registers to the master to form the register caches’ baseline. The master programs VLAN, STP, trunks and FDB (MAC forwarding) in register caches. Then the register caches are synced to the members. Every application checks its SW data structure (representing show run config) and programs the member using register caches or sending IPC messages. The master puts a member into READY state. The member begins to periodically push its counters and link statuses to the master register caches. The master periodically polls the link registers of every unit, does port up/down event handling, and calls upper layer applications. © 2014 Brocade Commsunications Systems, Inc. CONFIDENTIAL—For Internal Use Only 37 ARCHITECTURE FOR BROADCOM PLATFORMS • • • • • • • We capture all Broadcom APIs (about 3000) so we can do remote procedure calls (RPC), caller tracking, statistics and sanity checks. RPC is non-blocking (without waiting for the result from the remote unit). Therefore only write functions can use RPC. A member periodically pushes its counters to the master’s caches. The master can read/write without waiting. Most applications use distributed approaches by sending IPC messages. RPC is used for convenience. (Calling Broadcom APIs without implementing code to send and process IPC messages.) Port status updates are done by interrupts, not by polling. The SC/LC model sends status changes from a member to the master. Hotswap is similar to Marvell platforms: replace register caches by IPC messages. 38 © 2014 Brocade Commsunications Systems, Inc. CONFIDENTIAL—For Internal Use Only SWITCH-OVER & FAILOVER: STANDBY VIRTUAL MACHINE • • • • • • The standby must have proper dynamically learned data structures or protocol states to do hitless failover. This is standby virtual machine (VM). The standby VM builds L2 protocol states, such as STP, MRP and LACP, by snooping control packets. Some applications sync data structures, such as MAC tables and errdisable tables, to the standby VM. L3 protocols use graceful restart. The standby VM can be different from the master’s view. The standby’s HW is controlled by the master. The standby cannot write to its HW when running protocols. • Marvell platforms: the standby writes to virtual machine register caches. • Broadcom platforms: the standby must block all virtual machine HW writes. © 2014 Brocade Commsunications Systems, Inc. CONFIDENTIAL—For Internal Use Only 39 SWITCH-OVER & FAILOVER • • • • • • The new master (old standby) must program the HW of every unit, including itself, based on its virtual machine (VM) states. Marvell platforms: The new master populates some SW tables, e.g. MAC tables, into register caches, then does HW repainting based on register caches. Broadcom platforms: The new master programs every unit’s HW based on VM data structures and protocol states. The VM becomes the new master’s view. Switch-over: The new standby (old master) must delete all dynamic data. Its data structures and protocol states become the VM. A standby becoming a member needs to remove all dynamic data from the VM, preserving it in case it becomes a standby again. © 2014 Brocade Commsunications Systems, Inc. CONFIDENTIAL—For Internal Use Only 40 THANK YOU © 2014 Brocade Communications Systems, Inc. CONFIDENTIAL—For Internal Use Only 41