M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Functional Specification MV-S102110-02, Rev. E August 24, 2006 Document Classification: Restricted Information 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 98DX106, 98DX107, 98DX130, 98DX133, 98DX163, 98DX163R, 98DX166, 98DX167, 98DX169, 98DX243, 98DX246, 98DX247, 98DX249, 98DX250, 98DX253, 98DX260, 98DX262, 98DX263, 98DX269, 98DX270, 98DX273, and 98DX803 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Document Status Technical Publication: AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Preliminary Document Conventions Provides related information or information of special importance. Caution Indicates potential damage to hardware or software, or loss of data. Warning M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Indicates a risk of personal injury. For further information about Marvell® products, see the Marvell website: http://www.marvell.com Disclaimer No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, for any purpose, without the express written permission of Marvell. Marvell retains the right to make changes to this document at any time, without notice. Marvell makes no warranty of any kind, expressed or implied, with regard to any information contained in this document, including, but not limited to, the implied warranties of merchantability or fitness for any particular purpose. Further, Marvell does not warrant the accuracy or completeness of the information, text, graphics, or other items contained within this document. Marvell products are not designed for use in life-support equipment or applications that would cause a life-threatening situation if any such products failed. Do not use Marvell products in these types of equipment or applications. With respect to the products described herein, the user or recipient, in the absence of appropriate U.S. government authorization, agrees: 1) Not to re-export or release any such information consisting of technology, software or source code controlled for national security reasons by the U.S. Export Control Regulations ("EAR"), to a national of EAR Country Groups D:1 or E:2; 2) Not to export the direct product of such technology or such software, to EAR Country Groups D:1 or E:2, if such technology or software and direct products thereof are controlled for national security reasons by the EAR; and, 3) In the case of technology controlled for national security reasons under the EAR where the direct product of the technology is a complete plant or component of a plant, not to export to EAR Country Groups D:1 or E:2 the direct product of the plant or major component thereof, if such direct product is controlled for national security reasons by the EAR, or is subject to controls under the U.S. Munitions List ("USML"). At all times hereunder, the recipient of any such information agrees that they shall be deemed to have manually signed this document in connection with their receipt of any such information. Copyright © 2006. Marvell International Ltd. All rights reserved. Marvell, the Marvell logo, Moving Forward Faster, Alaska, Fastwriter, Datacom Systems on Silicon, Libertas, Link Street, NetGX, PHYAdvantage, Prestera, Raising The Technology Bar, The Technology Within, Virtual Cable Tester, and Yukon are registered trademarks of Marvell. Ants, AnyVoltage, Discovery, DSP Switcher, Feroceon, GalNet, GalTis, Horizon, Marvell Makes It All Possible, RADLAN, UniMAC, and VCT are trademarks of Marvell. All other trademarks are the property of their respective owners. MV-S102110-02 Rev. E Page 2 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Note M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table of Contents About This Document ................................................................................................................ 14 Document Organization ............................................................................................................. 14 Related Documentation ............................................................................................................. 14 Document Conventions ............................................................................................................. 15 Section 1. Product Family Overview...........................................................................................................18 Prestera Software Suite............................................................................................................. 21 Section 2. 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 Summary of Features................................................................................... 22 Port MAC Features .................................................................................................................... 22 Port Trunking Features .............................................................................................................. 24 Distributed Switching Architecture (DSA) Features ................................................................... 24 Quality of Service Features ....................................................................................................... 25 Policy Features .......................................................................................................................... 25 Bridging Features ...................................................................................................................... 26 Unicast Routing Features .......................................................................................................... 27 Traffic Policing Features ............................................................................................................ 28 Bandwidth Management Features............................................................................................. 28 Secure Control Technology (SCT) Features ............................................................................. 29 Traffic Monitoring Features........................................................................................................ 30 Section 3. 3.1 3.2 3.3 3.4 3.5 3.6 3.7 Functional Overview .................................................................................... 31 98DX106, 98DX163, 98DX166, 98DX243, and 98DX246 Block Diagram ................................ 32 98DX130, 98DX250, 98DX260, 98DX262, 98DX270, and 98DX803 Block Diagram ............... 33 98DX107, 98DX167, and 98DX247 Block Diagram .................................................................. 34 98DX133, 98DX253, 98DX263, and 98DX273 Block Diagram ................................................. 35 98DX169 and 98DX249 Block Diagram .................................................................................... 36 98DX269 Block Diagram ...........................................................................................................37 High-Level Packet Walkthrough ................................................................................................ 38 Section 4. 4.1 4.2 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 1.1 1.2 Product Overview ......................................................................................... 18 Distributed Switching Architecture ............................................................ 44 Cascade Ports ........................................................................................................................... 44 Single-Target Destination in a Cascaded System ..................................................................... 45 Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 3 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Preface ................................................................................................................................. 14 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Table of Contents Section 5. Ingress Packet Command Assignment ..................................................................................... 52 Command Resolution Matrix..................................................................................................... 53 Section 6. 6.1 6.2 6.3 6.4 6.5 6.6 6.7 PCI Interface ............................................................................................................................. 58 Serial Management Interfaces (SMI) ........................................................................................ 79 Two Wire Serial Interface (TWSI) ............................................................................................. 90 Device Address Space.............................................................................................................. 93 CPU MII/GMII/RGMII Port......................................................................................................... 94 Interrupts ................................................................................................................................... 98 General Purpose Pins (GPP) .................................................................................................. 100 Section 7. 7.1 7.2 7.3 CPU Traffic Management ........................................................................... 102 CPU Port Number ................................................................................................................... 102 Packets to the CPU................................................................................................................. 102 Packets from the CPU............................................................................................................. 107 Section 8. 8.1 8.2 8.3 8.4 8.5 8.6 Host Management Interfaces....................................................................... 55 Quality of Service (QoS) ............................................................................ 110 QoS Model .............................................................................................................................. 110 Initial QoS Marking .................................................................................................................. 115 Traffic Policing......................................................................................................................... 121 QoS Enforcement.................................................................................................................... 122 Setting Packet Header QoS Fields ......................................................................................... 126 Applications............................................................................................................................. 128 MV-S102110-02 Rev. E Page 4 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 5.1 5.2 Packet Command Assignment and Resolution......................................... 52 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Multi-Target Destination in a Cascaded System ....................................................................... 46 Loop Detection .......................................................................................................................... 47 QoS on Cascade Interface........................................................................................................ 47 DSA Tag.................................................................................................................................... 48 Cascading ................................................................................................................................. 51 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 4.3 4.4 4.5 4.6 4.7 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Tri-Speed Port Overview .........................................................................................................137 HyperG.Stack Port Overview...................................................................................................138 HX and QX Ports Overview .....................................................................................................142 MAC Operation and Configuration...........................................................................................144 Tri-Speed Ports Auto-Negotiation............................................................................................161 MAC MIB Counters..................................................................................................................164 MAC Error Reporting ...............................................................................................................171 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 9.1 9.2 9.3 9.4 9.5 9.6 9.7 Network Interfaces and Media Access Controllers (MACs) ................... 136 Section 10. Ingress Policy Engine ................................................................................ 172 10.1 10.2 10.3 10.4 10.5 10.6 10.7 Policy Engine Concepts...........................................................................................................172 Policy Engine Overview ...........................................................................................................173 Policy Lookup Configuration ....................................................................................................179 Triggering Policy Engine Processing .......................................................................................183 Policy Search Keys..................................................................................................................183 Policy Actions ..........................................................................................................................195 Applications .............................................................................................................................200 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 11.10 11.11 11.12 11.13 11.14 11.15 11.16 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 11. Bridge Engine ............................................................................................. 203 Bypassing Bridge Engine.........................................................................................................203 VLANs......................................................................................................................................203 Spanning Tree Support............................................................................................................219 Bridge Forwarding Database (FDB) ........................................................................................221 Bridge Multicast (VIDX) Table .................................................................................................240 Bridge Security Breach Events ................................................................................................241 IPv4/6 Multicast (S, G, V) Bridging ..........................................................................................242 Control Traffic Trapping/Mirroring to the CPU .........................................................................244 Private VLAN Edge (PVE) .......................................................................................................251 Ingress Port Packet Rate Limiting ...........................................................................................252 Unknown and Unregistered Packet Filtering ...........................................................................253 IP and Non-IP Multicast Filtering .............................................................................................256 Bridge Local Switching ............................................................................................................256 Bridge Source-ID Egress Filtering ...........................................................................................257 Bridge Ingress Command Resolution ......................................................................................259 Bridge Counters.......................................................................................................................260 Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 5 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Section 9. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Table of Contents Unicast Routing Features........................................................................................................ 265 Unicast Routing Overview....................................................................................................... 265 Policy Engine Support of Unicast Routing .............................................................................. 266 Bridge Engine Support for Unicast Routing ............................................................................ 270 Router Engine Processing ...................................................................................................... 271 Routed Packet Header Modification........................................................................................ 275 Layer 3 Control Traffic to the CPU .......................................................................................... 278 One-Armed Router Configuration ........................................................................................... 279 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 Section 13. Port Trunking .............................................................................................. 281 13.1 13.2 13.3 13.4 Port Trunk-ID Assignment....................................................................................................... 281 Forwarding to a Single Trunk Destination ............................................................................... 283 Forwarding of Multi-Destination Packets................................................................................. 285 Trunking over Cascade Link ................................................................................................... 291 Section 14. Ingress Traffic Policing Engine................................................................. 292 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 14.1 Traffic Policing Engine Overview ............................................................................................ 292 14.2 Triggering Traffic Policing ....................................................................................................... 298 14.3 Policer Configuration............................................................................................................... 298 Section 15. Bandwidth Management ............................................................................ 302 15.1 Buffers and Descriptors........................................................................................................... 302 15.2 Ingress Bandwidth Management............................................................................................. 303 15.3 Egress Bandwidth Management ............................................................................................. 305 Section 16. Traffic Monitoring ....................................................................................... 312 16.1 Traffic Sampling to the CPU.................................................................................................... 312 16.2 Traffic Mirroring to Analyzer Port ............................................................................................ 314 Section 17. LED Interface .............................................................................................. 318 17.1 17.2 17.3 17.4 17.5 LED Interface Overview .......................................................................................................... 318 LED Indications ....................................................................................................................... 319 LED Indication Groups ............................................................................................................ 324 Other Indications ..................................................................................................................... 325 LED Stream............................................................................................................................. 326 MV-S102110-02 Rev. E Page 6 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Section 12. IPv4 and IPv6 Unicast Routing.................................................................. 265 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Appendix A. DSA Tag Formats ....................................................................................... 333 Appendix B. CPU Codes .................................................................................................. 343 Appendix C. Register Set................................................................................................ 374 C.9 C.10 C.11 C.12 C.13 C.14 C.15 C.16 C.17 Registers Overview..................................................................................................................374 Global, TWSI Interface and CPU Port Configuration Registers...............................................376 GPP Configuration Registers...................................................................................................390 PCI SDMA Registers ...............................................................................................................392 Master XSMI Interface Configuration Registers ......................................................................400 Router Header Alteration Configuration Registers ..................................................................404 Tri-Speed Ports MAC, CPU Port MAC, and SGMII Configuration Registers...........................410 HyperG.Stack and HX/QX Ports MAC, Status, and MIB Counters, and XAUI Control Configuration Registers...............................................................................................426 XAUI PHY Configuration Registers .........................................................................................438 HX Port Registers Registers....................................................................................................475 LEDs, Tri-Speed Ports MIB Counters, and Master SMI Configuration Registers....................507 PCI Registers...........................................................................................................................552 Policy Engine and Bridge Engine Configuration Registers......................................................560 Policers and Unicast Routing Engine Configuration Registers................................................679 Pre-Egress Engine Configuration Registers ............................................................................692 Egress, Transmit Queue and VLAN Configuration Registers and Tables ...............................711 Buffers Memory, Ingress MAC Errors Indications, and Egress Header Alteration Configuration Tables and Registers ........................................................................................769 Buffers Management Registers ...............................................................................................777 Summary of Interrupt Registers...............................................................................................790 C.18 C.19 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 C.1 C.2 C.3 C.4 C.5 C.6 C.7 C.8 Appendix D. Packet Format............................................................................................. 821 D.1 D.2 D.3 D.4 D.5 D.6 D.7 Referenced Documents ...........................................................................................................821 Ethernet Headers.....................................................................................................................822 Layer 2 Protocol Headers ........................................................................................................829 IPv4 Header Format ................................................................................................................830 IPv6 Header Format ................................................................................................................832 Layer 4 Headers ......................................................................................................................836 Layer 5 Headers ......................................................................................................................844 Appendix E. Revision History ......................................................................................... 846 Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 7 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Extended DSA Tag in TO_CPU Format ..................................................................................333 Extended DSA Tag in FROM_CPU Format.............................................................................336 Extended DSA Tag in TO_ANALYZER Format.......................................................................339 Extended DSA Tag in FORWARD Format ..............................................................................341 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 A.1 A.2 A.3 A.4 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Table of Contents MV-S102110-02 Rev. E Page 8 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Acronyms........................................................................................................................................... 16 Packet Command Resolution ............................................................................................................ 54 Host Management Interfaces ............................................................................................................ 55 Rx SDMA Descriptor Format ............................................................................................................. 71 Rx SDMA Descriptor—Command/Status Field ................................................................................. 71 Receive Descriptor—Byte Count Field .............................................................................................. 72 Rx SDMA Descriptor—Buffer Pointer................................................................................................ 72 Rx SDMA Descriptor—Next Descriptor Pointer................................................................................. 73 Transmit Descriptor Format............................................................................................................... 76 Transmit Descriptor— Command/Status........................................................................................... 76 Transmit Descriptor—Byte Count...................................................................................................... 77 Transmit Descriptor—Buffer Pointer.................................................................................................. 77 Transmit Descriptor—Next Descriptor Pointer .................................................................................. 77 SMI Interfaces ................................................................................................................................... 79 SMI Interface Framing ....................................................................................................................... 81 XSMI Interface Framing..................................................................................................................... 81 Address Space Partitioning ............................................................................................................... 93 CPU Port Interface According to CPU_IF_TYPE[2:0] ....................................................................... 94 Example Layer 2 System Traffic Type to Traffic Class Mapping Table........................................... 104 Traffic Types.................................................................................................................................... 111 Packet QoS Attributes ..................................................................................................................... 113 QoS Profile Table Entry................................................................................................................... 114 FDB-Based QoS Marking Resolution .............................................................................................. 121 Network Ports QoS Operation Modes ............................................................................................. 128 Recommended QoS Configuration For Network Ports.................................................................... 130 Recommended QoS Configuration For Cascading Ports................................................................ 131 Example {TC, DP} Sample {TC, DP} Assignment ........................................................................... 131 Configuration for Table 36 ............................................................................................................... 132 MAC MIB Counters for Tri-State Ports ............................................................................................ 164 MAC MIB Counters for HyperG.Stack Ports.................................................................................... 166 PCL Configuration Table Entry Lookup Cycle Parameters ............................................................. 179 Standard (24-bytes) Key Format ..................................................................................................... 186 Extended (48-Byte) Key Fields........................................................................................................ 187 Ternary Digit Representation........................................................................................................... 188 Internally Generated Fields ............................................................................................................. 188 Key Fields Directly Extracted from the Packet ................................................................................ 193 Number of User-Defined Bytes Per Key.......................................................................................... 194 Policy Action Entry........................................................................................................................... 196 Dividing Applications Between Lookup Cycles................................................................................ 200 Global Protocol Table Entry............................................................................................................. 209 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 1: Table 2: Table 3: Table 4: Table 5: Table 6: Table 7: Table 8: Table 9: Table 10: Table 11: Table 12: Table 13: Table 14: Table 15: Table 16: Table 26: Table 27: Table 28: Table 29: Table 30: Table 31: Table 32: Table 33: Table 34: Table 35: Table 36: Table 37: Table 38: Table 39: Table 40: Table 41: Table 42: Table 43: Table 44: Table 45: Table 46: Table 47: Table 48: Table 49: AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 List of Tables M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 9 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Per Port Protocol Table Entry...........................................................................................................209 VLAN Entry Fields ............................................................................................................................217 Spanning Tree Port State Behavior..................................................................................................220 FDB Address Table Entry.................................................................................................................222 Additional Address Update Fields ....................................................................................................227 IEEE Reserved Multicast Addresses................................................................................................245 Cisco Proprietary L2 Protocols.........................................................................................................245 MLD Messages over ICMPv6...........................................................................................................247 Common IPv4/6 Link-Local Multicast Addresses .............................................................................250 Host Counters ..................................................................................................................................261 Matrix Source Destination Counters.................................................................................................262 Ingress Port/VLAN/Device Counters per Counter-Set .....................................................................262 Egress Counters per Counter-Set ....................................................................................................264 Routing PCL Rule Classification Key Fields.....................................................................................268 Policy Action Entry As a Route Entry ...............................................................................................269 Configuration Range of CIR and CBS..............................................................................................299 Number of 256-Byte Buffers For Each Device .................................................................................302 SDWRR vs. DWRR ..........................................................................................................................307 Tri-Speed Ports and CPU Port Indication Classes Description........................................................319 HyperG.Stack Port Indication Classes Description ..........................................................................321 XAUI PHY LED Indications ..............................................................................................................322 Group Data Description....................................................................................................................324 LED Interface 0 Ordered by Class ...................................................................................................326 LED Interface 1 Ordered by Class ...................................................................................................328 LED Interface 0 Ordered by Port......................................................................................................329 LED Interface 1 Ordered by Port......................................................................................................331 Extended TO_CPU DSA Tag ...........................................................................................................333 Extended FROM_CPU DSA Tag .....................................................................................................336 Extended TO_ANALYZER DSA Tag................................................................................................339 Extended FORWARD DSA Tag .......................................................................................................341 CPU Codes ......................................................................................................................................343 Standard Register Field Type Codes ...............................................................................................374 Valid Ports for Each Device..............................................................................................................375 Global, TWSI Interface and CPU Port Configuration Register Map Table .......................................376 GPP Configuration Register Map Table ...........................................................................................390 SDMA Register Map Table...............................................................................................................392 Master XSMI Interface Register Map Table .....................................................................................400 Router Header Alteration Configuration Registers Map Table .........................................................404 Tri-Speed Ports MAC, CPU Port MAC and SGMII Configuration Registers Map Table ..................410 HyperG.Stack and HX/QX Ports MAC and XAUI PHYs Configuration Register Map Table ............426 XAUI Register Map Table.................................................................................................................438 Register Map Table for the HX Port Registers Registers.................................................................475 LEDs, Tri-Speed Ports MIB Counters, and Master SMI Register Map Table...................................507 PCI Registers Map Table .................................................................................................................552 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 50: Table 51: Table 52: Table 53: Table 54: Table 55: Table 56: Table 57: Table 58: Table 59: Table 60: Table 61: Table 62: Table 63: Table 64: Table 65: Table 66: Table 67: Table 68: Table 69: Table 70: Table 71: Table 72: Table 73: Table 74: Table 75: Table 76: Table 77: Table 78: Table 79: Table 80: Table 81: Table 82: Table 83: Table 109: Table 113: Table 130: Table 133: Table 144: Table 158: Table 167: Table 222: Table 253: Table 298: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED List of Tables MV-S102110-02 Rev. E Page 10 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 541: Table 554: Table 605: Table 606: CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Policy Engine and Bridge Engine Configuration Registers Map Table............................................ 560 Address Update Message Types .................................................................................................... 662 Policers Configuration Registers Map Table ................................................................................... 679 Pre-Egress Engine Configuration Registers Map Table.................................................................. 692 Egress, Transmit Queue and VLAN Configuration Register Map Table ......................................... 711 Buffers Memory, Ingress MAC Errors Indications and Egress Header Alteration Configuration Tables and Registers Map Table .............................................................................. 769 Buffers Management Registers Map Table ..................................................................................... 777 Summary of Interrupts Register Map Table..................................................................................... 790 Referenced Standards..................................................................................................................... 821 Revision History............................................................................................................................... 846 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 311: Table 400: Table 416: Table 430: Table 457: Table 527: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Figure 9: Figure 10: Figure 11: Figure 12: Figure 13: Figure 14: Figure 15: Figure 16: Figure 17: Figure 18: Figure 19: Figure 20: Figure 21: Figure 22: Figure 23: Figure 24: Figure 25: Figure 26: Figure 27: Figure 28: Figure 29: Figure 30: Figure 31: Figure 32: Figure 33: Figure 34: Figure 35: Figure 36: Figure 37: 98DX106, 98DX163, 98DX166, 98DX243, and 98DX246 Top Level Block Diagram ....................... 32 98DX130, 98DX250, 98DX260, 98DX262, 98DX270, and 98DX803 Top Level Block Diagram ...... 33 98DX107, 98DX167, and 98DX247 Top Level Block Diagram ......................................................... 34 98DX133, 98DX253, 98DX263, and 98DX273 Top Level Block Diagram ........................................ 35 98DX169 and 98DX249 Top Level Block Diagram ........................................................................... 36 98DX269 Top Level Block Diagram .................................................................................................. 37 SecureSmart and Layer 2+ Switches Ingress and Egress Processing Engines ............................... 38 Multilayer Stackable and SecureSmart Stackable Switches Ingress and Egress Processing Engines........................................................................................................................... 39 Example of Single-Target Destination Forwarding in a Cascaded System....................................... 45 Example of Multi-Destination Forwarding in a Cascaded System..................................................... 47 DSA Tag in the Ethernet Frame ........................................................................................................ 48 Host Management Interfaces: 98DX130, 98DX133, 98DX250, 98DX253, 98DX260, 98DX263, 98DX270, 98DX273, and 98DX803 ................................................................................. 57 Host Management Interfaces: 98DX106, 98DX107, 98DX163, 98DX166, 98DX167, 98DX169, 98DX243, 98DX246, 98DX247, 98DX249, 98DX262, and 98DX269 ............................. 57 CPU Descriptors and Memory Buffers .............................................................................................. 66 Serial ROM Data Structure................................................................................................................ 91 TWSI Bus Transaction—External Master Write to a Device Register............................................... 92 TWSI Bus Transaction–External Master Read from a Device Register ........................................... 92 Hierarchal Interrupt Scheme ............................................................................................................ 98 QoS Processing Walkthrough ......................................................................................................... 112 Port-Based QoS Marking Operation................................................................................................ 117 MAC-Address-Based QoS Marker Configuration............................................................................ 120 QoS Enforcement Walkthrough....................................................................................................... 123 {TC, DP} Assignment Algorithm for Data traffic............................................................................... 124 {TC, DP} Assignment for Control Packets....................................................................................... 125 {TC, DP} Assignment of Mirrored Packets ...................................................................................... 125 DiffServ Domains Crossing Using a Single DSCP to DSCP Mutation Table .................................. 134 Functional Block Diagram of Tri-Speed Port in 1000BASE-X Mode .............................................. 137 Functional Block Diagram of Tri-Speed Port in SGMII Mode ......................................................... 138 Functional Block Diagram of the HyperG.Stack Port ..................................................................... 139 Functional Block Diagram of the HX/QX ......................................................................................... 142 MAC Loopback Packet Walkthrough............................................................................................... 155 PCS Loopback Packet Walkthrough ............................................................................................... 155 Analog Loopback Packet Walkthrough ........................................................................................... 156 Repeater Loopback Packet Walkthrough........................................................................................ 156 Ingress Pipe Block Diagram for SecureSmart and Layer 2+ Stackable Switches .......................... 173 Ingress Pipe Block Diagram for Multilayer Stackable Switches ...................................................... 174 Organization of the Policy TCAM .................................................................................................... 176 Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 1: Figure 2: Figure 3: Figure 4: Figure 5: Figure 6: Figure 7: Figure 8: CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 11 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 List of Figures M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED List of Figures MV-S102110-02 Rev. E Page 12 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 52: Figure 53: Figure 54: Figure 55: Figure 56: Figure 57: Figure 58: Figure 59: Figure 60: Figure 61: Figure 62: Figure 63: Figure 64: Figure 65: Figure 66: Figure 67: Figure 68: Figure 69: Figure 70: Figure 71: Figure 72: Figure 73: Figure 74: Figure 75: Figure 76: Figure 77: Figure 78: Figure 79: Figure 80: Packet Walkthrough For a Lookup Cycle ........................................................................................ 178 Access Procedure to The Policy Configuration Table ..................................................................... 181 Interface Mapping to Policy Table Index ......................................................................................... 182 Search Key Selection Procedure for SecureSmart and Layer 2+ Stackable Switches ................... 184 Search Key Selection Procedure for Multilayer Stackable Switches............................................... 184 VLAN Classification Algorithm......................................................................................................... 207 Nested VLAN Cloud Ingress and Egress ....................................................................................... 212 Processing of Unicast Routed Packets ........................................................................................... 266 Minimum Trunk-ID Allocation for Cascade Trunks .......................................................................... 282 Hash Index Generation Procedure .................................................................................................. 284 Sample Logical Configuration of the Designated Trunk Port Table................................................. 287 Sample Configuration of Designated Trunk Members Table........................................................... 288 Example Distribution Port List and Non-Trunk Members Table Configuration ................................ 290 Ingress Pipe Block Diagram: SecureSmart, SecureSmart Stackable and Layer 2+ Stackable Switches ......................................................................................................................... 293 Ingress Pipe Block Diagram: Multilayer Stackable Switches........................................................... 293 Policer Packet Walkthrough ............................................................................................................ 297 Example Profile of Queue Scheduling Groups ................................................................................ 309 Example........................................................................................................................................... 821 MAC Address .................................................................................................................................. 823 Ethernet v2 ...................................................................................................................................... 823 IEEE 802.3 LLC Header (LLC header in boldface) ......................................................................... 824 IEEE 802.3 LLC/SNAP Header (LLC and SNAP headers in boldface) ........................................... 825 Novell (Raw Ethernet) Header......................................................................................................... 826 IEEE 802.1Q Tag in Ethernet v2 Packet (tag in boldface) .............................................................. 827 Double Tag (Q-in-Q)........................................................................................................................ 828 ARP ................................................................................................................................................. 829 IPv4 Header..................................................................................................................................... 830 IPv4 TOS ......................................................................................................................................... 830 IPv4 Multicast Destination Address ................................................................................................. 831 Mapped MAC Multicast Destination Address .................................................................................. 831 IPv6 Header..................................................................................................................................... 832 IPv6 Traffic Class ............................................................................................................................ 833 IPv6 Multicast Destination Address ................................................................................................. 834 Mapped MAC Multicast Destination Address .................................................................................. 834 IPv6 Hop-by-Hop Extension Header ............................................................................................... 835 UDP Header .................................................................................................................................... 836 TCP Header..................................................................................................................................... 836 TCP/UDP Pseudo-Header............................................................................................................... 837 ICMP Echo Request ........................................................................................................................ 837 IGMPv1............................................................................................................................................ 837 IGMPv2............................................................................................................................................ 838 IGMPv3 Membership Query Message ............................................................................................ 840 IGMPv3 Membership Report Message ........................................................................................... 841 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 38: Figure 39: Figure 40: Figure 41: Figure 42: Figure 43: Figure 44: Figure 45: Figure 46: Figure 47: Figure 48: Figure 49: Figure 50: Figure 51: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 13 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 ICMPv6............................................................................................................................................ 842 IPv6 MLDv1..................................................................................................................................... 843 RIPv1............................................................................................................................................... 844 RIPv2............................................................................................................................................... 845 RIP Entry ......................................................................................................................................... 845 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 81: Figure 82: Figure 83: Figure 84: Figure 85: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED List of Figures AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Preface This document describes the architecture and features of the Prestera-DX SecureSmart switches, Layer 2+ stackable switches, and Multilayer stackable switches. It also provides full register definitions for these devices. All feature descriptions and specifications in this document refer to all of the following packet processors, unless otherwise specified. 98DX106, 98DX107, 98DX130, 98DX133, 98DX163, 98DX163R, 98DX166, 98DX167,98DX243, 98DX246, 98DX247, 98DX249, 98DX250, 98DX253, 98DX260, 98DX262, 98DX263, 98DX269, 98DX270, 98DX273, and 98DX803 In this document, any or all of these packet processors are referred to as “the device” or “the devices”. Wherever a section is relevant for only some of these devices, this is stated in the following way at the beginning of the section: This section is relevant for the following devices: R D SecureSmart: 98DX262 D Layer 2+ Stackable: 98DX130, 98DX260, 98DX270, 98DX803 D Multilayer Stackable: 98DX133, 98DX263, 98DX273 D SecureSmart Stackable: 98DX169, 98DX249, 98DX269 R M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note that if the section is not relevant for only one or two of the devices, this is emphasized as follows: This section is relevant for the following devices: D SecureSmart: 98DX106, 98DX163, 98DX163R, 98DX243, 98DX262 D SecureSmart Stackable: 98DX169, 98DX249, 98DX269 D Layer 2+ Stackable: 98DX130, 98DX166, 98DX246, 98DX250, 98DX260, 98DX270 D Multilayer Stackable: 98DX107, 98DX133, 98DX167, 98DX247, 98DX253, 98DX263, 98DX273 U Not relevant for: 98DX803 Document Organization The sections in this specification are organized according to architectural and functional topics. Section 3. "Functional Overview" on page 31 provides a general description of the functional units in the device, and a packet walk-through description. Subsequent chapters focus on each of the device’s architectural and functional topics. Each chapter includes a description of the particular functional behavior, which is followed by the associated hardware register and table configurations. References to registers and table entries are hyperlinks to the corresponding register definition in the appendix of this document. Related Documentation The following documents contain additional information related to the Prestera® family chipset: • RFC and IEEE standards (Table 605, “Referenced Standards,” on page 821) • Prestera-DX Packet Processors Hardware Design Guide, (Document Control # MV-S300644-00) • 98DX130/133/250/253/260/263/270/273 Hardware Specifications (Document Control # MV-S102110-00) • 98DX166/167/246/247 Hardware Specifications (Document Control # MV-S102727-00) • 98DX803 Hardware Specifications (Document Control # MV-S103020-00) MV-S102110-02 Rev. E Page 14 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 About This Document M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Preface Document Conventions The following conventions are used in this document: Document Conventions Document Conventions The following name and usage conventions are used in this document: Signal Range A signal name followed by a range enclosed in brackets represents a range of logically related signals. The first number in the range indicates the most significant bit (MSb) and the last number indicates the least significant bit (LSb). Example: CPU_TXD[7:0] An n symbol at the end of a signal name indicates that the signal’s active state occurs when voltage is low. Example: INTn State Names Register Naming Conventions M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Active Low Signals n State names are indicated in italic font. Example: linkfail Register field names are indicated as follows: Example: The <DeviceEn> field in the Global Control register. If the field name is in blue font (<DeviceEn>), this indicates a hyperlink. Register field bits are enclosed in brackets. Example: Field [1:0] Register addresses are represented in hexadecimal format Example: 0x0 Reserved: The contents of the register are reserved for internal use only or for future use. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 15 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 98DX163/243 Hardware Specifications (Document Control # MV-S103374-00) 98DX106-BCW Hardware Specifications (Document Control # MV-S103473-00) 98DX106-LKJ Hardware Specifications (Document Control # MV-S103381-00) 98DX107-BCW Hardware Specifications (Document Control # MV-S102993-00) 98DX107-LKJ Hardware Specifications (Document Control # MV-S103560-00) 98DX262 Hardware Specifications (Document Control # MV-S103020-00) 98DX249 and 98DX269 Hardware Specifications (Document Control # MV-S103653-00) 98DX169, 98DX249, and 98DX269 Product Brief (Document Control # MV-S103614-00) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • • • • • • • • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Document Conventions AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Glossary of Acronyms The acronyms in Table 1 are used in Prestera documentation. Aged Address Access Control List DiffServ “Assured Forwarding” Per-Hop Behavior Address Resolution Protocol Address Update Best Effort Bridge Protocol Data Unit Committed Burst Size Classless Interdomain Routing Committed Information Rate Class of Service Cyclic Redundancy Check DiffServ “Class Selector” Per-Hop Behavior Destination MAC Address IPv4 Header “Don’t fragment” field Destination IP Address Drop Precedence Differentiated Service Distributed Switching Architecture IEEE 802.2 Destination Service Access Point DiffServ Codepoint Equal/Weighted Cost Multipath DiffServ “Expedited Forwarding” Per-Hop Behavior Forwarding Database GARP VLAN Registration Protocol Internet Control Message Protocol Independent VLAN Learning Longest Prefix Match Media Access Control IPv4 header “More Fragments” Flag Multicast Listener Discovery Million packets per second Multiple Spanning Tree Maximum Transmission Unit “New Address” Address Update message Port-based ACL Policy Control Entry Policy Control List Per-Hop Behavior Port VLAN-ID “Query Address” Address Update message Quality of Service “Query Reply” Address Update message Reconciliation Sublayer Source MAC Address MV-S102110-02 Rev. E Page 16 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Acronyms AA ACL AF ARP AU BE BPDU CBS CIDR CIR CoS CRC CS DA DF DIP DP DS DSA DSAP DSCP ECMP EF FDB GVRP ICMP IVL LPM MAC MF flag MLD MPPS MST MTU NA PACL PCE PCL PHB PVID QA QoS QR RS SA M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 1: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Preface Acronyms (Continued) Shaped Deficit Weighted Round Robin Source IP Address Service-Level-Agreement Strict Priority IEEE 802.Source Service Access Point Single Spanning Tree Spanning Tree Protocol Shared VLAN Learning “Transplanted Address” Address Update message Traffic Class IPv4 header “Type of Service” field IPv4 header “Time to Live” field User Priority VLAN-based ACL VLAN Identification Multicast group index Virtual Local Area Network Variable-Length Subnet Masking Weighted Round Robin 10 Gigabit Attachment Unit Interface Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 SDWRR SIP SLA SP SSAP SST STP SVL TA TC TOS TTL UP VACL VID VIDX VLAN VLSM WRR XAUI CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 17 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 1: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Document Conventions AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 1. Product Overview Product Family Overview The Marvell® Prestera®-DX family of packet processors deliver the optimal desktop switching solution for Enterprise (desktop and stackable) and Small-to-Medium Size Business (SMB) networks. This functional specification describes three families of Prestera-DX devices: SecureSmart switches Layer 2+ stackable switches Multilayer stackable switches • • • 1.1.1 Prestera-DX SecureSmart Switches The Prestera-DX SecureSmart switches are targeted at the SMB market. They integrate Gigabit Ethernet ports with integrated SERDES (serializer-deserializer), as well as HyperG.Stack ports with XAUI transceivers, a Layer 2+ Switching engine, a Layer 2 through Layer 4 Policy engine, MII/GMII/RGMII Ethernet port for management, and on-chip buffer memory. These complete system-on-a-chip (SoC) packet processors provide support for line-rate Layer 2 bridging with 128-byte deep packet inspection Policy Control List and full IEEE 802.1p and DiffServ QoS Support. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Host CPU management interface of these devices is an MII/GMII/RGMII Ethernet port for packet forwarding and a Slave SMI Interface for address-mapped entities access. These devices do not support IPv4/IPv6 Unicast routing. The Prestera-DX SecureSmart family of switches consists of the following devices: 98DX106 10 Tri-Speed Ports SecureSmart switch 98DX163/98DX163R 16 Tri-Speed Ports SecureSmart switch 98DX243 24 Tri-Speed Ports SecureSmart switch 98DX262 24 Tri-Speed Ports + 2 HyperG.Stack Ports SecureSmart switch 98DX106, 98DX163, 98DX163R, and 98DX243 Apart from their port configurations, the 98DX106, 98DX163, 98DX163R, and 98DX243 devices are: • Footprint compatible • Features compatible • Software compatible • Footprint compatible with the 98DX160, 98DX240, 98DX162, 98DX242, 98DX166, 98DX246, 98DX107, 98DX167 and 98DX247 devices. 98DX262 Apart from its pin configuration, the 98DX262 is footprint compatible with the 98DX250, 98DX260, 98DX270, 98DX803, 98DX253, 98DX263, and 98DX273 devices. MV-S102110-02 Rev. E Page 18 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 1.1 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Product Overview Prestera-DX SecureSmart Stackable Switches AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Prestera-DX SecureSmart Stackable switches are targeted at the SMB market. They integrate Gigabit Ethernet ports with integrated SERDES (serializer-deserializer), as well as HyperG.Stack ports with XAUI transceivers and HX/QX ports with integrated SERDES, a Layer 2+ switching engine, an IPv4/IPv6 Unicast routing engine, a Layer 2 through Layer 4 Policy engine, MII/GMII/RGMII Ethernet port for management, and on-chip buffer memory. These complete system-on-a-chip (SoC) packet processors provide support for line-rate Layer 2 bridging with 128-byte deep packet inspection Policy Control List and full IEEE 802.1p and DiffServ QoS Support.The HX/QX ports provide cost-effective stacking solutions, ideal for the SMB market by utilizing low-cost HDMI or SATA cables. The Host CPU management interface of these devices is an MII/GMII/RGMII Ethernet port for packet forwarding and a Slave SMI Interface for address-mapped entities access. These devices are stackable to up to 32 devices. 98DX169, 98DX249 and 98DX269 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Prestera-DX SecureSmart Stackable family of switches consists of the following devices: 98DX169 16 Tri-Speed Ports + 2 HX/QX ports SecureSmart Stackable switch with IPv4/IPv6 Unicast Routing capabilities. 98DX249 24 Tri-Speed Ports + 2 HX/QX ports SecureSmart Stackable switch 98DX269 24 Tri-Speed Ports + 2 HX/QX ports + 1 HyperG.Stack Ports SecureSmart Stackable switch with IPv4/IPv6 Unicast Routing capabilities. or 24 Tri-Speed Ports + 1 HX/QX Port + 2 HyperG.Stack Ports SecureSmart Stackable switch with IPv4/IPv6 Unicast Routing capabilities. Apart from their port configurations the 98DX169, 98DX249, and 98DX269 devices are: • Features compatible • Software compatible Apart from its port configuration, the 98DX269 is footprint compatible with the 98DX250, 98DX260, 98DX262, and 98DX270. 1.1.3 Prestera-DX Layer 2+ Stackable Switches The Prestera-DX Layer 2+ stackable switches are targeted at the Layer 2+ stackable market. They integrate Gigabit Ethernet ports with integrated SERDES (serializer-deserializer), as well as HyperG.Stack ports with XAUI transceivers, a Layer 2+ switching engine, a Layer 2 through Layer 4 Policy engine, PCI or MII/GMII/RGMII Ethernet port for management, and on-chip buffer memory. These complete system-on-a-chip packet processors provide support for line-rate Layer 2 bridging with 128-byte deep packet inspection Policy Control List and full IEEE 802.1p and DiffServ QoS Support. The Host CPU management interface of these devices is a PCI Interface or an MII/GMII/RGMII Ethernet port for packet forwarding and a Slave SMI Interface for address-mapped entities access. Those devices are stackable to up to 32 devices and they do not support IPv4/IPv6 Unicast Routing. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 19 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 1.1.2 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Product Family Overview Note The 98DX166 and 98DX246 do not incorporate a PCI interface for management. Like the SecureSmart switches, their management interface is an MII/GMII/RGMII Ethernet port for packet forwarding and an a Slave SMI Interface for address-mapped entities access. 98DX166 and 98DX246 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Apart from their port configurations, the 98DX166 and 98DX246 devices are: • Footprint compatible • Features compatible • Software compatible • Footprint compatible with the 98DX160, 98DX240, 98DX162, 98DX242, 98DX163, 98DX243, 98DX107, 98DX167 and 98DX247 devices 98DX250, 98DX130, 98DX260, 98DX270, and 98DX803 Apart from their port configurations, the 98DX130, 98DX250, 98DX260, 98DX270 and 98DX803 devices are: • Footprint compatible • Features compatible • Software compatible • Footprint compatible with the 98DX262, 98DX253, 98DX263, and 98DX273. 1.1.4 Prestera-DX Multilayer Stackable Switches The Prestera-DX Multilayer stackable switches are targeted at the stackable edge router market. They integrate Gigabit Ethernet ports with integrated SERDES (serializer-deserializer) as well as HyperG.Stack ports with XAUI transceivers, a Layer 2+ switching engine, an IPv4/IPv6 Unicast Routing Engine, a Layer 2 through Layer 4 Policy engine, PCI or MII/GMII/RGMII Ethernet port for management, and on-chip buffer memory. These complete system-on-a-chip (SoC) packet processors provide support for line-rate Layer 2 bridging with 128-byte deep packet inspection Policy Control List and full IEEE 802.1p and DiffServ QoS Support. The Host CPU management interface of these devices is a PCI Interface or an MII/GMII/RGMII Ethernet port for packet forwarding and an a Slave SMI Interface for address-mapped entities access. These devices are stackable to up to 32 devices and they support IPv4/IPv6 Unicast Routing. MV-S102110-02 Rev. E Page 20 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Prestera-DX Layer2+ family of switches consists of the following devices: 98DX166 16 Tri-Speed Ports Layer 2+ stackable switch 98DX246 24 Tri-Speed Ports Layer2 + stackable switch 98DX250 24 Tri-Speed Ports Layer2+ stackable switch 98DX130 12 Tri-Speed Ports + 1 HyperG.Stack Layer2+ stackable switch 98DX260 24 Tri-Speed Ports + 2 HyperG.Stack Layer2+ stackable switch 98DX270 24 Tri-Speed Ports + 3 HyperG.Stack Layer2+ stackable switch 98DX803 3 HyperG.Stack Layer2+ stackable switch M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Product Overview Note The 98DX107, 98DX167, and 98DX247 do not incorporate a PCI interface for management. Like the SecureSmart switches, their management interface is an MII/GMII/RGMII Ethernet port for packet forwarding and a Slave SMI Interface for address-mapped entities access. 98DX107, 98DX167, and 98DX247 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Apart from their port configurations, the 98DX107, 98DX167 and 98DX247 are: • Footprint compatible • Features compatible • Software compatible • Footprint compatible with the 98DX160, 98DX240, 98DX162, 98DX242,98DX106, 98DX163, 98DX243, 98DX166 and 98DX246 devices 98DX253, 98DX133, 98DX263 and 98DX273 Apart from their port configurations, the 98DX133, 98DX253, 98DX263, and 98DX273 are: • Footprint compatible • Features compatible • Software compatible • Footprint compatible with the 98DX262, 98DX250, 98DX260, and 98DX270. 1.2 Prestera Software Suite The Prestera Software Suite (PSS) is composed of a set of production-quality comprehensive drivers for managing a Prestera based system. The Prestera Software Suite serves as a foundation for customer-developed applications, such as IEEE 802.1 bridging services, IPv4/IPv6 routing, Policy Control Lists, Traffic Conditioning, and Quality of Service. Based on a modular architecture and comprehensive APIs, the Prestera Software Suite enables software developers to integrate high-level applications with minimal effort, without register-level knowledge of the Prestera chipset registers and tables. The software is written in ANSI-C and is OS and CPU independent for easy porting. See the Prestera Software Suite User Guide for additional information. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 21 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Prestera-DX Multilayer stackable family of switches consists of the following devices: 98DX107 10 Tri-Speed Ports Multilayer stackable switch 98DX167 16 Tri-Speed Ports Multilayer stackable switch 98DX247 24 Tri-Speed Ports Multilayer stackable switch 98DX253 24 Tri-Speed Ports Multilayer stackable switch 98DX133 12 Tri-Speed Ports + 1 HyperG.Stack Multilayer stackable switch 98DX263 24 Tri-Speed Ports + 2 HyperG.Stack Multilayer stackable switch 98DX273 24 Tri-Speed Ports + 3 HyperG.Stack Multilayer stackable switch M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera Software Suite AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 2. Summary of Features Detailed feature definition and configuration descriptions can be found in the associated sections of this document. 2.1 Port MAC Features The device incorporates 10/12/16/24 independent 10/100/1000 Mbps Ethernet MACs with integrated 1.25 Gbps SERDES. In addition, the following devices incorporate independent HyperG.Stack MACs with integrated XAUI transceivers: 1 HyperG.Stack port: 98DX130, 98DX133 2 HyperG.Stack ports: 98DX260, 98DX262, 98DX263 3 HyperG.Stack ports: 98DX270, 98DX273, 98DX803 2 HX/QX ports: 98DX249, 98DX169 The MAC port features include: • 10/100/1000 Mbps Ethernet MAC: – • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 2 HX/QX ports and 1 HyperG.Stack port or 1 HX/QX port and 2 HyperG.Stack ports: 98DX269 Integrated SGMII interface on all 10/24/16 tri-speed ports. SGMII is a serialized version of the IEEE 802.3 GMII interface, which supports a triple-speed MAC (1000/100/10 Mbps) using only four I/Os per port. – IEEE 802.3x Flow Control support on full-duplex links and back-pressure Flow Control on half-duplex links. – Two IEEE 802.3 Clause 22 compliant master SMI interfaces for external PHY management and AutoNegotiation. – Supports manual or automatic setting for link, speed, duplex, and IEEE 802.3x Flow Control. – Support of Automatic Media Select when connected to a 88E1112 Alaska® PHY, without CPU intervention. – – – – – Support for Virtual Cable Tester® (VCT) technology, using the Alaska transceiver. Support for 1000 BASE-X for fiber and backplane applications. Support for pre-emphasis on serial driver. Support for Ethernet-like and RMON EtherStats counters. Support for Jumbo frames of up to 10 KB. HyperG.Stack MAC – The HyperG.Stack port integrates a XAUI transceiver using 16 I/0s, incorporating four synchronized lanes that deliver bi-directional point-to-point data transmission of 3.125 Gbps or 3.75 Gbps per lane. – – IEEE 802.3ae XAUI-compliant Quad 3.125 Gbps/lane. Supports pre-emphasis on serial driver. MV-S102110-02 Rev. E Page 22 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section summarizes the features supported by the device. Unless specified otherwise, it is applicable to all devices. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Summary of Features Three IPG modes—LAN mode, Fixed mode, and WAN mode. IEEE 802.3x Flow Control support. Packet Level Flow Control support via digital pins. IEEE 802.3 Clause 45 compliant master XSMI interface for configuration of the HyperG.Stack MACs and configuration of external XFP or XENPAK PHYs. Per-port IEEE 802.3 Clause 45 compliant Slave XSMI interface for port configuration. Two preamble modes—Standard and Enhanced. Support for Ethernet-like and RMON EtherStats counters. Supports Jumbo frames of up to 10 KB. HX/QX MAC – – – – – – – – – – The HX/QX port integrates two (HX) or one (QX) SERDES lanes using four (HX) or two (QX) I/0s. It incorporates two synchronized lanes, delivering bi-directional point-to-point data transmission of 3.125 Gbps per lane. QX port uses a single 3.125 Gbps SERDES lane for 2.5 Gbps throughput. HX port uses a two 3.125 Gbps SERDES lanes for 5 Gbps throughput. Supports pre-emphasis on serial driver. Exceeds IEEE 802.3ae jitter requirements in 10 Gbps applications. On-chip 50 ohms serial receiver termination. IEEE 802.3x Flow Control support. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • On-chip 50 Ohms Serial receiver termination. Packet Level Flow Control support via digital pins. Support for Ethernet-like and RMON EtherStats counters. Supports Jumbo frames of up to 10 KB. The port MAC features and configuration are described in detail in Section 9. "Network Interfaces and Media Access Controllers (MACs)" on page 136. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 23 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 – – – – Exceeds IEEE 802.3ae jitter requirements in 10 Gbps applications. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 – – – – – – M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Port MAC Features 2.2 Port Trunking Features AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Port trunking (also known as link aggregation) allows multiple physical ports to function as a single high-bandwidth logical port between the device and other switching devices or end-stations. The device’s port trunking support is compliant with the IEEE 802.3ad Link Aggregation standard. The following port trunk features are supported by the device: • Support for 127 trunk groups. (The SecureSmart devices support the following: 98DX163, 98DX243, and 98DX262 support 32 trunk groups and the 98DX106 supports 8 trunk groups.) • Each trunk group can be configured with up to eight port members. The Marvell Distributed Switching Architecture (DSA) enables the trunk group members to reside on any device in the system. • Unicast and Multicast packets are load-balanced among the trunk group port members using either: – – A hash function based on the packet’s L2, L3, and/or L4 header fields The ingress port number or Trunk-ID Port trunking features and configuration are described in detail in Section 13. "Port Trunking" on page 281. 2.3 Distributed Switching Architecture (DSA) Features M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device supports DSA, which allows multiple devices to be cascaded through any of its Ethernet MAC port interfaces with other devices in these three families, or with any Marvell device that supports DSA tag cascading (e.g., the 98DX240). The cascade port can be a single MAC port or a trunk group consisting of several MAC ports on the device. Up to 32 devices can be cascaded to create a single cascaded system. Any cascade topology (e.g., chain, ring, or mesh) is supported. A cascaded system of devices in these three families supports the same features as a non-cascaded single device in these three families. This includes: • Trunk groups with port members on multiple devices in the system. • Mirroring to analyzer port on any device in the system. • Traffic to the CPU can be sent through any device in the system. • CPU can inject traffic to be transmitted through a port on any device in the system. • A source-ID based egress filtering mechanism may be used to prevent loops in the forwarding topology. Cascaded system features and configuration are described in detail in Section 4. "Distributed Switching Architecture" on page 44. MV-S102110-02 Rev. E Page 24 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 In addition, cascading ports can be trunked to create a high-bandwidth interconnection between Prestera-DX devices. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Summary of Features 2.4 Quality of Service Features – – • • Eight traffic class assignments for segregation on egress queues. (SecureSmart and SecureSmart Stackable devices have four traffic classes for network ports and eight queues for the CPU port.) 2 drop precedence level assignments for tail-dropping on congested egress queues. On egress, optional QoS marking of packet user priority and/or DSCP. QoS initial marking mechanisms: Port-based, Protocol-based, Policy-based, or FDB based. Layer-2 and/or Layer-3 QoS Trusted Port modes: – – Maps the packet User Priority or DSCP to a QoS profile. • Optional DSCP Mutating for crossing of DiffServ domains. Policing: Out-of-profile packets may be QoS remarked or dropped. • Setting Packet header 802.1p Use Priority and/or DSCP QoS fields. 2.5 Policy Features M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Quality of Service features and configuration are described in detail in Section 8. "Quality of Service (QoS)" on page 110. The device incorporates an on-chip line-rate ingress Policy engine. The Policy engine is suited for supporting port or VLAN access control lists (ACLs/VACLs), policy based QoS, VLANs, mirroring or trapping to the CPU, or switching. The Policy engine features include the following: • • • • • • • • Inspection of the first 128 bytes of the packet. Up to 1024 policy rules where each rule key is 24 bytes, or 512 policy rules where each rule key is 48 bytes, or a combination of 24-byte and 48-byte rules. (SecureSmart and SecureSmart Stackable devices: Up to 256 policy rules, where each rule key is 24 bytes, or 128 policy rules, where each rule key is 48 bytes, or a combination of 24-byte and 48-byte rules.) Each rule key has per-bit masking capability. The key consists of well-known fixed Layer-2/3/4 fields, as well as user-defined fields. Rules are associated with a policy-ID. Packets are assigned a policy-ID based on the source port/trunk number, or the packet’s VID. Supports two policy searches per packet, each with a separate policy-ID assignment. Rule match counters—a rule action can be bound to one of 32 global rule-match counters. The policy actions include the following features: – – – – – – Accept/Deny Trap/Mirror to CPU Mirror to analyzer port Assign QoS attributes VLAN assignment or VLAN translation Redirect to a target destination Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 25 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 – AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device provides extensive Layer-2 and Layer-3 Quality of Service (QoS) capabilities, allowing it to support IEEE 802.1p and IETF Diffserv requirements. These device QoS features include: • 72 global QoS Profiles. • A QoS Profile determines the packet’s traffic class, drop precedence, user priority, and DSCP: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Quality of Service Features Bind to one of 256 Policers (SecureSmart and SecureSmart Stackable devices support four Policers per port.) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 – The Policy features and configuration are described in detail in Section 10. "Ingress Policy Engine" on page 172. Bridging Features The device supports wire-speed 802.1D/Q bridging features, together with many additional bridging feature enhancements. The device’s bridging features include: • 16K entry Forwarding Database (FDB): (SecureSmart devices, SecureSmart Stackable devices and 98DX107: 8K entry FDB) – – – – – – • • • • • • • New Source Addresses can be dropped, trapped, or forwarded. This is an important security hook for IEEE 802.1X Port Based Access Control, and to the proprietary extension MAC-based access control. Independent and Shared VLAN Learning. CPU triggered delete of entries by VLAN and/or port/trunk. MAC based filtering, trapping, mirroring to CPU, or mirroring to analyzer port. Address transplanting from an old device, port, or trunk to a new device, port, or trunk. This is an important hook for efficient implementation of IEEE 802.1w Rapid Reconfiguration. Address Update messages to/from the CPU for FDB management. IPv4/6 Multicast bridging based on the packet (Source-IP, Group-IP, VLAN-ID) VLANs: – – – – • Automatic and CPU-controlled learning and aging modes. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 – – 4K entry VLAN table. (SecureSmart and SecureSmart Stackable devices have 256 active VLANs.) Port, Protocol, and Policy-based VLAN assignment mechanisms. Nested VLAN support for Provider Bridging. VLAN ingress and egress filtering. 4K entry Multicast Group table. (SecureSmart and SecureSmart Stackable devices have 256 Multicast Group tables) Support for single spanning tree and multiple spanning tree with up to 256 spanning tree groups. (SecureSmart and SecureSmart Stackable devices do not support Multiple Spanning Tree.) Private VLAN Edge for secure forwarding to an uplink port. Trapping/Mirroring of well-known control protocols. Trapping/Mirroring/Dropping of Unknown or Unregistered packets. Rate limiting of Known, Unknown Unicast, Multicast, and Broadcast packets. Counters: – – – RMON 1 Host Group and Matrix Group counters. Port/VLAN/Device ingress counters. Port/VLAN/drop precedence/traffic class egress counters. The Bridge features and configuration are described in detail in Section 11. "Bridge Engine" on page 203. MV-S102110-02 Rev. E Page 26 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 2.6 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Summary of Features 2.7 Unicast Routing Features R AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This section is relevant for the following devices: D Multilayer Stackable: 98DX107, 98DX133, 98DX167 98DX247, 98DX253, 98DX263, 98DX273 D SecureSmart Stackable: 98DX169, 98DX249, 98DX269 Not relevant for the SecureSmart or Layer 2+ Stackable devices. The device supports the following Unicast routing features: • Per-port and per-VLAN enabling of IPv4 and IPv6 Unicast routing. • Policy-based IPv4/v6 routing lookup. • Up to 1K prefix/host entries and 1K ARP MAC addresses. SecureSmart Stackable devices support up to 32 static IPv4 prefix/host entries and 256 ARP MAC addresses. • Next-hop forwarding to any {device, port}, trunk, or VLAN group in the system. • Per route entry QoS assignment. • Per route entry mirroring-to-CPU or mirroring to Ingress Analyzer port. • Router exception checking: – – – Options Routed packet modifications: – – – • MAC SA assignment based on port or VLAN IPv4 TTL and IPv6 Hop Limit decrement. IPv4 Checksum update. Support for Layer 3 control traffic: – – – • TTL/Hop Limit Exceeded RIPv1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • IPv4/v6 Header Error IPv4/v6 control protocols running over link-layer Multicast, e.g. RIPv2, OSPv2 UDP Relay. Egress mirroring of routed packet to an Analyzer port. The Unicast Routing features and configuration are described in detail in Section 12. "IPv4 and IPv6 Unicast Routing" on page 265. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 27 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 U M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Unicast Routing Features 2.8 Traffic Policing Features AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device supports 256 on-chip wire-speed ingress traffic policers. (SecureSmart and SecureSmart Stackable devices support four per-port on-chip wire-speed ingress traffic policers.) Each Policer supports the following features: • Single meter, configurable with a maximum rate and burst-size: – – • • • Rates range from a minimum rate of 1Kbps to a maximum rate 100 Gbps, with six levels of granularity ranging from a minimum granularity of 1 Kbps for rates under 1 Mbps, to a maximum granularity of 100 Mbps for rates up to 100 Gbps. Large burst-size supports temporal bursts without impacting TCP's sliding window algorithm. Color aware and unaware operational modes. Out-of-profile packets are either remarked with QoS or dropped: QoS Remarking is based either on explicit QoS assignment or mapped according to the incoming DSCP. Conformance counters: – – 16 global conformance counter sets. Counts in-profile and out-of-profile packets. Each policer may be bound to a conformance counter set. The Policing features and configuration are described in detail in Section 14. "Ingress Traffic Policing Engine" on page 292. Bandwidth Management Features M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 2.9 The device provides the bandwidth management features required for QoS (lossy) systems, and flow-control (lossless) systems. These device QoS features include: • • • Ingress bandwidth management using flow-control with XOFF/XON buffer limits. Eight egress traffic class queues per port (including the CPU port). (SecureSmart and SecureSmart Stackable devices have four traffic classes for the network ports and eight traffic classes for the CPU.) Egress tail-dropping, for congestion avoidance: – – • Two levels of drop precedence for color-aware tail-dropping. Egress queue scheduling algorithms: – – – • Based on the queued buffers limit and queued packets limit. Shaped Weighted Round Robin (SDWRR), for minimum bandwidth assignment. Strict Priority (SP) provides low-latency scheduling, for high-priority low-latency traffic. Hybrid scheduling of both SP and SDWRR queues. Egress per-port and per-queue shaping, for limiting the maximum bandwidth: – Byte-based shaping rates ranging from 64 Kbps to 12 Gbps, with a granularity of under 64 Kbps. The bandwidth management features and configuration are described in detail in the Section 15. "Bandwidth Management" on page 302. MV-S102110-02 Rev. E Page 28 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 A traffic flow is bound to a policer through the Policy rule action. Any number of policy rules may be bound to a single policer. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Summary of Features 2.10 Secure Control Technology (SCT) Features In managed systems, it is critical that the CPU receive only traffic that requires software processing. Unwanted traffic unnecessarily burdens the CPU and delays handling of other traffic that requires processing. Furthermore, traffic that is sent to the CPU must be properly prioritized into separate queues. This allows the CPU to process high-priority traffic with minimum delay, even when overloaded with low-priority traffic. The device provides Secure Control Technology for both selecting traffic to be sent to the CPU, as well as prioritizing and managing the bandwidth of traffic sent to the CPU. • 8 Traffic Class CPU queues: Same queueing, scheduling algorithms, and shaping as non-CPU-port queues, see Section 7. "CPU Traffic Management" on page 102. For each packet type trapped or mirrored to the CPU, the user can configure the following packet attributes: – – – – – • Traffic Class Drop Precedence CPU destination device Packet truncation to 128 bytes Statistical dropping Explicit mechanisms to trap or mirror well-known Multicast and Broadcast control packets to the CPU: – – – – – – – – – ARP Request IPv4 IGMPv1/2/3 IPv6 MLDv1/2 IPv6 Neighbor Discovery IP Broadcast Spanning tree BPDU M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Other IEEE reserved Multicast packets (e.g. GVRP, LACP, PAE, etc.) Cisco Layer-2 Multicast control packets Unicast MAC-to-me packets The STC features and configuration are described in detail in Section 7. "CPU Traffic Management" on page 102. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 29 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device’s physical management interface may be the PCI interface for both packet Rx/Tx and register access, or MII/GMII/RGMII for packet Rx/Tx and SMI interface for register access. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Secure Control Technology (SCT) Features 2.11 Traffic Monitoring Features – Ingress and/or egress port packet sampling compliant with RFC 3176: Sflow - A Method for Monitoring Traffic in Switched and Routed Networks. – – Packets may be truncated to 128 bytes and sent to any CPU in the system. Sampling to the CPU is independent of ingress/egress packet mirroring to analyzer port. Mirroring to Analyzer Port: – – – – – Independent ingress and egress analyzer port configuration. Ingress mirroring enable per port, policy rule action, VLAN, and/or FDB. Egress mirroring enabled per port. Unlimited number of ingress and egress mirrored ports. Supports ingress and/or egress statistical mirroring to the destination analyzer port. MV-S102110-02 Rev. E Page 30 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The traffic monitoring features and configuration are described in detail in Section 16. "Traffic Monitoring" on page 312. CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device’s traffic monitoring features include: • Statistical packet sampling to the CPU: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification The devices are members of the Marvell® Prestera®-DX family of networking switches. This single-chip packet processor integrates 10/12/16/24, ports of Gigabit Ethernet with integrated SERDES, and one, two, or three HyperG.Stack ports, each with integrated quad SERDES interface XAUI transceivers, Layer 2+ switching engine, IPv4/IPv6 Unicast routing engine, powerful Layer 2 through Layer 4 Policy engine, PCI or MII/GMII/RGMII Ethernet port for management, and on-chip 6 Mbit buffer memory. This complete system-on-a-chip (SoC) packet processor provides support for line rate Layer 2 bridging, IPv4 and IPv6 Unicast routing, deep packet inspection Policy engine, and Layer 2/3 QoS. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device integrates the following functions: • Store and forward switching architecture with on-chip packet buffering. • Layer 2 through Layer 4 packet Policy engine. • Ethernet Bridge engine. • IPv4/IPv6 Unicast routing engine • Ingress policers. • Support for packet header manipulation, including VLAN insertion/removal/replacement, IEEE 802.1p User Priority field remarking and DSCP field remarking • Marvell Distributed Switching Architecture, based on the DSA tag for the CPU packet interface and cascade ports between devices. • On-chip transmit queues, including congestion handling and scheduling. • Egress rate shapers. • Support for host processor interface—PCI, Ethernet + SMI, or MII/GMII/RGMII Ethernet. In addition, Marvell provides a comprehensive set of software tools and software drivers supporting the Prestera chipset and the Alaska® transceivers. The Prestera Software Suite (PSS) provides the user with high-level APIs and OS/CPU independence. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 31 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 3. Functional Overview M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Functional Overview 98DX106, 98DX163, 98DX166, 98DX243, and 98DX246 Block Diagram AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 1: 98DX106, 98DX163, 98DX166, 98DX243, and 98DX246 Top Level Block Diagram Switch Address Mapped Entities TW SI CPU SMI Pre-Egress Engine PLLs and Misc VLAN Multicast Groups TWSI Controller Reference Clocks and Misc. Egress Control Pipe Forwarding Database Slave SMI Ingress Control Pipe Serial LED Controller LED Interface Transmit Queues Memory Transmit Queues CPU Port MAC Buffer Memory M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 MII/ GMII/ RGMII Policy Rules TCAM and Policy Action Table DMAs and Buffer Management Tx DMA per port and Rx DMA per port PHY Polling Port 0 GMAC Ext PHY Reg I/F 1.25Gbps SERDES ... ... Port 23 GMAC 1.25Gbps SERDES Tri-Speed Ethernet Ports Master SMI 98DX106: 98DX163: 98DX166: 98DX243: 98DX246: MV-S102110-02 Rev. E Page 32 10 16 16 24 24 Tri-Speed Tri-Speed Tri-Speed Tri-Speed Tri-Speed Ethernet Ethernet Ethernet Ethernet Ethernet Ports Ports Ports Ports Ports 0 0 0 0 0 – – – – – 9 15 15 23 23 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 3.1 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Functional Overview 98DX130, 98DX250, 98DX260, 98DX262, 98DX270, and 98DX803 Block Diagram AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 2: 98DX130, 98DX250, 98DX260, 98DX262, 98DX270, and 98DX803 Top Level Block Diagram Switch Address Mapped Entities TWSI CPU SMI Pre-Egress Engine TWSI Controller Egress Control Pipe PLLs and Misc VLAN Multicast Groups and Span Tree Groups Forwarding Database Slave SMI Serial LED Controller Ingress Control Pipe Reference Clocks and Misc. LED Interface Transmit Queues Memory Transmit Queues CPU Port MAC M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 MII/ GMII/ RGMII/ PCI Policy Rules TCAM and Policy Action Table Buffer Memory DMAs and Buffer Management Tx DMA per port and Rx DMA per port PHY Polling Ext PHY Reg I/F Master SMI Port 0 GMAC ... 1.25Gbps SERDES August 24, 2006, Preliminary Port 24 HX.S MAC Dual 1.25Gbps SERDES Dual 3.125 Gbps SERDES Tri-Speed Ethernet Ports 98DX130: 98DX250: 98DX260: 98DX262: 98DX270: 98DX803: Copyright © 2006 Marvell Port 23 HX.S MAC 12 Tri-Speed Ethernet Ports 24 Tri-Speed Ethernet Ports 24 Tri-Speed Ethernet Ports 24 Tri-Speed Ethernet Ports 24 Tri-Speed Ethernet Ports 3 HyperG.Stack Port 24-26 Port 26 HX.S MAC ... Dual 3.125 Gbps SERDES HyperG.Stack Ports 0 0 0 0 0 – – – – – 11 23 23 23 23 and 1 HyperG.Stack Port 24 and 2 HyperG.Stack Ports 24-25 and 2 HyperG.Stack Ports 24-25 and 3 HyperG.Stack Port 24-26 CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 33 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 3.2 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED 98DX130, 98DX250, 98DX260, 98DX262, 98DX270, and 98DX803 Block Diagram 98DX107, 98DX167, and 98DX247 Block Diagram AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 3: 98DX107, 98DX167, and 98DX247 Top Level Block Diagram Switch Address Mapped Entities TWSI Pre-Egress Engine TWSI Controller Egress Control Pipe PLLs and Misc VLAN Multicast Groups and Span Tree Groups Reference Clocks and Misc. Forwarding Database CPU SMI Slave SMI ARP Table MII/ GMII/ RGMII CPU Port MAC Transmit Queues Memory Serial LED Controller LED Interface Policy Rules TCAM and Policy Action Table M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Transmit Queues Ingress Control Pipe Buffer Memory DMAs and Buffer Management Tx DMA per port and Rx DMA per port PHY Polling Port 0 GMAC Ext PHY Reg I/F 1.25Gbps SERDES ... ... Port 23 GMAC 1.25Gbps SERDES Tri-Speed Ethernet Ports Master SMI MV-S102110-02 Rev. E Page 34 98DX107: 10 Tri-Speed Ethernet Ports 0 – 9 98DX167: 16 Tri-Speed Ethernet Ports 0 – 15 98DX247: 24 Tri-Speed Ethernet Ports 0 – 23 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 3.3 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Functional Overview 98DX133, 98DX253, 98DX263, and 98DX273 Block Diagram AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 4: 98DX133, 98DX253, 98DX263, and 98DX273 Top Level Block Diagram Switch Address Mapped Entities TWSI Pre-Egress Engine TWSI Controller Egress Control Pipe PLLs and Misc VLAN Multicast Groups and Span Tree Groups Reference Clocks and Misc. Forwarding Database CPU SMI Slave SMI ARP Table MII/ GMII/ RGMII/ PCI CPU Port MAC Transmit Queues Memory LED Interface Policy Rules TCAM and Policy Action Table M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Transmit Queues Serial LED Controller Ingress Control Pipe Buffer Memory DMAs and Buffer Management Tx DMA per port and Rx DMA per port PHY Polling Port 0 GMAC Ext PHY Reg I/F 1.25Gbps SERDES Master SMI ... August 24, 2006, Preliminary Port 24 HX.S MAC Dual 1.25Gbps SERDES Dual 3.125 Gbps SERDES Tri-Speed Ethernet Ports 98DX133: 98DX253: 98DX263: 98DX273: Copyright © 2006 Marvell Port 23 HX.S MAC 12 24 24 24 Tri-Speed Tri-Speed Tri-Speed Tri-Speed Ethernet Ethernet Ethernet Ethernet Ports Ports Ports Ports ... Port 26 HX.S MAC Dual 3.125 Gbps SERDES HyperG.Stack Ports 0 0 0 0 – – – – 11 and 1 HyperG.Stack Port 24 23 23 and 2 HyperG.Stack Ports 24-25 23 and 3 HyperG.Stack Port 24-26 CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 35 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 3.4 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED 98DX133, 98DX253, 98DX263, and 98DX273 Block Diagram 98DX169 and 98DX249 Block Diagram AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 5: 98DX169 and 98DX249 Top Level Block Diagram Switch Address M apped Entities Pre-Egress Engine VLAN M ulticast G roups TW SI TW SI Controller PLLs and M isc Reference Clocks and M isc. Egress Control Pipe Forwarding Database CPU SM I Slave SM I Transm it Q ueues M em ory Serial LED Controller LED Interface ARP Table Transm it Queues CPU Port M AC Policy Rules TCAM and Policy Action Table Buffer M em ory M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 M II/ G M II/ RG M II Ingress Control Pipe D M As and Buffer M anagem ent Tx DM A per port and Rx D M A per port PHY Polling Port 0 GM AC Ext PHY Reg I/F 1.25Gbps SERDES M aster SM I ... Port 23 G M AC Port 25 HX.S M AC Port 26 HX.S M AC 1.25Gbps SERDES Dual 3.125 G bps SERDES Dual 3.125 Gbps SERDES Tri-Speed Ethernet Ports HyperG .Stack Ports 98DX169 = 16 Tri-Speed Ethernet Ports 0-15 98DX249 = 24 Tri-Speed Ethernet Ports 0-23 MV-S102110-02 Rev. E Page 36 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 3.5 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Functional Overview 98DX269 Block Diagram AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 6: 98DX269 Top Level Block Diagram Switch Address Mapped Entities TWSI CPU SMI Pre-Egress Engine PLLs and Misc VLAN Multicast Groups TWSI Controller Reference Clocks and Misc. Egress Control Pipe Forwarding Database Slave SMI Ingress Control Pipe Serial LED Controller LED Interface Transmit Queues Memory Transmit Queues M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 MII/ GMII/ RGMII Policy Rules TCAM and Policy Action Table Buffer Memory CPU Port MAC DMAs and Buffer Management Tx DMA per port and Rx DMA per port PHY Polling Port 0 GMAC Ext PHY Reg I/F 1.25Gbps SERDES Master SMI ... Port 23 GMAC Port 24 H.GS MAC 1.25Gbps SERDES XAUI Quad 3.125/3.75 Gbps SERDES 24 Tri-Speed Ethernet Ports Port 25 H.GS/HX.S MAC XAUI Quad 3.125/3.75 Gbps SERDES Dual 3.125 Gbps SERDES Two HyperG.Stack ports Port 26 HX.S MAC Dual 3.125 Gbps SERDES Two HX.Stack ports When Port 25 mode is HyperG.Stack, its HX.Stack interface is inactive. When Port 25 mode is HX.Stack, its HyperG.Stack interface is inactive. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 37 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 3.6 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED 98DX269 Block Diagram 3.7 High-Level Packet Walkthrough AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This section provides a functional walkthrough of the device. Note The only difference between the Layer 2+ devices and the Multilayer devices is that the Multilayer devices incorporate an IPv4/IPv6 Unicast Routing Engine. SecureSmart and Layer 2+ Switches Ingress and Egress Processing Engines Egress Pipeline Ingress Pipeline Egress Filtering Pre-Egress Engine Multi-Target Replication Policing Engine Descriptor Enqueueing Bridge Engine Rate Shaping Transmit Scheduler Headers Alteration Ports MAC Tx MV-S102110-02 Rev. E Page 38 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 7: Policy Engine Header Decode Engine Ports MAC Rx CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The device’s processing engines are pipelined. The device maintains two pipelines—an ingress pipeline and an egress pipeline, which process all the traffic received on the device and transmitted from it. Figure 7 and Figure 8 illustrate the ingress and egress pipelines and the engines in each pipeline stage. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Functional Overview Egress Pipeline Ingress Pipeline Egress Filtering Pre-Egress Engine Multi-Target Replication Policing Engine Descriptor Enqueueing IPv4/IPv6 Unicast Routing Engine Rate Shaping Transmit Scheduler Ports MAC Tx 3.7.1 Ingress Pipeline Policy Engine Header Decode Engine M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Headers Alteration Bridge Engine Ports MAC Rx Packets received on a device’s port are first processed by the ingress pipeline, which consists of the following processing units: • Port MAC Rx • Header Decode engine • Policy engine • Bridge engine • IPv4/IPv6 Unicast Routing Engine (98DX107/167/247/253/263/273/169/249/269 devices only) • Policing engine • Pre-egress engine 3.7.1.1 Ingress Processing on a Cascade Port Although all packets received by the device always pass through the ingress pipeline, all ingress engines are not necessarily enabled. Specifically, the Policy and Bridge engines can be independently enabled or disabled on a per port basis. In a cascaded/stackable system based on devices in these families, the device in the stack through which the packet is received performs the policy and bridge decisions. If the packet is forwarded through a cascade port, the Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 39 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Multilayer Stackable and SecureSmart Stackable Switches Ingress and Egress Processing Engines AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 8: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED High-Level Packet Walkthrough Ports MAC Rx Each port MAC Rx operates independently. The port MAC is responsible for IEEE 802.3 MAC functionality, packet reception, allocation of buffers in the device’s packet memory and DMA of the packet data into the buffers memory. Packets that contain errors such as FCS Errors, Length Errors, etc. are discarded. Error-free packets continue to be processed by the ingress pipeline. 3.7.1.3 Header Decode Engine If the packet is not filtered by the port MAC, the packet’s header of up to 128 bytes is decoded by the Header Decode engine. This engine decodes the packet header and extracts the packet fields (e.g., MAC SA, MAC DA, EtherType, SIP, DIP) that are required by the subsequent pipe engines. 3.7.1.4 Policy Engine If the packet is not filtered by the port MAC, and the ingress port configuration setting enables Policy engine processing, the packet is processed by the Policy engine. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Policy engine allows Policy Control Lists to be applied to packets based on flexible criteria, including the packet Layer 2, Layer 3, and Layer 4 field content. The Policy engine may be used to implement user applications such as Access Control Lists (ACLs), Quality of Service (QoS), and Policy-based VLANs. The Policy engine may perform two lookups per packet. On the 98DX107/167/247/253/263/273/169/249/269 devices, the Policy engine second lookup may be used to perform IPv4/IPv6 Unicast Longest Prefix match on the packet’s DIP. If a mach is found, the policy action entry is used as a next hop entry for the IPv4/IPv6 Unicast Routing engine. 3.7.1.5 Bridge Engine If the packet is not filtered by the previous engines, and the ingress port configuration is enabled for Bridge engine processing, the packet is sent to the Bridge engine. The Bridge engine is responsible for the following functions: 1. IEEE 802.1Q/D Bridging. This includes functions such as VLAN assignment, Spanning Tree support, MAC learning, Address table entries aging, filtering, and forwarding. 2. IPv4 IGMP snooping and IPV6 MLD snooping. 3. Control packet trapping and mirroring to the CPU. This includes identifying IEEE reserved Multicast, IPv4/v6 link layer Multicast, IGMP and MLD, ARP, RIPv1, and IPv4 Broadcast packets. 4. Ingress port rate limiting of Broadcast, Multicast, and Unknown Unicast packets. 5. Filtering/trapping/mirroring of unknown and/or unregistered packets. 6. Private VLAN Edge (PVE), if enabled, overrides the bridge forwarding decision and sends the packet to a pre-configured destination. MV-S102110-02 Rev. E Page 40 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 3.7.1.2 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 forwarding decision and packet descriptor information are attached to the packet using the DSA tag. The Policy and Bridge engines are configured to be disabled on the device’s cascade ports, since the packet was already processed by these engines on the ingress device. In addition, any ingress trapping or mirroring of a packet to the CPU is also performed only on the ingress pipeline on the ingress device. This ensures that a packet is not sent to the CPU multiple times. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Functional Overview 3.7.1.6 This section is relevant for the following devices: AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 D Multilayer Stackable: 98DX107, 98DX133, 98DX167, 98DX247, 98DX253, 98DX263, 98DX273 D SecureSmart Stackable: 98DX169, 98DX249, 98DX269 U Not relevant for the SecureSmart and Layer 2+ devices. If the packet is IPv4 or IPv6 Unicast, and it has not been filtered by the previous engines, and the Policy engine has performed Longest Prefix Match on its DIP, and the packet has been triggered for routing by the Bridge engine, and its VLAN is enabled for routing, the IPv4/IPv6 Unicast routing engine is triggered and the packet is routed. The IPv4/IPv6 Routing engine is responsible for the following functions: • Router exception checking: • IPv4/v6 Header Error • TTL/Hop Limit Exceeded • Options • Next-hop forwarding to any {device, port}, trunk, or VLAN group in the system • Per route entry QoS assignment • Per route entry mirroring-to-CPU or mirroring to ingress analyzer port 3.7.1.7 Policing Engine 3.7.1.8 Pre-Egress Engine M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Policing engine is invoked when a policy rule binds the matching packet flow to a policer instance. The Policing engine can meter the flow and maintain flow statistics. Non-conforming packets can be dropped, or have their QoS re-marked. At the end of the ingress pipeline, the Pre-egress engine examines the decisions made by the ingress pipeline and prepares/duplicates the packet descriptor for the egress pipeline processing. 3.7.1.8.1 Ingress Pipeline Packet Commands The ingress pipeline packet descriptor command may be any ONE of the following: • Drop the packet. • Trap the packet to the CPU. • Forward the packet to a target destination(s). • Forward the packet to a target destination(s) AND mirror the packet to the CPU. In addition to one of the possible packet commands listed above, the ingress pipeline may set the packet descriptor with either or both of the following commands: • Mirror to the ingress analyzer port. • Ingress sample to the CPU. 3.7.1.8.2 Packet Descriptor Processing If the packet descriptor command is “Drop the packet” and it is neither ingress-mirrored nor sampled to the CPU, the packet is dropped and its buffer(s) is/are released. The descriptor of this dropped packet is not forwarded to the egress pipeline. If the packet is mirrored to the CPU and/or sampled to the CPU and/or mirrored to the ingress analyzer port, the packet descriptor is replicated by the pre-egress engine for each target. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 41 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 R IPv4/v6 Unicast Routing Engine M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED High-Level Packet Walkthrough AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If, for any reason, the packet is to be sent to the CPU, the packet descriptor is set with a specific CPU code and its associated attributes, and then forwarded to the specified CPU in the system. The global CPU Code table defines the CPU code attributes that define the CPU device through which the packet is to be forwarded, its traffic class, drop precedence, the statistical sampling ratio, and an option to truncate the packet to 128 bytes. 3.7.2 Egress Pipeline The ingress pipeline Pre-egress unit passes to the egress pipeline packet descriptors with either a Unicast or Multicast destination target. The egress pipeline is comprised of the following functional units: • Egress Filtering • Multi-Target Replication • Descriptor Queueing • Rate Shaping • Transmit Scheduler 3.7.2.1 Egress Filtering M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The following is performed prior to enqueueing a packet on a given egress port queue: 1. VLAN egress filtering 2. Spanning Tree egress filtering 3. Source port/trunk group egress filtering for multi-target packets 4. Egress port unregistered Multicast and Unknown Unicast filtering 5. Source ID multi-target filtering Each of the egress filtering unit mechanisms can be independently enabled or disabled. 3.7.2.2 Multi-Target Replication A multi-target destination is indicated via a multi-target group index, also known as VIDX. If the packet is multi-target (i.e. Broadcast, Multicast, or Unknown Unicast), the packet descriptor is replicated for each egress port member of the VIDX group. 3.7.2.3 Descriptor Queueing Non-filtered packet descriptors are enqueued on the egress traffic class queue. For Head of the Line blocking prevention, the Queueing is based on a tail-drop algorithm, according to the number of packets or buffers in the queue. If the egress port is configured for egress mirroring to analyzer port, every packet descriptor enqueued on the egress port queue has its descriptor duplicated and forwarded back to the ingress pipeline Pre-egress unit. It is then forwarded to the configured egress analyzer port. If the egress port is configured for statistical sampling of packets to the CPU, for every packet selected for sampling, the descriptor is duplicated and forwarded back to the ingress pipeline Pre-egress unit for forwarding to the configured CPU. MV-S102110-02 Rev. E Page 42 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 If the packet has a Unicast destination set to a trunk group, the destination is converted to one of the trunk group port members (on any device), based on the trunk hash function. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Functional Overview 3.7.2.4 Queue and Port Shaping AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Each egress port traffic class queue can be configured with a token bucket for shaping traffic transmitted from the queue. 3.7.2.5 Transmit Scheduler Packet descriptors are de-queued from the egress port traffic class queue according to the configured scheduling algorithm: • Deficit Weighted Round Robin • Shaped Weighted Round Robin • Strict Priority 3.7.2.6 Header Alteration When the packet is read from the buffers memory for transmission, its header is altered according to its descriptor content and according to the type of port from which it is being sent. The packet’s header may be modified by one of more of the following actions: • In the 98DX107, 98DX133, 98DX167, 98DX247, 98DX253, 98DX263, 98DX273, 98DX169, 98DX249 and 98DX269 devices, if a packet is routed, its header is changed as follows: MAC DA is modified to reflect the next hop MAC. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 – – – – – MAC SA is modified to reflect the router’s MAC. VLAN is modified to reflect the next hop subnet. IPv4 header TTL field is decremented (optional). IPv6 hop limit is decremented (optional). • VLAN Tag add/removed/modified • IEEE 802.1 user priority remarked • IPv4/IPv6 DSCP remarked • If IPv4 header is modified, by either TTL decrement for routed packets or DSCP remarking, its checksum is recalculated. If the packet is sent via a Cascading port or to the CPU, a DSA tag is attached. 3.7.2.7 Ports MAC Tx After the packet has been read from the buffers memory and its header has been altered, it is transmitted via the port’s MAC, which performs the MAC IEEE 802.3 MAC functionality. If the packet header has been altered, it generates a new CRC and appends it to the packet. If necessary, it pads the packet to a MinFrameSize of 64 bytes. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 43 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 In addition, the aggregate traffic of all the traffic class queues for a given egress port can be configured with a token bucket shaper. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED High-Level Packet Walkthrough This section describes the Marvell® Distributed Switching Architecture (DSA). The DSA architecture allows multiple devices to be cascaded through any of its Ethernet MAC port interfaces with other devices, or with any Marvell device that supports DSA tag cascading. Up to 32 devices can be cascaded to create a single cascaded system. A cascaded system of devices in these three families supports the same features as a non-cascaded single device in these three families. This includes: • Trunk groups with port members on multiple devices in the system. • Mirroring to analyzer port on any device in the system. • Traffic to the CPU can be sent through any device in the system. • CPU can inject traffic to be transmitted out through a port on any device in the system. 4.1 Cascade Ports A device’s port used for interconnecting Marvell devices is configured as a cascade port. All traffic sent and received on cascade ports is always DSA-tagged (Section 4.6 "DSA Tag"). Consequently, the cascade ports should only connect to other cascade ports. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Multiple cascade ports can be configured as a trunk group, to support large bandwidth inter-device connections (Section 13.4 "Trunking over Cascade Link" on page 291). Cascade ports should be members of all active VLANs. However, due to the fact that the DSA tag replaces the packet's VLAN Tag (see Section 4.6 "DSA Tag") the VLAN tagged state for cascade ports is not relevant. To allow the CPU to transmit a packet to any port in the system, and to learn about received packets (e.g., CPU Code, source device/Port), the CPU port must be configured as a cascade port (Section 7.1 "CPU Port Number" on page 102), however it cannot be a member of any VLAN. Configuration • To configure a port as a cascade port, set the <Cascading Port[26:0]> field in the Cascading and Header • • Insertion Configuration Register (Table 528 p. 770). To configure a CPU port as a cascade port, set the <CPUPort DSAtagEn> bit in the Cascading and Header Insertion Configuration Register (Table 528 p. 770). To configure the cascade port as a member of all active VLANs, set the corresponding port as a member in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). MV-S102110-02 Rev. E Page 44 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 4. Distributed Switching Architecture M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Distributed Switching Architecture 4.2 Single-Target Destination in a Cascaded System A packet with a single-target destination from the CPU or a cascade or network port is associated with a destination device number and port number. The destination port may be the CPU port or any of the other local ports on the destination device (among other parameters). If the destination device is not the local device number, then the packet must be sent through a cascade port that leads to the destination device (either directly or through intermediate devices). To facilitate this, each device supports a Device Map table that maps a destination device number to a cascade port or cascade group. A single-target destination packet whose destination device is not the local device is sent out the cascade port or trunk group specified in the Device Map table entry for the given destination device. In the example in Figure 9, four devices are connected in a ring topology. The cascade ports on each devices are labeled P0 and P1. In this example, a network port on device 0 receives a known Unicast packet and is assigned a forwarding target destination port on device 2. The Device Map table on device 0 indicates that the packet must be egressed on P1 to reach device 2. The packet is then received on device 1, where its Device Map table indicates that the packet must be egressed on P1 to reach device 2. The packet is then received on device 2 and egressed on the target destination port. Example of Single-Target Destination Forwarding in a Cascaded System Known Unicast packet FORWARD DSA Tag P0 Dev 0 P1 P0 FORWARD DSA Tag Dev 1 Dev Port Dev 0 N/A 0 1 P1 1 2 P1 2 3 P0 3 Device Map Table on device 0 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 9: P1 P0 Dev 2 P1 P0 Dev 3 P1 P1 Port Dev Port Dev Port P0 0 P1 0 P1 N/A 1 P0 1 P1 P1 2 N/A 2 P0 P1 3 P1 3 N/A Device Map Table on device 1 Device Map Table on device 2 Device Map Table on device 3 Configuration To configure the local device number, set the <DeviceID[4:0]> field in the Global Control Register (Table 84 • • p. 377) accordingly. For each device number in a cascaded system (excluding the local device number), set the corresponding Device<n> Map Table Entry (0<=n<32) (Table 471 p. 728) accordingly. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 45 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Every device in a cascaded system must be assigned a unique 5-bit device number. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Single-Target Destination in a Cascaded System 4.3 Multi-Target Destination in a Cascaded System AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 A multi-destination packet (i.e. unknown Unicast, Multicast, or Broadcast) is forwarded by the bridge egress processing according to the packet VLAN and its VIDX Multicast group assignment (Section 11.5 "Bridge Multicast (VIDX) Table" on page 240). However, if the topology of the cascaded system is not a Spanning Tree, i.e., a loop exists in the topology (e.g., as in a ring topology), then multiple copies of the multi-target destination packets may be received by devices. The device supports a unique feature that allows Multicast destination packets to be flooded in the cascaded topology according to a source device rooted Spanning Tree. The packet is assigned a unique source device identifier by the ingress device, and this “source ID” value is propagated with the packet as part of the DSA tag. The egress pipeline of each device performs egress filtering of packets based on their source-ID. A port bitmap per source-ID defines whether a packet originating from a given source device should be filtered or forwarded on the corresponding port. Once the cascade topology is learned, for each device in the system a Spanning Tree, rooted at the device and reaching all the other devices in the system, is calculated by the system management software. The cascade port(s), on the leaf devices of the Spanning Tree are configured to be filtered for this source-ID Spanning Tree (Section 11.14 "Bridge Source-ID Egress Filtering" on page 257). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the example in Figure 10, four devices are connected in a ring topology. The Source-ID table on each device keeps the packet from looping around the ring. The Source-ID table forwards the packet along a Spanning Tree path, reaching each device once only. For clarity, in Figure 10 the Source-ID table shows the port bitmap only for the cascade ports P0 and P1 on each device, however the network ports on each device are all set to ‘1’ (i.e., forward). In the example, a Multicast packet is received on one of the network ports on device 0 and is assigned source-ID 0. The packet is assigned a VLAN and VIDX Multicast group index, and is flooded to the local member ports on the device. Note that the VLAN/VIDX must always include the cascade ports. The flooding of this packet through the ring is depicted by the RED arrows. Specifically, the Source-ID table on device 0 enables the packet to be forwarded on its cascade ports 0 and 1 to reach devices 3 and 1 respectively. Device 3 allows the packet to be flooded to device 2, but the Source-ID table on device 1 does not permit the packet to be forwarded device 2, and the Source-ID table on device 2 does not permit the packet to be forwarded to device 1. So we can see that multitarget packets received by device 0 are forwarded along a Spanning Tree path where device 0 is the root, and each device in the stack is reached. To continue the example, a Multicast packet is received on a network port on device 1. Here we see that the packet is flooded along a different Spanning Tree path (the blue arrows) to each of the devices in the stack. MV-S102110-02 Rev. E Page 46 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The cascade port(s) on the device must be fixed members of all VLAN and Multicast groups, to ensure that the packet is propagated to all devices in the system. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Distributed Switching Architecture AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 10: Example of Multi-Destination Forwarding in a Cascaded System P Dev P 0 0 1 Dev 0 Src-ID Table Multicast Packet received on a network port on Device 1 P Dev P 1 1 0 Dev 1 Src-ID Table Dev P 2 1 P 0 Dev 2 Src-ID Table Dev P 3 1 P 0 Dev 3 Src-ID Table SrcID P0/P1 SrcID P0/P1 SrcID P0/P1 SrcID P0/P1 0 1/1 0 0/0 0 0/0 0 1/0 1 1/0 1 1/1 1 0/0 1 0/0 4.4 Loop Detection M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Source ID Egress Filter table on each device In the event of a topology change or misconfiguration of the Device Map or Source-ID Egress Filtering, a packet may be looped back to its original source device. To prevent such packets from continuing to loop between devices in the cascaded system, a global configuration can be enabled to discard DSA tagged packets received where the packet’s DSA tag <source device> is equal to the local device number. This filter is applied to all types of DSA tagged packets, with the exception of FROM_CPU, which does not contain a <source device> field. Configuration To enable ingress filtering of a DSA-tagged packet whose source device is equal to the local device, set the <DSATag SrcDevIsLocalFilterDis> field in the Bridge Global Configuration Register1 (Table 371 p. 647) accordingly. 4.5 QoS on Cascade Interface When oversubscribed, a cascade interface may suffer from congestion on its egress traffic class queues and leading to packet loss. Traffic on cascade ports is classified as either data, control, or mirror-to-analyzer. Control traffic is defined as either traffic to the CPU, or traffic from the CPU that is specified as control traffic. Control traffic is further classified as either CPU-to-CPU traffic, or “other” traffic to/from the CPU. To segregate control traffic from data and mirror-to-analyzer traffic, control traffic is assigned a configurable traffic class on the cascade port. To segregate CPU-to-CPU traffic (typically used for internal system control) from other, less critical traffic to/from the CPU, each type of control traffic (CPU-to-CPU and Other-Control) is assigned a configurable drop precedence. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 47 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Multicast Packet received on a network port on Device 0 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Loop Detection AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Data traffic, defined as network-to-network traffic, is sent across a cascade port according to a global table that maps the packet traffic class and drop precedence to a cascade port traffic class and drop precedence. Mirror-to-analyzer traffic is assigned a dedicated traffic class and drop precedence for ingress mirrored traffic, and a dedicated traffic class and drop precedence for egress mirrored traffic. 4.6 DSA Tag In a cascaded system, the DSA tag records the relevant packet information that must be passed from one device to another, to correctly process the packet. The DSA tag architecture is extensible, allowing additional fields to be added in new generation devices, while remaining backward-compatible with older devices. The first version of the DSA tag specified a 4-byte DSA tag format. The enhanced version of the DSA tag implements an extended 8-byte DSA tag. All packets transmitted through the device’s cascade ports must contain either a 4-byte DSA-tag (from legacy devices) or an extended 8-byte DSA tag. Legacy devices accept the extended tag from the device, but only recognize the first 4-bytes of the tag. The DSA tag is a superset of the fields contained in the IEEE 802.1Q tag. Thus a packet that arrives from a network port with an IEEE 802.1Q tag and is forwarded through a cascade port, has its 4-byte Q-tag converted into an extended 8-byte DSA tag, without any loss of the original Q-tag information. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Because a port configured as a cascade always sends and receives DSA tagged packets, there is no need for a special Ethertype to identify the DSA tag. The DSA tag appears directly after the MAC source address as illustrated in Figure 11. A packet received from a cascade port and transmitted through a network (i.e., non-cascade) port has its DSA tag either stripped or converted to an 802.1Q tag. Figure 11: DSA Tag in the Ethernet Frame 7 Octets Preamble 1 Octet SFD 6 Octets Destination Address 6 Octets Source Address 8 Octets Marvell Tag (Extended) 2 Octets Length/Type FCS b63 b1 Tag Command DSA Tag Data 00 - TO_CPU MAC Client Data 4 Octets b0 01 - FROM_CPU 10 - TO_ANALYZER 11 - FORWARD For further details on the DSA tag, see Appendix A. "DSA Tag Formats" on page 333. MV-S102110-02 Rev. E Page 48 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 For further details on configuration for the cascade port QoS, see Section 8.4.1 "Traffic Class and Drop Precedence Assignment" on page 123. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Distributed Switching Architecture 4.6.1 DSA Tag Commands AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Packets sent between devices (through cascade ports) are always DSA-tagged. There are four DSA tag commands: • FORWARD • TO_CPU • FROM_CPU • TO_ANALYZER For each command there is a different DSA tag format, which carries the relevant fields for the respective command. 4.6.1.1 FORWARD DSA Tag Command The FORWARD command indicates that the packet is forwarded to specified destination device/port, trunk group, or Multicast group. The extended 8-byte FORWARD DSA tag includes the forwarding the destination of the packet made by the ingress pipeline engines of the ingress device. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In a homogenous system where all devices support the same ingress processing engines, it is recommended to disable ingress processing engines on the cascade port, and to rely on the FORWARD DSA tag decision made by the ingress device (Section 11.1 "Bypassing Bridge Engine" on page 203 and Section 10.4.2 "Enabling Policy Engine Processing" on page 183). In heterogeneous systems, the cascade port may be enabled for ingress engine processing. The legacy 4-byte DSA tag does not include the forwarding destination. Packets received on a cascade port with the 4-byte FORWARD DSA tag are subject to processing by the ingress pipeline engines. This is the case when the device is connected to a legacy device. If the packet is either received on a cascade port with a FORWARD DSA tag whose destination is a remote device, or the packet is received on a network port and the forwarding destination is a remote device, the packet is sent with a FORWARD DSA-tagged via the cascade port configured in the device map table for the requested destination device. (Section 4.2 "Single-Target Destination in a Cascaded System"). If the FORWARD destination is Multicast, the packet is flooded according to the packet’s VLAN and Multicast group assignment (Section 11.5 "Bridge Multicast (VIDX) Table" on page 240). If the packet is queued for transmission on a local network port(s), the packet is transmitted without the DSA tag. If the packet is queued for transmission on a cascade port, the packet is transmitted with a FORWARD DSA tag. For further details on the DSA tag FORWARDING format, see Appendix A. "DSA Tag Formats" on page 333. 4.6.1.2 TO_CPU DSA Tag Command Packets are sent to the CPU with the TO_CPU DSA tag, which provides the CPU with all the relevant information regarding the packet. The CPU port must be configured as a cascade port to receive this information. A packet is sent to the CPU as a result of one of the following actions: • Packet is assigned a TRAP command by an ingress processing engine (Section 5.1.3 "TRAP Command" on page 53). • Packet is assigned a MIRROR command by an ingress processing engine (Section 5.1.2 "Mirror-to-CPU Command" on page 53). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 49 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The DSA tag determines how the packets is processed by the receiving device. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED DSA Tag • Packet is assigned a FORWARD command to the virtual CPU port 63 by an ingress processing engine (Section 5.1.1 "FORWARD Command" on page 52). Packet is selected for sampling to the CPU by the ingress or egress port sampling mechanism (Section 16.1 "Traffic Sampling to the CPU" on page 312). AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • A configurable CPU code table defines a set of configurable attributes associated with each CPU code. This set of attributes includes the packet QoS, statistical sampling ratio, truncation enable, and the device number through which the packet is sent to the CPU (Section 7.2.1 "CPU Code Table" on page 103). The destination device number for a given CPU code can be the local device or a remote device. If the target is the local device CPU, the packet is sent to the local device CPU packet interface with the TO_CPU DSA tag. If the target is a remote device, the packet is sent with a TO_CPU DSA tag over a cascade port selected according to the Device Map table for the destination device (Section 4.2 "Single-Target Destination in a Cascaded System"). The packet is queued on the cascade port according to the configured control traffic class, and the configured drop precedence for Other-Control traffic (Section 4.5 "QoS on Cascade Interface"). Packets received on a cascade port with the TO_CPU DSA tag are not subject to processing by ingress pipeline engines, even if enabled on the port. The packet is forwarded to the destination CPU device derived from the DSA tag CPU code, as described above. For further details on the DSA tag TO_CPU format, see A.1 "Extended DSA Tag in TO_CPU Format" on page 333. 4.6.1.3 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 For further details on packets to/from the CPU, see Section 6.1 "PCI Interface" on page 58. FROM_CPU DSA Tag Command The CPU can inject a packet for a specific destination by creating a FROM_CPU DSA-tagged packet. The FROM_CPU DSA tag contains all the relevant information required by the device to send the packet to the specified destination. The FROM_CPU tag has special provisions for: • Sending the packet to the CPU attached to an adjacent neighbor device without knowing its device number. This “mailbox” mechanism is useful for topology discovery in a multi-CPU system. • Sending to a VLAN or Multicast group, excluding a device/port or trunk group. • Marking the packet as “control”, so it is queued on the cascade control traffic class. • Disabling egress VLAN and Spanning Tree egress filtering. This is important for sending BPDUs out ports that are in the blocked Spanning Tree state. For more details on the FROM_CPU DSA tag, see Section 7.2 "Packets to the CPU" on page 102. and Section 7.3 "Packets from the CPU" on page 107 Packets received on a cascade port with the FROM_CPU DSA tag are not subject to processing by ingress pipeline engines, even if enabled on the port. The packet is forwarded to the target destination defined in the FROM_CPU DSA tag. If the FROM_CPU DSA tag has a single destination with a target device equal to the local device, the packet is transmitted on the local port without the DSA tag. If the FROM_CPU DSA tag has a single-destination with a target device not equal to the local device, the packet is sent with the FROM_CPU DSA tag on the cascade port configured in the Device Map Table for the destination device (Section 4.2 "Single-Target Destination in a Cascaded System"). MV-S102110-02 Rev. E Page 50 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Among the fields in the TO_CPU DSA tag is an 8-bit CPU code, which defines the mechanism responsible for sending the packet to the CPU. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Distributed Switching Architecture 4.6.1.4 TO_ANALYZER DSA Tag Command The device supports ingress and egress mirroring to an analyzer port. The device supports a configurable ingress analyzer port and a configurable egress analyzer port. The destination analyzer port may be a network port on the local device or a remote device. When an analyzer port is not local to a device, the cascade port leading to the device supporting that analyzer port is configured to be the analyzer port for this device. If a packet is marked to be ingress or egress mirrored to the analyzer port and the respective analyzer port is a cascade port, the packet is sent with the TO_ANALYZER DSA tag. Packets received on a cascade port with the TO_ANALYZER DSA tag are not subject to processing by ingress pipeline engines, even if enabled on the port. The packet is forwarded to the ingress or egress analyzer port configured in this device. Packets transmitted out the analyzer network port are sent without the TO_ANALYZER DSA tag. 4.7 Cascading M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 For more details on Mirroring to an analyzer port (Section 16.2 "Traffic Mirroring to Analyzer Port" on page 314). In a cascaded system, the Policy, Bridge, Unicast Routing, and Policing Ingress Processing engines should be enabled only on network ports and disabled on cascade ports. In this model, the ingress device of a cascaded system performs the policy, bridging, Unicast routing, and policing of traffic. If the packet is sent through a cascade port to adjacent devices in the system, the packet is processed based on its DSA tag information only. Configuration In the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567): • To disable Bridge engine processing on a port, set the <Bridge BypassEn> bit. • To disable Policy and Policing engine processing on a port, clear the <PolicyEn> bit. Note Even when the bridge engine is disabled, the Source address lookup for address learning is still performed (see Section 11.4.7 "FDB Source MAC Learning" on page 231). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 51 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the FROM_CPU destination is Multicast, the packet is flooded according to the VLAN and Multicast group assignment defined in the FROM_CPU DSA tag (Section 11.5 "Bridge Multicast (VIDX) Table" on page 240). If the packet is queued for transmission on a local network port(s), the packet is transmitted without the DSA tag. If the packet is queued for transmission on a cascade port, the packet is transmitted with a FROM_CPU DSA tag. To prevent loops in a non-Spanning Tree cascading topologies, see Section 4.3 "Multi-Target Destination in a Cascaded System". M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Cascading This section describes packet command assignment and resolution. At each stage of processing in the ingress pipeline, a packet is associated with a specific command value. 5.1 Ingress Packet Command Assignment Packets received on network ports are assigned the packet command FORWARD at the beginning of the ingress pipeline. The packet command may then be modified by the ingress pipeline engines enabled on the given port. Packets received on cascade ports must always be DSA-tagged (Appendix A. "DSA Tag Formats" on page 333). DSA-tagged packets received with the FORWARD command may be processed by the ingress processing engines, according to the cascade port configuration (Section 11.1 "Bypassing Bridge Engine" on page 203 and Section 10.4.2 "Enabling Policy Engine Processing" on page 183). In a homogenous system where all devices support the same ingress processing engines, it is recommended to disable ingress processing engines on the cascade port, and to rely on the FORWARD DSA tag decision made by the ingress device. In heterogeneous systems, the cascade port may be enabled for ingress engine processing.” DSA-tagged packets received with the command TO_CPU, FROM_CPU, or TO_ANALYZER are not eligible for ingress processing, regardless of the cascade port configuration, i.e., the DSA tag command is final and it will be treated as such by all devices in the system. Note M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The following packet commands can be assigned by the ingress processing engines to packets received from networks ports and packets received from cascade ports with a DSA tag FORWARD. • FORWARD • MIRROR • TRAP • SOFT DROP • HARD DROP Marking a packet for ingress-mirroring-to-analyzer and/or ingress-sampling-to-CPU is orthogonal to the packet command, i.e., a packet can have any of the above packet commands and still be ingress mirrored to the analyzer port and/or ingress sampled to the CPU (Section 16.1 "Traffic Sampling to the CPU" on page 312 and Section 16.2 "Traffic Mirroring to Analyzer Port"). 5.1.1 FORWARD Command The FORWARD packet command sends the packet to the assigned single or multi-target destination, where a single target destination is a device/port or trunk group, and a multi-target destination is a VLAN-ID (VID) and Multicast group index (VIDX). Packets received on network ports are assigned the initial packet command FORWARD and a null destination assignment. Packets received on cascade ports with the FORWARD DSA tag are assigned the initial packet command FORWARD. If the packet has an extended FORWARD DSA tag, the destination in the DSA tag is used as the packet destination. MV-S102110-02 Rev. E Page 52 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 5. Packet Command Assignment and Resolution M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Packet Command Assignment and Resolution AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 A packet with the FORWARD command and destination port set to the virtual port 63 is sent to the CPU with a TO_CPU DSA tag, according to the attributes defined in the CPU code table for the packet CPU code assignment, see Section 7.2.1 "CPU Code Table" on page 103. • • 5.1.2 The destination device number is ignored when the destination port is 63. The TO_CPU packet is sent to the target device specified in the CPU Code table. The FORWARD packet command sends a packet to the CPU only if the destination port is port 63. Multi-target packets (Unknown Unicast, Multicast, and Broadcast) with the FORWARD packet command do not reach the CPU, as the CPU cannot be a member of a VLAN or Multicast group. Mirror-to-CPU Command The MIRROR command is a superset of the FORWARD command. In addition to sending the packet to its destination device/port, trunk group, or VID/VIDX group, the CPU is added as a destination as well. A copy of the packet is sent to the CPU with a TO_CPU DSA tag, according to the attributes defined in the CPU code table for the packet CPU code assignment (Section 7.2.1 "CPU Code Table"). 5.1.3 TRAP Command The TRAP command forwards the packet to the CPU only. The packet destination assignment is ignored. 5.1.4 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The packet is sent to the CPU with a TO_CPU DSA tag, according to the attributes defined in the CPU code table for the packet CPU code assignment (Section 7.2.1 "CPU Code Table"). SOFT/HARD DROP Command The HARD/SOFT DROP command prevents the packet from being sent to its FORWARD destination device/port, trunk group, or VID/VIDX group. The HARD DROP also prevents the packet from being trapped or mirrored to the CPU by any mechanism in the ingress pipeline. The SOFT DROP does not prevent the packet from being trapped or mirrored to the CPU, i.e., if some other mechanism assigns a MIRROR or TRAP command, a SOFT DROP command still allows the packet to be sent to the CPU (but is not sent to any other destination). 5.2 Command Resolution Matrix Each ingress pipeline engine can assign a new packet command to the packet. However, the new command is not automatically applied to the packet. The state machine in Table 2 defines the command resolution rules for assigning the packet command. The left column is the packet command from the previous engine and the top row is the new packet command assignment. At the end of the ingress pipeline there is a single packet command that is assigned to the packet. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 53 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Notes M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Command Resolution Matrix Packet Command Resolution AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 2: Previous Pack et Com mand FO RWAR D MIRROR TRAP SOFT DROP HARD DROP F O RWA R D FO RWA R D MIRROR TRAP SOFT DROP HARD DROP MIRROR MIRROR MIRROR TRAP TRAP HARD DROP TRAP TRAP TRAP TRAP HARD DROP SOFT DROP SOFT DROP TR AP TRAP SOFT DROP HARD DROP HARD DROP HARD DROP HARD DROP HARD DROP HARD DROP HARD DROP TRAP Notes • • Example If multiple mechanisms assign the TRAP command, the CPU code reflects the last mechanism to assign the TRAP command. If multiple mechanisms assign the MIRROR command, the CPU code reflects the last mechanism to assign the MIRROR command. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • If the packet command is updated from MIRROR to TRAP, the CPU code is always updated to reflect the TRAP CPU code. If the Policy engine has a rule whose action assigns to the packet the SOFT DROP command, and the bridge engine has a mechanism that assigns to the packet the MIRROR command, then the resulting packet command is TRAP. MV-S102110-02 Rev. E Page 54 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 New Pa cke t Command M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 6. Host Management Interfaces Table 3: Host Management Interfaces Relev ant Dev ic es PCI interface 98DX130 98DX133 98DX250 98DX253 98DX260 98DX262 98DX263 98DX270 98DX273 98DX803 Forwarding packets to/from the host CPU. Asynchronous message notification of the device FDB events. Read and Write access to all the device's address mapped entities. Interrupt handling. CPU MII/GMII/RGMII Ethernet Port interface 98DX106 98DX107 98DX130 98DX133 98DX163 98DX166 98DX167 98DX169 98DX243 98DX246 98DX247 98DX249 98DX250 98DX253 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Forwarding packets to/from the host CPU for the devices that do not incorporate a PCI interface or for the devices that do incorporate a PCI interface but choose to use this interface instead. 98DX249 98DX250 98DX253 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Address-mapped read and write access for the devices that do not incorporate a PCI interface or for the devices that do incorporate a PCI interface but choose to use this interface instead. 98DX249 98DX250 98DX253 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Initialization of the device from an EEPROM device. Address-mapped entities read and write access. IEEE 802.3 Clause 22 compliant Slave SMI interface 98DX106 98DX107 98DX130 98DX133 98DX163 98DX166 98DX167 98DX169 98DX243 98DX246 98DX247 TWSI interface 98DX106 98DX107 98DX130 98DX133 98DX163 98DX166 98DX167 98DX169 98DX243 98DX246 98DX247 Copyright © 2006 Marvell August 24, 2006, Preliminary Purpose M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In terfa ce CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 55 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section describes the device’s Host Management interfaces. The device incorporates the Host Management interfaces listed in Table 3. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Host Management Interfaces Host Management Interfaces (Continued) Relev ant Dev ic es Purpose Two IEEE 802.3 Clause 22 compliant Master SMI interfaces 98DX106 98DX107 98DX130 98DX133 98DX163 98DX166 98DX167 98DX169 98DX243 98DX246 98DX247 98DX249 98DX250 98DX253 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 Read/write access to the PHY devices attached to the device’s Tri-Speed ports. Out-Of-Band Auto-Negotiation with the PHY devices attached to the device’s Tri-Speed ports. IEEE 802.3 Clause 45 compliant Master XSMI interfaces 98DX130 98DX133 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Read/write access to HyperG.Stack ports 24, 25, and 26 integrated XAUI PHYs registers via their Slave SMI Interface. Read/write access to an external device (e.g., 88X2010 XFP PHY) attached to the device’s HyperG.Stack ports. IEEE 802.3 Clause 45 compliant Slave XSMI interface, per HyperG.Stack port 98DX130 98DX133 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Read/write access to HyperG.Stack ports 24, 25, and 26 integrated XAUI PHYs registers. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In terfa ce Figure 12 and Figure 13 illustrate the device’s Host Management interfaces. MV-S102110-02 Rev. E Page 56 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 3: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification H ost CPU CPU_MDIO CPU_MDC PC I or M I I / G M I I/ R G M I I EEPROM TW SI 9 8 D X 1 3 0 /1 3 3 / 2 5 0 /2 5 3 / 2 6 0 /2 6 3 / 2 7 0 / 2 7 3 /8 0 3 S_XMDIO2 S_XMDC2 XAUI x2010 XFP PHY 9 8 D X 1 3 0 /1 3 3 / 2 6 0 /2 6 3 /2 7 0 / 2 7 3 / 8 0 3 o n ly x2010 XFP PHY 9 8 D X 2 6 0 /2 6 3 / 2 7 0 / 2 7 3 / 8 0 3 o n ly 9 8 D X 2 7 0 /2 7 3 / 8 0 3 o n ly M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 9 8 D X 2 5 0 / 2 5 3 / 2 6 0 / 2 6 3 / 2 7 0 / 2 7 3 o n ly S_XMDIO1 x2010 XFP PHY S_XMDC1 XAUI S_XMDIO0 S_XMDC0 A la s k a G ig a b it PHY XAUI A la s k a G ig a b it PHY M_XMDIO 4 x SGMII A la s k a G ig a b it PHY M_XMDC 4 x SGMII 9 8 D X 1 3 0 /1 3 3 /2 5 0 /2 5 3 /2 6 0 /2 6 3 / 2 7 0 / 2 7 3 o n ly 4 x SGMII 4 x SGMII A la s k a G ig a b it PHY M_MDIO1 4 x SGMII A la s k a G ig a b it PHY M_MDC1 4 x SGMII M_MDIO0 A la s k a G ig a b it PHY M Figure 13: Host Management Interfaces: 98DX106, 98DX107, 98DX163, 98DX166, 98DX167, 98DX169, 98DX243, 98DX246, 98DX247, 98DX249, 98DX262, and 98DX269 H ost CPU EEPROM CPU_MDIO CPU_MDC M II/G M II/R G M II TW SI 9 8 D X 1 0 6 /1 0 7 /1 6 3 /1 6 6 /1 6 7 /1 6 9 /2 4 3 /2 4 6 /2 4 7 /2 4 9 /2 6 2 /2 6 9 S_XMDIO1 S_XMDC1 August 24, 2006, Preliminary XAUI Copyright © 2006 Marvell 9 8 D X 2 4 3 /2 4 6 / 2 4 7 /2 4 9 /2 6 2 /2 6 9 o n ly S_XMDIO0 9 8 D X 1 6 3 /1 6 6 /1 6 7 / 1 6 9 /2 4 3 /2 4 6 /2 4 7 / 2 4 9 / 2 6 2 / 2 6 9 o n ly x2010 XFP PHY S_XMDC0 A la s k a G ig a b it PHY XAUI 4 x SGMII A la s k a G ig a b it PHY M_XMDIO 4 x SGMII A la s k a G ig a b it PHY M_XMDC 4 x SGMII 4 x SGMII A la s k a G ig a b it PHY M_MDIO1 4 x SGMII A la s k a G ig a b i t PHY M_MDC1 4 x SGMII M_MDIO0 M_MDC0 A la s k a G ig a b it PHY x2010 XFP PHY 9 8 D X 1 6 9 /2 4 3 /2 4 6 /2 4 7 /2 4 9 /2 6 2 /2 6 9 o n ly CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 57 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 12: Host Management Interfaces: 98DX130, 98DX133, 98DX250, 98DX253, 98DX260, 98DX263, 98DX270, 98DX273, and 98DX803 M_MDC0 MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Host Management Interfaces 6.1 This section is relevant for the following devices: AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 R PCI Interface D Layer 2+ Stackable: 98DX130, 98DX250, 98DX260, 98DX262, 98DX270, 98DX803 D Multilayer Stackable:98DX133, 98DX253, 98DX263, 98DX273, U Not relevant for the SecureSmart and SecureSmart Stackable devices. The PCI interface is used for: • Forwarding packets to/from host CPU. • Asynchronous message notification of the device FDB events. • Address-mapped entities read and write access. • Interrupt handling. The device has a 32-bit, 66 MHz, 3.3V PCI bus, which is compliant with Revision 2.1 of the PCI specification. Note The device does not support 5V tolerance. The PCI interface consists of a Master unit and a Slave unit. 6.1.1 PCI Slave Unit M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Slave unit responds to all transactions from the host PCI controller and host PCI bridge. The Master unit is used by the device’s internal DMAs to access the host memory for packet transmission and receipt, and for sending asynchronous Address Update messages to host memory. The PCI Slave unit is used to access the device’s address-mapped memory regions and the address-mapped memory regions on the devices attached to the device via the various SMI Interfaces. The host processor uses the PCI Slave unit to manage the device, i.e., reading/writing registers and table entry data structures. The PCI Slave unit has two internal buffers. Each buffer is dedicated for a single write transaction and can hold transactions of up to 32 bytes. These buffers are used to store data coming from and transmitted to the PCI. The PCI Slave unit is also used to configure the DMAs that utilize the PCI Master unit. The PCI Slave unit is capable of responding to the following PCI transactions: • Memory Read • Memory Write • Memory Read Line • Memory Read Multiple • Memory Write and Invalidate • Configuration Read • Configuration Write MV-S102110-02 Rev. E Page 58 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 When the PCI Interface is used for management of the device, the CPU MII/GMII/RGMII port cannot be used. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces 6.1.1.1 PCI Posted Write Operations All PCI writes (except for configuration writes) are posted. The device incorporates dual 32-byte posted write buffers. In every posted write cycle, the data is first written to the buffers. When a buffer fills up (32 bytes) or a FRAME is de-asserted, the data is written to the destination while the second buffer fills up. The PCI Master unit is released after the data is stored in one of the slave buffers. 6.1.1.2 PCI Non-Posted Write Operation PCI configuration writes are non-posted writes. The slave asserts PCI_TRDYn only when data is actually written to the configuration register. This implementation guarantees that there is never a race condition between the PCI transaction changing address mapping (Base Address registers) and subsequent transactions. 6.1.1.3 PCI Read Operation M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the device, the PCI Slave unit supports conventional read accesses to the PCI. Upon detection of a read cycle, the slave fetches the required data from the device’s internal register files or internal RAM arrays to its 32-byte prefetch buffer, and then sends it to the host device on the PCI. The device starts sending the data as soon as it reaches the PCI slave prefetch buffer. Most of the internal registers and tables in the device’s address space may be read, one DWORD (4-byte) at a time. When the CPU reads one of the internal registers, the PCI_FRAMEn should de-assert to one cycle only indicating a one DWORD read transaction. Otherwise, the slave will terminate the transaction using the Disconnect Target Termination. The device supports burst reads from its internal memory address space. When accessing a large memory space in one of the burst-capable address spaces of the device, it is recommended to use the memory read multiple transaction. In this kind of transaction the device’s slave pre-fetches bursts from the internal memory so that the PCI transaction is faster and more efficient. 6.1.1.4 PCI Slave Termination The device’s PCI Slave unit supports the following three types of target termination events described in the PCI specification: • Target Abort • Retry • Disconnect The PCI specification requires that latency for the first data read does not exceed 15 cycles, and that consecutive data (in the case of a burst) does not exceed 7 cycles. The slave has an idle timer which counts the number of PCI clock cycles while the target awaits data from the device’s resources. The timer is activated when the PCI slave identifies a transaction to one of the device’s BARs and is reset after a transaction occurs (assertion of PCI_TRDYn for reads and PCI_IRDYn for writes). In the following section, this timer is referred to as the Idle counter. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 59 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The PCI Slave unit does not support the following PCI transactions: • I/O Read • I/O Write • Special Cycle • Dual-Address-Cycle • Lock • Interrupt acknowledgement M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED PCI Interface AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The slave has two configuration parameters—Timeout0 and Timeout1. • Timeout0 is the latency for the first data read • Timeout1 is the latency for consecutive data transactions • (Table 308 p. 557) accordingly. To configure the latency for consecutive data transactions, set the <Timeout1> field in the Timer and Retry Register (Table 308 p. 557) accordingly. Target Abort Target Abort is activated when detecting a parity error on the address sent on the PCI. When detecting a Target Abort, the device performs the following: • Terminates the transaction with Target Abort. • The PCI <PCITarAbort> bit in the PCI Interrupt Cause Register (Table 565 p. 797) is set. • When the <TarAbort> bit is set in PCI_SERRn Mask Register (Table 309 p. 558), the PCI_SERRn signal is asserted. Target Retry The slave has a configurable timeout limit, which is applied to the Idle counter. The limit is configured using the <Timeout0> field in the Timer and Retry Register (Table 308 p. 557) M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The RETRY termination is activated in on of the following cases: • In a write transaction, the Idle counter reaches the <Timeout0> limit and buffers are still full of data from the previous transaction. • In a read transaction, the Idle counter reaches the <Timeout0> limit and read data is not available yet. • In Non-posted writes (configuration writes), the Idle counter reaches the <Timeout0> limit and the write buffer is not empty. In any of the above cases, the slave terminates the transaction with a RETRY transaction. The master initiating the transaction is expected to send the exact transaction again. Target Disconnect The slave has a configurable timeout limit, which is applied on the Idle counter. The limit is configured using the <Timeout1> field in the Timer and Retry Register (Table 308 p. 557). The slave generates a DISCONNECT termination in any of the following cases: • Upon a burst read or write access with the address LSB[1:0] not equal ‘00’. • Upon a burst read or write access that reaches BAR boundary. • Burst read or write access to internal registers. • Burst write, <Timeout1> has expired, and data n+1 cannot be stored in the slave buffers, since they are full. • Burst write, <Timeout1> has expired, and data n+1 cannot be stored in the slave buffers, since they are full. In any of the above cases, the slave terminates the transaction with a disconnect transaction. MV-S102110-02 Rev. E Page 60 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration • To configure the latency for the first data read, set the <Timeout0> field in the Timer and Retry Register M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces 6.1.1.5 PCI Configuration Cycle The slave responds to a Type 0 PCI configuration transaction when: • IDSEL is active • A configuration command is decoded • AD[1:0] is’00’ (Type 0 Configuration Command) See Appendix C.12.1 "PCI Registers" on page 553 for a full description of the configuration registers supported by the device. Other optional registers that are specified in the PCI standard specifications and are not implemented by the device, will be cleared to all 0 when read from the PCI. 6.1.1.6 Fast Back-to-Back The device’s slave is capable of supporting fast back-to-back transactions, compliant with the PCI specification and requirements. The slave can track bus transactions from a final data transfer (PCI_FRAMEn high, PCI_IRDYn low) directly to an address phase (PCI_FRAMEn low, PCI_IRDYn high) on consecutive clock cycles. Since the device’s PCI_DEVSELn timer is set to medium, contentions can be avoided on the PCI_DEVSELn, PCI_PERRn, PCI_TRDYn, and PCI_STIPn signals. I/O-Mapped Transactions M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 6.1.1.7 The device does not support I/O transactions on the PCI. 6.1.1.8 Memory-Mapped Transactions This section describes the device’s PCI slave memory-mapped transactions. Address Spaces The device consumes two address regions on the PCI: PCI Internal Address Space 12-bit address space (Base address is the 20 MSb’s) that contains the PCI Configuration (Header Type 00) and debug registers. A PCI Internal Address Space transaction is answered by the device’s PCI Slave unit, if PCI_AD[31:12] = 20 MSb’s of PCI memory-mapped internal base address. The PCI internal address space is also accessible from the TWSI interface. Device Address Space Note 26-bit address space (Base address is the 6 MSb’s). This address space is used to address the majority of the device’s registers, as well as all internal memories. A Device Address Space transaction is answered by the device’s PCI Slave unit if PCI_AD[31:26] = the 6 MSb’s of device’s memory-mapped internal base address. All memory-mapped transactions are accepted only if the <Memory SpaceEn> bit in the PCI Status and Command Register (Table 300 p. 553) is set. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 61 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device’s PCI interface supports a Type 00 configuration space header, as defined in the PCI specification. The devices in these three families are single-function devices. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED PCI Interface Configuration • To configure the base address for the PCI internal address space, set the <PCIBaseAddr> field in the PCI Memory Mapped Internal Base Address Register (Table 303 p. 556) accordingly. To configure the base address for the PCI device address space, set the <DevBaseAddr> field in the Device Memory Mapped Internal Base Address Register (Table 304 p. 556) accordingly. Address Completion Register The device consumes a 26-bit address space on the PCI bus, however it uses 32 bits for internal address space. To enable the host processor to access the entire device’s address space, a window-based approach is used. The device defines four address windows (address regions) of 24 bits each. Every PCI access is mapped to one of these address regions. Whenever the device identifies an access on the PCI bus to its address space, it uses PCI_AD[25:24] pins as a region identifier and performs address completion to one of four 8-bit values. The address seen internally by the device is {8-bit address completion value, PCI_AD[23:0]}. The 8-bit address completion value is taken from the Address Completion register, where there are four possible values indexed according to the PCI_AD[25:24]. Configuration 6.1.2 PCI Master Unit M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the Address Completion Register (Table 85 p. 379): • To configure region0, (for PCI_AD[25:24:] = 0), set the <Region0> field accordingly. • To configure region1, (for PCI_AD[25:24:] = 1), set the <Region1> field accordingly. • To configure region2, (for PCI_AD[25:24:] = 2), set the <Region2> field accordingly. • To configure region3, (for PCI_AD[25:24:] = 3), set the <Region3> field accordingly. The PCI Master unit is used by the device for the following: • Receive packets from the host CPU memory. • Transmit packets to the host CPU memory. • Asynchronous message notification of the device FDB events to the host CPU memory. The device’s PCI Master unit supports the following transactions: • Memory Read. • Memory Write. • Memory Read Line. • Memory Read Multiple. • Memory Write and Invalidate. MV-S102110-02 Rev. E Page 62 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The PCI Internal Address Space base address and the Device Address space base address may be configured during the PCI configuration cycle of via the TWSI Interface. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces Memory Write and Invalidate and Memory Read Line Memory Write and Invalidate and Memory Read Line cycles are performed when a transaction accessing the PCI memory space requests a data transfer size equal to multiples of the PCI cache line size and it is also cache-line aligned. For these transactions <MemWrInv> in the PCI Status and Command Register (Table 300 p. 553) must be set to 1. Memory Read Multiple Memory Read Multiple is performed when a transaction accessing the PCI memory space requests a data transfer size greater than a cache-line or crosses the PCI cache-line size boundary. Note M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device supports a cache-line size of eight 32-bit words only. If the PCI cache-line register is set to any other value, the cache-line size is regarded as being set to 0. The PCI Master unit consists of 512 bytes of posted-write buffer data and 512 bytes of read buffer data. It can accommodate up to four write transactions plus four read transactions. In the device, the PCI master posted-write buffer permits the SDMA to complete memory writes when the PCI bus becomes available, even if the PCI bus is busy while the posted data is written to the target PCI device (Host bridge). The read buffer is used to absorb the incoming data from the PCI. Read and Write buffer implementation guarantees that there are no wait states inserted by the master. Note PCI_IRDYn is never de-asserted in the middle of a transaction. 6.1.2.1 PCI Master Write Operation The PCI master unit supports the combining of memory writes, where it combines consecutive write transactions if possible. This is especially useful for long DMA transfers, where a long burst write is required. Combining is always enabled in the device. For write combining, the following conditions must be met: • The start address of the second transaction matches the address of data n+1 of the first transaction. For example, a write to a continuous buffer in host memory. • The request for the new transaction arrives while the first transaction is still in progress. The benefit of write combining is seen when writing packets to the CPU memory and when the buffer is longer than 128 bytes. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 63 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The following transactions are not initiated from the device’s PCI Master unit. • I/O Read. • I/O Write. • Configuration Read. • Configuration Write. • Interrupt Acknowledge. • Special Cycle. • DAC Cycles. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED PCI Interface Master Fast Back-to-Back Configuration To enable/disable Fast Back-to-Back, set the <MasterFast B2BEn> bit in the PCI Status and Command Register (Table 300 p. 553). 6.1.2.2 PCI Master Read Operation On a read transaction, as soon as the SDMA requests PCI read access, the master drives the transaction on the bus (after obtaining bus mastership). The returned data is written into the read buffer. The master also supports combining read transactions. This is especially useful for long SDMA reads. The PCI target is capable of driving long burst data without inserting wait states. 6.1.2.3 PCI Master Termination M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Combining is always enabled in the device. Whenever possible, the master combines consecutive read transactions. For read combining, the following conditions must be met: • The start address of the second transaction matches the address of data n+1 of the first transaction. • The request for the new transaction comes while the first transaction is still in progress. This section describes the device’s PCI Master Target termination. PCI Master Termination Overview The master issues a Master Abort event only if there is no target response to the initiated transaction within four clock cycles. In this case, the master de-asserts PCI_FRAMEn and on the next cycle de-asserts PCI_IRDYn. When this happens, the <PCIMasAbort> bit in the PCI Interrupt Cause Register (Table 565 p. 797) is set. The master supports all types of target termination—Retry, Disconnect, and Target Abort. Target Retry If a target has terminated a transaction with Retry, the device’s master re-issues the transaction. By default, the master retries a transaction until it is being served. However, the number of Retry attempts can be limited. If the master reaches this number, it stops retrying, and the <PCIRetryCtr> bit in the PCI Interrupt Cause Register (Table 565 p. 797) is set. Configuration To configure the number of retries, set the <RetryCtr> field in the PCI Status and Command Register (Table 300 p. 553) accordingly. Target Disconnect If a target terminates a transaction with Disconnect, the master re-issues the transaction from the point it was disconnected. If, for example, the master attempts to burst eight 32-bit words starting at address 0x18 and the target disconnects the transaction after the fifth data transfer, the master will re-issue the transaction with the address 0x2C, in order to burst the three remaining words. MV-S102110-02 Rev. E Page 64 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The master supports fast back-to-back transactions. If a new transaction is pending while a transaction is in progress, the master will start the new transaction as soon as the first one ends, without inserting a dead cycle. The master will issue a fast back-to-back transaction if the following conditions occur: • The first transaction is a write. • The new transaction request comes while the first transaction is still in progress. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces Target Abort AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If a target abnormally terminates a transaction with a Target Abort, the master will not attempt to re-issue the transaction. In this event, the <PCITarAbort> bit in the PCI Interrupt Cause Register (Table 565 p. 797) is set. 6.1.3 PCI Parity and Error Support Assertion of the PCI_PERRn and PCI_SERRn signals is configurable, depending on the configuration of the <ParityError ResEn> bit in the PCI Status and Command Register (Table 300 p. 553). Configuration To enable/disable Parity Error support, set the <ParityError ResEn>bit in the PCI Status and Command Reg• • ister (Table 300 p. 553). Assertion of PCI_SERRn depends on the setting of PCI_SERRn Mask Register (Table 309 p. 558). 6.1.4 Disabling the PCI Interface When the PCI Interface is not used, the PCI_EN pin must be pulled-down. The rest of the PCI Interface pins may be left NC. 6.1.5 Packet Reception and Transmission 6.1.5.1 SDMA Overview M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This section describes the device’s Packet Reception and Transmission via the PCI bus. The device incorporates 16 Bus Master DMAs, which transfer packets from device memory to CPU memory (receive path) and from CPU memory to device memory for transmission (transmit path). The 16 DMAs are divided into eight RxDMAs and eight TxDMAs, one for each of the eight traffic class (TC) queues. (SecureSmart devices have 4 traffic classes.)They are referred to as Serial DMAs (SDMA), as they operate on a serialized linked list of descriptors. For convenience, the SDMA is designed in a NIC-style DMA on the host processor. SDMA uses the PCI as a Master for all its transactions: descriptor read/write, buffer read/write. Maximum burst size is 128 bytes and minimum burst size is 8 bytes. The device incorporates eight dedicated receive DMA queues and eight dedicated transmit DMA queues, operated by two dedicated DMA engines (one for receive and one for transmit) that operate concurrently. Each queue is managed by buffer descriptors, which are chained together and managed by the software. The Rx SDMA engine transfer packets from the device memory to CPU memory (receive path) and the Tx SDMA from CPU memory to the device memory for transmission (transmit path). The packet data is stored in memory buffers, with any single packet spanning multiple buffers if necessary. The buffers are allocated by the CPU and are managed through chained descriptor lists. Each descriptor points to a single memory buffer and contains all the relevant information relating to that buffer (i.e., buffer size, buffer pointer, etc.) and a pointer to the next descriptor. Each descriptor also has an ownership bit, which indicates the current owner of the descriptor and its buffer (CPU or SDMA). The SDMA may process descriptors marked with SDMA ownership only. Data is read from the buffer or written to the buffer according to information contained in the descriptor. Whenever a new buffer is needed (end of buffer or end of packet), a new descriptor is automatically fetched and the data movement operation is continued using the new buffer. Ownership of the processed packet descriptors is returned by the SDMA to the CPU. (This operation is also referred to as Close Descriptor.) A Tx/Rx buffer maskable inter- Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 65 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The device implements all parity features required by the PCI specification. This includes PCI_PARITYn, PCI_PERRn, and PCI_SERRn generation and checking. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED PCI Interface AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 rupt is asserted to indicate a Close Descriptor event. The SDMA also delivers status information per packet via the representative descriptor of a packet (the first descriptor at Rx and the last descriptor at Tx). Figure 14 illustrates an example of memory arrangement for a single packet using three buffers. A cyclic ring of descriptors can be build by the software, or the chain can terminate with a NULL next-pointer. Descriptor 1 cmd/status byte count/buffer size Packet 1 - buffer 1 buffer pointer next descriptor pointer Descriptor 2 cmd/status byte count/buffer size buffer pointer Packet 1 - buffer 2 next descriptor pointer Descriptor 3 cmd/status byte count/buffer size 6.1.5.2 Packet 1 - buffer 3 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 buffer pointer next descriptor pointer Packet Reception (from the Device to the CPU) The CPU receives packets from the device directly into a pre-allocated space in host memory. To receive packets for a given CPU traffic class queue, the CPU must do the following: 1. Prepare a linked list of descriptors located in the host memory. 2. Configure one of the eight device Rx SDMAs with the address of the first descriptor in the list. 3. Enable the given device SDMA. A packet may occupy several buffers. For all descriptors except the first one, the DMA closes the descriptor by returning ownership after filling the buffer with packet data. After the entire packet has been copied to the host memory, the DMA closes the first descriptor, ownership is returned, and the packet information is written to the relevant descriptor fields. Then the DMA automatically starts transferring the next packet. Rx SDMA Initialization For each CPU traffic class to which traffic to the CPU is assigned, the CPU must prepare the host memory descriptor list and initialize the corresponding Rx SDMA. The host CPU may also evaluate which SDMA Rx queues have packets pending by reading the <PendingRx[7:0]> field in the Receive SDMA Status Register (Table 121 p. 397). MV-S102110-02 Rev. E Page 66 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Figure 14: CPU Descriptors and Memory Buffers M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces After completing these steps, the Rx SDMA is ready to deliver packets destined to that queue. The Rx SDMA fetches the current descriptor according to the address in <CurrtRxDesc Ptr<n>>, which is triggered by the <RxENQ[7:0]> setting. The Rx SDMA generates write bursts via the PCI to deliver the packet data to the current buffer. In parallel, the Rx SDMA updates <CurrtRxDesc Ptr<n>> and fetches the next descriptor. If the buffer is filled before the end of packet, the Rx SDMA continues to write the packet data to the next buffer. Upon every descriptor close, the Rx SDMA updates the descriptor First and Last indications—the first descriptor with First=1 and the last descriptor with Last=1 (see "Rx SDMA Descriptor" on page 70). Note Rx SDMA Descriptor Pointer M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Rx SDMA closes the descriptor upon completing the writing of data to the descriptor buffer, however the first descriptor of the packet is closed only after the packet is received in its entirety, at which time the Rx SDMA sets the first descriptor’s ownership to ‘CPU’, sets the First indication, and updates the status word and byte count of this packet. The Rx SDMA holds one 32-bit pointer register per queue—<CurrtRxDesc Ptr<n>>. This 32-bit register is used to point to the first descriptor of a receive packet. The CPU must initialize this register before enabling DMA operation. The value used for initialization should be the address of the first descriptor to use. After enabling the DMA channel, this register is updated by the Rx SDMA, as it moves thorough the descriptors chain and reflects the current pointer location within the linked list of descriptors. Note The CPU must not write to this register while its corresponding DMA is enabled. Modifying this register is allowed only when the respective bit <RxENQ[7:0]> is reset. This register may be read to assess the progress of the DMA, as well as to monitor the status of the queue. Rx SDMA Buffer Write Done Interrupt When the Rx SDMA completes writing to the current descriptor buffer, it sets descriptor ownership to the CPU. If the descriptor Enable Interrupt <EI> bit of the current buffer is set to 1, the Rx SDMA sets the relevant bit of the <RxBuffer Queue[7:0]> interrupt in the Receive SDMA Interrupt Cause Register (Table 561 p. 796). Rx SDMA Resource Error Event A resource error event occurs when it reaches the end of available descriptors in the linked list. The condition occurs if either: • The current descriptor for the Rx SDMA has a NULL next-descriptor pointer OR • the SDMA has reached a descriptor that is owned by the CPU (i.e., a cyclic descriptor list is implemented). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 67 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To initialize a receive operation, the CPU must do the following: 1. Prepare a chained list of descriptors and packet buffers with ownership=1 (SDMA). NOTE: The Rx SDMA supports eight priority queues. If the user wants to take advantage of this capability, a separate list of descriptors and buffers must be prepared for each of the priority queues. 2. Write the pointer to the first descriptor to <CurrtRxDesc Ptr<n>> in the Receive SDMA Current Descriptor Pointer<n> Register (0<=n<8) (Table 122 p. 397) associated with the priority queue to be started. If multiple priority queues are needed, the user must initialize <CurrtRxDesc Ptr<n>> for each queue. 3. Enable the Rx SDMA channel by setting the relevant bit of <RxENQ[7:0]> to 1 in the Receive SDMA Queue Command Register (Table 115 p. 394). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED PCI Interface AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This may happen at the start of a packet or in the middle of writing it to host memory (if it occupies more than a single descriptor buffer). This may also happen due to a speculative pre-fetch of a descriptor (i.e, without any current packet destined to the given queue). In the case of a resource error event, the device Rx SDMA mechanism performs the following actions: • The last valid descriptor is closed by setting the descriptor Last=1, and the first descriptor of the broken packet is closed by setting the descriptor First=1 and Resource-Error=1 (see "Rx SDMA Descriptor" on page 70). • The corresponding Rx SDMA Interrupt Cause bit <RxBuffer Queue[7:0]> is set in the Receive SDMA Interrupt Cause Register (Table 561 p. 796). • If the resource error is due to reaching a descriptor with a NULL next-descriptor pointer, the relevant Rx SDMA Enable Queue bit <RxENQ[7:0]> is cleared in the Receive SDMA Queue Command Register (Table 115 p. 394). If the resource error is due to reaching a descriptor owned by the CPU, the Rx SDMA Enable Queue bit remains set. The Rx SDMA maintains its pointer to the current descriptor (that is currently owned by the CPU). • If the resource event occurred on a real packet (and not upon a speculative pre-fetch), increment the 8-bit resource event counter for the given queue (see "Rx SDMA Counters" on page 70). Note M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the resource error is due to reaching a descriptor owned by the CPU, then the CPU must update the current descriptor with a pointer to a free buffer and set the descriptor ownership to the SDMA. On the next transmit attempt, the Rx SDMA will detect that descriptor ownership is now the SDMA and will proceed to transmit the packet to the descriptor buffer. If the CPU does not immediately have a free buffer to supply the descriptor, it is recommended that the CPU: 1) Disable the Rx SDMA queue by setting the <RxDISQ[7:0]> field to 1 in the Receive SDMA Queue Command Register (Table 115 p. 394). This prevents the Rx SDMA from performing a PCI read of the descriptor on every attempt to transmit the packet for this queue. 2) Once the descriptor is updated, the Rx SDMA can be enabled by setting the relevant bit of <RxENQ[7:0]> to 1 in the Receive SDMA Queue Command Register (Table 115 p. 394). If the resource error is due to reaching a descriptor with a NULL next-descriptor pointer, the CPU must: 1. Prepare the new linked list of descriptors in host memory. 2. Write the pointer to the start of the descriptor list to <CurrtRxDesc Ptr<n>> in the Receive SDMA Current Descriptor Pointer<n> Register (0<=n<8) (Table 122 p. 397) associated with the priority queue to be started. 3. Enable the relevant DMA channel by setting the relevant bit of <RxENQ[7:0]> to 1 in the Receive SDMA Queue Command Register (Table 115 p. 394). If the Rx SDMA successfully writes the packet to the descriptor buffers, the SDMA closes the first descriptor of the packet by returning ownership, setting the First bit, and writing the packet information in the descriptor fields. MV-S102110-02 Rev. E Page 68 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The device will terminate packet writing to the host memory if it reaches the end of the linked list before completion of packet reception. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces Rx SDMA Retry/Abort on Resource Error ABORT mode Upon Rx SDMA resource error, the packet that failed to be written in its entirety to host memory is dropped. The next packet is scheduled according the CPU port scheduling algorithm (see Transmit Queue Scheduling (Table 15.3.2 p. 306). RETRY mode Upon Rx SDMA resource error, the packet that failed to be written in its entirety to host memory remains scheduled for transmission. No other packet is handled from other traffic class queues until the current packet is successfully transmitted to host memory. When the resource error is resolved (i.e., the current descriptor is owned by the SDMA), the packet is transmitted again in its entirety. To configure queue RETRY/ABORT mode, set the relevant bit of the <Retry AbortTC[7:0]> field in the SDMA Configuration Register (Table 114 p. 393) accordingly. Rx SDMA Disable Queue Operation The CPU may disable any active DMA queue at any time. To disable a queue, set the relevant bit of the <RxDISQ[7:0]> field to 1 in the Receive SDMA Queue Command Register (Table 115 p. 394). Rx SDMA Parity Error Handling M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the queue is disabled while Rx SDMA is in the middle of writing the packet to host memory, the Rx SDMA continues to process the packet until the packet end or a resource error event, then it disables the queue operation by resetting its <RxENQ[7:0]> bit. The Rx SDMA may encounter a parity error event, derived from the PCI bus, upon descriptor fetch operation. This case is handled like a resource error event, except that in this case a parity interrupt is asserted (see Section 6.1.3 "PCI Parity and Error Support" on page 65). Rx SDMA Parity Error on Data Read from the Device’s Buffers Memory When the device reads the packet from its buffer memory, it conducts a parity check. A parity error indicates that the packet’s data was corrupted due to a soft error on the device’s buffers memory. When the packet is transmitted via a network port, it is transmitted with a bad CRC so that the receiving device will reject it. On the Rx SDMA CPU interface this is indicated by setting the bus_error bit in the first packet descriptor to 1 (see "Rx SDMA Descriptor" on page 70). Rx SDMA Invalid CRC All packets forwarded to the host CPU contain a TO_CPU DSA tag. Due to the DSA tag packet modification, the packet Ethernet CRC is always invalid. The Rx SDMA does not recalculate the packet’s CRC. Instead it indicates that the packet contains the four bytes of the packet’s CRC, but that these bytes are invalid. This is indicated by setting the invalid_CRC bit in the first packet descriptor to 1 (see "Rx SDMA Descriptor" on page 70). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 69 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the resource error occurred on a real packet (and not upon a speculative pre-fetch), each queue can be independently configured to operate in one of two modes: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED PCI Interface Rx SDMA Data Byte Order AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In default mode, the device writes data to the PCI bus in Big Endian byte order. Configuration In the SDMA Configuration Register (Table 114 p. 393): • To configure Word Swap mode, set the <RxWordSwap> bit. • To configure Byte Swap mode, set the <RxByteSwap> bit. Rx SDMA Counters The Rx SDMA maintains the following per-queue counters: <RxSDMA<n> PktCnt> in the Receive SDMA<n> Packet Count Register (0<=n<8) (Table 126 p. 398). Byte Counters (32 bit) <RxSDMA<n> BCCnt> in the Receive SDMA<n> Byte Count Register (0<=n<8) (Table 127 p. 398). Resource Error counters (8 bit) <RxSDMA0 ResrcErrCnt>, <RxSDMA1 ResrcErrCnt>, <RxSDMA2 ResrcErrCnt> and <RxSDMA3 ResrcErrCnt> in the Receive SDMA Resource Error Count 0 Register (Table 128 p. 399). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Packet counters (32 bit) <RxSDMA4 ResrcErrCnt>, <RxSDMA5 ResrcErrCnt>, <RxSDMA6 ResrcErrCnt> and <RxSDMA7 ResrcErrCnt> in the Receive SDMA Resource Error Count 1 Register (Table 129 p. 399). For each successful packet forwarded to the CPU, the relevant packet counter is incremented by one and the relevant byte counter is incremented by the packet’s byte count. If the packet is unsuccessful due to resource error, the Rx SDMA increments the respective Resource Error counter by one. When working in Retransmit On Error mode, this counter is incremented upon each unsuccessful packet transfer that occurs, until the packet transfer succeeds. Rx SDMA Descriptor The CPU can learn information about the packet received from the device by examining the Rx SDMA descriptor and the packet TO_CPU DSA tag. Notes • • All packets forwarded to the CPU contain a TO_CPU DSA tag (see Section 7.2 "Packets to the CPU" on page 102). The format of the TO_CPU DSA tag is found in Section A.1 "Extended DSA Tag in TO_CPU Format" on page 333. MV-S102110-02 Rev. E Page 70 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Two bits are defined to support byte-swapping, for CPUs that use different Endian byte ordering and different word sizes (32-bit or 64-bit): Word Swap Mod Within each 64-bit format, swaps the highest 32 bits with the lowest 32 bits. Byte Swap Mode Swaps bytes within every 32-bits (byte 3 swapped with byte 0; byte 2 swapped with byte 1). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces The Rx SDMA descriptor format is defined in the following tables: Table 4: Rx SDMA Descriptor Format 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 Offset Byte 3 Byte2 Byte1 Byte0 Command / Status (see Table 5) Byte Count[13:0] +0 Buffer Size[13:3] +4 +8 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Buffer Pointer [31:7] Next Descriptor Pointer [31:4] Table 5: +C Rx SDMA Descriptor—Command/Status Field Bits Nam e Desc ription Se t By 31 O Ownership bit: 0 = Buffer owned by CPU 1 = Buffer owned by the device SDMA CPU/ Device 30 bus_error The packet had an error while fetching it from device memory (see "Rx SDMA Parity Error on Data Read from the Device’s Buffers Memory" on page 69). NOTE: Valid only on the packet’s first descriptor, (<F> = 1). Device 29 EI Enable Interrupt When set, a maskable interrupt will be generated upon closing this descriptor. (see "Rx SDMA Buffer Write Done Interrupt" on page 67). CPU 28 Resource Error A resource error event occurred (see "Rx SDMA Resource Error Event" on page 67.) NOTE: Valid only on the packet’s first descriptor, (<F> = 1). Device Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 71 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Rx SDMA descriptor properties: • Descriptor length is 16-bytes, and it must be16-byte aligned (i.e., Descriptor_Address[3:0]==0000). • Descriptors may reside anywhere in the address space except for the null address (0x00000000), which is used to indicate the end of a descriptor chain. • If the link list of descriptors is in a chain formation (i.e., not a ring formation), the last descriptor in the chain must have a null value in the Next Descriptor Pointer field. In a ring formation, the end of chain is indicated by arriving at descriptor that is owned by the CPU. • The length of Rx buffers (whose pointer resides in the Rx SDMA descriptor) is limited to 16 KB and must have a 128-byte aligned address (i.e., Buffer_Address[6:0]==7’b0000000). • The minimum Rx buffer length is eight bytes. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED PCI Interface Rx SDMA Descriptor—Command/Status Field (Continued) Nam e Desc ription Se t By 27 F First. When set, indicates that the buffer associated with this descriptor is the first buffer of a packet. Device 26 L Last. When set, indicates that the buffer associated with this descriptor is the last buffer of a packet. Device 25:0 Reserved Must be 0. Device Bits 31 30 29:16 Receive Descriptor—Byte Count Field Na me De scription Set B y Reserved Must be 0 Device Invalid_CRC This bit states whether the Ethernet packet had a valid CRC or invalid CRC. Invalid CRC can be caused by the device modifying the packet’s content. (see "Rx SDMA Invalid CRC" on page 69). 0 = Valid 1 = Invalid NOTE: Valid only on the packet’s first descriptor, (<F> = 1). Device Packet Byte Count[13:0] When the last descriptor is closed, this field in the first descriptor of the packet is written by the device with a value indicating the total byte count of the received packet. For Ethernet packets the packet length ALWAYS includes the 4 bytes of CRC. NOTE: Valid only on the packet’s first descriptor, (<F> = 1). Device Device M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 6: AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Bits 15:14 Reserved Must be 0 13:3 Buffer Size Buffer size in quantities of 8 bytes. This field indicates the size in 8-byte resolution in the CPU memory allocated for this descriptor. When the number of bytes written to this buffer is equal to Buffer Size Value, the DMA closes the descriptor and moves to the next descriptor. CPU 2:0 Reserved Must be 0. Device Table 7: Rx SDMA Descriptor—Buffer Pointer Bits Nam e Des cription 31:7 Buffer Pointer 25 most significant bits of the 32-bit address of the buffer associated with this descriptor. NOTE: The buffer address must be 128-byte aligned. 6:0 Reserved Must be 0. MV-S102110-02 Rev. E Page 72 Set By CONFIDENTIAL Document Classification: Restricted Information CPU Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 5: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces Rx SDMA Descriptor—Next Descriptor Pointer Na me De scr iption Set B y 31:0 Next Descriptor Pointer 32-bit pointer that points to the beginning of the next descriptor. Bits[3:0] must be set to 0. DMA operation is stopped when a NULL (all zero) value in the Next Descriptor Pointer field is encountered. CPU AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Bits 6.1.5.3 Packet Transmission (from the CPU to the Device) The CPU transmits packets to the device using the Tx Serial DMA (SDMA) mechanism. The CPU initializes a linked list of Tx SDMA descriptors in host memory space. Each descriptor is associated with a data buffer to be transmitted, where a packet can span multiple descriptors. Once the Tx SDMA descriptor linked list is set up and ready for processing, the CPU configures the device with the address of the head of the linked list and enables the relevant Tx SDMA. The device starts reading the data from the linked list and passes it to the device’s buffer memory until reaching a descriptor marked with ‘L=1’. Multiple packets can be in a single linked list. If the list is in a chain format, the last descriptor of the last packet must be marked with the next-descriptor pointer set to NULL (all zeros). If a ring descriptor list format is used, the list is terminated by reaching a descriptor owned by the CPU. Tx SDMA Initialization M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Once enabled, the device’s Tx SDMA transmits all the packets in the linked list, stopping when it reaches a descriptor with the NULL pointer (all zeros) or a descriptor owned by the CPU. To initialize a transmit operation, the CPU must do the following: 1. Prepare a chained list of initialized Tx SDMA descriptors. The descriptor owner must be set to ‘1’ (SDMA). Note: The Tx SDMA supports eight priority queues. If the user wants to take advantage of this capability, a separate list of descriptors and buffers must be prepared for each of the priority queues. 2. Write the pointer to the first descriptor to <CurrtTxDesc Ptr<n>> in the Transmit SDMA Current Descriptor Pointer<n> Register (0<=n<8) (Table 117 p. 396) associated with the priority queue to be started. If multiple priority queues are needed, the user must initialize <CurrtTxDesc Ptr<n>> for each queue. 3. Enable the Tx SDMA channel by setting the relevant bit of <TxENQ[7:0]> to 1 in the Transmit SDMA Queue Command Register (Table 116 p. 395). After completing these steps, the Tx SDMA starts to perform arbitration between the active transmit queues, according to fixed priority, on a packet-by-packet basis. The Tx SDMA then fetches the first descriptor from the specific queue it has decided to serve (according to the address at <CurrtTxDesc Ptr<n>>) and starts transferring data from the memory buffer to the device’s buffers memory. The <CurrtTxDesc Ptr<n>> is updated with the next descriptor pointer. When the Tx SDMA completes sending the buffer’s data to the device, the Tx SDMA closes the respective descriptor by returning ownership to the CPU. If this is not the end of the packet, the Tx SDMA then fetches the next descriptor and continues to read packet data from the next buffer. Upon completing the processing of all the buffers of a packet, Tx SDMA again performs arbitration between the active transmit queues, to schedule the next packet for transmission. On the selected queue, if the Tx SDMA has reached the end of the linked list, the Tx SDMA resets its respective <TxENQ[7:0]> bit, thus disabling that DMA channel. A maskable <TxEnd Queue[7:0]> interrupt in the Transmit SDMA Interrupt Cause Register (Table 563 p. 797) is also generated, to indicate this event. The CPU can transmit packets again on the disabled channel by repeating the above initialization sequence. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 73 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 8: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED PCI Interface Tx SDMA Descriptor Pointer Note The CPU must not write to this register while its corresponding DMA is enabled. Modifying this register is allowed only when the respective <TxENQ[7:0]> bit is reset. This register may be read to assess the progress of the DMA, as well as to monitor the status of the queue.s Tx SDMA Buffer Read Done Interrupt When the Tx SDMA completes the read operation on the current buffer, it transfers ownership of buffer’s descriptor to the CPU. In addition, if the <EI> bit in the descriptor of the current buffer is set to 1, the Tx SDMA sets the relevant bit of the <TxBuffer Queue[7:0]> interrupt in the Transmit SDMA Interrupt Cause Register (Table 563 p. 797). Tx SDMA Resource Error Event M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 A Tx SDMA resource error event occurs when the Tx SDMA reaches the end of the descriptors linked list (a null next descriptor pointer or a CPU-owned descriptor) in the middle of a packet chain (i.e, after fetching the first descriptor and before fetching the last descriptor of the packet). This event may occur only due to a bad configuration of the CPU. In the event of a resource error, the following steps are taken by the Tx SDMA: 1. The Tx SDMA disables the DMA channel of the relevant queue by resetting the relevant bit <TxENQ[7:0]> in the Transmit SDMA Queue Command Register (Table 116 p. 395). 2. The Tx SDMA asserts the relevant bit <TxError Queue[7:0]> in the Transmit SDMA Interrupt Cause Register (Table 563 p. 797). Due to chain end, <TxEnd Queue[7:0]> is also asserted. 3. The Tx SDMA discards the partial packet read. Tx SDMA Recovery from Resource Error To restart the DMA channel, the CPU must perform the initialization operations defined in "Tx SDMA Initialization" on page 73. Tx SDMA Disable Queue Operation The CPU may stop any active Tx SDMA queue at any time. To disable a queue, set the relevant bit of <TxDISQ[7:0]> to 1 in the Transmit SDMA Queue Command Register (Table 116 p. 395). If this happens in the middle of a packet, the Tx SDMA continues to process the packet until packet end or resource error event, then it disables the queue operation by resetting its <TxENQ[7:0]> bit. In addition, the relevant <TxEnd Queue[7:0]> interrupt in the Transmit SDMA Interrupt Cause Register (Table 563 p. 797) is asserted. MV-S102110-02 Rev. E Page 74 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Tx SDMA holds one 32-bit pointer register per queue—<CurrtTxDesc Ptr<n>>. This is a 32-bit register used to point to the first descriptor of a packet transmitted from the CPU. The CPU must initialize this register before enabling DMA operation. The value used for initialization should be the address of the head of the Tx SDMA descriptor list. After enabling the DMA channel, this register is updated by the Tx SDMA, as it moves thorough the descriptors chain and reflects the current pointer location within the linked list of descriptors. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces Tx SDMA Parity Error Handling AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Tx SDMA may encounter a parity error event (derived from the PCI bus) upon descriptor fetch operation, or upon a buffer data read operation. If a parity error occurs upon buffer data read, the Tx SDMA continues processing the current packet and the packet read is discarded by the device. Tx SDMA Data Byte Order In default mode, the device reads data from the PCI bus in Big Endian byte order. Two bits are defined to support byte-swapping, for CPUs that use different Endian byte ordering and different word sizes (32-bit or 64-bit): Word Swap Mod Within each 64-bit format, swaps the highest 32 bits with the lowest 32 bits. Byte Swap Mode Swaps bytes within every 32-bits (byte 3 swapped with byte 0; byte 2 swapped with byte 1). Configuration Tx SDMA Descriptor M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the SDMA Configuration Register (Table 114 p. 393): • To configure Word Swap mode, set the <TxWordSwap> bit. • To configure Byte Swap mode, set the <TxByteSwap> bit. The CPU indicates to the device Tx SDMA how the packet is to be forwarded through the fields in the Tx SDMA descriptor and the through the fields in the packet DSA tag. Notes • All packets sent from the CPU to the device must contain a FROM_CPU or FORWARD DSA tag (see 7.3 "Packets from the CPU" on page 107). • The format of the TO_CPU DSA tag is found in A.2 "Extended DSA Tag in FROM_CPU Format" on page 336. • The format of the FROM_CPU DSA tag is found in A.4 "Extended DSA Tag in FORWARD Format" on page 341. Tx SDMA Descriptor properties: • Descriptor length is 16 bytes, and it must be 16-byte aligned (i.e., Descriptor_Address[3:0]==0000). • Descriptors may reside anywhere in the CPU address space except for a null address (0x00000000), which is used to indicate the end of the descriptor chain. • Descriptors are always fetched in a 16-byte burst. • The Tx SDMA descriptor list is terminated by either a NULL value in the Next Descriptor Pointer field, or by a Tx SDMA descriptor owned by the CPU. • Tx buffers associated with Tx descriptors are limited in length to a maximum of 64 KB. There are no alignment restrictions for buffers with a length greater than 8 bytes. However, buffers with a payload of one to eight bytes must be aligned to a 64-bit boundary. Zero size buffers are illegal. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 75 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 If a parity error occurs upon descriptor fetch, the Tx SDMA performs the same operations as described in "Tx SDMA Resource Error Event" on page 74, however a PCI parity interrupt is asserted instead of a Tx SDMA Resource Error interrupt. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED PCI Interface Transmit Descriptor Format AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 9: 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 Offset 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 Byte2 Byte1 Byte0 Command / Status Byte Count[13:0] +0 Desc info[15:0] +4 Buffer Pointer [31:0] Next Descriptor Pointer [31:4] Table 10: +8 0 0 0 +C Transmit Descriptor— Command/Status Bits Nam e Desc ription Set By 31 O Ownership bit. When set to’1’ buffer is owned by the Tx SDMA. When set to’0’ buffer is owned by CPU. Device CPU/ 30 Reserved Must be set to 0. 29:24 Reserved Must be set to 0. 23 EI Enable Interrupt. When set, a maskable interrupt will be generated upon closing the descriptor. To limit the number of interrupts and prevent an interrupt per buffer situation, the user should set this bit only in descriptors associated with LAST buffers. If this is done, a TxBuffer interrupt will be set only when transmission of a frame is completed. CPU Device M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Device 22 Reserved Reserved 21 F If set, indicates the desciptor is the FIRST descriptor of the packet. CPU 20 L If set, indicates the desciptor is the LAST descriptor of the packet. CPU 19:13 Reserved Reserved Device 12 RecalcCRC If set, this field indicates that the four CRC bytes of the packet are invalid and the CRC of the packet should be recalculated when transmitting the packet to its destination. CPU 11:0 Reserved Reserved Device MV-S102110-02 Rev. E Page 76 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Byte 3 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces Transmit Descriptor—Byte Count Nam e Desc ription Se t By 31:30 Reserved Must be’0’ CPU 29:16 Byte Count Number of bytes in the corresponding buffer. This is the payload size of the buffer in bytes. This field contains only the byte count of the buffer pointed to by this descriptor. CPU 15:0 Reserved Must be 0. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Bits Table 12: Transmit Descriptor—Buffer Pointer Bits Nam e Desc ription Se t By 31:0 Buffer Pointer 32-bit pointer to the beginning of the buffer associated with this descriptor. Has a 64-bit alignment requirement for buffers with byte count of 1–8 bytes. CPU Table 13: Transmit Descriptor—Next Descriptor Pointer Nam e Desc ription Se t By 31:0 Next Descriptor Pointer 32-bit pointer that points to the beginning of the next descriptor. Bits[3:0]. Must be set to’0’ (descriptor must be 16-byte aligned). A value of NULL (all zero) can be used to terminate the descriptor list. CPU 6.1.6 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Bits Asynchronous Notifications for FDB Update Messages The device implements an asynchronous message notification mechanism, to notify the host processor about: • Changes performed automatically by the device on the bridge Forwarding Database (FDB). • New Source Address received on a port or trunk group. • Reply to a host CPU query to the FDB. This mechanism is called AUQ (Address Update Queue), as it implements an internal queue to store the messages received from the Bridge engine, until their release to the CPU predefined memory via the PCI bus. These messages, called Address Update (AU) messages, can be transferred over the PCI bus into an Address Update Queue (AUQ) defined in host memory. (In non-PCI systems, MAC Update messages can be read by the host CPU as regular address-mapped registers.) The AU message size is 16 bytes and each AU message transfer is performed using a single PCI Master transaction. For detailed information regarding the FDB Address Update messages see Section 11.4.5 "Address Update (AU) Messages" on page 226. The FDB Address Update message format is defined in MAC Update Message Format (Table 401 p. 663). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 77 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 11: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED PCI Interface 6.1.6.1 Allocating Host Memory for the Address Update Queue (AUQ) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The host device allocates one or two contiguous buffers (i.e., AUQs) in its memory for AU messages. If two AUQs are allocated, the device automatically switches to the second AUQ when the first AUQ is full, thus allowing more efficient processing of AU messages. Configuration To configure the size of the AUQ (in terms of 16-byte AU messages), set <AUQSize> in the Address Update • • • Queue Control Register (Table 94 p. 384). After configuring the AUQ size, the 16-byte aligned AUQ base address is configured by setting the <PCIAUQ Base> field in the Address Update Queue Base Address Register (Table 93 p. 384). When the CPU sets the AUQ base address, the device validates and loads the AUQ size and its base. The AUQ loads these values only if they are valid. AUQ loading invalidates these registers’ values. To configure the second AUQ, the base address and AUQ size are reconfigured using the same registers. 6.1.6.2 AUQ Forwarding to the Host CPU To initialize the AUQ after exit from reset, the host CPU must configure the AUQ size and base address. Configuring the base address triggers the AUQ to update its internal data structures. AU messages from the device’s Bridge engine generate a write burst on the PCI bus, using the PCI Master. For each AU message transfer, the AUQ sets the <AUQPending> interrupt in the Miscellaneous Interrupt Cause Register (Table 557 p. 794). Notes • • 6.1.7 The CPU is expected to allocate a new AUQ area as soon as the <AUQueueFull> interrupt is asserted. This is important to prevent lack of address space for the AUQ, which stops FDB table updates. For continuous operation, it is recommended to allocate two buffers for AU messages. PCI Interrupts See Section 6.6 "Interrupts" on page 98. MV-S102110-02 Rev. E Page 78 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 When the AUQ is filled (i.e., the last AU message entry is written to the AUQ memory allocated), the device sets the <AUQueueFull> interrupt in the Miscellaneous Interrupt Cause Register (Table 557 p. 794). If the CPU has configured another AUQ memory area, the AUQ loads their values and continues to write the incoming AU messages to the new area. If there is not another valid AUQ memory area, the device stops sending AU messages until the host CPU configures a valid AUQ memory area, as described in Section 6.1.6.1 "Allocating Host Memory for the Address Update Queue (AUQ)". CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The device is configured with the AUQ starting location and size for each AUQ buffer area. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces 6.2 Serial Management Interfaces (SMI) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device incorporates the following SMI interfaces: SMI Interfaces R ele van t D evices CPU SMI 98DX106 98DX107 98DX130 98DX133 98DX163 98DX166 98DX167 98DX169 98DX243 98DX246 98DX247 98DX249 98DX250 98DX253 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Slave SMI interface for CPU management access to all address-mapped entities in the device and all PHY devices connected to it. This interface complies with IEEE 802.3 Clause 22. Master SMI interface 0 98DX106 98DX107 98DX130 98DX133 98DX163 98DX166 98DX167 98DX169 98DX243 98DX246 98DX247 98DX249 98DX250 98DX253 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Master SMI interface for CPU management of, and Auto-Negotiation with PHY devices connected to the Tri-Speed device’s Ethernet Port0 through Port11. This interface complies with IEEE 802.3 Clause 22. Master SMI interface1 98DX163 98DX166 98DX167 98DX169 98DX243 98DX246 98DX247 98DX249 98DX250 98DX253 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Master SMI interface for CPU management and Auto-Negotiation with PHY devices connected to the Tri-Speed device’s Ethernet Port12 through Port23 This interface complies with IEEE 802.3 Clause 22. Master XSMI interface 98DX130 98DX133 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Master SMI for packet processor access to the integrated HyperG.Stack XAUI PHYs and any external components such as XFP PHYs. The CPU may indirectly access the integrated XAUI PHYs and any external XFP PHY through this interface as well. This interface complies with IEEE 802.3 Clause 45. HyperG.Stack Port24 Slave XSMI interface 98DX130 98DX133 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Slave SMI interface for CPU management of the device HyperG.Stack Port24. This interface complies with IEEE 802.3 Clause 45. Typically, this interface is directly connected to the device’s Master XSMI interface via the board. Copyright © 2006 Marvell August 24, 2006, Preliminary Des cription M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In terfa ce CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 79 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 14: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Serial Management Interfaces (SMI) SMI Interfaces (Continued) R ele van t D evices Des cription HyperG.Stack Port25 Slave XSMI interface 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 Slave SMI interface for CPU management of the device HyperG.Stack Port25. This interface complies with IEEE 802.3 Clause 45. Typically, this interface is directly connected to the device’s Master XSMI interface via the board. HyperG.Stack Port26 Slave XSMI interface 98DX270 98DX273 98DX803 Slave SMI interface for CPU management of the device HyperG.Stack Port26. This interface complies with IEEE 802.3 Clause 45. Typically, this interface is directly connected to the device’s Master XSMI interface via the board. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In terfa ce Note IEEE 802.3 clause 22 compliant SMI Interfaces are referred to as SMI interfaces. IEEE 802.3 clause 45 compliant SMI Interfaces are referred to as XSMI interfaces. 6.2.1 Serial Management Interface Overview The following section is an overview of the serial management interface. MDC: Serial Management Interface Clock M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 6.2.1.1 MCD is the Serial Management clock. In the Master SMI Interfaces, M_MDC0, M_MDC1 and M_XMDC are output pins of the device and run at either 1.56 MHz or at 12 MHz. In the Slave SMI interface, CPU_MDC inputs can run from DC to a maximum rate of 10 MHz. S_XMDC0, S_XMDC1, and S_XMDC2 inputs can run from DC to a maximum rate of 8.33 MHz. Configuration • To set M_MDC0 speed for Master SMI Interface0, set the <SMI0 FastMdc> bit in the PHY Address Register0 • • (for Ports 0 through 5) (Table 258 p. 514). To set M_MDC1 speed for Master SMI Interface1, set the <SMI1 FastMdc> bit in the PHY Address Register2 (for Ports 12 through 17) (Table 260 p. 516). To set M_XMDC speed for Master XSMI Interface, set the <XSMIFastMDC> bit in the HyperG.Stack and HX/ QX Ports MIB Counters and XSMII Configuration Register (Table 163 p. 430). 6.2.1.2 MDIO: Serial Management Interface Data MDIO, for all SMI interfaces, is the Serial Management data input/output pin. It is a bi-directional signal that runs synchronously to the relevant MDC. MV-S102110-02 Rev. E Page 80 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 14: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces IEEE 802.3 Clause 22 SMI Framing AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 6.2.1.3 Table 15 outlines the SMI Interface framing support: SMI Interface Framing 2- Bi t TA 16-Bit D a ta F i e l d I d le 5-Bit PHY A d dr 5-Bit Reg A d dr 10 PPPPP RRRRR z0 DDDDDDDDDDDDDDDD Z 01 PPPPP RRRRR 10 DDDDDDDDDDDDDDDD Z 32-Bit PRE 2 - B it Sta r t of Frame 2 - B it O pC od e Read 1...1 01 Wr i te 1...1 01 Note Although the SMI Interface requires a preamble of 32 bits, the devices are permanently programmed for preamble suppression. A minimum of one Idle MDC cycle is required between two consecutive transactions. 6.2.1.4 IEEE 802.3 Clause 45 SMI Framing M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The XSMI implements a 16-bit address register that is used as follows: • For an address cycle, it contains the address of the register to be accessed on the next cycle. • For read, write, and post-read increment-address cycles, the field contains the data for the register. At power-up and reset, the content of the register is undefined. Write, read, and post-read-increment-address frames access the address register, though write and read frames do not modify the contents of the address register. Table 16 outline the XSMI Interface frames supported: Table 16: XSMI Interface Framing 32B it PRE 2 - B it Star t of Frame 2-Bit OpCode Address 1...1 00 00 Wr i te 1...1 00 01 Read 1...1 00 11 Read In c 1...1 00 10 Copyright © 2006 Marvell August 24, 2006, Preliminary 2 - B it TA 1 6 - B it A dd r / D a ta Idle 5-Bit PHY A d dr 5-Bit Dev A d dr PPPPP DDDDD 10 AAAAAAAAAAAAAAAA Z PPPPP DDDDD 10 DDDDDDDDDDDDDDDD Z PPPPP RRRRR z0 DDDDDDDDDDDDDDDD Z PPPPP RRRRR z0 DDDDDDDDDDDDDDDD Z CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 81 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 15: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Serial Management Interfaces (SMI) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note 6.2.2 CPU SMI Interface The device incorporates a Slave SMI interface, which is used for address-mapped entities access when the PCI interface is not used. This interface may also be used in parallel to the PCI Interface. The CPU can access the device memory space via read and write cycles on the SMI Interface. The device registers are PCI-compatible registers with 32-bit address and 32-bit data. As the Slave SMI interface is compliant with IEEE 802.3u Clause 22, it uses a 5-bit PHY address, 5-bit register address, and 16-bit data. Consequently, a direct access to the device’s 32-bit registers is not supported. The Read and Write accesses are done indirectly using the 16-bit data portion of the SMI transaction to write the device’s register address in two SMI transactions and the device’s register data, to be written or read, in two SMI transactions. 6.2.2.1 Device SMI PHY Address All accesses to the device are performed using the device’s PHY address. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device uses only one PHY address for all address-mapped entities accesses. The PHY address is a 5-bit address. The PHY Address is sampled at reset (For details see the relevant device Hardware Specifications). The HOST CPU may change <DevicePhy Addr[4:0]>. 6.2.2.2 Device SMI Registers As the device uses one SMI PHY address, its SMI register space consists of 32 registers. The device uses the following SMI registers: • SMI Read-Write Status Register (Table 17 p. 83) • SMI Write Address MSBs Register (Table 18 p. 83) • SMI Write Address LSBs Register (Table 19 p. 83) • SMI Write Data MSBs Register (Table 20 p. 83) • SMI Write Data LSBs Register (Table 21 p. 84) • SMI Read Address MSBs Register (Table 22 p. 84) • SMI Read Address LSBs Register (Table 23 p. 84) • SMI Read Data MSBs Register (Table 24 p. 84) • SMI Read Data LSBs Register (Table 25 p. 85) The following subsections defines the offset and content of each SMI register. MV-S102110-02 Rev. E Page 82 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Although the SMI Interface requires a preamble of 32 bits, the devices are permanently programmed for preamble suppression. A minimum of one Idle MDC cycle is required between two consecutive transactions. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces SMI Read-Write Status Register AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Read-only register used for indicating the read and write status of the SMI interface. SMI Read-Write Status Register Offset: 0x1F B i ts F ie l d Ty p e / InitVal D e s c r i p t io n 15:2 Reserved RO Reserved for future use. SMIWriteDone R0 0x1 SMI Interface write Status. When this bit is HIGH, the device has completed the current write transaction and is ready for another read or write transaction. SMIReadRdy R0 0x1 An indication that the device has completed the read transaction and the read data is ready for the CPU to read in the SMI Read Data MSBs Register (Table 24 p. 84) and the SMI Read Data LSBs Register (Table 25 p. 85). 1 0 SMI Write Address MSBs Register Write-only register for the 16 MSBs of the 32-bit address of the device in a write transaction. Table 18: SMI Write Address MSBs Register Offset: 0x0 F ie l d Ty p e / InitVal D e s c r i p t io n 15:0 SMIWrAddr MSBs WO 16 MSBs of the 32-bit Address of the device in a write transaction SMI Write Address LSBs Register M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 B i ts Write-only register for the 16 LSBs of the 32-bit Address of the device in a write transaction. Table 19: SMI Write Address LSBs Register Offset: 0x1 B i ts F ie l d Ty p e / InitVal D e s c r i p t io n 15:0 SMIWrAddr LSBs WO 16 LSBs of the 32-bit Address of the device in a write transaction SMI Write Data MSBs Register Write-only register for the 16 MSBs of the 32-bit Data to be written to a device’s address-mapped entity. Table 20: SMI Write Data MSBs Register Offset: 0x2 B i ts F ie l d Ty p e / InitVal D e s c r i p t io n 15:0 SMIWrData MSBs WO 16 MSBs of the 32-bit Data to be written to a device’s address-mapped entity Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 83 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 17: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Serial Management Interfaces (SMI) SMI Write Data LSBs Register B i ts 15:0 SMI Write Data LSBs Register Offset: 0x3 F ie l d Ty p e / InitVal D e s c r i p t io n SMIWrData LSBs WO 16 LSBs of the 32-bit Data to be written to a device’s address-mapped entity SMI Read Address MSBs Register Write-only register for the 16 MSBs of the 32-bit Address of the device in a read transaction. Table 22: B i ts 15:0 SMI Read Address MSBs Register Offset: 0x4 F ie l d Ty p e / InitVal D e s c r i p t io n SMIRdAddr MSBs WO 16 MSBs of the 32-bit Address of the device in a read transaction M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 SMI Read Address LSBs Register Write-only register for the 16 LSBs of the 32-bit Address of the device in a read transaction. Table 23: B i ts SMI Read Address LSBs Register Offset: 0x5 15:0 F ie l d Ty p e / InitVal D e s c r i p t io n SMIRdAddr LSBs WO 16 LSBs of the 32-bit Address of the device in a read transaction SMI Read Data MSBs Register Read-only register for the 16 MSBs of the 32-bit Data read from the device’s address-mapped entity. Table 24: SMI Read Data MSBs Register Offset: 0x6 B i ts F ie l d Ty p e / InitVal D e s c r i p t io n 15:0 SMIRdData MSBs RO 16 MSBs of the 32-bit Data read from the device. NOTE: This register contains valid data only when <SMIReadRdy> in the SMI Read-Write Status Register (Table 17 p. 83) is HIGH. SMI Read Data LSBs Register Read-only register for the 16 LSBs of the 32-bit Data read from the device’s address-mapped entity. MV-S102110-02 Rev. E Page 84 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 21: AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Write-only register for the 16 LSBs of the 32-bit Data to be written to a device’s address-mapped entity. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces B i ts F ie l d Ty p e / InitVal D e s c r i p t io n 15:0 SMIRdData LSBs RO 16 LSBs of the 32-bit Data read from the device. NOTE: This register contains valid data only when <SMIReadRdy> in the SMI Read-Write Status Register (Table 17 p. 83) is HIGH. 6.2.2.3 Write Transaction To write to a device’s address-mapped entity the following SMI transaction must be performed: 1. Write 16 MSBs of the device’s 32-bit Address to the SMI Write Address MSBs Register (Table 18 p. 83). 2. Write 16 LSBs of the device’s 32-bit Address to the SMI Write Address LSBs Register (Table 19 p. 83). 3. Write 16 MSBs of the device’s 32-bit Data to the SMI Write Data MSBs Register (Table 20 p. 83). 4. Write 16 LSBs of the device’s 32-bit Data to the SMI Write Data LSBs Register (Table 21 p. 84). 5. Read SMI Read-Write Status Register (Table 17 p. 83) until <SMIWriteDone> is HIGH. 6.2.2.4 Read Transaction 6.2.3 R M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To read from a device’s address-mapped entity the following SMI transaction must be performed: 1. Write 16 MSBs of the device’s 32-bit Address to the SMI Read Address MSBs Register (Table 22 p. 84). 2. Write 16 LSBs of the device’s 32-bit Address to the SMI Read Address LSBs Register (Table 23 p. 84). 3. Read SMI Read-Write Status Register (Table 17 p. 83) until <SMIReadRdy> is HIGH. 4. Read SMI Read Data MSBs Register (Table 24 p. 84). 5. Read SMI Read Data LSBs Register (Table 25 p. 85). Master SMI Interfaces This section is relevant for the following devices: D SecureSmart: 98DX106, 98DX163, 98DX163R, 98DX243, 98DX262 D SecureSmart Stackable: 98DX169, 98DX249, 98DX269 D Layer 2+ Stackable: 98DX130, 98DX166, 98DX246, 98DX250, 98DX260, 98DX270 D Multilayer Stackable: 98DX107, 98DX133, 98DX167, 98DX247, 98DX253, 98DX263, 98DX273 U Not relevant for the 98DX803. The device maintains two, IEEE 802.3 clause 22 compliant, Master Serial Management interfaces for managing external GbE PHY devices, as well as for Auto-Negotiation with an external GbE PHY device. Master SMI Interface0 (M_MDC0/M_MDIO0) is used for managing GbE PHY devices connected to the device’s Tri-Speed ports 0 through 11. Master SMI Interface1 (M_MDC0/M_MDIO0) is used for managing GbE PHY devices connected to the device’s ports 12 through 23. 6.2.3.1 Tri-Speed Ports PHY Address The device holds a PHY address for each of the PHYs connected to each of the device’s ports in configurable registers. PHY address<n> is used by the SMI Interface when accessing the PHY device connected to port<n>. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 85 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 SMI Read Data LSBs Register Offset: 0x7 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 25: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Serial Management Interfaces (SMI) 6.2.3.2 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To configure the PHY Address of the PHY devices connected to Tri - Speed ports 0 through 5, set the <PhyAd0>, <PhyAd0>, <PhyAd0>, <PhyAd0>, <PhyAd0>, and <PhyAd0> fields in the PHY Address Register0 (for Ports 0 through 5) (Table 258 p. 514) accordingly. It holds six PHY addresses for PHY devices connected to ports 0 through 5. • PHY Address Register1 (for Ports 6 through 11) (Table 259 p. 515) holds six PHY addresses for PHY devices connected to ports 6 through 11. • PHY Address Register2 (for Ports 12 through 17) (Table 260 p. 516) holds six PHY addresses for PHY devices connected to ports 12 through 17. • PHY Address Register3 (for Ports 18 through 23) (Table 261 p. 516) holds six PHY addresses for PHY devices connected to ports 18 through 23. Tri-Speed Ports PHY Registers Management via SMI Interface The device provides a mechanism for PHY registers read and write access. PHY registers read and write transactions are supported by both SMI interfaces, using the SMI0 Management Register (Table 256 p. 513) for accessing PHY devices connected to Master SMI interface0 and the SMI1 Management Register (Table 257 p. 514) for accessing PHY devices connected to Master SMI interface1. 6.2.3.3 Reading a PHY Register 6.2.3.4 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To read the value of a PHY register via the PCI Interface, TWSI Interface, or CPU SMI interface, the following procedure is performed. The procedure described is for the Master SMI Interface0, but it is the same for Master SMI Interface1: 1. Read SMI0 Management Register (Table 256 p. 513) until <Busy> is 0. When Read is performed via the TWSI Interface, which is considerably slower than the SMI Interface, this stage may be skipped. 2. Write the following to the SMI0 Management Register (Table 256 p. 513): <OpCode> - 1: Op code for read transaction. <RegAd> - Address of the register to be read. <PHYAd> - PHY address from which the register is to be read. 3. Read SMI0 Management Register (Table 256 p. 513) until <ReadValid> is set to 1. When <ReadValid> is 1, the content of the Read register is in <Data>. Writing to a PHY Register For writing the value of a PHY register via the PCI Interface, TWSI Interface, or CPU SMI interface, the following procedure is performed. The procedure described is for Master SMI Interface1, but it is the same for the Master SMI1 Interface: 1. Read SMI0 Management Register (Table 256 p. 513) until <Busy> is 0. When Read is performed via the TWSI Interface, which is considerably slower than the SMI Interface, this stage may be skipped. 2. Write the following to the SMI0 Management Register (Table 256 p. 513): <OpCode> - 0: Op code for write transaction. <RegAd> - Address of the register to be written. <PHYAd> - PHY address in which the register is to be written. <Data> - Data to be written into the register. MV-S102110-02 Rev. E Page 86 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces 6.2.3.5 SMI Auto-Negotiation and Auto-Media Selection AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device uses a standard master Serial Management Interface bus for reading from/writing to the PHY registers. For ports that are Auto-Media Selection enabled, the PHY polling unit also polls the PHY registers, to determine the medium connected to this port (fiber or copper). In normal behavior, the device polls the Status register of each PHY in a round-robin manner. The PHY polling unit uses Master SMI Interface0 for Auto-Negotiation and for polling PHY devices connected to Tri-Speed ports 0 through 11, and uses Master SMI Interface1 for Auto-Negotiation and for polling PHY devices connected to Tri-Speed ports 12 through 23. Thus PHY devices connected to Tri-Speed ports 0 through 11 must be connected to Master SMI Interface0 and PHY devices connected to Tri-Speed ports 12 through 23 must be connected to Master SMI Interface1. If the device detects a change in the link from down to up on one of the ports, it performs a series of register reads from the PHY and updates the Auto-Negotiation results in the device’s registers. The Port MAC Status register is updated with these results only if Auto-Negotiation is enabled (see Port<n> Status Register0 (0<=n<24, CPUPort = 63) (Table 149 p. 418)). The device enables full configuration of the Auto-Negotiation functionality via configuration of the Port<n> AutoNegotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415). 6.2.4 R M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 For further details about Auto-Negotiation settings and Auto-Media Select see Section 6.5.3 "CPU Port MAC Operation and Configuration" on page 95. Master XSMI Interface This section is relevant for the following devices: D SecureSmart: 98DX262 D SecureSmart Stackable: 98DX269 D Layer 2+ Stackable: 98DX130, 98DX260, 98DX270, 98DX803 D Multilayer Stackable: 98DX133, 98DX263, 98DX273 The 98DX130, 98DX133, 98DX260, 98DX262, 98DX263, 98DX270,98DX273, and 98DX803 maintain an IEEE 802.3 Clause 45 compliant Master Serial Management interface for managing their integrated XAUI PHYs and other devices, such as XFP PHYs, connected to its HyperG.Stack ports. Since the 98DX130 and 98DX133 integrate one, the 98DX260, 98DX262, and 98DX263 integrate two, and the 98DX270,98DX273, and 98DX803 integrate three XAUI PHYs, the XSMI Master must be connected to these XAUI PHYs’ slave XSMI interfaces, as illustrated in Figure 12. This connection is done over the board, to enable additional devices with Slave XSMI interface connection to the device’s Master XSMI interface. The device provides a mechanism for registers’ read and write access via the PCI interface, TWSI interface, or CPU SMI interface. This access consists of two/three phases. The first phase is performed as a regular general access via the PCI/SMI/TWSI interface, to configure the XSMI transaction parameters (located in XSMI Management Register (Table 131 p. 401) and XSMI Address Register (Table 132 p. 403)). The next phase is performed by the XSMI Master, which generates the appropriate transaction over the XSMI bus. If this is a read transaction, then there is a third phase in which the CPU reads the XSMI Management Register (Table 131 p. 401) via the PCI/SMI/TWSI interface, until it determines that the read data within it is valid (according to the<ReadValid> bit). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 87 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 In addition, the PHY polling unit performs Auto-Negotiation with PHY devices attached to the Tri-Speed ports via the Master SMI Interface. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Serial Management Interfaces (SMI) XSMI Transactions AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The XSMI Master is able to perform the following operations, which are coded at the <OpCode> field in the XSMI Management Register (Table 131 p. 401): • Write only (Opcode = 1). • Incremented address read only (Opcode = 2). • Read only (Opcode = 3). • Address then Write (Opcode = 5). • Address then incremented address read (Opcode = 6). • Address then Read (Opcode = 7). The “Read/Write only” consists of content frame only, whereas the Address-then-read/write/incremented read consist sof address and content frames. The “Incremented address read” is content-only access, which is followed by an increment of the stored address at the PHY device. This type of operation is used in order to efficiently access several PHY registers mapped to successive addresses. The following subsections describe the details of several basic management operations via the XSMI Master. 6.2.4.2 Reading a PHY Register 6.2.4.3 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To read the value of a PHY register, perform the following procedure: 1. Read XSMI Management Register (Table 131 p. 401) until <Busy> is 0. When Read is performed via the TWSI Interface, which is considerably slower than the PCI/SMI Interfaces, this stage may be skipped. 2. Write the PHY register Address to the <Addr> field in the XSMI Address Register (Table 132 p. 403). 3. Write the following to the XSMI Management Register (Table 131 p. 401): <OpCode> - 7: Op code for Address then Read transaction. <PHYAd> - Address of the PHY. <DevAd>- Address of the device within the PHY. 4. Set <ReadValid> in the Read XSMI Management Register (Table 131 p. 401). When <ReadValid> is 1, the content of the Read register is in the Data field. Writing to a PHY Register To write to a PHY register, perform the following procedure: 1. Read the XSMI Management Register (Table 131 p. 401) until <Busy> is 0. When Read is performed via the TWSI Interface, which is considerably slower than the PCI/SMI Interfaces, this stage may be skipped. 2. Write the PHY register Address to the <Addr> field in the XSMI Address Register (Table 132 p. 403). 3. Write the following to the XSMI Management Register (Table 131 p. 401): <OpCode>- 5: Op code for Address then Write transaction. <PHYAd> - Address of the PHY. <DevAd>- Address of the device within the PHY. <Data> - Write data. 6.2.4.4 Read Modify Write to a PHY Register For a “Read Modify Write” operation of a PHY register, perform the following procedure: 1. Read XSMI Management Register (Table 131 p. 401) until <Busy> is 0. When Read is performed via the TWSI Interface, which is considerably slower than the PCI/SMI Interfaces, this stage may be skipped. 2. Write the PHY register Address to the <Addr> field in the XSMI Address Register (Table 132 p. 403). MV-S102110-02 Rev. E Page 88 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 6.2.4.1 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces 4. 5. 6.2.4.5 Read of Several Successive PHY Registers 6.2.5 R M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To read several successive PHY registers, perform the following procedure: 1. Read the XSMI Management Register (Table 131 p. 401) until <Busy> is 0. When Read is performed via the TWSI Interface, which is considerably slower than the PCI/SMI Interfaces, this stage may be skipped. 2. Write the first PHY register Address to the <Addr> field in the XSMI Address Register (Table 132 p. 403). 3. Write the following to the XSMI Management Register (Table 131 p. 401): <OpCode> - 6: Op code for Address then incremented address Read transaction. <PHYAd> - Address of the PHY. <DevAd>- Address of the device within the PHY. 4. Read XSMI Management Register (Table 131 p. 401) <ReadValid>. When <ReadValid> is 1, the content of the Read register is in the Data field. 5. Perform steps 3 and 4 as many times as necessary, to read all registers except the last one. 6. On the last register read, write the following to the XSMI Management Register (Table 131 p. 401): <OpCode> - 3: Op code for Read only transaction. <PHYAd> - Address of the PHY. <DevAd>- Address of the device within the PHY. 7. Read XSMI Management Register (Table 131 p. 401) <ReadValid>. When <ReadValid> is 1, the content of the Read register is in the Data field. Slave XSMI Interfaces This section is relevant for the following devices: D SecureSmart: 98DX262 D SecureSmart Stackable: 98DX269 D Layer 2+ Stackable: 98DX260, 98DX270 D Multilayer Stackable: 98DX263, 98DX273 Each of the HyperG.Stack ports incorporates a XAUI PHY. The transceiver XGXS/XAUI function is controlled by a dedicated XMDIO management interface, as defined in IEEE 802.3ae Clause 45. The register map is composed of IEEE-defined manageable device registers and vendor-specific registers. The XGXS/XAUI can be configured as a PCS, PHY, and DTE device. Each of these Slave XSMI Interfaces is connected to the device’s Master XSMI Interface, as illustrated in Figure 12. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 89 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Write the following to the XSMI Management Register (Table 131 p. 401): <OpCode> - 7: Op code for Address then Read transaction. <PHYAd> - Address of the PHY. <DevAd>- Address of the device within the PHY. Read XSMI Management Register (Table 131 p. 401) <ReadValid>. When <ReadValid> is 1, the content of the Read register is in the Data field. Write the following to the XSMI Management Register (Table 131 p. 401): <OpCode>- 5: Op code for Address then Write transaction. <PHYAd> - Address of the PHY. <DevAd>- Address of the device within the PHY. <Data> - Write data. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 3. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Serial Management Interfaces (SMI) Configuration Two Wire Serial Interface (TWSI) This section describes the device’s Two-Wire Serial Interface (TWSI). The device provides full TWSI support. It acts as a master—generating read/write requests, as well as a slave— responding to read/write requests. The device fully supports multiple TWSI master environments (clock synchronization, bus arbitration etc.). The primary use of the TWSI interface is for serial ROM initialization. 6.3.1 TWSI Overview The TWSI is used as a master for EPROM initialization of the device. After the EEPROM initialization phase is done, the TWSI moves to TWSI slave mode. It can then be used for read and write access of all of the device’s address-mapped entities 6.3.2 TWSI Bus Operation M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The TWSI can be operated with a 100 KHz clock. The TWSI port consists of two open-drain signals—Serial Clock (SCL) and Serial Data/Address (SDA). The TWSI master starts a transaction by driving a start condition followed by a 7-bit or 10-bit slave address and a read/write bit indication. The target TWSI slave responds with an acknowledge. For write access (R/W bit is 0) following the acknowledge, the master drives 8-bit data and the slave responds with acknowledge. This write access (8-bit data followed by acknowledge) continues until the TWSI master ends the transaction with a stop condition. For a read access, following the slave address acknowledge, the TWSI slave drives 8-bit data and the master responds with acknowledge. This read access (8-bit data followed by acknowledge) continues until the TWSI master ends the transaction, by responding with no acknowledge to the last 8-bit data, followed by a stop condition. If a target slave cannot drive valid read data immediately after it has received the address, it can insert “wait states” by forcing SCL low until it has valid data to drive on the SDA line. A master is allowed to combine two transactions. After the last data transfer, it can drive a new start condition followed by a new slave address, rather than a drive stop condition. Transaction combining guarantees that the master will not loose arbitration to another TWSI master. 6.3.3 Serial ROM Initialization The device supports initialization of all its configuration registers and memories, as well as other system components, through the TWSI master interface. At exit from reset, the device’s TWSI master starts reading initialization data from the serial ROM and writes it to the appropriate registers or RAM arrays. EPROM initialization can be triggered by global reset hard/soft reset to the device. The size of the EPROM is determined according to the reset sampling (see Prestera® Hardware Specifications. MV-S102110-02 Rev. E Page 90 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 6.3 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the Port<n> XAUI PHY Configuration Register1 (24<=n<27) (Table 166 p. 436): • To configure the XAUI transceiver SMI PHY address, set the <PHYAddr> field accordingly. • To configure the XAUI transceiver SMI Device address, set the <PHYDevAddr> field accordingly. • To configure the XAUI transceiver to accept any device address on the SMI interface, set the <AnyDevAddrEn> bit. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces 6.3.3.1 Serial ROM Data Structure AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ch last_address[31:0], E default :0xFFFFFFFF R no ND lo A# gie 12 s 10 17 86 The Serial ROM data structure consists of a sequence of 32-bit address and 32-bit data pairs, as illustrated in Figure 15. MSB Start LSB Address0[31:24] Address0[23:16] Address0[15:8] Address0[7:0] Data0[31:24] Data0[23:16] Data0[15:8] Data0[7:0] ..... M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Address[31:0]/Data[31:0] ..... Address1[31:24] Address1[23:16] Address1[15:8] Address1[7:0] The device reads eight bytes at a time. The first four bytes are treated as address and the last four bytes are treated as data. Address bit [31] indicates which device internal address space is accessed. Address[31] = 0: Access to the PCI register address space. Address[31] = 1: Access to the Device address space. The TWSI Last Address Register (Table 97 p. 386) contains the expected value of the last serial data item (default value is 0xFFFFFFFF). When the device reaches the last data, it stops the initialization sequence. Notes • • • • The data read from the Serial ROM is assumed to have Little Endian byte ordering. Use EEPROMs with 8- or 16-bit slave address, according to the size of the EEPROM device used. The EPROM device slave address is sampled at reset. (For details see the relevant device Hardware Specifications) The device must not be reset when it is in the middle of a master EPROM transaction. This may cause the EPROM to hang. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 91 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Figure 15: Serial ROM Data Structure M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Two Wire Serial Interface (TWSI) 6.3.3.2 Disabling the TWSI Interface AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 When the TWSI interface is not in use, it must be disabled as follows: 1. Tie SCL and SDA to pullup VDD_MISC. Each pin must use a separate resistor. 2. Tie SRESET_EEPROM_DIS to a pullup resistor (see the Prestera® Hardware Specification). After EEPROM initialization has been completed, the TWSI interface enters TWSI slave mode. It can then be used for read and write access of all the device registers that are mapped to the device address space. The device TWSI slave interface address is 7 bits. The value of the four highest bits and the value of the lowest three bits are sampled (For details see the relevant device Hardware Specifications). To transfer the register address and operation code, the Master writes four times. This is followed by four writes to/ reads from the register data, according to the operation code. The address and data format are illustrated in the following figures: Figure 16: TWSI Bus Transaction—External Master Write to a Device Register Data7[7:0] = RegData [7:0] Stop ACK Data6[7:0] = RegData [15:8] ACK Data5[7:0] = RegData [23:16] ACK State is driven by the Master M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Data4[5:0] = RegData [31:24] ACK Data3[7:0] = RegAddr [7:0] ACK Data2[7:0] = RegAddr [15:8] ACK Data1[7:0] = RegAddr [23:16] ACK Data0[7:0] = {0,0,RegAdd r[29:24]} ACK ACK R/W=0 Device TWSI Addr [7:1] State is driven by the Slave M Figure 17: TWSI Bus Transaction–External Master Read from a Device Register Data3[7:0] = RegAddr [7:0] Stop ACK ACK Data3[5:0] = RegData [7:0] Stop Data2[5:0] = RegData [15:8] ACK Data2[7:0] = RegAddr [15:8] ACK Data1[5:0] = RegData [23:16] ACK Data1[7:0] = RegAddr [23:16] ACK Data0[5:0] = RegData [31:24] ACK Data0[7:0] = {1,0,RegAdd r[29:24]} ACK R/W=1 Start Device TWSI Addr [7:1] ACK R/W=0 Start Device TWSI Addr [7:1] State is driven by the Master State is driven by the Slave To perform write access, the following sequence from an external TWSI master is applied: • The Master transfers a 32-bit address in which bits [31:30] are {0,0} and bits[29:0] are the offset of the register address in the device address space. • The Master transfers 32 bits of data to be written to the addressed register. .To perform read access, the following sequence from an external TWSI master is applied: The Master transfers a 32-bit address in which bits [31:30] are {1,0} and bits[29:0] are the offset of the register address in the device address space. • The Master reads 32 bits of data four times. • MV-S102110-02 Rev. E Page 92 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 6.3.4 TWSI Bus Operation After Initialization Start MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces 6.4 Device Address Space AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The address space is accessible via the PCI interface, the CPU Slave SMI interface, or the TWSI interfaces. Table 26 outline the device’s address space partitioning Address Space Partitioning Bit # D e f in i ti o n Not e s 31 Reserved When accessing the register Address Space from the TWSI bus, this bit is significant: 0 = Access to PCI space. 1 = Access to the device’s space. 0 = Access to the device’s Address space Set to 0 0 = Internal register or tables, set to 0 Set to 0 30 29 28:2 3 W he n Ad dre ssing In tern al Re gister s a nd Tab le s ([2 9] = 0) 0 = Global registers, SDMA registers, Master XSMI registers and TWSI registers. 1–2 = Reserved. 3 = Transmit Queue registers. 4 = Ethernet Bridge registers. 5 = Reserved. 6 = Buffer Management register. 7 = Reserved. 8 = Ports group0 configuration registers (port0 through port5), LEDs interface0 configuration registers, and Master SMI interface0 registers. 9 = Ports group1 configuration registers (port6 through port11) and LEDs interface0. 10 = Ports group2 configuration registers (port12 through port17), LEDs interface1 configuration registers, and Master SMI interface1 registers. 11 = Ports group3 configuration registers (port18 through port23) and LEDs interface1. 12 = MAC Table Memory. 13 = Internal Buffers memory Bank0 address space. 14 = Internal Buffers memory Bank1 address space. 15 = Buffers memory block configuration registers. 20 = VLAN table configuration registers and VLAN table address space. 21 = Ports registers, including CPU port (port# 0x3F). Bits [13:8] are used as port number. Bits [7:2] are used as the register address. Bits [1:0] should be “00”. 22 = Eq. 23 = PCL. 24 = Policer. 63–25 = Reserved. Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 28:23 CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 93 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 26: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Device Address Space 6.5 CPU MII/GMII/RGMII Port AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device implements a standard Gigabit Ethernet MAC over MII, GMII, or RGMII, to interface with the host CPU. This is an alternative interface for the CPU to send/receive traffic to/from the device. • • 6.5.1 In PCI systems, the PCI is used as the CPU packet interface. For details, see 6.1.5 "Packet Reception and Transmission". This CPU MII/GMII/RGMII interface must not be used when the PCI Interface is used. Port Type Configuration The device’s CPU port interface type is sampled at reset (For details see the relevant device Hardware Specifications). Table 27 describes the CPU interface type according to this configuration. When the CPU port is not used, set CPU_IF_TYPE[1:0] to 0. Table 27: CPU Port Interface According to CPU_IF_TYPE[2:0] CPU_IF_TYPE[1:0] CPU Port Interface 0 CPU port interface is MII MAC Mode 2 3 CPU port interface is GMII CPU port interface is RGMII M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 1 CPU port interface is MII PHY Mode For further details regarding the CPU Interface type setting see the Prestera® Hardware Specification. The CPU interface type is reflected in the <CPUPortIFType> field of the CPU Port Global Configuration Register (Table 100 p. 387). 6.5.2 CPU Port MAC Overview The CPU Port MAC implements a standard Gigabit Ethernet MAC over MII, GMII, or RGMII, to interface with the host CPU. The device implements a standard MAC, which filters out received frames shorter than 64 bytes or longer than the Maximum Receive Unit. It also filters out received packets with a bad CRC or those in which a receive error occurred during packet reception. The MAC also maintains the minimum IPG restriction on transmitted packets. In 10/100 Mbps half-duplex modes it implements the CSMA/CD protocol (collision detect and retransmit). The MAC has a set of LED indicators (Section 17. "LED Interface" on page 318). The 10/100 Mbps speeds and the half-duplex are supported In MII and RGMII modes only. The CPU port does not support Auto-Negotiation and must be set manually for Link, Speed, Duplex mode and Flow Control support. All packets received via the CPU port must be DSA-tagged (see Section 7. "CPU Traffic Management" on page 102). MV-S102110-02 Rev. E Page 94 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Notes M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces 6.5.3 CPU Port MAC Operation and Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This section specifies the MAC operation and configuration. 6.5.3.1 CPU Port Activation Configuration To activate/de-activate the CPU Port, set the <CPUPortActive> bit in the CPU Port Global Configuration Register (Table 100 p. 387). 6.5.3.2 Port Enable See Section 9.4.1 "Port Enable" on page 144 for Tri-Speed ports. 6.5.3.3 Link State As the CPU port does not support Auto-Negotiation, its link state must be set manually. Configuration Interrupt M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • To force a Tri-Speed port link state to up, set the <ForceLinkUp> bit. • To force a Tri-Speed port link state to down, set the <ForceLinkDown> bit. The <LinkStatus Change> field in the Port<n> Interrupt Cause Register (0<=n<24, CPUPort = 63) (Table 571 p. 802) is set upon any change of the Tri-Speed Port link state. 6.5.3.4 Port Status Register See Section 9.4.3 "Port Status Register" on page 145 for Tri-Speed ports. 6.5.3.5 Disable CRC Checking on Received Packets See Section 9.4.4 "Disable CRC Checking on Received Packets" on page 145 for Tri-Speed ports. 6.5.3.6 Short Packets Padding See Section 9.4.5 "Short Packets Padding" on page 146 for Tri-Speed ports. 6.5.3.7 Preamble Length See Section 9.4.6 "Preamble Length" on page 146 for Tri-Speed ports. 6.5.3.8 Maximum Receive Unit (MRU) See Section 9.4.7 "Maximum Receive Unit (MRU)" on page 147 for Tri-Speed ports that are configured as Cascading ports. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 95 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 To save power, at default the CPU port is inactive. For it to be operational, the CPU port must be activated. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED CPU MII/GMII/RGMII Port 6.5.3.9 802.3x Flow Control AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 As the CPU port does not support Auto-Negotiation, Flow Control support must be set manually. See Section 9.4.8 "802.3x Flow Control" on page 148 for Tri-Speed ports. 6.5.3.10 Back Pressure in Half-Duplex Mode 6.5.3.11 Speed Setting Speed Setting in MII MAC Mode or MII PHY Mode In MII MAC mode or MII PHY mode, the MAC speed may be set to 10 Mbps or 100 Mbps. As Auto-Negotiation is not supported on the CPU port, the port speed must be set manually Configuration In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnSpeedEn> bit. • Clear the <SetGMIISpeed> bit. • To set the ports speed to 10 Mbps or 100 Mbps, set the <SetMIISpeed> bit. Speed Setting in GMII Mode Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In GMII mode, the MAC speed must be set to 1000 Mbps. As Auto-Negotiation is not supported on the CPU port, this must be set manually. In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnSpeedEn> bit. • Set the <SetGMIISpeed> bit. Speed Setting in RGMII Mode In RGMII mode, the MAC speed must be set to 10/100/1000Mbps. As Auto-Negotiation is not supported on the CPU port, the port speed must be set manually. Configuration In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnSpeedEn> bit. • To set the speed to 10 Mbps: • • – – Clear the <SetGMIISpeed> bit. – – Clear the <SetGMIISpeed> bit. – Set the <SetGMIISpeed> bit. Clear the <SetMIISpeed> bit. To set the speed to 100 Mbps: Set the <SetMIISpeed> bit. To set the speed to 1000 Mbps: MV-S102110-02 Rev. E Page 96 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 See Section 9.4.9 "Tri-Speed Port: Back Pressure in Half-Duplex Mode" on page 151. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces Duplex Mode Setting AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 6.5.3.12 Duplex Mode in MII MAC Mode or MII PHY Mode In MII MAC Mode or MII PHY mode, the MAC may operate in full-duplex or half-duplex mode. Configuration In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnDuplexEn> bit. • To configure full-duplex or half-duplex mode, set or clear the <SetFullDuplex> bit. Duplex Mode in GMII Mode In GMII mode the port speed is 1000 Mbps. As the MAC does not support half-duplex at this speed, the duplex mode must be set to full-duplex. As Auto-Negotiation is not supported on the CPU port, the duplex mode must be set manually. Configuration Duplex Mode in RGMII Mode M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnDuplexEn> bit. • Set the <SetFullDuplex> bit. In RGMII mode, half-duplex mode is supported only when the port’s speed is 10 Mbps or 100 Mbps. When the port’s speed is 1000 Mbps, Duplex mode must be set to full-duplex (see "Duplex Mode in GMII Mode" above). As Auto-Negotiation is not supported on the CPU port, the port’s duplex mode must be set manually. Configuration In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnDuplexEn> bit. • To configure full-duplex or half-duplex mode, set or clear the <SetFullDuplex> bit. 6.5.3.13 Excessive Collisions See 9.4.13 "Tri-Speed Port: Excessive Collisions" on page 154. 6.5.4 Auto-Negotiation The CPU port does not support Auto-Negotiation. The port’s link, speed, Duplex mode and Flow Control support must be set by the host CPU. 6.5.5 CPU Port MIB Counters The device implements a limited number of MAC MIB Counters for the CPU port, thus providing the necessary counters to support CPU port statistics. The counters and their addresses are listed in C.2.8 "CPU Port Configuration Register and MIB Counters" on page 387. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 97 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 As Auto-Negotiation is not supported on the CPU port, the duplex mode must be set manually. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED CPU MII/GMII/RGMII Port • CPUPort = 63) (Table 145 p. 411). By setting the <MIBCountMode> bit in the CPU Port Global Configuration Register (Table 100 p. 387), any CPU port MIB counter read is reset to ZERO (cleared on read). By clearing the <MIBCountMode> bit in the CPU Port Global Configuration Register (Table 100 p. 387), any CPU port MIB counter read is not reset to ZERO (not cleared on read). 6.6 Interrupts The device implements a hierarchical interrupt scheme. An Interrupt Cause and Interrupt Mask register is defined for each functional block. Each block provides a summary bit of all its interrupts to one Global Interrupt Cause register. The Global Interrupt Cause register contains the summary bits sent from all peripheral functional blocks and other interrupts belonging to the PCI. The PCI interrupt pin is an open-drain output, which is asserted LOW if any bit in the Global Interrupt Summary is set. The Global Interrupt Summary is a summary of all interrupts in the Global Interrupt Cause register. It is derived from a logical OR of all unmasked interrupt bits in the Global Interrupt Cause register. At reset, all Interrupt Mask and Cause registers are set to zero, therefore all interrupts are masked after reset. For a summary of all of the device’s interrupts, see C.19 "Summary of Interrupt Registers" on page 790. The interrupt scheme is illustrated in Figure 18. M INTn M AR int_sumB_ VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 18: Hierarchal Interrupt Scheme Global Int Cause Reg Global Cause Mask Reg int_sumA_ Reg File A Reg File B Int Cause Reg Page 98 Reg File N Int Cause Reg Int Cause Mask Reg MV-S102110-02 Rev. E int_sumN_ Int Cause Mask Reg CONFIDENTIAL Document Classification: Restricted Information Int Cause Reg Int Cause Mask Reg Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To enable/disable the counters, set the <MIBCntEn> bit in the Port<n> MAC Control Register0 (0<=n<24, • PCI_INTn MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note 6.6.1 Interrupt Types The interrupts are divided into the following two groups: Summary interrupts: Logical OR of all unmasked bits in the Cause register. Functional interrupts: Interrupts asserted by the device after an event has occurred. The functional interrupts are located within the Functional Block Cause register and the summary interrupts are always held in the least significant bit of each local Interrupt Cause register. A copy of each summary interrupt is held in the Global Cause register. 6.6.2 Setting and Resetting Interrupts All interrupts are set asynchronously by the device. The functional interrupts are set after the relevant event has occurred. Summary interrupts are set only as a result of assertion of one of the unmasked bits in the respective Interrupt Cause registers. Notes • • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device resets the interrupts after the host device reads the Cause register. Functional interrupts are reset by the device immediately after the Interrupt Cause register is read (Auto Clear). When reading the Interrupt Cause register, all interrupts in the register are cleared, including the masked interrupts. When an interrupt occurs, the corresponding Interrupt Cause is set, regardless of its interrupt mask bit. Summary interrupts always reflect the status of the corresponding Cause register. When the Interrupt Cause register in the functional block is read, the summary bit reflecting the unmasked bit of that functional block is cleared. When reading the Global Interrupt Cause register, the summary interrupts within the register are not reset. To reset the Global Interrupt Cause register, all the summary interrupts must be cleared by reading the corresponding local Cause register. 6.6.3 Interrupt Coalescing When CPU interrupts are generated at extremely high rates, the CPU is unable to achieve any useful processing. Interrupt coalescing forces a minimum time (i.e., clock cycles) between successive hardware interrupts from being generated to the CPU. Unmasked interrupts that occur during this forced delay will generate a hardware interrupt at the end of the delay, after which a new delay period begins. Only the generation of the hardware interrupt is delayed by this mechanism. The interrupt cause bits are still set when the interrupt occurs. The device has a global configurable interrupt coalescing period, which may be changed dynamically during normal operation. The default value is 0, i.e., a hardware interrupt is generated as soon as any interrupt cause bit is set. This period value defines the delay period in 64 clock cycles resolution. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 99 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 On devices that do not incorporate a PCI Interface or when the PCI Interface is used as the management interface, the PCI_INTn pin must be used as the device’s interrupt signal. When the PCI interface is not used, the INTn pin must be used as the device’s interrupt signal. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Interrupts AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note If the coalescing period is modified in the middle of an old period, the device will finish the current period and then, upon the next global summary interrupt, it will start working according to the new period. To limit the minimal IDLE period between two consecutive assertions of the PCI_INTn pin or the INTn pin, set the <Coalescing Period> field in the Interrupt Coalescing Configuration Register (Table 95 p. 384) accordingly. 6.6.3.1 Interrupt Coalescing Override Due to Ports Interrupt Typically, a Link Change event on one of the device’s ports must be addressed immediately by the Host interface. The device enables override of the coalescing IDLE period if one of the port’s links has changed and its Link Change Interrupt is not masked. Once a port’s link status has changed, the Interrupt pin is asserted immediately, regardless of the configured coalescing IDLE period. Configuration To enable/disable Interrupt Coalescing IDLE Period override due to a link change, set the <LinkChange Coalescing OverrideDis> bit in the Interrupt Coalescing Configuration Register (Table 95 p. 384) R General Purpose Pins (GPP) M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 6.7 This section is relevant for the following devices: D Layer 2+ Stackable: 98DX166, 98DX130, 98DX246, 98DX250, 98DX260, 98DX270, 98DX803 D SecureSmart Stackable: 98DX169, 98DX249, 98DX269 D Multilayer Stackable: 98DX133, 98DX167, 98DX247, 98DX253, 98DX263, 98DX273 U 6.7.1 Not relevant for the SecureSmart devices. GPP Overview The device includes eight general purpose pins (GPP), which are very useful for system design. These pins receive indications from the device’s environment, e.g., from the connector and/or the backplane of a chassis. The GPPs can also be used to drive indications from the device to the peripheral devices. They can be used as inputs, outputs, or edge-sensitive interrupts. The pins are 2.5V or 3.3V LVCMOS-compatible with LVTTL, however they are not 5V tolerant (see Prestera® Hardware Specification). 6.7.2 Working with GPPs 6.7.2.1 Control I/O Direction Each of the eight GPPs may be configured as an input or an output. MV-S102110-02 Rev. E Page 100 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Host Management Interfaces Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To configure the direction of the GPPs, set the <GPPIOCtr[7:0]> field in the GPP I/O Control Register (Table 112 p. 391) accordingly. 6.7.2.2 GPP as Input As the device has an Internal pulldown resistor of 50 kilohms for each of the GPPs, GPP input that is to be connected to LOW voltage level (logical ZERO) may by left unconnected and will be reflected as logical ZERO in the GPP Input Register (Table 111 p. 391). If the GPP acts as an output, the corresponding value in the GPP Input register is meaningless. 6.7.2.3 GPP as Output When a GPP acts as an output, the value of the GPP is driven with the value of the corresponding pin in the GPP Output Register (Table 110 p. 391). 6.7.2.4 GPP as Interrupt When a GPP acts as an input, an interrupt can be enabled by unmasking the corresponding bit in <GPP IntMask[7:0]> in the GPP Interrupt Mask Register (Table 560 p. 796). The interrupt is conveyed to the corresponding bit in <GPPInt[7:0]> in the GPP Interrupt Cause Register (Table 559 p. 795). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The interrupt is edge-sensitive and is asserted on every change of the input signal. When the pin is set as an output, the GPP Interrupt Mask register masks the corresponding bit. 6.7.2.5 Disabling the GPP By default, GPPs are set as inputs. As the device has an internal pulldown resistor of 50K ohm for each of the GPPs, an unused GPP may be left unconnected. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 101 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 When a GPP acts as an input, the value of the GPP pin is reflected in the GPP Input Register (Table 111 p. 391). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED General Purpose Pins (GPP) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 7. CPU Traffic Management In managed systems, it is critical that the CPU receive only traffic that requires software processing. Unwanted traffic unnecessarily burdens the CPU and delays handling of other traffic that requires processing. Furthermore, traffic that is sent to the CPU must be properly prioritized into separate queues. This allows the CPU to process high-priority traffic with minimum delay, even when overloaded with low-priority traffic The device supports the Marvell Secure Control Technology features for selecting only the required traffic to the CPU, as well as prioritizing and managing the bandwidth of traffic sent to the CPU. 7.1 CPU Port Number The CPU is represented as port 63 (0x3F). The CPU port has the same configuration options and egress queueing/scheduling features as the physical MAC ports. Specifically, the CPU port has eight egress traffic class queues. Each traffic class can be assigned a minimum bandwidth using the WRR scheduler, and the maximum rate can be limited with a leaky bucket shaper. The aggregate traffic to the CPU can also be rate-limited with a leaky bucket shaper (Section 15.3 "Egress Bandwidth Management" on page 305). The CPU port must be configured as a cascade port (Section 4.1 "Cascade Ports" on page 44). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 When it is configured as a cascade port, all packets sent to the CPU have the TO_CPU DSA tag providing information regarding the packet, to allow the CPU to inject FROM_CPU DSA-tagged packets to a specific destination and to receive TO_CPU DSA-tagged packets containing information about the packet (e.g. CPU code, source device, source port, etc.) (Section 7.2.2 "TO_CPU DSA Tag"). The CPU can inject FROM_CPU or FORWARD DSA-tagged packet to the CPU port (Section 7.3 "Packets from the CPU"). Note Typical applications require the CPU port to be configured as a cascade port, to allow the CPU to inject FROM_CPU DSA-tagged packets to a specific destination, and to receive TO_CPU DSA-tagged packets containing information about the packet (e.g. CPU code, source device, source port, etc.). 7.2 Packets to the CPU The device offers tight control of specific types of traffic to be sent to the CPU. In the device’s architecture, the CPU is not a member of the VLAN, so there is no implicit flooding of Unknown Unicast, Multicast and Broadcast traffic to the CPU port. A packet is sent to the CPU as a result of one of the following actions: • Packet is assigned a TRAP command by an ingress processing engine (Section 5.1.3 "TRAP Command" on page 53). • Packet is assigned a MIRROR command by an ingress processing engine (Section 5.1.2 "Mirror-to-CPU Command" on page 53). MV-S102110-02 Rev. E Page 102 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section describes the device’s CPU traffic management. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification CPU Traffic Management • Packet is assigned a FORWARD command to the CPU port 63 by an ingress processing engine (Section 5.1.1 "FORWARD Command" on page 52). Packet is selected for sampling to the CPU by the ingress or egress port sampling mechanism (Section 16.1 "Traffic Sampling to the CPU" on page 312). AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • The list of CPU codes is found in Appendix B. "CPU Codes" on page 343. The Policy engine rule action is capable of Mirroring or Trapping a packet to the CPU with a specific user-defined CPU code, see Section 10.6 "Policy Actions" on page 195. The Bridge engine has many mechanisms for Mirroring or Trapping a packet to the CPU. This includes BPDUs (Section 11.3.1 "Trapping BPDUs" on page 221) and other well-known control packets (Section 11.8 "Control Traffic Trapping/Mirroring to the CPU" on page 244). In addition, the bridge VLAN configuration can Mirror or Trap to the CPU various kinds of flooded traffic: unknown Unicast, unregistered IPv4/6 and non-IP Multicast, unregistered IPv4 Broadcast (Section 11.11.1 "Per-VLAN Unknown/Unregistered Filtering Commands" on page 253). 7.2.1 CPU Code Table • Packet truncation on the CPU port M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The CPU Code table is a 256-entry table that defines for each CPU code a set of configurable attributes controlling how the packet is sent to the CPU. The CPU code attributes are: • Packet device destination to CPU port • Packet QoS on the CPU port • Packet statistical sampling to CPU port These attributes are described in detail in the subsequent subsections. Packet Device Destination to the CPU Port In a single device system, packets are sent to the CPU via the device host interface (Section 6. "Host Management Interfaces" on page 55). In a cascaded system, however, it may not always be desirable for the packets to be sent to the CPU attached to the local device. For example, a CPU attached to one of the devices may serve as a master CPU for the system. Packets with a specific CPU code should be sent directly to that CPU. The device allows a CPU Destination Device table containing up to seven device numbers. The CPU Code table entry has a 3-bit field <Destination Device Pointer>, which serves as an index into a CPU Destination Device table. The <Destination Device Pointer> value of 0 is reserved to indicate that the packet is sent to the local device CPU port. For a given CPU Code table entry, a <Destination Device Pointer> value of 1–7 sends CPU traffic to the device with the corresponding number in the CPU Destination Device table. This allows distributed processing of protocols by multiple CPUs in the system, e.g., BPDUs are sent to the device attached to CPU #1, and GVRP PDUs are sent to the device attached to CPU#2. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 103 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Packets sent to the CPU are assigned an 8-bit CPU code, which indicates the reason the packet is sent to the CPU. In addition, the CPU code determines the attributes controlling how the packet is sent to the CPU, according to the CPU code table (Section 7.2.1 "CPU Code Table"). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Packets to the CPU AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To configure CPU Destination Device table entries, set the CPU Target Device Configuration Register0 • (Table 443 p. 702) and CPU Target Device Configuration Register1 (Table 444 p. 702) accordingly. To configure the destination CPU device index per CPU code, set the <CPUCode TrgDevIndex> field in the respective entry in the CPU Code Table Entry<n> (0<=n<256) (Table 437 p. 698) accordingly. • Like all ports in the device, the CPU port supports eight traffic class egress queues and two levels of drop precedence. (SecureSmart devices have 4 traffic classes.) The CPU Code table associates a traffic class and drop precedence with each CPU code. This allows differentiation between different kinds of traffic to the CPU on a per CPU code basis, in terms of the traffic class assignment and the drop precedence. The following table provides an example mapping of traffic type to traffic class:. Table 28: CPU Code Tr affic Type • • CPU TO CPU MAIL FROM NEIGHBOR CPU CPU to CPU management traffic in a multi-CPU cascaded system (Section 7.3 "Packets from the CPU" on page 107). • BPDU TRAP Bridge Spanning Tree BPDUs. • IEEE RESERVED MULTICAST ADDR TRAP/MIRROR IEEE Layer 2 control protocols (e.g. BDPUs, GARP, LACP). 4 • MAC TABLE ENTRY TRAP/MIRROR Unicast packets whose MAC DA is the CPU MAC Address. This includes ARP reply, and IP management protocols, e.g., SNMP, Telnet, and HTTP. 3 • ARP BROADCAST TRAP/MIRROR IGMP TRAP/MIRROR IPv6 ICMP TRAP/MIRROR IPv6 NEIGHBOR SOLICITATION TRAP/MIRROR ARP Broadcasts IGMP/MLD packets (for Snooping application) IPv6 Solicited Neighbor Multicast 7 6 5 • • • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Tra f fi c Class Qu eu e Example Layer 2 System Traffic Type to Traffic Class Mapping Table 2 • IPv4 BROADCAST TRAP/MIRROR IPv4 Broadcasts 1 • USER DEFINED n Policy Mirrored/Trapped Traffic (e.g., ACL security logging) 0 • • INGRESS SAMPLED EGRESS SAMPLED Ingress and egress packet sampling to the CPU (monitoring application, e.g. sFLOW) MV-S102110-02 Rev. E Page 104 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Packet QoS on the CPU Port M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification CPU Traffic Management • • The traffic class and drop precedence assignment is applied to the queueing on the CPU port only. If the packet is sent over a cascade port with a TO_CPU DSA tag, the packet is assigned the cascade control traffic class and drop precedence (Section 8.5.3 "Setting QoS Fields on Cascaded Ports" on page 127). If the packet is CPU-to-CPU, the traffic class and drop precedence assignment on the destination CPU port is taken from the FROM_CPU DSA tag and not from the CPU Code table. This applies to the case where the destination port is the CPU port 63 and when the FROM_CPU <mailbox> field is set (Section "CPU Mailbox to Neighbor CPU Device" on page 108). Configuration In the CPU Code Table Entry<n> (0<=n<256) (Table 437 p. 698): • To configure the CPU Code table entry with the traffic class assignment, set the <CPUCode TC> field accordingly. • To configure the CPU Code table entry with the drop precedence assignment, set the <CPUCode DP> field accordingly. Statistical Sampling The device supports statistical sampling of packets sent to the CPU on a per-CPU code basis. Each CPU Code table entry contains a 6-bit pointer to one of thirty-two 32-bit sampling thresholds. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 A pseudo-random number ranging from 1 to 232-1 is generated for every packet to be sent to the CPU. If the sampling threshold pointed to by the CPU Code entry is less than the pseudo-random number generated for the packet, the packet is not sent to the CPU. (If the packet command is MIRROR, it is still sent to the network ports.) If the maximum sampling threshold value of 0xFFFFFFFF is configured, all packets with the given CPU code are forwarded to the CPU. Conversely, if the minimum sampling threshold value of 0x0 is configured, no packets with the given CPU code are forwarded to the CPU. (Note that the pseudo-random number is always > 0.) In a cascaded system, a CPU code statistical sampling mechanism is applied only to packets received from network ports (i.e., the ingress device), and not to DSA-tagged packets received on cascade ports. This mechanism can be used to sample to the CPU a statistical percentage of an arbitrary traffic flow that is identified by the Policy engine. The Policy engine command can be to MIRROR the packet with a user-specified CPU code. The corresponding CPU code table entry is configured to the desired sampling percentage. Note The CPU code statistical sampling mechanism is orthogonal to the port statistical sampling mechanism (Section 16.1 "Traffic Sampling to the CPU" on page 312). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 105 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Notes M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Packets to the CPU <CPUCode StatRate LimitIndex> field accordingly in the CPU Code Table Entry<n> (0<=n<256) (Table 437 p. 698). To configure the set of 32 sampling threshold profiles, set the Statistical Rate Limits Table Entry<n> (0<=n<32) (Table 440 p. 700) accordingly. Packet Truncation Some statistical sampling applications only require the packet header information and not the entire packet data. This conserves host memory required for queueing received packets. The device allows packets to the CPU to be truncated to 128 bytes, on a per CPU code basis. The TO_CPU DSA tag contains a flag indicating the received packet was truncated to 128 bytes. In addition, the original packet length is reported in the TO_CPU DSA tag (Section 7.2.2 "TO_CPU DSA Tag"). Configuration To configure the CPU Code table entry to truncate packets to 128 bytes, set the <CPUCode Truncated> bit in the respective entry in the CPU Code Table Entry<n> (0<=n<256) (Table 437 p. 698). 7.2.2 TO_CPU DSA Tag When the CPU port is configured as a cascade port, all packets sent to the CPU are DSA-tagged TO_CPU. – – – – Ingress or egress port number Ingress or egress device number Ingress or egress VLAN Ingress or egress tag CFI bit Notes • • For caveats about the TO_CPU ingress port and device fields, see "CPU to CPU" on page 108 and "CPU Mailbox to Neighbor CPU Device" on page 108. For further information about DSA tag TO_CPU format, see A.1 "Extended DSA Tag in TO_CPU Format" on page 333. MV-S102110-02 Rev. E Page 106 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The TO_CPU DSA tag provides the following important packet attributes to the application: • 8-bit CPU code: Indicates the mechanism that caused the packet to be sent to the CPU (Appendix B. "CPU Codes" on page 343). • Truncation flag: Indicates that the packet was truncated to 128 bytes ( "Packet Truncation" on page 106). • Original packet byte length: Packet length (independent of truncation). • Ingress/egress attributes: The following attributes are defined in the TO_CPU DSA tag. These attributes reflect either the packet ingress or packet egress, depending on whether the packet was sent to the CPU by the ingress or egress pipeline. CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To configure the CPU Code table entry with the sampling threshold pointer to one of the 32 profiles, set the • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification CPU Traffic Management 7.3 Packets from the CPU AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The CPU can inject packets into the device using the FROM_CPU or FORWARD DSA tags. 7.3.1 FROM_CPU DSA Tag The FROM_CPU DSA tag allows the CPU to specify the following packet attributes: • Packet destination The packet destination can be a single-target device/port, a multi-target destination of either a VLAN or Multicast group, or a neighboring device CPU port (useful for device/topology discovery). Single-destination packet is to be sent on a network port as VLAN-tagged or Untagged If set, the single-target packet is sent on the egress port with a VLAN tag. This flag is not applied to Multicast packets, which are sent tagged/untagged according to the VLAN tagged state. • Packet 802.1Q VLAN tag fields: VID, CFI, and user priority These are the VLAN tag values used if the packet is transmitted VLAN-tagged. • Packet traffic class and drop precedence. These attributes are applied on the network destination port(s) and not the cascade ports. • Use Control traffic class on cascade port If set and the packet is sent to cascade port(s), the packet is assigned the control traffic class and drop precedence on the cascade port. If disabled, the packet traffic class and drop precedence is mapped to a cascade port traffic class and drop precedence (Section 8.5.3 "Setting QoS Fields on Cascaded Ports" on page 127). • Egress Filter Enable If set, the packet is subject to VLAN egress filtering and spanning tree state egress filtering. Specifically, when injecting BPDUs, this field should be disabled. • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Source-ID This is the source-ID used for egress Multicast source-ID filtering (Section 11.14 "Bridge Source-ID Egress Filtering" on page 257). • Exclude device/port or trunk group This option allows a Multicast packet to be injected while excluding a specific device/port or trunk group. • Mailbox to neighbor device CPU This option is used for topology discovery prior to knowing the device number assignment of other devices in a cascaded system. The CPU specifies the local cascade port through which to send the packet and the receiving device automatically sends the packet to the CPU. Note For further information about DSA tag TO_CPU format, see A.1 "Extended DSA Tag in TO_CPU Format" on page 333. The FROM_CPU packet destination options are discussed in the following subsections. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 107 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The CPU can inject packets with a specific destination and QoS using the FROM_CPU DSA tag. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Packets from the CPU CPU to Network AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The CPU can send a packet to any network device/port, VLAN group, or Multicast group. To set the destination to a VLAN group, the Multicast group index (VIDX) is set to a special value 0xFFF (Section 11.5 "Bridge Multicast (VIDX) Table" on page 240). If the destination is a VLAN or Multicast group in a cascaded system, the source-ID is used to prevent forwarding loops in the cascade topology (Section 4.3 "Multi-Target Destination in a Cascaded System"). CPU to CPU In a cascaded system, the CPU can send a packet to the CPU port on any device in the system. The FROM_CPU DSA tag destination port field is set to the CPU port 63. The destination device is set to the device number on which the CPU port resides. The packet is sent across the cascade port as a FROM_CPU DSA-tagged packet. When it arrives at the destination device, the packet is converted to a TO_CPU packet with the following settings: • CPU code = CPU TO CPU • Source device = FROM_CPU<source device> • Source port = Undefined (based on the CPU code, it can be determined that the source port is the CPU port on the source device) CPU Mailbox to Neighbor CPU Device M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The packet is queued on the CPU port according to the traffic class and drop precedence defined in the FROM_CPU DSA tag (and not according to the CPU Code table). In cascaded systems the mailbox mechanism allows a CPU to communicate with its adjacent neighbor CPU prior to having learned the device numbers in the system and programming the Device Map table. To send a Mailbox packet to the CPU attached to the adjacent device on a cascade port, the FROM_CPU DSA tag must have the Mailbox field set, the destination device must be set to the local device number, and the destination port is set to the cascade port through which the packet is to be sent. When received by the neighboring device, the packet is converted to a TO_CPU packet with the following settings: CPU code=MAIL FROM NEIGHBOR CPU Source device = local device number Source port = local cascade port from which the packet was received • • • The packet is queued on the CPU port according to the traffic class and drop precedence defined in the FROM_CPU DSA tag. 7.3.2 FORWARD DSA Tag The CPU can inject packets to the ingress pipeline using the FORWARD DSA tag. The motivation for injecting packets with the FORWARD DSA tag is to allow the ingress pipeline to make all the forwarding, filtering, and QoS decisions. The packet is processed just as any FORWARD DSA-tagged packet is processed when received on a cascade port. The fact that the packet is received from the CPU port does not affect the way it is processed by the ingress pipeline. MV-S102110-02 Rev. E Page 108 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 If the destination is a VLAN or Multicast group, it is possible to exclude a specific device/port or trunk group from the flood domain. This is useful if the packet was initially trapped to the CPU and now needs to be re-injected to the VLAN, but it should not be resent on its original source port. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification CPU Traffic Management AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The forwarding destination and QoS attributes of the FORWARD DSA-tagged packet are irrelevant when injected by CPU. The packet forwarding decision is assigned by the ingress processing engines enabled on the CPU port (e.g., policy, bridge, and router). The basic attributes of the FORWARD DSA tag that must be set by the CPU are: • Source device/port or trunk group: This determines Source Address location if learned by the bridge engine. • VLAN assignment: If set to 0, the packet is assigned a VID by the ingress pipeline. • User Priority assignment. • Egress filtering source-ID. This is used to prevent loops in cascaded systems (Section 4.3 "Multi-Target Destination in a Cascaded System" on page 46). The remaining attributes are assigned by the ingress pipeline. Note For further information about FORWARD DSA tag format, see A.4 "Extended DSA Tag in FORWARD Format" on page 341. 7.3.3 Ethernet Frame Alignment M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Currently, IP stacks require that IP and TCP/UDP headers be 32-bit aligned. Ethernet headers preceding the IP header generally have a 14-byte length (6 MAC DA, 6 MAC SA and 2 bytes for EtherType). Ethernet headers may also have an 18-byte length (802.1Q tagged), 22-byte length (with LLC/SNAP) or 26-byte length (802.1Q tagged with LLC/SNAP). In any case, they are never aligned to multiples of 32-bits. The device enables the pre-pending all packets forwarded to the Host CPU, via the PCI Interface or the CPU Port MAC, with 2 bytes to 32-bit align the IP and TCP/UDP. Configuration To enable/disable prepending a header of 16 bits to packets forwarded to the host CPU, set the <PrePendTwo BytesHeaderEn> bit in the Cascading and Header Insertion Configuration Register (Table 528 p. 770). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 109 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The packet QoS Profile is assigned by the ingress processing engines when the CPU port is configured with <Trust DSA tag QoS> set to 0 (Section 8. "Quality of Service (QoS)" on page 110). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Packets from the CPU This section describes how QoS is implemented in device. Quality of Service (QoS) provides preferential treatment to specific traffic possibly at the expense of other traffic. Without QoS, the device offers best-effort service to each packet and transmits the packets without any assurance of reliability, delay bounds, or throughput. Implementing QoS in a network makes performance more predictable and bandwidth utilization more effective. QoS implementation in the device supports the IETF-DiffServ and IEEE-802.1p standards. The typical QoS model is based on the following: • At the network edge, the packet is assigned to a QoS service. The service is assigned based on the packet header information (i.e., the packet is trusted) or on the ingress port configuration (packet is not trusted). • The QoS service defines the packet’s internal QoS handling (e.g., traffic class and drop precedence) and optionally the packet’s external QoS marking through either the 802.1p User Priority and/or the IP header DSCP field. • Subsequent devices within the network core provide consistent QoS treatment to traffic based on the packet’s 802.1p or DSCP marking. As a result, an end-to-end QoS solution is provided. • A device may modify the assigned service if a packet stream exceeds the configured profile. In this case, the packet may be dropped or reassigned to a lower QoS service. 8.1 QoS Model This section describes the QoS model. 8.1.1 Traffic Types M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device incorporates the required QoS features to implement network-edge as well as network-core switches/routers: • The device provides flexible mechanisms to classify packets into as many as 72 different services. • Up to 256 traffic policers may be used to control the maximum rate of specific traffic flows. (The SecureSmart devices support four Ingress Traffic Policers per port.) • The packet header may have its User Priority and/or DSCP set to reflect the QoS assignment. • Service application mechanism is based on eight egress priority queues per port (including the CPU port), on which congestion-avoidance and congestion-resolution policies are applied. The device classifies incoming traffic into data, control, and mirror-to-analyzer. Table 29 describes these traffic types. To assure predictable performance, as well as to simplify configuration, the device uses different mechanisms to assign QoS service to each traffic type. In addition, whenever different traffic types compete for shared resources (e.g., egress queuing resources), the device can be configured to control the degree of cross-interaction between different traffic types, by limiting or completely eliminating resource sharing. See Section 8.4.1 "Traffic Class and Drop Precedence Assignment" for further details on assignment of QoS per traffic type. See Section 15. "Bandwidth Management" on page 302 for further details about configuring queuing resources. MV-S102110-02 Rev. E Page 110 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 8. Quality of Service (QoS) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) Tra f fi c Ty pe Des cription Data Data packets are defined as either: • Network-to-Network traffic: Packets received on a cascade port with DSA tag <command> = FORWARD, or packets received on a network port that are assigned a <command> FORWARD or MIRROR TO CPU or • Data traffic from the CPU: Packets with a DSA tag <command> = FROM_CPU and DSA tag <Cascade Control> flag is clear. See Appendix A. "DSA Tag Formats" on page 333 for further details about DSA tag encapsulation formats. Traffic classified as Data is subject Initial QoS Marking as described in 8.2 "Initial QoS Marking" on page 115 Control Control packets are defined as one of the following: • Packet to the CPU: Packets sent to the CPU due to trapping, mirroring or forwarding a Data packet to the CPU by one of the ingress pipe, or egress pipe engines or – Packets received via a cascading port with a DSA tag <command> = TO_CPU or – Mirrored to Analyzer Port M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 – CPU-to-CPU traffic: Packets received via a cascading port with a DSA tag <command> = FROM_CPU and are destined to another CPU • Control packet from the CPU: Packets sent by the CPU with DSA tag <command> = FROM_CPU and DSA tag <Cascade Control> flag is set and are not targeted to another port Traffic Classified as control traffic is not subject to initial QoS Marking as described in: 8.2 "Initial QoS Marking" on page 115 and is not subject to policing as described in 8.3 "Traffic Policing" on page 121 Traffic Classified as control is assigned with a TC and DP as described in 8.4.1.2 "Control Packet Traffic Class and Drop Precedence Assignment" on page 124 Packets mirrored to an analyzer port are defined as either: • packets received on a cascade port with a DSA tag <command> = TO_ANALYZER, or • packets that are duplicated for either ingress or egress mirroring to analyzer port. An additional condition is that the target analyzer port is NOT the CPU port on the local device. This would cause the packet to be treated as Control. Traffic classified as Mirrored to Analyzer Port is not subject to initial QoS Marking as described in: 8.2 "Initial QoS Marking" on page 115 and is not subject to policing as described in 8.3 "Traffic Policing" on page 121 Traffic classified as Mirrored to Analyzer Port is assigned with a TC and DP as described in 8.4.1.3 "Mirrored Packet Traffic Class and Drop Precedence Assignment" on page 125. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 111 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Traffic Types AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 29: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED QoS Model 8.1.2 QoS Processing Walkthrough AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 QoS processing by the device is illustrated in Figure 19. The processing is described in four stages, where the first two stages—Ingress QoS Initial Marking and Ingress Traffic Policing and QoS Remarking —are performed in the ingress pipeline and the latter two stages—QoS Enforcement and Packet QoS Marking—are performed in the egress pipeline. C o n tro l o r M irro re d to A n a ly z e r P o rt P a c k e t D a ta P a c k e t In g re s s Q o S In itia l M a rk in g In g re s s T ra ffic P o lic in g a n d Q o S R e m a rk in g M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 E g re s s Q o S E n fo rc e m e n t E g re s s S e ttin g o f P a c k e t H e a d e r Q o S F ie ld s D a ta P a c k e t E g re s s 8.1.2.1 QoS Initial Marking QoS initial marking associates every packet classified as Data with a set of QoS attributes that determines the QoS processing by subsequent stages. For further details see Section 8.2 "Initial QoS Marking". Note Traffic classified as Control or Mirrored to Analyzer Port (see: 8.1.1 "Traffic Types" on page 110) is not subject to QoS Initial Marking. MV-S102110-02 Rev. E Page 112 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Figure 19: QoS Processing Walkthrough M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) 8.1.2.2 Traffic Policing and QoS Remarking Note Traffic classified as Control or Mirrored to Analyzer Port (see: 8.1.1 "Traffic Types" on page 110) is not subject to Traffic Policing and Qos Remarking. 8.1.2.3 QoS Enforcement In the device, QoS enforcement utilizes eight priority egress queues per port. Congestion avoidance and congestion resolution techniques are used to provide the required service. For a detailed description of QoS enforcement see Section 8.4 "QoS Enforcement". 8.1.2.4 Setting Packet Header QoS Fields The device supports setting or modifying the packet header 802.1p User Priority and/or IP-DSCP. For a detailed description see 8.5 "Setting Packet Header QoS Fields". 8.1.3 Packet QoS Attributes M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Every packet Classified as Data is assigned a set of QoS attributes, which can be modified by each ingress pipeline engine. Each of the ingress pipeline engines contain several Initial QoS Markers, which assign the packet’s initial QoS attribute, as described in Section 8.2 "Initial QoS Marking". The ingress pipeline engine also contains a QoS Remarker which can modify the initial QoS attributes, as described in Section 8.3 "Traffic Policing". The packet QoS attributes are defined in the following table:. Table 30: Packet QoS Attributes Q o S P a r a m e te r De s c r ip ti o n QoS Precedence QoS precedence. The device incorporates multiple QoS markers operating in sequence. As a result, a later marker overrides an earlier QoS attribute assignment. By setting the QoS Precedence flag to HARD, a QoS marker can prevent modification of packet QoS attributes by subsequent QoS markers. 0 = SOFT QoS Precedence: Subsequent QoS markers can override the existing packet QoS attributes assigned by a previous marker. 1 = HARD QoS Precedence: Subsequent QoS markers cannot override the existing QoS attributes that were assigned by a previous marker. NOTE: The traffic policer remarker can modify packet QoS attributes regardless of the QoS precedence value. QoS Profile index Copyright © 2006 Marvell August 24, 2006, Preliminary QoS Profile index. Ranges from 0–71. See Section 8.1.4 "QoS Profile". CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 113 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If enabled on a policy-based traffic flow and if the packet is classified as Data, the policer meters the given flow according to a configurable rate profile and classifies packets as either in-profile or out-of-profile. Out-of-profile packets may be discarded or have their QoS attributes remarked. For further details see Section 8.3 "Traffic Policing". M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED QoS Model Packet QoS Attributes (Continued) De s c r ip ti o n Modify DSCP Enable Packet DSCP field modification: 0 = Packet DSCP field is not modified when the packet egresses the device. 1 = Packet DSCP field is modified to the <DSCP> value of the QoS Profile entry for the packet’s QoS Profile index. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Q o S P aram e te r NOTE: This flag is valid only for IPv4/6 packets. See Section 8.5.2 "Setting the Packet IP Header DSCP Field" for a detailed description of packet DSCP field modification rules. Modify User Priority Enable packet 802.1p-User Priority field modification. 0 = Packet User Priority is preserved when the packet egresses the device. 1 = Packet User Priority field is modified to the <UP> value of the QoS Profile entry for the packet’s QoS Profile index, when the packet egresses the device. NOTE: This flag is relevant only if a VLAN tag is added or modified when the packet egress the device. See Section 8.5.1 "Setting the Packet Header 802.1p User Priority Field" for a detailed description of packet User Priority field modification rules. Note 8.1.4 QoS Profile The device supports up to 72 QoS Profiles. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Traffic classified as Control or Mirrored to Analyzer Port (see: 8.1.1 "Traffic Types" on page 110) is not assigned with Qos Attributes. Every packet that is classified as Data is assigned QoS attribute <QoS Profile index>, which is used by the egress pipeline to apply the QoS service. The QoS Profile index is used as a direct index, ranging from 0 to 71, into the global QoS Profile table. Each entry in the QoS Profile table contains the set of attributes defined in Table 31. Table 31: QoS Profile Table Entry Qo S Pr ofile field Desc ription TC Traffic class queue assigned to the packet. DP Drop precedence assigned to the packet. UP If the packet QoS attribute <Modify UP> is set, and the packet is transmitted tagged, this field is the value used in the packet 802.1p User Priority field. If the packet was received tagged, the existing User Priority is modified with this value. DSCP If the packet QoS attribute <Modify DSCP> is set, and the packet is IPv4 or IPv6, this field is the value used to modify the packet IP-DSCP field. MV-S102110-02 Rev. E Page 114 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 30: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In a cascaded system, the QoS Profile is conveyed between devices through the extended FORWARD DSA tag. To ensure consistent QoS treatment within the system, the application must configure QoS Profile attributes consistently in every device in the system. Note Traffic classified as Control or Mirrored to Analyzer Port (see: 8.1.1 "Traffic Types" on page 110) is not assigned with a QoS Profile 8.2 Initial QoS Marking Initial QoS marking provides various methods of assigning QoS attributes to packets classified as Data.(Section 8.1.3 "Packet QoS Attributes").The device supports the initial QoS markers described in the following sub-sections, listed according to their sequential order in the ingress pipeline. Note 8.2.1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Traffic classified as Control or Mirrored to Analyzer Port (see: 8.1.1 "Traffic Types" on page 110) is not subject to Initial Qos Marking, thus, the following sub-sections are not relevant for Control and for Mirrored to Analyzer Port traffic Port-Based QoS Marking The port-based QoS marker provides port-based assignment of the QoS attributes: • QoS Precedence • QoS Profile index • Modify DSCP flag • Modify User Priority flag See Section 8.1.3 "Packet QoS Attributes" for descriptions of these fields. In addition, port-based QoS marking supports a default port 802.1p User Priority assignment. Note Applications of the port-based marker, as well as recommended configuration, are described in Section 8.6.1 "Applications of Port-Based QoS Marker" and Section 8.6.2 "QoS Configuration of Cascade Ports". Assignment of the packet QoS attributes <QoS Precedence>, <Modify DSCP>, and <Modify DSCP> is based on the ingress port configuration. However, the packet <QoS Profile> attribute can be assigned according to one of the following: The ingress port configuration: This approach is used when the packet QoS for a given port is not trusted. or • • Based on the incoming packet QoS: This approach is used when the packet QoS for a given port is trusted. If the packet does not carry the trusted type of QoS, then the packet <QoS Profile> is assigned from the ingress port configuration. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 115 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The application is free to configure the 72 QoS Profiles in any suitable fashion. When supporting DiffServ and 802.1p, one approach is to assign QoS Profiles 0–63 for DiffServ PHBs, and QoS Profiles 64–71 for 802.1p User Priorities. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Initial QoS Marking 8.2.1.1 Port QoS Trust Modes AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device supports the following types of port QoS Trust modes: The packet QoS Profile is be assigned by mapping the packet User Priority to a QoS Profile, using a global User-Priority-to-QoS-Profile-Index mapping table. The table size is eight entries—one entry per User Priority value. Layer 3 DSCP Trust Mode When Layer 3 Trust mode is enabled, there is an additional sub-mode called DSCP-to-DSCP remapping, • • where the packet DSCP is mutated to a new DSCP, using a global DSCP-to-DSCP mapping table. This table maps the incoming packet DSCP to a new, mutated DSCP. The table size is 64 entries—one entry per DSCP value. The new DSCP is used by all subsequent functions based on the packet DSCP. The packet QoS Profile is assigned by mapping the packet DSCP (or remapped DSCP) to a QoS Profile Index using a global DSCP-to-QoS-Profile-Index mapping table. The table size is 64 entries—one entry per DSCP value. Layer 2 Trust Mode + Layer 3 Trust Mode DSA-Tag QoS Profile Trust Mode M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This allows both layers of trust to be enabled concurrently, however Layer 3 trust has precedence over Layer 2 trust. In this mode, IPv4/v6 packets are subject to Layer 3 trust mode, and non-IPv4/6 packets tagged packets are subject to Layer 2 trust mode. If the extended FORWARD DSA tag <QoS Profile> is trusted, the packet QoS Profile Index may be assigned from the incoming packet DSA tag<QoS Profile>. Note Typically the DSA tag <QoS Profile> is trusted, since the DSA tag originates within the device’s system). The following three global mapping tables support the above port trust modes: User-Priority-to-QoS-Profile-Index mapping table DSCP-to-DSCP mapping table DSCP-to-QoS-Profile-Index mapping table • • • 8.2.1.2 Port Default User Priority The port default user priority is used in two ways: If a VLAN tag (or Nested VLAN tag) is added to the packet by the egress port and the packet QoS attribute <Modify User Priority> is disabled, then: • If the packet was received VLAN-tagged and Nested VLAN Access is enabled, then the UP assignment in the new tag is the same as the original tag • If the packet was received untagged, then the UP assignment in the new tag is set according to the ingress port default User Priority. For details about setting the packet header User Priority field at the egress port, see Section 8.5.1 "Setting the Packet Header 802.1p User Priority Field" on page 127. MV-S102110-02 Rev. E Page 116 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Layer 2 User Priority Trust Mode M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) 8.2.1.3 Port-Based QoS Marking Algorithm AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The algorithm of the port-based QoS marker is shown in Figure 20: Trust D SA tag* and packet is D SA-tagged with Q oS Profile? Y Packet Q oS Profile = Packet DSA tag Q os Profile Trust L3* and packet is IPv4/ 6? Y Y DSC P-to-D SCP R em apping?* Trust L2 * and packet is tagged** N D SCP = Packet D SCP N N Y M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 DSCP = DSCPto-DSC P rem apping N Packet Q oS Profile = D SCP-to-QoS-Profile m apping Packet QoS Profile = UP-to-Q oS-Profile m apping Packet Q oS Profile = Port QoS Profile * These flags are per port configuration ** Tagged im plies either DSAtagged or 802.1Q -tagged Copyright © 2006 Marvell August 24, 2006, Preliminary Packet Q oS Precedence = Port Q oS Precedence Packet M odify Packet DSCP = Port M odify Packt D SC P Packet M odify Packet UP = Port M odify Packt U P CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 117 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Figure 20: Port-Based QoS Marking Operation M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Initial QoS Marking • • • Trust operation depends on the per port trust configuration being enabled and on the existence of a QoS field in the packet header. Multiple Trust Modes can be enabled concurrently. The Trusted QoS Modes have the following order of precedence, starting with the highest: 1. Trust Extended FORWARD DSA tag <QoS Profile>. 2. Trust packet Layer 3 DSCP field. 3. Trust packet Layer 2 802.1p User Priority field. Modification of packet DSCP and User Priority fields can be independently enabled. Note that packet QoS modification is enabled in the ingress pipe, but it is performed in the egress pipe. QoS precedence can be set to HARD to prevent subsequent QoS marking. Configuration To configure the port QoS Precedence, set the <PortQoS Precedence> bit in the Port<n> VLAN and QoS • • • • • • • • • • Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). To configure the port Trust DSA tag QoS mode, set the <Trust DSATagQoS> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). To configure the port Trust mode, set the <QoSTrustMode> field in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567) accordingly. To enable DSCP-to-DSCP re-mapping, set the <ReMapDSCP> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). To enable modification of the packet DSCP, set the <PortModify DSCP> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). To enable modification of the packet User Priority, set the <PortModifyUP> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). To configure the port default QoS Profile, set the <PortQoSProfile> field in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567) accordingly. To configure the port default User Priority, set the <PUP> field in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567) accordingly. To configure the DSCP-to-DSCP mapping table, set the DSCP<4n...4n+3>2DSCP Map Table (0<=n<16) (Table 318 p. 579) accordingly. To configure the DSCP-to-QoS-Profile mapping table, set the DSCP<4n...4n+3>2QoSProfile Map Table (0<=n<16) (Table 319 p. 579) accordingly. To configure the User-Priority-to-QoS-profile mapping table, set the UP<4n...4n+3>2QoSProfile Map Table (0<=n<2) (Table 320 p. 580) accordingly. 8.2.2 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Protocol-Based QoS Marking The protocol-based QoS marker provides protocol-based per-port assignment of the QoS attributes: • QoS Precedence • QoS Profile index • Modify DSCP flag • Modify User Priority flag See Section 8.1.3 "Packet QoS Attributes" for descriptions of the above fields. MV-S102110-02 Rev. E Page 118 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Notes M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) Protocol-based QoS marking updates all of the QoS attributes. Configuration • To enable protocol-based QoS marking on a port, set the <ProtBased QoSEn> bit in the Port<n> VLAN and • QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). To configure the QoS marking attributes, configure the Port<n> Protocol<m> VID and QoS Configuration Entry Register (0<=n<27, for CPU Port n = 0x3F, 0<=m<8, for <m> corresponding to the matching global protocol entry and encapsulation) (Table 313 p. 572) accordingly. 8.2.3 Policy-Based QoS Marking M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The policy-based QoS marker provides protocol-based per-port assignment of the QoS attributes: • QoS Precedence • QoS Profile index • Modify DSCP • Modify User Priority flag See Section 8.1.3 "Packet QoS Attributes" for descriptions of these fields. The policy-based QoS marker allows the application to perform QoS marking based on Layer 2 to Layer 4 packet information, as well as on device parameters such as the ingress port and the packet QoS Profile. The policy QoS marker has the ability to independently assign a new value, or maintain the existing value for the QoS Profile, Enable Modify DSCP, and Enable Modify User Priority flags. However the QoS Precedence is always updated by a matching policy rule action. The Policy engine supports two lookup cycles per packet, while maintaining wire-speed performance. Eventhough the device can perform QoS marking on either lookup (or on both lookups), it is common practice to dedicate the second lookup to implementing QoS policy rules, while the first lookup is dedicated to security access rules. Operation and configuration of the policy-based QoS marker is described in Section 10. "Ingress Policy Engine" on page 172. 8.2.4 Bridge FDB-Based QoS Marking The Bridge FDB QoS marker performs initial marking of the QoS attributes: • QoS Precedence • QoS Profile index • Modify DSCP flag • Modify User Priority flag See Section 8.1.3 "Packet QoS Attributes" for descriptions of the above fields. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 119 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The protocol-based QoS marker supports per-port QoS initial marking of the packet QoS attributes, according to the packet protocol field. The QoS marker operation is piggy-backed on the protocol-based VID assignment lookup. For a description of the mechanism, see Section 11.2.2.2 "Protocol-Based VLANs" on page 208. The QoS marking for a given port/protocol can be applied selectively as follows: • Do not QoS mark packets matching this protocol on this port. • QoS mark only untagged packets matching this protocol on this port. • QoS mark only VLAN-tagged, Priority-tagged, or DSA-tagged packets matching this protocol on this port. • Qos mark all packets matching this protocol on this port. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Initial QoS Marking AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The QoS marking is based on the Bridge FDB source and destination lookup (Section 11.4.3 "FDB Lookup" on page 225, but is decoupled from the bridge forwarding decision. For example, FDB-based QoS marking still applies, even though the forwarding decision is not based on the FDB (e.g., the packet is forwarded based on the PVE, as described in Section 11.9 "Private VLAN Edge (PVE)" on page 251). An FDB entry contains the following two fields (see MAC QoS Table Entry<n> (1<=n<8) (Table 363 p. 627): • <DA QoS attribute set index>: This is the QoS attribute set applied to the packet if there is a destination lookup match. • <SA QoS attribute set index>: This is the QoS attribute set applied to the packet if there is a source lookup match. Figure 21 describes the interactions between the FDB entry and the QoS attribute set index. It shows an FDB entry that applies QoS attribute set #2 for a source lookup match and it preserves the previous QoS marking for a destination lookup match. An example application is to mark all packets sourced by a specific device (e.g., an IP phone, or a video streamer) with an appropriate QoS. Figure 21: MAC-Address-Based QoS Marker Configuration Bridge FDB Entry DA QoS Parameter Set Index 0 SA QoS Attribute Set Index M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Non QoS Related Fields FDB QoS Attribute Sets 2 NULL QoS Attribute Index QoS Attribute Set #1 QoS Attribute Set #2 QoS Attribute Set #3 QoS Attribute Set #4 QoS Attribute Set #5 QoS Attribute Set #6 QoS Attribute Set #7 A lookup failure is implicitly assigned the NULL FDB QoS attribute index. In the event that both the source and destination FDB lookups find a matching entry whose attribute index is not NULL, a global FDB QoS Marking Conflict Resolution command selects either the source lookup attribute index or the destination lookup attribute index. Table 32 describes the QoS marking commands selection process. MV-S102110-02 Rev. E Page 120 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The FDB-based QoS marker supports seven QoS attribute sets, indexed 1 through 7. QoS attribute set 0 is the NULL set, which implies that FDB-based QoS is not applied to the packet. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) So u r c e Lo o k up FDB entry QoS Attribute In d e x D e s t i n a t io n L o o k up F D B e n tr y Q o S A t tr i bu te I n de x R e s o l u t io n NULL NULL Packet QoS information is preserved. NULL Valid MAC-DA marker is applied. Valid NULL MAC-SA marker is applied. Valid Valid Marking command is selected according to global FDB QoS Marking Conflict Resolution command. Configuration To configure the FDB QoS Attribute sets, set the MAC QoS Table Entry<n> (1<=n<8) (Table 363 p. 627) • accordingly. To add, remove, or modify an FDB entry, see Section 11.4.6 "CPU Update and Query of the FDB" on page 230. To configure QoS Marking Conflict Mode, set the <MACQoS ConfilctMode> bit in the Bridge Global Configuration Register0 (Table 370 p. 643). • 8.2.5 R Unicast Router-Based QoS Marking M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • This section is relevant for the following devices: D Multilayer Stackable: 98DX107, 98DX133, 98DX167, 98DX247, 98DX253, 98DX263, 98DX273 D SecureSmart Stackable: 98DX169, 98DX249, 98DX269 U Not relevant for the SecureSmart and Layer 2+ Stackable devices The Unicast Router QoS marking allows the application to assign identical service to all packets sharing the same Unicast destination IP address. The FDB QoS marker performs initial marking of the QoS attributes: • QoS Precedence • QoS Profile index • Modify DSCP flag • Modify User Priority flag See Section 8.1.3 "Packet QoS Attributes" for descriptions of the above fields. Operation and configuration of the Router-based QoS marker is described in Section 12. "IPv4 and IPv6 Unicast Routing" on page 265. 8.3 Traffic Policing Traffic Policing is triggered by the Policy engine for traffic classified as Data, by binding a specific flow with a rate limiter. Note Traffic classified as Control or Mirrored to Analyzer Port (see: 8.1.1 "Traffic Types" on page 110) is not subject to Traffic Policing Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 121 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 FDB-Based QoS Marking Resolution AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 32: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Traffic Policing AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The traffic policer uses two-color rate limiters to classify packets as in-profile or out-of-profile. If a flow rate exceeds the configured rate limiter settings, a portion of the flow is classified to be out-of-profile. Out-of-profile packets may be discarded or may have their packet QoS attributes remarked. See Section 8.1.3 "Packet QoS Attributes"for descriptions of the above fields. Note QoS re-marking of out-of-profile packets is performed regardless of the packet’s QoS precedence. The policer entry out-of-profile command supports two modes of assigning the QoS Profile —absolute and relative. • In absolute mode the packet QoS Profile is assigned a new value regardless of its previous value. • In relative mode the packet QoS Profile is assigned a new value based on its previous value. Absolute mode should be used to assign a specific service to out-of-profile traffic, regardless of the initially assigned service. For example, absolute mode can be used to remark all out-of-profile traffic to best-effort service. The application must be aware that redirection of out-of-profile traffic to a different traffic class than in-profile traffic may result in out-of-order delivery. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Relative mode uses a global 72-entry QoS Profile Remarking table. This table maps the packet <QoS Profile> attribute marked in the initial marking stages, to a new <QoS Profile>. The format of the QoS Profile Remarking table is described in "Remarked QoS Profile Index" on page 294. Relative mode is recommended for use when the out-of-profile service depends on the initial service. For example, a common QoS re-marking command is “Increase packet drop precedence but don’t change its egress priority queue.” This command guarantees that traffic is delivered in order, regardless of the policer classification decision. Nevertheless, out-of-profile traffic has a higher drop precedence. The out-of-profile command is described in detail in Section 14. "Ingress Traffic Policing Engine" on page 292. 8.4 QoS Enforcement This section describes how the device applies the required QoS service. QoS enforcement in the device is performed in the egress pipe. For multi-destination packets, the service is applied individually, per packet copy. QoS enforcement takes the steps shown in Figure 22 and described in the following sub-sections. MV-S102110-02 Rev. E Page 122 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Out-of-profile packets may have each of the following QoS attributes either remarked or preserved: QoS Profile index Modify DSCP flag Modify User Priority flag • • • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 22: QoS Enforcement Walkthrough Packet {TC, DP} assignment QoS Enforcement Parameters Packet enqueue according to {TC, DP} Congestion avoidance Packet dequeue according to shaper/scheduler policy Congestion Resolution 8.4.1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Packet Egress Traffic Class and Drop Precedence Assignment QoS enforcement parameters are the traffic class (TC) and the Drop Precedence (DP). These parameters are used by the congestion avoidance algorithm to enqueue the packet to the correct priority queue and to set the packet drop precedence for the tail-drop algorithm. The assignment of the packet TC and DP differs for Data traffic and Control traffic. See Section 8.1.1 "Traffic Types" for definitions of these traffic types. 8.4.1.1 Data Packet Traffic Class and Drop Precedence Assignment Data packets are assigned a <QoS Profile> attribute by the QoS marking/remarking mechanisms in the ingress pipeline. The egress pipeline uses the <QoS Profile> to derive the packet {TC, DP} required by the congestion avoidance. The data packet <QoS Profile> is used as an index into the global QoS Profile Mapping table, which maps it to the a set of QoS parameters {TC, DP, UP, DSCP}. The QoS Profile Mapping table has 72 entries—one entry per QoS Profile. (The UP and DSCP are used later for the packet QoS modification.) The data packet {TC, DP} may be re-mapped to new {TC, DP} values, using the global Data Traffic {TC, DP} Remapping table. The ability to reassign the {TC, DP} is useful on cascading ports, to segregate data traffic from control traffic and monitor traffic. Note however that the {TC, DP} assignment of data packets has local device significance only. If the packet is sent out a cascade port, the QoS is recorded in the FORWARD DSA tag as the <QoS Profile>. On the next device, the QoS Profile will again be mapped (and possibly remapped) to a {TC, DP} for egress queueing. Figure 23 shows the {TC, DP} assignment algorithm for Data traffic. An example of {TC, DP} assignment over cascade ports is described in Section 8.6.3 "{TC, DP} Assignment on Cascade Ports". Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 123 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Data Packet QoS Attributes or Contol / Mirrored to Analyzer Port packet M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED QoS Enforcement AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 23: {TC, DP} Assignment Algorithm for Data traffic Map <QoS Profile> to {TC, DP, UP, DSCP} using the global QoS Profile table {TC,DP} is not modified N port<n> Remap Data TC/DP?* Y {TC,DP} remapped to {TC, DP} using the global Data {TC, DP} Remapping table * Remapping is recommended on cascade ports to segregate Data traffic from Control traffic. 8.4.1.2 Control Packet Traffic Class and Drop Precedence Assignment – – • packet to CPU M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The control packet {TC, DP} assignment algorithm differs, depending on two factors: • The Type of the control packet (see: Section 8.1.1 "Traffic Types") Control packet from CPU The control packet is forwarded via a cascade port, its TC and DP may be remapped to segregate control traffic from Data Traffic. This is called control traffic {TC, DP} remapping Accordingly, the control packet {TC, DP} is assigned from one of three sources: • The control packet is forwarded via a cascade port, its TC is assigned from a global <Control TC> field and either the global <Control DP0> or <Control DP1> field. else, • If the packet is a Packet to CPU it is assigned with a TC and DP according to the CPU CODE assigned to it. see: Section 7.2 "Packets to the CPU" on page 102. else • The packet is a Control packet from CPU it is assigned with a TC and DP extracted from its DSA Tag. When control traffic {TC, DP} remapping is performed, for the purpose of assigning the drop precedence, a further distinction is made between two types of control traffic: • CPU-to-CPU traffic • CPU-to-Network and Network-to-CPU traffic The motivation for this distinction is to allow CPU-to-CPU traffic, which is used for critical internal system management, to have a lower drop precedence on the control traffic class than other control traffic. Figure 24 outlines the TC and DP assignment for Control packets. MV-S102110-02 Rev. E Page 124 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Data traffic on egress port <n> M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 24: {TC, DP} Assignment for Control Packets port<n> Remap Control {TC, DP}* N Y IF packet is a Control packet FROM CPU: {TC,DP} is taken from the DSA-Tag Else, Packet is TO CPU, CPU code is mapped to {TC,DP} using CPU Code table TC = global <Control TC> configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 IF packet is CPU to CPU Traffic, THEN DP from global <Control DP0> configuration ELSE DP from global <ControlDP1> * Remapping Control packet {TC, DP} is recommended on cascade ports, excluding the CPU port 8.4.1.3 Mirrored Packet Traffic Class and Drop Precedence Assignment Mirrored packets are assigned a traffic class and drop precedence according the global mirroring configuration (Section 16.2 "Traffic Mirroring to Analyzer Port" on page 314). Figure 25: {TC, DP} Assignment of Mirrored Packets Mirrored traffic on egress port 'n' IF (packet is ingress mirrored) {TC,DP} is according to global ingress mirroring <TC> and <DP> configuration\ ELSE if (packet is egress mirrored) {TC,DP} is according to global egress mirroring <TC> and <DP> configuration Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 125 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Control traffic on egress port 'n' M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED QoS Enforcement accordingly. To enable {TC,DP} remapping of data traffic, set the <Remap DataTCEn> bit in the Port<n> Txq Configuration Register (0<=n<27, CPU port number is 0x3F) (Table 466 p. 722). – • • To configure the device global {TC,DP}->{TC,DP} data mapping table, set the <Data Packets TC and DP0 Remapping Register> and Data Packets TC and DP1 Remapping Register (Table 468 p. 725) accordingly. To enable {TC,DP} assignment of control traffic, set the <Remap ControlTCEn> bit in the Port<n> Txq Configuration Register (0<=n<27, CPU port number is 0x3F) (Table 466 p. 722). – To configure the dedicated control traffic class, set the<ControlTC> field in the <Transmit Queue Extended Control Register (Table 459 p. 717) accordingly. – – To configure the DP for CPU<->CPU traffic, set the <ControlDP0> bit. To configure the DP for CPU<->Network traffic, set the <ControlDP1> bit. To configure {TC, DP} for mirrored traffic, set the Ingress and Egress Monitoring to Analyzer QoS Configuration Register (Table 452 p. 708) accordingly. 8.4.2 Congestion Avoidance M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Congestion avoidance consists of the following: • A priority queue for packet enqueuing is selected by the packet traffic class assignment (Section 8.4.1 "Traffic Class and Drop Precedence Assignment"). The device incorporates eight egress priority queues per port (including the CPU port). • The packet may be discarded, based on the decision of a tail-drop algorithm. The color-aware tail-drop algorithm utilizes the packet drop precedence assignment. See Section 15.3 "Egress Bandwidth Management" on page 305 for further information about the tail-drop mechanism. 8.4.3 Congestion Resolution Congestion resolution consists of the following: • A per-priority queue shaper is used to limit the maximum throughput from a priority queue. • A per-port scheduler services competing priority queues according to a pre-configured strict priority and/or Shaped Deficit Weighted Round Robin (SDWRR) policy. • A per-port global shaper limits the maximum port throughput. See Section 15.3 "Egress Bandwidth Management" on page 305 for further information about egress scheduling and traffic shaping operation and configuration. 8.5 Setting Packet Header QoS Fields This section describes how the device sets the packet header User Priority and/or DSCP fields. Header modification follows QoS enforcement in the egress pipe. However, header modification is enabled in the ingress pipe by marking the QoS attributes <Modify DSCP> and/or <Modify User Priority>. MV-S102110-02 Rev. E Page 126 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To configure the QoS Profile table, set the QoSProfile to QoS Table Entry<n> (0<=n<72) (Table 434 p. 696) • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) Setting the Packet Header 802.1p User Priority Field AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device may set the packet 802.1p User Priority field only if the packet is transmitted with a VLAN tag. This may be a new VLAN tag that is added, or the same VLAN tag in which the packet was received on its ingress port. For details on VLAN tagging, see Section 11.2.1 "VLAN Tag Rx and Tx" on page 204. If the packet is transmitted VLAN-tagged, and the packet QoS attribute<Modify User Priority> flag is set, the packet User Priority assignment is based on the packet’s <QoS Profile> attribute. The <QoS Profile> is used as an index into the QoS Profile table, which maps it to the set {TC, DP, UP, DSCP}. (Section 8.1.4 "QoS Profile"). If the packet is received with a VLAN tag and transmitted with a VLAN tag and the packet QoS attribute<Modify User Priority> flag is clear, the user priority field is not changed. If the packet is transmitted with a new VLAN tag, but the packet QoS attribute<Modify User Priority> flag is clear, the packet User Priority assignment is according to the ingress port <default User Priority> configuration. This allows simple port-based User Priority assignments to be made, rather than basing the User Priority on the overall QoS service assigned to the packet. Configuration To configure ingress default port User Priority, set the <PUP> field of the Port<n> VLAN and QoS Configura• • tion Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567) accordingly. To configure the QoS Profile table, set the QoSProfile to QoS Table Entry<n> (0<=n<72) (Table 434 p. 696) accordingly. 8.5.2 Setting the Packet IP Header DSCP Field Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the packet QoS attribute <Modify DSCP> is set, and the packet is IPv4/6, then the DSCP in the packet’s IPv4/6 header is modified. The packet DSCP assignment is based on the packet ‘s <QoS Profile index> attribute. The <QoS Profile index> is used as an index into the QoS Profile table, which maps it to the set {TC, DP, UP, DSCP}. (Section 8.1.4 "QoS Profile") To configure the QoS Profile table, set the QoSProfile to QoS Table Entry<n> (0<=n<72) (Table 434 p. 696) accordingly. 8.5.3 Setting QoS Fields on Cascaded Ports 8.5.3.1 QoS Profile In a cascaded system, the data packet QoS Profile used to access the QoS table is recorded in the packet DSA tag when it traverses a cascade port. This preserves the QoS Profile assignment along the packet path in the system. Consequently, proper synchronization between the devices’ QoS tables assures system-wide service synchronization. The QoS Profile is recorded in DSA tag type=FORWARD, described in A.4 "Extended DSA Tag in FORWARD Format" on page 341. 8.5.3.2 Marking Packet DSCP, User Priority Fields in a Cascaded System In a cascaded system, packet DSCP and User Priority fields are modified by the ingress device. Packets received on cascade ports are configured not to modify the packet header. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 127 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 8.5.1 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Setting Packet Header QoS Fields 8.5.3.3 Recommended Configuration of Cascading Ports 8.6 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 See Section 8.6.2 "QoS Configuration of Cascade Ports" on page 130 for detailed QoS configuration of cascaded ports in a homogeneous system. Applications 8.6.1 Applications of Port-Based QoS Marker The most basic QoS application is required to provide the service assignment modes described in Table 33. Table 34 shows the recommended configuration of networks that implement these modes. Throughout this section, it is assumed that the Nested VLANs feature is disabled (Section 11.2.3 "Nested VLAN Tags" on page 211). Note The device port-based QoS marker configuration spans a wider range of capabilities than described here. This section describes only the common modes from the network user’s perspective. For a detailed description of the port-based QoS marker see Section 8.2.1 "Port-Based QoS Marking". Table 33: Desc ription QoS Blind Mode QoS is disabled The device operates in a best effort network. All traffic is serviced equally. In this mode, the port is not QoS-aware. • All traffic ingressing at this port is assigned a single, port default, QoS Profile. • Packet DSCP, User Priority fields are not modified. Network/DiffServ Boundary Mode Untrust mode The port operates at the network edge, but service information from outside the network, embedded in packet DSCP and User Priority fields, can’t be trusted. • All traffic ingressing at this port is assigned a single, port default, QoS Profile. • Packet DSCP, User Priority fields are optionally modified to convey service information to next switches/routers along the packet path. Network/DiffServ Core Mode Trust L2 802.1p User Priority MV-S102110-02 Rev. E Page 128 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Mod e Network Ports QoS Operation Modes The port operates at the network core and only the User Priority field of the arriving packets can be trusted. This is a typical situation where the network edge switches are incapable of marking the packet DSCP to convey service information to the core switch/routers, or the traffic isn’t IP. • The QoS Profile is set according to packet User Priority field if the packet is tagged, or to port default for untagged packets. • Packet DSCP is modified to convey service information to next switches/routers along the packet path. CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section describes common QoS applications. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) Network Ports QoS Operation Modes (Continued) Desc ription Trust L3 DSCP The port operates at the network core or the port is located at the edge of a trusted DiffServ domain. Therefore, the DSCP field of the arriving packets can be trusted. • If the port is located at the edge of a trusted DiffServ domain, a DSCP to DSCP mapping may be performed to support DiffServ domain crossing. The packet DSCP is used for DSCP to DSCP mutation. If the packet is non-IP, the DSCP mutation stage is skipped. • QoS Profile is set according to packet/mutated DSCP if the packet is IPv4/6, or set to port default for non-IP packets. • Packet User Priority is optionally modified to convey service information to next switches/routers along the packet path, which may not be able to assign service based on the DSCP field. • If the DSCP field is mutated, the packet DSCP field must be modified to reflect the correct service information to the network core. Trust L3+L2 The port operates at the network core and both DSCP and User Priority fields of the arriving packets can be trusted. • For IP packets, QoS Profile is set according to the packet DSCP. • For tagged non-IP packets, QoS Profile is set according to User Priority. • For untagged non-IP packets, QoS Profile is set according to port default. • Packet DSCP and User Priority fields are not modified. NOTE: In Trust L3+L2 mode, the device provides up to 64 service levels to IP traffic and eight service levels to non-IP traffic, concurrently. Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Mod e CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 129 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 33: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Applications Recommended QoS Configuration For Network Ports AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 34: H a r d w ar e C o n fi gu r a t io n QoS Disabled Mode Un t r u s t Mode Tr u s t L 2 8 0 2 . 1p U s e r P r i or it y Mode Tr u s t L 3 DSCP Mode Tr u st L 3 + L2 M o de Trust DSA tag QoS Profile Disabled Disabled Disabled Disabled Disabled Trust L3 Disabled Disabled Disabled Enabled Enabled Disabled Disabled Enabled Disabled Enabled Enable DSCP Mapping Disabled Disabled Disabled Enabled/ Disabled1 Enabled/ Disabled Modify DSCP Disabled Enabled Enabled Enabled/ Disabled2 Enabled/ Disabled Modify User Priority Disabled Enabled Disabled Enabled Disabled Port Default QoS Profile Set to a QoS Profile pointing to the best effort or less-than-best-effort services. Ingress Port Default User Priority User Priority used for packets that arrive untagged and are transmitted tagged. Port QoS Precedence Not relevant3 User Priority used for packets that arrive untagged and are transmitted tagged. Not relevant3 User Priority value pointed by the port default QoS Profile. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Trust L2 Set to SOFT to allow subsequent markers to modify QoS attributes. 1. Enable if DSCP mutation is required. 2. Enable packet DSCP modification if DSCP mutation is enabled. 3. When modification of Packet User Priority is enabled, the User Priority is always taken from the QoS Profile table therefore the ingress port default User Priority is not relevant. 8.6.2 QoS Configuration of Cascade Ports Table 35 presents the recommended configuration of cascaded ports in a system based on these devices. A cascaded system should be viewed as a small-scale network. The ingress network port acts as a network edge port for the system, thus the ingress device assigns the service level and performs packet QoS marking. A cascaded port never resides on the system boundary, thus it can be considered a ‘core’ port for that system. The recommended configuration for cascaded ports resembles the configuration of network core ports, i.e., service information is trusted and packet QoS header isn’t modified. MV-S102110-02 Rev. E Page 130 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Q o S O p e r a t io n M o d e ( p e r Table 33) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) Port C on f i gu r at i on Va lu e Desc ription Trust DSA tag QoS Profile Enabled The QoS Profile recorded in the DSA tag is used to convey service information set by the ingress device along the packet path. Trust L3 Disabled Trust L2 Disabled Enable DSCP Mapping Disabled Modify DSCP Disabled Packet DSCP has been modified by the ingress device. Modify User Priority Disabled Packet User Priority has been modified by the ingress device. Port Default QoS Profile Not relevant QoS Profile in DSA tag always exists. Ingress Port Default User Priority Not relevant Packet is considered tagged by the ingress port. Port QoS Precedence HARD Prevents QoS marking. {TC, DP} Assignment on Cascade Ports M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 8.6.3 The queuing and bandwidth resources of cascade ports is shared between traffic types. By assigning different traffic types to distinct priority queues, the application can easily control queuing resources and bandwidth allocation per traffic type. Table 36 shows an example of {TC,DP} assignment over cascade ports. Table 37 shows the associated configuration. Table 36: Example {TC, DP} Sample {TC, DP} Assignment Tra f fic Ty pe {TC, DP} As signment Control traffic Assigned to TC#7 CPU<->CPU traffic is assigned DP=0 CPU<->Network traffic is assigned DP=1 Data traffic Mapped to TCs 6-0 DP is remapped to reflect the reduced number of TCs. Mirror-to-analyzer traffic Assigned to {TC=0, DP=1} (This example assumes the destination analyzer port is on a remote device and the mirrored packet is sent across a cascade port.) In addition, the queuing resource limits of the ingress/egress monitor traffic are set to consume less than the configured share of TC#0, DP=1 (see Section 15.3.1 "Enqueueing and Congestion Avoidance Tail-Dropping"). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 131 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Recommended QoS Configuration For Cascading Ports AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 35: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Applications Configuration for Table 36 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 37: Tra f fi c Ty pe Co nfi gu rati on Remap Control Traffic Enabled Control TC 7 CPU<->CPU DP 0 CPU<->Network DP 1 Data traffic Remap Data Traffic Enabled {TC,DP}->{TC,DP} table Application specific Mirror-to-Analyzer Traffic Ingress monitor {TC,DP} {TC=0, DP=1} Egress monitor {TC,DP} {TC=0, DP=1} 8.6.4 Marking the 802.1p User Priority in the Nested VLAN Tag M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This sub-section describes the marking of the User Priority field in the additional nested VLAN tag prepended to the packet. A general description of Nested VLANs is described in Section 11.2.3 "Nested VLAN Tags" on page 211. The User Priority field in the nested VLAN tag is marked according to the following policy: • If packet QoS attribute<Modify User Priority> is disabled, then User Priority is assigned by the port default User-Priority. • Else, User Priority is assigned by User Priority value extracted from the QoS Profile table. Every access port can be independently configured to mark User Priority according to any of the following modes: • Mark all customer packets by port default User Priority. • – – – The port must be configured to untrust mode. – – The port must be set to trust-L2. Set port default User-Priority and port default QoS Profile to the desired values. ‘Modify User Priority’ can be disabled. Copy customer packet User Priority to nested VLAN tag User Priority. – The packet QoS Profile index is assigned based on the global User Priority to QoS Profile Index mapping table. The QoS Profile table entry associated with the above assigned index contains the same User Priority of the original trusted packet. – • ‘Modify User Priority’ must be enabled. Use policy-based User-Priority marking Every port can use different marking policy, by binding each port to a different PCL. – The assigned QoS Profile index determines the nested VLAN User Priority via the QoS Profile table entry. – ‘Modify User Priority’ must be enabled. MV-S102110-02 Rev. E Page 132 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Control traffic M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) Diffserv Support AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 8.6.5 8.6.5.1 Number Of DiffServ PHBs 8.6.5.2 Support Unrecognized DSCP Values Remapping unrecognized DSCP values to the default DSCP is required by RFC2474. The device supports this feature by assigning unrecognized DSCP values to a single QoS Profile defined by the application as the DEFAULT PHB. DSCP to QoS Profile mapping is done using the global DSCP to DSCP mutation table, see Figure 26. 8.6.5.3 Support DiffServ Domain Crossing The DiffServ domain crossing feature enables the user to easily connect two DiffServ domains, each using different DSCP values to indicate identical service. Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device supports DiffServ domain crossing between two domains, where the boundary between them is located at the port ingress. The DSCP to DSCP mutation table is used to map DSCP values between the domains. Figure 26 describes the DiffServ domains boundary location and how the DSCP to DSCP mutation table is used. CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 133 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The device provides 72 QoS Profiles. Sixty-four profiles can be dedicated to support all 64 DiffServ Codepoints. The remaining QoS Profiles can be used to support eight User Priority values concurrently (see Section 8.6.7.1 "Number of IEEE 802.1p User Priority Values"). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Applications AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 26: DiffServ Domains Crossing Using a Single DSCP to DSCP Mutation Table Switch A Switch B Egress Ingress DSCP A to DSCP B Mapping qosProfileToQosTbl port<x>QosProfile upToQosProfileTbl dscpToQosProfileTbl dscpToQosProfileTbl port<x>QosProfile Ingress DSCP B to DSCP A Mapping M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 upToQosProfileTbl qosProfileToQosTbl Egress DiffServ Domain B 8.6.5.4 Backward Compatibility to IP Precedence RFC 2474 defines backward compatibility of DSCP values to IP precedence by mapping the codepoints ‘xxx000’ to their respective class selector PHBs. The device supports backward compatibility to IP-Precedence, as required in RFC 2474, using the DSCP to QoS Profile table. The table must be configured to map DSCP values equal to ‘xxx000’ (where ‘xxx’ denotes the IP precedence bits), to QoS Profile. MV-S102110-02 Rev. E Page 134 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 DiffServ Domain A M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Quality of Service (QoS) 8.6.6 ToS Byte and IP Precedence Support AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The functionality of the IP ToS byte is defined in RFC 791. Note The device ignores the CU (Currently Unused) bits defined in RFC 2474 - Differentiated Services Field. 8.6.7 IEEE 802.1p Support 8.6.7.1 Number of IEEE 802.1p User Priority Values Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device provide 72 QoS Profiles. Eight profiles can be dedicated to support all 802.1p User Priority levels. CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 135 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The device provides full compatibility to RFC 791 by the following means: • If the DiffServ domain uses the ToS byte convention, the packet DSCP field must be considered by the application as the ToS byte. Use the DSCP to QoS Profile table to map packet ToS bits to QoS Profile. • Use the DSCP to DSCP mutation table to map ToS values to DSCP values. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Applications AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 9. Network Interfaces and Media Access Controllers (MACs) The device implements the following media interfaces: SGMII A physical interface for the Tri-Speed 10/100/1000 Mbps ports. This is a serialized version of the IEEE 802.3 GMII interface. Through its 1.25 Gbps SERDES transceiver, it connects gluelessly to the 88E1xxx Alaska family of 1000BASE-T Ethernet transceivers and to 1000BASE-X fiber transceivers. XAUI IEEE 802.3ae Extended Auxiliary Interface, incorporating a Quad 3.125 Gbps duplex differential serial lane with 8B/10B ENDEC, as defined by 802.3ae 10G-BASE-X/XGXS device requirements. In an inter-device connection, this interface may be accelerated to 12 Gbps, where each of the four SERDES lanes runs at 3.75 Gbps. HX Auxiliary Interface, incorporating two 3.125 Gbps duplex differential serial lanes, with 8B/10B ENDEC. The HX interface runs at 5 Gbps. QX Auxiliary Interface, with one 3.125 Gbps duplex differential serial lane, with 8B/10B ENDEC. The QX interface runs at 2.5 Gbps The device implements a standard 802.3 10/100/1000/10000 Mbps Ethernet MAC. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The network interface of the Tri-Speed ports (ports 0 through 23) is a serial interface (SERDES), providing1000BASE-X for a Gigabit Ethernet fiber application and SGMII interfaces for a copper interface. The network interface of the HyperG.Stack ports is IEEE 802.3ae Extended Auxiliary Interface, incorporating a Quad 3.125 Gbps duplex differential serial lane, with 8B/10B ENDEC as defined by 802.3ae 10G-BASE-X/XGXS device requirements. In an inter-device connection, this interface may be accelerated to 12 Gbps, where each of the four SERDES lanes runs at 3.75 Gbps. An HX/QX port incorporates two 3.125 Gbps SERDES lanes. This port can work in a single lane mode (QX mode) or in a dual lane mode (HX mode). The HX interface runs at 5 Gbps. The network interface of the HX port incorporates two 3.125 Gbps duplex differential serial lanes with 8B/10B ENDEC. The network interface of the QX port is one 3.125 Gbps duplex differential serial lanes with 8B/10B ENDEC. The QX interface runs at 2.5 Gbps. The MAC layer includes packet delineation, erroneous frame detection (e.g., fragments and jabbers), CRC checking and generation, frame padding, and MIB counters. The MAC interface supports 802.3x Flow Control and Auto-Negotiation for speed, duplex, and Flow Control support for the relevant speeds. This interface is also capable of polling the PHY status, for detecting changes in the link. The Ethernet MAC supports half-duplex mode when running at 10 and 100 Mbps speeds only. Half-duplex mode is not supported for a speed of 1/10 Gbps. The MAC supports back pressure when running at a speed of 10/100 Mbps in half-duplex mode. The Ethernet MAC layer is fully compliant with IEEE 802.3. MV-S102110-02 Rev. E Page 136 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section describes the device’s network interfaces and its Media Access Controllers (MACs). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) R Tri-Speed Port Overview This section is relevant for the following devices: D SecureSmart: 98DX106, 98DX163, 98DX163R, 98DX243, 98DX262 D SecureSmart Stackable: 98DX169, 98DX249, 98DX269 D Layer 2+ Stackable:98DX130, 98DX166, 98DX246, 98DX250, 98DX260, 98DX270 D Multilayer Stackable: 98DX107,98DX133, 98DX167, 98DX247, 98DX253, 98DX263, 98DX273 U Not relevant for the 98DX803. Each of the Tri-Speed ports (ports 0 through 23) incorporates a standard MAC and has an integrated Physical Coding Sub-layer (PCS) and an integrated SERDES. The physical interface is an SGMII interface or an 1000Base-X interface. SGMII is a serialized version of the IEEE 802.3 GMII interface. It connects gluelessly to the 88E1xxx Alaska family of copper PHY Ethernet transceivers and to 1000BASE-X fiber transceivers. Figure 27 and Figure 28 are functional block diagrams of a single port. Figure 27: Functional Block Diagram of Tri-Speed Port in 1000BASE-X Mode MAC Status Register MIB Counters M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 802.3 MAC Link State, Flow Control GMII M PCS Compliant with IEEE 802.3 Clause 36 In-band Auto-Negotiation (IEEE 802.3 Clause 37) TBI LED Indications 1.25 Gbps SERDES P<n>_TX_P P<n>_TX_N August 24, 2006, Preliminary P<n>_RX_P Copyright © 2006 Marvell Differential Pairs (CML) CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 137 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 9.1 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Media Access Controller (MAC) layer incorporates all MAC-specific functions compliant with the IEEE 802.3 Ethernet standard for 10/100/1000/10000 Mbps speeds, including a Serial Management Interface (SMI) for PHY management and Auto-Negotiation. P<n>_RX_N MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Tri-Speed Port Overview MAC Status Register 802.3 MAC MIB Counters GMII/MII InBandAnEn SGMII Convertor GMII Link State, Speed, Duplex mode Link State, Speed, Duplex mode, Flow Control LED Indications PCS Compliant with IEEE 802.3 Clause 36 SGMII In-band Auto-Negotiation PHY Polling Unit TBI 1.25 Gbps SERDES M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 M_MDIO<0/1> M_MDC<0/1> P<n>_TX_P P<n>_TX_N P<n>_RX_P Differential Pairs (CML) Master SMI I/F (IEEE 802.3 Clause 22 ) M Each port consists of a very accurate, low-power SERDES, based on unique Marvell technology. The SERDES has an input differential pair and an output differential pair. The SERDES has a built-in signal-detect circuit, a comma detection and alignment, as well as a Pseudo Random Bit Stream (PRBS) generator and checker. The Physical Coding Sub-layer (PCS) resides above the SERDES and implements IEEE 802.3 PCS, as described in Clause 36. The PCS performs 8b/10b encoding/decoding mapping between IEEE 802.3 1000BASE-X symbols and GMII and implements the 1000BASE-X SYNC machine. The PCS supports loopback for software diagnostics. An integral part of the PCS layer is the in-band Auto-Negotiation machine, which operates according to the IEEE 802.3, Clause 37.3. The thin SGMII layer is a bus converter unit that converts between GMII and MII, when working at 10/100 Mbps speeds in SGMII. 9.2 R HyperG.Stack Port Overview This section is relevant for the following devices: D SecureSmart: 98DX262 D SecureSmart Stackable: 98DX269 D Layer 2+ Stackable: 98DX130, 98DX260, 98DX270, 98DX803 D Multilayer Stackable: 98DX253, 98DX133, 98DX263, 98DX273 Each of the HyperG.Stack ports (ports 24 through 26) incorporates a standard MAC and has an integrated XAUI transceiver with four SERDES Lanes. Figure 29 is a functional block diagram of the interface for a HyperG.Stack port. MV-S102110-02 Rev. E Page 138 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 28: Functional Block Diagram of Tri-Speed Port in SGMII Mode P<n>_RX_N MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 29: Functional Block Diagram of the HyperG.Stack Port M A C S tatu s R e g iste r 8 0 2 .3 MAC M IB C o u n ters XAUI PHY X G M II Interface M A C L E D In d ica tio n s LE D P H Y L E D In dicatio n s M a ster IE E E 8 0 2.3 C la u se 45 co m p lia n t S M I Inte rfa ce 10G B A S E -X R egisters FInput IF O FInput IF O F IF O A lign and 10 G B A S E -X ADlingn esk ewand A and Dlingn esk ew A and Dlingn esk ew D esk ew XG M D IO M anagem ent Interfac e D e-S erialize r D e-S C loerializer ck C lo ck R eco very R ec overy C LK 10G S E -X Tx B PA CS 10G Tx B PA CSSE -X 10G Tx B PA CSSE -X Tx PCS TBG/ BG/ CTlock Tlock BG/ S yth C BG/ CTlock S erializer S erializer S erializer M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 XAUI P<n>_TX_P/N[3] P<n>_TX_P/N[2] P<n>_TX_P/N[1] S yth Lane0 lock SCyth Lane1 S yth Lane2 Lane3 S erializer P<n>_TX_P/N[0] P<n>_RX_P/N[3] P<n>_RX_P/N[2] P<n>_RX_P/N[1] P<n>_RX_P/N[0] P<n>_REF_CLK_125_P P<n>_REF_CLK_125_N S_XMDIO S_XMDC M_XMDIO M Input FInput IF O 10G S E -X R xB PA CS 10G R x BPACSSE -X 10G B R x PACSSE -X Rx PCS Each port integrates a XAUI PHY that integrates: • IEEE 802.3ae 10 Gigabit Auxiliary Attachment Unit Interface (XAUI) • IEEE 802.3ae 10 Gigabit Ethernet XGMII Extender sublayer (XGXS) • Serial Management Interface And an 802.3 MAC that integrates: IEEE 802.3ae 10 Gigabit Ethernet Reconciliation sublayer (RS) IEEE 802.3ae 10 Gigabit Ethernet Medium Access Control (MAC) layer • • 9.2.1 XAUI PHY 9.2.1.1 XAUI Interface and XGXS Each port XAUI PHY integrates: • IEEE 802.3ae 10 Gigabit Auxiliary Attachment Unit Interface (XAUI) • IEEE 802.3ae 10 Gigabit Ethernet XGMII Extender sublayer (XGXS) The purpose of the XGXS/XAUI is to extend the physical separation between MAC/RS and PHY components in a 10 Gigabit Ethernet system. The MAC/RS data stream is organized into four lanes, with each lane conveying a Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 139 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 X G M II M_XMDC MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED HyperG.Stack Port Overview AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 data octet or control character. The XGXS converts bytes of the MAC data stream lane into a self-clocked serial 8B/10B encoded data stream. Each lane is transmitted across one XAUI lane. Transmit The transmit data from the MAC/RS sub-layer is 8B/10B encoded, serialized and transmitted sequentially over the differential high-speed pins. The inter-frame “Idle” control characters are coded into the 8B/10B code sequence. Receive The transceiver accepts serial data at its pins. CDR circuitry recovers the clock and the data on each lane. The receiver performs serial-to parallel conversion, comma detection, lane alignment and 8B/10B decoding. The receiver utilizes a FIFO to phase-align the lanes’ recovered clocks into single clock domain. It is assumed that the recovered clocks are frequency-locked, but are not necessarily in phase. The lane alignment ensures that symbols received on four lanes are aligned when presented to the MAC/RS sublayer. The receiver adds to or deletes from the inter-frame as needed for clock rate disparity compensation prior to converting the inter-frame code into “Idle” control characters. Clock Recovery and Jitter Tolerance M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The transceiver uses a DPLL to generate a clock with the same frequency as the incoming serial data. The clock is phase-aligned by the PLL, so that it samples the data in the center of the data eye pattern. The total loop dynamics of the clock recovery DPLL yields a jitter tolerance that exceeds the tolerance proposed for 10 GbE equipment by IEEE 802.3ae Clause 47.3.4.4. The receiver tolerates a total of 0.6 UI (unit interval) peak-to-peak jitter amplitude and a minimum of 0.36 UI peak-to-peak deterministic jitter. In the presence of maximum jitter, the receiver is still expected to meet the 10E-12 BER. 9.2.1.2 Serial Management Interface (XSMI) The transceiver XGXS/XAUI function is controlled by a dedicated XMDIO management interface, as defined in IEEE 802.3ae Clause 45. The register map is composed of IEEE-defined manageable device registers and vendor-specific registers. The XGXS/XAUI can be configured as a PCS, PHY, and DTE device. The management interface implements a 16-bit address register that stores the address of the XGXS/XAUI function register to be accessed. Write, read, and posted-read-increment-address XMDIO serial frames access the address register. PHY and DEV addresses may be configured. The XMDIO management interface can be utilized by the CPU to access and control XGXS/XAUI registers. See Section 6.2 "Serial Management Interfaces (SMI)" on page 79 for details about the SMI interface. Note To manage the XGXS/XAUI via MAC XMDIO interface, externally connect M_XMDC output to S_XMDC<0,1,2> input and M_XMDIO to S_XMDIO<0,1,2>. MV-S102110-02 Rev. E Page 140 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The 10 GbE port integrates a serializer/deserializer transceiver, which incorporates four synchronized lanes delivering bi-directional point-to-point data transmission of 3.125 Gbps or 3.75 Gbps per lane. The device accepts a reference clock of 125 MHz. The external clock must be within +/- 100 ppm. The transceiver supports pre-emphasis on the serial driver, to compensate for skin effect. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) Configuration 9.2.2 HyperG.Stack Port Media Access Control (MAC) The HyperG.Stack Port MAC integrates: • IEEE 802.3ae 10 Gigabit Ethernet Reconciliation sublayer (RS) • IEEE 802.3ae 10 Gigabit Ethernet Medium Access Control (MAC) layer 9.2.2.1 Reconciliation Sublayer (RS) The Reconciliation Sublayer (RS) converts between the MAC data stream and the XGXS function. The RS generates continuous data or control characters on the transmit path and expects continuous data or control characters on the receive path. The RS participates in link fault detection and reporting, by monitoring the receive path for status reports indicating an unreliable link. Status reports are also generated on the transmit path, to report detected link faults to the DTE on the remote end of the connecting link. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Fault messages are special in-band symbols, to indicate a remote fault in the path between the local and remote RS layer, or to indicate a local fault that is a result of a problem in the local lower layers (e.g., PCS, PMD, or PMA). The RS has an internal state machine for detection of local and remote faults. As a result of local fault detection, the MAC stops sending packets (could be anywhere in the packet) and continuously generates remote fault. If the fault message received is a remote fault, the MAC stops sending packets, but continues to send IDLE messages (and not remote fault messages). When the fault messages from the PHY stop, the MAC returns to normal operation and resumes transmitting data. Whenever a MAC changes state from NORMAL to FAULT and vice versa, an interrupt is conveyed to the host processor. In addition, the MAC Status register shows the state of the MAC layer (link up vs. link down). 9.2.2.2 Ethernet MAC The device implements an IEEE 802.3ae MAC sublayer with proprietary extensions. The 10 Gbps MAC includes the following standard features: • Support for WAN mode (IfStretchMode), rate adaptation to OC-192 speed. • Support for reception of frames with short inter-packet-gap (IPG) down to 4 bytes. • CRC generation on transmit and reception on receive. • IEEE 802.3x Flow Control frame detection and generation. • Support for any preamble length (multiple of 4 bytes). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 141 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the Port<n> XAUI PHY Configuration Register1 (24<=n<27) (Table 166 p. 436): • To configure the XAUI transceiver SMI PHY address, set the <PHYAddr> field accordingly. • To configure the XAUI transceiver SMI Device address, set the <PHYDevAddr> field accordingly. • To configure the XAUI transceiver to accept any device address on the SMI interface, set the <AnyDevAddrEn> bit. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED HyperG.Stack Port Overview 9.3 This section is relevant for the following devices: AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 D SecureSmart-Stackable: 98DX169, 98DX249, 98DX269 Each of the HX/QX ports (ports 25 and 26) incorporates a standard MAC and has an integrated Physical Coding Sub-layer (PCS) and two SERDES lanes. Figure 30 is a functional block diagram of the interface for the HX/QX ports. Figure 30: Functional Block Diagram of the HX/QX 802.3 MAC MIB Counters XGMII MAC LED Indications Port LED Indications MAC Status Register LED Generic PCS for 1/2 Lanes HX_REF_CLK_125_P Registers M 3.125 Gbps SERDES Pn_HX_TX_P/N[1:0] 9.3.1 HX_REF_CLK_125_N M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 CLK Generic Physical Coding Sub-layer (PCS) The HX/QX port integrates a generic PCS for 1/2 lanes, based on the IEEE 802.3ae standard. The purpose of the PCS is to separate the MAC/RS and SERDES components. The MAC/RS data stream is organized into four lanes, with each lane conveying a data octet or control character. The PCS converts bytes of the MAC/RS data stream lane into a self-clocked serial 8B/10B encoded data stream. This data stream is then transmitted over one lane (QX mode) or two lanes (HX mode). Each lane is transmitted across one SERDES lane. The HX/QX port integrates a serializer/deserializer transceiver, which incorporates two synchronized lanes delivering bi-directional point-to-point data transmission of 3.125 Gbps per lane. The device accepts a reference clock of 125 MHz. The external clock must be within +/- 100 ppm. The transceiver supports pre-emphasis on the serial driver, to compensate for skin effect. MV-S102110-02 Rev. E Page 142 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 R HX and QX Ports Overview Pn_HX_RX_P/N[1:0] MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) Transmit AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The transmit data from the MAC/RS sub-layer is 8B/10B encoded, serialized, and transmitted sequentially over the differential high-speed pins. The inter-frame “Idle” control characters are coded into the 8B/10B code sequence. The transceiver accepts serial data at its pins. CDR circuitry recovers the clock and the data on each lane. The receiver performs serial-to parallel conversion, comma detection, lane alignment and 8B/10B decoding. The receiver utilizes a FIFO, to phase-align the lanes’ recovered clocks into single clock domain. It is assumed that the recovered clocks are frequency-locked, but are not necessarily in phase. The lane alignment ensures that symbols received on four lanes are aligned when presented to the MAC/RS sublayer. The receiver adds to or deletes from the inter-frame as needed for clock rate compensation, prior to converting the inter-frame code into “Idle” control characters. Clock Recovery and Jitter Tolerance The transceiver uses a DPLL to generate a clock with the same frequency as the incoming serial data. The clock is phase-aligned by the PLL, so that it samples the data in the center of the data eye pattern. The total loop dynamics of the clock recovery DPLL yields a jitter tolerance that exceeds the tolerance proposed by IEEE 802.3ae Clause 47.3.4.4. The receiver tolerates a total of 0.6 UI (unit interval) peak-to-peak jitter amplitude and a minimum of 0.36 UI peak-to-peak deterministic jitter. In the presence of maximum jitter, the receiver is still expected to meet the 10E-12 BER. HX/QX Port Media Access Control (MAC) M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 9.3.2 The HX/QX port is connected to a standard XG port MAC. For further information see Section 9.2.2 "HyperG.Stack Port Media Access Control (MAC)" on page 141. 9.3.3 HX/QX Port Modes 9.3.3.1 Port 25 Mode In the 98DX269, port 25 can operate as a HyperG.Stack port or as an HX/QX port. Configuration • To set port 25 to HyperG.Stack Port mode, set the <Port25Mode> field in the HXPorts Global Configuration • (Table 237 p. 489). To set port 25 to HX Port mode, clear the <Port25Mode> field in the HXPorts Global Configuration (Table 237 p. 489). Note When Port 25 operates as a HyperG.Stack port, it uses different pins than the ones used when it operates as an HX/QX port. This can be configured only when the port SERDES and MAC are in RESET. 9.3.3.2 HX/QX Mode Selection Each HX/QX port may be configured to work with one or two lanes. When the port works with one lane, it is possible to select the active lane to be Lane0 or Lane1. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 143 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Receive M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED HX and QX Ports Overview • (Table 223 p. 477). To set the port to two lanes mode, set the <HXPortMode> field in the HXPort<n> Configuration Register0 (Table 223 p. 477). To select the active lane when the port works with one lane, set the <QXLaneSel> field in the HXPort<n> Configuration Register0 (Table 223 p. 477). Note This can be configured only when the port SERDES and MAC are in RESET. 9.4 MAC Operation and Configuration This section specifies the MAC operation and configuration. The device implements a standard MAC, which filters out received frames shorter than 64 bytes or longer than the Maximum Receive Unit. It also filters out received packets with a bad CRC or those in which a receive error occurred during packet reception. The MAC also maintains the minimum IPG restriction on transmitted packets. In 10/100 Mbps half-duplex modes it implements the CSMA/CD protocol (collision detect and retransmit). Note M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Each MAC has a set of LED indicators (Section 17. "LED Interface" on page 318). 10/100 Mbps speed and half-duplex mode are supported only in the SGMII physical interface. 9.4.1 Port Enable The device may be configured to drop packets received on all ports or to drop packets received on a specific port, independent of the port link status. In addition to dropping received packets, disabling a port also halts the queueing of additional packets to this port’s transmit queues. Packets destined to a disabled port are tail-dropped. Once a port is disabled, it rejects all received packets and transmits all pending packets in its transmit queues prior to its disabling. Configuration To disable all of the ports on the device, clear the <DeviceEn> bit in the Global Control Register (Table 84 p. 377). Tri-Speed Ports In Port<n> MAC Control Register0 (0<=n<24, CPUPort = 63) (Table 145 p. 411): • To disable Tri-Speed port n, clear the <PortEn> bit. • To enable Tri-Speed port n, set the <PortEn> bit. HyperG.Stack and HX/QX Ports In the Port<n> MAC Control Register0 (24<=n<27) (Table 159 p. 427): • To disable HyperG.Stack port n, clear the <PortEn> bit. • To enable HyperG.Stack port n, set the <PortEn> bit. MV-S102110-02 Rev. E Page 144 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To set the port to one lane mode, clear the <HXPortMode> field in the HXPort<n> Configuration Register0 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) 9.4.2 Link State AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The port’s link state may be set manually or it may be resolved by the MAC. In the Tri-Speed port MAC, the link state is resolved by the PCS Layer. In the HyperG.Stack port MAC, the Link state is resolved by the RS layer. The link state may be configured to be forced to the link up state. The Tri-Speed port MAC detects the link up (if not configured to be forced to link up state) when: • PCS link state machine state has achieved synchronization AND • (SGMII mode only) In-Band SGMII Auto-Negotiation is disabled and the Copper PHY transceiver link state is up, as reported by polling the PHY registers via the SMI. (Link Auto-Negotiation). The HyperG.Stack port MAC detects that the link is up (if not configured to be forced to link up state) when the RS state machine has not detected a remote fault and has not generated a local fault. Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Tri-Speed Ports In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • To force a Tri-Speed port link state to up, set the <ForceLinkUp> bit. • To force a Tri-Speed port link state to down, set the <ForceLinkDown> bit. Configuration of HyperG.Stack and HX/QX Ports In the Port<n> MAC Control Register0 (24<=n<27) (Table 159 p. 427): • To force a HyperG.Stack port link state to up, set the <ForceLink Pass> bit. • To force a HyperG.Stack port link state to Down, set the <ForceLink Down> bit. Interrupts: Tri-Speed Ports The <LinkStatus Change> field in the Port<n> Interrupt Cause Register (0<=n<24, CPUPort = 63) (Table 571 p. 802) is set upon any change of the Tri-Speed Port link state. Interrupts: HyperG.Stack and HX/QX Ports The <LinkStatus Change> field in the Port<n> Interrupt Cause Register (24<=n<27) (Table 573 p. 803) is set upon any change of the HyperG.Stack Port link state. 9.4.3 Port Status Register The Port Status register is a Read Only register reflecting the port’s status and mode of operation: Tri-Speed ports: HyperG.Stack and HX/QX ports: 9.4.4 Port<n> Status Register0 (0<=n<24, CPUPort = 63) (Table 149 p. 418). Port<n> Status Register (24<=n<27) (Table 162 p. 429). Disable CRC Checking on Received Packets The MAC can ignore the CRC check results on received packets. This mode is useful for propriety purposes, where non-Ethernet frames are passed on an Ethernet medium or bus, or for debug purposes. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 145 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Any change in the port link state sets a maskable interrupt. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC Operation and Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration HyperG.Stack and HX/QX Ports In the Port<n> MAC Control Register0 (24<=n<27) (Table 159 p. 427): • To enable CRC checking on received packets, set the <RxCRC CheckEn> bit. • To disable CRC checking on received packets, clear the <RxCRC CheckEn> bit. 9.4.5 Short Packets Padding Upon reception, the port MAC drops packets whose length is shorter than 64B. Upon transmission, there is a configuration option to pad packets to the size of minimum frame size of 64B. The need to pad packets upon transmission can occur in the following two cases: • A packet shorter than 64B was received from the CPU and is to be transmitted via a network port. (The CPU port does not enforce the minimum packet length on reception.) • A tagged packet of 64B is received on port and is transmitted through a port as an untagged packet. In this case the four bytes of the VLAN tag are removed and the size of the packet to be transmitted is 60B. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration Tri-Speed Ports In the Port<n> MAC Control Register2 (0<=n<24, CPUPort = 63) (Table 147 p. 414): • To enable padding of short packets, set the <PaddingDis> bit. • To disable padding of short packets, clear the<PaddingDis> bit. HyperG.Stack and HX/QX Ports In the Port<n> MAC Control Register0 (24<=n<27) (Table 159 p. 427): • To enable padding of short packets, clear the<PaddingDis> bit. • To disable padding of short packets, set the <PaddingDis> bit. 9.4.6 Preamble Length All packets received and transmitted via the cascading ports are DSA-tagged, The DSA tag adds eight bytes to untagged packets and four bytes to tagged packets. To enable wire-speed connection across devices, four bytes of the packet’s preamble may be removed. MV-S102110-02 Rev. E Page 146 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Tri-Speed Ports In the Port<n> MAC Control Register1 (0<=n<24, CPUPort = 63) (Table 146 p. 412): • To enable CRC checking on received packets, set the <RxCRC CheckEn> bit. • To disable CRC checking on received packets, clear the<RxCRC CheckEn> bit. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) Configuration Tri-Speed Ports To enable short preamble transmission, set the <Short PreambleEn> bit in the Port<n> MAC Control Register1 (0<=n<24, CPUPort = 63) (Table 146 p. 412). HyperG.Stack Ports In the Port<n> MAC Control Register0 (24<=n<27) (Table 159 p. 427): • To enable short preamble transmission, clear the <Preamble LengthTx> bit. • To enable reception of packets with short preamble, clear the <Preamble LengthRx> bit. 9.4.7 Maximum Receive Unit (MRU) Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Although a standard Ethernet frame has a maximum length of 1518 bytes (or 1522 bytes for 802.1Q tagged packets), the MAC and the device can support frame lengths up to 10240 bytes. Tri-Speed Ports To configure the MRU per port, set the <FrameSizeLimit> field in the Port<n> MAC Control Register0 (0<=n<24, CPUPort = 63) (Table 145 p. 411) accordingly. HyperG.Stack and HX/QX Ports To configure the MRU per port, set the <FrameSizeLimit> field in the Port<n> MAC Control Register1 (24<=n<27) (Table 160 p. 428) accordingly. 9.4.7.1 Filtering of Untagged Packets Larger than 1518 Bytes To accommodate 802.1Q tagged packets, the port MAC MRU should be configured to 1522 bytes. As the MAC does not parse the packet’s content and is unaware of the packet’s VLAN tag format, (tagged or untagged), the MAC does not discard untagged packets larger than the standard 1518 bytes. Filtering of untagged packets larger than 1518 bytes is performed at the bridge engine and may be enabled or disabled per port. Configuration In the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567): • To enable filtering of untagged packets larger than 1518, set the <OverSize Untagged PacketsFilterEn> bit. • To disable filtering of untagged packets larger than 1518, clear the <OverSize Untagged PacketsFilterEn> bit. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 147 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The MAC has two preamble modes: Standard As defined in the IEEE 802.3 standard. Length of the packet preamble is 8B, including Start Symbol and SFD. Short (Relevant only for devices with HyperG.Stackports) For cascading ports. Length of the packet preamble is 4B, including Start Symbol and SFD of 4B. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC Operation and Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note As this filtering is not done by the MAC, untagged packets larger than 1518 bytes are counted as “good” packets received. MRU Setting for Cascading Ports A DSA tag is added to all packets transmitted via the cascading ports. If the packet was originally VLAN-tagged, four bytes are added to the packet; if the packet was originally untagged, eight bytes are added to the packet. The MRU of cascading ports must be set to the ports’ MRU plus the additional four or eight bytes of the DSA tag. 9.4.8 802.3x Flow Control The MAC supports reception and transmission of IEEE 802.3x Flow Control packets. Note Flow Control support is applicable only when the port is in full duplex mode (Section 9.4.12 "Tri-Speed Port: Duplex Mode Setting" on page 153). 802.3x Flow Control Packets Reception Flow Control Packets Termination M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 9.4.8.1 Regardless of the port’s Flow Control support mode (Flow Control enabled or disabled), the port MAC terminates all 802.3x Flow Control packets and does not forward them to the ingress pipe for further processing. A packet received by the port is considered a Flow Control packet if its MAC DA is equal to one of the following: • 01-80-C2-00-00-01 • The port’s configured MAC Address (Section 9.4.8.3 "Port MAC Address"). Flow Control Packets Recognition A packet is considered a valid Flow Control packet (i.e., it may be used to halt the port’s packet transmission if it is an Xoff packet, or to resume the port’s packets transmission, if it is an Xon packet) if all of the following are true: • Packet’s MAC DA is 01-80-C2-00-00-01 or the port’s configured MAC Address (Section 9.4.8.3 "Port MAC Address"). • Packet’s Length/EtherType field is 88-08. • Packet’s OpCode field is 00-01. When Flow Control is enabled (Section 9.4.8.4 "Flow Control Support Setting") and a Flow Control packet with a non-zero timer is received, the device halts, or continues to halt, the transmit packet stream. When Flow Control is enabled (Section 9.4.8.4 "Flow Control Support Setting") and a Flow Control packet with a zero timer is received, the device resumes or continues packets transmission. 9.4.8.2 802.3x Flow Control Packet Transmission In the opposite direction, the device sends Flow Control packets according to the number of packet buffers allocated for packets received on this port and according to its thresholds for X-On and X-Off (see Section 15.2 "Ingress Bandwidth Management" on page 303). MV-S102110-02 Rev. E Page 148 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 9.4.7.2 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) • If this is an Xon packet, the pause time parameter is set to 00-00. If this is an Xoff packet, the pause time parameter is set to FF-FF. The reset of the packet’s payload is padded with zeros. X-off Packet Transmission When Flow Control is enabled (Section 9.4.8.4 "Flow Control Support Setting") and when the number of buffers allocated for packets received on this port exceeds the X-Off threshold, a pause frame is sent with pause_time =FF-FF. In addition, a pause frame with pause_time =FF-FF is sent periodically, as long as the number of buffers allocated for packets received on this port is above this port X-on threshold. For the Tri-Speed ports, the period between the transmission of two consecutive Flow Control packets is 30 ms, regardless of the port’s speed. For the HyperG.Stack ports, the period between transmission of two consecutive Flow Control packets is 3 ms. • • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Notes In the Tri-Speed ports, periodic transmission of X-off Flow Control packets cannot be disabled. In the HyperG.Stack ports, periodic transmission of X-off Flow Control packets may be disabled Configuration HyperG.Stack and HX/QX Ports In the Port<n> MAC Control Register2 (24<=n<27) (Table 161 p. 429): • To enable periodic X-off Flow Control transmission of packets, set the <PeriodicXoffEn> bit. • To disable periodic X-off transmission of Flow Control packets, clear the <PeriodicXoffEn> bit. X-on Packet Transmission When Flow Control is enabled (Section 9.4.8.4 "Flow Control Support Setting") and when the number of buffers allocated for packets received on this port exceeds the X-Off threshold, a pause packet is sent with pause_time =0xFFFF. After the number of buffers allocated for packets received on this port becomes lower than the X-on threshold, a pause packet is sent with pause_time = 00-00. In addition, a pause packet with pause_time = 00-00 may be sent periodically, as long as the number of buffers allocated for packets received on this port is below this port X-on threshold. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 149 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 – – AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Flow Control packet sent is 64 bytes with the following fields: • The packet’s MAC DA is 01-80-C2-00-00-01. • The packet’s MAC SA is equal to the port’s configured MAC Address (Section 9.4.8.3 "Port MAC Address"). • The packet’s Length/EtherType field is 88-08. • The packet’s OpCode field is 00-01. • Packet pause time parameter: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC Operation and Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration HyperG.Stack and HX/QX Ports In the Port<n> MAC Control Register0 (24<=n<27) (Table 159 p. 427): • To enable periodic X-on Flow Control packets transmission, set the <PeriodicXonEn> bit. • To disable periodic X-on Flow Control packets transmission, clear the <PeriodicXonEn> bit. 9.4.8.3 Port MAC Address For transmission and reception of Flow Control packets, each port has its unique MAC Address. This MAC Address is used as the MAC SA for Flow Control packets transmitted by the port and may be used as the MAC DA for Flow Control packets received by this port. The upper 40 bits of this MAC Address are the same for all ports in the device and the lower eight bits are unique per port. Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Source Address Middle Register (Table 254 p. 512) and Source Address High Register (Table 255 p. 512) are used to configure the 40 upper bits of the ports’ MAC Address. Tri-Speed Ports To configure the port’s lower 8 bits MAC Address, set the <SALow[7:0]> field in the Port<n> MAC Control Register1 (0<=n<24, CPUPort = 63) (Table 146 p. 412) accordingly. HyperG.Stack and HX/QX Ports To configure the port’s lower 8 bits MAC Address, set the <SALow[7:0]> field in the Port<n> MAC Control Register2 (24<=n<27) (Table 161 p. 429) accordingly. 9.4.8.4 Flow Control Support Setting Flow Control can be enabled manually. In addition, for the Tri-Speed ports, Flow Control may be resolved automatically. When Flow Control Auto-Negotiation is enabled, Flow Control support for the Tri-Speed ports is resolved by means of Auto-Negotiation. When Flow Control Auto-Negotiation is disabled, Flow Control support for the Tri-Speed ports is performed via configuration. Configuration Tri-Speed Ports Enabling Flow Control Auto-Negotiation In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • To enable Auto-Negotiation for Flow Control, set the <AnFcEn> bit. • To advertise Flow Control support, set the <PauseAdv> bit. • To advertise no Flow Control support, clear the <PauseAdv> bit. MV-S102110-02 Rev. E Page 150 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Tri-Speed Ports In the Port<n> MAC Control Register1 (0<=n<24, CPUPort = 63) (Table 146 p. 412): • To enable periodic X-on Flow Control packets transmission, set the<PeriodicXonEn> bit. • To disable periodic X-on Flow Control packets transmission, clear the <PeriodicXonEn> bit. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note Manual Setting of Flow Control Disable In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • To disable Auto-Negotiation for Flow Control, clear the <AnFcEn> bit. • Clear the <SetFcEn> bit. Manual Setting of Flow Control Enable In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnFcEn> bit. • Set the <SetFcEn> bit. HyperG.Stack and HX/QX Ports Manual Setting of Flow Control Disable In the Port<n> MAC Control Register0 (0<=n<24, CPUPort = 63) (Table 145 p. 411): • To disable transmission of Flow Control packets, clear the <TxFcEn> bit. • To disable reception of Flow Control packets, clear the <RxFcEn> bit. Note M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Manual Setting of Flow Control Enable In the Port<n> MAC Control Register0 (24<=n<27) (Table 159 p. 427): • To enable transmission of Flow Control packets, set the <TxFcEn> bit. • To enable reception of Flow Control packets, clear the <RxFcEn> bit. For Tri-Speed ports, when the port operates in half-duplex mode, manual Flow Control must be disabled. 9.4.9 Tri-Speed Port: Back Pressure in Half-Duplex Mode This section is relevant for Tri-Speed ports only. Back pressure is a proprietary, non-standard feature in the Ethernet MAC for 10 Mbps and 100 Mbps half-duplex operation only. Thus it is applicable to the SGMII interface only. When back pressure is enabled, and the port’s buffer fill level is higher than the X-Off threshold and the device transmits a long JAM pattern. This pattern mimics a collision condition on the wire and prevents other network devices on the network from transmitting. The JAM pattern is a series of 0101.... The IPG between two consecutive JAM patterns (or between the last transmit and the first JAM) is 64BT. When a port in back pressure mode has a packet pending transmission, it halts transmission of the JAM pattern for 64BT, and then transmits the packet. If the port remains in back pressure mode, it continues to transmit the JAM pattern—64BT—after transmission of the packet is completed. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 151 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The devices support symmetric Flow Control only. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC Operation and Configuration figuration Register (0<=n<24, CPUPort = 63) (Table 151 p. 421). To enable back pressure in half duplex, clear the<Back PressureEn> bit. 9.4.10 Tri-Speed Port: Network Interface Mode This section is relevant for Tri-Speed ports only. The Tri-Speed port may operate in one of the following two modes: SGMII Mode Enables connection to external copper transceivers such as the Alaska® device. It also facilitates 10/100 Mbps speeds in half- and full-duplex modes. In this mode, Auto-Negotiation may be performed out-of-band via the device’s Master SMI interface or in-band. 1000Base-X mode The port is connected to a 1000BASE-X fiber transceiver. In this mode, the port operates at 1000 Mbps, full-duplex only and supports in-band AutoNegotiation for link and for Flow Control. Configuration Configuration of all of the device’s ports to SGMII mode or to 1000Base-X mode is performed at reset. (see the 98DX250/260/270 Advanced Hardware Specification, Document Control Number MV-S102110-00).This configuration may be overridden per port. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 SGMII Configuration • Clear the <PortType> bit in the Port<n> MAC Control Register0 (0<=n<24, CPUPort = 63) (Table 145 p. 411). • For Out-of-Band Auto-Negotiation, clear the <InBandAnEn> bit in the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415). • For In-Band Auto-Negotiation set the <InBandAnEn> bit in the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415). 1000Base-X Configuration • Set the <PortType> bit in the Port<n> MAC Control Register0 (0<=n<24, CPUPort = 63) (Table 145 p. 411). • Set the <InBandAnEn> bit in the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415). 9.4.11 Tri-Speed Port: Speed Setting This section is relevant for Tri-Speed ports only. 9.4.11.1 Speed Setting in 1000BASE-X Mode In 1000BASE-X mode, the MAC speed is set to 1000 Mbps only. When all ports are operating in 1000BASE-X mode, Speed Auto-Negotiation must be disabled at reset (see the 98DX250/260/270 Advanced Hardware Specification). Configuration In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnSpeedEn> bit. • Set the <SetGMIISpeed> bit. MV-S102110-02 Rev. E Page 152 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To enable back pressure in half duplex, set the <Back PressureEn> bit in the Port<n> Serial Parameters Con• M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) 9.4.11.2 Speed Setting in SGMII Mode AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In SGMII, the MAC may operate at 10 Mbps, 100 Mbps, or 1000 Mbps. MAC speed can be resolved by Speed Auto-Negotiation or it can be set manually. Enabling Speed Auto-Negotiation for all ports is performed at reset (see the 98DX250/260/270 Advanced Hardware Specification). When Speed Auto-Negotiation is enabled, the port’s speed is resolved via Speed Auto-Negotiation. When Speed Auto-Negotiation is disabled, the port’s speed is set manually. Configuration Enabling Speed Auto-Negotiation Set the <AnSpeedEn> bit in the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415). Manual Setting the Port’s Speed to 10 Mbps In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the<AnSpeedEn> bit. • Set the <SetGMIISpeed> bit. • Clear the <SetMIISpeed> bit. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Manual Setting the Port’s Speed to 100 Mbps In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnSpeedEn> bit. • Clear the <SetGMIISpeed> bit. • Set the <SetMIISpeed> bit. Manual Setting the Port’s Speed to 1000 Mbps In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnSpeedEn> bit. • Set the <SetGMIISpeed> bit. 9.4.12 Tri-Speed Port: Duplex Mode Setting 9.4.12.1 Duplex Mode Setting in 1000BASE-X Mode In 1000BASE-X mode, the MAC operates in full-duplex mode only. Configuration in the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnDuplexEn> bit. • Set the <SetFullDuplex> bit. 9.4.12.2 Duplex Mode Setting in SGMII Mode In SGMII, the MAC can operate in half-duplex or in full-duplex mode. The MAC can operate in full-duplex mode at all speeds—10 Mbps, 100 Mbps and 1000 Mbps. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 153 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This configuration can be overridden by the host CPU on a per port basis. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC Operation and Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note Half-duplex mode is supported only when the MAC operates at speeds of 10 Mbps or 100 Mbps. At 1000 Mbps ONLY full-duplex mode is supported. When duplex Auto-Negotiation is enabled, the port’s duplex mode is resolved via Speed Auto-Negotiation. Configuration Enabling Duplex Auto-Negotiation Set the <AnDuplexEn> bit in the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415). Manual Setting of the Port to Full Duplex Mode In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnDuplexEn> bit. • Set the <SetFullDuplex> bit. 9.4.13 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Manual Setting of the Port to Half Duplex Mode (Applicable for speeds of 10 Mbps or 100 Mbps only) In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • Clear the <AnDuplexEn> bit. • Clear the <SetFullDuplex> bit. Tri-Speed Port: Excessive Collisions Relevant for Tri-Speed ports operating at a speed of 10 Mbps or 100 Mbps in half-duplex mode. According to the IEEE 802.3 standard, the maximal number of attempts for a collided packet is 16. Accordingly, the port attempts to gain media access and transmit the packet. If a collision occurs, the port tries to transmit the packet again. If 16 consecutive collisions occur, the port drops the packet. However, for some proprietary applications, in which packets must not be lost, this behavior is undesirable. There is a configuration option to allow the port to attempt to transmit the packet until it succeeds, regardless of the number of collisions. Configuration In the Port<n> MAC Control Register1 (0<=n<24, CPUPort = 63) (Table 146 p. 412): • To disable Collisions Attempt Limit, set the <Disable Excessive CollisionDrop> bit. • To enable Collisions Attempt Limit, clear the <Disable Excessive CollisionDrop> bit. 9.4.14 HX/QX Ports PCS Configuration This section describes the HX/QX ports PCS configurations. MV-S102110-02 Rev. E Page 154 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The MAC’s duplex mode can be resolved via duplex Auto-Negotiation or it can be set manually. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) 9.4.14.1 Loopback PCS Loopback Each packet sent from the MAC to the HX/QX port passes the PCS and is then sent back to the PCS layer. In this mode the PCS transmit and receive signals are connected internally. Analog Loopback Each packet sent from the MAC to the HX/QX port passes the PCS and SERDES layers and is sent back to the Rx lanes. In this mode the SERDES transmit lanes are connected internally to the SERDES Rx lanes. Repeater Loopback Each packet received on the HX/QX port RX lanes is sent back to the Tx lanes. The packet passes the SERDES and the PCS layers, but is not sent to the MAC. Figure 31: MAC Loopback Packet Walkthrough 3.125 Gbps SERDES Pn_HX_RX_P/N[1:0] M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Generic PCS for 1/2 Lanes 3.125 Gbps SERDES Pn_HX_TX_P/N[1:0] Figure 32: PCS Loopback Packet Walkthrough 802.3 MAC 3.125 Gbps SERDES Generic PCS for 1/2 Lanes 3.125 Gbps SERDES Copyright © 2006 Marvell August 24, 2006, Preliminary Pn_HX_RX_P/N[1:0] CONFIDENTIAL Document Classification: Restricted Information Pn_HX_TX_P/N[1:0] MV-S102110-02 Rev. E Page 155 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un 802.3 CO -fn MAC NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The HX/QX Ports PCS support three loopback modes: MAC Loopback All packets transmitted from the MAC to the HX/QX port are sent directly back to the MAC. In this mode the packets do not pass the PCS. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC Operation and Configuration 3.125 Gbps SERDES Pn_HX_RX_P/N[1:0] Generic PCS for 1/2 Lanes 3.125 Gbps SERDES Pn_HX_TX_P/N[1:0] Figure 34: Repeater Loopback Packet Walkthrough Pn_HX_RX_P/N[1:0] M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 3.125 Gbps SERDES Generic PCS for 1/2 Lanes 3.125 Gbps SERDES Pn_HX_TX_P/N[1:0] Configuration To set the HX/QX port to MAC loopback mode, set the <MACLoopBackEn> field in the HXPort<n> Configura• • • • tion Register1 (Table 224 p. 478). To set the HX/QX port to PCS loopback mode, set the <PCSLoopBackEn> field in the HXPort<n> Configuration Register1 (Table 224 p. 478). To set the HX/QX port to Analog loopback mode, set the <LOOPBACK> field in the Analog All Lane Control Register1 (Table 242 p. 495). To set the HX/QX port to Repeater loopback mode, set the <RepeaterModeEn> field in the HXPort<n> Configuration Register1 (Table 224 p. 478) and clear the <MACResetn> field in the Port<n> MAC Control Register0 (24<=n<27) (Table 159 p. 428). 9.4.14.2 Pseudo Random Bit Generator The HX/QX PCS layer incorporates a built-in Pseudo Random Bit Stream (PRBS) generator and checker using the 2^7 - 1 or the 2^23 -1 industry standard PRBS polynomials. MV-S102110-02 Rev. E Page 156 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR 802.3 VE 4du 802.3 MAC MACL L unCO fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 33: Analog Loopback Packet Walkthrough M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) The generator and checker can be enabled independently for each lane. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 When the checker is enabled, it tries to lock onto the incoming bit stream. Once this is achieved, the PRBS checker starts counting the number of bit errors. Enabling PRBS Generator • ISet the <TestModeEn> field in the HXPort<n> Test Configuration and Status (Table 233 p. 486) bit. • Set the <TestMode> field in the HXPort<n> Test Configuration and Status (Table 233 p. 487) field. • Set the <TestGenEn lane1> and/or <TestGenEn lane0> field in the HXPort<n> Test Configuration and Status (Table 233 p. 487). Enabling PRBS Checker • ISet the <TestModeEn> field in the HXPort<n> Test Configuration and Status (Table 233 p. 486) bit. • Set the <TestMode> field in the HXPort<n> Test Configuration and Status (Table 233 p. 487) field. • Set the <PRBSCheckEn lane1> and/or <PRBSCheckEn lane0> field in the HXPort<n> Test Configuration and Status (Table 233 p. 487). 9.4.14.3 SERDES Reset The HX/QX ports SERDES may be reset. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Status • The <PRBSCheckLocked lane1> and <PRBSCheckEn lane0> field in the HXPort<n> Test Configuration and Status (Table 233 p. 487) indicate that the PRBS checker is locked. • The <PrbsErrorCnt> field in the HXPort<n> Lane0 PRBS Error Counter (Table 234 p. 488) and HXPort<n> Lane1 PRBS Error Counter (Table 235 p. 488) count the number of errors since the last Host CPU read. Configuration • To reset the SERDES of lane0 of port 25, set the<RESET lane0> field in the HXPort25 SERDES Power and • • • Reset Control (Table 243 p. 496). To reset the SERDES of lane1of port 25, set the<RESET lane1> field in the HXPort25 SERDES Power and Reset Control (Table 243 p. 496). To reset the SERDES of lane0 of port 26, set the<RESET lane0> field in the HXPort26 SERDES Power and Reset Control (Table 248 p. 501). To reset the SERDES of lane1 of port 26, set the<RESET lane1> field in the HXPort26 SERDES Power and Reset Control (Table 248 p. 501). 9.4.14.4 PCS Reset The HX/QX ports PCS may be reset. Configuration • To reset the PCS, set the <PCSReset> field in the HXPort<n> Configuration Register0 (Table 223 p. 476). 9.4.15 Tri-Speed Port 1.25 Gbps SERDES Configuration This section outlines the Tri-Speed port SERDES configurations. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 157 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC Operation and Configuration 9.4.15.1 Power Up AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 At exit from reset and according to reset configuration, the port SERDES may be powered up or powered down. If a port is inactive, to save power its SERDES may be powered down. To power down the SERDES, clear the <PU_IVREF>, <PU_RX>, <PU_PLL>, and <PU_TX> bits in the Port<n> SERDES Configuration Register0 (0<=n<24) (Table 152 p. 422). 9.4.15.2 Reset The SERDES may be reset. Configuration To reset the SERDES, set the<SerdesReset> bit in the Port<n> SERDES Configuration Register0 (0<=n<24) (Table 152 p. 422). 9.4.15.3 Loopback To perform board-level checks, the SERDES differential pairs may be looped back so that the transmit pair is connected directly to the receive pair. Configuration 9.4.15.4 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To loopback the SERDES, set the <LOOPBACK> bit in the Port<n> SERDES Configuration Register0 (0<=n<24) (Table 152 p. 422). Pseudo Random Bit Generator The SERDES interface incorporates a built-in Pseudo Random Bit Stream (PRBS) generator and checker using the 2^7 - 1 industry standard PRBS polynomial. The generator and checker can be enabled independently. When the checker is enabled, it seeks to lock onto the incoming bit stream, and once this is achieved, the PRBS checker starts counting the number of bit errors. In addition, once an error is detected, a maskable interrupt is set. Configuration Enabling PRBS Generator In the Port<n> Status Register1 (0<=n<24, CPUPort = 63) (Table 150 p. 420): • Set the <SelectData ToTransmit> bit. • Set the <PRBSGenEn> bit. Enabling PRBS Checker Set the <PRBSCheckEn> bit in the Port<n> Status Register1 (0<=n<24, CPUPort = 63) (Table 150 p. 420). MV-S102110-02 Rev. E Page 158 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) Interrupts The <PRBSError OnPort> bit in the Port<n> Interrupt Cause Register (0<=n<24, CPUPort = 63) (Table 571 p. 802) is asserted if a PRBS error is detected. 9.4.16 Transmit Voltage Swing Configuration The Transmit Voltage Swing of the SERDES may be configured using the <OutAmp> field in the Port<n> SERDES Configuration Register0 (0<=n<24) (Table 152 p. 422) and the <MEN>, and <PEN> parameters in the Port<n> SERDES Configuration Register1 (0<=n<24) (Table 153 p. 423). For further information, contact your Marvell representative. 9.4.17 HyperG.Stack Port: Transmit Inter-Packet Gap (IPG) M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The HyperG.Stack port supports three IPG modes: LAN Mode Maintains a Deficit Idle Count (DIC), which is used to decide whether to add or delete idle characters and maintain an average IPG of 96BT. Fixed Mode Adds up to three idle symbols to a Base IPG that ranges from 64BT to 120BT in steps of 32BT, to align the start symbol to Lane 0. WAN Mode IGP is stretched to adopt to OC-192 speed. For Fixed Mode, the Base IPG may be one of the following: 96BT This configuration results in IPG or 96BT to 120BT. 64BT This configuration results in IPG or 64BT to 88BT. Configuration • To configure IPG Mode, set the <TxIPGIMode> field in Port<n> MAC Control Register0 (24<=n<27) • (Table 159 p. 427) accordingly. To configure the IPG base for Fixed IPG mode, set the <FixedIPGBase> bit in Port<n> MAC Control Register2 (24<=n<27) (Table 161 p. 429). 9.4.18 HyperG.Stack Port: Speed Setting The HyperG.Stack port may operate one of two speeds: Standard 10 Gbps Each of the four SERDES lanes runs at 3.125 Gbps. 12 Gbps Used to connect the devices in this mode. Each of the four SERDES lanes run at 3.75 Gbps. Configuration To configure the HyperG.Stack port speed, set the <SpeedSelect> field in Port<n> XAUI PHY and HX/QX PCS Configuration Register0 (24<=n<27) (Table 165 p. 435) accordingly. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 159 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Status • The <PrbsCheckRdy> bit in the Port<n> PRBS Status Register (0<=n<24) (Table 156 p. 425) indicates that the PRBS checker is ready. • The <PrbsCheck Locked> bit in the Port<n> PRBS Status Register (0<=n<24) (Table 156 p. 425) indicates that the PRBS checker is locked on the incoming stream. • The <PrbsBitErrCnt> bit in the Port<n> PRBS Error Counter (0<=n<24) (Table 157 p. 425) counts the number of errors since the Host CPU last read this register. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC Operation and Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Caution To perform this configuration, <XAUIPHY Resetn> must be cleared and then reset. HyperG.Stack Port: Lane Swap When connecting two devices on the same board, the HyperG.Stack port SERDES lanes may be swapped. By swapping the SERDES lanes, routing between the two devices may be performed using a single PCB layer. The lanes are swapped as follows: • Lane0 PCS Tx Data is connected to SERDES Lane 3. • Lane1 PCS Tx Data is connected to SERDES Lane 2. • Lane2 PCS Tx Data is connected to SERDES Lane 1. • Lane3 PCS Tx Data is connected to SERDES Lane 0. When Lane Swap is enabled, the SERDES lanes routing on the board from one device to the other may be swapped so that: s • P<n>_TX/RX_N/P[0] from the first device is connected to P<n>_RX/TX_N/P[3] of the second device. • P<n>_TX/RX_N/P[1] from the first device is connected to P<n>_RX/TX_N/P[2] of the second device. • P<n>_TX/RX_N/P[2] from the first device is connected to P<n>_RX/TX_N/P[1] of the second device. • P<n>_TX/RX_N/P[3] from the first device is connected to P<n>_RX/TX_N/P[0] of the second device. Configuration Caution M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To configure HyperG.Stack port swap, set the <SwapEn> bit in Port<n> XAUI PHY and HX/QX PCS Configuration Register0 (24<=n<27) (Table 165 p. 435). To perform this configuration, the <XAUIPHY Resetn> bit must be cleared and then reset. 9.4.20 HX Port: Lane Swap It possible to swap the HX port lanes. Swap configuration can be set for the transmit or the receive lanes. Swapping the lanes may ease the board routing. When performing swap (TX or RX): • Lane0 PCS Tx Data is connected to SERDES Lane 1. • Lane1 PCS Tx Data is connected to SERDES Lane 0. When Lane Swap is enabled, the SERDES lanes routing on the board from one device to the other may be swapped so that: • P<n>_TX/RX_N/P[0] from the first device is connected to P<n>_RX/TX_N/P[1] of the second device. • P<n>_TX/RX_N/P[1] from the first device is connected to P<n>_RX/TX_N/P[0] of the second device. Configuration To configure HX port swap mode, set the <Lane0RxSel>, <Lane1RxSel>,<Lane0TxSel>, and <Lane1TxSel> field in the HXPort<n> Tx and Rx Swap Control (Table 225 p. 479). MV-S102110-02 Rev. E Page 160 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 9.4.19 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) 9.5 Tri-Speed Ports Auto-Negotiation AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Tri-Speed ports MAC supports Auto-Negotiation functionality with the link partner, to establish link, speed, duplex mode, and Flow Control support. SGMII In-band Auto-Negotiation Performed by the PCS layer to establish link, speed, and duplex mode. Out-of-band Auto-Negotiation Performed by the PHY polling unit using the device’s Master SMI Interface to access the Copper PHY registers, to establish link, speed, duplex mode, and Flow Control support. 9.5.1 In-Band Auto-Negotiation In-Band Auto-Negotiation is performed when the port is in 1000Base-X mode or in SGMII mode. 9.5.1.1 1000Base-X In-band Auto-Negotiation When working in 1000BASE-X mode, Clause 37 in the IEEE 802.3 specification describes the 1000BASE-X AutoNegotiation (AN) function, which allows a device (local device) to advertise modes of operation it possesses to a device at the remote end of a link segment (link partner) and to detect corresponding operational modes that the link partner may be advertising. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Auto-Negotiation function exchanges information between two devices that share a link segment and automatically configures both devices to take maximum advantage of their abilities. Auto-Negotiation is performed with /C/ and /I/ ordered_sets, such that no packet or upper layer protocol overhead is added to the network device. If a Start-of-Packet or End-of-Packet symbol is received during the Auto-Negotiation period, it is considered RUDI and is considered an invalid symbol. The device’s 1000BASE-X Auto-Negotiation block supports Auto-Negotiation for the following: Link state When the link is not forced to link up or link down (Section 9.4.2 "Link State"). Flow Control Only symmetric 802.3x Flow Control is supported when Flow Control Auto-Negotiation is enabled (Section 9.4.8 "802.3x Flow Control"). There is no support for Auto-Negotiation speed, duplex, asymmetric Flow Control, or next-page. Caution In 1000BASE-X, the port must be set to work in full-duplex mode, at 1000 Mbps. Duplex Auto-Negotiation and Speed Auto-Negotiation must be disabled. 9.5.1.2 SGMII In-band Auto-Negotiation As PCS layers are present in both MAC and PHY, the PCS layer may also be used to exchange further information. SGMII In-band Auto-Negotiation for link is the same as for 1000Base-X In-Band Auto-Negotiation (Section 9.5.1.4). In addition, in this mode the PHY performs Auto-Negotiation with its link partner. In this process both PHYs communicate their abilities. The decisions of the PHY regarding link, speed, and duplex mode are forwarded to the Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 161 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The MAC supports the following Auto-Negotiation modes: 1000Base-X In-band Auto-Negotiation Performed by the PCS layer to establish link and Flow Control support. In 1000Base-X mode, speed is 1000 Mbps and duplex mode is full. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Tri-Speed Ports Auto-Negotiation AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 MAC in-band through a similar mechanism as in 1000BASE-X, using the same set of Config_Reg registers, with different encoding. However, in this mode the control data flow is only from the PHY to the MAC. The value of the transmitted Config_Reg is therefore constant and does not represent the abilities of the MAC. Speed When speed Auto-Negotiation is enabled (Section 9.4.11 "Tri-Speed Port: Speed Setting"). Duplex mode When duplex mode Auto-Negotiation is enabled (Section 9.4.12 "Tri-Speed Port: Duplex Mode Setting"). There is no support for Auto-Negotiating Flow Control. Caution SGMII In-Band Auto-Negotiation may be used only if the Tri-Speed port is connected to the Marvell® Alaska® PHY. 9.5.1.3 In-Band Auto-Negotiation Completion and Restart Once In-Band Auto-Negotiation is completed, an interrupt is asserted to the host. Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The host can restart In-Band Auto-Negotiation. In the Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • To enable 1000Base-X Auto-Negotiation, set the <InBandAnEn> bit. • To restart 1000Base-X Auto-Negotiation, set the <InBand ReStartAn> bit. Interrupts The <AnCompleted OnPort> bit in the Port<n> Interrupt Cause Register (0<=n<24, CPUPort = 63) (Table 571 p. 802) is asserted when In-Band Auto-Negotiation is done. 9.5.1.4 1000Base-X In-Band Auto-Negotiation Bypass Mode This section is relevant when the port is at 1000Base-X only and is not relevant for SGMII In-Band Auto-Negotiation. The IEEE standard Auto-Negotiation state machine per the 802.3X Clause 37, requires that both sides support Auto-Negotiation before the link can be up. If one side implements the Auto-Negotiation function and the other does not, then two-way communication is not established, unless the user manually disables Auto-Negotiation and configures both sides to work in the same operational mode. When Bypass mode is enabled, the PCS Auto-Negotiation state machine changes from the one specified in Clause 37 in the following manner: When entering the state “Ability_Detect”, a timer is started to count down with an initial value of 20 times the link timer (link timer is ~10 msec, so this time is ~200 msec). If the timer expires, while all this time the receive synchronization machine remained in sync and did not report RUDI(INVALID), and the state machine is still in the Ability_Detect state, then this is interpreted as a sign that the other side is “alive” but cannot send configuration codes to perform Auto-Negotiation. Therefore, the state machine moves to a new state called “Bypass_Link_Up”, in which the MAC assumes a link up and MV-S102110-02 Rev. E Page 162 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The device’s SGMII In-band Auto-Negotiation block supports Auto-Negotiation for the following: Link state When the link is not forced to link up or link down (Section 9.4.2 "Link State"). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 the operational mode is set to whatever the <SetFcEn> bit in Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415) is at that time. Note that Auto-Negotiation is restarted automatically, once the other device is replaced by a device that can perform Auto-Negotiation. If the Bypass was performed, the MAC reports this via <InBand AutoNeg BypassAct> in the Port<n> Status Register0 (0<=n<24, CPUPort = 63) (Table 149 p. 418), together with the Interrupt stating link change. Therefore, management can recognize whether the link was resumed due to standard Auto-Negotiation or Bypass. Configuration In thePort<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) (Table 148 p. 415): • To enable In-Band Auto-Negotiation bypass, set the <InBandAn ByPassEn> bit. • To disable In-Band Auto-Negotiation by pass, clear the <InBandAn ByPassEn> bit. 9.5.2 Out-of-Band Auto-Negotiation in SGMII Mode In SGMII mode, Auto-Negotiation with the PHY for Link, duplex mode, speed and Flow Control may be performed via out-of band Auto-Negotiation, using the PHY polling unit, via the device’s Master SMI Interfaces (Section 6. "Host Management Interfaces" on page 55). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In this mode, the PHY is performs Auto-Negotiation with its link partner. In this process both PHYs communicate their abilities. The PHY polling unit polls the PHY register. The polling data is then used to establish the port status. The addresses of the different PHYs are specified in the PHY address registers. The device’s Out-Of-band Auto-Negotiation block supports Auto-Negotiation for the following: Link state When the link is not forced to link up or link down (Section 9.4.2 "Link State"). Speed When Speed Auto-Negotiation is enabled (Section 9.4.11 "Tri-Speed Port: Speed Setting"). Duplex mod When duplex mode Auto-Negotiation is enabled (Section 9.4.12 "Tri-Speed Port: Duplex Mode Setting"). Flow Control Only symmetric 802.3x Flow Control is supported when Flow Control Auto-Negotiation is enabled (Section 9.4.8 "802.3x Flow Control"). 9.5.3 Auto Media Select in SGMII Mode The 88E1112 is a dual media PHY. It can be used for “combo” ports in which either fiber media or copper media is used. When connected to a 88E1112 PHY, the device enables Auto Media Selection without management intervention. This support is performed using the PHY polling unit via the SMI interface, to poll the PHY registers and retrieve the selected media. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 163 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The Bypass is performed if the other device does not transmit anything other than Idles during this extended timer period. (Configuration codes are not Idles, therefore a regular Auto-Negotiating device does not allow the Bypass to take place.) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Tri-Speed Ports Auto-Negotiation Enabling Auto Media Selection for a port • To configure ports 0 through 5, set the <AutoMedia SelectEn port[5:0]> field in PHY Auto-Negotiation Configuration Register0 (Table 262 p. 517) accordingly. • To configure ports 6 through 11, set the <AutoMedia SelectEn port[11:6]> field in PHY Auto-Negotiation Configuration Register1 (Table 263 p. 518) accordingly. • To configure ports 12 through 17, set the <AutoMedia SelectEn port[17:12]> field in PHY Auto-Negotiation Configuration Register2 (Table 264 p. 518) accordingly. • To configure ports 12 through 17, set the <AutoMedia SelectEn port[23:18]> field in PHY Auto-Negotiation Configuration Register3 (Table 265 p. 519) accordingly. The <MediaActive> bit in Port<n> Status Register1 (0<=n<24, CPUPort = 63) reflects the active media. 9.6 MAC MIB Counters The device implements MAC MIB Counters, providing the counters necessary to support MAU, 802.3 and EtherLike MIB. Each port has a set of counters residing in consecutive address space. Table 38 describes the MIB counters supported by Tri-State ports. Table 39 describes the MIB counters supported by HyperG.Stack and HX/QX ports. MAC MIB Counters for Tri-State Ports M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 38: Co u nt e r N a m e C o un ter Wi dt h D e s cr i p t i o n Late Collision 32 bits Number of late collisions seen by the MAC. Collisions 32 bits Number of collision events seen by the MAC. BadCRC 32 bits Number CRC error events. RxError Frame received 32 bits Number of Rx Error events seen by the receive side of the MAC. Jabber 32 bits Number of jabber packets received. Oversize 32 bits Number of oversize packets received. Fragments 32 bits Number of fragments received. Undersize 32 bits Number of undersize packets received. Receive FIFO Overrun 32 bits Number of instances that the port was unable to receive packets due to insufficient bandwidth to one of the packet processor internal resources, such as the DRAM or buffer allocation. FCReceived 32 bits Number of 802.3x Flow Control packets received. FCSent 32 bits Number of Flow Control frames sent. Sent Multiple 32 bits Valid Frame transmitted on half-duplex link that encountered more then one collision. Byte count and cast are valid. MV-S102110-02 Rev. E Page 164 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) MAC MIB Counters for Tri-State Ports (Continued) C o un ter Wi dt h D e s cr i p t i o n BroadcastFramesSent 32 bits Number of good frames sent that had a Broadcast destination MAC Address. This does not include: IEEE 802.3 Flow Control frames, collided packets, packets dropped due to excessive collision or packets with Tx Error Event. MulticastFramesSent 32 bits Number of good frames sent that had a Multicast destination MAC Address. This does not include: IEEE 802.3 Flow Control frames, collided packets, packets dropped due to excessive collision or packets with Tx Error Event. ExcessiveCollision 32 bits Number of frames dropped in the transmit MAC due to excessive collision condition. This bit is applicable for half-duplex mode only. Unicast Frame Sent 32 bits Number of Ethernet Unicast frames sent from this MAC. This does not include: IEEE 802.3 Flow Control frames, collided packets, packets dropped due to excessive collision or packets with Tx Error Event. GoodOctetsSent 64 bits Sum of lengths of all good Ethernet frames sent from this MAC. This does not include 802.3 Flow Control frames or packets dropped due to excessive collision or packets with Tx Error Event. NOTE: When reading a 64 bits counter, the 32 LSBs must be read first. Frames1024toMaxOctets 32 bits Ava Number of received and/or transmitted good and bad frames that are more than 1023 bytes in size. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. Frames512to1023Octets 32 bits Number of received and/or transmitted good and bad frames whose size is between 512-1023 bytes. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. Frames256to511Octets 32 bits Number of received and/or transmitted good and bad frames whose size is between 256-511 bytes. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. Frames128to255Octets 32 bits Number of received and/or transmitted good and bad frames whose size is between 128-255 bytes. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Co u nt e r N a m e CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 165 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 38: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC MIB Counters MAC MIB Counters for Tri-State Ports (Continued) C o un ter Wi dt h D e s cr i p t i o n Frames65to127Octets 32 bits Number of received and/or transmitted good and bad frames whose size is between 65-127 bytes. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. Frames64Octets 32 bits Number of received and/or transmitted good and bad frames that are 64 bytes in size. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Co u nt e r N a m e Number of good frames received that had a Multicast destination MAC Address. NOTE: This does not include 802.3 Flow Control messages, as they are considered MAC Control packets. BroadcastFrames Received 32 bits Number of good frames received that had a Broadcast destination MAC Address. Sent deferred 32 bits Valid frame transmitted on half-duplex link with no collisions, but where the frame transmission was delayed due to media being busy. Byte count and cast are valid. Good Unicast Frames Received 32 bits Number of Ethernet Unicast frames received that are not bad Ethernet frames or MAC Control packets. Note that this number does include Bridge Control packets such as LCAP and BPDU. Tx fifo overrun and tx CRC errors 32 bits Invalid frame transmitted when one of the following occurs: 1.A frame with bad CRC was read from the memory. 2.Underrun occurs. BadOctetsReceived 32 bits Sum of lengths of all bad Ethernet frames received. GoodOctetsReceived 64 bits Sum of lengths of all good Ethernet frames received, i.e., frames that are not bad frames or MAC Control packets. This sum does not include IEEE 802.3x Pause messages, but does include bridge control packets such as LCAP and BPDU. NOTE: When reading a 64 bits counter, the 32 LSBs must be read first. Table 39: M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 MulticastFramesReceived 32 bits MAC MIB Counters for HyperG.Stack Ports Co u nt e r N a m e C o un ter Wi dt h D es c r i p t i on BadCRC 32 bits Number CRC error events. RxError Frame received 32 bits Number of Rx Error events seen by the receive side of the MAC. MV-S102110-02 Rev. E Page 166 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 38: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) MAC MIB Counters for HyperG.Stack Ports (Continued) C o un ter Wi dt h D e scr ip ti o n Jabber 32 bits Number of jabber packets received. Oversize 32 bits Number of oversize packets received. Fragments 32 bits Number of fragments received. Undersize 32 bits Number of undersize packets received. Received FIFO Overrun 32 bits Number of instances that the port was unable to receive packets due to insufficient bandwidth to one of the packet processor internal resources. FCReceived 32 bits Number of 802.3x Flow Control frames received. FCSent 32 bits Number of 802.x3 Flow Control frames sent. BroadcastFramesSent 32 bits Number of good frames sent that had a Broadcast destination MAC Address. This does not include 802.3 Flow Control frames. MulticastFramesSent 32 bits Number of good frames sent that had a Multicast destination MAC Address. This does not include 802.3 Flow Control messages. Unicast Frame Sent 32 bits Number of good frames sent that had a Unicast destination MAC Address. GoodOctetsSent 64 bits The sum of lengths of all good Ethernet frames sent from this MAC. This does not include 802.3 Flow Control frames or packets with Transmit Error Event counted in CRCErrorSent. NOTE: When reading a 64 bits counter, the 32 LSBs must be read first. Frames1024toMaxOctets 32 bits Number of received and/or transmitted good and bad frames that are more than 1023 bytes in size. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. Frames512to1023Octets 32 bits Number of received and/or transmitted good and bad frames whose size is between 512-1023 bytes. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. Frames256to511Octets 32 bits The number of received and/or transmitted good and bad frames whose size is between 256-511 bytes. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Co u nt e r N a m e CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 167 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 39: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC MIB Counters MAC MIB Counters for HyperG.Stack Ports (Continued) C o un ter Wi dt h D es c r i p t i on Frames128to255Octets 32 bits Number of received and/or transmitted good and bad frames whose size is between 128-255 bytes. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. Frames65to127Octets 32 bits Number of received and/or transmitted good and bad frames whose size is between 65-127 bytes. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. Frames64Octets 32 bits Number of received and/or transmitted good and bad frames that are 64 bytes in size. NOTE: This does not include MAC Control Frames. The counter is enabled/disabled for Rx and/or Tx packets according to the <RxHistogram En> and <TxHistogram En> configuration of the associated port group. MulticastFrames Received 32 bits Number of good frames received that had a Multicast destination MAC Address. NOTE: This does NOT include 802.3 Flow Control messages as they are considered MAC Control packets. BroadcastFrames Received 32 bits Number of good frames received that had a Broadcast destination MAC Address. Good Unicast Frames Received 32 bits Number of Ethernet Unicast frames received that are not bad Ethernet frames or MAC Control packets. Note that this number includes Bridge Control packets such as LCAP and BPDU. CRCErrorsSent 32 bits Invalid frame transmitted when one of the following occurs: - A frame with bad CRC was read from the memory. - Underrun occurs. BadOctetsReceived 32 bits The sum of lengths of all bad Ethernet frames received. GoodOctetsReceived 64 bits The sum of lengths of all good Ethernet frames received, i.e., frames that are not bad frames or MAC Control packets. This sum does not include 802.3x pause messages, but does include bridge control packets like LCAP and BPDU. NOTE: When reading a 64 bits counter, the 32 LSBs must be read first. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Co u nt e r N a m e The Tri-Speed Ports MIB Counters are listed in C.11.5 "Tri-Speed Ports MAC MIB Counters Registers" on page 544. The HyperG.Stack ports MIB Counters are listed in C.8 "HyperG.Stack and HX/QX Ports MAC, Status, and MIB Counters, and XAUI Control Configuration Registers" on page 426. MV-S102110-02 Rev. E Page 168 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 39: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) MAC MIB Counters Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Port MAC MIB counters support the following configuration options: • Enable/disable MAC MIB counting per port. • Enable/disable updating of the RMON Etherstat histogram counters1 for received packets • Enable/disable updating of the RMON Etherstat histogram counters for transmitted packets • For the Tri-Speed ports, enable/disable Counters Clear on read per group of six ports. • For the HyperG.Stack ports, enable/disable Counters Clear on read per port. Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Tri-Speed Ports • To enable/disable MIB counting per port, set the <MIBCntEn> field in the Port<n> MAC Control Register0 (0<=n<24, CPUPort = 63) (Table 145 p. 411). • To enable/disable received packets histogram for ports 0 through 5, set the <RxHistogram En> bit in the MIB Counters Control Register0 (for Ports 0 through 5) (Table 293 p. 544). • To enable/disable received packets histogram for ports 6 through 11, set the <RxHistogram En> bit in the MIB Counters Control Register1 (for ports 6 through 11) (Table 294 p. 545). • To enable/disable received packets histogram for ports 12 through 17, set the <RxHistogram En> bit in the MIB Counters Control Register2 (for Ports 12 through 17) (Table 295 p. 545). • To enable/disable received packets histogram for ports 18 through 23 set the <RxHistogram En> bit in the MIB Counters Control Register3 (for Ports 18 through 23) (Table 296 p. 546) is used • To enable/disable transmitted packets histogram for ports 0 through 5 set the <TxHistogramEn> bit in the MIB Counters Control Register0 (for Ports 0 through 5) (Table 293 p. 544). • To enable/disable transmitted packets histogram for ports 6 through 11, set the <TxHistogramEn> bit in the MIB Counters Control Register1 (for ports 6 through 11) (Table 294 p. 545). • To enable/disable transmitted packets histogram for ports 12 through 17, set the <TxHistogramEn> bit in the MIB Counters Control Register2 (for Ports 12 through 17) (Table 295 p. 545). • To enable/disable transmitted packets histogram for ports 18 through 23, set the <TxHistogramEn> bit in the MIB Counters Control Register3 (for Ports 18 through 23) (Table 296 p. 546). • To configure counters clear on read for ports 0 through 5 MIB counters, set the <DontClear OnRead> bit in the MIB Counters Control Register0 (for Ports 0 through 5) (Table 293 p. 544). • To configure counters clear on read for ports 6 through 11 MIB counters, set the <DontClear OnRead> bit in the MIB Counters Control Register1 (for ports 6 through 11) (Table 294 p. 545). • To configure counters clear on read for ports 12 through 5 MIB counters, set the <DontClear OnRead> bit in the MIB Counters Control Register2 (for Ports 12 through 17) (Table 295 p. 545). • To configure counters clear on read for ports 0 through 5 MIB counters, set the <DontClear OnRead> bit in the MIB Counters Control Register3 (for Ports 18 through 23) (Table 296 p. 546). HyperG.Stack Ports • To enable/disable MIB counting per port, set the <MIBCntDis> bit in the Port<n> MAC Control Register0 (24<=n<27) (Table 159 p. 427). • To enable/disable received packets histogram for the HyperG.Stack ports, set the <Port24Rx HistogramEn>, <Port25Rx HistogramEn>, and <Port26Rx HistogramEn> bits in the HyperG.Stack and HX/QX Ports MIB Counters and XSMII Configuration Register (Table 163 p. 430). 1. Frames1024toMaxOctets, Frames512to1023Octets, Frames256to511Octets, Frames128to255Octets, Frames65to127Octets, Frames64Octets Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 169 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 9.6.1 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC MIB Counters 9.6.2 Port MIB Counters Capture The host CPU usually reads more than one MIB counter or all of the ports’ MAC MIB counters. When counters are read one by one, they are read at different times, thus they are incoherent to each other. To overcome the counters’ coherency problem, the device implements a port MIB counters capture mechanism. When this mechanism is triggered, the device reads all of a port’s MIB counters into a capture area in a single atomic action. Once this read is done, the host CPU may read the counters from the capture area. To capture port<n> MIB counters, the host CPU must perform the following: M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Tri-Speed Ports 1. Configure the port to be captured and trigger the capture action: – For ports 0 through 5, write port# modulo 6 to <CapturePort> and set the <CaptureTrigger> bit in the MIB Counters Control Register0 (for Ports 0 through 5) (Table 293 p. 544). – For ports 6 through 11, write port# modulo 6 to <CapturePort> and set the <CaptureTrigger> bit in the MIB Counters Control Register1 (for ports 6 through 11) (Table 294 p. 545). – For ports 12 through 17, write port# modulo 6 to <CapturePort> and set the <CaptureTrigger> bit in the MIB Counters Control Register2 (for Ports 12 through 17) (Table 295 p. 545). – For ports 18 through 23, write port# modulo 6 to <CapturePort> and set the <CaptureTrigger> bit in the MIB Counters Control Register3 (for Ports 18 through 23) (Table 296 p. 546). 2. <CaptureTrigger> is a self-clearing bit. It is set back to 0 once the capture action is done. The host CPU must poll this bit until it is LOW. Alternatively, the Host CPU can unmask the <CountCopy Done> maskable Interrupt in the Tri-Speed Ports GOP<n> MIBs Interrupt Cause Register(0<=n<4) (Table 567 p. 800). 3. The captured port MAC MIB counters are ready in the Capture Area (C.11.5 "Tri-Speed Ports MAC MIB Counters Registers" on page 544 for Capture Area offsets). HyperG.Stack and HX/QX Ports 1. Configure the port to be captured and trigger the capture action in the HyperG.Stack and HX/QX Ports MIB Counters and XSMII Configuration Register (Table 163 p. 430): – To trigger a capture of Port24 MIB counters, set the <Port24Capture Trigger> bit. – To trigger a capture of Port25 MIB counters, set the <Port25Capture Trigger> bit. – To trigger a capture of Port26 MIB counters, set the <Port26Capture Trigger> bit. 2. <CaptureTrigger> is a self-clearing bit. It is set back to 0 once the capture action is done. The host CPU must poll this bit until it is LOW. Alternatively, the host CPU can unmask <Port24MIB CapureDoneInt>,<Port25MIB CapureDoneInt>, and <Port26MIB CapureDoneInt> maskable interrupt in the Transmit Queue General Interrupt Cause Register (Table 593 p. 813). 3. The captured port MAC MIB counters are ready in the Capture Area (C.8 "HyperG.Stack and HX/QX Ports MAC, Status, and MIB Counters, and XAUI Control Configuration Registers" on page 426 for Capture Area offsets). MV-S102110-02 Rev. E Page 170 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • To enable/disable transmitted packets histogram for the HyperG.Stack/HX/QX ports, set the <Port24Tx HistogramEn>, <Port25Tx HistogramEn>, and <Port26Tx HistogramEn> bits in the HyperG.Stack and HX/QX Ports MIB Counters and XSMII Configuration Register (Table 163 p. 430). To configure counters clear on read, set the <Port24 DontClear AfterRead>, <Port25 DontClear AfterRead> and <Port26 DontClear AfterRead> bits in the HyperG.Stack and HX/QX Ports MIB Counters and XSMII Configuration Register (Table 163 p. 430). AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Network Interfaces and Media Access Controllers (MACs) 9.7 MAC Error Reporting Jabber Fragment Undersize Oversize Rx Error CRC Error Overrun the device updates the status register with the associated MAC source address and the error type, and raises an associated interrupt (maskable). The information remains in the status register until read by the CPU. Configuration To configure the port on which MAC Errors are reported, set <MACError IndicationPort> in the MAC Error Indication Port Configuration Register (Table 533 p. 773) Interrupts Status Registers M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 <MACErrorInt> maskable interrupt in the Buffer Memory Interrupt Cause Register (Table 597 p. 815) is asserted when a new MAC Error is latched to the MAC Error Indication Status Registers MAC Error Indication Status Register0 (Table 534 p. 773) and MAC Error Indication Status Register1 (Table 535 p. 773) contain the following fields: <MACError>: Error Type <MACError MACSA[47:32]> 48 bits MAC SA of the packet causing the error reported <MACError MACSA[31:0]>: Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 171 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • • • • • • • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In addition to the ports MIB counters, the device incorporates Status registers report offending MAC source addresses and the error type for a CPU specified port. The MAC error types reported are: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED MAC Error Reporting AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 10. Ingress Policy Engine 10.1 Policy Engine Concepts The Policy engine performs per-flow processing. It classifies packets into flows and then processes each flow with a flow-specific action. Per-flow processing is required to support applications such as Security and QoS. The Policy engine is capable of classifying packets based on Layer 2 to Layer 4 header information (or any packet data in the first 128 bytes of the packet) and on device ingress parameters such as the packet ingress port or QoS Profile. A flow is subject to the following types of actions: • Packet command: Forward, Drop, Mirror, or Trap to CPU • QoS marking command • Redirect flow traffic to a specific target port, trunk or a Multicast group, bypassing the Layer 2 Bridge engine • Trigger traffic policing • VLAN ID assignment • Ingress mirroring to an analyzer port • Packet counting M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In addition to the above, in the multilayer stackable switches the Policy engine also performs Policy-based Longest Prefix Match for the Unicast routing engine. In this case, the Policy action entry is regarded as a next hop entry for the Unicast Routing engine. It includes all parameters necessary for routing (e.g., ARP Pointer, VLAN-ID, Egress Interface) For further details see Section 12. "IPv4 and IPv6 Unicast Routing" on page 265. The Policy engine is designed to support easy implementation of Access Control Lists (ACLs) and Policy Control Lists (PCLs). A PCL is an ordered collection of rules, each rule defining a classification criterion and an action. A packet is classified to a flow if it matches the classification criterion. A matched packet is processed according to the rule’s action. Packet classification is performed by executing an ordered lookup over the PCL rules. The lookup is terminated on the first matching rule. If there is no match, the packet is not subject to any policy action. A search key is a binary string constructed by concatenating packet header information and device ingress information. The search key is used to match against the classification criterion. The Policy engine is capable of creating six search keys according to the following criteria: • There are two optional search key lengths—a 24-byte standard key and a 48-byte extended key. Key length is selected per PCL. • A different search key is created per packet type. The supported packet types are non-IP, ARP, IPv4, and IPv6. Every search key holds information specific to its packet type. For example, IPv4/6 keys hold Layer 3–4 information, while the non-IP key holds only Layer 2 information. In addition to the above six keys, in the multilayer stackable switches, when the Policy engine is used to perform Longest Prefix Match for the Unicast Routing engine, a seventh 24-byte standard key is created for IPv6 packets. For further details see Section 12. "IPv4 and IPv6 Unicast Routing" on page 265. The classification criterion is a string of ternary digits, equal in length to the search key. Every digit can have one of three values—’0’, ‘1’, or ‘x’. A search key is said to match a classification criterion if the search key and the classification criterion are exactly equal, excluding all digits equal to ‘x’. MV-S102110-02 Rev. E Page 172 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section describes the device’s Policy engine. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine • To support multiple PCLs, the Policy engine incorporates the following mechanisms: • A PCL is identified by a unique PCL-ID. An ingress interface is assigned a PCL-ID, which binds it to the PCL. Multiple interfaces can be bound to the same PCL by assigning the interfaces the same PCL-ID. The interface types can be either physical ports, VLANs, or trunk groups. • Every packet is subject to up to two policy lookup cycles. This feature allows simple implementation of application-specific PCLs. For example, the first lookup can be dedicated for security rules, while the second lookup can be dedicated to QoS and policy-based switching rules. 10.2 Policy Engine Overview 10.2.1 Policy Engine Location The Policy engine is located in the ingress pipe, as shown in Figure 35 for the Layer 2+ stackable switches and in Figure 36 for the in the multilayer stackable switches. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The following functions are executed before the Policy engine: MAC-level error filtering (e.g. bad CRC, runt, etc.) Port-based and protocol-based QoS marking (Section 8.2.1 "Port-Based QoS Marking" and Section 8.2.2 "Protocol-Based QoS Marking" on page 118). • Port-based and protocol-based VID assignment (Section 11.2.2.1 "Port-Based VLANs" and Section 11.2.2.2 "Protocol-Based VLANs" on page 208). • Port-based Nested VLAN configuration (Section 11.2.3.2 "Nested VLAN Support" on page 212). • • Figure 35: Ingress Pipe Block Diagram for SecureSmart and Layer 2+ Stackable Switches P r e - E g r e s s E n g in e P o lic in g E n g in e B r id g e E n g in e P o lic y E n g in e H e a d e r D e c o d e E n g in e P o rts M A C R x Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 173 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Following the standard/extended keys terminology, a standard rule uses a 24-byte length classification criterion and an extended rule uses a 48-byte length classification criterion. The Policy engine can accommodate up to 1024 standard rules, 512 extended rules, or various combinations of standard and extended rules. (SecureSmart and SecureSmart Stackable devices: Up to 256 standard rules and 128 extended rules.) The Policy engine supports multiple PCLs. This property allows a PCL to be application specific (e.g., QoS or Security), and/or interface specific (e.g., core port or access ports, guest VLAN or marketing VLAN). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Engine Overview AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 36: Ingress Pipe Block Diagram for Multilayer Stackable Switches P r e - E g r e s s E n g in e IP v 4 /v 6 U n ic a s t R o u tin g E n g in e B r id g e E n g in e P o lic y E n g in e H e a d e r D e c o d e E n g in e P o rts M A C R x Policy Engine Walkthrough The Policy engine Includes following sub-blocks: • Lookup configuration • Search key generator • Policy Lookup Memory: Policy TCAM • Policy Actions Table • Policy Match counters 10.2.2.1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 10.2.2 Lookup Cycle Configuration This block is used to obtain the lookup cycle configuration, which controls the lookup cycle behavior: • A lookup cycle can be enabled or disabled. • If the lookup cycle is enabled, the PCL-ID and the search key type—either standard or extended—are assigned to the packet. The lookup cycle configuration is maintained per ingress interface and per lookup cycle. The ingress interface can be either a physical port or a VLAN. The policy lookup configuration is described in Section 10.3 "Policy Lookup Configuration" on page 179. 10.2.2.2 Search Key Generator This block generates a search key per lookup, according to the lookup configuration and packet type. It generates a key using a field extracted from the first 128 bytes of the packet and using internally generated fields such as the PCL-ID assigned to the packet. MV-S102110-02 Rev. E Page 174 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 P o lic in g E n g in e M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Notes Layer 4 fields in IPv6 packets cannot be extracted if the IPv6 header includes extension headers, other than a single Hop-by-Hop Options header. • The device cannot parse search key fields relative to the packet Layer 3 and 4 headers if the packet is received with more than four Nested VLAN tags. The Policy engine informs the application of parsing errors by setting the parsing error flags in the search key (see Section 10.5.3.1 "Internally Generated Fields" on page 188 for a description of the parsing error flags). The application may respond to parsing errors by setting a rule on these flags. Search keys generation is described in Section 10.5 "Policy Search Keys". 10.2.2.3 Policy TCAM The Policy TCAM holds the classification criteria (i.e., rules) of all PCLs. The Policy TCAM is implemented using a Ternary Content-Addressable Memory (TCAM). The Policy TCAM is organized in a 512 x 48 bytes (rows x columns) matrix. Every row can hold two standard rules or a single extended rule. Extended rules cannot be divided between rows. Every TCAM row is associated with a flag indicating its contents. To ensure an accurate lookup, the application must properly configure this flag. The search key type determines which TCAM rows are searched. For example, if the search key is standard, then lookup is performed over all TCAM rows holding standard rules. The PCL-ID is embedded in each rule and search key, thus allowing a lookup to be restricted to a specific PCL. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 37 illustrates the Policy TCAM organization. Extended rules are accessed and searched by row order, starting at row #0. Standard rules are accessed and searched by column order and by row-order within each column. Referring to Figure 37, standard rules in the left column are searched before the rules in the right column. To efficiently utilize the TCAM, the application should split PCLs containing standard rules in such a way that they consume a minimum number of TCAM rows. One approach is to divide a Standard PCL into two equal groups, and fir the each group into a different column, but to the same row. This approach reserves space for extended rules. Figure 37 shows a sample 7-rules Standard PCL, fit into locations 2, 3, 4, 5, 514, 515, and 516. The lookup order is column-wise, by ascending location index, as described above. When a match is found, the Policy TCAM reports the index of the matched rule to the actions memory. In the case of a match on an extended rule, the reported index is the rule row index. If a standard rule is matched, the rule column index is reported. For example, a match on the last rule in the 7-rules Standard PCL, shown in Figure 37, is reported to the actions memory as a match on rule column index #516. The lookup engine provides the following capabilities: • Standard and Extended PCLs can co-exist in the Policy TCAM, thus optimizing Policy TCAM utilization. – • However, all the rules of a given PCL-ID must consist of either Standard rules or Extended rules. Rules can be shared between PCLs – This property can be used to efficiently implement default actions that apply to all PCLs. Rules sharing is done by completely or partially masking the rule’s PCL-ID field. To have the same relative order in all the relevant PCLs, the shared rules should be placed at the lowest or highest addresses of the search memory. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 175 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Engine Overview Rules can be shared between PCLs, but restricted to a specific lookup cycle. This property can be used to define different default actions per lookup cycle. To implement this property, the application must designate one of the bits in the PCL-ID field to indicate the lookup cycle. For example, all PCLs searched at the first lookup cycle are allocated PCL-IDs with the designated ‘lookup cycle’ bit cleared. Consequently, the PCL-ID of a rule shared in a specific lookup cycle is masked, except for the ‘lookup cycle’ digit, which must be set or cleared. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 – Note Sharing rules between Extended and Standard PCLs is not supported. The application must duplicate the shared rule to as a Standard and Extended rule. Figure 37: Organization of the Policy TCAM 24 bytes rule row index 1 1 513 2 2 514 4 6 511 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 0 5 Page 176 512 0 3 MV-S102110-02 Rev. E rule column index 24 bytes 3 515 4 516 5 517 6 518 511 1023 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine 10.2.2.4 Policy Action Table AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Policy Action table holds 1024 per-flow actions—one action entry per standard rule. (SecureSmart and SecureSmart Stackable devices: 256 per-flow actions.) An action is selected using the index reported by the Policy TCAM. Every extended rule makes one action entry unusable, e.g., an extended rule residing at TCAM row #1 is associated with action #1, making action #513 unusable. 10.2.2.5 Match Counters The Policy engine maintains 32 32-bit counters. The counters may be used to count the number of times a rule has been matched by a packet. The application can use the counters to detect Denial of Service attacks (DoS) or for gathering statistics. A counter is bound to a rule using the rule’s action. A counter may be bound to multiple rules. The match counters may be updated twice by every packet (match on each lookup). Counters are read/write, and auto-wrap to 0 on overflow. Configuration To read/write a policy counter, set the Policy Rule Match Counter<n> (0<=n<32) (Table 347 p. 606) accord• • ingly. To bind a counter with a rule, use the per-flow action, described in Section 10.6 "Policy Actions". 10.2.3 Packet Walkthrough Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Policy engine processes packets in two lookup cycles. Figure 38 illustrates a single lookup cycle as follows: 1. The lookup configuration is obtained (Section 10.3 "Policy Lookup Configuration"). 2. Check whether the packet is subject to policy processing (Section 10.4 "Triggering Policy Engine Processing"). Packets not subject to policy processing bypass both lookup cycles. 3. A search key is generated (Section 10.5 "Policy Search Keys"). 4. A lookup is performed. 5. If the lookup fails to find a matching rule, no action is performed for this lookup cycle. 6. If the lookup finds a matching rule, the associated action is extracted and applied (Section 10.6 "Policy Actions"). If bound to a match counter, the match counter is incremented. The action may modify packet information, including packet processing parameters such as packet command, QoS information, forwarding information, and VLAN information. CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 177 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 For further information about the Policy Action table see Section 10.5 "Policy Search Keys". M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Engine Overview AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 38: Packet Walkthrough For a Lookup Cycle Obtains lookup configuration Enter lookup cycle? N Y Generate search key M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 TCAM Lookup Match? Y N Get action Apply counting (optional) Update packet information Keep previous packet information Go to next lookup cycle or exit Policy engine MV-S102110-02 Rev. E Page 178 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Packet begins first/second lookup cycle M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine Policy Lookup Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 10.3.1 Global Search Memory Modes The search memory can be operated in one of the following modes: Standard rules mode Search memory supports standard rules. Extended rules mode. Search memory supports extended rules. Mixed mode. Search memory supports standard and extended rules concurrently. Configuration To configure the search memory mode according to the desired configuration, set the <Policy TCAMMode> and <TCAMMode Select> bits in the Policy Global Configuration Register (Table 333 p. 586). 10.3.2 PCL Configuration Table The PCL Configuration table is accessed at the start of each lookup cycle based on the interface to PCL binding configuration (Section 10.3.3 "Interface Binding to a Policy Configuration Entry"). Each Policy Configuration table entry contains two sets of parameters, one for each lookup cycle. These parameters bind the lookup cycle to a PCL in the search memory. The lookup cycle parameters are defined in Table 40. Table 40: PCL Configuration Table Entry Lookup Cycle Parameters Des cri pti on Enable Lookup Enable/disable the policy lookup cycle. NOTE: If disabled, none of the other parameters in the lookup cycle entry are relevant. PCL-ID 10-bit PCL-ID used in the TCAM search key. NOTE: The PCL rule may mask any part of the PCL-ID to permit rule sharing between PCLs. PCL Rule Type Selects whether this PCL uses standard or extended rules. NOTE: Relevant only if the TCAM is configured operate in Mixed Mode (Section 10.3.1 "Global Search Memory Modes" on page 179). IP Search Key Format If the PCL Rule Type of this entry is set to standard (24-byte) rules and the packet type is IPv4, the search key is determined according to this field’s configuration: • L2+IPv4/6 QoS. This key contains Layer 2 information and a Layer 3 DSCP field. • IPv4+L4. This key contains Layer 3 and Layer 4 information. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Fi el d Na me If the PCL Rule Type of this entry is set to extended (48-byte) rules and the packet type is IPv6, the search key is determined according to this field’s configuration: • L2+IPv6. This key contains Layer 2 information and some of the Layer 3 information. • IPv6+L4. This key contains Layer 3 and Layer 4 information. NOTE: For a detailed description of search key selection process, see Section 10.5.1 "Search Key Selection Procedure". Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 179 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 10.3 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Lookup Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In addition to the above, in the multilayer stackable switches, the PCL Configuration table contains an additional field to determine the standard key type for IPv6 packets. If the second lookup is used for Longest Prefix Match for the Unicast Routing engine, this field indicates which key is used for IPv6 packets. For further information see Section 12. "IPv4 and IPv6 Unicast Routing" on page 265. For PCL Configuration table entry format see: C.13.9 "PCL Configuration Table" on page 602 Interface Binding to a Policy Configuration Entry The Policy Configuration table has 1152 rows, where each row contains two sets of lookup-cycle parameters, as described in the previous section. The index selection for accessing the Policy Configuration table is based on the configuration for binding an interface to a PCL. There are two global access modes for binding an interface to a Policy Configuration table index: Access mode based the port configuration – • In this global mode, the port configuration determines whether the access index is based on either the: {local port #} - for port based PCLs or {packet VLAN-ID assignment} - for VLAN-based PCLs Access mode based on the packet original1 source interface: – The original source interface may be either: {source device, source port} - if received on a non-trunk port or {source trunk-ID} - if received on a trunk interface – This mode is applicable when the device must provide Policy engine services for traffic from a remote device attached via a cascade port. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • The access procedure to the Policy Configuration Table is illustrated in Figure 39 on page 181. Binding options of PCLs to interfaces are described in Section 10.7.2 "Binding Options of PCLs to Interfaces" on page 200. Notes • • In VLAN-ID access mode, only VLAN-IDs 0-1023 are supported. The table is accessed by the local port number for VLAN-IDs greater than 1023. In VLAN-ID access mode, the packet VLAN-ID assignment is used to access the table. If the VLANID assignment is modified in the first lookup then the second lookup configuration is based on the new VLAN-ID assignment. 1. “Original” means that if the packet is received on a cascade port, the source interface is taken from the DSA tag and not from the local device’s source interface. MV-S102110-02 Rev. E Page 180 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 10.3.3 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 39: Access Procedure to The Policy Configuration Table Local source port # or Packet VID Port<n>Policy Table Access Mode No VLAN {Source device, source port} or Source trunk-ID No Original source interface is a Trunk ? Yes M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Port Global Policy Table Access Mode VLAN ID < 1024 Yes Index <- n + 1024 Index <- VLAN ID[9:0] Index <- {Source Device[4:0], Source Port[4:0] } Index <- Source Trunk [6:0] + 1024 CPU port number is 31 Figure 40 on page 182 shows the mapping of an interface to the Policy Configuration entry for each global access mode. In the access mode, according to “port configuration”, the first 1K entries are used for ports configured for VLANID based PCLs, and the entries with index > 1K are used for port based PCLs. In the access mode, according to “original interface”, the first 1K entries are used for packets with non-trunk source interfaces, and the entries with index >1k are used for packets with trunk source interfaces. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 181 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Packet ingress from local port 'n' M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Lookup Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 40: Interface Mapping to Policy Table Index Access Mode According to Original Interface: {src device, src port} or {src trunk-ID} Entry Index {dev=0,port=0} 0 VID = 1 {dev=0,port=1} 1 VID = 31 {dev=0,port=31 (CPU)} 31 VID = 32 {dev=1,port=0} 32 VID = 1023 {dev=31,port=31 (CPU)} 1023 port=0 NA (trunk-ID=0) port=1 trunk-ID=1 port=63 (CPU) trunk-ID=63 NA trunk-ID=64 NA trunk-ID=127 Entry for lookup-cycle #0 Entry for lookup-cycle #1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 VID = 0 Policy Configuration Table 1024 1025 1087 1088 1151 Configuration • To configure the global access mode to the policy table, set the <PCL-ID Mode> bit in the Policy Global Con• • figuration Register (Table 333 p. 586). To configure the port access mode to the policy table, set the <VACL PCL-ID AssignMode> bit in Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). The read/write access procedure to the policy configuration table is described in C.13.9.3 "Read and Write Access to the PCL Configuration Table". MV-S102110-02 Rev. E Page 182 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 According to Port configuration: {local port #} or {packet VID} M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine 10.4 Triggering Policy Engine Processing AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This section describes the conditions to enable Policy engine processing. 10.4.1 Packet Eligibility for Policy Processing DSA-tagged packets whose command is TO_CPU, FROM_CPU, or TO_ANALYZER are not eligible for Policy engine processing. 10.4.2 Enabling Policy Engine Processing The Policy engine can be globally enabled/disabled. In addition, each port can be independently enabled/disabled for Policy engine processing. Eligible packets are subject to Policy engine processing if the Policy engine is globally enabled and the Policy engine is enabled on the ingress port. Configuration To enable the Policy engine, clear the <PolicyDisable> bit in the Policy Global Configuration Register • (Table 333 p. 586). This is a global enable configuration. To enable the Policy engine on a specific port, set the <PolicyEn> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). 10.4.3 Enabling a Lookup Cycle M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Each lookup cycle is associated with a set of lookup cycle parameters set according to the Policy Configuration table lookup cycle entry (Section 10.3.2 "PCL Configuration Table"). Each lookup cycle can be independently enabled or disabled. Note If the first cycle lookup results in an command assignment of HARD DROP, the second cycle lookup is skipped. 10.5 Policy Search Keys 10.5.1 Search Key Selection Procedure The Layer 2+ stackable switches offer six and the Multilayer stackable switches offer seven lookup-key combinations. A search key combination is selected as a function of the following parameters: • Search key length, defined by the global or lookup configurations. See Section 10.3.1 "Global Search Memory Modes" and Section 10.3.2 "PCL Configuration Table". • Packet type, which can be non-IP, IPv4, IPv6, or ARP. • IP Search Key type defined by the lookup configuration in Section 10.3.2 "PCL Configuration Table". The search key selection procedure in the SecureSmart and Layer 2+ stackable switches is illustrated in Figure 41. The search key selection procedure in the Multilayer stackable switches is illustrated in Figure 42. Search keys field description and format are described in the following sections. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 183 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Only packets whose command assignment is FORWARD are eligible for Policy engine processing. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Triggering Policy Engine Processing Extended rules PCL Type Standard rules Packet Type Packet Type IPv4 or ARP IPv6 Non IP Not IPv6 IPv6 Search Key Type IPv4 Search Key Type IPv6 type 1 Generate L2+IPv6 Lookup Key Generate IPv6+L4 Lookup Key IPv4 type 1 Generate L2+IPv4+L4 Lookup Key Generate IPv4+L4 Lookup Key IPv4 type 0 Generate L2 + IPv4/6 QoS Lookup Key Generate L2 Lookup Key M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 IPv6 type 0 IPv6 Figure 42: Search Key Selection Procedure for Multilayer Stackable Switches Extended rules PCL Type Standard rules Packet Type IPv4 or ARP IPv6 Packet Type Non IP IPv6 Not IPv6 IPv6 Search Key Type IPv4 Search Key Type First Lookup Cycle No IPv6 Lookup1 Key Type L2 + IPv6 DIP IPv6 type 0 Generate L2+IPv6 Lookup Key IPv6 type 1 Generate IPv6+L4 Lookup Key MV-S102110-02 Rev. E Page 184 IPv4 type 1 Generate L2+IPv4+L4 Lookup Key Generate IPv4+L4 Lookup Key IPv4 type 0 Yes Generate L2+IPv4/6 QoS Lookup Key CONFIDENTIAL Document Classification: Restricted Information L2 + IPv4/6 QoS Generate IPv6 DIP Lookup Key Generate L2 Lookup Key Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 41: Search Key Selection Procedure for SecureSmart and Layer 2+ Stackable Switches M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine 10.5.2 Search Keys Format AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The format of the standard keys and extended keys is outlined in Table 41 and Table 42. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 185 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 A single rule may match on different rule key types. To enable this property, the keys were designed to maximize the number of overlapping fields. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Search Keys Standard (24-bytes) Key Format Ke y Type : L2+IPv4/v6 Bits Fie ld Ke y Type : IPv4+L4 Bits Fie ld Ke y Type : IPv6 DIP Bits Fie ld 0 10:01 16:11 17 29:18 32:30 39:33 40 41 42 0 10:01 16:11 17 29:18 32:30 39:33 40 41 49:42 V alid PCL-ID SrcPort IsTagged V ID UP QoSProf ile IsIPv4 IsIP IpProtocol 0 10:01 16:11 17 29:18 32:30 39:33 40 41 49:42 V alid PCL-ID SrcPort IsTagged V ID UP QoSProf ile IsIPv4 IsIP IpProtocol 0 10:01 16:11 17 29:18 32:30 39:33 40 41 49:42 V alid PCL-ID SrcPort IsTagged V ID UP QoSProf ile IsIPv4 IsIP IpProtocol 55:50 PacketDscp 55:50 PacketDscp 55:50 PacketDscp 56 56 IsL4V alid 82:75 IsL4V alid L4 Byte Of f set 3 L4 Byte Of f set 2 IsA RP IsBC User Def ined Byte0 56 72:57 82:75 IsL4V alid L4 Byte Of f set 3 L4 Byte Of f set 2 IsA RP Ipv6_EH_exist User Def ined Byte0 90:83 User Def ined Byte1 90:83 User Def ined Byte1 91 139:92 187:140 IsIPv6 HopByHop MA C SA MA C DA 98:91 130:99 162:131 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Ke y Type : L2 Bits Fie ld 58:43 66:59 72:67 64:57 73 74 IsA RP Reserved User Def ined Byte0 82:75 User Def ined Byte1 90:83 91 139:92 187:140 Encap Type MA C SA MA C DA 72:65 73 74 64:57 72:65 73 74 170:163 178:171 186:179 187 188 User Def ined V alid 191:189 Reserved MV-S102110-02 Rev. E Page 186 188 189 190 191 DIP[15:0] 73 74 IsA RP Ipv6_EH_exist M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 V alid PCL-ID SrcPort IsTagged V ID UP QoSProf ile IsIPv4 IsIP Reserved EtherType / DsapSsap User Def ined Byte2 Reserved UserDef ine V alid Reserved IPv4 f ragmented IP Header OK 188 189 190 191 User Def ined Byte2 SIP[31:0] DIP[31:0] L4 Byte Of f set 13 L4 Byte Of f set 1 L4 Byte Of f set 0 Reserved User Def ine V alid Reserved IPv4 f ragmented IP Header OK CONFIDENTIAL Document Classification: Restricted Information 90:75 DIP[31:16] 91 187:92 IsIPv6HopByHop DIP[127:32]] 188 189 Reserved Reserved 190 191 IPv4 f ragmented IP Header OK Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 41: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note The IPv6 DIP key is relevant only for the Multilayer stackable switches and may be used in the second PCL lookup only for IPv6 packets Longest Prefix Match lookup for the Unicast Routing Engine. Extended (48-Byte) Key Fields Key Type: L2+IPv6 Bits Field Key Type: L4+IPv6 Bits Field 0 10:1 16:11 17 29:18 32:30 39:33 40 41 49:42 55:50 56 64:57 72:65 80:73 88:81 96:89 128:97 160:129 161 177:162 178 Valid PCL-ID SrcPort IsTagged VID UP QoSProfile IsIPv6 IsIP IpProtocol PacketDscp IsL4Valid L4 Byte Offset 3 L4 Byte Offset 2 L4 Byte Offset 13 L4 Byte Offset 1 L4 Byte Offset 0 SIP[31:0] DIP[31:0] EncapsulationType EtherType / DsapSsap IPv4_fragmented 0 10:1 16:11 17 29:18 32:30 39:33 40 41 49:42 55:50 56 '64:57 72:65 '80:73 '88:81 '96:89 128:97 Valid PCL-ID SrcPort IsTagged VID UP QoSProfile IsIPv6 IsIP IpProtocol PacketDscp IsL4Valid L4 Byte Offset 3 L4 Byte Offset 2 L4 Byte Offset 13 L4 Byte Offset 1 L4 Byte Offset 0 SIP[31:0] 0 10:1 16:11 17 29:18 32:30 39:33 40 41 49:42 55:50 56 '64:57 72:65 '80:73 '88:81 '96:89 128:97 Valid PCL-ID SrcPort IsTagged VID UP QoSProfile IsIPv6 IsIP IpProtocol PacketDscp IsL4Valid L4 Byte Offset 3 L4 Byte Offset 2 L4 Byte Offset 13 L4 Byte Offset 1 L4 Byte Offset 0 SIP[31:0] 234:179 282:235 330:283 338:331 346:339 354:347 362:355 370:363 378:371 381:379 382 383 Reserved MAC SA MAC DA UserDefinedBytes3 UserDefinedBytes4 UserDefinedBytes5 UserDefinedBytes0 UserDefinedBytes1 UserDefinedBytes2 Reserved UserDefineValid IP HeaderOk 224:129 225 226 234:227 282:235 330:283 338:331 346:339 354:347 362:355 370:363 378:371 381:379 382 383 SIP[127:32] Ipv6_EH_exist IsIPv6_EH_HopByHop DIP[127:120] MAC SA MAC DA UserDefinedBytes3 UserDefinedBytes4 UserDefinedBytes5 UserDefinedBytes0 UserDefinedBytes1 UserDefinedBytes2 Reserved UserDefineValid IP HeaderOk 224:129 225 226 234:227 SIP[127:32] Ipv6_EH_exist IsIPv6_EH_HopByHop DIP[127:120] 322:235 DIP[119:32] 354:323 362:355 370:363 378:371 381:379 382 383 DIP[31:0] UserDefinedBytes0 UserDefinedBytes1 UserDefinedBytes2 Reserved UserDefineValid IP HeaderOk Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Key Type: L2+IPv4 + L4 Bits Field CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 187 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 42: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Search Keys AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration The write procedure to the Policy TCAM is described in C.13.8.6 "Write Access to the Policy TCAM" on page • 600. Table 43 describes how a ternary-digit is represented in the search memory. • The read procedure to the Policy TCAM is described in C.13.8.4 "TCAM Direct Read Access" on page 596. Ternary Digit Representation Mask Bit Data Bit Tern ary Re pr esen tatio n 0 0 ‘x’ 1 NA 0 ‘0’ 1 ‘1’ 0 1 1 10.5.3 Search Key Fields Description This section defines the semantics search key fields. 10.5.3.1 Internally Generated Fields . Table 44: Field Internally Generated Fields M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 44 defines the semantics of the internally generated fields (see Section 10.5.2 "Search Keys Format" on page 185). Width Desc ription Valid 1b Always set to ‘1’, which implies valid. The search key is only compared against "valid" rules in the search memory. PCL Type 1b Set according to the global memory search mode, and the configuration lookup. If the search memory configuration is Mixed mode, this flag indicates if the current PCL lookup is based on regular or extended keys. Section 10.3.1 "Global Search Memory Modes" on page 179 Section 10.3.2 "PCL Configuration Table" on page 179 General MV-S102110-02 Rev. E Page 188 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 43: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine Internally Generated Fields (Continued) Width Desc ription userDefineValid 1b User-defined bytes (UDBs) are valid. Indicates that all user-defined bytes, used in that search key, were successfully parsed. 0 = At least 1 user-defined byte couldn’t be parsed ( "Failure to Extract a UserDefined Byte" on page 195). 1 = All user-defined bytes used in this search key were successfully parsed. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field When a UDB is extracted from the Layer 4 header of an IPv6 packet that includes extension header(s), the userDefineValid flag and the UDB are set as follows: • If the extension header is HBH, the search key generator assumes that the Layer 4 header starts after the HBH header and attempts to extract the UDB accordingly. If the UDB lies within the first 128B of the packet, userDefinedValid is set to VALID. • If the extension header is other than HBH, userDefinedValid is set to VALID and the UDB is zeroed. 10b PCL-ID, as extracted from Section 10.3.2 "PCL Configuration Table" on page 179. NOTE: This field may be modified between lookups if the port is operated in VACL mode and the packet VID is modified in the first lookup. SrcPort 6b Source Port The port number from which the packet ingressed the device. 63 is the CPU port. QosProfile 7b QoS Profile Qos Profile field, as assigned by the device, according to the QoS marking algorithm (see Section 8. "Quality of Service (QoS)" on page 110). NOTE: This field may be modified between lookup cycles, according to QoS marking command. IsBC 1b Ethernet Broadcast packet. Indicates an Ethernet Broadcast packet (MAC_DA == FF:FF:FF:FF:FF:FF). 0 = MAC_DA is not Broadcast. 1 = MAC_DA is Broadcast. NOTE: IPv6 does not send packets using the Broadcast address VID 12b VLAN-ID VLAN-ID field as assigned according to the VLAN assignment algorithm specified in Section 11.2.2 "VLAN Assignment Mechanisms" on page 206. This field may be modified between lookups if the packet VID is modified by the first lookup. IsTagged 1b Packet is identified as “tagged”. For a definition of “tagged”, see Section 11.2.1.1 "Identifying Tagged Packets" on page 204. L2 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 PCL-ID 0 = Packet is identified as untagged. 1 = Packet is identified as tagged (either VLAN-tagged or Priority-tagged). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 189 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 44: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Search Keys Width Desc ription 1b Determines content of EtherType/DSAP-SSAP field to be either EtherType or LLC DSAP-SSAP. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field Internally Generated Fields (Continued) EncapsulationType 0 = EtherType/Protocol field contains LLC DSAP-SSAP 1 = EtherType/Protocol field contains EtherType 16b Holds the most inner EtherType field in the Ethernet header. The value of this field depends on the Ethernet encapsulation: • Ethernet V2 - EtherType field, copied from the packet header. • 802.3 RAW - constant 0x8138 (IPX). • 802.3 (LLC) - {DSAP, SSAP} fields, copied from the packet header. • 802.3 SNAP - SNAP TYPE field, copied from the packet header. IsIPv4 1b Identifies an IPv4 packet. IPv4 packets are identified by the EtherType field == 0x0800 0 = Non IPv4 packet. 1 = IPv4 packet. NOTE: IP packets are correctly identified in Ethernet V2 and 802.3-SNAP encapsulations. All packets encapsulated in 802.3-LLC (not-SNAP) are considered non-IP packets, by the device. 1b Identifies an IPv6 packet. IPv4 packets are identified by the EtherType field == 0x86DD 0 = Non IPv6 packet. 1 = IPv6 packet. 1b Identifies an IPv4/6 packet 0 = Non IP packet. 1 = IPv4/6 packet. 1b Identifies an ARP packet. An ARP packet is identified by EtherType == 0x0806. 0 = Non ARP packet. 1 = ARP packet. isIPv6 IsIP IsARP MV-S102110-02 Rev. E Page 190 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 EtherType/DsapSsap CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 44: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine Internally Generated Fields (Continued) L3 Width Desc ription 1b Indicates a valid IP header. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field IP Header OK 0 = Packet IP header is invalid. 1 = Packet IP header is valid. A valid IPv4 header must meet all the following conditions: • Checksum • Version (==4) • IHL <= 5 (header of 20B). • IHL*4 <= <IP total length> • <Total packet byte count> - (L2 header + trailer) >= IP length A valid IPv6 header must meet all of the following conditions: • Version (==6) • <Total packet byte count> - (L2 header + trailer)>= IP length+40B iIPv4_fragmented 1b Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 NOTE: For non-IP packets the device sets this flag to a random value. To configure the packet length check criteria, set the <IPLength CheckMode> field of the <Policy Global Configuration Register (Table 333 p. 586). Identifies an IPv4 fragment. 0 = Not IPv4 packet or not an IPv4 fragment. 1 = Packet is an IPv4 fragment. CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 191 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 44: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Search Keys IsL4Valid Width Desc ription 1b Layer 4 information is valid. This field indicates that all the Layer 4 information required for the search key is available and the IP header is valid. Layer 4 information is defined in Table 45. 0 = Layer 4 information is not valid. Layer 4 information may not be available for any of the following reasons: • Layer 4 information is not included in the packet. For example, Layer 4 information isn’t available in non-IP packets, or in IPv4 non-initial-fragments. • Parsing failure: Layer 4 information is beyond the first 128B of the packet, or beyond IPv6 extension headers parsing capabilities. • IP header is invalid. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 1 = Layer 4 information is valid. A set flag indicates that the IP header is valid and any of the following conditions is true: • Packet is IPv4 not fragmented, or an initial IPv4 fragment for which Layer 4 information has been successfully parsed. • Packet is IPv6, and ((IPv6<OriginalNextHeader> field holds a valid Layer 4 protocol), Or (IPv6<OriginalNextHeader> == HOP_BY_HOP header && IPv6_HopByHopHeader<NextHeader> holds a valid Layer 4 protocol)), and Layer 4 information has been successfully parsed. NOTE: Whenever the search key contains a Layer 4 field, it must also contain an IsL4Valid field. Ipv6_EH_exist 1b IPv6 extension header exists. Indicates that an IPv6 extension exists. 0 = Non-Ipv6 packet or IPv6 extension header does not exist. 1 = Packet is IPv6 and extension header exists. IsIPv6_EH_HopByH op 1b IPv6 original extension header is hop-by-hop. Indicates that the IPv6 Original Extension Header is hop-by-hop. 0 = Non-IPv6 packet or IPv6 extension header type is not Hop-by-Hop Option Header. 1 = Packet is IPv6 and extension header type is Hop-by-Hop Option Header. IpProtocol 8b Layer 4 protocol Indicates the Layer 4 IP protocol/Extension Header type. This field is valid for IPv4/6 packets only. For IPv4 packets this field holds the Layer 4 IP protocol. For IPv6 packets this field takes the Original Extension Header or the Next Header, according to the following logic: if (originalNextHeader == HOP_BY_HOP) ipProtocol <- nextHeader; else ipProtocol <- originalNextHeader; MV-S102110-02 Rev. E Page 192 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Field Internally Generated Fields (Continued) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 44: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine 10.5.3.2 Key Fields Directly Extracted From the Packet AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 45 defines the semantics of the key fields that are directly extracted from the packet. Key Fields Directly Extracted from the Packet Field Width D esc rip t io n Packet User Priority 3b 802.1p User Priority field extracted from the packet. This field is set to ‘port default user priority’ (Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567), if the packet header does not include the User Priority field (i.e. the packet is neither VLAN tagged, Priority-tagged, nor DSA-tagged). If the packet includes multiple tags, this field is taken from the outer tag. MAC_DA 48b Ethernet Destination MAC address. MAC_SA 48b Ethernet Source MAC address. Layer 2 Information Layer 3 Information NOTE: Layer 3 information is valid only for IP/ARP packets. The device zero-pads L3 information for non-IP packets. 6b DSCP field directly extracted from the IPv4/6 header. DIP 4 or 16B IPv4/6 destination IP field. For ARP packets this field holds the target IPv4 address. 4 or 16B IPv4/6 destination IP field. For ARP packets this field holds the sender’s IPv4 address. SIP M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Packet DSCP Layer 4 Information NOTE: Layer 4 information is valid only for IP packets. UDP Data Offsets 4B The following Layer 4 information is available for UDP packets: Offsets 0–3 destination/source ports. TCP Data Offsets 5B The following Layer 4 information is available for TCP packets: • Offsets 0–3: destination/source ports. • Offset 13: TCP flags. IGMP Data Offsets 1B The following Layer 4 information is available for IGMP packets: Offset 0: IGMP Type. ICMP Data Offsets 2B The following Layer 4 information is available for ICMP packets: Offsets 0–1: ICMP Type, Code fields. Other L4 Data Offsets 4B The following Layer 4 information is available: Offsets 0–3. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 193 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 45: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Search Keys User-Defined Bytes AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 10.5.3.3 User-Defined Bytes Overview User-defined bytes allow the application to classify based on non-predefined packet header fields. Every search key incorporates an independent set of user-defined bytes. Extracting a Layer 4 UDB from an IPv6 packet with extension header(s), yields the following results: • If the extension header is other than HBH, userDefinedValid is set to VALID and the UDB is zeroed. • If the extension header is HBH, a Layer 4 header is assumed to be located right after the first extension header. userDefinedValid is set to VALID if the UDB lies within the first 128 bytes of the packet. Table 46: Number of User-Defined Bytes Per Key Sear ch Key N ame # o f U ser-De f in ed By tes Standard search keys (24 bytes) L2 3 L2+IPv4/6 QoS 2 IPv4+L4 3 IPv6 DIP 0 Extended search keys (48 bytes) L2+IPv4+L4 6 L2+IPv6 6 IPv6+L4 3 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 NOTE: This key is relevant for the Multilayer stackable switches only. A user-defined byte is set using the following parameters: Anchor Allows the application to define a byte, relative to a beginning of a specific layer. There are four anchor types: Start of packet, Start of L3 header, Start of L4 header, and Start of IPv6 Extension header. Offset Defines the position of the byte relative to the anchor. Note However, only UserDefinedByte0 and UserDefinedByte1 support the “Start of Packet” anchor type. The other user-defined bytes cannot use the “Start of Packet” anchor type. Invalidating Unused User-Defined Bytes An unused user-defined byte must be set, in such a way that it guarantees successful parsing. For example, the default configuration (Start of Layer 3 header, offset 0) guarantees successful parsing. The application must mask unused bytes in every rule. MV-S102110-02 Rev. E Page 194 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 46 shows the number of user-defined bytes per Search key. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine Failure to Extract a User-Defined Byte In addition, the Policy engine can be globally configured to take one of the following actions: • Continue the search. Let the application handle the parsing error. • TRAP the packet to CPU, with a CPU code equal to ‘INVALID_USER_DEFINED_BYTES’. • Skip the search and mark the packet command to SOFT DROP. • Skip the search and mark the packet command to HARD DROP. Configuration To define a user-defined byte, see C.13.11 "Lookup Key User Defined Bytes" on page 606. • • To define the action taken when the search key generator fails to extract a user-defined byte, set the <InValid UserDefined KeyBytesCmd> field of the Policy Global Configuration Register (Table 333 p. 586) accordingly. 10.6 Policy Actions 10.6.1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This section describes the Actions that may be assigned to a matching packet. Per matching policy lookup, the Policy Action table is accessed using the index reported by the lookup TCAM and performs the Actions in the entry accordingly. The Actions performed on the packet are according to the matching policy Action entry and may be one or more of the following: • Packet command: Forward, Mirror to CPU, Trap to CPU, Hard Drop, or Soft Drop. Packets forwarded to the CPU due to a match are assigned with a CPU code from the Policy Action Entry. • Mirroring the packet to the Ingress Analyzer port. • Redirecting the packet to an Egress interface (port Trunk or Multicast group), bypassing the Bridge engine. • VLAN assignment to both tagged and untagged packets. • QoS assignment. • Binding a flow to a Policer. • Binding a flow to one of the 32 Policy match counters. • In the Multilayer stackable switches, the Policy Action Entry is regarded as a Next Hop entry for the Unicast Routing engine. When the Entry is regarded as a Next Hop entry, it contains relevant Next Hop fields (e.g., ARP Pointer).For further details see Section 12. "IPv4 and IPv6 Unicast Routing" on page 265. Policy Action Table The Policy Action table is a 1024 entries table (256 entries in the SecureSmart devices) containing the Action to be performed on a flow matching its corresponding Policy TCAM rules. Table 47 outlines the fields of the Policy Action Entry. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 195 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 When it fails to extract a user-defined byte <userDefineValid> in Internally Generated Fields (Table 44 p. 188), the search key is cleared and the byte is zeroed. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Actions AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 For entry format, see Table 334: "Policy Action Entry<n> (0<=n<1024)" on page 589. In the Multilayer stackable switches, when the Policy Action Entry is regarded as a Next Hop Route Entry for the Unicast routing engine, it contains different fields from those specified in Table 47 and in Table 334: "Policy Action Entry<n> (0<=n<1024)" on page 589. For the fields of a Policy Action Entry as a Next Hop Route entry, see Table 64, “Policy Action Entry As a Route Entry,” on page 269. For the Entry format of the Policy Action Entry used as a Next Hop Route entry, see Table 335: "Policy Action Entry as a Next Hop Entry<n> Register (0<=n<1024)" on page 593. Table 47: Field Policy Action Entry Desc ription The entry forwarding command. 0 = Forward 1 = Mirror-to-CPU. 2 = Trap 3 = Hard Drop 4 = Soft Drop 5–7 = Reserved CPU_CODE Only relevant if <Packet Command> is Mirror-to-CPU or Trap. The CPU code assigned to packets Mirrored to CPU or Trapped to CPU due to a match in this entry M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Packet Command To avoid conflict with other CPU codes assigned by the device, it is recommend to use a CPU code from the User-Defined CPU code range (see Appendix B. "CPU Codes" on page 343). Mirror To Ingress Analyzer Port Enables mirroring the packet to the ingress analyzer port. 0 = Packet is not mirrored to ingress analyzer port. 1 = Packet is mirrored to ingress analyzer port. Redirect Command This field enables redirection of the matching flow packets to an egress interface (a port, trunk, or Multicast group) or in the Multilayer stackable switches, regard this Policy Action Entry as a Next Hop Route Entry used by the Unicast Routing engine. 0 = Do not redirect. 1 = Redirect the packet to the Egress interface specified in <PCE Egress Interface>, bypassing the Layer2 Bridge Engine Bridging decision. 2 = Reserved. 3 = In the Multilayer stackable switches, regard the Policy Action Entry as a Next Hop Route Entry. When the value of this field is 3, the Policy action entry is regarded as a Next Hop Route Entry with a format different than that specified in this table. For the Entry format of the Policy Action Entry used as a Next Hop Route entry, see Table 335: "Policy Action Entry as a Next Hop Entry<n> Register (0<=n<1024)" on page 593 MV-S102110-02 Rev. E Page 196 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Note M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine Policy Action Entry (Continued) Des cription Egress Interface Only relevant when <PCERedirect Cmd>=1 The Egress Interface (Port, Trunk or Multicast Group) through which the packet is forwarded VLAN Command Relevant only if VID precedence set by the previous VID assignment mechanisms (Port, Protocol-based VLANs, and previous matching rule) is soft. 0 = Don’t modify the VID assigned to the packet so far. 1 = <Policy VID> is assigned to untagged packets or prio tagged packets. 2 = <Policy VID> is assigned to tagged packets. 3 = <Policy VID> is assigned to all packets. Policy VID Only relevant if the VID precedence set by the previous VID assignment mechanisms (Port, Protocol Based VLANs, and previous matching rule), is Soft and <PCE VLANCmd>= 0 and the packet’s tag format matches this command. The VLAN-ID set by this entry VID Precedence The VLAN Assignment precedence for the subsequent VLAN assignment mechanism, which is the Policy engine next policy-pass rule. Only relevant if the VID precedence set by the previous VID assignment mechanisms (Port, Protocol Based VLANs, and previous matching rule) is Soft. 0 = Soft precedence: The VID assignment can be overridden by the subsequent VLAN assignment mechanism, which is the Policy engine. 1 = Hard precedence: The VID assignment is locked to the last VLAN assigned to the packet and cannot be overridden. EnNestedVLAN When this field is set to 1, this rule matching flow is defined as an access flow. The VID of all packets received on this flow is discarded and they are assigned with a VID using the device’s VID assignment algorithms, as if they are untagged. When a packet received on an access flow is transmitted via a core port or a Cascading port, a VLAN tag is added to the packet (on top of the existing VLAN tag, if any). The VID field is the VID assigned to the packet as a result of all VLAN assignment algorithms. The 802.1p User Priority field of this added tag may be one of the following, depending on the ModifyUP QoS parameter set to the packet at the end of the Ingress pipe: If ModifyUP is 1, it is the UP extracted from the QoSProfile to QoS Table Entry<n> (0<=n<72) (Table 434 p. 696). If ModifyUP is 0, it is the original packet 802.1p User Priority field if it is tagged and is <PUP> if the original packet is untagged. Modify UP Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field Enables the modification of the packet’s 802.1p User Priority field. Only relevant if QoS precedence of the previous QoS assignment mechanisms (Port, Protocol Based QoS, and previous matching rule) is Soft. Applies to packets that are transmitted as VLAN tagged or transmitted via Cascading ports with a DSA tag in a FORWARD format. When this field is set to 1 and the subsequent QoS assignment ingress pipeline engines have not cleared this setting, the packet’s 802.1p User Priority field is set to UP mapped from the QoSProfile to QoS Table Entry<n> (0<=n<72) (Table 434 p. 696), when the packet is transmitted. This is regardless of the incoming packet tag format (tagged or untagged). 0 = Keep previous ModifyUP setting. 1 = Set modify UP to 1. 2 = Set modify UP to 0. 3 = Reserved. CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 197 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 47: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Actions Table 47: Enables the modification of the packet’s DSCP field. Only relevant if QoS precedence of the previous QoS assignment mechanisms (Port, Protocol Based QoS, and previous matching rule) is Soft. Relevant for IPv4/IPv6 packets, only. When this field is set to 1 and the subsequent QoS assignment ingress pipeline engines have not cleared this setting, the packet’s DSCP field is modified, when the packet is transmitted to the DSCP mapped from the QoSProfile to QoS Table Entry<n> (0<=n<72) (Table 434 p. 696). 0 = Keep previous DSCP modification command. 1 = Enable modification of the DSCP field in the packet. 2 = Disable modification of the DSCP field in the packet. 3 = Reserved. QoS Profile Marking En Relevant only if QoS precedence of the previous QoS Assignment Mechanisms (Port, Protocol Based QoS, and previous matching rule) is Soft. 0 = Preserve previous QoSProfile setting. 1 = Assign <QoS Profile> to the packet. QoS Profile Only relevant if the QoS precedence of the previous QoS Assignment Mechanisms (Port, Protocol Based QoS, and previous matching rule) is Soft and <PCEQoS Profile MarkingEn> is set to 1. the QoSProfile is used to index the QoSProfile to QoS Table Entry<n> (0<=n<72) (Table 434 p. 696) and map the QoS Parameters for the packet, which are TC, DP, UP and DSCP Valid Range - 0 through 71 QoS Precedence PCE Marking of the QoSProfile Precedence. Only relevant only if QoS precedence of the previous QoS Assignment Mechanisms (Port, Protocol Based QoS, and previous matching rule) is Soft. Setting this bit, locks the QoS parameters setting from being modified by subsequent QoS assignment engines in the ingress pipe. 0 = QoS precedence is soft and the packet’s QoS parameters may be overridden by subsequent QoS assignment engines 1 = QoS precedence is hard and the packet’s QoS parameters setting done until this stage is locked. It cannot be overridden by subsequent QoS assignment engines. The QoS precedence is not relevant in the policing engine. Non-conforming traffic is remarked according to the policing engine setting regardless of their QoS precedence. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Modify DSCP Match Counter Enable Enables the binding of this policy action entry to the Policy Rule Match Counter<n> (0<=n<32) (Table 347 p. 606) indexed by <Match CounterIndex> 0 = Match counter binding is disabled. 1 = Match counter binding is enabled. Match Counter Index Valid if <Match Counter Enable> is set to 1. A pointer to one of the 32 policy rules match counters, Policy Rule Match Counter<n> (0<=n<32) (Table 347 p. 606) The counter is incremented for every packet satisfying both of the following conditions: Matching this rule. The previous packet command is not hard drop. MV-S102110-02 Rev. E Page 198 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Desc ription AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field Policy Action Entry (Continued) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine Policy Action Entry (Continued) Des cription PolicerEn When set to 1, this rule is bound to the policer indexed by <PolicerIndex>. 0 = Don’t use policer. 1 = Use policer. PolicerIndex Only valid if <PolicerEn> =1, the Index of the Policer bound to this flow. Valid range: 0–255 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field Configuration The read/write access procedure to the Action’s memory is described in C.13.8.7 "Read and Write Access to the Action Table" on page 601. 10.6.2 Applying the Action on the Packet M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The packet information includes the following packet processing parameters: • Packet command and associated CPU code (if packet command is TRAP or MIRROR) • Ingress mirroring command • QoS information • Traffic Policing information • Forwarding information • VLAN information An action is applied to a packet by modifying the packet information. In some cases, packet information is modified as a function of both the action and current packet information. Section 10.6.2.1 "Packet Command Modification" defines how packet information is modified after every lookup cycle. Note If a lookup cycle is skipped, packet information is not modified in that cycle. 10.6.2.1 Packet Command Modification The packet command is modified as specified in Section 5.2 "Command Resolution Matrix" on page 53. In the event that packet is assigned either a TRAP or MIRROR command, and the action assigns either a TRAP or MIRROR command, but the commands are not the same, the packet is always assigned the TRAP command and the CPU code associated with the TRAP command. However, if the packet and action commands are either both TRAP or both MIRROR, according to a global configuration command, either the first action CPU code or the second action CPU code is selected. Configuration To configure which CPU code is chosen, set the <CPUCode Precedence> bit in the Policy Global Configuration Register (Table 333 p. 586). This configuration is in effect only if the packet commands in the action and in the packet information are equal. 10.6.2.2 Ingress Mirroring Command The packet ingress mirroring command can be enabled by any lookup cycle, but can’t be disabled by the Policy engine. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 199 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 47: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Actions 10.6.2.3 QoS Marking Command AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The latter packet QoS marking command overrides former QoS settings, unless the QoS Precedence flag in the packet information is set to HARD by a former QoS marker. Traffic Policing Command Traffic Policing can be enabled at any lookup cycle, but can’t be disabled by the Policy engine. If traffic policing is enabled by the second lookup, the second action selects the policer. 10.6.2.5 Redirect Forwarding Command The Redirect Forwarding command can be enabled at any lookup cycle, but it can’t be disabled. If Redirect Forwarding is enabled by the second lookup, the second action sets the packet destination. Only the second lookup can redirect to next hop. 10.6.2.6 VLAN ID Marking Command The latter VID marking command overrides former VID settings, unless the VID Precedence flag in the packet information is set to HARD by a former VID marker. 10.7 Applications This section describes specific aspects of PCL implementation, using the Policy engine. Dividing Application PCLs Between Lookup Cycles M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 10.7.1 An application-specific PCL provides a single service. For example, a Security PCL provides packet filtering service. Implementation of application-specific PCLs requires a dedicated lookup per application. The following example demonstrates that application-specific PCLs cannot share the same lookup. If two PCLs, A and B, share the same lookup cycle, and A is located before B in the search memory, then a match in A skips B. As a result, application B is not processed. Table 48 shows two alternatives to divide applications between the lookups. Table 48: Dividing Applications Between Lookup Cycles Examp l e # L oo kup #1 1 Security 2 Policy based VLAN Assignment & Security 10.7.2 Lo oku p #2 QoS & traffic policing QoS & traffic policing & Policy-based forwarding Binding Options of PCLs to Interfaces PCLs are activated by binding them to ingress interfaces. Using the PCL Configuration Table (Section 10.3.2 "PCL Configuration Table" on page 179), PCLs can bound to the following types of interfaces: • Physical ports • VLANs • Trunk groups MV-S102110-02 Rev. E Page 200 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 10.6.2.4 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Policy Engine 10.7.2.1 Binding Capabilities in Homogeneous Systems AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In homogenous systems all devices incorporate Policy engines. In such systems, every device is capable of applying policy services on its ingress network ports. In homogenous systems PCLs can bound to the following interfaces: • Port-based and VLAN-based PCLs are supported concurrently. Every port (including the CPU port) can be independently configured to operate in port-based or VLAN-based PCLs. • VLAN-based PCLs are supported for VLAN-IDs 0-1023. VLAN-IDs greater then 1023 are processed by a port-based PCL. • Trunk PCLs are supported by binding the same PCL to all physical ports in a trunk group. Configuration • Set the <PCL-ID Mode> bit in the Policy Global Configuration Register (Table 333 p. 586). • To configure port ‘n’ to operate in Port-PCL mode, clear the <VACL PCL-ID AssignMode> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). To operate the port in VLAN-PCL mode, set the <VACL PCL-ID AssignMode> bit. 10.7.2.2 Binding Capabilities in Heterogeneous Systems M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In heterogeneous systems not all devices incorporate Policy engines. In such systems, the devices in these families provide policy services to policy-incapable devices. It is required that the policy-incapable devices be able to communicate with the devices using DSA-tagged packets. For example, a system built from devices in these families and 98DX16x-16xR-24x devices belongs to this category. In heterogeneous systems PCLs can bound to the following interfaces: • The Policy engine can service remote devices in either port-based or VLAN-based PCL modes, but not concurrently. The binding mode is a global configuration. Port-based PCLs are supported for up to 1024 remote ports residing on up to 31 devices, each having at most 32 ports. VLAN-PCLs are supported for VLAN-IDs 0-1023. VLAN-IDs greater than 1023 are processed by a local portbased PCL. • Up to 127 remote trunks groups are supported in port-based PCL mode. (The SecureSmart devices support the following: 98DX163, 98DX243, and 98DX262 support 32 trunk groups and the 98DX106 supports 8 trunk groups.) • Local network ports can be operated in the following modes: If the global access mode to the Policy Configuration table is set to lookup by "local port configuration", each local port can be configured to either VLAN or port-based PCLs. If the global access mode to the Policy Configuration table is set to lookup by "remote device/port", local mode is enabled, local ports can operate only in port-based PCL mode. 10.7.3 Implementation of Subnet-Based VLAN Marking Subnet-based VLAN assignment can be directly implemented using the Policy engine. The Policy engine allows examination of the IP addresses carried in ARP messages. The application can use this capability to assign ARP packets to the correct VLAN based on the source IP address field inside the ARP data. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 201 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 For example, a system built from devices in these three families belongs to this category. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Applications 10.7.4 Applying Policy on Packets from the CPU AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Packets sent by the CPU with a FORWARD DSA tag are processed by the Policy engine. PCLs can be bound to the ingress CPU port. The local CPU occupies port #63. MV-S102110-02 Rev. E Page 202 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the CPU is connected to several devices and has a PCL assigned to it, the PCL must be duplicated across devices. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 11. Bridge Engine 11.1 Bypassing Bridge Engine As part of the ingress pipeline, the bridge engine processes packets in sequence after the Policy engine. Packets bypass the bridge engine processing if any of the following conditions are true: • Packet is received on a port configured with <Bypass Bridge> enabled. This configuration is recommended on cascade ports attached to other Prestera devices that support the extended DSA tag, as these packets already have the bridge forwarding decision recorded in their DSA tag (Section 4.7 "Cascading" on page 51). • Packet is received on a cascade port with a DSA tag command that is not FORWARD (Section 4.6 "DSA Tag" on page 48). • Packet is policy-based switched to a target destination. • If the packet is assigned a packet command of HARD DROP prior to the bridge engine, the packet bypasses the bridge engine (and all other engines in the ingress pipeline). Note Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the Bridge engine is bypassed for reasons other than packet command set to HARD DROP or packet was received with DSA tag command not equal to FORWARD, the Bridge engine FDB Source MAC Address lookup is still performed (Section 11.4.7 "FDB Source MAC Learning" on page 231). However, if the MAC SA FDB entry is found, the entry does not affect the packet forwarding in any way To enable/disable the bridge-bypass mode on a port, set the <Bridge BypassEn> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). 11.2 VLANs IEEE 802.1Q Virtual LANs (VLANs) is the standard mechanism used today to partition a single Layer 2 bridging domain into multiple, separate, independent bridging domains. Conceptually, each VLAN constitutes a bridged network. VLANs provide many useful features such as: • Better utilization of bandwidth by reducing flooding of Broadcasts, Multicasts, and unknown Unicast packets to relevant ports only. • Natural mechanism to separate traffic with different protocols, e.g., IPX, IPv4, IPv6, or departmental traffic (e.g. finance, R&D, etc.). • Increased security by preventing access between VLANs. For example, networks with sensitive information such as marketing and finance can be separated from the rest of the network. • Segregation of customer traffic for MAN and access networks. The device is fully compliant with the IEEE 802.1Q VLAN standard. The full range of 4K VLAN-ID (VID) with 4K active VLAN groups is supported. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 203 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section describes the device’s ingress and egress bridge features. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bypassing Bridge Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In addition, the device can function as a conventional VLAN-unaware bridge (Section 11.2.5 "VLAN-Unaware Mode"). • • • • VLAN port membership does not include the CPU port. For traffic to be sent to the CPU, it must be explicitly trapped or mirrored to the CPU or Unicast bridged to the CPU port 63. For trapping/mirroring known Unicast or registered Multicast traffic see Section 11.4.4 "FDB Entry Command". For trapping/mirroring control traffic to the CPU see Section 11.8 "Control Traffic Trapping/Mirroring to the CPU" on page 244. For trapping/mirroring flooded traffic to the CPU see Section 11.11 "Unknown and Unregistered Packet Filtering" on page 253). This section describes the device’s VLAN support and related features. 11.2.1 VLAN Tag Rx and Tx The IEEE 802.1Q and IEEE 802.3ac standards include the definition of a VLAN frame format that is able to carry VLAN identification and User Priority information over the Ethernet. This additional information is carried in an additional header field, known as the Tag Header. 11.2.1.1 Identifying Tagged Packets M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Packets received on a port are identified as either tagged or untagged. Tagged packets are further identified as VLAN tagged or Priority-tagged. VLAN tagged packets have a VLAN-ID (VID) that is not the NULL VID 0. Priority-tagged packets have a VLAN-ID that is the NULL VID 0. These identifications are used by the VID assignment mechanisms and the egress tag modification mechanism. The identification criteria differ for packets received on a cascade port and packets received on a network port. Packets Received on a Network Port Packets received on a network port may be VLAN tagged, Priority-tagged, or untagged. (DSA-tagged packets cannot be correctly interpreted when received on a non-cascade configured port.) Tagged packets received on a network port are identified by examining the packet EtherType value. The device supports two global configurable ingress VLAN EtherType values. Each port is configured to select one of the ingress VLAN EtherTypes to be used for determining whether the packet is tagged. A packet is considered tagged if its Ethertype matches the configured ingress VLAN Ethertype; otherwise it is considered untagged. (Nested VLAN access ports are an exception to this rule, as described in Section 11.2.3 "Nested VLAN Tags".) If the packet is untagged or Priority-tagged, the packet is assigned a VID by one of the VID assignment mechanisms (Section 11.2.2 "VLAN Assignment Mechanisms"). If the packet is VLAN tagged, the tag VID may be either preserved or overridden by one of the VID assignment mechanisms. The packet VLAN assignment is done prior to the bridge engine processing. The tag user priority may be used to set the packet QoS Profile or it may be ignored (Section 8.2.1.1 "Port QoS Trust Modes" on page 116). MV-S102110-02 Rev. E Page 204 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Notes M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The tag Canonical Format Indicator (CFI) bit is always preserved, but does not play any role in the packet processing. Packets Received on a Cascade Port All packets received on a cascade port must be DSA-tagged (Section 4.6 "DSA Tag" on page 48). The port VID assignment mechanism (Section 11.2.2.1 "Port-Based VLANs") identifies all DSA-tagged packets received on the cascade port as VLAN tagged packets, regardless of whether the packet was originally received VLAN tagged or not. However, the protocol/policy-based VID assignment mechanisms use the DSA tag <srcTagged> field to identify whether the packet is VLAN tagged or not (Section 11.2.2.2 "Protocol-Based VLANs", Section 11.2.2.3 "PolicyBased VLANs"). Configuration • To configure the ingress VLAN Ethertype, set <IngressVLAN Ethertype0> or <IngressVLAN Ethertype1> field • in the Ingress VLAN EtherType Configuration Register (Table 323 p. 583) accordingly. To select the port ingress VLAN EtherType, set the <Ingress VLANSel> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). 11.2.1.2 Egress Port Tag Modification M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The egress port tag modification criteria differ for packets transmitted on a cascade port and packets transmitted on a network port. Each VLAN table entry contains a bit per port on the device indicating whether packets transmitted through the respective port are tagged or untagged (Section 11.2.6 "VLAN Table Entry" on page 216). Packets Transmitted on a Network Port A packet is transmitted on a network port either VLAN tagged or untagged, but never Priority-tagged (unless there is an explicit VLAN assignment of 0) or DSA -tagged. The factor that determines how the packet is transmitted depends on the internal packet command (Section 4.6.1 "DSA Tag Commands" on page 49). If the packet command is FORWARD or FROM_CPU to a Multicast group index (VIDX) destination (Section 11.5 "Bridge Multicast (VIDX) Table"), the packet is transmitted VLAN tagged or untagged according to the egress port tagged state in the VLAN table entry for the respective packet's VID assignment. If the packet command is FROM_CPU to a single destination, the packet is transmitted VLAN tagged or untagged according to the FROM_CPU DSA tag <trgTag> field (A.2 "Extended DSA Tag in FROM_CPU Format" on page 336). If the packet command is TO_ANALYZER and the packet is ingress mirrored to an analyzer port, the packet is transmitted tagged or untagged as it was received on the ingress port. If the packet command is TO_ANALYZER and the packet is egress mirrored to an analyzer port, the packet is transmitted VLAN tagged or untagged as it was transmitted on the egress port (Section 16.2 "Traffic Mirroring to Analyzer Port" on page 314). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 205 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The DSA tag contains the packet’s VID assignment and an indication (the <srcTagged> field) of whether the packet was originally received tagged or untagged on the ingress network port. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED VLANs The VLAN tag Ethertype is always set in the packet according to this configuration, even if the packet was not modified.) – • • • Typically, the port is configured with the same VLAN Ethertype selection for both ingress and egress. VID is set according to the packet VID assignment. User Priority is set according to the packet QoS assignment (Section 8.5.1 "Setting the Packet Header 802.1p User Priority Field" on page 127). CFI bit is preserved if the packet was received tagged. If the packet was received untagged, the tag CFI bit is set to zero. Configuration To configure the egress VLAN Ethertype, set the <EgressVLAN Ethertype0> or <EgressVLAN Ethertype1> • • field in the Egress VLAN EtherType Configuration Register (Table 531 p. 772) accordingly. To select the port egress VLAN EtherType, set the <Egress VLANEType SelPortMap> field in the Egress VLAN Ethertype Select Register (Table 532 p. 772) accordingly. Packets Transmitted on a Cascade Port Packets transmitted on a cascade port are always transmitted with a DSA tag. The VLAN port tagged/untagged configuration is not relevant for cascade ports, nor is the VLAN EtherType selection (the DSA tag does not contain an EtherType field). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Packets received untagged on a network port have a DSA tag added to the packet when transmitted on a cascade port. Packets received tagged on the network port have their tag overwritten with a DSA tag when transmitted on a cascade port. The DSA tag contains an indicator whether the packet was received tagged or untagged on the network ingress port. The DSA tag <VID> field contains the packet VID assignment. 11.2.2 VLAN Assignment Mechanisms All packets are assigned a VLAN-ID (VID) prior to processing by the Bridge engine. The device supports multiple mechanisms for assigning the packet VID: • Port-based (Section 11.2.2.1 "Port-Based VLANs") • Protocol-based (Section 11.2.2.2 "Protocol-Based VLANs") • Policy-based (Section 11.2.2.3 "Policy-Based VLANs") Multiple VID assignment mechanisms may be enabled simultaneously. The VID assignment made by a mechanism is assigned either “hard” or “soft” precedence. Hard precedence implies that subsequent mechanisms may not override the current assignment. Soft precedence implies that subsequent mechanisms may override the current assignment. Packets received on cascade ports are subject to the VLAN assignment mechanisms only if the DSA tag command is FORWARD (Section 4.6 "DSA Tag" on page 48). The following subsections describe each of these mechanisms in detail. Figure 43 defines the order of the VLAN assignment mechanisms. MV-S102110-02 Rev. E Page 206 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 – AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the packet is transmitted tagged, the tag fields are set as follows: • EtherType is set according to the egress port VLAN EtherType select configuration. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 43: VLAN Classification Algorithm Yes Port enabled for Nested VLAN Access Port? No Yes Port enabled for Force PVID assignment? No VID = PVID Yes M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 No VID = packet tag or DSA tag VID Packet is VLAN-tagged or DSA-tagged with VID > 0 Port enabled for Protocol-based VLAN & protocol match & port action assigns VID Yes VID = port protocol VID No Previous VID assigment is preserved No Port enabled for Policy & rule action assigns VID Yes VID = policy action VID Note: If any mechanism assigns the VID precedence to Hard, this completes the VID assignment algorithm. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 207 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Packet arrives at port M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED VLANs 11.2.2.1 Port-Based VLANs AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Port-based VLAN assignment is the initial mechanism for VID assignment and it is always enabled. Each port maintains a configurable Port VLAN-ID (PVID) value. VLAN tagged packets and DSA-tagged packets are assigned the VID from the tag. This VLAN assignment precedence is implicitly soft. Alternatively, there is a non-standard option to force the PVID assignment for VLAN tagged and DSA-tagged packets (according to 802.1Q, the PVID is only assigned to untagged and Priority-tagged packets.) If the PVID is assigned to the packet, the VID assignment precedence is configurable as either Hard or Soft. Configuration • To configure the port default VID, set the <PVID> field in the Port<n> VLAN and QoS Configuration Entry • • (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567) accordingly. To configure the PVID assignment precedence, set the <PVID Precedence> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). To configure the PVID Force Mode, set the <PVIDMode> field in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567) accordingly. 11.2.2.2 Protocol-Based VLANs M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Protocol-based VLAN mechanism is compliant with IEEE 802.1v VLAN Classification by Protocol and Port. The Protocol-based VID assignment mechanism is enabled per-port. Eight global protocol values are supported per device. For each global protocol a different VID assignment can be made on a per-port basis. If the protocol-based VID assignment is enabled on the ingress port and the packet EtherType or DSAP/SSAP protocol matches one of the global protocols, the VID assignment is according to the per-port per-protocol configuration. The per-port per-protocol VID assignment command is configurable to one of the following: • Only untagged and Priority-tagged packets • Only VLAN tagged packets • All packets Global Protocol Table A protocol match is based on comparing the packet protocol and encapsulation type to entries in the global protocol table. If the packet encapsulation is either EthernetV2 or 802.3 LLC/SNAP, the 16-bit EtherType value is used as the protocol value. If the packet is tagged or has multiple tags (up to four tags), the inner-most EtherType is used as the protocol value. If the packet encapsulation is 802.3 non-SNAP LLC, the 16-bit LLC DSAP-SSAP is used as the protocol value. MV-S102110-02 Rev. E Page 208 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Untagged and Priority-tagged packets always receive their initial VID assignment according to the ingress port PVID value. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine As some protocols may have multiple encapsulations, the encapsulation type/types are specified for each protocol. Note IPX raw packets are identified as LLC/Non-SNAP packets where the DSAP=SSAP=0xFF. The global protocol table contains eight entries, each consisting of the fields shown in Table 49. Table 49: Global Protocol Table Entry Des criptio n Entry Valid Bit If set, entry is valid. Entry Protocol 16-bit Protocol value. Entry Encapsulations 3 bit encapsulation bitmap. • Bit 0: EthernetV2. • Bit 1: non-SNAP LLC. • Bit 2: LLC/SNAP. Any combination of bits may be set. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field If there is a protocol/encapsulation match in global entry N, then the per-port protocol table entry N defines the actions taken. Each of the eight global entries has a corresponding per-port assignment entry consisting of the fields shown in Table 50. Table 50: Per Port Protocol Table Entry Field Des cription Entry Valid Bit 0 = Entry is invalid and the packet VID and QoS are not modified by the protocolbased mechanism. 1 = Entry is valid. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 209 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The following criteria are used to identify the encapsulation type: EthernetV2 Length/EtherType > 1500 decimal AND Length/EtherType != 0x8870 (EtherType for Jumbo LLC packets) LLC/SNAP Length/EtherType <= 1500 decimal OR Length/EtherType = 0x8870 (EtherType for Jumbo LLC packets), AND the LLC <DSAP-SSAP-Control> = 0xAA-0xAA-0x03. LLC/Non-SNAP Length/EtherType <= 1500 OR Length/EtherType = 0x8870 (EtherType for Jumbo LLC packets), AND the LLC <DSAP-SSAP-Control> != 0xAA-0xAA-0x03. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED VLANs Table 50: 0 = Do not change the VID assignment. 1 = Set the VID assignment only if the packet is untagged or Priority-tagged. 2 = Set the VID assignment only if the packet is VLAN tagged. For the purpose of protocol-based VLANs, DSA-tagged packets are identified as tagged or untagged according to the DSA tag <srcTagged> field. 3 = Set the VID assignment regardless of packet tagging state. Options 1, 2, and 3 are contingent upon the incoming descriptor VID-Precedence being set to Soft. VID Assignment The 12-bit VID assignment for the corresponding protocol entry in the global protocol table. VID Precedence 0 = Set to Soft precedence - The VID assignment can be overridden by other subsequent mechanisms in the ingress pipeline. 1 = Set to Hard precedence - The VID assignment cannot be overridden by other subsequent mechanisms in the ingress pipeline. QoS Assignment command 0 = Do not change the QoS Profile assignment or its attributes. 1 = Set the QoS Profile assignment only if the packet is untagged. 2 = Set the QoS Profile assignment only if the packet is VLAN-tagged, Prioritytagged, or DSA-tagged. 3 = Set the QoS Profile assignment regardless of packet tagging state. Options 1, 2, and 3 are contingent upon the incoming descriptor QoS-Precedence being set to Soft. QoS Profile Assignment 7-bit QoS Profile (Section 8.1.4 "QoS Profile" on page 114). QoS Precedence 0 = Set to Soft precedence - The QoS Profile assignment can be overridden by other subsequent mechanisms in the ingress pipeline. 1 = Set to Hard precedence - The QoS Profile assignment cannot be overridden by other subsequent mechanisms in the ingress pipeline. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 VID Assignment command Modify Packet DSCP 0 = The IPv4/6 packet DSCP is not modified. 1 = The IPv4/6 packet DSCP is updated according to the QoS Profile assignment. Modify Packet User Priority 0 = If the packet was received tagged and transmitted tagged, the user priority is not modified. If the packet was received untagged, the user priority is set according to the default port user priority. 1- If the packet is transmitted tagged, the user priority is set according to the QoS Profile table. Configuration To configure the global protocols/encapsulation, set the Protocols<2 n...2n+1> Configuration Register • • (0<=n<4) (Table 321 p. 581) and the Protocols Encapsulation Configuration Register (Table 322 p. 581) accordingly. To configure the port VID and QoS assignment values, set the Port<n> Protocol<m> VID and QoS Configuration Entry Register (0<=n<27, for CPU Port n = 0x3F, 0<=m<8, for <m> corresponding to the matching global protocol entry and encapsulation) (Table 313 p. 572) accordingly. MV-S102110-02 Rev. E Page 210 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Des cription AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field Per Port Protocol Table Entry (Continued) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To enable protocol-based VLANs on a port, set the<ProtBased VLANEn> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). • To enable protocol-based QoS on a port, set the <ProtBased QoSEn> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). For details on protocol-based QoS assignment see Section 8.2.2 "Protocol-Based QoS Marking" on page 118. Policy-Based VLANs The Policy engine is also capable of making the VID assignment. The Policy engine is the last VID assignment mechanism prior to the Bridging engine. A VID assignment made by the Policy engine is thus reflected in the Bridge learning and forwarding mechanisms. A Policy rule action supports the same fields as those defined in the Protocol-based VLAN mechanism, namely: • VID Assignment command • VID Assignment • VID Precedence See Table 50 for a description of these fields. The Policy engine is described in Section 10.6 "Policy Actions" on page 195. 11.2.3 Nested VLAN Tags 11.2.3.1 Service VLANs M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Nested VLANs (aka double-tagging or VLAN stacking) is now being standardized by IEEE 802.1ad, which is defining an addendum to the IEEE 802.1Q VLAN Bridge standard to support “provider bridging” based on adding an additional tag to the original packet. 802.1ad defines a “Service VLAN tag” or S-tag. The S-tag has the same format as the current standard customer VLAN tag (C-tag). IEEE will be standardizing a new EtherType value to be used for S-tags. On the access of a provider network, traffic is encapsulated with a Service tag (S-tag), regardless of whether it was originally C-tagged or untagged. The S-tag allows this traffic to be treated as an aggregate within a provider bridge network, where the bridging is based on the S-tag VID (S-VID) only. For example, a C-tagged packet received at the access of the provider network traverses the provider core network with two tags—the outer S-tag and inner C-tag. When leaving the provider core network through another access port, the S-tag is stripped and the original C-tagged packet is transmitted on the egress port. Note that if the original packet is received Priority-tagged, it will also be transmitted on the customer egress port as a Prioritytagged packet. The provider network can be seen by the customer's network as a layer 1 transport. If an untagged packet is received on an access port of a provider network, the packet traverses within the provider core network with a single VLAN tag. When leaving the provider core network through another access port, the tag is stripped and the original untagged packet is transmitted on the egress port. Based on whether the egress port is a tagged or untagged member of a VLAN, the device is capable of adding or removing a single tag on transmission out an egress port. This is illustrated in Figure 44. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 211 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 11.2.2.3 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED VLANs Push S-tag at access ports untagged packet Strip S-tag at access ports single-tagged packet untagged packet Provider Network 11.2.3.2 double-tagged packet Nested VLAN Support single tagged packet M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 single tagged packet Service VLAN is a common application, but the nested VLAN support can by applied to any application requiring multiple VLAN tags. As described in Section 11.2.1.1 "Identifying Tagged Packets" and Section 11.2.1.2 "Egress Port Tag Modification", there are two pairs of global VLAN EtherType values: • Ingress EtherType0 and Egress EtherType0 • Ingress EtherType1 and Egress EtherType1 (Typically the Ingress and Egress EtherType pair are configured to the same value, so in essence, there are two global EtherType values.) Each port is configured to “select” either VLAN EtherType0 or VLAN EtherType1. When Nested VLAN support is not required in the system, only one EtherType value (i.e., EtherType0) is configured in the system. We call this the Customer Tag (C-Tag) EtherType, and is typically configured to 0x8100 to conform to IEEE 802.1Q VLAN tags. In this case, all ports are configured with the VLAN EtherType Selection set to EtherType0. When Nested VLAN support is required in the system, an additional EtherType value (i.e., EtherType1) must be configured. We call the Service Tag (S-Tag) EtherType. There is no current standard for this EtherType value, but is expected to be defined in the future by IEEE 802.1ad. The device’s implementation of Nested VLANs is based on configuring each port as either a Core port or an Access port. Packets can be switched bi-directionally between Access and Core ports, between two Core ports, and between two Access ports. MV-S102110-02 Rev. E Page 212 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 44: Nested VLAN Cloud Ingress and Egress M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine Core Ports AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Core ports face the provider network. As such, tagged packets sent and received on Core ports are have the S-tag EtherType. Packets received on Core ports whose outer-most EtherType matches the port's VLAN EtherType selection (i.e. S-Tag EtherType) are treated as “tagged” packets. In this case, the initial VID assignment is taken from the tag, and the packet is treated as "tagged" by egress port(s), e.g. if the egress port is an untagged member of the VLAN, the VLAN tagged is removed. The packet received on a Core port may have additional nested S-Tags or C-Tags after the first S-Tag. The device is capable of parsing over the first S-tag and over up to three additional S-tags or C-tags to identify the packet EtherType protocol field and Layer 3/4 fields that are required for protocol-based VLANs/QoS and Policy rules. Packets received on Core ports whose outer-most EtherType does not match the port's VLAN EtherType selection (i.e. S-Tag EtherType) are treated as “untagged” packets, and will be assigned a VID according to the device VID assignment mechanisms. If the packet received on a Core port is C-tagged, the packet is treated as “untagged” and the device will not be able to parse the packet EtherType protocol and Layer 3/4 fields. Core ports are configured as tagged members of S-tag VLANs. Note Access Ports M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 When Nested VLANs are disabled, the port configuration is for Core port mode and the port VLAN EtherType Selection is set to the C-Tag EtherType. Access ports face the customer network. All packets received on Access ports are ALWAYS: • • Assigned a VID according by the device configuration - by default, port-based VID assignment is used, but the VID assignment may be overridden by the protocol or policy based VID assignment mechanisms (Section 11.2.2 "VLAN Assignment Mechanisms"). The VID assignment is never taken from the incoming packet tag. Considered “untagged” by the egress port(s), i.e., if the egress port is a tagged member of the VLAN, a VLAN tagged is added to the packet. Access ports are configured with the VLAN EtherType Selection set to the C-Tag EtherType. If packet received on an Access port have an outer-most EtherType which matches the port's VLAN EtherType selection (i.e. C-Tag EtherType), the device is capable of parsing over the first C-tag to identify the packet EtherType protocol field and Layer 3/4 fields that are required for protocol-based VLANs/QoS and Policy rules. If the packet received on an Access port is S-tagged, the packet the device will not be able to parse the packet EtherType protocol and Layer 3/4 fields. Access ports are configured as untagged members of S-tag VLANs. Configuration To configure the EtherType0 for ingress and egress: • Set the <IngressVLAN Ethertype0> field in the Ingress VLAN EtherType Configuration Register (Table 323 p. 583) accordingly. • Set the <EgressVLAN Ethertype0> field in the Egress VLAN EtherType Configuration Register (Table 531 p. 772) accordingly. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 213 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Core ports are configured with the VLAN EtherType Selection set to the S-Tag EtherType. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED VLANs 11.2.4 VLAN Filtering M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Access Port Configuration • Set the <NestedVLAN AccessPortEn> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). • Clear the <Ingress VLANSel> bit (VlanEtherType0) in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). • Clear the corresponding access port bits in the <Egress VLANEType SelPortMap> field in the Egress VLAN Ethertype Select Register (Table 532 p. 772). • Set the <PVID> field to the desired S-tag VID for this port in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). • For each S-tag VID in use, make all the Access ports untagged members by clearing the corresponding VLAN table entry port bit in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). Core Port Configuration • Clear the <NestedVLAN AccessPortEn> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). • Set the <Ingress VLANSel> bit (VlanEtherType1) in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). • Set the <Egress VLANEType SelPortMap> field for the respective core port bits in the Egress VLAN EtherType Configuration Register (Table 531 p. 772) • For each S-tag VID in use, make all the core ports tagged members by setting the corresponding VLAN table entry <port n tagged> port bit in VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). Customer Bridges (i.e., bridges that don’t employ nested VLANs). • Clear the <NestedVLAN AccessPortEn> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). • Clear the <Ingress VLANSel> bit (VlanEtherType0) in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567). • Clear the <Egress VLANEType SelPortMap> field in the Egress VLAN Ethertype Select Register (Table 532 p. 772). Once the final packet VID assignment is made, it is subject to the VLAN filtering mechanisms described in the following sub-sections. 11.2.4.1 VLAN Ingress Filtering Per the IEEE 802.1Q specification, when VLAN ingress filtering is enabled on an ingress port, that port is checked to determine whether it is a member of the incoming packet’s VLAN classification. If it is not a member, the Bridge engine drops the packet and it is not learned in the FDB. The drop type—Hard or Soft—is configurable. When a packet is discarded due to the VLAN ingress filtering mechanisms, it is treated as a security breach event (Section 11.6 "Bridge Security Breach Events" on page 241). MV-S102110-02 Rev. E Page 214 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To configure EtherType1 for ingress and egress: • Set the <IngressVLAN Ethertype1> field in the Ingress VLAN EtherType Configuration Register (Table 323 p. 583) accordingly. • Set the <EgressVLAN Ethertype1> field in the Egress VLAN EtherType Configuration Register (Table 531 p. 772) accordingly. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To configure the hard/soft drop mode or VLAN ingress filtering, set the <VLANIngress DropMode> bit in the Bridge Global Configuration Register1 (Table 371 p. 647). 11.2.4.2 VLAN Egress Filtering Prior to being enqueued on an egress port, the packet is subject to VLAN egress filtering. For known Unicast packets, the VLAN egress filter verifies that the egress port is a member of the VID assigned to the packet. For unknown Unicast, Multicast, and Broadcast packets, the packet is forwarded to a portlist resulting from AND’ing the Multicast group index (VIDX) port map with the packet VID port map. This operation performs VLAN egress filtering by ensuring the packet is flooded only to VID members. For details on the VIDX assignment see Section 11.5 "Bridge Multicast (VIDX) Table". Note M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 VLAN egress filtering is not performed in the following cases: • Packet is mirrored to an analyzer port. (DSA tag <command>=TO_ANALYZER). • Packet is sent from the CPU and designated with egress filtering disabled. (DSA tag <command>= FROM_CPU, and <egressFilterEn>=0). • Packet is bridged Unicast and the device has “VLAN egress filtering for known Unicast packets” set to disable. • In the Multilayer stackable switches, the packet is routed Unicast and the egress port has “VLAN egress filtering for routed Unicast packets” set to disable. The CPU cannot be member of a VLAN, so VLAN egress filtering is never applied to packets sent to the CPU. Configuration • To disable VLAN egress filtering for known Unicast packets, clear the <BridgedUC EgressFilter En> bit in the • • Egress Filtering Register0 (Table 463 p. 721). (VLAN egress filtering cannot be disabled for multi-target packets.) To disable VLAN egress filtering for routed packets, clear the Egress Filtering Register0 (Table 463 p. 721) <RoutedUc EgressFilterEn> field. (Relevant for the Multilayer stackable switches only.) For the CPU to send a packet that bypasses VLAN egress filtering, clear the DSA tag <EgressFilterEn> bit in the Extended FROM_CPU DSA Tag (Table 77 p. 336). 11.2.4.3 Invalid VLAN Filtering If the VLAN for which the packet is assigned is not configured as a valid VLAN in the VLAN table, the bridge engine drops the packet. The drop type—Hard or Soft—is configurable. When a packet is discarded due to invalid VLAN filtering mechanisms, it is treated as a security breach event (Section 11.6 "Bridge Security Breach Events" on page 241). Configuration To configure the hard/soft drop mode of invalid VLAN filtering, set the<VLANNotValid DropMode> bit in the Bridge Global Configuration Register1 (Table 371 p. 647). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 215 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To enable the port for VLAN ingress filtering, set the <Ingress Filtering> bit in the Ingress Port<n> Bridge • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED VLANs 11.2.4.4 VLAN Range Filtering AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 VLAN Range Filtering supports a per-device configurable maximum-allowed VLAN-ID value. Incoming VLAN tagged packets with a VLAN-ID greater than this value are dropped by the bridge engine and are not learned. The drop type—Hard or Soft—is configurable. When a packet is discarded due to the VLAN Range filtering mechanisms, it is treated as a security breach event (Section 11.6 "Bridge Security Breach Events" on page 241). Configuration • To set the maximum VID associated with packets with C-tag EtherType, set the <Ingress VLANRange0> field • in the Ingress VLAN Range Configuration Register (Table 375 p. 650) accordingly. To set the maximum VID associated with packets with S-tag EtherType, set the <Ingress VLANRange1> field in the Ingress VLAN Range Configuration Register (Table 375 p. 650) accordingly. 11.2.5 VLAN-Unaware Mode The device supports a VLAN-unaware mode for 802.1D bridging. • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 When this mode is enabled: • In VLAN-unaware mode, the device does not perform any packet modifications. Packets are always transmitted as-received, without any modification (i.e., packets received tagged are transmitted tagged; packets received untagged are transmitted untagged). • Packets are implicitly assigned with VLAN-ID 1, regardless of VLAN-assignment mechanisms. Packets are implicitly assigned a VIDX Multicast group index 0xFFF, indicating that the packet flood domain is according to the VLAN—in this case VLAN 1. Registered Multicast is not supported in this mode. All other features are operational in this mode, including internal packet QoS, trapping, filtering, policy, etc. Configuration To configure the VLAN-unaware mode, set the <BasicMode> bit in the Global Control Register (Table 84 p. 377)accordingly. 11.2.6 VLAN Table Entry The VLAN table contains 4K entries to support the full 4K VLAN-ID range defined by IEEE 802.1Q. (SecureSmart and SecureSmart Stackable devices have 256 active VLANs.) Although VID 0 and 0xFFF are reserved VLAN IDs in IEEE 802.1Q, the device VLAN table treats these as normal VLAN table entries. At initialization, the VLAN table is cleared to all zeros, with the exception of the entry for VID 1, which has all ports as untagged members. The remaining fields in the entry are initialized to zero. The VLAN entry hardware format is defined in C.16.18.1 "VLAN Table Entry Format" on page 754. CPU access to the VLAN table is defined in C.16.18.4 "VLT Tables Access Control Registers" on page 765. MV-S102110-02 Rev. E Page 216 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 A maximum VID value can be set for the C-tag and the S-tag, as identified by their respective EtherTypes (Section 11.2.3 "Nested VLAN Tags"). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine Description VLAN is valid VLAN entry is valid. or VLAN entry is invalid. A packet with this VLAN-ID assigned is dropped and a security breach event is generated (Section 11.2.4.3 "Invalid VLAN Filtering"). For each local port - Port is VLAN member Port is/is not a member of the VLAN. For each local port: - Packets egressed on this port are tagged Packets egressed on the port are tagged/untagged. Unknown Source Address is not security event. Packets with unknown Source MAC Addresses generate security breach events (Section 11.6 "Bridge Security Breach Events"). or Packets with unknown Source MAC Address are processed according to the normal Source Address learning rules (Section 11.4.7 "FDB Source MAC Learning"). IGMP Trap/Mirror Enable IGMP Trapping/Mirroring to the CPU is enabled/disabled. (Section 11.8.4 "IGMP"). IPv6 ICMP Trap/Mirror IPv6 ICMPv6 Trapping/Mirroring to the CPU is enabled/disabled. (Section 11.8.5 "MLD and other IPv6 ICMP"). NOTE: IPv6 MLD packets can be trapped/mirrored to the CPU using this mechanism. IPv4/6 Control to CPU Enable IPv4/6 control traffic trapping/mirroring to the CPU is enabled/disabled. (Section 11.8.6 "IPv4/6 Interface Control Traffic"). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field Name Ingress Mirror The packet mirroring state is not modified. or Mark the packet to be Mirrored to the Ingress Analyzer Port (Section 16.2.3 "Ingress Mirroring" on page 316). Spanning Tree Group 8-bit spanning tree group table index (Section Table 522: "Span State Group<n> Entry (0<=n<256)"). IPv4 Multicast Bridging Enable Options: Disabled: IPv4 Multicast packets are forwarded according to the MAC DA (like in standard bridging). Enabled: IPv4 Multicast packets are bridged according to the VLAN entry <IPv4 Multicast Bridging Mode>. (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging"). IPv4 Multicast Bridging Mode If VLAN entry <IPv4 Multicast bridging enable> = 1, then IPv4 Multicast are bridged according to mode: (*, G, V). or (S, G, V) (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging"). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 217 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 VLAN Entry Fields AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 51: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED VLANs VLAN Entry Fields (Continued) Description IPv6 Multicast Bridging Enable Options: Disabled: IPv6 Multicast packets are bridged according to the MAC DA ( like in standard bridging). Enabled: IPv6 Multicast packets are bridged according to the VLAN entry <IPv6 Multicast Bridging Mode>. (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging"). IPv6 Multicast Bridging Mode If VLAN entry <IPv6 Multicast bridging enable> = 1, then IPv6 Multicast are bridged according to mode: (*, G, V). or (S, G, V) (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging"). Unregistered Non-IPv4/6 Multicast Filter Filter command for unregistered Multicast packets (does not include MAC Broadcast) that are neither IPv4 nor IPv6 Multicast packets (identified by not having a MAC destination prefix of either 01-00-5E-xx-xx-xx/25 or 33-33-xx-xx-xx-xx/16 (Section 11.11.1.2 "Per-VLAN Unregistered Non-IPv4/6 Multicast Filtering"): • Flood the packet according to its VLAN assignment. • Trap the packet to the CPU with specific CPU code for unregistered L2 non-IPM Multicast. • Mirror the packet to the CPU with specific CPU code for unregistered L2 non-IPM Multicast. • Soft Drop the packet. • Hard Drop the packet. Unregistered IPv4 Multicast filter Filter command for unregistered IPv4 Multicast packets (Section 11.11.1.3 "Per-VLAN Unregistered IPv4 Multicast Filtering"): • Flood the packet according to its VLAN assignment. • Trap the packet to the CPU with specific CPU code for unregistered L2 IPv4 IP Multicast. • Mirror the packet to the CPU with specific CPU code for unregistered L2 IPv4 IP Multicast. • Soft Drop the packet. • Hard Drop the packet. Unregistered IPv6 Multicast filter MV-S102110-02 Rev. E Page 218 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field Name Filter command for unregistered IPv6 Multicast packets (Section 11.11.1.4 "Per-VLAN Unregistered IPv6 Multicast Filtering"): • Flood the packet according to its VLAN assignment. • Trap the packet to the CPU with specific CPU code for unregistered L2 IPv6 IP Multicast. • Mirror the packet to the CPU with specific CPU code for unregistered L2 IPv6 IP Multicast. • Soft Drop the packet. • Hard Drop the packet. CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 51: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine VLAN Entry Fields (Continued) Description Unregistered IPv4-BC filter Filter command for unregistered IPv4 Broadcast packets (Section 11.11.1.5 "Per-VLAN Unregistered IPv4 Broadcast Filtering"): • Flood the packet according to its VLAN assignment. • Trap the packet to the CPU with specific CPU code for unregistered L2 IPv4 BC. • Mirror the packet to the CPU with specific CPU code for unregistered L2 IPv4 BC. • Soft Drop the packet. • Hard Drop the packet. Unregistered non-IPv4BC filter Filter command for unregistered non-IPv4 Broadcast packets (Section 11.11.1.6 "Per-VLAN Unregistered non-IPv4 Broadcast Filtering"). • Flood the packet according to its VLAN assignment. • Trap the packet to the CPU with specific CPU code for unregistered L2 non-IPv4 BC. • Mirror the packet to the CPU with specific CPU code for unregistered L2 non-IPv4 BC. • Soft Drop the packet. • Hard Drop the packet. Unknown Unicast Filter command for unknown Unicast packets (Section 11.11.1.1 "Per-VLAN Unknown Unicast Filtering"). • Flood the packet according to its VLAN assignment. • Trap the packet to the CPU with specific CPU code for Layer 2 Unknown Unicast • Mirror the packet to the CPU with specific CPU code for Layer 2 Unknown Unicast • Soft Drop the packet. • Hard Drop the packet. IPv4 Unicast routing Enable Relevant for the Multilayer stackable switches only. Enable IPv4 Unicast Routing for this VLAN (Section 12.5.1 "Triggering Unicast Routing"). IPv4 Unicast Routing is enabled/disabled for this VLAN. IPv6 Unicast routing Enable Relevant for the Multilayer stackable switches only. Enable IPv6 Unicast Routing for this VLAN (Section 12.5.1 "Triggering Unicast Routing"). IPv6 Unicast Routing is enabled/disabled for this VLAN. 11.3 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Field Name Spanning Tree Support The device has the necessary hardware hooks to support Single Spanning Tree (SST) and Multiple Spanning Tree (MST) per IEEE 802.1D and 802.1s. While the Spanning Tree protocol (STP) is executed by the CPU, enforcement of the Spanning Tree port state in the packet data path is handled by the device at line rate. The device supports a 256-entry Spanning Tree Group (STG) table. Each entry contains the Spanning Tree state for each physical port on the device. This table is not relevant for the SecureSmart and SecureSmart Stackable devices. Each VLAN entry contains an STG index field that binds it to an STG table entry. If the system software is running Single Spanning Tree, all active VLAN entries should be configured with the same STG table index. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 219 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 51: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Spanning Tree Support AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the system software is running Multiple Spanning Tree, each set of active VLANs comprising a Spanning Tree Group should be configured to a unique STG table index. If STP is enabled on a port, its state is set to one of the following configurations: • Blocking/Listening • Learning • Forwarding Table 52 describes the functionality of each STP state. Table 52: Spanning Tree Port State Behavior State 0: STP Disabled State 1: Blocking/Listening Stat e 2: L e a r n in g Sta t e 3 : Fo r w a r d in g Ingress Traff ic F il te ri ng FORWARD all traffic. BPDUs are assigned the command TRAP. Non-BPDU traffic is assigned the command SOFT-DROP. BPDUs are assigned the command TRAP. Non-BPDU traffic is assigned the command SOFT-DROP. BPDUs are assigned the command TRAP. Non-BPDU traffic are assigned the command FORWARD. Eg ress Traff ic F il te ri ng FORWARD all traffic. All packets are discarded except for packets with DSA tag <command> = FROM_CPU and <EgressFilterEn> = 0 All packets are discarded except for packets with DSA tag <command> = FROM_CPU and <EgressFilterEn> = 0 Unrestricted L ea r ni n g Permitted Not permitted Permitted Permitted Notes • • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 ST P State --- --- -- --Device Behavior Learning “permitted” implies: 1) If enabled, auto-learning is performed (Section 11.4.7.2 "Source MAC Address Auto-Learning"). 2) If enabled, New Address (NA) updates are sent to the CPU ( "New Address (NA) Update Messages" on page 226). Learning “not permitted” implies that auto-learning and NA updates to the CPU are NOT performed, even if enabled to do so. Configuration Configure the VLAN table entry with the appropriate STG index (C.16.18.1 "VLAN Table Entry Format" on • • • page 754). Configure the STG table entry with the appropriate Spanning Tree state (C.16.18.3 "Span State Groups Table Entry Format" on page 762). To disable Spanning Tree egress filtering for routed packets, clear the Egress Filtering Register0 (Table 463 p. 721) <RoutedSpan EgressFilterEn> field. (Relevant for the Multilayer stackable switches only.) MV-S102110-02 Rev. E Page 220 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 STP can be enabled or disabled per STG port. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine 11.3.1 Trapping BPDUs If the BPDU is received on a port disabled for Spanning Tree, the packet is forwarded like a normal bridge Multicast packet. 11.3.2 CPU Generated BPDUs The CPU can generate BPDUs to be transmitted out any device port in the system, regardless of the port Spanning Tree state, by setting the packet DSA tag with the following field values (Section 7. "CPU Traffic Management" on page 102). <Command> = FROM_CPU <EgressFilterEn> = 0 Disables egress VLAN and Spanning Tree filtering. <CascadeControl>=1 If the packet is queued on a cascade port, it is queued according to the configured Control TC, and Control-DP0 (if <mailbox>=1, or port=CPU port 63) or Control-DP1. On non-cascade ports, the packet is queued according to the DSA tag TC and DP 11.4 Bridge Forwarding Database (FDB) 11.4.1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device contains an internal 16K entry bridge FDB. (SecureSmart and SecureSmart Stackable devices, and 98DX107: 8K entry FDB) FDB Unicast and Multicast Entries FDB entries can be categorized as either Unicast or Multicast entries. These entry types are discussed in further detail in the subsequent sub-sections. 11.4.1.1 FDB Unicast Entry FDB Unicast entries contain a MAC Unicast address that is associated with a single location, which may be either of the following: Device number and port The device can be the local device number or any remote device number that is casnumber on that device caded in the system. Forwarding to a remote device is done according to Device Map Table, and is described in Section 4.2 "Single-Target Destination in a Cascaded System" on page 45. Trunk-ID • • The Trunk-ID is treated as a single bridge interface. Forwarding to a trunk destination is described in Section 13. "Port Trunking" on page 281. Learning of Unicast MAC entries is described in Section 11.4.7 "FDB Source MAC Learning". Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 221 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Packets with the BPDU Multicast DA of 01-80-C2-00-00-00 are automatically trapped to the CPU when received on a VLAN configured as "valid" and the port Spanning Tree state is not set to "STP Disabled". The packet is sent to the CPU with a DSA tag <CPU code>=BPDU TRAP, according to the attributes in the CPU code table (Section 7.2.1 "CPU Code Table" on page 103). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Forwarding Database (FDB) 11.4.1.2 FDB Multicast Entry MAC Multicast FDB MAC Multicast entries provide for Multicast bridging based on the packet’s MAC Multicast group address. MAC Multicast entries can be configured to be static or not static (Section 11.4.8 "FDB Static Entries"). IPv4/6 Multicast FDB IPv4/6 Multicast entries provide for Multicast bridging based on the packet VLAN and its source IP address and/or group IP address (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging"). IPv4/6 Multicast entries are not subject to aging, even if the entry static bit is clear. MAC Unicast forwarded A MAC Unicast FDB entry with the <Multiple Target> bit set forces the Unicast to a Multicast group entry to be treated like a registered MAC Multicast. Such entries must be defined as static entries. Multicast entries are associated with a Multicast group index (aka VIDX). The VIDX is used as an index into a 4K entry Multicast table. The Multicast table is in addition to the 4K VLAN table (Section 11.5 "Bridge Multicast (VIDX) Table"). Multicast entries are added to the FDB only by the CPU (Section 11.4.6 "CPU Update and Query of the FDB"). If the Multicast packet is to be forwarded to a subset of the VLAN port members, the CPU must create a VIDX entry with the appropriate port list, and set the FDB Multicast entry <VIDX> with the corresponding index. 11.4.2 FDB Entry M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the Multicast packet is to be flooded to the VLAN group, there is no need to define a VIDX entry whose portlist is the same as the VLAN entry. Instead, the entry <VIDX> is set to 0xFFF, and the packet is flooded according to VLAN table portlist for the packet’s VID. Table 53 contains the fields in the FDB entry. Note that some of the fields are applicable only in specific FDB entry types. The device FDB format is defined in the FDB Entry (Table 412 p. 673). Table 53: FDB Address Table Entry Name Description Valid Entry Valid (Section 11.4.3 "FDB Lookup"): • Invalid entry and end of hash chain. • Entry may contain valid data, depending on <Skip>. Skip This bit is only relevant if <Valid>=1. (Section 11.4.3 "FDB Lookup"): • Entry contains valid data. • Entry does not contain valid data but is not the end of the hash chain. This entry is skipped by the search algorithm. Entry Type Entry types: • MAC Address entry • IPV4 Multicast address entry (IGMP snooping). (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging") • = IPV6 Multicast address entry (MLD snooping). (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging") MV-S102110-02 Rev. E Page 222 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The FDB supports the following Multicast entry types: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine FDB Address Table Entry (Continued) Description Age/Refresh This bit is used for the two-step Aging process. (Section 11.4.9 "FDB Entry Aging"): • Entry will be marked for aging in next age pass. • Entry will be marked for aging after two age-passes, in the event that the entry is not “refreshed” by a packet with this source MAC Address. When automatically learned, the bit is set. The Source MAC Address lookup process refreshes this bit when a match is found. Static The entry is subject to auto-learning and aging, if these mechanisms are enabled. or The entry is not subject to auto-learning and aging. (Section 11.4.8 "FDB Static Entries") MAC Address For <entry type>=MAC Address entry, the 48-bit Unicast or Multicast MAC Address. DIP For <entry type>=IP Multicast entry (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging") If IPv4, this is the 32-bit destination IPV4 address. If IPv6, this is the 4 selected octets of destination IPv6 address. VID DA Command For <entry type>=IP Multicast entry (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging") If IPv4, this is the 32-bit source IPv4 address. If IPv6, this is the 4 selected bytes of source IPv6 address. If this entry is (*, G, V), the SIP must be set to 0. The VLAN-ID associated with the MAC Address or IPv4/v6 (S, G) address. The command when this entry matches the destination lookup for either MAC DA or IPv4/6 Multicast entry. (Section 11.4.4 "FDB Entry Command"): • Forward • Mirror to the CPU with CPU code FDB_ENTRY_TRAP/MIRROR • Trap to the CPU with CPU code FDB_ENTRY_TRAP/MIRROR • Hard Drop • Soft Drop M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 SIP AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Name SA Command The command when this entry matches the Source Address lookup for a MAC SA entry (Section 11.4.4 "FDB Entry Command") • Forward • Mirror to the CPU with CPU code FDB_ENTRY_TRAP/MIRROR • Trap to the CPU with CPU code FDB_ENTRY_TRAP/MIRROR • Hard Drop • Soft Drop Is Trunk The MAC Unicast address is associated with a device number and port number. (Section 13. "Port Trunking" on page 281). or The MAC Unicast address is associated with a Trunk group. Trunk-ID If <Is Trunk>=1, the Trunk-ID associated with the Unicast MAC Address. (Section 13. "Port Trunking" on page 281). The Trunk-ID range is from 1 to 127. (For the SecureSmart devices: 98DX163, 98DX243, and 98DX262 Trunk-ID range is from 1 to 32 and for the 98DX106 from 1 to 8.) Device Number The device number associated with this address. Range is 0 to 31. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 223 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 53: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Forwarding Database (FDB) Table 53: Port Number The port number on <Device Number> associated with the Unicast MAC Address. VIDX 12-bit Multicast group Index (Section 11.4.1.2 "FDB Multicast Entry") 0-0xEFF - The Multicast group table index for this MAC Multicast or IPv4/v6 Multicast entry. 0xFFF - The Multicast packet is forwarded according to the VID group membership. No QoS parameters are modified when this entry matches the destination lookup for either MAC DA or IPv4/6 Multicast entry. or The QoS parameter set index from which to assign the QoS parameters for matching the destination lookup for this MAC DA or IPv4/6 Multicast entry (Section 8.2.4 "Bridge FDB-Based QoS Marking" on page 119). SA QoS Parameter Set Index No QoS parameters are modified based on the source MAC lookup entry match. or The QoS parameter set index from which to assign the QoS parameters for matching the source MAC lookup entry match. (Section 8.2.4 "Bridge FDB-Based QoS Marking" on page 119). Ingress Mirror to Analyzer Port Do not change the packet ingress mirroring setting. or When this entry matches either the source or destination lookup, mark the packet for ingress mirroring to the ingress analyzer port. (Section 16.2 "Traffic Mirroring to Analyzer Port" on page 314.) NA Storm Prevention Flag This field is relevant when the auto-learning is disabled, and New Address (NA) update Storm Prevention is enabled ( "NA Storm Prevention in Controlled-Learning" on page 235): This is a regular entry. or This is a storm prevention entry indicating that an NA message has been sent to the CPU but the CPU has not yet learned this MAC Address on its current location. The device will not send further NA messages to the CPU for this source MAC Address. Should a MAC DA lookup match this entry, it is treated as an unknown Unicast packet. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 DA QoS Parameter Set Index Source-ID 5-bit source-ID assignment for this source MAC entry. The source-ID assignment is used by the egress source-ID filtering mechanism. (Section 11.14 "Bridge Source-ID Egress Filtering" on page 257.) Multiple Target For Unicast MAC entries: Forward to the Unicast target device/port or Trunk group. or Treat this entry as if it were a Multicast and forward according to the entry <VIDX>. This entry must be configured as static. (Section 11.4.1.2 "FDB Multicast Entry" on page 222.) User Defined Four User Defined bits. May be used by the user to save per Unicast Entry User Data. Router DA Relevant for the Multilayer stackable switches only. Indicates that this MAC Address is associated with the Router MAC (see Section 12.5.1 "Triggering Unicast Routing"): This MAC Address is not associated with the Router MAC. or This MAC Address is associated with the Router MAC and this packet is sent to the Unicast Routing Engine MV-S102110-02 Rev. E Page 224 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Description AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Name FDB Address Table Entry (Continued) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine 11.4.3 FDB Lookup AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The FDB lookup is performed for the packet source and destination. However, the destination lookup is not performed if Bridge Bypass is enabled (Section 11.1 "Bypassing Bridge Engine" on page 203). The FDB internal memory is organized as 4K rows, where each row contains four FDB entries. An entry match is based on entry type key fields. The key for MAC Unicast and Multicast entries is the MAC Address, and optionally the packet VID assignment (Section 11.4.3.1 "FDB MAC Lookup Modes"). The key for IPv4/v6 Multicast entries is (S, G, VID) or (*, G, VID) (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging"). There is a configurable upper limit on the number of FDB rows that are searched. The default maximum hash chain length is one FDB row, which is four FDB entries. The search limit can be increased to a maximum of eight rows, however wire-speed performance will degrade if it is increased beyond a single row. The search algorithm compares the entry type key fields, starting from the hash index start-row, until a match is found, or an entry with <valid>=0 is found, or the maximum search limit is reached. Configuration To configure the maximum number of FDB rows to search, set the <NumOfEntries InMacLookup> field in the FDB Global Configuration Register (Table 399 p. 659) accordingly. 11.4.3.1 FDB MAC Lookup Modes VLAN Shared Lookup M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 There are two FDB MAC lookup modes: VLAN Independent Lookup In this mode the packet VLAN assignment together with the MAC Address are used to determine whether a matching entry exists in the FDB. This mode is used for independent VLAN learning (IVL), as described in IEEE 802.1Q. In this mode only the MAC Address is used to determine whether a matching entry exists in the FDB. This mode is used for shared VLAN learning (SVL). This mode requires all MAC Addresses to be unique among VLANs. This is applied to Unicast and Multicast MAC Address lookups. FDB MAC lookup mode is irrelevant for FDB IPv4/6 Multicast lookups. Configuration To configure FDB MAC Lookup mode, set the <VLANLookup Mode> bit in the FDB Global Configuration Register (Table 399 p. 659). 11.4.3.2 FDB Hash Functions The device can be configured to use either a CRC hash or a XOR-based FDB hash function. The CRC-based hash function provides the best hash index distribution for random addresses and VLANs. However, for controlled testing scenarios where sequential addresses and VLANs are often used, the XOR hash function provides optimal hash index distribution. Configuration To configure the CRC-based or XOR-based hash function, set the <FDBHashMode> bit in the FDB Global Configuration Register (Table 399 p. 659). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 225 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The FDB is searched using a hash function to calculate the search start-row index, from which point a linear search is performed to resolve hashing collisions. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Forwarding Database (FDB) FDB Entry Command AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Commands are associated with an FDB lookup match for the Source Address lookup and the Destination Address lookup.The following commands are supported: • Forward • Mirror to the CPU • Trap to the CPU • Hard Drop • Soft Drop If there is no match for the FDB source or destination lookup, the implicit command is Forward. However, for secure environments, it is possible to drop, mirror, or trap Unknown Unicast destinations, or Unregistered Multicast destinations (Section 11.11 "Unknown and Unregistered Packet Filtering" on page 253). Resolution of the final command based on the source and destination commands is according to the command resolution rules (Section 11.15 "Bridge Ingress Command Resolution" on page 259). If the packet is trapped or mirrored to the CPU by the FDB entry command, the packet DSA tag <CPU code> is set to CPU code FDB ENTRY TRAP/MIRROR. An FDB entry command of either Hard or Soft Drop is treated as a security breach, as described in Section 11.6 "Bridge Security Breach Events". The FDB entry command is not relevant if Bridge Bypass is enabled (Section 11.1 "Bypassing Bridge Engine" on page 203). Address Update (AU) Messages M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 11.4.5 Address Update (AU) messages allow FDB events or queries to be exchanged between the CPU and the device. The device sends AU messages to the CPU to indicate an event or change in the FDB table. AU messages are sent to the CPU through the mechanisms described in Section 11.4.5.3 "Address Update Messages to the CPU". The CPU sends AU messages to the device to request a query or change to the FDB table (Section 11.4.6 "CPU Update and Query of the FDB"). 11.4.5.1 Address Update Messages Types The following AU message types are defined: • New Address (NA) • Aged Address (AA) • Transplanted Address (TA) • Query Address (QA) • Query Response (QR) New Address (NA) Update Messages The device sends an NA message to the CPU when ALL the following criteria are met: • NA messages to the CPU are enabled on the ingress port • An FDB entry is not found for the packet Source MAC Address (and VLAN in IVL mode). OR the Source MAC Address matched a non-static FDB entry but is associated with different device/port or Trunk-ID (i.e. the station moved location) AND the entry <NA Storm Prevention Flag> is clear ( "NA Storm Prevention in Controlled-Learning" on page 235). • Packet did not generate a Security Breach event (Section 11.6 "Bridge Security Breach Events"). MV-S102110-02 Rev. E Page 226 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 11.4.4 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The CPU may write to the device an NA message to add, update, or remove an FDB entry (Section 11.4.6 "CPU Update and Query of the FDB"). Use of NA messages for CPU controlled learning or auto-learning is described in Section 11.4.7 "FDB Source MAC Learning". The device sends an AA message to the CPU when an entry is aged by the FDB aging or deleting mechanism (Section 11.4.9 "FDB Entry Aging" and Section 11.4.11 "Deleting FDB Entries".) There is a global device configuration to enable/disable AA messages to the CPU. The CPU does not write AA messages to the device. Entries are deleted using NA messages and setting the entry <Skip> to 1. The FDB aging and deleting mechanisms are described in Section 11.4.9 "FDB Entry Aging"and Section 11.4.11 "Deleting FDB Entries". Transplanted Address (TA) Update Messages The device sends a TA message to the CPU as a response to an FDB transplant address command triggered by the CPU. There is a global configuration to enable/disable TA messages to the CPU. The FDB transplant address mechanism is described in Section 11.4.12 "FDB Unicast Entry Transplanting". M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Query Address and Response (QA/QR) Update Messages The CPU writes a QA message to the device, to query the FDB for a given MAC Address or MAC Address + VLAN (depending on the MAC FDB Lookup mode), or IP Multicast (S, G, V). In response to a QA message, the device sends a QR message to the CPU. The QR contains a field called <Entry Found>, which indicates whether the queried address was found. If the entry was found, the QR message contains the FDB entry content and the FDB entry offset relative to its hash index. 11.4.5.2 Address Update (AU) Entry The AU message contains all the fields defined in the FDB Address Table Entry (Table 53 p. 222), with the single exception of the <valid> field. To remove an entry, the CPU must set the <Skip> field. In addition, the following fields are included in the AU message: Table 54: Additional Address Update Fields Fi el d Na me Des cri pti on AU Message ID This is a constant that is always set to 0x2. All other values are reserved. AU Message Type 0 = New Address (NA) 1 = Query Address (QA) 2 = Query Respond (QR) 3 = Aged Out Address (AA) 4 = Transplanted Address (TA) Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 227 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Aged Address (AA) Update Messages M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Forwarding Database (FDB) Additional Address Update Fields (Continued) Des cri pti on NA Chain Too Long This field is relevant in NA messages to the CPU: AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Fi el d Na me • • • In auto-learning mode, this flag indicates that the new address could not be learned in the FDB due to the length of the FDB hash chain. In controlled-learning mode with NA storm prevention enabled, this flag indicates that a Storm Prevention (SP) entry for the new address could not be created due to the length of the FDB hash chain. In controlled-learning mode with NA storm prevention disabled, this flag indicates that the length of the FDB hash chain for this source MAC Address was full at the time the packet was received. This field is relevant in QR messages to the CPU: 0 = The address requested in a previous QA message sent by the CPU wasn’t found and the FDB entry fields in this message invalid. 1 = The address requested in a previous QA message sent by the CPU was found and the FDB entry fields in this message are valid. Entry Index This is valid in the AA and TA messages to the CPU. This is the FDB entry index for this address. Entry Offset This is valid in the following entries: - NA messages, if the AU entry<NA Chain Too Long> is 0. - QR messages, if AU entry<Entry Found> is 1. This is the FDB entry offset relative to the hash index for this address. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Entry Found The physical format of the FDB Address Update Message is defined in MAC Update Message Format (Table 401 p. 663). 11.4.5.3 Address Update Messages to the CPU Address Updates (AU) messages are generated by the device to indicate an FDB event to the CPU (i.e., New Address, Aged Address, or Transplanted Address) or a Query Response. If AU messages of a given type are enabled to be sent to the CPU, but the device cannot send the AU message to the CPU (e.g., no memory available for the AU message to be stored), the current FDB operation (e.g. aging, deleting, transplanting, Query) is paused on the last processed entry, until the AU message can be sent. This ensures that the CPU is always in full synchronization with the FDB content. AU messages generated by the device are stored in an on-chip FIFO. The FIFO size is configurable, with a maximum capacity of 16 AU messages. There are two mechanisms through which the CPU can receive AU messages from the device: • On-chip address update FIFO • PCI Master write to host memory Address Update Queue (AUQ) If the device is managed by the PCI, the CPU may choose either of the above methods for reading the AU messages. The on-chip address update FIFO requires CPU-initiated reads from the device address space. The AUQ method utilizes the device PCI master capability to perform burst write transactions to the host memory address space. If the device is managed using the CPU SMI management interface (i.e., no PCI), then the CPU must read the onchip AU FIFO to receive AU messages. MV-S102110-02 Rev. E Page 228 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 54: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To select the AU update mechanism, set the <AUMsgs ToCPUIF> bit in the FDB Global Configuration Register (Table 399 p. 659). On-Chip AU FIFO Configuration • To read AU messages from the head of the FIFO, the CPU performs four consecutive 32-bit read transactions • • of the Message to CPU Register. If the data in the FIFO is not valid, reading it returns 0xFFFFFFFF. When the entry is valid, the first read returns Address Update Message word0, the second read returns Address Update Message word1, the third read returns Address Update Message word2, and the fourth read returns Address Update Message word3. To configure the AU FIFO size, set the <CPUFifo Threshold> field in the AU FIFO to CPU Configuration Register (Table 403 p. 666) accordingly. The <AUMsgTO CPUReady> bit in the FDB Interrupt Cause Register (Table 583 p. 809) is raised when an AU message is ready to be read by the CPU. The interrupt remains active until the FIFO is empty. Host Memory Address Update Queue This mechanism is recommended for systems that manage the device via the PCI bus. The device asynchronously transfers AU messages to a predefined region of host memory called the AUQ (Address Update Queue). Each AU transfer is done with a single PCI master burst transaction. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Should the AUQ become full, the device prevents any new AU messages from queued. Any active pass on the FDB is paused, waiting for available space in the AUQ. To prevent pausing the FDB pass when the AUQ fills, the device supports configuration of two AUQ memory regions. At initialization the CPU configures the device with two AUQ memory regions, defined by a base address and size. When an AUQ reaches its maximum size, the device automatically begins to fill the second region. When the CPU has completed processing all the AU messages from a filled AUQ region, the memory region should be written back to the device for the next switch-over. Unless PCI cache-coherency is supported, the AUQ regions must be allocated from the non-cacheable host memory address space. Each AU message in the AUQ is 16 bytes. At initial AUQ allocation time, prior to updating the device with the AUQ base address and size, the first byte of all AU message must be zeroed. All AU messages sent by the device start with a constant message-ID of 0x2 in the first nibble of the record. This technique allows the CPU to locate the last AU message recorded in the AUQ by the device. The CPU must maintain a pointer to the next AUQ message to be processed. Initially this is the first record of the AUQ. When an AU pending interrupt is generated, the CPU processes AU messages starting at the next AUQ message pointer. First the CPU checks to verify that the AU message first byte is non-zero. This indicates the AU message was updated by the device. After reading the AU message, the CPU must zero the first byte of the message and increment its next message pointer. The cycle continues until the next message pointer points to an entry whose first byte is zero, or the next message pointer has passed the end of the AUQ. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 229 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The CPU can either poll the AU FIFO for valid data or read from the AU FIFO when an interrupt is raised. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Forwarding Database (FDB) • p. 384)accordingly. The size must be configured before the AUQ base address. To configure the AUQ base address, set the <PCIAUQ Base> field in the Address Update Queue Base Address Register (Table 93 p. 384) accordingly. The CPU receives an indication that AU messages are pending in the current AUQ from the <AUQPending> bit in the Miscellaneous Interrupt Cause Register (Table 557 p. 794). Address Update Message Rate Limiting When address update messages are written to the host memory Address Update Queue (AUQ), the rate of address messages is limited by the performance of the PCI bus until the AUQ becomes full. If the software is event-driven, a large burst of AU messages can temporarily monopolize the CPU. To protect the CPU from such bursts, the rate of AU messages queued to the AUQ can be limited. AU rate limiting is globally enabled. It limits the maximum number of AU messages queued within a fixed 10 millisecond window of time. The limit ranges from a minimum of two AU messages per 10 ms, to a maximum of 512 AU messages per 10 ms, in steps of 2. Note M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 At the beginning of each time window, the internal counter is reset to zero. For each AU message sent to the AUQ, the counter is incremented by 1. If the new value of the counter exceeds the configured permitted value, the AU message is not queued to the AUQ and it is treated as if the AUQ were full, i.e., no auto-learning or storm-prevention entries are added to the FDB and FDB aging is stopped. In the event that an aging/deleting FDB pass cannot send an AU message to the CPU (e.g., due to rate limiting), the aging/deleting FDB pass is stalled at the specific FDB entry until such time as the AU message can be sent to the CPU. This ensures that the CPU is always kept in sync with FDB events. Configuration • To globally enable AU message rate limiting, set the <AURateLimEn> bit in the FDB Global Configuration • Register (Table 399 p. 659). To configure the maximum number of AU messages enqueued within a fixed 10 ms window of time, set the <AURateLimit> field in the FDB Global Configuration Register (Table 399 p. 659) accordingly. 11.4.6 CPU Update and Query of the FDB There are two mechanisms available to the CPU for updating or querying the FDB entry: • Address Update messages from the CPU • Direct entry update 11.4.6.1 Address Update (AU) Messages from the CPU The CPU can send AU messages through the Address Update register. The CPU can write the following AU messages to the device: • New Address (NA) to add a new entry or update an existing entry. An entry is removed by setting the entry <skip> field in an existing FDB entry. • Query Address (QA) to query the FDB for an existing entry MV-S102110-02 Rev. E Page 230 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To configure the AUQ size, set the <AUQSize> field in the Address Update Queue Control Register (Table 94 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The CPU must prepare the AU message according to the hardware AU message format and write it to the device Address Update register (MAC Update Message Format (Table 401 p. 663)). The device is responsible for calculating the hash and searching for the correct FDB entry to perform the operation. • • p. 667), Message from CPU Register1, Message from CPU Register2, and Message from CPU Register3. This can be written as four 32-bit PCI write operations or as a four-word PCI burst write operation. The Message From CPU Management (Table 408 p. 668) contains the <NewMessage Trigger> bit, which starts the internal processing of the AU message. The CPU polls the <NewMessage Trigger> bit, waiting for the value to be reset to 0. This indicates that the operation is complete. The CPU reads the <AUMsg StatusOK> bit of the Message From CPU Management (Table 408 p. 668). If not set, the AU operation failed. An AU message sent by the CPU can fail in the following cases: – – The message type is NA and the hash-chain has reached its maximum length The message type is QA and the FDB entry does not exist 11.4.6.2 FDB Table Read/Write Access The device supports CPU read/write access to the physical FDB entries. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This mechanism can be used to read or write any of the 16K FDB entries (SecureSmart and SecureSmart Stackable devices, and 98DX107: 8K FDB entries). The read access is useful to build a software shadow of the FDB when in auto-learning mode, as an alternative to the CPU receiving Address Update messages from the device. The write access option is an alternative to using the FDB address update mechanism (Section 11.4.6.1 "Address Update (AU) Messages from the CPU"). The CPU must specify the FDB entry index to be written. In some case, an existing entry’s index can be learned from an AU message <entry offset> or <entry index> that was received from the device. However, if the CPU needs to add/remove/modify an entry that is not a result of receiving an AU message, then the CPU must keep a shadow of the physical FDB. The entry index is based on calculating the hash index and finding the correct entry in the hash chain. Note If this mechanism is to invalid an entry (i.e., set the <Valid> bit to 0), the entry <Skip> bit must also be set to 0. Configuration To read or write to an FDB entry, following the procedure defined in C.13.30.3 "Read and Write Access to the FDB" on page 677. 11.4.7 FDB Source MAC Learning A Source MAC Address is associated with a location, which can be a device and port number, or a trunk-ID (Section 13. "Port Trunking" on page 281). FDB source MAC Address learning can be done through an auto-learning mechanism or through CPU-controlled learning. In either case, New Address (NA) update messages can be enabled to notify the CPU of new Source MAC Addresses or existing Source MAC Addresses that have changed location. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 231 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration The CPU writes an AU message to a set of four 32-bit registers—Message from CPU Register0 (Table 404 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Forwarding Database (FDB) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the packet is received on a cascade port and the DSA tag command is FORWARD, the source MAC lookup is based on the DSA tag location and not the cascade port or trunk on the local device. In the Multilayer stackable switches, routed packets received on a cascade port can be configured to bypass the source MAC lookup process. This may be desirable if the packet MAC SA has already been modified by the previous device to reflect the router MAC Address. The device identifies routed packets according to the DSA tag with <command>=FORWARD and <routed>=1. The source MAC lookup process is not affected by the Bridge-Bypass configuration and the learning mechanisms can still be enabled. If a packet Source MAC Address is a Multicast address, this is non-standard and it is treated as a security event. The packet is either hard dropped, soft dropped, or forwarded, depending on the configuration. Packets that generate security breach events are not subject to auto-learning or sending New Address Update messages to the CPU (Section 11.6 "Bridge Security Breach Events"). Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To configure whether a packet with a Multicast Source Address is hard dropped, soft dropped, or forwarded, set the <InvalidSA DropMode> and <Drop InvalidSA> bits in the Bridge Global Configuration Register0 (Table 370 p. 643). To disable source MAC lookup for routed packets, set the <Routed LearningMode> field in the Bridge Global Configuration Register0 (Table 370 p. 643) (relevant for the Multilayer stackable switches only). 11.4.7.1 FDB Maintenance in a Cascaded System In a cascaded system, there are two common system approaches to FDB management: • Synchronized FDB Management • Non-synchronized FDB Management Synchronized FDB Management In the synchronized FDB approach, the contents of the FDBs are synchronized across all devices. Each CPU sends addresses learned on its local network ports to the master CPU. The master CPU, in turn, instructs all the CPUs in the system to add the MAC Addresses to the device FDB. Devices do not perform learning on traffic received from cascade ports, as these addresses are learned from the master CPU. Each device ages only the FDB entries associated with its device number. Each device notifies the master CPU, which, in turn, instructs all the CPUs in the system to remove the aged entry. However, trunk entries can be treated as an exception and aged independently of their associated device numbers. Trunk entries are aged "without-delete" on each device. The master CPU can keep a scoreboard on which devices the trunk entry has been aged and then only remove the entry when it is aged on all devices. For further details see Section 11.4.9 "FDB Entry Aging" on page 237. Non-Synchronized FDB Management In the non-synchronized FDB management approach, each device learns Source MAC Addresses independently, based on traffic received on network and cascade ports. The FDB content may differ between devices. MV-S102110-02 Rev. E Page 232 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The source MAC lookup process is performed on all packets, even when the bridge engine is bypassed (Section 11.1 "Bypassing Bridge Engine"), except for either of the following cases: • The packet was received on a cascade port with a DSA tag command not equal to FORWARD. OR • The packet was assigned a packet command of HARD DROP prior to the Bridge engine. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine Each device ages all FDB entries, regardless of the device association. 11.4.7.2 Source MAC Address Auto-Learning Source MAC Address auto-learning can be enabled/disabled per port. On a per-VLAN basis, auto-learning can be disabled by setting the VLAN entry <Unknown Source Address is Security Event>. In this case, a packet with an unknown or changed Source MAC Address is treated as a security breach event (Section 11.6 "Bridge Security Breach Events"). It is not learned and does not generate an NA update message to the CPU. If the port is enabled for auto-learning and the packet is not a security breach, the device automatically learns new source MAC Addresses or updates an existing source MAC entry if there is a change of the source port or trunkID from which the packet was received. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If this is a new entry (entry key does not exist), the new address is learned in the first FDB entry in the hash chain with <valid>=0 or <skip>=1, within the maximum hash chain length (Section 11.4.3 "FDB Lookup"). If this is an existing entry, but the associated location has changed, the entry is updated with the new location. If the software needs to keep a shadow of FDB entries, the port can be enabled to send New Address (NA) update messages to the CPU. In the event that an NA update message cannot be sent to the CPU due to the Address Update Queue (AUQ) being full, then the address is not auto-learned (but is still forwarded as usual). This is to ensure that the CPU is always fully synchronized with the FDB contents (Section 11.4.5.3 "Address Update Messages to the CPU"). If a new Source MAC Address is received, but the hash chain in the FDB for this MAC Address has reached its maximum length then the address cannot be auto-learned (but is still forwarded as usual). Although the Source Address cannot be auto-learned, there is a configuration option whether or not to send to the CPU an NA update message with the <HashChainTooLong> bit set. The CPU can then overwrite an existing entry in the hash chain with the new address. If the associated location of an existing entry has changed, but the existing FDB entry for this MAC Address is static, then the static address is unmodified and an NA update message is not sent to the CPU. It is a configuration option whether the packet is forwarded normally or is considered a security event (Section 11.4.8 "FDB Static Entries"). The device maintains a counter for the number of packets with new source MAC Addresses that could not be auto-learned due to either the hash chain having reach its maximum length or an NA message could not be sent due to full AUQ or AU FIFO. This 32-bit counter satisfies the Learned Entry Discard MIB object counter in RFC1493 Bridge MIB. If a shadow FDB table is required in software as an alternative to receiving NA messages, the CPU can periodically read the entire contents of the on-chip FDB memory (Section 11.4.6.2 "FDB Table Read/Write Access"). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 233 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The system designer must be aware of the following issues when using the non-synchronized FDB management approach: • To present a single FDB for management purposes (e.g. Bridge MIB), the central management CPU must merge the FDB content across the devices. • Due to the fact that not all device FDBs are in sync (i.e., entry is active on one device but aged out on another), a device may flood a packet whose Source Address has already been learned on some of the devices. • Stale FDB entries may exist when a MAC Address moves from one device to another. The stale entry will remain until the source sends a flooded packet to re-sync the FDBs with the new location or the stale entry is aged-out. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Forwarding Database (FDB) • • • Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619) and the <NewSrcAddrIsNotSecurityBreach> field in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). To enable/disable sending NA update messages to the CPU for traffic received on a given port, set the <NaMsgToCpuEn> field in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To disable auto-learning for traffic received on a given VLAN, clear the <NewSrcAddrIsNotSecurityBreach> bit in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). When cleared, an unknown source MAC Address for this VLAN generates a security breach event. The address is not auto-learned and an NA update message is not sent to the CPU. To enable/disable sending an NA update message when the FDB hash chain has reached its maximum length for a given Source Address, set the <NAMsgOn ChainToo LongEn> bit in the FDB Global Configuration Register (Table 399 p. 659). The number of packets with new Source Addresses that were not learned due to internal reasons, can be read from the <Learned Entry Discards> field in the Learned Entry Discards Count Register (Table 415 p. 678). 11.4.7.3 Source MAC Address CPU Controlled Learning It may be desirable for the CPU to maintain strict control of the FDB content rather than have new Source Addresses automatically learned. In this case, auto-learning must be disabled on the port. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Forwarding of packets with a new source MAC Address or source MAC Address that has changed location is controlled by a per-port Unknown Source MAC Address command. The following Unknown Source MAC commands are supported: • Forward • Mirror-to-CPU • Trap-to-CPU • Soft Drop • Hard Drop If the packet is mirrored or trapped to the CPU, the packet DSA tag <CPU code> is set to UNKNOWN SOURCE MAC ADDRESS TRAP/MIRROR. The Unknown Source MAC command is useful in secure applications that do not allow packets from unknown Source Addresses to be forwarded (e.g., MAC-based authentication applications). MV-S102110-02 Rev. E Page 234 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To enable auto-learning for traffic received on a given port/VLAN, clear the <AutoLearnDis> bit in the Ingress • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine Trapping/Mirroring the packet to the CPU • This is accomplished by setting the port <Unknown Source MAC Command> to Trap or Mirror The CPU can then authenticate the source MAC Address and explicitly add the approved address to the FDB. CPU updating of the FDB is described in Section 11.4.6 "CPU Update and Query of the FDB". M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Some secure environments consider any new source MAC Address seen on a port as a security breach. These packets should not be forwarded and the CPU should be informed of violations. In this case, the port can be configured with: • Unknown Source MAC command set to DROP • Unknown Source MAC is Security Breach Event A Security Breach event is generated for each packet with a new or changed source MAC Address (Section 11.6 "Bridge Security Breach Events"). Configuration • To disable auto-learning on a port, set the <AutoLearnDis> bit in the Ingress Port<n> Bridge Configuration • • • • Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To enable NA messages to the CPU for unknown/changed source MAC Addresses received on a port, set the <NaMsgToCpuEn> bit in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To configure the per-port Unknown Source MAC Address command, set the <UnkSrc AddrCmd> field in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619) accordingly. To enable Security Breach Events for unknown/changed source MAC Addresses received on a port, set the <NewSrcAddr IsSecurity Breach> field in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619) accordingly. To enable Security Breach Events for unknown/changed source MAC Addresses received on a VLAN, clear the <NewSrcAddrIsNotSecurityBreach> bit in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). When this field is clear, unknown/changed source MAC Addressees on this VLAN generate Security Breach Events and do not generate NA messages to the CPU, even if enabled on the port. NA Storm Prevention in Controlled-Learning If a rapid stream of packets with an unknown source MAC Address or one with a changed location is received on a port with NA messages to the CPU enabled, there is potential for the CPU to receive multiple NA messages for this MAC Address. This is due to the time delay required for the CPU to receive the first NA message and program the Source Address into the FDB. During this period, NA messages continue to be sent to the CPU for each packet with an unknown/changed source. To avoid this behavior, the “NA storm prevention” mechanism can be enabled. This mechanism auto-learns a storm prevention (SP) FDB entry the first time the Source Address is encountered and an NA message is sent to Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 235 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The CPU can be notified of the new source MAC Address or one that has changed location, by one of the following methods: • If NA update messages to the CPU is enabled on the port, an NA update mesNew Address (NA) sage is sent to the CPU for each packet with a new or changed source MAC updates message Address (11.4.5.3 "Address Update Messages to the CPU" on page 228). • See "NA Storm Prevention in Controlled-Learning" on page 235 for preventing redundant NA messages for the same address. • The NA message contains the entry offset (relative to the hash index for this address) where the entry would have been learned if auto-learning had been enabled. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Forwarding Database (FDB) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 the CPU. Subsequently, when the Source Address FDB lookup finds the SP entry, the SP entry indicates to the device that an NA has already been sent and a new NA message must not be sent again. An FDB MAC destination lookup that matches an SP entry is treated as an unknown Unicast, even though a match has occurred. If the SP entry to be created is an existing entry, but the associated location has changed, the entry is converted to become an SP entry. The NA update message sent to the CPU indicates that an SP entry was created in the FDB by setting the <NA Storm Prevent Flag> bit (Section 11.4.2 "FDB Entry"). The CPU can then overwrite the SP entry with a nonSP entry associated with a given port or trunk-ID (11.4.6 "CPU Update and Query of the FDB"). If the direct FDB entry access mechanism is used (11.4.6.2 "FDB Table Read/Write Access"), the NA entry contains the SP offset (relative to the hash index for this address) to directly update the SP entry. If an NA message cannot be sent to the CPU due to the AUQ being full, the SP entry is not created in the FDB. If the SP entry cannot be created in the FDB because the hash chain for this address has reached its maximum length, there is a configuration option whether to send to the CPU an NA update message. If enabled, the NA update message has the <HashChainTooLong> bit set and the <NA Storm Prevent Flag> bit cleared. The CPU can then overwrite an existing entry in the hash chain with the new address. In this hash chain full scenario, multiple NA messages for the same address are sent to the CPU. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The SP entry is always created non-static, with the <age> bit set to 1. It is unconditionally aged-with-delete after two aging passes. The <age> bit of SP entries is not subject to auto-refresh when packets with this MAC SA are received. It is configurable whether to send to the CPU an Aged Address (AA) update message for SP entries that are aged-out (Section 11.4.9 "FDB Entry Aging"). (It is not expected that SP entries will exist long enough to be aged out. Typically, the CPU receives an NA message indicating that the SP entry was created and immediately the CPU will overwrite the SP entry with a normal FDB entry. The aging mechanism is intended to handle the case where, for some reason, the CPU does not overwrite the SP entry within the age time. In this case the entry is deleted and the next packet with this new source MAC Address will again cause an SP entry to be created and an NA update message to be generated.) Configuration • To enable NA storm prevention for traffic received on a given port, set the <NAStorm PreventionEn> bit in the • • Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To enable/disable sending an NA update message when the FDB hash chain has reached its maximum length for a given Source Address, set the <NAMsgOn ChainToo LongEn> bit in the FDB Global Configuration Register (Table 399 p. 659). To enable AA update messages to be sent for aged-out SP entries, set the <SPAAMsg ToCPUEn> bit in the FDB Global Configuration Register (Table 399 p. 659). 11.4.8 FDB Static Entries Static FDB entries are created by the CPU through the Address Update from CPU mechanism (Section 11.4.6 "CPU Update and Query of the FDB"). Static FDB entries are skipped by the FDB aging process (Section 11.4.9 "FDB Entry Aging"), however static entries still have the <age/refresh> bit set when the Source Address lookup matches the entry. Static entries are optionally subject to the FDB delete and transplant mechanisms (Section 11.4.11 "Deleting FDB Entries" 11.4.12 "FDB Unicast Entry Transplanting"). MV-S102110-02 Rev. E Page 236 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 If the SP entry to be created is a new entry (entry key does not exist), the SP entry is auto-learned in the first FDB entry in the hash chain with <valid>=0 or <skip>=1, within the maximum chain length. The FDB entry has the <NA Storm Prevent Flag> bit set. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the event that a source MAC FDB lookup matches a static MAC entry, but the packet is received on a different location than that of the entry, there is a configuration option to treat the packet as a security breach (Section 11.6 "Bridge Security Breach Events") and the packet is either hard or soft dropped. In the Bridge Global Configuration Register0 (Table 370 p. 643): • To configure the mode where a source MAC Address received on a location other than the static FDB entry configuration constitutes a security breach event, set the <MoveStatic AddressIs SecurityBreach> bit. • If the <MoveStatic AddressIs SecurityBreach> bit is set, then to set whether the packet is hard- or softdropped, set the <MoveStatic DropMode> bit. 11.4.9 FDB Entry Aging Automatic FDB aging allows non-active entries to be auto-deleted and/or an Aged Address (AA) update message to be sent to the CPU. Static entries and IPv4/v6 Multicast entries are not subject to FDB aging. The FDB entry <Age/Refresh> bit is always refreshed to 1 when the source MAC Address lookup matches an FDB entry. This is done for all FDB entry types, including static and IPv4/v6 entries. For each entry processed by the FDB aging pass, the <Age> bit is examined. If the <Age> bit is 1, it is reset to 0. If the <Age> bit is already 0, then the entry is marked for aging. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Depending on the configuration, an entry marked for aging may generate the following actions: • If Aged Address <AA> update messages to the CPU is enabled, an AA update message is sent to the CPU for the entry marked for aging. • If the Aging mode is Age-with-delete, the entry marked for aging has its <Skip> bit set. The entry <Valid> bit is cleared if it is followed by an entry whose <Valid> bit is clear. If the Aging mode is Age-without-delete, the entry is not modified. The AA update message can serve as a trigger for the CPU to remove the FDB entry. If AA messages are sent to the CPU and the CPU Address Update Queue (AUQ) fills, the aging pass is paused at the entry which generated the address update, and proceeds only when the AUQ has room for new AA entries. This ensures that the CPU is always synchronized with the FDB. As a configuration option, an FDB aging pass can be invoked automatically every S seconds, where S ranges from 10 to 630 seconds, in steps of 10. Alternatively, the CPU can trigger a single FDB aging pass at any frequency. A status bit can be polled to indicate the completion of an Aging pass. The aging pass can be configured to age all the non-static and non-IPv4/6 MAC Unicast/Multicast FDB entries, or the aging pass may be restricted to act only on a configured subset of the non-static non-IPv4/6 MAC Unicast/Multicast FDB entries associated with configured VLAN-ID and/or device number. A common approach in a multi-device cascaded system is to keep the contents of all FDBs fully synchronized with each other (Section 11.4.7.1 "FDB Maintenance in a Cascaded System"). This approach is realized by having each device age (with or with-out-delete) only its entries that are associated with the local device number. The central CPU then instructs the entry to be removed on all devices in the system. If the aging pass is restricted to FDB entries associated with a device number, it is possible to configure whether this restriction is applied to trunk entries. The motivation for this is to age trunk entries regardless of whether they were learned on the local device or not. One approach is to have all devices age trunk entries in age-withoutdelete mode and only when the trunk is aged on all devices is the entry removed on all devices. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 237 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Forwarding Database (FDB) • • • • • • 11.4.10 Removal of FDB Entries on Hot-Removal When aging each MAC Address table entry, a check is made to verify that the entry device number is a registered device in the system. If it is not, the entry is removed. Upon detecting the removal of a device from the system, the CPU clears the registration for the removed device number and triggers an aging pass. It is a configurable option to remove or not to remove static FDB entries whose device number is not registered. Aged Address (AA) update messages are not sent to the CPU for entries removed due to invalid device number. Static entries are not automatically removed by this operation. Configuration For each device number that resides in the system, set the corresponding device number bit in the Device • • Table (Table 369 p. 642). To enable removal of static FDB entries whose device number is not registered, set the <Remove StatOfNon ExistDevEn> bit in the FDB Action2 Register (Table 411 p. 671). MV-S102110-02 Rev. E Page 238 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • the CPU, set/clear the <AAandTAMsg ToCPUEn> bit in the FDB Global Configuration Register (Table 399 p. 659). To configure the aging mode to Age-with-delete or Age-without-delete, set the <ActionMode> field in the FDB Action0 Register (Table 409 p. 669) accordingly. To configure the automatic aging time, set the <ActionTimer>> field in the FDB Action0 Register (Table 409 p. 669) accordingly. To enable automatic or Triggered aging mode, set the <TriggerMode> bit in the FDB Action0 Register (Table 409 p. 669). When in Triggered aging mode, trigger an aging pass by setting the <AgingTrigger>bit in the FDB Action0 Register (Table 409 p. 669). This field can be polled for a value to return to 0, indicating the aging pass is complete. To restrict the aging to entries with a specific VLAN-ID, set the <ActVLAN> field to the VID and <ActVLANMask> to 0xFFF in the FDB Action1 Register (Table 410 p. 671). To age regardless of the entry VLAN-ID, set these fields to 0. To restrict the aging to entries with a specific device number, set the <ActDev> field to the device number and <ActDevMask> to 0x1F (all ones) in the FDB Action1 Register (Table 410 p. 671). To age regardless of the entry device number, set these fields to 0. When aging by device or VLAN, the port/trunk filter should be disabled. Clear the <ActTrunkPort> and <ActTrunkPort Mask> fields in the FDB Action1 Register (Table 410 p. 671). To age trunk entries regardless of the device number restriction, set the <AgeOutAllDev OnTrunk> bit in the FDB Action2 Register (Table 411 p. 671). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To globally enable/disable sending of Aged Address (AA) and Transplant Address (TA) update messages to • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine 11.4.11 Deleting FDB Entries A delete pass is always triggered by the CPU. The delete pass may delete all FDB entries, or it may be restricted to act only on a configured subset of the entries associated with any of the following filters: • VLAN-ID • Trunk-ID • Port/Device number • Device number (regardless of port) These filters enable quick, easy removal of FDB entries when a VLAN or trunk is deleted, a port is disabled, or a link goes down. The delete mechanism has a configurable option to delete or not delete static entries. To avoid processing the AA messages, the CPU can disable them prior to triggering the delete operation, and update its own shadow FDB accordingly. If AA messages are sent to the CPU and the CPU Address Update Queue (AUQ) fills, the delete pass pauses at the entry that generated the address update and it proceeds only when the AUQ has room for new AA entries. This ensures that the CPU is always synchronized with the FDB. Configuration • To globally enable/disable sending of Aged Address (AA) and Transplant Address (TA) update messages to • • • • • • • the CPU, set the <AAandTAMsg ToCPUEn> bit in the FDB Global Configuration Register (Table 399 p. 659). To configure the action to delete entries, set the <ActionMode> field in the FDB Action0 Register (Table 409 p. 669) accordingly. To configure the action to triggered mode, set the <TriggerMode> bit in the FDB Action0 Register (Table 409 p. 669). To configure the action to include or not include static entries in the delete pass, set the <StaticAddr DelEn> bit in the FDB Action0 Register (Table 409 p. 669). Trigger the delete pass by setting the <AgingTrigger> bit in the FDB Action0 Register (Table 409 p. 669). This field can be polled for value to return to 0, indicating the delete pass is complete. To enable a VLAN-ID filter, set the <ActVLAN> field to the VID and <ActVLANMask> to 0xFFF in the FDB Action1 Register (Table 410 p. 671). To disable a VLAN-ID filter, clear these fields. To enable a Trunk-ID filter, set the <ActIsTrunk>and <ActTrunkMask> bits, <ActTrunkPort> to the Trunk-ID, and <ActTrunkPort Mask> to 0x7F (all ones) in the FDB Action1 Register (Table 410 p. 671). To disable a trunk-ID filter, clear all of these fields. To enable a port/device filter, clear the <ActIsTrunk> bit, set the <ActTrunkMask> bit, set <ActTrunkPort> to the port number, <ActTrunkPort Mask> to 0x7F (all ones), <ActDev> to the device number, and <ActDevMask> to 0x1F (all ones) in the FDB Action1 Register (Table 410 p. 671). To disable a port/device filter, clear all of these fields. To enable a device number filter (regardless of the port), clear <ActIsTrunk>, <ActTrunkMask>, <ActTrunkPort>, and <ActTrunkPort Mask>. Set <ActDev> to the device number, and <ActDevMask> to 0x1F (all ones) in the FDB Action1 Register (Table 410 p. 671). To disable the device number filter, clear these fields. Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 239 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 FDB deleting utilizes the same hardware mechanism as the aging mechanism, with the main difference being that each FDB entry included in the FDB pass filter is deleted regardless of state of its <Age> bit. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Forwarding Database (FDB) 11.4.12 FDB Unicast Entry Transplanting AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Address transplanting allows Unicast MAC FDB entries associated with a given device/port or trunk-ID to be reassociated with a new device/port or trunk-ID. FDB address transplanting is intended to efficiently handle the relocation of a set of Unicast MAC Address table entries from one device/port to another. This feature is very useful for implementing IEEE 802.1w Rapid Reconfiguration. For example, when an active link fails and the previously blocked redundant link becomes active, all the addresses learned on the failed link can quickly be relearned to reside on the activated redundant link. The transplant address mechanism has a configurable option to transplant or not transplant static entries. If Translate Address <TA> update messages to the CPU is enabled, a TA update message is sent to the CPU for each translated entry. If TA messages are sent to the CPU and the CPU Address Update Queue (AUQ) fills, the delete pass pauses at the entry that generated the address update, and it proceeds only when the AUQ has room for new AA entries. This ensures that the CPU is always synchronized with the FDB. Configuration • To globally enable/disable sending of Aged Address (AA) and Transplant Address (TA) update messages to • • the CPU, set the <AAandTAMsg ToCPUEn> bit in the FDB Global Configuration Register (Table 399 p. 659). To configure the action to translate entries, set the <ActionMode> field in the FDB Action0 Register (Table 409 p. 669) accordingly. To configure the action to triggered mode, set the <TriggerMode> bit in the FDB Action0 Register (Table 409 p. 669). To configure the action to include or not include static entries in the translate address pass, set the <StaticAddr TransEn> bit in the FDB Action0 Register (Table 409 p. 669). Trigger the translate pass by setting the <AgingTrigger> bit in the FDB Action0 Register (Table 409 p. 669). This field can be polled for the value to return to 0, indicating that the translate pass is complete. To enable a VLAN-ID filter, set the <ActVLAN> field to the VID and <ActVLANMask> to 0xFFF in the FDB Action1 Register (Table 410 p. 671). To disable a VLAN-ID filter, clear these field. • • 11.5 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Bridge Multicast (VIDX) Table In addition to the 4K VLAN groups, the device supports 4K bridge Multicast groups. A bridge Multicast group is identified by an group index called VIDX. A packet forwarded to a Multicast group is assigned a VIDX. A registered Multicast packet is assigned a VIDX from the bridge Multicast FDB entry (Section 11.4.1.2 "FDB Multicast Entry"). Unknown Unicast and unregistered Multicast and Broadcast packets are assigned an implicit VIDX of 0xFFF. At initialization the Multicast table is cleared to all zeros, with the exception of an entry for VIDX 0xFFF, which is initialized to have all ports as members. This allows Multicast packets to be flooded to all the VLAN ports. The Multicast table entry 0xFFF should not be modified by the CPU. The VIDX serves as a direct index into the on-chip bridge Multicast table. Each entry in the bridge Multicast table contains a port map of the Multicast group member ports on the local device. To perform bridge VLAN egress filtering, the VIDX port map is AND’ed with the VID port map, ensuring that the egress ports are members of the packet VLAN group assignment. MV-S102110-02 Rev. E Page 240 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The CPU configures the “old” device/port or trunk-ID and the “new” device/port or trunk-ID to which the entry is reassociated. A VLAN filter can be enabled to restrict the transplanting to the “old” device/port or trunk-ID associated with a given VLAN. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine The Multicast table entry format is defined in C.16.18.2 "Multicast Group Table Entry Format" on page 761. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 CPU access to the Multicast table is defined in C.16.18.4 "VLT Tables Access Control Registers" on page 765. Bridge Security Breach Events The following bridge security breach events are defined: • VLAN Range Filtering (Section 11.2.4.4) • VLAN Ingress Filtering (Section 11.2.4.1) • Source MAC Address is Multicast (Section 11.4.7) • FDB DA or SA Command is Hard or Soft Drop (Section 11.4.4) • Invalid VLAN (Section 11.2.4.3) • Moved Static Address (Section 11.4.8) • Unknown Source Address (Section 11.4.7.3) Security breach packets are not subject to auto-learning and they do not generate New Address (NA) update messages to the CPU. See the specific security breach event type regarding the forwarding/dropping behavior. Security breach packets update the Security Breach counters and Security Breach Status register, as described in the following subsections. 11.6.1 Security Breach Status M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Security Breach Status registers provide the CPU with details about packets that generate security breach events. When a security event occurs, an Update Security Breach interrupt is generated and the Security Breach Status register is updated with the offending packet’s Source MAC Address, ingress port, VLAN assignment, and security breach type. The Security Breach Status register remains valid until the CPU reads the status content. From the time the interrupt is raised until the CPU reads the entire contents of the status registers, subsequent security breach events update only the security breach counters, but they do not update the status register or generate interrupts. The recommended software handling of the Update Security Breach interrupt is described below: 1. The <Update SecurityBreach RegisterInt> field is set in the Bridge Interrupt Cause Register (Table 581 p. 808). 2. The Interrupt service routine (ISR) clears the interrupt and sets the corresponding mask bit in the Bridge Interrupt Mask Register (Table 582 p. 808). 3. The event is passed for asynchronous handling by a software task. 4. The task reads the Security Breach Status Register0 (Table 376 p. 651), Host Outgoing Packets Count Register (Table 385 p. 654), and Security Breach Status Register2 (Table 378 p. 651) and clears the corresponding mask bit in the Bridge Interrupt Mask Register (Table 582 p. 808). 5. A new Update Security Breach interrupt can now occur. 11.6.2 Security Breach Counters There are two security breach counters: • Global security breach counter • Port/VLAN security breach counter Every security breach event increments the global security breach counter. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 241 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 11.6 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Security Breach Events AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The port/VLAN security breach counter is configurable to count security events on either a specific port or VLAN group. The security breach counters are 32-bit. When the maximum value is reached, the counter wraps to 0 and an interrupt is generated. • • • • Security Breach Filter Counter (Table 373 p. 649). The port/VLAN security breach counter is read from the <Port/VLAN SecurityBreach DropCnt> field in the Port/VLAN Security Breach Drop Counter (Table 374 p. 650). To configure the port/VLAN security breach counter to monitor security events from either a configured port or a configured VLAN, set the <Port/VLAN SecurityBreach DropCnt> field in the Bridge Global Configuration Register1 (Table 371 p. 647). To configure the VLAN to be monitored by the port/VLAN security breach counter, set the <SecurityBreach DropCntPort> field in the Bridge Global Configuration Register1 (Table 371 p. 647). To configure the port to be monitored by the port/VLAN security breach counter, set the <SecurityBreach DropCntVID> field in the Bridge Global Configuration Register1 (Table 371 p. 647). 11.7 IPv4/6 Multicast (S, G, V) Bridging IGMP and MLD are the protocols used by IPv4 and IPv6 nodes, respectively, to report their Multicast group membership to neighboring Multicast routers. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 IGMP or MLD snooping is a common technique used to conserve bandwidth on switch ports where no node has expressed interest in receiving the specific Multicast traffic. This is in contrast to the normal switch behavior, where Unregistered Multicast traffic is forwarded to all bridge ports in the VLAN group. IGMPv1/2 and MLDv1 support Join requests to receive traffic for a given IPM Group, i.e (*, G). This is known as Any-Source Multicast (ASM). IGMPv3 and MLDv2 support Join requests for receive traffic for an IPM Group from a list of senders, i.e., a list of (S, G) pairs. This is known as Source-Specific Multicast (SSM). MAC Multicast FDB entries can be used for limiting the forwarding of IPv4/6 Multicast groups, but this approach suffers from two problems: • MAC group address ambiguity: There is a many-to-one mapping between IPv4/6 group addresses and the corresponding MAC group address. • No source based lookup: The Bridge lookup is based on MAC DA + VLAN and does not include the IP source. To overcome these limitations, the FDB destination lookup can be based on the packet source IP (S) and/or destination IP group address (G), and the packet VLAN (V). IPv4/6 Multicast bridging is enabled on a per-VLAN basis. If IPv4/6 Multicast based bridging is disabled for the VLAN, the IPv4/6 packet is forwarded based on its MAC DA FDB lookup. The IPv4/6 Multicast mode is set on a per-VLAN basis to either: (S, G, V): This mode is used for SSM snooping. The FDB lookup is based on the packet source IP (SIP), group destination IP (DIP), and VLAN-ID. or • • (*, G, V): This mode is used for ASM snooping. The FDB lookup is based on the packet group destination IP (DIP), and VLAN-ID. MV-S102110-02 Rev. E Page 242 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration The global security breach counter is read from the <GlobalSecurity BreachDropCnt> field in the Global • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Based on IGMP/MLD report messages trapped to the CPU (Section 11.8.4 "IGMP"), the CPU configures a Multicast table entry with the appropriate port membership (11.5 "Bridge Multicast (VIDX) Table"), and then configures an FDB (S, G, VLAN) IPv4 or IPv6 Multicast entry associated with the VIDX index for this group (Section 11.4.6 "CPU Update and Query of the FDB"). The FDB <SIP> and <DIP> are 32-bit fields. For IPv4 Multicast bridging, the full IPv4 address is used in the FDB match lookup. However, for IPv6, which uses a128-bit address size, four octets must be selected from the IPv6 address, to be used for the FDB match lookup. The IPv6 selected octets used for the <SIP> and <DIP> FDB match lookup IPv6 Multicast bridging are configured globally. For each, IPv6 SIP and DIP, four of the sixteen octets in the IPv6 address are configurable. The IPv6 SIP default octet selection is octets 15, 14, 13, and 10 (where octet 15 is the IP address least significant octet). In cases where the IPv6 SIP is derived from the node’s MAC Address (RFC 2464), then these octets include the four least significant octets of the node’s MAC Address. The assumption is that this four octet selection provides a very highly probability of uniqueness. The IPv6 DIP default octet selection is octets 15, 14, 13, and 12. This corresponds to the least significant octets of the IPv6 group address. Note that, according to RFC 3307 - IP Multicast allocation Guidelines, the IPv6 group address allocations are from the least significant 32-bit range of the 128-bit Group ID. Based on this assumption, the selection of the least significant four octets provides an exact match of the group address. Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the event that multiple IPv6 Multicast flows collide to the same FDB (S, G, V) entry due to the fact that only 32 of the 128 bits of the S and G are represented, the entry’s associated VIDX group must be configured with a superset of the port members for all the colliding IPv6 Multicast flows. In the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754): • To enable/disable IPv4 Multicast-based bridging, set/clear the <IPv4IPM BridgingEn> bit. • To enable/disable IPv6 Multicast-based bridging, set/clear the <IPv6IPM BridgingEn> bit. • To set the IPv4 Multicast bridging mode to ASM or SSM, set the <IPv4IPM Bridging Mode> bit accordingly. • To set the IPv6 Multicast bridging mode to ASM or SSM, set the <IPv6IPM BridgingMode> bit accordingly. • • • To configure the IPv6 DIP and SIP octets to be selected for the FDB IPv6 Multicast lookup, set the fields in the IPv6 MC Bridging Bytes Selection Configuration Register (Table 380 p. 653) accordingly. To configure FDB IPv4 Multicast entries, set the <MACEntry Type> field in the FDB Entry (Table 412 p. 673) accordingly. If this is a (*, G, V) entry, set the FDB <SIP> to 0, otherwise set it to the source IPv4 address. Set the FDB entry <DIP[31:16]> in the MAC Update Message Format (Table 401 p. 663) to the IPv4 group address. Add the entry to the FDB according to Section 11.4.6 "CPU Update and Query of the FDB" on page 230. To configure FDB IPv6 Multicast entries, set the <EntryType> field to 2 in the MAC Update Message Format (Table 401 p. 663). If this is a (*, G, V) entry, clear the <SIP[31:28]> field, otherwise set it to the selected four IPv6 SIP octets. Set the <DIP[31:16]> field to the selected four IPv6 group address octets. Add the entry to the FDB according to Section 11.4.6 "CPU Update and Query of the FDB" on page 230. Note IGMPv3 and MLDv2 also support Join requests for a IPM Group excluding a list of senders (zero or more). This is known as Source-Filtered Multicast (SFM). There is no bridging support of IPv4/6 Multicast for SFM IGMP/MLD reports. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 243 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 If the IPv4/6 Multicast mode is (*, G, VLAN), the FDB <SIP> field must be set to 0. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED IPv4/6 Multicast (S, G, V) Bridging 11.8 Control Traffic Trapping/Mirroring to the CPU AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Protocols implemented in software require specific control traffic to be trapped or mirrored to the CPU. The device provides trapping/mirroring mechanisms for many well-known control traffic protocols. Other control traffic that does not have specific support can be trapped/mirrored to the CPU by the Policy engine. Unicast Management Traffic (MAC-to-Me) Unicast management traffic for the system management entity arrives with a MAC DA set to the management MAC address on a given VLAN interface. Typical examples of Unicast management traffic are: • IP management protocols like SNMP, HTTP, ICMP, etc. • ARP Reply packets To direct Unicast Management Traffic to the CPU, the FDB is configured with an entry as follows: <MAC Address> = management Unicast MAC Address <VLAN> = management VLAN interface <DA Command> = FORWARD <port number> = CPU port 63 (Section 7.1 "CPU Port Number" on page 102) <static> = 1 • • • • • Packets with MAC DA matching the management Unicast MAC Address are sent to the CPU with the CPU code BRIDGE PACKET FORWARD. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 For the Multilayer stackable switches, the Router MAC Address is configured in the FDB, as described in Section 12.4.1 "FDB Router MAC Entry" on page 270. See Section 11.4.6.2 "FDB Table Read/Write Access" on page 231 for details on how the FDB is updated. Notes • • 11.8.2 When the FDB entry <port number> is set to the CPU port 63, the FDB entry <device number> is not relevant, as the packet is forwarded to the target device according to the CPU code table configuration, as described in Section 7.2.1 "CPU Code Table" on page 103. As an alternative to setting the <DA Command> to FORWARD and the <port number> to 63, the <DA Command> can be set to TRAP and the packet sent to the CPU with CPU code FDB_ENTRY_TRAP/MIRROR. However, in devices that support L3 routing, the TRAP command precludes the packet from being processed by the router engine. IEEE Reserved Multicast IEEE 802.1D/Q defines the following reserved Multicast MAC ranges: • Bridge Standard Protocols: 01-80-C2-00-00-00 - 01-80-C2-00-00-0F • Bridge GARP Applications: 01-80-C2-00-00-20 - 01-80-C2-00-00-2F MV-S102110-02 Rev. E Page 244 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 11.8.1 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine Common addresses that fall into these ranges are listed in Table 55. P r o to co l Id ent i f i ed b y 802.3ad Slow Protocols (e.g. LACP) DA=01-80-C2-00-00-02 802.1X PAE address DA=01-80-C2-00-00-03 802.1Q GVRP DA=01-80-C2-00-00-21 802.1D GMRP DA=01-80-C2-00-00-20 Note Although Spanning Tree BPDUs with a MAC Destination Address of 01-80-C2-00-00-00 are part of the IEEE reserved Multicast range, this address is excluded by this mechanism. BPDUs are handled separately, according to the Spanning Tree enable state of the port (Section 11.3.1 "Trapping BPDUs" on page 221). Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 As a generic mechanism to trap or mirror the above reserved IEEE ranges and possibly future IEEE protocols, the device supports a 255-entry command table that is applied to all packets with a MAC destination Multicast address in the range 01-80-C2-00-00-01 through 01-80-C2-00-00-FF. The command table is indexed by the last byte of the Multicast address. The command for each entry can be set to either SOFT DROP, FORWARD, MIRROR, or TRAP. If it is set to TRAP or MIRROR, the packet is sent to the CPU with a DSA tag <CPU code> set to IEEE RESERVED MULTICAST ADDRESS TRAP/MIRROR. To configure the IEEE Reserved Multicast command, set the respective IEEE Reserved Multicast table entry (Multicast Group<n> Entry (0<=n<4096) (Table 521 p. 761). 11.8.3 Cisco Layer 2 Control Multicast Cisco has reserved the IEEE Multicast MAC range 01-00-0c-XX-XX-XX for Cisco proprietary protocols. Common Cisco protocols in this Multicast range are listed below: Table 56: Cisco Proprietary L2 Protocols Messa ge Type MAC MC DA SN AP HD LC Pro to co l Type Port Aggregation Protocol 01-00-0C-CC-CC-CC 0x0104 Spanning Tree PVSTP+ 01-00-0C-CC-CC-CD 0x010b VLAN Bridge 01-00-0C-CD-CD-CE 0x010C Unidirectional Link Detection (UDLD) 01-00-0C-CC-CC-CC 0x0111 Cisco Discovery Protocol 01-00-0C-CC-CC-CC 0x2000 Dynamic Trunking (DTP) 01-00-0C-CC-CC-CC 0x2004 Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 245 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 IEEE Reserved Multicast Addresses AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 55: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Control Traffic Trapping/Mirroring to the CPU Cisco Proprietary L2 Protocols (Continued) MAC MC DA SN AP HD LC Pro to co l Type STP Uplink Fast 01-00-0C-CD-CD-CD 0x200A Inter Switch Link (ISL) 01-00-0C-00-00-00 N/A VLAN Trunking (VTP) 01-00-0C-CC-CC-CC 0x2003 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Messa ge Type If globally enabled, all packets in the Cisco MAC Multicast range can be trapped or mirrored to the CPU with a CPU code CISCO CONTROL MULTICAST MAC ADDR TRAP/MIRROR. Configuration To trap or mirror Cisco Multicast to the CPU, set the <CiscoCommand> field in the Bridge Global Configuration Register0 (Table 370 p. 643). 11.8.4 IGMP IGMP (Internet Group Management Protocol) is the protocol used by IPv4 systems to report their IP Multicast group membership to neighboring Multicast routers. IGMP is defined in RFC 1112 (IGMPv1), RFC 2236 (IGMPv2), and RFC 3376 (IGMPv3). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 There are three IGMP message types: • Query messages: Sent by routers to hosts • Report messages: Sent by hosts/routers to routers • Leave messages (IGMPv2 only): Sent by hosts to routers IGMP packets are trapped/mirrored to the CPU for applications of IGMP snooping in switches, or Multicast routing in routers. IGMP packets are identified as packets with: EtherType=0x0800 (IPv4) Valid IPv4 header (Type=4, IP Header length >=5, correct IP checksum) IP protocol=2 • • • IGMP trapping can be enabled per port. This mechanism traps to the CPU all IGMP packets. Alternatively, IGMP trapping/mirroring is enabled/disabled per VLAN interface, where the behavior is defined according to a globally configurable IGMP trapping/mirroring mode. Three modes are defined: Mode 0 Trap all IGMP message types (IPv4 protocol = 2) Mode 1 IGMP Snoop mode Mirror to CPU Query messages (IPv4 protocol = 2 and IGMP Message Type= 0x11) and Trap to CPU non-Query IGMP messages (IPv4 protocol = 2 and IGMP Message Type != 0x11) Mode 2 IGMP Router mode (without Layer 2 IGMP snooping) Mirror to CPU all IGMP messages (IPv4 protocol = 2) Note If IGMP trapping is enabled on the port, IGMP packets are trapped to the CPU regardless of the VLAN IGMP trapping/mirroring mode. MV-S102110-02 Rev. E Page 246 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 56: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 For details about programming the bridge FDB for IGMP snooping, see Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging". • • ration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To enable/disable IGMP trapping/mirroring on a VLAN interface, set the <IPv4IGMP ToCPUEn> bit in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). To set the global IGMP trapping/mirroring mode, set the <IGMPMode> field in the Bridge Global Configuration Register0 (Table 370 p. 643) accordingly. 11.8.5 MLD and other IPv6 ICMP MLD (Multicast Listener Discovery) is the IPv6 protocol equivalent of IPv4 IGMP. MLDv1 is defined in RFC 2710 and is based on IGMPv2. MLDv2 is based on IGMPv3, which is still an InternetDraft. MLDv2 is defined in RFC 3810 and is based on IGMPv3 Per the standard, MLD is transported in IPv6 ICMP packets and it always contains an IPv6 Router Alert Hop-byHop Option. MLD packets can be identified by the ICMPv6 message type, with or without the presence of a Hopby-Hop Option header. Table 57: MLD Messages over ICMPv6 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 MLD messages are assigned the ICMPv6 message types shown in Table 57. MLD Mess age I CMPv6 Mess ag e Typ e MLDv1/2 Query 130 MLDv1 Report 131 MLDv1 Done 132 MLDv2 Report 143 Trapping or Mirroring of ICMPv6 Multicast packets are enabled on a per-VLAN basis. A global ICMPv6 table allows for trapping or mirroring up to eight ICMPv6 message types. If the packet is trapped or mirrored to the CPU, it is assigned a CPU code of IPv6 ICMP TRAP/MIRROR. Configuration To enable ICMPv6 trapping/mirroring on a VLAN interface, set the <IPv6ICMP ToCPUEn> bit in the VLAN<n> • • Entry (0<=n<4096) (Table 520 p. 754). To configure the ICMPv6 message type and command, set the respective entries in the IPv6 ICMP Message Type Configuration Register<n> (0 <= n < 2) (Table 365 p. 631) and the IPv6 ICMP Command Register (Table 366 p. 632). Note For details about programming the bridge FDB for MLD snooping, see Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging". Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 247 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration To enable/disable IGMP trapping on a port, set the <IGMPTrapEn> bit in the Ingress Port<n> Bridge Configu• M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Control Traffic Trapping/Mirroring to the CPU 11.8.6 IPv4/6 Interface Control Traffic A VLAN can be configured for “IPv4/6 Control to CPU”. This mechanism allows the CPU to receive only IP control traffic from relevant VLAN interfaces. When set, this enables the following control traffic to be “eligible” for mirroring/trapping to the CPU: • ARP (Section 11.8.6.1) • IPv6 Neighbor Solicitation (Section 11.8.6.2) • • IPv4/v6 Control Protocols Running Over Link-Local Multicast (Section 11.8.6.3) RIPv1 MAC Broadcast (Section 11.8.6.4) Each of the above sections has details regarding the specific functionality for each type of control traffic. Configuration To enable a VLAN interface for IP control packets to the CPU, set the <IPControlTo CPUEnable> bit in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). 11.8.6.1 ARP ARP (Address Resolution Protocol) is defined in RFC 826. ARP Request M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 There are two types of ARP packets—Requests and Replies. An ARP Request is sent as a link-layer Broadcast packet with an EtherType of 0x806. An IP management interface may or may not exist on a given VLAN. If an IP management interface is defined on a VLAN, the ARP Broadcast must be relayed to the CPU, to allow it to respond with an ARP reply to nodes requesting its link-layer address. If an ARP Request is received on VLAN where the VLAN entry <IP control to the CPU> is set, the packet is assigned a command of FORWARD, TRAP, or MIRROR-to-CPU, according to a global configuration. Typically, the command is set to MIRROR-to-CPU, which allows the packet to be forwarded to the VLAN and mirrored to the CPU as well. As an alternative to the per-VLAN mechanism, the device supports a per-port configuration to trap to the CPU all ARP Broadcast traffic. This is useful for systems implementing a proxy-ARP application, where all ARP Broadcasts on the port are be trapped to the CPU only. If the port is enabled for ARP Broadcast trapping, the ARP Broadcasts are trapped to the CPU regardless of the per-VLAN interface configuration. If the ARP Request packet is trapped or mirrored to the CPU by either the port or interface mechanism, it is assigned a CPU code of ARP TRAP/MIRROR. Configuration To configure the global ARP Requests command FORWARD, TRAP, or MIRROR, set the <ARPBCCmd> • field in the Bridge Global Configuration Register0 (Table 370 p. 643) accordingly. This command is only applied to ARP Requests if the packet is received on a VLAN with the <IPControlTo CPUEnable> bit set in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). • To configure a port for ARP Requests trapping to the CPU, set the <ARPBCTrapEn> bit in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). MV-S102110-02 Rev. E Page 248 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 On IPv4/6 interfaces, some specific types of control traffic may need to be mirrored or trapped to the CPU. However, it may be desirable to limit IP control traffic to the CPU to selected VLANs that are enabled for “IP control traffic”. These are referred to as “IP VLAN interfaces”. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine ARP Reply AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 ARP Reply packets are Unicast packets that are sent to the source MAC of the ARP Request. If the CPU generates an ARP Request, an ARP Reply packet is addressed to the MAC Address of the interface. The packet is sent to the CPU, as described in Section 11.8.1 "Unicast Management Traffic (MAC-to-Me)" on page 244. IPv6 Neighbor Solicitation Neighbor Solicitation messages are used to learn the link-layer address of a target neighbor IPv6 Unicast or Anycast address. It is defined in RFC 2461 (IPv6 Neighbor Discovery) and RFC 3513 (IPv6 Address Architecture). A Neighbor Solicitation message is sent to the IPv6 Multicast address FF02:0:0:0:0:1:FF00:0/104, where the loworder 24 bits are taken from the target IPv6 address. An IPv6 node must receive solicited-node Multicast packets for each IPv6 Unicast/Anycast address that it has been assigned. The device supports a configurable 128-bit Neighbor Solicitation IPv6 address and mask. If the packet’s IPv6 Destination Address matches the configured Neighbor Solicitation prefix AND the VLAN entry <IP control to the CPU> is set, the packet is assigned a command of FORWARD, TRAP, or MIRROR-to-CPU, according to a global configuration. If the packet is mirrored or trapped to the CPU, it is assigned the CPU code IPv6 NEIGHBOR SOLICITATION TRAP/MIRROR. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The goal is to mirror or trap to the CPU only Neighbor Solicitation messages that are relevant for the local device interfaces and to exclude non-relevant messages. If the device has only a single interface, the configured solicited neighbor address can be the full address and a full mask (i.e., all ones). If the device has multiple interfaces, the configured solicited neighbor address and mask are configured to match the Neighbor Solicitation address for all its interfaces. In some cases this may require the CPU to receive Neighbor Solicitation packets that are not targeted to one of its local interface addresses. Configuration • To configure the Neighbor Solicitation prefix, set the IPv6 Solicited-Node Multicast Address Configuration • Register0 (Table 325 p. 584), IPv6 Solicited-Node Multicast Address Configuration Register1 (Table 326 p. 585), IPv6 Solicited-Node Multicast Address Configuration Register2 (Table 327 p. 585) and IPv6 SolicitedNode Multicast Address Configuration Register3 (Table 328 p. 585) and IPv6 Solicited-Node Multicast Address Mask Register0 (Table 329 p. 585), IPv6 Solicited-Node Multicast Address Mask Register1 (Table 330 p. 585), IPv6 Solicited-Node Multicast Address Mask Register2 (Table 331 p. 586) and IPv6 Solicited-Node Multicast Address Mask Register3 (Table 332 p. 586) accordingly. To configure the IPv6 Neighbor Solicitation command, set the <IPv6Neighbor Solicited NodeCmd> field in the Bridge Global Configuration Register0 (Table 370 p. 643) accordingly. The command is applied to packets that match the Neighbor Solicitation prefix only if the packet is received on a VLAN with the<IPControlTo CPUEnable> field set in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). 11.8.6.3 IPv4/v6 Control Protocols Running Over Link-Local Multicast The following section is relevant for the Multilayer stackable switches only. IPv4 and IPv6 define link-local Multicast addresses that are used by various protocols. In IPv4, the link-local IANA Multicast range is 224.0.0.0/24. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 249 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 11.8.6.2 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Control Traffic Trapping/Mirroring to the CPU In IPv6, the link-local IANA Multicast range is FF02::/16. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If an IPv4/6 link-local Multicast packet is received on a VLAN whose VLAN table entry has the field <IP Control to CPU> set, a lookup is performed in the IPv4 or IPv6 link local Multicast command table, to determine whether to mirror the specific protocol type packet to the CPU. The device supports a 256-entry IPv6 link-local Multicast command table, where the lower eight bits of the IPv6 Destination Address of packets in the range FF02::/120 are used as an index to the table. In each table, the entry command is configured to either FORWARD or MIRROR. If the packet is mirrored to the CPU, it is assigned a CPU code IPv4/IPv6 LINK-LOCAL MULTICAST DIP TRAP/MIRROR. For example, the IPv4 or IPv6 link local Multicast command table can be configured for VRRP control packets to be mirrored to the CPU, while DVMRP control traffic is forwarded to the VLAN, but not mirrored to the CPU. Examples of common IPv4/6 link-local control protocols and their corresponding address are defined in the following table Table 58: Common IPv4/6 Link-Local Multicast Addresses IPv4 L in k-L oca l Multicast address I P v6 L in k- L oca l Multicast addr ess All Systems on this Subnet 224.0.0.1 FF02:0:0:0:0:0:0:1 All Routers on this Subnet 224.0.0.2 FF02:0:0:0:0:0:0:2 DVMRP Routers 224.0.0.4 OSPF All Routers 224.0.0.5 OSPF Designated Routers 224.0.0.6 RIPv2 Routers 224.0.0.9 EIGRP Routers 224.0.0.10 All PIM Routers 224.0.0.13 VRRP 224.0.0.18 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Messa ge Type FF02:0:0:0:0:0:0:4 FF02:0:0:0:0:0:0:5 FF02:0:0:0:0:0:0:6 FF02:0:0:0:0:0:0:9 FF02:0:0:0:0:0:0:A FF02:0:0:0:0:0:0:D N/A Configuration To globally enable the mechanism for IPv4 link-local Multicast mirroring mechanism, set the Bridge Global • Configuration Register0 (Table 370 p. 643) <IPv4LinkLocal MirrorEn> field. • To globally enable the mechanism for IPv6 link-local Multicast mirroring mechanism, set the Bridge Global Configuration Register0 (Table 370 p. 643) <IPv6LinkLocal MirrorEn> field. • To mirror an IPv4 link-layer Multicast protocol, set the relevant bit in the IPv4 Multicast Link-Local Configura- • tion Register<n> (Table 367 p. 633) corresponding to the lower eight bits in the IPv4 Multicast range 224.0.0.0/24. The command is applied to packets in the range 224.0.0.0/24 only if the VLAN they are associated with is enabled for IPv4/6 Control packets to CPU. To enable a VLAN interface for IP control packets to the CPU, set the <IPControlTo CPUEnable> bit in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). MV-S102110-02 Rev. E Page 250 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The device supports a 256-entry IPv4 link-local Multicast command table, where the lower eight bits of the IPv4 Destination Address of packets in the range 224.0.0.0/24 are used as an index to the table. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine 11.8.6.4 RIPv1 MAC Broadcast RIPv1 is an IPv4 routing protocol defined in RFC 1058 (Routing Information Protocol). RIPv1 packets are sent with a link-layer destination Broadcast address, IP protocol of 17 (UDP), and destination UDP port of 520. A RIPv1 packet received on a VLAN whose VLAN entry has the field <IP Control to CPU> set, is assigned a command of FORWARD or MIRROR, according to a global configuration. If the RIPv1 packet is mirrored to the CPU by this mechanism, it is assigned a CPU code of IPv4 RIPv1 MIRROR. Note If RIPv1 Broadcast is mirrored to the CPU by this mechanism, the packet is not subject to the IPv4 Unregistered Broadcast command (Section 11.11.1.5 "Per-VLAN Unregistered IPv4 Broadcast Filtering" on page 255). • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To configure the global RIPv1 to be mirrored to the CPU, set the Bridge Global Configuration Register0 • (Table 370 p. 643) <IPv4RIPv1 MirrorEn> field. • The command is applied to packets in the range FF02::/120 only if the VLAN they are associated with is enabled for IPv4/6 Control packets to CPU. To enable a VLAN interface for IP control packets to the CPU, set the <IPControlTo CPUEnable> bit in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). The command is applied to RIPv1 packets only if the VLAN they are associated with is enabled for IPv4/6 Control packets to CPU. To enable a VLAN interface for IP control packets to the CPU, set the <IPControlTo CPUEnable> bit in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). 11.9 Private VLAN Edge (PVE) Private VLAN Edge (PVE) is per-port configuration that assigns a configured bridge forwarding destination to all traffic received on the port. The PVE configured bridge destination can be either a device/port or trunk-ID. Typically, the PVE target is an uplink to a common resource (e.g., router, server) required by multiple private ports, but for security reasons, traffic between the private ports is not allowed. PVE only overrides the bridging destination. PVE packets are still subject to policy, bridging, and policing engines, and can be dropped, trapped, mirrored to the CPU, mirrored to analyzer, or sampled to the CPU. PVE traffic is NOT subject to unknown or unregistered bridge filtering (Section 11.11 "Unknown and Unregistered Packet Filtering"). Note Another form of PVE allows multiple uplinks to be defined, rather than just a single uplink. In this case, the egress source-ID filtering mechanism can be used to disallow traffic between the private ports, while permitting traffic between the private ports and the public uplink port ( "Protected and Unprotected Ports" on page 258). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 251 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • To mirror an IPv6 link-layer Multicast protocol, set the relevant bit in the IPv6 Multicast Link-Local Configuration Register<n> (Table 368 p. 638) corresponding to the lower eight bits in the IPv4 Multicast range 224.0.0.0/24. FF02::/120. The command is applied to packets in the range FF02::/120 only if the VLAN they are associated with is enabled for IPv4/6 Control packets to CPU. To enable a VLAN interface for IP control packets to the CPU, set the <IPControlTo CPUEnable> bit in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754). AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Private VLAN Edge (PVE) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the Multilayer stackable switches, PVE does NOT affect routed traffic. IPv4/6 Multicast packets are bridged through the PVE uplink as well as sent to the router. In this case, the IP Multicast router must work in non-distributed mode (i.e., IPM routing is performed only on the ingress device and the MLL downstream interfaces must contain the cascade ports. The IPM routed packets are subsequently bridged on the non-ingress devices.) • the <PVLANEn> bit in the Bridge Global Configuration Register0 (Table 370 p. 643). To enable a port for PVE, set the <PortPVLANEn> bit in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). In the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619) and Ingress Port<n> Bridge Configuration Register1 (0<=n<27, for CPU Port n= 0x3F) (Table 357 p. 623): • If the PVE bridge destination is a trunk group, set the <PortPVLAN TrgPort/><PortPVLAN TrgTrunk/> field accordingly. If it is a device/port, clear the field. • To configure the port PVE bridge target port/trunk, set the <PortPVLAN TrgPort/>> field to the target port or trunk. • To configure the port PVE bridge target device (when the target is not a trunk), set the <PortPVLAN TrgDev> field. 11.10 Ingress Port Packet Rate Limiting M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 For protection against excessive rates of known Unicast, unknown Unicast, Multicast, and Broadcast packets, the device ports (excluding the CPU port) can be enabled to drop packets exceeding a configured rate. Rate limiting is globally configured to be either packet-based or byte-count based. Byte-count based rate limiting includes the 20 bytes of Ethernet framing overhead (preamble + SFD + IPG) in addition to the packet byte count. On each port, rate limiting can be independently enabled for each of the following traffic types—known Unicast, unknown Unicast, Multicast (registered and unregistered), and Broadcast packets. Each port maintains a single configurable limit threshold value, and a single internal counter. Four configurable global time window periods are assigned to ports as a function of their operating speed—10 Gbps, 1 Gbps, 100 Mbps, and 10 Mbps. The resolution of the time window value is 25.6 microseconds for the 10 Gbps ports, and 256 microseconds for the others. The default values for the 10 Gbps, 1 Gbps, 100 Mbps, and 10 Mbps time windows is 1 ms, 10 ms, 100 ms, and 1 second, respectively. The advantage of keeping the window time setting relative to the port speed is that a given limit value represents the same percentage of the traffic, independent of its speed. For example, ports running at 10 Mbps, 100 Mbps, 1 Gbps, and 10 Gbps all receive a maximum of 1250 KB in their default time window. So if the desired Broadcast rate limit is 10% of the port bandwidth, the port limit can be configured to 125 KB and the limit percentage is constant, independent of its current speed. At the beginning of each time window, the internal port counter is reset to zero. The port counter is incremented (according to the mode—packet or byte-based) for every packet type enabled for rate limiting. If the incremented port counter exceeds the configured port limit, the packet is assigned either a Hard or Soft DROP command based on the DROP type configuration (Section 5. "Packet Command Assignment and Resolution" on page 52). There is a global 40-bit counter to count the number of dropped packets or dropped bytes (according to the mode) due to ingress port rate limiting. Each port can be enabled to increment the global 40-bit counter (according to the mode—packet or byte-based) for every packet dropped due to ingress port rate limiting. MV-S102110-02 Rev. E Page 252 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration • The PVE feature must be globally enabled before it can be enabled on specific ports. This is done by setting M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine • • • • • • • • • 11.11 Unknown and Unregistered Packet Filtering Unknown Unicast and unregistered Multicast packets are assigned a VIDX of 0xFFF, and they are forwarded according to the packet VLAN-ID assignment. This default flooding can be modified by various filtering mechanisms to trap/mirror the packet to the CPU, or drop them. Note See Section 11.15 "Bridge Ingress Command Resolution" for exceptions to Unknown and Unregistered packet filtering. 11.11.1 Per-VLAN Unknown/Unregistered Filtering Commands This section defines the filtering commands for: • 11.11.1.1 "Per-VLAN Unknown Unicast Filtering" • 11.11.1.2 "Per-VLAN Unregistered Non-IPv4/6 Multicast Filtering" • 11.11.1.3 "Per-VLAN Unregistered IPv4 Multicast Filtering" • 11.11.1.4 "Per-VLAN Unregistered IPv6 Multicast Filtering" • 11.11.1.5 "Per-VLAN Unregistered IPv4 Broadcast Filtering" • 11.11.1.6 "Per-VLAN Unregistered non-IPv4 Broadcast Filtering" Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 253 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To enable a port for known Unicast rate limiting, set the <UcRateLimEn> bit in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To enable a port for Multicast rate limiting, set the <McRateLimEn> bit in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To enable a port for Broadcast rate limiting, set the <BcRateLimEn> bit in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To configure the port limit value, set the <IngressLimit> field in the Ingress Port<n> Bridge Configuration Register1 (0<=n<27, for CPU Port n= 0x3F) (Table 357 p. 623) accordingly. To enable a port to increment the global drop counter, set the <RateLimitDrop CountEn> bit in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). To configure the window time for 10Gbps ports, set the <10GWindow> field in the Ingress Rate limit Configuration Register1 (Table 359 p. 625) accordingly. To configure the window time for 1Gbps ports, set the <1000MWindow> field in the Ingress Rate limit Configuration Register0 (Table 358 p. 624) accordingly. To configure the window time for 100 Mbps ports, set the <100MWindow> field in the Ingress Rate limit Configuration Register0 (Table 358 p. 624) accordingly. To configure the window time for 10 Mbps ports, set the <10MWindow> field in theIngress Rate limit Configuration Register0 (Table 358 p. 624) accordingly. To read the 40-bit drop counter, perform two read operations: First read the Ingress Rate limit Drop Counter[31:0] (Table 360 p. 626), and then read the Ingress Rate limit Drop Counter[39:32] (Table 361 p. 626)]. Counter coherency is ensured between the two read operations. To configure rate limit mode, packet based or byte based, set the <Ingress RateLimit Mode> bit in the Ingress Rate limit Configuration Register1 (Table 359 p. 625). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To enable a port for unknown Unicast rate limiting, set the <UnkUc RateLimEn> bit in the Ingress Port<n> • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Unknown and Unregistered Packet Filtering AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The VLAN entry (Section 11.2.6 "VLAN Table Entry" on page 216) contains a filtering command for unknown Unicast and unregistered Multicast packets. An unknown Unicast is defined as a packet with a Unicast MAC DA, whose FDB lookup did not result in a match. If the packet matches the unknown Unicast filter or one of the unregistered Multicast filters defined in this section, one of the following filter commands is assigned: • FORWARD • MIRROR • TRAP • SOFT DROP • HARD DROP 11.11.1.1 Per-VLAN Unknown Unicast Filtering An unknown Unicast packet is defined as a packet whose MAC DA is Unicast and whose FDB destination lookup does not find a matching entry. If the packet command for this filter is MIRROR or TRAP to the CPU, the packet is assigned a CPU code of BRIDGED UNKNOWN UNICAST TRAP/MIRROR. Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To configure the VLAN entry Unknown Unicast Filtering command, set the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754) accordingly. 11.11.1.2 Per-VLAN Unregistered Non-IPv4/6 Multicast Filtering An unregistered non-IPv4/6 Multicast packet is defined as follows: • MAC DA is Multicast (but not Broadcast). • MAC DA does not have either the IPv4 MAC prefix 01-00-5e-xx-xx-xx/25 or the IPv6 MAC prefix 33-33-xx-xxxx-xx/16. • FDB destination lookup does not find a matching entry. If the packet command for this filter is MIRROR or TRAP to CPU, the packet is assigned a CPU code of BRIDGED NON IPv4/IPv6 UNREGISTERED MULTICAST TRAP/MIRROR. Configuration To configure the VLAN entry Unregistered non-IPv4/6 Multicast Filtering command, set the <Unregistered NonIP MulticastCmd> field in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754) accordingly. 11.11.1.3 Per-VLAN Unregistered IPv4 Multicast Filtering An unregistered non-IPv4/6 Multicast packet is defined as follows: • MAC DA is Multicast (but not Broadcast), and • MAC DA has the IPv4 MAC prefix 01-00-5e-xx-xx-xx/25, and • FDB destination lookup does not find a matching entry. If the packet command for this filter is MIRROR or TRAP to the CPU, the packet is assigned a CPU code of BRIDGED IPv4 UNREGISTERED MULTICAST TRAP/MIRROR. MV-S102110-02 Rev. E Page 254 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 An unregistered Multicast is defined as a packet with a Multicast MAC DA, whose FDB lookup did not result in a match. The FDB lookup can be IP-Multicast-based as well (Section 11.7 "IPv4/6 Multicast (S, G, V) Bridging"). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To configure the VLAN entry Unregistered IPv4 Multicast Filtering command, set the <Unregistered IPv4Multicast Cmd> field in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754) accordingly. An unregistered IPv6 Multicast packet is defined as follows: • MAC DA is Multicast (but not Broadcast), and • MAC DA has the IPv6 MAC prefix 33-33-xx-xx-xx-xx/16, and • FDB destination lookup does not find a matching entry. If the packet is mirrored or trapped to the CPU, the packet is assigned a CPU code of BRIDGED IPv6 UNREGISTERED MULTICAST TRAP/MIRROR. Configuration To configure the VLAN entry Unregistered IPv6 Multicast Filtering command, set the <Unregistered IPv6Multicast Cmd> field in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754) accordingly. 11.11.1.5 Per-VLAN Unregistered IPv4 Broadcast Filtering M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 An unregistered IPv4 Broadcast packet is defined as follows: • MAC DA is Broadcast, and • Ethertype is 0x800 (IPv4), and • FDB destination lookup does not find a matching entry. If the packet command for this filter is MIRROR or TRAP to the CPU, the packet is assigned a CPU code of BRIDGED IPv4 UNREGISTERED BROADCAST TRAP/MIRROR. Configuration To configure the VLAN entry Unregistered IPv6 Multicast Filtering command, set the VLAN entry <Unregistered IPv6Multicast Cmd> field in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754) accordingly. 11.11.1.6 Per-VLAN Unregistered non-IPv4 Broadcast Filtering An unregistered non-IPv4 Broadcast packet is defined as follows: • MAC DA is Broadcast, and • Ethertype is not 0x800 (IPv4), or packet encapsulation is non-SNAP/LLC; and, • FDB destination lookup does not find a matching entry. If the packet command for this filter is MIRROR or TRAP to the CPU, the packet is assigned a CPU code of BRIDGED NON IPv4 UNREGISTERED BROADCAST TRAP/MIRROR. Configuration To configure the VLAN entry Unregistered non-IPv4 Multicast Filtering command, set the <Unregistered NonIP MulticastCmd> field in the VLAN<n> Entry (0<=n<4096) (Table 520 p. 754) accordingly. 11.11.2 Per-Egress port Unknown Unicast Filter An unknown Unicast packet is defined as a packet whose MAC DA is Unicast and whose FDB destination lookup does not find a matching entry. Unknown Unicast packets can be filtered based on the packet egress port configuration. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 255 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 11.11.1.4 Per-VLAN Unregistered IPv6 Multicast Filtering M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Unknown and Unregistered Packet Filtering Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To enable/disable unknown Unicast packet filtering on a given egress port, set the <Port<n> UnkUcFilterEn> field in the Egress Filtering Register0 (Table 463 p. 721) accordingly. An unregistered Multicast packet is defined as follows: • MAC DA is Multicast (but not Broadcast), and • FDB destination lookup does not find a matching entry. Unregistered Multicast packets can be filtered based on the packet egress port configuration. Configuration To enable/disable unregistered Multicast packet filtering on a given egress port, set the <Port<n>Unreg MCFilterEn> field in the Egress Filtering Register1 (Table 464 p. 721) accordingly. 11.12 IP and Non-IP Multicast Filtering The bridge engine supports the following global filters for Multicast packets: Drop IP Multicast Mode Assign DROP command to all packets with a MAC Multicast DA in the IPv4 MAC Multicast range 01-00-5E-00-00-00 to 01-00-5E-7F-FF-FF, or the IPv6 MAC Multicast range 33-33-xx-xx-xx. DROP type can be configured to HARD or SOFT. Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Drop non-IP Multicast Mode Assign DROP command to all packets with a MAC Multicast DA that is not in the IPv4 MAC Multicast range 01-00-5E-00-00-00 to 01-00-5E-7F-FF-FF, nor in the IPv6 MAC Multicast range 33-33-xx-xx-xx.(This does not include MAC Broadcast packets). DROP type can be configured to HARD or SOFT. In the Bridge Global Configuration Register0 (Table 370 p. 643) and Bridge Global Configuration Register1 (Table 371 p. 647): • To filter IP Multicast packets received on the device, set the <DropIPMcEn> bit. • To configure their DROP mode, set the <IPMCDropMode> bit. • To filter non-IP Multicast packets, set the <DropNonIP McEn> bit. • To configure its DROP mode, set the <NonIPMCDropMode> bit. 11.13 Bridge Local Switching In standard bridging applications, a packet must not be egressed on the same port or trunk in which it was ingressed. However, some specialized applications require “local switching” back through the ingress interface. 11.13.1 Local Switching of Known Unicast Packets Local switching of known Unicast packets is enabled/disabled per ingress port and is performed by the ingress pipeline Bridge engine. If local switching for known Unicast packets is disabled, and the bridge destination address lookup has the same Unicast location (trunk or port) as the packet’s original source location, then the ingress bridge engine assigns the packet a SOFT DROP command. MV-S102110-02 Rev. E Page 256 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 11.11.3 Per-Egress Port Unregistered Multicast Filter M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note If the packet is received from a cascade port, the packet’s source location is taken from the FORWARD DSA tag and not according to the local device and port number. Configuration To enable/disable source filtering for known Unicast packets, set the <UcLocalEn> bit in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619). 11.13.2 Local Switching of Multi-Destination Packets Local switching for multi-destination packets received on a non-trunk port is enabled/disabled per egress port, and is performed by the egress pipeline processing. If local switching of multi-destination packets is disabled for the egress port, then the egress pipeline filters the local device source port from the multi-destination distribution port list. Note Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 For details on source trunk filtering of multi-target packets, see Section 13.3.1 "Source Interface Filtering" on page 285. To enable/disable source filtering for multi-target VIDX packets received on a non-trunk port, set the <Port<i>Mc LocalEn> field in the Multicast Local Enable Configuration Register (Table 465 p. 722) accordingly. 11.14 Bridge Source-ID Egress Filtering The device supports a generic egress filtering mechanism that filters packets based on the packet Source-ID. Source-ID egress filtering is a generic mechanism that can be used for various applications. Two common applications are: • Loop Prevention in Cascaded Topology • Protected and Unprotected Ports All packets are assigned by the ingress pipeline a 5-bit Source-ID according to the following algorithm: IF the packet arrived on a cascade port, and the extended DSA tag <command>=FROM_CPU or FORWARD, the source-ID assignment is taken from the DSA tag. ELSE if the packet arrived on a cascade port, and the DSA tag <command>=TO_CPU or TO_ANALYZER, the source-ID assignment is 0. ELSE if the packet source MAC Address lookup finds a matching FDB entry, and the Bridge is not bypassed, and the global FDB source-ID assignment is not disabled, the packet is assigned the source-ID according to the FDB entry <source-ID>. ELSE the packet is assigned the source-ID according to the source port <source-ID> configuratioan. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 257 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 In the Multilayer stackable switches, routed packets are not subject local switching configuration constraints. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Source-ID Egress Filtering AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 IF the packet is routed by the Router engine(relevant for the Multilayer stackable switches only, the packet is assigned a source-ID according to the global configurable Router source-ID. (Section 12.5.4 "Router Source-ID Assignment" on page 275) The device supports a 32-entry Source-ID Filter table. In the egress pipeline this table is indexed with the packet Source-ID assignment. The table entry consists of a port bitmap, where a 0 bit for a given egress port indicates that packet is to be filtered and a 1 bit indicates that the packet is to be forwarded out the given egress port (however it is still subject to other egress filtering mechanisms). Source-ID egress filtering is always applied to multi-destination traffic. It is a configurable option to apply sourceID egress filtering for single-destination traffic. However, since there is only a single Source-ID assignment for a given packet, separate applications may require coordination of the source-ID space. Loop Prevention in Cascaded Topologies The Source-ID egress filtering mechanism can be used to create source-device-based Spanning Tree paths for load-balancing of multi-target packets in a cascaded loop topology, e.g., a ring stacking topology. Protected and Unprotected Ports M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Multi-target destination forwarding is described in Section 4.3 "Multi-Target Destination in a Cascaded System" on page 46. Another application of source-ID filtering is to support the model of “protected” and “unprotected” ports, where traffic is not permitted to pass between protected ports. Protected ports are all assigned with the "protected” SourceID. The protected Source-ID entry has the protected ports set to 0 in the bitmap and the unprotected ports set to 1. Thus traffic received on protected ports is not forwarded to egress protected ports and is forwarded on unprotected ports. Configuration • To enable/disable source-ID egress filtering for single-destination traffic, set the <UC Source-ID Filter • • • • Enable> field in the Transmit Queue Extended Control Register (Table 459 p. 717) accordingly. To assign non-DSA tagged packets with a source-ID according to the port default configuration (ignoring the source FDB entry Source-ID assignment), set the field <UsePortDefaultSrcId> in Bridge Global Configuration Register0 (Table 370 p. 643) To set the port default Source-ID, set the <SrcID>> field in the Ingress Port<n> Bridge Configuration Register0 (0<=n<27, for CPU Port n= 0x3F) (Table 356 p. 619) accordingly. When adding/modifying an FDB entry, the entry <SrcID> is set accordingly. FDB Entry (Table 412 p. 673). To configure the egress Source-ID table, set the <SrcIDMember [27:0]> field accordingly in the SrcID<n> Egress Filtering Table Entry (0<=n<32) (Table 472 p. 728). MV-S102110-02 Rev. E Page 258 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 When auto-learning is performed, the FDB source MAC entry is learned with the packet source-ID assignment as defined above. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine 11.15 Bridge Ingress Command Resolution AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The bridge engine has many packet filters that modify the packet command (Section 5. "Packet Command Assignment and Resolution" on page 52). These bridge packet filters are grouped into two phases. Phase 1 filtering is performed before Phase 2 filtering. Bridge Phase 1 has multiple packet filters operating in parallel that can modify the packet command. Following is a list of the bridge phase 1 packet filters: • • • • • • • • • • • • BPDU trap (Section 11.3.1 "Trapping BPDUs") FDB entry command (Section 11.4.4 "FDB Entry Command") ARP Request trap/mirror ( "ARP Request" on page 248) IGMP trap/mirror (Section 11.8.4 "IGMP") Unknown source command (Section 11.4.7.3 "Source MAC Address CPU Controlled Learning") IEEE reserved MC trap/mirror (Section 11.8.2 "IEEE Reserved Multicast") IPv6 ICMP trap/mirror (Section 11.8.5 "MLD and other IPv6 ICMP") IPv4/6 link-local control Multicast trap/mirror (Section 11.8.6.3 "IPv4/v6 Control Protocols Running Over LinkLocal Multicast") (relevant for the Multilayer stackable switches only) RIPv1 Mirror (Section 11.8.6.4 "RIPv1 MAC Broadcast") (relevant for the Multilayer stackable switches only) IPv6 neighbor solicitation trap/mirror (Section 11.8.6.2 "IPv6 Neighbor Solicitation") Cisco control Multicast trap/mirror (Section 11.8.3 "Cisco Layer 2 Control Multicast") Local packet filtering soft drop (Section 11.13 "Bridge Local Switching") Spanning tree port state soft drop (Section 11.3 "Spanning Tree Support") Ingress Rate Filtering soft/hard drop (Section 11.10 "Ingress Port Packet Rate Limiting") IPv4/non-IPv4 Multicast soft/hard drop (Section 11.12 "IP and Non-IP Multicast Filtering") VLAN Range Filtering soft/hard drop (Section 11.2.4.4 "VLAN Range Filtering") VLAN Ingress Filtering soft/hard drop (Section 11.2.4.1 "VLAN Ingress Filtering") Source MAC Address is Multicast soft/hard drop (Section 11.4.7 "FDB Source MAC Learning") Invalid VLAN soft/hard drop (Section 11.2.4.3 "Invalid VLAN Filtering") Moved Static Address soft/hard drop (Section 11.4.8 "FDB Static Entries") M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • • • • • • • • 11.15.1.1 Bridge Phase 1 Command Resolution There is a single command resolution between all the bridge Phase 1 mechanisms according to Section 5.2 "Command Resolution Matrix" on page 53. The command resolution is re-applied between the incoming packet command (as set by the Policy engine) and the resolution of the bridge Phase 1 command. 11.15.1.2 Bridge Phase 1 CPU Code Resolution As there are multiple Phase 1 mechanisms that can assign a packet command to TRAP or MIRROR, there must be a CPU code resolution process. The following list divides the mechanisms into six groups and associates a priority for assigning the CPU code. 1. 2. 3. IGMP trap/mirror IPv6 Neighbor Solicitation Mirror IPv6 ICMP trap/mirror Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 259 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 11.15.1 Bridge Phase 1 Packet Command Resolution M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Ingress Command Resolution The following mechanisms are mutually exclusive, i.e., no conflict resolution is required for the following CPU code assignments: – BPDU trap AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 – – – – – 5. 6. ARP Broadcast trap/mirror IEEE reserved Multicast trap/mirror IPv4/6 link-local control Multicast trap/mirror (relevant for the Multilayer stackable switches only) RIPv1 Mirror (relevant for the Multilayer stackable switches only) Cisco control Multicast trap/mirror FDB trap/mirror Unknown Source trap/mirror When a mechanism from more than one group assigns a packet command of TRAP or MIRROR to the CPU, the CPU code assigned to the packet is selected from the highest priority group (starting with IGMP trap/mirror). 11.15.1.3 Bridge Phase 1 Modification of the Unregistered/Unknown Status If the final command resolution from the bridge Phase 1 is not FORWARD (i.e., either the Policy engine or a bridge Phase 1 mechanism changed the command), or the packet was received on a PVE port (Section 11.9 "Private VLAN Edge (PVE)" on page 251), the packet is now considered registered (if Multicast) or known (if Unicast). These packets are subsequently not affected by the bridge per-VLAN unregistered/unknown Phase 2 filters and the egress port unregistered/unknown filters. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 11.15.2 Bridge Phase 2 Packet Command Modification The bridge Phase 2 filters are applied after the Phase 1 filters. The bridge Phase 2 consists of the per-VLAN unregistered and unknown packet filters (Section 11.11.1 "PerVLAN Unknown/Unregistered Filtering Commands"). The unregistered/unknown status of a packet can be modified by the bridge Phase 1 results (Section 11.15.1.3 "Bridge Phase 1 Modification of the Unregistered/Unknown Status"). 11.16 Bridge Counters 11.16.1 Bridge Ingress Counters 11.16.1.1 Bridge Drop Counter The Bridge engine can assign a Hard or Soft DROP command to packets for many reasons. The Bridge engine maintains a 32-bit counter that can be configured to count ALL bridge packet drop events, or only count packet drop events due to a specifically configured “reason”. The specific Bridge drop reasons are: • FDB entry command drop (Section 11.4.4 "FDB Entry Command") • Unknown MAC SA drop (Section 11.4.7.3 "Source MAC Address CPU Controlled Learning") • Invalid SA drop (Section 11.4.7 "FDB Source MAC Learning") • VLAN not valid drop (Section 11.2.4.3 "Invalid VLAN Filtering") • Port not member in VLAN drop (Section 11.2.4.1 "VLAN Ingress Filtering") • VLAN range drop (Section 11.2.4.4 "VLAN Range Filtering") • Static on Moved address drop (Section 11.4.8 "FDB Static Entries") MV-S102110-02 Rev. E Page 260 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 4. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine Configuration To configure the Bridge Drop Counter “reason”, set the <BridgeDropCnt Mode> field in the Bridge Global • • Configuration Register1 (Table 371 p. 647) accordingly. To read/write to the Bridge Drop Counter, read/write to the <BridgeDropCnt> field in the Bridge Filter Counter (Table 372 p. 649) accordingly. 11.16.1.2 Bridge Host Counters M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 A set of Host Group counters is maintained for a configured MAC Source Address and MAC Destination Address. These counters correspond to RMON-1 MIB (RFC 2819) Host counters. Table 59: Host Counters Counte r Name Counte r Desc ription hostInPkts The number of good packets1 with a MAC DA matching the CPU-configured MAC DA. hostOutPkts The number of good packets with a MAC SA matching the CPU-configured MAC SA. hostOutBroadcast Pkts The number of good Broadcast packets with a MAC SA matching the configured MAC SA. hostOutMulticastPkts The number of good Multicast Packets with a MAC SA matching the configured MAC SA. 1. Good packets are error-free Ethernet packets that have a valid frame length, per RFC 2819. Configuration • To configure the MAC DA to be used by the host counters, set the MAC Address Count0 Register (Table 381 • • p. 654) and the MAC Address Count1 Register (Table 382 p. 654) accordingly. To configure the MAC SA to be used by the host counters, set the MAC Address Count1 Register (Table 382 p. 654) and the MAC Address Count2 Register (Table 383 p. 654) accordingly. The hostInPkts counter can be read from the Host Incoming Packets Count Register (Table 384 p. 654). This counter is clear-on-read. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 261 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • • • • • Rate Limiting drop (Section 11.10 "Ingress Port Packet Rate Limiting") Local port drop (Section 11.13 "Bridge Local Switching") Spanning tree port state drop (Section 11.3 "Spanning Tree Support") IP Multicast drop (Section 11.12 "IP and Non-IP Multicast Filtering") Non-IP Multicast drop (Section 11.12 "IP and Non-IP Multicast Filtering") Unregistered Non-IP Multicast drop (Section 11.11.1.2 "Per-VLAN Unregistered Non-IPv4/6 Multicast Filtering") Unregistered IPv6 Multicast drop (Section 11.11.1.4 "Per-VLAN Unregistered IPv6 Multicast Filtering") Unregistered IPv4 Multicast drop (Section 11.11.1.3 "Per-VLAN Unregistered IPv4 Multicast Filtering") Unknown Unicast drop (Section 11.11.1.1 "Per-VLAN Unknown Unicast Filtering") Unregistered IPv4 Broadcast drop (Section 11.11.1.5 "Per-VLAN Unregistered IPv4 Broadcast Filtering") Unregistered non-IPv4 Broadcast drop (Section 11.11.1.6 "Per-VLAN Unregistered non-IPv4 Broadcast Filtering") AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • • • • • • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Counters • 11.16.1.3 Bridge Matrix Group A packet counter is maintained for a single CPU-configured MAC source/Destination Address pair. These counters correspond to RMON-1 MIB (RFC 2819) Matrix counters. Table 60: Matrix Source Destination Counters Counte r Name Counte r Desc ription matrixSDPkts The number of good packets with a MAC SA/DA matching the CPUconfigured MAC SA/DA. Configuration The matrixSDPkts counter can be read from the <MatrixSDPkts> field in the Matrix Source/Destination Packet Count Register (Table 388 p. 655) accordingly. This counter is clear-on-read. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 11.16.1.4 Bridge Port/VLAN/Device Counters There are two sets of ingress port/VLAN/device bridge counters—Set-0 and Set-1. Each counter-set is applied to a configured packet ingress stream based on "VLAN and port". VLAN and port can be set to "wildcard", e.g., to all ports in the VLAN, all VLANs in the port and all traffic in the switch. Each counterset can be configured independently. Each counter-set maintains four counters as described in Table 61. Table 61: Ingress Port/VLAN/Device Counters per Counter-Set Counte r Name Counte r Desc ription Bridge In Frames Counter Number of packets received by the bridge according to the specified mode criteria. Depending on the mode selected, this counter can be used to satisfy Bridge and SMON MIB objects (RFC 2674 and RFC 2617) such as: • dot1dTpPortInFrames (mode 1). • smonVlanIdStatsTotalPkts (mode 2). • dot1qTpVlanPortInFrames (mode 3). VLAN Ingress Filtered Packet Counter Number of packets discarded due to invalid VLAN, VLAN not in Range, or ingress port not VLAN member. This counter can be used to satisfy Bridge MIB object dot1qTpVlanPortInDiscard (mode 3) MV-S102110-02 Rev. E Page 262 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • The hostOutPkts counter can be read from Host Outgoing Packets Count Register (Table 385 p. 654). This counter is clear-on-read. The hostOutBroadcastPkts counter can be read from Host Outgoing Packets Count Register (Table 385 p. 654) This counter is clear-on-read. The hostOutMulticastPkts counter can be read from Host Outgoing Packets Count Register (Table 385 p. 654). This counter is clear-on-read. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bridge Engine Ingress Port/VLAN/Device Counters per Counter-Set (Continued) Counte r Desc ription Security Filtered Packet Counter Number of packets discarded due to Security Filtering measures: • FDB command drop (Section 11.4.4). • Invalid SA drop (Section 11.4.7). • Moved Static address is a Security Breach drop (Section 11.4.8). • Unknown source MAC command drop, and Unknown source MAC is Security breach (Section 11.4.7.3). Bridge Filtered Packet Counter Number of packets dropped by the Bridge for reasons other than VLAN ingress filtering and Security breach events. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Counte r Name M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This counter counts packets dropped due any of the following reasons: • Rate Limiting drop (Section 11.10 "Ingress Port Packet Rate Limiting") • Local port drop (Section 11.13 "Bridge Local Switching") • Spanning Tree state drop (Section 11.3 "Spanning Tree Support") • IP Multicast drop (Section 11.12 "IP and Non-IP Multicast Filtering") • Non-IP Multicast drop (Section 11.12) • Unregistered Non-IPM Multicast drop (Section 11.11.1.2 "Per-VLAN Unregistered Non-IPv4/6 Multicast Filtering") • Unregistered IPv6 Multicast drop (Section 11.11.1.4 "Per-VLAN Unregistered IPv6 Multicast Filtering") • Unregistered IPv4 Multicast drop (Section 11.11.1.3 "Per-VLAN Unregistered IPv4 Multicast Filtering") • Unknown Unicast drop (Section 11.11.1.1 "Per-VLAN Unknown Unicast Filtering") • Unregistered IPv4 Broadcast drop (Section 11.11.1.5 "Per-VLAN Unregistered IPv4 Broadcast Filtering") • Unregistered non-IPv4 Broadcast drop (Section 11.11.1.6 "Per-VLAN Unregistered non-IPv4 Broadcast Filtering") This counter can be used to satisfy Bridge MIB objects: • dot1dTpPortInDiscards • dot1qTpVlanPortInDiscards Configuration To configure counter set 0 criteria, set the <Set0Vid>, <Set0Port> and <Set0CntMode> fields in the Counters • • • Set0 Configuration Register (Table 389 p. 655) accordingly. To configure counter set the <Set1Vid>, <Set1Port>, and <Set1CntMode> fields in the 1 criteria, set Counters Set1 Configuration Register (Table 394 p. 657) accordingly. The following counters can be read from their respective registers. These counters are clear-on-read: – – – – – – – Set0 VLAN Ingress Filtered Packet Count Register (Table 391 p. 656) accordingly. Set0 Security Filtered Packet Count Register (Table 392 p. 656) accordingly. Set0 Bridge Filtered Packet Count Register (Table 393 p. 657) accordingly. Set1 Incoming Packet Count Register (Table 395 p. 657) accordingly. Set1 VLAN Ingress Filtered Packet Count Register (Table 396 p. 658) accordingly. Set1 Security Filtered Packet Count Register (Table 397 p. 658) accordingly. Set1 Bridge Filtered Packet Count Register (Table 398 p. 658) accordingly. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 263 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 61: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Bridge Counters 11.16.2 Bridge Egress Counters AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 There are two sets of bridge egress counters, Set-0 and Set-1. Examples: • Counter-set mode programmed with (*, 4, 2, *) counts traffic going to port 4, queue 2 (regardless of packet VLAN and drop precedence). • Counter-set mode programmed with (20, *, 3, 0) counts all packets going to VLAN 20, queue 3, with drop precedence of 0 (regardless of the packet destination port). • Counter-set mode programmed with (*, *, *, *) counts all egress traffic. Each counter-set has the following counters: Table 62: Egress Counters per Counter-Set Counte r Name Counte r Desc ription OutUcPkts Unicast packets transmitted for the egress packet stream. Multicast packets transmitted for the egress packet stream. OutBcPkts Broadcast packets transmitted for the egress packet stream. BridgeEgressFilteredPkts Packets filtered due to egress VLAN filtering and Spanning Tree egress filtering for the egress packet stream. TxQFilteredPkts Packets filtered due to egress queue congestion. CntrlAndToAnalyzerPkts Packets with command TO_ANALYZER, FROM_CPU, or TO_CPU. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 OutMcPkts Configuration • To set the Mode, and its attribute {port, VLAN, traffic class, drop precedence} for Set 0, set the respective • • fields in Txq MIB Counters Set0 Configuration Register (Table 506 p. 750). To set the Mode, and its attribute {port, VLAN, traffic class, drop precedence} for Set 0, set the respective fields in Txq MIB Counters Set1 Configuration Register (Table 513 p. 752). The following counters can be read from their respective registers. These counters are clear-on-read: – – – – – – – – – – – – Set0 Outgoing Unicast Packet Counter (Table 507 p. 751) accordingly. Set0 Outgoing Multicast Packet Counter (Table 508 p. 751) accordingly. Set0 Outgoing Broadcast Packet Count Register (Table 509 p. 751) accordingly. Set0 Bridge Egress Filtered Packet Count Register (Table 510 p. 751) accordingly. Set0 Tail Dropped Packet Counter (Table 511 p. 751) accordingly. Set0 Control Packet Counter (Table 512 p. 752) accordingly. Set1 Outgoing Unicast Packet Counter (Table 514 p. 753) accordingly. Set1 Outgoing Multicast Packet Counter (Table 515 p. 753) accordingly. Set1 Outgoing Broadcast Packet Count Register (Table 516 p. 753) accordingly. Set1 Bridge Egress Filtered Packet Count Register (Table 517 p. 753) accordingly. Set1 Tail Dropped Packet Counter (Table 518 p. 753) accordingly. Set1 Control Packet Counter (Table 519 p. 754) accordingly. MV-S102110-02 Rev. E Page 264 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Each counter-set is applied to a packet egress stream based on a configurable mode that can be set to any combination of parameters {VLAN, Port, traffic class, drop precedence} such that any of the components may be a wildcard ‘*’. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification IPv4 and IPv6 Unicast Routing AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 12. IPv4 and IPv6 Unicast Routing D Multilayer Stackable: 98DX107, 98DX133, 98DX167, 98DX247, 98DX253, 98DX263, 98DX273 D SecureSmart Stackable: 98DX169, 98DX249, 98DX269 U 12.1 Not relevant for SecureSmart, and Layer 2+ devices. Unicast Routing Features – – – IPv4/v6 Header Error TTL/Hop Limit Exceeded • Options Routed packet modifications: • • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device supports the following Unicast routing features: • Per-port and per-VLAN enabling of IPv4 and IPv6 Unicast routing. • Policy-based IPv4/v6 routing lookup. • Up to 1K prefix/host entries and 1K ARP MAC addresses. SecureSmart Stackable devices support up to 32 static IPv4 prefix/host entries and 256 ARP MAC addresses. • Next-hop forwarding to any {device, port}, trunk, or VLAN group in the system. • Per route entry QoS assignment. • Per route entry mirroring-to-CPU or mirroring to ingress analyzer port. • Per route entry TTL/Hop Limit decrement enable. • Per route entry binding to one of the 32 PCL match counters. • Router exception checking: – – MAC header modifications: DA/SA, VLAN, user priority – – – RIPv1 IP header modifications: TTL/Hop-Limit, checksum, DSCP Support for Layer 3 control traffic: IPv4/v6 control protocols running over link-layer Multicast, e.g. RIPv2, OSPv2 UDP Relay. Egress mirroring of routed packet to an analyzer port. 12.2 Unicast Routing Overview The device’s Unicast routing support is distributed among several pipeline engines: • Policy engine • Bridge engine • Unicast Router engine • Header Alternation The Policy engine is used to implement policy-based routing. The Policy rule action determines the packet’s nexthop forwarding decision. The standard “longest prefix match” (LPM) lookup is supported as a simple case of policy-based routing, where the routing policy rules are ordered according to the prefix length. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 265 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section is relevant for the following devices: R M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Unicast Routing Features AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Bridge engine performs all the normal bridge engine functions and mirroring/trapping of control packets. Packets to be routed by the device must have a MAC DA that matches the router MAC entry in the Bridge FDB. The Router engine utilizes the forwarding information from the Policy and Bridge engines, and performs the router exception checking and forwarding decision. The stages of Unicast routing processing are illustrated in Figure 45. Figure 45: Processing of Unicast Routed Packets P o lic in g e n g in e p r o c e s s in g R o u te r e n g in e p r o c e s s in g IF p a c k e t is trig g e re d fo r U n ic a s t ro u tin g , p e rfo rm R o u te r E x c e p tio n C h e c k in g A p p ly ro u te r n e x t-h o p in fo rm a tio n E g r e s s P r o c e s s in g B r id g e e n g in e p r o c e s s in g F u ll p ro c e s s in g b y th e B rid g e e n g in e M a rk th e p a c k e t if D A is th e R o u te r M A C a d d re s s M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 L 2 M u lti-T a rg e t R e p lic a tio n E g re s s F ilte rin g Q u e u in g , s h a p in g , s c h e d u lin g P o lic y e n g in e p r o c e s s in g R o u te r H e a d e r A lte r a tio n M o d ific a tio n o f M A C D A /S A , T T L , C h e c k s u m , e tc . Notes • • 12.3 P e rfo rm ro u te lo o k u p in th e R o u te r P C L . S to re n e x t-h o p in fo rm a tio n fo r la te r u s e b y th e R o u te r R x M A C P o r t p r o c e s s in g Although the policy-based route lookup takes place in the ingress pipeline prior to the Bridge engine, the decision whether to route is only made in the Router engine. If the packet is not routed (i.e., the routing trigger criteria are not met), the Routing PCL rule action has no impact on the packet forwarding decision. Policy Engine Support of Unicast Routing IPv4/6 Unicast routes are configured as standard 24-byte rules in a Policy Control List (PCL). The PCL containing routing rules is referred to as the “Routing PCL”. The PCL rule Action associated with the matching Routing PCL rule key contains the next-hop forwarding information. However, this action is only applied later by the Router engine if the packet meets the routing trigger requirements. MV-S102110-02 Rev. E Page 266 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The Header Alteration on the egress port is responsible for performing the necessary packet modifications of the routed packet’s MAC and IP header. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification IPv4 and IPv6 Unicast Routing 12.3.1 Binding the Routing PCL to a Port AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Policy engine supports two search cycles of the Policy engine TCAM. On a per-port basis, each lookup cycle can be bound to a different PCL. This configuration is performed using the Policy Configuration table entry for the given port (Section 10.3.2 "PCL Configuration Table" on page 179). If IPv4/v6 routing is enabled on the interface, the corresponding entry in the Policy Configuration table must bind the second lookup cycle to the Router PCL. The binding is performed in the Policy Configuration table (Section 10.3.3 "Interface Binding to a Policy Configuration Entry" on page 180). The Policy engine supports different search key types (Section 10.5 "Policy Search Keys" on page 183). For routing IPv4 Unicast packets, the short key type IPv4+L4 is used. This key type format is defined in Table 41, “Standard (24-bytes) Key Format,” on page 186. For routing IPv6 Unicast packets, the short key type“IPv6 DIP is used. This is a dedicated key type to support IPv6 Unicast routing. This key type format is also defined in Table 41, “Standard (24-bytes) Key Format,” on page 186. Notes • • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To bind the Router PCL to an interface, the corresponding Policy Configuration Table entry for the second lookup is configured as follows: • <Lookup Enable> = 1 • <Lookup PCL-ID> = the Routing PCL-ID • <Lookup Key Size> = 24-Byte Key Size • <Lookup IPv4 Short Key Type> = IPv4 + L4 Key Type • <Lookup IPv6 Short Key Type> = IPv6 DIP Key Type Multiple routing lists can be supported by defining multiple Router PCLs, each with a different PCL-ID. Each port can then be bound to a different Router PCL. PCLs may be bound based on ports or VLANs. When PCL binding mode is according to VLAN, the Routing PCL is also bound according to the packet’s VLAN. Configuration Configuration of the Policy Lookup Configuration is described in Section 10.3 "Policy Lookup Configuration" on page 179. 12.3.2 Routing PCL Rule Classification The Router PCL is comprised of standard 24-byte PCL rules. IPv4 route rules utilize the Policy engine standard IPv4+L4 key type. IPv6 route rules utilize the Policy engine standard IPv6 DIP key type. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 267 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The first lookup cycle can be used for any general policy application (e.g. security and QoS). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Engine Support of Unicast Routing When implementing an LPM-based route lookup, each rule is configured with the following key fields: Field Na me D escr ip tio n DIP This field contains the destination IPv4/6 address for the Routing PCL rule. The DIP value is set to either a network prefix or host address, and the DIP mask is set according to the prefix length of the address. If the DIP is a host address, then the entire field is unmasked. IsIP 0 = Packet is neither IPv4 nor IPv6. 1 = Packet is either IPv4 or IPv6. Router PCL rules are configured with this field set to ‘1’. IsIPv4 0 = Packet is IPv6. 1 = Packet is IPv4. Router PCL rules use this bit to distinguish between IPv4 rules and IPv6 rules. To meet the LPM requirements, the Routing PCL rules must be ordered according to the DIP prefix length. Rules with longer prefixes appear before rules with shorter prefixes. Thus rules containing host addresses are positioned in the PCL before rules containing the network prefix address. For finer granularity and policy-based lookup, additional fields can be set in the key (e.g., DSCP, or TCP/UDP port). Key fields that are not relevant must be masked. Notes • • M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 ECMP can be accomplished by creating multiple rules for a given DIP, with each rule including additional criteria (e.g., TCP/UDP port). The default route rule is support by setting the last Router PCL rule with the <DIP> field masked. There can be two separate default route rules—one for IPv4 and one for IPv6. To distinguish between the default rules, the IPv4 default rule has <isIP>=1 and <isIPv4>=1, and the IPv6 rule has <isIP>=1 and <isIPv4>=0. Both default rules have the <DIP> field masked. 12.3.3 Route PCL Rule Actions The Routing PCL rule action has a different format than the generic PCL rule action format defined in Section 10.6 "Policy Actions" on page 195. The Routing PCL rule action contains the next-hop routing information to be assigned to the packet if it is to be routed. Table 64 defines the Routing PCL Action entry fields. The hardware format of the Routing PCL rule action is defined in Table 335, “Policy Action Entry as a Next Hop Entry<n> Register (0<=n<1024),” on page 593. MV-S102110-02 Rev. E Page 268 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Routing PCL Rule Classification Key Fields AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 63: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification IPv4 and IPv6 Unicast Routing F ie l d Description Redirect Command UC_ROUTE_ENTRY - indicates this action is a Unicast Route entry. Packet Command If the packet is to be routed by the Router engine, it is assigned the packet command defined in this field: • • • • • SOFT_DROP HARD_DROP TRAP with CPU code IPV4_UC_ROUTE or IPV6_UC_ROUTE FORWARD (i.e., packet is routed) MIRROR to the CPU (i.e., packet is routed and mirrored to the CPU) with CPU code IPV4_UC_ROUTE or IPV6_UC_ROUTE NextHop Interface Next hop routing interface. Can be set to either a {Device, Port}, Trunk-ID, or VLAN. ARP DA Index Route Entry ARP Index to the ARP Table, if the packet is to be routed by the router engine. (Section 12.6.1 "ARP MAC Destination Address" on page 276) VLAN-ID VID assignment of the packet, if it is to be routed by the router engine. (Section 12.6.3 "VLAN-ID Assignment" on page 277) Modify UP Enable the packet 802.1p User Priority to be modified according to the QoS Profile Index, if the packet is to be routed by the router engine. Modify DSCP M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 (Section 8.1.3 "Packet QoS Attributes" on page 113) Enable the packet DSCP to be modified according to the QoS Profile Index, if the packet is to be routed by the router engine. (Section 8.1.3 "Packet QoS Attributes" on page 113) QoS Profile Marking Enable Enable the packet QoS Profile Index to be modified, if the packet is to be routed by the router engine. (Section 8.1.3 "Packet QoS Attributes" on page 113) QoS Profile Index The packet QoS Profile Index assignment, if the packet is to be routed by the router engine. (Section 8.1.3 "Packet QoS Attributes" on page 113) Mirror to Ingress Analyzer Enable If set, the packet is mirrored to the Ingress Analyzer port, if the packet is to be routed by the router engine. (Section 16.2 "Traffic Mirroring to Analyzer Port" on page 314) TTL/Hop Limit Decrement Disable If set, the routed packet IPv4 <TTL> or IPv6 <Hop Limit> is not decremented. (Section 12.6.5 "Decrement IPv4 TTL or IPv6 Hop Limit" on page 277) Bypass Router engine TTL and Options Check If set, the router engine bypasses the IPv4 TTL/Option check and the IPv6 Hop Limit/Hop-by-Hop check. This is used for IP-TO-ME host entries, where the packet is destined to this device. (Section 12.5.2.2 "IPv4 TTL Exceeded" on page 272) (Section 12.5.2.3 "IPv4 Options" on page 273) (Section 12.5.2.5 "IPv6 Hop Limit Exceeded" on page 273) (Section 12.5.2.6 "IPv6 Hop-by-Hop Options Header" on page 274) Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 269 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Policy Action Entry As a Route Entry AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 64: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policy Engine Support of Unicast Routing Table 64: Policy Action Entry As a Route Entry AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 ICMP Redirect Check Enable If set, the router engine checks if the next hop VLAN is equal to the packet VLAN assignment, and if so, the packet is mirrored to the CPU, in order to send an ICMP Redirect message back to the sender. (Section 12.5.2.7 "ICMP Redirect Error" on page 274) Match Counter Enable If set, the PCL rule is bound to the specified match counter index, which is incremented for each packet that matches this rule. The match counter is incremented regardless of whether or not the packet is routed by the router engine. (Section 10.2.2.5 "Match Counters" on page 177) Match Counter Index The match counter index to which this rule is bound. (Section 10.2.2.5 "Match Counters" on page 177) Note The Policy engine only applies the rule action configuration for the <Match Counter>. The remaining Routing PCL action values are stored internally but are not applied to the packet’s forwarding information. 12.3.4 Aging Host Route Entries Host route entries that are no longer active can be removed to free up TCAM rule entries. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 One method of determining that a rule is not in use is to bind the rule to one of the 32 PCL Match counters. The counter can be polled periodically, to determine if any new traffic has matched this rule. Another method is to use the Bridge aging mechanism as a trigger to delete IP host entries. When a Bridge Aged Address (AA) Update message is sent to the CPU for a given MAC Address, the CPU can use its ARP table shadow to reverse map the MAC Address to an IP address. If this IP address is a host entry, the CPU can free the PCL rule for this IP host and its associated entry in the hardware ARP MAC table (Section 12.6.1 "ARP MAC Destination Address" on page 276). 12.4 Bridge Engine Support for Unicast Routing A Router PCL rule match does not affect the packet’s subsequent processing by the Bridging engine. The packet has the same VLAN and QoS assignment that it had prior to the Router PCL lookup. A packet that matched a Router PCL rule is still subject to all the bridge engine processing mechanisms, e.g., ingress VLAN filtering, source learning, destination lookup, and the various dropping/trapping/mirroring mechanisms. 12.4.1 FDB Router MAC Entry The FDB is configured with a static Unicast Router MAC address for each VLAN interface enabled for routing. The FDB Router MAC entry is configured as follows: • <MAC Address> = Unicast Router MAC address • <VLAN-ID> = VLAN-ID of the Router interface • <static> = 1 • <DA Route>=1, indicating that the entry Unicast MAC Address is the Router MAC • <DA Command>=FORWARD • <Device Number> = local device • <Port Number>= CPU port (63) MV-S102110-02 Rev. E Page 270 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Description F ie l d M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification IPv4 and IPv6 Unicast Routing Note For IPv4/6 packets destined to the Unicast IP address of the router itself (IP-to-Me), see Section 12.7.1 "IPv4/v6 Unicast Management Traffic (IP-to-Me)" on page 278. 12.4.2 Per-VLAN Enable for IPv4 and IPv6 Unicast Routing IPv4 Unicast routing and IPv6 Unicast routing can be independently enabled/disabled on a per-VLAN basis. The VLAN routing enable state is one of the trigger requirement for the Router engine. (Section 12.5.1 "Triggering Unicast Routing"). Configuration To enable/disable IPv4 Unicast Routing on a given VLAN, set the <IPv4 Unicast routing Enable> field in the • VLAN Entry Fields (Table 51 p. 217) field accordingly. To enable/disable IPv6 Unicast Routing on a given VLAN, set the <IPv6 Unicast routing Enable> field in the VLAN Entry Fields (Table 51 p. 217) field accordingly. 12.5 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Router Engine Processing The Unicast Router engine is part of the ingress pipeline processing, residing between the Bridge engine and the Policing engine. 12.5.1 Triggering Unicast Routing A packet must meet ALL of the following trigger requirements to be processed by the Unicast Router engine: • The Bridge engine packet command resolution is FORWARD or MIRROR-to-CPU. • The packet matched a PCL routing rule in the Policy engine second lookup cycle (Section 12.3 "Policy Engine Support of Unicast Routing"). • The packet is IPv4 (EtherType=0x800) AND the packet VLAN is enabled for IPv4 Unicast routing. OR The packet is IPv6 (EtherType=0x86DD) AND the packet VLAN is enabled for IPv6 Unicast routing (Section 12.4.2 "Per-VLAN Enable for IPv4 and IPv6 Unicast Routing"). • The packet matched an FDB Router MAC entry with <DA_ROUTE> set (Section 12.4 "Bridge Engine Support for Unicast Routing" on page 270). OR Packet <target device> = local device and packet <target port> = 61 (the virtual router port) (Section 12.8 "One-Armed Router Configuration"). If the packet meets all the above criteria, it is then processed by the Router engine. If any criterion is not met, the packet forwarding information is not modified by the Router engine. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 271 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Router MAC FDB entry serves two purposes: • The <DA Route> bit serves as input to the Router engine trigger check (Section 12.5.1 "Triggering Unicast Routing" on page 271) to determine whether the packet is eligible for routing. If the packet is subsequently routed by the Router engine, the bridge forwarding decision (CPU port 63) is overridden by the next-hop information assigned by the Router PCL rule action. • If the packet is not IPv4/6 or does not pass the Router engine trigger check for any other reason, the packet is sent to the CPU port (as defined in the FDB entry) with the CPU code BRIDGE PACKET FORWARD. This is applied to non-IP traffic such as ARP Reply packets sent to the Router MAC address. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Router Engine Processing AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The router processing consists of the following ingress pipeline stages in the Router engine: • “Router Exception Checking” • “Router Command and CPU Code Resolution” and the following stage in the egress pipeline: • “Routed Packet Header Modification” 12.5.2 Router Exception Checking The following exception checks are made on IPv4 packets: • “IPv4 Header Error” • “IPv4 TTL Exceeded” • “IPv4 Options” The following exception checks are made on IPv6 packets: • “IPv6 Header Error” • “IPv6 Hop Limit Exceeded” • “IPv6 Hop-by-Hop Options Header” The “ICMP Redirect Error” check is performed for both IPv4 and IPv6 packets. The following subsections describe each of the router exception checks. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 12.5.2.1 IPv4 Header Error An IPv4 Header Error is detected if ANY of the following criteria are not met: • Valid IPv4 header <Checksum> • IPv4 header Version = 4 • IPv4 header IP Header Length >= 5 • IPv4 header IP Header Length <= (IPv4 header Total Length / 4) • IPv4 header IP Header Length + packet L3 byte offset + 4 CRC bytes <= packet Layer 2 byte count If an IPv4 Header Error is detected, there is a global configuration option to assign the packet command with either TRAP or FORWARD. If the IPv4 Header Error command is TRAP, the packet is assigned a CPU code IPv4 HEADER ERROR. The packet can be dropped by setting the corresponding CPU code table entry with the sampling threshold value of ‘0’ (Section 7.2 "Packets to the CPU" on page 102). Configuration To configure the IPv4 Header Error Command, set <IPv4BadHeaderCmd> in the Unicast Routing Engine Configuration Register (Table 417 p. 680). 12.5.2.2 IPv4 TTL Exceeded If the Router PCL rule action has <TTL/Hop Limit Decrement Disable>=0, the packet TTL is decremented by 1 (but remains at zero if it is already zero). After the optional decrement, if the IPv4 packet TTL field is either ‘0’ or ‘1’ and Route<DecTTL> = 1, a TTL Exceeded exception is detected. If an IPv4 TTL Exceeded exception is detected, there is a global configuration option to assign the packet command with either TRAP or FORWARD. MV-S102110-02 Rev. E Page 272 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 These stages are discussed in the following sections. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification IPv4 and IPv6 Unicast Routing AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the packet is assigned the TRAP command, it is assigned a CPU code IPV4 TTL EXCEEDED. The packet can be dropped by setting the corresponding CPU code table entry with the sampling threshold value of ‘0’ (Section 7.2 "Packets to the CPU" on page 102). Configuration To configure the IPv4 TTL Exceeded Command, set <IPv4TTL ExceededCmd> in the Unicast Routing Engine Configuration Register (Table 417 p. 680). 12.5.2.3 IPv4 Options If the IPv4 Header Length is greater than 5, the packet contains IPv4 options. If an IPv4 Options exception is detected, there is a global configuration to assign the packet command with either TRAP or FORWARD. If the packet is assigned the TRAP command, it is assigned a CPU code IPv4 OPTIONS. The packet can be dropped by setting the corresponding CPU code table entry with the sampling threshold value of ‘0’ (Section 7.2 "Packets to the CPU" on page 102). Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The IPv4 option check is skipped if the matching Router PCL rule action had the field <Bypass Router engine TTL and Options Check> set. This is intended to allow packets destined to the router IP host address to be received even if the packet has IPv4 Options. To configure the IPv4 Options Command, set <IPv4Options Cmd> in the Unicast Routing Engine Configuration Register (Table 417 p. 680). 12.5.2.4 IPv6 Header Error An IPv6 Header Error is detected if ANY of the following criteria are not met: • IPv6 Header IP Version = 6 • IPv6 Payload Length + 40 bytes of IPv6 header + packet L3 byte offset + 4 CRC bytes <= packet Layer 2 byte count If an IPv6 Header Error is detected, there is a global configuration to assign the packet command with either TRAP or FORWARD. If the packet is trapped to the CPU, it is assigned a CPU code IPv6 HEADER ERROR. The packet can be dropped by setting the corresponding CPU code table entry with the sampling threshold value of 0 (Section 7.2 "Packets to the CPU" on page 102). Configuration To configure the IPv6 Header Error Command, set <IPv6BadHeaderCmd> in the Unicast Routing Engine Configuration Register (Table 417 p. 680). 12.5.2.5 IPv6 Hop Limit Exceeded If the Router PCL rule action has <TTL/Hop Limit Decrement Disable>=0, the packet Hop Limit is decremented by 1 (but remains at zero if it is already zero). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 273 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The TTL Exceeded check is skipped if the matching Router PCL rule action had the field <Bypass Router engine TTL and Options Check> set. This is intended to allow packets destined to the router IP host address to be received even if the packet incoming TTL is ‘0’ or ‘1’. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Router Engine Processing AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 After the optional decrement, if the IPv6 packet Hop Limit field either ‘0’ or ‘1’ and Route<DecHopLimit> = 1, a Hop Limit Exceeded exception is detected. If an IPv6 Hop Limit Exceeded exception is detected, there is a global configuration option to assign the packet command with either TRAP or FORWARD. The Hop Limit exception check is skipped if the matching Router PCL rule action had the field <Bypass Router engine TTL and Options Check> set. This is intended to allow packets destined to the router IP host address to be received even if the incoming packet Hop Limit is ‘0’ or ‘1’. Configuration To configure the IPv6 Hop Limit Exceeded Command, set <IPv6HopLimit ExceededCmd> in the Unicast Routing Engine Configuration Register (Table 417 p. 680). 12.5.2.6 IPv6 Hop-by-Hop Options Header If the IPv6 packet contains a Hop-by-Hop Options header, a Hop-by-Hop Options exception is detected. If an IPv6 Hop-by-Hop Options exception is detected, there is a global configuration option to assign the packet command with either TRAP or FORWARD. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the packet is trapped to the CPU, it is assigned a CPU code IPv6 HOP-BY-HOP OPTIONS. The packet can be dropped by setting the corresponding CPU code table entry with the sampling threshold value of ‘0’ (Section 7.2 "Packets to the CPU" on page 102). The IPv6 Hop-by-Hop Option check is skipped if the matching Router PCL rule action had the field <Bypass Router engine TTL and Options Check> set. This is intended to allow packets destined to the router IP host address to be received even if the packet has an IPv6 Hop-by-Hop Option header. Configuration To configure the IPv6 Hop-by-Hop Option Command, set <IPv6HBH Cmd> in the Unicast Routing Engine Configuration Register (Table 417 p. 680) 12.5.2.7 ICMP Redirect Error An ICMP Redirect Error message is sent by a router to a host to inform it of a better next hop router on the path to the destination. If the Routing PCL action next-hop VLAN-ID equals the VLAN-ID assigned to the packet, an ICMP Redirect exception is detected. If the matching Router PCL action has the field <ICMPRedirect CheckEn> set, and an ICMP Redirect exception is detected, the packet is assigned the MIRROR to CPU command with the CPU code ICMP REDIRECT. 12.5.3 Router Command and CPU Code Resolution The final packet command resolution is based on the following: • Bridge engine packet command (must be either FORWARD or MIRROR, otherwise the packet is not eligible for routing.) • Routing PCL action <Packet Command> • Exception commands in the event where one or more exceptions are detected. MV-S102110-02 Rev. E Page 274 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 If the packet is trapped to the CPU, it is assigned a CPU code IPv6 HOP LIMIT EXCEEDED. The packet can be dropped by setting the corresponding CPU code table entry with the sampling threshold value of ‘0’ (Section 7.2 "Packets to the CPU" on page 102). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification IPv4 and IPv6 Unicast Routing AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Multiple mechanisms in the Router engine can assign the TRAP command. Since each mechanism assigns a CPU code, the following is the order of precedence for the Router TRAP CPU code assignment: 1. IPv4 HEADER ERROR (Section 12.5.2.1 "IPv4 Header Error") 2. IPv6 HEADER ERROR (Section 12.5.2.4 "IPv6 Header Error") 3. IPv4 TTL EXCEEDED (Section 12.5.2.4 "IPv6 Header Error") 4. IPv6 HOP LIMIT EXCEEDED (Section 12.5.2.5 "IPv6 Hop Limit Exceeded") 5. IPv4 OPTIONS (Section 12.5.2.3 "IPv4 Options") 6. IPv6 HOP-BY-HOP OPTIONS (Section 12.5.2.6 "IPv6 Hop-by-Hop Options Header") 7. IPv4 UC Route Command (Packet Command (Table p. 269) 8. IPv6 UC Route Command (Packet Command (Table p. 269) For example, if an IPv4 packet has a Header Error exception and the Header Error command is TRAP, and in addition, the PCL Route Action Packet command is TRAP, the packet is trapped with the CPU code IPv4 HEADER ERROR. Similarly, if multiple mechanisms in the Router engine assign the command MIRROR, following is the order of precedence for the Router MIRROR CPU code assignment: 1. IPv4 ICMP Redirect (Section 12.5.2.7 "ICMP Redirect Error") 2. IPv6 ICMP Redirect (Section 12.5.2.7 "ICMP Redirect Error") 3. IPv4 UC Route Command (Packet Command (Table p. 269) 4. IPv6 UC Route Command (Packet Command (Table p. 269) M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 12.5.4 Router Source-ID Assignment A global configurable source-ID is assigned to routed packets. The source-ID is an egress filtering mechanism that is useful to prevent loops in a Non-Spanning Tree topology. (Section 11.14 "Bridge Source-ID Egress Filtering" on page 257) Configuration To configure the Router source-ID assignment, set <Router Source-ID> in the Unicast Routing Engine Configuration Register (Table 417 p. 680). 12.6 Routed Packet Header Modification A packet routed by the Router engine undergoes packet header modification on transmission for the device. The following packet fields are modified in routed packets: • MAC DA modification • MAC SA modification • VLAN-ID modifications • Packet IPv4/6 DSCP and 802.1p User Priority modification • Decrement of the IPv4 TTL or IPv6 Hop Limit • IPv4 Checksum modification • DSA Tag source device and port modification The following subsections describe the modifications of each of these packet fields. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 275 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The packet command resolution is performed according to Section 5.2 "Command Resolution Matrix" on page 53. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Routed Packet Header Modification • Packets (including routed packets) queued to the CPU are not subject to any packet modifications other than DSA tagging. If a routed packet is transmitted on a port configured for egress mirroring, the packet is transmitted on the egress analyzer port with the SAME routed packet modifications that were performed on the original egress port. 12.6.1 ARP MAC Destination Address If the packet is routed by the device, the packet’s MAC DA is modified to reflect the next-hop’s MAC Address. The next-hop MAC Address is determined by the <ARP DA Index> assigned to the routed packet in the Router PCL rule action, which serves as an index into the 1K-entry ARP MAC DA table. Each entry in the ARP MAC table contains a 48-bit MAC Address. Configuration To configure the ARP MAC DA table, see the description in C.6.2 "Router ARP DA Table" on page 406. 12.6.2 Router MAC Source Address The routed packet’s MAC SA is modified to reflect the router’s MAC Address. There is a global base router MAC Address that defines the 40 most significant bits of the router MAC SA. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The router MAC SA’s least significant bits are defined according to global configurable router MAC SA modification mode: • Mode 0: 12 LSB of base SA is set according to the 12 LSB of packet VID. • Mode 1: 8 LSB of base MAC SA is set according to the 8-bit value configured per VLAN. • Mode 2: 8 LSB of base MAC SA is set according to the 8-bit value configured per egress port. If the packet is routed—either by the local device or by a remote device—there is a per-port configuration option whether or not to perform the router MAC SA modification. In a cascaded system where all devices are capable of modifying the MAC SA, it is recommended that the router MAC SA modification be performed only on the final egress network ports and disabled on cascade ports. Thus, if the routed packet is egressed on a remote device, the packet router MAC SA can be modified to reflect the final egress port (if working in router MAC SA mode 0), and the non-ingress devices can learn the original MAC SA of the bridged packet. Configuration • To configure the Router MAC SA base address, set the fields <Router MAC_SABase [31:0]> and <Router • • MAC_SABase [39:32]> in the Router MAC Base SA register0/1 accordingly. To enable the egress port for MAC SA modification of routed packets, set <MAC SAModeEn[31:0]> the Router Header Alteration Enable MAC SA Modification Register (Table 135 p. 405) To configure the Router MAC SA mode, set the field <Router MACSAMode> in the Router Header Alteration Global Configuration Register (Table 134 p. 404) MV-S102110-02 Rev. E Page 276 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Notes M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification IPv4 and IPv6 Unicast Routing 12.6.3 VLAN-ID Assignment AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The routed packet’s VLAN-ID assignment is modified according to the Router PCL rule action. If the egress port is a tagged member of the VLAN, the packet is transmitted tagged, with the routed VLAN-ID assignment (Section 11.2.1.2 "Egress Port Tag Modification" on page 205). Routed packets are subject to DSCP and User Priority modification in the same way as bridged packets. For details on the DSCP and User Priority packet modifications see Section 8.5 "Setting Packet Header QoS Fields" on page 126. Note The Router engine optionally modifies the packet’s QoS attributes according to the PCL Router Action fields <Modify UP>, <Modify DSCP>, <QoS Profile Marking Enable>, and <QoS Profile Index>. For details on QoS Attributes, see Section 8.1.3 "Packet QoS Attributes" on page 113. 12.6.5 Decrement IPv4 TTL or IPv6 Hop Limit If the Router PCL rule action <Do Not Decrement TTL> is not set and the routed packet’s TTL/Hop Limit is greater than ‘0’, the TTL/Hop Limit field is decremented by ‘1’. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 See Section 12.5.2.2 "IPv4 TTL Exceeded" and 12.5.2.5 "IPv6 Hop Limit Exceeded" for details about exception checking. 12.6.6 Update IPv4 Checksum The routed packet’s IPv4 checksum field is always updated to reflect the changes in the packet header. 12.6.7 DSA Tag Source Device and Port Modification In one-armed router stacking configurations, where one device provides routing services for another, the routing device must modify the original DSA tag <source device> and <source port> fields to reflect the routing device. The motivation of this feature is to prevent potential problems in one-armed router configurations, such as: • The original ingress device may be configured to discard DSA packets if <source device> is equal to the device number (Section 4.4 "Loop Detection" on page 47). • The original ingress device may filter packets destined to the port in which the packet was originally received. (This is only true in legacy DX devices that are not routing aware.) There is a per-port configuration to enable modifying the FORWARD DSA <source device> and <source port> when transmitting a packet that was routed by the device through a cascade port. The FORWARD DSA <source device> is set to the local device number and the <source port> is set to ‘61’, the virtual router port. The one-armed router model and virtual router port are described in Section 12.8 "One-Armed Router Configuration" on page 279. For the full format of the FORWARD DSA tag, see A.4 "Extended DSA Tag in FORWARD Format" on page 341. Configuration To enable modification of the FORWARD DSA <source device> and <source port> of packets routed by the local device, set <DevIDModEn[31:0]> in the Device ID Modification Enable Register (Table 138 p. 406). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 277 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 12.6.4 IPv4/6 DSCP and 802.1p User Priority Modification M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Routed Packet Header Modification Layer 3 Control Traffic to the CPU AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Bridging engine supports mechanisms for trapping or mirroring control traffic to the CPU (Section 11.8 "Control Traffic Trapping/Mirroring to the CPU" on page 244). This includes support for mirroring the following Layer 3 protocols: • ARP (Section 11.8.6.1) • IGMP (Section 11.8.4) • MLD and other IPv6 ICMP (Section 11.8.5) • IPv6 Neighbor Solicitation (Section 11.8.6.2) • IPv4/v6 Control Protocols Running Over Link-Local Multicast (Section 11.8.6.3) This includes RIPv2 and OSPFv2 packets. • RIPv1 MAC Broadcast (Section 11.8.6.4)\ Each packet type that is trapped or mirrored to the CPU, is assigned a unique CPU code. Each CPU code has an associated entry in the CPU Code table. The CPU Code table entry determines the packet destination device, the packet traffic class and Drop Precedence assignment, sampling threshold, and truncation option (Section 7.2 "Packets to the CPU" on page 102). Notes Packets to the CPU are not subject to any packet modification other than adding a TO_CPU DSA tag. In a cascaded system, control traffic is trapped/mirrored to the CPU by the ingress device only. The cascade ports are configured to bypass the Bridge (Section 11.1 "Bypassing Bridge Engine" on page 203), so the Bridge control traffic mechanisms are disabled on the non-ingress devices. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • • 12.7.1 IPv4/v6 Unicast Management Traffic (IP-to-Me) IPv4/6 Unicast packets destined to one of the Router’s locally attached IP addresses are sent to the CPU by the Router engine. The Router PCL action <Packet Command> associated with the Router IP address assigns the packet with the TRAP command. The packet is sent to the CPU with the CPU code IPV4_UC_ROUTE or IPV6_UC_ROUTE. Alternatively, the Router PCL action <Packet Command> is set to FORWARD and the Router PCL action <Next Hop> is set to the CPU port 63. In this, the packet is sent to the CPU with CPU Code ROUTED Packet FORWARD. 12.7.2 UDP Relay Some UDP applications such as DHCP, require a client station to send a Layer 2 Broadcast packet to the server station. If the server resides on the VLAN, it receives the Broadcast packet and replies with a Unicast packet back to the client. However, it is not always practical for the server to reside on every VLAN that requires the service. To allow for a single server to reside in the network, UDP Relay allows UDP Broadcast packets to be mirrored or trapped to the router CPU. The router CPU can then forward the packet on the VLAN on which the server resides. 12.7.2.1 IPv4 UDP Relay The Bridge mechanism Per-VLAN Unregistered IPv4 Broadcast Filtering (Section 11.11.1.5) can be used to mirror UDP Broadcast packets to the CPU. MV-S102110-02 Rev. E Page 278 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 12.7 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification IPv4 and IPv6 Unicast Routing The PCL rule action for this rule is configured with: <Redirect Command> = NO REDIRECTION <FORWARD Command> = MIRROR-to-CPU. • • Notes • • In principle, the rule key can also include the VLAN, but this is not practical if UDP relay is required on many VLANs. Thus each approach has its advantages. Using the Bridge IPv4 Broadcast command provides control per VLAN, but is not protocol-specific. Using the PCL rule allows protocol-specific (per UDP port) mirroring/trapping, but may not be practical to define such a rule per VLAN. If the Policy rule assigns the packet with a MIRROR/TRAP command, the Bridge Per-VLAN Unregistered IPv4 Broadcast Filtering is not applied to the packet. 12.7.2.2 IPv6 UDP Relay M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 IPv6 uses link-local Multicast addresses rather than a Broadcast address. For example, a client sends a packet to all DHCP relay agents and servers with a well-known link-local Multicast IPv6 destination address of FF02:0:0:0:0:0:1:2. To mirror packets with such well-known Multicast addresses to the CPU, a policy rule can be defined based on the IPv6 24-byte IPv6 packet key fields: • <isIP>=1 • <isIPv4>=0 • <DIP>=well-known IPv6 Multicast link-local address • <CPU code> = any user-defined CPU code The PCL rule action for this rule is configured with: • <Redirect Command> = NO REDIRECTION • <FORWARD Command> = MIRROR-to-CPU. 12.8 One-Armed Router Configuration A cascaded system may contain both Layer 2 (L2) and Layer 3 (L3) capable devices. In such systems, the L3 devices provide routing services on behalf of the L2 devices. This is referred to as a “one-armed router” system. Traffic received on the L2 device that is destined to the router Unicast MAC address is bridged to the L3 device router engine. The router engine on a remote device is represented by a special port configuration called the virtual router port. It is assigned the value 61 (decimal). To bridge traffic received on the L2 device network ports that is destined to the Router MAC address, the L2 device Bridge FDB must be configured with a static Unicast MAC address for each VLAN for which Unicast routing is supported. The Router MAC FDB entry is configured with: • FDB entry <MAC Address> = Unicast Router MAC address • FDB entry <VLAN> = VLAN interface of the Router Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 279 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Alternatively, if more selective UDP Broadcast mirroring is required—e.g., only for a given UDP protocol—a Router PCL rule can be defined that matches the packet based on the 24-byte IPv4+L4 key fields: • <isBC>=1, indicating the packet is a Layer 2 Broadcast • <isIPv4>=1 • <IPProtocol>=17 • <L4 Bytes 2/3> = UDP destination port • <CPU code> = any user-defined CPU code M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED One-Armed Router Configuration FDB entry <static> = 1 FDB entry <Device Number> = the L3 Routing device number FDB <Port Number> = the Virtual Router Port (port 61) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • • • See Section 11.4.6.2 "FDB Table Read/Write Access" on page 231 for details on how the FDB is updated. If there are multiple L3 Unicast routing devices in the cascaded system, the traffic can be distributed among L3 devices by specifying the FDB Unicast Router MAC entry for each VLAN interface with a different L3 device number. Traffic received on L2 device network ports with the Router MAC DA is bridged across the cascade ports of the L2 devices until it is received by the target L3 device. The packet DSA tag carries the destination L3 device number and port number. Intermediate L2 devices forward the packet towards the destination L3 device without any ingress Policy and Bridge engine processing (the cascade ports are configured to disable these engines). The L3 device cascade ports must be enabled for Policy engine processing (but the Bridge engine processing is disabled). The cascade port has the second PCL lookup bound to the Router PCL-ID (Section 12.3.1 "Binding the Routing PCL to a Port"). If the packet meets the Router Trigger requirements (Section 12.5.1 "Triggering Unicast Routing") it is then routed by the L3 device. MV-S102110-02 Rev. E Page 280 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 All L2 and L3 control traffic trapping/mirroring is performed only on the ingress device, regardless of whether the ingress device is an L2 or L3 device. This ensures that only one copy of the control packet is sent to the CPU. Note that on a per-CPU code basis, each type of control packet can be sent to the CPU with a different device number (Section 7.2 "Packets to the CPU" on page 102). CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Note M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Port Trunking This section describes the device’s Port Trunking features. Trunking, also known as Link Aggregation, allows very high bandwidth interfaces to be created by bundling multiple physical links into a single logical link. Traffic forwarded to the trunk interface is load-balanced across the physical links, thus achieving an effective bandwidth close to the aggregate bandwidth of each of the port members of the trunk group. Trunks are commonly deployed in the following applications: • Cascade interfaces between DSA (Distributed Switching Architecture) enabled devices (Section 4. "Distributed Switching Architecture" on page 44). • Uplink to a backbone switch/router. • Stack interfaces between stacking units. • High bandwidth switch-to-switch and switch-to-server inter-connection. In addition to increased bandwidth, trunks offer resiliency in the event of a link failure. When a port member of a trunk group has link failure, the trunk bandwidth may be reduced, but traffic continues to pass through the remaining port members of the trunk. 13.1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device supports the following trunking features: Up to 127 trunk groups per system. (The SecureSmart 98DX163, 98DX243, and 98DX262 devices support 32 trunk groups and the 98DX106/ 98DX107 support eight trunk groups. The SecureSmart Stackable devices support 32 trunk groups.) • Up to eight trunk port members per trunk group. • In a cascaded system, a trunk group may have port members distributed across multiple devices. • Every forwarding mechanism can forward traffic to a trunk group. • A hash function based on Layer 2/3/4 fields is used to load-balance Unicast and Multicast/flooded traffic on a trunk interface. • Trunks support all port-based features and configurations. • Trunks can be used to cascade devices. • Trunking support is compliant with the IEEE 802.3ad Link Aggregation standard. • Port Trunk-ID Assignment A trunk group is identified by a 7-bit trunk-ID. A port that is a trunk group member is assigned a trunk-ID ranging from 1 to 127. (For the SecureSmart devices (98DX163, 98DX243, and 98DX262) Trunk-ID range is from 1 to 32 and for the 98DX106/98DX107 from 1 to 8.) If the port is not a trunk group member, its trunk-ID assignment must be is 0. Packets received on a network trunk port (i.e., not a cascade port) are associated with the port’s trunk-ID assignment, rather than the physical port and device number. In this case, the bridge learning process is based on the packet’s source trunk-ID assignment (Section 11.4.7 "FDB Source MAC Learning" on page 231). The packet’s source trunk-ID is recorded in the FORWARD DSA tag when transmitted across a cascade port. A cascade trunk consumes one trunk-ID, like any other trunk. However, cascade trunks have a unique property— all their members reside on the local device. This allows the application to reuse cascade trunk-IDs, leaving more trunks-IDs for general use on network ports in the system. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 281 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 13. Port Trunking M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Port Trunk-ID Assignment AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 46 shows two sample system configurations and the optimal trunk-ID allocation: Two devices connected by 2 HyperG.Stack ports:This system can be configured to consume a single trunk-ID. Multiple devices organized in a ring: This system can be configured to consume two trunk-IDs. Configuration Trunk #127 Trunk #127 2 x HyperG.Stack Trunk #127 B A Trunk #126 8 x GbE Trunk #127 B Trunk #126 8 x GbE M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 8 x GbE A To configure the port trunk-ID, set the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567) accordingly. A zero means that the port is not a member of any trunk group. A value from 1–127 indicates that the port is a trunk member of the given trunk-ID. (For the SecureSmart devices the value is from 1 to 15.) For internal design reasons, this configuration must be replicated in the Port<4n,4n+1,4n+2,4n+3> TrunkNum Configuration Register<n> (0<=n<7) (Table 530 p. 771). Note All trunk-ID port members must have the same port configuration. Configuration categories include: - Port MAC settings such as speed and 802.3x Flow Control. - Operation mode—network or cascade. - QoS settings (port-based and protocol-based). - VID assignment settings (port-based and protocol-based). - Policy settings. - Bridge settings such as VLAN table, Spanning Tree table, ingress/egress filtering configuration. - Ingress/egress packet mirroring/sampling settings. - Egress enqueuing policy. - Egress shaping and scheduling policy. MV-S102110-02 Rev. E Page 282 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Figure 46: Minimum Trunk-ID Allocation for Cascade Trunks M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Port Trunking 13.2 Forwarding to a Single Trunk Destination AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This section describes how single-destination packets are forwarded to a trunk port. The forwarding decision of single-destination traffic consists of the following steps: 1. Packet is assigned a forwarding destination trunk-ID. 2. The trunk-ID members are obtained from the Trunk Members table (Section 13.2.1 "Trunk Members Table"). 3. A trunk member is selected using the hash function (Section 13.2.2 "Trunk Member Selection Hash"). 4. The packet destination is set to the selected trunk member {device number, port number}. 5. If the device is not the local device, the packet is forwarded according to the Device Map table (Section 13.4.1 "Forwarding Single Destination Traffic over a Cascade Trunk"). 13.2.1 Trunk Members Table The Trunk Members table is a list of all trunk members in each trunk in the system. Every device has a copy of this table. The Trunk Members table has 127 rows, one for each trunk-ID. (The SecureSmart devices support 16 Trunk groups, 1 through 15.) Every row can hold up to eight trunk members. A trunk member is defined by the couple {device number, port number}. (For the SecureSmart 98DX163, 98DX243, and 98DX262 devices and the SecureSmart Stackable devices, Trunk-ID range is from 1 to 32 and for the 98DX106/98DX107 from 1 to 8.) M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In a multi-device system, all Trunk Member tables must be synchronized. Configuration To configure the number of members in a trunk, set the Number of Trunk Members Table Entry<n> (0<=n<16) • • (Table 379 p. 652) accordingly. To configure trunk member ‘i’ for trunk ‘n’, set the Trunk Table Trunk<n> Member<i> Entry (1<=n<128, 0<=i<8) (Table 431 p. 693) accordingly. The table is accessed using the access control registers defined in Trunk Table Access Control Register (Table 432 p. 694). 13.2.2 Trunk Member Selection Hash To load-balance traffic destined to the trunk group, a hash index is calculated to select the trunk member in which to forward the packet. The hash index is calculated so as to load-balance the traffic evenly across all trunk members, at the same time ensuring that packets of any given flow are not reordered. Trunk member selection is performed as follows: 1. A 6-bit hash index is generated based on the hash function selection (Section 13.2.2.1 "Hash Functions"). 2. The number of trunk members is extracted from the Number of Trunk Members Table Entry<n> (0<=n<16) (Table 379 p. 652). 3. The selected trunk member index is calculated using the modulo operator (written “%” in C) as follows: selected_trunk_member = hash_index % number_of_trunk_members_in_trunk_ID. 4. The result is a 3-bit index that selects the trunk member from the list of members in the group. 13.2.2.1 Hash Functions The device provides the following types of hash functions: • Hash based on the ingress interface, which can be either a port or a trunk. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 283 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 A packet can be forwarded to a trunk-ID destination by any forwarding engine—Policy action, Bridge PVE, or Bridge FDB, or in the Multilayer stackable switches by the Unicast Routing Engine. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Forwarding to a Single Trunk Destination Hash based on packet header information. This hash offers different generators for non-IP, IPv4, and IPv6 packets. In addition, the application can configure which layer information is included in hash generation. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Figure 47 illustrates the hash generation procedure. If (global ingress interface is port) hash[5:0] = port_num[5:0]; Else hash[5:0] = trunk_ID[5:0]; Based on Global Ingress interface Non-IP Hash[5:0] = 0; Hash Mode Based on packet header Packet Type IP Hash[5:0] = MAC_SA[5:0]^MAC_DA[5:0]; N Y M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Switch (IPv6 hash mode) Case 0: hash[5:0] = SIP[5:0]^SIP[21:16]^DIP[5:0]^DIP[21:16]^flow[5:0]; break; Case 1: hash[5:0] = SIP[69:64]^SIP[125:120]^DIP[69:64]^DIP[125:120]^flow[13:8]; break; Case 2: hash[5:0] = case_0^case_1; break; Case 3: hash[5:0] = SIP[5:0]^SIP[21:16]^DIP[5:0]^DIP[21:16]; break; IP hashing is enabled IPv6 Packet Type IPv4 hash[5:0] = SIP[5:0]^SIP[21:16]^DIP[5:0]^DIP[21:16]; Layer 4 hashing is enabled && packet is TCP/UDP && Layer 4 valid Y hash[5:0] = hash^l4_SrcPort[5:0]^l4_TrgPort[5:0]; if (layer-4 long hash is enabled) hash[5:0] = hash^ l4_SrcPort[13:8]^l4_TrgPort[13:8]; N N Add MAC information Y hash[5:0] = hash^MAC_SA[5:0]^MAC_DA[5:0]; index[2:0] = hash % num_of_trunk_members; MV-S102110-02 Rev. E Page 284 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Figure 47: Hash Index Generation Procedure M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Port Trunking Configuration To enable inclusion of Layers 3–4 information in hash generation, set the <EnIPHash> bit. To enable inclusion of TCP/UDP ports in hash generation, set the <EnL4Hash> bit. To use more bits from TCP/UDP ports in hash generation, set the <L4LongTrunk Hash> bit. To select hash generation function for IPv6 packets, set the <IPv6TrunkHash Mode> field accordingly. To enable inclusion of Layer 2 information in hash generation, set the<AddMACHash> bit. 13.3 Forwarding of Multi-Destination Packets This section describes how multi-destination packets are forwarded by the egress pipeline when the multi-target distribution list contains ports that are trunk members. A multi-destination packet is forwarded according to its Multicast group index (VIDX) and VLAN assignment (Section 11.4.1 "FDB Unicast and Multicast Entries" on page 221) If a trunk group is a member of a VLAN/Multicast group, the VLAN/Multicast group membership must include all the trunk group port members. 13.3.1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The multi-target distribution port list is subject to two levels of filtering: • Source interface filtering • Filtering the non-designated trunk group ports Source Interface Filtering Source interface filtering removes the source interface—either a port or trunk group (actually the port members of the trunk group)—from the multi-target distribution port list. In a cascaded system, if the packet is received on a cascade port, there are two independent source interfaces: • Local device source interface • Ingress device source interface (i.e., original source interface) In a single device system, the local device source interface and ingress device source interface are obviously the same. As described in Section 11.13 "Bridge Local Switching" on page 256, the local device source port (and in a cascaded system the ingress device source port) are filtered from the distribution port list. In the event that either the local device or ingress source interface is a trunk, all ports that are members of either of these trunk groups must be filtered from the distribution port list. The Non-Trunk Members table is used to filter the source trunk group(s). The table holds 128 entries—one entry per trunk-ID in the system—where entry #0 corresponds to trunk-ID zero, the null trunk group of which none of the ports is a member. (For the SecureSmart 98DX163, 98DX243, and 98DX262 devices and the SecureSmart Stackable devices, Trunk-ID range is from 1 to 32 and for the 98DX106/98DX107 from 1 to 8.) Each entry contains a port mask list—one bit per local port. For each trunk-ID, ports that are members of the respective trunk-ID are set to 0, and the remaining non-members of the trunk-ID are set to 1. Each device has a different but coordinated Non-trunk Members table indicating per trunk group. “Local” ports are not members of that trunk. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 285 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 – – – – – AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 In the Policy Global Configuration Register (Table 333 p. 586): • To configure whether the hash mode is packet or ingress interface based, set the <TrunkHash Mode> bit. • The following configuration is applicable only in packet-based hashing, for IP packets only (the hash function for non-IP packets is not configurable): M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Forwarding of Multi-Destination Packets AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The source trunk ports are filtered from the distribution port list by AND’ing it with the Non-Trunk Members table entry for the source interface trunk-ID. Configuration 13.3.2 Selecting the Designated Trunk Group Port The designated trunk member is the trunk member from which a multi-destination packet egresses. To load-balance the multi-target traffic among the trunk group members, a hash function is used to select the designated trunk member. The hash function is flow-sensitive, ensuring that packets within a flow are hashed to the same designated trunk member. The Designated Trunk Members table is used to select a single designated trunk port member through which to forward the multi-target packet, and to filter the non-designated trunk members for the distribution list. This table has eight entries. Each entry contains a port mask, one bit per port on the local device. The port bit is set to 1 if EITHER of the following conditions is true: • The port is not a member of any trunk group (so it can be considered its own “designated member”). • The port is a member of a trunk group AND it is the designated member for this entry. The port bit is set to 0 if it is a member of a trunk group AND is NOT the designated member for this entry. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The non-designated trunk ports are filtered from the distribution port list by AND’ing with the Trunk Members table entry indexed by the hash function. In a multi-device system, the Designated Trunk Members table should be calculated by the software as a systemwide logical table, where each entry contains a port list for all the ports in the system. The system software then derives the per-device Designated Trunk Members table configuration from the system-wide logical table. In each logical entry, a trunk group must have ONE and ONLY ONE port set as the designated member of the entry. To load-balance the multi-target traffic across all the port members of a trunk group, the designated trunk port should be evenly distributed among the eight logical entries in the Designated Trunk Members table. Figure 48 is a sample logical configuration for a trunk group with four members. Although the logical table contains a port list of all the ports in the system, for clarity, only the trunk ports are represented, and the empty cells represent 0 bits. The trunk port members may physically reside on different devices. It is clear that only one designated member of the trunk group is selected in each entry, and that the designated port is equally distributed among all the trunk members (i.e., each of the four trunk members has a one in four probability of being selected as the designated trunk port). MV-S102110-02 Rev. E Page 286 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 To configure the Non-Trunk Members table, set the Trunk<n> Non-Trunk Members Table (0<=n<128) (Table 469 p. 727) accordingly. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Port Trunking Trunk member #1 Entry #0 1 Entry #1 1 Trunk member #2 Entry #2 1 Entry #3 1 Trunk member #3 Entry #4 1 Entry #5 1 Trunk member #4 Entry #6 1 Entry #7 1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 49 is an example of configuration of the Designated Trunk Members table for the following system: • Two devices—A and B—are cascaded by two HyperG.Stack ports, numbered #24 and #25. • The cascading ports are members of trunk #127, i.e., in each device the cascade ports are members of trunk #127. • Ports #1 in each device are members of trunk #1. • Ports #2 and #3 in device A are members of trunk #2. • Port #3 in device B is a single member of trunk #3. The configuration is explained as follows: There are two trunks #127 in the system. Each trunk is local to one device and the configuration of each of these individual trunks is not related. The configuration of trunks #127 follows the guidelines in Figure 48. • Trunk #1 is distributed between devices, therefore the configuration of the designated member is also distributed. • Trunk #2 is local to device A. Its configuration follows the guidelines in Figure 48. • Trunk #3 has a single member, therefore this member is always the designated one. • Ports that are not trunk members—such as port #0 in both devices— must be set in every entry. • Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 287 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 48: Sample Logical Configuration of the Designated Trunk Port Table M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Forwarding of Multi-Destination Packets AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 49: Sample Configuration of Designated Trunk Members Table 25 2 1 Trunk #1 24 Trunk #127 B 25 3 Trunk #2 1 3 Trunk #1 Trunk #3 Device A - Designated Trunk Members Table Port #1 Port #2 Port #3 Port #4 Port #24 Port #25 Entry #0 1 1 1 1 ..... 1 Entry #1 1 1 1 1 ..... 1 Entry #2 1 1 1 1 ..... 1 Entry #3 1 1 1 1 ..... 1 Entry #4 1 1 1 ..... 1 Entry #5 1 1 1 ..... 1 Entry #6 1 1 1 ..... 1 Entry #7 1 1 1 ..... 1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Port #0 Device B - Designated Trunk Members Table Port #0 MV-S102110-02 Rev. E Page 288 Port #1 Entry #0 1 Entry #1 1 Entry #2 1 Entry #3 1 Entry #4 1 1 Entry #5 1 1 Entry #6 1 1 Entry #7 1 1 Port #2 Port #3 Port #4 Port #24 1 1 1 ..... 1 1 1 1 ..... 1 1 1 1 ..... 1 1 1 1 ..... 1 1 1 1 ..... 1 1 1 1 ..... 1 1 1 1 ..... 1 1 1 1 ..... 1 CONFIDENTIAL Document Classification: Restricted Information Port #25 Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 24 A M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Port Trunking AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The entry selected by the hash index (Section 13.3.2.1 "Indexing the Designated Trunk Members Table") is ANDed with the multi-target distribution port list. As a result, for each trunk group in the distribution port list, the packet is forwarded only to the designated trunk member. Processing in Device A 1. 2. 3. 4. The ingress interface is filtered from the distribution port list, using the information in the Non-Trunk Members table. Entry #3 is picked from the Designated Trunk Members table. Device B is configured to generate the same index. Entry #3 filters the distribution port list. The packet is forwarded to ports #0, #2, and #24. Processing in Device B 2. 3. The ingress and local device source interfaces are filtered from the distribution port list, using the information in the Non-Trunk Members table. Entry #3 is picked from the Designated Trunk Members table. Note that both devices generated the same index. Entry #3 filters the distribution port list and the packet is forwarded to port #3. Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 1. CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 289 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Figure 49 shows the configuration of the forwarding path for a packet ingressing {device A, port #1}, to be flooded to a VLAN whose members are {device A, port#0}, trunk #1, trunk #2, and trunk #3. Figure 50 shows the configuration of the respective distribution port list and the Non-Trunk Members table. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Forwarding of Multi-Destination Packets AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 50: Example Distribution Port List and Non-Trunk Members Table Configuration Port #0 Port #1 Port #2 Port #3 Port #4 Port #24 Port #25 1 1 1 1 0 1 1 Port #0 Port #1 Port #2 Port #3 Port #4 Port #24 Port #25 1 0 1 1 1 1 1 Port #0 Port #1 Port #2 Port #3 Port #4 Port #24 Port #25 0 1 0 1 0 1 1 Distribution port list Non-Trunk Members Table Trunk #1 Distribution port list Non-Trunk Members Table Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Device B Port #0 Port #1 Port #2 Port #3 Port #4 Port #24 Port #25 Trunk #1 1 0 1 1 1 1 1 Trunk #127 1 1 1 1 1 0 0 To configure the designated trunk member table, set the <Port<i>Is Designated> field of the Designated Trunk Port Entry<n> Table (0<=n<8) (Table 470 p. 727) accordingly. MV-S102110-02 Rev. E Page 290 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Device A M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Port Trunking 13.3.2.1 Indexing the Designated Trunk Members Table There are two ways of deriving the index to the Designated Trunk Members table: • The least significant 3 bits of the hash value calculated by the selected hash function (Section 13.2.2.1 "Hash Functions" on page 283). • A new index is derived by XORing the hash index with the packet VLAN-ID. The VLAN-ID is assigned according to the VLAN-ID assignment procedure (Section 11.2.2 "VLAN Assignment Mechanisms" on page 206. The following formula is used to calculate the index: index[2:0] = VID[2:0]^VID[5:3]^VID[8:6]^VID[11:9]^hash_index[2:0] To ensure synchronized selection of the designated trunk member, all devices in a system must be configured to generate an identical index. Configuration To configure the indexing mode to the Designated Trunk Members table, set the <McTrunk HashVlanEn> bit in the Transmit Queue Extended Control Register (Table 459 p. 717). 13.4 Trunking over Cascade Link 13.4.1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 A trunk can be used to cascade two devices. From now on, this will be referred to as a “cascade trunk”. This section explains how packets are forwarded over cascade trunks. Forwarding Single Destination Traffic over a Cascade Trunk The forwarding decision of a single-destination packet is defined by the target {device number, port number}. If the target device is the local device, the packet is queued to the target port number. If the target device is not the local device, the packet must be forwarded over a cascade interface. The Device Map table is used to map the target device to a cascade interface (Section 4.2 "Single-Target Destination in a Cascaded System" on page 45). If the cascade interface is a trunk: • The Cascade trunk members are obtained from the Non-Trunk Members table. The table is indexed by the cascade trunk-ID, obtained from the Device Map table. • The cascade trunk port list, together with the designated trunk port list taken from the Designated Trunk table index using the hash index (Section 13.2.2.1 "Hash Functions"), selects the designated cascade trunk port. Configuration To configure the Device Map table, set Device<n> Map Table Entry (0<=n<32) (Table 471 p. 728) accordingly. 13.4.2 Forwarding Multi-Destination Traffic over a Cascade Trunk Forwarding multi-destination traffic over cascade trunks is handled as defined in Section 13.3 "Forwarding of Multi-Destination Packets". Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 291 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The index is designed to select each entry in equal probability and preserve packet order within a flow. Packet order is preserved by selecting the same entry for all packets in a flow. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Trunking over Cascade Link AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 14. Ingress Traffic Policing Engine Traffic Policing is the act of metering a flow of traffic according to a configured traffic profile, and performing the configured action on packets that are determined to be out-of-profile. Policing consists of creating a policer that specifies the following information: • Bandwidth limits for the traffic are specified in terms of: – Committed Information Rate (CIR): Average bandwidth in bits per second (bps). – Committed Burst Size (CBS): Describes the approximate amount of traffic bursts allowed (in bytes). • Action to be taken on packets exceeding the configured bandwidth constraints: – – – Transfer the packet unmodified. Drop the packet. Remark the packet QoS attributes. Packet processing by a policer consists of the following stages: Traffic Metering The traffic meter compares a flow of traffic to the configured bandwidth limits and classifies packets to in-profile if the consumed bandwidth meets the limits, or to out-of-profile if the consumed bandwidth exceeds the limits. The classification results are passed to the following stages. 2. Updating policing counters Policing counters are used to track the behavior of flows by measuring the volume of in-profile and out-ofprofile traffic. 3. Execution of an out-of-profile action Out-of-profile packets are processed according to the policer’s configured action. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 1. The traffic policing engine incorporates 256 ingress policers, each of which can be bound to a flow or a flow aggregate. (The SecureSmart and SecureSmart Stackable devices support four per-port on-chip wire-speed ingress traffic policers.) Traffic metering utilizes the color-aware and color-blind versions of the Single-Rate Two-Color algorithm. The traffic policing engine incorporates 16 policing counter sets. Each counter set consists of an in-profile traffic counter and an out-of-profile traffic counter. A counter set can bound to one or more policers. 14.1 Traffic Policing Engine Overview 14.1.1 Policing Engine Location The Traffic Policing engine is located in the ingress pipe, as shown in Figure 51 for the SecureSmart and Layer 2+ stackable switches and in Figure 52 for the Multilayer stackable switches. Traffic Policing is done after initial QoS marking, therefore the policing engine is the last stage in which a packet’s QoS attributes may be modified. MV-S102110-02 Rev. E Page 292 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section describes the device’s ingress traffic policing engine. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Traffic Policing Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 51: Ingress Pipe Block Diagram: SecureSmart, SecureSmart Stackable and Layer 2+ Stackable Switches P o lic in g E n g in e B r id g e E n g in e P o lic y E n g in e H e a d e r D e c o d e E n g in e P o rts M A C R x M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 52: Ingress Pipe Block Diagram: Multilayer Stackable Switches P r e - E g r e s s E n g in e P o lic in g E n g in e U n ic a s t R o u tin g E n g in e B r id g e E n g in e P o lic y E n g in e H e a d e r D e c o d e E n g in e P o rts M A C R x Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 293 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 P r e - E g r e s s E n g in e M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Traffic Policing Engine Overview 14.1.2 Policing Engine Organization 14.1.2.1 Policer Configuration Table The Policer Configuration table holds the per-policer configuration and it has 256 entries—one per policer. An entry is selected by directly indexing the table, using the packet policer index set in the Policy engine. Every entry has the following configuration parameters: • Enable policer • Meter configuration • Enable policing counters • Action applied to out-of-profile traffic For further details see Section 14.3 "Policer Configuration" on page 298. 14.1.2.2 Global Policer QoS Table M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The Global Policer QoS table holds the QoS configuration shared by all policers. It has 72 entries—one per QoS Profile. An entry is selected by directly indexing the table, using the packet QoS Profile index, set by the QoS initial marking stage. Every entry has the following configuration parameters: • Drop Precedence • Remarked QoS Profile index Drop Precedence (DP) The device uses the packet Drop Precedence (DP) as the initial color indication for color-aware metering. DP ‘0’ is used to indicate a color of Green (in-profile). DP ‘1’ is used to indicate a color of Red (out-of-profile). Packet DP is obtained from the Global Policer QoS table, using the packet QoS Profile index. If color-aware metering is used, the application must set DP in the Global Policer QoS table to be identical to DP in the QoS Profile table (Section 8.1.4 "QoS Profile" on page 114). Remarked QoS Profile Index Out-of-profile packets may be subject to QoS Profile index remarking based on the packet’s incoming QoS Profile index marking. This is known as “relative” remarking. Relative remarking means that the initially assigned QoS Profile index is mapped to a new index. The mapping process remarks the packet QoS index according to the “Remarked QoS Profile index“ field in the Global Policer QoS table. The application must configure the “Remarked QoS Profiles index“ only if at least one out-of-profile action is set to relative mode. Configuration To configure a Global Policer QoS table entry ‘n’, set the Policers QoS Remarking and Initial DP Table Entry<n> Register (0<=n<72) (Table 423 p. 687) accordingly. MV-S102110-02 Rev. E Page 294 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The policing engine consists of: • Policer Configuration table • Global Policer QoS table • Traffic meters • Policing counters M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Traffic Policing Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The access procedure is described in C.14.4.3 "Read and Write Access to the Policers QoS Remarking and Initial DP Table" on page 689. Traffic Meters The traffic meter performs the actual classifications of packets to in-profile or out-of-profile. The metering algorithm is Single-Rate Two-Color, which is a simplified version of the Single-Rate Three-Color Meter, described in RFC 2697. Whereas the RFC 2697 meter is capable of classifying packets into three categories —Green, Yellow, or Red—the implemented meter classifies packets into two categories— Green (in-profile) or Red (out-of-profile). The implementation supports two color-awareness modes—Color-Aware and Color-Blind. The following sub-section defines the meter classification algorithm. Single Rate Two-Color Meter Algorithm The behavior of the meter is specified in terms of: – – Color mode—Color-Aware or Color-Blind and Single token bucket, C, characterized by its token count Tc. The token count update rate is CIR and the maximum bucket size is CBS. At time 0 the token bucket is full, i.e., Tc(0) = CBS. Tc is updated CIR times per second as follows: IF Tc is less than CBS, Tc is incremented by one ELSE Tc is not incremented. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 When a packet of size B bytes arrives at time t, the following occurs if the meter is configured to operate in colorblind mode: IF Tc(t)-B >= 0, the packet is Green (in-profile) and Tc is decremented by B down to the minimum value of 0. ELSE the packet is Red (out-of-profile) and Tc is not decremented. When a packet of size B bytes arrives at time t, the following occurs if the meter is configured to operate in coloraware mode: IF the packet has been pre-colored as Green (in-profile) and Tc(t)-B >= 0, the packet is Green and Tc is decremented by B down to the minimum value of 0. ELSE the packet is Red (out-of-profile) and Tc is not decremented. Note The definition of packet size B depends on the global configuration described in the following sub-section. Metered Packet Size The Policy engine provides the following modes to define packet size, B: Layer 1 metering Packet size includes the entire packet + IPG + preamble. Layer 2 metering Packet size includes the entire packet, including Layer 2 header and CRC. Layer 3 metering Packet size includes Layer 3 information only, excluding Layer 2 header and CRC. Configuration Meter configuration, color mode, CIR and CBS are described in Section 14.3 "Policer Configuration". To configure the metered packet size, set the <Policing Mode[1:0]> field in the Policers Global Configuration Register (Table 418 p. 681) accordingly. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 295 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 14.1.2.3 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Traffic Policing Engine Overview 14.1.2.4 Policing Counters Counters have the following properties: • Counted packet size is defined by the metered packet size in "Metered Packet Size" above. • Every counter is 32 bits wide. • Wrap-around to 0 on overflow. • The set counters can be atomically accessed for reading. • Counters must be initialized by writing them. Counters Implementation Considerations The minimal counters wrap-around interval is 0.6 seconds. The calculation is based on the following scenario: • 54 Gbps traffic is bound to a single policer. • All of the traffic is considered in-profile. • Metered packet size is Layer 1 metering. Under full-rate conditions, every read access to the counters causes one to two packets not to be counted. The not-counted packets may belong to any policer. However, the metering operation is not affected by CPU access rate to the counters. Configuration 14.1.3 Packet Walkthrough M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The read/write access procedure to the policing counters is described in C.14.5.3 "Read and Write Access to the Policer Counters Table" on page 691. Policer packet-walkthrough is illustrated in Figure 53. • When a packet arrives to the policing engine, a decision is taken whether this packet is subject to policing. • Policer configuration is obtained. • If the meter configuration indicates that a Color-Aware policer is used, the packet color (i.e. drop precedence) is obtained from the global policer QoS table, using the packet QoS Profile index as an index. • Metering algorithm is applied. • Policing counters are updated, if policing counters are enabled in the policer configuration. • If the outcome of the metering algorithm indicates the packet is out-of-profile, it is processed by the policer out-of-profile action. MV-S102110-02 Rev. E Page 296 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The policer engine incorporates 16 policing counter sets. Each set can be bound to one or more policers. Once bound to a policer, the counter set is used to count in-profile and out-of-profile bytes. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Traffic Policing Engine AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 53: Policer Packet Walkthrough Do policing? Y Get policer configuration Optionally, get DP N M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Do metering Optionally, update policing counters In-profile (Green) Meter result Out-of-profile (Red) Apply out-of-profile action Packet exits policer Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 297 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Packet arrives at policer M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Traffic Policing Engine Overview 14.2 Triggering Traffic Policing Note The Policy engine can process standard Ethernet and DSA tag type = FORWARD packets (Section 10.4.1 "Packet Eligibility for Policy Processing" on page 183). Therefore, only these packets may be processed by the traffic policing engine. 14.3 Policer Configuration This section describes per policer configuration. 14.3.1 Enable Policer A policer must be enabled to process packets. When a policer is disabled, all packets assigned for processing by this policer bypass the policing engine. A policer must be disabled prior to configuring the Policer Configuration table. Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The meter Token Count (Tc) is reset to CBS when the policer is enabled. To enable a policer, set the <PolicerEn> bit in Policers Table Entry<n>(0<=n<256) (Table 419 p. 682). 14.3.2 Strict Rate Limiter and Loose Rate Limiter To implement the token bucket algorithm, each of the 256 policers incorporates a Bucket Size Counter (BucketSizeCnt). This counter is incremented with tokens, according to the configured policer rate up to a maximal value of the configured PolicerBurstSize The Byte Count of each conforming packet is decremented from the counter. When a new packet arrives, according to a global configuration, the packet conformance is checked according to one of the following modes: Strict Rate Limiter If BucketSizeCnt > Packet’s Byte Count the packet is conforming else, it is out of profile. Loose Rate Limiter If BucketSizeCnt > MRU the packet is conforming else, it is out of profile. When the Policer mode is Strict Rate Limiter, some discrimination against larger packets may take place. When the Policer mode is Loose Rate Limiter, PolicerBurstSize must be configured to a value larger than MRU. Configuration • To configure the Policers mode, set <PolicerMode> field in the Policers Global Configuration Register • (Table 418 p. 681). To configure the MRU when the policer mode is set to loose rate limier, set <PolicerMRU> field in the Policers Global Configuration Register (Table 418 p. 681). MV-S102110-02 Rev. E Page 298 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 A packet is subject to traffic policer processing if the following conditions are met: • Packet is bound to a policer by the Policy engine rule action • And the packet command is FORWARD or MIRROR. • And the selected Policer is enabled. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Traffic Policing Engine 14.3.3 Meter Configuration 14.3.3.1 Configuration of Color-Awareness Mode Every policer can be either color-aware or color-blind. To configure color awareness mode, set the <Policer ColorMode> bit in the Policers Table Entry<n>(0<=n<256) (Table 419 p. 682). 14.3.3.2 Configuration of CIR and CBS To set the CIR and CBS for a given policer, the policer time-base generator must first be selected. For flexibility, six time generators are supported. Table 65 shows the configuration range of CIR and CBS as a function of the time-base generator. Table 65: Time Bas e Configuration Range of CIR and CBS C IR 1 CBS 2 Step M ax Min Step Max 0 1Kbps 1.023 Mbps 0 1 Byte 64 KB–1B 0 10 Kbps 10.23 Mbps 0 8 Byte 512 KB–8B 0 100 Kbps 102.3 Mbps 0 64 Byte 4 MB–64B 0 1 Mbps 1.023 Gbps 0 512 Byte 32 MB–512B 4 0 10 Mbps 10.23 Gbps 0 4 KB 256 MB–4KB 5 0 100 Mbps 102.3 Gbps 0 32 KB 2 GB–32KB 0 1 2 3 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Min 1. Rate units use the following conventions: K = 103, M = 106, G = 109. 2. Burst units use the following conventions: K = 210, M = 220, G = 230. • • • To select the time-base generator, set the <PolicerRate Type[2:0]> field in Policers Table Entry<n>(0<=n<256) (Table 419 p. 682). To configure the CIR, set the <PolicerRate[9:0]> field in the Policers Table Entry<n>(0<=n<256) (Table 419 p. 682). To calculate the value to be configured in <PolicerRate[9:0]>, the following formula should be applied: <PolicerRate> = CIR / CIR_STEP, where CIR is the required rate and CIR_STEP is taken from Table 65. To configure the CBS, set the <Policer BurstSize[15:0]> field in the Policers Table Entry<n>(0<=n<256) (Table 419 p. 682). To calculate the value to be configured in <Policer BurstSize[15:0]>, the following formula should be applied: <Policer BurstSize> = CBS / CBS_STEP, where CBS is the required burst-size and CBS_STEP is taken from Table 65. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 299 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Per-meter configuration consists of setting the following parameters: • Color-awareness mode • Committed Information Rate (CIR) • Committed Burst Size (CBS) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policer Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note CBS must be larger than or equal to the maximum packet size. The meter classifies all packets longer than CBS to be out-of-profile. Enable Policing Counters This command enables binding a policer to one of 16 pairs police counter sets. See Section 14.1.2.4 "Policing Counters" on page 296 for further information about policing counters. Configuration To bind policer ‘n’ to a policing counter, in the Policers Table Entry<n>(0<=n<256) (Table 419 p. 682): • To enable binding, set the <PolicerCntEn> bit. • To specify the index of the counters (0–15), set the <PolicerCntIndex> field accordingly. 14.3.5 Out-of-Profile Action 14.3.5.1 No Operation (NOP) NOP performs no action on the packet. 14.3.5.2 DROP M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Out-of-profile action is applied to packets that were classified as Red (out-of-profile) by the traffic meter. The following action types can be applied: • No Operation • DROP • QoS Remark The packet is SOFT or HARD dropped according to a global configuration flag. Configuration To configure the DROP type, set the <PolicerDrop Mode> bit in Policers Global Configuration Register (Table 418 p. 681). 14.3.5.3 QoS Remark Out-of-profile packets may have each of the following QoS attributes either remarked or preserved: • Modify User Priority flag. • Modify DSCP flag. In addition, there are two options to remark the QoS Profile index: Absolute: Packet QoS Profile index is set to the Policer Configuration table entry QoS Profile index (see Configuration section below). Relative: Packet QoS Profile index is mapped to a new value, using the Global Policer QoS table (see Section 14.1.2.2 "Global Policer QoS Table" on page 294). Note The packet QoS attribute QoS Precedence is ignored by the policing engine. MV-S102110-02 Rev. E Page 300 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 14.3.4 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Ingress Traffic Policing Engine Configuration • To modify Enable Modify User Priority flag, set the <PolicerModify UP> field accordingly. To modify Enable Modify DSCP flag, set the <Policer ModifyDSCP> field accordingly. To configure the QoS Profile index, when the absolute QoS remarking mode is enabled, set the <PolicerEntry QosProfile> field accordingly. 14.3.6 Implications of Accessing the Policer Configuration Table Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Every configuration access to the Policer Configuration table temporarily blocks the ingress pipe. Consequently, under full-rate and short packet conditions, every configuration cycle may cause the pipe to drop up to five packets which ingress from any port. CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 301 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 – – AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The configuration of policer ‘n’ out-of-profile command is embedded in the Policers Table Entry<n>(0<=n<256) (Table 419 p. 682). • To configure the action type, set the <Policer Cmd> field accordingly. • When any of QoS remarking modes is enabled: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Policer Configuration AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 15. Bandwidth Management 15.1 Buffers and Descriptors Buffer and descriptor management are integral aspects of bandwidth management in the device, affecting both ingress and egress packet processing. All buffers and descriptor memory are internal to the device. Table 66 lists the number of 256-byte buffers for each device. Buffers are dynamically allocated to packets, according to packet length. For diagnostic purposes, buffer memory can be read and written to, as described in C.17.3 "Buffers Memory" on page 774. Table 66: Number of Buffers of 256 By te s Numbe r of Egre ss Queu e Desc ripto rs 98DX106 98DX107 512 512 98DX163 98DX243 98DX169 98DX249 1.5K 1.5K 3K 4K 98DX130 98DX133 98DX166 98DX167 98DX246 98DX247 98DX250 98DX253 98DX260 98DX262 98DX263 98DX269 98DX270 98DX273 98DX803 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Dev ice Number of 256-Byte Buffers For Each Device Table 66 also lists the number of egress queue descriptors for each device. A descriptor is allocated for each packet egress destination. A packet queued on N egress ports has N descriptors that point to the same buffer linked-list. For single destination packets, N=1, and for multi-destination packets, N > 1. MV-S102110-02 Rev. E Page 302 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section describes the device’s bandwidth management features. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bandwidth Management The descriptor is released when the packet instance is transmitted or dropped for any reason. The associated packet buffers are released only after all its the descriptors have been released. 15.2 Ingress Bandwidth Management The device’s ingress bandwidth management features are explained in the following sections. 15.2.1 Buffer Limits and Flow Control Buffer parameters are applied at three levels—global, port-group, and port-profile. Global buffer parameters apply to all ports. There are two port-groups—the Gigabit port-group and the HyperG.Stack port-group. The CPU port is considered a member of the Gigabit port-group. Each port is associated with its respective port-group buffer parameters. Up to four port-profiles can be configured. Each port is configured to be associated with one of these four port-profiles. Configuration 15.2.1.1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To associate a port to a port-profile, set the Buffers Limits Profile Association Register0 (Table 545 p. 781) and the Buffers Limits Profile Association Register1 (Table 546 p. 783) accordingly. Buffer Allocation Limits Buffer allocation is controlled according to three limit configurations: • Port-profile buffer limit • Port-group buffer limit • Global buffer limit Some applications require only a global buffer limit, rather than the three-level control described above. In this case, there is a global Buffer Mode to disable the port-profile and port-group buffer limitations, while enabling only the global buffer limit. When any of the buffer limits is reached (or just the global buffer limit is reached, depending on the Buffer Mode), the incoming packet is dropped. Configuration To disable the Port-group and Port-profile buffer limits while enabling only the global buffer limit, set the <Buff• • • ersMode> bit in the Buffer Management Global Buffers Limits Configuration Register (Table 542 p. 778) accordingly. To configure the global buffer limit, set the <MaxBufLimit> field in the Buffer Management Global Buffers Limits Configuration Register (Table 542 p. 778) accordingly. To configure the <GPorts MaxBufLimit> field in the Gigabit port-group buffer limit, set Buffer Management GigE Port Group Limits Configuration Register (Table 543 p. 779) accordingly. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 303 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Descriptor “duplication” for a given packet is performed in the following cases: • Packet is mirrored by an ingress pipeline engine to the CPU. • Packet is sampled to the CPU by the ingress port or egress port. • Packet is mirrored to the Analyzer port by the ingress pipeline or egress port. • Packet is flooded to multiple destinations (e.g. Broadcast, Multicast, or Unknown Unicast). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Ingress Bandwidth Management 15.2.1.2 Flow Control The goal of Flow Control is to prevent the buffer allocation limits from being reached, so packets won’t be dropped due to lack of buffers. Flow Control is based on buffer allocation upper and lower watermark thresholds. In 802.3x Flow Control, when the buffer allocation count reaches the upper watermark threshold (aka XOFF threshold), PAUSE frames are sent requesting the remote node to stop sending frames (i.e., maximum pause time). To prevent packet loss, the XOFF threshold must be set such that there are enough buffers available to absorb the packets that are “in progress” until the remote node ceases to transmit. When the buffer allocation count goes back down to the lower watermark threshold (aka XON threshold), a PAUSE frame with a zero pausetime is sent, requesting the remote node to resume sending frames. Half-duplex Flow Control is similar, but instead of sending PAUSE frames when the XOFF threshold is reached, the port uses a carrier-assertion-based scheme to stop the remote node from transmitting. The carrier assertion ceases when the XON threshold is reached. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Similar to the buffer allocation limits, there are three levels of XOFF/XON thresholds: • Port-profile XOFF/XON • Port-group XOFF/XON • Global XOFF/XON If any of the XOFF thresholds are reached for a port that is enabled for Flow Control, the port transmits PAUSE XOFF frames. If ALL of the XON thresholds are reach for a port that is enabled for Flow Control, the port transmits PAUSE XON frames. It is a configurable option to send periodic Flow Control messages, to ensure reception by the remote station (Section 9.4.8 "802.3x Flow Control" on page 148). Configuration • To configure the global XOFF/ON thresholds, set <GlobalXoff> and <GlobalXon> fields in the Buffer Manage• • • ment Global Buffers Limits Configuration Register (Table 542 p. 778) accordingly. To configure the Gigabit port-group XOFF/ON thresholds, set the <GPortsXoff> and <GportsXon> fields in the Buffer Management GigE Port Group Limits Configuration Register (Table 543 p. 779) accordingly. To configure the HyperG.Stack port-group XOFF/ON thresholds, set the <H.GS and HX/QX PortsXoff> and <H.GS and HX/QX PortsXon> fields in the Buffer Management HyperG.Stack and HX/QX Ports Group Limits Register (Table 544 p. 780) accordingly. To configure the port-profile XOFF/ON thresholds, set the XOFF/XON fields in the Ports Buffers Limit Profile0 Configuration Register (Table 547 p. 785), Ports Buffers Limit Profile1 Configuration Register (Table 548 p. 786), Ports Buffers Limit Profile2 Configuration Register (Table 549 p. 786), and Ports Buffers Limit Profile3 Configuration Register (Table 550 p. 787) accordingly. MV-S102110-02 Rev. E Page 304 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • To configure the HyperG.Stack port-group buffer limit, set the <H.GS and HX/QX Ports MaxBufLimit> field in the Buffer Management HyperG.Stack and HX/QX Ports Group Limits Register (Table 544 p. 780) accordingly. To configure the port-profile buffer limit, set the <Profile0 RxBufLimit> field in the Ports Buffers Limit Profile0 Configuration Register (Table 547 p. 785), the <Profile1 RxBufLimit> field in the Ports Buffers Limit Profile1 Configuration Register (Table 548 p. 786), the <Profile2 RxBufLimit> field in the Ports Buffers Limit Profile2 Configuration Register (Table 549 p. 786), and the <Profile3 RxBufLimit> field in the Ports Buffers Limit Profile3 Configuration Register (Table 550 p. 787) accordingly. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bandwidth Management 15.2.2 Ingress Rate Limiting 15.3 Egress Bandwidth Management The device supports the egress bandwidth management features discussed in the following sub-sections. 15.3.1 Enqueueing and Congestion Avoidance Tail-Dropping To support traffic-class-based QoS, each of the MAC ports and the CPU port in the device supports eight egress traffic class queues. The egress queues are implemented as linked-lists of packet descriptors, where each descriptor points to a linked-list of buffers containing the packet data. For handling congestion on egress queues, the device supports two levels of drop precedence. When congestion occurs on an egress queue, packets with high drop-precedence can be configured to be dropped, while packets with low drop-precedence are enqueued. A packet is assigned its traffic class and Drop Precedence by the ingress pipeline engines. Egress queue congestion occurs when the queue enqueue rate is greater than the dequeue (i.e. transmit) rate. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Congestion can be caused by various reasons such as port shaping (Section 15.3.3 "Port and Queue Traffic Shaping"), Flow Control received on the link due to congestion on the peer link (Section 15.2.1.2 "Flow Control"), or oversubscription of the egress line rate (e.g., traffic from a GbE port to a 100 Mbps port). The number of descriptors and buffers enqueued on egress traffic class queues is limited according to configurable thresholds. There are several limit types restricting the number of buffers and descriptors that can be enqueued: • Global buffer limit and descriptor limit • Port profile buffer limit and descriptor limit • Queue/drop precedence profile buffer limit and descriptor limit • Multi-target packet descriptor limit (i.e., descriptors that share packet buffers with other descriptors) • Ingress mirrored-to-analyzer packet descriptor limit (this applies to the “copy” of the packet that is enqueued on the analyzer port and not the original packet) • Egress mirrored-to-analyzer packet descriptor limit (this applies to the “copy” of the packet that is enqueued on the analyzer port and not the original packet) In the event that enqueueing a descriptor would cause any of the buffer or descriptor limits to be exceeded, the descriptor is released. This packet instance discard is known as “tail-dropping”. (Note that dropping of a mirrored copy of a packet due to the mirrored-to-analyzer packet descriptor limit does not affect the original instance of the packet.) Up to four tail drop profiles can be configured. Each port can be associated to one of these tail-drop profiles. The tail-drop profile defines the buffer and descriptor limits for the port and each queue/drop precedence. (There are 16 queue/drop precedence limits that cover the 8 queues x 2 levels of drop precedence.) A tail-drop profile can be configured to allow a limited degree of descriptor and buffer sharing. In the event that a port has reached its descriptor and/or buffer limit according to its profile assignment, and there are no other descriptors queued in the packet’s target traffic class queue, then the descriptor is permitted to be enqueued rather than dropped. This feature prevents complete starvation of non-congested queues on the port, even when other queues of the port are consuming the maximum resources allocated to the port. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 305 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The device supports Ingress Rate Limiting as follows: • Ingress Port Rate Limiting of known Unicast, Unknown Unicast, Multicast, and Broadcast packets. For details, see Section 11.10 "Ingress Port Packet Rate Limiting". • Ingress policing. For details see Section 14. "Ingress Traffic Policing Engine" on page 292. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Egress Bandwidth Management The following is an example of tail dropping by queue/drop precedence. (Note that DP 0 is low drop precedence, and DP 1 is high drop precedence.) 1. Port 5 is configured to be associated with Drop-profile 2. 2. Drop-profile 2 is configured with a descriptor limit of 16 for {TC 3, DP 0}, and a descriptor limit of 10 for {TC 3, DP 1}. 3. Port 5 queue 3 currently has a count of 12 descriptors enqueued. 4. A new descriptor arrives for target port 5, {TC 3, DP 1}. This descriptor is dropped and not enqueued as its limit of 10 has been exceeded. 5. A new descriptor arrives for target port 5, {TC3, DP 0}. This descriptor is enqueued as its limit of 16 has not been exceeded Configuration • To configure the total number of descriptors that can be enqueued, set the <TotalDescLimit> field in the • • • • • • • Transmit Queue Control Register (Table 458 p. 717) accordingly. To configure the total number of multi-target descriptors, set the <MultiDescLimit> field in the Transmit Queue Control Register (Table 458 p. 717) accordingly. To configure the total number of buffers enqueued, set the<TotalBuffers Limit> field in the Total Buffer Limit Configuration Register (Table 461 p. 720) accordingly. To configure the tail-drop profile association for a port, set the <TailDrop Profile> field in the Port<n> Txq Configuration Register (0<=n<27, CPU port number is 0x3F) (Table 466 p. 722) accordingly. To configure the profile port descriptor limit, buffer limit, and sharing mode, set the corresponding fields in the Profile0 Port Tail Drop Limits Configuration Register (Table 478 p. 732). To configure the profile queue/drop precedence descriptor and buffer limits, set the corresponding fields in Profile2 Port Tail Drop Limits Configuration Register (Table 484 p. 735). To configure the descriptor limit for ingress and egress mirrored packets, set the corresponding fields in the Mirrored Packets to Analyzer Port Descriptors Limit Configuration Register (Table 473 p. 728). To disable tail-dropping based on the profile limits associated with the port, set the <TailDropDis> bit in the Transmit Queue Control Register (Table 458 p. 717). To disable enqueueing on a given port queue, set the <EnQueue TC[7:0]> field in the Port<n> Txq Configuration Register (0<=n<27, CPU port number is 0x3F) (Table 466 p. 722) accordingly. 15.3.2 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Transmit Queue Scheduling When multiple traffic class queues have packet descriptors enqueued for transmission on the port, the scheduler is responsible for selecting one of the queues for dequeueing. Scheduling plays an important role in achieving Quality of Service goals such as minimum delay and minimum bandwidth guarantee for a given traffic class. Each of the device’s ports, including the CPU port, supports Strict Priority (SP) queue scheduling, Shaped Deficit Weighted Round Robin (SDWRR) queue scheduling, and a combination of both SP and SDWRR queue scheduling. SP scheduling supports minimal latency for real-time traffic flows and DiffServ Expedited Forwarding (EF) support. SDWRR scheduling supports guaranteed minimal bandwidth as required for DiffServ Assured Forwarding (AF) support, while reducing burstiness and jitter that is associated with non-shaped DWRR scheduling. A combination of SP and SDWRR scheduling provides minimal latency on the SP queues, while the other SDWRR queues share the remaining link bandwidth, according to their relative weights. MV-S102110-02 Rev. E Page 306 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Tail-dropping prevents the congested egress queues from consuming excessive descriptor and buffer resources, thus impacting traffic bound to other non-congested queues on the device. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bandwidth Management AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note If queue and/or port shaping is enabled, the scheduled queue is serviced only if the packet is conforming to the respective shaper attributes. (Section 15.3.3 "Port and Queue Traffic Shaping") Strict Priority Scheduling Group Within the SP scheduling group, queues are scheduled according to the queue number, starting with the highest queue 7, with decreasing priority down through queue 0. Traffic in higher queue numbers are always scheduled prior to traffic in lower queue numbers. SP queues may be non-continuous. Note that if an SDWRR-scheduled queue has a higher queue number than an SP-scheduled queue, the higher SDWRR queue is scheduled for transmission prior to the lower SP queue. See Section 15.3.2.4 "Scheduling Arbiter" on page 308 regarding the arbitration between SP and SDWRR scheduled queues. See Section 15.3.2.3 "Scheduling Profiles" on page 308 regarding configuration of SP queues. 15.3.2.2 Shaped Deficit Weighted Round Robin Scheduling Group The device supports two independent groups of SDWRR queues on a port. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Within each SDWRR group, the queues are serviced according to their configured “weight”. The weight can be configured to be byte-based or packet-based (byte-based weight provides a more accurate minimum bandwidth allocation). The queue weight is an 8-bit value, thus allow for bandwidth ratios between queues ranging from 1:1 up through 1:255. For example, to evenly divide up the available bandwidth among the queues in the SDWRR group, set the weight of each of the queues in the group to 1. If there are 4 queues in the group, and the desired bandwidth division is 10%, 20%, 30%, and 40%, then the weight assignment to each queue is set to 1, 2, 3, and 4 respectively. The device implements Shaped DWRR, which has significant advantages over DWRR. In DWRR, packets continue to be scheduled for transmission on a queue as long as the queue’s Deficit Counter is greater than the packet byte count. The next queue is serviced only when the queue Deficit Counter becomes smaller than the packet size to be transmitted. As a result, traffic on queues with large weights cause increased latency and jitter for traffic waiting for scheduling other queues. In shaped WDRR, if two or more queues have traffic eligible for transmission (i.e. the Deficit Counter is greater than the packet size to be transmitted), then a round-robin scheme among the queues is used, while still preserving the overall weight ratios between the queues. For example if queues 0 and 1 have packet descriptors queued for transmission, and their respective weights are 5 and 3, DWRR and SDWRR would transmit from the queues in the order shown in the following table. Table 67: SDWRR vs. DWRR Al go ri th m Ord er o f q ue ues s che du le d for tra nsm is sio n DWRR 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1 SDWRR 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0 As can be seen, SDWRR produces less jitter and lower maximum latency for traffic on both queues. See Section 15.3.2.3 "Scheduling Profiles" regarding configuration of SDWRR queues. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 307 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 15.3.2.1 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Egress Bandwidth Management 15.3.2.3 Scheduling Profiles AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Four scheduling profiles define the port queue scheduling attributes. Each port is configured to be associated with one of the scheduling profiles. Configuration To configure the SDWRR count mode to be packet or byte based, set the <TxSchedCount Mode> bit in the • • • – – – – – – – – – – – – – – – • Profile0 SDWRR Weights Configuration Register0 (Table 493 p. 742) Profile0 SDWRR Weights Configuration Register1 (Table 494 p. 742) Profile0 WRR & Strict Priority Configuration Register (Table 495 p. 743) Profile1 SDWRR Weights Configuration Register0 (Table 496 p. 743) Profile1 SDWRR Weights Configuration Register1 (Table 497 p. 744) M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Transmit Queue Extended Control Register (Table 459 p. 717). The internal SDWRR algorithm requires configuration of the maximum packet size. Set the <TxSchedMTU> bit in the Transmit Queue Extended Control Register (Table 459 p. 717). To associate a port with a scheduling profile, set the <TxSched Profile> field in the Port<n> Txq Configuration Register (0<=n<27, CPU port number is 0x3F) (Table 466 p. 722) accordingly. To configure the profile attributes, set the fields in each of the profile registers: Profile1 SDWRR & Strict Priority Configuration Register (Table 498 p. 745) Profile2 SDWRR Weights Configuration Register0 (Table 499 p. 745) Profile2 SDWRR Weights Configuration Register0 (Table 499 p. 745) Profile2 SDWRR & Strict Priority Configuration Register (Table 501 p. 746) Profile3 SDWRR Weights Configuration Register0 (Table 502 p. 747) Profile3 SDWRR Weights Configuration Register1 (Table 503 p. 747) Profile3 WRR & Strict Priority Configuration Register (Table 504 p. 748) If any of the profile parameters are updated, the following procedure must be followed: – Disable dequeueing on ports associated with the profile by clearing the <EnTxTC[7:0]> field in the Port<n> Txq Configuration Register (0<=n<27, CPU port number is 0x3F) (Table 466 p. 722). – – Update the profile attributes. Set the <UpdateSched VarTrigger> bit in the Transmit Queue Control Register (Table 458 p. 717) which loads the new profile configuration. 15.3.2.4 Scheduling Arbiter Each queue is associated with one of the scheduling groups (SP, SDWWR-0, or SDWRR-1), according to the port profile assignment (Section 15.3.2.3 "Scheduling Profiles" on page 308). Each scheduling group, according to its own scheduling algorithm and configuration, chooses one of its queues (or possibly none if all the queues within the group all empty) as its candidate for packet transmission. A port scheduling arbiter selects among the candidates the queue with the highest number. MV-S102110-02 Rev. E Page 308 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Each of the profiles contains the following queue scheduling attributes: • Scheduling algorithm for each of the eight port queues (SP, SDWRR-0, SDWRR-1) • SDWRR weight for each of the eight port queues (not relevant for SP queues) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bandwidth Management The following table is an example queue scheduling configuration. Q7 Q6 Q5 Q4 Q3 Q2 Q1 Q0 Strict Priority Group SDWRR Group 0 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 SDWRR Group 1 Queue 7 has the highest priority. It is selected for transmission whenever there are packet descriptors queued. Queue 6 has the next highest priority. It is selected for transmission whenever Queue 7 is empty. Queues 1 and 2 are scheduled according to the configured weight assignments in SDWRR Group 1, on condition that Queues 3-7 are empty. Lastly, Queue 0, although it is a SP queue, it is the lowest queue and therefore is selected for transmission only when all the other queues are empty. Note If two SDWRR groups are configured, to maintain the correct bandwidth division with each group, each group should consist of a continuous set of queues numbers. 15.3.3 Port and Queue Traffic Shaping Traffic shaping limits packet transmission within a maximum rate and burst size. A scheduled packet that conforms to the shaper attributes is dequeued and transmitted through the link. A non-conforming packet remains in the queue until it conforms to the shaper attributes. This is called a non-work conserving scheduler, as non-conforming packets are not transmitted even when link is available for transmission. The device implements egress rate shaping using a token bucket rate shaper per-port and per-queue. The token bucket shaper is independently enabled per-port and per-traffic class queue on a port. The token bucket shaper is byte-based on all ports except for the CPU port, which is configurable to be either byte-based or packet-based. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 309 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Figure 54: Example Profile of Queue Scheduling Groups M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Egress Bandwidth Management Bucket size The maximum amount of tokens that can be held by the token bucket. The token bucket size determines the maximum burst size permitted. The bucket size ranges from 4KB to 16MB, in steps of 4K. The token bucket size is configured independently per-port token bucket and per-queue token bucket. Refill period • • • The time period (in core clock cycles) in which the token refill value is added to the token bucket. For 1000/100/10Mbps ports, the refill period ranges from 1 to 15, in units of 1024 clock cycles. For HyperG.Stack ports, the refill period is ranges from 1 to 15, in units of 64 clock cycles. Token bucket maximum The token bucket MTU defines the minimum number of tokens required to permit transmission unit (MTU) a packet to be transmitted (i.e., conforming). The token bucket shaper MTU is a global attribute that is configurable to 1.5KB, 2KB, 10KB. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 When a packet is scheduled for transmission, if the token bucket counter is equal to or greater than the token bucket MTU, the packet byte-count is decremented from the token bucket counter and the packet is transmitted (the token bucket is decremented by one on the CPU port if it is configured for packet-based shaping). If the token bucket counter is less than the MTU, then the packet dequeue is delayed for some number of refill periods until the token bucket counter is larger than the MTU. If the CPU port is configured for packet-based shaping, all packets queued to the CPU port are treated as if they have a byte-count equal to the token bucket MTU, regardless if their actual packet length. Thus the packet rate is a function of the configured bit-rate divided by the MTU size. For the 10/100/1000 Mbps ports, there is a special slow-rate mode to provide slower rates. The slow-rate mode increases the refill period by a factor of up to 16. For example, if the refill period is set to 2 (with resolution is 1024 clock cycles), and the slow-rate mode set to 16, then the token bucket is refilled every (2 * 16 * 1024) clock cycles. The general formula for defining the shaping rate is as follows: Rate in Mbps = <Core Clock Frequency in MHz> * <Token refill value in bits> / <Refill period in clock cycles> For a HyperG.Stack port, using a core clock frequency of 200 MHz, the minimum shaped rate is: Token bucket attributes: refill period=15, refill value=1: rate = 200MHz * 1 byte * 8 bits / (15 * 64 clock cycles) = 1.67 Mbps For a 10/100/1000 Mbps port, using a core clock frequency of 200MHz, the minimum shaped rate is: • Token bucket attributes: refill period=15, refill value=1, and slow-rate mode disabled: rate = 200MHz * 1 bytes * 8 bits /(15 * 1024 clock cycles) = 104 Kbps • Token bucket attributes: refill period=15, refill value=1, slow-rate=15: rate = 200MHz * 1 bytes * 8 bits /(15 * 15 * 1024 clock cycles) = 6.9 Kbps In the above examples the token refill value is set to the minimum value 1, and the refill period is set to the maximum value 15. Larger rates can be achieved by Increasing the token refill value and/or decreasing the refill period. MV-S102110-02 Rev. E Page 310 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The token bucket configurable attributes are: Token refill value The number of tokens added to the token bucket counter every token bucket refill period. The token refill value is configured independently for the per-port token bucket and for the per-queue token buckets. The token refill value ranges from 1 token to 4K tokens. One token represents one packet byte. The token bucket counter cannot exceed the token bucket size. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Bandwidth Management • • • • Control Register (Table 459 p. 717) accordingly. To configure the refill period for the HyperG.Stack ports and the 10/100/1000 Mpbs ports, set the <TokenBucket GigUpdRate> and <TokenBucket XGUpdRate> fields in the Token Bucket Update Rate and MC FIFO Configuration Register (Table 490 p. 738) accordingly. To set the slow-rate factor for 10/100/1000 Mbps ports, set the <TokenBucket SlowUpdRatio> field in the Token Bucket Update Rate and MC FIFO Configuration Register (Table 490 p. 738) accordingly. To configure the port token bucket attributes, set the <MaxBucketSize>, <Tokens>, <SlowRateEn>, and <TokenBucketEn> fields in the Port<n> Token Bucket Configuration Registers (0<=p<27, CPUPort = 63) (Table 491 p. 740) accordingly. To configure the per-port per-queue token bucket attributes, set the <MaxBucketSize>, <Tokens>, <SlowRateEn>, and <TokenBucketEn> fields in the Port<n> Token Bucket Configuration Registers (0<=p<27, CPUPort = 63) (Table 491 p. 740) accordingly. To configure the CPU port shaper to be packet-based or byte-based, set <CPUPortTB Mode> in the Transmit Queue Misc. Control Register (Table 460 p. 719) accordingly. 15.3.4 Watchdog Timer Watchdog timers are associated with each one of the egress transmit ports. The purpose of the watchdog timer is to monitor non-transmission on a port that has non-empty queues. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 If the timer expires, the port queue descriptors and the associated buffers are freed (if not used by any other descriptor). This can occur when a port receives Flow Control XOFF packet and the XON does not arrive within the time-out period, or on misbehaving links (e.g. half-duplex link that is jammed all the time). The timer can disable by the management per port. The watchdog timer is programmable per port and can range from 1ms to 4 seconds. Upon timer expiration, an interrupt is also generated. Note If port link is lost, traffic continues to be dequeued and discarded from the port queues, so the watchdog is not triggered by this event. Configuration • Configure the watchdog attributes in the Port<n> Watchdog Configuration Register (0<=n<27, CPUPort = 63) • (Table 505 p. 749). Read the watchdog interrupt cause from the Transmit Queue WatchDog Interrupt Cause Register (Table 591 p. 812). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 311 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Configuration To configure the global token bucket MTU, set the <Token BucketMTU> field in the Transmit Queue Extended • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Egress Bandwidth Management AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 16. Traffic Monitoring 16.1 Traffic Sampling to the CPU The device supports ingress and egress port sampling to the CPU (aka STC). This feature meets the port sampling requirements of sFLOW per the informational RFC 3176. The STC feature is independent of the Spanning Tree port state, mirroring packets to an analyzer port and other mechanisms that mirror or trap packets to the CPU. This feature allows a packet to be sampled to the CPU after a configurable number of packets. The device supports independent configuration for ingress and egress port sampling to the CPU. Ingress port sampling operates in one of two configurable modes: Good Packet Mode All packets received on the port with no MAC-level errors are considered for sampling. Forwarded Packet Mode All packets that are not assigned a Soft or Hard DROP command by the ingress pipeline engines (Section 5. "Packet Command Assignment and Resolution" on page 52) are considered for sampling. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Egress sampling is based on all packets successfully enqueued for transmission on an egress port. The CPU configures a 30-bit packet sampling frequency, per-ingress port and per-egress port. If this value is configured to n, this value is loaded into an internal packet counter and is decremented for each potentially sampled packet. When the internal counter transitions from one to zero, the nth packet is marked for sampling to the CPU. When the internal port packet counter reaches zero, two global modes determine its reloading behavior: Continuous Reload Mode The port ingress/egress STC mechanism reloads the sampling frequency value that was last configured by the CPU. CPU-Controlled Reload Mode The port ingress/egress STC mechanism only reloads a new sampling frequency value if it was updated by the CPU since the previous reload operation. If the CPU did not update the frequency value since the previous reload, sampling is halted until a new frequency value is updated by the CPU. When a new sampling frequency is loaded into the port internal counter, a CPU interrupt is generated, indicating that a new port sampling frequency can be configured, to be used on the next internal counter reload. Alternatively, the CPU can read a status register field, to learn whether the last sampling frequency has been loaded into the internal counter. If the CPU configures a sampling frequency of ZERO, packet sampling halts immediately for the given ingress or egress port and effectively disables ingress (or egress) sampling to the CPU on the port. A packet marked for sampling to the CPU is duplicated by the pre-egress engine. The duplicated packet is sent with a TO_CPU DSA tag (Section 7.2.2 "TO_CPU DSA Tag" on page 106), with its TO_CPU DSA tag <CPU code> set to INGRESS SAMPLED or EGRESS SAMPLED. According to the configuration of the CPU Code table (Section 7.2.1 "CPU Code Table" on page 103), the sampled packet is assigned QoS attributes, the destination device CPU (local CPU or CPU on a remote device), and whether to truncate the packet to 128 bytes. MV-S102110-02 Rev. E Page 312 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 The device supports traffic sampling to the CPU and Traffic Monitoring mechanisms. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Traffic Monitoring AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The CPU learns the ingress or egress port from which the packet was sampled by examining the TO_CPU DSA tag <srcPort>/<trgPort> and <srcDev>/<trgDev> fields. Note A packet assigned a command of MIRROR or TRAP and also marked for STC is sent to the CPU twice— one with the STC assigned CPU code and once with the MIRROR/TRAP assigned CPU code. Configuration M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Ingress STC Configuration • To globally enable ingress sampling to the CPU, set the <IngressSTCEn> bit in the Ingress STC Configuration Register (Table 450 p. 707). • To configure the ingress internal counter reload mode (continuous or CPU-controlled), set the <Ingress STCReloadMode> bit in the Ingress STC Configuration Register (Table 450 p. 707) accordingly. • To configure the ingress packet sampling mode (good packets or non-dropped packets), set the <IngressSTCCountMode> bit in the Ingress STC Configuration Register (Table 450 p. 707). • To configure the ingress sampling frequency of a port, set the <Port<n>Ingress STCLimit>> field in the Port<n> Ingress STC Table Entry (0<=n<27) (Table 445 p. 703) using the access procedure defined in Ingress STC Table Access Control Register (Table 446 p. 704) • To read the number of packets marked for ingress sampling, set the <Port<n>STC SampledPktCnt> field in the Port<n> Ingress STC Table Entry (0<=n<27) (Table 445 p. 703) using the access procedure defined in Ingress STC Table Access Control Register (Table 446 p. 704). • In CPU-controlled reload mode, the CPU should write a new ingress sampling frequency value when the <Port<n>IngressSampleLoadedInt> field in the Ingress STC Interrupt Cause Register (Table 585 p. 810) is received. Alternatively, the <Port<n>IngressSTCNewLim ValRdy> bit in the Port<n> Ingress STC Table Entry (0<=n<27) (Table 445 p. 703) can be polled to determine if a new sampling frequency should be written. Egress STC Configuration • To globally enable egress sampling to the CPU, set the <EgressSTCEn> bit in the Statistical and CPU-Triggered Egress Mirroring to Analyzer Port Configuration Register (Table 474 p. 729). • To configure the egress internal counter reload mode (continuous or CPU-controlled), set the <Egress STCReloadMode> bit in the Statistical and CPU-Triggered Egress Mirroring to Analyzer Port Configuration Register (Table 474 p. 729) accordingly. • To configure the egress sampling frequency of a port, set the <Port<n>Egress STCLimit> field in the Port<n> Egress STC Table Entry Word0 (0<=n<27) (Table 475 p. 730) accordingly. • If any of the device cascade ports are configured for egress STC, set the <Cascade EgressMonitorEn> bit in the Egress Monitoring Enable Configuration Register (Table 454 p. 709). • For each non-cascade port configured for egress STC, set the corresponding bit in the <PortEgressMonitorEn> field of the Egress Monitoring Enable Configuration Register (Table 454 p. 709). • To read the number packets marked for Egress sampling, set the <Port<n>STC SampledPktCnt> field in the Port<n> Egress STC Table Entry Word2 (0<=n<27) (Table 477 p. 731) accordingly. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 313 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 A 16-bit sampled-packet counter is maintained per ingress port and per egress port. The CPU can read this counter to determine if packets marked for sampling were dropped due to congestion on a cascade port or the CPU port. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Traffic Sampling to the CPU In CPU-controlled reload mode, the CPU should write a new ingress sampling frequency value when the <Port<n>EgressSampleLoadedInt> field in the Egress STC Interrupt Cause Register (Table 595 p. 814) is received. Alternatively, the <Port<n>EgressSTCNewLim ValRdy> bit in the Port<n> Egress STC Table Entry Word0 (0<=n<27) (Table 475 p. 730) can be polled to determine if a new sampling frequency should be written. Traffic Mirroring to Analyzer Port The device supports mirroring a packet (i.e., sending a copy of the packet) to a configurable analyzer port.The following sub-sections define the Mirroring to Analyzer feature in detail. 16.2.1 Mirroring Overview A packet received on a network port can be marked for mirroring to the ingress analyzer port by an ingress pipeline mechanism and/or marked for mirroring to the egress analyzer port by the egress pipeline. A packet received on a cascade port is marked for ingress or egress mirroring if it is received with a TO_ANALYZER DSA tag. The field <rx_mirror> indicates whether the packet is ingress mirrored or egress mirrored. Packets marked for ingress or egress mirroring are duplicated by the pre-egress engine and marked with a DSA tag command TO_ANALYZER (Section 4.6.1.4 "TO_ANALYZER DSA Tag Command" on page 51). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The pre-egress engine supports independent configuration of ingress and egress mirrored traffic analyzer destination, and ingress and egress mirrored traffic QoS attributes (traffic class and drop precedence). In a cascaded system where the destination analyzer port resides on a remote device, each device assigns the destination and QoS to the mirrored packet according to its own analyzer port configuration. (This is due to the fact that the destination and QoS is not contained in the TO_ANALYZER DSA tag.) The mirrored packet is sent according to the configured ingress or egress analyzer destination device or the configured local port configuration. (It is not a combined destination device/port pair.) If the configured analyzer device is the local device, the packet is queued to a local port according to the configured analyzer port, which may be a network port, cascade port, or the CPU port 63. If the analyzer port is the CPU port 63, the pre-egress engine duplicates the packet and sends it with a TO_CPU DSA tag with its CPU code set to INGRESS MIRRORED or EGRESS MIRRORED. According to the configuration of the CPU Code table (Section 7.2.1 "CPU Code Table" on page 103), the mirrored packet is assigned QoS attributes, the destination device CPU (local CPU or CPU on a remote device), and whether to truncate the packet to 128 bytes. The CPU learns the ingress or egress port from which the packet was mirrored by examining the TO_CPU DSA tag <srcPort>/<trgPort> and <srcDev>/<trgDev> fields. If the configured analyzer device is a remote device number, the packet is sent over a cascade port or trunk group according to the Device Map table. (Section 4.2 "Single-Target Destination in a Cascaded System"). Mirrored packets queued on the analyzer port are not subject to VLAN or Spanning Tree egress filtering. In a cascaded system, if the ingress and egress analyzer ports are always network ports on the local device, each device may configure the ingress and egress analyzer port independently. However, if mirroring to a remote analyzer port is required, then only one ingress analyzer port and one egress analyzer port are supported for the entire system. Mirrored packets transmitted on the analyzer port are transmitted either VLAN-tagged or untagged as a function of the format in which the packet was ingressed (if the packet is ingress mirrored), or the format in which the packet was egressed (if the packet is egressed mirrored). The tagged state of the analyzer port in the mirrored packet’s VLAN is not relevant. (If the destination is the CPU port, the packet is always DSA-tagged with TO_CPU format). MV-S102110-02 Rev. E Page 314 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 16.2 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Traffic Monitoring • • • • As can be seen from the above description, if the analyzer is on a remote device, there are two configuration approaches: - Configure the analyzer device to the remote device number on which the analyzer port resides. In this case, the configuration of the analyzer local port is not relevant. - Configure the analyzer device to the local device number, and set the analyzer local port to the local cascade port that “leads” to the analyzer device. If the cascade interface leading to the destination device is not a trunk group, there is no functional difference between these approaches. However, option #1 has the advantage that if the cascaded system topology changes, only the Device Map table needs to be updated; the analyzer device configuration need not be updated. If the cascade port interface leading to the destination device is a trunk group, then there is a functional difference between the two approaches. Option #1 load-balances the ingress mirrored traffic across all port members of the trunk group, whereas option #2 always sends the ingress mirrored traffic through the same port member of the cascade trunk group. For details on segregating analyzer traffic, data traffic, and control traffic on a cascade port, see Section 8.4.1 "Traffic Class and Drop Precedence Assignment" on page 123. Configuration • To configure the ingress/egress mirrored packet QoS attributes, set the Ingress and Egress Monitoring to 16.2.2 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Analyzer QoS Configuration Register (Table 452 p. 708) fields accordingly. To configure the ingress/egress analyzer destination device and destination local port, set the Analyzer Port Configuration Register (Table 453 p. 709) fields accordingly. Statistical Mirroring The device supports ingress and/or egress statistical mirroring to the destination analyzer port. The pre-egress engine can be configured to allow only a configurable ratio of 1:X packets marked for ingress/ egress mirroring to be sent to the ingress/egress analyzer port. The ratio is independently configurable for ingress and egress mirrored traffic. The configuration supports a range from 1:X, where 0<= X <= 2047. For example: – – – – – X=0 does not send any mirrored packets to the analyzer port. X=1 sends ALL mirrored packets to the analyzer port. X=2 sends 1 out of every 2 mirrored packets to the analyzer port. X=3 sends 1 out of every 3 mirrored packets to the analyzer port. X = 2047 sends 1 out of every 2047 mirrored packets to the analyzer port. Configuration Ingress Statistical Mirroring • To enable ingress statistical mirroring, set the <IngressStat MirroringToAnalyzerPortEn> bit in the Statistic Sniffing Configuration Register (Table 451 p. 707). • To configure the ingress statical mirroring ratio, set the <IngressStat MirroringTo Analyzer PortRatio> field in Statistic Sniffing Configuration Register (Table 451 p. 707) accordingly. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 315 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Notes M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Traffic Mirroring to Analyzer Port Ingress Mirroring Multiple mechanisms in the ingress pipeline can mark a packet to be mirrored to the ingress analyzer port. These mechanisms are defined in the following subsections. 16.2.3.1 Port-Based Ingress Mirroring Any number of device ports—network or cascade—can be configured for ingress mirroring to the configured ingress analyzer port. All “good” (no MAC-level errors) packets received on a port configured for ingress mirroring are marked for ingress mirroring, duplicated by the pre-egress engine, and sent to the destination ingress analyzer port, as described in Section 16.2.1 "Mirroring Overview". Configuration 16.2.3.2 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 To configure an ingress port to mirror all “good” packets received to the configured ingress analyzer destination, set the <MirrorToIngress AnalyzerPort> bit in the Port<n> VLAN and QoS Configuration Entry (0<=n<27, for CPU port n=0x3F) (Table 312 p. 567), using the access procedure defined in the Ports VLAN, QoS and Protocol Access Control Register (Table 314 p. 575). Policy-Based Ingress Mirroring In the Policy engine, a policy action can be configured to enable ingress mirroring (see Section 10.6 "Policy Actions" on page 195). All packets matching this policy rule are marked for ingress mirroring, duplicated by the pre-egress engine, and sent to the destination ingress analyzer port, as described in Section 16.2.1 "Mirroring Overview". 16.2.3.3 VLAN-Based Ingress Mirroring A VLAN entry (Section 11.2.6 "VLAN Table Entry") can be configured with the field <Ingress mirror> set the VLT Tables Access Control Register (Table 523 p. 765). In the bridge engine, all packets assigned to a VLAN whose VLAN table entry has the field <ingress mirror> set, are marked for ingress mirroring, duplicated by the pre-egress engine, and sent to the destination ingress analyzer port, as described in Section 16.2.1 "Mirroring Overview". 16.2.3.4 FDB-Based Ingress Mirroring An FDB entry (Section 11.4.2 "FDB Entry") can be configured with the field <ingress mirror to analyzer port> set (Section 11.4.6 "CPU Update and Query of the FDB"). In the bridge engine, all packets whose Source Address or Destination Address lookup matches an entry with <ingress mirror to analyzer port> set, are marked for ingress mirroring, duplicated by the pre-egress engine, and sent to the destination ingress analyzer port, as described in Section 16.2.1 "Mirroring Overview". MV-S102110-02 Rev. E Page 316 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 16.2.3 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Egress Statistical Mirroring • To enable egress statistical mirroring, set the <EgressStat MirrorTo AnalyzerEn> bit in the Statistical and CPU-Triggered Egress Mirroring to Analyzer Port Configuration Register (Table 474 p. 729). • To configure the egress statical mirroring ratio, set the <EgressStat MirroringTo Analyzer PortRatio> field in the Statistical and CPU-Triggered Egress Mirroring to Analyzer Port Configuration Register (Table 474 p. 729) accordingly. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Traffic Monitoring 16.2.4 Egress Mirroring AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 The following mechanisms in the egress pipeline can mark a packet to be mirrored to the egress analyzer port. 16.2.4.1 Port Based Egress Mirroring All packets enqueued for transmission on a port configured for egress mirroring are marked for egress mirroring, duplicated by the pre-egress engine, and sent to the destination egress analyzer port, as described in Section 16.2.1 "Mirroring Overview". Note If the packet is egressed on multiple ports of the device (e.g., the packet is unknown Unicast or Multicast/ Broadcast) and more than one of the packet egress ports is enabled for egress mirroring, only the packet instance that is egressed on the lowest port number is mirrored to the Egress Analyzer port. Configuration • Prior to configuring any of the device cascade ports for egress mirroring, set the <Cascade EgressMoni- • Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • torEn> bit in the Egress Monitoring Enable Configuration Register (Table 454 p. 709). Prior to configuring any non-cascade port configured for egress mirroring, set the corresponding bit in the <PortEgressMonitorEn> field of the Egress Monitoring Enable Configuration Register (Table 454 p. 709). To configure an egress port to mirror all enqueued packets to the configured egress analyzer, set the <EgressMirror ToAnalyzerEn> bit in the Port<n> Txq Configuration Register (0<=n<27, CPU port number is 0x3F) (Table 466 p. 722). CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 317 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Any number of device ports—network or cascade—can be configured for egress mirroring to the configured egress analyzer port. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Traffic Mirroring to Analyzer Port AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Section 17. LED Interface LED Interface Overview This section provides a detailed description of the LED interface and the LED pin functionality. The LED interface consists of two 3-pin serial interfaces. The LED interface data stream can be loaded into an external shift register, which can be used to display the data via LED indicators. The data accessible on the LED Indications serial interfaces for each of the device’s ports includes link status, speed, duplex mode, activity, etc. To eliminate the need for an external logic device, for more complex indications and multi-color LEDS, the LED serial data stream also provides extensive manipulation ability of the port’s indication. The device incorporates two Serial LED Interfaces: In all devices LED Interface 0: • For Port0 through Port11 indications. • For CPU port Indications. • For Port26 Indications (relevant for 98DX249, 98DX269, 98DX270, 98DX273, and 98DX803 only). • • • In the 98DX107, 98DX167, 98DX243, 98DX246, 98DX247, 98DX250, 98DX253, 98DX260, 98DX262, 98DX263, 98DX270, 98DX273, 98DX803 devices: For Port12 through Port23 Indications. In the 98DX163 and 98DX166 devices: For Port12 through Port15 Indications. For Port24 and Port25 indications (relevant for 98DX130, 98DX133, 98DX249, 98DX260, 98DX262, 98DX263, 98DX269, 98DX270, 98DX273, and 98DX803 only). M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 LED Interface 1 The Serial LED Interface is a three pin Interface that consists of the following outputs: LEDCLK0, LEDCLK1 LEDCLK is the primary time base of the LED Indication Interface. It is a free-running clock at a fixed frequency of system clock divided by 128. For a nominal Core Clock of 200 MHz, the frequency of LEDCLK is 1.56 MHz. To improve SETUP and hold conditions, LEDCLK may be inverted. LEDSTB0, LEDSTB1: LEDSTB (Active High) indicates the beginning of the data frame, which by default starts at LED bit 0 and ends at LED bit 255. LEDSTB is activated for the duration of one LEDCLK cycle at the beginning of every data frame. LEDDATA0, LEDDATA1: The Serial frames of the LED Indications, a serial stream of up to 256 LED Indications. The serial stream may be ordered by indication type or by port. In addition, the serial stream start point and end point may be configured. Configuration To Invert LEDCLK0, set <LEDClkInvert> in the LED Interface0 Class6 Manipulation Register (for ports 0 • • through 11) (Table 278 p. 528). To Invert LEDCLK1, set <LEDClkInvert> in the LED Interface1 Class6 Manipulation Register (for ports 12 through 23) (Table 279 p. 529). MV-S102110-02 Rev. E Page 318 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 17.1 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification LED Interface 17.2 LED Indications This section describes the LED Indications available via the LED interfaces. The indications are supplied by the LED stream for the 14 indication classes. A class is a group of bits, containing a bit per port, representing a specific property of the port (e.g., link, collision). Some of the indication classes may undergo the following actions: • Stretch: Stretch an indication class to a visible length, from 5.5 ms to 704 ms. • Disable when link is down: Disable the LED Indication classes when the link is down. Configuration • To stretch stretchable classes in LED Interface0, set <PulseStretch> in the LED Interface0 Control Register0 • • (for Ports 0 through 11, CPU Port, and Port26) (Table 268 p. 520) to the desired length. To stretch stretchable classes in LED Interface1, set <PulseStretch> in the LED Interface1 Control Register0 (for Ports 12 through 23, Port24, and Port25) (Table 269 p. 522) to the desired length. To disable classes on link down in LED Interface0, set <DisableOnLinkDown> in the LED Interface0 Control Register0 (for Ports 0 through 11, CPU Port, and Port26) (Table 268 p. 520). To disable classes on link down in LED Interface1, set <DisableOnLinkDown>in the LED Interface1 Control Register0 (for Ports 12 through 23, Port24, and Port25) (Table 269 p. 522). 17.2.1 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • Tri-Speed Ports Classes Table 68 outlines the Tri-Speed ports indication classes and the actions each class may undergo. Table 68: Tri-Speed Ports and CPU Port Indication Classes Description Class # I nd ication Description Stretch Effect D i s a bl e On Li nk Do wn Effect 0 Speed is 1000 Mbps Port’s speed is 1000 Mbps. No Yes 1 Speed is 100 Mbps Port’s speed is 100 Mbps. No Yes 2 Full Duplex Port’s duplex mode is full-duplex. No Yes 3 Link Up Port’s link is up. No No 4 Activity Receive activity or transmit activity. Yes Yes Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 319 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 • To configure the beginning and the end of the serial LED Stream for LED Interaface0, set <LEDStart> and <LEDEnd> in the LED Interface0 Control Register0 (for Ports 0 through 11, CPU Port, and Port26) (Table 268 p. 520). To configure the beginning and the end of the serial LED Stream for LED Interaface1, set <LEDStart> and <LEDEnd> in the LED Interface1 Control Register0 (for Ports 12 through 23, Port24, and Port25) (Table 269 p. 522) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED LED Indications Tri-Speed Ports and CPU Port Indication Classes Description (Continued) Description Stretch Effect AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Class # I nd ication 6 7 8 9 10 11 12 Half Duplex/ Fiber Link Up This class may be used for one of two indications. • Half Duplex: Port’s duplex mode is full-duplex. • Fiber Link Up: For dual-media ports connected to a dual-media PHY such as the 88E1112) indicating the port is connected to a copper medium and the link is up. See Section 17.2.1.1. No Yes Port Disabled Port is disabled. No No Collision Collision is detected. Yes No Erroneous Frame in Rx Erroneous Frame in Rx, due to Rx Error. Yes No Cascading Port Port is configured as a Cascading port. No No Spanning Tree State Spanning Tree state 0 = Span State is disabled. 1 = Span State is enabled. No No Speed is 10 Mbps Port’s speed is 10 Mbps. No No Port Pause The port has reached its Rx Buffers threshold. In full-duplex mode and if Flow Control is enabled, it transmits an XOFF pause frame. In half -duplex mode, if back pressure is enabled, the port jams the line by transmitting a jam pattern. No No 13 This class may be used for one of two indications: • Link Down: Port’s link is down. • Copper Link Up: For dual-media ports connected to a dual-media PHY (such as the 88E1112) indicating that the port is connected to a copper medium and the link is up. See Section 17.2.1.1. No No 17.2.1.1 Link Down/ Copper Link Up Dual-Media Ports M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 5 D i s a bl e On Li nk Do wn Effect Class5 and Class13 indications may be used to display the selected medium on a dual-media port via the LED Interface. Configuration To Set Class5 to display Fiber Link Up on dual-media ports on LED Interface0, set <Class5Sel> to 1 in the • • LED Interface0 Class6 Manipulation Register (for ports 0 through 11) (Table 278 p. 528). To Set Class13 to display Copper Link Up on dual-media ports on LED Interface0, set <Class13Sel> to 1 in the LED Interface0 Class6 Manipulation Register (for ports 0 through 11) (Table 278 p. 528). MV-S102110-02 Rev. E Page 320 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 68: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification LED Interface • To Set Class5 to display Fiber Link Up on dual-media ports on LED Interface1, set <Class5Sel> to 1 in the LED Interface1 Class6 Manipulation Register (for ports 12 through 23) (Table 279 p. 529). To Set Class13 to display Copper Link Up on dual-media ports on LED Interface1, set <Class13Sel> to 1 in the LED Interface1 Class6 Manipulation Register (for ports 12 through 23) (Table 279 p. 529). AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 • 17.2.2 HyperG.Stack and HX/QX Ports Classes Table 69: HyperG.Stack Port Indication Classes Description Class # I nd ication Stretch D i s a b l e Effect O nL i nk D o wn Effect XAUI PHY Lane0, LED1 See section 17.2.2.1. (In QX mode, relevant if lane 0 is used) No No XAUI PHY Lane0, LED0 See section 17.2.2.1. (In QX mode, relevant if lane 0 is used) No No XAUI PHY Lane1, LED1 See section 17.2.2.1. (In QX mode, relevant if lane 1 is used) No No XAUI PHY Lane1, LED0 See section 17.2.2.1. (In QX mode, relevant if lane 1 is used) No No XAUI PHY Lane2, LED1 See section 17.2.2.1. (Not relevant for HX/QX ports) No No XAUI PHY Lane2, LED0 See section 17.2.2.1. (Not relevant for HX/QX ports) No No 6 XAUI PHY Lane3, LED1 See section 17.2.2.1. (Not relevant for HX/QX ports) No No 7 XAUI PHY Lane3, LED0 See section 17.2.2.1. (Not relevant for HX/QX ports) No No 8 Port Disabled Port is disabled. No No 9 Cascading Port Port is configured as a Cascading port. Yes No 10 Spanning Tree State Spanning Tree state 0 = Span State is disabled. 1 = Span State is enabled. Yes No 11 Port Pause The port has reached its Rx Buffers threshold In full-duplex mode. If Flow Control is enabled, it transmits an XOFF pause frame. Yes No 12 Reserved No No 13 Reserved No No 1 2 3 4 5 Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 0 D escription CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 321 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 69 outlines the HyperG.Stack ports indication classes and the actions each class may undergo. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED LED Indications 17.2.2.1 XAUI PHY and HX/QX PCS LEDs XAUI PHY LEDs are programmed through LED Control registers 8004 to 8007. LED characteristics such as blinking and pulse-stretching can be set individually for each lane: • LED 0 and LED 1 are XAUI PHY Lane0 indications controlled by the LED 0, 1 Control Device 3,4,5, Register (Table 185 p. 448) Register. • LED 2 and LED 3 are XAUI PHY Lane1 indications controlled by the LED 2, 3 Control Device 3,4,5, Register (Table 186 p. 449) Register. • LED 4 and LED 5 are XAUI PHY Lane2 indications controlled by the LED 4, 5 Control Device 3,4,5, Register (Table 187 p. 451) Register. • LED 6 and LED 7 are XAUI PHY Lane3 indications controlled by the LED 6, 7 Control Device 3,4,5, Register (Table 188 p. 452) Register. • HX/QX LEDs are programmed through HXPort<n> LED Control (Table 231 p. 483). Table 69 outlines the indications that may be displayed on each of the LED Indications, using the above registers. XAUI PHY LED Indications L E D C o nt ro l Bi ts Regi sters 8004 –8007 Bits [11:8], [3:0] I n di c a ti on 0 LED Low 1 LED High 2 3 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 70: LED Blink (Not relevant for the HX/QX Ports) Sync Lane 4 Rx FIFO Over/Underflow lane 5 Tx FIFO Over/Underflow lane 6 Byte Error 7 Disparity Error 8 Receive 9 Transmit 10 Activity 11 Receive/Link 12 Link 13 Local Fault 15-14 Reserved Using the XAUI PHY registers 8004–8007 enables stretching the LED indications per lane for the HyperG.Link ports, from no stretch to a pulse stretch of 2150 ms in eight steps. MV-S102110-02 Rev. E Page 322 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Eight of the 14 classes of the HyperG.Stack and HX/QX ports LED classes is dedicated to LED Indications from the XAUI PHY and the HX/QX PCS attached to each of the HyperG.Stack and HX/QX ports. Two LEDs are dedicated to each lane. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification LED Interface In addition, each pair may be configured to blink, from a blink rate of 34 ms to a blink rate of 538 ms in five steps. LED Indication Class Manipulation AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 17.2.3 Force Instead of the primary indication defined in Table 68 for Tri-Speed ports and in Table 69 for HyperG.Stack ports, Indication class may be forced to a configured value. Blink Indication class blinks according to blink configuration. Invert Indication class polarity may be inverted. All of the above manipulations may be done on classes 0 through 6 of the Tri-Speed ports indication classes and on classes 0 through 11 of the HyperG.Stack ports indication classes. Configuration All the LED class manipulations are configurable (see C.11.4 "LED Interfaces Configuration Registers" on page 520). Note All of the above manipulations are incremental. A class may be forced to a value from a configured register. This value may be configured to be blinking and then it may be inverted. Forcing Indication Class M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 17.2.3.1 Instead of the primary indication, Indication class may be forced to a configured value. For example: Tri-Speed Port Class5 is half-duplex. To force data other than half-duplex of the 12 bits in Class5, in the LED Interface0 Class5 Manipulation Register (for ports 0 through 11), set the <Class5ForceEn> field to 1 and set the <Class5ForceData> field to the 12 bits of data that are forced to Class5. 17.2.3.2 Blinking Indication Class An Indication class may be a static or blinking indication. The user may configure two blinking modes for each LED Interface. Configuration • Blink mode 0 is configured via the <Blink0Duration> and <Blink0DuTyCycle> fields in the LED Interface0 • Control Register0 (for Ports 0 through 11, CPU Port, and Port26) (Table 268 p. 520) or the LED Interface1 Control Register0 (for Ports 12 through 23, Port24, and Port25) (Table 269 p. 522). Blink mode 0 is configured via the <Blink1Duration> and <Blink1 DutyCycle> fields in the LED Interface0 Control Register0 (for Ports 0 through 11, CPU Port, and Port26) (Table 268 p. 520) or the LED Interface1 Control Register0 (for Ports 12 through 23, Port24, and Port25) (Table 269 p. 522). Each class may be static or associated with blinking mode 0 or 1. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 323 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Indication classes may undergo the following manipulations: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED LED Indications 17.2.3.3 Invert Indication Class The active polarity of indication may be inverted. Example for LED Interface0 Class0: • By setting the <Class0InvertEn> field in the LED Interface0 Class0–1 Manipulation Register (for Ports 0 through 11) (Table 270 p. 523) to 0, active polarity is not inverted and when a signal is active, it is set Low. • By setting the <Class0InvertEn> field in the LED Interface0 Class0–1 Manipulation Register (for Ports 0 through 11) (Table 270 p. 523) to 1, active polarity is not inverted and when a signal is active, it is set High. 17.3 LED Indication Groups To enable more complex indications, the device defines groups of LED Indication classes. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Four groups are defined for the Tri-Speed ports LED Indication classes and two groups are defined for the HyperG.Stack ports LED Indication classes. The groups enable the user to define indications generation by performing combinational logic on up to four classes. For each group, there are four fields to select these classes—Group<n>Class<m>Sel, where n is in the range of 0–3, and m is one of {A, B, C, D}, which are classes 0 through 13. The result of the group data is (Class A AND Class B) OR (Class C AND Class D). The user may wish to use less than four classes for the combinational logic. To allow this, selecting a non-existing class (14 or 15) sets the data as 0 for classes A and C, and as 1 for classes B and D. For configuration of the indication groups see C.11.4.3 "Tri-Speed Ports LED Groups Configuration Registers" on page 530. and C.11.4.5 "HyperG.Stack and HX/QX Ports LED Groups Configuration Registers" on page 543. Table 71 illustrates some possible implementations. It is assumed the class data is not manipulated (static). Table 71: Group Data Description ClassA_Sel ClassB_Sel ClassC_S el 2 4 8 2 4 15 13 7 10 MV-S102110-02 Rev. E Page 324 ClassD_Sel Group Data 15 (Full-duplex AND Activity) OR Link down. X Full-duplex AND Activity. 1 (Link Down AND Collision) OR (Port disabled AND Speed 100). CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Example for LED Interface0 Class0: • By setting the <Class0BlinkEn> field in the LED Interface0 Class0–1 Manipulation Register (for Ports 0 through 11) (Table 270 p. 523)to 0, Indication Class0 is configured to be static. By setting <Class0BlinkEn> field to 1, Indication Class0 is configured to be blinking. • A class is associated with a blink mode according to the value configured in the <Class0BlinkSel> field. When the <Class0BlinkSel> field is 0, the class is associated with blink mode 0. When the <Class0BlinkSel> field is 1, the class is associated with blink mode 1. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification LED Interface AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Note 17.4 Other Indications 17.4.1 CPU Port Indications LED Interface 0 also displays the CPU port indications. The CPU port has the same classes as the Tri-Speed ports (see Table 68, “Tri-Speed Ports and CPU Port Indication Classes Description,” on page 319). These 14 bits are a static indication class and cannot be manipulated. Note Indications from the CPU port cannot be manipulated (forced, inverted or blink). 17.4.2 This section is relevant for the following devices: M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 R GPP D Layer 2+ Stackable: 98DX130, 98DX166, 98DX246, 98DX250, 98DX260, 98DX270, 98DX803 D Multilayer Stackable: 98DX107, 98DX133, 98DX167, 98DX247, 98DX253, 98DX263, 98DX273 U Not relevant for the SecureSmart devices. Both LED interfaces display the value of the eight GPP pins. 17.4.3 HyperG.Stack and HX/QX Ports TxQ Status The HyperG.Stack and HX/QX ports Transmit Queue status may be displayed via both LED interfaces instead of GPP[7:5] for each of the HyperG.Stack and HX/QX ports, (Port24 through Port26). Port<i>TxqNotFull is set to 0 when the number of buffers in all of this port’s transmit queues exceeds the limit configured to this port. Configuration To display the HyperG.Stack and HX/QX ports Transmit Queue Status instead of GPP[7:5], set <HyperGStack PortTxqStaus ViaLedEn> in the Transmit Queue Misc. Control Register (Table 460 p. 719) to 1. 17.4.4 Blink Signals Blink0 and Blink1 are configured in the LED Interface0 Control Register0 (for Ports 0 through 11, CPU Port, and Port26) (Table 268 p. 520) or the LED Interface1 Control Register0 (for Ports 12 through 23, Port24, and Port25) (Table 269 p. 522). Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 325 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Since the input to the groups is the manipulated class data, the user can implement more complicated expressions such as ((~Speed1000 AND Full-duplex) OR ((Activity AND Blink0) AND LinkUp)). M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Other Indications 17.5 LED Stream AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 This section describes the 256 bits of the LED Stream. The LED stream may be organized by port or by class. • • • trol Register0 (for Ports 0 through 11, CPU Port, and Port26) (Table 268 p. 520). to organize LED Interface0 bit stream by port, clear the<LEDOrganizeMode> bit in the LED Interface0 Control Register0 (for Ports 0 through 11, CPU Port, and Port26) (Table 268 p. 520). To organize LED Interface1 bit stream by class, set the <LEDOrganizeMode> bit in the LED Interface1 Control Register0 (for Ports 12 through 23, Port24, and Port25) (Table 269 p. 522). To organize LED Interface1 bit stream by port, clear the<LEDOrganizeMode> bit in the LED Interface1 Control Register0 (for Ports 12 through 23, Port24, and Port25) (Table 269 p. 522). 17.5.1 17.5.1.1 LED Stream Ordered by Class LED Interface 0 Ordered by Class Table 72 describes the 256 bits of LED Interface 0 when it is organized by class. B its I n di c a ti on s 11:0 23:12 LED Interface 0 Ordered by Class M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 72: Tri-Speed Ports 0 through 11 Class 0 indications. (Bit[0] is Port0 Class0 Indication and bit[11] is Port11 Class0 indication.) Tri-Speed Ports 0 through 11 Class 1 indications. 35:24 Tri-Speed Ports 0 through 11 Class 2 indications. 47:36 Tri-Speed Ports 0 through 11 Class 3 indications. 59:48 Tri-Speed Ports 0 through 11 Class 4 indications. 71:60 Tri-Speed Ports 0 through 11 Class 5 indications. 83:72 Tri-Speed Ports 0 through 11 Class 6 indications. 95:84 Tri-Speed Ports 0 through 11 Class 7 indications. 107:96 Tri-Speed Ports 0 through 11 Class 8 indications. 119:108 Tri-Speed Ports 0 through 11 Class 9 indications. 131:120 Tri-Speed Ports 0 through 11 Class 10 indications. 143:132 Tri-Speed Ports 0 through 11 Class 11 indications. 155:144 Tri-Speed Ports 0 through 11 Class 12 indications. 167:156 Tri-Speed Ports 0 through 11 Class 13 indications. 179:168 Tri-Speed Ports 0 through 11 Group 0 indications. MV-S102110-02 Rev. E Page 326 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Configuration To organize LED Interface0 bit stream by class, set the <LEDOrganizeMode> bit in the LED Interface0 Con• M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification LED Interface B its I n di c a ti on s 191:180 Tri-Speed Ports 0 through 11 Group 1 indications. 203:192 Tri-Speed Ports 0 through 11 Group 2 indications. 215:204 Tri-Speed Ports 0 through 11 Group 3 indications. 229:216 CPU port data. 234:230 GPP[4:0]. 237 GPP[7] / Port26TxqNotFull 235 238 239 240 241 242 243 244 245 GPP[6] / Port25TxqNotFull GPP[5] / Port24TxqNotFull HyperG.Stack or HX/QX Port26 Class 0 indication. HyperG.Stack or HX/QX Port26 Class 1 indication. HyperG.Stack or HX/QX Port26 Class 2 indication. HyperG.Stack or HX/QX Port26 Class 3 indication. HyperG.Stack or HX/QX Port26 Class 4 indication. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 236 HyperG.Stack or HX/QX Port26 Class 5 indication. HyperG.Stack or HX/QX Port26 Class 6 indication. HyperG.Stack or HX/QX Port26 Class 7 indication. 246 HyperG.Stack or HX/QX Port26 Class 8 indication. 247 HyperG.Stack or HX/QX Port26 Class 9 indication. 248 HyperG.Stack or HX/QX Port26 Class 10 indication. 249 HyperG.Stack or HX/QX Port26 Class 11 indication. 250 HyperG.Stack or HX/QX Port26 Class 12 indication. 251 HyperG.Stack or HX/QX Port26 Class 13 indication. 252 HyperG.Stack or HX/QX Port26 Group 0 indication. 253 HyperG.Stack or HX/QX Port26 Group 1 indication. 254 Blink1. 255 Blink0. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 327 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 LED Interface 0 Ordered by Class (Continued) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 72: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED LED Stream 17.5.1.2 LED Interface 1 Ordered by Class for the Device Table 73: LED Interface 1 Ordered by Class In d ic a t io n s 11:0 Tri-Speed Ports 12 through 23 Class 0 indications. (Bit[0] is Port12 Class0 Indication and bit[11] is Port23 Class0 indication.) 23:12 Tri-Speed Ports 12 through 23 Class 1 indications. 35:24 Tri-Speed Ports 12 through 23 Class 2 indications. 47:36 Tri-Speed Ports 12 through 23 Class 3 indications. 59:48 Tri-Speed Ports 12 through 23 Class 4 indications. 71:60 Tri-Speed Ports 12 through 23 Class 5 indications. 83:72 Tri-Speed Ports 12 through 23 Class 6 indications. 95:84 Tri-Speed Ports 12 through 23 Class 7 indications. Tri-Speed Ports 12 through 23 Class 8 indications. 119:108 131:120 143:132 155:144 Tri-Speed Ports 12 through 23 Class 9 indications. Tri-Speed Ports 12 through 23 Class 10 indications. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 107:96 Tri-Speed Ports 12 through 23 Class 11 indications. Tri-Speed Ports 12 through 23 Class 12 indications. 167:156 Tri-Speed Ports 12 through 23 Class 13 indications. 179:168 Tri-Speed Ports 12 through 23 Group 0 indications. 191:180 Tri-Speed Ports 12 through 23 Group 1 indications. 203:192 Tri-Speed Ports 12 through 23 Group 2 indications. 215:204 Tri-Speed Ports 12 through 23 Group 3 indications. 217:216 HyperG.Stack or HX/QX Ports 24 through 25 Class 0 indications. 219:218 HyperG.Stack or HX/QX Ports 24 through 25 Class 1 indications. 221:220 HyperG.Stack or HX/QX Ports 24 through 25 Class 2 indications. 223:222 HyperG.Stack or HX/QX Ports 24 through 25 Class 3 indications. 225:224 HyperG.Stack or HX/QX Ports 24 through 25 Class 4 indications. 227:226 HyperG.Stack or HX/QX Ports 24 through 25 Class 5 indications. 229:228 HyperG.Stack or HX/QX Ports 24 through 25 Class 6 indications. 231:230 HyperG.Stack or HX/QX Ports 24 through 25 Class 7 indications. 233:232 HyperG.Stack or HX/QX Ports 24 through 25 Class 8 indications. MV-S102110-02 Rev. E Page 328 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 B its AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 73 describes the 256 bits of LED Interface 1 when the it is organized by class. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification LED Interface In d ic a t io n s 235:234 HyperG.Stack or HX/QX Ports 24 through 25 Class 9 indications. 237:236 HyperG.Stack or HX/QX Ports 24 through 25 Class 10 indications. 239:238 HyperG.Stack or HX/QX Ports 24 through 25 Class 11 indications. 241:240 HyperG.Stack or HX/QX Ports 24 through 25 Group 0 indications. 243:242 HyperG.Stack or HX/QX Ports 24 through 25 Group 1 indications. 245:244 Reserved. 250:246 GPP[4:0]. 253 GPP[7] / Port26TxqNotFull 252 251 254 255 GPP[6] / Port25TxqNotFull GPP[5] / Port24TxqNotFull Blink1. Blink0. LED Stream Ordered by Port 17.5.2.1 LED Interface 0 Ordered by Port M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 17.5.2 Table 74 describes the 256 bits of LED Interface 0 when the it is organized by port. Table 74: LED Interface 0 Ordered by Port B its I nd i c a t io n s 13:0 Port0 classes. (Bit[0] is a Port0 class 0 indication and bit[13] is a Port0 class13 indication.) 27:14 Port1 classes. 41:28 Port2 classes. 55:42 Port3 classes. 69:56 Port4 classes. 83:70 Port5 classes. 97:84 Port6 classes. 111:98 Port7 classes. 125:112 Port8 classes. 139:126 Port9 classes. 153:140 Port10 classes. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 329 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 B its LED Interface 1 Ordered by Class (Continued) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 73: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED LED Stream Table 74: I nd i c a t io n s 175:172 179:176 183:180 187:184 191:188 195:192 199:196 203:200 207:204 211:208 215:212 229:216 234:230 237 Port0 groups. (Bit[168] is a Port0 Group 0 indication, bit[171] is Port0 group 3 indication.) Port1 groups. Port2 groups. Port3 groups. Port4 groups. Port5 groups. Port6 groups. Port7 groups. Port8 groups. Port9 groups. Port10 groups. Port11 groups. CPU port data. GPP[4:0]. GPP[7] / Port26TxqNotFull 236 GPP[6] / Port25TxqNotFull 235 GPP[5] / Port24TxqNotFull 238 HyperG.Stack or HX/QX Port26 Class 0 indication. 239 HyperG.Stack or HX/QX Port26 Class 1 indication. 240 HyperG.Stack or HX/QX Port26 Class 2 indication. 241 HyperG.Stack or HX/QX Port26 Class 3 indication. 242 HyperG.Stack or HX/QX Port26 Class 4 indication. 243 HyperG.Stack or HX/QX Port26 Class 5 indication. 244 HyperG.Stack or HX/QX Port26 Class 6 indication. 245 HyperG.Stack or HX/QX Port26 Class 7 indication. 246 HyperG.Stack or HX/QX Port26 Class 8 indication. 247 HyperG.Stack or HX/QX Port26 Class 9 indication. 248 HyperG.Stack or HX/QX Port26 Class 10 indication. MV-S102110-02 Rev. E Page 330 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 171:168 Port11 classes. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 167:154 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 B its LED Interface 0 Ordered by Port (Continued) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification LED Interface 249 250 251 252 253 254 255 I nd i c a t io n s 17.5.2.2 HyperG.Stack or HX/QX Port26 Class 11 indication. HyperG.Stack or HX/QX Port26 Class 12 indication. HyperG.Stack or HX/QX Port26 Class 13 indication. HyperG.Stack or HX/QX Port26 Group 0 indication. HyperG.Stack or HX/QX Port26 Group 1 indication. Blink1. Blink0. LED Interface 1 Ordered by Port Table 75 describes the 256 bits of LED Interface 1 when the it is organized by port. B its 13:0 LED Interface 1 Ordered by Port In d ic a t io n s Port12 classes. (Bit[0] is a Port12 class 0 indication and bit[13] is a Port12 class13 indication.) M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 75: 27:14 Port13 classes. 41:28 Port14 classes. 55:42 Port15 classes. 69:56 Port16 classes. 83:70 Port17 classes. 97:84 Port18 classes. 111:98 Port19 classes. 125:112 Port20 classes. 139:126 Port21 classes. 153:140 Port22 classes. 167:154 Port23 classes. 171:168 Port12 groups. (Bit[168] is a Port12 group 0 indication, bit[171] is a Port12 group 3 indication.) 175:172 Port13 groups. 179:176 Port14 groups. 183:180 Port15 groups. 187:184 Port16 groups. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 331 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 B its LED Interface 0 Ordered by Port (Continued) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 74: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED LED Stream Table 75: In d ic a t io n s 195:192 199:196 203:200 207:204 211:208 215:212 227:216 232:228 233 Port18 groups. Port19 groups. Port20 groups. Port21 groups. Port22 groups. Port23 groups. Port24 classes. GPP[4:0]. GPP[7] / Port26TxqNotFull 234 GPP[6] / Port25TxqNotFull GPP[5] / Port24TxqNotFull 247:236 249:248 251:250 253:252 255 Port25 classes. Port24 groups. Port25 groups. Reserved. Blink1. Blink0. MV-S102110-02 Rev. E Page 332 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 235 254 Port17 groups. CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 191:188 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 B its LED Interface 1 Ordered by Port (Continued) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification DSA Tag Formats This section describes the DSA tag formats. The device supports four DSA tag formats: • TO_CPU • FROM_CPU • TO_ANALYZER • FORWARD This appendix describes these DSA tag formats in detail. For further information about the DSA tag see Section 4. "Distributed Switching Architecture" on page 44. Packets sent between devices (through cascade ports) are always DSA-tagged. When cascading ports are used to connect between devices in these families, then the original DSA tag specification must be extended to contain two words of 4 bytes each (32-bits wide). When cascading ports are used to connect a device in these families to a previous generation Prestera®-DX devices, such as the 98DX240, or with Prestera®-EX 98EX1x6 devices, the DSA tag must not be extended and contain one word or 4 bytes. The first word in the Extended DSA tag and the non-Extended DSA tag are the same, with the exception of an Extension bit. Extended DSA Tag in TO_CPU Format M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 A.1 All packets forwarded to the Host CPU are forwarded with a TO_CPU Extended DSA tag format. Table 76 describes the TO_CPU Extended DSA tag format used for packets forwarded to the CPU. Table 76: Bits Extended TO_CPU DSA Tag Name Description Wo rd 0 31:30 TagCommand 0 = TO_CPU 29 SrcTagged/ TrgTagged When the SrcTrg bit (word 1, bit 8) = 0: SrcTagged This tag contains Source Port information and this bit indicates the VLAN Tag format, SrcTagged, in which the packet was received on the network port: 0 = Packet was received from a network port untagged. 1 = Packet was received from a network port tagged. When SrcTrg = 1: TrgTagged This tag contains Target Port information and this bit indicates the VLAN Tag format, TrgTagged, in which the packet was transmitted via the network port: 0 = Packet was transmitted to a regular network port untagged. 1 = Packet was transmitted to a regular network port tagged. NOTE: When SrcTrg is 0 and the packet forwarded to the CPU is received on a customer port on which Nested VLAN is implemented, SrcTagged is set to 0, regardless of the packet’s VLAN tag format. As the packet is considered untagged, when the packet is forwarded to the CPU, the customer’s VLAN tag (if any) resides after the DSA tag. NOTE: When this tag is not extended, this field indicates SrcTagged. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 333 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Appendix A. DSA Tag Formats M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Extended DSA Tag in TO_CPU Format Name Description SrcDev/ TrgDev According to SrcTrg bit (Word1 Bit 8) When SrcTrg = 0: This tag contains Source Port information and this field indicates the number of the Source Device on which the packet was received. When SrcTrg = 1: This tag contains Target Port information and this field indicates the number of the Destination Device through which the packet was transmitted. NOTE: When the tag is not extended, this field indicates SrcDev[4:0] only. SrcPort[4:0]/ SrcTrunk[4:0]/ TrgPort[4:0] According to SrcTrg bit (Word1 Bit 8) and OrigIsTrunk bit (Word1 Bit 27) When SrcTrg = 0 and OrigIsTrunk = 0: This tag contains Src Port information and this field, together with SrcPort[5] in Word1 Bit10, indicates the number of the Source Port on which the packet was received. When SrcTrg = 0 and OrigIsTrunk = 1: This tag contains Src Trunk information and this field, together with SrcPort[6:5] in Word1 Bit[11:10], indicates the number of the Source Trunk on which the packet was received. When SrcTrg = 1: The tag contains target port information and this field, together with TrgPort[5] in Word1 Bit 10, indicates the number of the Destination Port through which the packet was transmitted. NOTE: When the tag is not extended, this field indicates SrcPort[4:0]. When the tag is extended, Word1 Bit 10 contains SrcPort[5]/SrcTrunk[5]/TrgPort[5]. and Word1 Bit 11 contains SrcTrunk[6] CPU_Code[3:1] CPU_Code[3:0] Must be set to 0xF to indicate an Extended DSA tag. NOTE: CPU_Code[0] is in Word0 Bit 12. In the Non-Extended Tag, this field, with Word0 Bit 12 contains the CPU code. UP The 802.1p User Priority field assigned to the packet. When the SrcTrg bit (word 1, bit 8) = 0: This field contains the packet’s incoming UP if SrcTagged = 1 or the UP assigned to the packet by the Ingress pipe of the device when SrcTagged = 0. When SrcTrg = 1: This field contains the packet’s outgoing UP if TrgTagged = 1 or the UP assigned to the packet by the Ingress pipe of the device when TrgTagged = 0. 12 CPU_Code[0] CPU_Code[3:0]. Must be set to 0xF to indicate an Extended DSA tag. 11:0 VID The VID assigned to the packet. When the SrcTrg bit (word 1, bit 8) = 0: This field contains the packet’s incoming VID if SrcTagged = 1 or the VID assigned to the packet by the Ingress pipe of the device when SrcTagged = 0. When SrcTrg = 1: This field contains the packet’s outgoing VID if TrgTagged = 1 or the VID assigned to the packet by the Ingress pipe of the device when TrgTagged = 0. 28:24 23:19 18:16 15:13 MV-S102110-02 Rev. E Page 334 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Bits Extended TO_CPU DSA Tag (Continued) CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 76: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification DSA Tag Formats Table 76: Extended TO_CPU DSA Tag (Continued) Name Description Extend Word1 is the last extension. This bit must be set to 0. CFI When SrcTagged = 1 This is the VLAN Tag CFI bit with which the packet was received from the network port. When SrcTagged = 0 The packet was received untagged and the CFI bit is assigned by the device to '0'. 29:28 Reserved Reserved, must be set to 0. 27 OrigIsTrunk When a packet’s DSA tag is replaced from FORWARD to TO_CPU and FORWARD_DSA<SrcIsTrunk> = 1 and SrcTrg = 0, this bit is set to 1 and this tag’s {Word1[11:10], Word0[23:19]} = SrcTrunk[6:0] which is the Trunk number extracted from the FORWARD DSA Tag When SrcTrg = 0 0 = The packet’s source is a network port or Trunk on the local device or a network port on a remote device and {Word1[10], Word0[23:19]} = SrcPort[5:0]. 1 = The packet’s source is a Trunk on a remote device and its FORWARD DSA tag is replaced from to TO_CPU and {Word1[11:10], Word0[23:19]} = SrcTrunk[6:0], which is the Trunk number extracted from the FORWARD DSA tag. When SrcTrg = 1 This field is always set to 0 Truncated Packet sent to CPU is truncated. Indicates that only the first 128 bytes of the packet are sent to the CPU. The packet’s original byte count is forwarded to the CPU in <PktOrigBC> field. (Word1 Bits [25:12]. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Bits 30 26 25:12 PktOrigBC The packet’s original byte count. 11 SrcTrunk[6] Bit[6] of the SrcTrunk[6] field. SrcPort[5]/ SrcTrunk[5]/ TrgPort[5] Bit[5] of the SrcPort[5:0]/SrcTrunk[5]/TrgPort[5:0] field. 9 Reserved Must be set to 0. 8 SrcTrg SrcTrg indicates the type of data forwarded to the CPU. 0 = The packet was forwarded to the CPU by the Ingress pipe and this tag contains the packet’s source information. 1 = The packet was forwarded to the CPU by the Egress pipe and this tag contains the packet’s destination information. 7:0 LongCPUCode 8 bits of CPU code. 10 Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 335 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 31 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Wo rd 1 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Extended DSA Tag in TO_CPU Format A.2 Extended DSA Tag in FROM_CPU Format Bits Wo rd 0 31:30 29 Extended FROM_CPU DSA Tag Name Description Tag_Command 1 = FROM_CPU, Packet from CPU TrgTagged 0 = Packet is sent via network port untagged 1 = Packet is sent via network port tagged NOTE: This bit is relevant only when use_vidx bit is 0. When use_Vidx is 1, Destination Tag format is according to the VID Tag format in the target device. Bits [28:19] when use_Vidx = 1 (use_Vidx is Word0[18]) 28:19 vidx[9:0] The first 10 bits of the vidx according to which the packet is to be forwarded. When vidx[11:0] is 0xFFF, the Vidx in the bridge is to be set to VID. Else it is a direct pointer to the Vidx table. NOTE: When the tag is not extended, VIDX field is 9 bits, bit 19 is reserved and bits 28:20 are VIDX[8:0]. Bits [28:19] when use_Vidx = 0 (use_Vidx is Word0[18]) 23:19 18 TrgDev Target Device to which the packet is forwarded. Trg Port[4:0] Target port to which the packet is forwarded. NOTE: When the tag is extended, TrgPort is a 6-bit field. TrgPort[5] is in Word1 Bit 10. use_vidx 0 = Packet from the CPU is a Unicast packet forwarded to a target specified in this tag. 1 = Packet from the CPU is a Multicast packet forwarded to the VLAN-ID and Multicast group index (VIDX) in this tag. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 28:24 17 TC[0] The packet’s Traffic Class, TC is a 3-bit field, TC [0] is in Word0 bit17 TC [1] is in Word1 bit14 TC[2] is in Word1 bit27 NOTE: When the tag is not extended, this field is set to 1. 16 CFI this bit is relevant only when the packet is transmitted tagged and reflects the CFI bit with which the packet to be transmitted via the network port. 15:13 UP The 802.1p User Priority field assigned to the packet. When the packet is transmitted via the network port with a VLAN Tag, this field is set to the packet’s 802.1p User Priority field. 12 Extend 1 = There is one more DSA tag word. 11:0 VID VID assigned to the packet. This field is used by the switch to determine a Multicast packet’s Multicast group. When the packet is transmitted via the network port with a VLAN Tag, this field is set to the packet’s VLAN ID field. MV-S102110-02 Rev. E Page 336 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 77: AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 All packets sent from the Host CPU are sent with a FROM_CPU Extended DSA tag format. Table 77 describes the FROM_CPU Extended DSA tag format used for packets forwarded from the CPU to a port, to a group of ports, to another CPU in the system or to a neighboring CPU. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification DSA Tag Formats Table 77: Name Description Extend Word1 is the last extension. This bit must be set to 0. EgressFilterEn Egress filtering enable 0 = Packets from the CPU are not egress filtered. Unicast packets are forwarded regardless of the Egress port Span State or VLAN Membership. Multi-destination packets are forwarded to the Multicast group members specified in this tag, regardless of the target port’s Span state. 1- Packets from CPU are Egress filtered. Cascade Control This bit indicates which TC is assigned to the packet when it is forwarded over a cascading/stacking port. 0 = If the packet is queued on a port that is enabled for Data QoS mapping (typically a cascade port), the packet is queued according to the data traffic {TC, DP} mapping table, which maps the DSA tag TC and DP to a cascade port TC and DP. On a port that is disabled for Data QoS mapping (a non-cascade ports), the packet is queued according to the DSA tag TC and DP. 1 = If the packet is queued on a port that is enabled for Control QoS mapping (typically a cascade port), the packet is queued according to the configured Control TC and DP. On a port that is enabled for Control QoS mapping (non-cascade ports), the packet is queued according to the DSA tag TC and DP. EgressFilter Registered Indicates that the packet is Egress filtered as a registered packet. 0 = Packet is egress filtered as an Unregistered packet and is forwarded to an Egress port according to its type (Unicast or Multicast) and the configuration of the Egress port Port<n>UnkFilterEn if packet is Unicast, and according to the configuration of Port<n>UnregFilterEn if the packet is Multicast. 1 = Packet is egress filtered as a registered packet according to the members of the Multicast group. NOTE: When this field is 0, the type of the packet—Multicast or Unicast—is set according to the packet’s MAC DA[40]. 27 TC[2] Bit [2] of the packet’s TC field. 26:25 DP[1:0] Packet’s drop precedence. 24:20 Src-ID Packet’s Source ID 19:15 SrcDev Packet’s Source Device Number. 14 TC[1] Bit [1] of the packet’s TC field. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Bits Extended FROM_CPU DSA Tag (Continued) 30 29 28 Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 337 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 31 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Wo rd 1 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Extended DSA Tag in FROM_CPU Format Table 77: Name Description AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Bits Extended FROM_CPU DSA Tag (Continued) 13:12 11 VIDX[11:10] The Multicast group to which the packet is transmitted. ExcludeIsTrunk A Trunk is excluded from the Multicast group. 0 = A port is excluded from the Multicast group. 1 = A trunk is excluded from the Multicast group. NOTE: To disable the exclusion of a port or a trunk from a Multicast group, set ExcludeIsTrunk to 1 and ExcludedTrunk to 0. When ExcludedTrunk = 0, no trunk or port is excluded from the Multicast group. Bits [10:0] when ExcludeIsTrunk = 1 10:7 6:0 Reserved Must be set to 0. ExcludedTrunk The excluded trunk number. NOTE: Setting this field to '0' disables the exclusion of a trunk from the Multicast group. Bits 10:0 when ExcludeIsTrunk = 0 10:5 4:0 ExcludedPort Together with ExcludedDev, specifies the port that is excluded from the Multicast group. ExcludedDev Together with ExcludedPort, specifies the port that is excluded from the Multicast group. Bits [13:0] when use_Vidx = 0 (use_Vidx is Word0[18]) 12:11 MailBoxTo NeighborCPU Mail box to Neighbor CPU, for CPU to CPU mail box communication. NOTE: As a Mail message is sent to a CPU with unknown Device Number, the TrgDev must be set to the local device number and TrgPort must be set to the Cascading port number in the local device through which this packet is to be transmitted. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 13 Reserved Must be set to 0. 10 TrgPort[5] Together with Word0 bits[23:19] this is the target port, TrgPort[5:0] through which the packet is forwarded. 9:0 Reserved Must be set to 0. MV-S102110-02 Rev. E Page 338 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Bits [13:0] when use_Vidx = 1 (use_Vidx is Word0[18]) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification DSA Tag Formats A.3 Extended DSA Tag in TO_ANALYZER Format AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 All packets forwarded to the analyzer port are sent with a TO_ANALYZER Extended DSA tag format. Table 78 describes the TO_ANALYZER Extended DSA tag format. Bits Extended TO_ANALYZER DSA Tag Name Description 31:30 TagCommand 2 - TO_ANALYZER - To Target sniff port, no bridging, no egress filtering. 29 SrcTagged/ TrgTagged When rx_sniff = 1: This field is SrcTagged. SrcTagged = 0 - Packet was received from a regular network port untagged and is forwarded to the Ingress analyzer untagged. SrcTagged = 1 - Packet was received from a regular network port tagged and is forwarded to the Ingress analyzer tagged, with the same VID and UP with which it was received. When rx_sniff = 0: This field is TrgTagged. TrgTagged = 0 – Packet was sent via a regular network port untagged and is forwarded to the Egress analyzer untagged. TrgTagged = 1 – Packet was sent via a regular network port tagged and is forwarded to the Egress analyzer tagged. NOTE: rx_sniff is in Word0 Bit 18. SrcDev/ TrgDev When rx_sniff = 1 or when rx_sniff =0 and the tag is not extended: Together with SrcPort[5:0], SrcDev indicates the packet’s original Ingress port. When rx_sniff =0 and the tag is extended: Together with TrgPort[5:0], TrgDev indicates the packet’s Egress port NOTE: When the tag is not extended, this field always indicates SrcDev. 23:19 SrcPort[4:0]/ TrgPort[4:0] When rx_sniff = 1 or when rx_sniff =0 and the tag is not extended: SrcPort: Together with SrcDev, SrcPort[5:0] indicates the packet’s original Ingress port. When rx_sniff =0 and the tag is extended: TrgPort: Together with TrgDev, TrgPort[5:0] indicates the packet’s Egress port. NOTE: When this tag is extended, port is a 6-bit field. SrcPort[5]/TrgPort[5] is in Word1 Bit 10. When this tag is not extended, this field always indicates SrcPort. 18 rx_sniff 0 = Packet was Tx sniffed and is forwarded to Target Tx sniffer. 1 = Packet was Rx sniffed and is forwarded to Target Rx sniffer. 17 Reserved Must be set to 0. 16 CFI VLAN Tag CFI bit When rx_sniff = 1: This bit is relevant only if SrcTagged = 1 and it reflects the CFI bit with which the packet was received from the network port. When rx_sniff = 0: This bit is relevant only if TrgTagged = 1 and it reflects the CFI bit with which the packet was transmitted to the network port. 28:24 Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Wo rd 0 CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 339 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 78: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Extended DSA Tag in TO_ANALYZER Format 12 11:0 Word 1 31 30:11 10 9:0 Description UP 802.1p User Priority field assigned to the packet. When rx_sniff = 1: This field contains the packet’s incoming UP if SrcTagged = 1 or the UP assigned to the packet by the Ingress pipe of the device when SrcTagged = 0. When rx_sniff = 0: This field contains the packet’s outgoing UP if TrgTagged = 1 or the UP assigned to the packet by the Ingress pipe of the device when TrgTagged = 0. Extend 1 = There is one more DSA tag word. VID The VID assigned to the packet. When rx_sniff = 1: This field contains the packet’s incoming VID if SrcTagged = 1 or the VID assigned to the packet by the Ingress pipe of the device when SrcTagged = 0. When rx_sniff = 0: This field contains the packet’s outgoing VID if TrgTagged = 1 or the VID assigned to the packet by the Ingress pipe of the device when TrgTagged = 0. Extend Word1 is the last extension. This bit must be set to 0. Reserved Must be set to 0. SrcPort[5]/ TrgPort[5] Bit 6 of ScrPort[5:0]/TrgPort[5:0] Reserved Must be set to 0. MV-S102110-02 Rev. E Page 340 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 15:13 Name AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Bits Extended TO_ANALYZER DSA Tag (Continued) CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 78: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification DSA Tag Formats A.4 Extended DSA Tag in FORWARD Format AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Packets forwarded in a regular way through cascading ports are forwarded with the FORWARD Extended DSA tag format. Table 79 describes the FORWARD format Table 79: Bits Extended FORWARD DSA Tag Name Des cription 31:30 TagCommand 3 = FORWARD 29 SrcTagged This bit indicates the VLAN Tag format, SrcTagged, in which the packet was received on the network port: 0 = Packet was received from a network port untagged. 1 = Packet was received from a network port tagged. SrcDev Source Device from which the packet was received. Word 0 28:24 Bits [23:19] when SrcIsTrunk = 1 (SrcIsTrunk is Word0 Bit 18) 23:19 SrcTrunk[4:0] Source trunk number on which the packet was received. NOTE: When this tag is extended, SrcTrunk[6:0] is a 7-bit field. SrcTrunk[6:5] are in Word1 Bits[29:30] Bits [23:19] when SrcIsTrunk = 0 (SrcIsTrunk is Word0 Bit 18) SrcPort[4:0] Together with SrcDev, SrcPort indicates the packet’s source port number. NOTE: When this tag is extended, SrcPort[5:0] is a 6-bit field. SrcPort[5] is in Word1 Bit[29] 18 SrcIsTrunk If the packet was received from a network port that is part of a trunk, this bit is set to 1 and Bits [23:19] contain the source trunk number. If the packet was received from a network port that is not part of a trunk, this bit is set to 0 and Bits [23:19] contain the source port number. Reserved Set to ZERO. 16 CFI When SrcTagged = 1 This is VLAN Tag CFI bit with which the packet was received from the network port When SrcTagged = 0 This is CFI bit assigned to the packet by the device, set to 0 15:13 UP This field contains the 801.p User Priority (UP) assigned to the packet by the Ingress pipe of the device. 12 Extend 1 = There is one more DSA tag word. 11:0 VID This field contains the VID assigned to the packet by the device’s Ingress pipe. Extend Word0 is the last extension. This bit must be set to 0. 17 Wo rd 1 31 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 23:19 Bits [30:29] when SrcIsTrunk = 1 (SrcIsTrunk is Word0 bit[18]) 30:29 SrcTrunk[6:5] If the packet was received on a Trunk Port, together with Word0 Bits[23:19], this field contains the source trunk number SrcTrunk[6:0]. Bits [30:29] when SrcIsTrunk = 0 (SrcIsTrunk is Word0 bit[18]) 30 Reserved Must be set to 0. 29 SrcPort[5] If the packet was received on a non-Trunk Port, together with Word0 Bits[23:19], this field contains the source port number SrcTrunk[5:0]. Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 341 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 . M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Extended DSA Tag in FORWARD Format Name Des cription EgressFilter Registered Indicates that the packet is Egress filtered as a Registered packet. 0 = Packet is egress filtered as an Unregistered packet. 1 = Packet is egress filtered as a registered packet. NOTE: When this field is 0, the type of the packet—Multicast or Unicast—is set according to the packet’s MAC DA[40]. Reserved Must be set to 0. Routed Indicates whether the packet is routed. 0 = Packet has not be Layer 3 routed. 1 = Packet has been Layer 3 routed. AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Bits Extended FORWARD DSA Tag (Continued) 28 27:26 25 24:20 19:13 12 Src-ID Packet’s Source ID. QoSProfile Packet’s QoS Profile. use_vidx 0 = Unicast packet forwarded to the Target port specified in this tag. 1 = Multicast packet forwarded to the Multicast group specified in this tag. Bits [11:0] when use_Vidx = 1 (use_Vidx is Word1[12]) 11:0 VIDX[11:0] Multicast group to which the packet is transmitted. Bits [11:0] when use_Vidx = 0 (use_Vidx is Word1[12]) 10:5 4:0 Reserved Must be set to 0. TrgPort Together with TrgDev, specifies the target port to which the packet is forwarded. TrgDev Together with TrgPort, specifies the target port to which the packet is forwarded. MV-S102110-02 Rev. E Page 342 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 11 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 79: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Appendix B. CPU Codes Table 80: CPU Codes Code N ame De scription 1–0 Reserved Reserved. BPDU TRAP BPDU packet trapped to the host CPU. FDB ENTEY TRAP/MIRROR Packet trapped or mirrored to the host CPU due to FDB Entry command set to Trap or Mirror on the packet’s MAC SA or MAC DA. Reserved Reserved. ARP BROADCAST TRAP/MIRROR ARP Broadcast packet trapped or mirrored to the host CPU. IPv4 IGMP TRAP/MIRROR IPv4 IGMP packet trapped or mirrored to the host CPU. Reserved Reserved. May not be used for user-defined CPU codes. 2 3 4 5 6 7 8 9 LEARN DISABLE UNKNOWN SOURCE MAC ADDR TRAP Reserved M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Code No. Packet with unknown Source Address received on a learning disabled port with learning disable command set to “Trap packets with unknown Source Address”. Reserved. 10 LEARN DISABLE UNKNOWN SOURCE MAC ADDR MIRROR 11-12 Reserved 13 IEEE RESERVED MULTICAST ADDR TRAP/MIRROR 14 IPv6 ICMP TRAP/MIRROR 15 Reserved 16 IPv4/IPv6 LINK-LOCAL MULTICAST DIP TRAP/MIRROR Relevant for the Multilayer stackable switches only. IPv4 or IPv6 packets with DIP in the Link Local Range trapped or mirrored to the host CPU. 17 IPv4 RIPv1 MIRROR Relevant for the Multilayer stackable switches only IPv4 RIPv1 packet mirrored to the host CPU. 18 IPv6 NEIGHBOR SOLICITATION TRAP/MIRROR 19 IPv4 BROADCAST TRAP/MIRROR Copyright © 2006 Marvell August 24, 2006, Preliminary Packet with unknown Source Address received on a learning disabled port with learning disable command set to “Mirror packets with unknown Source Address”. Reserved. May not be used for user-defined CPU codes. Packet with MAC DA in the IEEE Reserved Multicast range trapped or mirrored to the host CPU. IPv6 ICMP packet trapped or mirrored to the host CPU. Reserved. IPv6 neighbor solicitation packet trapped or mirrored to the host CPU. IPv4 Broadcast packet assigned to a VLAN with IPv4 Broadcast Command set to Trap or Mirror. CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 343 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 This section describes the CPU codes. M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED CPU Codes De scription NON IPv4 BROADCAST TRAP/MIRROR Non IPv4 Broadcast packet assigned to a VLAN with non IPv4 Broadcast Command set to Trap or Mirror. 21 CISCO CONTROL MULTICAST MAC ADDR TRAP/MIRROR A Multicast packet with a MAC DA in the CISCO AUI range trapped or mirrored to the host CPU. 22 BRIDGED NON IPv4/IPv6 UNREGISTERED MULTICAST TRAP/MIRROR Non IPv4/IPv6 Unregistered Multicast packet assigned to a VLAN with non IPv4/IPv6 Unregistered Multicast Command set to Trap or Mirror. 23 BRIDGED IPv4 UNREGISTERED MULTICAST TRAP/MIRROR IPv4 Unregistered Multicast packet assigned to a VLAN with IPv4 Unregistered Multicast Command set to Trap or Mirror. 24 BRIDGED IPv6 UNREGISTERED MULTICAST TRAP/MIRROR IPv6 Unregistered Multicast packet assigned to a VLAN with IPv6 Unregistered Multicast Command set to Trap or Mirror. 25-64 Reserved Reserved. 64 ROUTED PACKET FORWARD Packet forwarded to the host CPU by the Router (“IP to Me”). 65 BRIDGED PACKET FORWARD Packet forwarded to the host CPU by one of the following engines in the device: • Redirected by the PCL engine to the CPU port number. • MAC DA is associated with the CPU port number. • Private VLAN edge target set to CPU port number. 66 INGRESS MIRRORED TO ANALYAER EGRESS MIRRORED TO ANALYZER 68 MAIL FROM NEIGHBOR CPU 69 CPU TO CPU 70 EGRESS SAMPLED 71 INGRESS SAMPLED 73-72 Reserved 74 INVALID USER DEFINED SELECTED BYTES ON PCL KEY TRAP/MIRROR 136-75 Reserved 133 IPV4_UC_TTL_EXCEEDED 135 IPV6_UC_HOP_ LIMIT_EXCEEDED MV-S102110-02 Rev. E Page 344 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 20 67 Code Na me AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Code No. CPU Codes (Continued) Ingress mirrored packets to the analyzer port, when the ingress analyzer port number is set to the CPU port number. Egress mirrored packets to the analyzer port, when the egress analyzer port number is set to the CPU port number. A packet sent to the local host CPU as a mail from the neighboring CPU. A packet sent to the local host CPU, from one of the other host CPUs in the system. Packet mirrored to the host CPU by the egress STC mechanism. Packet mirrored to the host CPU by the ingress STC mechanism. Reserved. Packet trapped/mirrored to the host CPU by the Policy Engine due to the following: User-defined bytes in the key could not be parsed due to packet header length or its format. Reserved. May not be used for user-defined CPU codes. IPv4 Unicast Packet, triggered for routing Trapped/Mirrored to the CPU due to TTL exceeded. IPv6 Unicast Packet, triggered for routing Trapped/Mirrored to the CPU due to Hop Limit exceeded. CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 80: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification CPU Codes (Continued) Code N ame De scription 137 IPV4_UC_HEADER_ERROR Relevant for the Multilayer stackable switches only IPv4 Unicast Packet, triggered for routing Trapped/Mirrored to the CPU due to IP header Error Reserved Reserved. May not be used for user-defined CPU codes. IPV6_UC_HEADER_ERROR Relevant for the Multilayer stackable switches only. IPv6 Unicast Packet, triggered for routing Trapped/Mirrored to the CPU due to IP header Error. Reserved Reserved. May not be used for user-defined CPU codes. IPV4_UC_OPTIONS Relevant for the Multilayer stackable switches only. IPv4 Unicast Packet, triggered for routing Trapped/Mirrored to the CPU due to IP header Options. Reserved Reserved. May not be used for user-defined CPU codes. IPV6_UC_OPTIONS Relevant for the Multilayer stackable switches only. IPv6 Unicast Packet, triggered for routing Trapped/Mirrored to the CPU due to IP header Hop-By-Hop Options. Reserved Reserved. May not be used for user-defined CPU codes. 139 140 141 142 143 144 146 Reserved 159-148 Reserved 160 IPV4_UC_ROUTE0 Reserved. May not be used for user-defined CPU codes. Reserved. May not be used for user-defined CPU codes. Relevant for the Multilayer stackable switches only. IPv4 Unicast packet, triggered for routing, Trapped/Mirrored to CPU from NHE Entry with NHE<CPUCodeIndex> = 0. 167-161 Reserved 168 IPV6_UC_ROUTE0 179-169 Reserved 180 IPV4_UC_ICMP_REDIRECT 181 IPV6_UC_ICMP_REDIRECT Copyright © 2006 Marvell August 24, 2006, Preliminary M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 138 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Code No. Reserved. May not be used for user-defined CPU codes. Relevant for the Multilayer stackable switches only. IPv6 Unicast packet, triggered for routing, Trapped/Mirrored to CPU from NHE Entry with NHE<CPUCodeIndex> = 0. Reserved. May not be used for user-defined CPU codes. Relevant for the Multilayer stackable switches only. IPv4 Unicast packet, triggered for routing, Mirrored to CPU due to ICMP redirect. Relevant for the Multilayer stackable switches only. IPv6 Unicast packet, triggered for routing, Mirrored to CPU due to ICMP redirect. CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 345 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 80: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED CPU Codes 191 192-255 De scription Reserved Reserved. May not be used for user-defined CPU codes. PACKET_TO_VIRTUAL_ROUTER_ PORT Relevant for the Multilayer stackable switches only. Packet forwarded to the virtual router port and trapped to the CPU, because it was not routed or because it was directed to the CPU by the router. User-Defined User-Defined CPU codes. The user may use this range for any application specific purpose. MV-S102110-02 Rev. E Page 346 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 190-182 Code Na me M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Code No. CPU Codes (Continued) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 80: M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED THIS PAGE INTENTIONALLY LEFT BLANK AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 700 First Avenue Sunnyvale, CA 94089, USA Tel: 1.408.222.2500 Fax: 1.408.752.9028 Worldwide Corporate Offices Worldwide Sales Offices Marvell Semiconductor, Inc. 700 First Avenue Sunnyvale, CA 94089, USA Tel: 1.408.222.2500 Fax: 1.408.752.9028 Western US Marvell 700 First Avenue Sunnyvale, CA 94089, USA Tel: 1.408.222.2500 Fax: 1.408.752.9028 Sales Fax: 1.408.752.9029 Israel Marvell 6 Hamada Street Mordot HaCarmel Industrial Park Yokneam 20692, Israel Tel: 972.(0).4.909.1500 Fax: 972.(0).4.909.1501 Marvell 5400 Bayfront Plaza Santa Clara, CA 95054, USA Tel: 1.408.222.2500 China Marvell 5J1, 1800 Zhongshan West Road Shanghai, PRC 200233 Tel: 86.21.6440.1350 Fax: 86.21.6440.0799 Marvell Semiconductor, Inc. 5400 Bayfront Plaza Santa Clara, CA 95054, USA Tel: 1.408.222.2500 Marvell Japan K.K. Shinjuku Center Bldg. 44F 1-25-1, Nishi-Shinjuku, Shinjuku-ku Tokyo 163-0644, Japan Tel: 81.(0).3.5324.0355 Fax: 81.(0).3.5324.0354 Marvell Semiconductor Israel, Ltd. 6 Hamada Street Mordot HaCarmel Industrial Park Yokneam 20692, Israel Tel: 972.(0).4.909.1500 Fax: 972.(0).4.909.1501 Marvell Semiconductor Korea, Ltd. Rm. 603, Trade Center 159-2 Samsung-Dong, Kangnam-Ku Seoul 135-731, Korea Tel: 82.(0).2.551-6070/6079 Fax: 82.(0).2.551.6080 Radlan Computer Communications, Ltd. Atidim Technological Park, Bldg. #4 Tel Aviv 61131, Israel Tel: 972.(0).3.645.8555 Fax: 972.(0).3.645.8544 Central US Marvell 9600 North MoPac Drive, Suite #215 Austin, TX 78759, USA Tel: 1.512.343.0593 Fax: 1.512.340.9970 Marvell Rm. 1102/1103, Jintian Fudi Mansion #9 An Ning Zhuang West Rd. Qing He, Haidian District Beijing, PRC 100085 Tel: 86.10.8274.3831 Fax: 86.10.8274.3830 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Marvell Asia Pte, Ltd. 151 Lorong Chuan, #02-05 New Tech Park, Singapore 556741 Tel: 65.6756.1600 Fax: 65.6756.7600 For more information, visit our website at: www.marvell.com Eastern US/Canada Marvell Parlee Office Park 1 Meeting House Road, Suite 1 Chelmsford, MA 01824 , USA Tel: 1.978.250.0588 Fax: 1.978.250.0589 Europe Marvell 5 Marchmont Gate Boundary Way Hemel Hempstead Hertfordshire, HP2 7BF United Kingdom Tel: 44.(0).1442.211668 Fax: 44.(0).1442.211543 Japan Marvell Shinjuku Center Bldg. 44F 1-25-1, Nishi-Shinjuku, Shinjuku-ku Tokyo 163-0644, Japan Tel: 81.(0).3.5324.0355 Fax: 81.(0).3.5324.0354 Taiwan Marvell 2Fl., No.1, Alley 20, Lane 407, Sec. 2 Ti-Ding Blvd., Nei Hu District Taipei, Taiwan, 114, R.O.C Tel: 886.(0).2.8177.7071 Fax: 886.(0).2.8752.5707 Korea Marvell Rm. 603, Trade Center 159-2 Samsung-Dong, Kangnam-Ku Seoul 135-731, Korea Tel: 82.(0).2.551-6070/6079 Fax: 82.(0).2.551.6080 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 www.marvell.com M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Marvell Semiconductor, Inc. M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Register Set Appendix MV-S102110-02, Rev. E August 24, 2006 Document Classification: Restricted Information 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 98DX106, 98DX107, 98DX130, 98DX133, 98DX163, 98DX163R, 98DX166, 98DX167, 98DX243, 98DX246, 98DX247, 98DX249, 98DX250, 98DX253, 98DX260, 98DX262, 98DX263, 98DX269, 98DX270, 98DX273, and 98DX803 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Document Status Technical Publication: AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Preliminary Document Conventions Provides related information or information of special importance. Caution Indicates potential damage to hardware or software, or loss of data. Warning M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Indicates a risk of personal injury. For further information about Marvell® products, see the Marvell website: http://www.marvell.com Disclaimer No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, for any purpose, without the express written permission of Marvell. Marvell retains the right to make changes to this document at any time, without notice. Marvell makes no warranty of any kind, expressed or implied, with regard to any information contained in this document, including, but not limited to, the implied warranties of merchantability or fitness for any particular purpose. Further, Marvell does not warrant the accuracy or completeness of the information, text, graphics, or other items contained within this document. Marvell products are not designed for use in life-support equipment or applications that would cause a life-threatening situation if any such products failed. Do not use Marvell products in these types of equipment or applications. With respect to the products described herein, the user or recipient, in the absence of appropriate U.S. government authorization, agrees: 1) Not to re-export or release any such information consisting of technology, software or source code controlled for national security reasons by the U.S. Export Control Regulations ("EAR"), to a national of EAR Country Groups D:1 or E:2; 2) Not to export the direct product of such technology or such software, to EAR Country Groups D:1 or E:2, if such technology or software and direct products thereof are controlled for national security reasons by the EAR; and, 3) In the case of technology controlled for national security reasons under the EAR where the direct product of the technology is a complete plant or component of a plant, not to export to EAR Country Groups D:1 or E:2 the direct product of the plant or major component thereof, if such direct product is controlled for national security reasons by the EAR, or is subject to controls under the U.S. Munitions List ("USML"). At all times hereunder, the recipient of any such information agrees that they shall be deemed to have manually signed this document in connection with their receipt of any such information. Copyright © 2006. Marvell International Ltd. All rights reserved. Marvell, the Marvell logo, Moving Forward Faster, Alaska, Fastwriter, Datacom Systems on Silicon, Libertas, Link Street, NetGX, PHYAdvantage, Prestera, Raising The Technology Bar, The Technology Within, Virtual Cable Tester, and Yukon are registered trademarks of Marvell. Ants, AnyVoltage, Discovery, DSP Switcher, Feroceon, GalNet, GalTis, Horizon, Marvell Makes It All Possible, RADLAN, UniMAC, and VCT are trademarks of Marvell. All other trademarks are the property of their respective owners. MV-S102110-02 Rev. E Page 350 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Note M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification C.2 Registers Overview .........................................................................................................374 Global, TWSI Interface and CPU Port Configuration Registers ..................................376 Table 84: Global Control Register .............................................................................................................................. 377 Table 85: Address Completion Register..................................................................................................................... 379 Table 86: Sampled at Reset Register......................................................................................................................... 379 Table 87: Device ID Register...................................................................................................................................... 381 Table 88: Vendor ID Register ..................................................................................................................................... 382 Table 89: User Defined Register0 .............................................................................................................................. 382 Table 90: User-Defined Register1 .............................................................................................................................. 382 Table 91: User-Defined Register2 .............................................................................................................................. 383 Table 92: User-Defined Register3 .............................................................................................................................. 383 Table 93: Address Update Queue Base Address Register ........................................................................................ 384 Table 94: Address Update Queue Control Register ................................................................................................... 384 Offset: 0x00000058 Offset: 0x00000000 Offset: 0x00000004 Offset: 0x0000004C Offset: 0x00000050 Offset: 0x000000F0 Offset: 0x000000F4 Offset: 0x000000FC Offset: 0x000000C0 Offset: 0x000000C4 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Offset: 0x000000F8 Table 95: Interrupt Coalescing Configuration Register............................................................................................... 384 Table 96: TWSI Global Configuration Register .......................................................................................................... 385 Table 97: TWSI Last Address Register ...................................................................................................................... 386 Table 98: TWSI Timeout Limit Register...................................................................................................................... 386 Table 99: TWSI State History Register....................................................................................................................... 386 Offset: 0x000000E0 Offset: 0x00000010 Offset: 0x00000014 Offset: 0x00000018 Offset: 0x00000020 Table 100: CPU Port Global Configuration Register .................................................................................................... 387 Offset: 0x000000A0 Table 101: CPU Port GoodFramesSent Counter ......................................................................................................... 387 Offset: 0x00000060 Table 102: CPU Port MACTransErrorFramesSent Counter......................................................................................... 387 Offset: 0x00000064 Table 103: CPU Port GoodOctetsSent Counter ........................................................................................................... 388 Offset: 0x00000068 Table 104: CPU Port GoodFramesReceived Counter.................................................................................................. 388 Offset: 0x00000070 Table 105: CPU Port BadFramesReceived Counter .................................................................................................... 388 Offset: 0x00000074 Table 106: CPU Port GoodOctetsReceived Counter ................................................................................................... 388 Offset: 0x00000078 Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 351 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 C.1 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 List of Registers M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED List of Registers Table 107: CPU Port BadOctetsReceived Counter ..................................................................................................... 389 Offset: 0x0000007C AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 108: CPU Port Rx Internal Drop Counter ........................................................................................................... 389 Offset: 0x0000006C C.3 GPP Configuration Registers ........................................................................................ 390 Table 110: GPP Output Register ................................................................................................................................. 391 Offset: 0x018001BC Table 112: GPP I/O Control Register........................................................................................................................... 391 Offset: 0x018001C8 C.4 PCI SDMA Registers....................................................................................................... 392 Table 114: SDMA Configuration Register .................................................................................................................... 393 Offset: 0x00002800 Table 115: Receive SDMA Queue Command Register ............................................................................................... 394 Offset: 0x00002680 Table 116: Transmit SDMA Queue Command Register .............................................................................................. 395 Offset: 0x00002868 Table 117: Transmit SDMA Current Descriptor Pointer<n> Register (0<=n<8)........................................................... 396 Offset: Ptr0: 0x000026C0, Prt1: 0x000026C4 ... Ptr7: 0x000026DC (8 Registers in Steps of 0x4) Table 118: Transmit SDMA Fixed Priority Configuration Register ............................................................................... 396 Offset: 0x00002870 Table 119: Transmit SDMA WRR Token Parameters Register ................................................................................... 396 Offset: 0x00002874 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 120: Transmit SDMA Weighted-Round-Robin Queue<n> Configuration Register (0<=n<8)............................. 397 Offset: Q0: 0x00002708, Q1: 0x00002718 ... Q7: 0x00002778 (8 Registers in steps of 0x10) Table 121: Receive SDMA Status Register ................................................................................................................. 397 Offset: 0x0000281C Table 122: Receive SDMA Current Descriptor Pointer<n> Register (0<=n<8)............................................................ 397 Offset: RxPrt0: 0x0000260C, RxPrt1: 0x0000261C ... RxPrt7: 0x0000267C (8 Registers in Steps of 0x10) Table 123: Tx SDMA Token-Bucket Queue<n> Counter (0<=n<8) ............................................................................. 397 Offset: Q0: 0x00002700, Q1:0x00002710 ... Q7: 0x00002770 (8 registers in steps of 0x10) Table 126: Receive SDMA<n> Packet Count Register (0<=n<8) ................................................................................ 398 Offset: RxSDMA0: 0x00002820, RxSDMA1: 0x00002820 ... RxSDMA7: 0x0000283C (8 Registers in steps of 0x4) Table 127: Receive SDMA<n> Byte Count Register (0<=n<8).................................................................................... 398 Offset: RxSDMA0: 0x00002840, RxSDMA1: 0x00002840 ... RxSDMA7: 0x0000285C (8 Registers in steps of 0x4) Table 124: Tx SDMA Token Bucket Queue<n> Configuration .................................................................................... 398 Offset: Qt0: 0x00002704, Q1: 0x00002714 ... Q7: 0x00002774 (8 registers in steps of 0x10) Table 125: Tx SDMA Token-Bucket Counter............................................................................................................... 398 Offset: 0x00002780 Table 128: Receive SDMA Resource Error Count 0 Register ..................................................................................... 399 Offset: 0x00002860 Table 129: Receive SDMA Resource Error Count 1 Register ..................................................................................... 399 Offset: 0x00002864 MV-S102110-02 Rev. E Page 352 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Offset: 0x018001C4 Table 111: GPP Input Register .................................................................................................................................... 391 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification C.5 Master XSMI Interface Configuration Registers ...........................................................400 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 131: XSMI Management Register ...................................................................................................................... 401 Offset: 0x01CC0000 Table 132: XSMI Address Register .............................................................................................................................. 403 Offset: 0x01CC0008 C.6 Router Header Alteration Configuration Registers......................................................404 Offset: 0x07800100 Table 135: Router Header Alteration Enable MAC SA Modification Register .............................................................. 405 Offset: 0x07800104 Table 136: Router MAC SA Base Register0 ................................................................................................................ 405 Offset: 0x07800108 Table 137: Router MAC SA Base Register1 ................................................................................................................ 406 Offset: 0x0780010C Table 138: Device ID Modification Enable Register ..................................................................................................... 406 Offset: 0x07800110 Table 139: Router ARP DA Table Entry<n> (0<=n<1024) ........................................................................................... 406 Offset: N/A Table 140: VLAN/Port MAC SA Entry<n> (0<=n<4096)............................................................................................... 407 Offset: N/A Table 141: Router Header Alteration Tables Access Control Register ........................................................................ 408 Offset: 0x07800208 Table 142: Router Header Alteration Tables Access Data0 Register .......................................................................... 408 Offset: 0x07800200 Offset: 0x07800204 C.7 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 143: Router Header Alteration Tables Access Data1 Register .......................................................................... 408 Tri-Speed Ports MAC, CPU Port MAC, and SGMII Configuration Registers..............410 Table 145: Port<n> MAC Control Register0 (0<=n<24, CPUPort = 63)....................................................................... 411 Offset: P0: 0x0A800000, P1: 0x0A800100... P23: 0x0A801700 (24 Registers in steps of 0x100) CPUPort: 0x0A803F00 Table 146: Port<n> MAC Control Register1 (0<=n<24, CPUPort = 63)....................................................................... 412 Offset: P0: 0x0A800004, P1: 0x0A800104... P23: 0x0A801704 (24 Registers in steps of 0x100) CPUPort: 0x0A803F04 Table 147: Port<n> MAC Control Register2 (0<=n<24, CPUPort = 63)....................................................................... 414 Offset: P0: 0x0A800008, P1: 0x0A800108... P23: 0x0A801708 (24 Registers in steps of 0x100) CPUPort: 0x0A803F08 Table 148: Port<n> Auto-Negotiation Configuration Register (0<=n<24, CPUPort = 63) ............................................ 415 Offset: P0: 0x0A80000C, P1: 0x0A80010C... P23: 0x0A80170C (24 Registers in steps of 0x100) CPUPort: 0x0A803F0C Table 149: Port<n> Status Register0 (0<=n<24, CPUPort = 63) ................................................................................. 418 Offset: P0: 0x0A800010, P1: 0x0A800110... P23: 0x0A801710 (24 Registers in steps of 0x100) CPUPort: 0x0A803F10 Table 150: Port<n> Status Register1 (0<=n<24, CPUPort = 63) ................................................................................. 420 Offset: P0: 0x0A800040, P1: 0x0A800140... P23: 0x0A801740 (24 Registers in steps of 0x100) CPUPort: 0x0A803F40 Table 151: Port<n> Serial Parameters Configuration Register (0<=n<24, CPUPort = 63) .......................................... 421 Offset: P0: 0x0A800014, P1: 0x0A800114... P23: 0x0A801714 (24 Registers in steps of 0x100) CPUPort: 0x0A803F14 Table 152: Port<n> SERDES Configuration Register0 (0<=n<24)............................................................................... 422 Offset: P0: 0x0A800028, P1: 0x0A800128... P23: 0x0A801728 (24 registers in steps of 0x100) Table 153: Port<n> SERDES Configuration Register1 (0<=n<24)............................................................................... 423 Offset: P0: 0x0A80002C, P1: 0x0A80012C... P23: 0x0A80172C (24 registers in steps of 0x100) Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 353 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 134: Router Header Alteration Global Configuration Register............................................................................ 404 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED List of Registers Table 154: Port<n> SERDES Configuration Register2 (0<=n<24) .............................................................................. 423 Offset: P0: 0x0A800030, P1: 0x0A800130... P23: 0x0A801730 (24 registers in steps of 0x100) AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 155: Port<n> SERDES Configuration Register3 (0<=n<24) .............................................................................. 424 Offset: P0: 0x0A800034, P1: 0x0A800134... P23: 0x0A801734 (24 registers in steps of 0x100) Table 156: Port<n> PRBS Status Register (0<=n<24) ................................................................................................ 425 Offset: P0: 0x0A800038, P1: 0x0A800138... P23: 0x0A801738(24 Registers in steps of 0x100) Table 157: Port<n> PRBS Error Counter (0<=n<24) ................................................................................................... 425 C.8 HyperG.Stack and HX/QX Ports MAC, Status, and MIB Counters, and XAUI Control Configuration Registers ......................................................................... 426 Table 159: Port<n> MAC Control Register0 (24<=n<27)............................................................................................. 427 Offset: P24: 0x0A801800, P25: 0x0A801900, P26: 0x0A801A00 Table 160: Port<n> MAC Control Register1 (24<=n<27)............................................................................................. 428 Offset: P24: 0x0A801804, P25: 0x0A801904, P26: 0x0A801A04 Table 161: Port<n> MAC Control Register2 (24<=n<27)............................................................................................. 429 Offset: P24: 0x0A801808, P25: 0x0A801908, P26: 0x0A801A08 Table 162: Port<n> Status Register (24<=n<27) ......................................................................................................... 429 Offset: P24: 0x0A80180C, P25: 0x0A80190C, P26: 0x0A801A0C Table 163: HyperG.Stack and HX/QX Ports MIB Counters and XSMII Configuration Register................................... 430 Offset: 0x01800180 Table 164: HyperG.Stack and HX/QX Ports MAC MIB Counters (Ports 24–26) ......................................................... 432 Offset: Port24: 0x01C00000–0x01C0007C, Port24Capture: 0x01C00080–0x01C000FC, Port25: 0x01C40000–0x01C4007C, Port25Capture: 0x01C40080–0x01C400FC, Port26: 0x01C80000–0x01C8007C, Port24Capture: 0x01C80080–0x01C800FC M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 165: Port<n> XAUI PHY and HX/QX PCS Configuration Register0 (24<=n<27)............................................... 435 Offset: P24: 0x0A80181C, P25: 0x0A80191C, P26: 0x0A801A1C Table 166: Port<n> XAUI PHY Configuration Register1 (24<=n<27) .......................................................................... 436 Offset: P24: 0x0A801820, P25: 0x0A801920, P26: 0x0A801A20 C.9 XAUI PHY Configuration Registers............................................................................... 438 Table 168: Control 1Device 3,4,5 Register .................................................................................................................. 440 Offset: 0x0000 Table 169: Status Register 1 Device 3,4,5................................................................................................................... 441 Offset: 0x0001 Table 170: PHY Identifier 1 Device 3,4,5, Register...................................................................................................... 441 Offset: 0x0002 Table 171: PHY Identifier 2 Device 3,4,5, Register...................................................................................................... 442 Offset: 0x0003 Table 172: Speed Ability Register Bit Definitions Device 3,4,5, Register .................................................................... 442 Offset: 0x0004 Table 173: Devices In Package Register Bit Definitions Device 3,4,5, Register.......................................................... 442 Offset: 0x0005 Table 174: Devices In Package Register Bit Definitions Device 3,4,5, Register.......................................................... 443 Offset: 0x0006 Table 175: Control Device 3,4,5, Register2 ................................................................................................................. 443 Offset: 0x0007 Table 176: Status Register Device 3,4,5, Register2 .................................................................................................... 443 Offset: 0x0008 Table 177: Package Identifier Device 3,4,5, Register .................................................................................................. 444 Offset: 0x000E Table 178: Package Identifier Device 3,4,5, Register .................................................................................................. 445 Offset: 0x000F MV-S102110-02 Rev. E Page 354 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Offset: P0: 0x0A80003C, P1: 0x0A80013C... P23: 0x0A80173C (24 Registers in steps of 0x100) M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Table 179: 10GBASE-X PCS Status Device 3,4,5, Register........................................................................................ 445 Offset: 0x0018 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 180: 10GBASE-X PCS Test Control Device 3,4,5, Register .............................................................................. 446 Offset: 0x0019 Table 181: Mode/Lane 0 Control Device 3,4,5, Register.............................................................................................. 446 Offset: 0x8000 Table 182: Lane 1 Control Device 3,4,5, Register........................................................................................................ 447 Offset: 0x8002 Table 184: Lane 3 Control Device 3,4,5, Register........................................................................................................ 448 Offset: 0x8003 Table 185: LED 0, 1 Control Device 3,4,5, Register..................................................................................................... 448 Offset: 0x8004 Table 186: LED 2, 3 Control Device 3,4,5, Register..................................................................................................... 449 Offset: 0x8005 Table 187: LED 4, 5 Control Device 3,4,5, Register..................................................................................................... 451 Offset: 0x8006 Table 188: LED 6, 7 Control Device 3,4,5, Register..................................................................................................... 452 Offset: 0x8007 Table 189: Interrupt Enable Device 3,4,5, Register...................................................................................................... 453 Offset: 0x8008 Table 190: Interrupt Device 3,4,5, Register.................................................................................................................. 454 Offset: 0x8009 Offset: 0x800A M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 191: Status Device 3,4,5, Register ..................................................................................................................... 455 Table 192: Transmit FIFO Control Pad/Truncate Disparity A Byte 0 Device 3,4,5, Register ....................................... 455 Offset: 0x8010 Table 193: Transmit FIFO Control Pad/Truncate Disparity A Byte 1 Device 3,4,5, Register ....................................... 456 Offset: 0x8012 Table 194: Receive FIFO Control Pad/Truncate Disparity A Byte 0 Device 3,4,5, Register ........................................ 456 Offset: 0x8018 Table 195: Receive FIFO Control Pad/Truncate Disparity A Byte 1 Device 3,4,5, Register ........................................ 456 Offset: 0x801A Table 196: Random Sequence Error Counter LSB Lane 0/Jitter Pattern Error Counter LSB Lane 0/Jitter Packet Error Counter LSB Device 3,4,5, Register457 Offset: 0x8020 Table 197: Random Sequence Error Counter MSB Lane 0/Jitter Pattern Error Counter MSB Lane 0/Jitter Packet Error Counter MSB Device 3,4,5, Register457 Offset: 0x8021 Table 198: Random Sequence Error Counter LSB Lane 1/Jitter Pattern Error Counter LSB Lane 1/Jitter Packet Received Counter LSB Device 3,4,5, Register457 Offset: 0x8022 Table 199: Random Sequence Error Counter MSB Lane 1/Jitter Pattern Error Counter MSB Lane 1/Jitter Packet Received Counter MSB Device 3,4,5, Register458 Offset: 0x8023 Table 200: Random Sequence Error Counter LSB Lane 2/Jitter Pattern Error Counter LSB Lane 2 Device 3,4,5, Register 458 Offset: 0x8024 Table 201: Random Sequence Error Counter MSB Lane 2/Jitter Pattern Error Counter MSB Lane 2 Device 3,4,5, Register458 Offset: 0x8025 Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 355 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Offset: 0x8001 Table 183: Lane 2 Control Device 3,4,5, Register........................................................................................................ 447 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED List of Registers Table 202: Random Sequence Error Counter LSB Lane 3/Jitter Pattern Error Counter LSB Lane 3 Device 3,4,5, Register 458 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Offset: 0x8026 Table 203: Random Sequence Error Counter MSB Lane 3/Jitter Pattern Error Counter MSB Lane 3 Device 3,4,5, Register459 Offset: 0x8027 Table 204: Random Sequence Timer LSB/Jitter Pattern Timer LSB Device 3,4,5, Register....................................... 459 Offset: 0x8029 Table 206: Random Sequence Control Device 3,4,5, Register ................................................................................... 459 Offset: 0x802A Table 207: Jitter Packet Transmit Counter LSB Device 3,4,5, Register ...................................................................... 461 Offset: 0x802C Table 208: Jitter Packet Transmit Counter MSB Device 3,4,5, Register ..................................................................... 461 Offset: 0x802D Table 209: Analog Test and TBG Control Device 3,4,5, Register2.............................................................................. 461 Offset: 0xFF27 Table 210: Analog Lane 0 Transmitter Control Device 3,4,5, Register........................................................................ 463 Offset: 0xFF28 Table 211: Analog Lane 1 Transmitter Control Device 3,4,5, Register........................................................................ 464 Offset: 0xFF29 Table 212: Analog Lane 2 Transmitter Control Device 3,4,5, Register........................................................................ 466 Offset: 0xFF2A Offset: 0xFF2B M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 213: Analog Lane 3 Transmitter Control Device 3,4,5, Register........................................................................ 467 Table 214: Analog Receiver/Transmit Control Device 3,4,5, Register ......................................................................... 469 Offset: 0xFF2C Table 215: VCO Calibration Control Register .............................................................................................................. 470 Offset: 0xFF2E Table 216: Analog All Lane Control Register 2 ............................................................................................................ 471 Offset: 0xFF2F Table 217: 10GBASE-X Power Management Control Device 3,4,5, Register5 ........................................................... 471 Offset: 0xFF34 Table 218: VCO Calibration Control, Lane 0 Register ................................................................................................. 472 Offset: 0xFF40 Table 219: VCO Calibration Control, Lane 1................................................................................................................ 472 Offset: 0xFF41 Table 220: VCO Calibration Control, Lane 2................................................................................................................ 473 Offset: 0xFF42 Table 221: VCO Calibration Control, Lane 3................................................................................................................ 474 Offset: 0xFF43 MV-S102110-02 Rev. E Page 356 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Offset: 0x8028 Table 205: Random Sequence Timer MSB/Jitter Pattern Timer MSB Device 3,4,5, Register..................................... 459 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification C.10 HX Port Registers Registers ..........................................................................................475 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 223: HXPort<n> Configuration Register0 ........................................................................................................... 476 Offset: port25: 0x0A803900, port26: 0x0A803A00 Table 224: HXPort<n> Configuration Register1 ........................................................................................................... 477 Offset: port25: 0x0A803904, port26: 0x0A803A04 Table 225: HXPort<n> Tx and Rx Swap Control.......................................................................................................... 479 Offset: port25: 0x0A803908, port26: 0x0A803A08 Offset: port25lane0: 0x0A803948, port25lane1: 0x0A80394C, port26lane0: 0x0A803A48, port26lane1: 0x0A803A4C Table 227: HXPort<n> Lane<l> Disparity Error Counter .............................................................................................. 480 Offset: port25lane0: 0x0A803950, port25lane1: 0x0A803954, port26lane0: 0x0A803A50, port26lane1: 0x0A803A54 Table 228: HXPort<n> Deskew Error Counter ............................................................................................................. 480 Offset: port25: 0x0A803958, port26: 0x0A803A58 Table 229: HXPort<n> Interrupt Cause ........................................................................................................................ 480 Offset: port25: 0x0A803940, port26: 0x0A803A40 Table 230: HXPort<n> Interrupt Mask.......................................................................................................................... 481 Offset: port25: 0x0A803944, port26: 0x0A803A44 Table 231: HXPort<n> LED Control ............................................................................................................................. 483 Offset: port25: 0x0A80393C, port26: 0x0A803A3C Table 232: HXPort<n> Status....................................................................................................................................... 485 Offset: port25: 0x0A80390C, port26: 0x0A803A0C Table 233: HXPort<n> Test Configuration and Status ................................................................................................. 486 Offset: port25: 0x0A803910, port26: 0x0A803A10 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 234: HXPort<n> Lane0 PRBS Error Counter ..................................................................................................... 488 Offset: port25: 0x0A803914, port26: 0x0A803A14 Table 235: HXPort<n> Lane1 PRBS Error Counter ..................................................................................................... 488 Offset: port25: 0x0A803918, port26: 0x0A803A18 Table 236: HXPort<n> Lane<l> Cyclic Date<d> Configuration .................................................................................... 488 Offset: port25lane0data word0: 0x0A80391C, port25lane0data word1: 0x0A803920...port26lane1data word3: 0x0A803A38 Table 237: HXPorts Global Configuration .................................................................................................................... 489 Offset: 0x0A802000 Table 238: Analog Test and TBG Control Register ...................................................................................................... 489 Offset: 0x0A802004 Table 239: Analog Receiver/Transmit Control Register ............................................................................................... 491 Offset: 0x0A802008 Table 240: VCO Calibration Control Register............................................................................................................... 492 Offset: 0x0A80200C Table 241: Analog All Lane Control Register0 ............................................................................................................. 493 Offset: 0x0A802010 Table 242: Analog All Lane Control Register1 ............................................................................................................. 495 Offset: 0x0A802014 Table 243: HXPort25 SERDES Power and Reset Control ........................................................................................... 495 Offset: 0x0A803960 Table 244: HXPort25 Analog Lane 0 Transmitter Control Register.............................................................................. 496 Offset: 0x0A803964 Table 245: HXPort25 Analog Lane 1 Transmitter Control Register.............................................................................. 498 Offset: 0x0A803968 Table 246: HXPort25 lane0 VCO Calibration Control Register .................................................................................... 499 Offset: 0x0A80396C Copyright © 2006 Marvell August 24, 2006, Preliminary CONFIDENTIAL Document Classification: Restricted Information MV-S102110-02 Rev. E Page 357 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Table 226: HXPort<n> Lane<l> Symbol Error Counter ................................................................................................ 479 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED List of Registers Table 247: HXPort25 lane1 VCO Calibration Control Register.................................................................................... 500 Offset: 0x0A803970 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 248: HXPort26 SERDES Power and Reset Control........................................................................................... 501 Offset: 0x0A803A60 Table 249: HXPort26 Analog Lane 0 Transmitter Control Register ............................................................................. 501 Offset: 0x0A803A64 Table 250: HXPort26 Analog Lane 1 Transmitter Control Register ............................................................................. 503 Offset: 0x0A803A6C Table 252: HXPort26 lane1 VCO Calibration Control Register.................................................................................... 505 Offset: 0x0A803A70 C.11 LEDs, Tri-Speed Ports MIB Counters, and Master SMI Configuration Registers ..... 507 Table 254: Source Address Middle Register................................................................................................................ 512 Offset: 0x04004024 Table 255: Source Address High Register................................................................................................................... 512 Offset: 0x04004028 Table 256: SMI0 Management Register ...................................................................................................................... 513 Offset: 0x04004054 Table 257: SMI1 Management Register ...................................................................................................................... 514 Offset: 0x05004054 Table 258: PHY Address Register0 (for Ports 0 through 5) ......................................................................................... 514 Offset: 0x04004030 Table 259: PHY Address Register1 (for Ports 6 through 11) ....................................................................................... 515 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Offset: 0x04804030 Table 260: PHY Address Register2 (for Ports 12 through 17) ..................................................................................... 516 Offset: 0x05004030 Table 261: PHY Address Register3 (for Ports 18 through 23) ..................................................................................... 516 Offset: 0x05804030 Table 262: PHY Auto-Negotiation Configuration Register0 ......................................................................................... 517 Offset: 0x04004034 Table 263: PHY Auto-Negotiation Configuration Register1 ......................................................................................... 518 Offset: 0x04804034 Table 264: PHY Auto-Negotiation Configuration Register2 ......................................................................................... 518 Offset: 0x05004034 Table 265: PHY Auto-Negotiation Configuration Register3 ......................................................................................... 519 Offset: 0x05804034 Table 266: Flow Control Advertise for Fiber Media Selected Configuration Register0 (for ports 0 through 11)519 Offset: 0x04804024 Table 267: Flow Control Advertise for Fiber Media Selected Configuration Register1 (for ports 12 through 23)519 Offset: 0x05804024 Table 268: LED Interface0 Control Register0 (for Ports 0 through 11, CPU Port, and Port26) ................................... 520 Offset: 0x04004100 Table 269: LED Interface1 Control Register0 (for Ports 12 through 23, Port24, and Port25) ...................................... 522 Offset: 0x05004100 Table 270: LED Interface0 Class0–1 Manipulation Register (for Ports 0 through 11) ................................................. 523 Offset: 0x04004108 Table 271: LED Interface1 Class0-1 Manipulation Register (for Ports 12 through 23) ................................................ 524 Offset: 0x05004108 Table 272: LED Interface0 Class2-3 Manipulation Register (for Ports 0 through 11) .................................................. 525 Offset: 0x04804108 MV-S102110-02 Rev. E Page 358 CONFIDENTIAL Document Classification: Restricted Information Copyright © 2006 Marvell August 24, 2006, Preliminary 4duun-fnzjmsxe * Opnet Technologies * UNDER NDA# 12101786 Offset: 0x0A803A68 Table 251: HXPort26 lane0 VCO Calibration Control Register.................................................................................... 505 M MARVELL CONFIDENTIAL - UNAUTHORIZED DISTRIBUTION OR USE STRICTLY PROHIBITED Prestera®-DX SecureSmart, SecureSmart Stackable, Layer 2+ Stackable, and Multilayer Stackable Switches Functional Specification Table 273: LED Interface1 Class2-3 Manipulation Register (for ports 12 through 23)................................................. 525 Offset: 0x05804108 AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 274: LED Interface0 Class4 Manipulation Register (for ports 0 through 11) ...................................................... 526 Offset: 0x0400410C Table 275: LED Interface1 Class4 Manipulation Register (for ports 12 through 23) .................................................... 527 Offset: 0x0500410C Table 276: LED Interface0 Class5 Manipulation Register (for ports 0 through 11) ...................................................... 527 Offset: 0x0580410C Table 278: LED Interface0 Class6 Manipulation Register (for ports 0 through 11) ...................................................... 528 Offset: 0x04804100 Table 279: LED Interface1 Class6 Manipulation Register (for ports 12 through 23) .................................................... 529 Offset: 0x05804100 Table 280: LED Interface0 Group0–1 Configuration Register (for ports 0 through 11) ................................................ 530 Offset: 0x04004104 Table 281: LED Interface1 Group0–1 Configuration Register (for ports 12 through 23) .............................................. 531 Offset: 0x05004104 Table 282: LED Interface0 Group2–3 Configuration Register (for ports 0 through 11) ................................................ 531 Offset: 0x04804104 Table 283: LED Interface1 Group2–3 Configuration Register (for ports 12 through 23) .............................................. 532 Offset: 0x05804104 Table 284: LED Interface0 Class0–5 Manipulation Register (for Port26)..................................................................... 532 Offset: 0x04005100 Offset: 0x04005104 M AR VE 4du LL un CO -fn NF zjm ID sx EN e TI * O AL pn , U et ND Te ER chn ND olo A# gie 12 s 10 17 86 Table 285: LED Interface0 Class6-11 Manipulation Register (for Port26) ................................................................... 534 Table 286: LED Interface1 Class0-4 Manipulation Register (for Port24 and Port25)................................................... 536 Offset: 0x05005100 Table 287: LED Interface1 Class5-9 Manipulation Register (for Port24 and Port25)................................................... 538 Offset: 0x05005104 Table 288: LED Interface1 Class10-11 Manipulation Register (for Port24 and Port25)............................................... 540 Offset: 0x05805100 Table 289: LED Interface0 HyperG.Stack Ports Debug Select Register (for Port26)................................................... 541 Offset: 0x04005110 Table 290: LED Interface1 HyperG.Stack Ports Debug Select Register (for Port24 and Port25) ).............................. 542 Offset: 0x05005110 Table 291: LED Interface0 Group0–1 Configuration Register (for Port26)................................................................... 543 Offset: 0x4805104 Table 292: LED Interface1 Group0–1 Configuration Register (for Port24 and Port25) ................................................ 543 Offset: 0x5805104 Table 293: MIB Counters Control Register0 (for Ports 0 through 5) ............................................................................