802.1ad Drop Precedence Architecture Proposal Stephen Haddock March 16, 2004 Bridge Model Relay DE DE PRI MAC PRI EISS Network MAC Network Assume that the EISS carries a 3-bit Priority parameter (PRI) and a single bit Drop Eligible parameter (DE). PRI is identical to the current priority parameter. Discuss how these parameters are generated at each port later – focus on the drop precedence functionality in the Relay first. Extreme Networks Proprietary and Confidential Slide 2 July 2003 2 Objectives Maintain current frame ordering constraints The probability of dropping a yellow frame (DE set) shall be greater than or equal to the probability of dropping a green frame (DE clear) in the same traffic class. Never promote yellow (DE set) to green (DE clear). Relative priority between any two frames will never be reversed. (mapping to equal priority is OK). Extreme Networks Proprietary and Confidential Slide 3 July 2003 3 Drop Precedence Relay Model DE DE PRI PRI Ingress 0 or more Ingress Flow Meters Forwarding Queuing 1 to 8 Traffic Class Queues Extreme Networks Proprietary and Confidential Transmission Scheduler Slide 4 July 2003 4 Ingress Rules (discussion points) 1. Zero or more flow meters may be implemented per ingress port. 2. Meters do not change the PRI value. 3. No restrictions on how an individual packet is directed to a specific flow meter (e.g. may be based on S-VID, PRI, a combination thereof, or something else). 4. If flow meters are implemented, not all flows are required to go through a meter. 5. The DE value shall not be changed for packets not going through a meter. 6. Flow meters may be buffered (shaper) or unbuffered (policer). 7. Flow meters may set the DE parameter and may drop packets, but shall not clear the DE parameter. Extreme Networks Proprietary and Confidential Slide 5 July 2003 5 Ingress Rules (incorporate in 8.6.1) 1. Zero or more flow meters may be implemented per ingress port. 2. Meters do not change the PRI value. 3. No restrictions on how an individual packet is directed to a specific flow meter (e.g. may be based on S-VID, PRI, a combination thereof, or something else). 4. If flow meters are implemented, not all flows are required to go through a meter. 5. The DE value shall not be changed for packets not going through a meter. 6. Flow meters may be buffered (shaper) or unbuffered (policer). 7. Flow meters may set the DE parameter and may drop packets, but shall not clear the DE parameter. Extreme Networks Proprietary and Confidential Slide 6 July 2003 6 Queuing Rules (incorporate in 8.6.5) 1. One to eight queues may be implemented per egress port. 2. Individual packets are directed to a specific queue according the PRI bits and the priority-to-traffic-class mapping table currently specified in 802.1D-2004. 3. Some or all queues may implement a drop precedence aware queue management algorithm (e.g. queue depth threshold for packets with DE set, RED, WRED, …) 4. Queues may discard packets. The probability of dropping a packet with DE set shall be greater than or equal to the probability of dropping a packet with DE clear. 5. Queues shall not change the PRI value or the DE value. Extreme Networks Proprietary and Confidential Slide 7 July 2003 7 Transmission Rules (incorporate in 8.6.6) 1. As specified in 802.1D-2004 a strict priority scheduling algorithm shall be supported, and other scheduling algorithms may be supported. 2. The scheduler shall not change the PRI value. 3. An optional scheduling algorithm may incorporate a flow meter (i.e. rate-based scheduling or shaping). Such a scheduler may set the DE parameter. Otherwise the DE parameter is not modified by the scheduler. Extreme Networks Proprietary and Confidential Slide 8 July 2003 8 Minimal Implementation Minimal compliant implementation has no drop precedence awareness: Zero flow meters at any ingress port. One to eight queues at each egress port with no drop precedence aware queue management algorithms. No rate based scheduling algorithms, or at least no algorithms that modify the DE value. Therefore the PRI and DE values pass through the Relay unchanged. Only change from an 802.1D-2004 compliant bridge is the ability to carry the DE value through the Relay. Extreme Networks Proprietary and Confidential Slide 9 July 2003 9 Implementation Consideration Just as a 802.1D bridge may support fewer than 8 traffic classes, and 802.1ad bridge may support fewer than 8 traffic classes and only a subset of those traffic classes support drop precedence. If the number of PRI:DE combinations that are supported is 8 or fewer, an implementation may choose to carry PRI:DE through the Relay in a 3 bit field. There is a potential loss of information in encoding PRI:DE to a 3 bit field, but no more so than occurs when encoding PRI:DE as a 3 bit field in a S-tag. Therefore the difference between this implementation and the architectural model is not externally observable, so it is an allowed implementation. Extreme Networks Proprietary and Confidential Slide 10 July 2003 10 Bridge Model DE PRI Relay DE PRI S-TAG MAC S-TAG EISS Network MAC Network Now consider how to encode PRI:DE in the S-TAG. Extreme Networks Proprietary and Confidential Slide 11 July 2003 11 Encoding: Conclusions from conf calls If figure out how to make PRI:DE encoded in 3 bit field work, then should be simple to add option use CFI for DE. Encoding issues are very simple if every bridge uses the same encoding, but need to consider the case where connecting “domains” that use different encoding. Having a restricted set of allowed mappings is acceptable. There should also be specified default mappings (similar to the current priority-to-traffic-class mapping table). Extreme Networks Proprietary and Confidential Slide 12 July 2003 12 802.1Q-2003 Extreme Networks Proprietary and Confidential Slide 13 July 2003 13 Proposed default mapping: Encoded Value PRI only Bridge PRI w/ DP Bridge PRI only Bridge 7: 6: 5: 4: 3: 2: 1: 0: 7 6 5 4 3 2 1 0 6/7 green 6/7 yellow 4/5 green 4/5 yellow 0/3 green 1/2 green 1/2 yellow 0/3 yellow 7 6 5 4 3 2 1 0 Identical to 802.1Q Traffic Class Mapping Extreme Networks Proprietary and Confidential Slide 14 July 2003 14 Proposed default EISS mapping: PRI : DE Egress EISS Encoded Tag Field 7:G 7:Y 6:G 6:Y 5:G 5:Y 4:G 4:Y 3:G 3:Y 2:G 2:Y 1:G 1:Y 0:G 0:Y 7 6 5 4 3 2 1 0 Extreme Networks Proprietary and Confidential PRI : DE Ingress EISS 7:G 7:Y 5:G 5:Y 3:G 2:G 2:Y 3:Y Slide 15 July 2003 15 Old Slides Extreme Networks Proprietary and Confidential Slide 16 July 2003 16 Encoding PRI:DE in the S-tag Using the old CFI bit for DE in the S-tag allows PRI:DE to be carried without loss of information. Encoding PRI:DE in what is currently the 3-bit priority field of the S-tag (call it the Priority/Drop-Precedence or PDP field) is friendlier to existing equipment. But does not “grandfather in” existing equipment. All ports connected to a LAN must use the same mapping between PDP and PRI:DE. There will be a loss of information bridging between LANs that use different mappings between PDP and PRI:DE. It is possible to allow both as configurable options. Is support of one mode required and the other optional? If so, which is required? Extreme Networks Proprietary and Confidential Slide 17 July 2003 17 Mapping observations -- 1 Things will “just work” if constrain all ports use the same mapping between PDP and PRI:DE. PDP PRI:DE PDP mapping does not result in any lost information. If ports on each end of a network link need to use the same mapping , the it follows that all ports of all bridges on a network use the same mapping. Interoperability only assured if all switches must support all possible mappings, or if a constrained set of required mappings is specified. But are we willing to live with the constraint that all ports of all bridges connected in a network must have the same mapping? (Particularly problematic at connection between two different Service Providers.) Extreme Networks Proprietary and Confidential Slide 18 July 2003 18 Proposed Mapping Rules 1. Packet order within a priority shall be maintained Means two values of PRI:DE that differ only in the DE value cannot be mapped to two PDP values that are interpreted as having different priority. 2. Relative priority between packets shall be maintained through all mapping tables No packet that is initially tagged as higher priority than a second packet shall ever be mapped in a fashion that tags it a lower priority than the second packet. 3. When two values of PRI:DE that differ only in DE are mapped to a single PDP, packets with DE set may be transmitted (effectively clearing DE) or discarded. Do we specify discard vs. transmit in the standard, or let the implementation decide, or mandate that an implementation must be configurable to do either? Extreme Networks Proprietary and Confidential Slide 19 July 2003 19 Mapping Example 1 8/0 port 7 6 5 4 3 2 1 0 Priority7 Priority6 Priority5 Priority4 Priority3 Priority2 Priority1 Priority0 4/4 port 7 6 5 4 3 2 1 0 Gold-G Gold-Y Silver-G Silver-Y Bronze-G Bronze-Y Lead-G Lead-Y PRI information is permanently lost going from 8/0 to 4/4. DE information is lost (or packets are discarded) going from 4/4 to 8/0. Traffic going between a 8/0 port and a 4/4 port get the same effective behavior as if both ports were 4/0. Key: m/n m Priority values of which n have Drop Precedence. G (Green) low discard probability. Y (Yellow) high discard probability. Dashed arrow map or discard Extreme Networks Proprietary and Confidential Slide 20 July 2003 20 Mapping Example 2 A 6/2 port 7 6 5 4 3 2 1 0 Control VoIP EF AF1-G AF1-Y AF2-G AF2-Y Default B 6/2 port 7 6 5 4 3 2 1 0 Control Platinum Gold-G Gold-Y Gaming Silver-G Silver-Y Other (Conceivable that traveling A B C B A could result in a packet originally at EF ending up at Control, thus violating Mapping Rule 2.) Key: m/n m Priority values of which n have Drop Precedence. G (Green) low discard probability. Y (Yellow) high discard probability. Dashed arrow map or discard PRI information is permanently lost going from A port to B port. EF traffic that travels from network A through B and back to A will end up higher priority than EF traffic local to A. Alternative mapping would map EF to Gold and AF1 to Gaming, resulting in a loss of DE information instead of PRI. (Should one of these mapping alternatives be required by the standard?) Extreme Networks Proprietary and Confidential Slide 21 July 2003 21 Issues Investigating the consequences of going from any given mapping to any other gets complex, and leads to network behavior that is certainly not obvious and possibly not desirable. Are the Proposed Mapping Rules sufficient to preserve interoperability (and hopefully sanity)? Do we need further rules (such as “when collapsing two PRI values to one PDP, always map to a PDP which is interpreted as the higher of the two priorities”)? Do we try to describe the network behavior resulting from various mapping combinations, or do we let Service Providers sort it all out? Should we simplify the situation by specifying a small set of mappings that must be supported, or would that undermine the implicit objective of grandfathering existing equipment? Extreme Networks Proprietary and Confidential Slide 22 July 2003 22 Note on placement of PRI:DE – PDP mapping function I’ve assumed it is between the ISS and EISS (therefore in section 6.x of 802.1ad) so the PRI:DE values are passed across the EISS. It could as easily go on the other side of the EISS in the Ingress and Transmission portions of the Relay (sections 8.6.1 and 8.6.6 respectively) which means the PDP value is passed across the EISS. There is precedent for either solution: The VID value passed across the EISS is taken exactly from the packet if tagged or priority-tagged, or null if untagged. It is translated to the default PVID or port-protocol ID in the Ingress portion of the Relay. The current priority value is taken from a tagged packet or, if packet is untagged, is “regenerated” prior to being passed across the EISS. The functionality is the same either way. Extreme Networks Proprietary and Confidential Slide 23 July 2003 23