Flow Control – Pause Frames

advertisement
http://hasanmansur.com/
Flow Control – Pause Frames
POSTED BY HASAN MANSUR ⋅ DECEMBER 15, 2012
When an Ethernet device gets over loaded, flow control allows it to send PAUSE requests to the devices
sending it data to allow the over loaded condition to clear. If flow control is not enabled and an over
loaded condition occurs, the device will drop packets. Dropping packets is much more performance
impacting than flow control.
802.3X flow control is not implemented on a flow basis, but on a link basis. The problem flow control is
intended to solve is input buffer congestion on oversubscribed full duplex links which cannot handle
wire-rate input. Flow Control was originally invented to prevent packet drops by switches that were
running at less than media-speed. At that time the method of control was usually back-pressure. It also
could substantially lower overall throughput through the segments being flow controlled.
Pause Frames
When the receive part (Rx) of the port has its Rx FIFO queue filled and reaches the high water
mark, the transmit part (Tx) of the port starts to generate pause frames. The remote device is
expected to stop / reduce the transmission of packets for the interval time mentioned in the
pause frame. If the Rx is able to clear the Rx queue or reach low water mark within this interval,
Tx sends out a special pause frame that mentions the interval as zero (0×0). This enables the
remote device to start to transmit packets. If the Rx still works on the queue, once the interval
time expires, the Tx sends a new pause frame again with a new interval value.
If Rx-No-Pkt-Buff is zero or does not increment and the Tx PauseFrames counter increments, it indicates
that our switch generates pause frames and the remote end obeys, hence Rx FIFO queue depletes.
If Rx-No-Pkt-Buff increments and TxPauseFrames also increments, it means that the remote end
disregards the pause frames (does not support flow control) and continues to send traffic. In order to
overcome this situation, manually configure the speed and duplex, as well as disable the flow control, if
required. These types of errors on the interface are related to a traffic problem with the ports
oversubscribed.
What Flow Control is not:
Not intended to solve the problem of steady-state overloaded networks or links.
It is not intended to address lack of network capacity. Properly used, flow control can be a useful tool to
address short term overloads on a single link.
http://hasanmansur.com/
Not intended to provide end-to-end flow control. End-to-end mechanisms, typically at the Transport
Layer are intended to address such issues. The most common example is TCP Windows, which provide
end-to-end flow control between source and destination for individual L3/L4 flows.
What would happen if Flow Control were not available?
For Ethernet, packets would continue to be sent to the receiving port, but there would be no room for
the packets to be temporarily stored. The receiving port would simply ignore these incoming packets.
Ethernet and TCP/IP work together to have those “lost” packets re-transmitted. However, it takes time
to determine that packets have been dropped, request the re-transmission of those missing packets,
and then actually send them.
Flow Control – Where to use it, and where not.
Edge of a network
Where GE attached servers are operating at less than wirespeed, and the link only needs to be paused
for a short time, typically measured in microseconds. The singular clients can be held off without
potentially affecting large areas of the network. Flow control can be useful, for example, if the uplink is
being swamped by individual clients. CoS/QoS will become more important over time, here.
SAN
Flow Control is very important to a well designed and high-performance iSCSI Ethernet infrastructure.
On many networks, there can be an imbalance in the network traffic between the devices that send
traffic and the devices that receive the traffic. This is often the case in SAN configurations in which many
servers (initiators) are communicating with storage devices (targets). If senders transmit data
simultaneously, they may exceed the throughput capacity of the receiver. When this occurs, the receiver
may drop packets, forcing senders to re-transmit the data after a delay. Although this will not result in
any loss of data, latency will increase because of the re-transmissions, and I/O performance will
degrade. Flow Control can help eliminate this problem. This lets the receiver process its backlog so it can
later resume accepting input. The amount of delay introduced by this action is dramatically less than the
overhead caused by TCP/IP packet re-transmission.
Switches should always be set to auto-negotiate flow control unless Support specifies otherwise. In
Cisco terminology, this means using the “desired” setting. If the switch is capable of both sending and
receiving pause frames (called symmetric flow control), enable negotiation in both directions (send
http://hasanmansur.com/
desired and receive desired). If the switch only supports receiving pause frames (asymmetric flow
control), then enable negotiation for receive only (receive desired).
On Equalogic PS Series arrays, auto-negotiation for asymmetric flow control is always enabled.
Core of Network
It is actually more detrimental to flow control in the core than helpful. Flow control in the core can
cause congestion in sections of the network that otherwise would not be congested. If particular links
are constantly in a congested state, there is most likely a problem with the current implementation of
the network. The right solution is to redesign the network with additional capacity, reduce the load, or
provide appropriate end-to-end QoS to ensure critical traffic can get through.
The best way to handle any potential congestion in the backbone is CoS/QoS controls. Prioritizing
packets through multiple queues provides far more sophisticated traffic control (such as targeting
specific application packet types) than an all-or-nothing, or even a throttled form of flow control.
QoS
QoS cannot operate properly if a switch sends PAUSE frames, because this slows all of that ports traffic,
including any traffic which may have high priority.
When you enable QoS on the switch, the port buffers are carved into one or more individual queues.
Each queue has one or more drop thresholds associated with it. The combination of multiple queues
within a buffer, and the drop thresholds associated with each queue, allow the switch to make
intelligent decisions when faced with congestion. Traffic sensitive to jitter and delay variance, such as
VoIP packets, can be moved to the head of the queue for transmission, while other less important or
less sensitive traffic can be buffered or dropped. Ingress and egress scheduling are always based on the
COS value associated with the frame. By default, higher COS values are mapped to higher queue
numbers. COS 5 traffic, typically associated with VoIP traffic, is mapped to the strict priority queue, if
present.
http://hasanmansur.com/
.
Download