2. The proposed Switch Architecture

advertisement
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
“IMPLEMENTING LAYER-2, CONNECTION-ORIENTED QoS ON A 3STAGE CLOS SWITCHING ARCHITECTURE”
Fotios K. Liotopoulos & Mohsen Guizani
14/04/2000
Abstract
The implementation of effective Quality of Service (QoS) guarantees for user services must be implemented
end-to-end and through all layers of the OSI hierarchy at each network node. In this paper, we consider the
layer-2 design of an ATM switch, which is built as a scalable, 3-stage Clos switching network and discuss
architectural choices for the implementation of QoS support on this switch. An ATM-like cell format is used
to encapsulate and propagate traffic and resource management information through the switch. Call
Admission Control, fast inter-stage cell transfers and priority-based queue arbitration are used as congestion
control mechanisms. Simulation results are presented to show the effect of architectural choices (internal
buffering) on QoS parameters (CDV).
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
1
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
1. Introduction
Asynchronous Transfer Mode (ATM) switch architectures can be designed in a variety of core media (i.e.,
shared memory, ring / bus architectures or networks of switching elements) and a variety of buffering
schemes (input, output, both) [9]. In the process of designing layer-2 ATM switches as a network of
switching elements, one can apply typical network theory and design principles, such as Quality of
Service (QoS) and traffic management support, which are often applied in higher layers of the OSI
hierarchy [1,2,3,4].
In order for QoS to be meaningful and effective for user services, it must: a) be end-to-end, from source to
destination and b) span and propagate through all layers of the OSI hierarchy at each network node.
Therefore, every layer-2 switch design must be aware of and support QoS properties supported by upper
layers.
In this paper, we will refer to a proposed ATM switch architecture, based on 3-stage Clos networks and
we will discuss issues related to the implementation of QoS properties in the layer-2 design of such a
switch.
1.1 QoS Overview
The ability of ATM networks to provide large bandwidth and versatile Quality of Service (QoS)
guarantees can only be realized by applying effective traffic management strategies. Traffic management
includes Call Admission Control (CAC), Congestion Control (CC) and Virtual Path / Virtual Channel
(VP/VC) Routing [5,6,8].
The combination of Bandwidth Allocation, Call Admission Control and Congestion Control mechanisms
is used to ensure appropriate management of network resources, in order to maintain appropriate network
performance to meet the QoS required by the user [5,6,7,10].
Standardized QoS parameters include:

CLR: Cell Loss Ratio

CTD: Cell Transfer Delay

CDV: Cell Delay Variation
Multiple QoS guarantees are provided by means of multiple service classes, including:

UBR: Unspecified Bit Rate

CBR: Constant Bit Rate (or DBR: Deterministic Bit Rate)

VBR: Variable Bit Rate (RT / NRT) (or SBR: Statistical Bit Rate)

ABR: Available Bit Rate
For each service class, the user is requested to specify a set of traffic parameters, such as:

PCR: Peak Cell Rate

SCR: Sustainable Cell Rate

CDVT: Cell Delay Variation Tolerance

BT: Burst Tolerance
All of the aforementioned QoS principles must be considered in the design of a layer-2 ATM switch
architecture such as the one proposed in this paper.
1.2 Traffic Characteristics
Traffic is characterized by different parameters that define its transportation nature. These characteristics
should be considered for a good network design. There are four predominant traffic characteristics:
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
2
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
required capacity or required throughput of a call, delay tolerance, response time and burstiness. These
characteristics can be combined to derive the performance requirements of an application. In addition,
other requirements such as call setup response time, and routing algorithms are also important for the
design phase.
a) Capacity and throughput: The capacity requirement is the actual amount of resources required by an
application across a given path. The required throughput is a measure of how much application data must
pass across the switch fabric and arrive at the destination in a stated period of time. Typically, this refers
to user data. Some technologies use available bandwidth more efficiently than others, thus providing
higher throughput. A good switching network should be designed with high throughput capability and
extra available capacity in order to avoid congestion effects during high traffic conditions.
b) Response time: Variations in delay lead to variations in response time. Applications and users require
different response times. This ranges from real time applications such as video conferencing to batch
applications such as file transfers or electronic mail. The question that one might ask is how the response
time is important to the user.
c) Traffic delay tolerance: Traffic delay tolerance defines how the application can tolerate the delay of the
cells. The bottom line is to determine the maximum delay the application and user can tolerate.
d) Traffic burstiness: Burstiness is a commonly used measure of how infrequently a source sends traffic.
A source that sends traffic infrequently is said to be bursty. Whereas, a source that always sends at the
same rate is said to be non-bursty. Burstiness is defined as the ratio of the peak rate to the average rate of
traffic based on a specific sampling period for the data. Traffic burstiness can be measured as:
Burstiness = Peak Rate / Average Rate
In Section 2, we briefly present the architectural characteristics of the proposed switch architecture. In
Section 3, we outline architectural choices that can be adopted in order to provide QoS support for the
specific switch architecture. In Section 4, we discuss how resource management functions should take into
account and exploit layer-2 QoS support in order to provide more reliable and concrete user support
services. Finally, in Section 5, we present
simulation results and analysis of the effect of
switch fabric buffering to cell delay variation
and essentially to QoS.
2. The proposed Switch Architecture
In this paper, we propose a scalable switch
architecture, based on a 3-stage Clos network of
switching elements. Given this design model,
we can apply traditional network design
principles and analysis to it (such as QoS
guarantees), in order to produce a network
element (i.e., a switch) that blends transparently
with existing networks (LANs, WANs, etc.)
2.1 Three-Stage Clos Networks
Fig. 1. A Symmetric 3-stage Clos Network.
Three-stage Clos networks have been used for
multiprocessor interconnection networks, as
well as for data switching fabrics. For example,
the Memphis switch of the IBM GF-11 parallel
computer uses a symmetric three-stage Clos
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
3
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
network to interconnect 576 processors. A variety of the 3-stage Clos network designs have also been
proposed for high performance switching systems, including Fujitsu's FETEX-150, NEC's ATOM and
Hitachi's hyper-distributed switching system.
Clos networks can be used to efficiently implement low latency, high-bandwidth, connection-oriented
ATM switching. Clos-type switching fabrics have also been found to be economically and technically
attractive. Combined with their inherent fault-tolerant and multi-path routing properties, they pose as a
very appealing choice for reliable broadband switching.
A three-stage Clos network consists of three successive stages of switching elements, which are
interconnected by point-to-point links. In a symmetric three-stage network, all switching elements in a
stage are uniform (see Figure 1). In a symmetric Clos network such as this, there are r switches of size (n
x m) in the first stage, m switches of size (r x r) in the second stage and r switches of size (m x n) in the
third stage. This network thus inter-connects n*r input ports with n*r output ports in one direction. Bidirectional switching is achieved by using half of the input ports for the reverse direction.
2.2 A novel Clos Switch Architecture
Figure 2 depicts the basic components of a recent proposal for a 3-stage Clos switch architecture, scalable
to 32 switching elements in each stage. Each switching element consists of up to 32 input ports and 32
output ports (n=r=32) with internal input and output buffering. Such a switching element is constructed in
a modular and scalable way, consisting of up to 8 Core Switch Modules (CSM), described below.
01
01
01
02
32
01
02
32
32
32
01
02
32
01
02
32
LBus
LiQ
CSM
GiQ
LoQ
CSM
Switching Element
GBus
Core
Switch
Module
GoQ
Fig. 2. A scalable, modular, 3-stage Clos Switch Architecture.
The Core Switch Module (CSM)
Figure 2 also shows the modular implementation of a typical switching element (i.e., a stage-switch) of the
Clos switching fabric. It consists of 2 (up to 8) modules (CSMs) in parallel, interconnected via a (global)
shared bus.
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
4
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
Each CSM has four serial input ports and four serial output ports (e.g., ATM ports operating at 155 Mbps
(OC-3), also referred to as Synchronous Transport Signal level 3 (STS-3c)).
The serial data stream at each input port is transformed into parallel 32-bit words by a Serial-In-ParallelOut (SIPO) shift register. These words are subsequently stored in a (local) input FIFO queue (LiQ). Cells
can be switched from any input FIFO to any output FIFO within the same module or across different
modules of the same switch.
In this design, we break down the single shared bus into a two-level hierarchy of shared busses, one local
to the module (LBus) and one global for the entire switching element (GBus). Modules of the same switch
can communicate with each other, via the GBus, by means of a pair of (global) input and output FIFOs
(GiQ, GoQ).
A cell destined to an output port of the same module is transferred directly to the corresponding local
FIFO LoQ, via the local bus, LBus. If the destination of the cell is an output port of a different module,
within the same switch, then the cell is first transferred to the GoQ FIFO, and through the GBus, to the
GiQ of the target module. At the remote module, the cell is transferred from the GiQ to the appropriate
LoQ, before it exits the current switching element and moves to the next stage.
Fig. 3. An example of cell routing through 3 switching elements, one per stage.
With this design approach, each module contributes only one input load and one output load to the total
load of the GBus. Therefore, this design can scale to 8 or even 16 modules per switching element. Given
that each module has a switching capacity of 4*155=622 Mbps, the switching element can scale up to 10
Gbps. A 3-stage Clos network consisting of such switching elements can therefore achieve up to 160 Gbps
of strictly nonblocking switching capacity (in a [32 x 64 x 32] configuration), or up to 320 Gbps of
rearrangeably nonblocking switching capacity (in a [64 x 64 x 64] configuration).
With respect to the LBus, arbitration is performed among the four LiQ and the GiQ FIFOs, giving output
to the four LoQ and the GoQ FIFOs. Therefore, the LBus is electrically loaded with only 5 inputs and 5
outputs, which is well below its limitation, but kept at this level for overall performance reasons. For a 16module switch implementation, the corresponding global bus loading is 8 or 16 inputs and 8 or 16 outputs,
depending on the implementation.
A central scheduler performs control and bus arbitration, over the entire switch, and transfers one ATM
cell (14 32-bit words) at a time from an input FIFO (LiQ) to an output FIFO (LoQ), assuming that the
former has at least one cell to send and the latter has adequate free buffer space to accommodate it.
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
5
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
Cells from the output FIFOs are then transformed into a serial stream by a Parallel-In-Serial-Out (PISO)
shift register, in order to be switched to the next stage via an internal link or to an output port. It is often
desirable for internal links to have higher capacity than the input or output ports. This is usually
implemented with either wider data paths, or higher transfer rates.
3. Layer-2 QoS issues
3.1 Call Admission Control (CAC)
Data stream transport through the switching network is achieved by means of Connection Oriented
Network Services (CONS). CONS require the establishment of a connection between the origin and
destination before transmitting the data. The connection is established by pre-allocating bandwidth on a
path of physical links through intermediate nodes of the network. Once established, all data travels over
this same path in the network. Control signalling is used to establish and take down a connection
dynamically, (Switched Virtual Circuits (SVC)).
For the proposed switch architecture, we can apply a variety of Call Admission Control (CAC) algorithms
for Strictly Nonblocking (SNB), Wide-Sense Nonblocking (WSN), Semi-Rearrangeably Nonblocking
(SRN), or Rearrangeably Nonblocking (RNB) operation, as proposed by numerous researchers for
generalized 3-stage Clos switching networks in the multirate environment. Note at this point, that the term
“nonblocking” is used here to refer to the nonblocking property of the switching network at call set up
time. Even an SNB switch can still be blocking (at cell transport time), due to resource conflicts, although
this probability is generally small for an averagely utilized switch.
CBR services are often sensitive to delay variations (or “jitter”). For such services, it is useful to modify
the CAC function so as to avoid routing via global bus “detours” as much as possible. For example, we
can save such a detour in the 3rd stage by appropriately choosing a proper middle-stage switch to route
through.
3.2 ATM Data Switching
Data transfers from source (some input port) to destination (some output port) are performed based on the
Asynchronous Transfer Mode (ATM) switching paradigm. Assuming 32-bit wide intra-stage data
transfers, Figure 4 shows the structure of an ATM cell and its parallel transmission from an input queue to
an output queue within a switch module
VPI
(CSM). The small, fixed cell-size and the
statistical multiplexing provide better cell
Tag
6 bits
5 bits
5 bits
8 bits
4 bits
3b 1b
delay variation (CDV) behavior and thus
1 2nd Swi
3rd Swi OutPort
VCI
GFC PT C
better QoS.
8 bits
8 bits
8 bits
8 bits
0
HEC-0
PEC-0
PEC-1
PEC-2
Cell Structure
The cell header contains information, such
8 bits
8 bits
8 bits
8 bits
as the cell's Virtual Path Identifier (VPI, 16
0
Octet-01
Octet-02
Octet-03
Octet-04
bits) and Virtual Channel Identifier (VCI,
8 bits), Generic Flow Control information
:
:
:
:
(GFC, 4 bits), Payload Type (PT, 3 bits),
8 bits
8 bits
8 bits
8 bits
Call Loss Priority (CLP, 1 bit), Header
0
Octet-45
Octet-46
Octet-47
Octet-48
Error Check (HEC, 8 bits) and Payload
Error Check (PEC, 24 bits). The PEC field
Fig. 4. The proposed structure of the ATM cell.
is not part of the standard ATM cell
format, but we have included it for
enhanced reliability and in order for each cell to begin at a 32-bit word boundary.
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
6
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
VPI-based Self-Routing
In the proposed cell format, the VPI field contains all the routing information the cell needs to reach its
destination output port at the third-stage of the switching network. The first 6 bits of this field (VPI0:5)
denote the selected middle-stage switch the cell is chosen to go through. This essentially identifies the
output port number, which the specific cell should be switched to, inside the first-stage switch. The next 5
bits of the VPI field (VPI6:10) denote the destination third-stage switch, or equivalently, the output port
number of the middle-stage switch. Finally, the last 5 bits of the VPI field (VPI11:15) denote the output port
number of the third-stage switch. The VCI field contains a virtual circuit ID number, which relatively
identifies a virtual circuit (or service) within its corresponding ATM virtual path. Thus, the Absolute
Service Identifier (ASI) is derived from the concatenation of VPI6:15 and VCI, which is an 18-bit identifier.
The functionality of the aforementioned fields corresponding to their QoS significance is described below:

VPI: It carries routing information, which is used to self-route the cell through the switching network.
The routing function implemented can be such that the overall CTD and CDV is reduced for those classes
of services that have the strictest CDVT requirements.
VCI: It can be used to implement VC-based prioritization and “per VC”-scheduling within the
switching fabric.


GFC: This field is used to carry priority information based on the specific route a cell follows. In
particular, cells that are routed through global FIFOs (i.e., they migrate between modules of the same
switch, taking a detour via the global bus) are penalized with higher CTD and CDV. To compensate for
this overhead these cells need to be treated with higher priority, inversely proportional to their penalty.
Moreover, simulation results indicate that, due to the increased congestion in the early stages, the earlier
the penalty occurs, the higher it is. Therefore, the assigned priorities must be given as shown in Table 1:
(X: no detour, : with detour)
1st stage global bus detour
2nd stage global bus detour
3rd stage global bus detour
Priority (bigger = higher)
X
X
X
0
X
X

1
X

X
2

X
X
3
X


4

X

5


X
6



7
Table 1. Assigned priorities based on Global Bus detours per stage.
This prioritization scheme is called “Penalty Compensation Prioritization”, or PCP and can be used to
provide more consistent QoS parameter values, by compensating for architecture-imposed QoS penalties.
PT: The “Payload Type” field encodes in 3 bits the service class type. This information is used to
perform “per-service-class” prioritization and scheduling. Values 0-5 encode up to 6 service-class types
(i.e., user data). A value of PT=6 indicates a signalling data cell and PT=7 indicates a maintenance data
cell.


CLP: This 1 bit information is used to indicate whether the specific cell should be discarded during
network congestion. CLP is set to 0 to indicate higher priority.

HEC, PEC: The main function of these fields are to detect multiple bit errors and correct single bit
errors in the cell header and cell payload, respectively.
Data Buffering Scheme
Both input and output buffering is used, in order to isolate the switching functions of each stage. Internal
blocking is drastically reduced or even avoided by emptying the FIFO buffers in each stage, as fast as
possible. This is achieved by speeding up all inter-stage (serial) cell transfers and also by speeding up all
intra-stage (bus) transfers.
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
7
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
Priority-based arbitration of shared busses and queue multiplexing are used to enforce QoS requirements,
by implementing various types of priority class scheduling, including “per class” and “per VC”
scheduling.
Simulation results indicate that small buffers result in better switch performance (i.e., less CTD and CDV)
(see next paragraph). In order to compensate for hot-spot effects, progressively larger buffer space in each
stage (i.e., larger FIFOs in stage S+1 than in stage S) may be appropriate, but this issue is still under
investigation.
4. Resource Management for Congestion Control
4.1 Congestion Control
Congestion can be defined as the state of network elements in which the network cannot provide the
agreed-upon QoS to already existing or newly requested connections. In other words, congestion occurs
when the offered load from the user to the network exceeds the available network resources to guarantee
the required QoS. Some of the network resources that can cause congestion are switch ports, buffers,
ATM adaptation layer processors, and CAC processors. Two types of congestion may occur in an ATM
network: Long-term congestion, which is caused by more traffic load than the network can handle; and
short-term congestion, which is caused by burstiness in the traffic. Techniques used for congestion control
include admission control, resource reservation, and rate-based congestion control.
1) Admission Control: In this scheme, once congestion is detected by the network, no new virtual
connection circuits are accepted. This technique is widely used and easy to implement. For example, in a
telephone network, once congestion occurs, no more new phone calls can be made. Admission control
may allow new connections if the network can find a route that avoids the network congestion ports and
has the demanded QoS. If no such route can be found, the connection will fail.
2) Resource Reservation: Another congestion control scheme that is similar to admission control is to
reserve resources in advance. In this algorithm, the network and the user establish an agreement about the
traffic flow, QoS, peak bandwidth, average bandwidth, and other parameters. When a virtual connection is
established, the required network resources are reserved with it. Hence, congestion rarely occurs. On the
other hand, its main disadvantage is the low utilization of the bandwidth as not all of the allocated
resources for a connection may be used at the same time.
3) Rate-based Congestion Control: Due to the real-time nature of data, it is not possible to solve the
problem of congestion in both CBR and VBR traffic. When the same problem occurs in UBR, extra cells
are dropped. Since it is possible to inform the sender to slow down ABR traffic, congestion control can be
applied. Three procedures for rate-based congestion control have been used:
a) Whenever the sender wants to transmit burst data, an acknowledgement is needed before bursting.
This procedure was rejected because it takes a long time before sending.
b) When congestion occurs, the sender is notified with a notifier cell, then the sender has to reduce cell
flow by half. This procedure also was rejected because the notifier cell may be lost.
c) The congested cells are discarded in a highly selective way. The switch had to scan the end of the
incoming packet and discard all cells of the next packet. This procedure also was rejected because
the discarded packet may not be the packet causing the congestion.
The accepted procedure is that after N data cells, each sender sends a resource management (RM) cell.
This cell travels along the cells' path and is treated in a special manner throughout the switches. When the
RM cell arrives at the destination, it is examined, updated, and returned to the sender.
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
8
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
Reactive congestion control may be used to provide more effective QoS. Reactive control is achieved via
network feedback information collected by network management and maintenance functions. “Probecells” (with PT=7), like RM cells, are injected into the network at one end and are collected at the other
end, having registered QoS-related measures in between. This information can then be used to perform
traffic management, evaluate the congestion levels of the switching fabric and estimate cell loss
probability.
Additionally, preventive congestion control may be applied by means of multiple priority queues, which
are arbitrated over each shared bus, by means of priority arbiters. Cell streams are FIFO-queued based on
their header information discussed above.
4.2 Network Resource Management
The main role of network resource management is to allocate network resources to different applications
according to their service characteristics. Virtual paths can be used as a tool to achieve traffic control and
resource management. The Virtual path connections (VPC) are similar to virtual circuits. Hence, traffic
control capabilities such as Call Admission Control (CAC) and User/Network Parameter Contract
(UPC/NPC) are simplified. Different types of network end connections lead to the following cases:
1) User-to-user applications: Specifying the QoS of the connection between two user network interfaces
is the responsibility of the end users, not the network.
2) User-to-network applications: Because the VPC is a connection between a network user and the
network itself, the network has to know the QoS of the internal virtual circuits.
3) Network-to-network applications: The VPC is a connection between two networks. The network has
to know the QoS of the virtual circuits. The performance of virtual circuit connections and hence the
virtual path connections, depends mainly on the allocation of network resources. The allocation of
network resources affects the quality of service parameters such as cell loss ratio, peak-to-peak cell delay
variation and maximum cell transfer delay. Virtual circuits that are given the same quality of service will
behave similarly. The network has to allocate its resources to different virtual paths according to the
following algorithms:
a)
Aggregate peak demand: In this algorithm, the total allocated network resources used for one
virtual path, is equal to the total peak resource requirements of its virtual circuits. This scheme has an
advantage of ensuring that all virtual circuits contained in the virtual path operate under their peak
conditions. On the other hand, it does not provide full utilization of network resources under normal
operation.
b)
Statistical Multiplexing: In this algorithm, the total allocated network resources used for one
virtual path, is nearly equal to the average capacity of all virtual circuits. The main disadvantage of this
scheme is that it causes increased cell delay variation and greater cell transfer delay. But it has the
advantage of providing better utilization as compared to the aggregate peak demand.
c)
Traffic Shaping: Traffic shaping enhances traffic flow, reduces cell delay, and allows better
network resource allocation. This is accomplished by properly spacing the cells of virtual circuit
connections. Traffic shaping has to ensure the cell sequence integrity of an ATM connection, It also can
be used by the network operator and the user. It is usually used for cost-effective network dimensioning.
Examples of traffic shaping algorithms include the leaky bucket, also known as token bucket, which
controls the flow of compliant cells.
5. QoS Performance Analysis
Analytical models of the proposed switch, as well as simulation studies are very useful tools to estimate
the performance of the switch, its QoS tolerances and limitations. Performance analysis results can be
used to define optimal operating points of the switch. For example, if simulation results indicate that for
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
9
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
switch capacity utilizations greater than 90% the cell loss probability (CLP) becomes unacceptable (e.g.,
>10-4) then the CAC function is set to accept new calls only if the overall capacity utilization is less than
90%.
Another set of performance values, that can be derived by analysis and / or simulation and are useful to
determine the appropriate operating points of the switch, includes typical and maximum values (or xpercentile) of cell latency and cell delay variance. Knowing these values, we can accept QoS requirements
and provide QoS guarantees with a given degree of confidence.
Following, we present such a set of simulation results, which is part of a more thorough study of how
specific QoS parameters are affected by architectural choices of the switch design. In particular, we study
the effect of internal FIFO sizes on the observed cell delay variation (CDV).
5.1 Simulation Results
We have simulated an 8x16x8 Clos switching network, with continuous, CBR traffic of multiple ATM
cell streams. The simulated configuration consisted of 128 input ports and 128 output ports. Assuming 155
Mbps i/o links, this implies a total switching capacity of approximately 20 Gbps, while, assuming 622
Mbps i/o links the total switching capacity becomes approximately 80 Gbps.
We use the “cell cycle” as a time unit, in order to make our results independent of the assumed i/o rate. A
cell cycle is equal to the inverse of the i/o rate (e.g. ~3 usec for 155 Mbps (OC-3) i/o ports). All simulation
experiments were run for 100,000 cell cycles and in all cases steady state was achieved early enough.
In order to evaluate the effect of internal buffer size on the cell delay (or latency) variation (CDV), (i.e.,
how the switch architecture “shapes” the CBR traffic), we ran several experiments varying the sizes of the
Local input/output FIFOS (LiQ, LoQ) (parameter LQ) and Global input/output FIFOS (GiQ, GoQ)
(parameter GQ). In particular, we tested various combinations of LQ = {2, 4, 8, 16} cells and GQ = {4, 6,
8, 10, 12, 16, 20, 24} and measured the internal cell blocking, goodput1 and end-to-end cell latency. Next,
we present the results for two such cases; one exhibiting strong congestion effects, which shows more
clearly the FIFO size effect on CDV and another one with near zero internal cell blocking, which proves
possible to provide strong CDV guarantees for CBR services. In both cases, interesting conclusions can
also be drawn regarding the optimum size values of the internal FIFOs.
1
Goodput is defined as the ratio of the transferred cell rate to the offered cell rate. Goodput is 100% if no internal
blocking occurs, therefore all incoming cells are transferred through the switching fabric unhindered. Goodput is
50% if, in the time duration where two incoming cells are offered, only one actually exits the switching buffer, due to
internal blocking.
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
10
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
Cell Latency Distribution vs. Queue Size (8x16x8, 128 OC-3 i/o Clos switch)
14%
12%
% of transferred cells
10%
LQ=4, GQ=4
LQ=4, GQ=6
8%
LQ=4, GQ=8
LQ=4, GQ=10
LQ=4, GQ=12
6%
4%
2%
0%
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
41
43
45
47
49
51
Cell Cycles (1 cyc = 3 usec)
Fig. 5. Cell latency distribution for LQ=4 cells and various Global FIFO sizes.
Case I: Internal bus speedup = 8 x (i/o rate), Inter-stage link speedup = 1 x (i/o rate)
Due to the relatively small speedups, this case exhibited high internal cell blocking, thus yielding goodput
values between 63% and 64% for all FIFO sizes simulated. Figure 5 shows cell delay distribution curves
for LQ=4 cells and GQ={4, 6, 8, 10, 12} cells. Similar results were observed for LQ=2 and the same GQ
values, thus suggesting that the combination of LQ=2 and GQ=4 is the best choice, if we aim at reducing
CDV. Most of the curves have two “peaks”, the former of which seems to be corresponding to the
LiQ/LoQ delays and the latter to the GiQ/GoQ delays. Since most of the conflicts are resolved during the
Cell Latency Distribution vs. Queue Size (8x16x8, 128 OC-3 i/o Clos switch)
10%
9%
8%
% of transferred cells
7%
LQ=16, GQ=8
LQ=16, GQ=12
6%
LQ=16, GQ=16
LQ=16, GQ=20
5%
LQ=16, GQ=24
4%
3%
2%
1%
0%
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
41
43
45
47
49
51
Cell Cycles (1 cyc = 3 usec)
Fig. 6. Cell latency distribution for LQ=16 cells and various Global FIFO sizes.
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
11
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
first stage, the two-peak pattern does not repeat 3 times.
Figure 6 shows cell delay distribution curves for LQ=16 cells and GQ={8, 12, 16, 20, 24} cells. Similar
results were observed for LQ=8 and the same GQ values, thus suggesting that the combination of LQ=8
and GQ=8 seems to be the best choice.
In general, we observe that in a congested situation, higher FIFO sizes do not result in higher throughput
(or goodput) but they certainly increase CTD and CDV.
Case II: Internal bus speedup = 16 x (i/o rate), Inter-stage link speedup = 2 x (i/o rate)
In this case, due to the speedup doubling, the measured goodput reached 100% (i.e., no congestion effects
were observed). The cell latency distributions for the same sets of FIFO-size parameters of Case I, are
shown in Table 2. This table shows that, for LQ=8 we have very narrow cell latency distribution
(Probability [cell latency > 6 cell cycles] = 0). Goodput and internal cell blocking measurements indicate
that for all 4 cases, goodput is near 100% and for LQ=8 and LQ=16, the internal cell blocking is
practically zero (i.e., no cell blocking occurred in the simulation duration).
PARAMETERS
LQ=2, GQ=4,8,10,12,16
LQ=4, GQ=4,8,10,12,16
LQ=8, GQ=8,12,16,20,24
LQ=16, GQ=8,12,16,20,24
3-4 cycles
~61.19%
73.55%
73.40%
73.40%
4-5 cycles
~32.47%
24.57%
24.73%
24.73%
5-6 cycles
~5.48%
1.72%
1.87%
1.87%
6-7 cycles
~0.86%
0.16%
0.00%
0.00%
>7 cycles
0.00%
0.00%
0.00%
0.00%
Table 2. Cell latency distribution results for 16x internal bus speedup and 2x inter-stage link speedup
[Goodput=~100%].
6. Conclusions
In this paper, we propose the design of a switch as a scalable network of switching elements. This concept
has the advantage that we can apply traditional network design and analysis principles during the design of
the switch. In this way, the switch itself as a macro-network (LAN/WAN) component can carry through
QoS properties to all OSI layers in a more natural way.
To demonstrate this concept, we discuss QoS issues and their implementation on the layer-2 design of a
novel 3-stage Clos switching architecture. Since internal buffering plays a crucial role in the performance
of the switch, we study how the proposed buffer design choices affect QoS properties, such as cell delay
and cell delay variation. Simulation results and analysis is used to support our claims.
Traffic management and resource management is essential for congestion prevention and control and
related issues are briefly touched upon.
Apparently, a lot more issues can be discussed and our analysis in this paper does not exhaust the subject.
It provides us, though, with a good motivation and insight for future work and analysis. Our immediate
future plans include the extension of our current study to other QoS measures and properties by means of
simulation and analytical techniques, as well as the further evaluation of other architectural models and
switch designs in a similar manner.
7. References
[1] Special issue on “Flow and congestion control”, IEEE Comm. Mag., vol.34, no.11, Nov. 1997.
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
12
COMPUTER TECHNOLOGY INSTITUTE
14/04/2000
________________________________________________________________________________
[2] Special issue on “Bandwidth allocation in ATM networks”, IEEE Comm. Mag., vol.35, no.5, May
1997.
[3] U. Black, QoS in Wide Area Networks, Prentice Hall, January 2000.
[4] D. Black, W. Leitner, Building Switched Networks: Multilayer Switching, QoS, IP Multicast, Network
Policy, and Service Level Agreements, Addison Wesley Longman, January 1999.
[5] A. Croll, E. Packman, Managing Bandwidth: Deploying QoS across Enterprise Networks, Prentice
Hall, April 1999.
[6] D. McDysan, QoS and Traffic Management in IP and ATM Networks, McGraw-Hill, November 1999.
[7] P. Ferguson, G. Huston and C. Long, Quality of Service: Delivering QoS on the Internet and in
Corporate Networks, Wiley, John & Sons, January 1998.
[8] M. Murata, “Requirements on ATM Switch Architectures for Quality-of-Service Guarantees”, IEICE
Trans. Communications, vol. E81-B, no.2, pp.138-151, February 1998.
[9] J.S. Turner and N. Yamanaka, “Architectural Choices in Large Scale ATM Switches”, IEICE Trans.
on Communications, vol. E81-B, no.2, pp.120-137, February 1998.
[10] Related WWW link:
http://www.qosforum.com/docs/faq/
____________________________________________________________________________________
TECHNICAL REPORT No. TR2000/04/04
13
Download