Increasing System Bandwidth with CDS Introduction

®
June 2001, ver. 1.0
Introduction
Increasing System
Bandwidth with CDS
Application Note 162
As system speeds have increased, semiconductor and board designers
have turned to source-synchronous clocking and differential signaling to
improve chip-to-chip data transfer rates. While source-synchronous
clocking does meet this need, it is not very flexible. Designers must closely
match the clock and data line lengths, complicating board design. Every
chip-to-chip data transfer must have a clock as well as data lines, so every
connection introduces a new clock domain. A device that receives data
from several devices must have dedicated circuitry for each connection
and manage data flow among several clock domains.
A new clocking technique called clock-data synchronization (CDS)
combines the advantages of traditional synchronous clocking and sourcesynchronous clocking by providing high-speed data transfer without the
need to closely match clock and data lines. Unlike clock-data recovery
(CDR), there is no need to encode or scramble data to meet any kind of
run-length requirement. This application note discusses how CDS works
and how it can be used in a variety of systems.
The look-up table (LUT)-based APEXTM II device family incorporates CDS
circuitry in its differential I/O circuitry. These devices offer four banks of
high-speed differential I/O pins: two output banks and two input banks.
Each bank contains 18 channels and one clock and supports LVDS,
LVPECL, PCML, and HyperTransport I/O standards at up to one gigabit
per second (Gbps). The two input banks incorporate CDS, providing the
advantages described below.
SourceSynchronous
Clocking
Altera Corporation
A-AN-162-01
Source-synchronous clocking has become a popular technique for highspeed designs. With this technique, the transmitting device sends a clock
along with the data. The advantage of this approach is that the maximum
performance is no longer computed from the clock-to-out delay,
propagation delay, and setup times of the devices and board. Instead, the
maximum performance is related to the maximum edge rate of the driver
and the skew between the data signals and the clock signals. Using this
technique, data can be transferred at a 1-Gbps rate (1-ns bit period) even
though the propagation delay from transmitter to receiver may exceed 1
ns. Figure 1 shows an example of source-synchronous transfer.
1
AN 162: Increasing System Bandwidth with CDS
Figure 1. Source-Synchronous Transfer
In a source-synchronous system,
trace lengths must be matched
to minimize skew between
data traces and the clock trace.
Transmitter
Receiver
Data1
Data2
Clock
Clock
However, there are some drawbacks to the source-synchronous clocking
technique. The board design must be tightly controlled so that there is
minimal skew between the data and clock paths. Additionally, each set of
data driven from a device must be sent with a clock signal. Therefore, if a
device receives data from four other devices, that device must also receive
four clocks. These clocks can complicate the design of the receiver, as the
design now has to manage four clock domains using first-in first-out
(FIFO) buffers.
Clock-Data
Synchronization
2
CDS is a new solution to this design challenge. With CDS, the receiving
device can synchronize multiple incoming streams of data to its own
system clock. This technique simplifies board design because skew
between data channels and the clock is no longer an issue. A receiver can
use CDS to correct any amount of clock-to-channel or channel-to-channel
skew. CDS allows designers to easily implement various system
topologies. Multiple devices can now feed into one receiving device,
which processes all incoming data in one clock domain. Figure 2 shows an
example of a system using CDS.
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Figure 2. System Using CDS
APEX II
Device 1
1 to 36
APEX II
Device 2
1 to 36
APEX II
Device 3
1 to 36
APEX II
Device 4
Clock
Signal
CDR has been used to address similar skew and topology requirements.
CDR has an advantage over CDS because the data transmitters can
operate on multiple crystals as the receiver recovers individual clocks
from each incoming data channel. Every channel can have phase variation
as well as frequency variation within a specified limit. Although CDR
provides flexibility, the receiver design is more complicated because
every data channel has its own clock domain. With CDS, the data channels
may vary in phase, but must all be precisely the same frequency. To
ensure that all channels are the same frequency, all transmitters must be
clocked from the same system clock.
Altera Corporation
3
AN 162: Increasing System Bandwidth with CDS
Compared to CDR, CDS has an advantage in data transmission efficiency.
For a CDR receiver to recover the clock and data, the data channel must
periodically toggle. This requirement is known as the maximum run
length. For example, a common CDR technique is to use 8B/10B
encoding, which ensures that more than five ones or five zeros are never
transmitted consecutively. However, this encoding scheme creates
inefficiency on the data channel. A 1.25-gigabit data channel can only
transmit a 1-gigabit 8B/10B-encoded data stream. CDS does not have a
run length requirement, so there is no need to encode the data stream.
Therefore, the entire bandwidth of the transmission channel can be used
for the system data; a 1.25-gigabit data channel can transmit 1.25-gigabits
of data.
CDS Implementation
The APEX II CDS receiver works by aligning itself to a known training
pattern the transmitter sends over the data channels. When sending the
training pattern, the transmitter also enables a CDS pin on the receiving
device to synchronize the data to the system clock. The receiver’s circuit
captures the pattern with multiple phases of the system clock and then
selects whichever clock phase correctly captured the pattern. After the
training pattern is sent, the receiver uses the selected clock phase to
capture the actual data. Figure 3 shows the circuitry that selects which
phase of the clock captures the data.
4
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Figure 3. CDS Implementation
Input Data
D
D
Synchronized
Data
D
Control Logic
Selects Register
D
0˚ Output
System
Clock
PLL
(1)
90˚ Output
Note to Figure 3:
(1)
PLL: phase-locked loop.
When using source-synchronous clocking, the data stream can be
automatically byte-aligned. For example, if the data stream is eight times
as fast as the clock, the most significant bit (MSB) of each byte is the data
transmitted during the third bit period after the clock. This relationship
holds because skew between clock and data is limited. There is no limit on
skew between clock and data in a CDS system. Therefore, the designer
cannot use the relationship between clock and data to byte-align the two
signals. However, in a CDS system, a byte alignment pattern is sent to the
receiver after the training pattern. The receiving device uses this pattern
to byte-align the data.
Altera Corporation
5
AN 162: Increasing System Bandwidth with CDS
It only takes a few clock cycles to transmit and process this training and
byte-alignment sequence, and this is performed once upon system powerup. If multiple transmitting devices are on the same board, they are
subject to the same voltage and temperature variation, so skews between
them will not vary and retraining is not necessary. All transmitting
devices send the training pattern simultaneously so that the receiver can
self-adjust for all skews simultaneously. However, if the transmitting
devices are on different boards or subsystems, they may experience
different voltage and temperature variation, and the design may need to
periodically resend the training pattern depending on the variation that
the system sees. Although additional clock cycles are necessary to resend
the training pattern, a CDS system is still more efficient than CDR systemencoding schemes.
CDS System Applications
CDS improves system efficiency in many ways. It can correct for skew that
cables and connectors introduce to data channels. CDS also adds
flexibility to overall system designs. Two examples are implementing a
switched backplane and breaking up large designs into multiple devices.
Many systems, including communications and storage systems,
incorporate a backplane to transmit data from one subsystem to another.
Historically, these designs have used a shared backplane (such as PCI).
However, the need for faster data transfer has revealed limitations of this
approach. A shared backplane can only support one transaction at a time,
and the bus speed cannot increase fast enough to support the data
requirements.
The switched backplane approach is a solution to higher data transfer
requirements. Rather than sharing a common bus, each card
communicates on a point-to-point link to a master switch. The switch
transfers the data to the destination point. Differential I/O standards are
well-suited to this architecture, as each point-to-point link can run at very
high speeds. Furthermore, since the bus is not shared, multiple
transactions can be executed simultaneously, as shown in Figure 4.
6
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Figure 4. Switched Backplane Application
APEX II
Device 1
APEX II
Device 2
APEX II
Device 3
APEX II
Device 4
Clock
Signal
With source-synchronous clocking, every point-to-point link must have
its own clock. The master switch must implement multiple clock domains
and manage data and clock skew across the backplane. CDS is a good
solution to these concerns because all cards use a system clock. The master
switch can use CDS to correct for any skews caused by system clock skew,
device-to-device variation, or data skew. Using CDS for this architecture
simplifies the overall system design by keeping the entire system
synchronized to one clock. The CDS circuitry in the APEX II device family
provides the flexibility necessary to easily implement a switched
backplane system.
Altera Corporation
7
AN 162: Increasing System Bandwidth with CDS
Another example of a CDS application is design partitioning. Many
complex designs, such as packet processing, cannot easily fit into one
device or are partitioned for other reasons. For example, while software
running on network processors is useful for general packet processing,
ASICs or programmable logic devices (PLDs) are often used for
accelerating specific functions. Network processors and PLDs implement
different functions within the system. For example, classification and
queuing control are important to assure quality of service, and encryption
is important for security. These functions can be implemented at a higher
speed in a PLD than in a network processor. The size of these functions
may prevent them all from being incorporated into one PLD. Historically,
partitioning these functions into multiple devices has resulted in very
inefficient use of the devices. Each individual device would usually use
up all its I/O pins before using all of its logic. High-speed differential
interconnects in conjunction with CDS enable a very high bandwidth data
transfer from device to device so the required data transfer from chip to
chip can be implemented using only a few I/O pins.
Figure 5 shows a block diagram of an OC-192 data path. In this design, the
packet processing is divided between a network processor and multiple
PLDs. CDS is used to implement high-speed data transfer among the
multiple devices that make up the packet-processing function.
8
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Figure 5. OC-192 Design Partitioning
SRAM and
SDRAM Blocks
CDR
Circuitry
PMD
Device (1)
Transceiver
Framer
Packet Processing CDS System
APEX II
Device
APEX II
Device
APEX II
Device
APEX II
Device
Packet
Processing
Switch
Fabric
Host
Processor
Note to Figure 5:
(1)
PMD: physical medium dependent.
Because CDS enables easier design partitioning, it is also useful for ASIC
prototyping. In many cases, a designer takes advantage of the flexibility
and easy reconfiguration of programmable logic to prototype a design,
and then moves a very large or extremely high-volume design to an ASIC.
Since the ultimate capacity of a standard-cell device is larger than that of
a programmable logic device, the designer will partition this design into
multiple PLDs. As discussed earlier, this may lead to inefficient use of the
logic within these devices. By using CDS, the designer can implement the
required data transfer between devices and use the full logic capacity of
the PLDs.
Altera Corporation
9
AN 162: Increasing System Bandwidth with CDS
Summary
Increasing demand for data services has driven higher bandwidth
requirements for system designers. Differential signaling has been
successfully used to address this need. CDS builds on the success of
differential signaling, giving designers more flexibility in the design of
their boards and of their overall systems. By using CDS in APEX II
devices, designers can enhance their systems to provide flexibility and
performance.
101 Innovation Drive
San Jose, CA 95134
(408) 544-7000
http://www.altera.com
Applications Hotline:
(800) 800-EPLD
Customer Marketing:
(408) 544-7104
Literature Services:
lit_req@altera.com
Altera, APEX, APEX II, and specific device designations are trademarks and/or service marks of Altera
Corporation in the United States and other countries. Altera acknowledges the trademarks of other
organizations for their respective products or services mentioned in this document. Altera products are
protected under numerous U.S. and foreign patents and pending applications, maskwork rights, and
copyrights. Altera warrants performance of its semiconductor products to current specifications in accordance
with Altera’s standard warranty, but reserves the right to make changes to any products and services at any
time without notice. Altera assumes no responsibility or liability arising out of the application
or use of any information, product, or service described herein except as expressly agreed to
in writing by Altera Corporation. Altera customers are advised to obtain the latest version of
device specifications before relying on any published information and before placing orders
for products or services.
10
Copyright  2001 Altera Corporation. All rights reserved.
Altera Corporation
Printed on Recycled Paper.