Uploaded by misterwhidden

serial-bus-white-paper

advertisement
High-speed Serial Bus Repeater Primer
Re-driver and Re-timer Micro-architecture, Properties, and Usage
Revision 1.2, Oct. 2015
By Samie B. Samaan, Dan Froelich, and Samuel Johnson, Intel Corporation
1. Introduction
1.1 Target Audience
This primer aims to inform High-speed Serial bus Signal Integrity engineers and system designers on the
characteristics of SerDes repeaters (re-drivers and re-timers), in order to enable proper and effective
use of those devices. It also provides useful background and insights for hardware validation and
debugging practitioners.
1.2 Motivation and Scope
Multi-gigahertz serial links, with progressively increasing bitrates, suffer from signal distortion (Intersymbol Interference, or ISI) due to PCB and package copper and dielectric losses. Other distortions also
occur due to impedance discontinuities in the channels, such as vias, connectors, and packages. While a
Serializer-Deserializer (SerDes) receiver (Rx for short) is designed to compensate for most of these
distortions, and create an internal eye open enough for reliable sampling, bit rates are rising faster than
Rx, PCB, or package High Density Interconnect (HDI) technologies can keep up with. The result is that
total channel reach is decreasing. Using expensive PCB materials to reduce loss will be reaching its
dielectric limits soon, and adds significant cost. Next generation buses, such as PCI Express (PCIe) 4.0 (16
Gb/s) and USB3.10 (10 Gb/s), which aim to double bit rates, will support shorter channel lengths (with
similar high volume PCB materials) than their existing 8 Gb/s and 5 Gb/s predecessors, respectively.
For the above reasons, and also various OEMs’ desires for differentiation through larger/modular
systems, entailing longer channels, there seems to be a rising need for active devices –or Repeaters-which “restore” the signals mid-flight, thus extending channel reach. This paper describes the two main
classes of SerDes repeaters: The analog “Re-drivers”, and the mixed-signal (analog/digital) “Re-timers”
(both protocol-aware, and protocol-un-aware re-timers).
This work describes the internal micro-architecture, properties, applications, and limitations of both
types of SerDes repeaters. It explains the subtle reasons why the presence of a re-driver in certain
busses which require adaptive transmitter (Tx) Linear Equalization (TxEQ), such as PCIe 3.0, and 10G-KR,
causes such links to be non-openly interoperable. It also addresses the special case of so-called “Closed”
Systems which do not require open interoperability (Section 10.1). Furthermore, this document
describes the functionality of PCIe 3.0 and 10G-KR re-timers, in detail.
1
1.3 Signal Distortion in High-speed Copper Links
Present multi-Gigahertz serial busses carry bit streams ranging in their rates from around 1 Gb/s and up
to (or exceeding) 32 Gb/s. The fundamental (Nyquist) or clock frequency of such patterns is half the bit
rate. PCBs, packages, cables, and connectors make up the majority of copper-based serial links, due to
their cost effectiveness. Optical links are also used at high data rates and long reach, but they have
different distortion mechanisms, and are not addressed directly here. Copper (or metal) suffers from a
resistance which increases with the square root of frequency, due to skin effect [1]. In addition, PCBs
have losses in the dielectrics (resin and glass) used to make them [1]. The dielectric loss increases
approximately linearly with frequency. Hence, depending on the dielectric loss constant (Dissipation
Factor (Dk), a.k.a. tan(δ)), a PCB’s trace loss ranges from having square root to linear dependence on
frequency. For present day FR4 PCBs (whose Dk might range from 0.03 to 0.015) , the loss is dominated
by the square root function at low frequencies, then becomes dominated by the linear dielectric loss at
higher frequencies, as seen in Figure 1.
Increasing attenuation with frequency causes signal distortion, and such distortion is worse for longer
lines. When higher frequency components are attenuated, the bits (Unit Intervals, or UIs) begin to lose
their sharpness, their tails extend beyond any single symbol (or bit), and start spilling over (interfering)
with subsequent symbols, hence the term Inter-symbol Interference (ISI).
Gain, dB(Sdd21)
0
-5
-10
-15
-20
0
2
4
6
8
Frequency, GHz
Figure 1 Measured differential Insertion Loss (Gain to be precise) of a terminated PCB strip-line trace,
showing an initial square-root-like dependence on frequency up to ~0.5 GHz (skin-loss-dominated),
then a linear dependence (dielectric-dominated).
In addition to attenuation, there are also internal reflections in a channel, due to impedance
discontinuities, causing further signal distortion. The impedance discontinuities in the channel are
caused by inter-layer via transitions, connectors, decoupling capacitor parasitics, packages, imperfect
terminations, etc. The insertion loss of such channels (e.g. Figure 2) is usually more complex than that
of a simple transmission line.
It is noteworthy that a significant fraction of the smooth (frequency-dependent) distortion due to
channel loss could be compensated for by using simple equalization circuits in the transmitter or the
2
receiver, whereas distortion caused by reflections is usually compensated by the more complex Decision
Feedback Equalization (DFE) technique [2].
Gain, dB(Sdd21)
0
-5
-10
-15
-20
0
2
4
6
8
Frequency, GHz
Figure 2 Measured differential Insertion Loss (Gain to be precise), of a practical well-terminated
channel employing 2 connectors, vias, and PCB traces.
1.4 Introductory Description of Linear and Non-linear Systems
In order to understand two main sub-types of repeaters called re-drivers, this section provides an initial
brief definition of linear and non-linear systems (or circuits). More will be given on this topic in section
3.2.
A Linear System: Is one where the amplitude of the output response is directly proportional to the
amplitude of the input, irrespective of the shape of the output. The output and input do not have to
have the same shape for a system to be linear (since shape is controlled separately by the transfer
function of the system, H(s)). See section 3.2.2 for more details.
A Non-linear System: Is one where the amplitude of the output is not directly proportional to the input.
The relationship between input and output amplitudes could be anything other than a straight line. In
extreme cases, it could be a step function, i.e. as the input amplitude rises, there is no output initially,
but then suddenly an output appears, and remains at about the same amplitude even if the input
amplitude keeps rising. Such a system is a “Sharply-limiting” non-linear system. In more moderate cases
the output amplitude reaches saturation gradually. See section 3.2.1 for more details.
2. Repeater Types
Fundamentally, high-speed serial repeaters are of two types: Re-drivers, and Re-timers.
2.1 Re-driver
This is usually a high-gain-bandwidth amplifier, employing input Continuous Time Linear Equalization
(CTLE, [3]) and sometimes also output Transmitter Linear Equalization (TxEQ [3] [4]). The amplifier could
be either linear or non-linear (limiting). These devices have no clock, and are pure analog devices, except
for the presence sometimes of a sideband low-frequency bus to program their analog settings. Limited
3
programming is usually also achievable by using strap pins. Re-drivers do not store data digitally, nor
could they be protocol-aware. They just compensate for ISI, cause a delay of the signal by a few 100s of
pico-seconds, and usually add some jitter. Re-drivers are not specified directly in most Hi-speed serial
bus standard specifications.
If not addressed explicitly in a bus’s specification, then in general, using re-drivers renders a link noncompliant, strictly speaking. Their use in such busses as USB3 and SATA3 is “Extra-spec” (which might be
surprising to some, but can be gleaned from careful reading of those specifications), but appears to be
tolerated practically. More details on these devices’ usage and subtleties will be given in later sections.
2.2 Re-timer
A device which has a Clock & Data Recovery circuit (CDR [5] [6]), which is the main component of a
SerDes Physical Layer (PHY). A re-timer has a Phase Locked Loop (PLL [7]), may require an input
Reference clock, and is a mixed-signal analog/digital device. It converts an incoming analog bit stream
into purely digital bits that are stored (or staged) internally. The internal digital data has no analog
information left in it from the incoming original bit stream. A re-timer re-transmits data anew, with new
equalization, and new jitter content, unrelated analog-wise to the input bit stream. Thus, a re-timer
breaks a link into two distinct sub-links, which are completely independent from each other, from a
Signal Integrity (analog amplitude and timing) perspective. Some Re-timers also provide debug
capabilities, such as eye margining, link status, and link health indicators. In addition, sophisticated retimers, which are protocol-aware, comprise digital logic to manage link initialization, training, data
encoding and decoding, and clock domain frequency differences.
Re-timers are more complex than re-drivers, larger in die and package size, are more expensive, and, as
of this writing, are offered by fewer vendors than re-drivers. Re-timers are, in turn, of two types:
Simple “Bit Re-timers” and “Intelligent Re-timers”. Bit re-timers are usually Protocol unaware, &
usually TxEQ Training-incapable, while intelligent Re-timers are Protocol-aware, and TxEQ Trainingcapable. Section 13 provides more details on re-timers.
3. Re-drivers
3.1 Re-driver Micro-architecture
This section discusses the generic internal micro-architecture of a re-driver. Any vendor’s re-driver might
have a different specific design, but they all share these general features. Figure 3 shows the internal
components of a differential re-driver (See section 3.2). Re-drivers usually come as an identical pair (or
set of pairs) in one package as shown in Figure 4, in order to accommodate the sending and receiving
signal directions of one or more lanes. Typically, each direction of transmission has its own independent
set of controls. The channel preceding the re-driver is referred to here as the “Pre-channel”, and the
one following is referred to as the “Post-channel”.
4
Straps, I2C,
or SM Bus
R-term
Driver
TxEQ
Gain
CTLE
R-term
Rx Detect
Squelch
Timer
Threshold
Detector
Figure 3 Conceptual micro-architecture of a differential SerDes re-driver, showing one signal direction,
comprising a differential data path, with programming and control blocks.
Re-driver Package
Pre-channel
-ve Input
Host
Rx
Post-channel
-ve Output
-ve Output
Redriver
+ve Output
+ve Output
Redriver
+ve Input
Host
Tx
Post-channel
Device
Rx
Pre-channel
Device
Tx
+ve Input
-ve Input
Figure 4 A typical re-driver package contains a full differential re-driver (as shown in Figure 3) in each
signal direction, in order to allow repeating a full link. Also shown are the pre-channel (before) and
the post-channel (after) the re-driver, in each direction.
A re-driver’s main components are:-
Input Terminations
Depending on the circuit technology of the re-driver, an input pair of differential termination resistors is
provided for matching the channel preceding the re-driver (pre-channel). These may range in value from
60 to 40 Ohms per side (120-80 Ohms differential), and are usually terminated to ground, Vdd, or
another bias voltage. Re-drivers are usually DC-terminated devices at both their input and output, since
they expect to operate in AC-coupled channels (using on-board capacitors).
The input impedance of a re-driver is not purely resistive, mainly due to the presence of package
parasitics (inductance) and die input capacitance. A vendor’s data sheet usually gives information (either
5
tabulated, or graphical) about the typical or worst-case SDD11 of the re-driver. SDD11 affects the shape
and amplitude of the signals impinging upon a re-driver’s input.
CTLE
A Continuous Time Linear Equalizer (CTLE) [8] is a linear amplifier whose transfer function is similar to
that shown in Figure 5. It attempts to represent the opposite of the loss-dominated pre-channel
attenuation, amplifying high-frequency components more than low-frequency ones, up to a certain
frequency. Semiconductor circuits have a limited gain-bandwidth product, hence the progressive
amplification of higher frequencies in a CTLE reaches a peak, then begins to decline. The peak is usually
aimed to be at least as high as the Nyquist (clock) frequency corresponding to the bitrate, and
preferably exceeding it as much as possible. The resulting band-pass filter opens the input eye by
compensating for (reversing) ISI, only partially, since it is not an exact opposite of a channel’s
progressively higher attenuation shown in Figure 1 and Figure 2.
AC peaking, which is the difference in dB between the maximum amplification at the peaking frequency
and the DC amplification, is usually externally programmable. In some re-drivers, the amount of DC gain
is also externally, and independently, adjustable. It is important to note that a re-driver’s CTLE setting is
static, once programmed, irrespective of system operating conditions. This is true for both pin-strapped
programming, and I2C (or SM) programming. A re-driver does not offer adaptive equalization, like some
intelligent SerDes PHYs and some re-timers do.
5
Magnitude, dB
0
-5
-10
2e+8
3e+8
4e+8 5e+8
7e+8
2e+9
1e+9
3e+9
4e+9 5e+9
7e+9
1e+10
2e+10
3e+10
3e+9
4e+9 5e+9
7e+9
1e+10
2e+10
3e+10
Frequency, Hz
40
20
Phase, Deg
0
-20
-40
-60
-80
-100
2e+8
3e+8
4e+8 5e+8
7e+8
2e+9
1e+9
Frequency, Hz
Figure 5 An illustrative example of the amplitude and phase Bode plots of a CTLE equalizer. A specific
re-driver’s CTLE is usually different from the above, depending on the design and bitrate. Notice the
nonlinear phase response (versus frequency) of this 1-zero 2-pole equalizer.
6
Gain Stage
In many re-drivers, the amplification in the CTLE stage might be insufficient; therefore a second
amplifier follows the CTLE. In addition, maximizing gain-bandwidth usually requires more than 1 or 2
stages. In the case of the limiting (or non-linear) re-drivers described in section 3.2.1, this is usually a
high-gain clipping (i.e. limiting) amplifier. In the case of a linear re-driver (section 3.2.2), this stage is
usually a more moderate-gain amplifier with “linear” input-output amplitude characteristics.
In the case of a limiting re-driver, there is usually a required minimum input differential voltage
specification (on the order of 100-150 mV). This is the minimum voltage to guarantee that the gain
stages, including the output driver, can swing completely to the full requested Output Differential
Voltage (VOD).
Output Driver
This is the final power amplification stage which drives the output load. It can be either a limiting
amplifier (section 3.2.1), or a linear amplifier (section 3.2.2). In some re-drivers, the gain of this stage is
also programmable, in order to allow setting the Differential Voltage Output swing (VOD), say from 800
mV to 1200 mV, as an example.
Output TxEQ
This feature attempts to add de-emphasis (TxEQ) to the output. Since a re-driver lacks a clock (data is
not staged digitally), an FIR-like behavior might be emulated either by delaying the output (using an
analog delay circuit), scaling it, then adding it back to the main signal flow path, or alternatively through
the use of an analog peaking circuit.
TxEQ capability is always present the case of a limiting (non-linear) re-driver. In the case of a linear redriver (section 3.2.2), this function has not been observed to exist. The input CTLE is usually made strong
enough to provide equalization, in order to cover what a TxEQ might add, since they both share roughly
similar frequency domain peaking transfer functions.
A consequence of the methods used to achieve pseudo-FIR TxEQ behavior, is that a limiting re-driver -which is designed for a certain bit rate-- may not be used at a significantly different bit rate. The reason
is that the internal analog delays or peaking filters, which attempt to affect de-emphasis, are tuned to
yield de-emphasis appearing after one UI (bit width) at the target frequency of operation. There is
always a manufacturing variation in those circuits, so reasonable deviation is not detrimental.
Output Terminations
An output pair of differential termination resistors is provided in order to match the output impedance
of the re-driver to the output post-channel. These may range in value from 40 to 60 Ohms per side (80120 Ohms differential), and are usually terminated to the supply (Vdd) or an internal bias voltage.
Just like the input, the output impedance is not purely resistive, mainly due to the presence of package
parasitics (inductance) and die output capacitance. A vendor’s data sheet usually gives information
(either numerical or graphical) about the typical or worst-case SDD22 of a re-driver.
7
Input Idle (Squelch) Threshold Detector
This circuit compares the input differential voltage to a fixed (sometime programmable) threshold on
the order of about 100 mV (e.g. USB3), set for entering the Electrical Idle state. It sends a signal to a
Squelch Timer if the input differential voltage is less than the idle threshold. This feature is used to
detect bus inactivity, and start a process to turn off the output driver in order to save power, and reduce
idle bus noise. Entering squelch is not immediate, and only happens after several nanoseconds of delay,
depending on the bus protocol, and the expected bit patterns (See 3.1.8).
If the input rises above a value larger than the Idle Exit Threshold (300 mV in USB3, and called “Max LFPS
Threshold”, [9]), then the output of the threshold detector is reversed, and the re-driver’s output is
turned on. This capability is used by USB3’s host’s Low Frequency Periodic Signaling (LFPS), and SATA3’s
host’s Out Of Band (OOB) signaling, in order to awaken a client device, before the resumption of data
transmission. The Idle exit threshold is usually larger than the idle entry threshold.
On the other hand, a PCIe downstream device enters idle only after detection of an Electrical Idle
Ordered Set (EIOS, [10]), and not due to signaling inactivity. However, it exits idle after detection of
input signal activity using the Electrical Idle Exit Ordered Set (EIEOS) which has a low frequency pattern.
Since re-drivers targeted for PCIe 3.0 do not have digital pattern detection ability, they are sometimes
designed to turn their output off (go to sleep) after detecting that the input voltage has dropped below
a certain threshold, for several nanoseconds. They do, however, exit idle upon bus activity (higher input
voltage) when EIEOS reaches their input.
The author has observed, in the lab, that for some limiting re-drivers, the input voltage has to exceed
the squelch exit threshold (reported in the datasheet) by tens of millivolts, before the output reaches its
full desired amplitude. In addition, linear re-drivers tend not to squelch their output.
Squelch Timer
This circuit receives a signal form the input idle (or squelch) threshold detector if the input voltage is less
than the Idle Entry Threshold, and after some time delay (typically 4-10 ns) it turns the output driver off,
in order to save power, and reduce idle bus noise. Conversely, if the input rises above the Exit Threshold
for a certain amount of time, then the output driver is turned back on, after a few nanoseconds. As
mentioned in 3.1.7, the squelch exit threshold is typically larger than the squelch entry threshold.
Momentary crossings of the Idle Entry Threshold, by the input voltage, do not trigger shutting off of the
output driver, due to the delay circuit. Such momentary drops can occur for very short periods of time in
an encoded high-speed serial bit stream (e.g. “…10101...”).
Also note that for the “Limiting, or “Non-linear” re-driver type describe in section 3.2.1, there is yet a
third voltage threshold worthy of understanding: It is a switching threshold which must be exceeded, in
order for the limiting amplifier to generate an output whose amplitude is at the full desired output
swing. A well-designed re-driver would ensure that the idle exit threshold is above this switching
threshold, in order to ensure that the output has reaches its full desired amplitude whenever it is turned
on.
8
Receiver Detection (Rx Detect)
The PCIe and USB3 Specs provide a procedure for a transmitter to detect if a receiver is connected to
the channel, thus giving it an opportunity to move to lower power states, when not. The procedure is
based on sensing the different common mode charging time constants of the channel when and a load is
either present or absent (Figure 6). The USB3 and PCIe procedures are similar. This is the procedure for
USB3 [11]:1. A Transmitter must start at a stable voltage prior to the “detect common mode” shift.
2. A Transmitter changes the common mode voltage on Tx-P and Tx-N, consistent with detection of
receiver high impedance which is bounded by parameter “ZRX -HIGH-IMP-DC-POS” in Table 6-13 of [11].
Furthermore, upon startup, detection is repeated periodically, until successful.
After shifting the common mode voltage, the output pins of the Tx (the Tx-P and Tx-P nodes between
“R_Detect” and “C_AC” in Figure 6) start an upward voltage ramp. If there is no Rx load at the far end of
the link (left circuit in Figure 6), then the Tx pins charge up rapidly, and a certain detection threshold is
crossed quickly, since the isolation capacitor “C_AC” (which is the largest cap in the link) is floating. But
if an “R_Term” is present (the right hand circuit in Figure 6), then the detection threshold is crossed
much more slowly, as “C_AC”, which is now in circuit, takes a longer time to be charged. Detection is
successful if a load impedance equivalent to a DC impedance “RRX-DC” (Table 6-13 in [11]) is present.
The R-term of the Rx does not have to be referenced to GND, for the technique to work correctly, since
detection is an AC mechanism. This is easily proven, as the pin voltage at the bottom of “R_Detect” has
to equal the applied voltage (“V_Detect”) once the transient current subsides.
Figure 6 Circuit diagram indicating the concept of fast (left), and slower (right) charging time constant
used by USB3 and PCIe for Rx detection, [11]. “R_Detect” is the common mode Tx source resistance.
Furthermore, in both PCIe and USB3, upon power-up, and as long as no load is detected at the output of
the transmitter, an agent must keep its own input common mode impedance at a high value (USB3: ZRXHIGH-IMP-DC-POS > 25 K-Ohms [11], and for PCIe: ZRX-HIGH-IMP-DC-POS and ZRX-HIGH-IMP-DC-NEG, in the range of >
10-20 K-Ohms [10]). This lets any upstream agents know, in turn, that a final destination Rx is still
absent, since their Rx detection would continue to fail. Conversely, when an agent detects an Rx at its
output, then it must turn its own input differential terminations and common mode input impedance
ON to their nominal (low) values (USB3: 36-60 Ohms per differential side and a common mode of 18-30
9
Ohms [11], and for PCIe 1.0/2.0 40-60 Ohms per differential side, and for PCIe 3.0 it is bounded by RLRXCM of 6 dB, [10]).
Many PCIe and USB re-drivers possess the ability to detect an Rx, and behave similarly to a link agent, in
order to allow power savings in a system. When no load is connected to a re-driver, it turns its input
terminations to high-impedance, in order to let the upstream agent’s transmitter (driving its input) know
also that no downstream receiver is present. The re-driver’s output is usually also turned off (set to a
high impedance of a few 10s of kilo-ohms) in order to save power. The differential outputs reach an
approximately equal voltage, with a certain common mode value.
All USB3 re-drivers seem to perform Rx detection. After power-up (or after a certain input control pin is
toggled), an Rx-Detect cycle is performed periodically (typically every 12 ms). In PCIe, most re-drivers are
capable of Rx detection, but not all offerings on the market now do. When a PCIe re-driver is incapable
of Rx detection, then its input termination is turned on continuously upon power up, and the upstream
agent’s Rx detection would always succeed, even though there might not be a real load at the end of the
link. In that case, the upstream agent goes immediately into training mode. But, since it does not detect
a training response back form a downstream agent, it goes next into Compliance mode, where it sends a
compliance pattern repeatedly. This is not a fatal error. It just implies extra power consumption in both
the re-driver and the upstream agent.
There is no Rx-detection specification in both the 10G-KR and SATA specifications. In SATA, some agents
turn their input terminations ON always to the 50-Ohm nominal value.
Sideband Programming Bus (Typically, I2C, SM Bus, or Strap Pins)
Depending on the market segment and bus specification intended for a re-driver, different means are
provided to enable the designer to select the CTLE, TxEQ, Output Voltage Differential (VOD) swing,
Squelch Threshold, etc. These means range from strap pins (set to ground, Vdd through a resistor, or left
open circuited), to full I2C [12] or SM Bus [13] inputs. USB3 and SATA3 re-drivers tend to use strap pins,
due to their relative simplicity, while re-drivers intended for PCIe and KR busses typically use I2C or SM
bus programming inputs, in order to provide fine equalization resolution, and more flexible parameter
and feature selection. Each direction of transmission, such as host to device (downstream), or device to
host (upstream) is usually controllable independently, so the designer could select the required CTLE,
TxEQ, Gain, etc. values appropriate for that direction’s channel, Tx, and Rx.
3.2 Re-driver Types: Limiting and Linear
There are two main types of re-drivers in terms of input-output amplitude transfer functions: Linear,
and Non-linear. This pertains to the amplifying stage (or stages) in the re-driver.
Limiting (Non-linear) Re-drivers
Some re-drivers in the PCIe and KR domains, and all of the ones in the USB3 and SATA3 domains, at the
time of this writing, are of the limiting amplifier type. Referring to Figure 7, this means that as the input
differential amplitude rises from zero, the output amplitude (irrespective of waveform shape), starts
rising very quickly, reaches saturation, and its wave shape is stable at full swing, unaffected by any
further input amplitude increases. This happens at a small differential input voltage of around ~100 mV,
10
usually. Limiting re-drivers are easier to design. Furthermore, they can employ both an input CTLE, and
an output TxEQ, as depicted in Figure 3 which showed the typical micro-architecture of a limiting redriver.
Figure 7 Relationship between input and output amplitudes of a “Limiting” re-driver, showing a
quickly-growing output (red) as the low switching threshold is approached, then a regerenarted
“square-like” output wave (green) once the threshold is surpassed. Jitter, added TxEQ, and
bandwidth-limitted smoothing are not shown.
Figure 8 shows time domain waveforms at the output of a limiting re-driver as its input signal amplitude
is raised in regular steps, where the output follows the input dis-proportionately, then quickly ceases to
increase, due to the onset of limiting. Notice that the output saturates at small input amplitude. In
addition, the pre-existing pre- and post-shoot equalizations of the input are obliterated. The pre-shoot
in the input signal disappears from the output of the re-driver which applies only its own equalization to
the output (only post-shoot, or de-emphasis in this example). Hence, a limiting re-driver severs the
analog wave shape relationship between its input and output, but does not sever the switching-edgecrossing timing relationship, or jitter.
11
Figure 8 Input (top) and output (bottom) of a limiting re-driver as the input amplitude is raised at
regular small intervals, and the re-driver output goes into saturation, ceasing to grow. Note the
obliteration of pre-existing equalization in the input; the re-driver adding only its own.
For the sake of further illustration, Figure 9 depicts the simulated bit stream and eye diagrams at the
input and output of a limiting re-driver placed in a lossy channel. The analog delay between the input
and output is indicated by a grey dot.
With limiting re-drivers, un-equalized pre-channel ISI (under-equalization) or added ISI (overequalization) by the re-driver’s CTLE appear at the output as deterministic timing jitter (Dj), with a
complete loss of signal shape information. This renders under- and over-equalization uncorrectable by a
final downstream receiver. Figure 10 shows the signal at the output of a limiting re-driver’s CTLE
followed by its first limiting amplifier stage, where it is easy to see that un-equalized ISI results in pure
Deterministic jitter (Dj) of the high-frequency type, which the Rx downstream cannot remove. This is
due to the clipping (or re-shaping) nature of the amplifier which morphs all signals at its input into ideal
“square-like” waves at its output, thus leaving only time axis crossing information, in the form of jitter.
This is one reason why limiting re-drivers usually extend channel reach less than linear ones, when all
other aspects are equal. A limiting re-driver might extend the channel by ~40% (in total dB), while a
linear one might be able to achieve ~60%.
12
Figure 9 Simulated waveforms at the input (top left) and output (top right) of a Limiting re-driver,
depicting the “regeneration” (“squaring”) of wave shape, and the associated eye diagrams (bottom).
TxEQ is also present at the output.
Figure 10 Simulated output of a limiting re-driver’s CTLE which is followed by a limiting amplification
stage (left), and the output from the final driver (right). TxEQ is present at the output. Notice the pure
time axis crossing jitter which is not compensable by the final Rx at the end of a link.
13
Linear Re-drivers
Some re-drivers (especially those targeting PCIe 3.0 and KR) are built around linear amplifiers. Linearity
means that --irrespective of incoming and outgoing signal shapes-- when the amplitude of the input
signal is varied, and the amplitude of the output signal is measured, that there is a near-linear
relationship between the two amplitudes, as shown in Figure 11.
Figure 11 Relationship between the input and output amplitudes of a “Linear” re-driver, showing
linearity over a certain input range, and the eventual onset of saturatuon (or compression).
Relative linearity persists up to a certain value of differential input amplitude, and then compression
sets in, until the output remains stable no matter how high is the input amplitude. The beginning of the
end of linearity is usually indicated by a compression level, of say 1 dB, and may occur at as little as ~200
mV or up to ~600 mV of input differential amplitude, depending on the vendor, and gain settings. The
compression level is defined as:
Compression level in dB = 20 * Log10 (Actual Output Amplitude / Linearly-projected output Amplitude)
Thus, a compression level of 1 dB means that the actual output is ~11% smaller in amplitude than the
straight line projection of its value (at the same input amplitude point), as shown by the small double
arrow in Figure 11.
It is worth noting that linearity does not mean that the input signal shape is identical to the output
shape. It merely means that the relationship between the input amplitude and the output amplitude is a
straight line, to a reasonable degree of approximation. The actual shape of the output is actually
determined by the input signal shape and the equalization imposed on it by CTLE and TxEQ (if the latter
is present). In other words, the output shape is a function of the input, and the overall frequency
domain transfer function of the re-driver, H(s).
Figure 12 shows the waveforms at the output of a linear re-driver as its input signal amplitude is
increased gradually, where the output follows the input proportionately, but eventually ceases to do so,
14
due to the onset of compression (or end of linearity). The magnitudes of pre- and post-shoot relative to
the full signal amplitude are affected as compression sets in.
Figure 12 Input (top) and output (bottom) of a linear re-driver where the input amplitude is increased
at regular intervals until the output goes into compression, i.e. ceases to grow at the same rate as the
input. Compression diminishes the relative magnitude of equalization.
Linear re-drivers may or may not have TxEQ at the output (usually not). When they do not, they usually
have a strong CTLE. In fact, TxEQ and CTLE perform similar (but not the same) transformations in the
frequency domain, and the absence of TxEQ can be overcome by making CTLE stronger, in the linear
case. Figure 13 depicts the micro-architecture of a linear re-driver which does not have output TxEQ.
For the sake of further illustration, Figure 14 shows an example of the expected wave shapes at the
input and output of a linear re-driver. The output is not square-like, and has some resemblance to the
input, and its shape is only altered by the amount of applied equalization. Both the analog amplitude
and timing relationships between the input and output are maintained, albeit altered by the transfer
function (H(s)). Unlike a limiting re-driver, the shape information connection between the pre-channel
and the post-channel is not severed by a linear re-driver.
15
Straps, I2C,
or SM Bus
Driver
R-term
Gain
CTLE
R-term
Rx Detect
Threshold
Detector
Squelch
Timer
Figure 13 Conceptual micro-architecture of a linear SerDes re-driver without TxEQ, showing one signal
direction, comprising a differential data path, with programming and control blocks. In this case,
equalization is achieved via a strong CTLE only.
Figure 14 Simulated waveforms at the input (top left) and output (top right) of a “Linear” re-driver,
depicting the equalized waveform, and the associated eye diagrams (bottom). Note the clear analog
relationship maintained between input and output, and the lack of square-wave-like regeneration.
16
Figure 15 shows the transfer function (magnitude of the differential S-parameter SDD21) of the
combination of a PCB transmission line and a linear re-driver for various equalization settings of that
particular re-driver. The settings range from under-compensation to overcompensation of the
transmission line by the re-driver. Figure 16 shows the residual phase deviation from linearity for the
same setup of Figure 15. Note how it is not possible to find a setting which makes the transmission line
“disappear” fully. This illustrates why re-drivers change the character of a channel into which they are
inserted.
Figure 15 Measured magnitude of SDD21 of a re-driver plus a transmission line channel for various redriver equalization settings, showing the overall transfer function of the combination.
Figure 16 Measured phase deviation from linearity of SDD21 of a re-driver plus a transmission line
channel for various re-driver equalization settings, showing the overall transfer.
17
An advantage of a linear re-driver is that its equalization applies toward the full channel (pre-channel
plus post-channel), since the re-driver and the channel represent a linear system where --to first order-the location of the transfer functions does not affect the final outcome --Other considerations however,
make certain placements more preferable (See section 10.3). Furthermore, if a linear re-driver’s CTLE
capability is insufficient to compensate well for the pre-channel and/or post-channel, then the final
receiver’s CTLE and DFE could supplement the overall channel equalization. This is in contrast to a
limiting re-driver (section 3.2.1), where under- or over-equalization appears at the output as timing jitter
only, with a complete loss of signal shape information, thus rendering it un-equalizable downstream.
While a linear re-driver’s equalization can apply toward compensating for the pre and/or post-channel,
there is a change in the overall channel’s transfer, even when the re-driver is programmed most
optimally. The residual transfer function expressed by Figure 15 and Figure 16 causes the overall
channel transfer function to deviate in its shape from that of a purely passive channel. This causes
distortion in the equalization sent by the source Tx, as it appears to the final Rx, and will be discussed in
latter sections on PCIe 3.0 and 10G-KR interoperability. This is in contrast to the limiting re-driver case,
where the residual (uncompensated) part of the channel expresses itself as just deterministic uncompensable jitter (section 3.2.1).
Other Attributes of Linear and Limiting Re-drivers
3.2.3.1 Jitter Handling
Random jitter (Rj) and non-ISI deterministic jitter (Dj) [14] entering a re-driver are essentially passed on
from input to output, albeit with a change in magnitude (spectral content) due to the re-driver’s
equalization circuits’ transfer functions and their limited bandwidth.
Both limiting and linear re-drivers add a certain amount of Rj and Dj due to their own circuits, and their
power supply noise. Hence, it is important that the supply (Vdd) of a re-driver meet the manufacturer’s
noise recommendations. This is made less important, in some case, by virtue of the presence of a Linear
Voltage Regulator (or Low-dropout Regulator, LDO) inside a re-driver, which improves its Power Supply
Rejection Ratio (PSRR). Added Random jitter (Rj) is caused by thermal noise and popcorn (Burst, or 1/f)
noise. Thermal noise is determined by the design, and can only be reduced by consuming more current
in the re-driver’s internal circuits (which raises device power) and is outside the control of the user.
Popcorn noise is determine by the semiconductor process technology and transistor sizes, and is also
outside the control of the user.
Since any re-driver’s CTLE and TxEQ settings cannot equalize the pre- and/or post-channels perfectly at
all frequencies, under or over-equalization of the channel will appear as either additional ISI (in the case
of a linear re-driver) or additional Dj (in the case of a limiting re-driver). Neither CTLE nor TxEQ could
achieve perfect cancellation of a channel’s loss, mainly since passive channel loss starts as a square root
function of frequency, and then quickly becomes virtually linear with frequency (section 1.3), whereas
CTLE and TxEQ have a frequency response representing a peaking band-pass function, with curved
amplitude vs. frequency shape. Figure 15 and Figure 16 showed the residual frequency response of a
channel plus re-driver combination, even after the best CTLE settings have been chosen.
18
Semiconductor process, voltage, and temperature (PVT) variations cause a re-driver’s equalization to
vary part to part, and system to system. Such variations might range from approximately +/-0.5 to +/-1.5
dB. In addition, PCB material high-volume manufacturing (HVM) variations and temperature/humidity
loss dependence, all render a re-driver’s fixed settings suboptimal, eventually. All these variations
express themselves as added ISI (linear re-driver) or un-compensable Dj (limiting re-driver). To give an
example of PCB loss variations, a 15 dB channel extension could suffer from an increase in loss of 1.5 to
3 dB over a 55-C temperature range and humidity range (for a server design), depending on its material
properties [15] [16] [17].
To illustrate the effect of temperature alone, Figure 17 shows the input eye to a limiting re-driver
located near the middle of a long channel operating at 5 Gb/s, at room temperature. It also shows the
re-driver’s output eye, and the eye inside a receiver having only a CTLE. Notice a receiver input eye
opening of 97 ps. Figure 18 on the other hand, depicts the same waveforms after raising the
temperature of only the PCB by 55 degree centigrade, and increasing the total PCB loss by only 10%.
Notice the increased jitter at the re-driver’s output eye, and the reduced eye opening post-CTLE, all
caused by the increase in channel loss, which remains uncompensated for by a statically-programmed
re-driver. Note that some PCB materials may have a loss increase approaching 20% for the same
temperature rise and humidity increase, and that would cause significant further deterioration. Also
note that the re-driver’s equalization is not assumed to change as a function of temperature in this
example, which is optimistic.
Any system could start up cold, warm up to it maximum temperature rating (or vice versa), and must
remain functional at the required BER. In addition moisture content may change as the system warms
up. Many receivers perform their initial training upon system start-up, and some of their equalization
settings may remain unchanged thereinafter. That is usually referred to as “Start or Train Cold, Run Hot”
(TCRH), or “Train Hot, Run Cold” (THRC). Such receivers are usually deigned to be able to accommodate
channel loss changes versus environmental conditions assuming a base maximum spec channel. When a
channel extension plus a re-driver are added, further accommodation of their environmental variations
requires spare link margin, and appropriate validation.
19
Figure 17 A simulated channel (~29 dB total @2.5 GHz) with a limiting re-driver located off the middle,
showing re-driver output eye (Total jitter of 31 ps due to uncompensated ISI) & receiver internal postCTLE eye (97 ps), at room temp. No non-ISI Dj & no Rj from the Tx, re-driver, or final Rx are included.
Figure 18 The channel of Figure 17 after raising only the PCB temperature by 55 C, & changing PCB loss
by only +10% (Some PCBs may manifest up to +20% for a 55-C rise). Notice the increase in the redriver’s output uncompensable ISI-caused jitter (rising to 39 ps), & the reduced eye inside the final
receiver (dropping to 75 ps). Temperature effects on the re-driver itself were not included.
20
3.2.3.2 Re-drivers and Non-ISI Distortions (e.g. Reflections)
Since a re-driver lacks a CDR or a clock, and has bandwidth-limited equalizing circuits, it can only
compensate partially (strictly speaking) for the loss in a channel. In addition, Distortions caused by
reflections (discontinuities) and cross talk (from neighboring aggressors), are handled differently by the
two types of re-drivers: Linear and non-linear:In the case of a linear re-driver reflections and cross talk artifacts are passed on to the output --albeit
with some shape change depending on the frequency domain transfer function and linearity of the
device, Figure 19. Furthermore, a linear re-driver amplifies noise and cross talk, just as it amplifies the
main signal, and hence does not improve the Signal to Noise ratio (SNR) significantly. This is why placing
a linear re-driver very close to the final Rx (short post-channel) runs the risk of shorter link extension,
than placement near mid-channel, due to amplification of accumulated crosstalk.
In the case of a limiting re-driver, reflections and crosstalk artifacts are largely eliminated from a
voltage (amplitude) point of view (assuming that they still occur above the re-driver’s switching
threshold), but are both converted into deterministic jitter, especially in the case of either artifact
occurring on the rising or falling edges of the incoming signal, Figure 20.
Figure 19 The simulated effect at the output of a "Linear" re-driver of a severely reflective channel
discontinuity. Notice the changes in the incoming and outgoing waveforms and the eyes, relative to
Figure 14.
21
Figure 20 The simulated effect at the output of a "Limiting" re-driver of a severely reflective channel
discontinuity. Notice the changes in the incoming waveforms and the eyes, and the increase in output
jitter, while maintaining the max eye amplitude, all relative to Figure 19.
3.2.3.3 The Notion of a so-called “Spec-compliant” Re-driver
It is meaningless to require (or state) that a re-driver be (or is) Spec-compliant if no such device has
been explicitly defined in a bus’s formal specification! The main reason is that Tx and Rx compliance
specs are usually designed and stated assuming a passive channel, and do not account for an active
element in the channel (with very few exceptions). An active device adds jitter, intra-pair skew, and
alters the channel by a transfer function which is not of the same nature as a passive channel’s.
Measuring Spec compliance at a re-driver output, and requiring it to be identical to that of a standard’s
Tx, is not meaningful, either. The reason is that a re-driver is not meant to drive a full maximum Spec
channel at its output, like a standard Tx is meant and specified to do. A seemingly more relevant
location at which to measure Spec-compliance (albeit still outside of the Spec, strictly speaking), is a
clearly defined compliance point, such as in the USB3 (after a reference input CTLE), SATA (at a
connector), or SFI (off a host compliance board). For these buses, a re-driver (be it limiting or linear) is
still non-Spec-compliant, strictly speaking, due to its altering the ratio of compensable ISI to uncompensable jitter. See section 4 for more details on this subtle point.
Further complications arise with standards like PCIe 3.0 and 10G-KR, both of which require the
transmitter to adjust Tx Equalization (TxEQ) to any setting requested by a receiver, within a specified
capability range. With these types of standards, showing that the reference receiver produces an open
eye with one TxEQ value, at a compliance point, is not sufficient to guarantee interoperability in the
presence of a re-driver. The reason is that the reference Rx used in the Spec is not a required minimum
capability (or design) for practical receivers, but is merely there to allow measuring “an eye” for the
purpose of tying different ends of the spec together. This will be discussed in detail in sections 5.5 and 6.
22
3.2.3.4 Re-driver Usability with Lower Risk of Open Interoperability
While still strictly non-spec compliant, re-drivers are being marketed and used for such busses as USB3
and SATA3. Please refer to section 4 for the details on the reasons for non-spec compliance. Busses
where re-drivers are being used, with a lower (but non-zero) risk of interoperability issues, are those
which:•
•
•
Have a static (non-adaptive) TxEQ, AND
Have a post-EQ Rx eye Spec, such as USB3, or,
An eye spec (mask) at a test point, such as SATA3.
In these cases, it is important to design the placement and programming of the re-driver with sufficient
guard-band, in order to ensure that the eye opening meets the Spec’s minimum eye mask, under all
HVM and environmental (PVT) conditions, and also allow the Rx some room to deal with the altered
ratio between increased non-compensable ISI (jitter), and compensable ISI (see section 4). Furthermore,
in the Case of USB3, SATA3, and even SFI, the cables or devices connected to a host may vary in length
from user to user (e.g. a 3m USB3 cable, versus a 10” cable, or a thumb drive). A host using a re-driver
must be designed to accommodate these differing usage modes.
4. Re-drivers for USB3, SATA3, and PCIe 2.0
The USB3 [9], SATA3 [18], and PCIe 2.0 (Gen 2) [10] standards do not specify re-drivers. Hence, re-driver
usage is outside those specifications. However, limiting re-drivers are being used in those open eco
systems. Except for one subtlety, these buses are more tolerant of re-drivers, mainly due to the fact that
they do not require TxEQ adaptation (unlike PSCIe3, and 10G-KR which do). USB3, SATA3, and PCIe 2.0
have a small set of required static TxEQ settings, which a statically-programmed limiting re-driver can
provide, and no further TxEQ changes need to pass through the re-driver.
Technically, USB3, SATA, and PCIe 2.0 re-drivers are not Spec-compliant, due to the following subtlety:
The aforementioned Specs define receiver compliance as the ability of a device to achieve the required
BER, when plugged into a compliance channel presenting it with an eye having a specified minimum
height and width. Due to the prescribed maximum Transmitter Dj and Rj, and the maximum length of
the compliance channel, there is an implied assumption about the ratio of channel-caused ISI
(compensable jitter) to the sum of Random and Deterministic jitters which are un-compensable. When
a limiting re-driver is inserted into an extended channel, then the residual (un-compensable) ISI portion
of the eye becomes larger (see section 3.2.1) relative to the Rj and Dj profiles. This may not suite the
equalization capabilities of a receiver which does pass compliance under normal circumstances.
Consider the case of an Rx which has very high internal jitter, but relies on superior equalization to pass
Compliance. If such an Rx is presented with a Spec eye which has the required height and width, but
which has more inherent un-compensable jitter and less compensable ISI (than during compliance), then
it is plausible that such a receiver might fail to open that eye well-enough for its own low BER sampling,
since the USB3 input CTLE is informative only (section 6.8.2. of [11]) and no equalizer is specified in PCIe
2.0.
It seems that despite the above cautionary consideration, hosts using a USB3, SATA3, or PCIe 2.0 redriver, which are capable of producing the pre-scribed eye at a compliance point, are being assumed to
23
be compliant by some designers. This would be especially incorrect for PCIe 3.0 and 10G-KR, and the
reader is referred to section 6 and 9 for a full explanation.
All re-drivers for USB3, SATA3, and PCIe 2.0 on the market are of the limiting type (section 3.2.1), at this
time. The ability of a good re-driver to extend the channel is on the order of roughly 50%. If one leaves a
guard-band of 10%, then it is on the order of roughly 40%. Thus, a 19.6-dB USB3 Spec channel could be
extended by about 8 dB of loss, assuming that the re-diver is placed appropriately, as explained in
section 10.3. There are some variations in the capabilities (and programming sophistication) of redrivers from different vendors.
Most datasheets tend to state channel extensibility in terms of inches of FR4. But, length is not the
relevant measure, since low-cost FR4 has a loss range from around 0.7 dB to 1 dB per inch at 4 GHz,
depending on the specific materials being used, and the line geometry. A better way of stating channel
extension would be in terms of total dB of loss at a specific frequency, of say 2.5 GHz for USB3 & PCIe
2.0, or 3 GHz for SATA3.
5. The PCIe 3.0 Equalization Adaptation Protocol
The PCIe 3.0 Base Specification [10] requires that a compliant Tx be able to produce any of the output
equalization (TxEQ) levels shown in Figure 21, whenever requested to do so by the receiving port, during
link equalization training. The same specification describes a protocol of TxEQ adaptation, where the
host (upstream agent) or the end device (downstream agent) can each request that any of the TxEQ
values in Figure 21 be presented to them by the other agent on the link. The intent of adaptation is to
allow both agents to adjust the link partner’s TxEQ to an optimal value for their receiver, and for the
specific channel and operating conditions at hand.
Figure 21 PCIe 3.0 required Tx Linear equalization settings according to the PCIe 3.0 base specification,
showing pre- and post-cursor coefficients.
24
The table in Figure 21 shows the minimum granularity and range of TxEQ for PCIe 3.0. There are two
equalization locations, “Pre-shoot” and “Post-shoot”. The Pre-shoot (Pre-cursor) level is indicated by the
coefficient C-1 and can range from 0 to 0.25. Post-shoot (also called De-emphasis) is indicated by the
coefficient C+1 which can also range from 0 to 0.25. The cursor is usually called C0. The sum of the
absolute values of all three coefficients is normalized to be equal to 1. Thus:
|C-1| + |C0| + |C+1| = 1,
C+1 < 0, C-1< 0
Only C-1 and C+1 are adjustable directly, whereas C0 is always derived from the above equation. The
output signal amplitude is composed by applying the following FIR filter function:
V_outn = (V_inn+1 * C-1) + (V_inn * C0) + (V_inn-1 * C+1)
The above FIR is depicted in Figure 22. Where V_inn is the current bit (cursor), V_inn+1 is the next bit,
and V_inn-1 is the previous bit (before the cursor). The differential voltage swing of the Tx is dependent
on the TxEQ being applied, but is scalable independently of the three coefficients (C-1 , C0 , C+1), and can
be in the range of 800-1200 mV while driving a 100-Ohm differential load, when there is zero TxEQ.
Figure 22 The PCIe 3.0 Tx Equalization FIR filter representation. (Note: The 10G-KR Filter is similar in
definition).
This PCIe 3.0 TxEQ adaptation [10] protocol has 4 phases, defined as follows:-
5.1 Phase 0
PCI Express links always start at 2.5 GT/s. Before changing to the higher rate (8 Gb/s), an upstream
agent (downstream port) and downstream agent exchange information on the range and granularity of
TxEQ values supported by each transmitter at 8 GT/s.
The upstream agent instructs the downstream agent on which TxEQ setting to start transmissions with,
once the link transitions to 8 GT/s. The upstream agent picks its initial TxEQ setting to use for its
25
upstream agent transmitter at 8 GT/s based on system knowledge of the channel lengths on the
motherboard.
5.2 Phase 1
The link transitions to 8 GT/s, and both receivers must be able to start receiving with a BER of E-4, or
better, with the initial TxEQ settings (decided in Phase 0) at 8 GT/s.
5.3 Phase 2
The downstream agent sends to the upstream agent one or more requests for any TxEQ settings chosen
from the table in Figure 21, until it decides on an optimal setting. The implementation details of how an
eye (or BER) is evaluated, is left to implementation. Different receivers have a variety of algorithms for
both internal CDR convergence, and eye goodness or BER estimation. Since they are implementationspecific, they could range from exhaustive searches, to methods of steepest descent, which could be
subject to local extrema traps. In addition, “no changes” in the perceived eye or BER (for example due to
a limiting re-driver) might also trip a “Best eye” or “Best BER” search state machine, depending on its
specific logical constructs.
5.4 Phase 3
Here, the roles of phase 2 are reversed, and the upstream agent (downstream port) makes TxEQ preset
and change requests to the downstream agent. If the training is successful, then at the end of this phase
the link is established at a BER of at least 1E-12, both in the downstream and upstream directions.
5.5 PCIe 3.0 Open Interoperability
Open Interoperability is the notion that any pair of specification-compliant devices (or agents) ought to
be able to find TxEQ equalization values appropriate for each agent’s Rx from the other agent’s Tx, and
be able to operate the link in both directions at BER of 1E-12, automatically.
The PCI-SIG facilitates standard compliance testing for devices implemented to the Electromechanical
(CEM) specification [19]. A compliant Tx passing CEM at 8 GT/s requires that it be able to produce a
minimum eye at a BER of 1E-12 when driving the appropriate CEM Reference channel. The eye is
measured after the reference equalizer defined in the PCI Express 3.0 base specification – including one
of 7 CTLE curves (whichever one works best) and a one tap DFE [10]. A compliant Tx is required to
produce the aforementioned minimum eye for at least one TxEQ preset. Furthermore --but using a
separate test-- a compliant Tx is also required to produce all the other equalizations values tabulated in
Figure 21, and this is done by testing it for a selected subset of values (the “Presets”), and requiring that
its design be able to produce all the other values in the table. Aside from equalization capabilities, a Tx
is also required to meet certain deterministic and random jitter maximum specifications [10].
It is important to understand why the accuracy of the Tx equalization across the required TxEQ space is
tested in addition to the eye test. The PCI Express 3.0 base specification defines a receiver “Stressed
Eye” test that all compliant receiver silicon must pass. To test Rx for compliance, it is plugged into the
CEM test channel, which is driven by a calibrated Tx (say from a JBERT). The Tx calibration process for
this receiver test is illustrated in Figure 23. The Tx amplitude calibration is done with the Tx Equalization
26
fixed first to 0 dB of pre-shoot and 0 dB of de-emphasis. Then the Tx EQ is set to Preset 7 (P7) [10], and
noise sources are added and adjusted until the eye --measured at the end of the reference channel (TP2)
is 25 mV and 0.3 UI at a BER of E-12 --after applying a reference receiver with a certain equalizer having
a CTLE and 1-tap DFE, defined in the specification [10].
After Tx and eye calibration, the Rx under test (such as one on a PCIe card, or a host) in attached to the
CEM setup. The Rx is allowed to request adjusting the TxEQ to any other value (other than P7) in the
space shown in Figure 21, until it achieves a BER of 1E-12. Silicon receivers are NOT required to work as
well as the reference receiver at a given TxEQ setting – regardless of channel length. Therefore – it is
important that a transmitter be able to reproduce the entire required TxEQ space faithfully. Silicon
receivers may have a strong preference anywhere in this space, and have performance that degrades
rapidly away from the ideal TxEQ region for that receiver. This means that performing a host Tx eye test
with the reference receiver, and showing that it passes with a single TxEQ setting, is not a sufficient to
guarantee interoperability. This would only be true if all silicon receivers were required to work as well
as the reference receiver at all possible TxEQ settings, which is not the case in PCIe 3.0.
One might then ask: Why does the PCIe 3.0 Spec use a reference Receiver? Section 4.3.4.3.5., of the
base specification [10], describes the role of the reference Rx, and how it is not a required design
implementation:“For the longest calibration channel the stressed eye will be closed, making direct measurement of the stressed eye
jitter parameters unfeasible. This problem is overcome by employing a behavioral receiver equalizer that
implements both a 1st order CTLE and a 1-tap DFE. For the short and medium calibration channels the behavioral
Rx equalizer shall implement a 1st order CTLE only.
Rx behavioral CTLE and/or DFE are intended only as a means of obtaining an open eye in the presence of calibration
channel ISI plus the other signal impairment terms. The behavioral Rx equalization algorithm is not intended to
serve as a guideline for implementing actual receiver equalization.” (Emphasis added).
Fixed TX EQ
8 GT/s PRBS
Generator
Combiner
Calibration Channel
TP1
Rj
Source
EW Adjust
Sj
Source
Diff
Interfer
ence
Replica
Channel
Test Equipment
TP2
Post Processing Scripts:
CM
Rx pkg model
Behaviorial CTLE/DFE
Behavioral CDR
Interfer
ence
EH Adjust
TP2P
25 mV / .3 UI at E-12 BER
Figure 23 PCI Express 3.0 Rx Stressed Eye Calibration Setup.
27
The PCIe 3.0 Base Spec [10], & PCIe 3.0 CEM spec [19] guarantee that devices whose Tx & Rx pass
compliance testing independently, will interoperate, by requiring (or allowing) all the following:1. A Tx must be able to produce any of the pre- and post-shoot values in Figure 21, upon request from
an Rx, and also meet specified tolerance requirements (+/-1 dB for pre-shoot, and +/-1 to 1.5 dB for
post-shoot). See table 4-16 of [10].
2. A Tx must pass a compliance eye mask test at a minimum of one (or more) of the TxEQ values of
Figure 21.
3. A Tx compliance eye must be obtained using an informative spec Rx, which is provided for
measurement purposes, and also for simulating a channel using a minimally-compliant spec Rx.
4. An Rx is allowed to request any of the TxEQ values of Figure 21, and receive them from the link Tx.
5. An Rx achieves a BER of 1E-12 for one (or more) of those TxEQ values --any of its choosing. A
spec-compliant Rx is not required to match the performance of the reference Rx at any specific TxEQ
setting – but must match, or exceed, the reference Rx for at least one TxEQ setting.
6. The TxEQ selection algorithm used by the Rx to achieve a BER of 1E-12 is implementation-specific.
7. The architecture of the Rx (its equalization and CDR) is implementation-specific.
8. The TxEQ at which a Tx passes compliance may differ from that required by different receivers to
pass compliance.
9. The channel between Tx and Rx --the CEM channel described in [19] is passive. That channel has 2
connectors and 3 boards, and transports the TxEQ space of Figure 21 in a deterministic manner. Due
to this attribute, the CEM channel is suitable for both Tx and Rx compliance testing. The CEM
Channel’s passivity assumption underlies open interoperability. An Rx design would rely on
such a channel during design simulations, and later also during physical compliance testing.
The presence of re-drivers (even the linear type), as will be shown in section 7.2, distorts the TxEQ space
of Figure 21 in a manner different than the passive CEM spec reference channel alone does. Thus, redrivers present a compliant Rx with a link where it might never be able to find a TxEQ which appears to
it in the same fashion it did during its compliance testing. The distortion is worse than that of added
vias, or other passive components, as can be shown in simulations within a relevant range of
frequencies.
Stated at a high level, an active component, which is not specified in the Base Specification, has no room
in a compliant link, since the Spec does not budget for such a device’s transfer function and tolerances.
This determination was also made by the PCI SIG. Hence, the SIG recommends using compliant re-timers
[20].
6. Re-driver issues in PCIe 3.0
The PCIe 3.0 Base Specification [10] is completely silent on extension devices (including re-drivers). A
repeater ECN --which specifies re-timers-- has been approved, and is available through the PCI-SIG’s
website [20]. When an active device (other than a re-timer) is placed in a PCIe 3.0 link, the above
28
scheme of open-interoperability (section 5.5) based on TxEQ Figure 21, passive channels, and the Phase
2/3 adaptation protocol, is no longer valid, strictly. More detailed reasons are given in the next two
subsections for each type of re-driver.
6.1 Limiting Re-drivers versus PCIe 3.0 TxEQ Requirements and Adaptation
Protocol
Referring to section 3.2.1, a limiting re-driver re-shapes the incoming signal at its output, turning it into a
pseudo-square wave (with finite rise and fall times), with a new statically-programmed TxEQ. Thus, the
TxEQ changes requested by either host or end agent in phases 2 and 3 (sections 5.3 and 5.4), are
blocked from passing through the re-driver. Hence, the port requesting the TxEQ change would not be
able to see its effects, since the expected signal shape change is blocked by the limiting re-driver.
Furthermore, since the Base Spec [10] is silent on the exact implementation of the TxEQ adaptation
algorithm in a receiver, then it is entirely possible that some algorithms might fail (time-out) completely.
Consider a case where an agent requests a TxEQ change, and expects to see a different eye, hence a
different BER, and checking that against a pass/fail criterion. If the check comes back with the same
result every time, (since the limiting re-driver is blocking TxEQ changes from passing), then depending
on the specific algorithm used, the Rx might time out.
Hypothetically, a system designer might be able to select a specific limiting re-driver, and program it
statically in such a way that it could operate with one, or a small and well-characterized (tested) set of
link partners while effectively disabling the TxEQ adaptation protocols on both ends of the link. This
implies that neither agent in the link should even attempt to optimize the link any further. A PCIe 3.0
host can disable Phases 2 and 3, and the end agent would default to the values sent to it by the host in
phase 0. Such a system is referred to here as a “Closed System” or a “Closed-slot System” which does
not support open interoperability (Section 10.1). The market might not offer a wide range of limiting redriver with a high enough frequency, and a wide enough TxEQ space for this to be possible.
Furthermore, disabling TxEQ training would entail intervention into the firmware of a system.
Hypothetically, such a usage model should ensure that firmware could disable coefficient update
requests during training.
6.2 Linear Re-drivers vs. PCIe 3.0 TxEQ Requirements and Adaptation Protocol
Referring to section 3.2.2, a linear re-driver does allow incoming signal shape changes to appear at its
output. TxEQ changes requested by either the host, or the end agent, in phases 2 and 3, could pass
through the re-driver, and the requesting agent can detect a change in the eye opening, or the BER.
There is an issue which arises when a “Linear” active device is introduced into a channel. As explained in
section 3.2.2, linearity only implies that the relationship between the incoming and outgoing signal
amplitudes is that of direct proportionality (a straight line), irrespective of their shapes. But, the active
device would alter the incoming signal shape according to its frequency domain transfer function H(s),
which is a function of its CTLE, TxEQ (if any), and amplifier frequency response. In principle, it is not
possible to design a compact equalizer (using a finite number of poles and zeros), to compensate
perfectly for the loss-vs-frequency function of a PCB channel (section 1.3), especially with current &
29
foreseeable gain-bandwidth process capabilities. There are extraneous amplitude and phase effects
which render the compensation imperfect.
Furthermore, PVT variations of the re-driver and the extension PCB would cause deviations in the
compensation, and add tolerances to apparent TxEQ which are unaccounted for by the pre- and postshoot tolerances allowed in table 4-16 of the base PCIe 3.0 specification [10]. In addition, re-drivers have
a finite number of programmable settings for their CTLE and TxEQ. The discrete nature of the settings
means that, even if the re-driver has the perfect frequency response which cancels the channel
extension, there would most likely be either over- or under-equalization, both of which alter the pulse
response of the channel. Figure 15 and Figure 16 are an example of such deviation.
When a channel extension is imperfectly compensated for by a linear equalizer (re-driver), then there is
no guarantee that the equalization space tabulated in Figure 21, would appear to the Rx in the same
way it does, when the channel is only a passive channel, within the limits of the specification.
Receiver design is implementation-specific (not even having to meet any specific CTLE or DFE
equalization requirements, as was stated in section 5.5), and some receivers seem to prefer certain
regions of the equalization table in Figure 21, e.g. a specific preference for pre-emphasis and/or deemphasis. If such a region appears distorted due to the placement of a re-driver, then the receiver might
not find an operating point at a BER of 1E-12. There are other factors also which might affect link health,
such as noise (deterministic or random jitter) which might be added (or amplified) by a re-driver.
The notion that if a linear re-driver is set to over-compensate an extension (hence compensate some of
the base Spec channel also), then CEM could be met, is also flawed. Actual testing has shown this notion
to be inaccurate, since over-compensation causes TxEQ distortion just as under-compensation would
(See section 7.2). Please also note that since re-drivers are not constrained by any specification, it is not
even possible to re-define a new “Re-driver-friendly” CEM eye --which might guarantee open
interoperability-- if presumably met by an extended channel which is compensated (or overcompensated) by a re-driver!
Practically, TxEQ adaptation might succeed in finding an operating point for several host/end device
combinations. However, there is no longer an implied Spec guarantee that this would hold true always,
for any two devices that pass CEM compliance independently, in the open market. Despite this, some
designers are using re-drivers in systems that are meant to be openly interoperable. With the
information provided in this document, a designer should be better-prepared to understand the
implications, and make his or her own risk-taking judgment.
In addition, a host designer might be able to select a specific linear re-driver, and program it in such a
way that it could operate with a well-characterized (tested) set of end agents. Such a system is still
referred to here as a “Closed-slot System”, which does not support “unconditional” open
interoperability (see 10.1).
30
7. Comparative Evaluation of Linear PCIe 3.0 Re-drivers
While PCIe 3.0 and 10G-KR re-drivers can never be openly interoperable, they are sought by many
designers as a convenient method to extend those channels. Pressure is mounting to provide designers
with a method to select the best re-driver offering. In support, the authors developed, and tested, such
a method. The method provides an objective approach for comparing (and choosing) the best device for
“Closed-slot” design, and also demonstrates the degree of departure from open interoperability, thus
giving the designer a measure of such risk. The procedure is based on the following principles:A. The CEM channel is the most relevant baseline for comparing re-drivers as they lengthen that same
channel (using a PCB extension), with both extended and un-extended channels tested under identical
spec stress levels.
B. A better re-driver is one which is able to convey the TxEQ space of Figure 21 with more fidelity than
another.
7.1 A CEM-based Method for Evaluating Linear PCIe 3 Re-drivers
Referring to Figure 24, which shows a CEM channel and a CEM channel plus extension, the test
procedure is as follows:1. Calibrate the BERT for the standard PCI-SIG CEM Rx test procedure (for the specific BERT being used),
as described by the CEM test procedures, including V-swing, RJ, Sj, and DMI. (See [19]).
2. Measure the CEM eye using the CEM channel alone (Figure 24 A), at room temperature (e.g. 27 C) for
a wide range of TxEQ values, using the standard presets (P0 through P9), plus additional points as
indicated by the green ellipses in Figure 25. These points almost fully enclose the TxEQ space prescribed
by the PCIe 3.0 Spec [10]. Repeat the measurement at least 5 times, and store each result, in order to
overcome random effects in signal acquisition and associated post-processing artifacts by SigTest.
3. Connect the extension channel to the re-driver alone (Figure 26), and configure the re-driver’s
settings such that it produces P7 most faithfully when driven by the BERT (no jitter or DMI enabled),
with an extension channel. Alternately – follow the re-driver vendor recommendations for how to
select the optimal re-driver settings for the desired extension channel.
4. Connect the re-driver and the extension after the CEM channel (Figure 24 B), then re-measure the
CEM eye for the same set of TxEQ values used in step #2 above. Repeat the measurement at least 5
times, and store each result, in order to overcome random effects in signal acquisition, and the
associated post-processing by SigTest.
5. Subtract the Eye Width (EW) and Height (EH) obtained in step #2 from those of step #4, then plot the
results versus post-shoot and pre-shoot, for each one of the 5 acquisitions. Positive deltas (extended
channel is better than CEM), or the smallest negative ones, are desirable.
6. Change the re-driver settings in search of better ones (e.g. sweep its CTLE peaking), and repeat steps
#4, and #5 only.
31
7. Repeat steps #2 while heating the base CEM channel in a heat chamber to 80C. Repeat step #4 at 80C
also, while heating the CEM channel, the re-driver, and the extension channel. Perform the subtraction
of step #5 using the 80C results.
8. Repeat steps #3 to #7 with different extension channel lengths of interest.
A better re-driver produces a positive difference, or the smallest negative difference, in steps #5, #7, and
#8, at more of the marked TxEQ values of Figure 25.
J-BERT
8 Gb/s
CEM Channel
Sigtest
3.2.0
Scope
Riser + CBB + CLB
A) Measuring the eye using the CEM channel stand-alone as a reference
J-BERT
8 Gb/s
CEM Channel
Riser + CBB + CLB
Extension
Channel
& Cables
Redriver
Sigtest
3.2.0
Scope
B) Measuring the eye using the CEM channel + Re-driver + Extension & needed cables
Figure 24 The two measurement setups required by the CEM-based methodology for evaluating PCIe
3.0 re-drivers, each labelled individually (A & B).
P4
P5
P6
P9
P3
P1
P2
P0
P7
P8
Figure 25 The TxEQ Pre- & Post-shoot space (showing C+1 and C-1 coefficient values) required by the
PCIe 3.0 Base Spec, with the Presets & additional (green) ellipses suggested for bounding the space.
J-BERT
8 Gb/s
Redriver
Extension
Channel
& Cables
Scope
Figure 26 The Measurement setup used to set, or verify, the re-driver equalization & any of its other
configurations.
32
7.2 Some Test Results
Figure 27 illustrates the difference in EW and EH between the extended CEM (using a re-driver) and the
CEM channel only (Step #5, and #6 in section 7.1), at room temperature, for one linear re-driver
offering. Figure 28 presents the result for a second re-driver. The first re-driver yields the smallest delta,
in EW and EH, between the extended CEM and the base CEM. However, none of its settings show
positive (or zero) deltas at all TxEQ values, thus demonstrating that the re-driver distort the TxEQ space,
and invalidates open interoperability. Notice that the second device (Figure 28) distorts the TxEQ space
so much, that for several TxEQ values, the delta is highly negative, and that only the middle range of
TxEQ settings is produced somewhat faithfully. Both devices are almost equally linear, but the better
device of Figure 27 has a frequency domain transfer function which emulates the inverse of PCB loss
(compensates) more faithfully than the device of Figure 28, and does so up to about twice the frequency
range. Early Heat Chamber results suggest further degradation in the ability to reproduce TxEQ, even for
the better device, as expected. Heat degrades a re-driver’s ability to equalize, while the extension PCB’s
loss increases at higher temperature also. These results clearly demonstrate which device is a better
choice for Closed-slot design, and also provide a basis for judging the degree of violation of Open
Interoperability. This comprehensive methodology –based on the most relevant requirements of the
Spec (CEM)-- is recommended for evaluating which is a better re-driver offering.
It is worthwhile noting the “humps” (or re-closing of the eye) in the responses in Figure 27, as the
equalization is set beyond the optimal value for any particular transmitter TxEQ. This shows that overequalizing a channel using a re-driver can distort the TxEQ space just as harmfully as under-equalization.
It also demonstrates why the notion, that one could use a re-driver to make an extended link look even
better than the maximum Spec channel, is flawed. Early test results (not shown here), involving an
actual silicon receiver (different than the reference receiver), also indicate that there is an optimal redriver setting for obtaining the best sampling eye, and that over-equalization also causes reduction of
eye width, and reduction of timing margins to the left and right of the sampling point.
While a good receiver might have a wide-enough range to work-around the presence of a re-driver, for
instance by asking for alternate TxEQ settings from the Tx, some receivers are known for requesting only
one TxEQ value, for a long channel, and are unable to search for better ones.
33
Variability Gauge
Variability Chart for EW_Extended-CEM_Minus_CEM ps
15
10
5
-10
-15
-20
-25
C
C
C
-30
-35
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
EW_Extended-CEM_Minus_CEM ps
0
-5
0
0.8
1.9
2.5
3.5
6
0
0
2.5
0
3.5
8
3.5
1.2
2.5
0
0
4.4
3.9
3.5
6
6
1.9
0
8.8
9.5
C
C
Redriver_EQ
Preshoot (dB)
Postshoot (-dB)
Variability Gauge
Variability Chart for EH_Extended-CEM_Minus_CEM mV
60
20
0
-20
-40
-60
C
C
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
EH_Extended-CEM_Minus_CEM mV
40
0
0.8
1.9
2.5
0
3.5
6
0
2.5
0
3.5
8
3.5
1.2
3.9
2.5
0
4.4
0
3.5
6
6
1.9
0
8.8
9.5
Redriver_EQ
Preshoot (dB)
Postshoot (-dB)
Figure 27 The difference between EW (top) & EH (bot) of the “CEM+10dB (@4GHz) extension” and the
“Ref. CEM channel” using one linear re-driver offering, versus the range of TxEQs selected in Figure 21
and a number of progressively higher re-driver equalization settings. Note how no one re-driver
setting produces a positive (or zero) delta at all TxEQs. Also note how there seem to be an optimal redriver settings which are different for different TxEQ values. The red boxes indicate closed Ref. CEM
eyes using the Reference receiver.
34
Variability Gauge
Variability Chart for EW_Extended-CEM_Minus_CEM ps
10
5
0
-10
-15
-20
-25
-30
C
C
C
-35
-40
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
EW_Extended-CEM_Minus_CEM ps
-5
0
0.8
2.5
1.9
3.5
6
0
0
2.5
0
3.5
8
1.2
0
4.4
3.9
3.5
2.5
0
3.5
6
6
1.9
0
8.8
9.5
C
C
Redriver_EQ
Preshoot (dB)
Postshoot (-dB)
Variability Gauge
Variability Chart for EH_Extended-CEM_Minus_CEM mV
30
20
10
-20
-30
-40
-50
C
C
-60
-70
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
EH_Extended-CEM_Minus_CEM mV
0
-10
0
0.8
2.5
1.9
0
3.5
6
0
0
2.5
3.5
8
3.5
1.2
3.9
2.5
0
4.4
0
3.5
6
6
1.9
0
8.8
9.5
Redriver_EQ
Preshoot (dB)
Postshoot (-dB)
Figure 28 The difference between the EW of the CEM+8dB (@4GHz) of total extension and the “Ref.
CEM channel” alone, using a second re-driver offering, vs. the range of TxEQ values selected in Figure
21. Only the most relevant re-driver setting is shown here. Note how no one re-driver setting
produces a positive (or zero) delta at all TxEQs. The red boxes indicate closed Ref. CEM eyes using the
Ref. receiver.
8. The 10GBASE-KR Standard
10GBASE-KR utilizes 64b/66b encoding (at 10.3125 G-baud) and is part of the IEEE 802.3 Ethernet
specification [21]. It is intended to operate using PCB traces up to 1 m long, from Tx to Rx, including two
connectors which connect two boards that plug into a backplane. The recommended trace impedance
(Zo) is 100 Ohms +/- 10%. There is an N to P side (differential) skew requirement which must be less
than the minimum transition time.
KR interconnect is defined between test points TP1 and TP4 as shown in Figure 29. The transmitter and
receiver blocks include any external AC-coupling capacitors, and also packages. “Informative”
characteristics and methods of calculation for the insertion loss, insertion loss deviation, return loss,
crosstalk, and the ratio of insertion loss to crosstalk between TP1 and TP4 are defined in 69B.4.3,
69B.4.4, 69B.4.5, 69B.4.6, and 69B.4.6.4, respectively [21]. These characteristics may be applied to the
full path, including transmitter and receiver packaging and other components, such as AC caps.
35
Figure 29 The KR Interconnect reference model defining the reference test points TP1 and TP4 (Fig.
69B-1 of Annex 69B, [21]).
This standard’s channel compliance is specified in a way different than PCIe 3.0’s. In PCIe 3.0, the
channel is considered acceptable if it is passive, and if when driven by a Spec Tx plus noise injection, it
produces a certain eye opening defined inside a Reference Receiver after application of a Reference
Equalization comprising a certain CTLE transfer function and a 1-tap DFE [10]. This is also referred to
usually as the “CEM3” eye. For a PCIe 3.0 host with an open slot, a Compliance Load Board (CLB) is
plugged into the slot, and a CEM test procedure is followed, aiming to obtain an eye equal to or better
than the CEM3 eye [19], after embedding an Rx package model, and applying the aforementioned
reference equalization. Conversely, for a PCIe 3.0 card, the card is plugged into the Channel Base Board
(CBB), and a CEM eye (or larger) is sought on the other end of the channel after Rx package embedding,
and application of the reference receiver.
In KR, however, the channel is specified as a standalone entity, using an informative specification (IEEE
802.3 Annex 69B, in [21]). The 69B specification defines maximum Insertion Loss (IL) bounds, and IL
deviation bounds (ILD) at every frequency, as shown in Figure 30. Other aspects of the full channel, such
as return loss, and insertion loss to crosstalk ratio, are also specified in Annex 69B.
This approach implies that if such a channel is driven by a Spec-compliant Tx, and signaling is received by
an Rx which passes the Rx interference tolerance test [Annex 69A, [21]], then there is a high-confidence
(albeit not guaranteed) probability that such a channel would allow such compliant agents to interoperate successfully. Although the specification is informative, industry treats it as normative (i.e.
required).
36
Figure 30 Annex 69B [21] fitted pad to pad channel Insertion Loss (IL), and allowed maximum IL
deviation (ILD) from the fit.
8.1 The 10GBASE KR Equalization Standard
10GBASE-KR Equalization Definitions
In 10G-KR, the transmitter is required to be able to produce a wide range of equalizations, slightly larger
de-emphasis (post shoot) than that of PCIe 3.0, but less pre-emphasis (pre-shoot). To understand 10GKR transmitter equalization, refer to Figure 31 which defines the voltage levels of a standard equalized
waveform when the Tx is driving a standard load of 50 Ohms per side, through 100-nF decoupling caps.
10G-KR defines so-called Equalization ratios Rpre and Rpst, which are defined as follows (referring to the
voltage levels in Figure 31):Rpst = v1 / v2 , Rpre = v3 / v2
Table 1 shows the min to max ranges of the TxEQ ratios defined above, along with the permissible
values of the DC voltage, when driving the reference load described above. Note that Rpst, in dB, can
range from +0.45 to -12 dB (including tolerance), while Rpre can range from +0.45 to -3.75 dB.
The waveform of Figure 31 is obtained at the Tx output by using an FIR filter which is identical to the one
used for PCIe 3.0, and shown previously in Figure 22. Hence, 10G-KR also implies use of the coefficients
C-1, C+1 and C0, where one could derive equations which define these coefficients in terms of the three
voltage levels v1 , v2 , and v3 defined in Figure 31 . Implementation of coefficient values greater than
zero or less than the minimum values defined by Rpre (min) and Rpst (min) in Table 1, is optional.
Note that unlike PCIe 3.0, the 10G-KR equalization space does not require that the sum of the absolute
values of the three coefficients (|C-1|+|C+1|+|C0|) be normalized to 1. Instead, the range of v2 defined in
Table 1, along with the range of the equalization ratios (Rpre and Rpst) are all that constrains the values
of those coefficients.
37
Table 1 The required ranges of the 10G-KR pre- and post-shoot equalization ratios Rpst and Rpre,
along with the range of the DC level v2.
Figure 31 The 10G-KR Transmitter output waveform equalization parameter definitions (Clause 72,
[21]).
Summary of the 10G-KR Equalization Training Protocol
When two Ethernet devices are connected, then upon Rx detection in both directions, the link starts
Auto Negotiation (AN) (section 69.2.4, [21]). AN is the process by which the two devices communicate
38
their interface capabilities (at low speed, using Manchester encoding), and settle upon the highest
common denominator of link type, and speed. After AN, the link starts the TxEQ training process by
which both receivers try to find the best TxEQ setting of the other port’s Tx, in order to ensure operation
at a BER of 1E-12. The protocol allows a total of only 500+/-5 ms for both link partners to finish their
training.
Training is conducted through use of a Training Frame which has four parts: A Frame marker to denote
the beginning of the frame, a coefficient update section where the Rx sends to the other agent’s Tx its
request for a new coefficient setting, a “Status Report” where the Rx reports back to the other agent
whether it has been able to tune its Rx using the coefficients that it has requested, and finally a training
pattern which consists of 4094 bits from the output of an 11-bit pseudo-random bit stream (PRBS-11).
Both ports (end agents) enter the training state concurrently, and they send each other the above
Frame, making coefficient change requests, and training their Rx, until both finally find the coefficients
that they prefer, in order to achieve “Receiver Ready” status. Training ends after both agents report
back to the other agent that their receiver is trained and ready to bring up the link.
The protocol begins Training (via use of a Training Frame at low speed) where each end point requires
the other Tx to set its TxEQ to the “INITIALIZE” values corresponding to Rpre = 1.29 (2.2 dB) +/-10%, and
Rpst = 2.57 (8.2 dB) +/-10%, and that C0 shall be set such that the peak-to-peak differential output
voltage is equal to, or exceeds, 800 mV for a 1010… pattern. From that point on, an Rx is allowed to
request from the other agent’s Tx to increment, decrement, or hold the coefficient values. However,
only one coefficient is allowed to be changed at a time, in each direction. Table 2 defines the required
change magnitudes of the three voltage values shown in Figure 31, every time there is a coefficient
change request. Note that changing any of the individual coefficients C-1, C+1 or C0, results in a change
required for all the three voltage values v1, v2, and v3.
After changing a coefficient, a Tx sends the training pattern using its new equalization settings. The
receiving Rx tries to detect the incoming pattern as it adapts its receiver’s CTLE (and DFE, or any other
resources). When the receiver has completed training, and has determined that it is ready to bring up
the link (the implementation of this decision process is left to the PHY designer) it then sets the
“Receiver Ready” bit in its Status Report which is sent back to the other agent, in order to inform it of
tuning success. The specification for the Rx interference tolerance test [Annex 69A [21]] requires a BER
of 1e-12 to pass, but this is not explicitly required for achieving “Receiver Ready”. That being said, most
users expect, or require, a BER of 1e-12 or better. Training ends if both receivers achieve “Receiver
Ready” within 500 ms +/-1%.
39
Table 2 Transmitter output waveform voltage level requirements related to 10G-KR coefficient
updates.
9. Re-driver issues in 10G-KR
The 10G-KR base specification (Clause 72, [21]) is silent on extension devices (including re-drivers).
When an active device (other than a Spec-compliant re-timer) is placed in a KR link, the above scheme of
open-interoperability (sections 8.1.1 and 8.1.2) based on passive channels meeting the 69B Spec, Tx
training protocol, and Rx tolerance testing is no longer valid strictly, for the reasons detailed in the next
two subsections on re-driver issues.
9.1 Limiting Re-driver and KR’s TxEQ Training Protocol
A limiting re-driver blocks the signal equalization (coefficient) changes (which are requested of a Tx)
from reaching the requesting Rx during training. In this case, the issues with such a re-driver are similar
to those described in section 6.1, for PCIe 3.0. Please refer to that section for more details. If this is the
desired implementation of the system, then the devices are no longer compliant with the IEEE
specification for KR, and it is therefore recommended to forego both AN and Training, and simply use an
engineered link that is forced to 10G.
9.2 Linear Re-driver and KR’s TxEQ Training Protocol
Since the KR Specification has a set of TxEQ range requirements as described in section 8.1.1, and a
channel specification (Annex 69B, [21]) which is treated as normative by the industry, then a channel
using a linear re-driver cannot be qualified, strictly, as being compliant, and openly interoperable. The
69B specification implies a passive channel. If a linear re-driver is introduced, and its transfer function is
convolved with the channel’s transfer function, then even if one could show that the resulting SDD21
amplitude of the channel is still within the bounds of Figure 30, there remain two issues: The first is that
69B has no phase shift specification against which to check the overall channel response, and the second
is that the shape of SDD21 will have characteristics that are not normal for a passive channel, due to the
presence of the active device.
40
Similar to the PCIe 3.0 case (section 6.2): There is an issue which arises when a linear active device is
introduced into a channel. As explained in section 3.2.2, linearity only implies that the relationship
between the incoming and outgoing signal amplitudes is that of direct proportionality (a straight line),
irrespective of their respective shapes. The active device would alter the incoming signal shape
according to its frequency domain transfer function H(s), which is a function of its CTLE, TxEQ, and
amplifier frequency response. In principle, it is very difficult to design an equalizer (using a finite number
of poles and zeros, and today’s gain-bandwidth product process capabilities), to compensate perfectly
for the near-linear loss function of a PCB channel (section 1.3). There are extraneous amplitude and
phase effects which render the compensation imperfect. In addition, re-drivers have a finite number of
programmable settings for their CTLE and TxEQ. The discrete nature of the settings means that there
would most likely be either over- or under-equalization, both of which alter the pulse response of the
channel. When a channel is extended, and the extension is imperfectly compensated by using a linear
equalizer, there is no guarantee that equalization space described in Table 1, would appear to the Rx in
the same way it does when the channel is only a passive within the limits of the specification (Figure 30).
Practically, TxEQ adaptation might succeed in finding an operating point for several host/link partner
combinations. However, there is no longer an implied Spec guarantee that this would hold true always,
for any two devices that pass compliance independently, in the open market. Despite this, some
designers are using re-drivers in systems that are meant to be openly interoperable. With the
information provided in this document, a designer should be better-prepared to understand the
implications, and make his or her own risk-taking judgment.
Linear re-drivers are, however, amenable to use in a “Closed System” (Section 10.1). Such a system
could operate with one, or a small and well-characterized (tested) set of link partners, while allowing the
TxEQ adaptation protocols on both ends of the link to proceed as defined in the specification. Just as
described for PCIe 3.0 (section 6.2), such a system should undergo careful design and validation, in order
to guarantee HVM yield, and tolerance to PVT conditions.
10. Designing with Re-drivers
Once it has been decided that a re-driver is appropriate for use in a link, then one would have to address
other practical considerations, as described I n the following sections.
10.1 Closed System Using a Re-driver
This is a system using a re-driver, and which is foregone as non-openly interoperable. It is limited to a
host, and one or a few devices (such as a few PCIe 3.0 add-in cards, or KR link partners). As mentioned in
prior sections, in the case of PCIe 3.0, such a system might deploy a limiting re-driver (if Phases 2 and 3
of TxEQ adaptation are disabled), or a linear re-driver with adaptation enabled. Programming the redriver settings during validation is the only available degree of freedom in such a system.
A main consideration with closed systems is that they require careful HW validation, in order to ensure
that the Host and end device Silicon skews, PCB, temperature, humidity, and voltage variations do not
cause failures in the field (HVM). The dilemma is that the system itself is usually not available for such
extensive testing, until it has been designed and prototypes delivered. To mitigate such a dilemma, re41
driver evaluation cards could be used to gain a very good level of confidence as to whether such a
system would be reliable. This still requires that the Host and end device (or devices) be available for
such testing, in addition to a platform with which to perform the evaluation. This latter requirement can
cause un-acceptable time to market delays. Erring on the side of shorter link extension is the best way to
enhance the chances of a successful design.
10.2 Choice of Re-driver
Choosing the right device for a design always starts at the datasheet. However, there are usually
unaddressed properties in published literature. For USB3, SATA, and PCIe 2.0, one of the most important
characteristics, of limiting re-drivers, is the amount of available pre-channel equalization (CTLE) at the
input of the device. Many data sheets state inches of FR4, but as discussed in section 4, FR4 can differ
widely in its loss. In addition, the shape of the equalization curve, as a function of frequency, is relevant,
and determines the amount of residual un-compensable ISI. One method to address this is to procure a
re-driver evaluation KIT, and measure the eye opening using a JBERT and various pre and post-channel
loss values. Despite this, one has to remember that even for those busses, which have simple TxEQ
schemes, using re-drivers is not strictly Spec-compliant (Section 4).
For Closed PCIe 3.0 or 10G-KR systems, linear re-drivers offerings differ in their linearity, and frequency
response. Linearity has been improving recently, but re-drivers still remain widely different in their
frequency response (transfer function, or SDD21), which distorts the TxEq space required by Specs.
Section 7 details the recommended method for evaluating a re-driver’s TxEQ distortion.
For all re-drivers, the authors have noticed that high temperature affects their behavior differently.
Hence high-temperature testing using Evaluation KITs is highly-recommended.
10.3 Placement
Placement of a re-driver is usually dictated by the system architecture and its usage model, in addition
to Signal Integrity considerations. A re-driver might be needed on one large coplanar board, where both
host and device (or output connector) reside, such as with USB3, where the designer has more
placement freedom. In many cases, there might be at least one connector, and in many cases a second
connector such as in the case of a backplane. In other cases, host and device might reside on different
boards, and the topology might comprise a mezzanine connector, in addition to a slot connector, or a
USB connector, etc. More complex multi-connector topologies have also been requested. Backplanes or
mid-planes may not be appropriate for repeater placement, due to layout, power supply availability, or
usage model restrictions such as re-usability across multiple generations of agents (or link partners).
From a usage model perspective, an open system (where add-in cards or blades have to be swapped)
imposes interoperability requirements that cannot be met in the case of PCIe 3.0, and also 10G-KR. In
the case of a closed PCIe 3.0 system (where both host and end agent are predetermined and known),
the designer might still have some flexibility in exactly where to place a re-driver.
Figure 32 shows the different Signal Integrity considerations, which arise, depending on the placement
of a limiting re-driver (e.g. in the case of USB3 and SATA3). Figure 33, on the other hand, shows the
42
considerations for a system using a linear re-driver (e.g. in the case of Closed-slot PCIe 3.0). These
considerations might have to be traded away, sometimes, for more overriding system design
requirements.
Close to the Tx:- Little use of Input CTLE
- Risk of higher un-compensable
ISI jitter due to over-equalization
caused by even the min CTLE setting
- Output TxEQ not enough to
compensate long post-channel.
Approx. in middle third:- Best use of Input CTLE in
conjunction with output
TxEQ
- Lower residual ISI
- Acceptable noise-caused
time jitter
- Lower risk of Rx overdrive.
Rx
Redriver
Redriver
Redriver
Redriver
Full Extended Channel Using a Limiting Re-driver
Tx
Close to the Rx:- Input CTLE not enough
to compensate long pre-channel
- Higher noise-caused jitter
- Higher residual ISI jitter
- Little use of output TxEQ
- Final Rx may saturate (overdrive).
- Risk of signal being weaker than
switching &/or squelch threshold.
Figure 32 Signal Integrity considerations and tradeoffs guiding the placement of a single limiting redriver in an extended full channel.
Approx. in middle third:- Acceptable linearity
- Acceptable noise amplification
- Maximal observed channel
extension in the lab.
Rx
Redriver
Redriver
Redriver
Close to the Tx:- High risk of operation in nonlinear region of amplitude.
- Higher risk of un-compensable
ISI due to non-linearity
- Low risk of noise amplification.
Redriver
Full Extended Channel Using a Linear Re-driver
Tx
Close to the Rx:- Best amplitude linearity
- High risk of noise amplification.
Figure 33 Signal Integrity considerations and tradeoffs guiding the placement of a single linear redriver in an extended full channel.
10.4 Power Supply Noise
Many re-drivers obtain their power from either 3.3V, 2.5V, or in some cases even 1.5V supplies. Some
re-drivers contain internal voltage regulators to control circuit behavior, and also reduce supply noise.
Attention should be paid to system supply noise, by following the vendor’s bypassing guidelines, in
order to reduce deterministic jitter.
10.5 PCB Material Reported in Re-driver Data sheets, and Evaluation Cards
The great majority of re-driver datasheets list channel extension in inches of FR4. There is a wide range
of FR4 loss in dB/in (from about 1 dB/in down to about 0.7 dB/in at 4GHz), and temperature affects its
43
loss by anywhere from 0.2%-0.4% per Degree C, in addition to moisture content. Therefore, it is
important to inquire from the vendor as to what the loss of the PCB is in their reported channel
extension tables. It is more meaningful to represent channel extension in dB at a bus’s Nyquist
frequency, than in inches.
Most vendors provide free Evaluation KIT (or board) samples. They come in Edge-connector, SMA, or
SMP form factors. Some USB3 re-driver boards also offer Std-A or Std-B connectors.
10.6 Power Consumption
During active mode, re-drivers consume a significant amount of power, ranging from 100 to 200 mW per
channel, roughly, depending on the specific device. Some re-drivers offer low-power or sleep modes
(e.g. when the load is disconnected), where standing current is reduced, and power consumption drops
to 20-50 mW per channel. For a single lane (two channels) the numbers are doubled. Furthermore, for
PCIe 3.0, which usually has the highest number of lanes (x4 and up to x32), the total active power can be
significant, and should be accounted for when architecting and designing a platform. If a bank of redrivers can be placed away from the hot spots of a platform, then that might be a good choice, assuming
that Signal integrity requirements are also fulfilled.
11. Simulating Channels with Re-drivers
11.1 The Two Modes of Signal Integrity Simulations
Most Signal Integrity simulators are behavioral (i.e. not at the transistor level). They allow a designer to
examine channels over HVM variations, including PCB and device PVT skews. They facilitate using a
Design Of Experiments (DOE) followed by a Response Surface Model (RSM) fit, which allows prediction
of failure rates. These simulators run much faster than transistor-level Spice circuit simulators. Their
shortcoming is the need to build (or obtain) a behavioral model which emulates the Tx and Rx circuit
behaviors, faithfully. These simulators usually require either a pulse response of the channel (from pad
to pad), a netlist, or an S-parameter file (e.g. a Touchstone file) representing the channel. Behavioral
simulators usually have two modes: A “Statistical mode”, and a true Time Domain (transient), or “Bit-byBit” mode.
Statistical (a.k.a. Analytical) Simulation Mode
In the statistical (or analytical) mode, the simulator assumes perfect linearity of all components. It
characterizes the full channel from Tx output die pad to Rx input die pad. It does so by either producing
(or requiring) a full-channel pulse or step response. It then applies a probability distribution convolution
of the bit stream (say a long LFSR random stream), the characterized channel, Tx equalization, and Rx
equalization. The resulting final statistical eye is then analyzed mathematically for width and height at a
specific BER. Rx equalization settings might also be adapted, in search for the best solution.
A Statistical simulator is suitable for analyzing full channels that comprise a linear re-driver, assuming
that the re-driver is receiving signals whose amplitude is small enough to guarantee that it is indeed
operating in the linear region, and not suffering from saturation (or compression, 3.2.2).
44
A statistical simulator is not fit for analyzing full links with limiting (non-linear) re-drivers. One might
think that such links could be analyzed in the statistical mode, if the full channel is broken into two
completely separate simulations: One from the source Tx, through the pre-channel, to the re-driver
input pads, and a second from the re-driver output Tx, the post-channel, to the final Rx postequalization. The problem with such an approach is that a statistical simulator assumes perfect linearity,
and cannot handle what happens inside the re-driver, since the re-driver’s limiting amplifier (post EQ)
reshapes the signal non-linearly. Even if one assumes that the output wave shape could be assumed to
be an ideal pulse train with equalization and proper edge rates, the user would have to gather all the
jitter information at the output of the re-driver’s equalizer followed by its non-linear amplifiers, and use
such jitter as input for the post-channel simulation. A very tedious task, especially when performing DOE
simulations, where the pre-channel might have tens of variants, each one of which producing a different
set of jitter numbers, each one of which, in turn, having to be combined with the tens of post-channel
simulations.
Bit-by-bit (Transient, or “Empirical”) Simulation Mode
In this mode, a simulator performs a true time domain transient simulation (picosecond by picosecond),
using 1E4 to 1E7 bits, and accumulates the eye inside the Rx, for later width and height analyses. These
simulators are slower than the statistical ones. Due to their longer run times, they are used to
accumulate an eye over at most 1E6-1E7 bits (at the time of this writing). They build a bath tub diagram,
then they project the eye width and height to higher bit counts (lower BER) mathematically. The bit-bybit mode of a simulator is suitable for analyzing both linear and non-linear (limiting) re-drivers.
11.2 General Simulation Guidelines for Re-drivers
The advent of IBIS-AMI [22], and the ability of simulators to read IBIS-AMI models make it possible to
simulate channels with re-drivers, behaviorally. However, there are still some issues to be considered.
When acquiring models from a re-driver vendor, it is important to inquire about the following:1. The Silicon, Voltage, and Temperature (PVT) corners that the model covers
Silicon circuit designers use several (sometimes tens) of corners for gauging the performance of their
circuits. There are many circuit variables; such as N-type versus P-type transistor cross skews, resistors
and capacitors which are unrelated to transistors, interconnect metal resistance and capacitance, plus
voltage and temperature corners. Thus, if a vendor gives only 3 model corners, then there might be predetermined statistical assumptions, such as: N and P transistors are both slow or both fast together (i.e.
no cross-skews), and that the back end of the process (resistors, caps, and metal) might also be assumed
to be slow or fast at the same time that the transistors are. This builds pessimism into the model, which
would yield pessimistic (albeit safer) simulation results, yielding shorter channel extension capability.
Model corners which predict temperature dependence (along with Si fast, typical, and slow skews) are
very important, since many circuits usually lose gain and bandwidth at high temperatures.
2. The number of Standard Deviations (Sigmas) of variation at model corners
A model which represents 6-Sigma corners will --when convolved with the channel variation corners-yield very pessimistic statistical results.
3. How well the model and simulator represent Rj and supply noise (Dj)
45
The deterministic non-ISI jitter of a re-driver is a function of its supply noise (which is determined to
some degree by the system power supplies), and also the variation in its output switching rise and fall
times (Duty Cycle Distortion). Furthermore, Random jitter (Rj), and supply-caused jitter are also
functions of the chosen output voltage swing. Model creators might not be accounting for such
dependencies, especially if they are not the original designers of the device.
4. How well the model represents the input SDD11, and the output SDD22
One must ascertain whether the model represents the input and output terminations correctly, and if a
package model and die input capacitance are included.
5. Whether the model has the same programmability of the VOD, CTLE, TxEQ, and other settings as
the physical part
6. How well the model reflects the phase response (of SDD21) in the case of a linear re-driver
7. How well the model represents the onset of non-linearity of a linear device
A model which assumes perfect linearity, irrespective of the input voltage amplitude, and provides only
the equivalent of a frequency domain transfer function, can misrepresent the true signal shapes at the
output of a re-driver. For instance, S-parameter models of a linear re-driver are sometimes offered.
While S-parameter models allow very fast simulations, they force the assumption that the device is
perfectly linear in the relevant region of operation, which might not be the case always.
The reader is advised to exercise the model stand-alone, e.g. with a simple resistive load, in order to
explore its properties, before advancing to full-channel simulations. One should understand what the
model corners cover, and how they should be used. If the model lacks corner settings, or if the corners
do not cover PVT far enough, then one has to keep additional design margin. Conversely, statisticallyunreasonable corners could add pessimism.
12. Ascertaining Link Robustness of Closed PCIe 3.0 and 10G-KR
Systems using Re-drivers
Due to TxEQ adaptation (or training) in both protocols, re-drivers present similar issues to both buses, in
addition to what was discussed in sections 6 and 9. When a system designer chooses to use a re-driver, a
determination has to be made as to which vendor’s device to choose, and how to select its settings for
maximum production yield. The selection unlikely to be based on accurate simulations, due to the lack
of accurate corner models of all of the host, re-driver, and add-in card (AIC) or link partner end agents,
all simultaneously. But simulations might give an initial approximate method to evaluate options.
Usually, an initial a determination is made for the placement of the re-driver (e.g. on a riser card, a
backplane, or somewhere between the host and a PCIe 3.0 slot on the mother board, etc.). This is based
on initial approximations, using the device datasheets, the loss of the overall channel, the availability of
power supplies, and ensuing pre and post channel lengths. See section 10.3 for some placement
considerations.
46
The final selection of the re-driver’s DC gain, amplitude, and peaking equalization settings have to be
made during system testing and validation. Those settings then have to be appropriate for the life of the
system over a number for years, where several parameters could vary.
During high-volume production, the items described in the following sub-sections could be changing
over time. They would all affect whether the receivers on both ends of a PCIe 3.0 or 10G-KR link could
find BER 1E-12 operating points or not. For PCIe 3.0 this applies to both the case of disabled Phase 2 & 3,
and the case where a linear re-driver is used while phases 2 & 3 are enabled. Similarly for 10G-KR’s
Training.
12.1 Silicon Skews of the Host and End-agent (AIC or Link Partner)
While Tx and Rx variations are present in spec-compliant un-extended links, their significance increases
with the presence of a re-driver. The reasons are that the distorted EQ space might be requiring the
receivers to work harder (go to an extreme AGC, CTLE, DFE, etc.), in order to support a re-driven
extension, and the added variations of the re-driver and link extension.
Transmitted jitter can be a strong function of process corners (the cold and slow process corner in CMOS
devices usually produces the most jitter). The Rx on the other hand is more sensitive due to its input
equalization’s sensitivity to process corners. An Rx’s CTLE equalizer gain, peaking, overall bandwidth,
and to some degree the amplitudes of the DFE coefficients, are process-sensitive and could shift,
especially if the device is not designed to compensate, at least partially, for process variations.
It is not always easy to obtain skewed devices from manufacturers for validating a system, due to the
logistics of fabrication plants. In some cases, even design teams do not have access to all possible skews,
unless the Fab runs skew lots, or cherry-picks dies, occasionally. Hence, link margining using whatever
parts are available, plus temperature and voltage skewing, could be used as a minimal alternative in
order to ascertain link health.
12.2 Silicon Skews of the Re-driver
Since a re-driver is an added element to a channel, unaccounted for by Specs, then its variation is also
not accounted for. Hence, re-driver variation has an added effect on a link.
When designing a system which uses re-drivers (e.g. USB3, SATA3), or a closed PCIe 3.0 or 10G-KR
system which uses re-drivers, it is important to study the effects of the re-driver’s semiconductor
process skews on link health. Gain-bandwidth is a function of the process corners, whose number can be
numerous. Vendors may, or may not, be able to supply parts (or evaluation boards) that represent
significant skew corners, due to logistical limitations in coordination with a fab, especially when dealing
with small volumes. Hence, some vendors rely on simulations, or cherry-picked devices, to ascertain if
their device remains within its datasheet specifications. Some semiconductor designs attempt to
stabilize their part -- versus skew corners-- by using post-Silicon trim (or fusing), in order to “trim”
variation. It is important to understand, from the vendor, the expected range of variation, which is not
always stated in a datasheet.
47
12.3 PCB Material Variations
The transmission line impedance and loss are also functions of HVM and PVT, even from one vendor.
While it is realistic that one vendor’s impedance might remain under control to within +/- 5% to +/-7%,
and loss to be within +/-5%, a 5% loss change in 10 dB of channel extension amounts to 0.5 dB shift in
required equalization.
12.4 Supply Voltage Effects
Deterministic uncorrelated jitter added by a re-driver is a function of noise in both its Tx’s and Rx’s
power supplies. Good repeater designs tend to have internal voltage regulators which provide an
accurate internal DC rail, with reasonable noise rejection. Nevertheless, some attention must be paid to
designing the local external supply, with minimal high-frequency AC noise.
A careful reading of serial bus specifications would reveal that jitter characterization usually requires
activation of all lanes in a port, not just one bit. Similar margining should be performed, when using a redriver, in order to understand the effect of its internal (and external) supply noise during heavy
switching, and also internal circuit (and package) coupling effects between its channels.
12.5 Temperature (and Humidity) Effects
Temperature has the most significant effect on a system, more so than well-regulated supplies.
Temperature affects not only host and end agent Tx and Rx analog circuit bandwidths, gain, and jitter,
but also the loss of the connecting PCB, and the re-driver itself. Compared to an unrepeated link, the
added pieces are only the re-driver and the additional PCB (or perhaps cable) extension. Hence, to a first
order, only the effect of temperature on the re-driver and the added PCB need special attention.
PCB loss variations from room temperature (27 C) to about 80 C (a 53-C temperature shift) and humidity
can range anywhere from 10-20%, or 1-2 dB of extra loss for 10-dB of extension [16] [17]. Furthermore,
the re-driver’s jitter and equalization are functions of temperature. The host and device Tx and Rx jitter
and equalization are also functions of temperature, but those are present in links even before extension.
It is only when an Rx has to exercise more of its resources to service a link with a re-driver that
temperature variation of that receiver might become a limiter.
Furthermore, in many cases, a link is trained when the system is still cold (at least the PCB and host are
cold), then has to operate when the chassis’ internal temperature rises to its allowed maximum air
temperature (e.g. 75-80 C). That is sometimes referred to as “Train Cold Run Hot” (TCRH), and the
opposite is also possible (THRC). When one examines all the components that could change, then unless
the receivers on both ends of the lane have enough dynamic equalization range, a system with a redriver might become marginal. It has been demonstrated in the lab that this usually requires reducing
the amount of extension, until proper BER operation is still achieved in TCRH cases.
One might argue that a linear re-driver’s EQ should be set to be higher than needed, in order to
compensate for the channel extension, and handle high temperature. In a validation environment at
lower temperatures, test results would actually appear to reduce the amount of possible extension as
demonstrated in section 7.2.
48
In the case of limiting re-drivers (for USB3, SATA, or PCIe 2.0), over or under-compensation is
transformed into un-compensable data-dependent pure jitter, which would also reduce link margins. A
possible alternative would be to select the re-driver’s optimal settings at a mid-range temperature, as a
way to deal with TCRH (or THRC), thus minimizing the amount of un-compensable jitter created at either
extreme.
12.6 Example of Adding Overall Sources of Variation
First, we have to assume that the temperature variations affecting the Tx and Rx, on both ends of the
un-extended maximum Spec link, could be absorbed by those two devices, as they are supposed to be
Spec-complaint at all of their PVT corners. What remains, then, is the variation in the PCB extension, the
re-driver, the distortion of equalization space caused by the introduction of the re-driver into the
channel, and how that might tax the receivers on both ends of a link.
Consider a PCIe 3.0 system using a re-driver to extend the channel by 10dB, with phases 2/3 enabled. It
trains cold, and then it has to operate at 80C. It is evaluated by using nominal devices (corner skews
unknown) from host and AIC vendors, and the re-driver settings are selected at a mid-range
temperature of 55 C. Over HVM, temperature, and humidity, the following variations might occur in the
link:•
•
•
PCB loss of the channel extension might change by +/-0.5 dB due to manufacturing and impedance
variations.
PCB loss of the channel extension changes by +/-0.5-1 dB due to a 25 C temperature rise or drop
around 55 C.
The re-driver Silicon skew is different and its equalization shifts by +/- 1 to 2 dB.
The above scenario shows that a re-driven channel can experience a total change on the order of up to
+/- 3.5 dB. This represents 15% of the PCIe 3.0 maximum channel loss Spec of 23.5 dB. Therefore, it is
advised that closed systems using re-drivers be reasonably stressed during validation. Hence, it is
advisable to try to exercise the following combinations or tests during validation:•
•
•
•
•
Obtaining impedance and loss corner PCB boards of all main pieces in the system.
Asking the vendors of all the 3 active devices in the link (host, re-driver, end agent) for skew corners,
but keeping in mind the low likelihood that such a request could be met.
Optimizing the re-driver gain, amplitude and equalization settings at a temperature mid-way
between the coldest and the hottest in the operating range.
If re-driver’s skew devices are unobtainable, then a qualitative indicator of link robustness could be
had by adjusting its equalizations up and down by +/- 1 to 2 dB around the optimal setting. If the
system continues to function properly, over the full temperature range, then that increases
confidence.
Agents differ in their ability to indicate link health, but if possible; evaluation of the eye margins (or
BER), on both ends of the link, using all the above combinations, is advisable.
49
13. Re-timers
Re-timers are mixed-signal analog and digital devices. A Re-timer has a Clock Data Recovery circuit (CDR,
[5] [6]) similar to a SerDes PHY. A re-timer converts an incoming analog bit stream into purely digital bits
that are stored (staged) internally; it then re-transmits the digital data anew. Thus, a re-timer breaks a
link into two distinct sub-links, which are completely independent from each other from a Signal
Integrity (analog amplitude and timing) perspective. Since a Spec-compliant re-timer re-establishes
signal shape and transmitted jitter just like a compliant Tx would, it could double channel reach. Hence,
re-timer allows longer reach than a re-driver, since it does not suffer from jitter accumulation. Re-timers
render pre- and post-channel analyses simpler than the case of re-drivers, due to the complete
separation of the two. They also tend to provide debug capabilities, such as eye margining, and
Compliance patterns. Re-timers are substantially more complex than re-drivers, larger in die and
package size, are more expensive, and are offered by fewer vendors than re-drivers.
13.1 Re-timer Types
As described in the introductory section (2.2), re-timers are of two types: Bit Re-timers (Protocol-unaware, or TxEQ Training-incapable), and Intelligent or Protocol-aware Re-timers (TxEQ Trainingcapable). The following subsections describe the main features of each type.
Bit Re-timer (Protocol-unaware or TxEQ training-incapable) Micro-architecture
Such a device usually comprises only a CDR [5] [6] to convert the incoming data stream into a purely
digital one, then re-transmitting it anew, with fixed equalization, and new jitter. Figure 34 shows the
conceptual micro-architecture of a Bit Re-timer. A bit re-timer is usually programmed by the designer for
a specific output TxEQ, and cannot participate in TxEQ training such as for PCIe 3.0 [10] or 10G-KR [23].
These re-timers are fit for busses which do not require such training, such as SFI [24]. Hence, in the case
of SFI, such a re-timer is not really “Protocol Un-aware” since Equalization training is not part of the
protocol, and such an appellation would be non-descriptive.
There are several CDR architectures, as described in [6], which is highly-recommended reading. Here we
chose one of the robust ones, for the sake of illustration only. Although a re-timer could recover the
clock from the data itself alone, a Reference Clock (RefClk) is usually provided, in order both to speed-up
locking onto the correct data rate, and also to avoid false locking. The input RefClk is used to synthesize
a nominal baud-rate frequency using a Voltage-controlled Oscillator (VCO) composed of a number of
delay stages (e.g. 5 stages). The output of each VCO stage is fed to a Phase Interpolator (PI) which is
used to create the final clock used to sample the data (eventually becoming the recovered clock at the
true incoming data rate).
Since the incoming data rate is determined by the RefClk of the sending Tx, there will be a slight
frequency difference between that clock and the RefClk shown in Figure 34. The difference is allowed by
Specs usually (in PCIe 3.0, a RefClk can be within +/-300 ppm of 100 MHz). To compensate for the small
frequency delta, the PI uses the VCO phases fed in from the PLL, to generate the “Recovered Clock”, by
interpolating between an adjacent pair of those phases. The PI keeps “sliding” its selection of what pair
to use, and the ratio between them, such that it adjusts its output phase continuously. The PI performs
its function based on feedback from the CDR. In the meantime, the CDR tries to position the sampling
clock close to the center of the incoming data eye, using its own internal phase detector and control
50
decision loop. The desired location of the sampling clock is communicated via controls sent back to the
PI, guiding it in sliding and adjusting its Recovered Clock phase.
In addition to the CDR’s internal control, a Finite State Machine (FSM) is used to adjust the Input CTLE,
and Gain (AGC) in order to provide a more open eye. The algorithm used to achieve this is usually
proprietary. In addition, the recovered bits are (or can be) used to feed a DFE generator, which adds
equalization to the output of the Gain stage, in order to open the eye further (as explained later in
13.2.6). DFE control is usually also under FSM control, and is part of the proprietary algorithm used by
the Receiver. At the end of this process, the receiver is expected to be achieving the desired BER (e.g.
1E-12 for PCIe 3.0). A re-timer usually comprises other functional blocks, which will be described later in
section 13.2.
Programming
and Readout
I2C, or SM
Bus
Microcontroller
Debug
Controls
Reset
R-term
Recovered
Bits
(PD & loop)
FlipFlop
FlipFlop
FlipFlop
PI Control
PI
Recovered Clock
Output
R-term
CDR
Σ
Rx Detect
Driver
Gain
CTLE
R-term
R-term
Input
Staged Data Bits: “11….101”
DFE
TxEQ
Finite State
Machine
Clk Phases
PLL
RefClk
(VCO-based)
Threshold
Detector
RefClk
To Microcontroller
Figure 34 Conceptual micro-architecture of a “Bit Re-timer”, showing input CTLE, AGC, DFE, a CDR, and
a data path connected directly to an output driver with TxEQ.
Protocol-aware (Or TxEQ training-capable) Re-timer Micro-architecture
Such a device comprises all the blocks of the Bit Re-timer described in section 13.1.1, and in addition
contains a more sophisticated microcontroller which allows it to participate in TxEQ training
transactions (Figure 35). Training must occur with the two other link partners at the re-timer’s input and
output (upstream and downstream, respectively), including for example Phase 0 through Phase 3 of
TxEQ Adaptation in PCIe 3.0 [10] (section 5), or TxEQ Training in Ethernet Fabric’s KR [23] (section 8.1).
Protocol-aware re-timers are the only kind of SerDes repeater which is capable of being truly fully Speccompliant with PCI3 [10] and 10G-KR [23]. They are the most-expensive, and are also offered by a small
number of vendors (at the time of this writing) due to their complexity, their need for both analog and
digital design expertise, and protocol IP ingredients, simultaneously with access to a high-gainbandwidth analog-digital semiconductor process.
51
Assuming that such a re-timer is truly both protocol-compliant and also electrically-compliant, then its
use is the most straightforward from a signal integrity point of view --but not necessarily from a
programming, or debugging, standpoint. Its use in PCIe 3.0/4.0 or 10G-KR should support openinteroperability, as described in sections 5.5 and 8.1. More details on the PCIe 3.0 and 10G-KR versions
of re-timers are given in sections 14 and 15.
In PCIe, there are 3 clocking architectures: Common Ref Clock with or without Spread Spectrum Clocking
(SSC), Separate Ref Clock with Independent SSC (SRIS), and Separate Ref Clock No SSC (SRNS) [10]. The
SRIS and SRNS architectures cause mismatch in the data rates between sender and receiver, which must
be compensated for in PCIe 3.0 by insertion or removal of extra bits using Skip Ordered Sets (SOSs). The
SRIS mode is more stressful than SRNS, and causes the highest mismatch in data rates, due to the added
independent clock modulation on both ends of the link. It is the mode which sets the maximum
requirement on buffer depth, and also the maximum number of re-timers in a link.
A protocol-aware re-timer has two main modes of operation: Forwarding, and Training (or Execution). In
training mode, the re-timer tries to set itself and its link partners for the appropriate bit rate, port
widths, etc., and complete TxEQ adaptation. After training, the re-timer moves into forwarding mode
(also called data mode), where it behaves like a bit re-timer, with some exceptions, such as insertion (or
removal) of SOSs, in order to match the baud rate throughout the link. Note that a re-timer could start
in forwarding mode upon system reset, in order to allow slow back channel communication, before
moving to training, and then the eventual return to forwarding as the final high-speed operational
mode.
The Physical Coding Sublayer (PCS), shown in Figure 35, performs bit stream encoding and decoding as
required by the different bit rates (PCIe 1.0, 2.0, or 3.0), and it also compensates for the different data
rates by removal or insertion of SOSs. This latter function depends on the clocking architecture of the
system, as explained above. It also depends on the direction of data. In a system where the re-timer
shares its RefClk with one of the agents, then traffic directed toward the other agent does not require
data rate compensation, whereas traffic in the opposite direction would invoke SOS manipulation. More
details are given, in the following section, about the additional blocks contained in both Bit and
Protocol-aware re-timers.
52
Programming
and Readout
Microcontroller
&Training FSMs
I2C, or SM
Bus, <JTAG>
Debug
Controls
Reset
(PD & loop)
FlipFlop
R-term
FlipFlop
FlipFlop
PCS
PI Control
PI
Recovered Clock
Serializer
Output
R-term
CDR
Σ
Recovered
Bits
Driver
Gain
CTLE
R-term
R-term
Input
Rx Detect
Staged Data Bits: “11….101”
DFE
TxEQ
Finite State
Machine
Clk Phases
PLL
RefClk
(VCO-based)
Threshold
Detector
RefClk
To Microcontroller
Figure 35 Conceptual micro-architecture of a “Protocol-aware Re-timer”, showing input equalization,
a CDR, and a data path connected to a PCS, Serializer, and output TxEQ driver, with a Micro-controller.
13.2 Additional Re-timer Micro-architectural Blocks
Figure 34 and Figure 35 showed the basic micro-architecture of re-timers, but specific implementations
differ. Re-timers usually come as an identical pair (or pairs) in one package (similar to re-drivers, as
depicted in Figure 4), in order to accommodate both the sending and the receiving directions of one or
more lanes (or links). Though the basic mechanics of re-timer operation and some of their basics
building blocks where described in sections 13.1.1 and 13.1.2, more detailed descriptions of those
blocks, and others, are given next.
Input Terminations
These are similar to the input terminations of a re-driver (section 3.1.1).
CTLE and Gain (or AGC) Stage
These circuits are similar to those in a re-driver (section 3.1.2). The main difference is that in a re-timer,
input equalization and gain are usually adaptive, under the control of the FSM which controls the
algorithm used to guide the EQ and the CDR into the best sampling point (using the most-open eye), and
consequently the lowest BER.
Clock and Data Recovery (CDR) Circuit
A CDR is a circuit which can extract both clock and the data from an incoming (specially-encoded) data
stream [5] [6]. A CDR depends on the requirement that the incoming data stream has a known ratio of
zeros to ones (e.g. 1:1, which also maintains DC level balance), enforced in SerDes specifications via
encoding in the transmitter’s PCS. A CDR employs a Phase Detector (PD) and a decision circuit used to
position the sampling clock, in the best eye location, leading to detecting zeros and ones properly. In
addition, a counter usually accumulates the number of detected zeros and ones, over a moderate period
of time, and produces an error signal which is a function of the imbalance between the detected zeros
and ones. The CDR’s control loop uses the fed-back error to try to improve the exact sampling location.
53
Furthermore, to achieve proper detection of zeros and ones, the CDR requires an eye which has been
opened by receiver’s equalization resources (such as Gain, CTLE, and DFE). In advanced receivers, a
Finite State Machine (FSM) runs a proprietary algorithm which steers the equalization for the best open
eye, to ensure success in CDR locking at the target BER (or better) for that bus.
Reference Clock (RefClk)
RefClk is usually a low-frequency clock (e.g. 100 MHz) which feeds a PLL. The PLL produces a highfrequency internal clock running at the bus’s bitrate. Some receiver CDRs can operate without the need
for a RefClk, but such designs have limitations, such as low frequency jitter accumulation, and potential
false clocking [6]. Most CDRs require an input RefClk, as explained in section 13.1.2. The allowable
frequency delta between independent clock sources is bounded a bus’s specification’s allowable RefClk
deviation from nominal (e.g. +/-300 ppm in PCIe 3.0, and +/-100 pm in 10G-KR).
Data Staging Registers (Flip-flops)
A CDR’s main outputs are the extracted clock and data bits. The CDR’s recovered clock is used to capture
and store the recovered bits into a shift register (consisting of serially-connected flip-flops). The bits are
pure digital data lacking any analog content from the pre-channel. In addition to being sent out to the
PCS layer, the staged bits are also fed, in parallel, to the DFE generator (see section 13.2.6).
Discrete Feedback Equalization (DFE)
DFE [2] is a non-linear equalization used to cancel the residual pulse response tails (and ripples) in the
incoming analog signal at the output of the CTLE. It helps produce a narrower recovered single pulse
response, thus reducing ISI. The idea is to add, or subtract, small amounts of Unit Interval-wide (UIwide) voltage (or current) at progressive bit intervals around the main cursor, in a fashion which
counters the tails and ripples in the channel’s pulse response. The amplitude of each DFE tap is
controlled by the receiver’s FSM, which also controls other resources such as CTLE and AGC. Eventually
this also drives toward a better balance of the received zeros and ones, implied by the CDR’s
convergence loop. In addition to channel pulse response tails, DFE also cancels some of the reflection
artifacts caused by discontinuities in the channel. The number of DFE taps ranges from 1 post-cursor tap
to several, in addition to pre-cursor taps in some designs.
Finite State Machine (FSM) Controller
This digital state machine usually has a proprietary algorithm which manages adjusting the input Gain
(AGC), CTLE, DFE settings, and other parameters, in such a way as to open the eye as widely as possible,
before it enters the phase detector (PD) inside the CDR. The FSM also helps locate the best position to
sample the eye. It is usually programmable through firmware (inaccessible to the user), in order to
optimize the CDR’s performance. The FSM may choose to train equalization upon startup then freeze it,
or continue to train throughout operation, depending on the sophistication of the design.
Physical Coding/Decoding Sublayer (PCS)
This digital block exists in protocol-aware re-timers, and performs decoding and encoding of data into
balanced zero and one streams (e.g. using 128b/130b encoding for PCIe 3.0, or 64b/66b encoding in
10G-KR). The PCS also manages insertion or removal of Skip Ordered Sets (SOSs), as described in section
13.1.2, in order to compensate for differences between the received bit rate and the RefClk-implied bit
54
rate. The parallel data output of the PCS is fed to the Serializer. In the case of a bit-re-timer (a.k.a.
protocol-unaware re-timer), a PCS is not needed usually.
Serializer
A Serializer converts the parallel digital data (at a lower frequency) to a stream of serial bits at the link’s
high bitrate. The serializer pre-drives the output transmitter. The simplest serializer is a synchronous
parallel-load serial shift register.
Transmitter Output Driver
The driver is the final power amplification stage which drives the output load. It is usually merged with
the TxEQ function described next. The Output Differential Voltage swing (VOD) is usually adjustable
from about 0.4V to 1.2V, for PCIe and KR. The driver can usually be turned off by the micro-controller
during low-power modes (e.g. squelch, or no output load).
Output TxEQ
This circuit is usually merged with the output driver, and is used to add de-emphasis and usually preemphasis also to the output. It is a true FIR filter, as described in section 3.1.5. TxEQ is usually under
micro-controller control, since its settings have to be programmable by the re-timer itself, in order to
allow the device to participate in the PCIe 3.0 or KR TxEQ adaptation or training.
Output Terminations
These are similar to the output terminations of a re-driver; please refer to section 3.1.6.
Microcontroller
The controller manages communication with the I2C (or SM) bus, and acts as the intermediary between
those I/O bus protocols, and the internal architecture of a re-timer. It allows receiving any external
presets (or user configurations), in addition to communicating factory settings to the correct registers
inside the re-timer. The controller configures the receiver FSM which manages input EQ adaptation, and
CDR convergence. That functionality is indicated by the dotted arrows in Error! Reference source not
found.. The most important function of the micro-controller in a protocol-aware re-timer, is
communicating with the PCS (which extracts and inserts information from/into the training sequences)
and managing all the phases of TxEQ adaptation from beginning to end. Hence, the controller is
responsible for transitioning the re-timer from Forwarding, to Training (or Execution) mode, and back to
Forwarding. In addition, the controller communicates with the input Signal Level (Threshold) Detector,
and the Rx Output Load Detector, in order to manage the power states transitions of the re-timer. It also
manages any available debugging features, such as eye margining, and status readouts which can be
accessed by the user, through I2C, SM, or even JTAG.
Input Idle Threshold (Squelch) Detector
This function is similar to that in a re-driver, as explained in section 3.1.7.
55
Receiver Detection (Rx Detect)
This function is identical to the one described in section 3.1.9 for re-drivers. The main difference is that
the output of the Rx detector is usually sent to the micro-controller, in order to manage power states,
and input/output termination settings.
I2C, or SM Programming Bus
Re-timers require initial configuration by the user and status readout, both of which are usually achieved
by using an I2C or an SM bus [12] [13]. Data sheets give the user extensive register tables indicating
what parameters can be set or read out.
Debugging Logic
Some re-timers allow the user to margin the post-EQ receiver eye width and height, in both directions,
to assess link health, or also allow detecting error rates (via communicating with a sophisticated PCS).
Since a re-timer is transparent in the link (after training --if applicable), it is difficult to determine if bit
errors originate downstream or upstream of a re-timer – however eye margining tools can quickly locate
the poorly performing channel. While no eye margining block was shown in Figure 34 or Figure 35 (for
simplicity), some re-timers do offer such a capability. Debug features can also allow tracing the
progression of the re-timer’s state machines during link initialization and operation. Debugging can be
achieved through a set of read and write registers connected to the micro-controller, and accessible via
the I2C or SM bus.
14. PCIe 3.0 Re-timer
An Engineering Change Notice (ECN) to the PCI-SIG 3.0 specification defining re-timers [20] has been
approved. That ECN guarantees interoperability with specification-compliant PCI Express 3.0 devices.
The PCI Express re-timer must be protocol aware. The PCI Express re-timer runs the adaptive phases,
(phases 2 and 3, described in section 5) of the link equalization protocol, in each direction. This
effectively splits the link, on each side of the re-timer, into two completely separate and independent
“Link Segments” at the analog level. Therefore, the channel on either side of the re-timer can be up to
the full worst-case channel allowed by the PCI Express specification, while still guaranteeing
interoperability. This also means that the electrical requirements for re-timer receivers and transmitters
are the same as those for standard PCI Express devices, as defined in the PCI Express specifications [10].
A PCI Express re-timer does not need to have a receiver that exceeds the requirements defined for
standard devices in the PCI Express specification, and implementations can be based on standard PCI
Express physical layer IP.
There can be at most 2 re-timers in the link between a PCI-Express upstream port and a downstream
port. The major reasons for the 2 re-timer limitation are timeouts and Reference Clock parts per million
(ppm) tolerances, which will be explained in subsequent sections. Figure 36 shows the possible PCI
Express link scenarios with one or two re-timers, along with the terminology used in the PCI Express
Extension Device ECN [20]. The portion of the link on each side of the re-timer is called a “Link
Segment”. The re-timer’s own upstream and downstream ports are defined as the “Upstream Pseudo
Port” and “Downstream Pseudo Port”, respectively. Unlike traditional PCI Express devices and upstream
and downstream ports – the re-timer is not directly visible to software through the PCI Express
configuration space. The PCI Express re-timer is software-transparent.
56
Upstream Component
(RC or Switch)
Downstream Port
Downstream Port
Retimer Component
Link Segment
Downstream Pseudo Port
Link Segment
link
Downstream Pseudo Port
Upstream Pseudo Port
link
Retimer Component
Upstream Path
Upstream Pseudo Port
Downstream Path
Link Segment
Link Segment
Upstream Component
(RC or Switch)
Upstream Pseudo Port
Upstream Port
Retimer Component
Downstream Component
(Endpoint or switch)
Link Segment
Downstream Pseudo Port
Upstream Port
Downstream Component
(Endpoint or switch)
Figure 36 PCI Express links with one or two re-timers.
14.1 PCI Express Re-timer Link Training and Status State Machine Overview
The PCI Express re-timer does not have the same link training and status state machine as do standard
PCI Express upstream or downstream ports. At a high level - the PCI Express re-timer operates in one of
two functional modes, Forwarding mode and Execution mode. The re-timer starts --and spends most of
its time-- in the Forwarding mode, where incoming traffic is mostly retransmitted through the re-timer
without changes. Technically, it could start in the Forwarding mode, even at 8 GT/s, although the
standard protocol does not allow that, as a matter of course. The re-timer does have to modify some
fields in training ordered sets – but never modifies or interacts with transaction layer and link layer
packets.
For a small portion of the time – the re-timer operates in Execution mode, where it must initiate traffic
directly, and act like a standard PCI Express upstream port on the re-timer Upstream Pseudo Port, and
like a standard PCI Express downstream port on the re-timer Downstream Pseudo Port. The main case,
where the re-timer operates in Execution mode, is that when the adaptive phases (Phases 2 and 3) of
the PCI Express link equalization protocol take place.
57
14.2 Rx Detection, Re-timer Orientation, and Hot Plug
The PCI Express protocol requires that silicon attempt to detect receiver terminations, repetitively.
Once receiver terminations have been detected, silicon must attempt to train, and if there is no
response to training, then silicon must go into a “transmit compliance” mode, where a specified
compliance pattern is transmitted repetitively. This test mode is designed to make analog transmitter
compliance testing possible, by connecting any compliant re-timer directly to a 50-ohm terminated realtime oscilloscope.
The re-timer does not present 50-Ohm terminations, on power up. The reason is that if it does, then
even if nothing is connected to the far re-timer port --or something is connected but not active– then
the device on the other side of the link will end up stuck in transmit compliance mode. A device stuck in
transmit compliance mode consumes unnecessary power, and potentially interferes with hot-plug. Since
re-timers power on without their receiver 50-ohm terminations applied, they perform Rx Detection
repetitively, in the same way as a standard PCI Express device on both Pseudo Ports (section 3.1.9).
Once Rx detection is successful on a Pseudo Port, the re-timer applies its own 50-ohm receiver
terminations on the opposite Pseudo Port. In addition, the re-timer has error detection rules which will
trigger it to re-attempt Rx Detection when a hot-pluggable device is disconnected.
Re-timers also detect which direction is upstream and which is downstream, dynamically. This is
obtained from information in the training sets – thus allowing a standard re-timer to be usable in a
reversible cable (where either end can be plugged into the upstream or downstream component).
14.3 Speed Detection and Speed Support
Re-timers must support all PCI Express speeds up to the maximum one that they are designed for. For
example – a PCI Express re-timer that supports 8.0 GT/s must also support 5 GT/s, and 2.5 GT/s. Retimers modify information in the training sets so that an unsupported speed is never advertised to the
upstream or downstream components. For example – a re-timer that only supports 2.5 GT/s and 5.0
GT/s would modify training set information and remove any advertisement of support for the 8 GT/s link
speed. Re-timers must properly detect the analog signaling rate on the link, whenever a re-timer
receiver detects exit from electrical idle, and begins to receive analog signaling. The re-timer must
ensure that it has the rate detected properly, before it starts forwarding received information to its
other Pseudo Port. This detection must be performed within small time windows defined in the PCI
Express ECN [20], in order to ensure that the delay in forwarding does not impact the PCI Express
training protocol.
14.4 Phase 2 and Phase 3 Adaptive Tx Equalization Training at 8 GT/s
Whenever a PCI Express re-timer detects an exit from electrical idle, properly detects the signaling rate,
and starts forwarding information, then it is in the Forwarding mode by default (Figure 37). When the
protocol reaches the beginning of the adaptive TxEQ phases (Phases 2 and 3), the re-timer switches into
the Execution mode, and starts to execute (actually initiates) the training protocol itself (independently)
on both the upstream and the downstream Pseudo Ports, simultaneously.
To begin with – the re-timer runs Phase 2 on its Upstream Pseudo Port, in exactly the same way a
standard PCI Express Upstream Port would behave (as was described in section 5.3), and runs Phase 2
58
on its Downstream Pseudo Port in exactly the same way a standard PCI Express Downstream Port would
behave (Figure 37).
Once the Upstream Pseudo Port has completed optimizing the TxEQ for its own receiver in Phase 2, it
keeps the link in Phase 2 by not updating the appropriate status fields in the training sets, until the
Downstream Pseudo Port has also completed optimizing the TxEQ for its receiver.
Once the downstream Pseudo Port has completed optimizing the TxEQ for its receiver in Phase 3
(section 5.4), then the upstream Pseudo port indicates the completion of Phase 2, and the link segment
moves to Phase 3, where it behaves as a standard PCI Express upstream port.
Though the Downstream port has finished Phase 3 (Figure 38), it stays in Phase 3 by continuing to
request, repetitively, the final TxEQ which was selected for its receiver, while the upstream Pseudo Port
completes Phase 3. Once the Upstream Pseudo Port completes Phase 3, then both Pseudo Ports return
to the forwarding mode, at the same time.
The re-timer keeps states synchronized, in order to prevent introducing new error conditions due to its
presence. This helps prevent such cases where, for instance, one Link Segment completes Phase 2 and
Phase 3, and moves on to the next phase of the protocol, while the other Link Segment encounters
errors and exits TxEQ negotiation, before successfully completing Phase 3.
The re-timer must be able to complete Phase 2 and Phase 3 equalization for its own receivers in 2.5 ms,
so that the standard Upstream and Downstream Port equalization timeouts are never violated.
59
Upstream Component
Upstream Component
Upstream Component
Upstream Pseudo Port
Forwarding
Upstream Pseudo Port
Acting as Standard
Upstream Port
Upstream Pseudo Port
Stalling in Phase 2 By
Repeasting Last TX EQ Request
Retimer
Phase 2
Retimer
Phase 2 Completes on
Upst-ream Pseudo Port
Downstream Pseudo
Port acting as Standard
Downstream Port
Downstream Pseudo
Port Forwarding
Downstream Component
Retimer
Downstream Pseudo Port
acting as Standard Downstream Port Phase 2 or Phase 3
Downstream Component
Downstream Component
Figure 37 Phase 2 and Phase 3 Sequence with a re-timer --Part 1.
Upstream Component
Upstream Component
Upstream Pseudo Port
Stalling in Phase 2 By
Repeating Last TX EQ Request
Retimer
Downstream Pseudo Port
acting as standard Downstream Port Phase 2 or Phase 3
Downstream Component
Upstream Component
Upstream Pseudo Port
Acting as Standard
Upstream Port – Phase 3
Phase 3 Completes on
Downstream Pseudo
Port
Retimer
Downstream Pseudo Port
Stalling in Phase 3 By Not
Advancing
Downstream Component
Upstream Pseudo Port
Forwarding
Phase 3 Completes on
Upstream Pseudo Port
Retimer
Downstream Pseudo
Port Forwarding
Downstream Component
Figure 38 Phase 2 and Phase 3 sequence with a re-timer --Part 2.
60
14.5 Clocking Architecture Support
The PCI Express specification [10] supports several different clocking architectures. The architectures
supported in the latest revision of the specification are a common clock architecture (where a common
reference clock is distributed to both the upstream and downstream components), and an independent
reference clock architecture, where the upstream and downstream components have independent
reference clocks. Both clocking architectures allow the use of “Spread Spectrum Clocking” (SSC), which
is defined by PCI Express as 30-33 KHz clock modulation of up to 0.5% below the nominal reference clock
frequency of 100 MHz. The PCI Express ECN covering re-timers [20] allows support of either clocking
mode. A PCI Express re-timer available on the market may or may not support both architectures. Retimer implementations must have significantly larger elasticity buffers to support the independent
reference clock mode and have increased jitter tolerance at low frequency.
The PCI Express protocol requires a transmitter to send SKIP Ordered Sets (SOSs) with a minimum and
maximum average rate that varies with the clocking architecture. The average rate of SOS transmission
is about 10 times higher for the independent reference clock architecture with SSC (SRIS) than with the
common clock architecture. The SOSs are designed to be a set that a receiver can either expand, or
reduce, in order to keep internal buffers from underflowing or overflowing due to ppm variations
between the transmit and receive clocks. A re-timer is required to modify the SOSs to keep its receiver
data buffers from ever underflowing or overflowing when forwarding data from one re-timer Pseudo
Port to the opposite Pseudo Port. Furthermore, the PCI Express protocol requires that changes to the
SOSs are done consistently on all lanes at the same time, and failure to do so may break existing PCI
Express receivers. The PCI Express transmitter requirements provide enough SOS content, such that a
maximum of two re-timers may modify SOSs between the Upstream and Downstream Component. If
additional re-timers are present, then there is no longer a guarantee that the final receiver will have
enough remaining SOS content to function – this is true for both clocking modes.
14.6 Debugging and Electrical Testing
All PCI Express transmitters are required to support a Compliance mode where they automatically start
transmitting a compliance pattern [10] if 50-ohm terminations are detected, but the other side of the
link does not attempt to train. This allows transmitter testing to occur by connecting devices directly to
a 50-ohm-terminated oscilloscope. Re-timers are also required to support this mechanism such that
standard electrical testing can still occur for a re-timer by itself, or for a system which includes a retimer. In addition, re-timers are required to interoperate with the standard loopback modes defined by
the PCI Express specification such that loopback would still work between the Upstream and
Downstream components when one or two re-timers are in the path. There is also an optional “Local
loopback” definition in the re-timer ECN which allows loopback to be run directly to and from a re-timer
which supports this optional feature. The optional “Local loopback” helps to isolate the location of
errors that occur using the standard loopback modes between the Upstream and Downstream
component in a system.
61
15. 10G-KR Re-timer
There is a variety of ways that a 10GBASE-KR re-timer may be implemented. One fundamental rule
applies regardless of the link up flow methodology: The re-timer must be protocol-aware and capable of
performing KR Training, as defined by IEEE 802.3 Clause 72 [18]. By performing this training, the re-timer
is effectively breaking the channel in two from an electrical stand-point, and therefore creating two
separate KR Channels. Each side of the channel must conform to the IEEE 802.3 Annex 69B channel
parameter requirements. For additional details on this training process, please refer to the specification
and section 8.1 of this document. In addition to performing training with each end point, the re-timer
must either perform, or at least not interfere with, the other phases of bringing up a 10G-KR link,
specifically “Auto Negotiation” and “PCS Block Lock”.
15.1 Auto-Negotiation
The IEEE 802.3 standard’s Clause 73 “Auto Negotiation” (AN) is the first step required during the process
of establishing a 10G-KR link. At the beginning of the link up process, the two Ethernet ports attempt to
establish a connection using AN “Base Pages” (lower speed signaling) to communicate between each
other, thus allowing the devices to determine the highest common link speed. They can also configure
Forward Error Correction and perform additional functions through “Next Pages” such as enabling
Energy Efficient Ethernet (EEE) and other proprietary link modes. AN is used for link speeds from 1G
through 100G, and will continue to be used for future IEEE backplane protocols. If the two devices are
both capable of 10G-KR, and are not both capable of a higher speed, then AN will resolve to establish a
10G-KR link. When link is resolved to 10G-KR, the two devices have 500 ms to complete the AN process,
which includes KR Training and PCS alignment. Once the AN Base Page communication has been
completed, the devices transition to the KR Training phase of the link up process. The re-timer’s role in
this process is either to complete AN with both devices independently, or to pass the signal through
itself, and allow AN to complete directly between the end points. See section 15.4 for more info on the
full link up flow.
15.2 KR Training
IEEE 802.3 Clause 72 defines the PMD control for 10GBASE-KR, which includes KR Training. This is when
the equalization for the 10G signaling is adapted to the present channel (see section 8.1.2). In a system
using a re-timer, the channel is divided into two completely separate electrical domains, where side of
the link must conform to the Annex 69B channel specification, and training must occur on each side of
the re-timer. As stated above, a KR re-timer must be capable of performing this training. Once the KR
Training phase of the link up process is entered, the re-timer should close off the path between the two
devices and train each side of the link individually. Link training gives each receiver time to converge on
the incoming signal, which includes adjusting the TxEQ settings of the connected transmitter, so at the
end of training, all four transmitters and receivers are adapted to the connected channels.
15.3 PCS Alignment
The final phase of establishing a KR link is transitioning to data mode, and aligning the PCS layer of the
PHY to the 64B/66B encoded data (IEEE clause 49, [21]). In the case of a re-timer with a PCS, a
62
connection to each end point can be established, independently, because the re-timer can supply the
encoded data needed to maintain link. The re-timer would go through AN and KR Training with the
device, then move into data mode with the re-timer supplying the 10G data until all channels are
trained. While link is only up on one side, the re-timer should send the end point “Remote Fault Ordered
Sets” (IEEE 46.3.4, [21]) to indicate that the far end device is not ready. When the re-timer has finished
training both sides of the link, then it can begin passing through the encoded data, and a full link can be
established through the re-timer. If the re-timer does not have a PCS, then the PCS alignment will occur
between the two end points with the re-timer simply passing the data along. In this case, the main
concern is enabling a clean and seamless transition into data mode (no loss of signal), as well as making
sure that all data paths are running on a congruent clock.
15.4 Link Up Flow
In the case of using a re-timer to extend a 10G-KR channel, there are two basic options for bringing link
up, depending on whether or not the re-timer implements a 64b/66b PCS. If the re-timer has a PCS, then
it can establish an independent link with each end point, and enable an extended channel once both
sides have been trained. The other option is only to implement the KR Training phase of the link up flow
within the re-timer, and allow the rest of the AN process to occur between the two end points, passing
through the re-timer. In this case, the re-timer must be able to recognize when the devices transition
from AN “Base Page” exchanges to KR “Training Frames” (section 8.1). At that time, the re-timer should
close off the channel, then train both channels independently. Once both sides have been trained, then
the re-timer can indicate to the end points that it is ready to transition to data mode. At that time, the
re-timer would go back into a pass-thru mode, exchanging information between the end points using
the equalization that was converged on during training. This transition into data mode should be
seamless, so that the end points do not lose CDR lock, and can lock to the incoming PCS signal as quickly
as possible.
There are advantages and disadvantages to both link up methods. A PCS-enabled re-timer would be
more complex and expensive, since it would need an AN layer as well as a PCS layer. Whereas a re-timer
which passes-through the AN signaling between end points, and does not have a PCS, may be more
prone to timing and interoperability issues, but is a simpler device. The most important consideration
with a “Training-only” re-timer is developing a robust link up flow. The recommended method is to
make sure that the re-timer is not indicating (to either end point) that it is ready to move to data mode,
until all receivers are ready to move to data mode. The way to accomplish that is by controlling the
“Receiver Ready” indication on all of the channels. Receiver Ready is a bit within the KR “Control
Channel” (which is a portion of the low-speed KR Training Frame) used to indicate to the link partner
that the local receiver is trained, and is ready to move to data mode. A Training-only re-timer should
only propagate the Receiver Ready indication when both of its ports are trained and are ready to move
to data mode. In other words, a re-timer should indicate “Receiver Ready” to an end point, only when
both re-timer ports, are trained AND when “Receiver Ready” is seen from the other end point. An
example of this type of link up flow is shown in Table 3 and Figure 40 with its associated legend in Figure
39.
63
Table 3 KR re-timer “Link Up Flow” Description.
Link Up Stage
Description
Note
1. Ports enabled and
initialized
Re-timer passes through End Point's AN signal. End Point's complete AN process
with each other
2. Independent link training
KR Training occurs independently on either side of the link
3. End Point #1 achieves
Receiver Ready
End Point #1 local receiver is trained (RX2 and TX2), RX Ready indicated to retimer. RX Ready not propagated through re-timer because re-timer ports are not
converged
4. Either port of the re-timer
achieves Receiver Ready
Does not matter which re-timer port converges first (4.1 or 4.2), RX Ready not
indicated/propagated to End Point #2 until both re-timer ports are converged
5. Both Re-timer ports
achieve Receiver Ready
Begin propagating RX Ready indication from End Point #1 to End Point #2 when
BOTH re-timer ports are trained
6. All ports achieve Receiver
Ready
All ports are trained and ready to move to data mode. Re-timer propagates RX
Ready indication from both End Points
7. Transition to Data Mode
Re-timer propagates received data using equalization determined during training.
Transition into data pass through mode (SEND_DATA) must be seamless
Figure 39 Legend for the KR re-timer “Link Up Flow” Block Diagram in Figure 40, shown below.
64
Figure 40 KR re-timer “Link Up Flow” Block Diagram.
65
15.5 Other Considerations For 10G-KR Re-timers
Energy Efficient Ethernet (EEE)
IEEE Clause 78, which addresses EEE [21], can cause issues with a protocol-aware re-timer if the retimer device is unaware that EEE is enabled. EEE enables the devices to turn off their transmitters
intermittently --without losing the clock recovery or channel equalization-- which could be seen by a retimer as link going down. EEE could only be enabled with a re-timer if the re-timer is specifically EEEcapable, and completes AN with the end points, which is when the EEE configuration is enabled. In the
case that the re-timer is only doing training and not AN, EEE should be disabled.
Cascaded Configuration
In the event that a KR channel is longer than what can be supported by a single re-timer, it may be
desired to use multiple KR re-timers in a cascaded configuration. This can potentially create additional
link establishment concerns, but if the re-timer implements either one of the two “Link up Flows”
recommended in section 15.4, then the re-timer should be capable of establishing link up in such a
cascaded configuration, without introducing new issues.
PCS- and AN-capable devices should be able to support cascading easily, since they can establish links
independently with each end point, and/or a connected re-timer. As the other PHYs are enabled, and
once link has been adapted and brought up on each channel, the full data path will be enabled and
ready to pass traffic.
If the re-timer is only capable of supporting KR Training, then it is critical that the re-timer implement
the method of only propagating “Receiver Ready” after both of its ports have been trained, and it is
receiving “Receiver Ready” from the far-end link partner. The following example (Figure 41) of a KR
Channel, with multiple cascaded re-timers in the path, highlights the expected behavior of a “Trainingonly” re-timer. There are four different channels to be trained, labeled #1 through #4, and each re-timer
is in a different phase of the link up process.
#1
TX1
Ready
#2
RX1 RX Ready TX3
KR #1
RX2
Ready
#3
RX1
Retimer
Train
TX2
TX3
Train
#4
RX1
Retimer
RX4
Train
TX2
TX3
Train
Retimer
RX4
Train
TX2
RX3
KR #2
RX4
Train
TX4
Figure 41 KR re-timer Cascaded “Link Up Flow” Example.
In this example, channel #1 is fully trained by both KR End Point #1 as well as the re-timer. Since the End
Point is trained, it is signaling “Receiver Ready” to the re-timer, and because both re-timer ports are
trained, the re-timer is propagating that “Receiver Ready” indication on channel #2. Since the re-timer is
not receiving “Receiver Ready” on channel #2, it does not indicate “Receiver Ready” back to End Point
#1. Also notice that even though channel #2 is fully trained, and the middle re-timer is receiving
66
“Receiver Ready”, it does not propagate it onto channel #3 because it does not yet have both of its local
receivers trained yet. All of this shows that even though half of the channels are fully trained, the link
will only be brought up when all channels are fully trained, and the “Receiver Ready” indication is
generated by both End Points and propagated through each re-timer.
KR Re-timer Link Debugging
KR Interoperability can be problematic between two PHYs from different industry vendors, and adding
an active device to the path would only make it more complicated. It is important for a KR PHY to have
certain debug features in order to allow investigating interoperability issues, and a KR Re-timer should
have similar features. If link is not getting established, then it will take a strong knowledge of both the
IEEE KR specification [21], as well as the end point and re-timer configuration to determine the cause of
the problem. In order to enable this, at a minimum the re-timer device should be able to indicate the
following to the user:-
Link status of all ports
KR Training State of all ports (IEEE Figure 72-5, [21])
Frame Lock status of all ports (IEEE Figure 72-4, [21])
Receiver convergence status of all ports
Initial and Final Transmitter equalization coefficients
Link partner “Receiver Ready” status
Received eye health
The following features are desired, and may help facilitate debug, but they are not required:-
-
Loopback modes at various levels (or layers) of the device
Rx/Tx Parts Per Million (PPM) clock differences. Knowing the various clock rates --or at least the
differences between them-- can help debug a variety of receiver issues, such as elastic buffer
overflow problems
Time-to-Link measurement
Tx Feed-forward Equalization (Tx FFE, or TxEQ) adaptation requests made by re-timer to a link
partner
Applicable re-timer receiver parameters (VGA, DFE, etc.)
Following are additional requirements if the re-timer contains a PCS and an AN layer:-
Auto Negotiation State of all ports (IEEE Figure 73-11, [21])
PCS Block alignment status of all ports
Remote and Local fault status of all ports
15.5.3.1 Interoperability Testing
Assuming that all of the end points and re-timers in a link are compliant with the Tx and Rx electrical
requirements of the IEEE specification [21], then the most critical part of ensuring that a system which
utilizes a KR re-timer will behave as expected, is testing the functional interoperability of the system.
This interop testing should happen as early on in the design process as possible, so that any issues that
67
are discovered can be addressed prior to building systems. Testing should, at a minimum, include
analyzing the full re-timer and End Point configuration for the following:
-
Analyzing “Time to Link”
Testing link up reliability when resetting any device
Passing traffic reliably across comparable channels
Testing interoperability between the End Points, without a re-timer in the path, as well as testing a
single End Point with the re-timer in a loopback mode, can help facilitate this testing, as well as help
isolate any issues that may exist.
15.5.3.2 Transition Timing issues
One very common source of issues in KR Interoperability, that is made more complicated by the
presence of a re-timer, is transition timing. This is the amount of time that it takes the devices to
transition from AN signaling to KR “Training Frames”. Every device behaves differently, and devices can
be sensitive to long timeouts, or to the presence of unexpected data during this transition. The re-timer
should be capable of tolerating long transition times from end points, but should minimize the amount
of time that it adds to this transition, while it is closing off the channel to train each side. As mentioned
above, the transition from KR Training Frames to the passing through of KR data, should be seamless.
15.5.3.3 CDR Time domain issues
An additional consideration, which must be taken into account with re-timers, is knowing which clocking
(data rate) domain drives each independent channel section. The CDR of a re-timer’s receiving port
should use the recovered clock of the first (pre-re-timer) section of the full link to drive the re-timer’s
output transmission path to the other end point (the post-re-timer link). This ensures that there are no
buffer overflow or data rate mismatches. This can have an impact on the link up flow of the device, and
is up to the re-timer designers to verify that systems, in real world applications, will not have issues
regarding the various clock domains. Again, interoperability testing is critical to uncover potential issues
such as this.
68
16. About the Authors
Samie Samaan is a principal engineer at Intel Corporation, where he joined in ‘94. He is currently in the
CPU design team in Oregon, working on the power modeling of a next generation of CPUs, including
power-performance optimization to manage process variations. Recently, he spent 4 years at the Data
Center Group (DCG), leading all platform-level modeling of I/O channels (PCB, connectors, sockets, etc.)
employing behavioral models of PHYs, for various high-speed SerDes and single-ended busses. He also
oversaw the creation of customer design guides addressing all I/Os of that upcoming platform. He
pioneered and led investigations of analog SerDes Re-driver characteristics, and their effects on highspeed SerDes bus behavior and specifications. In that role, he created several reports, test methods, and
helped set internal guidelines. Prior to DCG, he spent 15 years in CPU design, working on highperformance custom VLSI circuit and logic design. He documented Intel’s GTL+ bus electrical Specs, then
designed multi-GHz high-performance self-timed dynamic logic adder circuits for CPUs. He developed
expertise in process variation statistics, and subsequently invented a ubiquitous intra-die process
variations monitor, which became a staple of process monitoring. He co-designed a Soft Error Rate (SER)
monitor IC, and performed numerous process variation analyses on CPUs. Later, he designed several
Analog blocks in an industry-first integrated voltage regulator for high-performance CPUs, including a
control loop compensator, a DAC, and a real-time very high-speed intra-die voltage monitor.
Prior to Intel, Samie was VP of Engineering at Zeelan Technology, which specialized in Behavioral I/O
buffer simulation models for high-speed board design. He was briefly at Epson working on a Pentium
chipset project. He has also worked at the Grass Valley Group, in CA, on an ASIC design for Video
Compression.
Samie spent 8 years at Tektronix, working on multi-GHz Si & GaAs large-signal linear ICs. At Tektronix, he
also developed a numerical code to compute electron deflection in electrodynamic fields, used in GHzCRT studies.
Samie obtained his Master’s in Electromagnetics & Microwaves from the University of Mississippi. He
had also obtained a post-graduate “Advanced Studies Diploma” in Medical Electronics from Nancy-1
University in France. His later education includes several graduate-level courses in Semiconductor
Physics.
Dan Froelich is a principal engineer in the I/O Technology and Standards group. His expertise covers a
broad span in I/O including electrical, protocols, software, form-factor, and compliance. Dan chairs three
WGs (Electrical (acting), CEM, and compliance) in PCI-SIG, and is among the few people responsible for
the broad adoption of PCI Express in the industry. Dan works closely with the internal teams, as well as
the ecosystem, to ensure that PCI Express continues to work first and best with IA through its evolution.
Dan has defined and evolved PIPE, the core PHY IP SoC interface for PCI Express, USB, and SATA, which
has been universally adopted inside Intel as well as across the industry at large.
Sam Johnson is a Network Hardware Engineer in Intel’s Networking Division, supporting server-class
Ethernet controllers and high-speed communication technologies. Sam started at Intel in 2010, and
quickly became the primary debugger of high-speed serial Ethernet protocols for Intel’s Networking
69
Division. He is an expert in IEEE and industry standards for backplane and cabled Ethernet technologies,
including involvement with new and future protocols. He has also worked extensively with a wide
variety of external PHY devices; including enabling development of the first KR to KR re-timers and
conducting industry-leading cross vendor evaluations of re-drivers for 10G-KR applications. Sam works
closely with both customers and design teams to connect future device capabilities to customer needs,
and ensure that new devices will support the latest debug tools. Sam currently conducts regular debug
training classes, educating and mentoring engineers, from around the world, in Ethernet technologies
and debug methodologies.
70
References
[1] W. C. Johnson, "Transmission lines and networks," McGraw-Hill, Technology & Engineering, Jan 1,
1950.
[2] C. Belfiore and J. J. Park, "Decision feedback equalization," Proceedings of the IEEE, vol. 67, no. 8,
pp. 1143 - 1156, Aug 1979.
[3] I. Gerst and J. Diamond, "The Elimination of Intersymbol Interference by Input Signal Shaping,"
Proceedings of the IRE, pp. 49 , Issue: 7, 1961.
[4] W. Dally and J. Poulton, "Transmitter equalization for 4-Gbps signaling," Micro, IEEE, p. 48 – 56,
Volume: 49 , Issue: 7 1997.
[5] B. R. Saltzberg, "Timing recovery for synchronous binary data transmission," The Bell System
Technical Journal, vol. 46, no. 3, pp. Page(s): 593 - 622, 1967.
[6] M.-t. Hsieh and G. Sobelman, "Architectures for multi-gigabit wire-linked clock and data recovery,"
Circuits and Systems Magazine, IEEE, vol. 8, no. 4, pp. 45 - 57, 2008.
[7] R. E. Best, Phase-Locked Loops, McGraw Hill Professional, Jun 20, 2003.
[8] C. Lee, M. Mustaffa and K. Chan, "Comparison of receiver equalization using first-order and secondorder Continuous-Time Linear Equalizer in 45 nm process technology," in 4th International
Conference on Intelligent and Advanced Systems (ICIAS), Volume: 2, Page(s): 795 – 800., 2012.
[9] "Universal Serial Bus 3.0 Specification," June 6, 2011.
[10] "PCI Express® Base Specification, Revision 3.0," PCI-SIG, November 10, 2010.
[11] Universal Serial Bus 3.0 Specification Revision 1.0, USB-IF, June 6, 2011.
[12] "I2C-bus Specification and User Manual, Rev. 6 — 4," NXP, April 2014.
[13] "System Management Bus (SMBus) Specification, Version 2," SBS Implementers Forum, Aug. 3,
2000.
[14] H. L. H. Stephen H. Hall, Advanced Signal Integrity for High-Speed Digital Designs, Wiley, 2009.
[15] ASHRAE Technical Committee, "2011 Thermal Guidelines for Data Processing Environments –
Expanded
Data
Center
Classes
and
Usage
Guidance,"
http://ecoinfo.cnrs.fr/IMG/pdf/ashrae_2011_thermal_guidelines_data_center.pdf, 2011.
[16] J. K. R. a. B. G. Loyer, "Humidity and Temperature Effects on PCB Insertion Loss," in DesignCon,
71
2013.
[17] P. B. G. Hamilton and G. a. S. J. Barnes Jr., "Humidity-Dependent Loss IN PCB Substrates," Printed
Circuit Design & Manufacture, vol. 24, no. 6, p. 30, 2007.
[18] "Serial ATA Revision 3.2 (Gold Revision)," Serial ATA International Organization, 7 August 2013.
[19] "PCI Express® Card Electromechanical Specification, Revision 3.0, ver. 1.0," PCI SIG, March 6, 201.
[20] PCI-SIG, Extension Devices ECN, Oct. 6, 2014.
[21] I. S. 802.3, "Standard for Ethernet," IEEE, 2012.
[22] IBIS Open Forum, http://www.vhdl.org/ibis/.
[23] "Standard 802.3, clause 72, “Physical Medium Dependent Sublayer and Baseband Medium, Type
10GBASE-KR," IEEE, 2008.
[24] "Specification for SFP+," SFF Committee , Rev 4.1 July 6, 2009, Rev 4.1 Addendum September 15,
2013.
White Paper – November 2015
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn more at
intel.com, or from the OEM or retailer.
No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses.
You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You
agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current
characterized errata are available on request.
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice.
Contact your Intel representative to obtain the latest Intel product specifications and roadmaps.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and noninfringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.
Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting
www.intel.com/design/literature.htm.
Intel, the Intel logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2015, Intel Corporation. All Rights Reserved.
72
Download