An Adaptive-Bandwidth Referenceless CDR with Small-area

advertisement
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.3, JUNE, 2015
http://dx.doi.org/10.5573/JSTS.2015.15.3.404
ISSN(Print) 1598-1657
ISSN(Online) 2233-4866
An Adaptive-Bandwidth Referenceless CDR with
Small-area Coarse and Fine Frequency Detectors
Hye-Jung Kwon, Ji-Hoon Lim, Byungsub Kim, Jae-Yoon Sim, and Hong-June Park
Abstract—Small-area, low-power coarse and fine
frequency detectors (FDs) are proposed for an
adaptive bandwidth referenceless CDR with a wide
range of input data rate. The coarse FD implemented
with two flip-flops eliminates harmonic locking as
long as the initial frequency of the CDR is lower than
the target frequency. The fine FD samples the
incoming input data by using half-rate four phase
clocks, while the conventional rotational FD samples
the full-rate clock signal by the incoming input data.
The fine FD uses only a half number of flip-flops
compared to the rotational FD by sharing the
sampling and retiming circuitry with PLL. The
proposed CDR chip in a 65-nm CMOS process
satisfies the jitter tolerance specifications of both USB
3.0 and USB 3.1. The proposed CDR works in the
range of input data rate; 2 Gb/s ~ 8 Gb/s at 1.2 V,
4 Gb/s ~ 11 Gb/s at 1.5 V. It consumes 26 mW at
5 Gb/s and 1.2 V, and 41 mW at 10 Gb/s and 1.5 V.
The measured phase noise was -97.76 dBc/Hz at the
1 MHz frequency offset from the center frequency of
2.5 GHz. The measured rms jitter was 5.0 ps at 5 Gb/s
and 4.5 ps at 10 Gb/s.
I. INTRODUCTION
The clock-data recovery (CDR) circuit is widely used
at the receiver of high-speed serial link interfaces such as
USB, PCIe, SATA, and Display port. The CDR circuit
extracts the data and clock signals from the received
signal. There are two kinds of CDRs; one is a referencebased CDR (Fig. 1(a)) and the other is a referenceless
CDR (Fig. 1(b)). The reference-based CDR [1, 2]
generates a clock signal from the reference clock source
of receiver (CKREF2 of Fig. 1(a)), and adjusts the clock to
locate phase at the center of the received data eye. The
CDR circuit is implemented by using a dual-loop
architecture, which consists of a frequency-locked loop
(FLL) and a phase-locked loop (PLL). Initially only the
(a)
Index Terms—Clock and data recovery circuit, fine
frequency detection, jitter tolerance, referenceless,
adaptive bandwidth
Manuscript received Mar. 5, 2015; accepted May. 19, 2015
Pohang University of Science and Technology Dept. of Electrical
Engineering
E-mail : hjpark@postech.ac.kr
(b)
Fig. 1. Serial link transceiver (a) w/ reference-based CDR, (b)
w/ referenceless CDR.
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.3, JUNE, 2015
FLL is enabled. It adjusts the frequency of the extracted
clock to within the PLL pull-in range around the target
frequency. After the FLL is locked, the PLL is enabled.
With the reference-based CDR circuit, the usable data
rate is limited to one or a few discrete values. If the
usable data rate of CDR can change over a continuous
range, a single CDR can be used for different
applications. It will reduce the design cost. The
referenceless CDR satisfies this requirement. The
referenceless CDR [3-16] extracts the clock signal from
the received data signal alone without using any
reference clock sources (Fig. 1(b)). It can be used for
many different applications with a wide range of input
data rate.
The most significant challenge of the referenceless
CDR is the harmonic locking problem, in which the
frequency of the clock signal extracted by the CDR is a
sub-harmonic value of the target frequency. One solution
to this problem is to limit the output frequency of the
voltage-controlled oscillator (VCO) in the referenceless
CDR to within ± 50 % of the target frequency [3-7].
However, this restriction limits the usable range of input
data rate to a narrow range. The second solution is to find
out the harmonic locking by checking the maximum runlength of the CDR output data for the case in which the
maximum run-length of the CDR input data is fixed to a
constant value [8, 9]. This method is limited to a specific
encoding scheme, such as the 27-1 PRBS data for [8] and
the 8B10B-encoded data for [9]. The third solution
recovers the clock signal by using the randomness of
input data [10-13]. The input data stream is divided by
more than 1000 and the resultant output is applied to a
frequency multiplier to recover the clock signal. This
method works only for random data that have a transition
density close to 0.5 to get a small frequency offset of
FLL, such that the output frequency of the locked FLL is
located within the PLL pull-in range (0.2 % of target
frequency). To achieve a recovered clock with a
reasonable jitter, this solution requires an excessively
narrow bandwidth for the FLL used for the multiplication.
The fourth solution [14] uses an extra delay-locked loop
(DLL) for a wide-range referenceless CDR.
In this paper, a small-area FLL for the referenceless
CDR is proposed to achieve a wide usable range of input
data rate, and a small frequency offset of the locked FLL
(<0.2 % of target frequency) without using an extra DLL
405
and no limits on the maximum run-length and transition
density of input data.
In a CDR circuit, the PLL bandwidth must be reduced
as much as possible to minimize the jitter of the
recovered clock and data, because the input data of the
CDR usually has a large jitter. However, the jitter
tolerance is reduced as the PLL bandwidth is reduced. In
the referenceless CDR with a wide range of input data
rate, the PLL bandwidth is usually fixed to ~1/1000 of
the minimum input data rate to minimize the jitter of the
recovered data and clock for the entire range of input
data rate. However, this method degrades the jitter
tolerance significantly at the maximum input data rate
[15, 16]. In this work, an adaptive-bandwidth tracking
scheme is used such that the PLL bandwidth is
proportional to the input data rate. Also, a digital loop
filter is used to compensate for PVT variations.
The proportionality constant of the PLL bandwidth to
the input data rate is fixed to ~1/1000 for the all the range
of input data rate. The proposed adaptive-bandwidth
scheme provides both a large jitter tolerance and a small
jitter of the recovered clock and data for the entire range
of input data rate.
Section II presents the architecture of the proposed
referenceless CDR, the FLL and the adaptive-bandwidth
tracking scheme of PLL. Section III explains the circuit
implementations. Section IV shows the measurement
results. Section V concludes this work.
II. ARCHITECTURE
The proposed CDR (Fig. 2) consists of a dual-loop
architecture of a PLL and an FLL. It accepts an input
data DIN and generates a four-phase recovered clock
signal CLKOUT[0:3] and a recovered data DOUT[0:1]. The
PLL is implemented by using digital circuits to maintain
an accurate ratio between the PLL bandwidth and the
Fig. 2. Proposed CDR.
406
HYE-JUNG KWON et al : AN ADAPTIVE-BANDWIDTH REFERENCELESS CDR WITH SMALL-AREA COARSE AND FINE …
input data rate and also to eliminate a large capacitor
from the loop filter. The FLL is also implemented by
using digital circuits to eliminate a huge capacitor from
the loop filter, which is required to get an extremely low
FLL bandwidth (a few tens of kilohertz). Initially after
the power-on reset, the FLL is enabled and adjusts the
frequency of CLKOUT[0:3] within the range of ±0.2 % of
the target frequency, which is one half of the input data
rate. A 2x oversampling is used to extract the recovered
data at the Alexander PD of the PLL. After the FLL is
locked, both the PLL and the FLL are enabled. To avoid
the interaction between the two loops, the FLL
bandwidth is set to a constant value, which is < 0.01
times the PLL bandwidth.
The proposed FLL has a wide range of frequency
acquisition. This helps to maximize the usable range of
input data rate of the CDR. The frequency acquisition
range refers to the range of initial ICO (current controlled
oscillator) frequency over which frequency locking can
be achieved. The PLL bandwidth is adjusted
proportionately to the input data rate such that the ratio
of the PLL bandwidth to the input data rate is maintained
constant at ~ 1/1000 for the entire range of usable input
data rate. This scheme improves the jitter tolerance at
frequencies near the PLL bandwidth.
1. Frequency- locked Loop (FLL)
The FLL sets the ICO frequency within the range of
±0.2% from the target frequency. Rotational frequency
detectors (RFDs)[17, 18] are widely used in the
conventional FLL circuits, because RFDs lock for any
input data patterns that include ‘010’ or ‘101’. The input
data for RFD are not restricted in the maximum runlength or the data transition density. RFD is usually used
in a single-loop FLL. However, in the CDR with a
single-loop FLL using RFD, the frequency acquisition
range is limited to ±50 % of the target frequency.
A dual-loop FLL that consists of a coarse and a fine
frequency loops was proposed in [14]. The coarse
frequency loop includes an additional DLL to increase
the frequency acquisition range of CDR without
imposing any limits on the maximum run-length or the
data transition density of input data. The DLL eliminates
the upper limit of the CDR frequency acquisition range,
as long as the VCDL delay can be smaller than one
Fig. 3. FLL of proposed CDR.
period of the output clock at the target frequency.
However, the additional DLL significantly increases the
chip area.
In this work, a dual-loop FLL is proposed to achieve a
wide frequency acquisition range of CDR with relatively
small-area frequency detectors (Fig. 3). This work
eliminates the lower limit of the CDR frequency
acquisition range by using only a simple coarse
frequency detector (FD) with two flip-flops. Because the
FLL shares the ICO with the PLL, no additional DLL is
needed. The coarse frequency loop of this work works as
follows. After the power-on reset, the initial frequency of
the ICO is set to the minimum value of its oscillation
range. This value is guaranteed to be lower than the
target frequency. The coarse frequency loop increases the
ICO frequency in uniform steps until it exceeds the target
frequency. At this point the coarse frequency loop is
declared to be locked. This operation of the coarse
frequency loop eliminates the lower limit of the CDR
frequency acquisition range. This also guarantees that the
proposed CDR can lock at any target frequency that is
within the ICO’s oscillation range.
After the power-on reset, only the coarse frequency
loop is enabled. It sets the ICO frequency to a value from
± 2% from the target frequency. After the coarse
frequency loop is locked, the coarse frequency loop is
disabled and the fine frequency loop is enabled. The fine
frequency loop sets the ICO frequency within ± 0.2 % of
the target frequency. This range is mostly smaller than
the PLL pull-in range. The fine FD works similarly to the
conventional RFD. The fine FD consists of a sampling
circuit followed by a transition detector. The sampling
and retiming circuit is shared with the Alexander PD of
the PLL. This sharing greatly reduces the chip area of the
fine FD.
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.3, JUNE, 2015
Fig. 4. s-domain approximation of PLL.
2. Adaptive-bandwidth tracking
The ratio of the PLL bandwidth to the input data rate is
fixed to a constant value in the proposed referenceless
CDR to maintain a good jitter tolerance performance for
a wide range of input data rate. This scheme, which is
called the adaptive-bandwidth tracking scheme, is
applied to the referenceless CDR for the first time in this
work. By using a simplified s-domain model of the PLL
(Fig. 4), the PLL bandwidth BWPLL can be derived as
follows if the proportional-path DAC (DACP) gain KP is
much larger than the integral-path DAC gain KI [19, 20]:
BWPLL » K PD × K P × K ICO µ I P × K ICO
(1)
where KPD is the gain of the Alexander PD, KICO is the
ICO gain, and IP is the output current of DACP (Fig. 2).
KPD and KICO are kept constant independently of the input
data rate. By setting KP to be proportional to the input
data rate, the ratio of BWPLL to the input data rate is
maintained constant at ~1/1000. This achieves the
adaptive-bandwidth tracking. IP is proportional to KP.
After the frequency lock is achieved, both the phase
loop and the fine frequency loop work simultaneously.
To maintain the loop stability, the bandwidth BWFLL of
the fine frequency loop is set to a constant value which is
< 0.01BWPLL for the entire range of input data rate.
BWFLL is determined as
BWFLL » K Fine_FD × K F × K ICO
(2)
where KFINE_FD is the gain of the fine FD, and KF is the
gain of the frequency loop DAC (DACF).
III. CIRCUIT IMPLEMENTATION
A PLL and an FLL are combined to implement the
407
proposed referenceless CDR (Fig. 2). The PLL consists
of an Alexander PD, a digital loop filter, a DSM (delta
sigma modulator), two DACs, and an ICO. The
Alexander PD converts the phase difference between the
input data and the ICO output clock into two digital
codes; ‘E’ and ‘L’, which represent three cases of ‘early’
(E = 1, L = 0), ‘late’ (E = 0, L = 1), and ‘no action’ (E = 0,
L = 0). The two output codes of the Alexander PD are
sent to DACP of the proportional path and the
accumulator of the integral path of the PLL. The
Alexander PD is implemented by using sense-amp flipflops to minimize the static phase offset between the
input data and the ICO output clock. DACP converts the
two output codes of the Alexander PD into three current
levels: 0, IP0, and 2IP0, where IP0 corresponds to the
proportional path gain KP. An 18-bit accumulator, a DSM,
and a 7-bit DAC are used in the integral path of the PLL.
The DSM enables use of a low-resolution DAC. The four
least significant bits (LSBs) of the accumulator are
discarded to reduce the dithering jitter of the ICO output
clock. The FLL is a first-order loop, which includes a
DSM to reduce the DAC size; A 7-bit R-2R DAC is
followed by a RC low pass filter with a bandwidth of
around 1 MHz. The four LSBs of the 18-bit accumulator
output are discarded to eliminate the steady state
dithering jitter. The MUX shifter block reduces the fine
frequency loop gain to 1/128 of the coarse frequency
loop gain. This enables a relatively fast lock time for the
coarse frequency loop and a fine frequency resolution for
the fine frequency loop. The FD gain is the same for both
fine and coarse frequency loops. For both FLL and PLL,
the digital loop filter is used to avoid the huge capacitors
[21, 22]. The circuit operations of the coarse FD, the fine
FD, and the ICO are explained in the following
paragraphs.
1. Coarse Frequency Detector
The coarse FD of this work is similar to that of [23]; a
two-phase version is used in [23] while a four-phase
version is used in this work.
The coarse FD of this work identifies whether the ICO
frequency is higher than half the input data rate by
counting the maximum number of rising transitions of
input data during one period of the ICO output clock. For
this, the coarse FD is implemented by using a series
408
HYE-JUNG KWON et al : AN ADAPTIVE-BANDWIDTH REFERENCELESS CDR WITH SMALL-AREA COARSE AND FINE …
Table 1. Comparison of coarse FDs with different time
intervals for counting (5 Gb/s PRBS-7 input data)
(a)
(b)
(c)
Fig. 5. Coarse FD (a) circuit, (b) operation (ICO freq. < target
freq.), (c) operation (ICO freq. > target freq.).
connection of two flip-flops (Fig. 5(a)).
The ICO generates four-phase clocks (CLKOUT[0:3]),
of which the target frequency is set to half the input data
rate for 2x-oversampling. Initially after the power-on
reset, the ICO output frequency is set to the minimum
frequency available from ICO, so the maximum number
of data rising transitions during one clock period is larger
than 2 because the ICO minimum frequency is designed
to be lower than the target frequency. Therefore, the
coarse FD initially sets the FC_UP signal to ‘1’ (Fig.
5(b)). Thus, the ICO output frequency increases
continuously with time. When the ICO output frequency
exceeds the target frequency, the maximum number of
rising transitions of input data during one period of the
ICO output clock is one and the FC_UP signal is set to
‘0’ (Fig. 5(c)). If the FC_UP signal remains at ‘0’ during
1024 consecutive rising transitions of data, FLL declares
the coarse lock and the ICO output frequency is located
within the range from 0 to +2% from the target frequency.
This satisfies the requirement of the following fine
frequency loop; the initial ICO output frequency must be
Time interval for
counting
(clock period)
1/2
1(this work)
2
Lock frequency offset
3~4%
1~2%
0.5~1%
Lock time
(coarse lock)
0.6 μs
1.5 μs
41 μs
Circuit complexity
8 FF+1 OR
12 FF + 1 OR
16 FF + 1 OR
Power
0.84 mW
1.17 mW
1.48 mW
Sensitivity to duty
cycle of input data
Sensitive
Insensitive
Insensitive
located within the range from 50% to 150% of the target
frequency for the fine frequency loop to be locked. The
maximum data transitions during one clock period occur
for the input data pattern of ‘0101’. The proposed coarse
frequency loop always locks to the target frequency
without harmonic locking, as long as the minimum
frequency of the ICO output is lower than the target
frequency.
In Fig. 5(a), the two flip-flops are clocked by the input
data (DIN). The first flip-flop is reset while the divided
clock (CLKOUT[0]/2) is high. The second flip-flop is reset
at the power-on reset. CLKOUT[0]/2 is generated by
dividing one of the four-phase ICO output clocks(CLK[0]).
By using four of the above-mentioned flip-flop circuit in
parallel (Fig. 5(a)), the lock time of the coarse frequency
loop is reduced from 5 μs to 1.5 μs.
Instead of counting the rising edges of input data
during one clock period of CLK[0], we can count them
during either a half clock period or two clock periods
(Table 1). When the half-clock-period counting is used,
the lock time is faster than the one-clock-period counting
but the frequency offset from the target frequency occurs
depending on the clock duty cycle and the inter symbol
interference (ISI) on the incoming data signal. With the
two-clock-period counting, the lock time is slower and
the circuit complexity increases compared to the oneclock-period counting. Therefore, the one-clock-period
counting was chosen in this work.
2. Fine Frequency Detectors
The coarse FD of this work is similar to that of [23]; a
two-phase version is used in [23] while a four-phase
version is used in this work.
After the coarse frequency loop is locked, the coarse
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.3, JUNE, 2015
409
Table 2. Comparison of fine FDs (5 Gb/s PRBS-7 input data)
Fine FDs
Conventional
RFD
Proposed fine
FD
(not shared)
Proposed fine FD
(shared w/
Alexander PD)
Clock rate
Full-rate
Half-rate
Half-rate
Clock time
interval between
adjacent F/F
0.25 UI
0.5 UI
0.5 UI
Circuit
complexity
12 FF + 6
Gates
24 FF + 14
Gates
8 FF + 14 Gates(16
FF: shared with
Alexander PD)
Power
2.41 mW
1.74 mW
0.28 mW
Sensitivity to
duty cycle of
input data
Sensitive
Insensitive
Insensitive
(a)
frequency loop is disabled and the fine frequency loop is
enabled. As long as the initial frequency of the ICO
output clock is located within ± 50 % of the target
frequency, the fine frequency loop is required to lock
such that the final frequency of the ICO output clock is
located within ± 0.2 % of the target frequency. The
proposed fine FD generates the FF_UP and FF_DN
signals by comparing the input data signal (DIN) with the
four-phase ICO clock signals (CLKOUT[0:3]). The fine
FD works for the input data patterns of consecutive 4-bit
‘0101’ only. For any other input data patterns, both the
FF_UP and the FF_DN signals of the fine FD are set to
‘0’. The fine frequency loop is declared to be locked if
both the FF_UP and FF_DN signals remain at ‘0’ during
the consecutive time interval of 1024 UI (unit interval of
data period) (Table 2).
The proposed fine FD consists of a sampling and
retiming block and a transition detector (Fig. 6(a)). The
operation of the proposed fine FD is basically the same
as the conventional RFD, except that the proposed fine
FD samples DIN at the rising edges of the half-rate ICO
clock(CLKOUT[0:3]), whereas the conventional RFD
samples the full-rate ICO clock at the rising and falling
edges of DIN. 32 flip-flops are used for the conventional
RFD. Although the same number of flip-flops are used
for the fine FD of this work (Fig. 6), 16 flip-flops for the
sampling and retiming block are shared with the
Alexander PD of the PLL. Therefore, only 16 flip-flops
are added for the fine FD of this work, which takes only
one-half area compared to the conventional RFD.
Because the half-rate clock is used for sampling in the
proposed fine FD, a significant power reduction is
achieved in clock drivers and flip-flops compared to the
(b)
(c)
Fig. 6. Fine FD with DIN of ‘0101’ (a) circuit, (b) operation
(ICO freq. < target freq.), (c) operation (ICO freq. > target
freq.).
conventional RFD. Also, no full-swing DIN is required
and the fine FD operation is insensitive to the duty cycle
change of DIN due to ISI, because DIN is sampled by the
ICO clock in this work. When the input data rate is faster
than the clock frequency (Fig. 6(b)), the data transition
interval changes from A to D between two consecutive
rising edges of CLKOUT[0]. A, B, C and D represent the
sequence of unit data intervals (UI) synchronized to the
ICO output clocks (CLK[0:3]). In this case, the FF_UP
signal is set to ‘1’ and this increases the ICO frequency.
When the input data rate is lower than the clock
410
HYE-JUNG KWON et al : AN ADAPTIVE-BANDWIDTH REFERENCELESS CDR WITH SMALL-AREA COARSE AND FINE …
(a)
Fig. 7. Current controlled oscillator (ICO).
frequency (Fig. 6(c)), the data transition interval changes
from D to A and the FF_DN signal is set to ‘1’. The
input data pattern is assumed to be ‘0101’ in these two
cases. Between two consecutive rising edges of
CLKOUT[0], the data transition interval can move only to
the adjacent one. This is because the initial ICO
frequency at the start of the fine frequency loop operation
is within ± 50% of the target frequency and the ICO
frequency is always adjusted to approach to the target
frequency during the fine frequency loop operation.
3. Adaptive BW Tracking ICO
The conventional adaptive-bandwidth CDR is divided
into an analog-type and a digital-type. The analog type
uses parallel charge-pump circuits [22] that are turned on
or off by a thermometer code. The digital type uses
parallel switches at the VDD side of DCO [24] that are
turned on or off by a thermometer code. Both [22] and
[24] are the reference- based CDRs. The adaptive
bandwidth scheme was not published for the
referenceless CDRs. In this work, the adaptive bandwidth
tracking scheme was applied to a referenceless CDR for
the first time, by fixing the ratio of the PLL bandwidth
(BWPLL) to the input data rate to a constant value. This
gives both a good jitter tolerance and a small jitter in the
recovered clock throughout a wide range of input data
rate.
The ICO is implemented by a 2-stage pseudodifferential inverter-type ring oscillator (Fig. 7) with
three separate current sources (IF, IP, and II); IF is the
DACF output of the frequency loop, IP and II are the
outputs of DACP and DACI associated with the
proportional and integral paths of the phase loop (Fig. 2).
The ICO output frequency fCLKOUT can be derived as
(b)
Fig. 8. Generation of ICO current (a) IF, (b) IP.
fCLKOUT » (I F + I I + I P ) × K ICO
(3)
After the fine frequency lock is achieved, fCLKOUT is
located in the range of ± 0.2% from the target frequency.
Thus, IF is proportional to fCLKOUT since it is dominant
over IP and II. Because fCLKOUT is the same as half the
input data rate during the locked state, IF is proportional
to the input data rate. BWPLL is proportional to IP as in (1).
In this work, the adaptive bandwidth tracking is achieved
by setting IP to be proportional to IF as shown in Fig. 8.
IF is generated by a 7-bit DAC (DACF) with the digital
input code of DF (Fig. 8(a)). II is generated similarly by
DACI with the digital input code of DI. IP is generated by
sharing the analog voltage VF’ from the IF generation
circuit (Fig. 8(b)). A 2-bit digital code (E, L) is used to
generate IP. IP = 0, IP0 and 2IP0 when (E, L) = (‘1’, ‘0’),
(‘0’, ‘0’) and (‘0’, ‘1’), respectively. In this work, IP0 =
IF/80. The DAC gains (KP, KI, KF of Fig. 2) includes the
gains of the current mirror circuits of Fig. 8. Compared
to the analog type CDR [22], the proposed adaptivebandwidth CDR has a smaller phase noise and a smaller
area. Compared to the conventional digital type CDR
[24], it has a better immunity to VDD noise, a smaller
area, and a better linearity.
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.3, JUNE, 2015
411
(a)
(a)
(b)
Fig. 9. (a) Layout, (b) chip photograph.
2Gbps(1.2V)
5Gbps(1.2V)
(b)
8Gbps(1.2V)
10Gbps(1.5V)
Fig. 10. Recovered clock and data (measurements).
IV. MEASUREMENT RESULTS
The proposed adaptive-bandwidth referenceless CDR
was fabricated in a 65-nm standard CMOS process (Fig.
9) on a QFN 80-pin package. The chip area is 0.17 mm2
excluding the input and output buffers.
The recovered half-rate data and clock (Fig. 10) were
measured at a supply voltage of 1.2 V for data rates from
2 Gb/s to 8.4 Gb/s. The CDR chip works at data rates up
to 11.2 Gb/s at a supply voltage of 1.5 V. It consumes
26 mW at 5 Gb/s and 1.2 V.
The jitter of the recovered clock was affected by
(c)
Fig. 11. Measured jitter of recovered clock (a) jitter histogram,
(b) rms. jitter versus data-rate, (c) rms. jitter versus maximum
run-length of data.
voltage supply and data rate (Fig. 11). The rms and peakto-peak jitters of the recovered clock were 5.0 ps and
41.1ps (Fig. 11(a) at 5 Gb/s and 1.2 V for a 27-1 PRBS
data. The rms jitter was reduced as the data rate was
increased (Fig. 11(b), where 27-1 PRBS data were used.
This is because the update period of the phase detector
output is reduced as the data rate is increased. The rms
412
HYE-JUNG KWON et al : AN ADAPTIVE-BANDWIDTH REFERENCELESS CDR WITH SMALL-AREA COARSE AND FINE …
Fig. 12. Measured frequency offset of recovered clock.
10Gb/s
(a)
16M(0.4UI)
USB3.1 spec
7.M(0.17UI)
Fig. 14. Measured jitter tolerance.
(b)
Fig. 13. Measured frequency spectrum of recovered clock at
5 Gb/s (a) phase noise, (b) reference spur.
jitter was increased as the maximum run-length of data
was increased (Fig. 11(c), where the maximum runlength is N for the 2N-1 PRBS data.
Frequency offset of the recovered clock (Fig. 12) from
the target frequency (half the data rate) was measured not
to exceed 1000 ppm after the FLL was locked. The offset
is smaller than the design target of ± 2000 ppm. In this
measurement, 27-1 PRBS data were used. The supply
voltage was 1.5V.
The frequency spectrum of the recovered clock was
measured for 5Gb/s 27-1 PRBS data at 1.2V (Fig. 13).
The phase noise at 1-MHz offset was -97.76 dBc/Hz, and
the integrating phase noise was 4.21 ps (Fig. 13(a)). The
reference spur was -32.4 dBc (Fig. 13(b)).
The measured jitter tolerance curve satisfies the USB
3.0 spec at 5 Gb/s and the USB 3.1 spec at 10 Gb/s (Fig.
14) The measured corner frequency was 6 MHz at 5 Gb/s,
9 MHz at 8 Gb/s and 16 MHz at 10 Gb/s; It is almost
proportional to the data rate, and is approximately the
same as the PLL bandwidth; this verifies the adaptive
bandwidth tracking operation.
The proposed CDR was compared with the recentlypublished referenceless CDRs (Table 3). This work
shows the excellent FOM of 4.1 mW/Gb/s.
V. CONCLUSIONS
A low-power small-area FD is proposed for an
adaptive-bandwidth referenceless CDR. The FD consists
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.3, JUNE, 2015
413
Table 3. Performance comparison of referenceless CDRs
CDR
ISSCC09[14]
JSSC13[22]
JSSC11[10]
This work
Technology
65 nm
0.18 μm
0.13 μm
65 nm
Data rate [Gb/s]
0.65-8
4.6-5.3/9.2-10.6
0.5-2.5
2-11
FLL lock range (UI)
[1,∞]
[0.85, 1.15]
[0, ∞]
[0,1.5]
Architecture
Quarter-rate analog DLL
Full-rate analog PLL
Half-rate digital PLL
Half-rate digital PLL
Supply [V]
1.2
1.8
0.8/1.2
1.2
1.5
Jitter [psrms/pspp @ Gb/s]
9.7/53.3
1.04/7.5
5.4/44
5.0/41 @ 5
4.2/31 @ 10
Power
[mW @ Gb/s]
20.6 @ 0.65
88.6 @ 8
110.6 @ 10
6.1 @ 2
26 @ 5
11
3.05
5.2
41 @ 10
FOM [mW/Gb/s]
31.7
11.1
of a coarse FD and a fine FD. The coarse FD eliminates
the harmonic locking problem by extending the
frequency lock range from 0% to 100% of the target
frequency. The coarse FD is implemented with two flipflops. The fine FD saves power and area by sharing the
sampler and re-timer circuitry with an Alexander PD in
the PLL. The coarse and fine FDs adjust the ICO
frequency only when the data pattern ‘0101’ is detected
in the incoming data. It does not depend on the maximum
run-length or the data transition density for frequency
locking. Because the coarse and fine FDs respond only to
the rising edges of incoming data, the proposed algorithm
is insensitive to the duty cycle of the incoming data. To
maintain a good jitter tolerance for a wide range of input
data rate, the adaptive-bandwidth scheme is used to
maintain a constant ratio of the PLL bandwidth to the
input data rate. To achieve this, the proportional path
gain of PLL is maintained to be proportional to the input
data rate by setting the DAC gain of the proportional
path to be proportional to the DAC input code of the FLL.
The proposed adaptive-bandwidth referenceless CDR
was implemented in a 65-nm standard CMOS process.
The CDR worked for the input data rates from 2 Gb/s to
8 Gb/s at a supply voltage of 1.2 V, and from 4 Gb/s to
11 Gb/s at a supply voltage of 1.5 V. The power
consumption of the proposed CDR was 26 mW. The rms
jitter of the recovered clock was 5.0 ps at 5 Gb/s and
1.2 V. The phase noise of the recovered clock was -97.76
dBc/Hz at the 1-MHz frequency offset from the center
frequency of 2.5 GHz (5 Gb/s). Because of the adaptive
bandwidth scheme used in this work, the proposed CDR
satisfies the jitter tolerance specifications of USB 3.0 and
USB 3.1 at 5 Gb/s and 10 Gb/s, respectively.
ACKNOWLEDGMENTS
This work was supported by the National Research
Foundation of the MSIP Korea under the contract
numbers of 2014-048650 and 2014-052875, and the
ITRC support program (NIPA-2014-H0301-14-1007)
supervised by the NIPA Korea, and IDEC.
REFERENCES
[1]
[2]
[3]
[4]
[5]
Pavan Kumar Hanumolu, Gu-Yeon Wei, and UnKu Moon, “A Wide-Tracking Range Clock and
Data Recovery Circuit,” IEEE J. Solid-State
Circuits, vol. 43, no. 2, pp. 268-278, Feb. 2008.
Arnoud P. van der Wel, and Gerrit W. den Besten,
“A 1.2–6 Gb/s, 4.2 pJ/Bit Clock & Data Recovery
Circuit With High Jitter Tolerance in 0.14 m
CMOS,” IEEE J. Solid-State Circuits, vol. 47, no. 7,
pp.1768-1775, Jul. 2012.
Fan-Ta Chen, Min-Sheng Kao, Yu-Hao Hsu, ChihHsing Lin, Jen-Ming Wu, Ching-Te Chiu, ShuoHung Hsu, “A 10 to 11.5GHz Rotational Phase and
Frequency Detector for Clock Recovery Circuit,”
Circuits and Systems (ISCAS), 2011 IEEE
International Symposium on, pp.185–188, May
2011.
Jri Lee and Ke-Chung Wu, “A 20-Gb/s Full-Rate
Linear Clock and Data Recovery Circuit With
Automatic Frequency Acquisition,” IEEE J. SolidState Circuits, vol. 44, no. 12, pp. 3590-3602, Dec.
2009.
Namik Kocaman, Siavash Fallahi, Mahyar Kargar,
Mehdi Khanpour, Ali Nazemi, Ullas Singh, and
Afshin Momtaz, “An 8.5–11.5-Gbps SONET
Transceiver With Referenceless Frequency
414
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
HYE-JUNG KWON et al : AN ADAPTIVE-BANDWIDTH REFERENCELESS CDR WITH SMALL-AREA COARSE AND FINE …
Acquisition,” IEEE J. Solid-State Circuits, vol. 48,
no. 8, pp. 1875-1884, Aug. 2013.
Junyoung Song, Inhwa Jung, Minyoung Song,
Young-Ho Kwak, Sewook Hwang, and Chulwoo
Kim, “A 1.62 Gb/s–2.7 Gb/s Referenceless
Transceiver for DisplayPort v1.1a With Weighted
Phase and Frequency Detection”, IEEE Trans. on
Ciruits and System I: Regular papers. vol. 60, no. 2,
pp. 268-278, Feb. 2013.
R.-J. Yang, S.-P. Chen, and S.-I. Liu, “A 3.125Gb/s Clock and Data Recovery Circuit for the 10Gbase-LX4 Ethernet,” IEEE J. Solid-State Circuits,
vol. 39, no. 8, pp. 1356–1360, Aug. 2004.
Rong-Jyi Yang, Student Member, IEEE, Kuan-Hua
Chao, Sy-Chyuan Hwu, Chuan-Kang Liang, and
Shen-Iuan Liu, Senior Member, IEEE, “A 155.52
Mbps–3.125 Gbps Continuous-Rate Clock and
Data Recovery Circuit,” IEEE J. Solid-State
Circuits, vol. 41, no. 6, pp. 1380-1390, Jun. 2006.
M.-S. Hwang, S.-Y. Lee, J.-K. Kim, S. Kim, and
D.-K. Jeong, “A 180-Mb/s to 3.2-Gb/s, continuousrate, fast-locking CDR without using external
reference clock,” in Proc. IEEE Asian Solid-State
Circuits Conf., pp. 144–147, Nov. 2007.
Rajesh Inti, Wenjing Yin, Amr Elshazly, Naga
Sasidhar, and Pavan Kumar Hanumolu, “A 0.5-to2.5 Gb/s Reference-Less Half-Rate Digital CDR
With Unlimited Frequency Acquisition Range and
Improved Input Duty-Cycle Error Tolerance,”
IEEE J. Solid-State Circuits, vol. 46, no. 12, pp.
3150-3162, Dec. 2011.
Jinho Han, Jaehyeok Yang, and Hyeon-Min Bae,
Member, IEEE, “Analysis of a Frequency
Acquisition Technique With a Stochastic Reference
Clock Generator,” IEEE Transactions on Circuits
ans Systems—II: Express briefs, vol. 59, no. 6, pp.
336-340, Jun. 2012.
Jinho Han, Hyosup Won, and Hyeon-Min Bae,
“0.6–2.7-Gb/s Referenceless Parallel CDR With a
Stochastic
Dispersion-Tolerant
Frequency
Acquisition Technique,” IEEE Transactions on
Very Large Scale Integration (VLSI) Systems, vol.
22, no. 6, Jun. 2014.
Guanghua Shu, Woo-Seok Choi, Saurabh Saxena,
Tejasvi Anand, Amr Elshazly, Pavan Kumar
Hanumolu, “A 4-to-10.5Gb/s 2.2mW/Gb/s
Continuous-Rate Digital CDR with Automatic
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
Frequency Acquisition in 65nm CMOS,” in IEEE
ISSCC Dig. Tech. Papers, pp. 150-152, Feb. 2014.
S.-K. Lee, Y.-S. Kim, H. Ha, Y. Seo, H.-J. Park,
and J.-Y. Sim, “A 650Mb/s-to-8Gb/s referenceless
CDR circuit with automatic acquisition of data
rate,” in IEEE Int. Solid-State Circuits Conf. Dig.
Tech. Papers, pp. 184–185, Feb. 2009.
Shao-Hung Lin and Shen-Iuan Liu, “Full-Rate
Bang-Bang Phase/Frequency Detectors for
Unilateral
Continuous-Rate
CDRs,”
IEEE
Transactions on Circuits ans Systems—II: Express
briefs, vol. 55, no. 12, Dec. 2008.
Chang-Lin Hsieh and Shen-Iuan Liu, Fellow, IEEE,
“A 1–16-Gb/s Wide-Range Clock/Data Recovery
Circuit With a Bidirectional Frequency Detector,”
IEEE Transactions on Circuits ans Systems—II:
Express briefs, vol. 58, no. 8, Aug. 2011.
David G. Messerschmitt. “Frequency Detectors for
PLL Acquisition in Timing and Carrier R.ecovery,”
IEEE Transactions on Communications, , vol.
COM-27, no. 9, Sep. 1979.
Razavi. B, “Frequency Detectors for PLL
Acquisition in Timing and Carrier Recovery,”
Monolithic Phase-Locked Loops and Clock
Recovery Circuits:Theory and Design 1996,
pp.107-114.
Amr Elshazly, Rajesh Inti, Wenjing Yin, Brian
Young, and Pavan Kumar Hanumolu, “A 0.4-to-3
GHz Digital PLL With PVT Insensitive Supply Noise
Cancellation Using Deterministic Background
Calibration,” IEEE J. Solid-State Circuits, vol. 46,
no. 12, Dec. 2011.
Mrunmay Talegaonkar, Rajesh Inti, and Pavan
Kumar Hanumolu, “Digital Clock and Data
Recovery Circuit Design: Challenges and
Tradeoffs,” Custom Integrated Circuits Conference
(CICC), 2011 IEEE, 2011, pp. 1–8.
Pyung-Su Han, Woo-Young Choi, “ 1 Gb/s gatedoscillator burst mode CDR for half-rate clock
recovery,” IEEE J. Semiconductor Technology and
Science, vol. 4, no. 4, Dec. 2004.
Hyung-Joon Jeon, Raghavendra Kulkarni, YungChung Lo, Jusung Kim, and Jose Silva-Martinez,
“A Bang-Bang Clock and Data Recovery Using
Mixed Mode Adaptive Loop Gain Strategy,” IEEE
J. Solid-State Circuits, vol. 48, no. 6, Jun. 2013.
D. Dalton, K. Chai, E. Evans, M. Ferriss, D.
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.3, JUNE, 2015
Hitchcox,
P.
Murray,
S.
Selvanayagam,
P.
Shepherd, L. Devito, and S. Member, “A 12.5Mb/s to 2.7-Gb/s continuous-rate CDR with
automatic frequency acquisition and data-rate read
back,” Solid-State Circuits, IEEE J., vol. 40, no.
12, pp. 2713–2725, Dec. 2005.
[24] Heesoo Song, Deok-Soo Kim, Do-Hwan Oh,
Suhwan Kim, and Deog-Kyoon Jeong, “A 1.0–4.0Gb/s All-Digital CDR With 1.0-ps Period
Resolution DCO and Adaptive Proportional Gain
Control,” IEEE J. Solid-State Circuits, vol. 46, no.
2, Feb. 2011.
Hye-Jung Kwon was born in Pohang,
Korea, on 1985. She received the B.S.
(2007), M.S. and Ph.D. (2014) degree
from the Department of Electronic
and Electrical Engineering, Pohang
University of Science and Technology
(POSTECH), Gyeongbuk, Korea, in
2007, where she is currently working toward the Ph.D.
degree in electronic engineering. She is currently
working at Samsung Electronics, Korea. Her research
interests include PLL/CDR circuits, PLL/CDR
behavioral simulator, on-chip PVT variations monitoring.
Ji-Hoon Lim was born in Seoul,
Korea, on 1989. He received the B.S.
degree in the Department of Electronic and Electrical Engineering
from Pohang University of Science
and Technology (POSTECH), Korea,
in 2011. He is currently pursuing the
M.S. and Ph.D. degrees in the Department of Electronic
and Electrical Engineering from Pohang University of
Science and Technology (POSTECH), Korea. His
interests include data converters, clock and data recovery,
high-speed interface circuits and ultra-low-voltage
analog circuits.
415
Byungsub Kim received the B.S.
degree in Electronic and Electrical
Engineering (EEE) from Pohang
University of Science and Technology
(POSTECH), Pohang, Korea, in
2000, and the M.S. (2004) and Ph.D.
(2010) degrees in Electrical Engineering and Computer Science (EECS) from
Massachusetts Institute of Technology (MIT), Cambridge,
USA. From 2010 to 2011, he worked as an analog design
engineer at Intel Corporation, Hillsboro, OR, USA. In
2012, he joined the faculty of the department of
Electronic and Electrical Engineering at POSTECH,
where he is currently working as an assistant professor.
He received several honorable awards. In 2011, Dr. Kim
received MIT EECS Jin-Au Kong Outstanding Doctoral
Thesis Honorable Mentions, and IEEE 2009 Journal of
Solid-State Circuits Best Paper Award. In 2009, he
received Analog Device Inc. Outstanding Student
Designer Award from MIT, and was also a co-recipient of
the Beatrice Winner Award for Editorial Excellence at the
2009 IEEE Internal Solid-State Circuits Conference.
Jae-Yoon Sim received the B.S.,
M.S., and Ph.D. degrees in Electronic
and Electrical Engineering from
Pohang University of Science and
Technology (POSTECH), Korea, in
1993, 1995, and 1999, respectively.
From 1999 to 2005, he worked as a senior engineer at
Samsung Electronics, Korea. From 2003 to 2005, he was
a post-doctoral researcher with the University of
Southern California, Los Angeles. From 2011 to 2012, he
was a visiting scholar with the University of Michigan,
Ann Arbor. In 2005, he joined POSTECH, where he is
currently an Associate Professor. He has served in the
Technical Program Committees of the International
Solid-State Circuits Conference (ISSCC), Symposium on
VLSI Circuits, and Asian Solid- State Circuits
Conference. He is a co-recipient of the Takuo Sugano
Award at ISSCC 2001. His research interests include
high-speed serial/parallel links, PLLs, data converters
and power module for plasma generation.
416
HYE-JUNG KWON et al : AN ADAPTIVE-BANDWIDTH REFERENCELESS CDR WITH SMALL-AREA COARSE AND FINE …
Hong-June Park received the B.S.
degree from the Department of
Electronic Engineering, Seoul National
University, Seoul, Korea, in 1979,
the M.S. degree from the Korea
Advanced Institute of Science and
Technology, Taejon, in 1981, and the
Ph.D. degree from the Department of Electrical
Engineering and Computer Sciences, University of
California, Berkeley, in 1989. He was a CAD engineer
with ETRI, Korea, from 1981 to 1984 and a Senior
Engineer in the TCAD Department of INTEL from 1989
to 1991. In 1991, he joined the Faculty of Electronic and
Electrical Engineering, Pohang University of Science and
Technology (POSTECH), Gyeongbuk, Korea, where he
is currently Professor. His research interests include
CMOS analog circuit design such as high-speed interface
circuits, ROIC of touch sensors and analog/digital
beamformer circuits for ultrasound medical imaging.
Prof. Park is a senior member of IEEE and a member of
IEEK. He served as the Editor-in-Chief of Journal of
Semiconductor Technology and Science, an SCIE journal
(http://www.jsts.org) from 2009 to 2012, also as the Vice
President of IEEK in 2012 and as the technical program
committee member of ISSCC, SOVC and A-SSCC for
several years.
Download