Clock- and Data-Recovery Circuit With Independently Controlled

advertisement
422
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 58, NO. 7, JULY 2011
Clock- and Data-Recovery Circuit With
Independently Controlled Eye-Tracking Loop
for High-Speed Graphic DRAMs
Jun-Yong Song, Student Member, IEEE, and Oh-Kyong Kwon, Member, IEEE
Abstract—An independently controlled eye-tracking clockand data-recovery (CDR) circuit that achieves enhanced highfrequency jitter tolerance is presented in this brief. In the proposed
CDR, a data-tracking loop compensates interchannel timing skews
and rejects low-frequency jitter of the data, and an eye-tracking
loop tracks asymmetric jitter distribution and high-frequency
jitter of the data to enhance high-frequency jitter tolerance. This
can be achieved by independently controlling two loops in the
digital domain. The CDR is implemented using an 0.18-μm CMOS
process, and a bit error rate of less than 10−12 was achieved for
a data rate up to 5.8 Gb/s using a 231 − 1 pseudorandom binarysequence input.
Index Terms—Bang-bang phase detector (PD), clock and data
recovery (CDR), complementary metal–oxide–semiconductor
(CMOS), dynamic random access memory (DRAM), eye tracking,
jitter tolerance.
I. I NTRODUCTION
T
HE DEMANDS of a high-bandwidth dynamic random access memory (DRAM) for graphic memory have recently
increased for 3-D graphics to process large amounts of multimedia data [1]. Source-synchronous multichannel links such
as DRAM input/output have large interchannel timing skews
and poor receiver timing margins [2]. In addition, as the data
rate between the memory controller and the memory module
reaches several gigabits per second, the received data suffer
from intersymbol interference and reflection noise, and the total
jitter of the transmitted data has asymmetric distribution [3].
Consequently, it is difficult to achieve low bit error rate (BER)
for a delay-locked loop (DLL)-based DRAM receiver because
a sampling clock is simply 90◦ shifted by using DLL at the
receiver.
A clock- and data-recovery (CDR) circuit can compensate
interchannel timing skews and track the jitter distribution of the
received data. Conventional CDRs have a single loop for datatracking operation and can have high BER with asymmetric
jitter distribution of the received data. The CDR, with variableinterval oversampling, can measure the data eye width and
can achieve low BER with asymmetric jitter distribution [3].
Reference [3] has a reference loop, tracking loop, and eyeManuscript received July 22, 2010; revised November 24, 2010 and
January 31, 2011; accepted April 4, 2011. Date of publication July 5, 2011;
date of current version July 20, 2011. This work was supported in part by
Hynix Semiconductor Inc. and the IC Design Education Center. This work was
recommended by Associate Editor H.-J. Yoo.
The authors are with the Department of Electronics and Computer Engineering, Hanyang University, Seoul 133-791, Korea (e-mail: okwon@hanyang.
ac.kr).
Digital Object Identifier 10.1109/TCSII.2011.2158254
Fig. 1.
CDR architecture with DTL and ETL for multichannel link.
measuring loop. Because the reference loop and the tracking
loop share the voltage-controlled oscillator (VCO), it is difficult
to adapt to a multichannel link. The tracking loop and eyemeasuring loop are tightly coupled, and the eye-measuring loop
has a relatively low jitter tracking bandwidth to acquire stable
operation.
This brief presents a CDR with an independently controlled
eye-tracking loop (ETL) for high-speed graphic DRAM. It
adopts a shared phase-locked loop (PLL) for multichannel
DRAM architecture [4] and two loops for phase tracking and
eye tracking. Two loops are independently controlled in the
digital domain to independently optimize the bandwidth of each
loop in a small area.
Section II describes the overall architecture of the CDR and
each loop. In Section III, the CDR building blocks for phase
detection and eye tracking are presented. The macromodeling
result of the CDR is presented in Section IV. The experiment
results of the CDR are presented in Section V, followed by the
conclusion in Section VI.
II. CDR A RCHITECTURE
Fig. 1 shows the block diagram of the proposed CDR.
The CDR operates at a rate of 1/4 to reduce the operation
speed of the digital logic and maximize the data rate at the
DRAM process. The PLL multiplies the frequency and reduces the jitter of the reference clock. The PLL synthesizes
the four-phase 1/4-rate clocks (CLK0−3 ) from the reference
clock (REF _CLK). A data-tracking loop (DTL) compensates
skews between data and sampling clocks at each channel
and tracks low-frequency jitter of the data. The DTL aligns
edge-sampling clocks (ϕ0−3 ) to the averaged transition positions of the 4-bit data by controlling the phase of CLK0−3 .
1549-7747/$26.00 © 2011 IEEE
SONG AND KWON: CDR CIRCUIT WITH INDEPENDENTLY CONTROLLED ETL FOR HIGH-SPEED GRAPHIC DRAM
423
Fig. 4. Phase diagram of the DTL.
Fig. 2.
Block diagram of the PLL.
Fig. 3.
Block diagram of the DTL.
An ETL takes over ϕ0−3 from the DTL, which has the phase
information of the data. The ETL generates two eye-monitoring
clocks and one data-sampling clock per data by phase interpolating ϕi and ϕi+1 . The phase of the data-sampling clock
is designed to be the center of the data eye where the lowest
BER position is. The bandwidth of the ETL is designed to
be higher than that of the DTL to track high-frequency jitter
of the received data by separating a control loop of the ETL
from the DTL and controlling two eye-monitoring clocks,
independently.
A. PLL
The shared PLL provides two equally spaced 1/4-rate differential clocks (CLK0−3 ) from REF _CLK, as shown in Fig. 2.
Differential delay cells of the VCO are implemented using symmetric loads [5] for high-supply-noise rejection characteristic
and compatibility with the DRAM process. The frequency of
REF _CLK is 1/8 rate of the data, and CLK0−3 is 1/4 rate
of the data. The phase difference between CLKi and CLKi+1
is a data unit interval (UI). CLK0−3 is distributed to the CDR
with current mode logic buffers.
B. DTL
Fig. 3 shows the block diagram of the DTL. The DTL
adopts a digitally controlled phase-tracking loop with a 1/4-rate
twice-oversampling bang-bang phase detector (PD) and a digital phase rotator (PR) controller. ϕ0−3 is aligned to the averaged
transition position of the data by adjusting a phase of CLK0−3
using the PR and is sent to the ETL. PIs generate two kinds
of sampling clocks (E0−3 and D0−3 ) for bang-bang phase
detection. Di is generated by phase interpolating ϕi and ϕi+1
with the same weight, and Ei is generated by buffering ϕi
Fig. 5. Block diagram of the ETL.
using a PI to match the skews between Di and Ei [6]. The phase
relationship is shown in Fig. 4. The phase difference between
ϕi and ϕi+1 is 90◦ of 1/4-rate clock, and the phase difference
between Ei and Di is 45◦ of 1/4-rate clock.
The DTL PD adopts a bang-bang phase detection to align Ei
to edges of the data. A majority vote circuit determines a state
of U P _DN 0 per one clock cycle by comparing the number
of HIGHs in U P [0 : 3] with that in DN [0 : 3]. A DTL digital
loop filter accumulates U P _DN 0 at an update frequency to
filter out the high-frequency jitter of the received data. The
accumulated result generates U P _DN 1 to control the phase
of ϕ0−3 using a PR controller. The PR controller determines the
phase of ϕ0−3 with a 30-bit thermometer code T W [0 : 29]. The
PR controller is implemented with a bidirectional shift register
and an area-optimized 5-bit binary-to-thermometer decoder to
increase linearity of the PR.
C. ETL
The ETL adopts a triple-oversampling eye-monitoring
scheme [3] with two digitally controlled eye-monitoring clocks
(L0−3 and R0−3 ) and a data-sampling clock (C0−3 ). The ETL
generates Li , Ri , and Ci by phase interpolating ϕi and ϕi+1 ,
which has the phase information of the data. The ETL also
independently controls Li , Ri , and Ci to track high-frequency
jitter of the data using an ETL PD, a digital loop filter, a PI
controller, and PIs as shown in Fig. 5. Fig. 6(a) shows a phase
diagram of the ETL when the data eye opening leans to the
right side of the UI, and Fig. 6(b) shows phases of clocks with
P DFjit , where P DFjit denotes the probability density function
(PDF) of jitter. In the locked state, Li and Ri are positioned
to the edges of the data eye, and Ci is positioned to the center of
the data eye, which is the lowest BER position. Consequently,
the data sampled with Ci is recovered data (RE_D[i]), and C0
424
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 58, NO. 7, JULY 2011
Fig. 8.
Schematic diagram of the 5-bit binary-to-thermometer decoder.
Fig. 9.
(a) Schematic diagram of the PI and (b) the PICS.
Fig. 6. (a) Phase diagram of the ETL and (b) phases of clocks with P DFjit .
Fig. 7. (a) Schematic diagram of the DTL PD and (b) schematic diagram of
the ETL PD.
is the recovered clock (RE_CLK). Additionally, the data eye
width can be measured with the digital control signals of Li
and Ri .
The ETL PD generates DN _U P _L0 and U P _DN _R0 to
independently control the phase of Li and Ri to align them to
the edges of the data eye. The ETL digital loop filter updates
DN _U P _L1 and U P _DN _R1 by sampling DN _U P _L0
and U P _DN _R0 at an update frequency. The update frequency of the ETL is higher than that of the DTL to track highfrequency jitter of the data. Because there is no feedback from
the ETL to the DTL, the bandwidth of the each loop can be
independently optimized. An ETL PI controller generates two
15-bit thermometer codes T W L[0 : 14] to control the phases of
Li , Ri , and Ci using each PI.
between Li and Ci , it means that Li is led to the left edge of the
data eye, and DN _U P _L0 becomes HIGH to delay Li . In the
same case for Ri and Ci , Ri is lagged to the right edge of
the data eye, and U P _DN _L0 becomes LOW to advance Ri .
B. 5-bit Binary-to-Thermometer Decoder
The 5-bit binary-to-thermometer decoder is used to generate
a thermometer code from the binary 5-bit PR control code
W [0 : 4] at the PR controller of the DTL. The 5-bit binaryto-thermometer decoder is implemented with a 4-bit binary-tothermometer decoder and MUXs to optimize a resolution and
the area of the PR controller, as shown in Fig. 8. The most
significant bit of the 5-bit PR control code is used as a MUX
selection signal. When W [4] is HIGH, T W [0 : 14] is all HIGH,
and T W [15 : 29] is determined by W [0 : 3]. Accordingly, the
5-bit binary-to-thermometer decoder has 31 steps.
C. PI
III. CDR B UILDING B LOCKS
A. PD
The DTL PD is composed of samplers [7] for data sampling,
a retimer to align the sampled data in-phase, and XOR gates for
bang-bang phase detection, as shown in Fig. 7(a). The states of
U P [i] and DN [i] is determined by a phase of Ei and the data
UI. The ETL PD adopts a triple-oversampling eye-monitoring
scheme [3], as shown in Fig. 7(b). If there is any transition
The PIs have two roles in the proposed CDR. First, the PIs
generate multiphase clocks to oversample the received data.
Second, the PIs control phases of the sampling clocks. Because
the inputs of the PI (V ID0 and V ID1) are differential signals,
which are generated from the symmetric-load-based VCO, the
PIs are also implemented with symmetric loads for constant
input and output voltage swing, as shown in Fig. 9(a). A
PI current source (PICS) is composed of 60 PI control units
(PICUs), and controlled with two 30-bit thermometer code
SONG AND KWON: CDR CIRCUIT WITH INDEPENDENTLY CONTROLLED ETL FOR HIGH-SPEED GRAPHIC DRAM
425
Fig. 10. Phase ranges of Li , Ci , and Ri .
Fig. 12. Macromodeling results with an input data of 0.66-UI peak-to-peak
jitter. (a) Control signal for ϕi . (b) Control signal for Li . (c) Control signal
for Ri .
IV. M ACROMODELING R ESULTS OF CDR
Fig. 11. Floor plan of the six ETL PICSs with control signals.
signals (T W [0 : 29] and T W b[0 : 29]). The PICU consists of
a switch for digital control signal (T W [n] or T W b[n]) and a
current source with bias voltage of V BN . Because PICUs are
connected in parallel between a common source node (CS0
or CS1) and V SS, two adjacent PICUs are sharing CSi or
V SS to minimize the area, as shown in Fig. 9(b). According to
simulation results, the integral nonlinearity of the PI is ±0.25
LSB due to process variations.
D. ETL PI Controller
The ETL PI controller generates 15-bit thermometer code
signals (T W L[0 : 14] and T W R[0 : 14]) to control the phases
of Li and Ri . The phase control signal for Ci (T W C[0 : 14]) is
composed of T W L[2m] and T W R[2m + 1], where m is from
0 to 7, to position Ci to the center of Li and Ri , and to minimize
the area. The phase ranges of Li , Ri , and Ci are limited to half
of the UI, as shown in Fig. 10, to simplify the PI controller logic
by assuming that the left or right edge of the data eye does not
exceed the center of the UI. Consequently, the ETL PI has 5-bit
resolution and 4-bit phase control range.
Fig. 11 depicts a floor plan of the six ETL PICSs with control
signals. There are two PICSs for each Li , Ri , and Ci . To
minimize process variation by pattern density, dummy PICSs
are placed on both sides of the six PICSs. To limit the phase
range of Li to left side and Ri to right side of the UI, fixed
control signals in the left and right PICSs are connected to
HIGH and LOW, respectively, as shown in Fig. 11. Center
PICSs have eight fixed control signals with HIGH and seven
fixed control signals with LOW. At the initial state, in order to
position Li , Ri , and Ci at the center of ϕi and ϕi+1 , T W L[0 :
14] and T W R[0 : 14] are reset to HIGH and LOW, respectively.
We verified the tracking characteristics of the proposed eyetracking method and designed the circuit using a macromodel
of the channel and the circuit. The transition probability and
jitter characteristic of the data is modeled using 231 − 1 pseudorandom binary-sequence (PRBS) input, white noise, and the
PCB channel. The channel length was varied to change the
magnitude of jitter. The length of observation for data eye
opening is determined by the bandwidth of the ETL.
Fig. 12 shows macromodeling results of the CDR when the
input data has 0.66UI peak-to-peak jitter, and the reference
clock is led to the input data. The bandwidth of the ETL is
set to four times faster than that of the DTL. The weights of
the control signals for ϕi , Li , and Ri are presented in Fig. 12;
high weight means lead phase. The DTL positions the phase of
ϕi to the edge of the received data, and at the same time, the
ETL positions the phases of Li and Ri to the edges of the data
eye. During the locking state, Li approaches the left edge of
the data eye, and Ri is fast delayed toward the right edge of the
data eye due to the high bandwidth of the ETL. After the DTL
is locked to the data UI, Li and Ri are fast tracking the jitter of
the data, whereas Ci is located at the center of the data eye for
low BER.
V. E XPERIMENT R ESULTS OF CDR
The CDR circuit was fabricated in 0.18-μm standard CMOS
technology. The microphotograph of the chip is shown in
Fig. 13. The core of the CDR circuit is 0.90 mm × 0.55 mm.
It consumes a power of 109.8 mW at 1.8-V supply voltage.
The area and power consumption are not optimized for a large
design margin.
The chip has been tested with a chip-onboard assembly. With
the 5.8-Gb/s 231 − 1 PRBS data input, the peak-to-peak jitter
of the received data is 25 ps by impedance mismatch, and
the measured BER is less than 10−12 . The jitter histogram of
the recovered clock is shown in Fig. 14, where the rms and
426
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 58, NO. 7, JULY 2011
TABLE I
P ERFORMANCE S UMMARY AND C OMPARISONS
Fig. 13. Chip microphotograph.
VI. C ONCLUSION
A CDR circuit with an independently controlled ETL for
high-speed graphic DRAM has been presented. The ETL independently controls two eye-monitoring clocks to monitor
the data eye and to track high-frequency jitter of the data.
In addition, digital control enables simple eye-opening measurement. To achieve area efficiency, a modified 5-bit binaryto-thermometer decoder and PI controller has been presented.
The CDR is implemented in 0.18-μm CMOS technology, the
maximum data rate is 5.8 Gb/s, and the BER is less than 10−12
with 231 − 1 PRBS data. The CDR can be applicable as a highspeed graphic DRAM receiver with high jitter tolerance.
Fig. 14. Recovered 1/4-rate 1.45-GHz clock.
ACKNOWLEDGMENT
The authors would like to thank K.-S. Kwak, S.-J. Ahn,
M.-S. Shin, E.-J. Kim, H.-R. Choi, and Y.-J. Kim for their
useful discussion and feedback.
R EFERENCES
Fig. 15. Eye diagram of the 1/4-rate 1.45-Gb/s recovered data.
peak-to-peak jitter are 13.5 and 121.2 ps, respectively. Fig. 15
shows the recovered parallel data at 1/4-rate 1.45 Gb/s. Maximum data rate and jitter tolerance of the CDR arelimited by
the large peak-to-peak jitter of the sampling clock because the
supply noise of digital buffers is transferred to the large jitter
of the clock through low bandwidth of the PLL. The measured
performance summary of the CDR and comparisons with other
works are listed in Table I.
[1] H. Lee, K.-Y. K. Chang, J.-H. Chun, T. Wu, Y. Frans, B. Leibowitz,
N. Nguyen, T. J. Chin, K. Kaviani, J. Shen, X. Shi, W. T. Beyene,
S. Li, R. Navid, M. Aleksic, F. S. Lee, F. Quan, J. Zerbe, R. Perego, and
F. Assaderaghi, “A 16 Gb/s/link, 64 GB/s bidirectional asymmetric memory
interface,” IEEE J. Solid-State Circuits, vol. 44, no. 4, pp. 1235–1247,
Apr. 2009.
[2] E. Yeung and M. A. Horowitz, “A 2.4 Gb/s/pin simultaneous bidirectional parallel link with per-pin skew compensation,” IEEE J. Solid-State
Circuits, vol. 35, no. 11, pp. 1619–1628, Nov. 2000.
[3] S.-H. Lee, M.-S. Hwang, Y. Choi, S. Kim, Y. Moon, B.-J. Lee, D.-K. Jeong,
W. Kim, Y.-J. Park, and G. Ahn, “A 5-Gb/s 0.25 μm CMOS jitter-tolerant
variable-interval oversampling clock/data recovery circuit,” IEEE J. SolidState Circuits, vol. 37, no. 12, pp. 1822–1830, Dec. 2002.
[4] Y.-S. Seo, J.-W. Lee, H.-J. Kim, C. Yoo, J.-J. Lee, and C.-S. Jeong, “A
5-Gb/s clock- and data-recovery circuit with 1/8-rate linear phase detector
in 0.18 μm CMOS technology,” IEEE Trans. Circuits Syst. II, Exp. Briefs,
vol. 56, no. 1, pp. 6–10, Jan. 2009.
[5] J. G. Maneatis, “Low-jitter process-independent DLL and PLL based
on self-biased techniques,” IEEE J. Solid-State circuits, vol. 31, no. 11,
pp. 1723–1732, Nov. 1996.
[6] S. Sidiropoulos and M. A. Horowitz, “Asemidigitaldualdelay-lockedloop,”
IEEE J. Solid-State Circuits, vol. 32, no. 11, pp. 1683–1692, Nov. 1997.
[7] P. K. Hanumolu, G.-Y. Wei, and U.-K. Moon, “A wide-tracking range clock
and data recovery circuit,” IEEE J. Solid-State Circuits, vol. 43, no. 2,
pp. 425–439, Feb. 2008.
[8] A. Agrawal, A. Liu, P. K. Hanumolu, and G.-Y. Wei, “An 8 × 5 Gb/s
parallel receiver with collaborative timing recovery,” IEEE J. Solid-State
Circuit, vol. 44, no. 11, pp. 3120–3130, Nov. 2009.
Download