A CMOS 5.4/3.24-Gbps Dual-Rate CDR with Enhanced Quarter-Rate Linear Phase Detector Jae-Wook Yoo, Tae-Ho Kim, Dong-Kyun Kim, and Jin-Ku Kang This paper presents a clock and data recovery circuit that supports dual data rates of 5.4 Gbps and 3.24 Gbps for DisplayPort v1.2 sink device. A quarter-rate linear phase detector (PD) is used in order to mitigate high speed circuit design effort. The proposed linear PD results in better jitter performance by increasing up and down pulse widths of the PD and removes dead-zone problem of charge pump circuit. A voltage-controlled oscillator is designed with a ‘Mode’ switching control for frequency selection. The measured RMS jitter of recovered clock signal is 2.92 ps, and the peak-to-peak jitter is 24.89 ps under 231–1 bit-long pseudo-random bit sequence at the bitrate of 5.4 Gbps. The chip area is 1.0 mm×1.3 mm, and the power consumption is 117 mW from a 1.8 V supply using 0.18 µm CMOS process. Keywords: Linear phase detector, clock and data recovery, quarter-rate PD, dual-rate voltage controlled oscillator (VCO), DisplayPort. Manuscript received Oct. 04, 2010; revised Dec. 24, 2010; accepted Jan. 13, 2011. This work is supported by Inha University and Korea Institute for Advancement of Technology (KIAT) through the Human Resource Training Project for Strategic Technology. Jae-Wook Yoo (phone: +82 10 9197 6271, email: jwyoo@siliconworks.co.kr) is with Siliconworks Co., Daejeon, Rep. of Korea. Tae-Ho Kim (corresponding author, email: taeho.kim.inha@gmail.com), Dong-Kyun Kim (email: crow@inha.edu), and Jin-Ku Kang (email: jkang@inha.ac.kr) are with the School of Electronics Engineering, Inha University, Incheon, Rep. of Korea. http://dx.doi.org/10.4218/etrij.11.0110.0578 752 Jae-Wook Yoo et al. © 2011 I. Introduction Clock and data recovery (CDR) circuits are used extensively in high-speed interface systems, such as Ethernet receivers, disk drive, digital mobile receivers, and serial high-speed interfaces, to extract timing information from data. DisplayPort is a high-speed digital display interface standard set by the Video Electronics Standard Association (VESA) for highresolution display devices. The DisplayPort source transmits data in serial, and the sink device should recover clock from the input data [1]. On the receiver side, the most important building block is the CDR, and various design approaches of CDR design have been published [2]-[7]. The phase detector (PD) in the CDR, a linear type PD, is preferred to a bang-bang type PD since the linear PD has better jitter performance. Also, subrate PDs have been proposed to mitigate design effort and process restriction for high-speed operation circuit. However, the larger subrate linear PD (for example, 1/8 rate) increases a complexity compared to full-rate PD and has drawbacks such as larger area and layout mismatch [2]. In this paper, we designed a CDR with an enhanced quarterrate PD working at dual data rates, 5.4 Gbps and 3.24 Gbps. A quarter-rate PD generating increased up/down pulse widths is proposed. Section II describes the CDR architecture and operating principle. Details of the building block are given in section III. The measurement results are presented in section IV. Section V presents the conclusion and the performance summary of the work. II. CDR Architecture Figure 1 shows the block diagram of the proposed dual-rate CDR. A quarter-rate CDR architecture is adopted [3]. A ETRI Journal, Volume 33, Number 5, October 2011 Data Divider Lock detector REF_CLK Latch Mode UP [3:0] Loop filter Charge pump 1/4-rate PD DN (Iup:Idn=4:5) [3:0] MUX R1 C1 4 bit Latch VCO UP[0] DN[0] CLK270" C2 Loop 2 Latch 4-phase Rclk Latch quarter-rate PD generating enlarged up/down pulse widths and charge pump (CP) blocks are proposed. The dual-loop architecture consists of a frequencyacquisition loop and a phase-locked loop [2], [3], [6], [8]. During the frequency locking process (Loop 1), the frequency detector (FD) compares the frequency difference between an external reference clock and recovered clock. When the rising edge of the REF_CLK stays within two adjacent phase clocks for a certain period, the frequency lock detector sends the lock signal to shift the operation from the frequency acquisition process to the phase locking process. During the phase locking (Loop 2), the clock is recovered and the data are retimed. Since the first draft version of DisplayPort v1.2 standard should support the dual data rates (3.24 Gbps and 5.4 Gbps), the dualrate voltage-controlled oscillator (VCO) is designed. The target clock frequency can be selected by the ‘Mode’ signal generated by the DisplayPort link layer detecting the input data rate. At ‘Mode 0’, the VCO operates at 810 MHz, which is a quarter rate of 3.24 Gbps input data. At ‘Mode 1’, the VCO generates a 1.35 GHz clock for 5.4 Gbps data. The DisplayPort leaves the designer as an option for the CDR design whether using a reference clock or not using a reference clock (reference-less CDR). In this paper, the CDR with reference clock is adopted for the faster lock. III. Building Block Design 1. Enhanced Quarter-Rate Linear PD Figure 2 shows the proposed quarter-rate linear PD. The proposed PD consists of latches, XOR, and AND gates. The two outputs from the second (CLK90') and first (CLK90) latches are compared by XOR gates and generate UP[3:0] signals. These UP[3:0] signals also make DN[3:0] signals through the AND gates’ operation with the four phase- ETRI Journal, Volume 33, Number 5, October 2011 CLK270 Latch C Latch D Latch CLK0' Latch A Latch Latch Latch Latch G UP[2] DN[2] CLK90" H UP[3] DN[3] CLK180" PD core CLK90" CLK180" Latch CLK270' CLK270 CLK0" Latch CLK180' CLK180 DN[1] Latch CLK90' CLK90 UP[1] CLK0" CLK0' CLK0 F CLK270' Latch Recovered data Fig. 1. Block diagram of dual-loop CDR with quarter-rate PD. B CLK180' Latch CLK90 CLK180 Data recovery E Mode Loop 1 Data input Latch CLK90' Latch CLK0 FD A CLK270" Latch CLK dummy Fig. 2. Block diagram of proposed quarter-rate linear PD with dummy delay. recovered clocks from VCO (CLK0", CLK90", CLK180" and CLK270"). Also, the A, B, C, and D outputs are provided to the data recovery (DR) block to retime the data. Linear PD is affected by clock duty and unwanted gate delay mismatch due to parasitic elements. In practice, the proposed PD is implemented with clock and data buffer for delay matching. In this work, all gates in the PD are implemented with current mode logic for supporting high-speed operation. In the quarter-rate PD operation, the up pulse width is proportional to the phase error, whereas the down pulse width is constant regardless of the phase error. In the locked state, the up pulse width is 2.5 times of 1-bit data period (2.5×TBIT) and down pulse width is twice of data rate (2.0×TBIT). Thus, the ratio of the up and down pulse widths is 5/4. The difference between up and down pulse widths can be compensated by the CP with a reverse current ratio. The timing diagram of the quarter-rate linear PD is shown in Fig. 3. Waveforms, A, B, C, and D are the first latch outputs and E, F, G, and H are the second latch outputs. Figure 4 shows the comparison of the simulated current mismatch effect in the Jae-Wook Yoo et al. 753 DATA 0 1 2 3 4 5 6 7 8 200 150 PD output average (ps) CLK0 CLK90 CLK180 CLK270 A 0 1 B 2 1 2 C 0 5 H 0 UP1 4 UP3 1 6 DN2 1 2 7 5 5 6 2.0 15.0 12.5 10.0 1.6 7.5 1.4 A 5.0 1.2 2.5 1.0 0 0 1.0 2.0 Time (ns) (a) 0.8 3.0 I (µA) 1.8 Input B 2.0 1.8 10.0 7.5 A 5.0 1.6 1.4 V (V) B Y1 (V) Y0 (µA) 15.0 12.5 CP_out Input 1.2 2.5 1.0 0 –2.5 0 1.0 2.0 Time (ns) (b) 0.8 3.0 Fig. 4. Simulated current mismatch effect by up/down pulse width: (a) reference PD (1.5×TBIT) and (b) proposed PD (2.5×TBIT). Mismatch ratio: A/B. CP between the reference PD [3] and the proposed PD. Figure 4(a) represents the PD in [3], and Fig. 4(b) represents the proposed PD assuming the same resistor-capacitor (RC) time constant at the PD output node. The CP’s up/down current is proportional to up/down pulse width in the PD. As the input data rate goes higher, the current pulse will become shorter and the current mismatch ratio will be increased due to the RC effect. Compared to the current pulse in Fig. 4(a), the current 754 Jae-Wook Yoo et al. 100 150 Proposed [2] work Quarter1/8-rate rate [3] [4] [5] QuarterHalf-rate Full-rate rate 8 Fig. 3. Timing diagram of proposed quarter-rate linear PD. CP_out 0 50 Phase error (ps) Pulse Up 2.5×TBIT 3.5×TBIT 1.5×TBIT 0.5×TBIT 0.5×TBIT width Down 2.0×TBIT 4.0×TBIT 1.0×TBIT 1.0×TBIT 0.5×TBIT 7 4 4 DN3 6 3 3 8 Type of PD 5 DN1 Reference 5 2 2 7 7 4 DN0 8 6 UP2 –50 Table 1. Comparison of proposed quarter-rate linear PD with other subrate linear PDs (TBIT=width of one bit data). 8 5 3 3 –100 7 7 4 2 –150 Fig. 5. Simulated PD gain. 4 1 –50 –100 8 6 3 5.4 Gbps 3.24 Gbps 0 –200 –150 8 7 50 6 3 UP0 7 5 2 G 7 6 4 2 F 8 6 4 3 1 6 5 3 1 E 5 3 2 D 4 100 pulse in Fig. 4(b) has lower current mismatch ratio (A/B). Therefore, the proposed PD leads to the lower jitter generation. Figure 5 shows the simulated PD gain for each data rate. The pulse widths of a generated up/down pulses (PD output) versus phase difference between the data edge and clock edge are plotted for each data rate. Table 1 shows the comparison of the proposed quarter-rate linear PD with other subrate linear PDs published before. When the PD in [3] is compared with the PD in [4], the difference is an increased up pulse width. Compared to the PDs in [3]-[5], the proposed PD has the longer pulse width. Comparing the proposed PD to [2], the proposed PD’s pulse widths are equivalent to 4×TBIT (up pulse) and 5×TBIT (down pulse), respectively, if it is modified to 1/8 rate PD as in [2]. As a result, the proposed PD has the widest pulse width. Hence, it can show the better jitter performance. Though the pulse width is increased, total current consumption can be controlled by adjusting the CP current. 2. VCO with Mode Control Since the CDR should support two different data rates, the VCO should generate two different frequencies. At each frequency, the VCO generates four different phase clocks for the quarter-rate PD. The ring oscillator type VCO is proposed with a digital mode switch for selecting the operating frequency. The frequency ‘Mode’ value is provided from the link layer ETRI Journal, Volume 33, Number 5, October 2011 VDD Mode M9 2.25 M5 M3 M4 M6 Out Out M10 Mode 2.00 In Vctrl M1 M2 Frequency (GHz) 1.75 In M7 1.50 1.25 1.00 1.35 GHz (FF, 0°C) 1.35 GHz (TT, 27°C) 1.35 GHz (FF, 85°C) 810 MHz (FF, 0°C) 810 MHz (TT, 27°C) 810 MHz (FF, 85°C) 0.75 0.50 Fig. 6. Dual-rate VCO delay cell. td = Reff_load × Ceff_out . (1) The VCO changes its effective resistance by adding PMOS (M9-M10) elements which are controlled by the ‘Mode’ signal. At ‘Mode 0’, the M9 and M10 are in the cut-off (switch-off) region and the node is in open state. At ‘Mode 1’, the M9 and M10 are in the linear region (switch-on) and they act like an additional resistor. Assuming that the total effective resistance at ‘Mode 0’ is Reff1 and the effective resistance of M9 and M10 is Rm9,10, the total effective resistance at ‘Mode 1’ will be Reff1 || Rm9,10 and it is smaller than Reff1. Consequently, according to (2), the time delay at ‘Mode 1’ is shorter than at ‘Mode 0’. Thus, two target frequencies (810 MHz and 1.35 GHz) are controlled by the ‘Mode’ signal. Reff1 @ Mode 0 > Reff1 || Rm9,10 @ Mode 1. 0.50 3. CP Circuit The CP circuit is shown in Fig. 8. A unity gain buffer is used to clamp the terminal voltages of current sources during the zero-current pumping period. In this way, glitches on the loop 0.75 1.00 1.25 Control voltage (V) (a) 1.50 1.75 2.00 0 –25.0 1.35 GHz –50.0 810 MHz –75.0 –100 –125 –150 3 10 104 105 106 Relative frequency (Hz) (b) 107 108 Fig. 7. VCO characteristics: (a) VCO operating range and (b) phase noise. VDD Output UP UPB (2) The proposed VCO also adds current source to decrease VCO gain, and it covers all tuning range of the VCO. Simulations show that the gains of VCO are 920.55 MHz/V at 3.24 Gbps and 832.48 MHz/V at 5.4 Gbps, as shown in Fig. 7(a). Corner simulations show that the target frequency ranges are safely covered under the process, supply voltage, and temperature (PVT) variations. The simulated VCO phase noise shows –80 dBc/Hz and –88 dBc/Hz at 1 MHz offset from 810 MHz and 1.35 GHz clock frequency, respectively, as shown in Fig. 7(b). ETRI Journal, Volume 33, Number 5, October 2011 0.25 25.0 Phase noise (dBc/Hz) protocol. Figure 6 shows the schematic of the proposed dual-rate VCO delay cell. The load of the differential pair is made up of p-channel metal-oxide-semiconductor field-effect transistor (PMOS). According to (1), the delay of the delay cell can be decided by the load effective resistance and capacitance: 0.25 0.0 DN DNB 4Is 5Is Fig. 8. CP circuit. filter due to the charge sharing can be minimized. Since the up and down pulse widths of the PD are different, two different bias current sources are placed. The up/down current ratio is set to be 4/5 (reverse of the up/down pulse width ratio). The up and down current values of the CP are 12 μA and 15 μA, respectively, in the CDR loop. If the CP has a ±5% current mismatch due to PVT variations, 0.08 UI of data eye is affected. 4. Data Recovery Since the proposed CDR recovers the data with the quarter- Jae-Wook Yoo et al. 755 CLK0 A VDD DFF Latch DFF 4 bit counter RDATA1 MUX MUX C CLK90 DFF Latch DFF Count reset DFF Mode REF_CLK REF_CLK CLK0 B 4 DFF Lock signal Reset DFF Fig. 11. Block diagram of frequency lock detector. Latch RDATA2 MUX MUX D Detection phase margin (90°) Latch CLK90 ① ⑮ REF_CLK DDFF CLK0 CLK45 CLK90 MUX Count reset Lock signal VDD Count start (4 bit) FLD end Fig. 12. Timing diagram of frequency lock detector. Fig. 9. Block diagram of data recovery. 5. Frequency Lock Detector DATA 0 1 2 3 4 5 6 7 8 CLK0 CLK90 CLK180 CLK270 A 0 1 B C D MUX A&C 2 1 0 0 4 2 2 3 1 6 5 2 7 6 4 7 5 4 1 8 6 4 3 0 MUX B&D 5 3 8 7 8 6 3 8 5 7 CLK45 RDATA1 0 2 4 6 RDATA2 –1 1 3 5 Fig. 10. Timing diagram of data retiming block. rate PD, it needs data retiming block for recovering the data with 1:2 demultiplexed formats. The data retiming block is shown in Fig. 9. It consists of 2:1 multiplexer and double edge triggered D-F/F (DDFF). Figure 10 shows the timing diagram of the data retiming block. A, B, C, and D are outputs of the first latches in the PD. The A and C outputs are selected by CLK0, and the B and D outputs are selected by CLK90. Also, two multiplexed outputs (A and C, B and D) are realigned by the DDFF. Consequently, the parallelized 2-bit signals are recovered and aligned by CLK45 (45 degree shifted from CLK0). 756 Jae-Wook Yoo et al. Figure 11 shows the frequency lock detector block. The resolution of the frequency lock detector is determined by (3). Thus, the frequency detection resolution can be improved by increasing the number of the counter bit size. The timing diagram of the frequency lock detector is shown in Fig. 12. If the reference clock is located between zero phase clock (CLK0) and 90 degree phase clock (CLK90) for a certain period, the frequency lock signal is on. After the frequency lock detector generates the lock signal, the lock signal makes the CDR shift the operation from the frequency acquisition process to the phase locking process. Since the 4-bit counter is used in this design, the lock signal is on if the generated clocks (CLK0, CLK90) stay at the same position during continuous 16 reference clock period. resolution = 90° 2 bit_counter × 360° = ±1.56%. (3) IV. Measurement Results The CDR circuit using the quarter-rate linear PD with enhancing the CP pulse width has been fabricated in a 0.18-µm complementary metal-oxide-semiconductor (CMOS) RF technology. The chip consumes 117 mW at 5.4-Gbps data rate. The chip microphotograph is shown in Fig. 13. Loop bandwidth of the implemented CDR has 10 MHz, and its capacitances were implemented using metal-insulator-metal capacitor for accurate capacitance on the chip. Also, the loop ETRI Journal, Volume 33, Number 5, October 2011 102 Data buffer QuarterCDR tree rate PD CP VCO Loop filter PLL CP Jitter tolerance (UI) DR Output circuit buffer Jitter spec. Measurement 101 100 10–1 10–2 105 106 Fig. 13. Photomicrograph of proposed chip. 107 108 Jitter frequency (Hz) 109 1010 Fig. 15. Jitter tolerance. Table 2. Performance summary of proposed CDR and other works. (a) (b) Fig. 14. Jitter measurement result: recovered data eye and jitter histogram of recovered clock (a) at 5.4 Gbps PRBS input data and (b) at 3.24 Gbps PRBS input data. filter values are R1 (1.6 KΩ), C1 (900 fF), and C2 (200 fF). The core area of the CDR circuit is 1 mm×1.3 mm. For facilitating the measurements, a test chip was mounted on a FR-4 printed circuit board using bonding wires. Figure 14 shows the measured eye-diagram of the recovered half-rate data output and recovered clock for 231–1 pseudorandom bit sequence (PRBS) input data at 5.4 Gbps and 3.24 Gbps. The measured RMS jitter and peak-to-peak jitter of the recovered clock are 2.92 ps and 24.89 ps at 5.4 Gbps, and 4.55 ps and 27.4 ps at 3.24 Gbps, respectively. The bit error rate is measured to be less than 10-12 at both data rates with PRBS 231–1 data format. Figure 15 illustrates the jitter tolerance at 5.4 Gbps. It shows that the designed circuit meets the jitter tolerance specification. The performance comparison of the proposed CDR is given in Table 2. The proposed one and previous 5-Gbps CDRs with the same process were compared. The proposed circuit shows ETRI Journal, Volume 33, Number 5, October 2011 Specification This work [2] [5] Technology (CMOS) [9] 0.18 µm 0.18 µm 0.18 µm Supply voltage 1.8 V 1.8 V 1.8 V 1.8 V 0.18 µm Data rate 3.24/5.4 Gbps 5 Gbps 5 Gbps 155 Mbps to 3.125 Gbps PD type Quarter-rate 1/8-rate Full-rate Full-rate 2.92 ps Clock jitter (@5.4 Gbps) 6.8 ps 6.04ps 6.4ps (RMS) 4.55 ps (@5 Gbps) (@5 Gbps) (@3.125 Gbps) 31 (@2 –1 PRBS) (@3.24 Gbps) Power 117 mW consumption 144 mW 130 mW 95 mW (@5.4 Gbps) (w/o I/O buffer) better jitter and power consumption performance. Since [3] and [4] adopted LC-VCO rather than ring oscillator type VCO and designed for a 10 Gbps data rate, the measurement data of [3] and [4] were not included in the comparison table. V. Conclusion A CDR circuit that supports dual data rates of 5.4 Gbps and 3.24 Gbps for DisplayPort v1.2 sink device is presented in this paper. The CDR is realized with a quarter-rate PD with enhancing the up and down pulses. The proposed CDR circuit is fabricated in a 0.18-µm CMOS technology, and it shows 2.92-ps RMS and 24.89-ps peak-to-peak jitter in the recovered quarter-rate clock from 231–1 PRBS at 5.4-Gbps serial input. Acknowledgment Authors thank the IDEC program and for its hardware and software assistance for the design and simulation. Jae-Wook Yoo et al. 757 References [1] VESA, DisplayPort Standard, v1.2, Jan. 2010. [2] Y.S. Seo et al., “A 5-Gbit/s Clock-and-Data-Recovery Circuit With 1/8-Rate Linear Phase Detector in 0.18-um CMOS Technology,” IEEE Trans. Circuits Syst., vol. 56, no. 1, Jan. 2009, pp. 6-10. [3] S. Byun et al., “A 10-Gb/s CMOS CDR and DEMUX IC with a Quarter-Rate Linear Phase Detector,” IEEE J. Solid-State Circuits, vol. 41, no. 11, Nov. 2006, pp. 2566-2576. [4] J. Savoj and B. Razavi, “A 10-Gb/s CMOS Clock and Data Recovery Circuit with a Half-Rate Linear Phase Detector,” IEEE J. Solid-State Circuits, vol. 36, no. 5, May 2001, pp. 761-767. [5] D. Rennie and M. Sachdev, “A 5-Gb/s CDR Circuit with Automatically Calibrated Linear Phase Detector,” IEEE Trans. Circuits Syst., vol. 55, no. 3, Apr. 2008, pp. 796-803. [6] H.S. Muthali, T.P. Thomas, and I.A. Young, “A CMOS 10-Gb/s SONET Transceiver,” IEEE J. Solid-State Circuits, vol. 39, no. 7, July 2004, pp. 1026-1033. [7] M. Hsieh and G.E. Sobelman, “Architectures for Multi-Gigabit Wire-Linked Clock and Data Recovery,” IEEE Circuit Syst. Mag., vol. 8, 2008, pp. 45-57. [8] L. Henrickson and D. Shen, “Low-Power Fully Integrated 10Gb/s SONET/SDH Transceiver in 0.13-μm CMOS,” IEEE J. Solid State Circuit, vol. 38, no. 10, Oct. 2003, pp. 1595-1601. [9] R. Yang et al., “A 155.52 Mbps-3.125 Gbps Continuous-Rate Clock and Data Recovery Circuit,” IEEE J. Solid-State Circuit, vol. 41, no. 6, June 2006, pp. 1380-1390. Dong-Kyun Kim received the BS in electronics engineering from Inha University, Incheon, Rep. of Korea, in 2009. He is currently toward the MS at the same university. His research interests include high speed clock and data recovery and VLSI circuit design. Jin-Ku Kang received his BS in nuclear engineering from Seoul National University, his MS in electrical engineering from New Jersey Institute of Technology, and his PhD in electrical engineering from North Carolina State University, in 1983, 1990, and 1996, respectively. From 1983 to 1988, he worked at Samsung Electronics, Inc., Rep. of Korea. In 1988, he was with Texas Instruments Korea. From 1996 to 1997, he was with Intel as a senior design engineer. Since 1997, he has been a professor in School of Electronics Engineering at Inha University. His research interests are high-speed CMOS VLSI design, mixed mode IC design, and highspeed serial interface design. Jae-Wook Yoo received the BS and MS in electronics engineering from Inha University, Incheon, Rep. of Korea, in 2008 and 2010, respectively. Since 2010, he has been with Siliconworks Co., Daejeon, Rep. of Korea. His research interests include high speed clock and data recovery and VLSI circuit design. Tae-Ho Kim received the BS and MS degrees in electronics engineering from Inha University, Incheon, Rep. of Korea, in 2007 and 2009, respectively. He is currently working toward the PhD at the same university. His research interests are mixed mode high speed links interface and signal integrity. 758 Jae-Wook Yoo et al. ETRI Journal, Volume 33, Number 5, October 2011