A 14-bit 1-GS/s DAC with a programmable interpolation filter in 65

Vol. 34, No. 2 Journal of Semiconductors February 2013 A 14-bit 1-GS/s DAC with a programmable interpolation filter in 65 nm CMOS Zhao Qi(赵琦)1 , Li Ran(李冉)1 , Qiu Dong(邱东)1 , Yi Ting(易婷)1; , Bill Yang Liu2 , and Hong Zhiliang(洪志良)1 1 State Key Laboratory of ASIC and System, Fudan University, Shanghai 201203, China Devices, Shanghai 200021, China 2 Analog Abstract: A programmable 14-bit 1-GS/s current-steering digital-to-analog converter is presented. It features a selectable interpolation rate (2x/4x/8x) with a programmable interpolation filter. To improve the high-frequency performance, a “fast switching” technique that adds additional biasing to the current-switch is adopted. The datadependent clock loading effect is also minimized with an improved switch control by using a double latch. This DAC is implemented in 65 nm CMOS technology with an active area of 1.56 mm2 . The measured SFDRs are 70.05 dB at 250 MS/s for 120.65 MHz input sine-wave signal and 64.24 dB at 960 MS/s for 56.3 MHz input sine-wave signal, respectively. Key words: DAC; high speed; high resolution; programmable DOI: 10.1088/1674-4926/34/2/025004 EEACC: 2570 1. Introduction Digital signal processing of analog function by connecting the data converter directly to the terminals is becoming a trend, which requires very high bandwidth and sampling speed. Meanwhile, the evolution towards high levels of integration in communication systems also drives the demands for high-speed and high-resolution digital-to-analog converters (DACs). For example, in a communication base station, the IEEE 802.3an 10GBASE-T Ethernet standard requires a 1 GS/s transmit DAC while driving a 50 load with an amplitude of 2.5 V. In recent years, various high speed and high resolution DAC designs have been publishedŒ1 9 . Some designs concentrate on obtaining good low-frequency performance and adopt techniques like calibration or dynamic element matchingŒ1; 2 , but these methods will not improve high-frequency performance since high frequency matching is no longer the only limiting factor. Although the return-to-zero (RZ) method is used in some other designs to improve high-frequency performanceŒ3; 4 , use of this method loses half of the signal power and limits its application fields. In this paper, a segmented current-steering DAC which aims at achieving good SFDR at high frequency without adopting the RZ method is proposed. Meanwhile, a programmable interpolation filter is used to improve energy efficiency with different data rates. 2. DAC architecture The block diagram of the proposed DAC is depicted in Fig. 1. It is mainly composed of a programmable digital interpolation filter and the DAC core. The programmable digital interpolation filter consists of a FIFO, three half-band filters (HBF1, HBF2 and HBF3) each with an interpolation factor of 2, and a MUX. The input signal to the DAC core (DATA<13:0>) can have the same update rate as the input to the DAC (Din<13:0>) or an update rate 2/4/8 times of the input to the DAC depending on the selection input SEL<1:0> to the MUX. The mode to be selected is determined by the DAC core update rate, and the programmable DAC works under noninterpolation mode, 2x-interpolation mode, 4x-interpolation mode or 8x-interpolation mode with the core update rate at 0– 250 MS/s, 250–500 MS/s, 500–800 MS/s or 800–1000 GS/s, respectively. For an interpolation filter with an interpolation factor of I , the output sequence xI .n/ is: xI .n/ D N X mD0 h .mI C .n/I /x n I ; (1) where N is the filter order, h.m/ is the filter coefficient, and x.n/ is the input sequence. I is 2 here as each half-band filter doubles the sampling rate. HBF1, HBF2 and HBF3 are half-band low-pass filters and because a half-band filter has 50 percent of its coefficients being zero, it outperforms other types in low power design. The filter order N and the quantized coefficients can be obtained by using the MATLAB FDA tool. The performance of each halfband filter is listed in Table 1, and their frequency responses are plotted in Fig. 2. In the hardware implementation, several measures are taken to reduce the area and power consumption of the interpolation filters HBF1, HBF2 and HBF3. Firstly, the ploy-phase structure is adopted to reduce the hardware consumption, as shown in Fig. 3. Besides that, the filter is realized in transposed form which has the advantage of hardware reuse and also has the shortest critical path compared to other kinds of filter realization. Moreover, shift-and-add operations are used * Project supported by the National High Technology Research and Development Program of China (No. 2009AA011605) and the National Natural Science Foundation of China (No. 61076027). † Corresponding author. Email: yiting@fudan.edu.cn Received 19 June 2012, revised manuscript received 6 September 2012 © 2013 Chinese Institute of Electronics 025004-1 J. Semicond. 2013, 34(2) Zhao Qi et al. Fig. 1. Block diagram of the proposed DAC. Fig. 2. Frequency response of (a) HBF1, (b) HBF2 and (c) HBF3. HBF HBF1 HBF2 HBF3 Passband 0.4 0.25 0.2 Passband-ripple (dB) 0.001 0.001 0.001 Table 1. Performance of each filter. Stopband (dB) Filter order (N ) 85 54 85 22 85 14 instead of dedicated multipliers to realize multiplication. Finally, a canonic sign digit (CSD) representation is adopted to express the coefficients. In this way, the coefficients can be represented with the fewest non-zero bits, which means minimal Word-length of quantization coefficient 16 16 14 addition and subtraction operations. The block diagram of the DAC core is depicted in Fig. 4. The current-steering architecture adopted here is the best choice for fast-sampling DAC so far. In consideration of 025004-2 J. Semicond. 2013, 34(2) Zhao Qi et al. Fig. 3. Block diagram of the interpolation filter HBF. Fig. 4. Block diagram of the DAC core. simplicity, compactness and reducing differential nonlinearity (DNL) and glitch energy, a 5-5-4 segmented structure is used, which consists of 5 thermometer coded most significant bits (5 MSB), 5 thermometer coded upper least significant bits (5 ULSB), and 4 binary coded lower least significant bits (4 LLSB). Most of the design effort is focused on the MSB and the ULSB as these two sections determine the overall performance of the DAC. As shown in Fig. 4, cells in the MSB and the ULSB are driven by row-and-column thermometer decoders, which are easy to design and area efficient. The input signal is separated into two channels (D and DB) with complementary clocks (CLK and CLKB), so signal D is half a clock before signal DB. Although it doubles the digital part area, the highfrequency performance will be greatly improved with a proper switching sequence. To meet the requirement of 2.5 V output amplitude, an offchip transformer connects the DAC to the load. The impedance equals 100 on both sides of the transformer, while the effec- tive impedance to the DAC equals 50 . 3. Distortion mechanism and DAC core design 3.1. Spectral performance of the current cell A simple model for the current cell is shown in Fig. 5. In this model, the current cell contains an ideal current cell and a pair of switches which determine the output current path and the differential voltage between the positive and negative output node. The major non-ideal effect of the current source is the finite frequency-dependent output impedance, which is modeled by Rcs and Ccs in Fig. 5. The spectral impurity of the DAC caused by the non-ideal current cell under different frequencies is discussed under this model. Transistor mismatch of the current sources is a main source of nonlinearity. To meet the INL requirement of the DAC, the relationship between DAC INL_yield and the required accu- 025004-3 J. Semicond. 2013, 34(2) Zhao Qi et al. Fig. 5. Ideal switched current cell and simple model for analysis. racy on the current sources can be expressed asŒ5 : .I / 1 INL_yield 6p ; C D inv_norm 0:5 C ; I 2 2N C (2) where .I /=I is the relative standard deviation of a unit current source, N is the resolution of the DAC, inv_norm is the reverse cumulative normal distribution and INL_yield is the relative number of DAC yield with INL < 0.5 LSB. A 14-bit converter with 99.7% yield specification at INL < 0.5 LSB requires a current error standard deviation of less than 0.125%. The minimal area of current cell can be expressed as: , 4A2VT .I / 2 2 C Aˇ WL D (3) I .VGS VTH /2 ; in which Aˇ and AVT are technology parameter, W L represents the area of the matched transistors, and VGS VTH is the gate overdrive voltage. According to Eqs. (2) and (3), the required area of the current sources can be determined. The other factor that causes distortion at low frequency is the finite output impedance of DAC current cell Rcs . As has been pointed out in other studiesŒ6; 7 , with the load resistor Rload and the total number of unit current sources N , the influence of the cell impedance Rcs on the third-order distortion of the converter can be described as: HD3 D ŒRload N=Rcs 2 : (4) From Eq. (4), it can be seen that the low frequency SFDR can be improved by increasing Rcs . In order to improve the low frequency SFDR performance, the cascode current source structure is generally used. Additional performance-degrading effects start to work when the sampling frequency gets high. An assumption of the output impedance can be modeled by the parallel connection of a resistor and a capacitor, and the expression for HD3 can be described as: HD3 D ŒRload N=Zcs 2 : (5) For the current sources of 5-bit MSB, to get HD3 < – 75 dB with a load resistance of 50 , Zcs should be over 100 k, which is very difficult to reach at 1 GHz output frequency. Therefore, the output parasitic capacitance should be very small. For a cascode current source structure, the cascode transistors and switches need to be small to minimize parasitic capacitance. Fig. 6. Current source structure. 3.2. Current source implementation The cascode current source structure is adopted in this design as depicted in Fig. 6. The area of the current cell M1 is determined in a similar way to that described above and the cascode transistor M2 and the switches M3–M6 are minimized to reduce the high-frequency distortion. In addition, a fast switching method is used to reduce the parasitic capacitanceŒ6 . Although a three-cascode-transistor structure can guarantee high impedance at low frequency, M7 and M8 could still cause large parasitic capacitance due to their large size. Therefore, two additional current paths are added to the sources of M7–M8, in which way these two transistors will not be completely switched off and the voltage of node 1 and node 3 will be much more stable. The parasitic capacitor Cgs of M7–M8 and Cgd of M3–M6 cannot be observed from the output node, and these capacitors will not contribute to distortion. The first observable capacitor due to switching is the capacitor Cgs of M3–M6, but their effect on finite output impedance will be reduced by the intrinsic gain of both M3–M6 and M7–M8 by a factor of (gm rout /2 . The number of the current switches is four instead of two. Although they bring in more parasitic capacitors, it can improve high-frequency performance. 3.3. Solutions to the data-dependent clock loading effect Any latch topology presents a load to the buffer that drives the clock input of the latchŒ8 , which depends on whether the state of the latch is about to change or not. This effect will be significant when a large number of latches change with different states. It also becomes one of the most important limitations for high-frequency operation. So here, several circuits are implemented to minimize this effect. As mentioned in the DAC architecture, row-and-column decoders are used to reduce design complexity and save power. Furthermore, a symmetrical decoding method is adopted, as shown in Fig. 7 to minimize the decoding feed-through effect. The latch shown in Fig. 7 is adopted to implement the synchronization of the switch-control signals, because it has the shortest clock-to-Q delay and the steepest transition. Although the transition of digital signals still has a finite settling effect at 025004-4 J. Semicond. 2013, 34(2) Zhao Qi et al. Fig. 9. Switch driver circuit. Fig. 7. Double latches with symmetrical decoding. Fig. 10. Floor plan for MSB and ULSB current sources. Fig. 8. Current cell switch timing. the end of the clock period, it is solved with the double-latch structure. To minimize the sensitivity of the driver to the datadependent clock load, the clock is local buffered, thus it can enhance the strength of clock signal. Besides that, to minimize the intersymbol interference between successive input codes for achieving low distortion, the input data is also locally converted into a differential equivalent and buffered through a successively scaled buffer, as shown in Fig. 7, in which way it can also improve the switching speed. To further reduce the data-dependent clock loading effect, a four-switch methodŒ9 is implemented as shown in Fig. 6 (M3–M6). In this method, two of the four switches change their states, while the other two remain off when the DAC is operating. The input data determines whether the output current flows to one output terminal or the other. It can be seen from Fig. 8 that regardless of whether the input signal is at low frequency or high frequency, the clock has the same load and node 2 in Fig. 6 suffers the same disturbances all the time. Meanwhile, the crossing point of the switch-control signals is also important. To prevent NMOS switch transistors from turning off at the same time, high crossing switch-control signals are needed, which are provided by the circuit shown in Fig. 9. 3.4. Layout solutions In the DAC design, the matching problem of the layout will have great influence on the chip performance when the DAC is fabricated. In order to ensure transistor-matching, a special layout technique called ‘INL bounded switching sequences’Œ10 is adopted to cancel both the linear and quadratic gradient errors. The layout arrangement of the MSB and ULSB current sources is depicted in Fig. 10. Another layout consideration is routing. According to Ref. [7], the delay differences caused by clock signal and output signal routing will degrade SFDR performance at highspeed operation. So a very tight control over relative timing is necessary. As shown in Fig. 11, binary tree routing method for the clock signal and output signal is adopted here. 025004-5 J. Semicond. 2013, 34(2) Zhao Qi et al. Fig. 11. Clock signal and output signal routing. Fig. 13. DNL&INL. Table 2. DAC performance summary. Parameter Value Resolution 14-bit Power supply 2.5 V (Analog) 1.0 V (Digital) Full-scale current 16 mA Power @ 1 GS/s 82 mW DNL 1.59 LSB INL 2.50 LSB Fig. 12. Microphotograph of the proposed DAC. FOM D Process 2 Rload Œ7 Iload Ptotal 15% 65 nm CMOS 4. Experimental results The DAC was fabricated in TSMC 65 nm CMOS technology. Figure 12 is the chip microphotograph, which shows that the core area is 1.2 1.3 mm2 . The measured DNL and INL plots are shown in Fig. 13 in which the measured DNL is C1.59 LSB/–1.26 LSB and the measured INL is C2.5 LSB/–1.06 LSB with off-chip calibration on. It can be seen that INL is not symmetrical. That is because current mirrors cause mismatch and the switching of full ULSB to MSB also causes mismatch. Getting ULSB through a current divider from MSB, instead of a current mirror, may achieve better performance. With the programmable digital interpolation filter, the DAC can work under 4 different modes, which are noninterpolation mode, 2x-interpolation mode, 4x-interpolation mode, and 8x-interpolation mode depending on the core update rate. Figure 14(a) shows that the measured SFDR is 70.05 dB for 120.65 MHz input signal when the update rate is 250 MS/s (non-interpolation mode). Under the 8x-interpolation mode, the output spectrum for 5 MHz input signal at 960 MS/s and 56.3 MHz input signal at 960 MS/s are shown in Figs. 14(b) and 14(c), and the SFDRs are 67.13 dB and 64.24 dB, respectively. As can be seen from Fig. 14, SFDR performance is limited by the second-order harmonic distortion. That is because fabrication will cause mismatch and layout and routing are not perfectly symmetrical. To minimize the effect, a better layout technique will be needed. A summarized performance of the programmable DAC and a comparison between this work and other works are listed in Tables 2, 3 and 4. The analog part of the DAC consumes 42 mW from 2.5 V power supply while providing 16 mA full scale current and the digital part consumes 40 mW when the sampling frequency is 1 GHz. 5. Conclusion This paper presents a programmable 14-bit 1 GS/s DAC, which features a selectable interpolation rate (2x/4x/8x). A “fast switching” technique is adopted to reduce parasitic capacitance at high input frequency and the data-dependent clock loading effect is minimized by various methods such as symmetrical decoding, double-latch, local buffered clock and switch sequence control. The layout techniques for the current source transistors, the clock binary tree and the output signal binary tree are implemented to further improve SFDR at high frequency. The full-scale output current is 16 mA and the power 025004-6 J. Semicond. 2013, 34(2) Zhao Qi et al. Fig. 14. Output spectra at (a) fsample D 250 MHz, fin D 120.65 MHz, (b) fsample D 960 MHz, fin D 5 MHz, and (c) fsample D 960 MHz, fin D 56.3 MHz. Mode Non-interpolation 2X- interpolation 4X- interpolation 8X- interpolation Table 3. SFDR performance summary. Core update rate (MHz) Input signal Frequency (MHz) 250 5 120.65 500 5 121 800 5 97 960 5 56.3 SFDR (dB) 72.39 70.05 70.65 66.59 68.86 65.05 67.13 64.24 Table 4. DAC performance comparison. Parameter Tech. (nm) Area (mm2 / Supply (V) Resolution Fclk (MHz) Swing (mV) Power (mW) SFDR @ 5 MHz (dB) SFDR @ 120 MHz (dB) FOMŒ7 This work 65 1.56 1/2.5 14 960 1600 82 72.39/67.13* 70/66.6** 15% ISSCC2009Œ6 65 0.31 1.1/2.5 12 2900 2500 188 76 72 66% * fsample D 250 MHz/960 MHz (non-interpolation/8x-interpolation). **fsample D 250 MHz/500 MHz (non-interpolation/2x-interpolation). 025004-7 ISSCC2004Œ9 180 2.5 1.8/3.3 14 1400 1500 200 N.A 67 (@ 261 MHz) 23% ISSCC2011Œ3 90 0.825 1.2/2.5 12 1250 1200 128 74 71 10% J. Semicond. 2013, 34(2) Zhao Qi et al. consumption is 82 mW at 1 GS/s. The converter is designed in TSMC 65 nm technology and has a core area of 1.56 mm2 . [6] References [7] [1] Huang Q, Francese P A, Martelli C, et al. A 200 MS/s 14 b 97 mW DAC in 0.18 m CMOS. ISSCC Dig Tech Papers, 2004: 364 [2] Chan K L, Galton L. A 14 b 100 MS/s DAC with fully segmented dynamic element matching. ISSCC Dig Tech Papers, 2006: 582 [3] Tseng W H. A 12 b 1.25 GS/s DAC in 90 nm CMOS with >70 dB SFDR up to 500 MHz. ISSCC Dig Tech Papers, 2011: 192 [4] Tseng W H, Wu J T, Chu Y C. A CMOS 8-bit 1.6-GS/s DAC with digital random return to zero. IEEE Trans Circuits Syst I, 2011, 58(1): 1 [5] Van den Bosch A, Borremans M A F, Steyaert M S J, et al. A 10- [8] [9] [10] 025004-8 bit 1-GSample/s Nyquist current-steering CMOS D/A converter. IEEE J Solid-State Circuits, 2001, 36(3): 315 Lin C H, van der Goes F M L, Westra J R, et al. A 12 bit 2.9 GS/s DAC with IM3 < –60 dBc beyond 1 GHz in 65 nm CMOS. IEEE J Solid-State Circuits, 2009, 44(12): 3285 Palimers P, Steyaert M S J. A 10-bit 1.6-GS/s 27-mW currentsteering D/A converter with 550-MHz 54-dB SFDR bandwidth in 130-nm CMOS. IEEE Trans Circuits Syst I, 2010, 57(11): 2870 Mercer D A. Low-power approaches to high-speed currentsteering digital-to-analog converters in 0.18 m CMOS. IEEE J Solid-State Circuits, 2007, 42(8): 1688 Schafferer B, Adams R. A 3 V CMOS 400 mW 14 b 1.4 GS/s DAC for multi-carrier applications. ISSCC Dig Tech Papers, 2004: 360 Cong Y, Geiger R L. Switching sequence optimization for gradient error compensation in thermometer-decoded DAC arrays. IEEE Trans Circuits Syst I, 2000, 47(7): 589

A 14-bit 1-GS/s DAC with a programmable interpolation filter in 65

Related documents

Products

Support

A 14-bit 1-GS/s DAC with a programmable interpolation filter in 65

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib