A 14-bit 1-GS/s DAC with a programmable interpolation filter in 65

advertisement
Vol. 34, No. 2
Journal of Semiconductors
February 2013
A 14-bit 1-GS/s DAC with a programmable interpolation filter in 65 nm CMOS
Zhao Qi(赵琦)1 , Li Ran(李冉)1 , Qiu Dong(邱东)1 , Yi Ting(易婷)1; Ž , Bill Yang Liu2 ,
and Hong Zhiliang(洪志良)1
1 State
Key Laboratory of ASIC and System, Fudan University, Shanghai 201203, China
Devices, Shanghai 200021, China
2 Analog
Abstract: A programmable 14-bit 1-GS/s current-steering digital-to-analog converter is presented. It features a
selectable interpolation rate (2x/4x/8x) with a programmable interpolation filter. To improve the high-frequency
performance, a “fast switching” technique that adds additional biasing to the current-switch is adopted. The datadependent clock loading effect is also minimized with an improved switch control by using a double latch. This
DAC is implemented in 65 nm CMOS technology with an active area of 1.56 mm2 . The measured SFDRs are 70.05
dB at 250 MS/s for 120.65 MHz input sine-wave signal and 64.24 dB at 960 MS/s for 56.3 MHz input sine-wave
signal, respectively.
Key words: DAC; high speed; high resolution; programmable
DOI: 10.1088/1674-4926/34/2/025004
EEACC: 2570
1. Introduction
Digital signal processing of analog function by connecting the data converter directly to the terminals is becoming
a trend, which requires very high bandwidth and sampling
speed. Meanwhile, the evolution towards high levels of integration in communication systems also drives the demands
for high-speed and high-resolution digital-to-analog converters (DACs). For example, in a communication base station, the
IEEE 802.3an 10GBASE-T Ethernet standard requires a 1 GS/s
transmit DAC while driving a 50  load with an amplitude of
2.5 V.
In recent years, various high speed and high resolution
DAC designs have been publishedŒ1 9 . Some designs concentrate on obtaining good low-frequency performance and
adopt techniques like calibration or dynamic element matchingŒ1; 2 , but these methods will not improve high-frequency
performance since high frequency matching is no longer the
only limiting factor. Although the return-to-zero (RZ) method
is used in some other designs to improve high-frequency performanceŒ3; 4 , use of this method loses half of the signal power
and limits its application fields. In this paper, a segmented
current-steering DAC which aims at achieving good SFDR at
high frequency without adopting the RZ method is proposed.
Meanwhile, a programmable interpolation filter is used to improve energy efficiency with different data rates.
2. DAC architecture
The block diagram of the proposed DAC is depicted in
Fig. 1. It is mainly composed of a programmable digital interpolation filter and the DAC core. The programmable digital interpolation filter consists of a FIFO, three half-band
filters (HBF1, HBF2 and HBF3) each with an interpolation
factor of 2, and a MUX. The input signal to the DAC core
(DATA<13:0>) can have the same update rate as the input to
the DAC (Din<13:0>) or an update rate 2/4/8 times of the input to the DAC depending on the selection input SEL<1:0> to
the MUX. The mode to be selected is determined by the DAC
core update rate, and the programmable DAC works under noninterpolation mode, 2x-interpolation mode, 4x-interpolation
mode or 8x-interpolation mode with the core update rate at 0–
250 MS/s, 250–500 MS/s, 500–800 MS/s or 800–1000 GS/s,
respectively.
For an interpolation filter with an interpolation factor of I ,
the output sequence xI .n/ is:
xI .n/ D
N
X
mD0
h .mI C .n/I /x
n
I
;
(1)
where N is the filter order, h.m/ is the filter coefficient, and
x.n/ is the input sequence. I is 2 here as each half-band filter
doubles the sampling rate.
HBF1, HBF2 and HBF3 are half-band low-pass filters and
because a half-band filter has 50 percent of its coefficients being zero, it outperforms other types in low power design. The
filter order N and the quantized coefficients can be obtained by
using the MATLAB FDA tool. The performance of each halfband filter is listed in Table 1, and their frequency responses
are plotted in Fig. 2.
In the hardware implementation, several measures are
taken to reduce the area and power consumption of the interpolation filters HBF1, HBF2 and HBF3. Firstly, the ploy-phase
structure is adopted to reduce the hardware consumption, as
shown in Fig. 3. Besides that, the filter is realized in transposed form which has the advantage of hardware reuse and
also has the shortest critical path compared to other kinds of
filter realization. Moreover, shift-and-add operations are used
* Project supported by the National High Technology Research and Development Program of China (No. 2009AA011605) and the National
Natural Science Foundation of China (No. 61076027).
† Corresponding author. Email: yiting@fudan.edu.cn
Received 19 June 2012, revised manuscript received 6 September 2012
© 2013 Chinese Institute of Electronics
025004-1
J. Semicond. 2013, 34(2)
Zhao Qi et al.
Fig. 1. Block diagram of the proposed DAC.
Fig. 2. Frequency response of (a) HBF1, (b) HBF2 and (c) HBF3.
HBF
HBF1
HBF2
HBF3
Passband
0.4
0.25
0.2
Passband-ripple (dB)
0.001
0.001
0.001
Table 1. Performance of each filter.
Stopband (dB)
Filter order (N )
85
54
85
22
85
14
instead of dedicated multipliers to realize multiplication. Finally, a canonic sign digit (CSD) representation is adopted to
express the coefficients. In this way, the coefficients can be represented with the fewest non-zero bits, which means minimal
Word-length of quantization coefficient
16
16
14
addition and subtraction operations.
The block diagram of the DAC core is depicted in Fig. 4.
The current-steering architecture adopted here is the best
choice for fast-sampling DAC so far. In consideration of
025004-2
J. Semicond. 2013, 34(2)
Zhao Qi et al.
Fig. 3. Block diagram of the interpolation filter HBF.
Fig. 4. Block diagram of the DAC core.
simplicity, compactness and reducing differential nonlinearity (DNL) and glitch energy, a 5-5-4 segmented structure is
used, which consists of 5 thermometer coded most significant
bits (5 MSB), 5 thermometer coded upper least significant bits
(5 ULSB), and 4 binary coded lower least significant bits (4
LLSB). Most of the design effort is focused on the MSB and
the ULSB as these two sections determine the overall performance of the DAC. As shown in Fig. 4, cells in the MSB and the
ULSB are driven by row-and-column thermometer decoders,
which are easy to design and area efficient. The input signal is
separated into two channels (D and DB) with complementary
clocks (CLK and CLKB), so signal D is half a clock before
signal DB. Although it doubles the digital part area, the highfrequency performance will be greatly improved with a proper
switching sequence.
To meet the requirement of 2.5 V output amplitude, an offchip transformer connects the DAC to the load. The impedance
equals 100  on both sides of the transformer, while the effec-
tive impedance to the DAC equals 50 .
3. Distortion mechanism and DAC core design
3.1. Spectral performance of the current cell
A simple model for the current cell is shown in Fig. 5.
In this model, the current cell contains an ideal current cell
and a pair of switches which determine the output current path
and the differential voltage between the positive and negative
output node. The major non-ideal effect of the current source
is the finite frequency-dependent output impedance, which is
modeled by Rcs and Ccs in Fig. 5. The spectral impurity of the
DAC caused by the non-ideal current cell under different frequencies is discussed under this model.
Transistor mismatch of the current sources is a main source
of nonlinearity. To meet the INL requirement of the DAC, the
relationship between DAC INL_yield and the required accu-
025004-3
J. Semicond. 2013, 34(2)
Zhao Qi et al.
Fig. 5. Ideal switched current cell and simple model for analysis.
racy on the current sources can be expressed asŒ5 :
.I /
1
INL_yield
6p
; C D inv_norm 0:5 C
;
I
2
2N C
(2)
where .I /=I is the relative standard deviation of a unit current source, N is the resolution of the DAC, inv_norm is the
reverse cumulative normal distribution and INL_yield is the
relative number of DAC yield with INL < 0.5 LSB. A 14-bit
converter with 99.7% yield specification at INL < 0.5 LSB requires a current error standard deviation of less than 0.125%.
The minimal area of current cell can be expressed as:
,
4A2VT
.I / 2
2
C Aˇ
WL D
(3)
I
.VGS VTH /2
;
in which Aˇ and AVT are technology parameter, W L represents
the area of the matched transistors, and VGS VTH is the gate
overdrive voltage. According to Eqs. (2) and (3), the required
area of the current sources can be determined.
The other factor that causes distortion at low frequency is
the finite output impedance of DAC current cell Rcs . As has
been pointed out in other studiesŒ6; 7 , with the load resistor
Rload and the total number of unit current sources N , the influence of the cell impedance Rcs on the third-order distortion
of the converter can be described as:
HD3 D ŒRload N=Rcs 2 :
(4)
From Eq. (4), it can be seen that the low frequency SFDR
can be improved by increasing Rcs . In order to improve the
low frequency SFDR performance, the cascode current source
structure is generally used.
Additional performance-degrading effects start to work
when the sampling frequency gets high. An assumption of the
output impedance can be modeled by the parallel connection
of a resistor and a capacitor, and the expression for HD3 can
be described as:
HD3 D ŒRload N=Zcs 2 :
(5)
For the current sources of 5-bit MSB, to get HD3 < –
75 dB with a load resistance of 50 , Zcs should be over
100 k, which is very difficult to reach at 1 GHz output frequency. Therefore, the output parasitic capacitance should be
very small. For a cascode current source structure, the cascode
transistors and switches need to be small to minimize parasitic
capacitance.
Fig. 6. Current source structure.
3.2. Current source implementation
The cascode current source structure is adopted in this design as depicted in Fig. 6. The area of the current cell M1 is
determined in a similar way to that described above and the cascode transistor M2 and the switches M3–M6 are minimized to
reduce the high-frequency distortion. In addition, a fast switching method is used to reduce the parasitic capacitanceŒ6 .
Although a three-cascode-transistor structure can guarantee high impedance at low frequency, M7 and M8 could still
cause large parasitic capacitance due to their large size. Therefore, two additional current paths are added to the sources of
M7–M8, in which way these two transistors will not be completely switched off and the voltage of node 1 and node 3 will
be much more stable. The parasitic capacitor Cgs of M7–M8
and Cgd of M3–M6 cannot be observed from the output node,
and these capacitors will not contribute to distortion. The first
observable capacitor due to switching is the capacitor Cgs of
M3–M6, but their effect on finite output impedance will be reduced by the intrinsic gain of both M3–M6 and M7–M8 by a
factor of (gm rout /2 . The number of the current switches is four
instead of two. Although they bring in more parasitic capacitors, it can improve high-frequency performance.
3.3. Solutions to the data-dependent clock loading effect
Any latch topology presents a load to the buffer that drives
the clock input of the latchŒ8 , which depends on whether the
state of the latch is about to change or not. This effect will be
significant when a large number of latches change with different states. It also becomes one of the most important limitations
for high-frequency operation. So here, several circuits are implemented to minimize this effect.
As mentioned in the DAC architecture, row-and-column
decoders are used to reduce design complexity and save power.
Furthermore, a symmetrical decoding method is adopted, as
shown in Fig. 7 to minimize the decoding feed-through effect.
The latch shown in Fig. 7 is adopted to implement the synchronization of the switch-control signals, because it has the
shortest clock-to-Q delay and the steepest transition. Although
the transition of digital signals still has a finite settling effect at
025004-4
J. Semicond. 2013, 34(2)
Zhao Qi et al.
Fig. 9. Switch driver circuit.
Fig. 7. Double latches with symmetrical decoding.
Fig. 10. Floor plan for MSB and ULSB current sources.
Fig. 8. Current cell switch timing.
the end of the clock period, it is solved with the double-latch
structure. To minimize the sensitivity of the driver to the datadependent clock load, the clock is local buffered, thus it can
enhance the strength of clock signal. Besides that, to minimize
the intersymbol interference between successive input codes
for achieving low distortion, the input data is also locally converted into a differential equivalent and buffered through a successively scaled buffer, as shown in Fig. 7, in which way it can
also improve the switching speed.
To further reduce the data-dependent clock loading effect,
a four-switch methodŒ9 is implemented as shown in Fig. 6
(M3–M6). In this method, two of the four switches change their
states, while the other two remain off when the DAC is operating. The input data determines whether the output current flows
to one output terminal or the other. It can be seen from Fig. 8
that regardless of whether the input signal is at low frequency
or high frequency, the clock has the same load and node 2 in
Fig. 6 suffers the same disturbances all the time.
Meanwhile, the crossing point of the switch-control signals is also important. To prevent NMOS switch transistors
from turning off at the same time, high crossing switch-control
signals are needed, which are provided by the circuit shown in
Fig. 9.
3.4. Layout solutions
In the DAC design, the matching problem of the layout will
have great influence on the chip performance when the DAC is
fabricated. In order to ensure transistor-matching, a special layout technique called ‘INL bounded switching sequences’Œ10 is
adopted to cancel both the linear and quadratic gradient errors.
The layout arrangement of the MSB and ULSB current sources
is depicted in Fig. 10.
Another layout consideration is routing. According to
Ref. [7], the delay differences caused by clock signal and output signal routing will degrade SFDR performance at highspeed operation. So a very tight control over relative timing
is necessary. As shown in Fig. 11, binary tree routing method
for the clock signal and output signal is adopted here.
025004-5
J. Semicond. 2013, 34(2)
Zhao Qi et al.
Fig. 11. Clock signal and output signal routing.
Fig. 13. DNL&INL.
Table 2. DAC performance summary.
Parameter
Value
Resolution
14-bit
Power supply
2.5 V (Analog) 1.0 V (Digital)
Full-scale current
16 mA
Power @ 1 GS/s
82 mW
DNL
1.59 LSB
INL
2.50 LSB
Fig. 12. Microphotograph of the proposed DAC.
FOM D
Process
2
Rload Œ7
Iload
Ptotal
15%
65 nm CMOS
4. Experimental results
The DAC was fabricated in TSMC 65 nm CMOS technology. Figure 12 is the chip microphotograph, which shows that
the core area is 1.2 1.3 mm2 . The measured DNL and INL
plots are shown in Fig. 13 in which the measured DNL is C1.59
LSB/–1.26 LSB and the measured INL is C2.5 LSB/–1.06 LSB
with off-chip calibration on. It can be seen that INL is not symmetrical. That is because current mirrors cause mismatch and
the switching of full ULSB to MSB also causes mismatch. Getting ULSB through a current divider from MSB, instead of a
current mirror, may achieve better performance.
With the programmable digital interpolation filter, the
DAC can work under 4 different modes, which are noninterpolation mode, 2x-interpolation mode, 4x-interpolation
mode, and 8x-interpolation mode depending on the core update
rate. Figure 14(a) shows that the measured SFDR is 70.05 dB
for 120.65 MHz input signal when the update rate is 250 MS/s
(non-interpolation mode). Under the 8x-interpolation mode,
the output spectrum for 5 MHz input signal at 960 MS/s and
56.3 MHz input signal at 960 MS/s are shown in Figs. 14(b)
and 14(c), and the SFDRs are 67.13 dB and 64.24 dB, respectively. As can be seen from Fig. 14, SFDR performance is limited by the second-order harmonic distortion. That is because
fabrication will cause mismatch and layout and routing are not
perfectly symmetrical. To minimize the effect, a better layout
technique will be needed.
A summarized performance of the programmable DAC
and a comparison between this work and other works are listed
in Tables 2, 3 and 4. The analog part of the DAC consumes
42 mW from 2.5 V power supply while providing 16 mA full
scale current and the digital part consumes 40 mW when the
sampling frequency is 1 GHz.
5. Conclusion
This paper presents a programmable 14-bit 1 GS/s DAC,
which features a selectable interpolation rate (2x/4x/8x). A
“fast switching” technique is adopted to reduce parasitic capacitance at high input frequency and the data-dependent
clock loading effect is minimized by various methods such as
symmetrical decoding, double-latch, local buffered clock and
switch sequence control. The layout techniques for the current
source transistors, the clock binary tree and the output signal binary tree are implemented to further improve SFDR at high frequency. The full-scale output current is 16 mA and the power
025004-6
J. Semicond. 2013, 34(2)
Zhao Qi et al.
Fig. 14. Output spectra at (a) fsample D 250 MHz, fin D 120.65 MHz, (b) fsample D 960 MHz, fin D 5 MHz, and (c) fsample D 960 MHz, fin
D 56.3 MHz.
Mode
Non-interpolation
2X- interpolation
4X- interpolation
8X- interpolation
Table 3. SFDR performance summary.
Core update rate (MHz)
Input signal Frequency (MHz)
250
5
120.65
500
5
121
800
5
97
960
5
56.3
SFDR (dB)
72.39
70.05
70.65
66.59
68.86
65.05
67.13
64.24
Table 4. DAC performance comparison.
Parameter
Tech. (nm)
Area (mm2 /
Supply (V)
Resolution
Fclk (MHz)
Swing (mV)
Power (mW)
SFDR @ 5 MHz (dB)
SFDR @ 120 MHz (dB)
FOMŒ7
This work
65
1.56
1/2.5
14
960
1600
82
72.39/67.13*
70/66.6**
15%
ISSCC2009Œ6
65
0.31
1.1/2.5
12
2900
2500
188
76
72
66%
* fsample D 250 MHz/960 MHz (non-interpolation/8x-interpolation).
**fsample D 250 MHz/500 MHz (non-interpolation/2x-interpolation).
025004-7
ISSCC2004Œ9
180
2.5
1.8/3.3
14
1400
1500
200
N.A
67 (@ 261 MHz)
23%
ISSCC2011Œ3
90
0.825
1.2/2.5
12
1250
1200
128
74
71
10%
J. Semicond. 2013, 34(2)
Zhao Qi et al.
consumption is 82 mW at 1 GS/s. The converter is designed in
TSMC 65 nm technology and has a core area of 1.56 mm2 .
[6]
References
[7]
[1] Huang Q, Francese P A, Martelli C, et al. A 200 MS/s 14 b 97
mW DAC in 0.18 m CMOS. ISSCC Dig Tech Papers, 2004:
364
[2] Chan K L, Galton L. A 14 b 100 MS/s DAC with fully segmented
dynamic element matching. ISSCC Dig Tech Papers, 2006: 582
[3] Tseng W H. A 12 b 1.25 GS/s DAC in 90 nm CMOS with >70
dB SFDR up to 500 MHz. ISSCC Dig Tech Papers, 2011: 192
[4] Tseng W H, Wu J T, Chu Y C. A CMOS 8-bit 1.6-GS/s DAC with
digital random return to zero. IEEE Trans Circuits Syst I, 2011,
58(1): 1
[5] Van den Bosch A, Borremans M A F, Steyaert M S J, et al. A 10-
[8]
[9]
[10]
025004-8
bit 1-GSample/s Nyquist current-steering CMOS D/A converter.
IEEE J Solid-State Circuits, 2001, 36(3): 315
Lin C H, van der Goes F M L, Westra J R, et al. A 12 bit 2.9 GS/s
DAC with IM3 < –60 dBc beyond 1 GHz in 65 nm CMOS. IEEE
J Solid-State Circuits, 2009, 44(12): 3285
Palimers P, Steyaert M S J. A 10-bit 1.6-GS/s 27-mW currentsteering D/A converter with 550-MHz 54-dB SFDR bandwidth
in 130-nm CMOS. IEEE Trans Circuits Syst I, 2010, 57(11): 2870
Mercer D A. Low-power approaches to high-speed currentsteering digital-to-analog converters in 0.18 m CMOS. IEEE
J Solid-State Circuits, 2007, 42(8): 1688
Schafferer B, Adams R. A 3 V CMOS 400 mW 14 b 1.4 GS/s
DAC for multi-carrier applications. ISSCC Dig Tech Papers,
2004: 360
Cong Y, Geiger R L. Switching sequence optimization for gradient error compensation in thermometer-decoded DAC arrays.
IEEE Trans Circuits Syst I, 2000, 47(7): 589
Download