Techniques for Low-Power High-Performance ADCs by MAASSACHUSETT NS~~iW O)F TEHNOLOGY Sunghyuk Lee B.S., Electrical Engineering Seoul National University (2005) S.M., Electrical Engineering and Computer Science Massachusetts Institute of Technology (2010) APR 10 2014 UBRARIES Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering and Computer Science at the @ MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2014 Massachusetts Institute of Technology 2014. All rights reserved. Author .. .................................................................. Department of Electrical Engineering and Computer Science January 31, 2014 Certified by. Anantha P. Chandrakasan Professor Thesis Supervisor Certified by... Hae-Seung Lee Professor Thesis Supervisor Accepted by... ........... .. .... ....f eslie A . K olodziejski Chair, Department Committee on Graduate Students 2 Techniques for Low-Power High-Performance ADCs by Sunghyuk Lee Submitted to the Department of Electrical Engineering and Computer Science on January 31, 2014, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering and Computer Science Abstract Analog-to-digital converters (ADCs) are essential building blocks in many electronic systems which require digital signal processing and storage of analog signals. Traditionally, ADCs are considered a power hungry circuit. This thesis investigates ADC design techniques to achieve high-performance with low power consumption. Two designs are demonst rated. The first design is a voltage scalable zero-crossing based pipelined ADC. The zero-crossing based circuit technique is modified and optimized to improve the limited ADC resolution in nano-scaled CMOS technology. The proposed unidirectional charge transfer scheme allows faster and more energy efficient operation by eliminating unnecessary charging and discharging of the capacitors. Furthermore, the reduced transient disturbance at the beginning of the fine charge transfer phase improves the accuracy of operation. Power supply scaling enhances power efficiency at low sampling rates much like in digital circuits and widens the conversion frequency range where the ADC operates with highest efficiency. The second design is a high speed time-interleaved (TI) SAR ADC with background timing-skew calibration. A time-interleaved structure is employed to improve the effective sampling rate without sacrificing energy efficiency. SAR ADCs are used for each channel to make good use of device scaling. The proposed ADC architecture incorporates a flash ADC operating at the full sampling rate of the TI ADC. The flash ADC output is multiplexed to resolve MSBs of the SAR channels. Because the full-speed flash ADC does not suffer from timing-skew errors, the flash ADC output is also used as the timing reference to estimate the timing-skew of the SAR ADCs. Thesis Supervisor: Anantha P. Chandrakasan Title: Professor Thesis Supervisor: Hae-Seung Lee Title: Professor 3 4 Acknowledgments During my time at MIT, I have encountered many people that have helped and supported me in completing this research and I would like to thank them here. I would like to deeply thank my advisors, Professor Anantha Chandrakasan and Professor Hae-Seung Lee. It has been a tremendous privilege to work with them. Their technical guidance and constant encouragement has allowed me to complete my research successfully. Not only in technical point of view but also in many other aspects, I will greatly benefit from all the experiences that I had with them in the future. I believe that the best and luckiest decision I made at MIT was having them as my advisors. I would also like to thank Professor Duane Boning for being on my thesis committee. He has given me a lot of very useful comments and suggestions to improve my thesis. I had chance to work with Professor Charlie Sodini, Kailiang Chen, and Bonnie Lam for ultrasound project. It was a great experience for me to explorer new areas and share ideas. Also, I was able to learn system level design and optimization in this project. Jack Chu and SungWon Chung were senior members of my research group. They patiently answered my questions with their knowledge, experience, and intuitions. Jack provide many reference papers, shared his design, and allowed me to watch his measurement process. Thanks to his help, I was able to build my skill quickly. SungWon is an expert on RF/high-speed circuits and I relied heavily on his technical advice for my second project. He also shared a lot of his custom pcells including pads and auto routing units with shielding. Philip Godoy was very helpful with various tasks over the past few years. He helped my PCB design and programming to control measurement equipments. He also kindly reviewed my poorly written papers many times and corrected errors in it. The students in the Lee, Sodini, and Chandrakasan group have been a wonderful group of people to work with. I will not forget the enjoyable memories with senior 5 members (Albert Chow, Matt Guyton, Mariana Markova, Albert Chang, Sushmit Goswami, Kevin Ryu, Khoa Nguyen, Ivan Nausieda, David He) and junior members (Daniel Kumar, Hyunho Boo, Doyeon Yoon, Xi Yang, Joohyun Seo, Eric Winoker, Grant Anderson, Bruno Do Valle, Sabino Pietrangelo, Meggie Delano). Thanks for making the office a fun and educational place. Finally, I would like to thank my family members. I really appreciate my wife, Yunmi Kim, for her patience and understanding. My daughter, Seohyun Emily Lee, has given me each day of small joys with big smiles. I would also like to thank my parents for always loving and supporting me unconditionally. I could never have come this far without them. 6 Contents 19 1 Introduction 2 3 1.1 M otivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.2 Thesis Organization. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Zero-Crossing Based Circuit Technique 25 2.1 Zero-Crossing Based Circuits Operation . . . . . . . . . . . . . . . . . 25 2.2 Error Sources of Zero-Crossing Based Circuits . . . . . . . . . . . . . 27 2.3 Prior Work Based on ZCBC . . . . . . . . . . . . . . . . . . . . . . . 29 A Voltage Scalable Zero-Crossing Based Pipelined ADC 3.1 3.2 3.3 31 System Level Improvement . . . . . . . . . . . . . . . . . . . . . . . . 31 3.1.1 Unidirectional Two-Phase Charge-Transfer . . . . . . . . . . . 31 3.1.2 Voltage Scaling . . . . . . . . . . . . . . . . . . . . . 34 3.1.3 Reusable Design . . . . . . . . . . . . . . . . . . . 35 Circuit Implementation . . . . . . . . . . . . . . . . . . . 36 . . . . . . . . . . . . . . . . . . . . . 36 3.2.1 Block Diagram 3.2.2 Pipelined Stages . . . . . . . . . . . . . . . . . . 37 3.2.3 Sampling Switches . . . . . . . . . . . . . . . . . . 39 3.2.4 Sub-ADC . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2.5 Zero-Crossing Detecto r . . . . . . . . . . . . . . . . . 42 3.2.6 Current Source . . . . . . . . . . . . . . . . . . . . . 46 3.2.7 Multiplying DAC . . . . . . . . . . . . . . . . . . . . 47 Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 7 3.4 3.5 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.4.1 Measurement Setup . . . . . . . . . . . . . . . . . . . . . . . . 51 3.4.2 Measurement Result . . . . . . . . . . . . . . . . . . . . . . . 53 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Measurement Sum m ary 65 Time-Interleaved ADCs 4.1 Principle of Time-Interleaved ADCs . . . . . . . . . . . . . . . . . . . 65 4.2 Issues of Time-Interleaved ADCs . . . . . . . . . . . . . . . . . . . . 68 4.3 4.2.1 Offset Mismatches . . . . . . . . . . . . . . . . . . . . . . . . 68 4.2.2 Gain Mismatches . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.2.3 Timing-Skew . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Research Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5 A Time-Interleaved SAR ADC with a Background Timing-Skew Cal75 ibration 5.1 Block Diagram of the Proposed Time-Interleaved SAR ADC . . . . . 75 5.2 Circuit Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 77 . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.2.1 Clock Generator 5.2.2 Flash ADC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.2.3 SAR ADC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Timing-Skew Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.3.1 Variance Based Estimation . . . . . . . . . . . . . . . . . . . . 89 5.3.2 Correlation Based Estimation . . . . . . . . . . . . . . . . . . 96 5.3.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.4 L ayout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.5 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.3 5.6 5.5.1 Measurement Setup. . . . . . . . . . . . . . . . . . . . . . . . 102 5.5.2 Measurement Result . . . . . . . . . . . . . . . . . . . . . . . 105 Sum m ary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 8 6 121 Conclusion 6.1 Thesis Contribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 6.2 Future Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 122 10 List of Figures 1-1 The impact of CMOS device scaling on ADC performance . . . . . . 22 2-1 Zero-crossing based circuit operation . . . . . . . . . . . . . . . . . . 26 3-1 Conceptual diagram of two-phase charger transfer (a) bidirectional scheme (b) unidirectional scheme (c) voltage of output and Vx node in time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3-2 Transient effects from current sources . . . . . . . . . . . . . . . . . . 33 3-3 Block diagram of the proposed pipelined ADC . . . . . . . . . . . . . 36 3-4 Residue plot of (a) the first stage, (b) subsequent stages . . . . . . . . 37 3-5 (a) Circuit implementation of the pipelined stages and (b) timing diagram of the pipelined stages . . . . . . . . . . . . . . . . . . . . . . . 38 3-6 Schematic of the bootstrap switches . . . . . . . . . . . . . . . . . . . 39 3-7 Simulated on-resistance of (a) bootstrap switches, (b) transmission gate switches at different supply voltages . . . . . . . . . . . . . . . . 3-8 Simulated output spectrum of sampled signal with bootstrap switches and transmission gate switches at different supply voltages . . . . . . 3-9 40 40 Sub-ADC design : (a) flash ADC structure, (b) comparator, and (c) binary weighted resistor ladder . . . . . . . . . . . . . . . . . . . . . . 41 3-10 Schematic of the zero-crossing detector (ZCD) for unidirectional twophase charge-transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3-11 Simulated frequency response (left) and input referred noise spectrum (right) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12 Simulated delay of the threshold detector at different supply voltage 11 . 44 45 3-13 Schematic of the current source with variable current sourcing capacity 46 3-14 DAC voltages generated using split capacitors . . . . . . . . . . . . . 48 3-15 Top layout of ADC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3-16 First stage layout of proposed ADC . . . . . . . . . . . . . . . . . . . 50 3-17 Block diagram of the measurement setup . . . . . . . . . . . . . . . . 51 3-18 Test board design for measurement . . . . . . . . . . . . . . . . . . . 52 3-19 48pin 5x5 mm QFN package bonding diagram . . . . . . . . . . . . . 53 3-20 Die photo of GP process chip (left) and LP process chip (right) 54 . . . 3-21 Measured spectrum data and INL/DNL of GP chip at 50MS/s and 1.OV supply voltage, before calibration . . . . . . . . . . . . . . . . . 55 3-22 Measured spectrum data and INL/DNL of GP chip at 5MS/s and 0.5V supply voltage, before calibration . . . . . . . . . . . . . . . . . . . . 55 3-23 Estimated code gaps (missing codes) are shown over the INL plots of the G P chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3-24 Measured spectrum data and INL/DNL of GP chip at 50MS/s and 1.OV supply voltage, after calibration . . . . . . . . . . . . . . . . . . 57 3-25 Measured SNDR/SFDR versus input frequency of GP chip at 50MS/s and 1.OV supply voltage . . . . . . . . . . . . . . . . . . . . . . . . . 57 3-26 Measured spectrum data and INL/DNL of GP chip at 5MS/s and 0.5V supply voltage, after calibration . . . . . . . . . . . . . . . . . . . . . 58 3-27 Measured SNDR/SFDR versus input frequency of GP chip at 5MS/s and 0.5V supply voltage . . . . . . . . . . . . . . . . . . . . . . . . . 58 3-28 Measured power consumption versus conversion frequency at different supply voltages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3-29 Power breakdown of the GP chip . . . . . . . . . . . . . . . . . . . . 61 4-1 Block diagram of Tl ADC . . . . . . . . . . . . . . . . . . . . . . . . 66 4-2 FoM comparison between single channel ADCs and TI ADCs . . . . . 67 4-3 Example of offset mismatches in a four channel TI ADC . . . . . . . 69 4-4 Example of gain mismatches in a four channel TI ADC . . . . . . . . 70 12 4-5 Timing-skew error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4-6 Example of timing-skew in a four channel TI ADC . . . . . . . . . . 72 5-1 Block diagram of the proposed TI Flash/SAR ADC . . . . . . . . . . 76 5-2 Block diagram of clock generation . . . . . . . . . . . . . . . . . . . . 77 5-3 Schematic of LVDS clock receiver . . . . . . . . . . . . . . . . . . . . 78 5-4 Schematic of clock generator . . . . . . . . . . . . . . . . . . . . . . . 78 5-5 Schematic of programmable delay block . . . . . . . . . . . . . . . . . 79 5-6 Implementation of 4bit flash ADC . . . . . . . . . . . . . . . . . . . . 80 5-7 Implementation of 10bit SAR ADC . . . . . . . . . . . . . . . . . . . 81 5-8 Timing diagram of the SAR ADC . . . . . . . . . . . . . . . . . . . . 82 5-9 An example of no redundancy: no redundancy between 3 bit flash ADC and 5 bit SAR ADC . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5-10 An example of redundancy: 1 bit redundancy between 3 bit flash ADC . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 . . . . . . . . . . . . . 86 5-12 Schematic of the SAR ADC comparator with offset calibration . . . . 87 . . . . . . . . . . . . . . . . . . . . . . . 88 and 5 bit SAR ADC 5-11 Custom designed unit capacitor of SAR ADC 5-13 Schematic of the SAR logic 5-14 A conceptual diagram of the ADC operation (a) ideal case without timing-skew and (b) realistic case with timing-skew . . . . . . . . . . 90 5-15 Examples of the DLSAR histogram without timing-skew (left) and with tim ing-skew (right) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16 Behavioral simulation result : Timing-skew vs. VAR(DLSAR) ..... 91 92 5-17 Behavioral simulation result with comparator offsets in the flash ADC: Timing-skew vs. VAR(DLSAR) . . . . . . . . . . . . . . . . . . . . . . 94 5-18 Behavioral simulation result with comparator noise in the flash ADC: Timing-skew vs. VAR(DLSAR), using 217 data vlaues to calculate variance ......... .................................... 13 94 5-19 Behavioral simulation result with comparator noise in the flash ADC: Timing-skew vs. VAR(DLSAR), using 220 data vlaues to calculate variance ......... .................................... 95 5-20 Behavioral simulation result with comparator noise in SAR ADC: Timingskew vs. VAR(DLSAR) . . . . . . . . . . . . . .. .. . .. 96 . .. . . 5-21 Behavioral simulation result: Timing-skew vs. Corr(DFLASHDCHANNEL) 97 5-22 Behavioral simulation result: Timing-skew vs. Corr(DFLASH,DCHANNEL) 97 5-23 Floor plan and top layout of the ADC . . . . . . . . . . . . . . . . . 100 5-24 Layout of the SAR ADC . . . . . . . . . . . . . . . . . . . . . . . . . 101 5-25 Layout of the flash ADC . . . . . . . . . . . . . . . . . . . . . . . . . 101 5-26 Block diagram of the measurement setup . . . . . . . . . . . . . . . . 102 5-27 Test board design for measurement . . . . . . . . . . . . . . . . . . . 103 5-28 Die photo of the time-interleaved SAR ADC . . . . . . . . . . . . . . 104 5-29 Measured spectrum before calibration: single channel result (left) and time-interleaved result (right) with a low frequency input signal at 11MHz106 5-30 Measured spectrum before calibration: single channel result (left) and time-interleaved result (right) with a Nyquist rate input signal at 479MHz 106 5-31 Measured variance of DLSAR against coarse delay of SAR ADC sampling clock (top) with the examples of the DLSAR histogram (bottom) for channel 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5-32 Measured variance of DLSAR against fine delay of SAR ADC sampling clock (top) with the examples of the DLSAR histogram (bottom) for channell ........ ................................. 109 5-33 The convergence of the background timing-skew calibration with a simple iterative linear minimum searching algorithm. . . . . . . . . . . . 110 5-34 Measured spectrum after background timing-skew calibration : single channel result (left) and time-interleaved result (right) with a Nyquist rate input signal at 479MHz . . . . . . . . . . . . . . . . . . . . . . . 113 5-35 Power breakdown of the chip . . . . . . . . . . . . . . . . . . . . . . . 114 14 5-36 Measured INL/DNL of the ADC: single channel result (left) and timeinterleaved result (right) . . . . . . . . . . . . . . . . . . . . . . . . . 114 5-37 Measured spectrum after foreground timing-skew calibration: single channel result (left) and time-interleaved result (right) with a Nyquist rate input signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5-38 Measured SNDR versus input frequency . . . . . . . . . . . . . . . . 5-39 Measured SNDR versus input frequency . . . . . . . . . . . . . . . . 116 5-40 Measured SNDR versus input frequency from three different chips . . 115 116 5-41 Timing waveform to avoid crosstalk among the channels that share power lines and reference voltages . . . . . . . . . . . . . . . . . . . . 118 5-42 Measured spectrum at 200MS/s: single channel result (left) and timeinterleaved result (right) with a low frequency input signal at 5MHz . 15 119 16 List of Tables . . . . . . . . . 30 . . 60 2.1 Performance summary of prior work based on ZCBC 3.1 Summary of supply voltage scaling of GP process and LP process 3.2 Performance summary and comparison table (L<=65nm, SNDR >60dB) 62 3.3 Performance summary and comparison table (VDD <= 0.6V) 5.1 Performance summary and comparison table 17 . . . . 62 . . . . . . . . . . . . . 117 18 Chapter 1 Introduction Thanks to the continued scaling in CMOS technology, signal processing in the digital domain has benefited in many aspects. Smaller devices with less parasitics enhance the maximum speed of operation, while also reducing the power consumption. Furthermore, numerous functional blocks and identical blocks in parallel can be implemented on one chip at low cost with a high level of integration. To make use of these advantages, most electronic systems heavily rely on digital processing and tend to replace traditional analog blocks with digital ones. Because of this, analog-to-digital converters (ADCs), which are required to interface the digital systems with the real world, have been studied extensively. 1.1 Motivation An analog-to-digital converter (ADC) is a device that converts a continuous input signal to a digital output that represents the input amplitude. Among the various metrics of ADCs, sampling rate, resolution, and power consumption are the key metrics that describe the performance of ADCs. The sampling rate, which is the inverse of the time interval between two consecutive conversions, determines the input bandwidth. High-speed digital oscilloscopes, direct sampling receivers, nun-wave imaging systems, and digitally-equalized data links are typical applications of high sampling rate ADCs. In some applications, high 19 speed ADCs are used to reduce the quantization noise and to relax the constraints in anti-aliasing filters [1, 2]. For high-speed operation, flash ADCs, folding ADCs, and pipelined ADCs are commonly used. The ADC resolution means the level of quantization and therefore determines the theoretical limit of the digital output accuracy for an ideal ADC. Signal to noise and distortion ratio (SNDR), effective number of bits (ENOB), spurious free dynamic range (SFDR), integral non-linearity (INL), and differential non-linearity (DNL) are several different ways of describing the accuracy of the conversion. Applications which require high-resolution ADCs include audio recording, bio-signal acquisition, medical imaging, and communication systems using complex modulation schemes. Sigma- delta modulators, dual-slope ADCs, successive approximation register (SAR) ADCs, and pipelined ADC are typically used for these applications. For portable electronic systems powered by a battery, power consumption is one of the most important performance parameters. Along with research on battery technology, massive amount of research and development have been performed recently to lower the power consumption in electronic systems and circuits, including ADCs. One interesting parameter in ADCs is the figure of merit (FoM), which is a measure of ADC power efficiency. An example is the FoM proposed by Walden [3]. Although it is an empirical FoM that depends on other metrics, this FoM is well accepted as a standard metric to compare ADCs. Because FoM tends to be favorable to slow and low-resolution ADCs, it is important to compare FoM among ADCs which have similar speeds and resolutions. There are other ADC parameters, such as chip area, supply voltage, reference voltage, input impedance, and output latency. However, in general, they have less impact on the design and application of ADCs. The impact of CMOS device scaling on ADC performance can be found from published data [4].1 Figure 1-1(a) shows bandwidth versus SNDR of published ADCs. The blue points represent the ADCs implemented with finer than 65nnm process. The 'Sigma-delta ADCs are excluded because their properties and applications are very different from Nyquist-rate ADCs. For fair comparison, bandwidth of Time-interleaved (TI) ADCs is divided by the number of channels. 20 red points show the ADCs designed with longer than 180nin minimum feature size. Energy per conversion versus SNDR is also plotted in Figure 1-1(b) to compare the efficiency, in terms of FoM. In both plots, dotted lines are added to describe the state-of-the-art of each group. The bandwidth improvement in Figure 1-1(a), especially in the lower SNDR range, and the improved FoM in Figure 1-1(b), are the positive impact of device scaling on ADC performance. The enhanced ft and the reduced parasitics of the scaled CMOS transistors can explain this improvement. The reduction in switching power and transition time due to the reduced supply voltage has also helped this progress. However, Figure 1-1(a) shows the limited SNDR range of the ADCs implemented in modern CMOS technology. The main cause of this limitation is the reduced signal range due to the lowered supply voltage. It exacerbates the noise and linearity issues. The reduced intrinsic gain of transistors is another reason of this limitation. Comparators and op-amps with high gain which are often required for precise quantization, become less ideal. The impact of random variation is more significant on small devices and is another factor that limits the resolution. The goal of this research is to develop ADCs which exploit the benefits of scaled CMOS technology and overcome their limitations. As the first part of this thesis, a voltage scalable zero-crossing based (ZCB) pipelined ADC is proposed and demonstrated. A zero-crossing based circuit (ZCBC) technique is modified and optimized to improve the limited ADC resolution in nano-scaled CMOS technology. The proposed unidirectional charge transfer scheme allows faster and more energy efficient operation by eliminating unnecessary charging and discharging of the capacitors. Power supply scaling is demonstrated to enhance power efficiency at low sampling rates much like in digital circuits and widen the conversion frequency range where the ADC operates with highest efficiency. The prototype ADC is implemented in 65nm CMOS processes and demonstrates 12-bit resolution at 50MS/s, which is suitable for many applications, such as ultrasound imaging, wireless communication receiver, and portable instrumentation systems. The second part of this thesis is focused on the implementation of an energy efficient high speed ADC. A time-interleaved (TI) 21 1.E+11 - Jitter=1psrms -Jitter=O.1psrms - - 1.E+10 N I - 1.E+09 - 1. E+08 - n L>=180nm * L<=65nm +N 4-, C #41 onU IN 1.E+07 1.E+06 .I i 10 I 20 I 30 - 40 I An V- 50 I I I 60 70 80 ;m - I 90 100 SNDR [dB] (a) bandwidth vs. SNDR 1.E+04 . 1E+03 **" **i*E , 1.E+02 *4 '- a-W * 1.E+01 ot U. .0 m oo 1.E+00 1.E-01 I 10 20 . 30 I - OfJ/conv-step 9r - FO M = l'n o L>=180 nm + L<=65n m II 40 50 60 70 80 90 SNDR[dB] (b) Energy per conversion vs. SNDR Figure 1-1: The impact of CMOS device scaling on ADC performance 22 100 structure is used to boost sampling speed while keeping the power consumption low. For each channel, a SAR ADC is designed to make good use of device scaling. The proposed ADC architecture incorporate a flash ADC operating at the full sampling rate of the TI ADC. The flash ADC output is multiplexed to resolve MSBs of the SAR channels. Because the full-speed flash ADC does not suffer from timing-skew errors, the flash ADC output is also used as the timing reference to estimate the timing-skew of the SAR ADCs. The ADC is designed and fabricated in a 65nm CMOS process and achieves 10-bit resolution at 1GS/s. 1.2 Thesis Organization The outline of this thesis is as follows. Chapter 2 briefly reviews the zero-crossing based circuits (ZCBC) technique, including principles of operation, sources of error, and prior works based on the ZCBC technique. Chapter 3 describes a voltage scalable ZCB pipelined ADC. Particular attention is given to the proposed modifications which improve the resolution and accuracy of the ADC and enable voltage scaling for high efficiency. The details of the circuitlevel implementation are then presented, along with the measurement results that demonstrate a significant improvement in resolution and FoM over the published state-of-art. Chapter 4 explains the time-interleaved ADC structure, including the principles of operation, advantages and problems of the architecture. The published ideas to address the issues of the TI ADC are also discussed. Chapter 5 describes a time-interleaved SAR ADC with background timing-skew calibration. The operation of the proposed ADC architecture is first explained at the block-diagram level. Then the transistor-level design of each block is detailed. The operation of the timing-skew calibration is also discussed, including a behavioral simulation result. The chapter concludes with mieasurement results of the IC prototype, demonstrating the high speed and high efficiency of the TI SAR ADC. Finally, Chapter 6 summarizes the contributions of this work and proposes new 23 research projects to further improve ADC performance in nano-scaled CMOS technology. 24 Chapter 2 Zero-Crossing Based Circuit Technique Most pipelined ADCs, a popular ADC architecture for high speed and high resolution, rely on op-amps for accurate amplification of the residue voltage. This is mainly because charge transfer using op-amps is robust and the result depends only on capacitor ratios when the gain of the op-amp is large. However, the reduced supply voltage and small intrinsic gain of scaled transistors make it difficult and inefficient to implement high-gain and wide-bandwidth op-amps. To avoid power inefficiency and the implementation challenges of op-amps, recent research has been focused on possible substitutes for op-amps. Examples include open-loop amplifiers [5, 6], dynamic source followers [7], gain stages relying on capacitive charge pumps [8], charge-domain operation using a charge coupled device (CCD) [9] or bucket brigade device (BBD) [10], and ring-oscillator based amplifiers [11]. A zero-crossing based circuit (ZCBC) [12, 13] is another technique that replaces op-amps in switched capacitor circuits, including pipelined ADCs. 2.1 Zero-Crossing Based Circuits Operation The zero-crossing based circuits (ZCBC) technique, shown in Figure 2-1 with a timing diagram, replaces the op-amp with a zero-crossing detector (ZCD) and current 25 C 4)1 Vin _1 - #1e_ Vx S 2 4)1e Vou 02p V 2 4)2 S Coad VCM VCM 4) 4)2e (a) sample 2e VCM Vout Vautideal 1C1 ------ VCM Vil + VDAc j ) 4)2 4)2_ Ve 1 VCM 2 Vx Vo 1 Cload VCM am2, (b) amplifycu (c) timing diagram Figure 2-1: Zero-crossing based circuit operation sources. The sampling phase of a ZCBC is identical to the sampling phase of an op-amp based circuit. For bottom plate sampling, #1, opens the sampling switch earlier than the input switches. The charge transfer for amplification is divided into three sub-phases: a preset phase <b2p, a coarse charge transfer phase 42ee, and a fine charge transfer phase (2e. During the preset phase, C2 is connected to the DAC output voltage (VDAC) and the output of the stage is connected to the lowest system voltage through the switch S. This causes vo to step down and results in vx ,the voltage of v after the preset phase, being less than VCM over the entire range of input voltages. When the coarse charge transfer phase 4)2ee starts, a coarse current source I, is turned on and charges the output for a fast and coarse estimation of the output voltage and virtual ground condition. When the ZCD detects the virtual ground condition, the current source 1 is turned off. The finite delay of the comparator together with the high output ramp-rate results in a significant overshoot of the correct value. During the fine 26 charge transfer phase (D2ee, a fine current source 12 discharges the output to obtain a more accurate measurement of the virtual ground condition and consequently a more accurate value for the output voltage. The fine phase current 12 is much smaller than the coarse phase current 11 and in the opposite direction. This substantially reduces the error caused by the comparator delay. When the comparator detects the virtual ground crossing the second time, the sampling switch of the load capacitor Clad is opened. This defines the end of amplification and locks the sampled charges on the load capacitance. The 12 current source is turned off slightly after the sampling switch opens, but the extra current it sinks does not disturb the sampled charges because the extra current only discharges the parasitic capacitance at the output node. As demonstrated in [14, 15, 16], the two charge transfer phase (coarse and find) can be merged into one charge transfer phase, and the charge transfer can be simplified into two sub-phases: a preset phase and a charge transfer phase. This can enhance the maximum speed of the operation at the cost of the reduced accuracy. However, in this project, a two-phase charge transfer scheme is adapted to achieve higher accuracy and modified to minimize the penalty in the operation speed. 2.2 Error Sources of Zero-Crossing Based Circuits When a zero-crossing detector (ZCD) detects the virtual ground condition, all the circuit nodes have the same conditions as in the op-amp based amplification. This provides the same degree of robustness of the ZCBC technique as op-amp based circuits. However, unlike in op-amp based amplification where all circuit nodes are settled to the final condition, voltages of all circuit nodes are moving at that instant due to the constant charing and discharging current. Thus, the error sources in ZCBC are different from those in op-amp based amplification. One of the major sources of error is the output overshoot caused by non-zero delay of the comparator. To first order, the overshoot is the product of the ramp rate at the virtual ground condition and the delay of the comparator. The average of the overshoot over the output range corresponds to a fixed offset, and the variation of 27 the overshoot over the output range causes a non-linearity. dvOU VOVERSHOOT (vOT) dOUT(vOUT) VOFFSET + VOFFSET VNON-LINEARITY (VOVERSHOOT (VOUT) (VOUT) (2.2) (vOUT) (2.3) vOUT (VU))v(J2.4 XvOtERSOOT OUT VNNdv VNON-LINEARITY (2.1) X tdelay(VOUT) delayOUT) (vOUT) - OVERSHOOT OUT 4) Generally, a fixed offset in ADCs is more tolerable than non-linearity in ADCs, because a fixed offset can be removed easily by analog or digital means. In Equation 2.4, compared to the output ramp rate the comparator delay ( dvOT (VOUT)), (tdelay(VOUT) is a relatively weak function of the output, because the comparator input is always the same when charge transfer ends, regardless of the output. Thus, non-linear ramp rate caused by the finite impedance of the current source is the dominant source of non-linearity. The offset and non-linearity are given by the following equations. dv VOFFSET vNON-LINEARITY (vOUT) OUT (VOUT)) OUT (VOUT) VOUT - ( (2.5) X tdelay '(VOUT ))) X tdelay (2.6) A detailed analysis of the non-linearity of the ZCBC technique can be found in [17]. Furthermore, in [18], effects from the finite impedance of the current source are analyzed and compared with op-amp based amplification. The dominant noise source of the ZCBC circuit is the ZCD. The noise from the current source is only effective during the delay of comparator, because the noise of the current source before the virtual ground condition is reached does not affect the final value of the output. A power-efficient ZCD design for a given noise requirement is studied in [19]. Reference [20] shows a noise-performance comparison between op-amp based circuits and zero-crossing based circuits (ZCBC). 28 2.3 Prior Work Based on ZCBC The first prototype of the ZCB pipelined ADC was published in [12]. The op-amp is replaced by a threshold detection comparator and current source. To achieve high gain in the comparator, several gain stages are cascaded together. Even though the op-amp is eliminated, the power consumption in the multi-stage comparator is considerable. Due to the single-ended implementation, the resolution is limited by the noise from the power supply and the substrate. The overall performance is 8 MS/s with 10b resolution (8.6 effective number of bits (ENOB)) consuming 2.5mW in a 0.18um process. In a later design [13], a more power efficient zero-crossing detector is used to replace the continuous time comparator. The dynamic zero-crossing detector almost completely removes the static current and significantly improves power efficiency. However, the implementation is still single-ended, once again limiting the resolution. In this design, each capacitor has a current source for sampling and charge transfer. Thus only the mismatched current between the stages flows through the switch, and the voltage drop across the switches with finite on-resistance which causes nonlinearity is reduced. This results in a 200MS/s, 8b (6.4b ENOB) pipelined ADC which consumes 8.5mW in a 0.18um process. To improve the robustness against substrate, power supply, and common-mode noise, a differential design has also been implemented [14, 15, 16, 21]. With the differential implementation, the input range is doubled in spite of the lower power supply voltage. The differential zero-crossing detector is composed of a preamplifier followed by a dynamic threshold-detecting latch. This approach increases the resolution at the expense of static current. For high-speed operation, a large current source is required with high output impedance. However, there is a strong trade-off between current sourcing capacity and output impedance, especially in the single-phase charge transfer scheme. Reference [15] achieves 100 MS/s, 12b (10.2 ENOB) resolution, and 6.2mW power consumption in a 90nm process. Besides the aforementioned zero-crossing based pipelined ADC, the zero-crossing 29 based circuit technique has also been applied to other types of ADCs. A delta- sigma ADC employed zero-crossing-based integrators to achieve low power operation in [22, 23]. In [24], a reconfigurable analog circuit is implemented using the ZCBC technique. The ZCBC technique has also been used with an op-amp to make a hybrid correlated level-shifting (CLS)-opamp/ZCBC pipelined ADC [25], which removes the effect of the finite impedance of the current sources. Table 2.1 summarizes the performance of prior work based on ZCBC technique. Table 2.1: Performance summary of prior work based on ZCBC JSSC 2011 Lajevardi [24] CICC 2012 Shin [21] 90nm 180nm 65nm 55nm 1.2 1.2 1.8 1 1.1 50 100 10 50 200 10 12 JSSC 2008 VLSI 2008 JSSC 2009 Fiorenza [121 Brooks [13] Shin [16] Brooks [14] VLSI 2010 Chu [15] 180nm 180nm 65nm 90nm ltage V) 1.8 1.8 1.2 rate (m S/s) Resolution 7.9 10 200 26 Technology Vo 8 (bit) Nyqust B) Power 10 12 12 50 62.5 8.4 1.92 28.5 52.7 344 150 one phase hybrid two phase one phase 131 unidirectiona un ectional 40.3 62 63.2 2.5 8.5 5.5 4.5 _7.5_ I3_50__ 5 _2.7_344_15_ 6.2 (mW) (fisep 763 Charge Transfer bidirectional two phase 503 one phase 5087.5 one phase one phase 30 _____________ 69.5 54 54.3 12 ______ ______ ______ ______ JSSC 2010 Hershberg [25] JSSC 2006 _131 Chapter 3 A Voltage Scalable Zero-Crossing Based Pipelined ADC A fully differential zero-crossing based ADC is designed and presented here with two major modifications [26]. First, the unidirectional two-phase charge-transfer is proposed for high resolution. Supply voltage scalability for better energy efficiency is a second key feature of the proposed ADC. This chapter focuses on the implementation of the ADC. Although the design and simulation of the prototype ADC was described in the Master's thesis [27], same key design aspects are presented here as a background. This chapter also includes the detailed characterization result. 3.1 System Level Improvement 3.1.1 Unidirectional Two-Phase Charge-Transfer In this work, the two-phase charge-transfer scheme is modified to improve the speed and low voltage operation in scaled CMOS processes. The proposed unidirectional two-phase charge-transfer is faster than the previous bidirectional version, and the conversion speed is comparable to the single-phase operation. Figure 3-1 shows two different realizations of the two-phase charge-transfer based on the direction of two phases: bidirectional and unidirectional approaches. For the 31 C1 C1 C2 V. 2 +1 ZCD VCM VX 11 C Vout VCM 7 12 12 Vout VCM Cload VCo load (b) (a) V Ut4A bidirection I-------------unidirection Vx bidirection unidirection (c) Figure 3-1: Conceptual diagram(i of two-phase charger transfer (a) bidirectional scheme (b) unidirectional scheme (c) voltage of output and Vx node in time bidirectional scheme, the output overshoots during the coarse phase, and the fine phase proceeds in the opposite direction [12]. In unidirectional two-phase chargetransfer, on the other hand, the coarse phase ends with a small undershoot and the fine phase proceeds in the same direction. One advantage of the unidirectional scheme is that the zero-crossing detector (ZCD) can be optimized in one direction, which makes the ZCD simpler and faster. For example, the bidirectional ZCD in [12] is implemented with several cascaded low-gain stages with clamped output range for fast recovery time. In contrast, the unidirectional ZCD in [13] uses one high gain amplifier with a dynamic threshold detector which is simpler and more efficient. It is possible to use two separate ZCDs optimized for each direction respectively [22], but this increases power consumption, area, and circuit complexity. In this work, a ZCD which is optimized for unidirectional two-phase charge-transfer without significant power and area penalty is proposed. The detailed implementation of the ZCD is explained in Section 3.2. Another reason that the unidirectional charge-transfer scheme is faster is due to the shorter coarse phase. With the shorter coarse phase, the unidirectional charge32 I Vut , bidireco unidirection Vout ------ - - - - [,,OFF S10N 1,0 OFF 110N 1, ON Figure 3-2: Transient effects from current sources transfer can operate at a comparable speed to the bidirectional approach while providing higher accuracy. In addition, it is more energy efficient compared to the bidirectional counterpart, because the unidirectional charge-transfer does not charge and discharge the capacitors unnecessarily. The amount of power saving can be roughly estimated from the magnitude of the overshoot in the coarse charge transfer. According to simulation results, it is typical to have an overshoot of a few tens of millivolts which is about 10 percentage of the maximum output range. It corresponds to more than 10 percentage switching power saving. This benefit becomes significant at a higher conversion frequency where the switching power consumption is dominant. Yet another advantage of the unidirectional charge-transfer is that the transient effects from turning on the fine current source can be reduced. In a two-phase operation, the final accuracy is determined by the fine charge-transfer, thus the current source for the fine phase must settle before the zero-crossing. As shown in Figure 32, in the previous bidirectional two-phase charge-transfer scheme, the fine current sources (12) are turned on at the end of the coarse phase. The turn-on transient of the fine current source must decay sufficiently before the zero crossing. On the other hand, the unidirectional approach allows turning on the fine current source (12) at the beginning of coarse phase making it a part of the coarse current source. This gives the fine current source more time to settle before zero-crossing without additional power consumption. Thus, the transient disturbance from turning on the fine current sources can be reduced, enabling a shorter fine phase. Although the fine current source can be turned on at the start of the bidirectional charge-transfer as well, this 33 requires the coarse current to be larger to compensate for the fine current source, increasing power consumption. 3.1.2 Voltage Scaling In digital circuits, when the operating speed requirement is relaxed, supply voltage scaling improves power efficiency due to significantly reduced switching power [28]. However, this benefit from voltage scaling is generally not applicable to analog circuits, because the power consumption is usually dominated by static power required for thermal noise constraints. Recently, mostly-digital ADCs, such as SAR ADC [29, 30], flash ADC [31, 32] and ring VCO based sigma-delta ADC [33], have shown good energy efficiency at low supply voltages because their power consumption is dominated by switching power. However, the previous architectures chosen for mostly-digital implementation, such as SAR and flash ADCs, have limited resolution or sampling rate. Pipeline ADCs with low supply voltages have been reported [34, 35] which employ op-amps. In these converters, the resolution is limited as the supply voltage is lowered due to the op-amps. In contrast, the proposed unidirectional two-phase ZCB pipeline ADC maintains high resolution and efficiency at down to a very low supply voltage. This is primarily because the ZCD lacks the gain-bandwidth-stability trade off combined with the output swing requirements that makes it very difficult for op-amps to achieve high performance at the low supply voltages. In this work, to minimize static power consumption, power gating is applied to circuits that consume static power such as ZCDs and current sources. Additional switches added for the power gating cut off VDD or GND from the circuits during the inactive periods. To reduce the voltage drop across the switches, large power gating switches with lower threshold voltage are used. The gate control signals are buffered to turn on and off the switches fast. This power gating makes the power consumption largely proportional to operation frequency and reduces leakage power. Circuits consuming static power without idle phase are avoided. One exception is the resistor ladder for reference voltage in sub-ADCs, where the large decoupling capacitors at the reference voltages make the power gating less effective and cause 34 settling issues. The power consumption of resistor ladder for reference voltage in subADCs is scaled with the sampling rate by a parallel ladder structure, as explained in Section 3.2. For voltage scaling, all signal levels, such as input range of each stage, commonmode voltage, and reference voltages, are made proportional to the supply voltage. This is shown clearly in Section 3.2 with residue plots. 3.1.3 Reusable Design Reusable design to save design resources and to accelerate time to market has been investigated for a long time by the integrated circuit industries. The rapid evolution of the technology and standard protocols have made design reuse more attractive. However, reusable design has been adapted only in a few areas such as digital IP, due to many issues, such as layer and layout compatibility, difficulties in IP description, and performance sensitivity to surrounding circuits. Aside from these common issues, conventional pipelined ADCs based on op-amps have an important stability issue. A stringent margin for stability cannot be guaranteed across different fabrication processes due to the variation in supply voltages, different device characteristics, and parasitic loading. In contrast, the stability issue is absent in the ZCBC technique due to the lack of continuous feedback. In this work, as a proof of concept of a reusable design, one design file is used to fabricate chips in two different processes which share the same design rule files. Because the bias voltage of ZCDs and current sources are designed tunably for voltage scaling, the concept of reusable design can be tested without increasing circuit conplexity, area, and power consumption. 35 Circuit Implementation 3.2 3.2.1 Block Diagram The block diagram of the proposed ADC is shown in Figure 3-3. The ADC consists of seven stages. The first stage is composed of a 3.3 bit sub-ADC (9 comparators) and a 4x MDAC. The subsequent stages (stage 2 to stage 6) are identical and have a 3bit sub-ADC and a 4x multiplying DAC (MDAC). They are scaled down by a factor of two from the first stage to reduce area and power consumption. Stages 3 to 6 are not scaled down further because the power consumption in the second stage is already small, so further scaling does not have a large effect on the overall power consumption. A simple 3 bit flash ADC is used for the last stage. The clock generation circuit, bias circuits, and digital circuits for data alignment are also implemented on chip. The residue plots of the first stage and the subsequent stages are shown in Figure 34. The first stage residue plot is optimized to increase the input signal range while keeping the output range reasonably small with a moderate gain factor. The wide input range facilitates resolving more bits in the first stage. The reason for keeping the output range of each stage small is to maintain linearity. The input range of the second stage is extended to have redundancy for digital error correction. A 3 bit Clock generator On Chip r- - - - stage6 stage2 Istage1 Bias circuit -- , stage7 x4 MDAC x4 MDAC x4 MDAC I -3b L-3.3b subADC L _ subADC _ I I L ---- _ _ _ 3b subADC -- I- IL---- it & output buffers L Data alignment circu Figure 3-3: Block diagram of the proposed pipelined ADC 36 3b subADC -- Vout,diff Vout,diff a-------------- a a l' a 1/3 Vref -- a 1/3 Vref - - a ---- ---------- Overrange S protection a -- a - -- -- Overrange otection -4 - rdiff +Vref -Vref -1/3 Vref -1/3 Vref ---- ------ $ $ -Vref; --- -- -- +Vref - -- a - a a - - a/ 5/6 Vref -5/6 Vref ae- are -1/3 Vref (b) (a) Figure 3-4: Residue plot of (a) the first stage, (b) subsequent stages sub-ADC with 1 bit redundancy is implemented in the second stage. It allows small errors in the sub-ADC output. 3.2.2 Pipelined Stages Figure 3-5(a) shows the pipelined stages of the proposed ADC. Each stage has a sub-ADC, a zero-crossing detector (ZCD), and current sources. During the sampling phase (4i), the input signal is sampled on the sampling capacitors C 1 and C2. Once the input signal is sampled, the sub-ADC is enabled and the capacitor C1 is split and connected to either VREFP or amplification phase (f2) VREFM based on the digital output of the sub-ADC. The starts with a brief preset phase (@2p). The preset signal (b2p) initializes the output voltages to VDD and GND. During the coarse phase after the preset, the current sources start to charge/discharge the output nodes quickly. While the current sources charge/discharge the output nodes, the ZCD monitors the Vx+ and Vx_ nodes and makes a coarse decision (42ee) when v, voltage. After the coarse decision (P2ee), voltage is close to vx_ the current sources are reduced significantly to lower the output voltage ramp rate for accurate charge-transfer (luring the fine phase. The fine decision (42e) is made when vx+ and v_ cross each other. A more detailed timing diagram is shown in Figure 3-5(b). 37 Stage n Stage n+1 ----------------------------- -- -r------ - - -- -- -- -- 4o2 Vin+ -- 1 ,1 Vout+I Vrefp Vrefm (Oe 4)2 + sub- Cl+ -<s8:0>- (Oe C2+ <2:0> vx Vrefp Vrefm 2 | 4w2e C3+ C4+ _<8: 0>-t:2: 0> 1 C Vcm I Vrefp V in - le<8:0> C3- -- C4<8:0> <2:0> Vrefm Vrefp C2- -- Cl-- (02 Vrefm <2:0> (Oi F.-,-Vout- P 2#, 11 , 2 L (a) (Oe , 21 I7 I I I i I I a I I I I 42p, 2eej (b2e I I I I I I I I I I Sub-ADC J IP/ I outputs 1/ I I I Valid Outputs I II I Vx.Vx - x)vx+ t (b) Figure 3-5: (a) Circuit implementation of the pipelined stages and (b) timing diagram of the pipelined stages 38 Because it samples the input signal from an external source directly, the first stage does not need current sources for the input sampling. Similarly, the last stage does not require current sources for the charge-transfer because its output voltage is not sampled. However, to match the ramping speed of the last stage with the other stages, dummy capacitors and current sources are added. 3.2.3 Sampling Switches The linearity of switches, located at the input of each stage, can limit the overall accuracy. Especially when the supply voltage is scaled down, the overdrive voltage of switches varies significantly to the input voltage, and it degrades the linearity of the sampled signal. Bootstrap switches in Figure 3-6, which have nearly constant gate-source overdrive voltage without device reliability issues, are used for higher linearity over a wide range of input signals [36, 37]. The on-resistance of bootstrap switches and transmission gate switches is simulated at different supply voltage and plotted in Figure 3-7. Both switches are sized to have the same resistance at nominal supply voltage. The simulation result shows that bootstrap switches have much less Th F, EN,' ' Vout Vin EN Figure 3-6: Schematic of the bootstrap switches 39 distortion especially at low supply voltages. Simulated output spectrum of sampled signal with bootstrap switches and transmission gate switches are plotted in Figure 38. Also, current source splitting [13] is used to minimize non-linear voltage drop across the input switches. 1E+05 I- 1E+05 TE 1E+04 0 1E+04 0 I 1E+03 F---- 0 1E+02 * --- -------- 1E+01 I 0 1E+03 " 0.2 -0.4V -O.8V VDD VDD 0.4 0.6 -0.6V -1V --- 1E+02 VDD VDD 0.8 -O.4V 1E+01 1 - 40 Input Voltage (normailzed to VDD) 0.2 0.4 0.6V VDD 1V VDD VDD 0.8V VDD - 0.6 0.8 1 Input Voltage (normalized to VDD) (a) (b) Figure 3-7: Simulated on-resistance of (a) bootstrap switches, (b) transmission gate switches at different supply voltages VDD=1.0V, 50MS/s n. Bootstrap switches -20 VDD=0.5V, 5MS/s 0 . Bootstrap switches -20 Transmission gate switches -40 -40 -60 -60 -0 -0 -100 -100 __ Transmission gate switches E -120 0 -120 5 10 15 20 ANALOG INPUT FREQUENCY (MHz) 0 0.5 1 1.5 2 ANALOG INPUT FREQUENCY (MHz) (a) 2.5 (b) Figure 3-8: Simulated output spectrum of sampled signal with bootstrap switches and transmission gate switches at different supply voltages 40 Sub-ADC 3.2.4 The sub-ADCs are implemented as flash ADCs, as shown in Figure 3-9(a). Sub-ADCs have separate sampling capacitors to sample the input voltage at the same instant as the MDAC. The timing and charge injection mismatch between the sampling circuit of the sub-ADC and the sampling circuit of the MDAC makes the sub-ADC output inaccurate. The mismatch increases the MDAC residue range and degrades the linearity of the MDAC residue. To minimize this error, the sampling circuit of the sub-ADC is implemented as a scaled version of the MDAC sampling circuit. A dynamic latch, shown in Figure 3-9(b), is used for the comparators in the flash [8:0] Vref8:01 C-+E <8:0> [8:0] Vrego s (a) R7 7 VrpfrRl Vr EN - I Vref[0] Outp 7 R/2 r EN Outm EC I -I inm inp L EN d (W/L (W/L) L 2(W/L p (b) (c) Figure 3-9: Sub-ADC design : (a) flash ADC structure, (b) comparator, and (c) binary weighted resistor ladder 41 ADCs. The input NMOS pair of the comparators is properly sized to reduce offsets. Simulations based on device mismatch show that the over-range protection of the ADC, shown in Figure 3-4(b), is wider than 3- of the comparator offset variations. Increased meta-stability at scaled supply voltages are addressed by allowing more time to regenerate. Measurement results show that the meta-stability is not a limiting factor of the comparator. The voltage references for the flash ADCs are generated by parallel resistor ladders. While the maximum conversion frequency goes down much faster than proportional to the supply voltage, the static current through the resistor is scaled linearly with the supply voltage because the full-scale voltage is scaled proportionally to the supply voltage. Thus, the power consumption in the resistor ladder becomes significant at low supply voltages. To alleviate this problem, parallel resistor ladders are incorporated, as shown in Figure 3-9(c). At the highest supply voltage all of the switches in Figure 3-9(c) are shorted to reduce the resistance of the ladder for faster settling time. By opening switches between resistor ladders, effective resistance is increased at low supply and unused resistor ladders are turned off to save power. The total resistance of each ladder is binary weighted for a wide range of adjustability. 3.2.5 Zero-Crossing Detector Figure 3-10 shows the schematic diagram of the zero-crossing detector (ZCD). The ZCD is composed of a differential preamplifier, threshold detectors, and a levelshifting capacitor (CLS). The preamplifier amplifies the input signal and converts the differential input to a single-ended output. The gain of the preamplifier reduces the effect of noise in the threshold detectors and improves common-mode and power supply rejection. Threshold detectors are implemented as simple inverter chains optimized for fast transition in one direction. The level-shifting capacitor (CLS) is used to produce the lower threshold for the coarse decision. It is precharged to a constant voltage during the sampling phase (D1). In the amplification phase (4D2), the precharged level-shifting capacitor (CLS) is inserted between the input nodes of the threshold detector. Due to 42 42e #2e (2 Vin. Vin. vset o-W--0T 4)1 (02 Vbias (02e 42ee I preamplifier I | I level-shifting capacitor threshold detectors Figure 3-10: Schematic of the zero-crossing detector (ZCD) for unidirectional twophase charge-transfer the voltage across CLS, the bottom inverter chain in Figure 3-10 always switches the output (<b2ee) earlier than the top inverter chain (42e). This early information (4D2ee) is used for the coarse decision to reduce the ramp rate for the fine phase. Although the precharging voltage is provided externally to facilitate testing in the prototype, the integration on chip is fairly straightforward. For example, a DAC can be used to generate the voltage across the level shifting capacitor (CLS). A feedback control loop that monitors the coarse phase undershoot may be utilized to control the DAC voltage to achieve the optimum undershoot. After the fine phase ((D2e), the preamplifier is turned off. This power gating of the ZCD saves static power consumption and makes the ZCD power consumption largely proportional to the conversion frequency. Because the preamplifier is shared between the two threshold detectors, the twophase operation does not increase static power consumption of the ZCD. Also, since the coarse and the fine phases share the same preamplifier, there is no offset mismatch between the coarse and the fine zero-crossing detection. If two separate preamplifiers 43 were used, the offset mismatch between them would result in a non-optimum coarse phase undershoot. The noise of the ZCD is mainly determined by the preamplifier. This is because the noise from the threshold detectors and the sampled noise on level-shifting capacitor are divided by the open-loop gain (~18dB) of the preamplifier. Thus, the noise spectral density of the ZCD can be simplified as 8kT-y(1 + b)/gm , where -Y is a noise coefficient that depends on the region of operation, gm is the transconductance of the input differential pairs, and b represents noise from the PMOS loads. The noise power can be calculated by integrating the noise spectral density over the bandwidth or by simply multiplying the effective bandwidth [2]. The effective noise bandwidth is = 1/ (4RoutOut), where R 0su is the output impedance of the preamplifier and Cout is the total capacitance at the output of the preamplifier. Thus, the input 27r I referred noise power can be expressed as 8kTy(1 + b)/gm X 1/(4R.UtC.,t) = 2kT-(1 + b)/(gmRoutCout). Note that gmRout is the DC gain of the preamplifier, and it is the only variable to control the input referred noise power of the ZCD. Thus, if the gain of the ZCD is maintained, the noise power of the ZCD does not change at scaled supply voltage. This is confirmed from the simulated frequency response and input referred noise spectrum of the ZCD, shown in Figure 3-11. Although the noise power input referred noise spectal density Frequency response 10-6 40 DC gain =18.7 dBW = 1890MHZ 20- ~~ DC gain = 18.1 dIB BW = 170MHz -1.0V-.V 10 - ~ ~-ntegrated~ -- ----- C noise I power 0. 170mVrms Integrated, noIs power i0-9 = 0.167mV,rms -20- z 1010- 103 104 $5 106 108 107 freq (Hz) 109 103 1010 1 0 OS 6 10 108 107 (Hz) freq 19 10 10 Figure 3-11: Simulated frequency response (left) and input referred noise spectrum (right) 44 1000 - -4-bi-direction (up) M bi-direction (down) 100 2-S-uni-direction 10 o 2.5x 0 3 0.4 0.5 0.1 0.6 0.7 0.8 0.9 1 1.1 Supply voltage (V) Figure 3-12: Simulated delay of the threshold detector at different supply voltage of the ZCD is maintained, due to the reduced signal power at scaled supply voltage, the noise contribution of the ZCD is higher at scaled supply voltage. The offsets of the preamplifier are compensated by a programmable PMOS load [14], which is not shown in the figure for simplicity. The threshold detectors are optimized for unidirectional detection of virtual ground crossing. Highly unbalanced inverters in the threshold detectors enhance the transition speed in one direction. In the fine phase, due to the increased delay in the threshold detector caused by the slow ramp rate, the overall ZCD delay is doiminated by the delay of threshold detector. The simulated delay of the threshold detectors is plotted in Figure 3-12 with supply voltage scaling. The values are normalized to the delay from the unidirectional case at the highest supply voltage. It is shown that the delay can be reduced about 2.5 times by proper sizing of the threshold detector at the highest supply voltage and the improvement becomes more significant at lower supply voltages. 45 3.2.6 Current Source For the two-phase charge-transfer, the current sources are designed to change their current source capacity between the coarse and the fine phases. The current source capacity can be changed by either changing the bias voltage of current sources or by turning on and off parts of the current sources. Changing the bias voltage of current sources saves area, because one current source can be used for both coarse and fine phase. However, changing the bias voltage of the current sources causes transient effects which limit current source linearity or speed. In this work, the current source capacity is changed by turning on and off parts of the current sources. Figure 3-13 shows the schematic of the current source used in this work. All the current sources in Figure 3-5 are composed of six small identical unit current sources. All six unit current sources are turned on during the coarse phase (42ee) and only one of them remains turned on during the fine phase (42ee). Because the current source for the fine phase is turned on at the beginning of the coarse phase, the transient effects of turning the fine current source on are minimized when the fine charge-transfer starts. ...... mm. 4.2ee S S S U U S S U cI2e;'s ENpM8 M6 bp M4 " US Vcbp M3 out Sir S U U S U U U S mm m mm mm S * I I M2 Vcbn M7 Ms Vbn M1 EN-n 4 Figure 3-13: Schematic of the current source with variable current sourcing capacity 46 The unit current sources (Mi - M4) are cascoded and the gates of the cascode transistors (M2, M3) are controlled by switches (M7, M8) to turn on and off the current source. The additional switches (M5, M6) help turn the current source off quickly by discharging the parasitic capacitance at the source of the cascode transistors. Although the unidirectional charge-transfer scheme needs only one type of current source, all current sources in this design are made fully symmetric between the true and the complementary signal paths and the unused current sources are disabled to improve high frequency supply rejection [14]. 3.2.7 Multiplying DAC As shown in Figure 3-5, standard switched-capacitor multiplying DACs (MDACs) are used in this design. The 4x multiplying factor is determined by the capacitor ratio between sampling capacitors (Cl and C2, 12 unit capacitors) and feedback capacitors (C2, 3 unit capacitors). The unit capacitor value of the first stage is chosen to be about 80fF so that thermal noise does not limit the SNR. For better matching between the unit capacitors, dummy capacitors are added around the sampling and feedback capacitors. The DAC voltages are generated using split capacitors, which are connected to either VREFP or VREFM, as shown in Figure 3-14. Mismatches between the unit capacitors in MDACs cause stage-gain error and non-linearity of the DAC voltage. These non-idealities lead to missing or overlap of the digital output codes and degrade the overall linearity of analog-to-digital conversion. In this work, a simple foreground digital calibration scheme, a decision boundary gap estimation algorithm [181, is applied to the first stage MDAC. The gaps are estimated by minimizing local variations of the output code density around the decision boundary and the error correction is done by shifting the output codes by the amount of estimated gaps. 47 Vrefp C1[ [1 Vrefm '-A Vrefp D161 Vrefp EN C1[1] Vrefm D[11 D 0 0 0 * * * Vrefp C1[81 .....Vrefm Vrefm D[81 Figure 3-14: DAC voltages generated using split capacitors 3.3 Layout Because ADCs are very sensitive to matching of devices and noise, the floor plan and the detail layout should be done carefully. Overall layout of the proposed ADC is shown in Figure 3-15. All stages are placed in a row so that signals can go through all stages in a short distance. Other peripheral blocks, such as bias generation, subADC reference voltages, and the digital data collecting circuit, are placed close to the ADC core. The layout of the first stage is shown in Figure 3-16. Sub-blocks in a stage are placed in such way as to maximize symmetry and to minimize interconnect lengths. Differential circuits, such as comparators and ZCDs, are carefully laid out for symmetry. Also, deep-Nwell and Nwell guard rings are placed around ZCDs which are sensitive to noise. Metal-oxide-metal (MOM) capacitors are used for capacitor arrays (sampling and feedback capacitors). In the process used for this project, MOM capacitors using all metal layers have larger capacitance per area than MIM capacitors. The high capacitance density of MOM capacitor helps to minimize the layout area. However, due to the proximity to the substrate, MOM capacitors have larger parasitic capacitance 48 ADC core (stagel~stage7) subADC reference voltages Di Figure 3-15: Top layout of ADC at both terminals. To reduce parasitic capacitance, the metall (Ml) layer is not used for capacitor array. Also, to improve matching between the capacitors, dummy capacitors surround the capacitor arrays. Subsequent stages have almost the same layout as the first stage, except they are scaled down 2x. 49 Current source 0l 0 n C 0 i 2 Cap array Current source Figure 3-16: First stage layout of proposed ADC 50 ZCD 3.4 3.4.1 Measurement Measurement Setup The performance verification of a high resolution ADC is challenging. The noise and interference from the measurement setup must be kept well below the ADC resolution. The block diagram of the measurement setup for this work is shown in Figure 3-17. A synthesized signal generator is used to generate a low noise input signal. To reduce the distortion and noise of the signal source, a bandpass filter is inserted between the signal generator and the test board. A CMOS clock with a programmable amplitude is generated from a clock generator and provided to the test board. The configuration bits on the ADC chip which are implemented by a flip-flop chain, are programmed by a pattern generator. The CMOS output signals of the ADC are captured by a logic analyzer triggered by the output clock of the ADC. Figure 3-18 shows the test board for the chip. A single-ended input signal is converted to a differential signal on the PCB using off-the-shelf transformers (ADT16T, Mini-circuits). Nine linear regulators (LT3021, Linear technology) are used to generate supply voltages, reference voltages, and common-mode voltage. For the supply voltage scaling, the output voltage of the regulators is adjusted by a variable resistor. An accurate 1 ohm resistor is added in series at the output of the regulator Agilent E3646 DC power supply Tektronix TLA715 Agilent 8644B Signal generator TTE Bandpass filter TEST BOARD TLA7AA2 Logic analyzer Matlab TLA7PG2 Pattern generator SRS CG635 Clock generator Figure 3-17: Block diagram of the measurement setup 51 Linear regulators Po' Sul Logic analyzer Input I L Linear regulators Pattern generator Clock Figure 3-18: Test board design for measurement to measure the power consumption. Chips are wire bonded in a 48pin 5x5mm QFN package, as shown in Figure 3-19. The ground pad at the center of the QFN package is used for all ground connections. This reduces the required number of pins of the package and allows the use of a small package with less parasitics. Also, the short bonding wires for ground reduce the ground bouncing and make the on-chip ground quieter. The die is intentionally placed off-center and close to the top, because the sensitive analog signals, such as input, reference voltages, and supply voltage, are routed to the top of the chip. To reduce the pin count further, dual data rate output is used for the digital outputs. A prototype ADC is fabricated in both 65nm GP (general purpose) and LP (low power) CMOS processes with one design file. Because these two CMOS processes have the same minimum feature size, layout layers, and design rules, one design file can be used for both processes. The die photos of both GP process and LP process are 52 I--L LIj a- 0 0 U- LU 0 0 0 U LU 0 A0.7-7 36 35 VDDD VDDD VDDS VDDA VDD VREFP CLOCK VREFM SCLK VCM SDATA 6 30 VDDA VDDIO VREFP CLKout VREFM VDDIO Off C D<10> pla VDD r d ie e lo e t to: D<O> 25 NC 12 13 A V 0 18 0 A1 00 V 0 A IV 0 A V 0 23 A V 0 0 A 0 V 0 mo V 0 A r-4 V 0 VDDIO 24 A V 0 Figure 3-19: 48pin 5x5 mm QFN package bonding diagrali shown in Figure 3-20. The active area of the ADCs is highlighted in Figure 3-20 and occupies 0.36mm 2 . The unused area of the chip is filled with decoupling capacitors. 3.4.2 Measurement Result The total quantization level of the ADC is 15.3 bits. The first stage resolves 3.3 bits and subsequent stages (stage 2 - 6) contribute two effective bits per stage. The 15.3 bit raw data are combined into 12 bit quantization level in Matlab before the performance calculation. 53 Figure 3-20: Die photo of GP process chip (left) and LP process chip (right) Before Calibration Figure 3-21 shows the output spectrum and INL/DNL of the GP chip at 1.0V nominal supply. 63.3dB SNDR (10.2 ENOB), 67.3dB SFDR, 4 LSB 12 INL and 0.5 LSB 12 DNL are achieved at 50MS/s with a 23.9MHz input signal. The performance at 0.5V scaled supply voltage is shown in Figure 3-22. 63.5dB SNDR (10.3 ENOB), 68.4dB SFDR, 4 LSB 12 INL and 0.5 LSB 12 DNL are achieved at 5MS/s with a 2.1MHz input signal. Calibration Process The INL plots in Figure 3-21 and Figure 3-22 are plotted again in Figure 3-23 for comparison. It is noticeable that the INL plots are very similar to each other. Especially, the multiple abrupt jumps are observed on the same digital output codes. Further analysis shows that the jumps coincide with the threshold of the first stage subADC. Further measurement results reveal that the error is independent of conversion speed, supply voltage. This can be explained by the capacitor mismatches in the first stage MDAC. The calibration scheme described in Section 3.2.7 for MDAC capacitor mismatches is implemented off-chip using Matlab. A calibration code (or estimated code gap) is 54 FFT PLOT INL 4 -20 k Fin= 23.9MHz Fs= 6-SMS/s4 SNDR= 63.3dB 2 z SFDR= 67.3dB -40 ......-...........--.:-- - 0-2 -.- -1 0 w 500 1000 1500 2000 2500 3000 DIGITAL OUTPUT CODE DNL -60 }- .- 01 3500 4000 1 - -80 - 05 Ljj -- -&W"l~ij~ - . 0 100 - - -T- -0.5 120 b -1 10 15 ANALOG INPUT FREQUENCY (MHz) 0 500 1000 1500 2000 2500 3000 DIGITAL OUTPUT CODE 3500 4000 Figure 3-21: Measured spectrum data and INL/DNL of GP chip at 50MS/s and 1.OV supply voltage, before calibration FFT PLOT -20 INL Fin= 2.1MHz. Fs= 5MS/s SNDR= 63.5dB SFDR= 68.4dB 2 z - 0 - -2 -40 -4 0 Sz - 0 12 -60 DIGITAL OUTPUT CODE DNL -80 L i-i 05 iiIii I ii, ii 171"1UE-I7hIP -100 -0. 5 --- -120 0.5 1 1.5 2 ANALOG INPUT FREQUENCY (MHz) 10 2.5 500 1000 1500 2000 2500 3000 DIGITAL OUTPUT CODE 3500 4000 Figure 3-22: Measured spectrum data and INL/DNL of GP chip at 5MS/s and 0.5V supply voltage, before calibration 55 INL 4 2 .. INL at .OV VDD L/) ........ ..... 0 -2 S -4 150 1000 500 I. I L I DIGITAL 35 0' 400 '50 3000 20'0 0TPUT IODE A INL at 0.5V VDD - 0 -.- .- . -- - - -2 -4 0 500 10001 30001 500 1 5Q0 2600 DIGITAL OJTPUT CODE 350d 4000 * Estimated Gap: 15 18 18 7 20 8 4 8 10 Figure 3-23: Estimated code gaps (missing codes) are shown over the INL plots of the GP chip shown in Figure 3-23 with the INL plots [38]. The calibration code is estimated in 15.3 bit quantization level and applied to the raw output code. Once the capacitor mismatch calibration codes are determined, only a few digital logic blocks and adders are required to run at conversion rate to shift the digital output codes by the anmount of the calibration codes and the power consumption of the digital block for calibration is estimated by simulation. 56 After Calibration Figure 3-24 shows the output spectrum and INL/DNL of the GP chip at 1.0V nominal supply after calibration. 67.7dB SNDR (11.0 ENOB), 85.8dB SFDR, 1.2 LSB 12 INL and 0.4 LSB 12 DNL are achieved at 50MS/s with a 23.9MHz input signal. The power consumption is 4.07mW, which corresponds to 41.0fJ/step FOM. The SNDR/SFDR versus input frequency at 50MS/s and IV supply voltage is shown in Figure 3-25. FFT PLOT INL Fin- 23.9MHz F= 5 .MS/s SNDR= 67.7dB SFDR= 85.8dB -20 2 - - -2- -40 -4[ 0 0 -60 K -- 500 1000 1500 2000 2500 3000 DIGITAL OUTPUT CODE DNL 3500 4000 3500 4000 1- 1. I 1. 'I I., I z 0 -. 10 15 20 ANALOG INPUT FREQUENCY (MHz) 0 25 500 1000 ..... .... . -. .. 1500 2000 2500 3000 DIGITAL OUTPUT CODE Figure 3-24: Measured spectrum data and INL/DNL of GP chip at 50MS/s and 1.0V supply voltage, after calibration SNDR/SFDR vs f 90 (1V, 50MS/s) 85 -- SNDR before cal. -W-SNDR after cal. S-*-FDR before cal. -++-SFDR after cal. 80 U- 75 o z 70 60 60 0 5 10 15 20 f%(MHz) Figure 3-25: Measured SNDR/SFDR versus input frequency of GP chip at 50MS/s and 1.OV supply voltage 57 The performance at 0.5V scaled supply voltage is shown in Figure 3-26. The SNDR degrades only slightly to 66.3dB (10.7 ENOB) at 0.5V. A 78dB SFDR, 1.0 LSB 12 INL and 0.4 LSB 12 DNL are achieved at 5MS/s with a 2.1MHz input signal. The power consumption is 0.237mW, which corresponds to 28.0fJ/step FOM. The SNDR/SFDR versus input frequency at 5MS/s and 0.5V supply voltage is shown in Figure 3-27. FFT PLOT INL 4 Fin=* 2.1MHz - 2 - -- - Fs= 5M5s -20 SNDRt= 66.3dB SFDRk= 78dB Z -40 41 0 500 1000 -- -60 1500 2000 2500 3000 DIGITAL OUTPUT CODE DNL 3500 4000 3500 4000 -60 I id., 0 -100 -.. .5 -.. - -- 1- 0.5 1 1.5 2 ANALOG INPUT FREQUENCY (MHZ) 2.5 0 500 1000 1500 2000 2500 3000 DIGITAL OUTPUT CODE Figure 3-26: Measured spectrum data and INL/DNL of GP chip at 5MS/s and 0.5V supply voltage, after calibration SNDR/SFDR vs fi (0.5V VDD, 5MS/s) 85 80 - " -+s-SNDR before cal. 75 -*-SNDR after cal. -SFDR before cal. 70 Un 65 -4 60 0 0.5 1 1.5 2 fin (MHz) Figure 3-27: Measured SNDR/SFDR versus input frequency of GP chip at 5MS/s and 0.5V supply voltage 58 For the chip built in the LP CMOS process, similar performance was obtained at higher power supply voltages as expected. 68.1dB SNDR, 83.2dB SFDR with 4.93mW at 50MS/s are achieved at 1.2V nominal supply voltage, which corresponds to 47.5fJ/step FOM. The supply voltage scaling down to 0.8V also gives FOM improvement, which is 37.8fJ/step. Although the design is mainly simulated and optimized with GP process models, two chips fabricated in GP and LP process achieve similar performance in measurement except the scalable supply voltage range. The major differences of the GP and LP processes are nominal supply voltage ( voltage ( IVth,GP < IVDDGP < IVDDLPI ) and threshold th,LP I). Due to the higher threshold voltage, the supply voltage scalability of LP process is limited at 0.8V, which is the main difference between the two chips. For the same reason, the LP chip requires higher supply voltage to operate at the same conversion frequency, which increases power consumption and FoM at the same performance. Table 3.1 summarizes the supply voltage scaling results of GP and LP chips. As the supply voltage is scaled down, the maximum conversion frequency decreases, but the energy efficiency (FoM) is improved, due to the significant reduction in switching power. This demonstrates the idea that the supply voltage can be scaled to improve energy efficiency when the sampling rate requirements are relaxed, as in most digital circuits. The prototype maintains functionality for further supply voltage scaling down to O.4V (GP) / 0.7V (LP), but insufficient settling and leakage at low sampling rates in sampling switches degrade SNDR and FoM. To highlight the supply voltage scaling effects, the power consumption of the GP process chip versus conversion frequency is plotted at different supply voltages in Figure 3-28. The maximum operating frequency at each supply voltage is determined such that the SNDR degradation is less than 1.5dB from the nominal supply voltage result. Each curve shows the reduction of power consumption due to the reduced activity rate of digital circuits and reduced duty cycle of the power gated circuits as conversion frequency goes down. However, the power reduction in each curve is less than proportional to the conversion frequency decrease due to the unavoidable static 59 Table 3.1: Summary of supply voltage scaling of GP process and LP process 65nm LP process 65nm GP process 1.0 0.8 0.6 0.5 0.4 1.2 1.0 0.8 0.7 1.67 1.33 1.00 0.83 0.67 2.00 1.67 1.33 1.16 Conversion rate (MS/s) 50 30 10 5 2 50 30 10 4 SNDR* (dB) Before cal. After cal. 63.3 63.1 63.7 63.5 56.5 63.9 63.5 64.2 62.6 67.7 66.5 66.9 66.3 61.4 68.1 66.7 67.1 63.9 SFDR* Before cal. After cal. 67.3 68.6 68.8 68.4 65.9 67.6 68.6 68.3 68.6 85.8 83.9 81.0 78.0 78.1 83.2 87.4 78.4 73.7 Power (mW)** 4.07 1.61 0.559 0.237 0.112 4.93 2.15 0.699 0.385 Estimated cal. Power(uW) 6.03 2.38 0.606 0.281 0.125 8.91 3.81 1.1 1 0.441 FOM( fJ/step) 41.0 31.1 30.9 28.0 58.3 47.5 40.6 37.8 75.2 Supply Voltage (V) Input range (dB) (Vdinl) *SNDR/SFDR is measured at Nyquist input frequency. ** Power consumption of the bias and clock distribution circuit is included. power, such as currents through a resistor ladder for sub-ADC references and bias currents of ZCDs and current sources. This means that the efficiency or FoM gets worse when the ADC operates lower than its maximum operating frequency. Thus, more significant power saving is achieved by scaling supply voltage until the required speed is the maximum operating frequency at the scaled supply voltage. For example, when 1OMS/s conversion speed is required, 3.5x power saving is achieved by scaling the supply voltage to 0.6V, compared to the simple duty cycling of the circuit. For applications that require conversion frequency lower than 5MS/s, the supply voltage scaling should stop at 0.5V to maintain 12 bit accuracy of the operation, but the ADC can operate at any frequency lower than 5MS/s by duty cycling. Figure 3-29 shows the power breakdown of the GP chip. Because of the merged power supply on chip, the power consumption of the bias and ZCD can not be measured separately, but are estimated from simulation and various measurement results. Compared to the nominal supply voltage case, the power consumption of the peripheral circuits, such as subADC resistor and bias, takes a significant portion of the power in the scaled supply voltage case. This suggests that better control of the power in those components can improve the efficiency further in the scaled supply voltage case. 60 5 4 - O1.OV VDD E 3 32 -" ""0.8VVDD I V,1.9x 0 CL No001 017. 7 fOxIv 0 0 10 O.6V VDD -E40-0.sVVDD 3.Sx 30 40 Conversion frequency (MHz) 20 50 60 Figure 3-28: Measured power consumption versus conversion frequency at different supply voltages Power Breakdown (1V VDD, 50MS/s) Power Breakdown (O.5V VDD, 5MS/s) total power = 4.07mW total power = 0.237mW Figure 3-29: Power breakdown of the GP chip 61 Table 3.2 summarizes the performance with a comparison to the previously published works [4] with L<=65nm, SNDR >60dB. This work achieves high resolution and good efficiency simultaneously. The performance of ADCs operating at lower than 0.6V supply voltage is summarized in Table 3.3. Table 3.2: Performance summary and comparison table (L<=65nm, SNDR >60dB) This work 2011 ISSCC 2013 Kapusta VLSI 2010 Lee ISSCC 2013 Harpe ISSCC 2008 Hsueh ZCB PIPE TI SAR PIPE/SAR SAR PIPE Architecture Technology 65nm(GP) 65nm(LP) 65nm 65nm 65nm 65nm Supply Voltage (V) 1.0 1.2 1.2 1.3 0.6 1.0 Fs (MS/s) 50 50 80 50 0.04 200 Resolution (bit) SNDR @ Nyquist (dB) Power 12 12 14 12 12 11 67.7 68.1 71.3 64.4 62.6 59.9 4.07 4.93 31.1 3.5 0.0001 (mW) FoM (fJ/step) 180 _ ___ ________ 1114.2 2.2 51.8 129.5 47.5 41.0 Table 3.3: Performance summary and comparison table (VDD <= 0.6V) This work 2011 ISSCC 2013 Haroe ISSCC 2011 Yio VLSI 2009 Liu VLSI 2007 Shen ASSCC 2006 Gambini ISSCC 2008 Daly Architecture ZCB PIPE SAR SAR SAR PIPE SAR FLASH Technology 65nm(GP) 65nm 65nm 130nm 90nm 90nm 180nm Supply Voltage (V) 0.5 0.6 0.55 0.6 0.5 0.5 0.4 Fs (MS/s) 5 0.04 0.02 10 10 1.5 0.4 Resolution (bit) SNDR @ Nyquist (d1B) Power (uW) FoM (fi/step) 12 12 10 10 8 6 6 66.3 62.6 55.0 51.8 48.1 35.3 32.5 237 0.1 0.21 46 2400 14 1.66 28.0 2.2 22.4 9.6 1150 240 125 62 The measured accuracy matches well with expected result from simulations. The output spectrum plots indicate that SNDR is not limited by the non-linearity of MDAC, but rather by noise. Input capacitors (12 x 80fF on both positive and negative side) are sized to achieve SNR of 76dB (~12.3ENOB). Zero-crossing detectors (ZCDs) in the first stage are designed for 74dB (~12ENOB) SNR. With 12 bit quantization noise, the theoretically achievable SNR (70dB) is close to the measured SNDR (68dB). However, the maximum conversion frequency which can maintain the expected resolution is lower than in the simulation results. Increased delay in ZCDs due to parasitics can degrade the maximum operating speed. Especially for the proposed unidirectional two-phase operation, the speed to turn off the coarse current source is another factor that affects operation speed. Because the settling time for reference voltages provided externally is sensitive to parasitic bondwire inductance, this can also be a reason for reduced maximum speed. 3.5 Summary This work describes a unidirectional two phase zero-crossing based (ZCB) circuit to implement low power, high resolution ADCs in advanced technologies. Compared to the previous bidirectional two-phase ramp approach, the proposed unidirectional charge-transfer scheme allows faster and more energy efficient operation by eliminating unnecessary charging and discharging of the capacitors. Also, the reduced transient disturbance at the beginning of the fine charge-transfer phase improves the accuracy of the operation. The power supply scaling improves power efficiency at low sampling rates much like in digital circuits and widens the conversion frequency range where the ADC operates with highest efficiency. 63 64 Chapter 4 Time-Interleaved ADCs Although device scaling has helped to increase the conversion rate of single channel ADCs, the improvement is still not enough to meet the requirements of recent high-speed applications. A time-interleaved (TI) ADC architecture can increase the effective conversion rate of an ADC by multiplexing several ADCs in parallel. In this chapter, the principles and issues of TI ADCs will be reviewed briefly with several examples of previous work. 4.1 Principle of Time-Interleaved ADCs Figure 4-1 shows the block diagram of the time-interleaved (TI) ADC in [39] which was the first demonstration of a TI ADC. TI ADCs have several individual ADCs each of which sample an input signal at interleaved timing. When the digital outputs are combined and placed in an interleaved sequence, the ADCs function effectively as a single converter operating at a higher conversion rate. It is notable that the TI structure can be applied to most ADC architectures including flash [40], folding [41], pipelined [42], SAR [43], single slope [44], and sigma-delta ADCs [45]. The following discussion briefly outlines the effects of a TI structure on the ADC performance metrics. The effective conversion rate of a TI ADC (f,) can be expressed as where f, N, is the conversion rate of each channel and N is number of the interleaved 65 TJ~ ADC s/H --- TT S/H VIN 0 T2 ADC TT3 S/H ADC MUX s/H TN ADC Figure 4-1: Block diagram of TI ADC [39] channels. Simply, the effective conversion rate can be increased linearly by interleaving more channels. The power efficiency is the biggest benefit of a TI ADC. Generally, power consumption of an ADC grows much faster than proportional to the conversion rate of the ADC. However, in the time-interleaved structure, the effective conversion rate can be increased by N times at the cost of only N times power consumption. Although there is some overhead, such as an interleaved clock generator, and a digital output combiner, they are typically an insignificant portion of the overall power consumption. In Figure 4-2, the power efficiency of the published ADCs [4] are plotted in two groups: single channel ADCs and TI ADCs. It is observed that only TI ADCs have achieved higher than 10 GS/s conversion rate. Also, TI ADCs have demonstrated the best FoMs for higher than 500 MS/s sampling speed. Ideally, the resolution of TI ADCs is the same as the resolution of the ADC in each channel. However, in reality, mismatches between the channels introduce additional error sources in TI ADCs that limit the accuracy of the analog to digital conversion. Because most errors from TI structures appear as harmonics and spurs in the output spectrum, it is especially detrimental to applications which require high SFDR. The impact of each error source and published calibration methods are discussed further in Section 4.2. 66 1.E+05 x C xx x 1.E+04 - xK x x xx X x x 1.E+02 xx x x F x x Ne x < -p xx Xx x x x x X* x x << < >0 x 1.E+01 x x '(IX)& x LLx 0 x x X~x x XCK xxx x X1 A x x x x x x x X x 1.E+02 xx x x O 1.E+03 xx x x x 1.E+03 x b x s ,x <K a 4* ASingIADC TIADC 1.E+00I 1. E+04 1.E+05 1.E+06 1. E+07 fsnyq 1.E+08 1.E+09 1.E+10 1.E+11 [Hz] Figure 4-2: FoM comiparison I)etween single channel ADCs a1(1 TI ADCs The effective sampling rate of the TI ADC cannot be increased arbitrarily high due to fundamental and practical reasons. The sample and hold (S/H ) circuits shown in Figure 4-1 present the fundamental limitation. Unlike the ADCs which process the sampled data, the S/H should be fast enough to track the input signal accurately lip to the speed and resolution of the TI ADC. Thus, the ultimate effective sampling rate is limited by the bandwidth of the S/H circuit. Furthermore, the area and on-chip wiring of a TI ADC sets a practical limitation on the number of channels. Because of the parallel structure, the area and cost of TI ADCs increase linearly to the number of channels. Also, it is difficult to connect sensitive signals, such as the input and the clock, over a wide area, without adding mismatch and noise. Thus, increasing the number of interleaved channels can make the TI introduced perforiance degradation worse. 67 4.2 Issues of Time-Interleaved ADCs Offset and gain mismatches and timing-skew between the interleaved channels are major issues of the time-interleaved ADCs. These issues are investigated extensively in both circuit design and signal processing research [46, 47, 48, 49, 50, 40, 51]. In this section, instead of a detailed mathematical perspective, the cause and impact of the errors are explained with simple diagrams in the time and frequency domains. 4.2.1 Offset Mismatches The offset of an ADC manifests itself as a static value added at the input regardless of the input signal. A fixed amount of error caused by random variation, such as charge injection from the sampling switches and the offsets of comparators and amplifiers, can cause ADC offsets. The offset of a single channel ADC appears as a DC signal in the frequency domain and does not degraded the dynamic performance of the ADC. In TI ADCs, however, the offset mismatches between channels generates tones at the frequencies that are multiples of fs. Figure 4-3 shows an example of the offset errors in a 4-channel TI ADC. In the time domain, the error from the offset mismatches is periodic with a frequency of 8. Thus, this error appears as multiple tones at the frequencies of 0, 1, 2f, 3f8 in the frequency domain. Because the offset error does not vary with the input signal, the amplitude and frequency of the error tones are independent of the input signal. Another way of analyzing TI ADC offset error is that the mean of the offsets of the interleaved channels is the offset of the TI ADC and the deviation of the offsets from the interleaved channels causes spurs at frequencies of - 2f, The offset of each ADC channel can be defined as the y intercept of the best fit line of each ADC transfer function. The calibration of offset mismatches can be done in either the analog domain or the digital domain. Digital offset calibration means an addition or subtraction of an offset value at the digital output. This can be done easily with low power consumption, but, due to the quantization level, the remaining error can be as large as jLSB of the ADC. On the other hand, the offset calibration in 68 Errors from offset mismatches 02 03 02 03 01 01 time 04 0 f,/4 2fJ4 3f/4 fs frequency 04 11 Mean of offset mismatches Kzz4 time 0 f,/4 2f,/4 frequency 3f5/4 + Deviation of offset mismatches 1 t0 oi I~ - JI time 0 f,/4 2fJ4 3f/4 fS frequency Figure 4-3: Example of offset mismatches in a four channel TI ADC the analog domain has better accuracy at the cost of additional power and increased circuit complexity [14, 52, 30]. 4.2.2 Gain Mismatches The gain of an ADC is the slope of the best fit line of the ADC transfer function. Ideally, the gain is equal to 1, but several error sources, such as capacitor mismatches and variations in the reference voltage, cause gain error. In a single channel ADC, gain error changes the amplitude of the output signal, but does not degrade the dynamic performance of the ADC. The effect of gain mismatches in a four channel TI ADC is described in Figure 44. Errors from gain mismatches are Ag - vi, where Agi is the gain error of the ith channel and vi, is the input signal. In the time domain, this error can be analyzed as a product of the periodic sequence of Ag 1 , Ag 2 , Ag3 , Ag 4 and the input signal vi,. 69 Errors from gain mismatches Vin vA 9g viTAg v,. Ag3 viAg 4 4 viAg 1 fin time f3/4±fin 2f3/4±f, 3f,/4±fin fs-fi~fun c 11 vin(t) Kzz~j frequency time x 0 Gain mismatches Ag 3 A93 Ag2 1A4 T Ii , I Ag2 400 --time Ag1 0 f,/4 I f/ 2fs/4 3,/4 f, frequency Ag, Figure 4-4: Example of gain mismatches in a four channel TI ADC Thus, the result in the frequency domain can be calculated by the convolution of the two spectra. Because the periodic sequence of Ag1 , Ag 2 , Ag3 , Ag 4 have four tones at frequencies of 0, 1,s 4 2f ,f,9f the spurs appear t2l 4 4at 0 t f fl 2f-3 4 i f3s fl ::t fi fi, in in the frequency domain. This is different from the offset errors in that the frequencies of the error tones depend on the input frequency, and the amplitude of the error tones increases linearly with the amplitude of the input signal. Multiplications at the digital output can be used for gain-error correction. The digital multiplication increases hardware and power consumption compared to the addition needed for offset calibration. 4.2.3 Timing-Skew The clock signals of a TI ADC, shown in Figure 4-1, are usually generated on-chip by dividing a high frequency clock. Ideally, the clocks should be uniformly spaced in time. However, due to unavoidable mismatches in layout, random mismatches in 70 the clock generator circuits, and random variations of the threshold voltage in the sampling switches, the clock edges deviate from the ideal case and are spaced nonuniformly. The error of the clock edges in time is called timing-skew, as shown in Figure 4-5. It is a unique and additional error source of the TI ADC which does not happen in a single channel ADC. VIN - At2 ck3 Ideal clocks Clocks with timing-skew (k4 VI2 4it: at4 Figure 4-5: Timing-skew error When a pure single tone signal is applied to the input, the errors from the timingskew can be expressed as VIN M se,i = Aeawi't (4.1) dvi(t) At dt (4.2) jwinAeWilt . Ati (4.3) Winsti - VINM(4) where Ati is the timing-skew of the ith channel. Similar to the gain error case, the error from timing skew can be analyzed as a product of the periodic sequence of jwAti, jwAt 2 , -. , jwAtN and the input signal vi,. Figure 4-6 shows an example of the timing-skew error in a four channel interleaved ADC. Note that the frequencies of the error tones are the same as the frequency of the tones generated by gain 71 Errors from timing-skew jii.Ati. jU)4At4Vi TJWilf~v jwt <:*KI vO@ fJ4±fin time SjWivt" 2f,/4±fin 3fJ4±fin frequency -L 11 vin(t) <--> 1 frequency time x 0 Timing-skew joinA ij t2 jJinAtAt jWnt 3 jw nAt 2 ji jWinAt3 At4 time 0 f,/4 2f,/4 3f,/4 7 frequency Figure 4-6: Example of timing-skew in a four channel TI ADC mismatches. This shows that timing-skew and gain errors cannot be distinguished from the frequency of the error tones. However, there is an important difference between the gain errors and timing-skew errors that allows the two error sources to be separated. As derived in Equation 4.4, the timing-skew errors are proportional to the input frequency, while the errors from gain mismatches are independent of the input frequency. Thus, the gain error can be measured and calibrated from a low frequency input test and the timing-skew errors can be processed later with a high frequency input test. The relation between the timing-skew error and the input frequency provides another important message. Considering that the TI structure is used to expand the input bandwidth, TI ADCs tend to have a high frequency input signal and timingskew errors can be dominant error sources. For example, the timing-skew requirement for a 10 bit 1GS/s TI ADC can be roughly calculated as follows. 72 eskew,i (4.5) Ati < 1 Wmax = 2N 1 2rO.5e9 - 210 300f s (4.6) Generally, timing-skew calibrations are processed in two steps: timing-skew measurement (or estimation) and error correction. Timing skew can be measured with a predetermined input signal such as a linear ramp [47] or a sine wave with known frequency and amplitude [53]. This makes the measurement relatively easy and accurate. However, to apply such a specific input signal, the normal ADC operation should be stopped for the calibration. Timing-skew is sensitive to temperature and supply voltage changes. Temperature and supply voltage change the sharpness of clock transition edges which affect the timing-skew caused by random variation of the threshold voltage in the clock generators. The ability to track variations in the timing-skew is an important aspect of the calibration. Thus, these calibrations are limited to applications where ADCs are allowed to an have off-phase for a foreground calibration. Timing-skew can also be measured with arbitrary inputs, which allows for background calibration. A correlation-based algorithm was developed in [40] to measure the timing-skew. They use dedicated timing reference ADC channel which does not suffer from timing-skew as a reference for the time-interleaved channels. A recent study in [43] uses two additional ADC channels for a direct measurement of the timing-skew error. One of them works as a reference for the TI channels and the other one is used to measure the derivative of the input signal. Both [40] and [43] present a new and robust algorithm, but the additional channels required for the calibration are an expensive overhead. The correction of timing-skew error can be done in either the digital domain or the analog domain. Digital interpolation filters [47], fractional delay filters [51], and Taylor series approximations [50] are examples of digital filters for skew-calibration. Although they are demonstrated in simulation, it is difficult to find measurement 73 results from silicon chips in the literature. This is partially because the cost in power consumption and area of the digital calibration block is high, even in modern CMOS technology. In [53, 40, 43], programmable-delay blocks are placed at the sampling clock to compensate the timing-skew. This is the most practical and frequently used solution. However, it needs to be designed carefully because the delay circuits tend to increase the noise and jitter of the clock signal. Clock jitter is not an additional error source of TI ADCs, but a common error source of all ADCs. The impact of clock jitter is usually negligible for low-performance ADCs, but it becomes more notable for high-speed and high-resolution ADCs. Thus, when sampling speed is improved with TI structure, clock jitter can limit the accuracy of ADC output. Unlike timing-skew, due to its random characteristics, clock jitter increases noise floor of the output spectrum, but does not cause spurs. The requirement of the clock jitter can be calculated similarly to Equations 4.6. [54] provides a detailed analysis of clock jitter. 4.3 Research Direction In this chapter, the benefits and problems of the time-interleaved ADC structure are described. Prior research techniques developed to mitigate the problems are also introduced with particular attention to the timing-skew calibrations. However, each of the prior works has a drawback. They require either an off-phase of the ADC operation for foreground calibration or additional ADC channels for background calibration which increases the cost and complexity of the system. This thesis proposes a TI ADC which allows a background timing-skew calibration without increasing circuit complexity, which will be described in detail in the next chapter. 74 Chapter 5 A Time-Interleaved SAR ADC with a Background Timing-Skew Calibration A time-interleaved SAR ADC is designed to demonstrate the energy efficiency of the TI structure and propose a background timing-skew calibration method. In the proposed ADC, a single high-speed flash ADC is added to improve the conversion rate of each channel, and to enable a timing-skew estimation of the TI SAR channels. This chapter describes the detailed implementation of the ADC and the process to estimate timing-skew. Measured results from a prototype IC are shown at the end of this chapter. 5.1 Block Diagram of the Proposed Time-Interleaved SAR ADC Figure 5-1 shows the block diagram of the time-interleaved SAR ADC and the timing waveform. The time-interleaved ADC is composed of a clock generator, a flash ADC, eight time-interleaved SAR ADCs, and digital circuits for bit combining and multiplexing. Tje computation for timing-skew estimation is performed off-chip. 75 CLOCK Clock generator -------- -- - DouT VIN . ADC3 SSAR 4)i 10 4)8 SAR ADC3 C FADC 14 On Chip Figure 5-1: Block diagram of the proposed TI Flash/SAR ADC The flash ADC resolves MSBs (4 bit) at the full speed (<D) of the time-interleaved ADC, therefore it does not suffer from timing-skew errors. The flash ADC output is used as a coarse estimation of SAR conversion. Because it does not suffer from timing-skew, the flash ADC output is also used as a timing reference. SAR ADCs resolve LSBs (7 bit) at the divided clock speed ((Di - 4)8). The timing-skew estimator controls the programmable delay circuit in the clock generator to correct the timingskew error. The main idea of the calibration is to align the sampling clock of SAR ADCs (4)x) to the sampling clock of the flash ADC (4)). This calibration approach is explained thoroughly with examples in Section 5.3. 76 5.2 Circuit Implementation 5.2.1 Clock Generator Figure 5-2 shows the block diagram of clock generation for the prototype. A lowvoltage differential signal (LVDS) clock is provided externally and converted to a single-ended signal on-chip. Global delay block shown in Figure 5-2 corrects the average mismatch between the flash clock (4D) and SAR clocks (4b, - 4b8). Since they are located before the clock divider, the global delay block affects all SAR clocks equally. Thus, they do not change timing-skew among SAR clocks. A clock divider is used to generate the divided clocks for the interleaved SAR channels, <D1 - (D8 in Figure 5-1. To minimize the systematic mismatches in the clock path, H-tree structure is used to route clocks. Local delay block shown in Figure 5-2 adjusts the timing of divided SAR clocks (<Di - D8) separately to correct timing-skew among the SAR ADCs. Figure 5-3 shows the schematic of the LVDS clock receiver [55]. The differential structure improves supply noise rejection and input common-mode signal rejection. The cross-coupled PMOS load provides weak positive feedback which increases the differential gain and transition speed. The bias voltage of the tail current source, which is not shown for simplicity, is generated on-chip using a current mirror. Figure 5-4 shows the schematic of the clock generator. A chain of eight D flip-flops Programmable delays: Clock divider Programmable delays : Global delay )2 CLKM ne .- _06,~8 4)8 CLK1 CLKPrcie si 4)1 LVDS ': Local delay CLK L H-tree routingH CLK2 Figure 5-2: Block diagram of clock generation 77 CLKM CLKP CLK bias 4H~ Figure 5-3: Schematic of LVDS clock receiver D Q 01 D >2 Q D ):43 Q D a041 Q D Q N D a'6 Q D Q I47 D Q 8 CLKI cLK2 Figure 5-4: Schematic of clock generator is used to divide the high frequency clock. A power on reset signal is connected to the SET and RESET of the D flip-flops to initialize the divider, which is not shown in the figure for simplicity. To minimize the systematic mismatch between the flash clock (<b) and SAR clock ( Px), several dummy logic gates are added for the flash clock path. The schematic of the programmable delay block is shown in Figure 5-5. It is composed of four inverters with different sizes. A capacitor bank with switches are placed between the inverters to control the delay. Each capacitor bank has seven minimum size MOS-capacitors which are controlled by 3 bit binary code. Capacitors at the output of a small inverter control the coarse delay and capacitors at the output of a large inverter control the fine delay. The last two inverters are added to recover 78 coarse_ delayCTRL <2:0> .-. ±fine delay_CTRL <2:0> Figure 5-5: Schematic of programmable delay block the sharpness of the transition edge. The coarse and fine delay are designed to have 2ps and 0.8ps delay steps, respectively. After timing-skew calibration in 0.8ps fine delay steps, the maximum timing-skew is 0.4ps and the variance of the timing-skew is about 0.28ps, which is within the requirement from Equation 4.6. 5.2.2 Flash ADC Figure 5-6 shows the implementation of the flash ADC with the timing waveform. It is a 4 bit flash ADC with a capacitive DAC. Each comparator samples the input signal with a bottom plate switch on two capacitors with different sizes, which are then switched to the reference voltages. When the reference voltages settle, comparators are enabled. Although a singled-ended version is shown for simplicity, a differential structure is implemented. The bootstrap switches, shown in Figure 3-6 in Chapter 3, are used for input tracking to minimize the variation of on-resistance over a wide range of input voltages. NMOS or PMOS switches are used for the bottom plate sampling and the reference switches, because they always connect to the same voltage. The two sampling capacitors are sized to generate DAC voltages which are the threshold of the flash ADC. For the kth comparator, the two capacitors C1 = 0.5+k and C2= 15.5-k generate DAC voltage of DACk - c1 VREFP C C2 VREFM. The reason for having a non-integer capacitor size is that the threshold voltage of the flash ADC should be shifted by 0.5 LSBflash to have a symmetrical redundancy between flash and SAR ADCs. This is more clearly described in the following SAR ADC 79 NMOSorPMOS switch K th comparator (K =1,2,---,14) Bootstrap switch 0 15.5-K 0.5+K REF_ON VINCOMP_EN -VREFP COMPENFJ VREFM DAC VA LID OUTPUT .............____... Figure 5-6: Implementation of 4bit flash ADC section. In actual implementation, to avoid non-integer size of capacitors, capacitors are increased by two times while keeping the relative ratio, C1 = 31-2k and C2=1+2k. A custom designed unit capacitor is used, which is described in Section 5.2.3. A dynamic latch, shown in Figure 3-9(b) in Chapter 3, is used for the comparators in the flash ADC. Thanks to the redundancy in SAR, ADCs, offsets of the comparators are not corrected as long as they are in the correction range which is k0.5LSBFLASH = k32LSBSAR ~ ±62.5mV. Simulations based on the device mis- matches show the a of the comparator offset is about 5.2mV, thus the comparator has enough margin for offset. The flash ADC described in this section has several advantages over the conventional flash ADC [40] which compares the input directly with a reference voltage. First, a rail-to-rail input range is enabled. Because most flash comparators have limited common-mode range, a rail-to-rail input signal cannot be utilized. Second, it allows a true fully differential implementation. Although conventional flash comparators with two differential input pairs is topologically fully-differential, when input or reference voltages are large, one of the differential pairs dominates leading to effectively single-ended comparison. This causes errors when conmon-mode level is not stabilized and suffers from lower PSRR. Also, the lack of resistor ladders in the sampling flash comparator makes this flash ADC power efficient. 80 Finally, because it is a scaled version of a SAR ADC, the SAR and flash sampling clocks are closely matched. This is especially important in this work, because the sampling clocks of SAR ADCs must be aligned as closely to the sampling clock of the flash ADC as possible to minimize the timing skew correction range. 5.2.3 SAR ADC SAR ADCs have become more popular due to their highly digital operation and power efficiency. However, designing a high-performance SAR ADC is challenging. Recent research has developed many ideas to improve the performance of SAR ADCs. Novel switching schemes for low power operation [56, 57, 29, 58], calibration algorithms for capacitor mismatches [59], high speed SAR design [60, 61], and comparator noise reduction [62, 63] are good examples. Although a conventional SAR ADC is designed in this project to demonstrate the timing-skew calibration of a TI ADC, this technique can be extended to such advanced SAR ADCs. Figure 5-7 shows a 10 bit SAR ADC composed of 1024 unit capacitors, one comparator, and SAR logic. MSB DACs have 14 unary wighted capacitors whose size is 64 and are controlled by the output of the flash ADC. LSB DACs are composed of binary weighted capacitors (size of 1 to 64) and controlled by a SAR logic block. 1 bit redundancy between the flash ADC and the SAR ADCs is added to cover the Programmable delay 64 VIN 1> <T64 64 [VREFP MSBDAC RedundancyLSBDACs D FLASH1-------------------- DLSAR<6:O>:. Figure 5-7: Implementation of 10bit SAR ADC 81 SAR LOGIC error from the flash ADC and to extract the timing-skew information, as explained in the later part of this section. The final output of each channel weighted sum of the flash ADC output conversion (DLSAR), DCHANNEL= 64 is the and the lower output bits of SAR (DFLASH) *DFLASH (DCHANNEL) + DLSAR. For the same reason as in the flash ADC, bootstrap switches are used for the input tracking and NMOS or PMOS switches are used for the bottom plate sampling and the reference switches. A programmable delay block, Figure 5-5, is placed between the sampling clock and the NMOS bottom plate switches to correct the timing-skew. The timing diagram of the SAR conversion is shown in Figure 5-8. To avoid requiring a high frequency clock and to increase SAR conversion speed, asynchronous SAR logic is implemented [64] with programmable delays. The input signal is sampled on the capacitor array, at the falling edge of the SAR sampling clock(4bx), which is ideally synchronized with the sampling clock of the flash ADC(<b). MSBDELAY controls the timing to capture the output of the flash ADC. MSBDACSETTLE-DELAY provides time for the MSB DAC settling and triggers the comparator. An XOR gate at the output of the differential comparator generates COMPVALID and UPDATELSBDAC signals when the comparison is done. The UPDATELSBDAC signal is used to update the LSB DACs. COMPRESETDELAY adjusts reset timing of the comparator and LSBDAC_ SETTLEDELAY provides settling time to (I) FLASH OUTPUT | [] m 4X *SBDELAY SET MSB DAC MSODAC_SETTLEDELAY COMP EN COMPRESET DELAY LSBDACSETTLE_DELAY COMP DONE (UPDATE LSB DAC) SAR DONE Figure 5-8: Timing diagram of the SAR ADC 82 m the LSB DACs. After the last comparison, SARDONE is generated which indicates the completion of the SAR conversion. A digital logic circuit to generate these signals is shown at the end of this section. Figure 5-9 and Figure 5-10 describe the redundancy. Because it is difficult to show the entire range of the 4 bit flash and 10 bit SAR, a simplified example with 3 bit flash and 5 bit SAR is described. Figure 5-9 shows the case without redundancy. In this case, the flash ADC needs 7 comparators for 8 level quantization. The thresholds of the flash ADC are uniformly distributed between the full input range. Note that the SAR searching range is the same as the uncertainly provided from the flash ADC. In the particular example shown in Figure 5-9, the flash ADC confines the input signal between 8-11 and the same range (8-11) is covered by the SAR ADC in two cycles. This structure is not robust to small errors from the flash ADC and is impractical for real implementation, because it requires the flash ADC to be accurate to the full ADC resolution. Figure 5-10 shows the case with redundancy. In this case, the flash ADC needs 6 comparators for 7 level quantization. Compared to Figure 5-9, the thresholds are shifted by 0.5 LSBf 1ash. The redundancy comes from the fact that the SAR searching range is wider than the uncertainly provided from the flash ADC. Also, to use the redundancy efficiently, the SAR searching range is centered around the two thresholds of the flash ADC. In the particular example shown in Figure 5-10, the flash ADC confines the input signal between 10-13, but a wider range (8-15) is covered by the SAR ADC in three cycles. Due to the 1 bit redundancy, the overall resolution of the whole ADC is reduced from 6 bits to 5 bits. However, the robustness to the flash ADC error makes this design more practical. When selecting a capacitor size, there are two important factors to consider: thermal (k) noise and capacitor mismatches. The k noise limits the minimum size of total capacitance to 100 fF on each side for 10 bit ADC with 2 Vdff,p_, input amplitude. This leads to a 0.1 fF unit capacitor for a 10 bit ADC, which is impractical and too small for 10 bit matching. According to published work using custom designed unit capacitors [29, 65, 66], 1fF unit capacitors are a good compromise for 83 Threshold of 3bit flash ADC Digital output SAR conversion without redundancy 32 31 30 29 28 28 27 26 25 24 24 23 22 21 20 20 19 18 17 16 16 15 14 13 12 Input 12 11 -> 10 9 i Nominal range 88 - 7 6 5 4 4 3 2 1 0 0 Figure 5-9: An example of no redundancy: no redundancy between 3 bit flash ADC and 5 bit SAR ADC 84 Threshold of 3bit flash ADC SAR conversion with 1 bit redundancy Digital output 32 31 30 29 28 27 26 26 25 24 23 22 22 21 20 19 18 18 17 16 15 4 14 Redundancy 13 12 Input ) 10 10 9 8 7 6 1 Nominal range i Redundancy 6 5 4 3 2 1 0 0 Figure 5-10: An example of redundancy: 1 bit redundancy between 3 bit flash ADC and 5 bit SAR ADC 85 M5 VIA45 M4 VIA34 M3 Top view Side view Figure 5-11: Custom designed unit capacitor of SAR ADC 10 bit accuracy. The custom designed unit capacitor used in the work is shown in Figure 5-11. It is a combination of MIM and MOM structures. M3 and M5 are two plates having a MIM structure. The benefit of capacitor arrays with the MIM structure is that the top and bottom plates are shielded from each other in a capacitor array and each element can be accessed without adding parasitic capacitance. This is especially important when a small unit capacitor is used. Also, M5 is designed slightly smaller than M3 to reduce the errors from misalignment of metal layers. However, the density of MIM capacitance without special layer in the fabrication process is too low. To increase the density of the capacitor, interdigitated MOM structures are added in M4 and connected to M3 and M5. Metal strips in M4 are designed wider than minimum size to cover the capacitance from VIAs. In the capacitor array, M5 is used as a bottom plate that is connected to the comparator input for smaller parasitic capacitance. A dynamic latch with an offset control, shown in Figure 5-12, is used for the SAR ADC comparator. An offset calibration block, which is a capacitor bank with switches, is added at both outputs of the comparator [67]. The offset calibration block has 31 switches and 31 capacitors controlled by the binary weighted 5 bit con- figuration bits. Although explicit capacitors are shown in Figure 5-12, the parasitic capacitance of switches are used to control the offset finely. According to the simulation and measurement results, the offsets can be adjusted up to ±12mV ~ with a 0.2LSB step size. 6LBSs The input NMOS pair of the comparators are sized to 86 EN EN Outm Outp os ctrlP<4:0> osctrlM<4:p>iinm EN Figure 5-12: Schematic of the SAR ADC comparator with offset, calibration have the - of the comparator offset below 1.5mV. The noise of the comparator is controlled by the proper sizing of the NMOS strobe switch at the bottom of the comparator [68]. Simulations show that the input referred noise of the comparator is about 0. 2 m4Vms,diff. The asynchronous SAR logic block is shown in Figure 5-13. Two D flip-flop chains shown at the bottom are used for LSB DAC control [69]. The digital logic circuit to generate the signals in Figure 5-8 is described on top. 87 SETMSBDAC D SAMPLE Q D De y COMPEN > DACRESETb TACN SAR_DONE UPDATELSBBDAC De y e Pus De y DACD_ E____ CN SN [> SARDONE ~ QN QN > _ QN > ON > QN > QN > QN> > UPDATELSB_DAC COMPOUT SIN D D -O[ DACRESETb SIN _SIN q Q Q CN CN D > SN L Q -- CIN QN D > CN QN L SN L Q D -- > Q SIN SN D -- CIQN > Q CIN QN -- > SIN Q D CN -- N D > CIN :T _ ofthe AR ogi Figue 513:Scheati 88 5.3 Timing-Skew Estimation In this section, two background timing-skew estimation methods which are applicable to the proposed ADC architecture are explained. The basic principle, to use the coarse flash ADC output as a reference in timing, is the same. However, they use different measures to estimate the timing-skew. 5.3.1 Variance Based Estimation A conceptual diagram of the ADC operation, shown in Figure 5-14, is used to explain the variation based method. The input signal and sampling clocks of the flash ADC(<b) and the SAR ADC(<Dx) are shown on the left side. Reference voltages of the flash ADC which are equivalent to the MSB DAC level of the SAR ADC are shown in the middle. The LSB DAC code of the SAR ADC is shown on the right side. In Figure 5-14(a), the input signal is sampled by the flash ADC (4)) and the SAR ADC (<Dx) simultaneously. Thus, assuming the flash ADC is accurate, they sample the same input signal and coarse estimation from the flash ADC is accurate. In this example, the flash ADC output is 12. Although the flash ADC confines the SAR searching range between 32 and 96, due to the redundancy, the SAR covers a wider range than necessary. In the absence of noise, the final SAR output, in this case, always falls between 32 and 96, which is a nominal range. However, when D and <Ix are not aligned as shown in Figure 5-14(b), the coarse estimation from the flash ADC is not accurate. In this example, the flash ADC output is 11. Thus, the SAR conversion starts from the wrong place. However, thanks to the redundancy, the SAR finds the accurate final value. Although the channel output which is a weighted sum of the flash ADC output and the SAR conversion is the same as before, the SAR conversion output, 102 in this example, goes beyond the nominal range. The effect of the timing-skew can be found from the histogram of ure 5-15 shows an example of the SAR conversion output, without the timing-skew error. In this ideal case, 89 DLSAR DLSAR, DLSAR. Fig- histogram with and is confined in the nominal 128 Redundancy * 112 Ideal case 13 96 80 SAR INPUT = FLASH INPUT 48 2 DFLASH - - Nominal range 64 32 - 16 *11 0 = 64*DFLASH + DLSAR = 64*12 +38 Redundancy = 806 Redundancy )- = 6 4 *DFASH + DLSAR * (a) Timing-skew 13 128(OX 112 SAR INPUT 96 4) = 64*11 80 FLx IN PT = Nominal 64 -- +102 = 806 range 48 1-- 32 16 J * Redundancy 0 (b) Figure 5-14: A conceptual diagram of the ADC operation (a) ideal case without timing-skew and (b) realistic case with timing-skew 90 DLSAR density DLSAR density Mean 4 flash ADC offset Variance 4 Timing skew 0 32 64 96 128 32 0 64 96 128 Flash ADC offset error & timing skew Ideal case Figure 5-15: Examples of the DLSAR histogram without timing-skew (left) and with timing-skew (right) range and the density is uniform. However, with timing-skew, some of the DLSAR codes at the edge go to the other side to cover the error from the flash ADC and this increases the variance of the DLSAR histogram. Thus, timing-skew can be estimated from the variance of DLSAR. To verify the idea, behavioral simulations are performed and plotted in Figure 5-16. In this simulation, two input signals with different frequencies are applied. Here, 128K (217) data values are used to calculate each variance value on the plot. As expected, VAR(DLSAR) is minimized when the timing-skew is zero, regardless of the input frequency. Figure 5-16 shows that VAR(DLSAR) has lower sensitivity when the input signal is at a low frequency. With a low frequency input signal, VAR(DLSAR) curve has small slope because this method does not measure the timing-skew directly in the time domain, but measures the errors caused by timing-skew in the voltage domain. With a low frequency input signal, the errors caused by timing-skew are small and thus difficult to detect. The proposed variation based timing-skew estimation can be explained by simple equations. Assuming the error due to the timing-skew is small and covered by the redundancy in the SAR ADC, the channel output DCHANNEL is completely deter- mined by the sampled value of the SAR ADC. Then, the digital output of the flash ADC and the digital output of each channel can be written as 91 Input signal in time domain 1- 1 0.8 0.6 E 0.4 0.2 0 0 0.2 I I I I I 0.4 0.6 0.8 1 1.2 I 1.4 1.6 1.8 time (s) 2 x 10~ Variance of DLSAR vs. timing-skew 360 I I I II -150 MHz input s ignal -450 MHz input s ignal cr 355 350 > 345 340 - -4 -3 -1 -2 0 1 2 3 timing-skew (s) 4 5 x 10-12 Figure 5-16: Behavioral simulation result : Timing-skew vs. VAR(DLSAR) DFLASH [n] DCHANNEL [n] vIN (nT) vIN + QFLASH [n (nT + At) + QCHANNEL[n] (5.1) (5.2) where DFLASH [n] and QFLASH [n] are the digital output and the quantization noise of the flash ADC, DCHANNEL[n] and QCHANNEL[n] are digital output and the quanti- zation noise of each ADC channel, and At is the timing skew between the flash ADC and each ADC channel. From these equations, the variance of the DLSAR can be expressed as 92 VAR[DLSAR[n]] =VAR[DCHANNEL[n] - DFLASHn] = VAR[vIN(nT + At) ±QCHANNELn] - = E[(vIN(rT + At) E[(vIN(nT + At =~ ~E - VIN (nT)- vIN (TT)+ QCHANNEL[ri - QFLASH[n]] QFLASH [n) 2 ] + Q2HNE~m J(nT)) 2 ] + Q 2FLSm FLASH,rms CHANNEL,rms V(ININ (5.3) (5.4) (5.5) (5.6) .6 where QHANNEL,rms and Q2LASH,rms are the quantization noise power of each chan- nel ADC and the flash ADC respectively. Equation 5.6 shows that VAR[DLSAR[n]] is a function of At and it is minimized when At is zero. Equation 5.6 also indicates that minimizing VAR[DLSAR[n]] is equivalent to find a least mean square (LMS) error approximation between the two sampled signals: vI (nT) and vN (nT + At). In the previous explanation, an ideal flash ADC is assumed. However, it is important to consider non-idealities of the flash ADC, such as offset and noise of the comparators. The comparator offsets in the flash ADC have negligible impact to the timing-skew estimation. It is true that the comparator offsets in the flash ADC provide inaccurate coarse estimation to the SAR ADCs and increase the range of DLSARHowever, since the polarity and amplitude of the comparator offsets do not change, they can be distinguished from timing-skew errors easily. The behavioral simulation result, plotted in Figure 5-17, shows that the comparator offsets in the flash ADC shift the VAR(DLSAR) curve upward, but do not change the fact that VAR(DLSAR) is minimized when the timing-skew is zero. In this simulation, comparator offsets with two different RMS values are tested with ~ 451MHz -2dBFS input signal. Here also, 128K (217) data values are used to calculate each variance value on the plot. The noise of the comparators in the flash ADC has different impacts on the DLSAR histogram. Because the polarity and the amplitude of the comparator noise varies randomly, it is difficult to distinguish the error caused by the comparator noise in the flash ADC from the error caused by timing-skew with one sample. However, due to the well defined statistics of noise such as mean and variance, the effect of the comparator noise can be distinguished statistically. When the histogram and 93 375 I I I - I Ideal Flash ADC a #" A 370 t f# t 2LSB FLASH ADC with comparator offsets (SLSBr) 365 U) 360 - -j 355 350 I II I- 345 -5 -4 -3 -2 0 -1 1 2 3 4 timing-skew(s) 5 X 10-12 Figure 5-17: Behavioral simulation result with comparator offsets in the flash ADC: Timing-skew vs. VAR(DLSAR) Ideal flash ADC 160_ Flash ADC with comparator noise (1 LSBrmS 360 CD Flash ADC with comparator noise (3LSBrmd 355- 350- 345- S4U -5 -4 -3 -2 -1 0 1 2 3 4 timing-skew(s) 5 x 10-12 Figure 5-18: Behavioral simulation result with comparator noise in the flash ADC: Timing-skew vs. VAR(DLSAR), using 217 data vlaues to calculate variance 94 365 1 1 1 1 1 3 4 - ideal flash ADC Flash ADC with comparator noise (1 LSBrmS 360- Flash ADC with comparator noise (3LSBrmd cc 355 > 350 345- 3401 -5 -4 -3 -2 0 -1 1 2 timing-skew(s) 5 X10-12 Figure 5-19: Behavioral simulation result with comparator noise in the flash ADC: Timing-skew vs. VAR,(DLSAR), using 220 data vlaues to calculate variance variance of the DLSAR are calculated with sufficient number of data, the effect of the comparator noise is averaged and does not change. In other words, the variance of DLSAR is increased due to the comparator noise. However, with sufficient samples, the amount of increase is the same for every VAR,(DLSAR) calculation. It is because the captured noise power of the comparators gives its convergence as the number of samples increases. The effects of the comparator noise are simulated with behavioral models. The comparator noise with two different RMS values is tested with ~ 451MHz -2dBFS input signal. As before, 128K (21) data values are used to calculate each variance value on the plot. Figure 5-18 shows that the noise of the comparators in the flash ADC adds noisy patterns, which degrades the sensitivity of the timing-skew calibration. Thus, it is important to keep the comparator noise low. If the comparator noise is higher than the required level and cannot be controlled, a simple solution is using the averaging effect of noise power. By using more data for the variance calculation, the impact of the flash comparator noise can be reduced. Figure 5-19 shows the same behavioral simulation result as in Figure 5-18, but Figure 5-19 use 95 358356 Ideal SAR ADC SAR ADC with comparator noise (0.2LSB ) 354 - SAR ADC with comparator noise (0.5LSB ) 352c 350348346344- 342340 -5 - 1 -4 1 -3 -2 0 -1 1 2 3 timing-skew(s) 4 5 X 10-12 Figure 5-20: Behavioral simulation result with comparator noise in SAR ADC: Timing-skew vs. VAR (DLSAR) eight times more data (220) to calculate each variance. However, the improvement is significantly slower than the increased cost in time and power consumnption of the calibration. Thus, the number of data for variance calculation should be optimized to the noise level of the flash comparators. Finally, the noise of the comparators in the SAR ADCs also affects the histogram of DLSAR and may limit the sensitivity of the calibration. However, since the noise level of the SAR comparator is kept significantly lower than 1 LSB, it is usually negligible. The behavioral simulation result is shown in Figure 5-20. 5.3.2 Correlation Based Estimation A correlation based skew-estimation approach was presented in [40]. Although it was proposed for a different ADC architecture, a time-interleaved flash ADC, the same principle can be applied to this work. However, there is one impiortant difference. [40] used a dedicated extra timing reference channel whose output is used only for timing reference. In contrast,, the output of the flash ADC in this work provides both 96 0.9981 I ..211 -111120-.... 11.. 0.998 150MHz inpu 150MHz inpus 0.997 9 -5 1 -4 1 -2 1 -3 1 0 1 -1 1 1 2 3 4 timing-skew(s) Figure 5-21: Behavioral Corr(DFLASH,DCHANNEL) simulation 5 X10-1 result: Timing-skew vs. 0.99 )8 0.9 98 - 0.9 0.9f 88- tiigskws 8 x 101 - 0.997j90.99719 - Ideal flash ADC Flash ADC with comparator noise (1 LSBrm- 0.9M79- Flash ADC with comparator noise (3LSBrm O.M9 -5 Figure -4 5-22: -3 -2 Behavioral -1 0 1 timing-skew(s) sinulation Corr(D FLASH ,DCHANNEL) 97 result: 2 3 5 4 X10-12 Tininig-skew vs. a timing reference and a coarse estimation to the SAR channels. According to [40], the correlation between DFLASH and DCHANNEL can be a measure of the sampling clock alignment, and the correlation reaches the maximum value when the sampling clocks are aligned. Figure 5-21 shows the behavioral simulation result. Similar to the previous variance based method, the correlation based method is robust to the offset of the comparators in the flash ADC. However, as shown in Figure 5-22, it also suffers from sensitivity degradation with the comparator noise in the flash ADC. The same condition as for Figure 5-18 is used for this simulation. The complexity of the digital circuits for skew-estimation is a disadvantage of this method. Note that both DFLASH and DCHANNEL contains the input signal and have wide dynamic range. Thus, the calculation of the correlation between DCHANNEL 5.3.3 DFLASH and requires multipliers and adders with a wider input words. Limitations The timing-skew estimation methods introduced in this section have a few limitations. First, the input signal must be busy or active, so that the input signal crosses at least one of the reference voltages of the flash ADC. This is because the proposed methods detect the error (or correlation) between the two independent ADCs (flash ADC and SAR ADC). For example, with a very small input signal which is represented with one flash ADC output, the Corr(DFLASH,DCHANNEL) is always zero. However, this is not a critical issue, because if the input signal is limited between two reference voltages of the flash ADC, the derivative of the input, d1a, is also limited. Thus, in this case, the errors from the timing-skew are likely to be insignificant. Second, the input frequency should not be an integer multiple of f, where fc is the conversion rate of each channel. Although this is a pathological case, if the input frequency is an integer multiple of fe, each channel samples the same signal repetitively. Thus, there is only one flash ADC output and one SAR output for each channel. Thus, VAR(DLSAR) and Corr(DFLASH,DCHANNEL) are always zero. Finally, because the proposed methods rely on statistics of the input signal, the input characteristic should be maintained during the timing-skew calibration. For 98 example, if the input signal is busy, but appears more frequently where DLSAR is close to the edges of nominal range, 32 and 96 in the prototype, the VAR(DLSAR) can increase. However, the increase of the variance is not caused by the timing-skew, but by an unexpected characteristic of the input. Similarly, if the input frequency increases temporarily only for one calibration cycle, the VAR(DLSAR) can also in- crease regardless of actual timing-skew. To address this issue, studies to enhance the calibration speed are recommended for future work. 5.4 Layout The floor planning of a TI ADC is important to keep the symmetry of the channels. Any asymmetry in layout exacerbates the issues discussed in Section 4.2. A signal traveling a long distance can be affected by the voltage drop and local noisy circuits. Figure 5-23 shows the floor plan and the top layout of the TI SAR ADC in this work. Four SAR ADCs are located on each of the right and left sides of the chip. The flash ADC is placed at the top of the chip. Input signals and clocks are distributed to the interleaved channels from the bottom. To ensure the same distance to the TI channels, an H-tree structure is used to route the input signals and clocks. To avoid interference through capacitive coupling, clocks are shielded with ground. Figure 5-24 shows the layout of the SAR ADC. A symmetric layout is required to exploit benefits of fully differential implementation. Thus, the comparator and the bottom sampling switches are placed between the main and complementary DACs. Dummy capacitors are added around the DAC for better matching. To make the wiring length short, the input tracking switches and the reference switches are placed on the opposite sides of the DACs. Layout of the flash ADC is shown in Figure 5-25. Thanks to the fact that the sum of two capacitors of each comparators are the same, flash ADC layout becomes compact and modular. 99 FLASH ADC SAR ADC SAR ADC SARADC SARADC H-tree for input and clcok SARADC SARADC SAR ADC SAR ADC Clock generator Figure 5-23: Floor plan and top layout of the ADC 100 cornparators Co igura bits In igura C Capacitor array jibitsI I I Bootstrap switch Programmable delay Bootstrap switch Figure 5-24: Layout of the SAR ADC Con igu I ion SAR logic bit Reference switches I co or Capacitor array Capacitor array Otto Input tracking switches Bootstrap switch Figure 5-25: Layout of the flash ADC 101 Programmable delay Bootstrap switch on 5.5 5.5.1 Measurement Measurement Setup The block diagram of the measurement setup for this work is shown in Figure 5-26. Synthesized signal generators are used for the low noise input signal and the clock source. To reduce the distortion and noise of the signal generator, a bandpass filter is inserted between the signal generator and the test board. The configuration bits on the ADC chip are implemented by a flip-flop chain, and are programmed by a pattern generator. The LVDS output, signals of the ADC are captured by a logic analyzer. Figure 5-27 shows the test board for the measurement of the chip. To minimize the effects of off-chip parasitics, the chip is bonded to the test board directly. Singleended input signal and clock are converted to differential signals on the PCB using off the shelf balun (TC1-1-13M+, Mini-circuits). Eight linear regulators (LT3021, Linear technology) are used to generate supply voltages, reference voltages, and the common-mode voltage. An accurate 1 ohm resistor is added in series at the output of the regulators to measure the power consumption. A prototype ADC is fabricated in a 65nm GP CMOS process. The die photo is Agilent E3646 DC power supply Tektronix TLA715 Agilent 8644B Signal generator TTE Bandpass filter TEST BOARD 1 TLA7AA2 Logic analyzer Matlab TLA7PG2 Pattern generator TTE Bandpass filter HP 83732B Clock generator Figure 5-26: Block diagram of the measurement setup 102 Clock Pattern generator Logic analyzer Linear regulators a supply Input Figure 5-27: Test board design for measurement shown in Figure 5-28. The active area of the ADC is highlighted in Figure 5-28 and occupies 0.78mm 2 . The unused area of the chip is filled with decoupling capacitors. 103 Figure 5-28: Die photo of the time-interleaved SAR ADC 104 5.5.2 Measurement Result Before Calibration Figure 5-29 shows the measured spectrum with a low frequency (11MHz) input signal. The spectrum of a single channel result is shown on the left side and the spectrum of the time-interleaved result is plotted on the right side. A typical single channel achieves 53.9dB SNDR and 70.2dB SFDR. This result confirms that the matching of the custom designed capacitor is good for 10 bit accuracy. The SNDR in the timeinterleaved result shows 52.7dB SNDR and 63.4B SFDR. Considering that the input frequency is very low, the errors from the timing-skew has little effect. Thus, 1.2 dB SNDR degradation in the TI result is caused by the gain mismatches and offset errors between the channels. Figure 5-30 shows the measured spectrum with an input signal close to the Nyguist rate (479MHz). A typical single channel result shows only a slight degradation in SNDR. This result means that clock jitter has negligible impact on the result and does not limit the performance. Considering that this work has fully differential structure and the second harmonic does not appear with a low frequency input signal, the second harmonic distortion can be the result of the phase imbalance in the offchip balun [70]. Although SNDR of a single channel is maintained above 53dB, the time-interleaved spectrum suffers from large tones caused by timing-skews between channels. As a result, SNDR and SFDR are limited to 42.5dB and 46.6dB respectively. 105 8 CHANNEL TI FFT PLOT ONE CHANNEL FFT PLOT 0 0 - -20 -40 - -60 - = llMHz/125MS/s fi /f SNDR= 53.9dB SFDR= 70.2dB SFDR= 1IMHZ/1GS/s 52 .7dB 63 .4dB -40- -o R *0 0 nJ 0D f /f SNDR= -20- -60 0 50 0 30 20 ANALOG INPUT FREQUENCY (MHz) 10 ki. 11 IiII I. -120 - -12077717 0 60 100 300 200 500 400 ANALOG INPUT FREQUENCY (MHz) Figure 5-29: Measured spectrum before calibration: single channel result (left) and time-interleaved result (right) with a low frequency input signal at 11MHz ONE CHANNEL FFT PLOT 0 f. /p fin/ -20 S = /f SNDRn f 479MHz/125MS/ s 20 53dB SFDPR 59.6dB = 479MHz/1GS/ S 42.5dB SFDR= 46.6dB -40 -40-o uJ 0 H 8 CHANNEL TI FFT PLOT 0 U0 -60- 0- -60 0~ I -80 -80 -100 1, -100 111 1 F, -120 -120 0 50 40 30 20 10 ANALOG INPUT FREQUENCY (MHz) 0 60 100 200 300 400 ANALOG INPUT FREQUENCY (MHz) I -I 500 Figure 5-30: Measured spectrum before calibration: single channel result (left) and time-interleaved result (right) with a Nyquist rate input signal at 479MHz 106 Calibration Process The timing-skew calibration method described in Section 5.3.1 is performed. Figure 531 shows the VAR(DLSAR) against the coarse delay control codes of the SAR sampling clocks. Here, 128K data values are used to calculate each variance. All channels show a smooth curve with one minimum. The same process is repeated for the fine delay control code to complete the calibration and the result is shown in Figure 5-32. To demonstrate the possibility of background calibration and to avoid errors from a deterministic input signal, the calibration is performed over four different input frequencies and two different input amplitudes, and the differences are found to be insignificant. Although the above mentioned brute force sweeping of the delay codes can be used to find the minimum variance, this can be slow and inefficient because it includes unnecessary calculation. An iterative linear searching method can expedite the convergence speed and save power consumption of the calibration. Figure 5-33 presents an example of an iterative liner searching algorithm. In this example, a pure sine input signal (479MHz, -1.5dBFS) is applied and the SNDR is calculated at the end of each calibration cycle to demonstrate the convergence and effectiveness of the calibration. The linear searching starts from default delay control codes and measure the variance of DLSAR. In the second calibration cycle, the delay code is either increase or decrease by 1 randomly and measure the variance of DLSAR. If the variance is decreased, the control code is updated further in the same direction in the next calibration cycle. If the variance is increased, the control code is updated in the opposite direction in the next calibration cycle. At the bottom of Figure 5-33, the coarse and fine code for the channel 1 timing-skew calibration are presented as an example. The color of the coarse and fine delay control code shows whether the variance is reduced (blue) or increased (red). With this searching method, SNDR is significantly improved after a few cycles of the calibration. This linear searching method requires the monotonicity of the programmable delay block. 107 chosen code 440 - 420 -- CHAN1--> SCHAN2 -> 400 - U. zcc 380 CHAN3 -4 --X- CHAN4 -> -0K- 360 CHAN5 -CHAN6 340 4 CHAN7 320 CHAN8-4 3 2 1 7 6 5 8 5 6 6 2 3 3 3 2 coarsedelayCTRL Histogram of DL Histogram of DLSAR (DFLASH -7) (DFLASH .7) 160r Jill. 140 120 z 120 100 100 Iz 0 80 U z 80- 60 60 40 40 20- 20 0 1 0 16 I 140 32 48 64 80 96 0 0 112 128 16 2 32 48 64 80 96 112 128 DLSAR DLe Figure 5-31: Measured variance of DLSAR against coarse delay of SAR ADC sampling clock (top) with the examples of the DLSAR histogram (bottom) for channel 1 108 chosen code 350 348 CHAN1 * 346 -0-CHAN2 S344 U.' CHAN3 z CHAN4 < 342 CHAN5 > 340 CHAN6 338 CHAN7 336 CHAN8 1 2 3 4 6 finedelay CTRL ,Histogram of DLSAR (DFLASH 0 Q. Histogram of DLSAR = 7) 1wU 160- 160 140- 140 120 120 100 0 100 C, 80- cn 60- 40 40 20- 20 32 48 64 80 96 112 128 4 FLASH=7) D:4(DLS 11 I III 80 60- 16 8 4 3 4 4 5 4 \ 180 0 0 7 + + + + + 4 +4 + 0 0 DLSAR 16 32 48 64 80 DLsAR 96 112 128 Figure 5-32: Measured variance of DLSAR against fine delay of SAR ADC sampling clock (top) with the examples of the DLSAR histogram (bottom) for channel 1 109 360 S4 35S 50 -*-CHAN1 ,w-CHAN2 a: 350 46 -- CHAN3 -CHAN4 9 345 42 Z /A 1 - - CHANS CHAN6 - 340 38 -- CHAN7 CHAN8 fine delay cal. coarse delay cal. 34 335 3 0 12 9 6 -+-SNDR 15 Calibraiton cycle (128K sample/cycle) Chani [ Coarse 4 6 5 5 5 5 5 S 5 5 Fine 4 4 4 4 4 4 4 5 3 3 3 Figure 5-33: The convergence of the background timing-skew calibration with a simple iterative linear minimum searching algorithm. 110 After Calibration Figure 5-34 shows the measured spectrum after timing-skew calibration. A typical single channel result is the same as before calibration. However, the error tones in the time-interleaved result are significantly reduced by the timing-skew calibration. SNDR and SFDR are improved to 51.4dB and 60.0dB respectively. The power consumption is 18.9mW (clock 3.34mW, flash ADC 5.04mW, SAR ADCs 9.18mW, reference 1.35mW), which corresponds to 62.3fJ/step FoM. The power consumption of the digital circuits for background timing-skew estimation is not included. Because timing-skew does not change frequently, the calibration is not required to run continuously all the time. It may be initiated only when the chip is powered on and when temperature or voltage fluctuation is detected. Then, the power consumption of the calibration is insignificant and can be ignored. Figure 5-35 shows the power breakdown of the chip. The INL/DNL plots are shown in Figure 5-36. The residual error tones after calibration are about -60dBFS, which is slightly worse than the expected value from the fine delay step (~0.8ps step). A foreground timing-skew calibration is performed to find the limit of the calibration. For the foreground calibration, a pure sine wave is applied and the coarse and fine delay codes are tuned to maximize the SNDR. The result of foreground calibration is plotted in Figure 5-37. SNDR is improved to 52.9dB, which is 1.5dB better than with the proposed background calibration. To highlight the effectiveness of the timing-skew calibration, SNDR versus input frequency is plotted in Figure 5-38. It clearly shows that SNDR drop at high input frequency is recovered by both background and foreground calibration. It is also notable that SNDR after calibration has a repetitive pattern over input frequency. To explain this SNDR fluctuation, a typical single channel SNDR curve is plotted in Figure 5-39. A similar SNDR pattern over input frequency is also observed in a single channel result. It means that the SNDR fluctuation should be explained from a single channel operation. The SNDR waviness is believed to be caused by data-dependent disturbances on 111 the external input network. SNDR curves have peaks at input frequency around 0MHz, 125MHz, 250MHz, 375MHz, and 500MHz. At these frequencies, each channel converts an aliased low frequency input signal. Thus, the charge sampled in the previous cycle of each channel is very similar to the charge sampled by that channel. This means that the input network is not disturbed by the sampling capacitors at the beginning of the sampling phase. In contrast, about 3 dB worse SNDR is measured at input frequency around 62.5MHz, 187.5MHz, 312.5MHz, and 437.5MHz. These are the frequencies that each channel experiences an aliased Nyquist rate input signal. In this case, each channel sees the worst case difference in charge sample-to-sample. Therefore, it causes maximum disturbance on the input network. If this disturbance does not settle with in Ins, the next sample taken by the next channel will have an error. One possible solution to mitigate this data-dependent input network disturbances is adding a reset phase to clear the charge in the previous sample. This reset phase does not eliminate disturbance on the input network, but makes disturbance constant and data-independent. Another solution is to use a small sampling capacitor. With a small sampling capacitor, the charge provided from the input network in the sampling phase is also small. Thus, disturbance of the input network can be reduced. However, it may worsen thermal noise and matching of sampling capacitor. The measurement results of three different chips are plotted in Figure 5-40. Before calibration, SNDR is spread widely and limited by the timing-skew for all three chips at high frequencies. After calibration, the three chips provide similar performance with 1.2dB SNDR variation at Nyquist input rate. The output spectrum plots shown in the previous part of this section are the result from Chip 1. Table 5.1 summarizes the performance with a comparison to previously published work with fs>0.8GS/s, SNDR >45dB, and FoM <180fJ/step. 112 0- ONE CHANNEL FFT PLOT f. /: fin/1 -20- SNDIR SFDR S = 479MHz/125MS/ S f /f, SNDR= SFDR= -20- 53dB 59.7dB = 479MHz/1GS/ s 51.4dB 60.OdB -40 -40uj w D H 8 CHANNEL T1FFT PLOT 0 n D -60- -60 -j 0~ Ai -80 1.ii -80 I -100 -100 'I' 120)C I 0 ~ F II~ 10 20 30 40 50 ANALOG INPUT FREQUENCY (MHz) i ll 120 60 1 1 qi'i 1'1. 100 200 300 400 ANALOG INPUT FREQUENCY (MHz) 500 Figure 5-34: Measured spectrum after background timing-skew calibration : single channel result (left) and time-interleaved result (right) with a Nyquist rate input signal at 479MHz 113 total power = 18.9mW Figure 5-35: Power breakdown of the chip ONE CHANNEL INL 1.5 1 1 0.5 cf) z 8 CHANNEL TI INL . 1.5 111 0.5 0 0 z -0.5 -0.5 -1 -1. 51 0 200 400 800 600 -1.5 1000 0 200 DIGITAL OUTPUT CODE ONE CHANNEL DNL 1.5 1 a L|d ~]i. a 0 400 600 800 DIGITAL OUTPUT CODE 1000 0.5 -1 0 z 0 1000 600 8 CHANNEL TI DNL 1.5 0.5 [iW L 800 400 DIGITAL OUTPUT CODE -0.5 Jil 0 200 ljf~l 400 " li" " l 'l 600 800 z 0 1 -0.5- -1.5 1000 DIGITAL OUTPUT CODE 0 200 Figure 5-36: Measured INL/DNL of the ADC: single channel result (left) and timeinterleaved result (right) 114 8 CHANNEL TI FFT PLOT ONE CHANNEL FFT PLOT f. / -20 SNDE SFDr, -40 479MHz/125MS/ S S -20 55 . 1dB 59 .7dB f.i /f S = - 479MHz/1GS/ S SNDR= 52.9dB SFDR= 61.4dB -40- - V w C D H -60- -60 C 0~ I -80 d T -100 -120 0 11 -80 F1'I -100 40 50 30 20 10 ANALOG INPUT FREQUENCY (MHz) 60 I 400 300 200 100 ANALOG INPUT FREQUENCY (MHz) 0 500 Figure 5-37: Measured spectrum after foreground timing-skew calibration: single channel result (left) and time-interleaved result (right) with a Nyquist rate input signal 58.0 56.0 -l v 54.0 Foreground calibration Background calibration Before calibration 52.0 50.0 0 C') 48.0 46.0 44.0 42.0 40.0 0 50 100 150 200 250 300 350 400 Input frequency (MHz) Figure 5-38: Measured SNDRJ versus input frequency 115 450 500 58.0 Z I- mom Single channel mom TI channel (background cal) 56.0 54.0 52.0 50.0 48.0 46.0 44.0 42.0 40.0 50 0 100 150 250 200 300 400 350 450 500 Input frequency (MHz) Figure 5-39: Measured SNDR versus input frequency 58.0 56.0 54.0 52.0 -+- Chip 1 - After Cal. 50.0 -"- Chip 2 - After Cal. Chip 3 - After Cal. 0a 48.0 z Wl 46.0 -"- Chip 1 - Before Cal -A- Chip 2 - Before Cal Chip 3 - Before Cal 44.0 42.0 40.0 0 50 100 150 200 250 300 350 400 450 500 Input frequency (MHz) Figure 5-40: Measured SNDR versus input frequency from three different chips 116 Table 5.1: Performance summary and comparison table This work ISSCC 2013 Hong VLSI 2013 Chiang VLSI 2012 Stepanovic VLSI 2012 Sahoo ISSCC 2011 Mulder Architecture TI SAR TI SAR PIPE TI SAR PIPE TI PIPE Technology 65nm 45nm 65nm 65nm 65nm 40nm Supply -Voltage (V)-- 1.0 1.2 1.0 1.2 1.2 1.0/2.5 Fs (GS/s) 1.0 0.9 0.8 2.8 1.0 0.8 Resolution (bit) SNDR @ Nyquist (dB) Power (mW) 10 9 10 11 10 12 51.4 51.2 52.2 48.2 52.4 59.0 19.8 10.8 19.0 44.6 32.9 105 62.3 40.5 71.4 75.8 96.6 180.2 FoM (fJ/step) Although this work dem onstrates good SNDR at targeted operating speed, the TI ADC is designed to achieve better than 58dB SNDR (60dB SNDR for single channel) at 1GS/s. As discussed in the previous part of this section, SNDR, degradation by the additional errors causec by time-interleaved structure is about 1-2dB. However, SNDR of a typical single channel result is significantly worse than simulation results, and is a limiting factor of the overall SNDR. The output spectrum of a typical single channel in Figure 5-29 indicates that SDNR of a typical single channel is limited by the noise power. However, based on simulation results, thermal noise of the sampling capacitors and SAR comparators is well below the measured noise power. One possible limiting factor is crosstalk among the time-interleaved channels. Crosstalk paths can occur through shared reference voltage, common power lines, and substrate coupling, and can increase the noise floor of the output spectrum. Due to the limited number of pins, reference voltages and power lines are shared by four channels in this chip. The best way to evaluate and measure the effects of crosstalk is to turn on only one channel and turn off all the other channels. However, unfor117 SAR1 - - Asynchronous 7conversion Crossalk SAR3: 1GS/s Asynchronous conversion SAR1 || Asynchronous conversion SAR3 200MS/s 1 2 3 4 5 6 ._ ...... 8 7 9 _ _ 10 11 12 13 14 15 t [ns] Figure 5-41: Timing waveform to avoid crosstalk among the channels that share power lines and reference voltages tunately, this testing mode is not implemented for this chip. Instead, an indirect measure of the crosstalk is performed. Because the SAR ADCs in this work are con- trolled by asynchronous SARP logic, it is possible to avoid the overlap of the SAR operation among the channels. For asynchronous SAR ADCs, the required time for conversion is fixed and does not change with the sampling rate. Thus, with a signifi- cantly low sampling rate (lower than 200MS/s for this chip), SAR conversion of one channel ends before the next SAR conversion in another channel starts as shown in Figure 5-41. This is consistent with from measurement results that higher than 58dB SNDR for a single channel is achieved at 200MS/s, as shown in Figure 5-42. The crosstalk effect is not observed in simulation. In simulation, large on-chip decoupling capacitors for reference voltage and power lines absolve the noise caused Iy DAC switching and prevent crosstalk. However, we believe that parasitics make the decoupling capacitor less ideal. A simple solution is to have separate reference pins to the outside of the chip for all channels, but this is not usually possible. A more practical solution is to implement on-chip reference buffers for all channels. The crosstalk through power supply and ground can be alleviated with a fully differential implementation with a high power supply rejection. 118 ONE CHANNEL FFT PLOT 8 CHANNEL TI FFT PLOT U 0 f. /f,= -20 f. /f 4.9MHz/2sMS/s 4.9MHz/200MS/s SNDR=;* 57.9&B SFDR- 73.4dB SNDR= 58dB SFDR='-'1.5dB -40 -40 M .. ... . ... '0 in M- in -60 M CL .. -60 - .. -.... C-60 -60 -w 0 2 4 6 6 10 ANALOG INPUT FREQUENCY (MHz) -120 12 0 20 40 60 60 ANALOG INPUT FREQUENCY (MHz) 10 0 Figure 5-42: Measured spectrum at 200MS/s: single channel result (left) and timeinterleaved result (right) with a low frequency input signal at 5MHz Another possible error source leading to the increased noise floor of single channel results is the limited driving capability of the external balun. As described in the measurement setup, an off-chip balun is used to generate the differential input signal and drive the input capacitors. When a balun drives a large switched capacitor circuit, it can cause unexpected transient behavior, caused by charge dumping and abrupt switching. This is different from the issue caused by limited bandwidth of the sampling circuits, and this is difficult to model in simulation. 5.6 Summary In this work, a time-interleaved structure is employed to improve effective ADC sampling rate without sacrificing energy efficiency. This work also proposed a background calibration approach for the timing-skew, which is the dominant error source in high- speed time-interleaved ADCs. After background calibration, 51.4dB SNDR, 59.1dB SFDR, and 1.0 LSB INL/DNL are achieved at 1GS/s with a Nyquist input signal. The power consumption is 18.9mW, which corresponds to 62.3fJ/step FoM. 119 120 Chapter 6 Conclusion 6.1 Thesis Contribution This dissertation has investigated ADC design techniques that exploit the benefits of CMOS technology while overcoming previous limitations. There are two key contributions of this thesis. The first contribution is the implementation of a voltage scalable zero-crossing based pipelined ADC. A zero-crossing based circuit technique is modified and optimized to improve the limited ADC resolution in nano-scaled CMOS technology. Compared to the previous bidirectional two-phase ramp approach, the proposed unidirectional charge-transfer scheme allows faster and more energy-efficient operation by eliminating unnecessary charging and discharging of the capacitors. Also, the reduced transient disturbance at the beginning of the fine charge-transfer phase improves the accuracy of operation. Power supply scaling improves power efficiency at low sampling rates much like in digital circuits and widens the conversion frequency range where the ADC operates with highest efficiency. At nominal supply voltage, 67.7dB SNDR (11.0 ENOB), 85.8dB SFDR, 1.2 LSB 12 INL and 0.4 LSB 12 DNL are achieved at 50MS/s with a 23.9MHz input signal. The power consumption is 4.07mW, which corresponds to 41.0fJ/step FoM. Supply voltage scalability is demonstrated down to 0.5V. At the scaled supply voltage, the SNDR degrades only slightly to 66.3dB (10.7 ENOB) at 5MS/s and the FOM is improved to 28.0fJ/step. This 121 result strongly demonstrates that the ZCBC technique is an energy-efficient solution for high resolution ADCs in advanced CMOS technology. The second contribution of this work is the design and implementation of a highspeed time-interleaved SAR. ADC with background timing-skew calibration. A timeinterleaved structure is employed to improve the effective sampling rate without sacrificing energy efficiency. SAR ADCs are used for each channel to make good use of device scaling. The proposed ADC architecture incorporates a flash ADC operating at the full sampling rate of the TI ADC. The flash ADC output is multiplexed to resolve MSBs of the SAR channels. Because the full-speed flash ADC does not suffer from timing-skew errors, the flash ADC output is also used as the timing reference to estimate the timing-skew of the SAR ADCs. We believe that this calibration method is simpler than the prior work. 51.4dB SNDR, 59.1dB SFDR, and 1.0 LSB INL/DNL are achieved at 1GS/s with a Nyquist input signal. The power consumption is 18.9mW, which corresponds to 62.3fJ/step FoM. This work demonstrates the energy efficiency of a TI SAR. ADC with background timing-skew calibration for high speed applications. 6.2 Future Work Although the prototypes implemented in this thesis have shown good results, there are still many opportunities for improvements. For the zero-crossing based pipelined ADC, the current source non-linearity is a major source of error. Although the two-phase operation relaxed the linearity requirement, better solutions, such as correlated level shifting [23, 21], are necessary to achieve higher resolution. Due to the fully differential structure, the power supply noise rejection, common-mode rejection, and offset errors are typically less problematic. However, when the target resolution is high, for example, 14 bit or higher, automatic cancellation or feedback control of the offset and common-mode is necessary. Reference [20] analyzed the cause and effect of power supply noise in a similar ZCB pipelined ADC, and reference [18] suggested a feedback control loop for the 122 the common-mode voltage control. The offset errors of the ZCBC can be measured and stored on a capacitor during the off phase of the ZCD [711. A parallel structure with a different charge transfer direction [72] is another solution to cancel the offset automatically. Reference voltage settling is another issue for high speed operation. Unlike opamp based circuits, ZCB pipelined ADCs require a preset phase before the charge transfer phase. In the preset phase, sampling capacitors are directly shorted between the reference voltages and the supply voltages. This causes a significant disturbance on the reference voltages which must settle before the fine charge transfer phase. An on-chip reference buffer [21] or reference precharging [15] are proposed in previous work to address this issue. For the proposed time-interleaved SAR ADC in this work, one block that has not been implemented on-chip for the TI SAR ADC is the digital computation logic for timing-skew estimation. To complete the background timing-skew calibration loop in real time, it is necessary to have a timing-skew estimator on-chip. Considering the speed of the TI ADC and the complexity of the calculation, it is important to optimize the area and power consumption of this block. It is also crucial to protect the ADCs from the noisy digital timing-skew estimator. A separate power supply for the digital circuits would alleviate this, while physical isolation with a high-resistivity substrate would reduce coupling through the substrate. The algorithm to find the calibration code is worthy of further research. In Section 5.5.2, a linear searching algorithm is demonstrated as an example to accelerate the speed of convergence. However, it requires monotonicity of the delay adjustment over the entire programmable range. More advanced searching algorithms, such as gradient-based searching [73], may be a better solution to enhance the speed of the calibration. To achieve even higher effective sampling rate, the number of SAR channels can be increased. If the proposed ADC structure is maintained, the ultimate limit will be the speed of the flash ADC. Considering that state-of-art flash ADCs work well beyond 3GS/s, there is substantial room to increase the input bandwidth. However, 123 the area of the chip is another limitation. Thus, a compact design and layout of the SAR ADC would be a precondition of this approach. Although demonstrated with a conventional TI SAR structure, the proposed timing-skew calibration can be extended to other structures. In general, any timeinterleaved ADC that has coarse and fine quantization structure can incorporate the proposed timing-skew calibration. For example, a time-interleaved pipelined ADC can adopt this calibration method by sharing the flash ADC in the first stages. Finally, supply voltage scaling can be applied to the proposed ADC. Previous work [30, 29] has demonstrated voltage scalability of single channel medium resolution SAR ADCs for better power efficiency. They show that dominant switching power of SAR ADCs can be reduced significantly with voltage scaling. Although the noise contribution from sampling capacitors and the comparator in SAR ADC is increased due to the reduced signal power, SNDR is not degraded much at scaled supply voltage. It is because the noise from sampling capacitors and SAR comparator is usually not a limiting factor for medium solution SAR ADCs. As a result, FoM is improved at scaled supply voltage. The same idea can be applied to the proposed TI SAR ADC. However, it is worthwhile to consider the impact of voltage scaling on the additional issues of TI structure presented in Section 4.2. The offset mismatch of the proposed ADC is mainly determined by the SAR comparator offset. The SAR comparator offset is dominated by the threshold mismatches of comparator input differential pair. Although the threshold voltage mismatch does not change with the supply voltage, the relative error caused by threshold voltage mismatch increases at lower supply voltages due to the reduced signal power. Gain mismatch of a capacitive SAR ADC is determined by the capacitor matching and will not be affected by supply voltage scaling. Systematic timing-skew caused by unavoidable asymmetries in layout is fixed, regardless of supply voltage. In contrast, timing-skew from random threshold variation of transistors increases due to the slow transition edges in the clock generator. However, at lower supply voltage, it is expected to have a lower input bandwidth which relaxes the requirement for timing-skew of TI ADCs. Thus, it is not easy to predict a general effect of supply voltage scaling on the timing-skew error, and it may 124 be affected by the specific design. The timing-skew detection methods presented in Section 5.3 can be used, but may required more data samples for accurate estimation due to the increased noise power. 125 126 Bibliography [1] R. W. S. A. V. Oppenheim and J. R. Buck, "Discrete-Time Signal Processing," Prentice Hall, 1999. [2] D. A. Johns and K. Martins, "Analog Integrated Circuit Design," John Wiley & Sons, Inc., 1997. [3] R. Walden, "Analog-to-Digital Converter Survey and Analysis," IEEE Journal on Selected Areas in Communications, vol. 17, no. 4, pp. 539-550, 1999. [4] B. Murmann, "ADC Performance Survey 1997-2013," http://www.stanford.edu/ murmann/adcsurvey.html. [Online]. Available: [5] B. Murmann and B. Boser, "A 12-bit 75-MS/s Pipelined ADC Using Open-Loop Residue Amplification," IEEE Journal of Solid-State Circuits,vol. 38, no. 12, pp. 2040-2050, Dec. 2003. [6] B. Sahoo and B. Razavi, "A 10-b 1-GHz 33-mW CMOS ADC," IEEE Journal of Solid-State Circuits, vol. 48, no. 6, pp. 1442--1452, 2013. [7] J. Hu, N. Dolev, and B. Murniann, "A 9.4-bit, 50-MS/s, 1.44-mW Pipelined ADC Using Dynamic Source Follower Residue Amplification," IEEE Journal of Solid-State Circuits, vol. 44, no. 4, pp. 1057-1066, april Apr. 2009. [8] I. Ahmed, J. Mulder, and D. Johns, "A Low-Power Capacitive Charge Pump Based Pipelined ADC," IEEE Journal of Solid-State Circuits, vol. 45, no. 5, pp. 1016--1027, May. 2010. [9] S. Paul and H.-S. Lee, "A 9-b Charge-to-Digital Converter for Integrated Image Sensors," IEEE Journal of Solid-State Circuits, vol. 31, no. 12, pp. 1931-1938, dec Dec. 1996. [10] M. Anthony, E. Kohler, J. Kurtze, L. Kushner, and G. Sollner, "A ProcessScalable Low-Power Charge-Domain 13-bit Pipeline ADC," IEEE Symposium on VLSI Circuits, 2008. [11] B. Hershberg, S. Weaver, K. Sobue, S. Takeuchi, K. Hamashita, and U.-K. Moon, "Ring amplifiers for switched capacitor circuits," IEEE Journal of Solid-State Circuits, vol. 47, no. 12, pp. 2928-2942, 2012. 127 [12] J. Fiorenza, T. Sepke, P. Holloway, C. Sodini, and H.-S. Lee, "Comparator-Based Switched-Capacitor Circuits for Scaled CMOS Technologies," IEEE Journal of Solid-State Circuits, vol. 41, no. 12, pp. 2658-2668, Dec. 2006. [13] L. Brooks and H.-S. Lee, "A Zero-Crossing-Based 8-bit 200 MS/s Pipelined ADC," IEEE Journal of Solid-State Circuits, vol. 42, no. 12, pp. 2677-2687, Dec. 2007. [14] -- , "A 12b, 50 MS/s, Fully Differential Zero-Crossing Based Pipelined ADC," IEEE Journal of Solid-State Circuits, vol. 44, no. 12, pp. 3329-3343, Dec. 2009. [15] J. Chu, L. Brooks, and H.-S. Lee, "A Zero-Crossing Based 12b lOOMS/s Pipelined ADC with Decision Boundary Gap Estimation Calibration," IEEE Symposium on VLSI Circuits, 2010. [16] S.-K. Shin, Y.-S. You, S.-H. Lee, K.-H. Moon, J.-W. Kim, L. Brooks, and H.-S. Lee, "A Fully-Differential Zero-Crossing Based 1.2V 10b 26MS/s Pipelined ADC in 65nm CMOS," IEEE Symposium on VLSI Circuits, 2008. [17] J. K. Fiorenza, "A Comparator-Based Switched-Capacitor Pipelined Analogto-Digital Converter," PhD thesis, Massachusetts Institute of Technology, June. 2007. [18] L. G. Brooks, "Circuits and Algorithms for Pipelined ADCs in Scaled CMOS Technologies," PhD thesis, Massachusetts Institute of Technology, June. 2008. [19] T. C. Sepke, "Comparator Design and Analysis for Comparator-Based SwitchedCapacitor Circuits," PhD thesis, Massachusetts Institute of Technology, Sep. 2006. [20] Y. J. Chu, "High Performance Zero-Crossing Based Pipelined Analog-to-Digital converters," PhD thesis, Massachusetts Institute of Technology, June. 2011. [21] S.-K. Shin, J. Rudell, D. Daly, C. Munoz, D.-Y. Chang, K. Gulati, H.-S. Lee, and M. Straayer, "A 12b 200MS/s Frequency Scalable Zero-Crossing Based Pipelined ADC in 55nm CMOS," in IEEE Custom Integrated Circuits Conference, 2012, pp. 1-4. [22] T. Musah, S. Kwon, H. Lakdawala, K. Soumyanath, and U.-K. Moon, "A 630pW Zero-Crossing Based Sigma-Delta ADC Using Switched-Resistor Current Sources in 45nm CMOS," IEEE Custom Integrated Circuits Conference, pp. 1-4, 2009. [23] M. C. Guyton, "Low-Voltage ZCB Delta-Sigma Analog-to-Digital Converter PhD thesis, Massachusetts Institute of Technology, June. 2010. ," [24] P. Lajevardi, A. Chandrakasan, and H.-S. Lee, "Zero-Crossing Detector Based Reconfigurable Analog System," IEEE Journal of Solid-State Circuits, vol. 46, no. 11, pp. 2478-2487, 2011. 128 [25] B. Hershberg, S. Weaver, and U.-K. Moon, "Design of a Split-CLS Pipelined ADC With Full Signal Swing Using an Accurate But Fractional Signal Swing Opamp," IEEE Journal of Solid-State Circuits,, vol. 45, no. 12, pp. 2623-2633, 2010. [26] S. Lee, A. Chandrakasan, and H.-S. Lee, "A 12 b 5-to-50 MS/s 0.5-to-1 V Voltage Scalable Zero-Crossing Based Pipelined ADC," IEEE Journal of Solid-State Circuits, vol. 47, no. 7, pp. 1603-1614, 2012. [27] S. Lee, "A Zero-Crossing Based Pipelined ADC with Voltage Scalability," S.M thesis, Massachusetts Institute of Technology, Sep. 2010. [28] A. Wang and A. Chandrakasan, "A 180-mV Subthreshold FFT Processor Using a Minimum Energy Design Methodology," IEEE Journal of Solid-State Circuits, vol. 40, no. 1, pp. 310-319, Jan. 2005. [29] C.-C. Liu, S.-J. Chang, G.-Y. Huang, and Y.-Z. Lin, "A 10-bit 50-MS/s SAR ADC With a Monotonic Capacitor Switching Procedure," IEEE Journal of SolidState Circuits, vol. 45, no. 4, pp. 731-740, Apr. 2010. [30] M. Yip and A. Chandrakasan, "A Resolution-Reconfigurable 5-to-10-Bit 0.4-to-1 V Power Scalable SAR ADC for Sensor Applications," IEEE Journal of SolidState Circuits, vol. 48, no. 6, pp. 1453-1464, 2013. [31] S. Weaver, B. Hershberg, D. Knierim, and U.-K. Moon, "A 6b Stochastic Flash Analog-to-Digital Converter without Calibration or Reference Ladder," IEEE Asian Solid-State Circuits Conference, pp. 373-376, 2008. [32] D. Daly and A. Chandrakasan, "A 6-bit, 0.2 V to 0.9 V Highly Digital Flash ADC With Comparator Redundancy," IEEE Journal of Solid-State Circuits, vol. 44, no. 11, pp. 3030-3038, Nov. 2009. [33] U. Wismar, D. Wisland, and P. Andreani, "A 0.2V, 7.5pW 20 kHz Sigma-Delta modulator with 69 dB SNR in 90 nm CMOS," IEEE European Solid State Circuits Conference, pp. 206-209, 2007. [34] J. Shen and P. Kinget, "A 0.5-V 8-bit 10-Ms/s Pipelined ADC in 90-nm CMOS," IEEE Journal of Solid-State Circuits, vol. 43, no. 4, pp. 787-795., Apr. 2008. [35] Y.-J. Kim, H.-C. Choi, Park, and J.-W. Kim, Low-Power 10b 0.13um cuits Conference, 2007, S.-W. Yoo, S.-H. Lee, D.-Y. Chung, K.-H. Moon, H.-J. "A Re-configurable 0.5V to 1.2V, 1OMS/s to 100MS/s, CMOS Pipeline ADC," in IEEE Custom Integrated Cirpp. 185-188. [36] A. Abo and P. Gray, "A 1.5-V, 10-bit, 14.3-MS/s CMOS Pipeline Analog-toDigital Converter," IEEE Journal of Solid-State Circuits, vol. 34, no. 5, pp. 599-606, may May. 1999. 129 [37] M. Dessouky and A. Kaiser, "Very Low-Voltage Digital-Audio Delta-Sigma Modulator with 88-dB Dynamic Range Using Local Switch Bootstrapping," IEEE Journal of Solid-State Circuits,vol. 36, no. 3, pp. 349-355, 2001. [38] L. Brooks and H.-S. Lee, "Background Calibration of Pipelined ADCs Via Decision Boundary Gap Estimation," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 55, no. 10, pp. 2969-2979, 2008. [39] J. Black, W.C. and D. Hodges, "Time Interleaved Converter Arrays," IEEE Journal of Solid-State Circuits,vol. 15, no. 6, pp. 1022-1029, 1980. [40] M. El-Chammas and B. Murmann, "A 12-GS/s 81-mW 5-bit Time-Interleaved Flash ADC With Background Timing Skew Calibration," IEEE Journal of SolidState Circuits, vol. 46, no. 4, pp. 838-847, 2011. [41] K. Poulton, K. Knudsen, J. Kerley, J. Kang, J. Tani, E. Cornish, and M. VanGrouw, "An 8-GSa/s 8-bit ADC System," in IEEE Symposium on VLSI Circuits, 1997, pp. 23-24. [42] B. Setterberg, K. Poulton, S. Ray, D. Huber, V. Abramzon, G. Steinbach, J. Keane, B. Wuppermann, M. Clayson, M. Martin, R. Pasha, E. Peeters, A. Jacobs, F. Demarsin, A. Al-Adnani, and P. Brandt, "A 14b 2.5GS/s 8-Way Interleaved Pipelined ADC with Background Calibration and Digital Dynamic Linearity Correction," in IEEE InternationalSolid-State Circuits Conference Digest of Technical Papers, 2013, pp. 466-467. [43] D. Stepanovic and B. Nikolic, "A 2.8GS/s 44.6mW Time-Interleaved ADC Achieving 50.9dB SNDR and 3dB Effective Resolution Bandwidth of 1.5GHz in 65nm CMOS," in IEEE Symposium on VLSI Circuits, 2012, pp. 84-85. [44] S. Danesh, J. Hurwitz, K. Findlater, D. Renshaw, and R. Henderson, "A Reconfigurable lGSps to 250MSps, 7-bit to 9-bit Highly Time-Interleaved Counter ADC in 0.13um CMOS," in IEEE Symposium on VLSI Circuits, 2011, pp. 268269. [45] K. Lee, J. Chae, M. Aniya, K. Hamashita, K. Takasuka, S. Takeuchi, and G. Temes, "A Noise-Coupled Time-Interleaved Delta-Sigma ADC With 4.2 MHz Bandwidth, - 98 dB THD, and 79 dB SNDR," IEEE Journal of Solid-State Circuits, vol. 43, no. 12, pp. 2601-2612, 2008. [46] S. Jamal, D. Fu, M. Singh, P. Hurst, and S. Lewis, "Calibration of Sample-Time Error in a Two-Channel Time-Interleaved Analog-to-Digital Converter," IEEE Transactions on Circuits and Systems I, vol. 51, no. 1, pp. 130-139, 2004. [47] H. Jin and E. Lee, "A Digital-Background Calibration Technique for Minimizing Timing-Error Effects in Time-Interleaved ADCs," IEEE Transactionson Circuits and Systems II, vol. 47. no. 7, pp. 603-613, 2000. 130 [48] J. Elbornsson, F. Gustafsson, and J. E. Eklund, "Blind Adaptive Equalization of Mismatch Errors in a Time-Interleaved A/D Converter System," IEEE Transactions on Circuits and Systems I, vol. 51, no. 1, pp. 151-158, 2004. [49] K. Poulton, R. Neff, B. Setterberg, B. Wuppermann, T. Kopley, R. Jewett, J. Pernillo, C. Tan, and A. Montijo, "A 20 GS/s 8 b ADC with a 1 MB Memory in 0.18 ym CMOS," in IEEE InternationalSolid-State Circuits Conference, 2003, pp. 318-496 vol.1. [50] V. Divi and G. W. Nornell, "Blind Calibration of Timing Skew in TimeInterleaved Analog-to-Digital Converters," IEEE Journal of Selected Topics in Signal Processing,vol. 3, no. 3, pp. 509-522, 2009. [51] T. Laakso, V. Valimaki, M. Karjalainen, and U. Laine, "Splitting the Unit Delay [FIR/all pass filters design]," IEEE Signal Processing Magazine, vol. 13, no. 1, pp. 30-60, 1996. [52] M. Yoshioka, K. Ishikawa, T. Takayama, and S. Tsukamoto, "A 10b 50MS/s 820 pW SAR ADC with on-chip digital calibration," in IEEE International SolidState Circuits Conference Digest of Technical Papers, 2010, pp. 384-385. [53] J. Chu and H. S. Lee, "A 450 MS/s 10-bit Time-Interleaved Zero-Crossing Based ADC," in IEEE Custom Integrated Circuits Conference, 2011, pp. 1-4. [54] C. Vogel, "The impact of combined channel mismatch effects in time-interleaved ADCs," IEEE Transactions on Instrumentationand Measurement, vol. 54, no. 1, pp. 415-427, 2005. [55] A. Boni, A. Pierazzi, and D. Vecchi, "LVDS I/O Interface for Gb/s-per-pin Operation in 0.35um CMOS," IEEE Journal of Solid-State Circuits, vol. 36, no. 4, pp. 706-711, 2001. [56] B. Ginsburg and A. Chandrakasan, "An Energy-Efficient Charge Recycling Approach for a SAR Converter with Capacitive DAC," in IEEE InternationalSymposium. on Circuits and Systems, 2005, pp. 184-187 Vol. 1. [57] Y.-K. Chang, C.-S. Wang, and C.-K. Wang, "A 8-bit 500-KS/s Low Power SAR ADC for Bio-Medical Applications," in IEEE Asian Solid-State Circuits Conference, 2007, pp. 228-231. [58] V. Hariprasath, J. Guerber, S.-H. Lee, and U.-K. Moon, "Merged Capacitor Switching Based SAR ADC with Highest Switching Energy-Efficiency," Electronics Letters, vol. 46, no. 9, pp. 620-621, 2010. [59] W. Liu, P. Huang, and Y. Chiu, "A 12b 22.5/45MS/s 3.0mW 0.059mm 2 CMOS SAR ADC Achieving Over 90dB SFDR," in IEEE InternationalSolid-State Circuits Conference Digest of Technical Papers,2010, pp. 380-381. 131 [60] H. Wei, C.-H. Chan, U.-F. Chio, S.-W. Sin, U. Seng-Pan, R. Martins, and F. Maloberti, "A 0.024mm2 8b 400MS/s SAR ADC with 2b/cycle and Resistive DAC in 65nm CMOS," in IEEE InternationalSolid-State Circuits Conference Digest of Technical Papers, 2011, pp. 188-190. [61] Y.-C. Lien, "A 4.5-mW 8-b 750-MS/s 2-b/step Asynchronous Subranged SAR ADC in 28-nm CMOS technology," in IEEE Symposium on VLSI Circuits, 2012, pp. 88-89. [62] P. Harpe, E. Cantatore, and A. van Roermund, "A 2.2/2.7fJ/conversion-step 10/12b 40kS/s SAR ADC with Data-Driven Noise Reduction," in IEEE International Solid-State Circuits Conference Digest of Technical Papers, 2013, pp. 270-271. [63] T. Morie, T. Miki, K. Matsukawa, Y. Bando, T. Okumoto, K. Obata, S. Sakiyama, and S. Dosho, "A 71dB-SNDR 50MS/s 4.2mW CMOS SAR ADC by SNR Enhancement Techniques Utilizing Noise," in IEEE InternationalSolidState Circuits Conference Digest of Technical Papers, 2013, pp. 272-273. [64] S.-W. Chen and R. Brodersen, "A 6b 600MS/s 5.3mW Asynchronous ADC in 0.13mu CMOS," in IEEE InternationalSolid-State Circuits Conference Digest of Technical Papers, 2006, pp. 574-575. [65] Y.-Z. Lin, C.-C. Liu, G.-Y. Huang, Y.-T. Shyu, and S.-J. Chang, "A 9-bit 150MS/s 1.53-mW Subranged SAR ADC in 90-nm CMOS," in IEEE Symposium on VLSI Circuits, 2010, pp. 243-244. [66] P. Harpe, C. Zhou, Y. Bi, N. van der Meijs, X. Wang, K. Philips, G. Dolmans, and H. De Groot, "A 26 pW 8 bit 10 MS/s Asynchronous SAR ADC for Low Energy Radios," IEEE Journal of Solid-State Circuits, vol. 46, no. 7, pp. 15851595, 2011. [67] J. Craninckx and G. Van der Plas, "A 65fJ/Conversion-Step 0-to-50MS/s 0-to0.7mW 9b Charge-Sharing SAR ADC in 90nm Digital CMOS," in IEEE International Solid-State CircUits Conference, 2007, pp. 246-600. [68] J. Kim, B. Leibowitz, J. Ren, and C. Madden, "Simulation and Analysis of Random Decision Errors in Clocked Comparators," IEEE Transactions on Circuits and Systems I: Regular Papers,vol. 56, no. 8, pp. 1844-1857, 2009. [69] T. 0. Anderson, "Optimum Control Logic for Successive Approximation AD Converters," Computer Design, vol. 11, no. 7, pp. 81-86, 1972. [70] R. Reeder, "Wideband A/D Converter Front-End Design Considerations," [Online]. Available: http://www. analog.com/library/analogDialogue/archives/4007/transformer.pdf. [71] A. Chow and H.-S. Lee, "Offset Cancellation for Zero Crossing Based Circuits," IEEE InternationalSymposium on Circuits and Systems, 2010. 132 [72] S. Lee and H.-S. Lee, "An offset cancellation technique for zero-crossing based circuits," US Patent Pending. [73] M. El-Chammas and B. Murnann, "Background Calibration of Time-Interleaved Data Converters," Springer, 2012. 133