A 1.2GHz Adaptive Floating Gate Comparator with 13-bit Resolution Yanyi Liu Wong, Marc H. Cohen and Pamela A. Abshire yanyi.wong/marc.cohen/pamela.abshire@ieee.org Institute for Systems Research, University of Maryland, College Park, MD 20742, U.S.A. Abstract— We present a high-speed voltage comparator that uses floating gate adaptation to achieve high comparison resolution. The comparator uses nonvolatile charge storage for either offset nulling or automatic programming of a desired offset. We exploit the negative feedback functionality of pFET hot-electron injection to achieve fully automatic offset cancellation. The design has been fabricated in a commercially available 0.35µm process. Experimental results confirm the ability to reduce the variance of the comparator offset 3600× and to accurately program a desired offset with maximum observed residue offset of 469µV and standard deviation 199µV. We achieve controlled injection to accurately program the input offset to voltages uniformly distributed from -1V to 1V. The comparator operates at 1.2GHz with a power consumption of 2.97mW. increase the likelihood of impact-ionization and produce high energy electrons, and a high vertical electric field EV across the gate oxide to sweep the hot electrons across the oxide barrier. For an ordinary pFET, it is relatively easy to achieve both conditions under normal operation. An accurate semiempirical model in [11] suggests that injection current scales as an exponential function of source-to-drain voltage Vsd . Using current sources and sinks, injection mechanisms have been exploited to satisfy different needs [3]–[10]. Our design approach is to use a current source at the source of the injection pFET to form a stable negative feedback loop that enables automatic and accurate adaptation [9]. I. I NTRODUCTION Vdd Comparators are decision-making circuits that interface between analog and digital signals. Comparators are the core element for A/D converters, and oftentimes their performance directly affects that of the resulting A/D converters. Mismatches in the pre-amplifier and regenerative stage due to process variations cause offset that directly affects resolution of a comparator. A common approach used to cancel offset is dynamic switching which adds switches and multiple nonoverlapping clocks, and excellent results have been reported [1]. Another approach for high speed operation without switching is on-chip averaging, which has been shown to reduce mismatch and boost resolution [2]. Since offset is a constant value, it is natural to store it using nonvolatile storage on a floating gate. Floating gate circuits have been used to cancel offsets in imagers [3], to trim current sources [4]–[6], and to autozero amplifiers [7], [8]. We previously introduced a simple floating gate comparator [9] with the ability to accurately trim and store desired zero or nonzero offsets, a feature that is not readily available using existing offset cancellation techniques. In this paper, we present the design and testing of a high-speed high-resolution comparator using floating gate adaptation. II. BACKGROUND A floating gate MOSFET uses an electrically isolated material to store charge indefinitely. In our comparator, the circuit offset is stored in this high-retention charge form, and altered by means of differential injection and tunneling. The injection mechanism has been extensively described in the literature [8], [10]. Impact-ionized hot-carrier injection occurs in p-type MOSFETs when two conditions are satisfied: a high lateral electric field EL across the channel to 0-7803-8834-8/05/$20.00 ©2005 IEEE. Is1 Is2 Vg+ Vi+ VgM1 Iinj1 M5 VoM3 Fig. 1. Vclk Vi- M2 Iinj2 Vo+ M4 Simple five-transistor AFGC in [9]. III. A DAPTIVE F LOATING G ATE C OMPARATOR Figure 1 shows a 5-transistor implementation of differentialmode hot-electron injection that uses local control and adapts the charge on the input pFETS’ floating gates [9]. Transistors M1,2 form the differential pair that compares the input voltages Vi+,− and at the same time are responsible for controlling injection. The adaptive element is integrated within the comparator itself. In this paper we present an AFGC that separates the adaptive elements from the comparator core. Because the comparator itself does not carry out adaptation, its design can be much more flexible. We based our design on a 3-stage-pipelined high performance CMOS comparator [2] as the comparator 6146 AVdd AVdd AVdd c AVdd AVdd c n2n1- c iVdd AVdd n2+ n1+ n1- n3+ n3- c Vfg+ Vfg- c M2 c c n2- Ib1 M1 Ich n1+ n4+ n2+ Ib2 Cp Ci+ Vin+ Vfg+ n5+ Iinj+ Ib3 D1 p-diff (a) Fig. 2. (b) (c) nWell The comparator core consists of 3 stages. core for our AFGC. Figure 2 (a), (b) and (c) show the three stage comparator core. Currents Ib1,2,3 provide tail currents for the 1st, 2nd and 3rd stages, respectively. The inputs to the 1st stage Vf g+,− are supplied by the floating gates of the nFET differential pair. Nodes n1+,- are the outputs of the 1st stage and the inputs to the 2nd stage. Nodes n2+,- are the outputs of the 2nd stage and the inputs to the 3rd stage. Nodes n3+,- are then fed to a latch that produces a rail-to-rail digital output do+ . Clock c and its complement c are supplied to the comparator. A separate analog Vdd (AVdd) of 3.3V is supplied to the comparator core. Figure 3 shows one of the adaptive elements that is used to change the charge on one of the input floating gates. The inputs Vin+,− are the AFGC differential inputs, and they are coupled to comparator inputs Vf g+,− through capacitors Ci+,- . M2 is the injection transistor, with channel current supplied by M1. M1 forms a current mirror with another diode-connected transistor and sets the channel current Ich . Capacitor Cp and diode D1 form a negative charge pump. During normal operation, node n4+ sits at the digital Vdd (DVdd), and n5+ sits around 0.65V since D1 is forward biased. Cp holds a voltage of DVdd − 0.65V. The maximum source-to-drain voltage Vsd on M2 is 2.65V, which is insufficient to produce impact-ionized hotelectron injection. When n4+ goes to ground, n5+ immediately goes to −DVdd + 0.65V. This increase in the Vsd across M2 causes injection to occur, and a small amount of charge is transferred onto the floating gate so that Vf g+ decreases by a small amount ∆V . Suppose that injection occurs with inputs Vin+,− connected to some desired voltage Vin+ − Vin− = Vd , and there exists unknown charge on the floating gates Vf g+,− ; since Vf g+,− are capacitively coupled to clamped inputs Vin+,− , they are held constant. We pulse n4+ to ground when the comparison outcome is positive (plus side greater than minus side), and pulse n4- to ground when the comparison outcome is negative. Differential injection occurs in the direction that makes the outcome reverse. By pulsing n4+ to ground, we decrease the gate voltage Vf g+ , and we move in the direction of a negative outcome. Eventually, the system reaches an equilibrium and the comparison outcome alternates for each cycle, causing injection on the corresponding side of the floating gate. After Fig. 3. Vin+ Vin- The adaptive element for AFGC. iVdd AVdd inj/tun block + comp − Vfg+ DVdd do+ Dout+ By-64 Decimator Vout Vfgn4+ n4Fig. 4. Rf iFSM 1.2GHz clock Cf inj_en The AFGC system with filtered output. equilibrium has been established the residual offset left on the floating gates after each injection is smaller than ∆V . As injection proceeds, the maximum Vsd for M2 decreases, so the incremental voltage change ∆V gradually diminishes, and we successfully program a precise desired offset Vd onto the differential floating gates of the AFGC. Setting Vd = 0, we achieve offset cancellation. This method of differential injection uses the outcome of comparison to correct offset; therefore the adaptation feedback loop encompasses all mismatch and offset within the circuit, and accurate offset adaptation can be achieved. In the above description, injection is performed after every comparator outcome. In practice, we perform injection every 3 clock cycles. This is because the outcome immediately after an injection cycle is the old comparison result in the pipeline and should not be used to determine the update direction. After 3 clock cycles the pipeline is flushed and the outcome and injection are correctly aligned in time. We implement this delay with a finite state machine (shown as iFSM in Fig.4). Injection is a one way process that lowers the floating gate voltages. To raise these floating gate voltages, we perform tunneling by grounding inputs Vin+,− and raising the power supply voltage on the adaptive element iVdd, to 9.16V. The back gate (nWell) of M2 is also connected to iVdd and therefore also raised to 9.16V. A large electric field now exists across the gate oxide at the side edge of M2, from nWell to the floating gates. This large electric field tunnels electrons off the floating gate, raising the floating gate voltages. 6147 IV. H ARDWARE , E XPERIMENTS AND R ESULTS 1 P[X<Vi] measured pts best−fit erf(x) 0.5 0 −2 −1 0 1 2 (a) Differential input voltage V (mV) i 1.5 Probability density The AFGC was fabricated in a commercially available 0.35µm CMOS process and packaged in a DIP-40 ceramic package. As in all high-speed digital chips, several pins are dedicated to DVdd and GND to reduce ground bounce [12]. These pins are located near pin numbers 10 and 30, the center pins on the left and right side of the DIP-40 package, where the parasitic inductance in the package is minimum. Large area sandwiched layers of metal1, metal2, metal3, poly and poly2 form on-chip decoupling capacitors and at the same time satisfy the chemical-mechanical polishing (CMP) requirements for planarized processes. A 4-layer PCB was made with two inner layers dedicated to power and ground. We use surface mount ceramic capacitors to decouple the power rails. A PC-based DAC card supplies the commonmode voltage to the negative input Vin− and a precision voltage source (having 5µV steps) supplies the differential input Vi = Vin+ − Vin− . We use a fast, low-noise, fully differential voltage controlled oscillator (VCO) as the on-chip clock generator [13]. True single-phase clocked (TSPC) D-flip-flops are used as the building block for all synchronous logic circuits to enable a sampling rate above 1GHz [14]. Figure 4 shows the system architecture of the AFGC. We use a decimator to subsample the digital output by 64X to relax requirements on the output buffer that drives the pin Dout+ . The 1.2GHz clock feeds the comparator core, the decimator and the iFSM. We enable adaptation by turning on inj en. DVdd is supplied with 4.3V to boost the clock voltage to 4.3V because the clocked switches in the comparator core (Fig.2) are too small to provide sufficient reset conductance with a 3.3V clock, which causes hysteresis. Future designs will remedy this by making all clocked switches bigger. We observe a maximum operating frequency of 1.2GHz with DVdd=3.3V. Therefore, we confine the operating frequency to 1.2GHz even though we have the ability to operate faster for higher values of DVdd. We low-pass-filter the subsampled digital output Dout+ with Rf = 25.6kΩ and Cf = 0.1nF to get a filtered output Vout . The cut-off frequency is 62.2kHz. We then sample Vout √ with PC-based data acquisition at 1kHz. We reduce noise by 2 10 in the measured voltage by averaging over 40 samples. Using the lowpass filter followed by averaging, the measured Vout approaches the average value for the comparator Vout output Dout+ . Therefore, the normalized quantity DVdd approaches the mean m for the comparison outcome, which includes the effects of deterministic offset as well as random noise. Let X be the random variable representing the actual input offset, and suppose that the outcome of a comparison is zero (D0 = 0) when the differential input signal Vi = (Vin+ − Vin− ) < X, and one (D1 = 1) when Vi > X. Then, m is equivalent to the cumulative P distribution function (cdf) p1 = P [X < Vi ] since m = pi Di = p0 · 0 + p1 · 1 = p1 , where p0 = P [X > Vi ]. Empirically, we find that the distribution P [X < Vi ] is Gaussian. Figure 5 shows 1 0.5 0 −3 −2 −1 0 1 2 (b) Differential input voltage Vi (mV) 3 (a) The normalized Vout plotted against differential input voltage Vi and (b) the corresponding probability density function. Fig. 5. the normalized measurement points Vout along with a leastsquare-fit Gaussian cdf curve. Here the measurement is taken after adaptation and has an offset of E[X] = −46µV. In addition to the floating gate comparator, there is also a comparator core with its inputs connected directly to external pins. We were able to measure the offset performance for this “bare” comparator and compare it to the AFGC. In the following, we present offset statistics for the bare comparator and for the AFGC before and after injection. We erase random charges on the floating gates by tunneling prior to adaptation. We measure 21 available chips and plot their offset distribution in Fig.6. The mean µ and standard deviation σ statistics for the 21 measured offsets are; µ0 = 7.081mV and σ0 = 11.942mV for the bare comparator, µb = 33.685mV and σb = 25.246mV for the AFGC before injection, and µa = −6µV and σa = 199µV for the AFGC after injection. These results clearly demonstrate the AFGC’s ability to reduce the mean offset by a factor of 1000 and to reduce the standard deviation by a factor of 60. Note that the mean approaching zero for a large number of samples simply means that the injection mechanism is perfectly balanced. The magnitude of the residual offset on the AFGC after adaptation is limited by the standard deviation. Thus, by reducing σ by 60, we gain approximately 6 bits in resolution compared to the bare comparator. An input offset with σa = 199µV for a 3.3V peak-to-peak sine wave input signal corresponds to an SNR of 81.4dB, or 13 effective bits. We programmed one AFGC to 21 offset values evenly distributed from -1V to 1V. Figure 7 shows the residual offset (measured−programmed) versus the programmed offset. The magnitude of the residue is under 0.5mV over all programming voltages. The standard deviation for the residue is 178µV, which is comparable to the standard deviation obtained with 6148 0.5 4 0.4 2 Number of occurences 0 −20 Residue offset (mV) 0.3 −10 0 10 20 30 (a) Offset distribution for bare comparator (mV) 4 2 0 −20 0.2 0.1 0 −0.1 −0.2 0 20 40 60 80 (b) AFGC offset distribution before injection (mV) −0.3 4 −0.4 −1 2 Fig. 7. 0 −0.5 0 0.5 1 (c) AFGC offset distribution after injection (mV) The input offset distribution over 21 chips for (a) the bare comparator, (b) AFGC before injection and (c) AFGC after injection. Fig. 6. 21 different chips. This demonstrates that the AFGC’s performance after adaptation holds for chip-to-chip variations, and is independent of programming voltage. For the above experiments, we inject with Ich =8µA. When we apply one short pulse (10µs) on the inj en pin in Fig.4, the offset shifts by 21.3mV. For higher Ich , we observe larger shifts. The offset shift is not linear in Ich because the negative voltage on node n5+ in Fig.3 holds for less time the higher the Ich . The offset stops shifting if we apply 32µA or more. To quantify the ability of the AFGC to retain its postadaptation stored offset voltage, we programmed three chips with 150mV, 0V and -150mV offset and, after three days observed a 3.935mV, 0.838mV and 0.219mV offset drift. V. C ONCLUSION A “bare” three stage comparator architecture was redesigned to include floating gate pFET adaptive circuit elements at the input stage. The output decision of this new floating gate comparator is used as the feedback control signal to guide the adaptation process so as to virtually eliminate all inherent circuit mismatches. The AFGC uses the mechanism of hot electron injection to adjust the voltages on the floating nodes of each of the input pFET transistors. When enabled, hot electron injection occurs during normal comparator operation. The AFGC was fabricated in a 0.35µm technology and when supplied with 3.3V, it operates at 1.2GHz and consumes 2.97mW. Standard deviation of initial offset is reduced by a factor of 60 which translates into a 6 bit gain in resolution when compared with the “bare” comparator. An A/D converter using multiple AFGCs with input offset standard deviation of 199µV can achieve an SNR of about 80dB which corresponds to 13 effective bits, when converting a 3.3V peak-to-peak sine wave. The performance of the adaptation is independent of chip-to-chip process variations and programming voltage. In addition to canceling offset, the AFGC can accurately store an arbitrary input offset, a feature not readily available in −0.5 0 0.5 Programming voltage (V) 1 The residue offset versus programming voltage. other offset cancellation schemes. Ongoing and future work will use our AFGC to implement new classes of adaptive data converters. VI. ACKNOWLEDGEMENTS We thank the MOSIS service for providing chip fabrication through their Educational Research Program. Y.W. is supported by Johns Hopkins University Applied Physics Laboratory. P.A. is supported by an NSF CAREER Award (NSF-EIA-0238061). R EFERENCES [1] B. Razavi and B. Wooley, “Design techniques for high-speed, highresolution comparators,” IEEE JSSC, vol. 27, no. 12, pp. 1916–1926, December 1992. [2] M. Choi and A. Abidi, “A 6-b 1.3-Gsample/s A/D converter in 0.35-µm CMOS,” IEEE JSSC, vol. 36, no. 12, pp. 1847–1858, December 2001. [3] M. Cohen and G. Cauwenberghs, “Floating-gate adaptation for focalplane online nonuniformity correction,” IEEE TCAS.II, vol. 48, no. 1, pp. 83–89, January 2001. [4] S. Shah and S. Collins, “A temperature independent trimmable current source,” in IEEE ISCAS, vol. 1, May 2002, pp. I713–I716. [5] S. Jackson, J. Killens, and B. Blalock, “A programmable current mirror for analog trimming using single poly floating-gate devices in standard CMOS technology,” IEEE TCAS.II, vol. 48, no. 1, pp. 100–102, Jan. 01. [6] J. Hyde, T. Humes, C. Diorio, M. Thomas, and M. Figueroa, “A 300MS/s 14-bit digital-to-analog converter in logic CMOS,” IEEE JSSC, vol. 38, no. 5, pp. 734–740, May 2003. [7] P. Hasler, B. Minch, and C. Diorio, “An autozeroing floating-gate amplifier,” IEEE TCAS.II, vol. 48, no. 1, pp. 74–82, January 2001. [8] T. Constandinou, J. Georgiou, and C. Toumazou, “An auto-input-offset removing floating gate pseudo-differential transconductor,” in IEEE ISCAS, vol. 1, May 2003, pp. 169–172. [9] E. Wong, P. Abshire, and M. Cohen, “Floating gate comparator with automatic offset manipulation capability,” in IEEE ISCAS, vol. 1, May 2004, pp. I–529–532. [10] P. Hasler and J. Dugger, “Correlation learning rule in floating-gate pFET synapses,” IEEE TCAS.II, vol. 48, no. 1, pp. 65–73, January 2001. [11] K. Rahimi, C. Diorio, C. Hernandez, and M. Brockhausen, “A simulation model for floating-gate MOS synapse transistors,” in IEEE ISCAS, vol. 2, May 2002, pp. 532–535. [12] H. W. Johnson and M. Graham, High-speed digital design: a handbook of black magic. Englewood Cliffs, NJ: Prentice Hall, 1993. [13] L. Dai and R. Harjani, Design of high performance CMOS voltagecontrolled oscillators. Boston, MA: Kluwer, 2003. [14] J. Yuan and C. Svensson, “New single-clock CMOS latches and flipflops with improved speed and power savings,” IEEE JSSC, vol. 32, no. 1, pp. 62–69, January 1997. 6149