Sub-threshold Sense Amplifier (SA) Compensation Using Auto-zeroing Circuitry Peter Beshay Department of Electrical Engineering University of Virginia, Charlottesville Robust Low Power VLSI 01/21/2014 Outline Motivation Introduction DAZ Circuit 16kB SRAM Chip Measurements Conclusion 2 Motivation Source: IdeaConnection.com Source: groups.csail.edu/ Source: Implantable-device.com 3 Motivation SRAM are used in implantable devices Contribute significantly to the total System-on-chip (SOC) power consumption SRAM Power Consumption (1) (1) N. Verma, Phd thesis 4 Motivation Main Limitations Process Variations effect, Slow Speed Normalized Energy Minimum Energy occurs in sub-threshold [1] Eactive = CVDD2 Etotal/operation minimized in sub-VT VDD (V) Energy Consumption vs. VDD (1) (1) N. Verma, Phd thesis 5 Motivation Work Focus Minimizing the energy of the read operation of sub-threshold SRAMs. Sense Amplifier are utilized during the read operation of the SRAMs. The intrinsic offset voltage of the SAs causes increased read energy and degraded performance of the SRAM read operation [2]. 6 Outline Introduction DAZ Circuit 16kB SRAM Chip Measurements Conclusion 7 Sense Amplifier Vout =1 if V1 > V2 Vout =0 Otherwise 𝑨𝒏𝒂𝒍𝒐𝒈𝒚 𝑉1 𝑉𝑜𝑢𝑡 𝑉2 Enable SA Offset Voltage Vout =1 if V1 > V2 Vout =0 Otherwise 𝑨𝒏𝒂𝒍𝒐𝒈𝒚 𝑉1 𝑉𝑜𝑢𝑡 𝑉2 Enable 𝐕𝐨𝐟𝐟𝐬𝐞𝐭 Vout =1 if V1 > V2 + 𝐕𝐨𝐟𝐟𝐬𝐞𝐭 Vout =0 Otherwise 𝑉1 𝑉𝑜𝑢𝑡 𝑉2 Enable 9 SA Offset Voltage Vout =1 if V1 > V2 Vout =0 Otherwise 𝑨𝒏𝒂𝒍𝒐𝒈𝒚 𝑉1 𝑉𝑜𝑢𝑡 𝑉2 Enable 𝑉1 𝑉𝑜𝑢𝑡 𝑉2 Enable 𝐎𝐜𝐜𝐮𝐫𝐞𝐧𝐜𝐞𝐬 𝐕𝐨𝐟𝐟𝐬𝐞𝐭 Vout =1 if V1 > V2 + 𝐕𝐨𝐟𝐟𝐬𝐞𝐭 Vout =0 Otherwise 𝐕𝐨𝐟𝐟𝐬𝐞𝐭 10 Row Decoder 6T SRAM Read Operation 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. . 6T Bitcell .. . 6T Bitcell … .. . 6T Bitcell SAE 11 Row Decoder 6T SRAM Read Operation 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. . 6T Bitcell .. . 6T Bitcell … .. . 6T Bitcell SAE 12 Row Decoder 6T SRAM Read Operation 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. . .. . … .. . SAE 13 Row Decoder 6T SRAM Read Operation 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. . .. . … .. . SAE 14 6T SRAM Read Operation Row Decoder BL=𝐕𝐃𝐃 𝐁𝐋=𝐕𝐃𝐃 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. . .. . … .. . SAE 15 6T SRAM Read Operation Row Decoder BL=𝐕𝐃𝐃 𝐁𝐋=𝐕𝐃𝐃 WL 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. WL=.𝐕 𝐃𝐃 .. . … .. . SAE 16 6T SRAM Read Operation Row Decoder BL=𝐕𝐃𝐃 𝐁𝐋=𝐕𝐃𝐃 WL 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. WL=.𝐕 𝐃𝐃 1 .. . … .. . 0 SAE 17 6T SRAM Read Operation Row Decoder BL=𝐕𝐃𝐃 𝐁𝐋=𝐕𝐃𝐃 WL 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. WL=.𝐕 𝐃𝐃 1 6T Bitcell BL, BL .. . 6T Bitcell … .. . 0 SAE 18 6T SRAM Read Operation ∆V > 𝐕𝐨𝐟𝐟𝐬𝐞𝐭 Row Decoder BL=𝐕𝐃𝐃 𝐁𝐋=𝐕𝐃𝐃 WL 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. WL=.𝐕 𝐃𝐃 1 6T Bitcell ∆V 6T Bitcell BL, BL .. . … .. . 0 SAE 19 6T SRAM Read Operation ∆V > 𝐕𝐨𝐟𝐟𝐬𝐞𝐭 Row Decoder BL=𝐕𝐃𝐃 𝐁𝐋=𝐕𝐃𝐃 WL 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. WL=.𝐕 𝐃𝐃 1 6T Bitcell ∆V 6T Bitcell BL, BL .. . SAE … .. . 0 SAE 20 6T SRAM Read Operation ∆V > 𝐕𝐨𝐟𝐟𝐬𝐞𝐭 Row Decoder BL=𝐕𝐃𝐃 𝐁𝐋=𝐕𝐃𝐃 WL 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. WL=.𝐕 𝐃𝐃 1 6T Bitcell ∆V 6T Bitcell BL, BL .. . SAE … .. . 0 SAE 21 6T SRAM Read Operation Row Decoder BL=𝐕𝐃𝐃 𝐁𝐋=𝐕𝐃𝐃 WL 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. WL=.𝐕 𝐃𝐃 6T Bitcell ∆V 6T Bitcell BL, BL .. . SAE … .. . Pre-charge 1 0 SAE 22 6T SRAM Read Operation Row Decoder BL=𝐕𝐃𝐃 𝐁𝐋=𝐕𝐃𝐃 WL 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. WL=.𝐕 𝐃𝐃 6T Bitcell ∆V 6T Bitcell BL, BL .. . SAE … .. . Pre-charge 1 0 SAE 23 6T SRAM Read Operation 𝑬𝒑𝒓𝒆𝒄𝒉𝒂𝒓𝒈𝒆 = 𝐂𝐁𝐋 𝐕𝐃𝐃 ∆𝐕 Row Decoder BL=𝐕𝐃𝐃 𝐁𝐋=𝐕𝐃𝐃 WL 6T Bitcell 6T Bitcell 6T Bitcell 6T Bitcell .. WL=.𝐕 𝐃𝐃 6T Bitcell ∆V 6T Bitcell BL, BL .. . SAE … .. . Pre-charge 1 0 SAE 24 PMOS-input Latch SA Enable the SA 𝐄𝐍 BL 𝐄𝐍 𝐁𝐋 M5 M6 M3 M4 M1 OUT Sense the input voltage M2 𝐄𝐍 Cross coupled inverter to latch the output 𝐎𝐔𝐓 Precharge the output to VDD 25 PMOS-input Latch SA BL=0.45V 𝐄𝐍 OUT 𝐄𝐍 𝐁𝐋 = 𝟎. 𝟒𝐕 M5 M6 M3 M4 M1 M2 EN 𝐄𝐍 OUT, 𝐎𝐔𝐓 𝐎𝐔𝐓 26 PMOS-input Latch SA V=𝐕𝐃𝐃 BL=0.45V 𝐄𝐍 OUT 𝐄𝐍 𝐁𝐋 = 𝟎. 𝟒𝐕 M5 M6 M3 M4 M1 M2 EN 𝐄𝐍 OUT, 𝐎𝐔𝐓 𝐎𝐔𝐓 27 Offset Voltage 𝐕𝐆𝐒 −𝐕𝐭𝐡 𝐕𝐃𝐒 𝐖 𝐧𝐕 𝐈𝐃 = 𝐈𝟎 𝐞 𝐭𝐡𝐞𝐫𝐦𝐚𝐥 (𝟏 − 𝐞𝐕𝐭𝐡𝐞𝐫𝐦𝐚𝐥 ) 𝐋 𝐄𝐍 BL=0.5 𝐄𝐍 OUT M5 M6 M3 M4 M1 M2 𝐁𝐋=0.5 ∆𝑽𝒕𝒉 mismatch causes the currents to Be different, for zero differential input (BL=BL) 𝐄𝐍 𝐎𝐔𝐓 28 Digital Auto-zeroing (DAZ) • We propose a digital auto-zeroing (DAZ) scheme inspired by analog amplifier offset correction. • The main advantages of the approach are • Near-zero offset after cancellation. • Suitable for sub-threshold operation due to the repeated offset compensation phase. • Several attempts have been made before to tackle the problem including: • Redundancy [3] • Transistor upsizing [4] • Digitally controlled compensation [5] 29 Outline Introduction DAZ Circuit 16kB SRAM Chip Measurements Conclusion 30 Auto-zeroing in analog amplifiers • Amplification is done in two phases • Φ1: Sample the offset on a capacitor Dynamic Offset Cancellation (2) • Φ2: Subtract the offset from the input signal (2) K Kang et al, “Dynamic Offset Cancellation Technique” cse.psu.edu/~chip/course/analog/insoo/S04AmpOffset.ppt DAZ Scheme 𝑉𝑜𝑢𝑡 Enable=0 𝑇𝑢𝑛𝑒 𝑡ℎ𝑒 𝑆𝐴 𝑉𝑜𝑢𝑡 Enable=1 • Phase1 (ENR1) A zero differential input is applied to the sense amp. • Phase2 (ENO) The SA resolves based on its intrinsic offset. DAZ Scheme BL 𝑉𝑜𝑢𝑡 BL Enable=0 • Phase3 (ENR2) The differential input is applied to the sense amp. BL 𝑉𝑜𝑢𝑡 BL Enable=1 • Phase4 (ENI) The SA resolves based on the differential input. DAZ Circuit ENR1 ENR1 ENR2 ENR2 BL 𝐁𝐋 ENI 𝐄𝐍 OUT 𝐄𝐍 ENI M5 M6 M3 M4 M1 M2 MC2 𝐂𝐩 MC1 𝐄𝐍 𝐎𝐔𝐓 Charge Pump • DAZ circuit applied to a latch-based sense amp with PMOS inputs • DAZ circuit uses a split-phase clock and charge pump (CP) feedback circuit for repetitive compensation. DAZ Circuit ENR1 ENR1 ENR2 ENR2 BL 𝐁𝐋 ENI 𝐄𝐍 OUT 𝐄𝐍 ENI M5 M6 M3 M4 M1 M2 MC2 𝐂𝐩 MC1 𝐄𝐍 𝐎𝐔𝐓 Charge Pump • Transistors MC1 and MC2 control the drive strength of the right side of the SA. • The CP controls the drive current in both MC1 and MC2 to equalize the strength of the SA right and left sides. DAZ Circuit ENR1 ENR1 ENR2 ENR2 BL 𝐁𝐋 ENI 𝐄𝐍 𝐄𝐍 ENI M5 M6 M3 M4 M1 M2 MC2 Cp MC1 Charge Pump 𝐄𝐍 ENR2 OUT 𝐎𝐔𝐓 𝐄𝐍𝐈 M11 M13 M9 M10 ENO 𝐄𝐍𝐑𝟐 M12 Phase 1 ENR1 ENR1 ENR2 ER1: A zero differential input is applied to the sense amp. ENR2 BL 𝐁𝐋 ENI 𝐄𝐍 𝐄𝐍 ENI M5 M6 M3 M4 M1 M2 MC2 Cp MC1 Charge Pump 𝐄𝐍 ENR2 OUT 𝐎𝐔𝐓 𝐄𝐍𝐈 M11 M13 M9 M10 ENO 𝐄𝐍𝐑𝟐 M12 Phase 2 ENR1 ENR1 ENR2 ENO: The SA resolves based on its intrinsic offset. ENR2 BL 𝐁𝐋 ENI 𝐄𝐍 𝐄𝐍 ENI M5 M6 M3 M4 M1 M2 MC2 Cp MC1 Charge Pump 𝐄𝐍 ENR2 OUT 𝐎𝐔𝐓 𝐄𝐍𝐈 M11 M13 M9 M10 ENO 𝐄𝐍𝐑𝟐 M12 Phase 3 ENR1 ENR1 ENR2 ER2: The differential input is applied to the sense amp. ENR2 BL 𝐁𝐋 ENI 𝐄𝐍 𝐄𝐍 ENI M5 M6 M3 M4 M1 M2 ∆v MC2 Cp MC1 Charge Pump 𝐄𝐍 ENR2 OUT 𝐎𝐔𝐓 𝐄𝐍𝐈 M11 M13 M9 M10 ENO 𝐄𝐍𝐑𝟐 M12 Phase 4 ENR1 ENR1 ENR2 ENI: The SA resolves based on the differential input. ENR2 BL 𝐁𝐋 ENI 𝐄𝐍 𝐄𝐍 ENI M5 M6 M3 M4 M1 M2 MC2 Cp MC1 Charge Pump 𝐄𝐍 ENR2 OUT 𝐎𝐔𝐓 𝐄𝐍𝐈 M11 M13 M9 M10 ENO 𝐄𝐍𝐑𝟐 M12 Precision • The precision of the scheme depends on the accuracy of setting the voltage on the output capacitor (Cp). Settling Time = 60us 41 Offset Tuning • Accuracy (offset voltage) vs. settling time trade-off through Cp tuning. 40 Cp=0.74pF 35 Settling Time (us) 30 25 Cp=0.43pF 20 Cp=0.24pF 15 Cp=0.14pF 10 5 Cp=0.13pF 0 2 4 6 8 10 12 Min Achieved Offset (mV) 14 16 18 20 42 Outline Introduction DAZ Circuit 16kB SRAM Chip Measurements Conclusion 43 16kB SRAM Test-case • A 20mV DAZ SA is used in a 16kB SRAM with 1bank, 512 rows and 256 columns using commercial 45nm technology node [6]. • 10% reduction of the read energy • 24% reduction of the read delay Chip Measurements • 45nm technology test chip. • • One regular SA array for benchmarking DAZ SA array with Cp=32fF. • DAZ circuit limits the absolute value of the maximum offset to 50 mV and provided 80% improvement in σ [6]. 44 Limitation • Area overhead (major concern in SRAM designs) • 2.5X for 50mV offset compensation • Can be significant for small offsets • Energy overhead of the continuous calibration (split phases, charge pump) • 3.5X the energy of a regular SA • Sensitivity to split phase frequency. 45 Outline Introduction DAZ Circuit 16kB SRAM Chip Measurements Conclusion 46 Conclusion • We proposed a circuit that is capable of improving sense-amp offset to near zero, which is valuable for sub-threshold operation due to the repeated calibration phase. • Applying the scheme on a 16 kB SRAM in 45nm technology node showed a reduction in the total energy and delay of 10% and 24% respectively. • Measurements from a test chip fabricated in 45 nm technology showed the circuit’s ability to limit the absolute maximum value of the offset voltage to 50 mV using a 32fF output capacitance. 47 References 1. B. H. Calhoun et al. "Sub-threshold circuit design with shrinking CMOS devices." ISCAS 2009. 2. J. Ryan et al. “Minimizing Offset for Latching Voltage-Mode Sense Amplifiers for Sub-threshold Operation” ISQED 2008. 3. N. Verma et al. “A 256 kb 65 nm 8T Sub-threshold SRAM Employing Sense-Amplifier Redundancy” ISSCC 2008. 4. L. Pileggi et al. “Mismatch Analysis & Statistical Design” CICC 2008. 5. M. Bhargava et al. “Low-Overhead, Digital Offset Compensated, SRAM Sense Amplifiers” CICC 2009. 6. P. Beshay et al. "A Digital Auto-Zeroing Circuit to Reduce Offset in SubThreshold Sense Amplifiers." JLPEA 2013 48 Questions 49