EEC 118 Lecture #18: Low Power Circuits, Future Directions, CMOS Alternative, Final Review Stanley Hsu 6/7/2012 Slides courtesy of Professor Rajeevan Amirtharajah, UC Davis and Jeff Parkhurst, Intel Corp. Permissions to Use Conditions & Acknowledgment • Permission is granted to copy and distribute this slide set for educational purposes only, provided that the complete bibliographic citation and following credit line is included: "Copyright 2002 J. Rabaey et al." Permission is granted to alter and distribute this material provided that the following credit line is included: "Adapted from (complete bibliographic citation). Copyright 2002 J. Rabaey et al." This material may not be copied or distributed for commercial purposes without express written permission of the copyright holders. Hsu/Amirtharajah, EEC 118 Spring 2012 2 Announcements • Final Exam on Monday 6/11, 1-3PM, Chem 166 • Bring: Calculator, Pencil, Eraser, One page of notes (both sides OK). – Note paper size must not exceed 8.5 in. x 11 in. Hsu/Amirtharajah, EEC 118 Spring 2012 3 Outline • Today: – Low Power Circuits – Emerging Technologies – Final Review Hsu/Amirtharajah, EEC 118 Spring 2012 4 Hsu/Amirtharajah, EEC 118 Spring 2012 5 Hsu/Amirtharajah, EEC 118 Spring 2012 6 Hsu/Amirtharajah, EEC 118 Spring 2012 7 Woren the power problem! Hsu/Amirtharajah, EEC 118 Spring 2012 8 How to reduce power 2 Pdynamic = α CV dd f • Reduce Vdd, f, C Q: What happens? Pstatic ≈ Pleakage = V dd I leakage • Reduce Vdd and Ileak Q:How to reduce Ileak? Hsu/Amirtharajah, EEC 118 Spring 2012 9 Total Power vs. Frequency 2 Ptotal = α CV dd f + V dd I leakage Total power Slope=? Y-intercept=? Frequency Hsu/Amirtharajah, EEC 118 Spring 2012 10 Pipeline Approach to Voltage Scaling Original data-path D LOGIC A LOGIC B output clock Amirtharajah, EEC 116 Fall 2011 11 Pipeline Approach Pipelined data-path D LOGIC A LOGIC B output clock • Shorter logic delay-> can run at lower supply voltage • Operate pipelined path at original frequency • Tradeoff power for a little more area and more (initial) latency by reducing voltage to meet fixed throughput Amirtharajah, EEC 116 Fall 2011 12 Pipeline Approach to Voltage Scaling • Start with a single design with two registers – Consider the logic in between allows freq = fmax • Now break the logic into N separate parts with equal delay – Separate each part by a register – Logic will be several times faster (New fmax = N x Old fmax) • Vdd can be lowered in order slow down logic to fit original fmax freq – However, additional capacitance of each register has been added. • Power savings could be as much as 80% once all things are considered Hsu/Amirtharajah, EEC 118 Spring 2012 13 Parallelization Driven Voltage Scaling Original data-path D LOGIC A LOGIC B output clock at frequency f Amirtharajah, EEC 116 Fall 2011 14 Parallelization Driven Voltage Scaling D LOGIC A LOGIC B LOGIC A LOGIC B f 2 • Parallelize computation up to N times • Reduce clock frequency by factor N • Reduce voltage to meet relaxed frequency constraint Amirtharajah, EEC 116 Fall 2011 15 Hardware Replication (Parallelism) • Create N redundant paths for data/logic • Input data sent to all path inputs – Outputs from the multiple paths arrive at same time • Have clock to each input register at Fclk/N • Use mux to select from all outputs • To reduce power – Reduce power supply voltage for each path • You can afford the slower speed since replication speeds up total circuit performance – Gate clocks (turn them off) for unused paths Amirtharajah, EEC 116 Fall 2011 16 Tradeoffs of Parallelization • Amount of parallelism in application may be limited • Extra capacitance overhead of multiple datapaths – N times higher input loading – N-to-1 selector on output – Lower clock frequency somewhat offset by higher clock load • Consumes more area, devices, more leakage power especially in deep submicron • Voltage reduction typically results in dramatic power gains – ~3X power reduction Amirtharajah, EEC 116 Fall 2011 17 Summary • Reducing power dissipation (besides scaling Vdd and frequency) – Pipeline approach – Hardware replication (parallelism) approach • Excellent paper on low power design: – Chandrakasan P. A., Sheng S., and Brodersen W. R., LowPower CMOS Digital Design, JSSC, 4/1992 Hsu/Amirtharajah, EEC 118 Spring 2012 18 Future Directions Alternatives to CMOS Digital Circuits Beyond CMOS • Past digital technologies – Electromagnetic Relay: mechanical switch controlled by a current; ideal switch but high power and slow – Vacuum Tube: three (usually) terminal device, similar to a MOSFET but very high power – Bipolar Transistor: high transconductance, but high power, finite input impedance limits fanout • State-of-the-art digital technology is bulk silicon CMOS – Low power, little static power (compared to other technologies) – Infinite input impedance allows large fanouts – Vast development investment has led to extremely high reliability, high yields, and high performance Hsu/Amirtharajah, EEC 118 Spring 2012 20 Digital Circuits Beyond CMOS • Modifications on CMOS can extend Moore’s Law – Strained silicon: mechanical stress on bulk crystal improves mobility – Silicon-on-Insulator: insulating bulk decreases parasitic capacitors, allows tighter integration – New materials: low-k interconnect dielectrics, high-k gate dielectrics, metal gates – New structures: 3D gates (finFET) • But what happens when scaling stops? – International Technology Roadmap for Semiconductors extends to about 20 nm gate lengths in 2015-2020 time frame – Research devices demonstrated down to 5 nm gate lengths Hsu/Amirtharajah, EEC 118 Spring 2012 21 Nanometer Gate Length Bulk FETs 10 nm FET 180 nm FET Doyle, Intel 2002 • Still more room at the bottom! • In 10 nm CMOS, Intel 386 occupies 25 µm x 25 µm • Fabricated using a combination of lithography and epitaxial growth to define feature sizes Hsu/Amirtharajah, EEC 118 Spring 2012 22 Three Dimensional Transistors: FinFETs Hisamoto, JSSC 2000 • Channel forms in thin silicon fin • Gate wraps around fin to control channel formation • Allows very small channel lengths Hsu/Amirtharajah, EEC 118 Spring 2012 23 Digital Circuits Beyond CMOS • New devices based on new materials or quantum effects – Gallium Arsenide (GaAs) MESFET: high electron mobility but no complementary device and poor oxide isolation – Josephson Junctions: superconducting logic and interconnect promises very high speed (THz), but only two terminals inconvenient for circuit design – Carbon Nanotube Transistors: carbon nanotube devices demonstrate high carrier mobilities and promise high speed – Molecular Electronics: single organic molecules shown to switch states in the lab, but also only two terminals – Organic Electronics: semiconducting plastic substrate, enables flexible displays and low cost but offers poor performance Hsu/Amirtharajah, EEC 118 Spring 2012 24 Alternatives to Electronics • A number of completely different digital technologies – Optical Computing: use photons to carry information instead of charge carriers, but no good three terminal nonlinear optical element and difficult integration – Quantum Computing: using various atomic-scale structures to store multiple bits simultaneously and operating on them using laws of quantum mechanics allows massive parallelism, but very sensitive to noise – Biological Computing: use DNA gene coding and promoter and repressor sites to control synthesis of proteins, which form the digital “signals” • But can anything replace CMOS? – Maybe, but not for a long time (CMOS is here to stay) – Some parts of a system (memory) before others (logic) Hsu/Amirtharajah, EEC 118 Spring 2012 25 Nano/Micro-Electro-Mechanical Relays Figure 1. Structure and operation of the MEM switch (SEM included) [1] Figure 2: 9- by 9-millimeter relay test chip, built at the University of California, Berkeley, contains a 100-relay multiplier [fourth row from bottom], designed by MIT researchers, that is the largest relay integrated circuit demonstrated to date. [2] Reference: [1] Stojanovic Vladimir, “Integrated Systems Group Research Webpage”. Online. http://www.rle.mit.edu/isg/research_project_nems.htm [2] H. Fariborzi, M. Spencer, V. Karkare, J. Jeon, R. Nathanael, C. Wang, F. Chen, H. Kam, V. Pott, T.-J. King Liu, E. Alon, V. Stojanovic´, D. Markovic´, "Analysis and Demonstration of MEM-Relay Power Gating," Custom Integrated Circuits Conference (CICC), Sept. 2010. Hsu/Amirtharajah, EEC 118 Spring 2012 26 Intel 22nm “Ivy Bridge” Processor • Multi-CPU and GPU SoC, 4 cores, 1.4 B xtrs. [3] • 3D “tri-gate” transistor types – Fast • nominal leakage – Medium • 1/4 leakage – Slow • 1/10 leakage Reference: [3] Damaraju, S.; George, V.; Jahagirdar, S.; Khondker, T.; Milstrey, R.; Sarkar, S.; Siers, S.; Stolero, I.; Subbiah, A.; , "A 22nm IA multi-CPU and GPU System-on-Chip," Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International Feb. 2012 Hsu/Amirtharajah, EEC 118 Spring 2012 27 Final Review Final Exam Review List Closed Book Closed Notes Calculator OK ONE 8.5 in. x 11 in. formula Sheet allowed, both sides Hsu/Amirtharajah, EEC 118 Spring 2012 29 Final Exam Review List (circuit level) • MOS transistor operation and bias (IV curves) • Static CMOS Design and Analysis – Inverter design and analysis (Vil, Vih, Vol, Voh, NM_low, NM_high, Vm), and VTC. – Delay calculation – Pseudo-NMOS design style and related issues. – Gate logic functions, truth table, switching function, circuit synthesis, transistor sizing, worst case path and timing analysis, parasitic capacitances and resistances, gate delay calculation, power consumption calculation, logical effort. • Dynamic, Domino, and Pass Transistor Logic – Gate design, implementation, sizing, and delay (RC) calculation – Issues and solutions in dynamic logic. Hsu/Amirtharajah, EEC 118 Spring 2012 30 Final Exam Review List (circuit level) • Sequential Circuit Design – Different flip-flop designs (see lecture) – Important timing parameters – Operating frequency estimation • Memory (DRAM, SRAM) – Bitcell and array structure, read and write operation, sizing calculations, required peripheral circuits • Low Power Circuits – Methods to reduce power consumption, and estimate power consumption due to scaling effects. • Arithmetic Circuits – Adder and multiplier critical path analysis Hsu/Amirtharajah, EEC 118 Spring 2012 31 Final Exam Review List (system level) • Power consumption – Dynamic, short circuit, leakage – How to estimate. How to compute transition probability. • Delay • Interconnect – Calculation of delay based on geometry. Effects on logic timing (skew). Miller effect/capacitance • Design Modeling and Abstraction Hierarchy • Chip design flow – Front-end to back-end, required steps and outcome of each step. • Study all homework, quiz, and lectures Hsu/Amirtharajah, EEC 118 Spring 2012 32 Final Exam Review List These are here to help focus your study but is not an exhaustive list of what you are responsible for Hsu/Amirtharajah, EEC 118 Spring 2012 33 Key Learnings • Should know what region of operation the transistor is in given the bias voltages at its terminals • Should know what the PMOS and NMOS Id vs Vds curves look like • Be able to identify major points of the VTC for a CMOS Device [Voh, Vol, Vih, Vil, Vm] • Need to know how to calculate the total capacitance at the output node (including Miller effect) • Know the relevant capacitances of a transistor used in transient analysis (i.e. Cgs, Cgd, …). (see HW5) Hsu/Amirtharajah, EEC 118 Spring 2012 34 Key Learnings • Know how to calculate propagation delay using Req and capacitance load. Be able to derive Req and Cload. • Know Pseudo NMOS and Pass Transistor Logic pros and cons • Know Different Adders and Multipliers – Know concepts of how they speed up these arithmetic units • Dynamic Logic concepts – Pros and cons, techniques used to cascade, avoid noise, etc. • Memory (different types, SRAM operation) • Low Power Design techniques (voltage scaling, pipelining, etc.) • Interconnect basics (resistance, capacitance) Hsu/Amirtharajah, EEC 118 Spring 2012 35 Key Learnings • Know how to find the Boolean function from a schematic • Know how to properly size transistors to get the equivalent resistance of a basic inverter • Know how to size devices and choose logic depth based on logical effort These are here to help focus your study but is not an exhaustive list of what you are responsible for Hsu/Amirtharajah, EEC 118 Spring 2012 36 Course Evaluations Thank you all for a great quarter! Hsu/Amirtharajah, EEC 118 Spring 2012 37