EE M216A Fall 2010 Design of VLSI Circuits and Systems Prof. Dejan Marković Prof University of California, Los Angeles, USA Email: ee216a@gmail.com Course Description This course focuses on advanced concepts of VLSI circuit and system design in state‐of‐the‐art CMOS technologies. Topics include: – Circuit‐level optimization using gate size, supply and threshold voltage; layout of circuit blocks optimized for speed, power, or area. – Advanced concepts of retiming, place and route will be employed in class projects, in addition to the design of custom blocks. – The applications include micro‐processors, signal and multimedia processors, portable devices, memory and periphery. – Course topics are continuously updated to track unique technological features such as power leakage, interconnect, clock and power distribution, impact of device variability on the design. – This quarter, special focus will be given to design optimization and scaling. EEM216A .:. Fall 2010 LectureD. 1: Markovic Introduction | 22 / Slide 1 EE115C vs. EEM216A EE115C (introductory material) – Basic transistor and circuit models – Basic circuit design styles and logic gates – Design g of custom blocks (adders, ( , memories,…) , ) EEM216A (advanced material) – – – – – Transistor models of varying accuracy Design under constraints: power, area, performance, robustness More advanced design techniques Learning challenges in the coming years Creating new solutions to challenging design problems EEM216A .:. Fall 2010 LectureD. 1: Markovic Introduction | 33 / Slide Class Topics Fundamentals – Technology and modeling – Scaling and limits of scaling Design for nano‐scale nano scale CMOS – Static CMOS, transistor sizing, buffer design, high‐speed CMOS design styles, (dynamic logic) – Process variations, leakage Design techniques for low power and low voltage – Power minimization at technology, circuit, architecture levels – Energy‐delay optimization Arithmetic ih i circuits i i System‐level issues – Timing strategies, logic synthesis – Clock and power distribution – Physical design EEM216A .:. Fall 2010 LectureD. 1: Markovic Introduction | 44 / Slide 2 Teaching Staff Instructor – Prof. Dejan Marković – Office hours ● Tu T & Th 10:30‐11:45am 10 30 11 45 ● 56‐147E Eng‐IV Bldg. – Email: ee216a@gmail.com MSOL TA – Fang‐Li Yuan Reader – TBD Admin – Kim H – Office: 56‐127CC Eng‐IV EEM216A .:. Fall 2010 LectureD. 1: Markovic Introduction | 55 / Slide Class Material Textbook: – J. Rabaey, A. Chandrakasan, B. Nikolic, Digital Integrated Circuits: A Design Perspective, (2nd Edition), Prentice Hall, 2003. Other books: – N. Weste, D. Harris, CMOS VLSI Design: A Circuits and Systems Perspective, (3rd Edition), Addison Wesley, 2004. – A. Chandrakasan, W. Bowhill, F. Fox, Design of High‐Performance Microprocessor Circuits, IEEE Press, 2001. – W.J. Dally and J.W. Poulton, Digital System Engineering, Cambridge University Press, 1998. – B. Wong, g, A. Mittal,, Y. Cao,, G.W. Starr,, Nano CMOS Circuit and Physical y Design, Wiley‐Interscience, 2004. Selected papers: – Available on classwiki ● Linked from IEEE Xplore (http://ieeexplore.ieee.org) (need to be logged in to a campus machine) EEM216A .:. Fall 2010 LectureD. 1: Markovic Introduction | 66 / Slide 3 Other Sources Core material – – – – – – IEEE Journal of Solid‐State Circuits (JSSC) IEEE International Solid‐State Circuits Conference (ISSCC) European p Solid‐State Circuits Conference ((ESSCIRC)) Symposium on VLSI Circuits (VLSI) Custom Integrated Circuits Conference (CICC) Other conferences and journals CAD topics – International Conference on Computer Aided Design (ICCAD) – Design Automation Conference (DAC) LectureD. 1: Markovic Introduction | 77 / Slide EEM216A .:. Fall 2010 Class Organization & Grading Grading: – – – – – – Week Homeworks (4) Labs (2) Project j Midterm Final exam Course survey 1 2 15% 4% 30% 25% 25% 1% 3 4 H1 H2 Fri 10/8 Mon 10/18 Class project Phase‐1 5 Phase‐2 6 H3 8 H4 Fri 10/29 M Wed 11/3 EEM216A .:. Fall 2010 7 Fri 11/12 Final PPT 9 L1 10 L2 Mon Fri 11/22 11/26 11 S F Fri 12/10 LectureD. 1: Markovic Introduction | 88 / Slide 4 Class Website EEweb: grades only Classwiki: notes, handouts, assignments, CAD tools, references, … classwiki EEM216A .:. Fall 2010 LectureD. 1: Markovic Introduction | 99 / Slide Classwiki Create an account: use your UCLA username! EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 1010 / Slide 5 Homework #0 / Action Items Get an EE account (if you haven’t already) Sign up for classwiki – Use your ee/seas username to sign up – Once you sign up, I need to add you to ee216a group Server and CAD tool info is on the wiki EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 1111 / Slide CAD Tools Cadence & Synopsys software − Phased out Electric software − Online documentation and tutorials 90nm CMOS technology − Cadence gpdk090 & gsclib & Synopsys generic 90nm library − 9 metal layers Important tools / skills from EE115C − Design Capture: Virtuoso Schematic / Layout Editor − Circuit Simulation: Spectre / Ocean − Design Verification (DRC, LVS, Extraction): Assura/Diva/QRC EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 1212 / Slide 6 EEM216A Goals Understanding the basic building blocks of VLSI – Transistors/Wires – Logic Gates and Layout – Datapath p Blocks Be able to conceptually model a system – Logic Optimization – State Machine Design (RTL) Be able to build a system (using a subset of the tools) – Verilog Modeling – Synthesis – Place and Route Understanding the constraints and tradeoffs – Delay analysis (gates and interconnects) – Clocking methodology – System integration issues (Power/Ground routing, Noise) Lecture D. 1: Markovic Introduction | 1313 / Slide EEM216A .:. Fall 2010 EE M216A .:. Fall 2010 Lecture 1 Digital Integrated Circuit Design: Trends and Challenges Prof. Dejan Marković ee216a@gmail.com 7 Moore’s Law In 1965, Gordon Moore noted that the number of transistors on a chip doubled every 18 to 24 months He made a prediction that semiconductor industry will double its effectiveness every 18 months “The complexity for minimum component costs has increased at a rate of roughly a factor of two per year. Certainly over the short term, this rate can be expected to continue, if not to increase. Over the longer term, the rate of increase is a bit more uncertain, although there is no reason to believe it will not remain nearly constant for at least 10 years. That means by 1975, the number of components per integrated circuit for minimum cost will be 65,000.” [G. Moore, Electronics, 1965] EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 1515 / Slide Moore’s Law – 1965 EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 1616 / Slide 8 Moore’s Law – 2005 EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 1717 / Slide Evolution in Complexity EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 1818 / Slide 9 Microprocessor Examples Moore’s M ’ law l − − − − − Number of transistors Logic density Die size Frequency Power Lecture D. 1: Markovic Introduction | 1919 / Slide EEM216A .:. Fall 2010 #1: Number of Transistors Transistors on lead microprocessors double every 2 years Transistors (MT) 1000 100 10 1 0.1 0 01 0.01 2X growth in 11.96 96 years! Pentium 4 Pentium Pro (P6) Pentium® (P5) 486 (P4) 386 (P3) 286 (P2) 8086 (P1) 8085 8080 8008 4004 0.001 1970 1980 1990 2000 Source: S. Borkar (Intel) 2010 Year EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 2020 / Slide 10 #2: Logic Density Shrinks and compactions meet density goals − New micro‐architectures drop density 2x trend Logic Density Log gic Transistors/mm2 1000 100 Pentium II (R) 10 Pentium Pro (R) 486 Pentium (R) 386 i860 Source: Intel 0.13µ 0.18µ 0.25µ 0.35µ 0.6µ 0.8µ 1.0µ 1.5µ 1 Lecture D. 1: Markovic Introduction | 2121 / Slide EEM216A .:. Fall 2010 #3: Die Size Growth Die size grows by 14% to satisfy Moore’s law Die size (mm) 100 10 8080 8008 4004 1 1970 EEM216A .:. Fall 2010 8086 8085 1980 286 386 Pentium Pro 486 Pentium ® ~7% growth per year ~2X 2X growth in 10 years 1990 Year 2000 Source: S Borkar S. B k (Intel) 2010 Lecture D. 1: Markovic Introduction | 2222 / Slide 11 #4: Frequency Lead microprocessor frequency doubles every 2 years Frequency (Mhz) F 10000 Doubles every 2 years 1000 100 10 8085 1 0.1 1970 8086 286 386 486 Pentium 4 Pentium Pro Pentium ® Source: S. Borkar (Intel) 8080 8008 4004 1980 1990 Year 2000 2010 Lecture D. 1: Markovic Introduction | 2323 / Slide EEM216A .:. Fall 2010 Processor Frequency Trend Frequency doubles each generation − Number of gates/clock reduce by 25% 100 Intel IBM Power PC DEC Gate delays/clock Processor freq scales by 2X per generation Game over! 21264S 1,000 Mhz 21164A 21264 Pentium(R) 21064A 21164 II 21066 MPC750 604 604+ 10 Pentium Pro (R) Pentium(R) 100 601, 603 486 386 1 EEM216A .:. Fall 2010 Source: V. De, S. Borkar ISLPED’99 2005 2003 2001 1999 1997 1995 1993 1991 1989 1987 10 Gate Delays/ Clock 10 000 10,000 Lecture D. 1: Markovic Introduction | 2424 / Slide 12 Technology Roadmap (2002) International Technology Roadmap for Semiconductors (ITRS) Year 2001 2003 2005 2007 2010 2013 2016 DRAM ½ pitch [nm] 130 100 80 65 45 32 22 MPU transistors/chip 97M 153M 243M 386M 773M 1.55G 3.09G Wiring levels 8 8 10 10 10 11 11 High-perf. phys. gate [nm] 65 45 32 25 18 13 9 High-perf. VDD [V] 1.2 1.0 0.9 0.7 0.6 0.5 0.4 Local clock [GHz] 1.7 3.1 5.2 6.7 11.5 19.3 28.8 High-perf. power [W] 130 150 170 190 218 251 288 Low-power phys. gate [nm] 90 65 45 32 22 16 11 Low-power VDD [V] 1.2 1.1 1.0 0.9 0.8 0.7 0.6 Low-power power [W] 2.4 2.8 3.2 3.5 3.0 3.0 3.0 Node years: 2007/65nm, 2010/45nm, 2013/32nm, 2016/22nm Lecture D. 1: Markovic Introduction | 2525 / Slide EEM216A .:. Fall 2010 Technology Scaling ISSCC data x4 1 0.1 0.01 80 MPU DSP 85 90 Year (a) Power dissipation vs. year. 95 0.7 κ 3 ars ye 3 / 100 ∝ Power Dissipation (W) 10 ∝κ 1000 Power Density (mW/mm2 ) P ears x1.4 / 3 y 100 10 1 1 Scaling Factor κ (normalized by 4µm design rule) (b) Power density vs. scaling factor. 10 Source: T. Kuroda EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 2626 / Slide 13 New Trend: Parallel Hardware Higher logic throughput, yet lower power Vdd Logic Block Freq = 1 Throughput = 1 Active Power = 1 SD Lkg Power = 1 0.7 x Vdd Logic g Block Source: S. Borkar (Intel) Logic Block Freq q = 0.7 Throughput = 1.4 Active Power = 0.7 SD Lkg Power = 0.7 Lecture D. 1: Markovic Introduction | 2727 / Slide EEM216A .:. Fall 2010 Dual Core Rule of thumb Voltage Frequency 1% 1% Power Performance 3% 0.66% In the same process technology… Cache Core Source: S. Borkar (Intel) EEM216A .:. Fall 2010 Voltage V lt =1 Freq = 1 Area = 1 Power = 1 Perf =1 Cache Core Core Voltage V lt = ‐15% 15% Freq = ‐15% Area = 2 Power = 1 Perf = ~1.8 Lecture D. 1: Markovic Introduction | 2828 / Slide 14 Future Multi‐core Platform Heterogeneous Multi‐Core Platform—SOC GPC GPC GPC GPC GPC SPC GPC SPC C SP C GP GPC SPC C GP C GP GPC GPC General Purpose Cores Special Purpose HW Interconnect fabric Source: S. Borkar (Intel) EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 2929 / Slide Software Challenge Source: ITRS 2007 EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 3030 / Slide 15 Impact of Process Variations 130nm data (getting worse with scaling) Source: S. Borkar (Intel) No ormalized Frequency y 1.4 1.3 30% Frequency ~30% 1.2 Leakage Power ~5‐‐10X ~5 1.1 10 1.0 0.9 5X 1 2 3 4 Normalized Leakage EEM216A .:. Fall 2010 5 Lecture D. 1: Markovic Introduction | 3131 / Slide Implications Reliability – Extreme variations (Static & Dynamic) will result in unreliable components – Impossible to design reliable system as we know today ● Transient errors (Soft Errors) ● Gradual errors (Variations) ● Time dependent (Degradation) Test – – – – Source: S. Borkar (Intel) One time factory testing will be out One‐time‐factory Burn‐in to catch chip infant‐mortality will not be practical Test HW will be part of the design Dynamically self‐test, detect errors, reconfigure, & adapt EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 3232 / Slide 16 In a Nut‐shell… 100 BT integration capacity 100 Billion Transistors Billions unusable (variations) Some will fail over time Intermittent failures Source: S. Borkar ((Intel)) Yet, deliver high performance in the power & cost envelope… Lecture D. 1: Markovic Introduction | 3333 / Slide EEM216A .:. Fall 2010 Parallel Data Processing Power limited technology scaling – Increased impact of process variations – More leakage power, multiple threshold devices Single dimensional Æ Multidimensional data Multi-core Processors MIMO Communications IBM / Sony / Toshiba Belkin EEM216A .:. Fall 2010 Neuroscience www.sci.utah.edu Lecture D. 1: Markovic Introduction | 3434 / Slide 17 Energy‐Delay Optimization Same principle, different optimization goals Processors Communications VDD scaling Energy y −M Maximize i i performance f − Highest VDD required Processors Communications − Minimize energy & area − Typically, sensitivity ~ 1 Neuroscience − Power density: 0.8mWmm2 − Aggressive VDD scaling Neural 0 EEM216A .:. Fall 2010 Delay Lecture D. 1: Markovic Introduction | 3535 / Slide ASICs on The Road to Extinction? EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 3636 / Slide 18 The Age of Concurrency and Flexibility UCB Pleiades Heterogeneous reconfigurable fabric Xilinx Vertex 4 Intel Montecito Courtesy: C t J. Rabaey (UCB) ARM AMD DualCore NTT Video codec (4 Tensilica cores) IBM/Sony Cell Processor Lecture D. 1: Markovic Introduction | 3737 / Slide EEM216A .:. Fall 2010 FPGAs going Multi‐core… BEE2 compute module 14”x17” 22 layer PCB Courtesy: J. Wawrzynek (UCB) EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 3838 / Slide 19 Moore’s Law and the Long Term What level? 1965 2005 Lecture D. 1: Markovic Introduction | 3939 / Slide EEM216A .:. Fall 2010 Moore’s Law and the Long Term What level? Within your working life? 1965 EEM216A .:. Fall 2010 2005? When? Lecture D. 1: Markovic Introduction | 4040 / Slide 20 Silicon Technology Reaches Nanoscale Source: Intel EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 4141 / Slide Sub‐wavelength Optical Lithography EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 4242 / Slide 21 Scaling Toward 10 nm Node Bulk/SOI CMOS Post‐Silicon Multi‐gate CMOS 5 nm 55 n m 65nm 45nm 32nm 22nm 16nm 12nm Technology: scaling, alternative structures and materials, post‐silicon devices Design: billion transistors, GHz operation Source: K. Cao (ASU) Lecture D. 1: Markovic Introduction | 4343 / Slide EEM216A .:. Fall 2010 Design of Nanoelectronics [TI] 1948 1958 ??? [IBM] Carbon Nanotube FET 1993 2006 [UCB] Coming soon 2008 EEM216A .:. Fall 2010 2010 Lecture D. 1: Markovic Introduction | 4444 / Slide 22 Moore’s Law Challenge Double transistors every two years Stay within the expected power trend Still deliver the expected performance P Power‐limited li it d scaling li regime i Two key issues: – Design complexity – Power efficiency Looking at solutions to these challenges is what this course is all about! EEM216A .:. Fall 2010 Lecture D. 1: Markovic Introduction | 4545 / Slide 23