ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 29: November 18, 2011 Dynamic Logic 1 Penn ESE370 Fall2011 -- DeHon Today • Memory Energy wrapup • Dynamic Logic – Strategy – Form – Compare CMOS 2 Penn ESE370 Fall2011 -- DeHon Memory • What fraction of memory cells is involved in a read/write? • What are most cells doing on a cycle? • Reads are slow – Cycles long lots of time to leak 3 Penn ESE370 Fall2011 -- DeHon ITRS 2009 45nm Low Power Isd,leak Isd,sat High Performance 100nA/mm 1200 mA/mm Cg,total Vth 1fF/mm 285mV 0.91fF/mm 585mV 50pA/mm 560mA/mm C0 = 0.045mm × Cg,total 4 Penn ESE370 Fall2011 -- DeHon High Power Process • V=1V d=1000 g=0.5 Waccess=Wbuf=2 • Full swing for simplicity • Csc = 0 – (just for simplicity, typically <Cload) • BL: Cload=1000C0 ≈ 45 fF = 45×10-15F • WN = 2 Ileak = 9×10-9 A • P= (45×10-15) freq + 1000×9×10-9 W Penn ESE370 Fall2011 -- DeHon 5 Relative Power • P= (45×10-15) freq + 1000×9×10-9 W • P= (4.5×10-14) freq + 9×10-6 W • Crossover freq<200MHz • How partial swing on bit line change? Reduce dynamic energy Increase percentage in leakage energy Reduce crossover frequency Penn ESE370 Fall2011 -- DeHon 6 Consequence • Leakage energy can dominate in large memories • Care about low operating (or stand-by) power • Use process or transistors with high Vth – Reduce leakage at expense of speed 7 Penn ESE370 Fall2011 -- DeHon Memory in Processors • Most of the area on modern processors is memory – Often accounts for 80—90% of transistors • Example: Intel 6-core processor – 1.9 Billion transistors – 25MB of L3 + L2 cache • 25 x 106 x 8 bits/byte x 6tr/bit =1.2 Billion – Plus L1 memories, RF, branch predict, reorder ….. 8 Penn ESE370 Fall2011 -- DeHon Dynamic Logic 9 Penn ESE370 Fall2011 -- DeHon Motivation • Like to avoid driving pullup/pulldown networks – reduce capacitive load • Power, delay 10 Penn ESE370 Fall2011 -- DeHon Motivation • Like to avoid driving pullup/pulldown networks – reduce capacitive load • Power, delay • Ratioed had problems with – Large device for ratioing – Slow pullup – Static power 11 Penn ESE370 Fall2011 -- DeHon Idea • Use clock to disable pullup during evaluation 12 Penn ESE370 Fall2011 -- DeHon Discuss • Use clock to disable pullup during evaluation • What happens when – /Pre=0, A=B=0 – /pre=1, A=B=0? – /pre=1, A=1, B=0? • Sizing implication? • Concerns? • Requirements? 13 Penn ESE370 Fall2011 -- DeHon Advantages • Large device – Driven by clock not data/logic – Can pullup quickly w/out putting load on logic • Single network – Pulldown – Don’t have to size for ratio with pullup – Swings rail-to-rail Penn ESE370 Fall2011 -- DeHon 14 Domino Logic 15 Penn ESE370 Fall2011 -- DeHon Domino • Everything charged high – After inverter all inputs low • Why do we want this? • Disabled, waiting for an enabling transition 16 Penn ESE370 Fall2011 -- DeHon Requirements • Single transition – Once fires, it is done like domino falling • All inputs at 0 during precharge – Precharge to 1 so inversion makes 0 http://en.wikipedia.org/wiki/File:Domino_effect.jpg • Non-inverting gates 17 Penn ESE370 Fall2011 -- DeHon Issues • Noise sensitive • Power? • Activity? 18 Penn ESE370 Fall2011 -- DeHon Domino or4 19 Penn ESE370 Fall2011 -- DeHon Domino Logic • Performance – R0/2 input • Compare to CMOS cases? • nor4 • or4 • nand4 20 Penn ESE370 Fall2011 -- DeHon Dynamic OR4 • Precharge time? • Driving input – With R0/2 • Driving inverter and self cap? • Output self delay? 21 Penn ESE370 Fall2011 -- DeHon Class ended here 22 Penn ESE370 Fall2011 -- DeHon CMOS NOR4 • Driving input – With R0/2 • Driving self cap? 23 Penn ESE370 Fall2011 -- DeHon CMOS NAND4 • Driving input – w/ R0/2 • Driving self cap? 24 Penn ESE370 Fall2011 -- DeHon Delay Roundup Circuit Precharge Input Driving Inv or Self Output Delay Or4 domino Nor4 cmos 2g+1/3 n/a 3/2 5 15g+2 20g Nand4 cmos n/a 5 20g For same output drive strength (R0/2), comparable or lower self load and lower input loading. 25 Penn ESE370 Fall2011 -- DeHon Discuss (time permit) • Avoid inversion? • Converting from CMOS? • Post-charge 26 Penn ESE370 Fall2011 -- DeHon Observe • Better (lower) ratio of input capacitance to drive strength • Particularly good for – Driving large loads – Large fanin gates • Harder to design with – Timing and polarity restrictions – Avoiding noise • Especially with today’s high variation tech. • Can consume more energy/op Penn ESE370 Fall2011 -- DeHon 27 Admin • Project 2: Due Wednesday – Hope you are well along – Optimization push over weekend • Monday: in Detkin Lab – See posted lab description – Teams assigned 28 Penn ESE370 Fall2011 -- DeHon Idea • Dynamic/clocked logic – Only build/drive one network – Fast transition propagation – Spend delay (capacitance) on pullup off critical path of logic – More complicated, power • Reserve for when most needed 29 Penn ESE370 Fall2011 -- DeHon