ESE370: Circuit-Level for Digital Systems Modeling, Design, and Optimization

advertisement
ESE370:
Circuit-Level
Modeling, Design, and Optimization
for Digital Systems
Day 29: November 18, 2011
Dynamic Logic
1
Penn ESE370 Fall2011 -- DeHon
Today
• Memory Energy wrapup
• Dynamic Logic
– Strategy
– Form
– Compare CMOS
2
Penn ESE370 Fall2011 -- DeHon
Memory
• What fraction of memory cells is
involved in a read/write?
• What are most cells doing on a cycle?
• Reads are slow
– Cycles long  lots of time to leak
3
Penn ESE370 Fall2011 -- DeHon
ITRS 2009 45nm
Low Power
Isd,leak
Isd,sat
High
Performance
100nA/mm
1200 mA/mm
Cg,total
Vth
1fF/mm
285mV
0.91fF/mm
585mV
50pA/mm
560mA/mm
C0 = 0.045mm × Cg,total
4
Penn ESE370 Fall2011 -- DeHon
High Power Process
• V=1V d=1000 g=0.5 Waccess=Wbuf=2
• Full swing for simplicity
• Csc = 0
– (just for simplicity, typically <Cload)
• BL: Cload=1000C0 ≈ 45 fF = 45×10-15F
• WN = 2  Ileak = 9×10-9 A
• P= (45×10-15) freq + 1000×9×10-9 W
Penn ESE370 Fall2011 -- DeHon
5
Relative Power
• P= (45×10-15) freq + 1000×9×10-9 W
• P= (4.5×10-14) freq + 9×10-6 W
• Crossover freq<200MHz
• How partial swing on bit line change?
Reduce dynamic energy
Increase percentage in leakage energy
Reduce crossover frequency
Penn ESE370 Fall2011 -- DeHon
6
Consequence
• Leakage energy can dominate in large
memories
• Care about low operating (or stand-by)
power
• Use process or transistors with high Vth
– Reduce leakage at expense of speed
7
Penn ESE370 Fall2011 -- DeHon
Memory in Processors
• Most of the area on modern processors
is memory
– Often accounts for 80—90% of transistors
• Example: Intel 6-core processor
– 1.9 Billion transistors
– 25MB of L3 + L2 cache
• 25 x 106 x 8 bits/byte x 6tr/bit =1.2 Billion
– Plus L1 memories, RF, branch predict,
reorder …..
8
Penn ESE370 Fall2011 -- DeHon
Dynamic Logic
9
Penn ESE370 Fall2011 -- DeHon
Motivation
• Like to avoid driving pullup/pulldown
networks
– reduce capacitive load
• Power, delay
10
Penn ESE370 Fall2011 -- DeHon
Motivation
• Like to avoid driving pullup/pulldown
networks
– reduce capacitive load
• Power, delay
• Ratioed had problems with
– Large device for ratioing
– Slow pullup
– Static power
11
Penn ESE370 Fall2011 -- DeHon
Idea
• Use clock to disable pullup during evaluation
12
Penn ESE370 Fall2011 -- DeHon
Discuss
• Use clock to disable pullup during evaluation
• What happens when
– /Pre=0, A=B=0
– /pre=1, A=B=0?
– /pre=1, A=1, B=0?
• Sizing implication?
• Concerns?
• Requirements?
13
Penn ESE370 Fall2011 -- DeHon
Advantages
• Large device
– Driven by clock not data/logic
– Can pullup quickly w/out
putting load on logic
• Single network
– Pulldown
– Don’t have to size for ratio
with pullup
– Swings rail-to-rail
Penn ESE370 Fall2011 -- DeHon
14
Domino Logic
15
Penn ESE370 Fall2011 -- DeHon
Domino
• Everything charged high
– After inverter all inputs low
• Why do we want this?
• Disabled, waiting for an enabling transition
16
Penn ESE370 Fall2011 -- DeHon
Requirements
• Single transition
– Once fires, it is done  like
domino falling
• All inputs at 0 during precharge
– Precharge to 1 so inversion
makes 0
http://en.wikipedia.org/wiki/File:Domino_effect.jpg
• Non-inverting gates
17
Penn ESE370 Fall2011 -- DeHon
Issues
• Noise sensitive
• Power?
• Activity?
18
Penn ESE370 Fall2011 -- DeHon
Domino or4
19
Penn ESE370 Fall2011 -- DeHon
Domino Logic
• Performance
– R0/2 input
• Compare to CMOS cases?
• nor4
• or4
• nand4
20
Penn ESE370 Fall2011 -- DeHon
Dynamic OR4
• Precharge time?
• Driving input
– With R0/2
• Driving inverter
and self cap?
• Output self delay?
21
Penn ESE370 Fall2011 -- DeHon
Class ended here
22
Penn ESE370 Fall2011 -- DeHon
CMOS NOR4
• Driving input
– With R0/2
• Driving self cap?
23
Penn ESE370 Fall2011 -- DeHon
CMOS NAND4
• Driving input
– w/ R0/2
• Driving self cap?
24
Penn ESE370 Fall2011 -- DeHon
Delay Roundup
Circuit
Precharge
Input
Driving Inv or
Self Output
Delay
Or4 domino
Nor4 cmos
2g+1/3
n/a
3/2
5
15g+2
20g
Nand4 cmos
n/a
5
20g
For same output drive strength (R0/2),
comparable or lower self load and lower input loading.
25
Penn ESE370 Fall2011 -- DeHon
Discuss (time permit)
• Avoid inversion?
• Converting from CMOS?
• Post-charge
26
Penn ESE370 Fall2011 -- DeHon
Observe
• Better (lower) ratio of input capacitance
to drive strength
• Particularly good for
– Driving large loads
– Large fanin gates
• Harder to design with
– Timing and polarity restrictions
– Avoiding noise
• Especially with today’s high variation tech.
• Can consume more energy/op
Penn ESE370 Fall2011 -- DeHon
27
Admin
• Project 2: Due Wednesday
– Hope you are well along
– Optimization push over weekend
• Monday: in Detkin Lab
– See posted lab description
– Teams assigned
28
Penn ESE370 Fall2011 -- DeHon
Idea
• Dynamic/clocked logic
– Only build/drive one network
– Fast transition propagation
– Spend delay (capacitance) on pullup off
critical path of logic
– More complicated, power
• Reserve for when most needed
29
Penn ESE370 Fall2011 -- DeHon
Download