Memory Optimization at Design Time

advertisement
Optimizing Power @ Design Time
Memory
Benton H. Calhoun
Jan M. Rabaey
Low Power Design Essentials ©2008
Chapter 7
Role of Memory in ICs
 Memory is very important
 Focus in this chapter is embedded memory
 Percentage of area going to memory is increasing
Low Power Design Essentials ©2006
[Ref: V. De, Intel 2006]
X.2
Processor Area Becoming Memory Dominated
 On chip SRAM contains 5090% of total transistor count
– Xeon: 48M/110M
– Itanium 2: 144M/220M
 SRAM is a major source of
chip static power dissipation
SRAM
– Dominant in ultra-low power
applications
– Substantial fraction in others
Intel Penryn™
(Picture courtesy of Intel)
Low Power Design Essentials ©2008
7.3
Chapter Outline





Memory Introduction
Power in the Cell Array
Power for Read Access
Power for Write Access
New Memory Technologies
Low Power Design Essentials ©2008
7.4
Basic Memory Structures
Low Power Design Essentials ©2008
[Ref: J. Rabaey, Prentice’03]
7.5
SRAM Metrics
 Functionality
– Data retention
– Readability
– Writability
– Soft Errors
 Area
 Power
Low Power Design Essentials ©2008
Why is functionality a “metric”?
 Process variations
increase with scaling
 Large number of cells
requires analysis of
tails (out to 6σ or 7σ)
 Within-die VTH variation
due to Random Dopant
Fluctuations (RDFs)
7.6
Where Does SRAM Power Go?
 Numerous analytical SRAM power models
 Great variety in power breakdowns
 Different applications cause different
components of power to dominate
 Hence: Depends on applications: e.g. high
speed versus low power, portable
Low Power Design Essentials ©2008
7.7
SRAM cell
BL
BL
WL
– WL=0; BLs=X
Q
M3
 Write
M6
M2
M5
M4
M1
Three tasks of a cell
 Hold data
QB
– WL=1; BLs driven with new
data
 Read
Traditional 6-Transistor
(6T) SRAM cell
Low Power Design Essentials ©2008
– WL=1; BLs precharged
and left floating
7.8
Key SRAM cell metrics
BL
BL
WL
– Static Noise Margin (SNM)
– Data retention voltage (DRV
Q
M3
M6
M2
 Read
M5
M4
M1
Key functionality metrics
 Hold
QB
– Static Noise Margin (SNM)
 Write
– Write Margin
Traditional 6-Transistor
(6T) SRAM cell
Low Power Design Essentials ©2008
Metrics:
Area is primary constraint
Next: Power, Delay
7.9
Static Noise Margin (SNM)
BL
BLB
WL
VN
SNM gives a measure of the
cell’s stability by quantifying the
DC noise required to flip the cell
M6
M3
M5
M2
Q
M1
M4
QB
VN
Inv 1
Inv 2
0.3
QB(V)
VTC for Inv 2
VTC-1 for Inv 1
VTC for Inv2 with VN = SNM
VTC-1 for Inv1 with VN = SNM
SNM
0.15
SNM is length of side of
the largest embedded
square on the butterfly
curve
0
0.15
0
0.3
Q (V)
Low Power Design Essentials ©2008
[Ref: E. Seevinck, JSSC’87]
7.10
Static Noise Margin with Scaling
 Typical cell SNM
deteriorates with scaling
 Variations lead to failure
from insufficient SNM
Tech and VDD scaling lower SNM
Variations worsen tail of SNM
distribution
(Results obtained from
simulations with
Predictive Technology
Models –
[Ref: PTM; Y. Cao ‘00])
Low Power Design Essentials ©2008
7.11
Variability: Write Margin
BL
BLB
WL
1
1 0
1
Normalized QB
0.8
0
Write failure:
Positive SNM
0.6
0.4
0.2
0
0
0.2
Dominant fight (ratioed)
1
0.8
0.8
Normalized QB
Normalized QB
Cell stability
prior to write:
1
0.6
0.4
0.4
0.6
0.8
Normalized Q
1
Successful write:
Negative “SNM”
0.6
0.4
0.2
0.2
0
0
0
0.2
Low Power Design Essentials ©2008
0.4
0.6
0.8
Normalized Q
1
0
0.2
0.4
0.6
0.8
Normalized Q
1
7.12
Variability: Cell Writability
VDD=0.6V
0.05
Write Fails
0
SNM (V)
-0.05
-0.1
-0.15
-0.2
-0.25
-40
TT
WW
SS
WS
SW
-20
0
20
40
60
80
100
120
Temperature (oC)
Write margin limits VDD scaling for 6T cells to 600mV, best case.
 65nm process, VDD = 0.6V
 Variability and large number of cells makes this worse
Low Power Design Essentials ©2008
7.13
Cell Array Power
 Leakage Power dominates while the
memory holds data
BL
BL
WL
‘0’
‘1’
Importance of Gate
tunneling and GIDL
depends on
technology and
voltages applied
Sub-threshold leakage
Low Power Design Essentials ©2008
7.14
 High VTH cells necessary if
all else is kept the same
 To keep leakage in 1 MB
memory within bounds, VTH
must be kept in [0.4, 0.6]
range
1-Mb array retention current (A)
Using Threshold Voltage to Reduce Leakage
Tj =125 C Lg =0.1 m
100
100 C
75 C
50 C
25 C
10-2
W (QT)=0.20 m
W (QD)=0.28 m
W (QL)=0.18 m
high speed
(0.49)
10-4
10 A
low power
(0.71)
10-6
0.1 A
10-8
-0.2 0
0.2 0.4 0.6 0.8 1.0
Average extrapolated VTH (V) at 25 ºC
Extrapolated VTH =VTH (nA/m)+0.3 V
Low Power Design Essentials ©2008
[Ref: K. Itoh, ISCAS’06]
7.15
Multiple Threshold Voltages
BL
WL
BL
BL
WL
BL
‘0’
Dual VTH cells with low VTH
access transistors provide good
tradeoffs in power and delay
[Ref: Hamzaoglu, et al., TVLSI’02]
High VTH
Use high VTH devices to lower
leakage for stored ‘0’, which is
much more common than a
stored ‘1’
Low VTH
Low Power Design Essentials ©2008
[Ref: N. Azizi, TVLSI’03]
7.16
Multiple Voltages
 Selective usage of multiple voltages in cell array
– e.g. 16 fA/cell at 25oC in 0.13 μm technology
1.0V
WL=0V
1.5V
1.0V
 High VTH to lower subVTH leakage
 Raised source, raised
VDD, and lower BL
reduce gate stress
while maintaining SNM
0.5V
Low Power Design Essentials ©2008
[Ref: K. Osada, JSSC’03]
7.17
Power Breakdown During Read
VDD_Prech
 Accessing correct cell
– Decoders, WL drivers
– For Lower Power:
WL
Address
 hierarchical WLs
 pulsed decoders
Sense
Amp
 Performing read
– Charge and discharge
large BL capacitance
– For Lower Power :
 SAs and low BL swing
 Hierarchical BLs
Mem
Cell
Data
 Lower VDD
– May require read assist
 Lower BL precharge
Low Power Design Essentials ©2008
7.18
Hierarchical Word-line Architecture
 Reduces amount of switched capacitance
 Saves power and lowers delay
Low Power Design Essentials ©2008
[Ref’s: Rabaey, Prentice’03; T. Hirose, JSSC’90]
7.19
Hierarchical Bitlines
Local BLs
Global BLs
 Divide up bitlines hierarchically
– Many variants possible
 Reduce RC delay, also decrease CV2 power
 Lower BL leakage seen by accessed cell
Low Power Design Essentials ©2008
7.20
BL Leakage During Read Access
 Leakage into nonaccessed cells
“1”
“0”
Bit-line
– Raises power and delay
– Affects BL differential
“0”
Low Power Design Essentials ©2008
7.21
Bitline Leakage Solutions
VSSWL
“1”
“0”
“1”
VSSWL
“0”
VGND
Vg
Raise VSS in cell (VGND)







Negative Wordline (NWL)
Hierarchical BLs
Raise VSS in cell
Negative WL voltage
Longer access FETs
Alternative bit-cells
Active compensation
Lower BL precharge
voltage
Low Power Design Essentials ©2008
[Ref: A. Agarwal, JSSC’03]
7.22
Lower Precharge Voltage
Lower BL precharge
voltage decreases power
and improves Read SNM
 Internal bit-cell node rises
less
 Sharp limit due to
accidental cell writing if
access FET pulls internal ‘1’
low
Low Power Design Essentials ©2008
7.23
VDD Scaling
 Lower VDD (and other voltages) via classic
voltage scaling
– Saves power
– Increases delay
– Limited by lost margin (read and write)
 Recover Read SNM with read assist
–
–
–
–
Lower BL precharge
Boosted cell VDD [Ref: Bhavnagarwala’04, Zhang’06]
Pulsed WL and/or Write-After-Read [Ref: Khellah’06]
Lower WL [Ref: Ohbayashi’06]
Low Power Design Essentials ©2008
7.24
Power Breakdown During Write
VDD_Prech
 Accessing cell
– Similar to Read
– For Lower Power:
WL
Address
Mem
Cell
 Hierarchical WLs
 Performing write
– Traditionally drive BLs full swing
– For Lower Power :
Data
 Charge sharing
 Data dependencies
 Low swing BLs with amplification
Low Power Design Essentials ©2008
7.25
Charge recycling to reduce write power
 Share charge between BLs or pairs of BLs
 Saves for consecutive write operations
 Need to assess overhead
Basic charge recycling – saves 50% power in theory
1
BL=
0V
0
BLB=
VDD
old values
Low Power Design Essentials ©2008
BL=
VDD/2
1
BLB=
VDD/2
connect
floating BLs
BL=
VDD
BLB=
0V
disconnect and
drive new values
[Ref’s: K. Mai, JSSC’98; G. Ming, ASICON’05]
7.26
Memory Statistics
 0’s more common
– SPEC2000: 90% 0s in data
– SPEC2000: 85% 0s in instructions
 Assumed write value using inverted data as
necessary [Ref: Y. Chang, ISLPED’99]
 New Bitcell:
BL
WL
BL
WZ
WWL
1R, 1W port
W0: WZ=0, WWL=1, WS=1
W1: WZ=1, WWL=1, WS=0
WS
Low Power Design Essentials ©2008
[Ref: Y. Chang, TVLSI’04]
7.27
Low-Swing Write
 Drive the BLs with low swing
 Use amplification in cell to restore
values
VDD_Prech
EQ
BL
SLC
WL
WL
Q
EQ
SLC
WE
VWR=VDD-VTH-delVBL
VWR
Din
BL/BLB
BLB
VDD-VTH-delVBL
Q/QB
QB
column
decoder
VDD-VTH
WE
Low Power Design Essentials ©2008
[Ref: K. Kanda, JSSC’04]
7.28
Write Margin
 Fundamental limit to most power-reducing
techniques
 Recover write margin with write assist, e.g.
–
–
–
–
Boosted WL
Collapsed cell VDD [Itoh’96, Bhavnagarwala’04]
Raised cell VSS [Yamaoka’04, Kanda’04]
Cell with amplification [Kanda ’04]
Low Power Design Essentials ©2008
7.29
Non-traditional cells
 Key tradeoff is with functional robustness
 Use alternative cell to improve robustness, then trade
off for power savings
 e.g. Remove read SNM
• Register file cell
• 1R, 1W port
• Read SNM eliminated
• Allows lower VDD
• 30% area overhead
• Robust layout
RWL
WBL
WWL
WBL
RBL
8T SRAM cell
Low Power Design Essentials ©2008
[Ref: L. Chang, VLSI’05]
7.30
Cellss with Pseudo-Static SNM Removal
 Isolate stored data during read
 Dynamic storage for duration of read
BL
WL
BL
WLW
Differential read
[Ref: S. Kosonocky, ISCICT’06]
Low Power Design Essentials ©2008
BL
WL
WWL
BL
WLB
Single-ended read
[Ref: K. Takeda, JSSC’06]
7.31
Emerging Devices: Double-gate MOSFET
 Emerging devices allow new SRAM structures
 Back-gate biasing of thin-body MOSFET provides improved
control of short-channel effects, and re-instates effective dynamic
control of VTH.
Gate1
Fin Height
HFIN = W/2
Double-gated (DG) MOSFET
Low Power Design Essentials ©2008
Gate length = Lg
Switching
Gate
Gate2
VTH Control
Drain
Source
Gate Fin Width = T
Si
Drain
Source
Gate length = Lg
Fin Height
HFIN = W
Back-gated (BG) MOSFET
•
•
Independent front and back gates
One switching gate and VTH
control gate
[Ref: Z. Guo, ISLPED’05]
7.32
6T SRAM Cell with Feed-back
 Double-Gated (DG) NMOS pull-down
and PMOS load devices.
 Back-Gated (BG) NMOS access devices
dynamically increase β-ratio.
PL
PR
“0”
“1”
AL
NL
NR
Beta ratio
increased
210mV
6T DG-MOS
READ
Vsn2 (V)
Vsn2 (V)
– SNM during read ~ 300mV.
– Area penalty ~ 19%
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
STANDBY
210mV
0
0.5
1
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
300mV
6T BG-MOS
READ
STANDBY
300mV
0
Vsn1 (V)
Low Power Design Essentials ©2008
AR
[Ref: Z. Guo, ISLPED’05]
0.5
Vsn1 (V)
1
7.33
Summary and Perspectives
 Functionality is main constraint in SRAM
– Variation makes the outlying cells limiters
– Look at hold, read, write modes
 Use various methods to improve robustness,
then trade off for power savings
– Cell voltages, thresholds
– Novel bit-cells
– Emerging devices
 Embedded memory major threat to continued
technology scaling – innovative solutions
necessary
Low Power Design Essentials ©2008
7.34
References
Books and Book Chapters





K. Itoh et al, Ultra-Low Voltage Nano-scale Memories, Springer 2007.
A. Macii, “Memory Organization for Low-Energy Embedded Systems,” in Low-Power Electronics
Design, C, Piguet Editor, Chapter 26, CRC Press, 2005.
V. Moshnyaga and K. Inoue, “Low Power Cache Design,” in Low-Power Electronics Design, C,
Piguet Editor, Chapter 25, CRC Press, 2005.
J. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Circuits, 2003.
T. Takahawara and K. Itoh, “Memory Leakage Reduction,” in Leakage in Nanometer CMOS
Technologies, S. Narendra, Ed, Chapter 7, Springer 2006.
Articles






A. Agarwal, H. Li, and K. Roy, “A Single-Vt Low-Leakage Gated-Ground Cache for Deep
Submicron,” IEEE Journal of Solid-State Circuits, vol. 38, no. 2, pp. 319–328, Feb. 2003.
N. Azizi, F. Najm, and A. Moshovos, “Low-leakage Asymmetric-Cell SRAM,” IEEE Transactions
on VLSI, vol. 11, no. 4, pp. 701-715, August 2003.
A. Bhavnagarwala, S. Kosonocky, S. Kowalczyk, R. Joshi, Y. Chan, U. Srinivasan, and J.
Wadhwa, “A Transregional CMOS SRAM with Single, Logic VDD and Dynamic Power Rails,” in
Symposium on VLSI Circuits, pp. 292–293, 2004.
Y. Cao, T. Sato, D. Sylvester, M. Orshansky, and C. Hu, “New Paradigm of Predictive MOSFET
and Interconnect Modeling for Early Circuit Design,” in Custom Integrated Circuits Conference
(CICC), Oct. 2000, pp. 201–204.
L. Chang, D. Fried, J. Hergenrother, et al., “Stable SRAM cell design for the 32 nm node and
beyond,” Symposium on VLSI Technology, pp. 128-129, June 2005.
Y. Chang, B. Park, and C. Kyung, “Conforming inverted data store for low power memory,” IEEE
International Symposium on Low Power Electronics and Design, 1999.
Low Power Design Essentials ©2008
7.35
References (cntd)












Y. Chang, F. Lai, and C. Yang, “Zero-aware asymmetric SRAM cell for reducing cache power in
writing zero,” IEEE Transactions on VLSI Systems, vol. 12, no. 8, pp. 827 – 836, August 2004.
Z. Guo, S. Balasubramanian, R. Zlatanovici, T.-J. King, and B. Nikolic, ”FinFET-based SRAM
design,” International Symposium on Low Power Electronics and Design, pp. 2-7, August 2005.
F. Hamzaoglu, Y. Ye, A. Keshavarzi, K. Zhang, S. Narendra, S. Borkar, M. Stan, and V. De,
“Analysis of Dual-VT SRAM Cells with Full-Swing Single-Ended Bit Line Sensing for On-Chip
Cache,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 10, no. 2, pp.
91–95, Apr. 2002.
T. Hirose, H. Kuriyama, S. Murakami, et al., IEEE Journal of Solid-State Circuits, vol. 25, no. 5,
pp. 1068-1074, October 1990
K. Itoh, A. Fridi, A. Bellaouar, and M. Elmasry, “A Deep Sub-V, Single Power-Supply SRAM Cell
with Multi-VT, Boosted Storage Node and Dynamic Load,” Symposium on VLSI Circuits, pp. 132–
133, June 1996.
K. Itoh, M. Horiguchi, and T. Kawahara, “Ultra-low voltage nano-scale embedded RAMs,” IEEE
Symposium on Circuits and Systems, May 2006.
K. Kanda, H. Sadaaki, and T. Sakurai, “90% Write Power-Saving SRAM Using Sense-Amplifying
Memory Cell,” IEEE Journal of Solid-State Circuits, vol. 39, no. 6, pp. 927–933, June 2004.
S. Kosonocky, A. Bhavnagarwala, and L. Chang, International Conference on Solid-State and
Integrated Circuit Technology, pp. 689-692, October 2006.
K. Mai, T. Mori, B. Amrutur, et al., IEEE Journal of Solid-State Circuits, vol. 33, no. 11, pp. 16591671, November 1998.
G. Ming, Y. Jun, and X. Jun, "Low Power SRAM Design Using Charge Sharing Technique," pp.
102-105, ASICON, 2005.
K. Osada, Y. Saitoh, E. Ibe, and K. Ishibashi, “16.7-fA/Cell Tunnel-Leakage- Suppressed 16-Mb
SRAM for Handling Cosmic-Ray-Induced Multierrors,” IEEE Journal of Solid-State Circuits, vol.
38, no. 11, pp. 1952–1957, Nov. 2003.
PTM – Predictive Models. Available: http://www.eas.asu.edu/˜ptm
Low Power Design Essentials ©2008
7.36
References (cntd)



E. Seevinck, F. List, and J. Lohstroh, “Static Noise Margin Analysis of MOS SRAM Cells,” IEEE J.
of Solid-State Circuits, vol. SC-22, no. 5, pp. 748–754, Oct. 1987.
K. Takeda, Y. Hagihara, Y. Aimoto, M. Nomura, Y. Nakazawa, T. Ishii, and H. Kobatake, “A ReadStatic-Noise-Margin-Free SRAM Cell for Low-Vdd and High-Speed Applications,” in IEEE
International Solid-State Circuits Conference, pp. 478–479, February 2005.
M. Yamaoka, Y. Shinozaki, N. Maeda, Y. Shimazaki, K. Kato, S. Shimada, K. Yanagisawa, and K.
Osadal, “A 300MHz 25μA/Mb Leakage On-Chip SRAM Module Featuring Process-Variation
Immunity and Low-Leakage-Active Mode for Mobile-Phone Application Processor,” in IEEE
International Solid-State Circuits Conference, 2004, pp. 494–495.
Low Power Design Essentials ©2008
7.37
Download