Slides

advertisement
Fabrizio Lombardi
ITC Endowed Chair Professor
Dept of ECE
Northeastern University, Boston




CMOS: currently at 28/22nm, soon to move
further down in scaling (ITRS)
New commercial markets: GPU, tablet, massive
external storage (mostly portable)
Emerging paradigms: multi-value operation,
non-volatile RAM, processing-in-memory
Challenges:
New designs abound, but not yet a clear winner



CMOS is not going away any time soon
More and More-Than Moore
Beyond CMOS
 year
Elements
Beyond CMOS
Extending MOSFETs
to the End of the
Roadmap
___________
CNTFETs
Graphene nanoribbons
III-V Channel MOSFETs
Ge Channel MOSFETs
Nanowire FETs
Tunnel FET
Non-conventional
Geometry Devices
Unconventional
FETSCharge-based
Extended CMOS
Devices
_______________
Spin FET& Spin MOSFET
Negative Cg MOSFET
NEMS switch
Excitonic FET, Mott FET
Tunnel FET
I-MOS
SET
Non-FET, Non
Charge-based ‘Beyond
CMOS’ Devices
_______________
Spin Transfer Torque Logic
Moving domain wall devices
Pseudo-spintronic Devices
Nanomagnetic (M:QCA)
Negative Cg MOSFET
All Spin Logic
Molecular Switch
Atomic Switch
BiSFET
Resistive Memories
Spin Transfer Torque
MRAM
Nanoelectromechanical
Nanowire PCM
Macromolecular (Polymer)
Electronic Effects
Memory
− Charge trapping
− Metal-Insulator
Transition
− FE barrier effects
Redox Memory
−Nanoionic memory
−Electrochemical memory
− Fuse/Antifuse memory
Molecular Memory
Capacitive Memory
FeFET Memory
NVM cost/gigabyte ~ $1 (Intel)







PVT variations
Stability (SNM) concern
Power dissipation
Charge diffusion and
collection in the layout
Basic binary operation
(supply voltage
requirements)
Inability to meet large
storage needs
Likely soft errors





Avoid large capital
investment, selectively
use new/compatible
technologies
Preferably, hybrid
circuits
Multi-level (multi-bit)
operation
Processing in memory
(PIM)
Problematic endurance
Move to higher radix bases than binary: ternary,
quad or eventually octal
Bases:
1. Ternary: used for CAM processing mostly in
routers, but also in GPUs (cache)
2. Quaternary/Octal: increase capacity for massive
storage (to replace flash memories)
Not efficiently done in CMOS (additional voltage
rails and high area/power penalty)

Use radically new technologies


ITRS: memory has always met stated
objectives in the past
Late 2014 as crucial initial milestone wrt to
performance (power dissipation and density)
and design fundamentals.
Discuss new (emerging) directions:



Unorthodox technologies (briefly)
Material-based technologies
Focus on non volatile memories
Innovative operational paradigms for memory
using new physics storage phenomena:
1. QCA (memory in motion); challenge is room
temperature operation and CMOS
compatibility for manufacturing
2. SET (controlled transfer of electrons for
memory operation purposes)

Long term opportunities abound, but
grand challenges too
Currently applicable mostly to an
academic investigation
Exploit new materials and fabrication methods
(CMOS compatible) to meet challenges
Additional criteria:
1. Hybrid operation is usually sought
2. Robustness to PVT variations/endurance.
3. New design realms:
Multi level (resistance) for increased capacity
Ambipolar operation for control
APPLICATION: non volatile storage

2011 Memory Application (ITRS)
Emerging Research Memory Technology Stand-Alone
Ferroelectric-gate FET
X
Nanoelectromechanical RAM
X
Spin Transfer Torque MRAM
Embedded
X
X
Nanoionic or Redox Memory
X
X
Nanowire Phase Change Memory (PCM)
X
X
Electronic Effects (Charge trapping, Mott)
X
Macromolecular memory
X
X
Molecular memory
X
X

Also know as Resistive RAMs: add (programmable)
resistive element(s) to active device(s) (usually
1T1R for simplest non-volatile cell design)
Issues:
1. Resistance range (Rmax-Rmin)
2. Power dissipation and leakage
3. Programmability and universal memory feature
4. Error/defect models (soft and drift)
5. Endurance (related to read/write operation)
6. Testing
FEATURE
Capacity
Random Read
Random Write
Endurance
Management
Error Correction
Retention(ys)
Read Access(ns)
Prog Access(us)
Erase Access(ms)
Power
Cell size(F^2)
Universal Memory
NOR
256MB
Yes
No
10^5
High
No
10
60
200
1-100
Mid
10
No
NAND
16GB
No
No
10^5-10^3
High
1-72 bits
1-10
60
200
1-100
Mid
4
No
PCM
32MB
Yes
Yes
10^6
Mod
*
15
10
20
50
Mid
4
Yes
MRAM
2MB
Yes
Yes
10^15
No
No
20
35
35
35
Low
6-20
Yes
FRAM
1MB
Yes
Yes
10^14
No
No
5-20
60
60
60
Low
4-15
Yes
Flash memory seen as a mature technology,
unable to capitalize on scaling and not
meeting high density storage for mobile
application
 Low lifetime due to high-voltage based
process
 Apple and Anobit (2012)
 Additional players:
Samsung, Micron, IBM

• Does not require many transistors
or other access devices
Remove silicon requirements:
• Improve density
• Reduce power consumption
• Integrate with processors
• Reduce total area
• Crossbar Inc (August 2013):
3D stacking, 1TByte on chip
prototype (using FeRRAM)
Feature size = Litho node
F
Cell Size = 4 F2
P
Pitch = 2F for cross bars
The Memristor: Prediction
Fourth Fundamental, Two-Terminal Circuit Element
φ
Leon Chua
U.C. Berkeley
v
q
dφ/ dt = v
i
dq / d t = i
v
Ohm
1827
RESISTOR
dv = R di
q
i
1831
Faraday
Von Kleist
1745
CAPACITOR
dq = C dv
MEMRISTOR
dφ = M dq
IN DUCTOR
dφ = Ldi
φ
1971
Chua


Resistance depends on direction of voltage or
current across it (dϕ = M*dq)
Titanium dioxide film sandwiched between
two platinum electrodes; doped operation (HP
Labs), 5-10nm in length
Resistance Range
• Between Ron and Roff
• Roff : Highest resistance
• Ron : Lowest resistance






Excellent linearity in switching
Resistive range is good
I-V characteristics are also very good
Nanometric dimension (10nm in 2011, 5nm
in 2013): very high density potential at
extremely low power consumption
Manufacturing compatibility with CMOS
Problem: endurance and leakage (on read)



Ambipolar control of single memristor
No standby power, no direct path from VDD to
GND, only dynamic power dissipation
Less number of transistors than RAM (6T)



Memristor changes its value when reading
Roff state
Refresh operation is required
Write time significantly higher than read
VDD(V)
32 nm
0.9 V 1 V
45nm
0.9 V 1 V
65 nm
0.9 V 1 V
Write time (ns)
160
150
195
180
235
200
Read time (ns)
0.8
0.75
0.975
0.9
1.175
1
Ti 1nm /Pt 100nm/TiOx 29nm/Ti4O7 100nm
Resistance (ohm)
104
103
R on
R o ff
102
100
101
102
103
104
105
106
s w itc h in g c yc le s



Use phases of GTS (chalcogenide alloy)
High current-based process for two phases:
amorphous (high R) and crystalline (low R).
No erase-write cycle as for NAND flash (at
most 100,000 cycles for enterprise product)




Ron, programming (write) region: intersection
of Ron curve with voltage axis is Vh (holding
voltage)
Roff, read region: this can be changed by I or V
pulse; Roff=Ron exp(toff/t) where t=effective
recombination time (constant), toff=non
programming time
Vx as intersection point of Ron curve and Rset
curve, Vx=Vh x Rset/(Rset-Ron)
Typical values: Rset=7k, Rreset=200k, Ron=1k,
Vh=0.45v, Rset<Roff<Rreset, t=5nsec




Mobile devices (Samsung)
PCM likely to a be a depository (for less
frequently accessed data) next to DRAM for
processor design (IBM)
Networking/Communication systems:
CAM/TCAM designs
Massive storage for data acquisition systems




ISSCC11: Samsung (1-Gbit, 58-nm
manufacturing process, low-power doubledata-rate nonvolatile memory interface)
ISSCC12 : Samsung (8-Gbit, 20-nm device).
IEDM11: Macronix/IBM (39-nm device with
30-microamp reset current and 10^9 cycling
endurance, 128-Mbit)
July 2012: Micron/Numonyx (45 nm PCM for
mobile devices in 1 Gb and 512 Mb multichip
packages); commercially available





Low voltage and moderate current as
operational characteristics
Multiple bit operation (at least 2): higher
resistance range (M ohms) than other RRAMs
Read Time: 12ns; Write time: 85ns (@45nm)
Soft error highly unlikely to occur for GST
Good endurance (IBM: 1million cycles) and
density




Use 1T1P core for both CAM/TCAM
Functionality is at support circuitry
Voltage-based sensing for
comparison outcome in search
Use of circuit with ambipolar properties for
comparison and control
IBM (1/2 PCMs per
core), current based
operation
Stored
Search
IML (A)
0
(200kΩ)
0 (VSL = 0)
-1.38*10-9
1 (VSL = 0.4)
-1.97*10-6
0 (VSL = 0)
-1.38*10-9
1 (VSL = 0.4)
-4.15*10-5
1
(7kΩ)
Circuit
New cell (1 PCM
per core), voltage
based operation
Write Time
(ns)
Search Time
(ns)
Number of
Transistors/C
ore
Number of
PCM s/Core
PDP of Search
Operation (fJ)
CAM
[20]
Proposed
199.34
199.34
1.326
1.092
1
1
1
1
46.6886
36.4296
[20]
209.53
1.346
TCAM
Proposed
199.34
2.447
2
1
2
1
48.41
43.4518





Practical problem: drift of resistance and
threshold voltage (when not read or
programmed)
Related to crystalline fraction (Cx) in GST
Rpcm=(1-Cx)*Ra+Rc*Cx (Ra >> Rc)
Ra=Rreset
Rc=Rset



Level drift is more pronounced for high
resistance states and non linear wrt time
Problematic for MVL storage (i.e. more than one
bit per cell)
Order of resistivity for states remains the same
(short term), so avoid overlap in long term.




Use advanced modulation coding technique
for solving short-term drift (analogous to
NAND flash, electrons leak through thin walls
of cells and create data read errors).
Apply a voltage pulse based on deviation
from desired level and measure resistance. If
desired level of resistance is not achieved,
apply another voltage pulse and measure
again – until achieve the exact level
Only suitable for binary cell storage
It may reduce endurance (multiple writes)





Assume cell independence in drift errors (?).
Data to be encoded not in the programmed
state but in the relative order of the states in a
small group of cells.
Error in encoding scheme only seen when
resistivity levels of states cross each other
Software-based error correction methodologies
are then applied (slow)
Reduction in capacity: from 2 bits/cell to 1.57
bits/cell





Octal base for MVL (noise, crosstalk) and/or
single vs multiple storage elements
MVL implications on error detection/correction
Dynamic models of RRAM operation in HSPICE
(as related to drift evaluation and mitigation)
At system-level, improve endurance by
reducing maximum number of writes to a cell
System-level application modeling (for
example “normally-off instantly-on” operation:
combining SRAM with PCM)




Emergence of new paradigms: resistive RAMs,
non-volatile operation, multi-bit storage
Nearly all future memories will utilize new
phenomena away from 6T configuration
TECHNOLOGY TIME SCALE:
Hybrid implementations will be dominant in
the next 5-10 years
4Q-2014/1Q-2015 as crucial time frame for
PCM
Download