Taylor_Slides

advertisement
Defect and Fault Tolerant
Architectures for Nanoscale
Devices
David Newell, BSEE ‘07
Taylor Johnson, BSEE ‘08
ELEC527
March 22, 2007
Motivation
“As silicon manufacturing technology reaches the
nanoscale, architectural designs need to
accommodate the uncertainty inherent at such
scales. These uncertainties are germane in the
miniscule dimension of the devices, quantum
physical effects, reduced noise margins, system
energy levels reaching computing thermal limits,
manufacturing defects, aging and many other
factors. Defect tolerant architectures and their
reliability measures will gain importance for logic
and micro-architecture designs based on
nanoscale substrates.”
March 22, 2007
Debayan Bhaduri, Sandeep Shukla, NANOLAB: A Tool for Evaluating Reliability of Defect-Tolerant Nano
Architectures
(2/49)
State of the Art Yesterday
March 22, 2007
http://www.rpi.edu/~schubert/Educational%20resources/Educational%20resources.htm
(3/49)
State of the Art Yesterday

Intel 4004, 1971




Intel 8008, 1972



March 22, 2007
Max clock speed:
740kHz
Process: 10um PMOS
2250 transistors
Max clock speed:
800kHz
Process: 10um PMOS
3500 transistors
http://www.cpu-world.com/CPUs/CPU.html
(4/49)
State of the Art Yesterday (cont)

Intel 8080, 1974




Intel 80286, 1982



March 22, 2007
Max clock speed:
2MHz
Process: 6um NMOS
6000 transistors
Max clock speed:
12.5MHz
Process: 1.5um
CMOS
134,000 transistors
http://www.cpu-world.com/CPUs/CPU.html
(5/49)
State of the Art Yesterday (cont)

Intel 80386, 1985




Intel 80486, 1989



March 22, 2007
Max clock speed:
16MHz
Process: 1um CMOS
275,000 transistors
Max clock speed:
25MHz
Process: 1um CMOS
1.2 million transistors
http://www.cpu-world.com/CPUs/CPU.html
(6/49)
State of the Art Yesterday (cont)

Pentium, 1993




Pentium Pro, 1995



March 22, 2007
Max clock speed:
66MHz
Process: 0.8um
CMOS
3.1 million transistors
Max clock speed:
200MHz
Process: 0.6um
CMOS
5.5 million transistors
http://www.cpu-world.com/CPUs/CPU.html
(7/49)
State of the Art Yesterday (cont)

Pentium II, 1997




Pentium III, 1999



March 22, 2007
Max clock speed:
300MHz
Process: 0.35um
CMOS
7.5 million transistors
Max clock speed:
600MHz
Process: 0.25um
CMOS
9.5 million transistors
http://www.cpu-world.com/CPUs/CPU.html
(8/49)
State of the Art Yesterday (cont)

Pentium 4, 1999




Pentium 4HT, 2002



March 22, 2007
Max clock speed:
1.5GHz
Process: 0.18um
CMOS
42 million transistors
Max clock speed:
3.006GHz
Process: 0.13um
CMOS
55 million transistors
http://www.cpu-world.com/CPUs/CPU.html
(9/49)
State of the Art Yesterday (cont)

Pentium 4EE, 2003




Pentium M, 2005



March 22, 2007
Max clock speed:
3.2GHz
Process: 0.13um
CMOS
178 million
transistors
Max clock speed:
2.13GHz
Process: 90nm CMOS
140 million
transistors
www.wikipedia.org
(10/49)
State of the Art Yesterday (cont)

Core Duo, 2006



March 22, 2007
Max clock speed:
2.33GHz
Process: 65nm
CMOS
291 million
transistors
www.wikipedia.org
(11/49)
March 22, 2007
tiu
Model and Year
tiu
tiu
C
or
e
e
2
C
or
4
M
D
uo
D
uo
m
4E
E
4H
T
tiu
m
m
P
en
P
en
P
en
m
-2
7
6
00
20
0
5
3
00
00
-2
-2
2
0
00
20
0
-2
-
9
7
99
19
9
5
93
9
5
2
19
9
19
19
8
19
8
19
8
19
74
19
72
19
71
-1
-
-
-
-
-
-
-
-
-
-
III
II
ro
m
m
tiu
tiu
P
en
P
en
tiu
P
m
48
6
38
6
28
6
80
08
04
tiu
m
P
en
P
en
P
en
80
80
80
80
80
40
log(Number of Transistors)
Transistors
1.0E+09
1.0E+08
1.0E+07
1.0E+06
1.0E+05
1.0E+04
1.0E+03
1.0E+02
1.0E+01
1.0E+00
(12/49)
March 22, 2007
tiu
Model and Year
tiu
tiu
C
or
e
e
2
C
or
4
M
D
uo
D
uo
m
4E
E
4H
T
tiu
m
m
P
en
P
en
P
en
m
-2
7
6
00
20
0
5
3
00
00
-2
-2
2
0
00
20
0
-2
-
9
7
99
19
9
5
93
9
5
2
19
9
19
19
8
19
8
19
8
19
74
19
72
19
71
-1
-
-
-
-
-
-
-
-
-
-
III
II
ro
m
m
tiu
tiu
P
en
P
en
tiu
P
m
48
6
38
6
28
6
80
08
04
tiu
m
P
en
P
en
P
en
80
80
80
80
80
40
log(Process Size)
Process Size
10
1
0.1
0.01
(13/49)
State of the Art Today

Core 2 Duo, 20062007



March 22, 2007
Max clock speed:
2.66GHz
Process: 65nm
CMOS
376 million
transistors
www.wikipedia.org
(14/49)
State of the Art Tomorrow Evolutionary

Fabrication (<45nm)



March 22, 2007
Extreme ultraviolet lithography
Electron projection lithography
Interconnect problems
INTERNATIONAL TECHNOLOGY ROADMAP FOR SEMICONDUCTORS, http://www.sia-online.org
(15/49)
State of the Art Tomorrow Revolutionary

Molecular
Electronics



Issues


March 22, 2007
Self-assembly
Carbon nanotubes
Nanotube
transistors are only
a few atoms across
More transistors
means more
chances for failure
(16/49)
Traditional Full Adder
March 22, 2007
Ellenbogen, J.C., Love, J.C., Architectures for molecular electronic computers, PROCEEEDINGS OF THE IEEE,
VOL. 88, NO. 3, MARCH 2000.
(17/49)
Molecular Electronics Full Adder using
Molecular Diodes
March 22, 2007
Ellenbogen, J.C., Love, J.C., Architectures for molecular electronic computers, PROCEEEDINGS OF THE IEEE,
VOL. 88, NO. 3, MARCH 2000.
(18/49)
March 22, 2007
Ellenbogen, J.C., Love, J.C., Architectures for molecular electronic computers, PROCEEEDINGS OF THE IEEE,
VOL. 88, NO. 3, MARCH 2000.
(19/49)
Architecture Tolerance Types

Defect Tolerance



Fault Tolerance

March 22, 2007
Manufacture-time
defect detection and
reconfiguration
Ex: controlling
placement of wires,
orientation of wires,
and interconnects
Operation-time fault
detection,
reconfiguration,
recovery, etc.
Shukla, Goldstein, et al, Nano, Quantum, and Molecular Computing: Are We Ready for the Validation and Test
Challenges. In Eighth IEEE International High-Level Design Validation and Test Workshop, pages 3-7,
November, 2003.
(20/49)
Defect Tolerant Architecture


March 22, 2007
An architecture which uses
techniques to mitigate the effects of
defects in the devices that make up
the architecture, and guarantees a
given level of reliability
So, what are some of these
techniques?
Shukla, et al, Evaluating the Reliability of Defect-Tolerant Architectures for Nanotechnology, Proceedings of
the 17th International Conference on VLSI Design, 2004.
(21/49)
Building on Traditional Tolerance
Methods

Teramac (1998)




March 22, 2007
Massively parallel experimental computer built at HewlettPackard Laboratories to investigate a wide range of
different computational architectures
Defect-tolerant architecture of Teramac, which
incorporates a high communication bandwidth that
enables it to easily route around defects, has significant
implications for any future nanometerscale computational
paradigm
Maybe feasible to chemically synthesize individual
electronic components with less than a 100 percent yield,
assemble them into systems with appreciable uncertainty
in their connectivity, and still create a powerful and
reliable data communications network
Future nanoscale computers may consist of extremely
large-configuration memories that are programmed for
specific tasks by a tutor that locates and tags the defects
in the system
Heath, J. R., et al, A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology, Science, Vol.
280, JUNE 1998
(22/49)
Building on Traditional Tolerance
Methods

Teramac (cont)






March 22, 2007
Consists of 65,536 LUTs connected via crossbars in
a fat-tree network.
Extremely flexible architecture with few critical
paths
Highly redundant connectivity
Contains about 220,000 hardware defects, any
one of which could prove fatal to a conventional
computer
Despite defects, operated 100 times faster than a
high-end single-processor workstation for some
of its configurations
Functions normally despite defects in 10% of cells
and interconnects
Heath, J. R., et al, A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology, Science, Vol.
280, JUNE 1998
(23/49)
Fault Tolerance:
Teramac Overview




March 22, 2007
Successful operation
due to learning
defects after
fabrication
Able to avoid running
into defects due to
extremely high
connectivity via high
bandwidth bus
Redundancy
Tree architecture
leads to intrinsic
ability to find paths to
an end node
Heath, J. R., et al, A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology, Science, Vol.
280, JUNE 1998
(24/49)
Fault Tolerance:
Teramac – Lesson #1


March 22, 2007
Possible to build a very powerful
computer that contains defective
components and wiring, given sufficient
communication bandwidth in the system
to find and use the healthy resources
Machine is built cheaply but imperfectly, a
map of the defective resources is
prepared, and then the computer is
configured with only the healthy
resources
Heath, J. R., et al, A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology, Science, Vol.
280, JUNE 1998
(25/49)
Fault Tolerance:
Teramac – Lesson #2


March 22, 2007
Resources in a computer do not have to
be regular, but rather they must have a
sufficiently high degree of connectivity
System at the nanoscale that has some
random character can still be functional if
there is enough local intelligence to locate
resources, either through the laws of
physics or through the ability to reach
down through random but fixed local
connections
Heath, J. R., et al, A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology, Science, Vol.
280, JUNE 1998
(26/49)
Fault Tolerance:
Teramac – Lesson #3


March 22, 2007
Wires are by far the most plentiful
resource, and the most important are the
address lines that control the settings of
the configuration switches and the data
lines that link the LUTs to perform the
calculations
In a nanotechnology paradigm, these
wires may be physical or logical, but they
will be essential for the enormous amount
of communication bandwidth that will be
required
Heath, J. R., et al, A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology, Science, Vol.
280, JUNE 1998
(27/49)
Fault Tolerance:
Teramac – Lesson #4



March 22, 2007
The conventional paradigm for computation is to
design the computer, build it perfectly, compile
the program, and then run the algorithm
Teramac paradigm is to build the computer
(however imperfectly), find the defects, configure
the resources with software, compile the
program, and then run it
Moves what is difficult to do in hardware into a
software task, which is just the continuation of a
trend that has accompanied the development of
electronic computers from their first appearance
Heath, J. R., et al, A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology, Science, Vol.
280, JUNE 1998
(28/49)
Tolerance Methods in Traditional
Silicon Architectures

Von Neumann Defect



Byzantine Defect


March 22, 2007
Expect a 0 and see a 1
Expect a 1 and see a 0
Unknown number of faulty inputs
Given full communication, if 1/3 of
inputs are faulty, the correct output can
still be determined
Shukla, et al, Evaluating the Reliability of Defect-Tolerant Architectures for Nanotechnology, Proceedings of
the 17th International Conference on VLSI Design, 2004.
(29/49)
Traditional Methods Applied:
NAND Multiplexing


March 22, 2007
Proposed by von Neumann in 1952
Idea: if the failure probabilities of the gates are
sufficiently small and failures are independent,
then computations may be done with a high
probability of correctness
Shukla, et al, Evaluating the Reliability of Defect-Tolerant Architectures for Nanotechnology, Proceedings of
the 17th International Conference on VLSI Design, 2004.
(30/49)
Traditional Methods Applied:
NAND Multiplexing
March 22, 2007
Shukla, et al, Evaluating the Reliability of Defect-Tolerant Architectures for Nanotechnology, Proceedings of
the 17th International Conference on VLSI Design, 2004.
(31/49)
Traditional Methods Applied:
NAND Multiplexing
March 22, 2007
Shukla, et al, Evaluating the Reliability of Defect-Tolerant Architectures for Nanotechnology, Proceedings of
the 17th International Conference on VLSI Design, 2004.
(32/49)
Fault Tolerance:
Modern Solutions

Pair and Spare



Triple Modular Redundancy

March 22, 2007
2 pairs of circuits
Choose the pair that agrees
3 circuits take majority vote
(33/49)
Fault Tolerance:
Fault Protection

ACID

Atomicity


Consistency


refers to the ability of the application to make
operations in a transaction appear isolated from all
other operations.
Durability

March 22, 2007
refers to being in a legal state when the transaction
begins and when it ends.
Isolation


either all of the tasks of a transaction are
performed or none of them is
refers to the guarantee that once the user has been
notified of success, the transaction will persist, and
not be undone.
(34/49)
Fault Tolerance:
Safe Failures

Fail-Safe


Graceful Degradation

March 22, 2007
Should a function fail, it will not cause
harm to other areas
Operating quality is proportional to
severity of failure
(35/49)
Defect Tolerance:
Failure


March 22, 2007
Detecting failures in transistors
becomes more complex as size
decreases
Rather than detect and replace
failures, accept and over come them
(36/49)
Defect Tolerance:
Accounting for failure

Architecture that does not require a
large number of working cells



March 22, 2007
Find other ways to reach cells
Find ways to avoid failed cells
Find logically equivalent circuits
Will Knight, Y-shaped nanotubes are ready-made transistors,
http://www.newscientist.com/article.ns?id=dn7847, 15 August 2005.
(37/49)
Defect Tolerance:
DNA Self-Assembly




March 22, 2007
Control over nanoscale devices is
exceedingly difficult
Exercising more control reduces the
speed of self assembly
Exercising less control reduces the
possible size of self assembly
Which methods of control allow the
greatest speed and size?
(38/49)
Defect Tolerance:
Controlled Parameters

Placement


Orientation


All nodes are aligned the same
direction
Interconnect

March 22, 2007
All nodes are set up in a grid format
All interconnects are straight and at
right angles to the node
Jaidev P. Patwardhan, Chris Dwyer, and Alvin R. Lebeck, Self-Assembled Networks: Control vs. Complexity,
Duke University
(39/49)
Defect Tolerance:
Controlled Parameters
March 22, 2007
Patwardhan, et al, A Defect Tolerant Self-organizing Nanoscale SIMD Architecture, Twelfth International
Conference on Architectural Support for Programming Languages and Operating Systems
(40/49)
Defect Tolerance:
Network Organization
March 22, 2007
Patwardhan, et al, A Defect Tolerant Self-organizing Nanoscale SIMD Architecture, Twelfth International
Conference on Architectural Support for Programming Languages and Operating Systems
(41/49)
Defect Tolerance:
Results

Shows percent of nodes reachable
for each combination of control



March 22, 2007
With infinite backoff, there can only be
one receiver and one broadcaster
Infinite backoff not shown if below 10%
of nodes are reachable
Device reliability from 99.99% to
100%
Patwardhan, et al, A Defect Tolerant Self-organizing Nanoscale SIMD Architecture, Twelfth International
Conference on Architectural Support for Programming Languages and Operating Systems
(42/49)
Defect Tolerance:
Reachable Nodes


March 22, 2007
Control of orientation and placement (N6) allows
for many more reachable nodes for lower device
reliability
Control of Interconnects and one other parameter
(N3, N5) leads to fewer reachable nodes
Patwardhan, et al, A Defect Tolerant Self-organizing Nanoscale SIMD Architecture, Twelfth International
Conference on Architectural Support for Programming Languages and Operating Systems
(43/49)
Defect Tolerance:
Methods of Control

Orientation and Placement controlled
through DNA placement.



Lack of control of Interconnect matters
much less than other parameters

March 22, 2007
Control of one implies control of the other
Better placement of DNA allows for more
control of both parameters
More productive to focus on device reliability
Gaia Vince, Nano-transistor self-assembles using biology, http://www.newscientist.com/article.ns?id=dn4406,
20 November 2003.
(44/49)
Motivation Revisited
“With the continuing advances in the miniaturization of devices,
we are already at the deep submicron scale of device
manufacture. However, nanotechnology is emerging as the
technology of the not too distant future. In the nano era,
device sizes will be in the range of several nanometres,
leading to a high degree of failures, due to manufacturing
defects, transient faults resulting from reduced noise
tolerance at low voltage and current levels, and faults due to
ageing because of molecular and other kinds of techniques for
creating nano-devices. Although nano-scale manufacturing
will allow us to pack more devices on a chip, we have to live
with the possibilities of defects in the nano-substrate. As a
result, ‘defect-tolerant architecture’ is being posed as a way
to mitigate the challenge of the inherent unreliability at the
nano-scale. Defect-tolerance is built into the architecture in
the form of redundancy of devices and functional units.”
March 22, 2007
Shukla, et al, Evaluating the Reliability of Defect-Tolerant Architectures for Nanotechnology, Proceedings of
the 17th International Conference on VLSI Design, 2004.
(45/49)
Conclusions

Evolutionary Advances


Revolutionary Advances



March 22, 2007
Traditional semiconductor technologies are
reaching their limits
Mandate some form of effective defect and
fault tolerance to behave within desired error
limits
Currently researched methods are primarily
probabilistic with varying levels of effectively
depending on model
Much more research is need in this arena,
especially using fabricated devices instead of
solely modeled ones
(46/49)
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
March 22, 2007
Debayan Bhaduri, Sandeep Shukla, NANOLAB: A Tool for Evaluating Reliability of Defect-Tolerant Nano
Architectures
http://www.rpi.edu/~schubert/Educational%20resources/Educational%20resources.htm
http://www.cpu-world.com/CPUs/CPU.html
www.wikipedia.org
INTERNATIONAL TECHNOLOGY ROADMAP FOR SEMICONDUCTORS, http://www.sia-online.org
Ellenbogen, J.C., Love, J.C., Architectures for molecular electronic computers, PROCEEEDINGS OF THE
IEEE, VOL. 88, NO. 3, MARCH 2000.
Shukla, Goldstein, et al, Nano, Quantum, and Molecular Computing: Are We Ready for the Validation and
Test Challenges. In Eighth IEEE International High-Level Design Validation and Test Workshop, pages 37, November, 2003.
Heath, J. R., et al, A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology, Science,
Vol. 280, JUNE 1998
Will Knight, Y-shaped nanotubes are ready-made transistors,
http://www.newscientist.com/article.ns?id=dn7847, 15 August 2005.
Jaidev P. Patwardhan, Chris Dwyer, and Alvin R. Lebeck, Self-Assembled Networks: Control vs.
Complexity, Duke University
Patwardhan, et al, A Defect Tolerant Self-organizing Nanoscale SIMD Architecture, Twelfth International
Conference on Architectural Support for Programming Languages and Operating Systems
Gaia Vince, Nano-transistor self-assembles using biology,
http://www.newscientist.com/article.ns?id=dn4406, 20 November 2003.
Shukla, et al, Evaluating the Reliability of Defect-Tolerant Architectures for Nanotechnology, Proceedings
of the 17th International Conference on VLSI Design, 2004.
(47/49)
Thank You
March 22, 2007
(48/49)
Questions?
March 22, 2007
(49/49)
Download