Everyone Has It All Wrong! - FAMU

advertisement
FAMU-FSU
College of Engineering
Addressing the Funding Gap in
Energy-Efficient Computing:
Research Overview and
Program Management
Philosophy
By Michael P. Frank
Presented to the National Science Foundation
Directorate for Computer & Information Science & Engineering
Computer & Communication Foundations (CCF) Division
Monday, July 10, 2006
6/19/06
M. Frank, NSF/CISE/CCF job talk
1
FAMU-FSU College of Engineering
Overview of Talk

Motivation:

The Looming Energy Efficiency Crisis in Computing


The Science:

Why something called Reversible Computing is really
“Our Only Hope” for solving the problem


And why we need to start major research on it now!
Why I’m Here:

Convey my vision of CCF, the EMT program and how the
field of Reversible Computing fits into them

6/19/06
and the related Funding Gap between government & industry
Ideas on how I would help run the EMT program
M. Frank, NSF/CISE/CCF job talk
2
FAMU-FSU
College of Engineering
Motivation
The Coming Crisis in
Computer Energy Efficiency
6/19/06
M. Frank, NSF/CISE/CCF job talk
3
FAMU-FSU College of Engineering
Major Motivation of my Work:
The Energy Efficiency Crisis

The bulk of past improvements in practical computer
performance have been fundamentally enabled by steady
improvements in the energy efficiency of computation…


Defined as the number of useful computational operations performed
per unit of available energy dissipated into the form of waste heat
Unfortunately, an end to the past trend of steady energy
efficiency improvements is now clearly within sight…

Designs at many levels (devices, circuits, architectures, algorithms)
for conventional computing are rapidly converging towards optimal
design-point asymptotes, within a few-decade time-frame


To circumvent the crisis, a radical paradigm shift in our
models and structures for computation is required!

6/19/06
Beyond which substantial further progress will not be possible, at least
not within the conventional classical, irreversible computing paradigm
I will show why reversible computing will be an essential part of this.
M. Frank, NSF/CISE/CCF job talk
4
FAMU-FSU College of Engineering
Computing’s Rapid Climb

The raw performance & efficiency characteristics of
our information processing technologies (computing,
storage, communication) have been improving at a
steady, exponentially increasing rate over time, for at
least the past 50 years…

Due to “Moore’s Law” (integration scale of electronics
doubles every 1-2 years) and related technology trends


6/19/06
Performance trends also span multiple pre-IC technologies
(vacuum tubes, relays, etc.) going back ~100 years or more
Each generation of performance improvements has
reliably led to significant new informationprocessing applications becoming practicable…
M. Frank, NSF/CISE/CCF job talk
5
FAMU-FSU College of Engineering
Substantial Societal Impact

Economic measures of the nation’s (& world’s)
economy, such as GDP, per-capita income, and
standard of living have also improved exponentially
(although at slower rates) over this same period…

It’s clear that a substantial portion of these gains was
made possible by the introduction of new IT applications,
itself made possible by raw technology improvements

Nearly every major industry today has relied on digital/
electronic technologies for a substantial portion of the
productivity gains it has made over the last few decades

6/19/06
Effected either directly, or indirectly through its suppliers
M. Frank, NSF/CISE/CCF job talk
6
FAMU-FSU College of Engineering
These historical observations
raise an important concern…

We can arguably expect that the future rate of
growth of the entire world economy will
substantially depend on future trends in
information technology efficiency…

6/19/06
I.e., will our raw technology
capabilities flatten out,
log
continue improving
efficiency
steadily, or accelerate
even faster than before?
M. Frank, NSF/CISE/CCF job talk
now
decade
7
FAMU-FSU College of Engineering
But, a Severe Problem…

The energy efficiency (useful operations performed per unit
energy dissipated) of all conventional information processing
technologies will flatten out within the next few decades…


This is true for fundamental and absolutely irrefutable physical
reasons! (To be discussed)
As a consequence, the cost efficiency (ops performed per unit
cost) and thus practical performance (e.g., FLOPS per dollar
of annual operating budget) of systems will also flatten!

This is assuming only that the economic cost of energy will not soon
enter a new era of rapid exponential decay…


If this “flattening” happens, it can be expected to have a
substantial braking effect on the entire world economy!

6/19/06
Which seems unlikely since, at present, energy costs are rising
This would be an extremely negative outcome, which we should try
our best to avoid at all costs…
M. Frank, NSF/CISE/CCF job talk
8
FAMU-FSU College of Engineering
Why Energy Efficiency of Conventional
Computing Must Flatten

The potential energy efficiency gains from all conventional
sources are limited… For example:

Decrease logic signal energy by lowering logic voltages

This has already reached a practical limit of on the order of ~1V; going
to much lower voltages leads to excessive FET energy leakage


Eliminate speculative execution and other unnecessary CPU activity


This is quite helpful, but typically yields at most ~100x savings
Find new high-level algorithms that require fewer total operations

6/19/06
Soon, power is dominated by active switching in units that are in use
Replace algorithms for general-purpose CPUs with FPGA
configurations or special-purpose architectures:


Soon, energy dissipation becomes dominated by “necessary” activity
Turn off unused functional units when not in use to avoid unnecessary
power dissipation from leakage currents


Also, signal energy is subject to thermodynamic limits to be discussed
This is great when possible, but as our algorithms improve, significantly
better algorithms become harder and harder to find
M. Frank, NSF/CISE/CCF job talk
9
FAMU-FSU College of Engineering
Trend of Minimum Transistor
Switching Energy
ITRS '97-'03 Gate
Energy Trends
Based on Data from International
Technology
Roadmaps for Semiconductors
1.E-14
CV2/2 gate
CVV/2 energy,
energy, J Joules
250
1.E-15
LP min gate energy, aJ
HP min gate energy, aJ
100 k(300 K)
ln(2) k(300 K)
1 eV
k(300 K)
Node numbers
(nm DRAM hp)
180
130
90
65
1.E-16
45
32
1.E-17
22
1.E-18
Room-temperature 100 kT reliability limit
fJ
aJ
One electron volt
1.E-19
1.E-20
Room-temperature kT thermal energy
Room-temperature von Neumann - Landauer limit
zJ
1.E-21
1.E-22
1995
2000
2005
2010
2015
2020
2025
2030
2035
2040
2045
Year
6/19/06
M. Frank, NSF/CISE/CCF job talk
10
FAMU-FSU College of Engineering
An Urgent Scientific Need

Given the above considerations, I would say that one
of the most important basic research issues that our
society needs the field of computer science &
engineering to address is to find a definitive answer
to the following question:

Can the introduction of new alternative, unconventional
computing paradigms (such as reversible, quantum, and
bio-inspired computing) realistically prevent or forestall
the “flattening” of the information technology curve?


My vision is that answering this question should be a
primary scientific mission of the EMT program.

6/19/06
And if so, how exactly can this work?
Although other applications are also important…
M. Frank, NSF/CISE/CCF job talk
11
FAMU-FSU
College of Engineering
The Science
Why Reversible Computing
is Our “Last, Great Hope”
for Continuing to Improve
Computing Indefinitely
6/19/06
M. Frank, NSF/CISE/CCF job talk
12
FAMU-FSU College of Engineering
The von Neumann-Landauer
(VNL) Bound

Physical theorem: To lose, obliviously erase,
or otherwise irreversibly forget 1 bit’s worth
of known information involves/requires the
eventual dissipation of at least kBT ln 2
amount of free energy to heat in an external
environment at some temperature T.


6/19/06
kB here is Boltzmann’s constant, 1.38×10−23 J/K
in energy/temperature units
First alluded to by John von Neumann, 1949;
clarified and proven by Rolf Landauer, 1961.
M. Frank, NSF/CISE/CCF job talk
13
FAMU-FSU College of Engineering
A simple proof of the VNL bound

Here’s a simple proof, from basic thermodynamic facts known for >100 years!

If known information becomes unknown, this is (by def’n) an increase of entropy.

Because entropy is simply unknown physical information.


Standard units of information and entropy are simply logarithmic units:



1 bit = log 2 = λb.logb2 (indefinite logarithm object), Boltzmann’s constant kB = log e
Therefore, in units of Boltzmann’s constant, 1 bit = kB(log 2/log e) = kB ln 2
Thus, the loss (forgetting) of 1 bit is, by definition, the very same thing as an
increase of entropy by the amount kB ln 2.

Once entropy is created, it can never be destroyed (2nd law of thermodynamics)


To operate sustainably without eventual meltdown,

The entropy generated must be expelled to an external environment.
To add entropy S to an environment at temperature T requires adding energy E = ST
to that environment - this is the very definition of thermodynamic temperature!

6/19/06
This follows from the micro-scale reversibility of basic laws of (today quantum) mechanics
As entropy builds up in a system, its temperature rises.


And, all information that is accessible to us is physical information anyway.
Thus, to forget a bit (i.e., permanently expel it into the environment) requires that we
must eventually permanently commit energy kBT ln 2 to the environment (as heat).
M. Frank, NSF/CISE/CCF job talk
14
FAMU-FSU College of Engineering
An Essential Element of
Future Paradigms: Reversible Computing

Basic idea: (R. Landauer, 1961 & C. Bennett, 1973)

Fundamental physics suggests that in principle there is no limit to the
energy efficiency of computing technologies, although this is true
only to the extent that we avoid performing irreversible operations
that discard information during the computing process…

But, it seems that with sufficient engineering effort, we can in principle
approach, as closely as we care to, the limit of a reversible computer that
discards no information and dissipates no energy


Present status of reversible computing:



Potential advantages/tradeoffs are reasonably well understood
Models & early prototypes exist, but no practical systems yet
Of interest to other clusters: Implementing this notion would
eventually impact computer engineering & CS at all levels!

6/19/06
Our practical aim is not zero energy, just continued steady reductions!
From low-level physical device requirements up through circuit
design, theory, architecture, languages, & algorithms…
M. Frank, NSF/CISE/CCF job talk
15
FAMU-FSU College of Engineering
Irreversible vs. Reversible
Digital Operations

A typical irreversible digital operation:


Reversibly transform the old physical state representing x “in
place” to a new state the new value y.
2nd
The semantic difference is that the
op can only be
done if the old value x is “known”…

x
bit bucket
xy
This means, it can be reconstructed based on the new value y
together with other available information.

This restricts the kinds of replacements that can be done
reversibly;

6/19/06
y
A closely corresponding, but reversible operation:


Regardless of the previous digital contents x of some circuit
node or memory cell, destructively overwrite it with a given
new value y.
e.g., can’t replace two bits a,b with the product ab and 1 other bit
M. Frank, NSF/CISE/CCF job talk
16
FAMU-FSU College of Engineering
Simple Electronic Implementations
 Reversible
“CLEAR”
(change from 1 to 0):
 Irreversible
CLEAR
(set to 0) operation:

Without knowing if there is
charge on node N, connect it to
ground (logic 0 reference level)

N
6/19/06
The stored information is lost
and the entire associated node
energy E is dissipated to heat!
Switch open
Node is
charged up
with an
amount E of
electrostatic
energy

N
Node
discharges
suddenly,
all info &
energy are
fully lost
Given that N contains a 1, we
connect it to a source that goes
from 1 to 0 over time t > tc

Only a fraction tc/t of the node
energy E is dissipated,

tc = 2RC is a time constant
 R = resistance of path
 C = capacitance of node
N
Switch closed
1
Variable
source
R
C
M. Frank, NSF/CISE/CCF job talk
0
t
(2EC)1/2
Charge Q =
flows out in a
controlled way over time t, dissipation
Ediss = I2Rt = Q2R/t = E(2RC/t)
(Adiabatic charge transfer)
17
Simulation Results (Cadence/Spectre)
Power vs. freq., TSMC 0.18, Std. CMOS vs. 2LAL
2LAL = Two-level adiabatic logic (invented at UF, ‘00)

1.E-05


1.E-06
1.E-07
1.E-08
Standard
CMOS
1.E-09
1.E-10
1.E-11
1.E-12
1.E-13

Frequency, Hz
Reversible is 100×
faster than irreversible!
Minimum energy dissip.
per nFET is < 1 eV!

500× lower than best
irreversible!


500× higher
computational
energy efficiency!
Energy transferred is still
~10 fJ (~100 keV)

1.E-14
1.E+09 1.E+08 1.E+07 1.E+06 1.E+05 1.E+04 1.E+03
Reversible uses
< 1/100th the power of
irreversible!
At ultra-low power
(1 pW/transistor)


in 8-stage shift register.
At moderate frequencies
(1 MHz),

Energy dissipated per nFET per cycle
Average power dissipation per nFET, W
Graph shows power
dissipation vs. frequency
So, energy recovery
efficiency is 99.999%!

Not including losses
in power supply,
though
FAMU-FSU College of Engineering
Reversible and/or Adiabatic VLSI
Chips Designed @ MIT, 1996-1999
By EECS grad students Josie Ammer, Mike Frank, Nicole Love, Scott Rixner,
and Carlin Vieri under CS/AI lab members Tom Knight and Norm Margolus.
6/19/06
M. Frank, NSF/CISE/CCF job talk
19
FAMU-FSU College of Engineering
Some Important Results in
Reversible Computing So Far

Landauer (IBM) 1961:


Lecerf 1963, Bennett (IBM) 1973:


Computers that use only reversible operations are still Turing universal.
Fredkin & Toffoli (MIT), 1980:


The von Neumann limit of kT ln 2 energy dissipation per bit operation only holds for irreversible operations.
Reversible computers can be implemented in an idealized classical physical model.
Feynman (CalTech), 1982:

Reversible computers can be implemented in a simple quantum physical model.


Younis & Knight (MIT), 1993:

Pipelined, sequential logic circuits can be implemented in fully-reversible CMOS.


Designed & implemented fully reversible programmable circuits, general-purpose RISC architectures, highlevel programming languages, and algorithms for a wide variety of classical CS problems
Frank (MIT) 1997-1999:


This paper helped to spawn the field of adiabatic circuits
MIT Pendulum Project (Ammer, Frank, Knight, Love, Margolus, Rixner, Vieri), 1994-1999:


This paper eventually spawned the field of quantum computing
When physical constraints are accounted for, reversible computers offer asymptotically lower energy, cost,
and time complexity for broad classes of problems than conventional machines.
Frank (UF) 2000-2002:

6/19/06
The advantages of reversible computing over conventional computing increase as small polynomials of the
underlying technology characteristics… The trends show reversible winning within decades for machines at
usual scales
M. Frank, NSF/CISE/CCF job talk
20
FAMU-FSU College of Engineering
Important Open Research
Challenges in Reversible Computing

Fundamental research on practicability of reversible
computing:

(Physics) Can we invent post-transistor devices with lower leakage
and energy coefficients?


(Engineering) Can we tailor physical mechanisms to precisely
execute complex trajectories (computations) with high energyrecovery efficiency?


Existing general-purpose reversible architectures are highly suboptimal
(Theory) Can we reversibly emulate general irreversible algorithms
with less space-time complexity overhead than presently known?

6/19/06
E.g. efficient resonators and power-clock distribution systems driving
adiabatic logic. Collaboration with extremely skilled EEs is needed
(Structures) Can we design mostly-reversible architectures with low
overhead for practical special-purpose applications, at least?


This research requires cross-disciplinary collaboration with physicists
Oracle-based results suggest not, but more work is needed
M. Frank, NSF/CISE/CCF job talk
21
FAMU-FSU College of Engineering
The Funding Gap in
Energy-Efficient Computing

As a proposal writer, I’ve found that reversible computing
falls into a rather awkward, in-between position…



The major risk that society faces in allowing this funding
gap to persist is that if industry steps in too late, then
workable, practical implementations of RC might not be
ready in time to prevent performance growth from stalling…

6/19/06
Because it aims to help a broad range of practical applications, and is
well-motivated by basic physics, many scientists who evaluate RC
proposals say it seems “too practical” to receive basic research
funding, they expect its development should be funded by industry.
Yet, because RC is high-risk, very disruptive, and probably will take
much longer than industry’s traditional ~10-year lab-to-fab time lag to
develop and broadly adopt, industry has largely ignored it, in favor of
more short-term approaches to save energy
If there is even a brief stall, the loss of momentum could breed
pessimism and choke off industry’s will to continue innovating…
M. Frank, NSF/CISE/CCF job talk
22
FAMU-FSU
College of Engineering
Why I’m Here
My vision of CCF, EMT,
and how I and my field fit
into it
6/19/06
M. Frank, NSF/CISE/CCF job talk
23
FAMU-FSU College of Engineering
Areas Covered by CCF

Emerging Models and Technologies (EMT)

Paradigms: Nanocomputing, quantum
computing, biologically inspired computing…


Founds. of Comp. Procs. & Artifs. (FCPA)


Structures: Programming languages, computer
architecture, VLSI design…
Theoretical Foundations (TF)

6/19/06
I would add reversible computing to this list…
Theory: Models of computation, complexity,
parallelism, algorithms, information theory…
M. Frank, NSF/CISE/CCF job talk
24
FAMU-FSU College of Engineering
Some Highlights of My
Related Educational Background

Early exposure to nanotech/nanocomputing concepts


Solid general background in CS theory & AI



Designed & had fabbed several chips, for courses & Ph.D. work
Ph.D. work on Reversible Computing

6/19/06
Reviewed the field for MIT EECS Ph.D. area exam, 1995
Ph.D. minor in conventional CMOS VLSI design


MIT Lab for CS, ’94-‘95
Fairly early exposure to Quantum Computing


BS in Symbolic Systems, Stanford, 1991
MS in EECS on Decision-Theoretic techniques in AI, MIT, 1994
Ph.D. proposal on DNA-based computing


Nanotechnology course, K. Eric Drexler, Stanford, 1988
Included development of nanocomputing models, complexity theory,
architectures, programming languages, & VLSI design
M. Frank, NSF/CISE/CCF job talk
25
FAMU-FSU College of Engineering
What I See As Some General
Research Questions Behind EMT

What are the fundamental physical limits of present
& future information processing technologies?


What fundamental changes to our underlying
models/paradigms of computation may we need in
order to fully harness emerging technologies?


New models based on physics (or chemistry, biology?)
How can practical considerations help to guide our
exploration of the emerging technology concepts?

6/19/06
As opposed to the more abstract, algorithmic kinds of
limits addressed by traditional theoretical CS
E.g., concerns with (at least estimates of) real-world cost,
performance, energy efficiency, reliability, ease of use…
M. Frank, NSF/CISE/CCF job talk
26
FAMU-FSU College of Engineering
Some Cross-Cutting
Questions to other areas of CCF

Cross-cutting to FCPA cluster:


Cross-cutting to TF cluster:

6/19/06
What would the emergence of new computing
paradigms require in terms of new architectures,
programming languages, & HW design tools?
What impacts do emerging technologies have on
theoretical CS areas such as models of
computation, complexity theory, algorithm
design, and parallel computing?
M. Frank, NSF/CISE/CCF job talk
27
FAMU-FSU College of Engineering
What are the Fundamental
Physical Limits of Computing?

Fundamental laws of physics impose a variety of
universal limits that hold true in all physically
possible information processing technologies:

Thermodynamic von Neumann/Landauer (VNL) lower
bound of kT ln 2 (~18 meV at room temperature) on
energy dissipated per known bit that is discarded into a
temperature-T environment.


Quantum performance limit (Margolus-Levitin bound) of
at most a rate 2E/h (h=Planck’s constant) of ‘useful’ bit
operations in any device with an active energy of E.


6/19/06
However, this one could be avoided via reversible computing
This limit applies even to reversible & quantum computers!
There are also fundamental physical limits on information
density and bandwidth, but I won’t get into those here…
M. Frank, NSF/CISE/CCF job talk
28
FAMU-FSU College of Engineering
New Paradigms for Computing



Reversible computing aims to directly circumvent the energy
efficiency problem through the use of energy-conserving
physical mechanisms for information processing…
Quantum computing aims for dramatic algorithmic
improvements for some types of problems, using ‘shortcuts
through state space’ made possible by nonclassical operations
Bio-inspired computing broadly includes:



6/19/06
In vivo biological computing, e.g., bacteria genetically engineered to
incorporate custom gene expression regulation networks
In vitro biochemistry-based computing such as DNA computing and
related approaches
“In silico” but still biologically-inspired techniques such as digital &
analog neural networks, other analog approaches, “neuromorphic”
computing, etc…
M. Frank, NSF/CISE/CCF job talk
30
FAMU-FSU College of Engineering
New Paradigms in Relation to
What I see as EMT’s Mission

Bio-inspired computing is interesting, but generally incapable of
superseding the limits of conventional technology by very much…

All realistic bio-inspired approaches could be simulated by conventional
parallel digital machines with (at most) modest constant-factor overheads…


Quantum computing is nice if it can be made to work, but as far as we
know, it is limited in its applicability to relatively narrow classes of
problems (e.g., hidden subgroup, modest gains for search)…

Its potential economic impact is therefore only a small fraction of that for all
leading-edge computing in general


Research that aims to broaden its applicability is potentially worthwhile
Reversible computing is the only unconventional paradigm that might
possibly break down the roadblocks to indefinite future improvement of
computer efficiency and practical performance in general applications…

Its future economic value is thus potentially unlimited…

6/19/06
The motivation for bio-inspired computing must come from other directions…
However, it is difficult to do, and still in its infancy! Much research is needed.
M. Frank, NSF/CISE/CCF job talk
31
FAMU-FSU College of Engineering
Some Other Motivations for Paradigms
Covered by EMT

Bio-inspired computing:





Quantum computing:




In vivo computing: Self-reproducing, self-organizing microbial systems for
various clinical or industrial applications
In vitro computing: Self-assembly of nanostructures
Neural networks: Applications in machine learning
Analog electronics: Low-power signal processing
Fast factoring etc. for cryptanalysis of PK cryptosystems
Strong information security via quantum cryptography
Fast, flexible, accurate simulation of quantum physical systems
Reversible computing:

Reversible logic is already used in quantum computing, and has a few
possible applications in other areas of CS:



Security: auditable/verifiable computation, resilient systems
Transaction rollback for concurrent systems
May conceivably provide useful angles for tackling complexity-theory questions

6/19/06
e.g., FACTORINGP iff  a poly-time zero-garbage reversible alg. to multiply primes
M. Frank, NSF/CISE/CCF job talk
32
FAMU-FSU College of Engineering
Some Important Research
Challenges in Quantum Computing

Important experimental physics challenges:

Develop new experimental setups for prototype quantum
computers that can effectively suppress decoherence to
the threshold for fault-tolerance



Develop effective physical architectures for efficient qubit
transfer & execution of parallel quantum circuits
Important theory challenges:

Better characterize the limits of applicability of quantum
algorithms


Find major new categories of applications beyond the scope of
the standard hidden subgroup / unstructured search algorithms
Resolve major open issues in quantum complexity theory

6/19/06
To enable more rapid improvement of machine sizes
Comparisons between BQP vs. BPP and NP, etc.
M. Frank, NSF/CISE/CCF job talk
33
FAMU-FSU College of Engineering
Program Administration Ideas

My personal program management philosophy:


Clarify the vision and goals of the funding program up-front with a
technical “white paper” surveying important open scientific issues



and encourage them to submit proposals to the program
Encourage review panel members to carefully consider the quality &
thoroughness of the motivation section when evaluating the scientific
merit of proposals

6/19/06
Include motivation for and summaries of important open research problems,
with references to the literature
Encourage proposal writers to address the listed issues, or else to thoroughly
motivate their own alternative directions
Proactively seek out researchers whose background, skills, and research
interests seem to mesh well with the cluster’s mission and vision


“Hands-on” leadership, guiding & steering the work of proposers &
reviewers based on my vision and understanding of the program’s mission
and the scientific needs of the fields that it touches on
IMHO, too much of today’s research is not sufficiently well-motivated
M. Frank, NSF/CISE/CCF job talk
34
FAMU-FSU College of Engineering
Educational Component

Strongly encourage proposers to include educational
activities in their proposals, including:




Organizing of conferences
Writing of technical books & textbooks
Writing of introductory books for popular audiences
Even encourage submission of proposals for activity
that is primarily educational in nature

There is an “education gap” in the areas I discussed also


Emphasize the need for educational materials that
have a strong interdisciplinary perspective

6/19/06
Especially in reversible computing, which is still little known
E.g., integrating CS, EE, physics issues
M. Frank, NSF/CISE/CCF job talk
35
FAMU-FSU College of Engineering
Conclusion

Among the various unconventional computing technologies,
there are strong reasons to believe that reversible computing
has the greatest potential to make an enormous, vital, broad,
and timely economic impact in coming decades…


One of my main motivations for working in reversible
computing has been to correct the imbalance between the
underlying importance of and popular attention to this field…


However, my influence as a lone researcher “in the trenches” is
limited… No programs support this presently unfashionable field
I hope in my position at EMT to help to finally bring some
much-needed funding and attention to this orphaned area, and
help guide research in new, productive directions…

6/19/06
Yet, compared to areas such as DNA, quantum, nano and bacterial
computing, it has received by far the least attention and funding!
While continuing support for well-motivated projects in other areas
M. Frank, NSF/CISE/CCF job talk
36
FAMU-FSU
College of Engineering
finis
End of Presentation – Extra Slides Follow
6/19/06
M. Frank, NSF/CISE/CCF job talk
37
FAMU-FSU College of Engineering
Everyone Has It All Wrong!

As the talk proceeds,


I’ll explain (in the proud MIT tradition) why most
of the rest of the world is thinking about the future
of computing in a completely wrong-headed way.
In particular,



6/19/06
The Low-Power Logic Circuit Designers have it
all wrong!
The Semiconductor Process Engineers have it
all wrong!
(Most) Device Physicists have it all wrong!
M. Frank, NSF/CISE/CCF job talk
40
FAMU-FSU College of Engineering
The von Neumann-Landauer
(VNL) principle

John von Neumann, 1949:

Claim: The minimum energy dissipated “per elementary
(binary) act of information” is kT ln 2.


Rolf Landauer (IBM), 1961:

Logically irreversible (many-to-one) bit operations must
dissipate at least kT ln 2 energy.


Paper anticipated but didn’t fully appreciate reversible computing
One proper (i.e. correct) statement of the principle:

The oblivious erasure of a known logical bit generates at
least k ln 2 amount of new entropy.

6/19/06
No published proof exists; only a 2nd-hand account of a lecture
Releasing into environment at T requires kT ln 2 heat emission.
M. Frank, NSF/CISE/CCF job talk
41
FAMU-FSU College of Engineering
Proof of the VNL Principle

The principle is occasionally questioned, but:


Its truth follows absolutely rigorously (and even trivially!)
from rock-solid principles of fundamental physics!
(Micro-)reversibility of fundamental physics implies:

Information (at the microscale) is conserved

I.e., physical information cannot be created or destroyed


Thus, when a known bit is erased (lost, forgotten) it must
really still be preserved somewhere in the microstate!

But, since its value has become unknown, it has become entropy

6/19/06
only transformed via reversible, deterministic processes
Entropy is just unknown/incompressible information
M. Frank, NSF/CISE/CCF job talk
42
FAMU-FSU College of Engineering
Types of Dynamical Processes

These animations illustrate how states
transform in their configuration space, in:

A nondeterministic process:


An irreversible process:



One-to-many transformations
Many-to-one transformations
Nondeterministic and irreversible:
Deterministic and reversible:

One-to-one transformations only!
WE ARE HERE
6/19/06
M. Frank, NSF/CISE/CCF job talk
43
FAMU-FSU College of Engineering
Physics is Reversible!

Despite all of the empirical phenomenology relating
to macro-scale irreversibility, chaos, and
nondeterministic quantum events,

Our most fundamental and thoroughly-tested modern
models of physics (e.g. the Standard Model) are, at
bottom, deterministic & reversible!


Although classical General Relativity is argued by some
researchers to have certain irreversible aspects,

6/19/06
All of the observed nondeterministic and irreversible phenomena
can still be explained within such models, as emergent effects.
The general consensus seems to be that we’ll eventually find that
the “correct” theory of quantum gravity will be reversible.
M. Frank, NSF/CISE/CCF job talk
44
FAMU-FSU College of Engineering
Reversible/Deterministic Physics is
Consistent with Observations

Apparent quantum nondeterminism can validly be understood as an emergent
phenomenon, an expected practical result of permanent wavefunction splitting


Even if a quantum wavefunction does not split permanently, its evolution in a
large system can quickly become much too complex to track within our models


Thus entropy, for all practical purposes, tends to increase towards its maximum
Chaos (macro-scale nondeterminism) occurs when entropy at the microscale
infects our ability to forecast the long-term evolution of macroscopic variables


Thus we resort to using “reduced” density matrices, which discard some knowledge
The above effects, plus imprecision in our knowledge of fundamental
constants, result in some practical unpredictability even for microscale systems


As illustrated e.g. in the “many worlds” and “decoherent histories” pictures
A necessary consequence of the computation-universality of physics?
Meanwhile, averaging of many high-entropy microscopic details results in a
“smoothing” effect that leads to irreversible evolution of macro-variables.
6/19/06
M. Frank, NSF/CISE/CCF job talk
45
FAMU-FSU College of Engineering
Reversible Computing

We’d like to design mechanisms that compute while
producing as little entropy as possible…


Losing known information necessarily results in a
minimum k ln 2 entropy increase per bit lost, so…


Let’s consider what we can do using logically reversible
(one-to-one) operations that don’t lose information.
Such operations are still computationally universal!

6/19/06
In order to minimize consumption of free energy /
emission of heat to the environment
Lecerf (1963), Bennett (1973)
M. Frank, NSF/CISE/CCF job talk
46
FAMU-FSU College of Engineering
Conventional Gate Operations are
Irreversible (even NOT!)

Consider a computer engineer’s (i.e., real world!)
Boolean NOT gate (a.k.a. logical inverter)

Specified function: Destructively overwrite output
node’s value with the logical complement of the input!
Hardware
diagram:
in
Two
different
physical
logic
nodes
6/19/06
New
in
Old
in
Inverter
gate
out
Space-time logic network
diagram (not the same thing!!):
Inverter
operation
Old
out
New
out
time
M. Frank, NSF/CISE/CCF job talk
47
FAMU-FSU College of Engineering
In-Place NOT (Reversible)

Computer scientist’s (i.e., somewhat
fictionalized!) in-place logical NOT operation

Specified operation: Replace a given logic signal
with its logical complement.

People occasionally confuse the irreversible inverter
operation with a reversible in-place NOT operation

The same icon is sometimes used in spacetime diagrams
time
in
6/19/06
time
out
old bit
M. Frank, NSF/CISE/CCF job talk
new bit
48
FAMU-FSU College of Engineering
In-Place Controlled-NOT (cNOT)
Specified function: Perform an in-place NOT
on the 2nd bit if and only if the 1st bit is a 1.


Equiv., replace 2nd bit with XOR of 1st & 2nd bits
control
old
data
new
data
time
6/19/06
M. Frank, NSF/CISE/CCF job talk
Before
C D
0 0
0 1
After
C D
0 0
0 1
1
1
1
1
0
1
Transition
table
1
0
49
FAMU-FSU College of Engineering
Early Universal Reversible Gates

Controlled-controlled-NOT (ccNOT)

A.k.a. Toffoli gate



B
C
Controlled-SWAP (cSWAP)

A.k.a. Fredkin gate


6/19/06
Perform cNOT(b,c) iff a=1.
Equiv., c := c XOR (a AND b)
A
Swap b with c iff a=1.
Conserves 1s
A
B
C
M. Frank, NSF/CISE/CCF job talk
50
FAMU-FSU College of Engineering
The Adiabatic Principle

Applied physicists know that a wide class of
physical transformations can be done adiabatically

From Greek adiabatos, “It shall not be passed through”


Newer, more general meaning: No increase of entropy


Of course, exactly zero entropy increase isn’t practically doable
In practice, “adiabatic” is used to mean that the
entropy generation scales down proportionally as the
process takes place more gradually.

6/19/06
Used to mean, no passage of heat through an interface separating
subsystems at different temperatures
The general validity of this 1/t scaling relation is
enshrined in the famous adiabatic theorem of quantum
mechanics.
M. Frank, NSF/CISE/CCF job talk
51
FAMU-FSU College of Engineering
Adiabatic Charge Transfer

Q
Consider passing a total quantity of
charge Q through a resistive element of
resistance R over time t via a constant current, I = Q/t.


The power dissipation (rate of energy diss.) during such a process is
P = IV, where V = IR is the voltage drop across the resistor.
The total energy dissipated over time t is therefore:
E = Pt = IVt = I2Rt = (Q/t)2Rt = Q2R/t.


R
Note the inverse scaling with the time t.
In adiabatic logic circuits, the resistive element is a switch.


The switch state can be changed by other adiabatic charge transfers.
In simple FET-type switches, the constant factor (“energy coefficient”) Q2R
appears to be subject to some fundamental quantum lower bounds.

6/19/06
However, these are still rather far away from being reached.
M. Frank, NSF/CISE/CCF job talk
52
FAMU-FSU College of Engineering
The Low-Power Design
community has it all wrong!

Even (most of) the ones who know about adiabatics and
even many who have done extensive amounts of
research on adiabatic circuits still aren’t doing it right!


Watch out! 99% of the so-called “adiabatic” circuit
designs published in the low-power design literature aren’t
truly adiabatic, for one reason or another!
As a result, most published results (and even review
articles!) dramatically understate the energy efficiency
gains that can actually be achieved with correct adiabatic
design.

6/19/06
Which has resulted in (IMHO) too little serious attention
having been paid to adiabatic techniques.
M. Frank, NSF/CISE/CCF job talk
53
FAMU-FSU College of Engineering
Circuit Rules for
True Adiabatic Switching

Avoid passing current through diodes!


Follow a “dry switching” discipline (in the relay lingo):



Crossing the “diode drop” leads to irreducible dissipation.
Never turn on a transistor when VDS ≠ 0.
Never turn off a transistor when IDS ≠ 0.
Together these rules imply:

The logic design must be logically reversible


There is no way to erase information under these rules!
Transitions must be driven by a quasi-trapezoidal waveform


Important
but often
neglected!
It must be generated resonantly, with high Q
Of course, leakage power must also be kept manageable.

Because of this, the optimal design point will not necessarily use the
smallest devices that can ever be manufactured!

6/19/06
Since the smallest devices may have insoluble problems with leakage.
M. Frank, NSF/CISE/CCF job talk
54
FAMU-FSU College of Engineering
Conditionally Reversible Gates

Avoiding VNL actually only requires that the operation be one-to-one on the
subset of states actually encountered in a given system

This allows us to design with gates that do conditionally reversible operations



That is, they are reversible if certain preconditions are met
Such gates can be built easily using ordinary switches!
Example: cSET (controlled-SET) and cCLR (controlled-CLR) operations can be
implemented with a single digital switch (e.g. a CMOS transmission gate), with
operation & timing controlled by an externally-supplied driving signal

These operations are conditionally reversible, if preconditions are met
Hardware
icon:
in
Space-time logic diagram
in
drive
drive
out
6/19/06
Hardware
schematic:
in
out
old
01
out = 0
M. Frank, NSF/CISE/CCF job talk
new
out = in
10
final
out = 0
55
FAMU-FSU College of Engineering
Reversible OR (rOR)
from cSET

Semantics: rOR(a,b)::=if
a|b, c:=1.

Set c:=1, if either a or b is 1.


a
Reversible if initially a|b → ~c.
c
Two parallel cSETs simultaneously
driving a shared output bus
implements the rOR operation!


Hardware diagram
This is a type of gate composition that
was not traditionally considered.
b
Spacetime diagram
Similarly, one can do rAND, and
reversible versions of all Boolean
operations.
c
Logic synthesis with these
is extremely straightforward…
b

6/19/06
M. Frank, NSF/CISE/CCF job talk
a’
a
0
a OR b
c’
b’
56
FAMU-FSU College of Engineering
Semiconductor Process Engineers have it all
wrong!

Everybody still thinks that smaller FETs operating at lower
voltages will forever be the way to obtain ever more energyefficient and more cost-efficient designs.

But if correct adiabatic design techniques are included in our toolbox,
this is simply not true!

With good energy recovery, higher switching voltages (requiring
somewhat larger devices) enable strictly greater overall energy
efficiency! (and thus lower energy cost!)


The hardware cost-performance overheads of this approach only
grow polylogarithmically with the energy efficiency gains


Over time, we can expect the overheads will be overtaken by
competitively-driven per-device manufacturing cost reductions
If devices better than FETs aren’t found,

6/19/06
This is due to the suppression of FET leakage currents
exponentially with Vq/kT.
then I predict an eventual “bounce” in device sizes
M. Frank, NSF/CISE/CCF job talk
57
FAMU-FSU College of Engineering
The Need for Ballistic Processes

In order to achieve low overall entropy generation in
a complete system,

Not only must the logic transitions themselves take place
in an adiabatic fashion,


but also the components that drive and control the signal levels
and timing of logic transitions (“power clocks”) must proceed
reversibly along the desired trajectory.
Thus, we require a ballistic driving mechanism:

One that proceeds “under its own momentum” along a
desired trajectory with relatively little entropy increase.

Many concepts for such mechanisms have been proposed, but…

6/19/06
Designing a sufficiently high-quality power-clock mechanism
remains the major unsolved problem of reversible computing
M. Frank, NSF/CISE/CCF job talk
58
FAMU-FSU College of Engineering
Requirements for Energy-Recovering
Clock/Power Supplies

All of the known reversible computing schemes require the presence of a
periodic and globally distributed signal that synchronizes and drives
adiabatic transitions in the logic.


Several factors make the design of a resonant clock distributor that has
satisfactorily high efficiency quite difficult:




For good system-level energy efficiency, this signal must oscillate resonantly
and near-ballistically, with a high effective quality factor.
Any uncompensated back-action of logic on resonator
In some resonators, Q factor may scale unfavorably with size
Excess stored energy in resonator may hurt the effective quality factor
There’s no reason to think that it’s impossible to do it…

But it is definitely a nontrivial hurdle, that we reversible computing
researchers need to face up to, pretty urgently…

6/19/06
If we hope to make reversible computing practical in time to avoid an extended
period of stagnation in computer performance growth.
M. Frank, NSF/CISE/CCF job talk
60
FAMU-FSU College of Engineering
MEMS Resonator Concept
Arm anchored to nodal points of fixed-fixed beam flexures,
located a little ways away, in both directions (for symmetry)
Moving metal plate support arm/electrode
Moving
plate Range of Motion
z
Phase 0° electrode
C(θ)
0°
θ
360°
Repeat
interdigitated
structure
arbitrarily many
times along y axis,
all anchored to the
same flexure
Phase 180° electrode
y
x
C(θ)
0°
θ
360°
(PATENT PENDING, UNIVERSITY OF FLORIDA)
6/19/06
M. Frank, NSF/CISE/CCF job talk
61
FAMU-FSU College of Engineering
MEMS Quasi-Trapezoidal Resonator: 1st
Fabbed Prototype
(Funding source: SRC CSR program)

Post-etch process is still being fine-tuned.

Parts are not yet ready for testing…
Primary
flexure
(fin)
Sense
comb
Drive comb
6/19/06
(PATENT PENDING,
UNIVERSITY OF
FLORIDA)
M. Frank, NSF/CISE/CCF job talk
62
FAMU-FSU College of Engineering
Would a Ballistic Computer
be a Perpetual Motion Machine?

Short answer: No, not quite!

Hey, give us some credit here!


Two traditional (and impossible!) kinds of perpetual motion machines:



1st kind: Increases total energy - Violates 1st law of thermo. (energy conservation)
2nd kind: Reduces total entropy - Violates 2nd law of thermo. (entropy non-decrease)
Another kind that might be “possible” in an ideal world, but not in practice:

3rd kind: Produces exactly 0 increase in entropy!


We’re hard-core thermodynamics geeks, we know better than that!
Requires perfect knowledge of physical constants, perfect isolation of system from
environment, complete tracking of system’s global wavefunction, no decoherence, etc.
What we’re more realistically trying to build in reversible computing is none of the
above, but only the more modest goal of a “For-a-long-time Motion Machine”

I.e., one that just produces as close to zero entropy (per op) as we can possibly achieve!


Such a “coasting” machine can perform no net mechanical work in a complete cycle,

6/19/06
It would “coast” along for a while, but without energy input, it would eventually halt
But it can potentially do a substantial amount of useful computational work!
M. Frank, NSF/CISE/CCF job talk
63
FAMU-FSU College of Engineering
Some Results on Scalability
of Reversible Computers

In a realistic physics-based model of computation that
accounts for thermodynamic issues:

When leakage is negligible and heat flux density is bounded,

Adiabatic machines asymptotically outperform irreversible machines
(even per unit cost!) as problem sizes & machine sizes are scaled up



Even when leakage is non-negligible,

Adiabatic machines can still attain constant-factor (i.e., problem-sizeindependent) energy savings (& speedups at fixed power) that scale as
moderate polynomials of the device characteristics


E.g., roughly with the transistor on-off ratio to at least the ~0.39 power
Cost overheads from RC in these scenarios also grow, somewhat faster

6/19/06
But, the absolute speedup when total system power is unrestricted grows
only as a small polynomial with the machine size

E.g., exponents of 1/36 or 1/18, depending on problem class
The speedup per unit surface area or (equivalently) per unit power
dissipation grows at a somewhat faster (but still gradual) rate

E.g., with the 1/6 power of machine size
But, we can hope that device costs will continue to decline over time
M. Frank, NSF/CISE/CCF job talk
64
FAMU-FSU College of Engineering
Bennett’s 1989 Algorithm
for Worst-Case “Reversiblization”
k=2
n=3
6/19/06
M. Frank, NSF/CISE/CCF job talk
k=3
n=2
65
Worst-Case Energy/Cost Tradeoff
Cost-EfficiencyBennett-89
Gains, Modified Ben89
(Optimized
Variant)
Advantage in Arbitrary Computation
100000000
y = 1.741x0.6198
cost  energy 1.59
10000000
70
60
1000000
50
100000
y = 0.3905x0.3896
10000
1000
40
30
100
20
10
k
1
10
n
0.1
1
100
10000
1000000
10000000
0
On/Off Ratio of Individual Devices
1E+10
0
1E+12
out
hw
n
k
FAMU-FSU College of Engineering
(Most) Device Physicists
have it all wrong!

Unfortunately, I’d say >90% of papers published on new
logic device concepts (whether based on CNTs,
spintronics, etc.) either ignore or dramatically neglect
the key issue of the energy efficiency of logic operations

Even though, looking forward, this is absolutely the most
crucial parameter limiting the practical performance of
leading-edge computing systems!

6/19/06
And, even the rare few device physicists who study reversible
devices don’t seem to be talking to the analog/RF/µwave
engineers who might help them solve the many subtle and
difficult problems involved in building extremely highquality energy-recovering power-clock resonators
M. Frank, NSF/CISE/CCF job talk
67
FAMU-FSU College of Engineering
Device-Level Requirements for Reversible
Computing

A good reversible digital bit-device technology should have:

Low amortized manufacturing cost per device, ¢d


Important for good overall (system-level) cost-efficiency
Low per-device level of static “standby” power dissipation Psb due to
energy leakage, thermally-induced errors, etc.

This is required for energy-efficient storage devices, especially


Low energy coefficient cEt = Ediss·ttr (energy dissipated per operation,
times transition time) for adiabatic transitions between digital states.

This is required in order to maintain a high operating frequency
simultaneously with a high level of computational energy efficiency.


And thus maintain good hardware efficiency (thus good cost-performance)
High maximum available transition frequency fmax.

6/19/06
but it’s still a requirement (to a lesser extent) in logic as well
This is especially important for applications in which the latency from
inherently serial computing threads dominates total operating costs
M. Frank, NSF/CISE/CCF job talk
68
Power vs. freq., alt. device techs.
Power per device, vs. frequency
Plenty of Room for
Device Improvement
1.E-03
1.E-04
1.E-05
1.E-06
1.E-07
Recall, irreversible device
technology has at most ~3-4
orders of magnitude of
power-performance
improvements remaining.

1.E-08
1.E-09
1.E-10
1.E-11
1.E-12
1.E-13
1.E-14
1.E-15
And then, the firm kT ln 2
(VNL) limit is encountered.
1.E-16
1.E-17
1.E-18

But, a wide variety of
proposed reversible device
technologies have been
analyzed by physicists.

1.E-19
1.E-20
1.E-21
.18um 2LAL
nSQUID
QCA cell
Quantum FET
Rod logic
Param. quantron
Helical logic
.18um CMOS
kT ln 2
With preliminary estimates of
theoretical power-performance
up to 10-12 orders of
magnitude better than today’s
CMOS!
1.E+12

Ultimate limits are unclear.
1.E+11
1.E+10
1.E+09
1.E-22
1.E-23
1.E-24
Various
reversible
device proposals
1.E-25
1.E-26
1.E-27
1.E-28
1.E-29
1.E-30
1.E+08
1.E+07
Frequency (Hz)
1.E+06
1.E+05
1.E+04
1.E-31
1.E+03
Power per device (W)

One Optimistic Scenario
A Potential Scenario for CMOS vs. Reversible Raw Affordable Chip Performance
40 layers, ea. w.
8 billion active
devices,
freq. 180 GHz,
0.4 kT dissip.
per device-op
Device-ops/second per affordable 100W chip
1.00E+23
1.00E+22
1.00E+21
CMOS
1.00E+20
Reversible
1.00E+19
e.g. 1 billion devices actively switching at
3.3 GHz, ~7,000 kT dissip. per device-op
1.00E+18
1.00E+17
2004
2006
2008
2010
2012
2014
2016
2018
2020
Year
Note that by 2020, there could be a factor of 20,000× difference in raw
performance per 100W package. (E.g., a 100× overhead factor from reversible
design could be absorbed while still showing a 200× boost in performance!)
FAMU-FSU College of Engineering
A Call to Action

The world of computing is threatened by permanent raw
performance-per-power stagnation in ~1-2 decades…

We really should try hard to avoid this, if at all possible!


Many more of the nation’s (and the world’s) top
physicists and computer scientists must be recruited,


A wide variety of very important applications will be impacted.
to tackle the great “Reversible Computing Challenge.”
Urgently needed: A major new funding program;
a “Manhattan Project” for energy-efficient computing!

Mission: Demonstrate computing beyond the von NeumannLandauer limit in a practical, scalable machine!

6/19/06
Or, if it really can’t be done, for some subtle reason, find a completely
rock-solid proof from fundamental physics showing why.
M. Frank, NSF/CISE/CCF job talk
71
FAMU-FSU College of Engineering
Efficiency in General,
and Energy Efficiency

The efficiency η of any process is: η = P/C




Where P = Amount of some valued product produced
and C = Amount of some costly resources consumed
In energy efficiency ηe, the cost C measures energy.
We can talk about the energy efficiency of:

A heat engine: ηhe = W/Q, where:


An energy recovering process : ηer = Eend/Estart, where:



Eend = available energy at end of process,
Estart = energy input at start of process
A computer: ηec = Nops/Econs, where:


6/19/06
W = work energy output, Q = heat energy input
Nops = # useful operations performed
Econs = free-energy consumed
M. Frank, NSF/CISE/CCF job talk
72
Download