“Information Friction” and Minimum Power Consumed The hope:

advertisement
“Information Friction” and Minimum Power Consumed
in Data Center and Big Data Networks
The hope:
Pulkit Grover
CMUdistortion?
y probes = significant near-field
the core idea:
coil in or around the circuit
coil is probed, detect probing by measuring field
Acknowledgments:
NSF ECCS 1343324
NSF CSoI !1 /15
Energy consumed in communication & computing
In the body
- 20% of total energy
In the world
- 8% of electricity in 2011
- Projection: 15% by 2020
!2 /15
Moore’s law,
and the difficulty of reducing computation power
Moore’s law is saturating
Higher frequencies cost more power
But this is the era of
BigData!
!3 /15
Solution to Moore’s law saturation: Parallelize!
Use communication to compute
Communication has become a bottleneck
“We need a “Moore’s Law” technology for networking” [Byrant, Katz, Lazowska ’08]
!4 /15
!
How much power must communication consume?
Transmitter
PT
A soft-spoken teacher’s dilemma
Speak slower? louder?
Channel
c
M
Receiver
“Shannon’s solution”
Neither!
Have a lot to say; “Mix” your information
−5
log (P(P
)
log
10 e e )
10
M
−10
fixed R,
distance
−15
path loss
Shannon limit
✓
◆
−25 (Tx power)
PT
C = W log 1 +
>R
N0
−20
−30
0
0.05
0.1
0.15
Total power (Watts)
Total power
(watts)
Transmit
0.2
!5 /15
How can we reduce communication power?
log10(P(P
)
log
10 e e )
−5
−10
fixed R,
distance
−15
path loss
Shannon limit
✓
◆
−25 (Tx power)
PT
C = W log 1 +
>R
N0
−20
−30
0
0.05
0.1
0.15
Total power (Watts)
Total power
(watts)
Transmit
0.2
Shannon limit is already achieved
[Turbo Codes ’93]
[LDPC codes ’98]
[Polar Codes ’08]
We are already at the theoretical minimum transmit power
/27
!6 /15
Communication systems: then and now
!7 /15
The problem has changed
We now want to minimize total (transmit + circuit) power
!8 /15
What could fundamental limits on total power look like?
M
log10(P(P
) )
log
e
e
10
−5
−10
Transmitter
PT
Channel
c
M
Receiver
Total power behavior
fixed rate,
distance
this?
... or this?
−15
−20
Shannon limit
−25 (Tx power)
−30
0
0.05
0.1
0.15
Total power (watts)
0.2
Total power (Watts)
!9 /15
Total power: Hints from experiments
Circuit-simulation-based modeling of decoding power
[Ganesan, Grover, Rabaey SiPS’11][Ganesan, Wen, Grover, Rabaey’12]
Regular LDPCs
(3,4) (3,6) (4,8) (5,10)
log10 (Pe )
−5
−10
Ptotal
⇤
PT
−15
−20
0
0.025
0.05
0.075
Power (Watts)
R = 7 Gbps
W = 7 GHz
path-loss coefft = 3
d = 3 meters
fc = 60 GHz
STM CMOS 90 nm
Cadence Encnter (Layout)
Synopsys HSPICE (post-lyt)
0.1
!10 /15
“Information-friction” energy model
Circuit implementation model:
Computational nodes that can
talk to one another over circuit links
Min distance between nodes
=
“Info-friction” model of energy
consumed in circuit-links: weight w
w
B bits
d
Efriction = µ w d
Example applications:
Einfo
d
friction
=µBd
Coefficient of
informational friction
!11 /15
Info-friction losses in encoding/decoding
M
Transmitter
PT
c
M
Receiver
Channel
Theorem [Grover, ISIT ’13]
0 s
⌦ @k
Eenc,dec
log
1
Pe
PT
1
for any code, and any encoding &
A decoding algorithm implemented
in the circuit model
log10 (Pe )
−5
Ptotal
−10
−15
⇠
−20
r
3
PT⇤
1
log
Pe
−25
0
Ptotal
r
1
3
2
1
PT ⇠ log
Pe
PT
with bounded
r
⇠
⇠
−30
In comparison,
uncoded transmission:
1
log
Pe
1
log
Pe
3
Power (Watts)
4
5
−5
x 10
Experimental
corroboration
[Ganesan, Grover,
Goldsmith, Rabaey ‘12]
!12 /15
Why does this matter?
Traditional view
Total-power perspective
41
Code choice (10-Gbase-T):
(6,32) - “RS-LDPC”, length 2048
10
FER/BER
FER/BER
10
10
10
10
10
10
10
0
20
30
−2
-5
−4
log10 (Pe )
10
Distance (meters)
−6
-10
−8
−10
−12
−14
2.5
-15
10 iterations
20 iterations
50 iterations
100 iterations
200 iterations
3
3.5
(6,32) - “RS-LDPC”
Code-choice must
length 2048
depend on distance
-20
4
4.5
5
5.5
6
Eb/No (dB)
SNR (dB)
FER (dotted lines) and BER (solid lines) performance of the Q4.2 sum-product
the (2048,1723) RS-LDPC code using different number of decoding iterations.
New codes that adapt and minimize total power [Mahzoon, Yang, Grover in prep]
!13 /15
Towards fundamental limits on energy of
computing networks
!
!
Total computation energy
on-chip energy + cable-energy
µon-chip Bon-chip don-chip + µcable Bcable dcable
Total energy for sorting, classification, …
!14 /15
Can noisy or approximate computing help?
The Shannon waterfall has inspired a study of noisy computing
[von Neumann’56]
[Pippenger’88]
[Hajek, Weller’91]
[Evans, Schulman’99]
… [Shanbhag, Kumar, Jones’10]
Theorem [Grover, ISIT ’14 Sub.]
Etotal per-bit
⌦
✓r
3
1
log
Pe
◆
for any implementation
with Gaussian noise
in circuit elements
… i.e., the “Shannon-capacity” of noisy computing is zero!
[Grover, ISIT’14 Sub.]
Nevertheless, noisy computing could help,
. . . but don’t expect Shannon waterfall’s “magic”
!15 /15
Summary
1) Traditional info theory ignores energy of encoding/decoding
- useful in long distance communication
- misleading in computation
−5
log10 (Pe )
log10(P(P
)
log
10 e e )
−5
−10
−10
−15
−15
−20
−20
−25
PT⇤
Shannon limit
(Tx power)
Ptotal
−25
−30
−30
0
Total power
with info-friction
0.05
0.1
0.15
Total power (Watts)
Total
Total
power
(watts)
0.2
0
1
2
3
4
Power (Watts)
5
−5
x 10
2) Info-friction model: fundamental limits on energy of various computations
- sorting/ranking, Fourier transforms, classification
!
!
3) Insights towards “good” computational network designs
4) Still need to understand noise in computing and its impact
!16 /15
!17 /15
Extra slides
!18 /15
cellationof wave functionsin re-
above use active channels
8. C. M. Cav
J. J. Hall
(Cambrid
89; W. Zu
R. Landa
C. H. Ben
__,
I
D. Frank
Requirements Internati
Minimal
Energy
sociation
...
in Communication
1995), pp
Worksho
'94 (IEE
’96]
the wave function within that eliminated.[Landauer
the chaoticnature
Furthermore,
CA, 1994
NOTES
literature describing the energy needs for a communications channel has been
13. C. M. Cav
atREFERENCES
a much AND
In
it
higher
energy.The
the
of
billiard
ball
collisions
makes
extremely
dominated by analyses of linear electromagnetic transmission, often without awareness
66, 481 (
. J. Betley, V. L. Miller,J. J. Mekalanos, Annu. Rev.
energy
that
an
amount
of
the
conclusion
case
leads
to
special
case.
This
that
this
is
a
the
mit,
approaches
in
the
that
sensitive
to
flaws
machinery.
Using
crobiol.40,
577 behavior
(1986); J. Morschhauser,
V. Vet14. L. Brillou
, L. Emody, J. Hacker, Mol. Microbiol. 11, 555
equal to kTln 2, where kT is the thermal noise per unit bandwidth, is needed to transmit
wo-level
such
motion
forphoton
a communications
as an
elecR. Taylor, C. system,
P. Spears,
994);
Shaw, K. Peterson,
energies hv > kT. Alternative demic Pr
if quantized ball
channels
are used with
a bit,
and morebilliard
Mekalanos, Vaccine
6,151
(1988);
J.
Reidl
and
J.
The
physicist
noframework
unavoidable minimal
to done
show that
there is a
methods
are proposed
communication
15. L. B. Le
where
noMicrobiol.
details
the
link
most
within
beyond
spinis
easily
Mekalanos, Mol.
18, 685
(1995).
energy requirement per transmitted bit. These methods are invoked as part of an analysis
J. Mekalanos, Cell 35, 253 (1983).
Physics
states
net rate
of billiardball transfer
in-down
exist.
where
D. N. Pearson, thesis,
Harvard
UniversityMedical
procedures.
of ultimate limits
and not the
as practical
Compute
hool (1989).
smatter
inA. two
an in- doesnot dependon the informationcontent
First,
D. N. Pearson,
Woods, ways.
S. L. Chiang,
J. J.
pp. 210-2
ekalanos, Proc. Natl. Acad. Sci. U.S.A. 90, 3750
t993).
can be transmitted.This is not a of the link. For example,0 is denotedby a 16. H. J. Br
Fasano et a/, ibid. 88, 5242 (1991).
Informationis inevitablytied to a physical with a known and specifiablelowerbound (1982); H
and
a 1 isIt was
J. Michalski,
the
of error
is ball
an
A. Fasano, J. B.
Trucksis, J. if
Galen,
oblem,
followed
empty
slot,
probability
by
information.
a
paper,
a
that
discard
as
a
mark
on
are
those
such
representation,
aper, ibid. 90, 5267 (1993).
17. E. Fredk
path
loss
that
operalong
ago
(9)
also
understood
spin emptyslot followedby a
in a punched
card, an electron
hole
B. Kaper,
J. Michalski,
J. M.
mall
error
rate
a
Ketley, M. M. Levine,
small
represented
by
an
requires
only
18. R. Landa
fect. Immun. 62,1480 (1994).
or down, or a chargepresentor tions that do throwawayinformation,such
pointingup
(channel
“frictional”
loss)
fK.redundancy
use
of
athereturn
Taylor,V. L. Miller,D. B. to
J. J. Mekalaerror
free
ball.
This
scheme
Furlong,
regain
permits
logicalOR, can Noise in
AND
and
logical
This
representation
as
the
a
capacitor.
absent
on
s, Proc. Natl. Acad. Sci. U.S.A. 84,✓
2833 (1987).
◆
that per- Mountain
operations
larger
be imbeddedin
of physics
whether
the laws
leads us to asklink
L. Miller
andThe
J. J. Mekalanos,
sion
if
are
ibid. 81,that
we
opti3471 Prestores
(3).
(Fig.
5).
Alternatively,
decoding
Trestrictthe handlingof informationand in form a logical 1:1 mapping and do not National
984); V. J. DiRita,Mol. Microbiol.6, 451 (1992).
C
=
W
log
1
+
>R
L. Miller
and J. J. Mekalanos,
J. Bacteriol.
nal
170,
message
is
not
about
lifetime
of
a
then
photon,
dissipationless;
mistic
the
a real 1981), pp
Nevertheless,
discard information.
enerminimal
whether
there are
particular
N
75 (1988); V. J. DiRita,C. Parsot, G. Jander, J. J. 0
is now calledreversof what
with are
understanding
associated
gy dissipationrequirements
ekalanos,
Proc.
Natl. Acad. Sci.away
U.S.A. 88, information
5403
we
are
inventions
there
throwing
possible
conceivably
,A
19.
information handling. The subject has ible computationcame from the work of
991); G. Jonson, J. Holmgren, A.M. Svennerholm,
20. E. Goto, J
fect.
Immun.(4).
60, 4278However,
(1992).
error
the
the
minimal
make
use
of
timing
that compupolarization
showed
branches
(10, 11 ), whoor
Bennett
but interrelated
three distinct that
Noise
Goldberg and J. J. Mekalanos, J. Bacteriol. 165,
[Shannon
The
engineer
conductedthrougha J. von Ne
can alwaysbe ’48]
measurement tation
with the
dealing,respectively,
3 (1986).
is
to
the
Or
there
couldBennett
be furdissipation
proportional
of
single
photons.
perhaps
/15
!19exam
process,the communicationschannel, and seriesof logical 1:1 mappings.
21. For
e primers used to determine the sequence of the
1994, Genetics Computer Group, 575 Science
grateful to Mekalanos laboratory members, espeof geneticelementsmediatingthe transDrive, Madison, WI 5371 1, USA).
ciallyE. Rubin,J. Tobias, S. Chiang, and G. Pearson,
0f virulence
or 1 state
channel.Such a
(Fig.
4A). The
two all along the length offorthe
genes with
the pathogenic
22. S. L. Chiang, R. K. Taylor,M. Koomey, J. J. Mekalavaluable support, suggestions, and strains. All
nos, Mol. Microbiol.17,1133 (1995).
experiments involvinganimals were done in accorderialeigenstates
species they infect. (Fig.
Thus, a virued
as
4B)
have
these
schemes
setup
makes
ance with theunappealing
guidelines for animal experimentation
e factor (TCP) is the receptorfor a 23. D. A. Herrington et al., J. Exp. Med. 168, 1487
of HarvardMedical School.
(1988).
fferent
energies.
The most 9.
Their
exact
technological
candidates.
eriophageencoding
another
virulence
24. wave
R. H. Hall,P. A.serious
Vial,J. B. Kaper,
J. J. Mekalanos, M.
22 April1996; accepted 28 May 1996
M. Levine, Microb. Pathogen. 4, 257 (1988).
or(CT), both of which arecoordinately
within
a
difpocket
active link, for the 10.
are
obvious
alternative
to
an
slightly
latedby the same virulenceregulatory
11.
In this case,
the natural
habitatand can(toxR).
ey
cannot
cancel
exactly
classicalcase, is the use of particleswith 12.
th phageand pathogenis the gastroinanal
totally
This sufficientlywell-definedspeed and directract. It isunoccupied
apparentthat thispocket.
host
partment
the necessary
envican beprovides
minimized
by
using very tion, as in the well-knowncollidingbilliard
mentalsignalsrequiredforthe expression
in which
tential
ball computer(29). This scheme assumes
sential
genepockets,
productsmediating
interac-the next
between
all threethe
participants,
namely,
ate
within
pocket,
a
that
friction
can
be
completely
having
and
noise
Rolf
Landauer
[1996]
erium,phage,and mammalianhost.
How much power must communication consume?
zero!
Total power: Hints from experiments
Circuit-simulation-based modeling of decoding power
[Ganesan, Grover, Rabaey SiPS’11][Ganesan, Wen, Grover, Rabaey’12]
Regular LDPCs
(3,4) (3,6) (4,8) (5,10)
log10 (Pe )
−5
−10
Ptotal
⇤
PT
−15
−20
0
0.025
0.05
0.075
Power (Watts)
R = 7 Gbps
W = 7 GHz
path-loss coefft = 3
d = 3 meters
fc = 60 GHz
STM CMOS 90 nm
Cadence Encnter (Layout)
Synopsys HSPICE (post-lyt)
0.1
!20 /15
Decoding complexity/power:
Fundamental limits
Theorem*[Grover, Woyach, Sahai ISIT ’08, JSAC ’11]
# of clock-cycles, ⇥
&
1
log(
1)
log
log P1e
(C(PT )
R)2 /V (PT )
!
. . . for any code and any decoding algorithm.
↵ : Max. node degree
V (PT ) : “channel dispersion”
Node mode: Enode joules per clock-cycle
* precise result for any Pe and any gap appears in [Grover, Woyach, Sahai ’11]
!21 /15
Moving information:
a model of implementation
Output nodes
Input nodes
Helper nodes
1) Asynchronous
2) No connectivity
limitations
3) No constraints
on interconnect
material
4) but fixed schedule
[Grover ’13]
!22 /15
Looking forward:
Neuronal and neuro-morphic processing
Nodes of Ranvier
(provides insulation)
friction + noise in computing?
!23 /15
Download