ppt

advertisement
Synthesis of the
Optimal 4-bit
Reversible Circuits
Oleg Golubitsky Sean Falconer Dmitri Maslov (spkr)
Google Inc.
Waterloo, ON, Canada
Stanford University
Stanford, CA
University of Waterloo
Waterloo, ON, Canada
Basic Definitions
NOT
x
y
x
y
z  xy
z
Toffoli
CNOT
Toffoli-4
Reversible circuit is a string of gates. Reversible nbit function is a permutation of 2n elements.
page 1/15
Problem
Synthesize optimal 4-bit reversible circuits, i.e., containing
minimal number of gates.
Complexity
-- There are 16!=20,922,789,888,000 reversible functions.
-- There 32 gates.
-- An average optimal circuit requires 11.94 gates.
:: 20,922,789,888,000 * log2 32 * 11.94 bits > 100 TB.
Murphy, David. "Western Digital Launches World-First 2TB
Hard Drive". PC World. Retrieved 2009-01-27.
page 2/15
Importance
Library for physicists interested in performing a small
experiment, but having very limited control over their system.
Indispensable for peep-hole optimization methods. Peephole optimizations are an important part of any modern
compiler.
Mathematical curiosity. Computing the value of Shannon’s
complexity function. L(3)=8, L(4)=[14,17], L(5)=?
page 3/15
Solution
Denote 16!=N (formally, N:=2n!).
Rough complexity analysis
N 
-- time: N 
-- space:
Next, reduce these complexity figures to something
manageable.
page 4/15
Solution
Optimization 1
Synthesize and save only halves of all optimal circuits.
An optimal circuit for any function may be found by searching
for both of its halves.
Rough complexity analysis
-- space: 
-- time: 

 N

N * N   N 
page 5/15
Solution
Optimization 2
Store optimal halves in a hash table.
Rough complexity analysis
 N
-- time: soft   N 
-- space: 
Actual complexity is closer to
OSpace Re quiredToSt oreHalves( N )
-- time: soft OSpace Re quiredToSt oreHalves( N )
-- space:
page 6/15
Solution
Optimization 3
Simultaneous input/output relabeling does not change
optimality of a circuit. Thus, we store a single (canonical--binary string with least lexicographic order) representative.
In practice, there are almost 24=4! different relabelings,
reducing the storage complexity by a factor of almost 24, and
helping to reduce runtime.
page 7/15
Solution
Optimization 4
If an optimal circuit is found for a function f, an optimal circuit
for the inverse function, f -1, can be obtained by reversing the
optimal circuit for f.
In practice, random f frequently differs from f -1 resulting in
the reduction of storage requirement by an additional factor
of almost 2, and helping to further reduce the runtime.
page 8/15
Performance
Parameters of the linear hash table storing canonical
representatives.
k
Size
7
225
8
228
Memory usage 256 MB 2 GB
Load factor
0.58
0.91
9
232
32 GB
0.51
Using a high performance server with 16 AMD Opteron
2300 MHz processors, 64 GB RAM, and Seagate
Barracuda ES2 SCSI 7200 RPM HDD running Linux it took
10,549 seconds (under 3 hours) to synthesize all optimal
circuits with up to 9 gates.
page 9/15
Performance
Size
14
13
Functions
17,191
2,371,039
12
11
10
5,110,943
2,051,507
392,108
9
8
7
50,861
5,269
455
6
5
24
3
Synthesis of 10,000,000 random
functions (Fisher-Yates shuffle over
Mersenne twister random number
generator) took 104,616.716 seconds
(about 29 hours) of user time with the
maximal memory usage of 43.04 GB.
Loading optimal circuits with up to 9
gates into RAM took 1111 seconds.
On average, it took only 0.01035
seconds to synthesize an optimal circuit.
A 5400-RPM HDD access time may be
expected to be on the order of 0.01—
0.02 seconds.
page 10/15
Performance
Distribution of the
number of functions
requiring a circuit of
a specified size
(gate count).
page 11/15
Performance
Size Functions
10 138
9 13,555
8 84,225
7 118,424
6 72,062
Distribution of the number of
linear functions requiring a
circuit of a specified size
(gate count).
5 26,182
4 6,589
3 1206
It took under 2 seconds to
synthesize all these circuits.
2 162
1 16
0 1
WA: 6.8816
Total: 322,560
page 12/15
Performance
page 13/15
Future directions
Larger circuits
-- There are 80 transformations resulting from the application
of all possible Toffoli-type gates on 5 bits.
-- 806*(log2 80)/5!/2 ~ 7.1 billion bits, fits into RAM memory.
-- 6+6=12. Meaning, it is reasonable to expect that extending
the search for optimal 4-bit reversible circuits will allow to find
optimal 5-bit reversible circuits with up to 12 gates.
page 14/15
Future directions
Optimal circuits using other cost metrics
This search can be easily extended to account for other cost
metrics:
Weighted gate count optimal circuits---organize breadth
first search such that a gate with cost G is assigned to a
circuit of cost C at the iteration number G+C.
Depth optimal circuits---choose a different set of
elementary transformations, e.g., circuit NOT(a)CNOT(b,c) is
now an elementary transformation.
Depth optimal weighted gate circuits---combine previous
two modifications.
page 15/15
END
Questions?
2 ! 5.418528796 10
10
2639
210!=541852879605885728307692194468385473800155396353801344448287027068321061207337660373314098413621458671907918845
7089807539319941657701873682604541333337219391083675280127649937697682925169378911657556806596637479473145184048866
7767255612518869433525121367727452196343077013371320579624843312887008843617165469023751839045294473227780840293215
8722061853806162806063925435310822186848239287130261690914211362251144684713888587881629252104046295315949943900357
8824102439343150374441138908061814062108639532752353758850185984515822295996545585412427891309024869442986109231533
0757913167574514643630402489082044290773456182736903050225279692655307296737099075874779312763510470246988966796146
2133026237158973227857814631807156427767644064591085076564783456324457736853810336981776080498707767046394272605341
4167791256977333745680374751866762659616656158846814502633370425226641418621570468256847733609443267374936766749150
9895376811294583162664385647902781638573029154266772566564227682605826439388451491197641967550929020859271315636298
3290989441052732125187249527501314071676405516936190781821236701912295767363117054126589929916482008515781751955466
9109028387292322245099063886381477712552277826313223857569488193936588899089936708745168606530984110202998538162815
6433498184710577783953474253149962210348880758451370576983976399310392966504604612116665134513114951365740086905633
4867859885025601787284982567787314407216524272262997319791568603629406624740101482697559533155736658800562921274680
6572852015704019406922855578006114290557553245497940089398491468126398607500852632988202247195855053447737115906566
8282104141726504065860068384494510435499881288680131655155171467338832334085176381971359131237254867373478353731634
1517369387565212899726597964903241208727348690699802996369265070088758384854547542272771024255049902319275830918157
4482051964210728372049372935161753419577754224531524422803913724077178916612030610402558300550338867900521160254087
4045462093838436763788665876991279092232371737134317606748335251362912336288589362713229418356588401041872786935443
9077085278288558308427090461075019007184933139915558212752392329879780649639075333845719173822840501869570463626600
2352655875023355954893116375093802191198604713357716524039994032963602455772579636732866543489573257409997105671316
2327234576676193765140810399919363390828642051009857745452406810689739249313828736222625792000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Classically:
210!/(Lifespan_of_universe_in_Planck_time_units *
Estimated_number_of_atoms) ~ 102452 universes!
page __/__
Work in progress
Synthesize all optimal 4-bit circuits
-- Store circuits with up to 9 gates as we do it now.
-- Store a bit vector (~250 GB) for canonical representatives
of circuits with 10, 11, 12, 13, and 14 gates, one at a time.
-- Use a minimal number of uploads/downloads of parts of
each of such vectors into RAM.
page __/__
Download