slides - webdev.fit.cvut.cz

advertisement
New Ways of Generating
Large Realistic Benchmarks
for Testing Synthesis Tools
Petr Fišer, Jan Schmidt
Faculty of Information Technology
Czech Technical University in Prague
fiserp@fit.cvut.cz, schmidt@fit.cvut.cz
Outline
Motivation
New benchmark generation methods
Experimental results
Conclusions
IWSBP 2010, Freiberg
2
Motivation
… why another artificial benchmark generator?
To test logic synthesis tools






Capabilities of synthesis processes
Immunity to “bad” structures
Ability to discover “good” structures
Iterative power
Scalability
…
IWSBP 2010, Freiberg
3
Motivation
J. Cong, K. Minkovich: Optimality study of logic synthesis
for LUT-based FPGAs, IEEE Trans. on CAD, vol. 26, 2007
They created artificially large circuits, functionally
equivalent to their small origins (70 LUTs)
Synthesis produced 10k – 30k LUTs
IWSBP 2010, Freiberg
4
Motivation
P. Fišer, J. Schmidt, J: Small but Nasty Logic Synthesis
Examples, IWSBP'08
XOR tree is appended to the circuit outputs and
the circuit is collapsed
Synthesis produced
>400 LUTs instead of 11
IWSBP 2010, Freiberg
5
Motivation
Will my synthesis tool produce the same result for
different descriptions (versions) of one particular
circuit? (a.k.a. iterative power)
Most probably not!
(if things go bad)
What went wrong?
What descriptions are bad for me?
What structures caused my failure?
What should I do to perform better?
IWSBP 2010, Freiberg
6
Proposed Benchmarks
Starting with seed circuit (could be small)
Functionally equivalent “big” circuit is created
The size of the benchmark circuit is adjustable
Ideal case:
Seed circuit
IWSBP 2010, Freiberg
Transformation 1
Bench circuit 1
Synthesis
Transformation 2
Bench circuit 2
Synthesis
Transformation 3
Bench circuit 3
Synthesis
Result
7
Proposed Benchmarks
Starting with seed circuit (could be small)
Functionally equivalent “big” circuit is created
The size of the benchmark circuit is adjustable
Real case:
Seed circuit
IWSBP 2010, Freiberg
Transformation 1
Bench circuit 1
Synthesis
Result 1
Transformation 2
Bench circuit 2
Synthesis
Result 2
Transformation 3
Bench circuit 3
Synthesis
Result 3
8
Cong’s LEKU Benchmarks
J. Cong, K. Minkovich: Optimality study of logic synthesis
for LUT-based FPGAs, IEEE Trans. on CAD, vol. 26, 2007
LEKU = Logic Examples with Known Upper Bound
Based on elimination of the original circuit structure
… and bad decomposition
G5
7 LUTs
Replicate
IWSBP 2010, Freiberg
G25
70 LUTs
Collapse
ABC balance
LEKU-CB
814 gates
SIS te ch_decomp
LEKU-CD
>1M gates
SOP
19K terms
9
1. Realistic LEKU Benchmarks
Any circuit may be used as a seed (instead of g25)
Possible chance of success
Global BDDs may be used instead of collapsing
Upper bound = size of the original circuit
Collapse
Possibly
large SO P
SIS tech_decomp
Possibly
large circuit
Global BDD
Possibly
large SO P
SIS tech_decomp
Possibly
large circuit
Original
circuit
IWSBP 2010, Freiberg
10
1. Realistic LEKU Benchmarks
Size increase by collapsing
7x
250 ISCAS and IWLS benchmarks
Size increase factor
6x
Size increase in 61% of circuits
5x
4x
3x
2x
1x
0x
0
2000
4000
6000
8000
10000
12000
Gates
IWSBP 2010, Freiberg
11
1. Realistic LEKU Benchmarks
Experimental results
Synthesis (# of 4-LUTs)
Benchmark circuit
Bench Inp. Out.
Process
c432
36
7
original
c432
36
7
c432
36
c432
Gates
ABC
#1
#2
145
84
77
118
global BDD
2,017
1,031
1,023
1,333
7
ABC collapse
2,658
1,246
1,548
1,648
36
7
SIS collapse
7,075
3,361
3,872
4,738
c880
60
26
original
208
113
110
122
c880
60
26
global BDD
407,098
93,190
174,983
N/A
c880
60
26
ABC collapse
13,727
7,437
8,109
9,460
c880
60
26
SIS collapse
30,015
19,787
20,487
28,017
IWSBP 2010, Freiberg
12
2. Parity Benchmark Circuits
XOR tree is appended to the circuit outputs, then the
structure is destroyed (collapsing, BDD)
No guarantee of circuit size increase
Upper bound = size of the core circuit + XOR tree
x1
Collapse
Possibly
large SOP
SIS tech_decomp
Possibly
large circuit
Global BDD
Possibly large
MUX tree
SIS tech_decomp
Possibly
large circuit
y1
core
circuit
xn
IWSBP 2010, Freiberg
XOR
ym
13
2. Parity Benchmark Circuits
Size increase by appending parity & collapsing
30x
100 ISCAS and IWLS benchmarks
25x
Size increase
Size increase in 25% of circuits
20x
15x
10x
5x
0x
0
1000
2000
3000
4000
5000
Gates
IWSBP 2010, Freiberg
14
2. Parity Benchmark Circuits
Experimental results
Benchmark circuit
Bench
Inp. Out.
Process
s1238
32
1
original
s1238
32
1
global BDD
s1238
32
1
s1238
32
b4
Synthesis (# of 4-LUTs)
Gates
ABC
#1
#2
493
229
241
263
6,282
3,849
4,055
3,839
ABC collapse
31,839
19,741
21,875
25,793
1
SIS collapse
39,636
26,313
28,254
N/A
33
1
original
267
110
108
116
b4
33
1
BDD
16,963
6,347
6,099
4,285
b4
33
1
ABC collapse
1,405
730
841
884
b4
33
1
SIS collapse
4,087
2,036
2,422
1,627
IWSBP 2010, Freiberg
15
3. Tautology Benchmarks
Large random SOP is generated
When the number of terms exceeds some threshold,
the SOP is a tautology
Then, the big SOP is mapped into 2-input gates
(SIS tech_decomp)
 Big network
Upper bound = 0
The benchmark size may be adjusted by
1.
2.
IWSBP 2010, Freiberg
Number of input variables
Dimension of SOP terms
16
4. Partial Collapsing
Only parts of the network are collapsed
1.
2.
3.
4.
5.
6.
Choose one pivot gate
Extract its transitive fan-in and fan-out to a given radius
Collapse the extracted network part
Decompose into 2-input gates
Put it back
Iterate several times
Upper bound = size of the original circuit
The benchmark size may be adjusted by
1.
2.
IWSBP 2010, Freiberg
Size of the extracted circuit
Number of iterations
17
4. Partial Collapsing
Example – c432
12000
10000
Gates
8000
6000
4000
2000
0
0
20
40
60
80
100
120
140
Part size
IWSBP 2010, Freiberg
18
4. Partial Collapsing
Example – big tautology
20000
Gates
15000
10000
5000
0
0
2000
4000
6000
8000
10000
Part size
IWSBP 2010, Freiberg
19
4. Partial Collapsing
Example – big tautology
11000
10500
10000
20000
Gates
Gates
15000
10000
9500
9000
5000
0
0
2000
4000
6000
8000
10000
8500
Part size
8000
0
2000
4000
6000
8000
Part size
IWSBP 2010, Freiberg
20
4. Partial Collapsing
Experimental results
Synthesis (4-LUTs)
Benchmark circuit
Bench
Inp. Out.
Process
c432
36
7
original
c432
36
7
c432
36
c432
Gates
ABC
#1
#2
145
84
77
118
Part. coll., size 98
1,247
626
782
916
7
Part. coll., size 109
3,077
1,445
1,699
2,422
36
7
Part. coll., size 138
5,026
2,598
2,761
3,727
c432
36
7
Part. coll., size 140
11,531
6,647
6,844
9,255
c880
60
26
original
208
113
110
122
c880
60
26
Part. coll., size 129
1,008
485
601
597
c880
60
26
Part. coll., size 171
5,034
2950
2,394
3,769
c880
60
26
Part. coll., size 201
10,423
6224
5,010
7,887
IWSBP 2010, Freiberg
21
5. Replicating Shared Logic
Duplicate a part of the logic that is shared
1.
2.
Find a branching signal
Duplicate its transitive fan-in, to a given depth
G1
G3
G1
G3
G10
G11
G6
G2
G22
G16
G6
G2
G10
G11
G22
G16
G11’
G19
G7
G16’
G23
G7
G19
G23
Upper bound = size of the original circuit
The benchmark size may be adjusted by
1.
2.
IWSBP 2010, Freiberg
Number of duplicated branches
Depth of duplication
22
5. Replicating Shared Logic
Experimental results
Benchmark circuit
Bench
Inp. Out.
Process
c432
36
7
original
c432
36
7
c432
36
c432
Synthesis (4-LUTs)
Gates
ABC
#1
#2
145
84
77
118
10k dup., depth 1
1,428
84
244
333
7
10k dup., depth 2
4,905
84
447
586
36
7
10k dup., depth 3
8,389
84
396
637
c432
36
7
10k dup., depth 4
11,349
84
452
739
c432
36
7
10k dup., depth 5
16,040
84
472
771
IWSBP 2010, Freiberg
23
6. Adding Inverters
(special bonus – not included in the proceedings)
Add pairs of inverters to random locations
The network size may be arbitrarily expanded
And all the synthesis tools…
Are completely immune to this!
IWSBP 2010, Freiberg
24
Summary Experiments
10000
9000
c432
8000
#1
#2
ABC
7000
LUTs
6000
5000
4000
3000
2000
1000
0
0
2k
4k
6k
8k
10k
12k
14k
16k
18k
Source circuit gates
IWSBP 2010, Freiberg
25
Summary Experiments
60k
50k
#1
#2
ABC
LUTs
40k
30k
s1238_p
20k
10k
0
0
20k
40k
60k
80k
100k
120k
140k
Source circuit gates
IWSBP 2010, Freiberg
26
Conclusions
Several new benchmark generation methods
proposed
Artificially “big” circuits are generated from seed
circuits
Benchmarks are functionally equivalent to the
seed circuits
 the complexity upper bound is known
Tested on ABC and 2 commercial tools
Unfortunate result – the bigger the circuit going to
synthesis, the bigger the result
IWSBP 2010, Freiberg
27
Download