part-1

advertisement
1
BSPlace: A BLE Swapping technique for
placement
04.09.2014
Minsik Hong
George Hwang
Hemayamini Kurra
Minjun Seo
2
BSPlace: A BLE Swapping technique for placement
• BLE Level Swapping within Simulated Annealing
• Chen, Gang, and Jason Cong. "Simultaneous timing driven
clustering and placement for FPGAs." Field Programmable Logic
and Application. Springer Berlin Heidelberg, 2004. 158-167.
• Use Rent’s rule to determine swapping method
• Singh, Amit, Ganapathy Parthasarathy, and Malgorzata MarekSadowska. "Efficient circuit clustering for area and power reduction
in FPGAs." ACM Transactions on Design Automation of Electronic
Systems (TODAES) 7.4 (2002): 643-663.
3
Outline
• iRAC
• Clustering Comparison
• Rent’s Rule
• Key terms
• Clustering Step
• Results
• SCPlace
• Introduction
4
Efficient circuit clustering for area and
power reduction in FPGAs.
Singh, Amit, Ganapathy Parthasarathy, and Malgorzata Marek-Sadowska.
ACM Transactions on Design Automation of Electronic Systems (TODAES)
7.4 (2002): 643-663.
5
Clustering Comparison
• TVPACK
• What is different in RPACK?
• Gain functions for considering routing constraints in cost function
while clustering
• RPACK + -----  iRAC
• Rent’s rule to depopulate the clusters!!  Best CW
6
7
Rent’s Rule
•
N io  kB P , log( N io )  log(k )  p log( B )
• Where Nio is the number of inputs and outputs in a CLB
• K is the average number of connections per BLE
• Calculate k in technology mapping phase
• B is the number of BLEs in a CLB
• P is the rent’s parameter
• Since FPGA has uniform interconnect resources, p at local level is assu
med to be uniform
• Characterize the complexity of a cluster
• Smaller values of p mean that the cluster’s external routing
requirement is low
• So, a good clustering solution will ensure that the Rent’s parameter
of the generated cluster is small.
8
Net Length : Local Rent’s parameter Pld
• Complexity Varies across design.
• Solution – Use local interconnect complexity measure ba
sed in interconnect length distributions. (Van Marck et al.,
95)
• Reduces to Rent’s exponent for uniform design at the top l
evel
9
Net Length : Rent’s Parameter
• Van Marck, Stroobandt, Campenhout, 1995
• p =D(log Ni) / D(log Li)
• p – Rent’s parameter
• Li - length of a net
• Ni - number of nets of length Li
• First Order Approximation for varying rent’s parameter
• Connects net-length with Rent’s parameter!
• Wirelength, channel width, routability estimation based on Rent’s p
arameter
10
Applications of Rent’s Rule
• layout parameter estimations in Electronic Design Automa
tion,
• studies of new computer architectures, and
• the generation of synthetic circuit benchmarks.
11
Applications of Rent’s Rule
• The increasing problem sizes in electronic design and the
sub-micron design challenges have placed the need for a
priori estimates of chip layout parameters in the forefront.
• The generality and predictive power of Rent’s rule are perfect for suc
h estimates.
• Another application of Rent’s rule tries to assess the merits
of new chip or computer architectures before they have to
be built, using wire length estimates based on Rent’s rule a
nd a generic model for the architecture. This research has
gained attention especially due to the possibilities of using
optical interconnections to build three-dimensional chips
12
Key terms
• Degree of an BLE
• the number of nets incident to that BLE
• Separation of an BLE
• The sum of all terminals of nets incident to the BLE
• Connectivity factor (c)
separation
c
deg ree 2
• Weight, w(e)
2
w(e)  , where r is the number of terminals on the net
r
13
Clustering step (1)
• First, calculate the connectivity factor of all unclustered BLEs.
Terminal
Cluster
NET
BLE
14
Clustering step (1)
• First, calculate the c factor of all unclustered BLEs.
1
2
3
4
Degree - the number of nets incident to BLE A
15
Clustering step (1)
• First, calculate the c factor of all unclustered BLEs.
2
3
4
5
15
10
6 16 18
7
8
9 17
1
11
12
13
14
Degree - the number of nets incident to BLE A
Separation - the sum of all terminals of nets incident to the BLE A
c
separation 18 18
 2 
 1.125
2
deg ree
4
16
16
Clustering step (2)
• Second, choose a seed which has highest degree and lowest c
Degree = 4, c=1.125
Degree = 4, c=0.5
Cluster size = 5
17
Clustering step (3)
• Third, assign gain value to unclustered BLEs and cho
ose BLE which has highest gain
•
G( X , C, x)  2nw( x)  (1   x )
• the attraction of ble X to ble C
• x: the net between ble X and ble C
• n: the cluster size (# of BLEs in CLB)
• w(x): the weight of net
• α: the number of pins of net x already inside
18
Clustering step (3)
Cluster size = 4
G( X , C, x)  2nw( x)  (1   x )
2
w(e)  , where r is the number of terminals on the net
r
n: the cluster size (# of BLEs in CLB)
α: the number of pins of net x already inside
2
G ( X )  2  4   (1  1)  16
2
19
Clustering step (3)
Cluster size = 4
2
2
2
G (Y )  2  4   (1  1)  2  4   (1  1)  2  4   (1  1)  26.7
6
3
3
y
z
w
20
Clustering step (3)
choose BLE which has highest gain
Cluster size = 4
2
G ( X )  k 16  160
G ( X )  2  4   (1  1)  16
2
2
2
2
G (Y )  2  4   (1  1)  2  4   (1  1)  2  4   (1  1)  26.7
6
3
3
If adding X to C fully
absorbs net x, then
G(X,C,x) is multiplied
by a large constant
value k. (ex. k=10)
21
Clustering step (4)
• Fourth, check spatial uniformity using Rent’s rule
Tio  kB
P
where K=3, B=4, P=0.5
Threshold Tio = 6
# of used I/O of cluster < Tio
p<P
If Nio > Tio, then choose th
at BLE as another seed.
p
p
ln( 7)  ln( 3)
 0.6112  0.5
ln( 4)
p
ln( 5)  ln( 3)
 0.3685  0.5
ln( 4)
ln( N io )  ln( k )
ln( B)
Smaller values of p mean that the cluster’s external routing requirement is low
22
Results(1)
• Random Seed of RPack
• iRAC is more effective in
clustering circuits which
have a higher percentage
of low-fanout nets.
•
Why?
23
Result(2)
iRAC is able to lower the number
of external nets, and the Rent’s
parameter of the circuits after cl
ustering!
24
Simultaneous timing driven clustering and
placement for FPGAs.
Chen, Gang, and Jason Cong. Field Programmable Logic and Application.
Springer Berlin Heidelberg, 2004. 158-167.
25
Why simultaneous placement and
clustering?
26
Why simultaneous placement and
clustering?
• More freedom of changing to change a circuit structure
but fast and accurate estimation of wirelength, timing and
routability are not available in clustering stage
• In placement stage due to the fixed circuit structure,
simultaneous optimization of wirelength, timing and
routability are possible.
• Sub-optimal place and route result!!!!
27
Key concept
• Fragment level move
• BLE to a new CLB
• Check for valid CLB configuration
• Feasibility (number of BLEs and input pins)
• Update the cost function
• Block level move
• CLB to CLB
• Logic duplication
28
To be Continued….
Download