# 20071228003446

```An Analytic Placer
Aplacer
Andrew B. Kahng and Qinke Wang
UCSD CSE Department
{abk, qiwang}@cs.ucsd.edu
Problem Formulation
• Minimize wirelength subject to the constraint
that cells do not overlap
• Therefore the objective includes
– Density objective: to spread cells
– Wirelength objective : to minimize wirelength
Wirelength Formulation
• Placement objective: HPWL
• Smooth approximation Naylor et al., US Patent 6301693, 2001
– log-sum-exp formula: pick the most dominant terms
among pin coordinates
–  : smoothing parameter
– closer to HPWL when α → 0
– precise
– strictly convex
– continuously differentiable
Density Control
• Common strategy
– divide the placement area into grids
– equalize the total cell area in each grid
• Squared deviation penalty of an uneven cell
distribution
– not smooth or differentiable
– difficult to optimize
Cell Potential Function
• Bell-shaped cell potential function
[Naylor et al., US Patent 6301693, 2001]
• Cell c has potential(c, g) with respect to grid g
p(d)
1-2d2/r2
•
•
•
•
•
Cell c at (x, y) has area A
Grid point g = (x', y')
p(d) : bell-shaped function
r : the radius of cells' potential
C : a proportionality factor, s.t.
2(r-d)2/r2
d
r
r/2
r/2
r
Implementation
• Cells are spread by minimizing the smooth
density penalty function
• APlace combines the above two objectives
and optimizes the following function using a
– Density term drives cell spreading
– Wirelength term draws connected components
back toward each other
Wirelength vs. Density Objectives
Objective:
• Density weight: fixed
– larger  spread cells out hastily without good wirelength
• Wirelength weight: variable
– larger  contract cells together and prevent them from
– initially set to be large
– repeat until all cells are spread out evenly:
• execute conjugate-gradient solver until convergence
• reduce the weight by half
Descent Method
• Produce a minimizing sequence x(k) , k=1,…
where
: Step length
: Search direction
such that
• From convexity
we know
A General Decent Method
Given a starting point x in dom f
Repeat
1. Determine a descent direction Δx
2. Line search. Choose a step length t &gt; 0
3. Update. x := x + t Δx
Until stopping criterion is satisfied
• Line search is called since t determines where
along the line { x + t Δx | t in R+ } the next iterate
will be
–
Δx := -
)
Algorithm
Algorithm
• Loop stopping criterion
– Predetermined number of iterations is reached
– Step length returned by the line search function
is small enough
– The function value is not changing significantly
FastPlace: Efficient Analytical
Placement using Cell Shifting,
Iterative Local Refinement and a
Hybrid Net Model
Natarajan Viswanathan
Chris Chong-Nuen Chu
Iowa State University
International Symposium on Physical Design
April 19, 2004
FastPlace – Key Features
Efficient Analytical Placement
using
1.
2.
3.
Cell Shifting
Iterative Local Refinement
Hybrid Net Model

Solution Quality
There may be significant room for improvement



For existing wirelength-driven placement algorithms
 Cong et al. [ASPDAC 03] [ISPD 03]
For existing timing-driven placement algorithms
 Cong et al. [ICCAD 03]
Efficiency
Important to have fast placement algorithms


Circuit sizes are huge in modern design
Placement must be run in early design stages
Analytical Placement Formulation


Let (xi ,yi )  Coordinate s of the center of cell i
wij
 Weight of the net between cell i and cell j
x, y
 Solution v ectors
Cost of the net between cell i and cell j
1
 wij ( xi  x j ) 2  ( yi  y j ) 2
2
1 T
1 T
T
T
Total cost  x Qx  d x x  y Qy  d y y  const
2
2
Analytical Placement Framework:




repeat
until the cells are evenly distributed
FastPlace Approach

Framework:
repeat
Solve the convex quadratic program 
Reduce wirelength by iterative heuristic 
until the cells are evenly distributed 

Special features of FastPlace:



Hybrid Net Model
 Speed up solving of convex QP 
Iterative Local Refinement
 Minimize wirelength based on linear objective 
Cell Shifting
 Easy-to-compute technique 
 Enable fast convergence 
Outline
FastPlace:
Efficient Analytical Placement
using
1.
2.
3.
Cell Shifting
Iterative Local Refinement
Hybrid Net Model



relative position of cells
Simple shifting of cells should be able to produce
a good placement
Major difficulties:
1. How to shift cells in a 2-D region?
2. How to make sure wirelength will still be good?

Our Approach:
1. Perform 1-D shifting in x and y directions
independently
2. Interleave a small amount of shifting with quadratic
placement
Cell Shifting
1. Shifting of bin boundary
Uniform Bin Structure
Non-uniform Bin Structure
2. Shifting of cells linearly within each bin

Apply to all rows and all columns
independently
Cell Shifting – Animation …
Bin
i
Bin
i
Bin
i+1
Bin
i+1
k
j
Ui
l
OBi-1
OBi
OBi+1
Ui+1
OBi-1 OBi OBi+1
k
j
l
NBi
NBi
Pseudo pin and Pseudo net




prevent cells from
collapsing back during the
next global optimization
pins and pseudo nets
Only diagonal and linear
system need to be updated
Takes a single pass of O(n)
time to regenerate matrix
Q (which is common for
both x and y problems)
Pseudo pin
Pseudo
pin
Pseudo
net
Pseudo net
Force
Target Position
Original Position
Outline
FastPlace:
Efficient Analytical Placement
using
1.
2.
3.
Cell Shifting
Iterative Local Refinement
Hybrid Net Model
Iterative Local Refinement






Iteratively go through all the cells one by one
For each cell, consider moving it in four directions by a
certain distance
Compute a score for each direction based on
 Half-perimeter wirelength (HPWL) reduction
 Cell density at the source and destination regions
Move in the direction with highest positive score
(Do not move if no positive score)
Distance moved (H or V) is
decreasing over iterations
Detailed placement is handled
by the same heuristic
V
H
H
V
Outline
FastPlace:
Efficient Analytical Placement
using
1.
2.
3.
Cell Shifting
Iterative Local Refinement
Hybrid Net Model
Effect of Net Model on Runtime


Need to replace each multi-pin net by 2-pin nets
Then the placement problem (even with pseudo nets) can
be formulated as a convex QP:
1 T
1 T
T
T
Total cost  x Qx  d x x  y Qy  d y y  const
2
2






Solved by any convex QP algorithms
 Use Incomplete Cholesky Conjugate Gradient (ICCG)
Runtime is proportional to # of non-zero entries in Q
Each non-zero entry in Q corresponds to one 2-pin net
Traditionally, placers model each multi-pin net by a clique
High-degree nets will generate a lot of 2-pin nets
Slow down convex QP algorithms significantly
Clique, Star and Hybrid Net Models



Star model is introduced by Mo et al. [ICCAD-00]
for macro placement
Introduce a star node even for 2-pin nets
Not clear how the placement result will be affected
Star Node
Clique Model
Star Model
# pins
2
3
4
5
6
…
Net Model
Clique
Clique
Star
Star
Star
…
Hybrid Model
Equivalence of Clique and Star Models


Lemma: By setting the net weights appropriately,
clique and star net models are equivalent.
Proof: When star node is at equilibrium position,
each cell are the same for clique and star
net models.
Star Node
Weight = γW
Weight = γ kW
for a k-pin net
Clique Model
Star Model
Experimental Setup






ISPD-02 mixed-mode benchmark suite by IBM
Macro blocks replaced by standard cells with width
set to 4 x average cell width
10% whitespace
FastPlace implemented in C
Compared with:
 MetaPl-Capo 8.8 in default mode
 Dragon 2.2.3 in fixed die mode
All placers run on a 750MHz Sun Sparc-2 machine
Placement Benchmark Statistics
Circuit
ibm01
ibm02
ibm03
ibm04
ibm05
ibm06
ibm07
ibm08
ibm09
Ibm10
Ibm11
ibm12
ibm13
ibm14
ibm15
ibm16
ibm17
ibm18
#Nodes
12506
19342
22853
27220
28146
32332
45639
51023
53110
68685
70152
70439
83709
147088
161187
182980
184752
210341
#Terminals
246
259
283
287
1201
166
287
286
285
744
406
637
490
517
383
504
743
272
#Nets
14111
19584
27401
31970
28446
34826
48117
50513
60902
75196
81454
77240
99666
152772
186608
190048
189581
201920
#Pins
50566
81199
93573
105859
126308
128182
175639
204890
222088
297567
280786
317760
357075
546816
715823
778823
860036
819697
#Rows
96
109
121
136
139
126
166
170
183
234
208
242
224
305
303
347
379
361
Clique Net Model vs Hybrid Net Model
# Non-zero Entries
Circuit
ibm01
ibm02
ibm03
ibm04
ibm05
ibm06
ibm07
ibm08
ibm09
ibm10
ibm11
ibm12
ibm13
ibm14
ibm15
ibm16
ibm17
ibm18
Average
Clique Model
Hybrid Model
Clique / Hybrid
Speed-Up
( Hybrid / Clique )
109183
343409
206069
220423
349676
321308
373328
732550
478777
707969
508442
748371
744500
1125147
1751474
1923995
2235716
2221860
41164
70014
74680
84556
108282
106835
147009
173541
185102
251101
230865
270849
295048
456474
607289
668491
753507
711702
2.65
4.90
2.76
2.61
3.23
3.01
2.54
4.22
2.59
2.82
2.20
2.76
2.52
2.46
2.88
2.88
2.97
3.12
2.95
1.5
2.4
1.4
1.2
1.3
1.6
1.3
2.0
1.4
1.6
1.2
1.6
1.5
1.3
1.4
1.3
1.4
1.4
1.5
Half Perimeter Wirelength
80
70
50
40
30
20
10
Capo 8.8
Dragon 2.2.3
Average Wirelength Ratio
FastPlace / Capo : 1.010
FastPlace / Dragon : 1.016
ibm18
ibm17
ibm16
ibm15
ibm14
ibm13
ibm12
ibm11
ibm10
ibm09
ibm08
ibm07
ibm06
ibm05
ibm04
ibm03
ibm02
0
ibm01
Wirelength (x 10 e6)
60
FastPlace
Runtime Comparison
Circuit
ibm01
ibm02
ibm03
ibm04
ibm05
ibm06
ibm07
ibm08
ibm09
ibm10
ibm11
ibm12
ibm13
ibm14
ibm15
ibm16
ibm17
ibm18
Average
Capo 8.8
3 m 59 s
7 m 15 s
8 m 23 s
10 m 46 s
10 m 44 s
12 m 08 s
18 m 32 s
19 m 53 s
22 m 50 s
29 m 04 s
31 m 11 s
30 m 41 s
39 m 27 s
1 h 12 m
1 h 30 m
1 h 31 m
1 h 43 m
1 h 44 m
Runtime
Dragon 2.2.3
29 m 06 s
31 m 13 s
31 m 49 s
1h5m
1 h 48 m
1 h 21 m
1 h 47 m
4 h 30 m
3 h 43 m
3 h 19 m
2 h 22 m
3 h 48 m
3 h 04 m
7 h 37 m
10 h 34 m
12 h 06 m
26 h 54 m
23 h 39 m
FastPlace
13 s
33 s
33 s
39 s
51 s
45 s
1 m 19 s
1 m 33 s
1 m 42 s
2 m 25 s
2 m 13 s
2 m 23 s
2 m 54 s
5 m 34 s
8 m 45 s
10 m 52 s
11 m 30 s
12 m 21 s
Speed-Up
(Capo / FP) (Dragon / FP)
x 18.4
x 134.3
x 13.2
x 56.8
x 15.2
x 57.8
x 16.6
x 100.0
x 12.6
x 127.1
x 16.2
x 108.0
x 14.1
x 81.3
x 12.8
x 174.2
x 13.4
x 131.2
x 12.0
x 82.3
x 14.1
x 64.1
x 12.9
x 95.7
x 13.6
x 63.4
x 12.9
x 82.1
x 10.3
x 72.4
x 8.4
x 66.8
x 9.0
x 140.3
x 8.4
x 114.9
x 13.0
x 97.4
Summary

FastPlace -- Efficient Flat Placement Algorithm




13.0x faster than Capo
97.4x faster than Dragon
Comparable WL to Capo and Dragon
Based on three techniques:
1. Cell Shifting
 Fast convergence
 Simple computation
2. Iterative Local Refinement
 Reduce wirelength based on HPWL measure
3. Hybrid Net Model
 1.5x speedup compared to Clique
 Applicable to any analytical placement tools
Thank You !!
```