ECE 260B - CSE241A VLSI Digital Circuits

advertisement
ECE260B – CSE241A
Winter 2007
Floorplanning, Partitioning and Placement
Website: http://vlsicad.ucsd.edu/courses/ece260b-w07
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
ECE260B – CSE241A
Winter 2007
Floorplanning
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Floorplanning Input
ƒ Design netlist (required)
ƒ Area requirements (required)
ƒ Power requirements (required)
ƒ Timing constraints (required)
ƒ Physical partitioning information (required)
ƒ Die size vs. performance vs. schedule trade-off (required)
ƒ I/O placement (optional)
ƒ Macro placement information (optional)
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Floorplanning Output
ƒ Die/block area
ƒ I/Os placed
ƒ Macros placed
ƒ Power grid designed
ƒ Power pre-routing
ƒ Standard cell placement areas
Æ Design ready for standard cell placement
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Floorplanning Output
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Floorplan
ƒ
Blocks inside a pad
frame
blocks
ƒ
Routing inside, between
blocks
I/O pads
ƒ
Different-sized blocks
more difficult than
standard cells to place
and route
ƒ
std cell
RAM
Routing
channels
data path
Blocks
z
Hard, soft, semi-soft
Rectangular, L-shaped,
T-shaped, rectilinear
z
Can rotate, mirror, …
z
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Size Estimation
ƒWhy we care:
z
If area is too small: P&R will not finish or meet timing, will run too long
z
Schedule and die size inversely related
z
Performance and die size have complex relationship
Physical Design
Schedule
Perf
Die size
ƒRule of thumb (must correct for power, clock, etc.):
-
3LM: Cell utilization 65 percent
3LM: Cell utilization 70 percent
5LM: Cell utilization 75 percent
6LM: Cell utilization 80 percent
Die size
// what is utilization?
ƒFloorplan metrics
z
z
Low interconnect density Æ Cell util (standard cell area/standard cell row area)
High interconnect density Æ “Net util” (number of nets/standard cell area)
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Channels
ƒ Channels end at block boundaries
ƒ Alternate channel definitions
possible, depending on position of
blocks
A
ch 1
ch 2
ch 3
B
C
A
channel 1
B
ch 2
C
A
B
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
C
Andrew B. Kahng, UCSD
Channel Intersection Graph
ƒ Nodes are channels, edges correspond to pairs of
channels that touch
ƒ Channel graph shows paths between channels
ƒ Channel graph can be used to guide global routing
C
A
B
D
E
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Slicing Floorplan Represented by Binary Tree
ƒ
ƒ
ƒ
1
A slicing floorplan can be
recursively cut in two without
cutting any blocks
C
A
A slicing floorplan is guaranteed to
have no “wheels”, therefore
guaranteed to have a feasible order 2
of routing for the channels
A slicing floorplan can be
represented as a binary tree, with
internal nodes representing slices
in the floorplan and leaves
representing blocks.
3
D
B
4
E
1
2
A
3
B
4
C
D
E
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
O-Tree
ƒ
Partial ordering based on projection
overlapping (with given physical
locations)
ƒ
Transforming into binary trees by
pivoting, etc.
ƒ
A
Coded in a node sequence given a
tree traversal algorithm
z
ƒ
C
O
E.g., OACBDEF for DFS
D
Condensed solution space
E
B
F
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Sequence Pair
ƒ
Based on layout partitions by nonoverlapping ascending/descending
staircases
ƒ
Coded in two node sequences
ƒ
z
E.g., CEDFAB for descending
staircases and
z
ABCDEF for ascending staircases
C
A
Larger solution space, finer
representation
D
E
B
F
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
ECE260B – CSE241A
Winter 2007
Partitioning
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Hypergraphs in VLSI CAD
ƒ Circuit netlist represented by hypergraph
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Hypergraph Partitioning in VLSI
ƒ Circuit netlist represented by hypergraph
ƒ Variants
- directed/undirected hypergraphs
- weighted/unweighted vertices, edges
- constraints, objectives, …
ƒ Human-designed instances
ƒ Benchmarks
- up to 4,000,000 vertices
- sparse (vertex degree ≈ 4, hyperedge size ≈ 4)
- small number of very large hyperedges
ƒ Efficiency, flexibility:
KL-FM style preferred
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Hypergraph Partitioning in VLSI
ƒ Circuit netlist represented by hypergraph
ƒ Variants
- directed/undirected hypergraphs
- weighted/unweighted vertices, edges
- constraints, objectives, …
ƒ Human-designed instances
ƒ Benchmarks
- up to 4,000,000 vertices
- sparse (vertex degree ≈ 4, hyperedge size ≈ 4)
- small number of very large hyperedges
ƒ Efficiency, flexibility:
KL-FM style preferred
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Example: Partitioning of a Circuit
Input size: 48
Cut 1=4
Size 1=15
Cut 2=4
Size 2=16
Size 3=17
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Hierarchical Partitioning
ƒ Levels of partitioning:
z
System-level partitioning:
Each sub-system can be designed as a single PCB
z
Board-level partitioning:
Circuit assigned to a PCB is partitioned into sub-circuits
each fabricated as a VLSI chip
z
Chip-level partitioning:
Circuit assigned to the chip is divided into manageable subcircuits
NOTE: physically not necessary
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Delay at Different Levels of Partitions
A
x
10x
B
D
C
PCB1
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
20x
PCB2
Andrew B. Kahng, UCSD
Delay at Different Levels of Partitions
A
x
10x
B
D
C
PCB1
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
20x
PCB2
Andrew B. Kahng, UCSD
Delay at Different Levels of Partitions
etc
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Context: Top-Down Placement
ƒ Speed
- 6,000 cells/minute to final detailed placement
- partitioning used only in top-down global placement
- implied partitioning runtime: 1 second for 25,000 cells, < 30
seconds for 750,000 cells
ƒ Structure
- tight balance constraint on total cell areas in partitions
- widely varying cell areas
- fixed terminals (pads, terminal propagation, etc.)
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Fiduccia-Mattheyses (FM) Approach
ƒ Pass:
z
start with all vertices free to move (unlocked)
z
label each possible move with immediate change in cost that it
causes (gain)
z
iteratively select and execute a move with highest gain, lock the
moving vertex (i.e., cannot move again during the pass), and
update affected gains
z
best solution seen during the pass is adopted as starting solution
for next pass
ƒ FM:
z
start with some initial solution
z
perform passes until a pass fails to improve solution quality
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Cut During One Pass (Bipartitioning)
Cut
Moves
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Multilevel Partitioning
Clustering
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Refinement
Andrew B. Kahng, UCSD
ECE260B – CSE241A
Winter 2007
Placement
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
VLSI Design Flow and Physical Design Stage
ƒ
ƒ
IO Pad Placement
Power/Ground
Stripes, Rings Routing
Global
Placement
Definitions:
Cell: a circuit component to be placed on
the chip area. In placement, the
functionality of the component is ignored.
ƒ
Net: specifying a subset of terminals, to
connect several cells.
ƒ
Netlist: a set of nets which contains the
connectivity information of the circuit.
Detail Placement
Clock Tree Synthesis
and Routing
Global Routing
Extraction and
Delay Calc.
Timing
Verification
Detail Routing
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Placement Problem
Input:
ƒ A set of cells and their complete information (a cell library).
ƒConnectivity information between cells (netlist information).
Output:
ƒ A set of locations on the chip; one location for each cell
Goal:
ƒThe cells are placed to produce a routable chip that meets
timing and other constraints (e.g., low-power, noise, etc.)
Challenge:
ƒThe number of cells in a design is very large (> 1 million)
ƒThe timing constraints are very tight
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Optimal Relative Order:
A
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
B C
Andrew B. Kahng, UCSD
To spread ...
A
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
B
C
Andrew B. Kahng, UCSD
.. or not to spread
A
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
B
C
Andrew B. Kahng, UCSD
Place to the left
A
B
C
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
… or to the right
A
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
B
C
Andrew B. Kahng, UCSD
Optimal Relative Order:
A
B C
ƒ Without “free” space, the placement problem is dominated
by order
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Placement Problem
A bad placement
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
A good placement
Andrew B. Kahng, UCSD
Global and Detailed Placement
ƒ
In global placement,
we decide the
approximate locations
for cells by placing
cells in global bins.
Global Placement
Detailed Placement
ƒ
In detailed placement,
we make some local
adjustment to obtain
the final nonoverlapping placement.
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Placement Footprints:
Standard Cell:
Data Path:
IP - Floorplanning
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Placement Footprints:
Core
Reserved areas
IO
Control
Mixed Data Path &
sea of gates:
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Placement Footprints:
Perimeter IO
Area IO
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Placement objectives are subject to user constraints /
design style
ƒ Hierarchical Design Constraints
z
pin location
power rail
z
reserved layers
z
ƒ Flat Design with Floorplan Constraints
ƒ Fixed Circuits
ƒ I/O Connections
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Standard Cells
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Standard Cells
z
Power connected by abutment, placed in sea-of-rows
z
Rarely rotated
z
DRC clean in any combination
z
Circuit clean (I.e. no naked T-gates, no huge input capacitances)
z
8,9,10+ tracks in height
z
Metal 1 only used (hopefully)
z
Multi-height stdcells possible
z
Buffers: sizes, intrinsic delay steps, optimal repeater selection
z
Special clock buffers + gates (balanced P:N)
z
Special metastability hardened flops
z
Cap cells (metal1 used?)
z
Gap fillers (metal1 used?)
z
Tie-high, tie-low
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Unconstrained
Placement
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Floor planned
Placement
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Traditional Placement Algorithms
Bi-Partitioning / Quadrisection
ƒ
Force Directed Placement
ƒ
Hybrid
t
lis
t
e
N
G
rit
a
ul
n
ra
y
ene
ss
ƒ
Co
ars
Simulated Annealing
out
ƒ
Lay
Quadratic Placement
Algorithm
ƒ
Cost Function
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Quadratic Placement
ƒ
Quadratic Placement
x3
Min [(x1-x3)2 + (x1-x2)2 + (x2-x4)2] : F
x1
δF/δx1 = 0;
x2
A =
2
-1
x4
Ax = B
δF/δx2 = 0;
-1
2
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
B =
x3
x4
x=
x1
x2
Andrew B. Kahng, UCSD
Analytical Placement
ƒ
Get a solution with lots of overlaps
ƒ
What do we do with the overlap?
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Pros and Cons of QP
ƒ Pros
z
Very fast analytical solution
z
Can handle large design sizes
z
Can be used as an initial seed placement engine
ƒ Cons
z
Can generate overlapped solutions: post-processing needed
z
Not suitable for timing-driven placement
z
Not suitable for simultaneous optimization of other aspects of
physical design (clocks, crosstalk, …)
z
Gives trivial solutions without pads (and close to trivial with
pads)
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Simulated Annealing Placement
ƒ
Initial Placement Improved through
Swaps and Moves
ƒ
Accept a Swap/Move if it improves
cost
ƒ
Accept a Swap/Move that degrades
cost under some probability
conditions
Cost
Time
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Pros and Cons of SA
ƒ Pros:
ƒ
ƒ
ƒ
ƒ
Can Reach Globally Optimal Solution (given “enough” time)
Open Cost Function.
Can Optimize Simultaneously all Aspects of Physical Design
Can be Used for End Case Placement
ƒ Cons:
ƒ
Extremely Slow Process of Reaching a Good Solution
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Bi-Partitioning / Quadrisection
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Pros and Cons of Partitioning Based Placement
ƒ Pros:
ƒ
More Suitable to Timing Driven Placement since it is Move
Based
ƒ
New Innovation (hMetis) in Partitioning Algorithms have made
this Extremely Fast
ƒ
ƒ
Open Cost Function
Move Based means Simultaneous Optimization of all Design
Aspects Possible
ƒ Cons:
ƒ
ƒ
ƒ
Not Well Understood
Lots of “indifferent” moves
May not work well with some cost functions.
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Cost Functions of Placement
Net-cut
Timing
Coupling
Other performance related cost functions
oar
s
out
C
Congestion
t
lis ity
t
e
ar
N
l
nu
a
r
G
Lay
Quadratic wirelength
ene
ss
Linear wirelength
Algorithm
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
Cost Function
Undiscovered: crossing
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Net-cut Cost for Global Placement
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
ƒ
The net-cut cost is defined as
the number of external nets
between different global bins
ƒ
Minimizing net-cut in global
placement tends to put highly
connected cells close to each
other.
Andrew B. Kahng, UCSD
Linear Wirelength Cost
(x1,y1)
ƒ
1
2
(x2,y2)
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
The linear length of a net
between cell 1 and cell 2 is
l12 = |x1-x2| +|y1-y2|
ƒ
The linear wirelength cost is
the summation of the linear
length of all nets.
Andrew B. Kahng, UCSD
Quadratic Wirelength Cost
(x1,y1)
ƒ
1
2
(x2,y2)
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
The quadratic length of a net
between cell 1 and cell 2 is
l12 = (x1-x2)2 +(y1-y2)2
ƒ
The quadratic wirelength cost
is the summation of the
quadratic length of all nets.
Andrew B. Kahng, UCSD
Timing Cost
Critical Path
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
ƒ
Delay of the circuit is
defined as the longest
delay among all possible
paths from primary inputs
to primary outputs.
ƒ
Interconnection delay
becomes more and more
important in deep submicron regime.
Andrew B. Kahng, UCSD
Timing Analysis
22
3
2
5
L
A
T
C
H
19
ƒ
1
2
4
2
5
1
L
A
T
C
H
1
4
1
5
4
3
2
How do we get the delay numbers on the gate/interconnect?
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Approaches
ƒ Budgeting
z
In accurate information
z
Fast
ƒ Path Analysis
z
Most accurate information
z
Very slow
ƒ Path analysis with infrequent path substitution
z
Somewhere in between
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Timing Metrics
ƒ How do we assess the change in a delay due to a
potential move during physical design?
ƒ Whether it is channel routing or area routing, the problem
is the same
ƒ translate geometrical change into delay change
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Other costs: Coupling Cost
ƒ Hard to model during placement
ƒ Can run a global router in the middle of placement
ƒ Even at the global routing level it is hard to model it
Avoid it
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Coupling Solutions
ƒ Once we have some metrics for coupling, we can
calculate sensitivities, and optimize the physical design...
Noisy region
Extra space
Grounded Shields
Spacing
Shielding
Quiet region
Segregation
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Other Performance Costs
ƒ Power usage of the chip.
ƒ
ƒ
Weighted nets
Dual voltages (severe constraint on placement)
ƒ Very little known about these cost functions and their
interaction with other cost functions
ƒ Fundamental research is needed to shed some light on
the structure of them
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Placement References
ƒ
C. J. Alpert, T. Chan, D. J.-H. Huang, I. Markov, and K. Yan, “Quadratic
Placement Revisited”,Proc. 34th IEEE/ACM Design Automation Conference,
1997, pp. 752-757
ƒ
C. J. Alpert, J.-H Huang, and A. B. Kahng, “Multilevel Circuit Partitioning”,
Proc. 34th IEEE/ACM Design Automation Conference, 1997, pp. 530-533
ƒ
U. Brenner, and A. Rohe, “An Effective Congestion Driven Placement
Framework”, International Symposium on Physical Design 2002, pp. 6-11
ƒ
A. E. Caldwell, A. B. Kahng, and I.L. Markov, “Can Recursive Bisection Alone
Produce Routable Placements”,Proc. 37th IEEE/ACM Design Automation
Conference, 2000, pp 477-482
ƒ
M.A. Breuer, “Min-Cut Placement”, J. Design Automation and Fault Tolerant
Computing, I(4), 1997, pp 343-362
ƒ
J. Vygen, “Algorithms for Large-Scale Flat Placement”, Proc. 34th IEEE/ACM
Design Automation Conference, 1988,pp 746-751
ƒ
H. Eisenmann and F. M. Johannes, “Generic Global Placement and
Floorplanning”, Proc. 35th IEEE/ACM Design Automation Conference, 1998,
pp. 269-274
ƒ
S.-L. Ou and M. Pedram, “Timing Driven Placement Based on Partitioning
with Dynamic Cut-Net Control”, Proc. 37th IEEE/ACM Design Automation
Conference, 2000, pp. 472-476
ƒ
C.M. Fiduccia and R.M. Mattheyses, A linear time heuristic for improving
network partitions, Proc. ACM/IEEE Design Automation Conference. (1982)
pp. 175 - 181.
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Reading Assignment (Posted on the web)
ƒ
C.M. Fiduccia and R.M. Mattheyses, A linear time heuristic for
improving network partitions, Proc. ACM/IEEE Design
Automation Conference. (1982) pp. 175 - 181.
ƒ
A. E. Caldwell, A. B. Kahng and I. L. Markov. Design and
Implementation of the Fiduccia-Mattheyses Heuristic for VLSI Netlist
Partitioning. Proc. Workshop on Algorithm Engineering and
Experimentation (ALENEX), January, 1999
ƒ
(Optional): C. J. Alpert and A. B. Kahng, "Recent Directions in Netlist
Partitioning: A Survey“, Integration: The VLSI Journal 19 (1995), pp.
1-81.
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Homework – Friday 1/19
ƒ If I model one wire segment with R, C = Rw, Cw by
two segments in series, each with R, C = Rw/2,
Cw/2, how does Elmore delay change?
ƒ What are the differences between Kernighan-Lin
and Fiduccia-Mattheyses?
ECE 260B – CSE 241A Floorplanning, Partitioning and Placement
Andrew B. Kahng, UCSD
Download