Fishbone: A Block-Level Placement and Routing Scheme Fan Mo and Robert K. Brayton EECS, UC Berkeley Outline • The block level placement and routing problem – Routability, predictability • Fishbone scheme – Spine net topology – Base/Virtual pin pair – Row/column routing with left-edge algorithm • Integrated placement and routing • Experimental results • Discussion Block-Level P&R • In conventional design flow, the block-level placement and routing are two sequential stages. – During placement, certain net model (fast but inaccurate) is used to estimate wire length, congestion, etc. – During routing, blocks are fixed and nets are routed with certain net model (slow but accurate). • Problem occurs when wrong estimation was made during the placement, or even early steps of the routing. RST and HP • Rectilinear Steiner Tree (RST) – Smallest wire length – Slowest computation • Half-Perimeter Model (HP) – Good estimation of RST – Faster computation – Strictly speaking, HP is not a net topology. It cannot be used to predict congestion and routability • A common approach is to use HP in placement and RST in routing. Block-Level Design in Reality • Pins of the blocks lie on layer mB. Routing takes place on a couple of higher metal layers, mB+1 and mB+2. • Layers mB+1 and mB+2 have preferred routing directions (vertical or horizontal) for better manufacturability. • Routability problem may occur, especially in and around pin regions. – Pins of a block may lie close to each other – Pins of adjacent blocks may lie close to each other. – Such routability problems are quite local, which are hard to predict even in the global routing step. The New Routing Scheme • We want a net topology and a routing scheme that have – Better predictability of routability than HP (or even RST), especially in and around pin regions. – Faster computation than RST. Spine Topology • The output pin of a net is on a vertical wire called "trunk"; and all the input pins connect to the spine by horizontal "branches". – Given pin positions, the net shape is fully determined. – Pin-pin distance is Manhattan. – Routability is easy to detect, given all pin positions. Grids, Columns and Rows • The routing grids are given cyclic indices labeled 0,1,2,…, GR-1, where GR is the grid radix. • The whole routing space is composed of rows (or columns), each containing grid 0~GR-1. Base Pins GR=6 base input pin (mB) obstruction (mB) base output pin (mB) 0123450123450123450123450 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 Virtual Pins GR=6 trunk branch virtual input pin (mB+1) virtual output pin (mB+2) 0123450123450123450123450 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 Base/Virtual Pin-Pairs GR=6 virtual output pin 0123450123450123450123450 0 5 4 3 2 1 virtual input pin 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 The Fishbone Routing • Given a placement of the blocks, we know the base pin locations (the columns of the base output pins, and the rows of the base input pins). – Only know one coordinate of the virtual pin (Y of virtual output pin and X of virtual input pin). • The trunks are assigned to the columns, and the branches are assigned to the rows. • Use "left-edge" algorithm to arrange trunks in columns and branches in rows. • Then we know the virtual pin positions (points). • Overflows in the "left-edge" packing are considered as routing violations. – The Fishbone scheme seeks a placement (and thus the routing) with no violation and some objective function (area and/or delay) minimized. The Integrated Fishbone P&R • Simulated-annealing framework. – Sequence-pair – Base/virtual pin and Fishbone routing • After a random move (swapping of blocks in the sequence pair, or swapping two I/O ports). – Evaluate area (sequence-pair) – Fishbone routing – Evaluate routing violation, wire length or delay The I/O Ports virtual input pins 6 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 extended region pseudo branch 0123450123450123450123450 virtual output pin Experiment Example #Block #I/O #Net #Pin Grid radix ami33 33 42 117 522 7 ami49 49 22 407 953 8 playout 62 192 1609 4656 9 ibm100 30 200 3327 9983 7 ibm101 40 300 4340 13021 8 ibm102 50 400 5402 16208 9 ibm103 60 500 6481 19443 9 ibm104 70 600 7621 22864 10 • Compare the areas, wire lengths and run times of: – – – – – – – Placement and routing with Fishbone. Placement with RST and Warp Router routing. Placement with HP and Warp Router routing. HP placement, post wire length measurement with RST. Fishbone placement, post wire length measurement with RST. Fishbone placement, Warp Router routing with base pins. Fishbone placement, Warp Router routing with virtual pins. Experimental Results area (mm2) average wire length (mm) FB example FB HP RST pl ami33 ami49 playout ibm100 ibm101 ibm102 ibm103 ibm104 5.09 59.4 349 73.6 102 127 151 187 5.32 57.2 296 65.1 83.9 104 128 152 5.10 58.9 298 67.2 81.0 103 131 156 0.70 3.22 9.14 7.51 8.58 9.21 10.9 11.7 pRST 0.76 3.15 9.42 6.94 7.92 8.43 9.92 10.5 #routing violations HP RST FB FB ro-b ro-v pl p-RST ro pl ro ro-b ro-v HP RST 0.76 3.15 9.13 6.97 7.93 8.46 9.93 10.5 0.78 3.18 9.39 6.94 7.91 8.43 9.89 10.4 0.89 3.16 8.92 6.40 7.28 8.11 9.01 10.6 0.90 3.24 9.07 6.76 7.68 8.57 9.50 11.2 0.89 3.24 9.13 6.74 7.64 8.55 9.48 11.2 0.85 3.09 8.91 6.91 7.90 8.87 9.57 10.2 0.83 3.12 8.97 6.90 7.87 8.82 9.53 10.2 2 0 4 6 5 3 15 23 0 0 0 0 0 0 0 0 6 0 4 7 11 13 16 20 7 0 2 6 7 13 16 23 compare 1.14 1.00 1 1.05 1.00 1.00 1.00 0.98 1.02 pl: placed. p-RST: post-placement RST estimation. ro: Wrouter. ro-b: Wrouter. Routing with Fishbone placemen but with base pins only. ro-v: Wrouter. Routing with Fishbone placement and virtual pins. 1.01 1 0 1.06 1 1.00 0.77 • On average, the Fishbone scheme resulted in a 14% area overhead and a 5% increase in wire length. – A price paid for 100% routability and predictability known during placement. • Fishbone placement with virtual pins specified (FB ro-v) is 100% routable using the Wrouter. Also it runs much faster (because there are no violations to be repaired). Experimental Results placement time example ami33 ami49 playout ibm100 ibm101 ibm102 ibm103 ibm104 compare FB 1m36 5m07 20m 53m 72m 1h34 1h56 2h17 0.19 HP 0m30 1m05 3m15 8m 9m 10m 12m 13m 0.03 RST 19m 54m 1h39 4h24 6h02 7h07 8h09 9h42 1 FB ro-b 1m14 0m22 1m55 2m40 3m40 6m58 9m13 13m 0.91 routing time FB ro-v HP 0m02 1m06 0m02 0m22 0m13 1m55 1m32 2m45 2m26 4m27 4m18 6m06 6m26 9m07 9m13 15m 0.39 0.92 RST 1m15 0m24 2n33 2m52 4m37 7m15 10m 13m 1 • The run time of the Fishbone scheme is the time taken only by the simulated annealing phase, which is on average 80% less than for RST placement; the RST and HP placements need extra time for routing. Discussion • Fishbone cannot handle obstructions in the routing layers. • No 90o rotations of the blocks are allowed. • Only vertical spine (trunk vertical). • Need a pre-defined grid radix GR. • Extension to timing-driven version is straightforward. • Easy for coupling capacitance extraction.