Capo: Robust and Scalable Open-Source Min-cut Floorplacer http://vlsicad.eecs.umich.edu/BK/PDtools/tar.gz/LATEST/ Jarrod A. Roy, David A. Papa,Saurabh N. Adya, Hayward H. Chan, James F. Lu, Aaron N. Ng, Igor L. Markov EECS Department University of Michigan, Ann Arbor Credits for original Capo: Caldwell, Kahng & Markov Original Motivation (ca 2000) • Co-developed at UCLA with Andy Caldwell under the guidance of Andrew Kahng – First fixed-die placer in the literature – First academic placer using multi-level partitioning (MLPart) – First academic placer in the US to compete heads-on with commercial placers on large industrial netlists – All code written from scratch • Capo - an experiment in min-cut placement • DAC 2000: “Can Recursive Bisection Alone Produce Routable Placements ?” – Message: minimizing HPWL is not enough Min-cut Bisection Basic Components & Techniques • Overall runtime O(P (log P)2) • Three min-cut partitioners – MLPart and FMPart (heur.) & BBPart (optimal) ASPDAC 2000, JEA 2000, ISPD `98/TCAD `01 – Capo makes several different calls to MLPart every time (some are Vcycling) • Shifting cut-lines (not a grid): ASIC `98 • Optimal end-case placers (B&B): ISPD `98 – Also used in detail placement “RowIroning” • Uniform whitespace allocation: TCAD `03 • Non-uniform whitespace allocation: ICCAD `03 • Feedback / cycling (Kahng & Reda, DAC `04) Capo’s Distinctive Features (can be found in some industry tools, but rarely in academic placers) • Global placement with Capo often produces legal placements – No cell-shifting / legalization is necessary Very low – Top-down estimates of cell locations and via counts interconnect are very accurate – This seems to improve “generic routability” – Ensured robust handling of obstacles since 2000 • Can make any netlist routable by more generous floorplan (greater whitespace) – Request uniform whitespace distribution! The Integration of Block-packing (using Parquet) ICCAD `04 Advanced Details • Weighted terminal propagation – Described in a TR by Karypis & Selvakkumaran – More accurate capture of HPWL in min-cut partitioning – Nets with terminals too close to cut-line have cost <1 – Some orig. nets are modeled by two nets for min-cut • To improve terminal propagation, use an SOR-based quadratic placer after every round of partitioning – Also tried ACG (Alpert, Nam & Villarrubia; ICCAD`03) • Look-ahead criteria for Parquet were sharpened Recent Improvements • Whitespace allocation rewritten entirely – New params: -minLocalWS and -safeWS • Cutline positioning relative to obstacles – “Feature locations” are corners of fixed macros – Among those, we minimize cutlength – Cutline direction is selected based on cutlength • MLPart & FMPart are 2x faster (loop unroling) – Also, bugfix in Vcycling – Also, variable-effort partitioning in Capo • Stronger, much faster RowIroning • Stronger macro legalizer, more scalable cell leg. • Meta-options: -faster and -tryHarder Large-scale Visualization • Example: different whitespace allocation modes • Plots of large designs with data compression – Less than 1 bit per cell – The plotter is in the GSRC bookshelf under PlaceUtils Capo Memory & Runtime Data with –faster (Opteron @2.8GHz, Linux) Memory Usage (MB) 6000 5000 32 Bit Mode 4000 64 Bit Mode 7h40m 3000 2000 1000 17 mins 0 0.00E+00 1.00E+06 2.00E+06 3.00E+06 4.00E+06 5.00E+06 6.00E+06 7.00E+06 8.00E+06 9.00E+06 1.00E+07 Pin Count Example of Self-profiling in Capo Adaptec 1 (250K) MLPart took: 396.06sec (51.58%) FMPart took: 38.29sec (4.99%) BBPart took: 20.13sec (2.62%) SmPlace took: 139.14sec (18.12%) ProblemSetup took 151.01sec (19.67%) SmPlProbSetup took: 4.85sec (0.63%) Level Stats took: 10.79sec (1.41%) Total runtime of measured components: 760.43sec (99.03%) Big Blue 4 (2.1M) MLPart took: 9360.9sec (39.25%) FMPart took: 661.4sec (2.77%) BBPart took: 255.3sec (1.07%) SmPlace took: 1727.6sec (7.24%) ProblemSetup took: 11365.5sec (47.66%) SmPlProbSetup took: 133.5sec (0.56%) Level Stats took: 151.2sec (0.63%) Total runtime of measured components: 23657.1sec (99.2%) RowIroning takes additional 10-15% by runtime Infrastructure Available in GSRC Bookshelf (April 2005) • Source code & binaries – – – – – Converters: Bookshelf to/from LEFDEF Gnuplotter with data compression Macro & cell legalizer Improved stand-alone RowIroning Complete Capo 9.1 (with MLPart) for Linux (32/64), Solaris (32/64) and Windows http://vlsicad.eecs.umich.edu/BK/PDtools/ • Compatible with OpenAccess • Ongoing work on further improvements