CS137 Electronic Design Automation Day 9: February 25, 2004 Placement II André DeHon, Michael Wrighton Today Continue with Placement Improving Quality Avoiding local minima Techniques: Simulated Annealing Exhaustive (Branch-and-bound) DeHon/Wrighton CALTECH CS137 Winter2004 Review: Cost Functions Total Wirelength Wirelength2 Critical Path Delay More on this later Close Things Placed Close DeHon/Wrighton CALTECH CS137 Winter2004 First Cut Randomly place netlist Consider moves of nodes Perform those moves which improve some global cost metric When no more moves possible -- terminate DeHon/Wrighton CALTECH CS137 Winter2004 Force-Directed Model netlist as being connected with springs F=kx Perform swaps when F is reduced DeHon/Wrighton CALTECH CS137 Winter2004 Force Directed Swaps DeHon/Wrighton CALTECH CS137 Winter2004 Problem: Greedy Approach Any greedy approach will tend to get wedged in some local minima FM attacked problem by swapping all nodes even if negative gain DeHon/Wrighton CALTECH CS137 Winter2004 Simulated Annealing Physically motivated approach Physical world has similar problems objects/atoms seeking minimum cost arrangement at high temperature (energy) can move around at low temperature, no free energy to move cool quickly, freeze in defects (weak structure) cool slowly, allow to find minimum cost DeHon/Wrighton CALTECH CS137 Winter2004 Key Benefit Avoid Local Minima Allowed to take locally not improving moves in order to avoid being stuck Cost of Solution Local Minimum Better Solution Best Solution Algorithm Progression DeHon/Wrighton CALTECH CS137 Winter2004 Simulated Annealing At high temperature can move around not trapped to only make “improving” moves free energy from temperature allows exploration of nonminimum states avoid being trapped in local minima As temperature lowers less energy to take big, non-minimizing moves more local / greedy moves DeHon/Wrighton CALTECH CS137 Winter2004 Design Optimization Components: “Energy” (Cost) function to minimize Moves represent entire state, drives system forward local rearrangement/transformation of solution Cooling schedule initial temperature temperature steps (sequence) time at each temperature DeHon/Wrighton CALTECH CS137 Winter2004 Basic Algorithm Sketch Pick an initial solution Set temperature (T) to initial value while (T> Tmin) for time at T pick a move at random compute Dcost if less than zero, accept else if RND<e-Dcost/T, accept update T DeHon/Wrighton CALTECH CS137 Winter2004 Details Initial Temperature T0=Davg/ln(Paccept) Cooling schedule fixed ratio: T=lT (e.g. l=0.85) temperature dependent Time at each temperature fixed number of moves? Fixed number of rejected moves? Fixed fraction of rejected moves? DeHon/Wrighton CALTECH CS137 Winter2004 Cost Function Can (Should) be very general Drive entire solution in right direction reward each good move Should be cheap to compute delta costs e.g. FM O(1) ideally DeHon/Wrighton CALTECH CS137 Winter2004 Bad Cost Functions Update cost rerun maze route on every move rerun timing analysis recalculate critical path delay Drive toward solution: size < threshold ? Critical path delay DeHon/Wrighton CALTECH CS137 Winter2004 Cost Functions Total Wire Length Bounding Box Semiperimeter Exact estimate for up to 3 node nets VPR Bounding Box (improves apprx): N nets Cost q(i) bbx (i) bby (i) i 1 q(i) varies from 1 -> 2.79 by the number of nodes on the net DeHon/Wrighton CALTECH CS137 Winter2004 Timing Cost Function TimingCost(i,j) = Dly(i,j) ∙Critclty(i,j)CritcltyExp Maintain all delay’s (i -> j) in a lookup table Recompute Net Criticality @ Every Temperature Balance w/ Wiring Cost Function Marquardt, Betz, & Rose, 2000 DeHon/Wrighton CALTECH CS137 Winter2004 Multiple Cost Functions 800 60 50 600 40 400 30 20 200 10 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Lambda DTimingCost DWiringCost DCost l (1 l ) OldTimingC ost OldWiringC ost DeHon/Wrighton CALTECH CS137 Winter2004 1 Timing Cost “Auto-Normalize” Function is not weighted on ‘absolute’ values Wiring Cost Handling ‘Max’ Style Cost Functions Max(a1, a2, a3, a4…) is difficult to ‘drive’ annealing Instead: Minimize ea1+ea2+ea3+ea4+… DeHon/Wrighton CALTECH CS137 Winter2004 Initial Solution Spectral Placement Random Constructive Placement DeHon/Wrighton CALTECH CS137 Winter2004 Moves Swap two cells swap regions …rows, columns, subtrees rotate cell (when feasible) flip (mirror) cell permute cell inputs (equivalent inputs) DeHon/Wrighton CALTECH CS137 Winter2004 Variant Allow non-legal solutions capture badness in cost function E.g. -- allow cells to overlap Just make sure cost function makes very expensive as cool settle out to legal solutions DeHon/Wrighton CALTECH CS137 Winter2004 Variant: “Rejectionless” Order moves by cost compare FM Pick random number first Use random to define range of move costs will currently accept Pick randomly within this range (never pick a costly move which will reject) DeHon/Wrighton CALTECH CS137 Winter2004 Theory If stay long enough at each cooling stage If cool long enough will achieve tight error bound will find optimum Long Enough If P ≠ NP – will be very long DeHon/Wrighton CALTECH CS137 Winter2004 Practice Good results ultimately, what most commercial tools use... Slow convergence Tricky to pick schedules to accelerate convergence DeHon/Wrighton CALTECH CS137 Winter2004 Black Magic VPR Details Rlimitnew = Rlimitold (1-0.44+α) Maintain moves accepted at 44% Lam and Delosme cN4/3 Moves @ Temp = Temperature Update Schedule Tnew = Told ∙γ α γ α > 0.96 0.5 0.8 < α ≤ 0.96 0.9 0.15 < α ≤ 0.8 0.95 α ≤ 0.15 0.8 Empirical DeHon/Wrighton CALTECH CS137 Winter2004 Big Hammer Costly, but general Works for most all problems Can have hybrid/mixed cost functions as long as weight to single potential With care, can attack multiple levels (part, placement, route, retime, schedule…) place and route Ignores structure of problem resignation to finding/understanding structure DeHon/Wrighton CALTECH CS137 Winter2004 Optimal/Exhaustive More on Long Enough… If you run simulated annealing long enough…. It should converge to optimum If you have enough monkeys typing at keyboards keyboards long enough They’ll eventually produce the works of Shakespeare…. DeHon/Wrighton CALTECH CS137 Winter2004 Brute Force? If you are really going to give up on structure and explore the entire space …there are more efficient ways to do it …and maybe they’re not terrible For small/modest problems As our computers get faster DeHon/Wrighton CALTECH CS137 Winter2004 Optimum Placement Simplest case: Gate/array (FPGA) w/ fixed cell locations N locations M cells Try all permutations of N choose M N!/(N-M)! cases DeHon/Wrighton CALTECH CS137 Winter2004 Improving Prune off symmetry cases Rotate 90, 180, 270 Mirror X, Y, XY Reject provably bad starts (return to in a minute) DeHon/Wrighton CALTECH CS137 Winter2004 Exhaustive Placement More general: Modules have variable size Modules can be rotated/flipped… To explore all cases: For each module For each orientation DeHon/Wrighton CALTECH CS137 Winter2004 Branch-and-Bound Recall from Logic Optimization Consider dense 1D placement Search space of all placements Tree branch is choice of logical cell for physical cell position Keep track of best solution so far as reach leaves If partial solution worse than best solution, prune branch DeHon/Wrighton CALTECH CS137 Winter2004 Pruning: 1D example Reducing channel width Have solution with width=10 When find a partial solution with width>=10 Can abort that branch Reducing Delay Have solution with delay=20 When find a partial solution with delay>=20 Can abort branch DeHon/Wrighton CALTECH CS137 Winter2004 Viable Only on small problems But “small” growing with machine speed Use for end-case in constructive Flatten bottom of hierarchy Maybe even in iteration/overlap relaxation DeHon/Wrighton CALTECH CS137 Winter2004 Caldwell et. al. Results DeHon/Wrighton CALTECH CS137 Winter2004 [TRCAD v19n11p1304-13] Runtime DeHon/Wrighton CALTECH CS137 Winter2004 [TRCAD v19n11p1304-13] Faster? Accounting and gain complexity Make it linear time But do make each update somewhat complex Exhaustive case less bookkeeping Not an atypical result… DeHon/Wrighton CALTECH CS137 Winter2004 Admin Will Meet on Monday Should have feedback from Dehon Reminder: No Support for Non-Standard Tools Jython, Optik, M3, C, Perl, Tcl/Tk, etc. Reading Scheduling DeHon/Wrighton CALTECH CS137 Winter2004 Summary Simulated Annealing General purpose solution use randomness to explore space accept “bad” moves to avoid local minima decrease tolerance over time costly in runtime Small (sub)problems May solve exhaustively Can prune to accelerate… DeHon/Wrighton CALTECH CS137 Winter2004 Big Ideas: Use randomness to explore large (nonconvex) space Simulated Annealing Use dominance to quickly skip over obviously bad solutions Branch and Bound DeHon/Wrighton CALTECH CS137 Winter2004