Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan TAU 2011, Myung-Chul Kim, University of Michigan 1 Fast SPICE Simulation: Motivation ■ IC timing closure, especially at advanced technology nodes, heavily depends on highly-accurate timing simulations − Increasing impact of PVT variation − Rigorous clock skew/slew constraints ■ Circuit size and complexity rapidly increasing − Scalable SPICE technique is critical TAU 2011, Myung-Chul Kim, University of Michigan 2 Key Feature of Chop-SPICE ■ Developed as a compromise simulator (fast yet sufficiently accurate) for use by Contango2 software in the ISPD 2010 contest ■ Simple and practical divide-and-conquer approach ■ Can capture PVT variation and spatial correlation ■ Flexible trade-off between runtime and solution quality ■ Adaptability to various SPICE simulators TAU 2011, Myung-Chul Kim, University of Michigan 3 ISPD10 Clock Tree Synthesis Contest ■ 45nm 2GHz CPU benchmarks from IBM and Intel ■ Objective: Minimize the overall capacitance of the clock network − Subject to constraints: – Monte-Carlo SPICE simulations with PVT variations – Local clock skew < 7.5 ps – Slew rate < 100ps – Hard runtime limit per benchmark < 12 hours ■ Low-skew clock trees are especially unforgiving to timing-analysis inaccuracies TAU 2011, Myung-Chul Kim, University of Michigan 4 Speed Prior Work Ideal Timing Evaluator Elmore, D2M, LnD Delay Models Simulation SPICE, AWE Accuracy ■ Ideal Timing Evaluator − Fast runtime without sacrificing accuracy − High fidelity, adaptability to various SPICE tools TAU 2011, Myung-Chul Kim, University of Michigan 5 Chop-SPICE Algorithm ■ Definition: Probing Points − Given an RC tree , probing points are defined as A. Input nodes of buffers B. Sink nodes − = Set of probing points − = Number of fanouts to probing points at node si ■ Example TAU 2011, Myung-Chul Kim, University of Michigan 6 Chop-SPICE Algorithm ■ Definition: Granularity − Maximum Granularity: − Minimum Granularity: − Granularity Range: − Target Granularity: ■ Target Granularity determines minimum number of probing points to be included in sub-circuits TAU 2011, Myung-Chul Kim, University of Michigan 7 Chop-SPICE Flow RC Tree instance RC Tree traversal no Target granularity reached? yes Sub-circuit generation Apply input slew stimuli Delay and slew propagation Invoke SPICE simulation Delay and slew update no RC tree exhausted? yes End 8 Sub-circuit Generation ■ Sub-circuits are always delimited by buffers − If a probing point is an input node of buffer(s), all fanout buffers are explicitly included in current sub-circuit − Buffers at the boundary of a sub-circuit may also appear in another sub-circuit. ■ Facilitating accurate reconstruction of circuit delay from sub-circuit simulation data ■ Can reduce AC sweep time for sub-circuits TAU 2011, Myung-Chul Kim, University of Michigan 9 Delay Propagation ■ Purpose : After retrieving probing points’ delay from SPICE, they can be propagated in order to capture delay for probing points in subsequent sub-circuits. ■ Calculation of delay from the root node s0 to node sj − Find the sub-circuit containing sj . − Identify the shortest tree path from s0 to sj , and the earliest node si in the sub-circuit that lies on this tree path (Assume that signal delay from s0 to si was computed recursively). − The delay from si to sj is obtained by SPICE simulation and added to delay at si. TAU 2011, Myung-Chul Kim, University of Michigan 10 Slew Propagation ■ Purpose : After retrieving probing points’ slew from SPICE, they can be used in order to capture slew for probing points in subsequent sub-circuits. ■ Slew at a given node can be expressed as a function of input slew of a sub-circuit. − Slew measured at the previous stage (up to the root node si in a given sub-circuit) should be accounted for when stimuli for the current sub-circuit are generated. − Slew at a node is directly calculated by SPICE simulation. TAU 2011, Myung-Chul Kim, University of Michigan 11 Empirical Results: ISPD10 Benchmarks ■ Experimental setup − Single threaded runs on a 3.2GHz Intel core i7 Quad CPU Q660 Linux workstation − Buffered RC networks generated by applying Contango2 to ISPD’10 high-performance CNS contest benchmark suite − Open-source NgSPICE-2.2 ■ Target granularity − Varies from (full-scale SPICE simulation) to in order to examine trade-offs TAU 2011, Myung-Chul Kim, University of Michigan 12 Empirical Results: Avg. Error TAU 2011, Myung-Chul Kim, University of Michigan 13 Empirical Results: Max. Error and Trade-off 14 Fidelity ■ Fidelity suggests whether Chop-SPICE is effective as a replacement of full-scale SPICE during optimization − On intermediate clock trees produced by Contango2, we use Chop-SPICE and full-scale SPICE to measure sink delays before and after optimization TAU 2011, Myung-Chul Kim, University of Michigan 15 Future work ■ Extension to general RC networks − An algorithm for computing signal delays in non-tree RC networks by partitioning a given circuit into a spanning tree and non-tree links, and invoking an RC-tree computation is given [6] − A recent study [16] report 98% correlation to full SPICE runs. ■ Using parallelism − Two sub-circuits can be simulated in parallel if they do not lie on the same path to root. − The larger the RC tree, the more parallelism can be found. TAU 2011, Myung-Chul Kim, University of Michigan 16 Conclusions ■ Accurate estimation of circuit delay is becoming more difficult at new technology nodes − Clock-skew estimation in CNS requires picosecond precision ■ Chop-SPICE partitions the original RC tree into sub-circuits, simulates each of them with SPICE, and reconstructs global results from simulation data for sub-circuits ■ Empirical validation shows that Chop-SPICE offers attractive trade-offs between accuracy and runtime ■ Chop-SPICE provides not only good accuracy, but also fidelity sufficient for use in external optimization algorithms ■ Can be applied to any SPICE simulators TAU 2011, Myung-Chul Kim, University of Michigan 17 Questions and Answers Thank you! Time for Questions TAU 2011, Myung-Chul Kim, University of Michigan 18