Simulation Based Power Estimation For Digital CMOS Technologies Master’s Thesis Defense Jins Davis Alexander Thesis Advisor: Dr. Vishwani D. Agrawal Thesis Committee: Dr. Victor P. Nelson and Dr. Adit Singh Department of Electrical and Computer Engineering Auburn University, AL 36849 USA Sept 3, 2008 MS Thesis Defense 1 Outline • • • Motivation and Problem Statement Background Contributions: 1. PowerSim: A Generalized Power Analysis Tool 2. A New Dynamic Power Analysis Algorithm • Bounded Delays and Ambiguity Intervals Maximum Transitions Minimum Transitions Simulation and Power Estimation Experimental Results and Observations Conclusion Sept 3, 2008 MS Thesis Defense 2 Motivation • Dynamic power increases with glitch transitions, which in turn are a functions of gate delays. • Process variation can influence delays in a circuit, especially in nanoscale technologies. • Monte Carlo simulation used to address the variation is time consuming and CPU intensive. • Bounded delay models are usually considered to address process variations in logic level simulation and timing analysis (Grimes, MS Thesis, August 2008). Sept 3, 2008 MS Thesis Defense 3 Problem Statement • Given a set of vectors (random or selected functional set): 1. Analyze a digital circuit for various power components (dynamic – logic and glitch, short circuit, leakage, clock, flip-flop) in a nominal delay circuit. 2. Determine the range of dynamic power consumption for specified bounds on delay variations. Sept 3, 2008 MS Thesis Defense 4 C880: Monte Carlo vs. New Method 80000 1000 Random Vectors, 1000 Sample Circuits 70000 Frequency 60000 50000 40000 30000 20000 10000 2. 13 7 2. 74 74 3. 35 79 3. 96 83 4. 57 88 5. 18 92 5. 79 96 6. 41 01 7. 02 05 7. 63 1 8. 24 14 8. 85 19 9. 46 23 10 .0 73 10 .6 83 11 .2 94 1. 52 65 0 Power (mW) Monte Carlo Simulation New Bounded Delay Algorithm Min Power (mW) Max Power (mW) CPU Time (secs) Min Power (mW) Max Power (mW) CPU Time (secs) 1.42 11.59 262.7 1.35 11.89 0.3 Sept 3, 2008 MS Thesis Defense 5 Background • Bounded delays model delay uncertainties by assigning each gate lower and upper bounds on delay, also known as min–max delays. • The bounds can be obtained by adding specified process-related variation to the nominal gate delay for the technology. • In this model, regions of signal uncertainties are defined at the output of each gate node. Sept 3, 2008 MS Thesis Defense 6 Specifying Ambiguity Delay Intervals IV FV EA • • • • LS FV EA LS EA is the earliest arrival time LS is the latest stabilization time IV is the initial signal value FV is the final signal value EAsv=-∞ EAdv LSsv=∞ LSdv EAdv=-∞ EAsv Sept 3, 2008 IV LSdv=∞ LSsv MS Thesis Defense 7 Propagating Ambiguity Intervals through Gates The ambiguity interval (EA,LS) for a gate output is determined from the ambiguity intervals of input signals, their pretransition and post-transition steady-state values, and the min-max gate delays. (mindel, maxdel) Sept 3, 2008 MS Thesis Defense 8 Representative Formulae • To evaluate the output of a gate, we analyze inputs i: Sept 3, 2008 MS Thesis Defense 9 Formulae… • and, where the inertial delay of the gate is bounded as (mindel, maxdel). Sept 3, 2008 MS Thesis Defense 10 Finding Number of Transitions 3 14 7 5 8 10 12 10 12 14 2 [mintran,maxtran] [0,2] 3 EA 5 EA 14 (mindel, maxdel) LS [0,4] 1,3 6 EA 17 LS 17 LS where mintran is the minimum number of transitions and maxtran the maximum number of transitions. Sept 3, 2008 MS Thesis Defense 11 Estimating maxtran • First upper bound: We calculate the maximum transitions (Nd) that can be accommodated in the ambiguity interval given by the gate delay bounds and the (IV,FV) output values. • Second upper bound: We take the sum of the input transitions (N) as the output cannot exceed this. We modify this by : N=N–k (1) where k = 0, 1, or 2 for a 2-input gate and is determined by the ambiguity regions and (IV, FV) values of inputs. • The maximum number of transitions is lower of the two upper bounds: maxtran = min (Nd, N) Sept 3, 2008 MS Thesis Defense (2) 12 Examples of maxtran (k = 0) Nd = ∞ N=8 maxtran=min (Nd, N) = 8 Nd = 6 N=8 maxtran=min (Nd, N) = 6 Sept 3, 2008 MS Thesis Defense 13 Example: maxtran With Non-Zero k [n1 + n2 – k = 8 ] , [n1 = 6] EAsv = - ∞ EA LSdv = ∞ LS where k = 2 EAdv LSsv [n2 = 4] EAsv = - ∞ EAdv LSdv = ∞ LSsv [6] [6+4–2=8] [4] Sept 3, 2008 MS Thesis Defense 14 Estimating mintran • First lower bound (Ns): Based on steady state values, i.e., 00, 11 as no transition and 01, 10 as a single transition. • Second lower bound (Ndet): The minimum number of transitions that can occur in the output ambiguity region is the number of deterministic signal changes that occur within the ambiguity region and such that signal changes are spaced at time intervals greater than or equal to the inertial delay of the gate. • The minimum number of transitions is the higher of the two lower bounds: mintran = max (Ns, Ndet) Sept 3, 2008 MS Thesis Defense (3) 15 Example: mintran EAsv = - ∞ EAdv LSsv = ∞ LSdv EAdv = - ∞ EAsv EA LS d LSdv = ∞ LSsv (mindel, maxdel) • There will always be a hazard in the output as long as (EAsv – LSdv) ≥ maxdel • Thus in this case the mintran is not 0 as per the steady state condition, but is 2. Sept 3, 2008 MS Thesis Defense 16 Multiple Ambiguity Intervals • Multiple ambiguity regions may rise in output which are separated by regions of deterministic signal states. • We arrange the (EA,LS) values in order of their temporal occurrences. • If an (LS) value occurs before an (EA) value, then a multiple ambiguity region exists. • We propagate them to the output on the condition that any two consecutive bound values are separated at least by the gate inertial delay. Sept 3, 2008 MS Thesis Defense 17 Example EA1 LS1 EA1+ d1 EA2 LS3 + d2 LS2 d1,d2 EA3 LS3 (a) EA1 LS1 EA1+ d1 EA2 EA3 + d1 LS2 d1,d2 EA3 LS2 + d2 LS3 + d2 LS3 (b) Sept 3, 2008 MS Thesis Defense 18 Simulation Methodology • maxdel, mindel = nominal delay ± Δ% • Three linear-time passes for each input vector: First pass: zero delay simulation to determine initial and final values, IV and FV, for all signals. Second pass: determines earliest arrival (EA) and latest stabilization (LS) from IV, FV values and bounded gate delays. Third pass: determines upper and lower bounds, maxtran and mintran, for all gates from the above information. Sept 3, 2008 MS Thesis Defense 19 Effect of Gate Delay Distribution • Experiment conducted to see if the distribution of gate delays has an effect on power distribution. • For uniform distribution: Gate delays were randomly sampled from uniform distribution [a, b], where a = nominal delay – Δ% and b = nominal delay + Δ% This distribution has a variance σ2 = (1/12)(b – a)2 = Δ2(nom. delay)2/30,000. • For normal distribution: Gate delays were randomly sampled from a Gaussian density with mean = nom. delay, and variance σ2 as above. Sept 3, 2008 MS Thesis Defense 20 Experimental Setup • A standard gate node delay of 100 ps was taken. A wire load delay model was followed with each nominal gate delay being a function of its fan – out. • The power distribution is for 1000 random vectors with a vector period of 10000 ps. • For each vector pair 1000 sample circuits was simulated. Sept 3, 2008 MS Thesis Defense 21 180000 160000 Uniform Distribution Frequency 140000 120000 100000 80000 60000 40000 20000 0 0.12 0.17 0.21 0.26 0.31 0.36 0.41 0.46 0.51 0.56 0.61 0.66 0.71 0.76 0.81 0.86 0.91 0.96 1.01 1.06 More Power in (mW) 180000 160000 Normal Distribution 140000 Frequency 120000 100000 80000 60000 40000 20000 0 0.12 0.17 0.21 0.26 0.31 0.36 0.41 0.46 0.51 0.56 0.61 0.66 0.71 0.76 0.81 0.86 0.91 0.96 1.01 1.06 More Power in (mW) Sept 3, 2008 MS Thesis Defense 22 Experimental Result (Maximum Power) • Monte Carlo Simulation vs. Min-Max analysis for circuit C880. 100 sample circuits with + 20 % variation were simulated for each vector pair (100 random vectors). R2 is coefficient of determination, equals 1.0 for ideal fit. Sept 3, 2008 MS Thesis Defense 23 Result…(Minimum Power) R2 is coefficient of determination, equals 1.0 for ideal fit. Sept 3, 2008 MS Thesis Defense 24 Results…(Average Power) 10 9 Monte Carlo average power (mW) R2 = 0.9527 8 7 6 5 4 3 R2 is coefficient of determination, equals 1.0 for ideal fit. 2 1 0 0 2 4 6 MIN - MAX m ean pow er (m W) 8 10 Effect of Inertial Delay • Transition Statistics for high activity gate 1407 in c2670 for a random vector pair. Histograms obtained from Monte Carlo Simulations of 100 sample circuits. min-max delay (7ps,12ps) min-max delay (1ps,3ps) 60 mintran = 0 Frequency 35 30 25 20 15 10 50 Frequency maxtran =1 0 40 30 20 10 5 0 0 0 2 4 6 8 10 Num ber of Transitions Sept 3, 2008 40 maxtran = 8 45 mintran = 0 70 50 0 2 4 6 8 Num ber of Transitions MS Thesis Defense 26 50 40 30 20 10 40 30 20 10 0 0 0 2 4 0 6 2 4 Num ber of Transitions Num ber of Transitions Sept 3, 2008 maxtran = 4 60 mintran = 0 min-max delay (11ps,33ps) maxtran = 6 Frequency 50 min-max delay (8ps,24ps) Frequency 60 mintran = 0 Effect of Inertial Delay… MS Thesis Defense 27 Power Estimation Result • Circuits implemented using TSMC025 2.5V CMOS library , with standard size gate delay of 10 ps and a vector period of 1000 ps. Min-Max values obtained by assuming ± 20 % variation. The simulation was run on a UNIX operating system using a Intel Duo Core processor with 2 GB RAM. Sept 3, 2008 MS Thesis Defense 28 PowerSim • A gate level power analysis tool to efficiently and accurately estimate and separate the different power components. • Libraries are characterized by SPICE for leakage power (all input states), node capacitance, temperature, supply voltage and other technology constraints. • The tool does an event driven simulation of input vectors and estimates power from the dynamically created libraries. • Provides information of the effect of different power components with respect to the circuit and technology used. Sept 3, 2008 MS Thesis Defense 29 PowerSim Results on Benchmarks • Average power dissipation of ISCAS85 Benchmark circuits for 1000 random vectors, vector period 100ns, 0.25 micron CMOS technology, supply voltage 2.5 volts. Sept 3, 2008 MS Thesis Defense 30 Histogram of c880 Leakage Power • Histogram of leakage power for circuit c880 in 90 nm technology for 1000 random vectors. 350 300 Number of vectors 250 200 150 100 50 0 Sept 3, 2008 4.206 Leakage power in microwatts (incremented in steps of 0.1 microwatts) MS Thesis Defense 5.254 31 Short Circuit Power Against Sizing Logarithmic Scale - Average Short Circuit power dissipation in microwatts • The average short circuit power dissipation for 6 invertors (last inverter being used as a load) with constant size, increasing size and decreasing size. An input signal of 0 1 0 was applied to the first inverter. 10 1 0.1 0.01 0.001 Sept 3, 2008 Constant Sizing Increasing Sizing Decreasing Sizing invertor 1 invertor 2 invertor 3 invertor 4 MS Thesis Defense invertor 5 32 Conclusion • We have used bounded delay model to successfully develop a power estimation method with consideration of uncertainties in delays. • Linear time complexity in number of gates and an efficient alternative to the Monte Carlo analysis. • Future work includes considering process dependent variation in leakage as well as in node capacitances. • PowerSim is a useful gate level power analysis tool for VLSI designers. Sept 3, 2008 MS Thesis Defense 33 Thank You. Sept 3, 2008 MS Thesis Defense 34