Power estimation techniques and a new glitch filtering effect modeling based on probability waveform Fei Hu Department of Electrical and Computer Engineering, Auburn University, AL 36849 11/11/2004 Fei Hu, ELEC6970 Fall 2004 1 Outline Introduction Gate-level Probabilistic Approach 11/11/2004 Signal Probability Transition probability Transition density Probability waveform A new glitch filtering method Different Levels of power estimation Based on Probability waveform The idea and examples Preliminary experimental results Summary Fei Hu, ELEC6970 Fall 2004 2 Introduction Power estimation is critical to IC (low power) design Levels of power estimation Transistor Level Gate Level RTL Level Behavior Level Software Level Two approaches 11/11/2004 Total power consumption must be estimated during the design phase. Helps to find the hot-spot which may lead to the failure Simulation based Non-simulative Fei Hu, ELEC6970 Fall 2004 3 Simulation based Approach Transistor Level Simulation Circuit level SPICE PowerMill 11/11/2004 Solving a large matrix of node current using the Krichoff’s Current Law (KCL) Basic components include resistor, capacitor, inductors, current sources and voltage sources. Diodes and transistors are modeled by basic components Table based device model Even driven timing simulation 2-3 orders of magnitude faster than Spice Fei Hu, ELEC6970 Fall 2004 4 Simulation based Approach Transistor Level Simulation -continued Switch level Gate Level Simulation Basic components, logic gates Logic simulation to find switching activity, P=1/2CV2factive Monte Carlo simulation, statistical method 11/11/2004 Model transistor as a on-off switch with a resistor Short circuit power can be accounted by observing the time in which the switches form a powerground path Each sample has N Random input vector Energy consumption has a normal distribution Stopping criterion derived from sample average and sample standard deviation Fei Hu, ELEC6970 Fall 2004 5 Simulation based Approach RTL level simulation 11/11/2004 Basic components, register, adder, multiplier, etc. RT-level simulation collect input statistics of each module Macro-modeling of each component based on simulation Simulating the component with random input Fitting a multi-variable regression curve (power macro model equation) using a least mean square error fit. Fei Hu, ELEC6970 Fall 2004 6 High level estimation Most of the high level power prediction use profiling and simulation techniques to address data dependencies Behavior level estimation No much RT (or lower) level circuit structure information available Information theoretic models Complexity based models, “equivalent gate” Software level estimation 11/11/2004 Total capacitance estimated based on output entropy Average switching activity for each line, approximated by ½ its entropy Energy consumption by a application program Instruction level power macromodel Profile-driven program synthesis, RT level simulation Fei Hu, ELEC6970 Fall 2004 7 Non-simulative Approach Gate level probabilistic approach Concepts Signal Probability Transition probability Transition density Probability waveform Factors Spatial, temporal correlation Zero delay or real delay (glitch power) With or w/o glitch filtering 11/11/2004 Fei Hu, ELEC6970 Fall 2004 8 Gate level probabilistic approach concepts Signal Probability 11/11/2004 Ps(x), the fraction of clock cycles in which the steady-state value of signal x is high Spatial independence, the logic value of an input node is independent of the logic value of any other input node Under spatial independence assumption, signal probability for simple gate is: NOT: c=a, Ps(c)=1-Ps(a) AND: c=ab, Ps(c)=Ps(a)Ps(b) OR: c=a+b, Ps(c)=1-[1-Ps(a)][(1-Ps(b)] Fei Hu, ELEC6970 Fall 2004 9 Signal probability w/ spatial correlation Example Ps(a)=0.5 Ps(c)=0.5x0.5=0.25 ? Ps(c)=0 Ps(b)=0.5 Signal correlation 11/11/2004 S. Ercolani, M. Favalli, M. Damiani, P. Olivo, and B. Ricco. Estimate of signal probability in combinational logic networks. In Proceedings of the First European Test Conference, pages 132– 138, 1989. Ps(x1,x2)=Ps(x1)Ps(x2)Wx1,x2 Approximate higher order correlation with pairwise correlations Fei Hu, ELEC6970 Fall 2004 10 Signal probability w/ spatial correlation Global OBDD Ordered binary decision diagram corresponding to the global function of a node (function of node in terms of circuit input) Give exact signal correlation x1 1 Example, function y=x1x2+x3 0 0 x3 0 11/11/2004 0 x2 1 1 Ps(y)=Ps(x1)Ps(fx1)+Ps(x1)Ps(fx1) Traversal from bottom to top to derive signal probability Fei Hu, ELEC6970 Fall 2004 1 11 Gate level probabilistic approach concepts Transition probability 11/11/2004 Pt(x), average fraction of clock cycles in which the steady state value of x is different from its initial value Temporal independence, the signal value of a node at clock cycle i is independent to its signal value at clock cycle i-1 Under temporal independence assumption, transition probability Pt(x)=2Ps(x)[1-Ps(x)] Fei Hu, ELEC6970 Fall 2004 12 Transition probability w/ spatial temporal correlations R. Marculescu, D. Marculescu, and M. Pedram. Logic level power estimation considering spatiotemporal correlations. In Proceedings of the IEEE International Conference on Computer Aided Design, pages 294–299, Nov. 1994. Zero delay assumption, lag one markov chain Transition correlations Used to describe the spatial temporal correlation between two signals in consecutive clock periods TCxy(ij,mn)=P(xi->j ,ym->n)/P(xi->j)P(ym->n) i,j,m,n {0,1} Propagate transition probability from PI 11/11/2004 Pt(x) ≠2Ps(x)[1-Ps(x)] OBDD based procedure Global or local OBDD Fei Hu, ELEC6970 Fall 2004 13 Gate level probabilistic approach concepts Transition density D(x), average number of transitions a logic signal x makes in a unit time (one clock cycle) Boolean difference, if y is a function depending on x then and 11/11/2004 Under differential delay assumption, no two signal has transition happened at the same time. Under spatial independence assumption Considers glitch power No glitch filtering effect Fei Hu, ELEC6970 Fall 2004 14 Transition density Example, c=ab Ps(a)=0.5 D(a)=Pt(a)=2*0.5*0.5 =0.5 Ps(b)=0.5 D(b)=Pt(b)=2*0.5*0.5 =0.5 11/11/2004 d d P(c/a)=P(b)=0.5 P(c/b)=P(a)=0.5 D(c)=0.5*D(a)+0.5*D(b) =0.5*0.5+0.5*0.5 =0.5 Depending on delay model above result can be true or false Fei Hu, ELEC6970 Fall 2004 15 Transition density Inputs Number transition on c a b Zero delay Unit delay 00 00 0 0 00 01 0 0 00 10 0 0 00 11 0 0 01 00 1 1 01 01 0 0 01 10 1 1 01 11 0 0 10 00 1 1 10 01 1 1 10 10 0 2 10 11 0 0 11 00 0 0 11 01 1 1 11 10 1 1 11 11 0 0 6 8 Total 11/11/2004 Fei Hu, ELEC6970 Fall 2004 16 Gate level probabilistic approach concepts Probability waveform 11/11/2004 F. N. Najm, R. Burch, P. Yang, and I. N. Hajj. CREST - a current estimator for cmos circuits. In Proceedings of IEEE International Conference on Computer-Aided Design, pages 204–207, Nov. 1988 A sequence of value indicating the probability that a signal is high for certain time interverals, and the probability that it makes low-to-high at specific time point Real delay model Propagation of probability waveform deals with probability of making transitions Transition density is the sum of all probability of transitions CREST assumes spatial independence Fei Hu, ELEC6970 Fall 2004 17 Probability waveform P 0.5 Example, c=ab 0.1 0.2 0.2 0.1 t1 t2 P 0.5 0.1 0.2 0.2 0.1 t1 0.5 a b P Pc01(t1)=Pa01(t1) Pb01(t1)+ Pa01(t1) Pb11(t1)+ Pa11(t1) Pb01(t1) =0.1*0.1+0.1*0.3+0.3*0.1 =0.07 0 c 0.07 0.16 t1 0.16 0.07 t2 Pc10(t1)=Pa10(t1) Pb10(t1)+ Pa10(t1) Pb11(t1)+ Pa11(t1) Pb10(t1) =0.2*0.2+0.2*0.3+0.3*0.2 =0.16 t2 Pc11(t1)=Pc1(t1-)- Pc10(t1) Pc1(t1+)=Pc01(t1)+Pc11(t1) 11/11/2004 Fei Hu, ELEC6970 Fall 2004 18 Probability waveform Tagged Probability waveform Divide probability waveform into 4 tagged waveform depending the steady state signal values Probability waveforms are for one clock period Use transition correlation of steady state signal to approximate spatial temporal correlation between two inputs Wa,bxy,wz=Pa,bxy,wz /Paxy Pbwz Transition correlation can be obtained from zero delay logic simulation Bit-parallel 11/11/2004 simulation Glitch filtering effect considered Fei Hu, ELEC6970 Fall 2004 19 Tagged probability waveform Example of decomposition 11 0.35 0.15 0.15 t1 11 0.5 P 01 0.15 0.1 0.2 0.2 0.1 t1 0.05 0.1 10 t2 t2 t1 0.15 0.05 t2 0.1 t1 t2 t1 t2 00 0.35 11/11/2004 Fei Hu, ELEC6970 Fall 2004 20 Tagged probability waveform Propagation of waveform Similar to untagged waveform Two input gates, 16 combinations of tags sum up waveform with same resulting tags, 4 output waveform Example for an AND gate Pc,uv01(t1)+=[Pa,xy01(t1) Pb,wz01(t1)+Pa,xy01(t1) Pb,wz11(t1)+Pa,xy11(t1) Pb,wz01(t1)] * Wa,bxy,wz Pc,uv10(t1)+=[Pa,xy10(t1) Pb,wz10(t1)+Pa,xy10(t1) Pb,wz11(t1)+Pa,xy11(t1) Pb,wz10(t1)] * Wa,bxy,wz uv, xy, wz are tags, (00,01,10,11) uv = xy and wz here 11/11/2004 Fei Hu, ELEC6970 Fall 2004 21 Tagged probability waveform Glitch filtering scheme If pulse width less than gate inertial delay, it is subject to glitch filtering For time t1 for at time point t2, t2-t1<d Pc,uv01(t1)-= Pa,xy01(t1) Pb,wz10(t2)Wa,bxy,wz Pc,uv10(t2)-= Pa,xy01(t1) Pb,wz10(t2)Wa,bxy,wz Limitations 11/11/2004 Rough filtering, Not exact description for pulse Can’t filter glitch coming from one input Fei Hu, ELEC6970 Fall 2004 22 A new glitch filtering scheme Why important Glitch power can be a significant portion of total switching power Bad filtering scheme gave errors Basic idea: look at the exact condition for a pulse t1 a b 0 c t1 t2 t2 11/11/2004 P(c has transition at t1 and t2)=P(a has 0->1 at t1, b has 1->0 at t2) Tagged waveform was correct ? Fei Hu, ELEC6970 Fall 2004 23 New glitch filtering scheme P 0.5 In probability waveform (spatial independence) 0.1 0.2 0.2 0.1 t1 P 0.5 0.1 0.2 0.2 0.1 0.5 a b t1 11/11/2004 t2 P 0 c 0.07 0.16 t1 0.16 0.07 t2 t2 P(c has 0->1 transition at t1 and 1->0 at t2)= P{(a,b) at t1 is (01,11) or (11,01) or (01,01) and (a,b) at t2 is (10,11) or (11,10) or (10,10)} Fei Hu, ELEC6970 Fall 2004 24 New glitch filtering a 01 11 01 b 11 01 01 and a 10 11 10 b 11 10 10 t1 t2 a 01,10 01,11 01,10 11,10 11,11 11,10 01,10 01,11 01,10 b 11,11 11,10 11,10 01,11 01,10 01,10 01,11 01,10 01,10 t1,t2 Pc01,10(t1,t2) is a sum of 9 terms Example term: Pa01,10(t1,t2)Pb11,11(t1,t2) This sum Pc01,10(t1,t2) is subtracted from Pc01(t1), Pc10(t2) Similarly, Pc10,01(t1,t2) is subtracted from Pc10(t1), Pc01(t2) 11/11/2004 Fei Hu, ELEC6970 Fall 2004 25 New glitch filtering Keep track of Pcij,kl(t1,t2) for every signal during the waveform propagation in the form of correlation coefficient wcij,kl(t1,t2)= Pcij,kl(t1,t2)/Pcij(t1) Pckl(t2) After the filtering If t2-t1<d Pc01,10(t1,t2) set to 0 Otherwise 11/11/2004 Pc01,11(t1,t2) set to Pc01(t1) … Pcij,kl(t1,t2) = wcij,kl(t1,t2)* P’cij(t1) P’ckl(t2) P’cij(t1), P’ckl(t2) are probability of transition at t1,t2 after filtering Fei Hu, ELEC6970 Fall 2004 26 New glitch filtering scheme In tagged probability waveform Consider spatial correlation Approximate spatial correlation with steady state signal transition correlations, Wabxy,wz Pc,uv01,10(t1,t2) is a sum of sub-sum of 9 terms Each sub-sum is corresponding to a pair of input waveform Example term in sub-sum Pa,xy01,10(t1,t2)Pb,wz11,11(t1,t2)*Wabxy,wz Pc,uv01,10(t1,t2) is subtracted from Pc,uv01(t1), Pc,uv10(t2) if t2-t1<d Similarly, Pc,uv10,01(t1,t2) is subtracted from Pc,uv10(t1), Pc,uv01(t2) 11/11/2004 Fei Hu, ELEC6970 Fall 2004 27 Preliminary experimental results Small circuit with tree structure No spatial correlations Randomly specified delay Input signal probability =0.5 Results by probability waveform compared to logic simulation under random input vectors 1 17 1 3 18 3 4 25 29 1 26 3 5 1 6 19 3 20 7 4 31 8 3 33 9 1 10 21 32 2 22 11 12 5 27 28 13 14 3 30 1 1 23 2 24 15 16 11/11/2004 1 2 2 Fei Hu, ELEC6970 Fall 2004 28 Preliminary results – tree circuit Node time interval Logic Simulator Prob Wave + new Errors % Tag Tag + new 17 (2,2) 0.45701 0.4608 0.83% 0.460576 0.78% 0.460576 0.78% 18 (4,4) 0.45991 0.4608 0.19% 0.460277 0.08% 0.460277 0.08% 19 (2,2) 0.46043 0.4608 0.08% 0.46039 0.01% 0.46039 0.01% 20 (5,5) 0.46193 0.4608 0.24% 0.460373 0.34% 0.460373 0.34% 21 (2,2) 0.46222 0.4608 0.31% 0.460688 0.33% 0.460688 0.33% 22 (6,6) 0.46184 0.4608 0.23% 0.460396 0.31% 0.460396 0.31% 23 (2,2) 0.46104 0.4608 0.05% 0.461046 0.00% 0.461046 0.00% 24 (3,3) 0.45848 0.4608 0.51% 0.460052 0.34% 0.460052 0.34% 25 (3,5) 0.58581 0.589824 0.69% 0.589791 0.68% 0.589791 0.68% 26 (5,8) 0.48563 0.483656 0.41% 0.483775 0.38% 0.483775 0.38% 27 (4,8) 0.59225 0.589824 0.41% 0.589889 0.40% 0.589889 0.40% 28 (4,5) 0.48332 0.483656 0.07% 0.483814 0.10% 0.483814 0.10% 29 (4,9) 0.60493 0.605951 0.17% 0.605226 0.05% 0.605226 0.05% 30 (7,11) 0.32617 0.324893 0.39% 0.324305 0.57% 0.324305 0.57% 31 (7,12) 0.49869 0.481971 3.35% 0.605226 21.36% 0.481551 3.44% 32 (8,12) 0.32617 0.324893 0.39% 0.324305 0.57% 0.324305 0.57% 33 (10,15) 0.46703 0.457838 1.97% 0.546224 16.96% 0.456386 2.28% 8.05286 8.0289 0.30% 8.23635 2.28% 8.02284 0.37% total Standard Deviation 0.84% 6.31% 0.89% Average 0.60% 2.55% 0.63% 11/11/2004 Fei Hu, ELEC6970 Fall 2004 29 Preliminary results – reconvergent fanout Small circuit with reconvergent fanout 11/11/2004 Introduce spatial correlations Randomly specified delay Input signal probability =0.5 Results by probability waveform compared to logic simulation under random input vectors 1 1 17 2 1 3 18 3 4 25 29 1 26 3 5 1 6 19 3 20 7 4 32 8 26 30 9 34 3 1 36 27 2 1 10 21 1 2 35 22 11 12 5 27 28 13 14 3 33 31 1 1 23 2 24 15 16 2 Fei Hu, ELEC6970 Fall 2004 30 Preliminary results – reconvergent fanout Node time interval Logic Simulator Prob Wave + new Errors % Tag Tag + new 25 (3,5) 0.58581 0.589824 0.69% 0.589791 0.7% 0.589791 0.7% 26 (5,8) 0.48563 0.483656 0.41% 0.483775 0.4% 0.483775 0.4% 27 (4,8) 0.59225 0.589824 0.41% 0.589889 0.4% 0.589889 0.4% 28 (4,5) 0.48332 0.483656 0.07% 0.483814 0.1% 0.483814 0.1% 29 (4,9) 0.60493 0.605951 0.17% 0.605226 0.0% 0.605226 0.0% 30 (5,9) 0.49029 0.490675 0.08% 0.49081 0.1% 0.49081 0.1% 31 (7,11) 0.32617 0.324893 0.39% 0.324305 0.6% 0.324305 0.6% 32 (7,12) 0.49869 0.481971 3.35% 0.605226 21.4% 0.481551 3.4% 33 (8,12) 0.32617 0.324893 0.39% 0.324305 0.6% 0.324305 0.6% 34 (8,15) 0.29495 0.196084 33.52% 0.284186 3.6% 0.27001 8.5% 35 (6,13) 0.47879 0.464104 3.07% 0.451358 5.7% 0.451358 5.7% 36 (8,17) 0.47557 0.549163 15.47% 0.535572 12.6% 0.506832 6.6% (2,18) 9.32543 9.27109 0.58% 9.45205 1.4% 9.28546 0.4% Total Std Dev 7.96% 5.37% 2.50% Avg 3.02% 2.42% 1.46% 11/11/2004 Fei Hu, ELEC6970 Fall 2004 31 Preliminary results – benchmark circuits ISCAS 85’ benchmark Input signal probability 0.5 Logic simulation 40,000 random vectors Error statistics 11/11/2004 Large percentage error for low activity node – exclude error for node activity less than 0.1 Total energy estimated from transition density and load capacitance. CL is proportional to fanout of a gate Fei Hu, ELEC6970 Fall 2004 32 Circuit c880 c1355 c1908 c2670 c3540 c5315 c6288 c7552 11/11/2004 Errors Prob wave + new Tag Tag + new Avg node error 8.16% 6.10% 5.32% Std Dev 9.31% 9.64% 8.11% Totoal energy 7.26% 1.64% 5.93% Avg node error 15.16% 22.99% 4.65% Std Dev 15.95% 25.71% 6.47% Totoal energy 18.33% 32.85% 5.43% Avg node error 14.34% 10.66% 11.49% Std Dev 20.01% 19.42% 19.29% Totoal energy 19.66% 4.09% 11.19% Avg node error 15.22% 12.65% 11.98% Std Dev 16.42% 15.98% 14.15% Totoal energy 15.03% 7.24% 9.93% Avg node error 14.08% 16.29% 6.60% Std Dev 19.24% 26.08% 11.31% Totoal energy 10.08% 9.78% 2.42% Avg node error 13.06% 8.14% 7.72% Std Dev 14.87% 10.26% 10.62% Totoal energy 17.24% 2.29% 10.07% Avg node error 23.67% 28.62% 12.76% Std Dev 13.52% 22.94% 11.11% Totoal energy 26.37% 32.12% 4.05% Avg node error 17.62% 12.40% 11.42% Std Dev 25.76% 20.25% 18.61% Totoal energy 16.39% 3.17% 7.78% Fei Hu, ELEC6970 Fall 2004 33 Preliminary results – benchmark circuits Observations Single probability waveform does not perform better than “Tag + new” for benchmark circuits New glitch filtering improves node error std dev. – error more even distributed New glitch filtering improves overall and node estimation accuracy in some cases In some other case, it tends to give worse total energy estimation Estimation speed 11/11/2004 Filtering is based on tagged probability waveform Waveform before filtering is underestimated, a correct amount of subtraction by filtering effect will lead to overall underestimated value “Tag” tends to underestimate the amount of subtraction by filtering effect “Prob + new” – 10~30 times speed up comparing to logic simulation “Tag” – 200 times speed up “Tag + new” – 1~5 speed up Fei Hu, ELEC6970 Fall 2004 34 Future work Improve the estimation accuracy by improving the estimation of waveform before filtering Need to consider more on spatial correlations Take care of special case that spatial correlation between two signal is poorly approximated by steady state transition correlation Improving the estimation speed The new method of glitch filtering take too much time because of it’s propagation of Pcij,kl(t1,t2) 11/11/2004 Only 1~5 times speed up than logic simulation Software optimization has to be done Fei Hu, ELEC6970 Fall 2004 35 Summary Introductions to different Levels of power estimation Gate-level Probabilistic Approach A new glitch filtering method 11/11/2004 Signal Probability Transition probability Transition density Probability waveform Based on Probability waveform A more accurate glitch filtering Preliminary experimental results shows the potential of the new method Problems are to be tackled Questions ? Fei Hu, ELEC6970 Fall 2004 36 Thank You ! For questions and comments, please contact me at hufei01@auburn.edu 11/11/2004 Fei Hu, ELEC6970 Fall 2004 37