Efficient Decoupling Capacitance Budgeting Considering Operation and Process Variations Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He* *Electrical Engineering Department, UCLA +IBM T. J. Watson Research Center, Yorktown Heights, NY This work is partially supported by NSF CAREER award and a UC MICRO grant sponsored by Altera, RIO and Intel. Motivation The continuous semiconductor technology scaling leads to growing process variations, and statistical optimization has been actively researched to cope with process variations. Stochastic gate sizing for power reduction [Bhardwaj:DAC’05, Mani:DAC’05] Stochastic gate sizing for yield optimization [Davoodi:DAC’06, Sinha:ICCAD’05] Stochastic buffer insertion to minimize delay [He:TCAD’07] Adaptive body biasing with post-silicon tuning [Main:ICCAD’06] However, all these work ignore operation variation such as crosstalk difference over input vectors power supply noise fluctuation over time processor temperature variation over workload A better design could be achieved by considering both operation and process variations As a vehicle to demonstrate this point, we study the on-chip decoupling capacitance insertion and sizing (or decap budgeting) problem taking into account operation and process variations Decap Budgeting Overview Nodes away from Vdd pin may suffer from supply noise due to sudden burst of activity Provide current for surplus need from the local storage charge Side effect of adding too much decap Increased leakage Increased die area Risk of yield loss Location matters power supply The closer to the turbulent point, the more noise reduction can be achieved Given the amount of decap to be inserted, find the optimal location so that the noise can be suppressed to a maximum extent. Vn intrinsic cap decap Load current We define the noise as the integral over time of the area below U U t0 t1 Decap Budgeting Problem Formulation Objective Find the distribution and location of the white space so the noise on power network is minimized Constraints: Local decap constraints: amount of decap allowed at each location is limited due to placement constraint Global decap constraints: total amount of decap allowed is limited due to leakage constraint Limitation of existing work: Most existing work in essence uses worst case load current in order to guarantee there is no noise violation, which is too pessimistic It is not clear how to provide decap budgeting solution that is robust to current loads under all kinds of operations for a circuit Major Contribution of our work In this paper, we develop a novel stochastic model for current loads, taking into account operation variation such as temporal and logic-induced correlations and process variations such as systematic and random Leff variation. We propose a formal method to extract operation variation and formulate a new decap budgeting problem using the stochastic current model. We develop an effective yet efficient iterative alternative programming algorithm and conduct experiments using industrial designs. Experiments show that considering both operation and process variations can reduce over-design significantly. This demonstrates the importance of considering operation variation. Outline Stochastic Modeling and Problem Formulation Algorithm Experimental Results Conclusions Correlated Load Currents Strong correlation between load currents due to Operation variation Currents at different ports have logic-induced correlation – Large number of ports with limited control bits – Currents at certain ports cannot reach maximum at the same time due to the inherent logic dependency for a given design Currents at the same port have temporal correlation – System takes several clock cycles to execute one instruction – The currents cannot reach maximum at all the clock cycles Process variation Currents have intra-die variation due to process variation – The P/G network is robust to process variation, but the load currents have intradie variation because the circuit suffers from process variation. – Leff variation is one of the primary variation sources and the variation is spatially correlated [Cao:DAC’05] Current Sampling Model the current in each clock cycle as a triangular waveform and assume constant rising/falling time Other current waveforms can be used. It will not affect the algorithm In our verification, we use the detailed non-simplified current waveform Partition a circuit into blocks and assume no correlation between different blocks [Najm:ICCAD’05] Extensive simulation for each block to get the peak current value in each clock cycle and at each port. Assume there is only temporal correlation within certain number of clock cycles L L can be the number of clock cycles to execute certain function Stochastic Current Modeling Divide peak current values into different sets according to the clock cycle and port number j The set bk contains peak current values at port k and in clock cycle j, j+L, j+2L,… Example: Take L=2, and consider two ports in 8 consecutive clock cycles clock cycles j, temporal correlation clock cycle port 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 port 2 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 b11 b12 b21 b22 0.1 0.3 0.5 0.7 0.2 0.4 0.6 0.8 0.01 0.03 0.05 0.07 0.02 0.04 0.06 0.08 port k, logic-induced correlation j Define Bk to be the stochastic variable with the sample set 1 For example, B1 has the samples 0.1, 0.3, 0.5, 0.7, and therefore has mean value 0,4 j2 The correlation between Bkj1 and Bk reflects the temporal correlation between clock cycle j1 and j2 j The correlation between Bkj1 and Bk 2 reflects the logic induced correlation between port k1 and k2. Extraction of Correlations The logic-induced correlation coefficient between port k1 and k2 at clock cycle j can be computed as ( j; k1 , k2 ) cov( Bkj1 , Bkj2 ) ( B ) ( B ) j k1 j k2 , (1 k1 , k 2 p) Temporal correlation coefficient between clock cycle j1 and j2 at port k can be computed as j j cov( Bk 1 , Bk 2 ) ( j1 , j2 ; k ) , (1 j1 , j2 L) j1 j2 ( Bk ) ( Bk ) j To take process variation into consideration, sample each Bk multiple times over different region, and the above two formulas can still be applied 0.5 0.8 eff ox I~L t (Vdd Vt ) Extraction of Correlations As Bkj is not Gaussian, apply Independent Component Analysis [Hyvarinen’01] to remove the correlation between Bkj and get a new set of independent variables r , r , … 1 2 Each Bkj can be represented by the linear combination of r1, r2,… Accordingly the waveform at each clock cycle can be reconstructed from those r1,r2,…, i.e., The new variables ri catch both the operation and process variations. Example of Extracted Temporal Correlation The correlation map for peak currents between different clock cycles of one port from an industry application. The P/G network is modeled as RC mesh The load currents are obtained by detailed simulation of the circuit It can be seen that the correlation matrix can be clearly divided into four trunks, and L can be set as 10 Parameterized MNA Formulation Original MNA formulation With the design variables - decap area wi, the G, C matrices can be expressed as Together with the stochastic current model, the MNA formulation becomes: With parameters wi and ri The objective now is to find the optimal solution for those parameters More specifically, find the wi values that minimize the noise with the ri corresponding to the load currents which introduce the maximum noise Stochastic Decap Formulation p min i 1 rk wi ( P1) f (U yi ( wi , rk ; t )) dt sup i rk rk rk , 1 k q 0 wi wi 1 i M s.t. M w W i 1 i Minimize the maximum noise sum over all ports Subject to the stochastic current variable upper/lower bound Subject to Local decap area constraint due to placement constraint Global decap area constraint due to leakage constraint Non-convex min/max optimization problem Difficult to find global optimal solution Outline Stochastic Modeling and Problem Formulation Algorithm Experimental Results Conclusions Iterative Programming Algorithm p min i 1 rk wi ( P1) f (U yi ( wi , rk ; t )) dt sup rk rk rk , 1 k q 0 wi wi s.t. i M w i 1 i 1 i M Each iteration we increase the white space allowed until all the white space has been used up or it converges W update the decap budgeting Find the optimal decap budgeting for the giving max droop/bounce Find the input corresponding to the max. droop/bounce for the given decap budgeting update the max droop/bounce Cannot guarantee optimality, but can guarantee convergence and efficiency Experimental results show our algorithm can achieve good optimization results Illustration of Iterative Programming A3: (P3) A1: (P3) A0: Initial A2: (P2) A0: Initial noise curve at one randomly selected port A1: The noise curve under the optimal decap budgeting for a giving droop/bounce A2: The noise curve with the input corresponding to the max. droop/bounce for the decap budgeting in A1 A3: The noise curve under the optimal decap budgeting for the giving max droop/bounce in A2 Sequential Programming We apply sequential linear programming (sLP) to solve each of the two sub-problems. For each sub-problem, we iteratively do the following two steps until the solution converges: Compute the sensitivities of all the variables to the first order by moment M matching. x x0 i wi i 1 M M (G wi Gw,i ) x s (C wi Cw,i ) x Bu i 1 i 1 (G sC ) x0 Bu first order sensitivities (G sC ) i (Gw,i sC w,i ) x0 Linearize the objective function with the sensitivities and the optimization problem becomes an LP M min(max) w i 1 i i Outline Stochastic Modeling and Problem Formulation Algorithm Experimental Results Conclusions Impact of Current Correlations Model 1 Maximum current at all ports Model 2 Stochastic model with logic-induced correlation Model 3 Model 2 + temporal correlation Node # Noise (V*s) Runtime (s) Model 1 Model 2 Model 3 Model 1 Model 2 Model 3 1284 6.33e-7 1.28e-7 4.10e-8 104.2 161.2 282.3 10490 5.21e-5 1.09e-5 4.80e-6 973.2 1430 2199 42280 7.92e-4 5.38e-4 9.13e-5 2732 3823 5238 166380 1.34e-2 5.37e-3 2.28e-3 3625 5798 7821 avg 1 1/2.68X 1/9.10X 1 1.50X 2.26X Compared with the model assuming maximum currents at all ports, under the same decap area, Stochastic model with spatial correlation only reduce the noise by up to 3X Stochastic model with both spatial and temporal correlation reduce the noise by up to 9X Impact of Leff Variation Node #3429 3.06X V.R. 1284 10490 42280 166380 avg mean (V*s) sLP std (V*s) runtime (s) 10% 9.28e-7 3.97e-7 184.2 20% 9.43e-7 4.55e-7 10% 1.03e-4 4.79e-5 20% 1.22e-4 4.38e-5 10% 2.29e-3 9.72e-4 20% 4.43e-3 1.01e-3 10% 2.06e-2 9.91e-3 20% 2.31e-2 1.03e-2 10% 1 1 20% 1 1 1121 2236 3824 1 sLP + Leff mean std runtime (V*s) (V*s) (s) 6.14e-7 1.38e-7 332.8 1.81X 6.38e-7 1.86e-7 7.22e-5 1.23e-5 7.94e-5 2.06e-5 8.23e-4 1.01e-4 8.28e-4 1.92e-4 5.31e-3 8.92e-4 5.92e-3 9.33e-4 11224 2.93X 1/2.02X 1/5.05X 2.73X 1/1.95X 1/4.05X 3429 3.06X 6924 3.10X Compared with the stochastic model without considering Leff variation, the stochastic model with it reduce the average noise by up to 4X and the 3-sigma noise by up to 13X Conclusions In this paper, we develop a novel stochastic model for current loads, taking into account operation variation such as temporal and logicinduced correlations and process variations such as systematic and random Leff variation. We propose a formal method to extract operation variation and formulate a new decap budgeting problem using the stochastic current model. We develop an effective yet efficient iterative alternative programming algorithm and conduct experiments using industrial designs. Experimental results show that the noise can be reduced by up to 9X. We also apply similar idea to temperature-aware clock routing [Hao:ispd’07] and microprocessor floorplanning (Section 8C.2). Thank you!