Stratified Sampling for Fault Coverage of VLSI Systems Vishwani D. Agrawal Agere Systems, Murray Hill, NJ 07974 va@agere.com http://cm.bell-labs.com/cm/cs/who/va September 26, 2001 Collaborators: Pradip Thaker, Acorn Networks, and Mona Zaghloul, GWU Sep. 26, 2001 Agrawal: Stratified Sampling 1 VLSI System Design 90-100% stuck-at fault coverage required Register-transfer level (RTL) design and verification Logic synthesis Test generation Timing and physical design Design and test data for manufacturing Sep. 26, 2001 Agrawal: Stratified Sampling 2 Problem Accurately estimate the gate-level fault coverage for a VLSI system at the RT-level Advantages: • Improve test • Improve design • Avoid expensive design changes Previous approaches do not accurately represent gate-level fault coverage (function errors, mutation, statement faults, branch faults, etc.) Sep. 26, 2001 Agrawal: Stratified Sampling 3 Solution Model faults as representative sample of the targeted (gate-level stuck-at) faults. Treat the coverage in an RTL module as a statistical sampling estimate. For a multi-module VLSI system, combine module coverages according to the stratified sampling technique. Sep. 26, 2001 Agrawal: Stratified Sampling 4 Outline of Talk Introduction to fault sampling. RTL fault model and application to modules. Coverage in a multi-module system: • Need for stratified sampling • Stratum weights • Experimental results Conclusion References Sep. 26, 2001 Agrawal: Stratified Sampling 5 Fault Sampling A randomly selected subset (sample) of faults is simulated. Measured coverage in the sample is used to estimate fault coverage in the entire circuit. Advantage: Saving in computing resources (CPU time and memory.) Disadvantage: Limited data on undetected faults. Sep. 26, 2001 Agrawal: Stratified Sampling 6 Random Sampling Model Detected fault All faults with a fixed but unknown coverage Random picking Np = total number of faults Ns = sample size Ns << Np (population size) C = fault coverage (unknown) Sep. 26, 2001 Undetected fault c = sample coverage Agrawal: Stratified Sampling (a random variable) 7 Probability Density of Sample Coverage, c (x--C )2 -- ------------ 1 p (x ) = Prob(x < c < x +dx ) = -------------- e s (2 p) 2s 2 1/2 p (x ) C (1 - C) 2 Variance, s = -----------Ns s Sampling error s Mean = C C -3s C x C +3s 1.0 x Sample coverage Sep. 26, 2001 Agrawal: Stratified Sampling 8 Sampling Error Bounds |x-C|=3 C (1 - C ) [ -------------- ] 1/2 Ns Millot, 1923 Solving the quadratic equation for C, we get the 3-sigma (99.8% confidence) estimate (Agrawal-Kato, 1990): 4.5 C 3s = x ------- [1 + 0.44 Ns x (1 - x )]1/2 Ns Where Ns is sample size and x is the measured fault coverage in the sample. Example: A circuit with 39,096 faults has an actual fault coverage of 87.1%. The measured coverage in a random sample of 1,000 faults is 88.7%. The above formula gives an estimate of 88.7% 3%. CPU time for sample simulation was about 10% of that for all faults. Sep. 26, 2001 Agrawal: Stratified Sampling 9 An RTL Fault Model (ITC-2000) Language operators are assumed to be faultfree Variables (map onto signal lines) contain faults stuck-at-0 stuck-at-1 Only one fault is applied at a time (single fault assumption) Sep. 26, 2001 Agrawal: Stratified Sampling 10 RTL Fault Injection Not affected by faults: • Synthetic operators + - * >= <= == != • Boolean operators & | ^ ~ • Logical operators && || ! • Sequential elements (flip-flops & latches) Faults introduced in signal variables (stems and fan-outs) Separate faults for bits of data words Sep. 26, 2001 Agrawal: Stratified Sampling 11 Fault Modeling for Boolean Operators module mux(c, a, b, s); assign d = a & s; assign e = s1 & b; assign s1 = !s; assign c = d | e; d a s c s1 e b endmodule RTL Description Sep. 26, 2001 Symbolic Description Agrawal: Stratified Sampling 12 Stem and Fan-out Fault Modeling RTL fan-out faults: if(X) then Z=Y; else Z=!Y; Unique RTL fault is placed on each fan-out of each bit of a variable Unique RTL fault on each stem module module (a) Sep. 26, 2001 (b) Agrawal: Stratified Sampling 13 More RTL Faults b [1 ] b [0 ] + f [2 :0 ] f [2 :0 ] < h M U X o u t_ si g 1 e [3 :0 ] o u t_ s ig 2 a[1] a[0] c[1] c[0] * e [3:0] v e [3 :0 ] k > j w g [2 :0 ] d [1 ] d [0 ] - g [2 :0 ] g [ 2 :0 ] = i f [2 :0 ] c lk reset_ Sep. 26, 2001 Agrawal: Stratified Sampling 14 Observations and Assumption: RTL Faults RTL faults may have detection probability distribution similar to that of collapsed gate-level faults Statistically, an RTL fault-list approximates a random sample from the gate-level fault-list Number of RTL faults vs. gate-level faults depends on • Level of RTL description • Synthesis procedure used to convert RTL to gate level Sep. 26, 2001 Agrawal: Stratified Sampling 15 RTL Fault Simulation Analogous to gate-level approach Faults injected in RTL code of the design description by a C++ parser; a simulatable logic buffer element inserted at fault site Fault report contains statistics on detected and undetected RTL faults Cadence’s Verifault-XL used as RTL fault simulator Sep. 26, 2001 Agrawal: Stratified Sampling 16 Estimation Error for Module Fault Coverage RTL fault coverage assumed to be an estimate of the collapsed gate-fault coverage within statistical bound [Agrawal and Kato, D&T, 1990]: 2 k 1 4 Nc(1 c) / 2k 2N a = 3.00 for confidence probability of 99.8% c = ratio of detected to total number of RTL faults M = number of gate faults N = number of RTL faults, k = 1 - N/M Sep. 26, 2001 Agrawal: Stratified Sampling 17 DSP Interface Module (3,168 Gates) 100 RTL & Gate Fault Cov (%) 90 80 70 60 RTL Cov Gate Cov 50 40 30 20 10 0 0 500 1000 1500 Test Vectors Sep. 26, 2001 Agrawal: Stratified Sampling 18 RTL Faults and VLSI System Coverage Experimental results demonstrate RTL fault coverage of a module to be a good statistical estimate of the gate-level fault coverage A VLSI system consists of many interconnected modules Overall RTL fault-list of a VLSI system does not constitute a representative sample of the gate-level fault-list Sep. 26, 2001 Agrawal: Stratified Sampling 19 Error at System Level RTL M1 100 faults 91% cov. M2 100 faults 39% cov. M1 150 faults 90% cov. Gatelevel M2 400 faults 40% cov. RTL Coverage = (0.91 x 100 + 0.39 x 100) / 200 = 65% Gate Coverage = (0.90 x 150 + 0.40 x 400) / 550 = 54% A correct estimation of gate-level fault coverage from RTL coverage: 91 x (150 / 550) + 39 x (400 / 550) = 53% Sep. 26, 2001 Agrawal: Stratified Sampling 20 Application of Stratified Sampling Fault population of a VLSI system divided into strata according to RTL module boundaries RTL faults in each module are considered a sample of corresponding gate-level faults The stratified RTL coverage is an estimate of the gate-level coverage: Wm = stratum weight of mth module = Gm/G M C = S Wmcm m=1 Sep. 26, 2001 cm = RTL fault coverage of mth module Gm = number of gate-level faults in mth module G = number of all gate-level faults in the system M = number of RTL modules in the system Agrawal: Stratified Sampling 21 Application of Stratified Sampling Range of coverage, where, s2 = C+ts M S m=1 Wm cm(1 cm) rm 1 rm = number of RTL faults in mth module t = value from tables of normal distribution The technique requires knowledge of stratum weights and not absolute values of Gm and G Sep. 26, 2001 Agrawal: Stratified Sampling 22 Stratum Weight Extraction Techniques Sep. 26, 2001 Logic synthesis based weight extraction Wm = Gm/G Floor-planning based weight extraction Wm = Am/A Entropy-measure based weight extraction Agrawal: Stratified Sampling 23 Experimental Procedure Technology-dependent weight extraction • Several unique gate-level netlists obtained by logic synthesis from the same RTL code • Each synthesis run performed using a different set of constraints, e.g., area optimization (netlist 1), speed optimization (netlist 2), or combined area and speed optimizations (netlists 3 and 4) • Strata weights calculated using gate-level fault lists of various synthesized netlists Technology-independent weight extraction • Stratum weights calculated using area distribution among modules Each set of stratum weights used to calculate RTL fault coverage and error bounds Impact of estimation error investigated Sep. 26, 2001 Agrawal: Stratified Sampling 24 Experimental Data: Weight Distributions Netlist1 Stratum Weights 0.3 Netlist2 0.25 Netlist3 Area 0.2 Netlist4 0.15 0.1 0.05 0 1 2 3 4 5 6 7 8 9 10 11 12 Modules Sep. 26, 2001 Agrawal: Stratified Sampling 25 Experimental Data: RTL Fault Coverage 80 Fault Coverage (%) 70 60 50 RTL Cov.(Wm from Netlist1) RTL Cov.(Wm from Netlist2) RTL Cov.(Wm from Netlist3) RTL Cov. (Wm from Area) RTL Cov. (Wm from Netlist4) Gate Cov. 40 30 20 10 0 1 Sep. 26, 2001 2 3 4 5 Test Vector Set 6 Agrawal: Stratified Sampling 7 26 Error Bounds (|E|) Experimental Data: Error Bounds 10 9 8 7 6 5 4 3 2 1 0 |E|(Wm |E|(Wm |E|(Wm |E|(Wm |E|(Wm 1 2 3 4 5 6 from from from from from Netlist1) Netlist2) Netlist3) Area) Netlist4) 7 Test Vectors Sep. 26, 2001 Agrawal: Stratified Sampling 27 RTL & Gate Fault Cov (%) Timing Controller ASIC (17,126 Gates) 70 60 50 40 30 RTL Cov 20 Gate Cov 10 0 0 Sep. 26, 2001 200 400 Test Vectors Agrawal: Stratified Sampling 600 28 RTL & Gate Fault Cov(%) A DSP ASIC (104,881 Gates) 80 70 60 50 40 30 20 10 0 RTL Cov Gate Cov 0 200 400 600 800 1000 Test Vectors Sep. 26, 2001 Agrawal: Stratified Sampling 29 Conclusion Main ideas of RTL fault modeling • A small or high-level RTL module contributes few RTL faults, but large statistical tolerance gives a correct coverage estimate • Stratified sampling accounts for varying module sizes and for different RTL details that may be used • Stratum weights appear to be insensitive to specific details of synthesis Advantages of the proposed RTL fault model • High-level test generation and evaluation • Early identification of hard-to-test RTL architectures • Potential for significantly reducing run-time penalty of the gate-level fault simulation Sep. 26, 2001 Agrawal: Stratified Sampling 30 References V. D. Agrawal, “Sampling Techniques for Determining Fault Coverage in LSI Circuits,” J. Digital Systems, vol. V, no. 3, pp. 189202, 1981. V. D. Agrawal and H. Kato, “Fault Sampling Revisited,” IEEE Design & Test of Computers, vol. 7, no. 4, pp. 32-35, Aug. 1990. P. A. Thaker, M. E. Zaghloul, and M. B. Amin, “Study of Correlation of Testability Aspects of RTL Description and Resulting Structural Implementation,” Proc. 12th Int. Conf. VLSI Design, Jan. 1999, pp. 256-259. P. A. Thaker, V. D. Agrawal, and M. E. Zaghloul, “Validation Vector Grade (VVG): A New Coverage Metric for Validation and Test,” Proc. 17th IEEE VLSI Test Symp., Apr. 1999, pp. 182-188. P. A. Thaker, Register-Transfer Level Fault Modeling and Evaluation Techniques, PhD Thesis, George Washington University, Washington, D.C., May 2000. P. A. Thaker, V. D. Agrawal, and M. E. Zaghloul, “Register-Transfer Level Fault Modeling and Test Evaluation Techniques for VLSI Circuits,” Proc. Int. Test Conf., Oct. 2000, pp. 940-949. This presentation is available from the website http://cm.bell- labs.com/cm/cs/who/va Sep. 26, 2001 Agrawal: Stratified Sampling 31 Thank you Sep. 26, 2001 Agrawal: Stratified Sampling 32