IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 9, SEPTEMBER 2012 1405 Diagnosis-Assisted Adaptive Test Xiaochun Yu, Member, IEEE, and Ronald DeShawn Blanton, Fellow, IEEE Abstract—This paper describes a method for improving the test quality of digital circuits on a per-design basis by: 1) monitoring the defect behaviors that occur through volume diagnosis; and 2) changing the test patterns to match the identified behaviors. To characterize the behavior of a defect (i.e., the conditions when a defect is activated), physically-aware diagnosis is employed to extract the set of signal lines relevant to defect activation. Then, based on the set of signal lines derived, the defect is attributed to one of several behavior categories. Our defect level model uses the behavior-attribution results of the current failing population to guide test-set customization to minimize defect level for a given constraint on test costs, or alternatively, ensure that defect level does not exceed some predetermined threshold. Circuit-level simulation involving various types of defects shows that defect level can be reduced by 30% using this method. Simulation experiment on actual chips also demonstrates quality improvement. Index Terms—Adaptive test, defect behavior, diagnosis, test quality. I. Introduction T HE DIAGNOSIS of test data for improving yield is a topic of great interest [1]–[5]. For example, the work in [2] uses diagnosis results to track failure rates for various design features. These empirically observed failure rates are compared with expected failure rates to identify anomalies in the design and/or manufacturing process. The thinking is that the identified anomalies can be remedied in order to improve yields. Diagnosis methods such as the ones described in [6] and [7] derive defect behaviors (i.e., defect activation conditions) in addition to the potential defect locations. The defect characteristics derived from a statistically significant population suggest the susceptibilities of a design to a specific manufacturing process. Customizing a test set to those failure characteristics is likely to increase test quality, which is commonly measured using defect level (DL). DL is the ratio of defective chips to all the chips that fully pass all the tests, and it is typically expressed in units of defective parts per million shipped (DPPM). Since the failure characteristics may change over time, the custom test set should also change accordingly. Custom test is a type of adaptive test. Different from a Manuscript received September 10, 2011; revised January 8, 2012; accepted February 17, 2012. Date of current version August 22, 2012. This work was supported by the National Science Foundation, under Award CCF-0427382. This paper was recommended by Associate Editor C.-W. Wu. X. Yu is with Intel Corporation, Hillsboro, OR 97124 USA (e-mail: xiaochun.yu@intel.com). R. D. Blanton is with the Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213 USA (e-mail: blanton@ece.cmu.edu). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCAD.2012.2193580 static test method (e.g., a test set that is generated using a fault model) that does not depend on changing parameters, an adaptive test method can alter test conditions, test flow, test content, and test limits based on statistical analysis of various forms of manufacturing data [8]. Adaptive test has been successfully employed in various ways. For example, test cost can be reduced by reordering test patterns based on fail count [9]–[12]. Adaptive limit setting for quiescent supply current (IDDQ) and minVDD outlier screening has also been shown to improve test quality [9], [11]–[15]. This paper describes diagnosis-assisted adaptive test (DAT), an approach that improves test quality or test cost for static defects by identifying and applying tests that optimally detect the defect characteristics of integrated circuits (ICs) being currently manufactured. In its current incarnation, DAT targets static defects which are different from a sequence-dependent defect (e.g., a transistor stuck-open defect) whose activation can require multiple clock cycles. The activation and error propagation of a static defect, on the other hand, occurs within a single clock cycle. Moreover, for a static defect at a site f , defect activation is assumed to be controlled by the neighborhood of f , which is the set of signals that are in close physical or logical proximity to f [6]. DAT uses diagnosis-extracted models of chip failures along with a model for estimating DPPM. Both are incorporated in a quality-monitoring methodology that ensures a desired level of quality by changing some proportion of the production test patterns to match the current defect characteristics derived from diagnosis. DAT is dynamic in nature and thus differs from the typical approach that assumes sufficient quality levels are maintained using the tests developed during the time of design or “first silicon.” DAT presupposes that failure characteristics can change over time but with a time constant that is sufficiently slow, thereby allowing the test set to be altered so as to maximize coverage of the failure characteristics actually occurring. If the failure characteristics are unchanging, it will then be possible to tune the test set once to match the fixed failure characteristics. We therefore believe it is possible to minimize DPPM for a given constraint on test costs, or alternatively, ensure that DPPM does not exceed some predetermined threshold. Fig. 1 illustrates a possible scenario where DPPM changes over time. At t1 , DPPM prediction begins based on a sufficient amount of diagnosis data obtained during the period between t0 and t1 . When the predicted DPPM approaches a specific DPPM limit at t2 , the tests are altered to lower DPPM at t3 . When DPPM is sufficiently low (e.g., t5 ), test quality can be traded off to reduce test application time, which leads to the slight increase in DPPM shown at t6 . Notice, however, at t4 , an excursion is assumed to have c 2012 IEEE 0278-0070/$31.00 © 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. 1406 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 9, SEPTEMBER 2012 TABLE I Behavior-Attribution Rules Id 1 Relevant Neighborhood Fail in deriving RNf Defect Behavior Unknown 2 RNf = ∅ RNf = ∅ and ∃c, (RNf ∪ {f })\Pc = ∅ Stuck-at 3 Fig. 1. Illustration of how DPPM can change over time for a given design in a given manufacturing process. 4 5 6 7 8 Fig. 2. Overview of DAT. occurred. Again, DAT does not presumably need to cope with excursions since effective techniques already exist to deal with them [16]. An overview of DAT is shown in Fig. 2. Multiple defect diagnosis using X-evaluation (MD2 X) [17], a multiple defect diagnosis method that makes no assumption of defect behavior, is first applied to each failing chip to derive defect sites. Then, for each defect site found via MD2 X, the defect behavior (i.e., its activation condition) is derived using the failing behavior identification (FBI) method described in [7]. In FBI, activation of a static defect is assumed to be controlled by its relevant neighbors (i.e., signal lines in close physical and logical proximity to the defect site), which are identified using an entropy analysis. Based on the relevant neighbors identified, each chip failure is placed into one of several behavior categories, each of which corresponds to one or more defect types [7] (Section II). Using the behavior-attribution results derived from a large sample of failing ICs, the defect level of chips passing a specific test set can be estimated using a novel, data-based defect-level model (Section III). The defect-level model guides test-set customization for improving test quality without additional test cost (Section IV). II. Defect Behavior Attribution 2 MD X and FBI derive, for each failing chip, the defect sites and each site’s relevant neighbors. Based on the relevant ∀c, (RNf ∪ {f })\Pc = ∅ and RNf ∩ (Vtotal \Vf ) = ∅ ∀c, (RNf ∪ {f })\Pc = ∅ and RNf ∩ (Vtotal \Vf ) = ∅ and RNf ∩ (DPNf + DVf )\(PNf + SIf ) = ∅ |RNf | = |RNf ∩ PNf | = 1 and ∀c, (RNf ∪ {f })\Pc = ∅ and RNf ∩ (Vtotal \Vf ) = ∅ |RNf | = |RNf ∩ PNf | > 1 and ∀c, (RNf ∪ {f })\Pc = ∅ and RNf ∩ (Vtotal \Vf ) = ∅ |(RNf ∩ SIf )\(PNf + DPNf + DVf )| > 0 and ∀c, (RNf ∪ {f })\Pc = ∅ and RNf ∩ (Vtotal \Vf ) = ∅ Cell misbehavior f ’s value depends on other defective sites f ’s value depends on drive strength f ’s value depends on one physical neighbor f ’s value depends on > 1 physical neighbors f ’s value depends on its receiver-gate side inputs Defect Type Unknown All types possible All types possible Bridge Bridge Bridge or open Bridge or open Open neighbors identified, each defect can be placed into one of several mutually exclusive behavior categories. Since the behaviors of some defect types overlap, there is no one-onone mapping from defect behavior to defect type in many cases. Thus, the objective of defect behavior attribution is not to derive the defect type directly from the relevant neighbors identified. But instead, behavior-attribution results in a behavior distribution, that is, the relative occurrence rates of various behavior categories. The behavior distribution of a failing population of chips captures the failure characteristics of that population. The details of behavior attribution are explained in the following subsections. Specifically, in Section II-A, the behavior-attribution rules are described and the relationship between each behavior category and defect type is analyzed. In Section II-B, various experiments are conducted to validate the relationships described in Section II-A. A. Categorizing Defect Behavior Table I lists the behavior-attribution rules that are based on different sets of relevant neighbors. (The notation used in Table I is described in Table II.) Specifically, Table I lists eight different behavior categories and their association to the four defect types considered, namely, open, bridge, cell defect, and “unknown.” Though eight behavior categories and four defect types are considered in this paper, behavior attribution can be extended to other, and more refined defect types and behaviors. Category 1: FBI identifies the relevant neighbors using the characteristics of the common defect types under consideration. If the relevant neighborhood cannot be identified, then the behavior of the defect is deemed “unknown” to signify the observed behavior does not completely align with the common defect types under consideration. YU AND BLANTON: DIAGNOSIS-ASSISTED ADAPTIVE TEST TABLE II Definition of Notation Term RNf PNf DVf DPNf SIf Vtotal Vf Pc Description Deduced relevant neighborhood of f Physical neighbors of f Inputs of the cell driving f Inputs of the cells that drive lines in PNf Side inputs of the cells driven by f All defective sites identified by diagnosis Defective sites that have the same effect as f on the failing circuit under the given test set Inputs and output of a cell c Category 2: If a defective site f does not have any relevant neighbors, then f manifests faulty behavior regardless of its neighborhood. In other words, f behaves as a signal line being permanently stuck-at a fixed logic value (i.e., a stuck-at fault), under the applied test set. Any type of defect can behave as a stuck-at fault, especially when the applied tests do not expose all the possible behaviors of a defect. Category 3: If: 1) the union of a defective site f and its relevant neighborhood is a subset of the union of the inputs and the output of a cell; and 2) there is at least one relevant neighbor, then the behavior of the defect mimics a misbehaving cell (i.e., the truth table of the cell’s function has changed). This is likely due to a cell defect, but there are other possibilities. For example, a cell input affected by an open can manifest as a change in cell functionality. Also, for a given test set, a bridge whose activation depends on the driving strength of a cell can also mimic cell misbehavior. Category 4: If: 1) the relevant neighborhood of a defective site f includes another defective site g; 2) f and g have different effects on the failing circuit; and 3) the behavior does not agree with Categories 1, 2, or 3, then the value of f is dependent on g. A bridge defect affecting two signal lines can produce this behavior. Although an open on a stem line can also cause several (or all) of its downstream branches to be defective, error manifestation on the defective branches does not depend on each other. Neither can a cell defect cause the aforementioned behavior. Thus, a defect that causes this observed behavior can be classified as a bridge. Category 5: If: 1) the relevant neighborhood of a defective site f includes inputs to the cell driving f or inputs to the cells driving the physical neighbors of f ; 2) at least one of the inputs included in the relevant neighborhood is not a physical neighbor of f or a side input to any of the cells driven by f ; and 3) the behavior does not agree with any of the Categories 1–4, then the logical value resulting in a defective site f is a function of the drive strength of its driving cell and/or its physical neighbors. A bridge defect can cause such behavior, but an open or a cell defect cannot. Thus, a defect that causes this behavior can be classified as a bridge. Categories 6 and 7: If the relevant neighborhood of a defective site f includes only the physical neighbors of f , and the relevant neighborhood does not conform to any of the Categories 1–5, then the value of f depends only on its physical neighbors. For this situation, f can be affected by an open or bridge, but not a cell defect. If f is affected by 1407 an open and the activation of the defect is only dependent on its physical neighbors, then the coupling between f and its relevant neighbors should be large enough to determine the value of f . Thus, when f is affected by an open, it is more likely that the combined effect of multiple physical neighbors will determine the value of f rather than a single neighbor [18]. Also for an open, it is less likely that the value of f is always controlled by a specific physical neighbor. On the other hand, if f is affected by a bridge, the physical neighbors that control f are likely the lines bridged with f . A bridge involving two lines is much more likely than a bridge involving three or more lines. Therefore, an open defect in this behavior category is more likely to have more than one relevant neighbor. On the other hand, a bridge in this behavior category is more likely to have a single relevant neighbor. The behavior is further partitioned by examining: 1) the value of the defective site as a function of a single physical neighbor (Category 6); and 2) the value of the defective site as a function of two or more physical neighbors (Category 7). This is reasonable since the likelihood of a defect of Category 6 being an open over a bridge is likely different from that of Category 7 even though the two categories correspond to the same defect types. Category 8: If: 1) the value of a defective site f is dependent on the side inputs to the cells driven by f ; 2) at least one side input included in the relevant neighborhood is: a) not an input to any of the cells driving the physical neighbors of f ; b) not an input to the cell driving f ; and c) not a physical neighbor of f ; and 3) the behavior does not agree with any categories from 1 to 7, then the value of f depends on the side inputs of the cells driven by f . An open affecting f can cause this type of behavior, but a bridge or a cell defect cannot. Thus, a defect exhibiting this behavior is classified as an open. B. Validation Experiment To validate the effectiveness of the behavior-attribution rules of Table I, a simulation experiment is performed using several benchmark circuits [19], [20]. To better mimic reality, a pool of defective circuits is generated by randomly injecting cell defects, gross interconnect opens, and two-line bridge defects, one at a time, into a layout implementation of each benchmark [21]. The cell defects injected include internal line opens, bridges between internal lines, bridges between inputs of a cell, feedback bridges between the inputs and output of a cell, transistor stuck-opens, transistor stuck-close defects, and “nasty polysilicon” defects1 [22]. For each layout with an injected defect, the corresponding circuit-level [23] netlist is extracted and circuit-level simulation is performed using a test set that achieves 100% stuck-at fault efficiency (i.e., a 1-detect test set), and the circuit responses are digitized and collected. A clock cycle that is much slower than the rated clock is used so as to mimic scan test. Table III shows the number of defective circuits created for each defect type and benchmark. 1 A nasty polysilicon defect (NPSD) refers to a spot deformation of extra polysilicon that causes an unwanted connection between the gate and the source (drain) of a transistor that also prevents the metal terminal connection of the source (drain) of a transistor. Since an NPSD causes a short and an open at the same time, it is inherently “nasty.” 1408 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 9, SEPTEMBER 2012 TABLE III Number of Defective Circuits for Various Defect Types and Benchmarks Used for Validation Benchmark c432 c880 s713 s953 c1355 Cell Defect 363 1061 610 847 1250 Bridge 260 486 321 554 493 Open 1277 2262 1829 2643 3372 According to our analysis in Section II-A, all cell defects should be placed in either the stuck-at or the cell-misbehavior category. The behavior of a bridge can be placed in any of the categories except 1 or 8. For an open, its behavior can be in any category except 1, 4, and 5. To evaluate our analysis of the relationship between different categories of defect behaviors and defect types, each defective circuit is diagnosed using MD2 X and the relevant neighbors of each defective site are identified using FBI. Then, each defect is categorized using Table I. The percentages of defects exhibiting stuck-at behavior are 59%, 14%, and 97% for cell, bridge, and open defects, respectively. The behavior distributions of defects that do not behave as stuck-at faults are shown in Fig. 3. It shows that among defects that do not behave as stuck-at faults, 84% of the cell defects, 99% of the bridges, and 99% of the opens have behaviors that agree with our analysis. In other words, the behaviors of a vast majority of defects of each type are attributed to the behavior categories corresponding to that defect type (Table I). There is however a small percentage of disagreement, which is due to misidentification of the relevant neighborhood [7]. The behavior attribution error for cell defects is much higher than bridge and open defects, since the complexity exhibited by cell misbehavior can be mimicked by some other defect behaviors. Fig. 4 shows the behavior-attribution results for 2533 failing arithmetic logic unit (ALU) test chips manufactured in 0.11 μm technology by LSI Corporation, Milpitas, CA. The ALU design consists of 2509 gates. Fig. 4(a) shows that a majority of the chip failures have misbehaviors that fall into Categories 2 and 7. Furthermore, these failing chips can be partitioned into four sets according to the time they were manufactured and tested. Fig. 4(b) shows that the occurrence rates of different behavior categories have changed slightly over time. The most significant change occurred for defects with cell misbehavior from March 2004 to July 2005. It should be noted, however, that the sample size here is too small to draw any conclusions about the significance of these changes. In other words, based on the number of chips involved in deriving Fig. 4(b), changes in behavior distribution cannot be determined with high confidence. Defect behaviors are likely to vary across different processes and different designs, which means Fig. 4 varies for different technologies. Thus, for a design manufactured using a different process, behavior attribution needs to be applied to derive the behavior distribution for that design-process combination. III. DPPM In this section, a model for estimating DPPM using the behavior-attribution results is described. The derivation of the Fig. 3. Behavior distributions of non-stuck-at-behaving defects. The percentages of cell, bridge, and open defects, exhibiting stuck-at behavior (Category 2) are 59%, 14%, and 97%, respectively. Fig. 4. Occurrence rates for different behavior categories for (a) 2533 actual chip failures from LSI Corporation and (b) same chip failures partitioned over the time of manufacture. model for estimating DPPM is given in Section III-A, and experiment results that validate the model are described in Section III-B. A. DL Model Many models for predicting DL have been proposed. For example, a model that predicts DL using multimodel fault coverage is presented in [24]. Their DL model is based on the assumption that defects of different types are equally likely to occur, which of course may not be true in reality. In [25], a YU AND BLANTON: DIAGNOSIS-ASSISTED ADAPTIVE TEST DL model that is based on the number of times each site is observed is described. The particular activation conditions of a given defect type are, however, not taken into account. It is quite likely that the values on the relevant neighbors of a defect that affects a site i are the same for many patterns where i is observed. In such cases, using a test set with a repeated number of observations does not necessarily lead to lower defect level especially for the static defects considered here. In addition, the model of [25] assumes that a defective chip passes the testing process and therefore is shipped if the erroneous effect of at least one defective site is not observed. However, in the context of multiple defects, a defective chip passes test only if no error from any of the defective sites is observed. So, the assumption in [25] can lead to an overestimation of defect level. To address the shortcomings just outlined, a DL model using an estimate of the defect-type distribution has been proposed in [26]. As shown in [7], the defect-type distribution for a design fabricated in a given manufacturing process can be estimated using the behavior distribution of each defect type (e.g., Fig. 3), which is derived from the behaviorattribution results of a sufficient number of PFAed2 defects that affect the design. The method of [7] can effectively estimate the defect-type distribution if the behavior distribution of each defect type: 1) remains static or changes slowly; and 2) varies from other defect types. PFA is an expensive and time-consuming process however. Moreover, there may be no or not enough PFA results to generate an accurate behavior distribution per defect type at the beginning of a product’s life time. Thus, it is desirable if DL can be estimated without PFA results. In the work of [26]–[29], a defective chip examined using diagnosis is classified into one of several defect types directly from the defect behavior extracted without the use of PFA. However, due to the ambiguity in mapping defect behavior to defect type, which is well illustrated by Table I and Fig. 3, such classification methods result in defect-type distributions that are very different from the actual as demonstrated in [7]. The DL model presented here does not depend on an accurate estimation of defect-type distribution. Instead, it uses the occurrence rate of each behavior category, which is derived through diagnosis and the behavior attribution method described in Section II-A. In the following, we define the DL model in terms of several probabilities. For each site i, the following probabilities are defined. Pi (defective) = Probability that site i is affected by a defect. Pi (detected) = Probability of detecting the defect that affects site i. Pi (behavior j) = Probability of a defect with behavior j affecting i, where behavior j is one of the categories in Table I. Pi (j detected) = Probability of detecting a defect of behavior j that affects site i. Pi (j activated|state k) = Probability of activating a defect of behavior j that affects a site i when the corresponding relevant neighborhood has state k. To make the problem more tractable, independence is assumed to exist among the defect activation and corresponding 2 Physical failure analysis (PFA) is the very time-consuming process of physically inspecting an IC for uncovering the root cause of failure. 1409 error-propagation conditions for defects affecting different locations. In other words, it is assumed that the detection of a defect at one site does not influence the detection of a defect at another site when there are multiple defects in the circuit. A chip C is shipped if: 1) C is defect-free; or 2) none of the defects affecting C is detected by the testing process. In other words, a chip C is shipped if for every site i, there is no defect that affects i, or if there is one, it is not detected. Thus, the probability of C being shipped, Pship , is computed as follows: Pship = (1 − Pi (defective) × Pi (detected)) i∈{allsites} = ⎛ ⎝1− ⎞ Pi (behaviorj)×Pi (j detected)⎠ j∈BEHVC i∈{allsites} (1) where BEHVC is the set of all behavior categories. Derivations for both Pi (defective) and Pi (behavior j) are described next. A defect of behavior j affecting a site i is detected if there is a test pattern t that activates the defect and sensitizes i (despite the possible existence of other defects) to some observable point. Therefore Pi (j detected) = 1−P(j is not activated for any test sensitizing i) =1− (1 − Pi (j activated|state k)) (2) k∈RNSij where RNSij consists of all the states for the relevant neighborhood for a defect of behavior j at site i for the test patterns where site i is sensitized. RNSij can easily be obtained by analyzing the applied test set. Assume the probability of a defect of behavior j affecting a site i is the same for all sites. Thus, the fraction of a failure population affected by a defect of behavior j, which can be derived using the behavior attribution method (Section II-A), j) equals Pi (behavior . Moreover, the theoretical yield, P (behavior j) j∈BEHVC i which is the percentage of the manufactured chips that are defect-free, can be expressed as (1 − Pi (behavior j)). (3) Yield = i∈{allsites} Thus j∈BEHVC Pi (behavior j) = 1 − √ i 1 Yield. (4) j∈BEHVC Therefore, Pi (behavior j) can be computed for all behavior categories at every site i for a certain value of yield. Using the equations for Pi (j detected) and Pi (behavior j), Pship can be derived for a given test set using (1). Since defect level is defined to be the ratio of the defective chips in the chips that pass all the tests, defect level can therefore be expressed as Pship − Yield . (5) DL = Pship 1410 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 9, SEPTEMBER 2012 TABLE IV Number of Failing LSI Test Chips and Their Associated Test Sets Tester Data Set 1 2 3 No. of Tests 234 239 238 No. of Failing Chips 849 911 773 B. Validation An evaluation experiment is conducted using the LSI test chips whose behavior-attribution results are shown in Fig. 4. The test chip design consists of 384 ALUs. Although each ALU is identical in logic and layout, the ALUs were partitioned into several groups, each with its own test set. The test-set size and number of failing ALUs detected for each group are given in Table IV. The first 197 test patterns are the same across all three test sets. The 197 test patterns are used in this experiment. Given a yield value, Pship and DL can be computed according to (1) and (5) using the behavior-attribution results shown in Fig. 4 (Section II-B). For simplicity, it is assumed for a defect of behavior j affecting a site i that for any state k ∈ RNSij , Pi (j activated|state k) = 1/2. In other words, each state of the relevant neighborhood for a defect of behavior j at a site i is assumed to be equally likely to activate the defect at 50%. However, this probability can be made more precise by using a defect density distribution and the layout information. The 197 tests that are common to each ALU are simulated to derive RNSij ∀i, j, which is the set of relevant neighborhood states for each behavior category, site pair (i, j). Next, Pi (j detected) is computed for every site i and every behavior category j using (2). The probability of a defect of a specific behavior category affecting a site is derived according to (4) using the yield value and the behaviorattribution results from Fig. 4 (Section II-B). Then, Pship is computed using Pi (j detected) and Pi (behavior j), and DL is derived using (5). Since the actual escape is unknown, the DL model cannot be directly validated. However, the number of chips failed for each test pattern is known from the fail data. On the other hand, the expected number of chips failed by a set of test patterns, Estfails , equals (1 − Pship ) × Total, where Total is the total number of chips tested. Thus, it is possible to validate the model for Pship . Since the model for Pship is a function of Yield, Estfails is a function of Yield and Total, both of which are not given by the fail data. Thus, to validate the model for Pship , a four-fold cross validation [30] is performed. In other words, the set of the chips failed by these 197 test patterns, G, is randomly partitioned into four groups of equal size. For each group of failing chips, Gi , the maximum likelihood estimate [30] for Yield and Total, Yieldi and Totali are derived from the chips in G − Gi . Then, Estfails (Yieldi , Totali ) is compared with the observed number of chips failed in Gi . Fig. 5 shows for each group in the four-fold cross validation, the estimated and the observed cumulative number of failing chips versus test pattern index, as well as the 95% confidence intervals of the estimations. Specifically, Fig. 5(a) compares the estimated and the observed cumulative number of failing chips in group 1. Similarly, Fig. 5(b)–(d) makes similar comparisons for groups 2, 3, and 4, respectively. As shown in Fig. 5, the estimated cumulative fallout count tracks the observed very well for each group, and the estimated values are well contained in the corresponding 95% confidence intervals. The average differences between the estimated and the observed values for the four groups are 1.73, 2.98, 1.54, and 1.89, respectively. This result means that the estimation of the Pship model tracks the observed number of failing chips very well. IV. Custom Test for Quality This section explores customizing the actual test patterns (but not increasing the number of tests) of the applied test set to match the failure characteristics for reducing DPPM. The use of a test-selection method [31], [32] for reducing DPPM is described in Section IV-A, the improvement in DL is predicted in Section IV-B, and the resulting test-quality improvement is validated in Section IV-C. A. Test Selection for Reducing DPPM According to the defect level model described in Section III-A, DL can be reduced if for every site i, the defect possibly affecting i is activated using additional relevant neighborhood states for test patterns that sensitize i. In other words, DL can be reduced by applying a new test that sensitizes the site i and activates some defect affecting i using an uncovered relevant neighborhood state (i.e., a relevant neighborhood state which has not been applied by any test where i has been sensitized). This observation agrees with the result of an experiment for a physically aware N-detect (PAN-detect) test set [33], in which each stuck-at fault is detected using at least N unique neighborhood states. In that experiment, the PAN-detect test set, as well as stuck-at, IDDQ, logic builtin self-test, and transition-fault tests, are applied to 99 718 IBM ASICs manufactured in 0.13 μm technology. The testingexperiment result shows that the PAN-detect test set uniquely fails three chips that passed the other test sets. This result implies that detecting a fault using more unique neighborhood states increases the defect detection probability. However, since the probability of a defect of a particular behavior category affecting a given site is different, which is demonstrated in Section II-B, a test that activates a certain number of uncovered relevant neighborhood states for one behavior category can be more preferable in reducing DPPM than a test that activates the same number of uncovered relevant neighborhood states for another behavior category. The implications of this observation are explored next. According to (5), defect level decreases as Pship decreases for a given yield. Recall that Pship can be computed using (1). Taking the log of both sides of the equation yields the expression log(Pship ) = log(1 − Pi (behavior j) i∈{allsites} j∈BEHVC ×Pi (j detected)). (6) YU AND BLANTON: DIAGNOSIS-ASSISTED ADAPTIVE TEST 1411 Fig. 5. Failing chips are randomly partitioned into four groups of equal size. Based on the parameters derived from one group (i.e., groups one, two, three, and four), the estimated cumulative numbers of failing chips detected for the chips in the remaining three groups are shown in (a), (b), (c), and (d), respectively. Therefore, the reduction in log(Pship ) that results from activating a defect of behavior j at a sensitized site i for test tk using an uncovered relevant neighborhood state sijk , δijk , can be computed using the following equation: 1 − j∈BEHVC Pi (behavior j) × Pi (j detected) δijk =abs log 1 − j∈BEHVC Pi (behavior j) × Pi (j detected) (7) where Pi (j detected) = 1 − (1 − Pi (j activated|state k)) (8) k∈RNSij and Pi (j detected) = 1 − (1 − Pi (j activated|state k)). Procedure 1 Test selection for reducing defect level 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: Initialize the selected test set, Tsel , to be empty Generate an N-detect test set, TN -det Tpool = TN -det ∪ original test set while the size of Tsel is less than the original test-set size do for each test tk ∈ Tpool do Weight(tk ) = 0 if tk ∈ / Tsel then for each defect behavior j and each sensitized site i in tk do if an uncovered relevant neighborhood state sijk is established then Weight(tk ) = Weight(tk )+decrease of log(Pship ) due to sijk end if end for end if end for Add the test tk with the largest Weight(tk ) to Tsel end while k∈RNSij +{sijk } (9) Again, RNSij is the set of all relevant neighborhood states for a behavior category j in the already-selected test patterns when i is sensitized. The total decrease in log(Pship ) due to applying a new test tk is the summation of the decrease in log(Pship ) due to the activation of every uncovered relevant neighborhood state (if any) of each defect behavior at every sensitized site i in tk . The test that leads to the largest decrease in log(Pship ) is the most preferable. Thus, tests are selected, one at a time, from a large test set in a way that greedily decreases log(Pship ). The pseudocode for test selection is given in Procedure 1. B. Improved Test Quality In this experiment, Procedure 1 is used to select a test set for the ALU that is of the same size as the original test set corresponding to the first test set shown in Table IV. The union of a 20-detect test set generated by the Cadence Tool Encounter Test [34] and the original test set is used as the test pool Tpool for selection. The decrease in log(Pship ) due 1412 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 9, SEPTEMBER 2012 TABLE V Comparison of Test Escapes Corresponding to the Original 1-Detect Test Set, the New Selected Test Set Tsel , and Tpool , for Various Defect Types for the ISCAS89 Benchmark Circuit s713 Test Set 1-det Tsel Tpool Fig. 6. DPPM comparison between the original, selected, and N-detect test sets for an assumed yield = 85% for the LSI test chips considered in Section III-B. to activating a defect of behavior category j at a sensitized site i using an uncovered relevant neighborhood state sijk is derived using (7). To compare the quality of the new selected test set with traditional N-detect test sets, several N-detect test sets are also generated using Encounter Test [34]. Fig. 6 shows the DPPM comparison between the selected test set, the original test set, and the traditional N-detect test sets assuming a yield value of 85%. Due to the randomness in the generation of the N-detect test sets [i.e., each N-detect test set is generated independently which means that the tests in the (m − 1)-detect set are not necessarily a subset of the subsequent m-detect test set], the DPPM for the conventional N-detect test set does not necessarily decrease monotonically with an increase in N. The selected test set achieves a lower DPPM with a smaller test-set size than traditional N-detect test sets that are not neighborhood aware [32], nor focused on the characteristics of the failure population. Fig. 6 also demonstrates that the selected test set improves test quality (i.e., reduces DPPM) without increasing the number of tests. Although the data shown in Fig. 6 is based on a yield value of 85%, the DPPM comparison results hold for other values as well. C. Simulation Validation The experiment in Section IV-B has shown that the test set selected using Procedure 1 is expected to achieve a lower DPPM value. To show that the number of test escapes can be reduced by applying Procedure 1, a simulation experiment using the full-scan version of the ISCAS89 benchmark s713 [20] is performed. A pool of defective circuits is generated and the corresponding circuit responses are determined using the method described in Section II-B for a 100% stuck-at fault detection test set (i.e., a 1-detect test set). The resulting defect population consists of 419 front-end cell defects, 1829 gross interconnect opens, 806 missing vias, and 512 twoline bridges. During simulation, these defects are detected and those that escape detection are determined. To mimic application in a production environment, the following steps are performed. For each defect type, 25% Test Set Set Size 48 48 155 Cell 27 24 24 No. of Test Escapes Missing via Net Open Bridge 39 66 37 20 37 37 20 37 36 Total 169 118 117 of the defective circuits affected by that type are randomly selected. The selected defective circuits, referred to as Dcurrent , constitute the current failing population. The remaining 75%, which are referred to as Dfuture , are assumed to be the defective chips that are fabricated in the future. The test set applied to Dcurrent is a 1-detect test set. The test pool (Tpool ) used for test selection is the union of a 50-detect test set generated by Encounter Test [34] and the original 1-detect test set. MD2 X, FBI, and behavior attribution are performed on circuits in Dcurrent that fail the 1-detect test set to derive the occurrence rate of each behavior category. Then, a test set Tsel of the same size as the original 1-detect test set is selected using Procedure 1 and the derived occurrence rate of each behavior category. Finally, for the defective circuits in Dfuture , circuitlevel simulation is used again to identify the test escapes, that is, the defective circuits in Dfuture that are not detected by Tsel . The test-set sizes and the number of defective circuits in Dfuture that escape each test set are given in Table V. The simulation results reveal that the resulting selected test set Tsel reduces the number of test escapes in Dfuture by 30% without increasing the number of tests. Moreover, the DPPM of the selected test set is only 0.09% higher than Tpool , even though the size of Tpool is more than three times the size of Tsel . V. Application Scenarios In this section, challenges related to the application of a DAT-based test set in an actual production environment are discussed. The impact of process variations on the application of DAT is discussed in Section V-A. In Section V-B, the number of tested ICs required to determine that a DAT test set improves test quality with high confidence is derived. Finally, various scenarios of DAT are described in Section V-C. A. Impact of Process Variations The discussion thus far in this paper has been on defects that are composed of extra or missing material within the manufactured ICs. Common defects include missing vias, interconnect opens, unintended connections between signal lines, and others. However, due to uncertainty in the manufacturing process, all manufactured ICs are also affected by process variations, which are deviations from the intended structural or electrical parameters of concern [35]. For example, transistor gate length and width can vary significantly for small feature sizes due to the variations in the exposure system [36]. Due YU AND BLANTON: DIAGNOSIS-ASSISTED ADAPTIVE TEST to reasons such as nonuniform gate oxidation temperature and channel dopant fluctuations, manufactured transistors have different threshold voltages. Moreover, due to copper loss in chemical mechanical polishing, metal thickness can vary up to 30%–40% of its nominal value in a 90 nm fabrication process [35]. Process variations can influence the behavior distribution resulting from behavior attribution (Section II) in the following two ways. First, for different process parameters, the behavior of a defect at the same location may change. For example, the logic value interpreted by a gate driven by a net affected by an open defect is determined by: 1) the coupling capacitance between the floating net and nets that are driven to the supply voltage; 2) the coupling capacitance between the floating net and nets that are driven to the ground voltage; 3) the trapped charge on the floating net; and 4) the threshold voltage of the gate(s) driven by the floating net. For different values of coupling capacitance, threshold voltage, and trapped charge resulting from different process parameters, the logic value of the floating net may be interpreted differently. Therefore, the behavior of the floating net may change. For example, for a certain threshold voltage, the voltage of the floating net may be interpreted as logic 0 for all test patterns, in which case the defective line has stuck-at-0 behavior (Category 2 in Table I). However, for a lower value of threshold voltage, the value of the floating net may be interpreted as logic 0 for some test patterns and logic 1 for the remaining test patterns depending on the values of its physical neighbors, which corresponds either to Category 6 or 7 in Table I. Similarly, among all behaviors a bridge defect can exhibit, which are given in Table I, the behavior of a line affected by a bridge defect may change from one category to another for different process parameters. For example, the drive current of a gate increases with decreasing effective channel length, which may cause the behavior of a two-line bridge defect to change from Category 5 (i.e., the value of the defect site depending on the driving strength) to Category 6 (i.e., the value of the defect site depending on one of its physical neighbors). Second, unacceptable process variations may behave as a classic spotbased “defect.” For example, when the threshold voltage of a transistor is so high that the transistor is never turned on, the behavior of the transistor is the same as when it is affected by a transistor stuck-open defect, which can manifest as stuck-at or cell misbehavior that corresponds to Category 2 or 3 in Table I. Since behavior distribution depends on process parameters, the probability of a failing IC having behavior j, P(j), can be computed using P(j) = P(j, p) dp = P(j|p) × P(p) dp (10) PR PR where p is a set of values for the process parameters that can influence the defect behaviors, PR is a region containing all possible values for the relevant process parameters, P(j, p) is the probability that a failing IC has process parameter values p and exhibits behavior j, P(j|p) is the probability of a failing IC exhibiting behavior j given that the IC has process parameter values p, and P(p) is the probability that a failing IC has 1413 process parameter values p. Therefore, a good estimation for P(j) requires the behavior-attribution results from a sufficient amount of failing ICs that can approximate both the actual distribution of relevant process parameters and the behavior distribution for a given set of process parameter values. In the diagnosis-assisted test, the assumption is that the behavior distribution of failing ICs is monitored over time. A new DAT test set is generated if the following two criteria are satisfied. First, the current behavior distribution derived with high confidence is different from the one used to generate the currently employed DAT test set. Second, for any behavior category considered, the nonoverlapping interval between the confidence intervals for the current and the original behavior distribution is more than the error resulting from MD2 X and FBI, which can be deduced using simulation experiments. In general, it is expected that a small number of failing ICs will lead to large confidence intervals which means, of course, that a larger change in behavior distribution has to be observed to trigger the generation of a new DAT test set. B. Sample Size for Validating Quality Improvement To confirm that a DAT test set improves product quality for actual manufactured ICs, the current and the DAT test sets should be applied to a large population of manufactured ICs. In the following, the number of ICs required to draw a highconfidence conclusion concerning DAT effectiveness based on defect level (DL) is derived. Suppose test evaluation is conducted on a population of m ICs, some of which are defective while others are defect-free. ˆ Crnt be the defect level resulting from the application Let DL of the current test set on the m ICs. In other words ˆ Crnt = P̂shipCrnt − Yield DL (11) P̂shipCrnt where P̂shipCrnt is the percentage of ICs in the population of m ICs that pass all the tests in the current test set and Yield is the percentage of good ICs in the population of m ICs tested. Thus Yield P̂shipCrnt = . (12) ˆ Crnt 1 − DL Similarly, P̂shipDAT , the percentage of ICs in the set of m ICs under test that pass the DAT test set, can be calculated using the following equation: Yield P̂shipDAT = (13) ˆ DAT 1 − DL ˆ DAT is the defect level resulting from the application where DL of the DAT test set to the same m ICs. Let q be the relative reduction in defect level resulting from the DAT test set over the current test set for the m ICs. In other words ˆ DAT = (1 − q) × DL ˆ Crnt . DL (14) Thus P̂shipCrnt −P̂shipDAT = Yield Yield − ˆ Crnt ˆ Crnt 1 − DL 1 − (1 − q) × DL ˆ Yield × q × DLCrnt = . ˆ Crnt ) × (1 − DL ˆ Crnt ) (1 − (1 − q) × DL (15) 1414 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 9, SEPTEMBER 2012 Fig. 7. Number of ICs required to conclude that a DAT test set is of higher quality for (a) confidence level of 95% and Yield = 85%, (b) confidence level of 90% and Yield = 85%, (c) confidence level of 99% and Yield = 85%, and (d) confidence level of 95% and Yield = 95%. The random variable Xi is defined, where Xi equals 1 if an IC i passes the current test set but is failed by the DAT test set, Xi equals −1 if an IC i is failed by the current test set but passes the DAT test set, and Xi equals 0 otherwise. Thus 1 Xi = P̂shipCrnt − P̂shipDAT . m i=1 m (16) According to Hoeffding’s inequality [37], ∀ > 0 − m 1 (1−(−1)2 ) i=1 P( (17) Xi − (PshipCrnt − PshipDAT ) ≥ ) ≤ e m i=1 m 22 m2 where PshipCrnt (PshipDAT ) is the probability of an IC passing all the tests in the current (DAT) test set. Setting to m1 m i=1 Xi leads to the following inequality: m 2 m 1 P(PshipCrnt − PshipDAT ≤ 0) ≤ e− 2 ( m i=1 Xi ) . (18) By analyzing (15), (16), and (18), the minimum m required to conclude PshipCrnt > PshipDAT (i.e., the DAT test set results in a reduction in defect level) with a confidence level of (1 − α) is derived to be ˆ Crnt )×(1−DL ˆ Crnt ) 2 (1−(1−q) ×DL m=− 2× × ln α. (19) ˆ Crnt Yield × q × DL Fig. 7 shows the sample size m (i.e., the number of ICs) required to conclude that a DAT-based test set achieves a reduced DL (as compared to the current set of tests) for various values of observed DL reduction resulting from a ˆ Crnt ). DAT test set (q) and DL of the current test set (DL ˆ Assuming Yield = 85%, for DLCrnt = 10 000 DPPM and q = 20%, the required sample sizes m are 1.5 million, 2 million, and 3.1 million, for a confidence level of 90%, 95%, and 99%, respectively. For a confidence level of 95% and Yield = 85%, ˆ Crnt = 30 000 DPPM the sample size reduces to 31 000 for DL and q = 50%. This sample size changes to 25 150 if Yield is 95%. Thus, as shown in Fig. 7, as well as in (19), the ˆ Crnt , or sample size m reduces with an increase in Yield, q, DL a reduction in confidence level. The result shows that a DAT test set can be applied with high confidence when a slight quality improvement is observed for a large number of tested ICs. For low volume products, however, a significant quality improvement has to be observed to conclude that a DAT test set improves produce quality. C. Application Scenarios There are two application scenarios of DAT based on behavior distribution changes. If failure characteristics change slowly over time, DAT can be applied over time to improve test YU AND BLANTON: DIAGNOSIS-ASSISTED ADAPTIVE TEST quality. Since the tester response of the failing chips and their lot numbers are normally stored and diagnosed, the failure characteristics can be tracked over time by analyzing the diagnosis results of failing chips using behavior attribution (Section II). To determine whether the change is slow enough, (19) and Fig. 7 can be used. In other words, if: 1) the sample size is large enough so that the change in fallout characteristics can be determined with high confidence; and 2) the time needed to customize the test set is smaller than the time for manufacturing/collecting the failing chips, then distribution change is considered sufficiently slow. Alternatively, if the failure characteristics remain static, then the test set can be one-time tuned to match the unchanging failure characteristics using DAT. Under either scenario, DAT can be used to minimize DPPM for a given constraint on test costs, or alternatively, ensure that DPPM does not exceed some pre-determined threshold. VI. Conclusion A DAT methodology that ensures a desired level of quality for chip-logic circuits on a per-design basis by changing test content to match diagnosed-derived failure characteristics was described. DAT began by partitioning the failure population of a manufactured design into several behavior categories using diagnosis results. Then, based on the occurrence rate of each behavior category, DPPM was estimated using a new and novel data-driven defect-level model. Also, using the resulting behavior distribution, the test set was changed using a testselection methodology to reduce DPPM. The efficacy of DAT was demonstrated using detailed circuit-level simulations. Specifically, defects of various types were injected into the layout of a benchmark circuit, extracted to create a circuit-level netlist that accurately reflects the presence of the defect, and then simulated to produce virtual test responses. Applying DAT to 25% of the virtual fail population showed that escapes from the remaining 75% could be reduced by 30%. Although not verifiable, application to actual chips also demonstrated quality improvement. In addition, a formulation was derived for computing the number of tested ICs required to confidently conclude that a DAT-based test set would reduce DPPM. Analysis of this formulation revealed that a DAT-based test set can be applied with high confidence when a slight quality improvement is observed for a large number of tested ICs. On the other hand, a DAT-based test set can also be applied to low volume products if a significant reduction in defect level is achievable. Finally, though data used in this paper is based on 0.09 μm, 0.11 μm, and 0.13 μm technologies, DAT can be applied to chips manufactured in more advanced technologies, and to other types of defects such as delay defects. References [1] L. M. Huisman, “Diagnosing arbitrary defects in logic designs using single location at a time (SLAT),” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 23, no. 1, pp. 91–101, Jan. 2004. [2] M. Keim, N. Tamarapalli, H. Tang, M. Sharma, J. Rajski, C. Schuermyer, and B. Benware, “A rapid yield learning flow based on production integrated layout-aware diagnosis,” in Proc. IEEE Int. Test Conf., Oct. 2006, pp. 1–10. 1415 [3] H. Tang, S. Manish, J. Rajski, M. Keim, and B. Benware, “Analyzing volume diagnosis results with statistical learning for yield improvement,” in Proc. IEEE Eur. Test Symp., May 2007, pp. 145–150. [4] B. Seshadri, I. Pomeranz, S. Venkataraman, M. E. Amyeen, and S. M. Reddy, “Dominance based analysis for large volume production fail diagnosis,” in Proc. IEEE VLSI Test Symp., Apr.–May 2006, pp. 394– 399. [5] J. Rajski, “Logic diagnosis and yield learning,” in Proc. IEEE Des. Diagnostics Electron. Circuits Syst., Apr. 2007, p. 19. [6] R. Desineni, O. Poku, and R. D. Blanton, “A logic diagnosis methodology for improved localization and extraction of accurate defect behavior,” in Proc. IEEE Int. Test Conf., Oct. 2006, pp. 1–10. [7] X. Yu and R. D. Blanton, “Estimating defect-type distributions through volume diagnosis and defect behavior attribution,” in Proc. IEEE Int. Test Conf., Nov. 2010, pp. 1–10. [8] International Technology Roadmap for Semiconductors, 2009th ed., Semiconductor Industry Association, Austin, TX, 2009. [9] R. Madge, B. Benware, R. Turakhia, R. Daasch, C. Schuermyer, and J. Ruffler, “In search of the optimum test set: Adaptive test methods for maximum defect coverage and lowest test cost,” in Proc. IEEE Int. Test Conf., Oct. 2004, pp. 203–212. [10] K. M. Butler and J. Saxena, “An empirical study on the effects of test type ordering on overall test efficiency,” in Proc. IEEE Int. Test Conf., Oct. 2000, pp. 408–416. [11] OptimalTest [Online]. Available: http://www.optimaltest.com [12] Pintail [Online]. Available: http://www.pintail.com [13] R. Daasch, K. Cota, J. McNames, and R. Madge, “Neighbor selection for variance reduction in IDDQ and other parametric data,” in Proc. IEEE Int. Test Conf., Oct. 2002, pp. 1240–1248. [14] M. Rehani, R. Madge, K. Cota, and R. Daasch, “Statistical postprocessing at wafersort: An alternative to burn-in and a manufacturable solution to test limit setting for sub-micron technologies,” in Proc. IEEE VLSI Test Symp., May 2002, pp. 69–74. [15] R. Madge, B. H. Goh, V. Rajagopalan, C. Macchietto, R. Daasch, C. Schuermyer, C. Taylor, and D. Turner, “Screening minVDD outliers using feed-forward voltage testing,” in Proc. IEEE Int. Test Conf., Oct. 2002, pp. 673–682. [16] W. Shindo, E. H. Wang, R. Akella, A. J. Strojwas, W. Tomlinson, and R. Bartholomew, “Effective excursion detection by defect type grouping in in-line inspection and classification,” IEEE Trans. Semicond. Manuf., vol. 12, no. 1, pp. 3–10, Feb. 1999. [17] X. Yu and R. D. Blanton, “Diagnosis of integrated circuits with multiple defects of arbitrary characteristics,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 29, no. 6, pp. 977–987, Jun. 2010. [18] Y. Sato, L. Yamazaki, H. Yamanaka, T. Ikeda, and M. Takakura, “A persistent diagnostic technique for unstable defects,” in Proc. IEEE Int. Test Conf., Oct. 2002, pp. 242–249. [19] F. Brglez and H. Fujiwara, “A neutral netlist of 10 combinational benchmark circuits and a target translator in fortran,” in Proc. IEEE Int. Symp. Circuits Syst., Jun. 1985, pp. 695–698. [20] F. Brglez, D. Bryan, and K. Dozminski, “Combinational profiles of sequential benchmark circuits,” in Proc. IEEE Int. Symp. Circuits Syst., 1989, pp. 1929–1934. [21] W. Tam, O. Poku, and R. D. Blanton, “Automated failure population creation for validating integrated circuit diagnosis methods,” in Proc. ACM/IEEE Des. Autom. Conf., 2009, pp. 708–713. [22] R. D. Blanton, J. T. Chen, R. Desineni, K. N. Dwarakanath, W. Maly, and T. J. Vogels, “Fault tuples in diagnosis of deep-submicron circuits,” in Proc. IEEE Int. Test Conf., 2002, pp. 233–241. [23] T. L. Quarles. (1989). “Analysis of performance and convergence issues for circuit simulation,” Ph.D. dissertation, Electr. Eng. Comput. Sci. Dept., Univ. California, Berkeley, CA [Online]. Available: http://www.eecs.berkeley.edu/Pubs/TechRpts/1989/1216.html [24] S. K. Lu, T. Y. Lee, and C. W. Wu, “Defect level prediction using multimodel fault coverage,” in Proc. IEEE Asian Test Symp., Nov. 1999, pp. 301–306. [25] M. R. Grimaila, S. Lee, J. Dworak, K. M. Butler, B. Stewart, H. Balachandran, B. Houchins, V. Mathur, J. Park, L. C. Wang, and M. R. Mercer, “REDO-random excitation and deterministic observation-first commercial experiment,” in Proc. IEEE VLSI Test Symp., Apr. 1999, pp. 268–274. [26] X. Yu, Y.-T. Lin, W.-C. Tam, O. Poku, and R. D. Blanton, “Controlling DPPM through volume diagnosis,” in Proc. IEEE VLSI Test Symp., May 2009, pp. 134–139. [27] D. B. Lavo, I. Hartanto, and T. Larrabee, “Multiplets, model, and the search for meaning: Improving per-test fault diagnosis,” in Proc. IEEE Int. Test Conf., Oct. 2002, pp. 250–259. 1416 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 9, SEPTEMBER 2012 [28] M. Sharma, W.-T. Cheng, T.-P. Tai, Y. Cheng, W. Hsu, C. Liu, S. M. Reddy, and A. Mann, “Faster defect localization in nanometer technology based on defective cell diagnosis,” in Proc. IEEE Int. Test Conf., Oct. 2007, pp. 1–10. [29] W. Tam, O. Poku, and R. D. Blanton, “Precise failure localization using automated layout analysis of diagnosis candidates,” in Proc. ACM/IEEE Des. Autom. Conf., Jun. 2008, pp. 367–372. [30] C. M. Bishop, Pattern Recognition and Machine Learning. Berlin, Germany: Springer, 2006. [31] S. Seshu and D. N. Freeman, “The diagnosis of asynchronous sequential switching systems,” IRE Trans. Electron. Comput., vol. 11, no. 4, pp. 459–465, Mar.–Apr. 1962. [32] Y.-T. Lin, O. Poku, N. K. Bhatti, and R. D. Blanton, “Physically-aware N-detect test pattern selection,” in Proc. IEEE Des. Autom. Test Eur., Mar. 2008, pp. 634–639. [33] Y.-T. Lin, O. Poku, R. D. Blanton, P. Nigh, P. Lloyd, and V. Iyengar, “Evaluating the effectiveness of physically-aware N-detect test using real silicon,” in Proc. IEEE Int. Test Conf., Oct. 2008, pp. 1–9. [34] Cadence [Online]. Available: http://www.cadence.com [35] X. Li, J. Le, and L. T. Pileggi, Statistical Performance Modeling and Optimization. Boston, MA/Delft, The Netherlands: NOW Publishers, 2007. [36] B. Stine, D. Boning, and J. Chung, “Inter and intra-die polysilicon critical dimension variation,” Proc. SPIE, vol. 2874, pp. 27–36, Oct. 1996. [37] L. Wasserman, All of Statistics: A Concise Course in Statistical Inference. Berlin, Germany: Springer, 2004. Xiaochun Yu (S’08–M’12) received the B.S. degree in microelectronics from Fudan University, Shanghai, China, in 2005, and the M.S. and Ph.D. degrees in electrical and computer engineering from Carnegie Mellon University, Pittsburgh, PA, in 2008 and 2011, respectively. Since June 2011, she has been with Intel Corporation, Hillsboro, OR, where she is engaged in diagnosis and debug solutions. Her current research interests include multiple defect diagnosis, layoutaware diagnosis, defect-type distribution estimation, and adaptive testing. Ronald DeShawn Blanton (S’93–M’95–SM’03– F’09) received the B.S. degree in engineering from Calvin College, Grand Rapids, MI, in 1987, the M.S. degree in electrical engineering from the University of Arizona, Tucson, in 1989, and the Ph.D. degree in computer science and engineering from the University of Michigan, Ann Arbor, in 1995. He is currently a Professor with the Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, where he serves as the Director of the Center for Silicon System Implementation, an organization consisting of 22 faculty members and over 80 graduate students focusing on the design and manufacture of siliconbased systems. His current research interests include the test and diagnosis of integrated, heterogeneous systems, and design-, manufacture-, and test information extraction from tester measurement data.