Performance Evaluation of Two Allocation Schemes for Combinatorial Group Testing Fault Isolation Rawad N. Al-Haddad, Carthik A. Sharma, Ronald F. DeMara University of Central Florida Agenda • • • • • • Overview of Group Testing Algorithms Overview of Fault Handling Techniques Multi-stage Adaptive Group Testing Equal Share Allocation Scheme Interleaved Allocation Scheme Performance Comparison of Allocation Strategies Group Testing Algorithms • Origin – World War II Blood testing Problem: Test samples from millions of new recruits Solution: Test blocks of sample before testing individual samples • Problem Definition Identify subset Q of defectives from set P Minimize number of tests Test v-subsets of P Form suitable blocks Fault-Handling Techniques Device Failure Characteristics Duration: Target: Approach: Transient: SEU Device Processing Configuration Datapath Repetitive Readback Majority Vote Invert Bit Value Processing Datapath CGT-Based STARS CED Dueling Supplementary Testbench Duplex Output Comparison Duplex Output Comparison Cartesian Intersection Worst-case Clock Period Dilation Diagnosis: Recovery: SEL, Oxide Breakdown, Electron Migration, LPD TMR Detection: Bitwise Comparison Device Configuration BIST Methods Isolation: Permanent: Ignore Discrepancy Replicate in Spare Resource Fast Run-time Location Repetitive Intersections unnecessary Select Spare Resource Evolutionary Algorithm using Intrinsic Fitness Evaluation Isolation Problem Outline Objectives Locate faulty logic and/or interconnect resource: a single stuckat fault model is assumed Online Fault Isolation: device not entirely removed from service Two Schemes: Equal Share: Suspect resources are divided into equal subsets, each subset is assigned to one individual in the population, Each suspect resource is guaranteed to be covered by at least one individual Interleaved: Suspect subsets are shared among individuals, Coverage Factor (CF) determines the minimum number of individuals ( 1) which utilize each resource in the suspect pool Equal Share Allocation Allocation Strategy Suspect pool of N LUTs Population of R individuals Each individual gets M suspect resources, where M = N/R. Maximal possible gain if the fault is articulated by the test vectors is a factor of R (from N suspect resources to M) Minimal possible testing phase gain: No gain at all if fault is not articulated N LUTs M LUTs M LUTs M LUTs M LUTs Ind1 Ind2 Ind3 Ind4 M LUTs Ind R Experiments • Experimental Setup DES-56 encryption circuit Xilinx ISE design tools to place and route the design Virtex II Pro FPGA device Fault Injection and Analysis Toolkit (FIAT) Application Programmer Interfaces (APIs) to interact with the Xilinx ISE tools to inject and evaluate faults Editing the design file rather than the configuration bitstreams to introduce stuck-at-faults Editing User Constraint Files (UCF) to control resource usage Equal Share Results 15 individuals 20 individuals 25 individuals 15 individuals 16 12 Test vectors Number of Runs 14 10 8 6 4 2 0 3 4 5 25 individuals 5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0 1 6 20 individuals 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Runs Groups Total number of runs for each group count Number of test vectors required in each run Results of three CGT experiments with different population size Isolation results Number of groups Success Fail 3 4 5 6 Mean SD Required Test vectors Discrepancies 15 17 3 0 13 6 1 4.35 0.587 247.4 3.7 20 17 3 14 6 0 0 3.3 0.470 311.9 2.55 25 17 3 14 6 0 0 3.3 0.470 525.3 2.6 Population Interleaved Allocation Allocation Scheme Each LUT in the suspect pool is utilized by more than one individual in the population Implies “interleaving” of individuals over each LUT. Interleaving degree decided by Coverage Factor. Coverage factor (CF): Number of individuals utilizing each resource in the suspects pool Example: CF = 2 means that each suspected LUT is covered by two different individuals. Interleaved Allocation Scheme N LUTs M LUTs M LUTs S1 S2 Ind 1 Ind 3 M LUTs M LUTs S3 M LUTs S4 S5 Ind 2 Ind 4 Ind 3 Ind 5 Interleaved Allocation scheme with CF = 2 N LUTs divided into M subgroups where M = N/R Each individual utilizes 2M LUTs Discrepancy will reduce the number of suspects to 2M rather than M However, (100/CF)% less chance of unarticulated faults. Two-Pass Algorithm • Pass one: Reduce suspect list from N to CFN/R, where CF is the coverage factor Isolation granularity gain is reduced when CF is increased. Terminated once the first discrepant output is observed. • Pass Two Reduce suspect list from CFN/R to N/R (same gain as Equal Share) New data structure is introduced to expedite the process. Called Interleaved Individuals Set (IIS) Interleaved Individuals Set • Purpose: Keep track of the interleaved individuals in a specific CGT configuration • Example: Ind 1 Ind 3 Ind 4 Ind 2 Ind 4 Ind 5 Ind 3 Ind 5 Ind 1 Ind 4 Ind 1 Ind 2 Ind 5 Ind 2 Ind 3 N LUTs M LUTs M LUTs S1 S2 Ind 1 Ind 3 M LUTs M LUTs S3 M LUTs S4 S5 Ind 2 Ind 4 Ind 3 Ind 5 In pass two, individuals interleaving with the one which articulated the fault in pass one will be tested. Conclusion • Equal Share: Best Case: Suspect List reduced from N to N/R Worst Case: Zero gain (unarticulated fault) One pass only • Interleaved Best Case: Suspect List reduced from N to N/R Performed in two passes (N CFN/R N/R) IIS minimizes overhead in Pass two Worst Case: Zero gain also. BUT, less chance to occur than Equal share scheme (because of interleaving) References Sharma, C. A. and R. F. DeMara (2006), “A Combinatorial Group Testing Method for FPGA Fault Location,” in Proceedings of the International Conference on Advances in Computer Science and Technology (ACST 2006), Puerto Vallarta, Mexico, 2006 Du D and Hwang, F. K (2000), "Combinatorial Group Testing and its Applications," Series on Applied Mathematics volume 12, World Scientific. Sharma, C. A. (2007), "FPGA Fault Injection and Analysis Toolkit (FIAT)."