Design for Test of Digital Systems TDDC33 Erik Larsson Department of Computer Science Course Outline Introduction; Manufacturing, Wafer sort, Final test, Board and System Test, Defects, and Faults Test generation; combinational and sequential test generation Design-for-Test techniques; test point insertion, scan, enhanced scan Test data compression, Built-In Self-Test; Logic BIST and memory BIST System Chip Test; test architectures, test planning, test scheduling, and power constraints System Test and Boundary Scan 2 System-on-Chip Die Viper 2.0 RevB Analog/Digital TV Processor 10mm x 10 mm (100 mm2) ~10 M gates ~50 M transistors ~100 clock domains 3 17 System-on-Chip Processor Cores ARM, MIPS, PowerPC Memories SRAM, ROM, Flash, DRAM DSP Cores Peripherals DMA Controllers, MMU Interface CPU UDL DSP SRAM DRAM DSP PCI ROM PCI, USB, UART Multimedia JPEG compression, MPEG decoder Networking Ethernet controller 4 Modular Test Design Test Quality Different parts (logic, memory, analog, RF) need different test methods Black-boxed Embedded Core Implementation is not known, forced to use tests developed by provider Divide-and-Conquer Very large SOCs are intractable for ATPG/FSim tools Modular test approach allows concurrent development/engineering Test Reuse Module will be reused in other designs 5 Challenges Distributed Design and Test Development Standardized set of deliverables Test Access to Embedded Modules Standardized on-chip test access hardware Tools for test translation Chip-Level Test Optimization Tools to evaluate trade-offs; minimal impact on design (extra silicon, delay) at minimizing test application time and ATE memory requirement 6 Scan Test Application Scan chain 1 (20 FFs) Scan chain 1 (10 FFs) Scan chain 0 (20 FFs) Scan chain 0 (10 FFs) Core1 Test vectors: 10 Core2 Test vectors: 20 7 Non-modular Alternative Scan chain 1 (20 FFs) Scan chain 1 (10 FFs) Scan chain 0 (20 FFs) Scan chain 0 (10 FFs) Test vectors: 10 Test vectors: 20 Non-modular alternative: Test time= (20+10+1)*20+(20+10)=650 Capture Max(10,20) 8 Modular Alternative Core 1 Core 2 Scan chain 1 (20 FFs) Scan chain 1 (10 FFs) Scan chain 0 (20 FFs) Scan chain 0 (10 FFs) Test vectors: 10 Core 1: Test vectors: 20 Test time= (20+1)*10+(20)=230 Capture Core 2: Test time= (10+1)*20+(10)=230 Total test time: 460 9 Scan Test Application Scan chain 1 (20 FFs) Scan chain 1 (10 FFs) Scan chain 0 (20 FFs) Scan chain 0 (10 FFs) Core1 Test vectors: 10 Core2 Test vectors: 20 Scan in + capture + scan out/scan in + capture + scan out + ...... + capture + scan out -> (sc+1)*p+sc Non-modular alternative: Test time= (20+10+1)*20+(10+20)=650 Modular alternative: Core1: sc=20, p=10 -> (20+1)*10+20=230 Core2: sc=10, p=20 -> (10+1)*20+10=230 Total = Time(Core1)+Time(Core2)=460 (30% cut) 10 Generic Test Access Architecture source CPU UDL DSP SRAM DRAM TAM TAM MUT PCI wrapper ROM sink Test pattern Source and Sink Store/generate test stimuli and store/evaluate test responses Test Access Mechanism (TAM) Transports test patterns to/from module under test (MUT) Test Wrapper Provides test access to MUT Isolates MUT at test 11 Test Planning Objectives: Optimizing test access to cores and scheduling tests Test software planning Test hardware planning Core test import Core import Core integration Test wrapper & TAM design Top-level ATPG • Glue logic, soft cores • Test wrappers Test scheduling Test assembly 12 IEEE 1500 Core Test Standard Goals Define a core test interface between an embedded core and the SOC Core isolation and protection Plug-and-play protocols Scope Standardize core isolation protocols and test modes TAM design Type of test to be applied Test scheduling 13 Test Wrapper Test wrapper Interface between module and the rest of the chip makes it possible access core and isolate core from rest of the system. Test modes Normal: Functional mode, InTest: test of module itself, ExTest: test of interconnection between cores IEEE 1500 Standard for Embedded Core Test 14 Test Wrapper WPP (optional) WPC WPO WPI Core WSI wrapper WSC WSP: Wrapper Serial Port WSI: Wrapper Serial Input WSC: Wrapper Serial Control WSO: Wrapper Serial Output WPP: Wrapper Parallel Port WPI: Wrapper Parallel Input WPC: Wrapper Parallel Control WPO: Wrapper Parallel Output WSO WSP (mandatory) 15 Test Wrapper CTI: Cell Test Input CTO: Cell Test Output CFI: Cell Functional Input CFO: Cell Functional Output (User-defined WPP = WPI+WPO+WPC) W B R FO FO FI FI FI: Functional Input FO: Functional Output W B R WFI: Wrapper Functional Input WFO: Wrapper Functional Output Core CTO WFO WFI W B R FO FO FI FI W B R WFO WFI Test enable CFI WBC CFO WBY WSI CTI wrapper WIR WSC: WRCK, WRST, SelectWIR, ShiftWR, CaptureWR, UpdateWR WSO WBC: Wrapper Boundary Cell WBR: Wrapper Boundary Register WBY: Wrapper Bypass Register WIR: Wrapper Instruction Register WSP: Wrapper Serial Port WSI: Wrapper Serial Input WSC: Wrapper Serial Control WSO: Wrapper Serial Output WPP: Wrapper Parallel Port WPI: Wrapper Parallel Input WPC: Wrapper Parallel Control WPO: Wrapper Parallel Output 16 Wrapper Boundary Cell shift wci From chip Input cell To core From WSI/WPI FF To WSO/WPO Clk shift Output cell wci From core From WSI/WPI To chip FF To WSO/WPO Clk 17 Test Wrapper: Functional Operation Test wrapper is in functional mode; hence the test wrapper is transparent (invisible) WFO WFI W B R FO FO FI Core FI W B R WFO WFI Test enable WBY WSI wrapper WIR WSO WSC 18 Test Wrapper: WS_Bypass Test data (test stimuli and test responses) are bypassed. Normal mode Normal mode W B R FO Core FO FI FI W B R Normal mode Normal mode Test enable WBY WSI wrapper WIR WSO WSC 19 Test Wrapper: WS_EXTEST Wrapper cells are programmed to control the WFO and observe (capture) the WFI UDL UDL W B R FO Core FO FI FI W B R UDL UDL Test enable WBY WSI wrapper WIR WSO WSC 20 Test Wrapper: WS_INTEST Wrapper cells are programmed to control FI and observe (capture) the FO of the core W B R FO Core FO FI FI W B R Test enable WBY WSI wrapper WIR WSO WSC 21 Multiplexed TAM TAM WPC WPO WPI WSI WPP WPP WPP CoreA CoreB CoreC WSP WSP WSP WSO System Chip 22 Direct Access TAM WPI WPC WPO WPO WPO WPC WPC TAM WPI WPI WSI WPP WPP WPP CoreA CoreB CoreC WSP WSP WSP WSO System Chip 23 Dasiy-Chained TAM TAM WPC WPO WPI WSI WPP WPP WPP CoreA CoreB CoreC WSP WSP WSP WSO System Chip 24 Architecture Design ATE channels Multiplex TAM CoreA CoreC CoreB Test time Direct Access TAM Daisy-Chained TAM CoreA CoreC CoreB CoreA+CoreB+CoreC B+C B 25 Architecture Design Multiplex TAM + Direct Access TAM Flexible Architecture Test bus 1 CoreA CoreC CoreB Test bus 2 CoreB CoreA CoreC 26 Problem For a given SoC: form wrapper chains out of the scan-chains and the wrapper cells at every core connect the wrapper chains to TAMs, and assign a time for testing each core, such that the total test time is minimized. 27 Architecture Design A B C D Mem 1 Logic 1 Mem 2 E Logic 2 SoC CPU TAM 1 TAM 1 TAM 2 TAM 2 TAM 3 TAM 3 28 Wrapper Design Scan chain 0 Scan chain 0 (100 FFs) Scan chain 2 Scan chain 1 (100 FFs) Scan chain 3 SI[0:3] SO[0:3] Scan chain 2 (100 FFs) Scan chain 1 T=(200+1)*10+200=2210 Scan chain 3 (100 FFs) Core1 SE p=10 Scan chain 0 Scan chain 2 Scan chain 1 Scan chain 3 T=(200+1)*10+200=2210 Test time (T) = (sc+1)*p+sc Scan chain 0 Scan chain 1 Scan chain 2 Scan chain 3 T=(400+1)*10+400=4410 29 Longest wrapper scan chain" Wrapper Design 1. 2. Minimize length of longest wrapper scan in/out chain Minimize number of wrapper scan chains TAM width" 30 Test Wrapper Optimization Priority 1: Balanced Wrapper Scan Chains Core" 4 FF! 8 FF! Wrapper" Unbalanced Core" 4 FF! 8 FF! Wrapper" Balanced Minimize length of longest wrapper scan in/out chain 31 Reducing TAM Width Priority 2: Minimize wrapper scan chains created Scan chain – 32 FF I I 8 FF I 8 FF I 8 FF O O 4 Wrapper scan chains 2 Wrapper scan chains Scan chain – 32 FF I I I I 8 FF 8 FF 8 FF O O 32 Core To TAM Assignment A B C D Mem 1 Logic 1 Mem 2 E Logic 2 CPU SoC TAM 1 Logic 1 Mem 1 A TAM 2 Mem 2 Logic 2 CPU TAM 3 B C D TAM 1 E TAM 2 TAM 3 33 Test Bus Architecture Architecture A C Schedule: Serial B D E F Combination of multiplexing and distribution Supports only serial schedule Core-external testing is cumbersome or impossible 34 Test Scheduling ATE A B C D Mem 1 B CPU D Mem1 Logic2 C E Logic1 A Logic 1 Mem 2 E Logic 2 SoC CPU 35 Test Scheduling ATE A B C D Mem 1 B CPU D Mem1 Logic2 C Logic1 A Logic 1 Mem 2 E Logic 2 SoC CPU 36 Test Scheduling ATE A B C D Mem 1 Logic 1 CPU Mem 2 E Logic 2 SoC CPU 37 Problem For a given SoC: form wrapper chains out of the scan-chains and the wrapper cells at every core connect the wrapper chains to TAMs, and assign a time for testing each core, such that the total (or expected) test time is minimized. 38 Test Scheduling ATE B CPU D Mem1 Logic2 C E Logic1 A Mem 1 Mem 2 SoC A B C D Logic 1 E Logic 2 CPU Test Scheduling ATE B CPU D Mem1 Logic2 C Logic1 A Mem 1 Mem 2 SoC A B C D Logic 1 E Logic 2 CPU Test Scheduling ATE Mem 1 CPU Mem 2 SoC A B C D Logic 1 E Logic 2 CPU Test Scheduling ATE (produced response) B CPU D Mem1 Logic2 C E Logic1 A ATE (expected response) B CPU D Mem1 Logic2 C E Logic1 A Fault at module E B CPU D Logic2 Logic1 Mem1 C E A Abort-on-fail testing ATE (stimuli) B CPU D Logic2 Logic1 Mem1 Mem1 C A A B C D ATE (produced response) Logic 1 E Mem2 E Logic 2 SoC Mem1 E CPU ATE (expected response) Over a large set of ICs, minimize overall test time CPU CPU BB Logic2 Logic2 Logic1 DD Logic1 Mem1 Mem1 CC E E AA Abort-on-fail testing ATE (produced response) Mem1 E Fault at module E Mem1 Mem1 ATE (expected response) B CPU D Logic2 Logic1 EE Mem1 C A E Abort-on-fail testing ATE (produced response) Without Abort-on-Fail B CPU D Time to determine a possible fault in module E With Abort-on-Fail Mem1 Logic2 C E Logic1 A Test time ATE (produced response) Mem1 E Test time Abort-on-fail testing Idea Spend less time on faulty circuits If the test fails, it is aborted early Low-yielding and short tests should be performed early Problem Find a test schedule that minimizes the expected test time. Assumptions Abort-on-fail: when a fault occur, testing terminates. Defect probability for each testable unit is given. Sequential testing and concurrent testing. Erik Larsson, Julien Pouget, and Zebo Peng, Abort-on-Fail Based Test Scheduling, Journal of Electronic Testing; Theory and Applications (JETTA), Vol. 21, Nr 6, Dec. 2005, pages 651 – 658 Urban Ingelsson, Sandeep Goel, Erik Larsson, and Erik Jan Marinissen, Test Scheduling for Modular SOCs in an Abort-on-Fail Environment, European Test Symposium (ETS'05), 2005, pages 8-13 Example Core2 Core3 Core4 SoC Core1 Core Test time 1 2 2 4 3 3 4 6 Test time: 15. Core1 τ1=2 Core2 Core3 τ2=4 Core4 τ3=3 Core4 Core2 τ4=6 τ2=4 τ4=6 Core3 Core1 τ3=3 τ1=2 Time Time Sequential testing Without abort-on-fail: 15 Core3 Core4 Core2 Core Test time Pass probability 1 2 0.7 2 4 0.8 3 3 0.9 4 6 0.95 Core1 Time Expected test time: 13.6 Core1 Core2 Core3 Core4 Time Expected test time: 9.5 Sequential testing p1=0.7 Core1 Core2 Core3 Core4 τ1=2 fault abort testing At time point τ1: Probability to pass, p1 =0.7. Weighted probability to pass, τ1 xp1: 2x0.7=1.4 Probability to fail (1-p1)=0.3. Weighted probability to fail, τ1x(1-p1): 2x0.3= 0.6 Time Sequential testing p1=0.7 p2=0.8 Core1 Core2 Core3 Core4 τ1+ τ2 At time point τ1 + τ2 : Probability to pass -> pass test at core 1 and test at core 2. Probability to fail -> pass test at core 1 and fail test at core 2. Time Sequential testing p1=0.7 Core1 p2=0.8 Core2 p3= 0.9 p4= 0.95 Core3 Core4 Time At τ1 : τ1x(1-p1)=0.6 At τ1+τ2 : (τ1+τ2)xp1(1-p2)=0.84 At τ1+τ2+τ3: At τ1+τ2+τ3+τ4 : Total expected time : (τ1+τ2+τ3)xp1xp2(1-p3)=0.50 (τ1+τ2+τ3+τ4)xp1xp2xp3(1-p4)=0.38 (τ1+τ2+τ3+τ4)xp1xp2xp3xp4=7.2 7.2+0.5+0.84+0.6=9.5 Concurrent testing TAM 3 Core1 2 1 Core2 Core3 Core4 Time τ11 Probability to pass at time point τ11 (=2): Core 1: p1 =0.7 – p11 Core 2: p2k, k=τ11/τ2 = 0.82/4 – p21 Core 3: p3k, k=τ11/τ3 = 0.92/3 – p31 Probability to fail at time point τ11 (=2): τ11x((1-p11)xp21xp31) Constraints to consider Power consumption An SOC is designed according to functional power consumption In testing, switch as many “sites” as possible in order to test as much as possible in a short time Power consumed during testing is Higher and different from “functional”-mode power Burn an IC or getting wrong results SoC Test Planning including Test Data Compression 53 IC Test ATE TEST STIMULUS SOC PRODUCED RESPONS Power-Aware Test Approaches Design SOC to handle test power consumption Design SOC with test power reducing techniques SOC test planning to handle test power consumption Test planning is a low-cost alternative to: Explore ordering of tests to lower the test application time Guide the search for bottlenecks where design for low-power techniques are to be included or (over) design for test power is needed Test Data Compression m sc1 sc2 sc3 .. . scm Compactor W Decoder IC/ASIC W TAM wires are expanded two m scan chains (m>>W) Reduces test time and test data volume 56 Test Data Compression for Core-based SOCs SOC m CORE2 CORE3 Compactor W Decoder CORE1 CORE4 Major drawback High number of TAM wires (m) are routed to all cores 57 Test Data Compression for Core-based SOCs SOC m1 w1 w2 m3 CORE1 m2 CORE2 CORE3 m4 CORE4 TAM wires (W) are partitioned into test buses Cores are connected to test buses Few TAM wires (W) are routed to the cores Decoder design at core-level 58 Analysis of Compression Techniques 4.5x10 1200 Selective Encoding & Vector Repeat 800 Selective Encoding & Vector Repeat 3 2.5 600 Selective Encoding & Vector Repeat 400 200 0 0 1 Selective Encoding 3.5 Test time (clock cycles) Test-data volume (Mbits) 1000 Vector Repeat 4 Vector Repeat Selective Encoding 4 2 Vector Repeat 1.5 1 0.5 2 3 4 5 6 7 8 9 10 11 12 13 No. of TAM wires (w) 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 No. of TAM wires (w) 59 Analysis of Compression Techniques At core-level define number of TAM wires (w), number of wrapper-chains (mi), compression technique, and decoder such that the core’s test time and test-data volume are minimized At SOC-level define number of test buses, width (wj) of test buses, core’s assignment to test buses, and compression technique, and decoder such that the SOC’s test time and test-data volume are minimized. 60