BackSpace: Formal Analysis for Post-Silicon Debug Flavio M. de Paula* Marcel Gort *, Alan J. Hu *, Steve Wilton *, Jin Yang+ * University of British Columbia + Intel Corporation Outline Motivation Current Practices BackSpace – The Intuition Proof-of-Concept Experimental Results (Recent Experiments) Conclusions and Future Work 2 Motivation Chip is back from fab! Screened out chips w/ manufacturing defects 3 Motivation Chip is back from fab! Screened out chips w/ manufacturing defects A bring-up procedure follows: Run diagnostics w/o problems, everything looks fine! 4 Motivation Chip is back from fab! Screened out chips w/ manufacturing defects A bring-up procedure follows: Run diagnostics w/o problems, everything looks fine! But, the system becomes irresponsive while running the real application… 5 Motivation Chip is back from fab! Screened out chips w/ manufacturing defects A bring-up procedure follows: Run diagnostics w/o problems, everything looks fine! But, the system becomes irresponsive while running the real application… Every single chip fails in the same way (1M DPM: Func. bugs) 6 Motivation Chip is back from fab! Screened out chips w/ manufacturing defects A bring-up procedure follows: Run diagnostics w/o problems, everything looks fine! But, the system becomes irresponsive while running the real application… Every single chip fails in the same way (1M DPM: Func. bugs) What do we do now? 7 Current Practices Inputs Scan-out buggy state 8 Current Practices Inputs Scan-out buggy state But, cause is not obvious!!! 9 Current Practices Guess when to stop and single step Inputs ?? ? Scan-out 10 Current Practices Guess when to stop and single step Inputs ? Problems: Single-stepping interference; Non-determinism; Too early/late to stop? Non-buggy path 11 Current Practices Leveraging additional debugging support: Trace buffer of the internal state 12 Current Practices Leveraging additional debugging support: Trace buffer of the internal state Provides only a narrow view of the design, e.g., program counter, address/data fetches 13 Current Practices Leveraging additional debugging support: Trace buffer of the internal state Provides only a narrow view of the design, e.g., program counter, address/data fetches Record all I/O and replay Solves the non-determinism problem, but… Requires highly specialized bring-up systems 14 Current Practices Leveraging additional debugging support: Trace buffer of the internal state Provides only a narrow view of the design, e.g., program counter, address/data fetches Record all I/O and replay Solves the non-determinism problem, but… Requires highly specialized bring-up systems Just having additional hardware does NOT solve the problem 15 A Better Solution: BackSpace Goal: Avoid guess work Avoid interfering with the system Run at speed Portable debug support Compute an accurate trace to the bug 16 A Better Solution: BackSpace Requires: Hardware: Existing test infrastructure and scan-chains; Breakpoint circuit; Good signature scheme; Software: Efficient SAT solver; BackSpace Manager 17 A Better Solution: BackSpace Inputs 1. Run at-speed until hit the buggy state Non-buggy path 18 A Better Solution: BackSpace Inputs 1. Run at-speed until hit the buggy state Non-buggy path 19 A Better Solution: BackSpace Inputs 1. Run at-speed until hit the buggy state Non-buggy path 20 A Better Solution: BackSpace Inputs 1. Run at-speed until hit the buggy state Non-buggy path 21 A Better Solution: BackSpace Inputs 2. Scan-out buggy state and history of signatures 22 A Better Solution: BackSpace Inputs 3. Off-Chip Formal Analysis Formal Engine 23 A Better Solution: BackSpace Inputs 4. Off-Chip Formal Analysis - Compute Pre-image Formal Engine 24 A Better Solution: BackSpace Inputs 5. Pick candidate state and load breakpoint circuit Formal Engine 25 A Better Solution: BackSpace Inputs 6. Run until hits the breakpoint Formal Engine 26 A Better Solution: BackSpace Inputs 7. Pick another state Formal Engine 27 A Better Solution: BackSpace Inputs 7. Run until hits the breakpoint Formal Engine 28 A Better Solution: BackSpace Inputs 7. Run until hits the breakpoint Formal Engine 29 A Better Solution: BackSpace Inputs Computed trace of length 2 30 A Better Solution: BackSpace Inputs 7. Iterate Formal Engine 31 A Better Solution: BackSpace Inputs 8. BackSpace trace 32 Outline Motivation Current Practices BackSpace – The Intuition Proof-of-Concept Experimental Results Recent Experiments Future Work 33 Proof-of-Concept Experimental Results Chip on Silicon BackSpace Manager SAT Solver 34 Proof-of-Concept Experimental Results Logic Simulator BackSpace Manager SAT Solver 35 Proof-of-Concept Experimental Results Setup: OpenCores’ designs: 68HC05: 109 latches oc8051 : 702 latches Run real applications 36 Proof-of-Concept Experimental Results Can we find a signature that reduces the size of the pre-image? Experiment: Select 10 arbitrary ‘crash’ states on 68HC05; Try different signatures 37 Signature Size vs. States in Pre-Image 38 Proof-of-Concept Experimental Results How far can we go back? Experiment: Select arbitrary ‘crash’ states: 10 for each 68HC05 and oc8051; Set limit to 500 cycles of backspace; Set limit on size of pre-image to 300 states; Compare the best two types of signature; Hand-picked Universal Hashing of entire state 39 68HC05 w/ 38-Bit Manual Signature 40 68HC05 w/ 38-Bit Manual Signature 41 68HC05 w/ 38-Bit Universal Hashing 42 8051 w/ 281-Bit Manual Signature 43 8051 w/ 281-Bit Universal Hashing 44 Proof-of-Concept Experimental Results Results Signature: Universal Hashing Small size of pre-images All 20 cases successfully BackSpaced to limit 45 Proof-of-Concept Experimental Results Breakpoint Circuitry 40-50% area overhead. Signature Computation Universal Hashing naïve implementation results in 150% area overhead. 46 Recent Experiments OpenRisc 1200: 32-bit RISC processor; Harvard micro-architecture; 5-stage integer pipeline; Virtual memory support; Total of 3k+ latches BackSpace implemented in HW/SW AMIRIX AP1000 FPGA board (provided by CMC) Board mimics bring-up systems Host-PC: off-chip formal analysis 47 Recent Experiments BackSpacing OpenRisc 1200: Running simple software application Backspaced for hundreds of cycles Demonstrated robustness in the presence of nondeterminism 48 Conclusions & Future Work Introduced BackSpace: a new paradigm for post-silicon debug Demonstrated it works Main challenges: Find hardware-friendly & SAT-friendly signatures Minimize breakpoint circuitry overhead 49 50 Dfn. BackSpaceable Design 1) Augmented Machine Given , where is the set of states, Define the signature generator as where is the set of states, , Construct an augmented machine MA such that: 51 Dfn. BackSpaceable Design 2) BackSpaceable State A state (s’,t’) of augment state machine MA is backspaceable if its pre-image projected onto 2S is unique. 52 Dfn. BackSpaceable Design 3) BackSpaceable Machine An augmented machine MA is backspaceable iff all reachable states are backspaceable. A state machine M is backspaceable iff it can be augmented into a state machine MA for which all reachable states are reachable. 53 Crash State History Algorithm Given state (s0,t0) of a backspaceable augmented state machine MA, compute a finite sequence of states (s0,t0), (s1,t1),… as follows: Since MA is backspaceable, let si+1 be the unique pre-image state (on the state bits) of (si,ti). Run MA (possibly repeatedly) until it reaches a state (si+1,x). Let ti+1 = x. 54 Theorem (Correctness) If started at a reachable state, the sequence of states computed by the preceding algorithm is the (reversed) suffix of a valid execution of M. 55 Theorem (Probabilistic Termination) If the forward simulation is random, then with probability 1, the preceding algorithm will reach an initial state. 56