BackSpace: Formal Analysis for Post-Silicon Debug

advertisement
BackSpace: Formal Analysis
for Post-Silicon Debug
Flavio M. de Paula*
Marcel Gort *, Alan J. Hu *, Steve Wilton *, Jin Yang+
* University
of British Columbia
+ Intel Corporation
Outline






Motivation
Current Practices
BackSpace – The Intuition
Proof-of-Concept Experimental Results
(Recent Experiments)
Conclusions and Future Work
2
Motivation

Chip is back from fab!

Screened out chips w/ manufacturing defects
3
Motivation

Chip is back from fab!


Screened out chips w/ manufacturing defects
A bring-up procedure follows:

Run diagnostics w/o problems, everything looks fine!
4
Motivation

Chip is back from fab!


Screened out chips w/ manufacturing defects
A bring-up procedure follows:


Run diagnostics w/o problems, everything looks fine!
But, the system becomes irresponsive while running
the real application…
5
Motivation

Chip is back from fab!


Screened out chips w/ manufacturing defects
A bring-up procedure follows:


Run diagnostics w/o problems, everything looks fine!
But, the system becomes irresponsive while running
the real application…

Every single chip fails in the same way (1M DPM: Func. bugs)
6
Motivation

Chip is back from fab!


Screened out chips w/ manufacturing defects
A bring-up procedure follows:


Run diagnostics w/o problems, everything looks fine!
But, the system becomes irresponsive while running
the real application…


Every single chip fails in the same way (1M DPM: Func. bugs)
What do we do now?
7
Current Practices
Inputs
Scan-out
buggy state
8
Current Practices
Inputs
Scan-out
buggy state
But, cause is not obvious!!!
9
Current Practices
Guess when to stop and single step
Inputs
?? ?
Scan-out
10
Current Practices
Guess when to stop and single step
Inputs
?
Problems:
Single-stepping interference;
Non-determinism;
Too early/late to stop?
Non-buggy path
11
Current Practices

Leveraging additional debugging support:

Trace buffer of the internal state
12
Current Practices

Leveraging additional debugging support:

Trace buffer of the internal state

Provides only a narrow view of the design, e.g.,
program counter, address/data fetches
13
Current Practices

Leveraging additional debugging support:

Trace buffer of the internal state


Provides only a narrow view of the design, e.g.,
program counter, address/data fetches
Record all I/O and replay


Solves the non-determinism problem, but…
Requires highly specialized bring-up systems
14
Current Practices

Leveraging additional debugging support:

Trace buffer of the internal state


Provides only a narrow view of the design, e.g.,
program counter, address/data fetches
Record all I/O and replay


Solves the non-determinism problem, but…
Requires highly specialized bring-up systems
Just having additional hardware
does NOT solve the problem
15
A Better Solution: BackSpace

Goal:





Avoid guess work
Avoid interfering with the system
Run at speed
Portable debug support
Compute an accurate trace to the bug
16
A Better Solution: BackSpace

Requires:

Hardware:




Existing test infrastructure and scan-chains;
Breakpoint circuit;
Good signature scheme;
Software:


Efficient SAT solver;
BackSpace Manager
17
A Better Solution: BackSpace
Inputs
1. Run at-speed until hit the
buggy state
Non-buggy path
18
A Better Solution: BackSpace
Inputs
1. Run at-speed until hit the
buggy state
Non-buggy path
19
A Better Solution: BackSpace
Inputs
1. Run at-speed until hit the
buggy state
Non-buggy path
20
A Better Solution: BackSpace
Inputs
1. Run at-speed until hit the
buggy state
Non-buggy path
21
A Better Solution: BackSpace
Inputs
2. Scan-out buggy state and
history of signatures
22
A Better Solution: BackSpace
Inputs
3. Off-Chip Formal Analysis
Formal
Engine
23
A Better Solution: BackSpace
Inputs
4. Off-Chip Formal Analysis
- Compute Pre-image
Formal
Engine
24
A Better Solution: BackSpace
Inputs
5. Pick candidate state and
load breakpoint circuit
Formal
Engine
25
A Better Solution: BackSpace
Inputs
6. Run until hits the breakpoint
Formal
Engine
26
A Better Solution: BackSpace
Inputs
7. Pick another state
Formal
Engine
27
A Better Solution: BackSpace
Inputs
7. Run until hits the breakpoint
Formal
Engine
28
A Better Solution: BackSpace
Inputs
7. Run until hits the breakpoint
Formal
Engine
29
A Better Solution: BackSpace
Inputs
Computed trace of length 2
30
A Better Solution: BackSpace
Inputs
7. Iterate
Formal
Engine
31
A Better Solution: BackSpace
Inputs
8. BackSpace trace
32
Outline






Motivation
Current Practices
BackSpace – The Intuition
Proof-of-Concept Experimental Results
Recent Experiments
Future Work
33
Proof-of-Concept
Experimental Results
Chip on Silicon
BackSpace
Manager
SAT Solver
34
Proof-of-Concept
Experimental Results
Logic Simulator
BackSpace
Manager
SAT Solver
35
Proof-of-Concept
Experimental Results

Setup:

OpenCores’ designs:



68HC05: 109 latches
oc8051 : 702 latches
Run real applications
36
Proof-of-Concept
Experimental Results


Can we find a signature that reduces the size
of the pre-image?
Experiment:


Select 10 arbitrary ‘crash’ states on 68HC05;
Try different signatures
37
Signature Size vs.
States in Pre-Image
38
Proof-of-Concept
Experimental Results


How far can we go back?
Experiment:

Select arbitrary ‘crash’ states:




10 for each 68HC05 and oc8051;
Set limit to 500 cycles of backspace;
Set limit on size of pre-image to 300 states;
Compare the best two types of signature;


Hand-picked
Universal Hashing of entire state
39
68HC05 w/ 38-Bit Manual Signature
40
68HC05 w/ 38-Bit Manual Signature
41
68HC05 w/ 38-Bit Universal Hashing
42
8051 w/ 281-Bit Manual Signature
43
8051 w/ 281-Bit Universal Hashing
44
Proof-of-Concept
Experimental Results

Results



Signature: Universal Hashing
Small size of pre-images
All 20 cases successfully BackSpaced to limit
45
Proof-of-Concept
Experimental Results

Breakpoint Circuitry


40-50% area overhead.
Signature Computation

Universal Hashing naïve implementation results in
150% area overhead.
46
Recent Experiments

OpenRisc 1200:






32-bit RISC processor;
Harvard micro-architecture;
5-stage integer pipeline;
Virtual memory support;
Total of 3k+ latches
BackSpace implemented in HW/SW



AMIRIX AP1000 FPGA board (provided by CMC)
Board mimics bring-up systems
Host-PC: off-chip formal analysis
47
Recent Experiments

BackSpacing OpenRisc 1200:



Running simple software application
Backspaced for hundreds of cycles
Demonstrated robustness in the presence of
nondeterminism
48
Conclusions & Future Work

Introduced BackSpace: a new paradigm for
post-silicon debug
Demonstrated it works

Main challenges:



Find hardware-friendly & SAT-friendly signatures
Minimize breakpoint circuitry overhead
49
50
Dfn. BackSpaceable Design
1) Augmented Machine

Given
, where
is the set of states,
Define the signature generator as
where is the set of states,
,
 Construct an augmented machine MA such that:

51
Dfn. BackSpaceable Design
2) BackSpaceable State

A state (s’,t’) of augment state machine MA is
backspaceable if its pre-image projected onto 2S
is unique.
52
Dfn. BackSpaceable Design
3) BackSpaceable Machine

An augmented machine MA is backspaceable iff
all reachable states are backspaceable. A state
machine M is backspaceable iff it can be
augmented into a state machine MA for which all
reachable states are reachable.
53
Crash State History Algorithm

Given state (s0,t0) of a backspaceable
augmented state machine MA, compute a
finite sequence of states (s0,t0), (s1,t1),… as
follows:


Since MA is backspaceable, let si+1 be the unique
pre-image state (on the state bits) of (si,ti).
Run MA (possibly repeatedly) until it reaches a
state (si+1,x). Let ti+1 = x.
54
Theorem (Correctness)

If started at a reachable state, the sequence
of states computed by the preceding
algorithm is the (reversed) suffix of a valid
execution of M.
55
Theorem
(Probabilistic Termination)

If the forward simulation is random, then with
probability 1, the preceding algorithm will
reach an initial state.
56
Download