ppt - Nuno Alves

advertisement
Using Implications for Online
Error Detection
Nuno Alves, Jennifer Dworak, R. Iris Bahar
Kundan Nepal
Division of Engineering
Brown University
Providence, RI 02912
Electrical Engineering Dept.
Bucknell University
Lewisburg, PA 17837
NATW 2008
Online error detection
• Purpose: Detect transient faults that may occur
in a circuit during operation
• Critical as circuits scale to smaller sizes
• “Easy” in memory logic
• In circuit logic not so easy
Common online detection
techniques
1. Stored pre-computed test vectors in hardware
2. Duplicating the computation of disjoint
hardware elements and voting on the result
3. Use of check bits
Our approach
• Find invariant relationships in a circuit
• Violations of these expected relationships can
identify errors
Error detection implementation
Invariant relationships in circuits
n1
n2
n3
n4
n5
n8
These relationships are
logic implications
n5=1
n6
n7
n8=0
Error detection with implications
n1
n2
n3
n8
n4
n5
ERROR
n6
n7
n5=1
n8=0
n5=1 & n8=1 will generate
an error in checker logic
How we find implications
Verilog
Description
Logic
Simulation
Find Implications
Collect Logic Values
At Each Site
Validate
Implications
We have implications. Now what?
Remove Redundant
Implications
Select Useful
Implications
Pick Best Implications
For Given HW Overhead
Why should we remove implications?
• With all implications we can generate checker
logic for each implication.
• Inefficient!
▫ A circuit can contain thousands of implications
▫ generating separate checker logic for each
implication could more than double circuit size.
• We want to detect only the “most important”
implications.
Removing redundant implications
n1
n2
n3
n9
n10
n4
n5
n12
n13
i1: n3=0  n8=0
i2: n4=1  n12=0
i3: n4=1  n8=0
n6
n7
n11
i4: n12=0  n8=0
i5: n4=1  n13=0
n8
Removing low coverage implications
• We only want implications that:
▫ Detect many faults
▫ Identify hard-to-detect faults
▫ Cover faults not detected by other implications
• Finding these important implications requires:
▫ fault analysis to determine the specific fault
coverage for each implication
Reducing the number of implications
redundant implications
low coverage implications
c1
35
5
c4
99
c4
32
c1
90
8
is
ex
2
m
b1
2
z9
sy
m
cl
ip
z5
xp
1
rd
73
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
high-quality implications
Covering faults with implications
• For each random input vector, and at each fault,
the implications-based circuit operation can fall
into the following 4 categories:
Cas
e
1
Cas
e
2
Cas
e
3
Cas
e
4
Error Propagates To
Output




An Implication is Violated




Average distribution of the 4 scenarios
70
60
40
30
20
10
is
ex
2
c1
90
8
c4
32
c4
99
c1
35
5
m
b1
2
0
rd
73
z5
xp
1
cl
ip
z9
sy
m
%
50
Case 1: Error Propagated & Implication Violated
Case 2: Error NOT Propagated & Implication Violated
Case 3: Error NOT Propagated & Implication NOT Violated
Case 4: Error Propagated & Implication NOT Violated
How often do we detect errors?
Case1/[Case1+Case4]
Implications with fixed HW budgets
• Given a fixed HW budget, by how much can we
reduce the probability of an undetected error?
20%
18%
16%
14%
12%
10%
8%
6%
4%
2%
0%
10%
b12
30%
50%
mis ex2 rd73
Z 5xp1
clip
Z 9s ym C 499
C 432 C 1908
Conclusions
• Practical online error detection alternative based
on implication validation
• No modification of targeted logic
• Checker logic is added off the critical path and
run in parallel rest of circuit.
• For several circuits, we can detect almost 90% of
all errors that propagate to a primary output.
• With only a 10% area overhead, probability of
an error being both observable and undetected is
reduced to 11% on average
Download