Multi-Cycle Implications

advertisement
Detecting Errors Using Multi-Cycle
Invariance Information
Nuno Alves, Jennifer Dworak,
and R. Iris Bahar
Division of Engineering
Brown University
Providence, RI 02912
Kundan Nepal
Electrical Engineering Dept.
Bucknell University
Lewisburg, PA 17837
Design, Automation, and Test in Europe, April 20-24, 2009
Motivation
Errors in ICs are increasing
– Particle strikes, temperature, power, noise,
process variations, test escapes, etc.
Previously, we have proposed using logic
implications for online error detection during a
single clock cycle
What happens if we consider
implications across time cycles?
Outline
Introduction & Background
Logic Implications for Error Detection
Multi-Cycle Implications
Experimental Results
Conclusions
Outline
Introduction & Background
Logic Implications for Error Detection
Multi-Cycle Implications
Experimental Results
Conclusions
Other Work
Triple Modular Redundancy
Logic Duplication
Re-Execution in Multiple Threads
Codes (Parity, Berger, Bose Lin, etc.)
High Level Fault Assertions
Fault Masking
Checking the Outputs Against a Subset of
the Truth Table
Our Approach
Find natural expected relationships and
check for their violation.
Water should be
blue….
Not brown…
In circuits, expected relationships at the gate level
consist of logic implications.
Outline
Introduction & Background
Logic Implications for Error Detection
Multi-Cycle Implications
Experimental Results
Conclusions
Implications Naturally Occur in Circuits
n1
n2
n3
n4
n5
0
0
0
n8
1
n6
n7
n5 = 1 → n8 = 0
Implication Violations Can Be Used
to Detect Errors
Appropriate checker logic can detect multiple
errors with a single implication.
n1
n2
n3
n4
n5
n8
n5=1
n8=0
n6
n7
ERROR
Implication Violations Can Be Used
to Detect Errors
Appropriate checker logic can detect multiple
errors with a single implication.
n1
sa1
n2
n3
sa1
sa1
sa1
n4
n5
n8
n5=1
n8=0
n6
n7
ERROR
Total Number of Implications With Distance 2 or Greater
10000
1000
100
10
Circuit
90
8
c1
35
5
c1
99
c4
32
c4
ex
2
is
m
b1
2
m
z9
sy
ip
cl
xp
1
z5
3
1
rd
7
Number of Implications
100000
So….what’s the problem?
We have too many implications!
How do we efficiently find them
and which ones should we use?
Implication Algorithm
Gate-level implications can be found automatically
…without functional knowledge of the circuit.
Start
Identify Potential
Implications w/ Simulation
Verify
Implications
Eliminate Subsumed
Implications
Determine Coverage of
Remaining Implications
Select Best Subset for
Target Error Detection
and Overhead
End
What determines which faults an
implication may cover?
Potential Spatial Fault Coverage
Each implication can only cover a limited area of the
circuit….
Reconvergent Fanout
Direct Path
P=0 → Q=0
P
Divergent Fanout
Q=0 → P=0
P=1 → Q=1
P
Q
P
Q
P
Q
P
Q
Q
P
Q
Faults along the path may
be detected
Faults along reconverging
paths may be detected
Faults along paths to
common ancestors may
be detected
Implications cannot cover any
sites downstream of both
implication points!
Limitations of Single-Cycle
Implications
Implications may not exist to cover faults
far downstream—e.g. close to:
– Flip-flops
– Primary Outputs
It is possible for no useful implications to
exist in a single cycle
Optimal timing of capture is difficult
Many of these issues are alleviated if
we consider multi-cycle implications
Outline
Introduction & Background
Logic Implications for Error Detection
Multi-Cycle Implications
Experimental Results
Conclusions
Multi-Cycle Implications
A
X
B
Y
F
Time Frame
Expansion
A1
B1
X1
F1
X2
A2
B2
Y1
X2
Y1
Cycle t1
Sequential Circuit
Containing No Non-Trivial
Implications in
Combinational Logic
Y2
X1
X0
Y0
F2
Y2
Cycle t2
Logic Value in First Clock Cycle
Implies a Value at a Different Site
in the Second Clock Cycle
B1 = 0 → F2 = 0
Multi-Cycle Checker Hardware
violation
A1
B1
X1
F1
A2
B2
Y1
X2
A
F
Y
X2
Y1
Cycle t1
X
B
Y2
X1
X0
Y0
F2
Y2
Cycle t2
B1 = 0 → F2 = 0
Checker hardware requires state to be held between first
and second cycle….
Spatial Coverage of Multi-Cycle
Implications
P
Q
Cycle t
Cycle t + 1
Advantages:
Good spatial coverage can be achieved near flip-flops
Logical distance may increase between implication sites
Delays captured at flip-flops in cycle t can be detected
without complex timing
Outline
Introduction & Background
Logic Implications for Error Detection
Multi-Cycle Implications
Experimental Results
Conclusions
Experimental Setup
ISCAS ’89 benchmark circuits
Zchaff SAT solver to validate implications
Three sets of implications per circuit
– First cycle
Both implication sites in cycle 1
Obtained with single cycle analysis & unrestricted inputs
– Second cycle
Both implication sites in cycle 2
Obtained with time frame expansion
– Cross cycle
One site per cycle
Obtained with time frame expansion
So, how many implications exist?
Number of Implications in Each Class
25000
20000
1st cycle
cross-cycle
2nd cycle only
15000
10000
5000
Circuit
8
s1
48
6
s1
19
53
s9
13
s7
10
s5
44
s4
20
s4
98
0
s2
Number of Implications
30000
What is the distance between
implication sites?
Average Implication Distance for Single and Between
Cycle Implications
14
10
Average single cycle
distance
Average cross-cycle
distance
8
6
4
2
Circuit
48
8
s1
19
6
s1
53
s9
13
s7
10
s5
44
s4
20
s4
98
0
s2
Average Distance
12
How do the different
implication classes compare
for error detection (if we use all
possible implications)?
Contribution of Different Implication Classes
to Error Detection
100
90
80
Error Coverage
70
1st cycle
1st and 2nd cycle
cross cycle
all
60
50
40
30
20
10
0
s298 s420 s444 s510 s713 s953 s1196 s1488
Circuit
Developing a Compressed
Implication Set
Start
Choose next
fault in fault list
Find implication with best
coverage of this fault
Add best implication
to compressed list
Yes
No
Any more faults?
Return
implication list
End
Number of Compressed Implications
500
450
400
Number
350
300
1st cycle
cross-cycle
2nd cycle only
250
200
150
100
50
0
s298 s420 s444 s510 s713 s953 s1196 s1488
Circuit
What if we further tradeoff
error coverage for reduced
area overhead?
Average Error Coverage Acheived for Different Area Thresholds
100
90
Average Error Coverage
80
70
10%
20%
30%
40%
50%
Compressed
All
60
50
40
30
20
10
0
s298
s420
s444
s510
s713
Circuit
s953
s1196
s1488
% of Chosen Implications that are
Cross-Cycle
Percentage of Cross-Cycle Implications Chosen
for Different Area Overheads
100.00
90.00
80.00
70.00
60.00
10%
50%
50.00
40.00
30.00
20.00
10.00
0.00
s298
s420
s444
s510
s713
Circuit
s953 s1196 s1488
Outline
Introduction & Background
Logic Implications for Error Detection
Multi-Cycle Implications
Experimental Results
Conclusions
Conclusions
Implications can be used to effectively detect
many errors at runtime
– Without requiring functional knowledge of the circuit
– Allowing tradeoffs to be made between error
coverage and overhead
Cross-cycle implications cover faults that cannot
be covered by single cycle implications
Even though they have larger overhead, cross
cycle implications are often an “optimal” choice
When optimizing for low area overhead, more
than 85% of the implications may be cross cycle
For Inquiring Minds
Implication Algorithm
Gate-level implications can be found automatically
…without functional knowledge of the circuit.
Start
Identify Potential
Implications w/ Simulation
Run Good Circuit Simulation
with Random Vectors and
Monitor Site Values…
Verify
Implications
00
Eliminate Subsumed
Implications
01
10
A,B
A,C
Determine Coverage of
Remaining Implications
A,D
Select Best Subset for
Target Error Detection
and Overhead
End
A=0 → C = 0
11
Implication Algorithm
Gate-level implications can be found automatically
…without functional knowledge of the circuit.
Start
Identify Potential
Implications w/ Simulation
Using a SAT solver
Verify
Implications
(such as Zchaff)
Eliminate Subsumed
Implications
Determine Coverage of
Remaining Implications
Select Best Subset for
Target Error Detection
and Overhead
End
Implication Algorithm
Gate-level implications can be found automatically
…without functional knowledge of the circuit.
Start
Identify Potential
Implications w/ Simulation
Verify
Implications
n1
n9
n2
n3
n11
n8
n13
Eliminate Subsumed
Implications
Determine Coverage of
Remaining Implications
Select Best Subset for
Target Error Detection
and Overhead
n4
n5
n10
n12
n10 = 0 → n13 = 0
n6
n7
n4 = 1 → n8 = 0
End
Implication Algorithm
Gate-level implications can be found automatically
…without functional knowledge of the circuit.
Start
Identify Potential
Implications w/ Simulation
Verify
Implications
Of all the patterns that will
allow a fault to produce an
error at an output, how
many will each implication
detect?
Eliminate Subsumed
Implications
Determine Coverage of
Remaining Implications
Select Best Subset for
Target Error Detection
and Overhead
End
Implication Algorithm
Gate-level implications can be found automatically
…without functional knowledge of the circuit.
Start
Identify Potential
Implications w/ Simulation
Verify
Implications
Eliminate Subsumed
Implications
Determine Coverage of
Remaining Implications
Select Best Subset for
Target Error Detection
and Overhead
End
Download