Design and synthesis of self-checking VLSI circuits

advertisement
878
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 12, NO. 6, J U N E 1993
Design and Synthesis of Self-Checkmg VLSI Circuits
Niraj K. Jha, Member, IEEE, and Sying-Jyan Wang
Abstract-Self-checking circuits can detect the presence of
both transient and permanent faults. A self-checking circuit
consists of a functional circuit, which produces encoded output
vectors, and a checker, which checks the output vectors. The
checker has the ability to expose its own faults as well. The
functional circuit can be either combinational or sequential. A
self-checking system consists of an interconnection of selfchecking circuits. The advantage of such a system is that errors
can be caught as soon as they occur; thus, data contamination
is prevented.
Although much effort has been concentrated on the design of
self-checking checkers by previous researchers, very few results have been presented for the design of self-checking functional circuits. In this paper, we explore methods for the costeffective design of combinational and sequential self-checking
functional circuits and checkers. The area overhead for all proposed design alternatives is studied in detail.
I. INTRODUCTION
ECHNOLOGICAL advances have increased drastically the complexity of integrated circuits that can be
realized on a single chip, and the trend is likely to continue. Although the advantages of such a rapid advancement are obvious, we do have to learn to deal effectively
with some of the side effects, such as the increase in the
number of transient faults. It is well known that transient
faults are the predominant cause of system failures [1][3]. With the possibility of reduced voltage levels for
VLSI circuits and the resultant decrease in noise margins,
system susceptibility to transient faults is likely to increase [3]. Thus we have to design circuits and systems
that can expose such faults.
Self-checking circuits and systems can detect the presence of both transient and permanent faults. A self-checking circuit consists of a functional circuit, which produces
encoded output vectors, and a checker, which checks the
vectors to determine if an error has occurred. The checker
has the ability to give an error indication even when a
fault occurs in the checker itself. The functional circuit
can be either Combinational or sequential. A self-checking system consists of an interconnection of self-checking
circuits. The most important concept developed for selfchecking circuits is the totally self-checking (TSC) concept [4], 151. The TSC concept was generalized to the
path fault-secure (PFS) and strongly fault-secure (SFS)
concepts [6]. These concepts are defined in Section 11.
The output vectors of a self-checking functional circuit
are often encoded in a code that detects unidirectional errors. The reason is that unidirectional errors are very common in integrated circuits [3], [7]-[lo]. Such an error is
said to occur when there are multiple transitions in either
the 0 + 1 direction or the 1
0 direction, but not both.
For example, in [9] it was shown that a stuck-at fault,
cross-point fault, or a short in an MOS programmable
logic array (PLA) or read-only memory causes a unidirectional error at its outputs. Even for random logic, it is
easy to ensure by simple design techniques that most single faults can cause only a unidirectional error. Many
codes have been proposed for detecting such errors [lo][ 161, [22], [26], [27]. A nice mathematical framework for
dealing with such errors is given in [lo].
A lot of work has been done in the area of self-checking
checker design for different codes. However, not as much
attention has been paid to the design of functional circuits. Self-checking combinational functional circuit design methods were presented in [17] and 161. Results in
the area of self-checking sequential functional circuit design were presented in [ 181-1201. Applications of the selfchecking concept to microprogram control units and
PLA’s were presented in 171, [3], and 191. A theory of
TSC system design was presented in [ZI].In most previous works on functional circuits, usually some conditions are given which need to be satisfied to obtain a selfchecking design. However, no attempt is made to evaluate the practicality of the design in terms of area overhead, etc. For example, the area overhead needed for a
self-checking sequential functional circuit design derived
from some existing methods can be quite substantial.
In this paper, we are concerned with deriving methods
for cost-effective design of combinational and sequential
functional circuits and checkers. In Section 11, we describe briefly some previous results that will be used
herein. The basic self-checking circuit and system structures are given in Section 111. Next, in Section IV, we
evaluate various synthesis methodologies by applying
them to some benchmarks. Concluding remarks are given
in Section V .
Manuscript received April 30, 1991; revised August 14, 1992. This work
was supported by the National Science Foundation under Grant MIP9010433. This paper was recommended by Associate Editor Franc Brglez.
The authors are with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544.
IEEE Log Number 9206845.
11. PRELIMINARIES
In this section, first we will review come unidirectional
error-detecting codes that will be used in the following
sections. Then we will discuss properties that define a selfchecking circuit.
T
+
0278-0070/93$03.00
0 1993 IEEE
JHA A N D WANG: SELF-CHECKING VLSI CIRCUITS
2.1. Systematic and Separable Codes
First, we will make a distinction between systematic
and separable codes. Suppose that the information symbol
has k bits and the appended check symbol has r bits. Thus,
a codeword has n = k r bits.
Dejinition 1: A set of binary n-tuples C is called a systematic code if 1) it contains 2k n-tuples, 1 Ik In , and
2) C = {XI. = ZS,CS,, where IZS,l = k and ICS,l = r,
and each one of the 2k information symbols appears as
part of some codeword).
Dejinition 2: A set of binary n-tuples C is called a separable code if 1) it contains s < 2k n-tuples, 1 Ik In ,
and 2) C = { X ( X = ZS,CS,, where IZS,l = k and ICS,l =
r , and each one of the s information symbols appears as
part of some codeword).
The Berger code [22] is an optimal systematic all-unidirectional error-detecting (AUED) code. For such a code,
r = [logz ( k
1)1 . This code is formed as follows: a
binary number corresponding to the number of 1’s in the
information symbol is obtained, and the binary complement of each bit in this number is taken; the resulting
binary number forms the check symbol. Sometimes
AUED codes are also called unordered. TSC checkers for
the Berger code were presented in [23]-[25] and [40].
In a systematic code in which the number of information bits is k , it is assumed that all 2k possible information
symbols are present. However, more often than not, a
functional circuit with k outputs does not produce all the
2k output patterns. In such cases, it is possible to use a
separable AUED code [26], [16] for encoding purposes.
A separable code has redundancy no worse than a systematic code with the same capability; frequently its redundancy is less.
+
+
2.2. The m-out-of-n Code
The rn-out-of-n code [27] is an optimal AUED code. In
such a code, all codewords have exactly m 1’s and ( n rn) 0’s. For this reason, an rn-out-of-n code is also called
an rn-hot code. Such codes are “nonsystematic” because
the information bits cannot be separated from the check
bits. A lot of work has already been done in the area of
TSC checker design for m-out-of-n codes [5], [28]-[3 11.
2.3. Self-checking Circuits
The following definitions can be used to describe a TSC
system [4], [5]. F denotes the set of faults, and G is the
network in these definitions.
Dejinition 3: G is self-testing with respect to F if for
every fault in F the circuit produces an output noncodeword for at least one input codeword.
Dejinition 4: G is fault-secure with respect to F if for
every fault in F the circuit never produces an incorrect
output codeword for any input codeword.
Dejinition 5: G is TSC if it is both self-testing and fault
secure.
Dejinition 6: G is code-disjoint if it maps input code-
879
words (noncodewords) to output codewords (noncodewords).
Dejinition 7: G is a TSC checker if it is self-testing,
fault-secure, and code-disjoint.
It has been pointed out in [32] that the fault-secure
property is not really necessary for TSC checkers. Conversely, most checkers, which are designed to be only
self-testing and code-disjoint, can also be found to be automatically fault-secure. Thus, in many cases we get the
fault-secure property in the checker for free.
The concept of TSC functional circuits was generalized
to SFS functional circuits in [6].
Dejinition 8: G is SFS with respect to F if for every
fault in F, either 1) the circuit is self-testing and faultsecure, or 2) the circuit is fault-secure, and if another fault
from F occurs in G then either property 1 or 2 is true for
the fault sequence.
The path fault-secure property also was presented in
[6]. This property is defined in terms of network structures and codes. If a circuit is PFS, a modeled fault can
result only in the correct codeword or a noncodeword at
the outputs, never an incorrect codeword. The following
theorems were proved for the PFS property [6].
Theorem I : If G is PFS with respect to F, it is SFS
with respect to F.
Theorem 2: If G is PFS with respect to F and input
code space A , then G is also PFS with respect to F for the
input code space A’ C A .
Theorem 3: Any inverter-free network with an unordered output code space is PFS with respect to unidirectional stuck-at faults.
111. SELF-CHECKING
CIRCUITAND SYSTEM
STRUCTURES
In this section we use the concepts introduced in the
previous section to derive self-checking circuit and system
3.1. Self-checking Combinational Functional Circuits
A circuit, which has been designed to be PFS for some
input code space, may actually receive a subset of the input code space under normal operation when it is placed
in an actual system environment. According to Theorems
1 and 2, the circuit remains PFS and, hence, SFS. However, if a fault is not detectable by the received input code
space, then the circuit can be simplified by removing the
line or gate corresponding to the fault. For example, if a
stuck-at 0 (stuck-at 1) fault at the input of an A N D gate is
undetectable by any normally received input codeword
then the gate (line) can be removed. However, this process can be time-consuming.
Theorem 3 is strictly applicable to totally inverter-free
circuits. It is well known that such a circuit can be obtained if AUED encoding is used for both the inputs and
outputs [33], [6]. However, in general, such an approach
may be overly restrictive and may result in much more
area overhead than necessary. We extend this approach as
~
880
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 12, NO. 6, JUNE 1993
follows. We encode the output functions of the functional
circuit in a unidirectional error-detecting code, but the inputs are not encoded. We then use logic optimization tools
to obtain a realization of these functions which has inverters only at the primary inputs. In other words, the
internal part of the circuit consists of only A N D and OR
gates (the circuit can have any number of levels), as shown
in Fig. 1. In fact, even if the internal part of the circuit
contains inverters, N A N D gates or NOR gates, but the circuit can be reduced to a circuit shown in Fig. 1 only
through the use of De Morgan's theorem, then the realization is still acceptable. However, such a circuit can be
guaranteed only to the PFS with respect to internal single
stuck-at faults, not unidirectional stuck-at faults. Note
that, in general, the output functions of a functional circuit need not be binate in each of the input variables. In
Fig. 1, the outputs have been assumed to be binate in x i ,
*
, x u , but unate in xu + , * . . x p .
Consider any single stuck-at fault in the circuit in Fig.
1 except those at the stems of the primary inputs x , , . . . ,
x,. Define the inversion parity of a path to be the number
of inversions on the path modulo 2. For the above-mentioned faults, every path from the fault site to any circuit
output has the same inversion parity. Thus these faults
can result only in unidirectional errors at the outputs.
Therefore, the output vector would be a noncodeword,
which will be detected by the checker monitoring the outputs. This means that the circuit will be PFS and, hence,
SFS with respect to the aforementioned faults. When this
functional circuit is embedded in a system, it may receive
its inputs from another functional circuit. The checker that
monitors the outputs of the preceding functional circuit
can also be made to detect the stuck-at faults on the stems
of primary inputs x,, . . , x,, of the functional circuit
under question; this will become clearer in Subsection 3.3.
Now suppose that the functional circuit that is to be
synthesized is not fed by any other functional circuit even
when placed in a system. For such cases, one solution, as
mentioned earlier, is to use AUED encoding for both inputs and outputs. Such a circuit is PFS with respect to all
single and unidirectional stuck-at faults. Note that there
is no need for an input checker. However, the disadvantage of a realization such as this is the need for extra primary inputs and most likely a significant increase in the
area required for its implemention. A designer may wish
to consider if even for this case the extra overhead over a
circuit realization given in Fig. 1 is beneficial, considering that the extra protection is not that much, especially
for VLSI functional circuits.
,
Fig. I . Realization of the functional circuit
'3
X"
AND/OR
circuit
3
3.2. Self-checking Sequential Functional Circuits
Previous works in the area of self-checking sequential
functional circuits have been presented in [ 18]-{20].
However, these methods can be impractical when applied
to VLSI circuits. The primary reasons are unacceptably
high area overhead and/or the difficulty of satisfying the
conditions required for making the circuits self-checking.
Fig. 2 . Mealy sequential machine
An alternative approach for Mealy machines is presented here. It can be trivially extended to Moore machines. Consider the Mealy machine shown in Fig. 2. In
this machine the next state functions and the output functions are encoded separately. Suppose the number of states
in the state table of the sequential machine that has to be
synthesized is s. If the next state functions are encoded in
the m-out-of-n code, then the number of state variables or
flip-flops, n , must satisfy the following condition: (k) L
s. Clearly, to reduce the number of flip-flops, we have to
choose the smallest n such that the preceding condition is
satisfied, and this leads to the choice of m = Ln/2J .
However, reducing the number m usually decreases the
size of the combinational logic of the sequential circuit,
as we will see in Section IV. Therefore, the choice of m
is a tradeoff between reducing the number of flip-flops and
reducing the size of the combinational logic.
The output functions are encoded in an AUBD code.
Using arguments from [ 3 3 ] it can be shown that, with the
above encoding, the circuit outputs and the next state outputs will be unate in the state variables, as shown in Fig.
2. The inputs need not be encoded if they are obtained
from a preceding functional circuit (combinational or sequential). The arguments for doing this are similar to those
given in Subsection 3.1. However, if the functional circuit is not fed by any other functional circuit, then the
designer may or may not wish to use input encoding, as
explained earlier.
JHA AND WANG: SELF-CHECKING VLSI CIRCUITS
881
There are two TSC checkers in this circuit to monitor
the circuit outputs; one checks the output functions, the
other checks the next state functions. The outputs of these
two checkers are then fed to a two-rail checker to produce
the final checker outputs. Any detectable stuck-at fault,
excluding those at the primary input stems of x,, * xu
in Fig. 2, results in a unidirectional error at the circuit
outputs and/or next state lines, which is caught by at least
one of the checkers. The stuck-at faults at the primary
input stems are detected by the checker, which checks the
preceding functional circuits in the system. If there is no
functional circuit preceding it, and AUED encoding is
used for the primary inputs, then it can be shown that the
output functions and next state functions will be unate in
the primary inputs. In such a case, any stuck-at fault can
cause only unidirectional errors. The preceding arguments can also be extended to PLA implementations. Thus
we see that the functional circuit is PFS and, hence, SFS.
This can be reduced to a TSC circuit if all gates and lines
corresponding to stuck-at faults not detected by input
codewords are removed. However, as mentioned earlier,
this can be time-consuming.
1
TRC
TRc 2
21
STATISTICS FOR
CKT
3.3. Selfchecking Systems
The area of design and synthesis of self-checking systems has received scant attention from researchers in the
past. Applications of the self-checking property to microprogram control units were discussed in [7] and [3]. A
theory of TSC systems was given in [21]. However, the
results presented pertain more to analysis rather than synthesis of self-checking systems. Furthermore, it may not
always be possible to meet or even check for the conditions outlined in [21] especially for moderately large systems.
The design and synthesis methods for self-checking
combinational and sequential circuits were presented in
the previous two subsections. These are the building
blocks of a system. Next we need to design the system
from these building blocks. Consider the system in Fig.
3. In this system, functional circuits 1 and 2 are combinational, whereas functional circuit 3 is sequential. We
have assumed that systematic or separable codes have
been used to encode the output functions of the functional
circuits; however, nonsystematic codes could also be
used. TRC, refers to a two-variable two-rail checker. The
outputs of checkers 1, 2, 3, and 4 are reduced ultimately
to only two outputs, zI and z 2 , with the help of TRC,’s.
From functional circuits 1 and 2, only the information bits
go to functional circuit 3. In general, functional circuit 3
will require the complements of its primary inputs too (inverters are not shown in Fig. 3, but are implicit). Therefore, the branch of information bit lines that goes to
checker 1 or 2 should be derived from a point beyond
where these lines are fed to the inverters. Thus, we ensure
that stuck-at faults on the stem and all branches of these
lines can be detected.
22
Fig. 3. Self-checking system
bbsse
cse
dk14
dk15
dk16
ex 1
ex4
ex6
kirkman
mark 1
mc
opus
planet
SI
sand
scf
TABLE 1
BFNCHMARK
SFQUFNTIAL MACHINES
#inp
#out
I
I
7
3
3
2
9
6
5
12
5
3
5
7
8
7
5
5
3
19
9
8
6
16
5
6
11
21
19
6
9
56
sse
I
I
styr
9
10
#sta
16
16
7
4
27
20
14
8
16
15
4
IO
48
20
32
121
16
30
IV. RESULTS
In this section, we give results for various synthesis
methodologies. The methodologies are applied to several
sequential MCNC benchmarks whose statistics are given
in Table I. For each machine in this table, the number of
inputs (#inp), the number of outputs (#out), and the number of states (#sta) are indicated. The outputs and the next
states are encoded with AUED codes, as mentioned earlier, and then synthesized by MIS [34], a multilevel logic
optimization program from Berkeley, to get the final realization.
4.1. Internally Inverter-Free Sequential Circuits
Encoded with Berger Code
The first synthesis scheme follows the procedure given
in Subsection 3.2. The output functions are encoded with
the Berger code, while the states of the finite state machines are encoded with the m-out-of-n code with all m
882
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF 1NTEGRATED CIRCUITS AND SYSTEMS. VOL. 12, NO. 6, J U N E 1993
(L)
and n that satisfy the following conditions:
L s and
rn I L n / 2 J , where s is the number of states. The functional circuits are optimized by a script that strictly uses
algebraic factorization without complement; i.e., the resulting circuits have inverters only at the primary inputs,
but not inside them, as shown in Fig. 2. However, when
such a script is used to optimize a circuit, the results are
usually slightly inferior to a script that includes Boolean
optimization.
Consider, for example, the state table shown in Fig.
4(a). With the given state assignment, the combinational
logic of the sequential circuit without any self-checking
properties is as shown. However, when 1-hot encoding
for the states and Berger encoding for the outputs are used,
then the self-checking circuit is as shown in Fig. 4(b).
Let us next consider the FSM benchmarks, which are
first converted to KISS2 format [ 3 5 ] . With an rn-out-of-n
(or, equivalently, rn-hot) state assignment, the circuit outputs and next state outputs can be made unate in state
variables by specifying the 0’s in the present state assignment to be “don’t care.” For instance, if the present state
is 11 000, then it should be specified as 11- - -, where the
dash denotes a don’t care. If this is not done, MIS does
not necessarily find the solution that makes the outputs
unate in the present state inputs. The functional circuit
outputs are then encoded with the Berger code. The literal-counts of the combinational logic of the self-checking sequential circuits synthesized by our method are
given in Table 11. For each benchmark, we tried all possible rn-hot codes, and for each rn-hot code, two circuits
were synthesized. In the first circuit, the output functions
are encoded with Berger code, whereas in the second one
they are not. Both of them are optimized with the algebraic script so that any internal stuck-at fault will cause
only unidirectional errors at the outputs. In the last column, the target machine’s states are assigned by MUSTANG [ 3 6 ] ,which gives a state assignment with rlog2s1
bits; the circuit is then fully optimized (i.e., Boolean optimization is performed). There are two options in MUSTANG, one being fan-in-oriented and the other fan-outoriented. Synthesis was done based on both options, and
the better of the two results is given in Table 11. In this
table the area occupied by the latches was not taken into
account because first we want to draw some conclusions
from these results about the overheads involved only in
the combinational logic. However, when we report the
total literal-count of the functional circuit and the checker,
we will also take into account the area of the latches.
It is interesting to note that, as the state encoding moves
from I-hot to 2-hot, etc., the literal-count of the combinational logic of the self-checking sequential circuit increases as well. In many cases, the I-hot code is better
than MUSTANG in terms of literal-count. However, the
1 -hot code may not always be practical for the larger machines (with more states) since the number of required
latches increases so fast that the gain in the literal-count
by the 1-hot code may be more than compensated by the
increase in the area due to the extra latches. as we will
0 sto stl
ooo
1 sto sto loo
0 stl st2oo1
1 stl sto 101
0 st2 st2 010
1 st2 s13 111
o s 0 stlO11
I st3 sto 01I
sto->oo
sa -> 11
stl - > O I
st3 -> 10
Next
State
Lines
-
Inputs
&m
I
\
Present
State
outputs
Lines
osto
1 sto
OStl
1 stl
ost2
1 st2
ost3
1 st3
stloM)11
sto loo 10
s t 2 0 0 1 10
sto 101 01
st2010 10
so 111 oo
stlo1101
stool1 01
sto ->ooIO
st2 -> Oloo
stl -> IoM)
st3 ->ooo1
J
Inputs
&
Present
States
Lines
(b)
:
Fig. 4. (a) Example of a sequential circuit. (b) Self checking circuit for
circuit in Fie. 4(a).
JHA A N D WANG: SELF-CHECKING VLSI CIRCUITS
883
TABLE I1
LITERAL-COUNTS
FOR THE BENCHMARK
SEQUENTIAL
CIRCUITS
I -hot
CKT
bbsse
cse
dk14
dk15
dk16
ex 1
ex4
ex6
kirkman
mark1
mc
opus
planet
SI
sand
scf
sse
styr
2-hot
3-hOt
4-hol
Berger
Code
No
Code
Berger
Code
No
Code
Berger
Code
No
Code
169
266
175
141
238
422
Ill
137
449
I06
44
95
493
267
518
698
169
538
126
208
146
116
20 1
210
76
101
267
92
44
74
356
233
449
556
126
385
217
319
235
169
355
502
160
173
467
152
67
I34
667
363
64 I
979
217
660
186
278
I96
I37
32 I
295
1 I7
124
312
136
57
108
533
326
562
830
186
536
24 1
43 1
234
371
-
-
-
-
457
561
165
428
404
129
-
*
Berger
Code
No
Code
MUSTANG
140
23 1
93
68
283
269
76
99
2 IO
Ill
24
80
575
207
553
97 1
I22
489
-
183
348
159
-
-
-
-
754
457
806
1226
24 1
952
619
416
688
1003
234
717
-: There is no need for such a code
*: Results not available.
see later. Conversely, the number of state variables for
the 2- and 3-hot codes are usually reasonably close to
rlog,sl , which makes them a good choice. For larger
machines, the 2-hot encoding can give better results than
MUSTANG, as can be seen for the benchmarks scfand
planet.
Suppose the machine to be synthesized has k primary
outputs and p state bits, then the number of redundant
outputs due to Berger encoding is [log, ( k + 1)1 . Since
the original circuit has k + p outputs, one may expect the
overhead added by the Berger code to be close to [log,
(k
1)1 / k p . In Table 111, we compute the overhead
for every sequential circuit, which is given by {(l.c. of
self-checking circuit) - (1.c. of original circuit)} /(l.c. of
original circuit), and normalize it with respect to the preceding ratio (1.c. denotes the literal-count). Here original
circuit refers to a circuit whose states are encoded in an
m-hot code, but whose outputs are not encoded. In reality,
for different circuits this normalized overhead varies
widely. However, the average normalized overhead for
the 2- and 3-hot state assignments is close to 1, as one
may have naively expected. The reason for reporting normalized overhead, as opposed to simply the percent overhead, is that it allows us to see the trend. All the benchmark circuits are reasonably small. Hence, for these
circuits the percent overhead can be large. However, if
the preceding trend holds for larger sequential circuits,
then we can see that the percent overhead for such circuits
can go down drastically, since the ratio rlog, ( k + 1)1/ k
p decreases as the value of k increases.
+
+
+
4.2. Fully Optimized Sequential Circuits Encoded with
Berger Code
If a circuit is optimized with algebraic factorization,
then any internal stuck-at fault will cause only unidirec-
TABLE 111
ESTIMATION
OF THE NORMALIZED
OVERHEAD
CAUSED
BY
CODE
CKT
1-hot
2-hot
3-hot
bbsse
cse
dk14
dk15
dk16
ex 1
ex4
ex6
kirkman
mark I
mc
opus
planet
0.78
0.69
0.67
0.70
0.58
2.63
I .38
I .28
2. I5
0.52
0.53
0.88
1.51
0.49
0.63
2.18
0.78
1.10
0.13
0.70
sand
scf
sse
styr
2.62
2.14
0.80
0.65
2.76
7.87
2.65
1.43
5.00
0.91
0.00
0.79
5.16
1.26
1.58
7.53
2.62
3.97
Average
2.76
I .08
0.93
SI
THE
BERCER
4-hOt
-
0.38
I .94
1.05
-
*
0.66
-
1.18
0.39
0.69
2.48
0.13
1.39
tional errors at the outputs. However, if there are inverters inside the circuit, then there might be bidirectional
errors at the outputs. But, it is still possible that a bidirectional error can be detected by either the Berger-encoded circuit outputs or the rn-out-of-n encoded next state
outputs. Therefore, if the fully optimized circuit has a
smaller overhead, yet if a high enough percentage of faults
are detected by the above codes, then it may be worthwhile using full optimization. In this subsection, we report results on the synthesis of the self-checking circuits
obtained by full optimization, and on the probability that
a fault will invalidate the above codes (i.e., change one
codeword into another). In Table IV we give the literal-
684
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS A N D SYSTEMS, VOL. 12, NO. 6 , JUNE 1993
TABLE IV
LITERAL-COUNTS FOR FULLY-OPTIMIZED BENCHMARK
SEQUENTIAL CIRCUITS
1-hot
CKT
Berger
Code
2-hot
No
Code
Berger
Code
3-hot
No
Code
Berger
Code
~~
bbsse
cse
dk14
dk15
dk16
ex I
ex4
ex6
kirkman
mark I
mc
opus
planet
SI
sand
scf
sse
styr
134
230
115
209
126
105
193
91
I08
65
78
102
41
88
95
33
75
I50
275
1 so
*
*
*
*
*
*
150
*
*
*
*
480
*
I15
*
225
363
174
150
370
569
136
181
464
148
61
127
743
54s
827
*
225
147
I79
301
I39
123
328
315
102
143
308
132
43
100
533
469
639
848
I79
594
counts for fully optimized circuits in the same way as in
Table 11.
Comparing Tables I1 and IV, it is interesting to note
that fully optimized circuits sometimes have worse literalcount than the circuits optimized by the algebraic script.
Actually, it seems that the algebraic script nearly always
produces better results for larger circuits. The reason,
however, may be that the output and the next state functions are unate in the present state variables, and this
property may be better exploited by algebraic operations.
Table V is similar to Table 111, and it shows the normalized overhead for the fully optimized circuits.
The next thing we want to know is the probability that
a fault will be detected if the method described in this
section is used. To obtain this probability, we generate a
test pattern for each stuck-at fault in the combinational
logic of the synthesized circuit, and see if the errors at the
output are detectable by the codes. We used the test pattern generation program STALLION [ 371 with some
modifications. The result shown in Table VI is the probability that a fault is not detected. The average probability
ranges from 4 to 7 % for 2-, 3 - , and 4-hot codes. Depending on the application, such a probability may or may not
be considered too high.
4.3. TSC Checkers
In this subsection, we describe briefly the checkers for
the codes used herein.
TSC checkers for the Berger code were presented in
[23]-[25] and [40]. We consider the area-efficient method
given in [24] next. The checkers presented in [24] consist
of half-adders, full-adders, and two-rail checkers. Suppose the number of information bits in a codeword is k ,
and the number of checkbits is r . Then the number of fulladders is k - Llogz k_J - 1 , the number of half-adders
is upper bounded by Llog, kJ , and the number of two-
4-hot
No
Code
Berger
Code
No
Code
MUSTANG
~~
~
232
395
140
23 1
93
68
283
269
76
99
210
111
24
80
575
207
553
97 I
122
489
202
354
-
-
508
624
I37
455
376
120
-
-
503
164
31 1
130
-
-
-
-
793
608
904
1234
232
843
637
460
760
1073
202
658
ESTIMATION OF
THE
TABLE V
NORMALIZED
OVERHEAD CAUSED
CODE
BY THE
CKT
1 -hot
2-hot
3-hot
bbsse
cse
dk14
dkl5
dk16
ex 1
ex4
ex6
kirkman
mark I
mc
opus
planet
2.33
2.42
0.76
0.83
2.88
1.20
0.96
0.84
0.66
0.70
3.02
1.25
0.86
2.19
0.53
1.26
0.99
2.36
0.70
1.32
0.64
*
2.30
1.54
*
0.44
0.73
0.92
*
*
*
*
4-hot
0.50
0.58
3.30
0.53
-
2.47
1.15
-
sand
scf
sse
styr
2.33
*
1.20
1.22
1.32
1.29
0.76
1.68
0.64
1.19
Average
1.89
1.25
I .23
SI
BERCER
*
variable two-rail checkers (TRC2’s) required to form an
r-variable two-rail checker is rlog, (k + 1)1 - 1. The
least literal-counts for a full-adder, a half-adder, and a
TRC, checker among various designs we considered are
13, 6, and 8, respectively. Therefore, if 1.c. is the literalcount of a Berger code checker, the bound on it is given
by
13(k - Llog, kJ - 1)
I1.c. 5
13(k
-
+ 8 ( rlog,
(k
Ll0g2 k ] - 1)
+
1)1 - 1)
+ 6 ( LlOg,
kJ
)
There are many TSC rn-out-of-n code checker designs
in the literature [ 5 ] , [28]-[31], from which the designer
JHA A N D WANG. SELF-CHECKING VLSl CIRCUITS
885
TABLE VI
PROBABILITY OF UNDETECTED FAULT
Ih F U L L Y - O P T I MCIRCUITS
IL~D
CKT
bbsse
cse
dk14
dk15
dk16
ex I
ex4
ex6
kirkman
mark 1
mc
opus
planet
SI
sand
scf
sse
styr
Average
I-hot
5%
6%
10%
10%
48 %
*
14%
2%
*
4%
11%
1%
*
*
*
*
6%
*
10.6%
2-hot
3-hot
2%
13%
5%
9%
17%
5%
0.3%
3%
5%
0%
15%
-
1%
0%
4%
3%
3%
bbsse
dk14
dk15
dk16
ex 1
ex4
ex6
kirkman
mark I
mc
2%
3%
11%
-
15%
0%
-
opub
-
2%
5%
4.0%
6.6%
*
CKT
cse
-
4%
4%
5%
1%
9%
6%
1%
4-hot
TABLE VI1
TOTAL
LITERAL-COUNTS FOR THE SEl.F-c"ECI(IhC
planet
SI
sand
scf
sse
styr
I -hot
2-hot
3-hot
355
452
289
210
497
780
317
30 I
629
429
I20
239
1121
47 1
888
2370
355
906
344
446
316
230
445
777
310
298
S88
407
135
221
1016
484
818
I876
344
789
B t . N C " M A K K cIKClJl~l
S
4-hot
Dup.
349
539
-
-
-
-
-
3 84
566
260
I96
652
752
272
296
516
398
IO8
256
I768
524
I240
2480
348
I120
55 I
817
305
-
-
.-
-
-
-
430
-
-
-
1082
559
976
2034
349
I134
I124
-
2235
-
4.0%
can choose the best. Usually these checkers are rather
small because of the small values of n , except when 1-hot
code is used. In the following discussion, for each rn-hot
code, we choose the design with the least literal-count.
The final circuit size depends on the size of the combinational logic of the sequential circuit, latches, and
checkers. The actual area occupied by the latches may be
a decisive factor in choosing among the different rn-hot
codes. As suggested in [38], we count a latch as three
literals. In Table VII, we list the final literal-counts (for
the functional circuit containing the combinational logic
and latches, and the two checkers) for all benchmark examples synthesized with algebraic factorization. Also
listed in the table are the literal-counts for the duplicateand-compare type of implementation of all the benchmark
circuits. This includes duplicated functional circuits (each
obtained after using MUSTANG and full optimization by
MIS) and two-rail checkers to compare the two sets of
circuit and next state outputs of the functional circuits.
4.4. Outputs Encoded with Sepuruble Codes
In many cases the output patterns that normally occur
consist of only a small part of the whole output space.
Under these circumstances, separable codes may frequently need fewer checkbits and potentially lead to simpler functional circuits. Conversely, as seen in the previous section, the size of Berger code checkers grows only
linearly with the number of its inputs, whereas the size of
a separable code checker may grow more rapidly. For
some circuits, the checker accounts for a significant portion of the overhead. If this is the case, one may want to
try a different approach which may reduce the checker
size. We consider these issues next.
We use the following encoding scheme for obtaining
the separable code. This scheme is an extension of the one
presented in [16]. We first obtain the zero-count (the
number of 0's) of all the information symbols that can
normally be produced. We put all the information symbols with the same zero-count into one group. Suppose
there are G such groups. Obviously, I 5 g 5 k
1. Let
r be the number of checkbits. We do not use the check
symbol which has r 1's. Therefore, r is chosen to be [log,
(G + 1)1. The r-bit check symbols are assigned to the
different groups in ascending order (i.e., the group with
a larger zero-count is assigned a larger check symbol).
For example, let k = 9 and the different groups present
have zero-counts 1 , 2, 5 , 6, 7, and 9. Then G = 6 and r
= 3. Therefore, one possible assignment would be to use
thecheck symbols 110, 101, 100, 011, 010, 001, respectively, for the information symbols with zero-counts 9, 7,
6, 5 , 2, 1.
A self-checking circuit based on the preceding encoding technique is shown in Fig. 5 . The functional circuit
consists of two subcircuits: the information symbol generator (ISG) and the check symbol generator (CSG). These
two subcircuits do not share any logic. The checker consists of a check symbol complement generator (CSCG),
which generates the bit-by-bit complement of the check
symbol in the fault-free case, and an r-variable two-rail
checker (TRC,). The structure of this checker for the separable code is the same as the structure frequently used
for systematic codes 1231.
The functions of the CSCG are derived as follows. For
all the information symbols that the ISG of the functional
circuit can produce under normal operation, the CSCG
outputs correspond to the bit-by-bit complement of the
expected check symbol. For the information symbols that
should not normally occur, all the CSCG outputs are assigned 0's. It is easy to specify such an input/output mapping in the ESPRESSO [39] format. Since the information
symbols that do not normally occur map to all O's, they
do not need to be specified.
Since ISG and CSG are synthesized separately, no internal stuck-at fault in the functional circuit can create an
error at the outputs of both ISG and CSG. Based on this
+
886
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS. VOL
TABLE VIII
LIT~RAL-COLNTF
FOR S t Q C F N l I A L CIKCLIII 5 EN(ODED
InLIuts
I
1
;
Functional Circuit
< -- i Checkbits
, - - - - - - - - - - - .- -.
i Check Symbol
Complement
; Generator
1 -hot
2-hot
3-hot
4-hot
Checker
ex I
mark I
planet
scf
457
106
533
694
617
747
181
880
1227
-
367
78
463
64 1
155
768
992
TOTALLITERAL-COUNTS
FOR
~
\
/
r-variable
Checker
21
953
1364
TABLE IX
T H E SELF-CHtCKING CIRCUITS ENCODED
W I T H SFPAR.4BI.E CODES
$
’
Twn-Rail
outputs
SFPARABLE
CKT
t
A <;
Wt TH
CODkS
Check
Symbol
Generator
Complements of
Checkbits
12, NO. 6. JUNE 1993
22
Fig. 5 . Self-checking circuit based on the separable code
observation, it is easy to show that the checker is codedisjoint.
The fact that sharing of logic is not allowed between
ISG and CSG tends to increase the area of the functional
circuit. However, since there may be fewer bits in the
check symbol and the area of the checker may be less, the
overall area for a self-checking circuit based on a separable code may be less than the area of a self-checking
circuit based on a systematic code. The preceding approach can obviously be extended easily to sequential circuits, as before. In this extension, the ISG can be combined with the next state logic, but the CSG has to be
synthesized separately.
The results for some large FSM sequential benchmark
machines are given in Table VIII. As before, the states of
each machine are encoded with all possible rn-hot codes;
the number given in the corresponding column is the literal-count for the combinational logic of the sequential
circuit whose outputs are encoded with separable codes.
The last column gives the literal-count of the separable
code checker, which includes CSCG and TRC,..
Next, in Table IX we give the total literal-count of the
self-checking circuit based on separable codes. This includes the literal-counts for the functional circuit (including both the combinational logic and latches), the separable code checker and the m-out-of-n code checker. The
literal-counts for the duplicate-and-compare scheme are
also listed as a reference. By looking at the results in Table VII, we find that the literal-counts for the self-checking circuits based on separable codes are better for mark1
and scf, and worse for ex1 and planet compared with thc
literal-counts for self-checking circuits based on the Berger code, which is a systematic code.
V. CONCLUSION
In this paper we presented gate-level self-checking circuit, system, and checker synthesis methodologies. The
overhead due to the proposed methodologies was evalu-
CKT
ex 1
mark I
planet
scf
1 -hot
1002
329
1414
2297
2-hot
3-hot
4-hot
DUP
1079
319
1364
I846
1190
335
I460
2027
-
752
398
1768
2480
-
1522
2147
ated by synthesizing self-checking circuits for MCNC
benchmarks.
From the experimental results we found that, if the
states of a sequential circuit are encoded with an rn-hot
code, the size of the combinational logic usually increases
as m increases. However, when the area of the whole selfchecking circuit is taken into account (including that of
the latches and checkers), then the 2-hot code seems to
give the best result in most cases. Second, since the normalized overhead of the combinational logic is close to 1
on the average (at least for the 2- and 3-hot codes), it
means that one can expect the overhead introduced in the
circuit to be close to [log, (k
1)1/k p , where k and
p are the number of primary outputs and next state lines,
respectively. Therefore, the overhead due to Berger encoding should, in general, decrease as the size of the circuit becomes larger.
Self-checking systems are of great value for ensuring
reliability of computations. Therefore, we hope that our
results will help designers of such systems choose from
among the different approaches.
+
+
REFERENCES
X . Castillo. S . K. McConnel. and D. P. Siewiorek, “Derivation and
calibration of a transient error reliability model,” IEEE Truns. Cornpur., vol. C-31. pp. 658-671, July 1982.
Y . Savaria, N . C . Rumin, I . F. Hayes, and V . K. Agarwal. “Softerror filtering: A solution to the reliability problem of future VLSI
digital circuits.” IEEE Pro(.., vol. 74, no. 5 . pp. 669-683, May 1986.
M. M . Yen, W . K. Fuchs, and I . A. Abraham, “Designing for concurrent error detection in VLSI: Application to a microprogram control unit,’’ IEEE J . Solid-Sruru Circuits, vol. SC-22, pp. 595-605,
Aug. 1987.
W . C. Carter and P. R. Schneider, “Design of dynamically checked
computers.” in Proc. IFIP ’68, v o l . 2, Edinburgh, Scotland, Aug.
1968, pp. 878-883.
D. A. Anderson and G . Metze, “Design of totally self-checking circuits for m-out-of-n codes,” IEEE Truns. Compur., vol. C-22, pp.
263-269, Mar. 1973.
J . E . Smith and G . Metze, “Strongly fault secure logic networks,”
IEEE Trans. G m p u r . , vol. C-27, pp. 491-499, June 1978.
R . W . Cook. W . H . Sisson. T . F. Storey, and W . N . Toy, “Design
AND WANG: SELF-CHECKING VLSI CIRCUITS
887
of a self-checking microprogram control,” IEEE Trans. Comput., vol.
[33] G . Mago. “Monotone functions in sequential circuits,” IEEE Trans.
C-22, pp. 255-262, Mar. 1973.
Cornput., vol. C-22, pp. 928-933. Oct. 1973.
D. K. Pradhan and J. J . Stiffler, “Error correcting codes and self1341 R. K. Brayton, R. Rudell. A. Sangiovanni-Vincentelli. and A. R.
checking circuits in fault-tolerant computers,” IEEE Computer. vol.
Wang. “MIS: A multiple-level logic optimization program,” IEEE
13, pp. 27-37, Mar. 1980.
Trans. Computer-Aided Design, vol. CAD-6, pp. 1062-1081, Nov.
G . P . Mak, J. A. Abraham, and E . S . Davidson, “The design of
1987.
PLAs with concurrent error detection,” in Proc. Int. Symp. F ~ ~ u l t - 1351 G . De Micheli, R. K. Brayton, and A. Sangiovanni-Vincentelli.
Tolerant Cornput., Santa Monica, CA. June 1982, pp. 303-310.
”Optimal state assignment for finite state machines,” IEEE Trans.
D. K. Pradhan, “A new class of error correctingidetecting codes for
Computer-Aided Design, vol. CAD-4, pp. 269-284, July 1985.
fault tolerant applications.” IEEE Trans. Cornput.. vol. C-29, pp.
1361 S . Devadas, H-K. T. Ma, A. R. Newton, and A. Sangiovanni-Vin471-481, June 1980.
centelli, “MUSTANG: State assignment of finite state machines tarH . Dong, “Modified Berger codes for detection of unidirectional ergeting multi-level logic implementations. ’’ IEEE Trans. Computerrors,” in Proc. Int. Symp. Fdr-Toleranr Cornput.. Santa Monica,
Aided Design, vol. 7 , pp. 1290-1300, Dec. 1988.
CA, June 1982, pp. 317-320.
1371 H-K. T . Ma, S . Devadas. A. R. Newton, and A. Sangiovanni-VinB. Bose and D. J. Lin. “Systematic unidirectional error-detecting
centelli, “Test generation for sequential circuits,” IEEE Trans. Corncodes,” IEEE Trans. Cornput., vol. C-34, pp. 1026-1032. Nov.
purer-Aided Design, vol. 7, pp. 1081-1093, Oct. 1988.
1985.
1381 X . Due, G . Hachtel, B. Lin and A. R. Newton, “MUSE: A multiN. K. Jha and M. B. Vora. “A systematic code for detecting t-unilevel symbolic encoding algorithm for state assignment,” IEEE Trans.
directional errors,” in Proc. Int. Symp. Fault-Tolerunt Comput.,
Computer-Aided Design, vol. 10, pp. 28-38, Jan. 1991.
Pittsburgh, PA, June 1987, pp. 96-101.
1391
.~ R . K . Bravton. C . McMullen. G . D . Hachtel. and A. Saneiovanni[I41 B. Bose, “Burst unidirectional error detecting codes,” IEEE Trans.
Vincentelli, Logic Minimization Algorithms for VLSI Synthesis.
Comput.. vol. C-35, pp. 350-353, Apr. 1986.
Kluwer Academic Publishers, 1984.
[IS] M. Blaum, “On systematic burst unidirectional error detecting
1401 S . J . Piestrak, “Design of fast self-testing checkers for a class of
codes,” IEEE Trans. Cornput., vol. (2-37, pp. 453-4.57, Apr. 1988.
Berger codes,” IEEE Trans. Cornnut.. vol. C-36. .DD.
. 629-634. Mav
N. K. Jha, “Separable codes for detecting unidirectional errors.”
1987.
IEEE Trans. Computer-Aided Design, vol. CAD-8, pp. 57 1-574. May
1989.
J . E. Smith and G . Metze, “The design of totally self-checking conibinational circuits,” in Proc. I n t . Syrnp. Fault-Tolerant Cornput., Los
Angeles, CA, June 1977, pp. 130-134.
Niraj K. Jha (S’85-M’86) received the B.Tech.
M. Diaz, “Design of totally self-checking and fail safe sequential
degree in electronics and electrical communicamachines,” in Proc. Int. Syrnp. Fault-Tolerant Cornput.. Urbana, IL.
tion engineering from the Indian Institute of TechJune 1974, pp. 3-9 - 3-24.
nology, Kharagpur, India in 1981, the M.S. deF. Ozguner, “Design of totally self-checking asynchronous aequengree in electrical engineering from S.U.N.Y. at
tial machines,” in Proc. Int. Symp. Fault-Tolerant Comput., June
Stony Brook, NY, in 1982, and the Ph.D. degree
1977,pp. 124-129.
in electrical engineering from university of IlliI . Viaud and R. David, “Sequentially self-checking circuits.” in
nois, Urbana, IL, in 1985.
Proc. Int. Symp. Fault-Tolerant Cornput.. June 1980. pp. 263-268.
He is currently an Associate Professor of ElecJ. E. Smith and P. Lam, “A theory of totally self-checking system
trical Engineering at Princeton University. From
design,” IEEE Trans. Cornput., vol. C-32. pp. 831-844, Sept. 1983.
1985 to 1987 he was an Assistant Professor of
J. M. Berger, “A note on error detecting codes for asymmetric chanElectrical Engineering and Computer Science at the University of Michinels,” Information & Control, vol. 4. pp. 68-73. Mar. 1961.
gan, Ann Arbor. He has served as the Program Chairman of the 1992
M. Ashjaee and J. M. Reddy, “On totally self-checking checkers for
Workshop on Fault-Tolerant Parallel and Distributed Systems. He has also
separable codes,” IEEE Trans. Cornput., vol. C-26. pp. 737-744,
served on the program committees for the IEEE International Conference
Aug. 1977.
on Computer Design in 1988. 1989, and 1990, for the IEEE International
M. A. Marouf and A. D. Friedman, “Design of self-checking checkSymposium on Fault-Tolerant Computing in 1990 and 1991, and the Iners for Berger codes,” in Proc. Int. S y n p . Fault-Tolerant Cornput..
ternational Conference on VLSI Design in 1993. He has co-authored a book
June 1978, pp. 179-184.
titled Testing and Reliable Design of CMOS Circuits. He has authored or
J.-C. Lo and S . Thanawastien, “The design of fast totally self-checkco-authored over 80 technical papers. His research interests include digital
ing Berger code checkers based on Berger code partitioning,” in Proc.
system testing. fault-tolerant computing, computer-aided design of inteInt. Symp. Fault-Tolerant Cornput., Tokyo, Japan, June 1988, pp.
grated circuits, and parallel processing.
226-23 1.
He is the recipient of the AT&T special-purpose grant award and the
J. E. Smith, “On separable unordered codes.” IEEE Truns. CotnNEC Preceptorship award.
p u t . , vol. C-33, pp. 741-743, Aug. 1984.
C. V. Frieman, “Optimal error detecting codes for completely asymmetric binary channels,” Information & Control, vol. 5 , pp. 64-71.
Mar. 1962.
S. M. Reddy. “A note on self-checking checkers.” IEEE Trans.
Sying-Jyan Wang received the B.S. degree in
Cornput., vol. C-23, pp. 1100-1102, Oct. 1974.
electronics engineering from the National Taiwan
M. A. Marouf and A. D. Friedman, “Efficient design of self-checkUniversity, Taiwan, China, in 1984, and the
ing checker for m-out-of-n code.” IEEE Trans. Cornput., vol. C-27,
Ph.D. degree in electrical engineering from
pp. 482-490. June 1978.
Princeton University in 1992.
N. Gaitanis and C . Halatsis. A new design method for m-out-of-n
From 1984 to 1986 he was an R.O.T.C. officer
TSC checkers,” IEEE Trans. Cornput.. vol. C-32, pp. 273-283. Mar.
in the Chinese Air Force, Taiwan. From 1986 to
1983.
1987 he was a teaching assistant in the DepartT. Nanya and Y. Tohma, “A 3-level realization of totally self-checkment of Electrical Engineering, National Taiwan
ing checkers for rn-out-of-n codes,” in Proc. I n t . S w p . Fault-To/University. He is currently an Associate Professor
erant Comput.. Milan, Italy, June 1983, pp. 173-176.
of Institute of Information Science at National
Y. Tamir and C. H. Sequin, “Design and application of self-testing
Chung-Hsing University, His research interests include fault-tolerant comcomparators implemented with MOS PLA’s.” IEEE Trans. Cornput.,
puting, digital system testing, and communication architectures.
vol. C-33, pp. 493-506, June 1984.
~~
“
Download