Multiple Error Recovery in TMR System K.JansiRani ,P.Vinitha , Dr.G.K.D.PrasannaVenkatesan

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 7 - May 2014
Multiple Error Recovery in TMR System
K.JansiRani 1,P.Vinitha2, Dr.G.K.D.PrasannaVenkatesan3
Post Graduate student- ME in VLSI Design1, Assistant Professor2, Vice Principal, Professor and Head3
Department of E.C.E., PGP College of Engineering and technology, Namakkal, Tamilnadu,India
Abstract : A new approach to implement fault
tolerant FPGA ,which can mitigate radiation-induced
temporary faults (single-event upsets (SEUs) at
moderate cost. we present a scan-chain-based multiple
error recovery technique for triple modular
redundancy (TMR) systems (SMERTMR). The
proposed technique reuses scan-chain flip-flops
fabricated for testability purposes to detect and correct
faulty modules in the presence of single or multiple
transient faults. In the proposed technique, the
manifested errors are detected at the modules outputs,
while the temporary faults are detected by comparing
the internal states of the TMR modules, ours allows to
detect and eliminate its internal temporary
configuration upsets without interrupting normal
functioning. Faults are identified and Removed using a
self built Configuration to avoid a single point of
failure, is implemented as fault-tolerant using triple
modular redundancy system. Finally, This paper
proposes a test pattern generator(TPG) for built-in
self-test. Our method generates test pattern for multiple
single input change (MSIC) vectors in a pattern, each
vector applied to a scan chain is an SIC vector .The
proposed approach can be validated through fault
injection experiments.
Keywords – Fault tolerant design, automatic Test
pattern generation,Built in self test,Triple modular
redundancy(TMR)
I.INTRODUCTION
Xilinx FPGAs use Static RAM-based technologies
which is known to be very susceptible to radiation
and electromagnetic Noise. The major effects caused
by Single-Event Upsets (SEUs) or soft errors,
because only some logic states of memory elements
are changed but the circuit/device itself is not
permanently damaged. In FPGAs, Single Event
Upsets may directly corrupt computation results or
induce changes to configuration memory, Due to
their flexibility, FPGAs are attractive for critical
embedded applications, but their reliability is
insufficient unless some fault-tolerance techniques
capable of mitigating soft errors are used. This
technique should allow for online error detection,
correction during system operation, fast fault location
and quick recovery from temporary fault, and fast
permanent fault repair through reconfiguration
ISSN: 2231-5381
engine.we are interested in designing and
implementing a fault-tolerant mechanism for soft
core processor using FPGA. FPGA implementations
of various fault-tolerant soft core and hardcore
processors, lockstep systems using dual hardcore
processors PowerPC found in certain FPGAs are
constructed. However a limited number of available
Power PC processors in FPGAs and it restrict their
utilization to build a fault-tolerant Multiprocessor
System-on-Chip .
Therefore, one option to implement larger number of
lockstep modules in a single FPGA is to employ soft
core processors that can use all available
reconfigurable resources of the device
II.HARDWARE REDUNDANCY TECHNIQUES
FOR FPGAS
The most widely used fault-tolerant methods used to
mitigate logic errors in FPGA designs rely on
hardware redundancy 1.a) duplication with
comparison (DWC) for detecting faults (Fig. 1 a and
b ) 1.b)Triple modular redundancy (TMR) with
majority voter for masking faults (Fig. 1). In DWC,
the original block to be protected is replicated twice
and the results produced by the original and the
outputs of replicated blocks are compared to detect
faults. The lockstep scheme of Fig.1.a) is the
implementation of DWC at the processor level,
supported by some Xilinx FPGAs. Two identical
processors P1 and P2 receive the same inputs,
continuously execute the same instructions.Finally
the results are compared step-by-step at each clock
cycle. P2 generates the reference results to be
compared against those of _P1 that provides the
system output. Generally , DWC is able to detect the
errors but not able to correct errors, because it cannot
point out the faulty module.
Fig 1(a) DWR (b)TMR
http://www.ijettjournal.org
Page 329
International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 7 - May 2014
However, it capable to tolerate temporary faults,
provided that it is supported by some re execution
procedure. In TMR, the original block is replicated
inputs, and a majority 2-out-of-3 voter is used to
determine the correct result. It also possible to detect
fault in the outputs of the three blocks to identify the
faulty one. TMR mask
both permanent and
temporary faults of a single block.
III.NEW FAULT
TOLERANT
ARCHITECTURE WITH NEW VOTER
Fig 2 shows the architecture of the fault-tolerant
system proposed here, whose two main blocks
are:1)Enhanced Lockstep scheme 2)fault-tolerant
Configuration Engine. It realize on using two Xilinx
Virtex hardware primitives. The COMP_MUX
consists of two blocks: 1) The VOTER that indicates
any mismatch between the outputs the Multiplexer
which connects one of the processors to the system
output, so the another processor is reported to be
faulty, the other is switched on. Automatically
switching is performed and also executed in one
clock cycle. Then, the three modules need to be
synchronized to put the newly reconfigured one to
the same state as the correct one, and also enabling
them to continue executing the same task in Lockstep
again. The CRB has a memory shared by both
processors to store their context using on-chip.
BRAM that has dual-port configuration providing
simultaneous access to two Recovery Buses RB1 and
RB2. These buses perform to save and restore the
processor’s contexts to the BRAM as well as to
control the CRB during the context switching
process.
Fig 3.Proposed voter
the possibility to carry out automatic fault injection
campaigns for the configuration memory of P1 and
the COMP_MUX. The main goal is to obtain some
statistical data on the design complex, percentages of
sensitive bits and persistent errors .Once a fault is
injected, the lock stepped pair of processors _P1 and
_P2 executes the application datas that checks
whether a peripheral of a softcore operates correctly.
This application was chosen because it perform large
number of transactions passing through PLB bus and
thus offers the advantage that COMP_MUX
can verify the consistency of outputs of _P1 and P2
for frequently varying data. Although this application
data does not cover the complete set of instructions
for execution, it is a valuable proof of concept for
the proposed system.
The architecture of the proposed voter is depicted in
Fig. 3 As shown in the figure, three comparators
(C12, C13, and C23) are used to represent any
mismatch between TMR modules. As an example,
TE23 signal is activated once a mismatch between
Outputs II and III is detected. If one modules
generates an erroneous output (e.g., Output I), two of
the comparators (here, C12 and C13) will activate the
mismatch signals (here, T E12 and T E13) and only
one of the comparators (here, C23) will not activate
the corresponding mismatch signal (here, T E23). In
case of a faulty comparator (e.g., C13), only the
corresponding signal (here, T E13) is activated and
the other signals (here, T E12 and T E23) are
deactivated.
Fig 2. proposed scheme
IV.VALIDATION USING FAULT INJECTION
To validate the efficiency of the fault-tolerant method
is proposed, we have provided the Fault Tolerant
Configuration Engine with
Fig 4.Sic vector generation
ISSN: 2231-5381
http://www.ijettjournal.org
Page 330
International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 7 - May 2014
This paper has proposed a low-power test pattern
generation method.It can easily implemented by
hardware and also developed a theory to express a
sequence generated by linear sequential architectures,
and extracted a class of Single Input Change
sequences named MSIC. Analysis results showed that
an Multiple single Input Change sequence had the
favorable features of uniform distribution, low input
transition, and low dependency relationship between
the test length and the Test Pattern Generations in the
initial states.The proposed reconfigurable Johnson
counter or scalable SIC counter, the Multiple Single
Input Change-Test Pattern Generation can be
implemented easily,Test-per-clock schemes and testper-scan schemes are flexible for test pattern
generation.
V.TEST PATTERN GENERATION METHOD
Assume m primary inputs(PIs) and M scan chains. In
a full scan design and each scan chains has L scan
cells fig 4 shows the symbolic simulation for one
generated pattern. The test pattern vector generated
by an m-bit LFSR with the primitive polynomial Can
be expressed as S(t)=S0(t)S1(t)S3(t)…Sm-1(t)(S referred to
as the seed) and the vector generated by an l bit
Johnson
counter
can
be
expressed
as
J(t)=J0(t)J1(t)J2(t)…Jl-1(t).In the first clock cycle ,
J꞊JₒJ1J2…..J l-1(t) will bit XOR with S=S0S1S2 …SM-1
and the results X1Xl+1X2l+1…. Will be shifted into M
scan chains .Respectively,in the second clock cycle
J=J0,J2….Jl-2 will be circularly shifted as Jl-1J0J1……Jl-2
which will also bit XOR with the seed
S=S0S1S2…..SM-1 .The resulting X2Xl+2X2l+2…..X(M-1)l+2
will be shifted into M scan chains respectively After
1clocks each scan chain will be fully loaded with a
unique Johnson code word and seed S0S1S2…Sm-1 will
be applied to m PIs.
TABLE 1
EXPERIMENTAL RESULTS
Parameter
Existing
method
17
Proposed
method
16
Number of 4
input LUTs
Number of
bonded IOBs
31
28
50
46
Time (ns)
15.9
14.9
Number of
Slice LUTs
Based on the analysis, it is come to known that the
testing process which uses sic technique gives the
better reduction in area and time when compared to
the other methods and is shown in the fig.5.
35
30
25
20
existing
15
proposed
10
5
0
area
time
Fig.5 Area and Time Comparison
VIII.SIMULATION RESULTS
VI.EXPERIMENTAL RESULTS
The proposed enhanced TMR technique and tiling
technique are simulated by using Xilinx ISE 12.1i
and implemented in Virtex-5 FPGA processor. The
experimental results are given in Table 1.
VII.PERFORMANCE ANALYSIS
The performance of the our scheme with existing
scheme are analyzed based on the area and time
consumption which was given in table 1.
Fig 6:Fault Injection And Verification
ISSN: 2231-5381
http://www.ijettjournal.org
Page 331
International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 7 - May 2014
Programmable CustomComputing Machines, pp. 47-54, May
2010.
[6] Aeroflex Gaisler AB, “LEON3 Product Sheet,” www.aeroflex.
com/gaisler, 2012.
[7] Aeroflex Gaisler AB, “LEON3-FT SPARC V8 Processor,”
www. aeroflex.com/gaisler, 2012.
[8]
Xilinx,
Inc.,
“SEU
Strategies
for
VirtexDevices,”XAPP864,htttp://www.xilinx.com/support/documentatio
n/application_
notes/xapp864.pdf, Mar. 2009.
[9] M. Lanuzza et al., “An Efficient and Low-Cost Design
Methodology to Improve SRAM-Based FPGA Robustness in
Space and Avionics Applications,” Proc. Int’l Workshop
Reconfigurable Computing: Architectures, Tools and Applications,
vol. 5453, pp. 74-84, 2009.
[10] B. Dutton and C. Stroud, “Single Event Upset Detection and
Correction in Virtex-4 and Virtex-5 FPGAs,” Proc. ISCA 24th Int.
Conf. Computers and Their Applications (CATA 2009), pp. 57-62,
Apr. 2009.
Fig 7: Faulty Unit Identification
IX.CONCLUSION
In this paper, we presented a roll-forward technique,
called SMERTMR, to recover multiple errors in
TMR systems. In the proposed technique, we
employed built-in design for testability resources
available in digital circuits to recover faulty modules
in the presence of multiple transient faults. To
validate the proposed technique, we performed fault
injection experiments by generating test pattern using
Johnson counter. An analytical assessment also
demonstrated that SMERTMR improves the
reliability of TMR systems by several orders of
magnitude compared to the state-of-the-art
techniques.
AUTHORS PROFILE
1.K.JansiRani,Post Graduate student –ME in VLSI
Design at
PGP College of Engineering and Technology,Namakkal
2.P.Vinitha ,Assistant Professor /Electronics
and
communication engineering and Technology Department at
PGP
College
of
Engineering
and
Technology,Namakkal,Tamilnadu,india
3.Dr.G.K.D.Prasanna Venkatesan, Completed Ph.D from
College of Engineering, Anna University, Chennai, India. He is
Currently Working as Vice-Principal, Professor & Head of
Department of Electronics and Communication Engineering at
PGP College of Engineering and Technology. Namakkal, Tamil
REFERENCES
nadu, India. His research interests includes Wireless Sensor
[1] Xilinx, Inc., “PowerPC 405 Processor Block Reference Guide
(UG018),” www.xilinx.com/support/documentation/use
guides/ug018.pdf, 2012.
[2] Xilinx, Inc., “Single-Event Upset Mitigation Selection Guide,”Appl.
Note
XAPP987(v1.0),http://www.xilinx.com/support/documentation/ap
plication_notes/xapp987.pdf, Mar. 2008.
Networks, 4G Wireless Networks, Cloud Computing,Adhoc
Network, etc.,
.
[3] Xilinx, Inc., “PPC405 Lockstep System on ML310,”Appl.
NoteXAPP564 v1.0.2,
application_notes/xapp564.pdf, Jan. 2007.
[4] F. Abate et al., “New Techniques for Improving the
Performanceof the Lockstep Architecture for SEEs Mitigation in
FPGA
Embedded Processors,” IEEE Trans. Nuclear Science, vol. 56,no.
4, pp. 1992-2000, Aug. 2009.
[5] Y. Ichinomiya et al., “Improving the Robustness of a Softcore
Processor against SEUs by Using TMR and Partial
Reconfiguration,” Proc. IEEE Ann. Int’l Symp. Field-
ISSN: 2231-5381
http://www.ijettjournal.org
Page 332
Download