International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 7 - May 2014 Multiple Error Recovery in TMR System K.JansiRani 1,P.Vinitha2, Dr.G.K.D.PrasannaVenkatesan3 Post Graduate student- ME in VLSI Design1, Assistant Professor2, Vice Principal, Professor and Head3 Department of E.C.E., PGP College of Engineering and technology, Namakkal, Tamilnadu,India Abstract : A new approach to implement fault tolerant FPGA ,which can mitigate radiation-induced temporary faults (single-event upsets (SEUs) at moderate cost. we present a scan-chain-based multiple error recovery technique for triple modular redundancy (TMR) systems (SMERTMR). The proposed technique reuses scan-chain flip-flops fabricated for testability purposes to detect and correct faulty modules in the presence of single or multiple transient faults. In the proposed technique, the manifested errors are detected at the modules outputs, while the temporary faults are detected by comparing the internal states of the TMR modules, ours allows to detect and eliminate its internal temporary configuration upsets without interrupting normal functioning. Faults are identified and Removed using a self built Configuration to avoid a single point of failure, is implemented as fault-tolerant using triple modular redundancy system. Finally, This paper proposes a test pattern generator(TPG) for built-in self-test. Our method generates test pattern for multiple single input change (MSIC) vectors in a pattern, each vector applied to a scan chain is an SIC vector .The proposed approach can be validated through fault injection experiments. Keywords – Fault tolerant design, automatic Test pattern generation,Built in self test,Triple modular redundancy(TMR) I.INTRODUCTION Xilinx FPGAs use Static RAM-based technologies which is known to be very susceptible to radiation and electromagnetic Noise. The major effects caused by Single-Event Upsets (SEUs) or soft errors, because only some logic states of memory elements are changed but the circuit/device itself is not permanently damaged. In FPGAs, Single Event Upsets may directly corrupt computation results or induce changes to configuration memory, Due to their flexibility, FPGAs are attractive for critical embedded applications, but their reliability is insufficient unless some fault-tolerance techniques capable of mitigating soft errors are used. This technique should allow for online error detection, correction during system operation, fast fault location and quick recovery from temporary fault, and fast permanent fault repair through reconfiguration ISSN: 2231-5381 engine.we are interested in designing and implementing a fault-tolerant mechanism for soft core processor using FPGA. FPGA implementations of various fault-tolerant soft core and hardcore processors, lockstep systems using dual hardcore processors PowerPC found in certain FPGAs are constructed. However a limited number of available Power PC processors in FPGAs and it restrict their utilization to build a fault-tolerant Multiprocessor System-on-Chip . Therefore, one option to implement larger number of lockstep modules in a single FPGA is to employ soft core processors that can use all available reconfigurable resources of the device II.HARDWARE REDUNDANCY TECHNIQUES FOR FPGAS The most widely used fault-tolerant methods used to mitigate logic errors in FPGA designs rely on hardware redundancy 1.a) duplication with comparison (DWC) for detecting faults (Fig. 1 a and b ) 1.b)Triple modular redundancy (TMR) with majority voter for masking faults (Fig. 1). In DWC, the original block to be protected is replicated twice and the results produced by the original and the outputs of replicated blocks are compared to detect faults. The lockstep scheme of Fig.1.a) is the implementation of DWC at the processor level, supported by some Xilinx FPGAs. Two identical processors P1 and P2 receive the same inputs, continuously execute the same instructions.Finally the results are compared step-by-step at each clock cycle. P2 generates the reference results to be compared against those of _P1 that provides the system output. Generally , DWC is able to detect the errors but not able to correct errors, because it cannot point out the faulty module. Fig 1(a) DWR (b)TMR http://www.ijettjournal.org Page 329 International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 7 - May 2014 However, it capable to tolerate temporary faults, provided that it is supported by some re execution procedure. In TMR, the original block is replicated inputs, and a majority 2-out-of-3 voter is used to determine the correct result. It also possible to detect fault in the outputs of the three blocks to identify the faulty one. TMR mask both permanent and temporary faults of a single block. III.NEW FAULT TOLERANT ARCHITECTURE WITH NEW VOTER Fig 2 shows the architecture of the fault-tolerant system proposed here, whose two main blocks are:1)Enhanced Lockstep scheme 2)fault-tolerant Configuration Engine. It realize on using two Xilinx Virtex hardware primitives. The COMP_MUX consists of two blocks: 1) The VOTER that indicates any mismatch between the outputs the Multiplexer which connects one of the processors to the system output, so the another processor is reported to be faulty, the other is switched on. Automatically switching is performed and also executed in one clock cycle. Then, the three modules need to be synchronized to put the newly reconfigured one to the same state as the correct one, and also enabling them to continue executing the same task in Lockstep again. The CRB has a memory shared by both processors to store their context using on-chip. BRAM that has dual-port configuration providing simultaneous access to two Recovery Buses RB1 and RB2. These buses perform to save and restore the processor’s contexts to the BRAM as well as to control the CRB during the context switching process. Fig 3.Proposed voter the possibility to carry out automatic fault injection campaigns for the configuration memory of P1 and the COMP_MUX. The main goal is to obtain some statistical data on the design complex, percentages of sensitive bits and persistent errors .Once a fault is injected, the lock stepped pair of processors _P1 and _P2 executes the application datas that checks whether a peripheral of a softcore operates correctly. This application was chosen because it perform large number of transactions passing through PLB bus and thus offers the advantage that COMP_MUX can verify the consistency of outputs of _P1 and P2 for frequently varying data. Although this application data does not cover the complete set of instructions for execution, it is a valuable proof of concept for the proposed system. The architecture of the proposed voter is depicted in Fig. 3 As shown in the figure, three comparators (C12, C13, and C23) are used to represent any mismatch between TMR modules. As an example, TE23 signal is activated once a mismatch between Outputs II and III is detected. If one modules generates an erroneous output (e.g., Output I), two of the comparators (here, C12 and C13) will activate the mismatch signals (here, T E12 and T E13) and only one of the comparators (here, C23) will not activate the corresponding mismatch signal (here, T E23). In case of a faulty comparator (e.g., C13), only the corresponding signal (here, T E13) is activated and the other signals (here, T E12 and T E23) are deactivated. Fig 2. proposed scheme IV.VALIDATION USING FAULT INJECTION To validate the efficiency of the fault-tolerant method is proposed, we have provided the Fault Tolerant Configuration Engine with Fig 4.Sic vector generation ISSN: 2231-5381 http://www.ijettjournal.org Page 330 International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 7 - May 2014 This paper has proposed a low-power test pattern generation method.It can easily implemented by hardware and also developed a theory to express a sequence generated by linear sequential architectures, and extracted a class of Single Input Change sequences named MSIC. Analysis results showed that an Multiple single Input Change sequence had the favorable features of uniform distribution, low input transition, and low dependency relationship between the test length and the Test Pattern Generations in the initial states.The proposed reconfigurable Johnson counter or scalable SIC counter, the Multiple Single Input Change-Test Pattern Generation can be implemented easily,Test-per-clock schemes and testper-scan schemes are flexible for test pattern generation. V.TEST PATTERN GENERATION METHOD Assume m primary inputs(PIs) and M scan chains. In a full scan design and each scan chains has L scan cells fig 4 shows the symbolic simulation for one generated pattern. The test pattern vector generated by an m-bit LFSR with the primitive polynomial Can be expressed as S(t)=S0(t)S1(t)S3(t)…Sm-1(t)(S referred to as the seed) and the vector generated by an l bit Johnson counter can be expressed as J(t)=J0(t)J1(t)J2(t)…Jl-1(t).In the first clock cycle , J꞊JₒJ1J2…..J l-1(t) will bit XOR with S=S0S1S2 …SM-1 and the results X1Xl+1X2l+1…. Will be shifted into M scan chains .Respectively,in the second clock cycle J=J0,J2….Jl-2 will be circularly shifted as Jl-1J0J1……Jl-2 which will also bit XOR with the seed S=S0S1S2…..SM-1 .The resulting X2Xl+2X2l+2…..X(M-1)l+2 will be shifted into M scan chains respectively After 1clocks each scan chain will be fully loaded with a unique Johnson code word and seed S0S1S2…Sm-1 will be applied to m PIs. TABLE 1 EXPERIMENTAL RESULTS Parameter Existing method 17 Proposed method 16 Number of 4 input LUTs Number of bonded IOBs 31 28 50 46 Time (ns) 15.9 14.9 Number of Slice LUTs Based on the analysis, it is come to known that the testing process which uses sic technique gives the better reduction in area and time when compared to the other methods and is shown in the fig.5. 35 30 25 20 existing 15 proposed 10 5 0 area time Fig.5 Area and Time Comparison VIII.SIMULATION RESULTS VI.EXPERIMENTAL RESULTS The proposed enhanced TMR technique and tiling technique are simulated by using Xilinx ISE 12.1i and implemented in Virtex-5 FPGA processor. The experimental results are given in Table 1. VII.PERFORMANCE ANALYSIS The performance of the our scheme with existing scheme are analyzed based on the area and time consumption which was given in table 1. Fig 6:Fault Injection And Verification ISSN: 2231-5381 http://www.ijettjournal.org Page 331 International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 7 - May 2014 Programmable CustomComputing Machines, pp. 47-54, May 2010. [6] Aeroflex Gaisler AB, “LEON3 Product Sheet,” www.aeroflex. com/gaisler, 2012. [7] Aeroflex Gaisler AB, “LEON3-FT SPARC V8 Processor,” www. aeroflex.com/gaisler, 2012. [8] Xilinx, Inc., “SEU Strategies for VirtexDevices,”XAPP864,htttp://www.xilinx.com/support/documentatio n/application_ notes/xapp864.pdf, Mar. 2009. [9] M. Lanuzza et al., “An Efficient and Low-Cost Design Methodology to Improve SRAM-Based FPGA Robustness in Space and Avionics Applications,” Proc. Int’l Workshop Reconfigurable Computing: Architectures, Tools and Applications, vol. 5453, pp. 74-84, 2009. [10] B. Dutton and C. Stroud, “Single Event Upset Detection and Correction in Virtex-4 and Virtex-5 FPGAs,” Proc. ISCA 24th Int. Conf. Computers and Their Applications (CATA 2009), pp. 57-62, Apr. 2009. Fig 7: Faulty Unit Identification IX.CONCLUSION In this paper, we presented a roll-forward technique, called SMERTMR, to recover multiple errors in TMR systems. In the proposed technique, we employed built-in design for testability resources available in digital circuits to recover faulty modules in the presence of multiple transient faults. To validate the proposed technique, we performed fault injection experiments by generating test pattern using Johnson counter. An analytical assessment also demonstrated that SMERTMR improves the reliability of TMR systems by several orders of magnitude compared to the state-of-the-art techniques. AUTHORS PROFILE 1.K.JansiRani,Post Graduate student –ME in VLSI Design at PGP College of Engineering and Technology,Namakkal 2.P.Vinitha ,Assistant Professor /Electronics and communication engineering and Technology Department at PGP College of Engineering and Technology,Namakkal,Tamilnadu,india 3.Dr.G.K.D.Prasanna Venkatesan, Completed Ph.D from College of Engineering, Anna University, Chennai, India. He is Currently Working as Vice-Principal, Professor & Head of Department of Electronics and Communication Engineering at PGP College of Engineering and Technology. Namakkal, Tamil REFERENCES nadu, India. His research interests includes Wireless Sensor [1] Xilinx, Inc., “PowerPC 405 Processor Block Reference Guide (UG018),” www.xilinx.com/support/documentation/use guides/ug018.pdf, 2012. [2] Xilinx, Inc., “Single-Event Upset Mitigation Selection Guide,”Appl. Note XAPP987(v1.0),http://www.xilinx.com/support/documentation/ap plication_notes/xapp987.pdf, Mar. 2008. Networks, 4G Wireless Networks, Cloud Computing,Adhoc Network, etc., . [3] Xilinx, Inc., “PPC405 Lockstep System on ML310,”Appl. NoteXAPP564 v1.0.2, application_notes/xapp564.pdf, Jan. 2007. [4] F. Abate et al., “New Techniques for Improving the Performanceof the Lockstep Architecture for SEEs Mitigation in FPGA Embedded Processors,” IEEE Trans. Nuclear Science, vol. 56,no. 4, pp. 1992-2000, Aug. 2009. [5] Y. Ichinomiya et al., “Improving the Robustness of a Softcore Processor against SEUs by Using TMR and Partial Reconfiguration,” Proc. IEEE Ann. Int’l Symp. Field- ISSN: 2231-5381 http://www.ijettjournal.org Page 332