Availability Analysis of Xilinx FPGA on Orbit Nozomu Nishinaga National Institute of Information and Communications Technology Masayoshi Yoneda NEC TOSHIBA Space Systems, Ltd. Nishinaga No. 1 MAPLD2005 Outline Motivation Heavy Ion test results of Virtex II pro Availability analysis Conclusion Nishinaga No. 2 MAPLD2005 Motivation Very high availability or low non-availability is required for the consumer communications equipment. typical non-availability value for terrestrial network equipment is 10E-6 If the SEU can be defined as an accidental failure and the failure can be fixed without any loss of the original device function. the rebooting process also can be defined as a repairing Does equimpment with S-RAM type FPGAs meet the non-availability criteria? Nishinaga No. 3 MAPLD2005 Radiation test of Virtex II Pro Virtex II pro (XC2VP7-5FG456 and XC2VP4) Test carried out in November 2003 and February 2004 at TIARA in Takasaki, Japan Heavy Ions (N, Ne, and Kr) Result compared with that of Virtex II. (Gary Swift, Candice Yui, and Carl Carmichael,” Single-Event Upset Susceptibility Testing of the Xilinx Virtex II FPGA,” MAPLD2002, paper P29) Nishinaga No. 4 MAPLD2005 Devices Under Testing IOB/DCI PowerPC 405 core Rocket I/O DCM 18x18 Multiplilers 18k BlockSelect RAM FF FF FF JTAG SelectMap CLB Config. Data IOB/DCI XC2VP4 Configuration Memory DCM Nishinaga 3.01 [Mbit] (Digital Clock Manager) XC2VP7 XC2VP100 4.49 [Mbit] 34.29 [Mbit] 4 [unit] 4 [unit] 12 [unit] Block RAM 28 [unit] 44 [unit] 7992 [kbit] F/F 6016 [unit] 9856 [unit] 9856 [unit] Multiplier 28 [unit] 44 [unit] 44 [unit] Rocket I/O 4 [Block] 8 [Block] 8 [Block] No. 5 MAPLD2005 Radiation test result (1) Block RAM region 0.000001 Cross Section [cm^2/bit] 0.0000001 1E-08 1E-09 XC2VP4 XC2VP7 1E-10 Virtex-II 1E-11 0 Nishinaga 10 20 30 40 50 LET[Mev cm^2/mg] No. 6 60 70 MAPLD2005 Radiation test result (2) Configuration Memory region Cross Section [cm^2/bit] 1.0E-07 1.0E-08 1.0E-09 Virtex-II Pro (XC2VP4) Virtex-II Pro (XC2VP7) Virtex-II (iMPACT) Virtex-II (FIVIT) 1.0E-10 Total N Ne Kr 1.0E-11 0 10 20 30 40 50 60 70 LET [Mev cm^2 /mg] Nishinaga No. 7 MAPLD2005 SEU frequency analysis (CREAM 96) XC2VP4 Solar MAX Flare Peak (1 week) Conf. Memory 0.33 times/day 163.4 times/day DCM 0.00 times/day 0.11 times/day Block RAM 0.04 times/day 21.87 times/day Multiplier 0.00 times/day 0.46 times/day XC2VP7 Solar MAX Flare Peak (1 week) Conf. Memory 0.49 times/day 243.8 times/day DCM 0.00 times/day 0.11 times/day Block RAM 0.07 times/day Multiplier 0.00 times/day 0.72 times/day No. 8 MAPLD2005 Nishinaga 34.4 times/day Mean Time Before Failure Analysis XC2VP4 XC2VP7 XC2VP100 (Simulated) Solar MAX (Sec.) Flare Peak (1 week) (Sec.) Solar MAX (Sec.) Flare Peak (1 week) (Sec.) Solar MAX (Sec.) Flare Peak (1 week) (Sec.) Conf. Memory 2.64E+05 5.29E+02 1.77E+05 3.55E+02 2.32E+04 4.64E+01 DCM 4.14E+08 8.09E+05 4.14E+08 8.09E+05 1.38E+08 2. 70E+05 Block RAM 2.02E+06 3.95E+03 1.28E+06 2.51E+03 1.27E+05 2.49E+02 Multipliers 7.89E+07 1.89E+05 5.02E+07 1.21E+05 4.98E+06 1.19E+04 SYSTEM 2.3267E+05 4.6495E+02 1.5501E+05 3.0972E+02 1.95E+04 3.90E+01 If the SEU can be considered as A Failure, the MTTR is roughly proportional to the size. System MTBF -> Harmonic Mean of all functional blocks Assumption 1: All the SEUs can be detected. Assumption 2: All the gates are used. Assumption 3: All the SEUs must be repaired as soon as quickly Nishinaga No. 9 MAPLD2005 Mean Time To Repair (MTTR) XC2VP4 XC2VP7 XC2VP100 Configuration data (bit) 3,006,560 4,485,472 34,292,832 MTTR (s) (10Mbyte/s) 0.037582 0.056068 0.42866 MTTR (s) (50Mbyte/s) 0.007516 0.011214 0.085732 REBOOT == Repair The effects of SEU are volatile. By loading the correct configuration data, the operation mode will go to the normal mode. Rebooting time -> Repair time The maximum data rate for loading is fixed : 50M byte/Sec. for XC2VP series. The larger gate size or configuration size, the longer MTTR becomes necessary. Nishinaga No. 10 MAPLD2005 Triple Module Redundancy Case 1: One out of Three system failure is acceptable. Loose regulation Acceptable when the MTBF is quite large compared with MTTR Case 2: NO failure is acceptable Tight configuration The output is always guaranteed. Nishinaga No. 11 MAPLD2005 Non-Availability Alalysis MTTR NonAvailability MTBF MTTR Case 1 XC2VP4 XC2VP7 XC2VP100 (Simulated) Solar MAX Flare Peak Solar MAX Flare Peak Solar MAX Flare Peak 10Mbyte/s 7.83E-14 1.96E-08 3.93E-13 9.83E-08 1.45E-09 3.53E-04 50Mbyte/s 3.13E-15 7.84E-10 1.57E-14 3.93E-09 5.79E-11 1.44E-05 Case 2 XC2VP4 XC2VP7 XC2VP100 (Simulated) Solar MAX Flare Peak Solar MAX Flare Peak Solar MAX Flare Peak 10Mbyte/s 4.85E-07 2.42E-04 1.09E-06 5.43E-04 6.59E-05 3.23E-02 50Mbyte/s 9.69E-08 4.85E-05 2.17E-07 1.09E-04 1.32E-05 6.57E-03 MTBF is proportional to the area of the die and MTTR is also proportional. -> Large FPGA has disadvantage. Large size FPGA does not meet the criteria 10e-6 How to mitigate? – divide small FPGAs Much larger down load rate will be needed (50 M Byte/S is too slow) Nishinaga No. 12 MAPLD2005 Dividing The Non-Availability depends on the size A Large size FPGA is split up to several (D) small FPGAs Sc-> Configuration data size [bits] R -> Configuration rate [bps] Sc K MTTR , MTBF R Sc Sc / R S c2 MTTR NA 2 MTTR MTBF S c / R K / S c S c KR Sinusoidal NAdiv _ d Nishinaga S c2 2 , K Const. 2 S c d KR No. 13 MAPLD2005 Interstage VOTER The availability is varying With or Without the interstage Voter. The performance with interstage voters is superior to tat without the voters. Nishinaga No. 14 MAPLD2005 Non-Availability Analysis with dividing Area or gate loss due to the division is not taking into account in this figure. -> next issue Nishinaga No. 15 MAPLD2005 Conclusion Non availability analysis for Vertex II pro Large scaled FPGA do not meet a non availability criteria for communication equipment (10e-6). Need much faster or wider Interface for configuration to enhance its availability. Nishinaga No. 16 MAPLD2005