Action Requested: LAT CDR RFA #28 Response

advertisement

LAT CDR RFA #28 Response

Action Requested:

Describe the process for performing worse case circuit analyses for electronic circuits to show that the electronics can perform over its full temperature range over the life of the mission. Provide the results of these analyses, if you have performed these analyses. If you have not performed WCCA’s, then perform them. I suggest using NASA JPL preferred reliability practice PD-ED-1212 as a guideline.

Supporting Rationale:

This has not been described, although WCCA’s may have been performed.

Response:

WORST CASE CIRCUIT ANALYSIS RESULTS

Gamma-ray Large Area Space Telescope (GLAST) Large Area Telescope (LAT) personnel performed a Worst Case Circuit Analysis (WCCA). The analysis was divided into three sections: Part Stress Analysis, Timing Analysis, and Failure Mode Effects

Analysis. The WCCA uses the NASA JPL preferred reliability practice PD-ED-1212.

This guideline summarizes its implementation method as: "Derive part parameter variations for the environments and life of a specific mission and combine them with the initial tolerances of the parts as procured to produce a worst case part variation database for each mission or project. Apply classical circuit analysis techniques, and determine if each circuit and each assembly meets its specified performance attributes over the most extreme but realizable combinations of part variation sources."

The WCCA yielded no concerns for flight success.

I. Part Stress Analysis

Part Stress Analysis was performed on all LAT subsystems, either by GSFC (for the

ACD) or SLAC (for the remainder). The reports are collected in a single database and they are listed in the part stress analysis column of the Worst Case Circuit Analysis

Summary of Reports (Table 1).

GSFC Code 300 contractors examined these part stress analyses. A small number of items requiring further analysis were identified. These items are summarized in the reports listed in the Part Stress Analysis Additional Notes at the bottom of Table 1.

These items were discrepancies like an incorrect derating factor or an applied value exceeds derated value. The majority of these items have been resolved and deemed to be of no concern. The remaining items are presently being analyzed. Any issues that arise will be brought to the attention of GLAST Project Management, the GLAST Systems

Assurance Manager, and the RFA originator

II. Timing Analysis

Timing analysis is usually performed via simulation to determine the ability to meet timing specifications on a circuit board and the ability to meet board-to-board timing specifications. However, none of the ASICs underwent timing analysis prior to fabrication. There was no simulation by the chip designers to verify that timing would be met over the mission's temperature and voltage ranges. Instead, the flight ASICs underwent Qualification Testing following fabrication consisting of Initial Tests, Life

Cycle testing, Highly Accelerated Stress Testing and Temperature Cycle testing. The downside of this method of testing is it is impossible for these tests to recreate all of the conditions and combinations that simulation would have covered. The upside of this test methodology is a subset of the conditions and combinations are tested on the actual flight hardware rather than via simulation.

Because the ASICs are integral to the LAT and ASICs did not undergo simulation, performing timing analysis of the LAT would have yielded highly incomplete results.

One timing analysis was able to be performed at SLAC on the Tower Electronics

Module. That timing analysis looked at the key TEM internal and external interfaces.

The cognizant LAT engineers whose subsystems interface with the TEM verified that the interface timing is adequate and that their subsystems will operate within those constraints.

In lieu of the ability to perform useful simulation timing analysis, we followed the recommendation of JPL's Preferred Reliability Practice number PT-TE-1431 by performing Voltage and Temperature Margin Testing (VTMT) on the LAT instrument subsystems and on the LAT as a whole. JPL's Preferred Reliability Practice number PT-

TE-1431, titled "Voltage & Temperature Margin Testing", states: "Voltage and

Temperature Margin Testing (VTMT) is the practice of exceeding the expected flight limits of voltage, temperature, and frequency to simulate the worst case functional performance, including effects of radiation and operating life parameter variations on component parts. For programs subject to severe cost or schedule constraints, VTMT has proven an acceptable alternative to conventional techniques such as worst case analysis

(WCA). WCA is the preferred approach to design reliability, but VTMT is a viable alternative for flight projects where trade-offs of risk versus development time and cost are appropriate…The VTMT must be designed to achieve adequate variation of circuit

parameters through a judicious choice of voltage, temperature, and frequency margin combinations to achieve an optimal, yet realistic check of the design margin."

All subsystems undergo thermal/vacuum testing which spans the temperature and input voltage ranges thereby performing the needed VTMT. Additionally, the entire LAT instrument will be undergoing thermal/vacuum testing which will test the electronics over the temperature and spacecraft interface voltage ranges.

The reports that document the tests involved in this section are listed in the timing analysis column of the Worst Case Circuit Analysis summary of Reports (Table 1). For those portions where "T/V test results not yet available" is listed, the T/V test must be completed successfully for the subsystem to be integrated to LAT. For those portions where "et. al." is listed, only one of the usually 16 report numbers is listed for the multiple instances of the electronics in LAT.

III. FAILURE MODE EFFECTS ANALYSIS

LAT personnel performed a Failure Mode Effects Analysis (FMEA) in an effort separate from the WCCA effort. As indicated in the failure mode effects analysis column of the

Worst Case Circuit Analysis Summary of Reports (Table 1), the FMEA results are collected as LAT-TD-00374-D1 and the items that were considered can be found in the

LAT DAQ FMEA Worksheets5_20.xls.

The FMEA provides an assessment for the hardware configuration of the LAT. A

“bottoms-up” examination of each LAT component was performed in order to identify potential failures and their effects on a local, instrument, and overall science mission level. Specific attention is given to identification of any Single Point Failures (SPFs) that could cause failure of the GLAST science mission, and to recommend corrective actions or methods to alleviate their occurrence and/or impact.

This qualitative report will help answer these questions as each component of the LAT is analyzed:

1. How can the component fail? (It might be possible there is more than one mode of failure.)

2. What are the effects of the failure?

3.

4.

5.

How critical are the effects?

How is the failure detected?

What are the safeguards against significant failures?

A Critical Items List (CIL) is also be maintained as part of this activity. The CIL provides a summary of selected hardware items whose related failure modes can result in serious injury, loss of life (flight or ground personnel), loss of launch vehicle; or the loss of one or more mission objectives (when no redundancy exists) as defined by the GSFC project office.

At present, one hardware item, the Micrometeoroid Shield (MMS), is a SPF item that remains on the Critical Items List (CIL). The MMS is classified with a severity classification of 2 (i.e., critical failure mode that could result in loss of one or more mission objectives). Its failure mode would be light leakage/damage to two or more tiles.

Its mission effect would be an inability to achieve the Diffuse Background Rejection

Objective as defined in Table 2.3.1-1, Item 15, of the GLAST Science Requirements.

Although the Micrometeoroid Shield is classified as a critical item, a supplemental rationale document provided by Steve Ritz/LAT Scientist (see Appendix B of PRA

Report, LAT-TD-02510-02, dated 02/02/04) describes how the ACD and LAT are still capable of meeting science requirements (although efficiency may fall slightly below performance requirements, in the event a single tile, or localized pair, fails. The effect is localized inefficiency that would not severely impact the real-time operation of the LAT.

Of the remaining 752 failure conditions that were assessed within the LAT instrument as part of the FMEA process; a summary of the current severity code rankings are as follows:

* 106 failure conditions are of Category 2R (Critical), where failure mode is of identical or equivalent redundant hardware items that could result in Category 2 effects if all failed. (Category 2 is a failure mode that could result in loss of one or more mission objectives as defined by the GSFC project offce.)

* 606 failure conditions are of Category 2MR (Critical), where the failure modes is of more than one of many identical or equivalent redundant hardware items that could result in Category 2 effects if a significant fraction of the identical systems fail. (Category 2 is a failure mode that could result in loss of one or more mission objectives as defined by the

GSFC project offce.)

* 28 failure conditions are of Category 3 (Significant), where the failure mode is one that could cause degradation to mission objectives

* 12 failure conditions are of Category 4 (Minor), where the failure mode results in insignificant or no loss to mission objectives.

Failures conditions with Category 2R, 2MR, 3, or 4 classification are not catastrophic

(Category 1) and are sufficiently mitigated via design or have an acceptable effect on mission objectives.

LAT subsystem

Anti-

Coincidence

Detector

(ACD)

Tracker

Calorimeter

Data

Acquisition

(DAQ) electronics unit

Base Electronics

Assembly

Tower

Electronics

Module

Tower Power

Supply Module unit

TKR

TABLE 1 -- Worst Case Circuit Analysis Summary of Reports acronym

BEA FRont End Electronics

High Voltage Bias Supply

TKR Front-End Electronics board board acronym

FREE

HVBS

MCM part stress analysis

ACD_FREE_Parts

_Derating_06-06-

03A.xls

Derating_Sheet_L

AT_ACD_HVBS11

_5-30-03.xls

LAT_TRACKER_P arts_stress.xls

timing analysis

VTMT on entire ACD.

GAFE & GARC completed ASIC failure mode effects analysis

Results in LAT-

TD-00374-D1 and individual qualification testing. T/V items analyzed test results not yet available.

VTMT on entire MCM.

GTFE & GTRC are found in LAT

DAQ FMEA

Worksheets5_20.

xls

CAL

TEM

TPS

Global-Trigger /

ACD-EM / Signal-

Distribution Unit

GASU

GASU Power

Supply

Spacecraft

Interface Unit

GPS

SIU

CAL Front-End Electronics

Storage Interface Board:

* EEPROM storage

* Spacecraft MIL1553 interface

* Power control or PDU/GASU power switches in PDU

* Power control or VCHP switches in heater box

LAT Communication Board:

* Commanding of GASU

* Read-back, housekeeping & event data from GASU

Processor Board

AFEE

SIB

LCB

CPU

AFEE_Board_Stre ss_Analysis_Rev_

A1.xls

LAT-TD-01782-

02.pdf

LAT-TD-04516-

02.pdf

LAT-TD-01823-

01.pdf

LAT-TD-02655-

02.pdf

LAT-TD-01818-

52.pdf

LAT-TD-01819-

01.pdf

completed ASIC qualificiation testing. T/V test results LAT-TD-

05495 et. al.

VTMT on entire MCM.

GCRC & GCFE qual complete. T/V test results LAT-PS-05129-

01 et. al.

LAT-TD-01785-02.pdf for a timing analysis.

VTMT on entire

TEM/TPS. T/V test results LAT-TD-03631 et. al.

VTMT on entire

GASU/GPS. GLTC,

GTCC & GCCC completed ASIC qualification testing. T/V test results not yet

SFC-Test-

Procedures_RevD_2.4_

A.pdf for the Processor

Board. VTMT on entire

SIU. T/V test results not yet available.

750_Reliability_As sessment_for_GLA

ST_(12.1).doc,

GLAST_Radiation_ and_Reliabiity.ppt,

Event Processor

Unit

EPU

Power Supply Board:

* 28V to 3.3V/5V conversion

* LVDS-CMOS conversion of SC discretes

* System clock to GASU

Same components boards as SIU, except doesn’t perform:

* SIB's spacecraft MIL1553 interface

* SIB's power control or PDU/GASU power switches in PDU

* SIB's power control or VCHP switches in heater box

* PSB's LVDS-CMOS conversion of SC discretes

PSB

SIB,

LCB,

CPU,

PSB

LAT-TD-01817-

01.pdf

same as SIU SFC-Test-

Procedures_RevD_2.4_

A.pdf for the Processor

Board. VTMT on entire

EPU. T/V test results not yet available.

Power Distribution Unit

Heater Control Box

PDU

HCB

LAT-TD-01809-

03a_released.pdf

LAT-TD-04576-

01.pdf

VTMT on entire PDU.

T/V test results not yet available.

VTMT on entire HCB.

T/V test results not yet available.

Part Stress Analysis additional notes: Tony DiVenti had contractors Al Kogan and Paula Simon audit SLAC's part stress analysis. Their analysis is contained in derating_09-13-04.doc, GLAST_LAT_Parts_Derating-TPS_rev_A.doc, GLAST_RAD750_Parts_Derating.doc.

Timing Analysis additional notes: For those portions where "T/V test results not yet available" is listed, the T/V test must be completed successfully for the subsystem to be integrated to LAT. For those portions where "et. al." is listed, only one of the usually 16 report numbers for the multiple instances of the electronics in LAT.

Download