IEEE 37 on Reliability Prediction - Society of Reliability Engineers

advertisement
14 May 2002 Meeting Minutes
SRE Space Systems Reliability Tools Standards Working Group
The 4th meeting of the Space Systems Reliability Tools Standards Working
Group was held on Tuesday, May 14, 2002, from 8:30 AM to 11:30 AM PDT. The
meeting consisted of two separate teleconferences. One teleconference was mediated
at The Aerospace Corporation in El Segundo, CA, and the other teleconference was
mediated at DSI International, in Orange, CA. The meeting agenda is on page 3.
The objective of the SSRT Standards WG is to develop a commercial standard
that provides a single framework for linking different reliability assessment tools. This
framework shall be built by defining critical process addresses 1 and standard formats
for all data elements used in appropriate identification, analysis, and verification of
Reliability, Maintainability, and Availability (RMA) requirements for space systems. In
the context of this standard, “appropriate identification, analysis, and verification…”
means there would be negligible risk of adverse effects from using the results. The title
of the standard shall be, “Standard Format for Space System Reliability Computer
Applications,” and its scheduled completion date is 30 September 2002.
The WG is organized into two teams. Team 1 is tasked with defining the data
elements and their critical process addresses. The members of Team 1 are all
Reliability Engineering experts and their lead is Tyrone Jackson. Team 2 is tasked with
defining the standard formats for the data elements. The members of Team 2 are all
reliability tool developers and their lead is Dan Hartop.
Participants at the May 14th meeting were:
NAME
Steve Harbater
Dan Hartop(2)
Jim Sketoe
Al Jackson
Tyrone Jackson(1)
Xuegao (David)
Walt Willing
(1)
(2)
COMPANY
TRW
DSI Intl.
Boeing
CSULB Eng Grad College
Aerospace Corp.
SoHar Inc.
Northrop Grumman
PHONE
858-592-3490
714-637-9325
253-773-2891
310-493-7469
310-336-6170
323-653-4717
410-765-7372
E-MAIL
Steve.Harbater@trw.com
dhartop@dsiintl.com
James.E.Sketoe@boeing.com
jacksona@simanima.com
Tyrone.Jackson@aero.org
Xuegao@sohar.com
walter_e_willing@md.northgrum.com
Meeting coordinator and Team 1 lead
Team 2 lead
The critical process addresses may be defined using machine-readable alphanumeric symbols or humanreadable Extended Machine Language (XML) keywords.
1
1
The following individuals are on regular distribution for the SSRT Standards WG
Meeting minutes:
NAME
Mike Canga
J C Cantrell
Terry Kinney
Robert Poltz
Kamran Nouri
James Womack
John Ingram-Cotton
Dave Dylis
Eric Gould
Jim Kallis
Bill Geimer
Leo F. Watkins
Marios Savva
Adamantios Mettas
Doug Ogden
Rich Pugh
Ken Murphy
Chuck Anderson
Myron Hecht
Rebecca Menes
Bob Miller
Kevin P. Van Fleet
Hunter Shaw
Clarence Meese
COMPANY
NASA JSC
Aerospace Corp.
Spectrum Astro
Design Analytx
Item Software
Aerospace Corp.
Aerospace Corp.
RAC
DSI Intl.
Raytheon
Northrop Grumman
Lockheed Martin
Reliasoft
ReliaSoft
ReliaSoft
Pratt Whitney
ARINC
GRC International
Sohar Inc.
Sohar Inc.
TRW
Relex Software
Relex Software
SRE
PHONE
281-483-5395
310-336-2899
719-550-0325
877-327-7550
714-935-2900
310-336-7647
310-336-1249
315-339-7055
714-637-9325
310-647-3620
626-812-2783
817-935-4452
520-886-0410
520-886-0366 Ext. 29
520-886-0366 Ext. 41
505-248-0640
281-483-4087
323-653-4717X111
323-653-4717X101
310-812-2840
724-836-8800 x105
724-836-8800
2
E-MAIL
michael.a.canga1@jsc.nasa.gov
John.C.Cantrell@Aero.org
Terry.Kinney@specastro.com
getreliability@designanalytx.com
kamran@itemsoft.com
James.M.Womack@aero.org
John.Ingram-Cotton@aero.org
DDylis@IITRI.ORG
egould@dsiintl.com
jmkallis@west.raytheon.com
William.Geimer@northropgrumman.com
Leo.F.Watkins@LMCO.com
Marios.Savva@reliasoft.com
Adamantios.Mettas@ReliaSoft.com
Doug.Ogden@ReliaSoft.com
pugh@pwfl.com
KMURPHY@arinc.com
charlton.r.anderson1@jsc.nasa.gov
Myron@sohar.com
Becky@sohar.com
Robert.Miller@trw.com
kevin.vanfleet@relexsoftware.com
Hunter.Shaw@relexsoftware.com
cmeese@nyx.net
May 14th Meeting Agenda
Time
SRE Working Group Administrative Topics
8:30 - 8:40 PDT
Take roll
Vote to approve the minutes of the April 30th meeting
Remind participants to pay their SRE membership dues
Time
Team 1 Discussion Topics
8:40 - 9:00 PDT
Discuss Status of Action Items from the April 30th Meeting – Tyrone
Jackson
9:00 - 9:20 PDT
Discuss the Electrical Stress Derating Analysis Flow Diagrams – Steve
Harbater
9:20 - 9:50 PDT
Discuss the Reliability Prediction Process Flow Diagram for the
Preliminary Design Phase – Jim Sketoe
9:50 - 10:00 PDT
Break
10:00 - 10:45 PDT
Discuss the First-Cut Standard Formats for Reliability Data – Tyrone
Jackson
Team 2 Discussion Topics
8:30 - 9:50 PDT
Discuss Status of Action Items from the April 30th Meeting – Dan Hartop
9:50 - 10:00 PDT
Break
10:00 - 10:45 PDT
Begin Developing a Draft Outline for the Standard, which is titled,
“Standard Format for Space System Reliability Computer Applications” –
Dan Hartop
Team 1 & 2 Summary
10:45 - 11:30 PDT
Summary and Review of Actions Items – All
11:30 PDT
Meeting Adjourn
3
Team 1 Discussion Topics

Team 1 participants in the April 30th meeting were:




Tyrone Jackson (Team Lead)
Steve Harbarter
Walt Willing
Jim Sketeo

The group did not meet the minimum number of participants required for a Team
1 quorum and decided to postpone the vote on approval of the April 30 th meeting
minutes until the next scheduled meeting on May 28 th.

The group agreed that Visio 2000 diagrams should be converted to Visio 5
format before distribution to the working group for review.

The group reviewed Steve’s Stress Electrical Derating Process Flow Diagram
and accompanying write-up. Steve mentioned that sometimes the secondary
parameters are not included in the stress derating analysis to save money.
Tyrone volunteered to develop draft definitions for some of the electrical stress
derating parameters. He plans on using the Fortran source code for an old MILHDBK-217 program to build a list of component-specific derated parameters.

The group reviewed Jim’s Reliability Prediction Process Diagram for the
Preliminary Design Phase. The group agreed that unit level and component
level trade studies are often performed during the Preliminary Design Phase.
Therefore, the use of reliability data to support trade studies should be added to
Reliability Prediction Process Diagram. Jim will modify the diagram.

The group discussed the widespread trend away from piece part FMECA. Walt
said that, at a minimum, FMECA should be performed to identify the effects of
failures at the interfaces of a Line Replaceable Unit (LRU). He added that
identifying internal failure modes of an existing LRU would not be efficient use of
an analyst’s time, but identifying internal failure modes of a new or modified LRU
would be efficient use of an analyst’s time. The group agreed with Walt.

The group agreed that FMECA should be used to validate the Reliability Block
Diagram (RBD), and both the FMECA and RBD should begin at the same level
of indenture.

The group agreed on the following concepts:
o In an ideal world, where tools are available to apply all reliability methods
with equal effort to all items, the preferred order of reliability methods
would be:
1. Field data
4
2. Test data
3. Physics of failure (PoF) equations if they were derived from applicable
test data
4. Handbook reliability prediction equations if they were derived from
applicable field data
o The MTBF calculation for COTS should be based on either field data or
test data.
o In the real world (at least for now), handbook reliability prediction methods
are the most cost effective choice for MTBF calculations because:

Insufficient field and test data is available for all items in modern
space systems.

A key goal of the Responsible Design Engineer (RDE) should be
to eliminate all wearout mechanisms that can affect mission
success. Therefore, PoF would not be necessary if this goal is
met.

Cost effective PoF tools are not available.
o Some of the problems associated with handbook reliability prediction
methods include:


Use of proprietary parameters

Failure rate equations that were not derived from field data

Unknown confidence bonds for calculated failure rates

Assumed exponential (constant) failure rates for all items

Lack of a comprehensive set of hazard rate equations for nonelectronic parts

Lack of a comprehensive set of non-operating failure rate
equations for electronic parts
Tyrone discussed an example for a standard reliability data format that he
derived from the old B1 and B2 sheets in MIL-STD-1388-A. The example
consists of predefined keywords that have origination points identified on critical
process flow diagrams. The points on the diagrams serve as data addresses.
To allow consistent identification of the data by different reliability assessment
tools, the keywords are arranged in an indentured configuration that is based on
data dependency. Take for example, a spacecraft Mean Mission Duration
5
(MMD) prediction. Its standard electronic data interchange format might look
something like this:
RELIABILITY
PREDICTION
MMD
RWEIBULL (Rayleigh-Truncated Weibull)
SCALE = 60.0
SHAPE = 1.75
BWEAROUT (Begin Wearout) = 36
MWEAROUT (Mean Wearout) = 48
CONFIDENCE = 0.5
UNITS = MONTHS
Team 2 Discussion Topics

Team 2 participants in the May 14th meeting were:



Dan Hartop (Team Lead)
David Xuegao
Al Jackson

The group met the minimum number of participants required for a Team 2
quorum.

The following tasks have been completed:
o Created, Updated and Reviewed Initial Schema
o Documented Updated Schema Considerations for review by Team 2
o Discussed potential Interoperability paths and approach

As a side note, DSI will ultimately create an XSL style sheet (just a fancy XML
document for automatically changing XML into something useful) for converting a
Fault Tree XML (FTML ?) document into an Excel XML Spreadsheet (supported
by Excel 2002). This will be accomplished sometime over the next few months at
DSI's availability. Therefore, we will commit to an Action Item that will not have a
definite date other than by September 2002.

Team 2 Future Agenda
o Team 2 - Complete review of Gate Types, ensure consistent parsing
for existing tools
o Team 2 - Define interoperability paths for Fault Tree and other
Schemas
o Team 1 - Provide input to Team 2 regarding current schema
6
Action Items
1. Team 1 Action Items –
a. All – Review the updated Fault Tree Schema that Team 2 constructed.
Specifically, check for correctness, completeness, and compliance with
the stated objective of standard (see page 1).
b. Jim – Update the diagram for the Reliability Prediction Process during the
Preliminary Design Phase. Specifically add references to Reliability Trade
Studies and FMECA.
c. Tyrone and Steve – Tackle the Team 2 action item to begin developing a
draft outline for the standard, which is titled, “Standard Format for Space
System Reliability Computer Applications”.
d. Tyrone – Construct a flow diagram for Similarity Analysis that shows how
individual reliability assessment tasks might be integrated at the Reliability
Program level.
e. Tyrone – Develop draft definitions for some of the more typical electrical
stress derating parameters.
f. Tyrone – Write a draft guide and construct Reliability Analysis Process
Flow Diagrams for the Detailed Design Phase.
2. Team 2 Action Items –
a. All - Review the updated Fault Tree Schema. Specifically, check for
correctness, completeness, and compliance with the stated purpose of
standard (see page 1).
b. SOHAR - Define interoperability (inputs and outputs to existing tools).
c. SOHAR - Complete review for completeness of Gate Types.
d. John - Review and update schema documentation.
e. All - Review Team 1 documentation & findings.
7
Next Meeting
The next SSRT Standards WG Meeting is scheduled for May 28, 2002, at 8:30 AM
PDT. Team 1 and Team 2 will hold separate teleconferences from 8:30 AM to 10:45
AM PDT. At 10:45 AM PDT, Team 1 will join the Team 2 teleconference to discuss
progress and actions. The following teleconference numbers are to be used:

Team 1 teleconference number - (888) 550-5969, pass code 646354

Team 2 teleconference number - (888) 550-5969, pass code 162080
Arrangements have been made for Team 1 to use NetMeeting concurrently during
the teleconference. For those that prefer face-to-face discussions, meeting rooms have
been reserved at the following locations:

Team 1 meeting room - The Aerospace Corporation, Building D-8, 200 N.
Aviation Boulevard, El Segundo, CA 90245-4691

Team 2 meeting room - DSI International, 1574 N. Batavia, Suite 3, Orange, CA
92867
8
Planned Future Meetings
Location:
The Aerospace Corporation, Building D-8, 200 N. Aviation
Boulevard, El Segundo, CA 90245-4691
Date:
2002
5/28
Teleconference
6/11
Teleconference
6/25
Teleconference
7/16
Teleconference
7/30
Teleconference
8/13
Teleconference
8/27
Teleconference
9/10
Teleconference
8/24
Teleconference
Please direct all comments regarding these meeting minutes to:
Tyrone Jackson
SSRT Standards Working Group Coordinator
Tyrone Jackson
Reliability & Statistics Office
The Aerospace Corporation
Ph. (310) 336-6170
Fax (310) 336-5365
Email: Tyrone.Jackson@aero.org
9
Top-10 problems that affect the Reliability Programs of Space
Systems as determined by an internal working group survey:
1. Valuable reliability lessons learned often are not in a format that is readily useable by
the Reliability Program, or they have become “lessons forgotten” or “lessons ignored”.
2. Some reliability critical items often are not identified at all or are not properly
controlled.
3. System reliability predictions often do not include probability of occurrence estimates
for all relevant failure modes, failure mechanisms, and failure causes. (Probability of an
induced fault during manufacture, or probability of damage during assembly often is not
included in reliability predictions.)
4. The perceived accuracy of high-precision system reliability predictions often is not
supported by the input data which is of lower precision that the result.
5. The steadily shrinking pool of “experienced” Reliability Engineering specialists is
unable to meet the needs of a steadily growing number of space system development
projects.
6. Many commercial reliability assessment tools have major shortcomings that may not
be obvious to the casual reliability analyst (e.g., inaccurate equipment failure rate
models, use of unverifiable parameters in equations, high misapplication rates, etc.).
7. Often, insufficient funding is provided to perform all of the tasks necessary for a HighReliability Program. (Some customers and managers believe that high-reliability can be
tested-in more cost-effectively than it can be designed-in.)
8. Different approaches are being used across the space industry to perform reliability
assessment tasks that are called by the same name, but which often serve different
purposes. (Inconsistency in reliability assessment practices has become a major
problem since DoD canceled military standards in the late 90’s.)
9. Some customers’ believe that all dependability predictions for space vehicle
constellations are too conservative. (The basis of this belief is rooted in historical
evidence that shows contingency procedures of ground operations are very effective for
extending the useful life of a space vehicle far beyond it’s predicted mean-life. This
phenomenon has resulted in many customers buying more space vehicles than
necessary to meet the dependability requirements of the constellation.)
10. Sometimes the reliability analyst cannot take advantage of (or is unaware of) some
of the critical data paths that link a particular task of the Reliability Program with:
a. Other tasks within the Reliability Program;
b. Systems Engineering Process functions outside the Reliability Program; or
c. External product-related data sources.
10
Download