Presentation - Through

advertisement
No Fault Found, Retest-OK, Cannot Duplicate or Fault Not Found? –
Towards a standardised Taxonomy
5th November 2012
Dr Samir Khan
Dr Paul Philips
Mr Chris Hockley
Prof Ian Jennions
Agenda
• Overview of the NFF problem
– Cause and impact of NFF
– Tackling NFF
• The Multitude of Terms
– Impact of inconsistent terminology
• Maintenance culture
• Concluding Remarks
The NFF problem
System
Diagnostic
success
Fault
Strategy
No fault
No fault
Diagnostic failure
The NFF problem
NFF Definition
• Removals of equipment from service for
reasons that cannot be verified by the
maintenance process (shop or elsewhere).
(Source ARINC Report 672 (2008) Guidelines for the reduction of No Fault Found
(NFF). Avionics Maintenance Conference, Aeronautical Radio Inc)
The NFF problem
To Achieve Diagnostic Success?
• Implies identification of the root cause.
• That enables the correct maintenance activity
to be performed.
• It suggests a closed-loop system that can
relate the symptom to the fault to the correct
maintenance solution.
The NFF problem
Some NFF history
1993, BA concern at cost of removals where
- nothing found wrong
- same fault re-occurred
13.8% of all unscheduled removals = NFF
Cost £17.6M per year
80.4% of all NFF were avionics
26.6% of all avionics removals were NFF
BA Presentation ERA Conference 1996
BA estimate costs at £20M per year
Blishke WR & Murphy (2003) pub -John Wiley & Sons
1996, Boeing state 40% rate of incorrect parts removal from
the airframe
R Knotts MPhil Thesis Exeter
Avionics constitutes 75% of NFF occurrences in aerospace
Aviation Week, Feb 9 2007
Common Stages of NFF
Stage 1
Stage 2
Contractors
AC; Avionics 74%, pneumatic 19%
In aerospace 50% of LRU
are classified NFF
Satellite industry - 50%
increase
Surveys in electronic
equipment – NFF 19->53%
F16 radar – 67%
85% of all operational faults
in aircraft electronics
Train control system 50% of
faults
AMC 2004 voted NFF ‘the
most imp issue’
Some NFF events seem to be independent of life cycle stage
Common Stages of NFF
Results from Past Experiences and Studies
NFF costs can be generated through the following
activities:
• Removals for the wrong reasons
• Workshops failing to find and then repair the reported
fault
• Inability to simulate the conditions in which the fault
occurred
Cost of removals due to No-Fault-Found can be huge
where:
• Nothing has been found at fault
• The same fault re-occurs in the next or a subsequent
mission, risking safety and/or mission success
Reasons for NFF
•
•
•
•
Not enough emphasis on diagnostics training
Pressure on quantity in workshops not quality
No emphasis on history of components
No tracking of “rogue units”
Loose conn./
intermittent
corrosion
Reliability issues
Causes of
NFF
System design
Poor understanding
Poor maintenance
Organisational setup
Maintenance team
Management team
integration
Designer team
BIT
complexity
BIT
Why is Tackling NFF difficult?
• The solutions to NFF problems may not be possible
because:
Inconsistent fault
–
–
–
–
Organizational culture
Procedures and rules
Technical inefficiencies
Workforce behaviour
reporting
Design related
hard/software def.
Insufficient time
available for
maintenance
Poor on board
maint. system
Insufficient training of line
and shop personnel
Inadequate design for
testability
Performance Poor test
measure
coverage
NFF
Technical Causes of NFF
•
•
•
•
•
•
•
•
•
•
Undefined or limited performance measures
BIT levels set too low
BITE ability to detect a fault
BITE inability to apply reliable fault diagnostics
Lack of information on operating environment
Inability to reproduce operating environment during test or diagnosis
Intermittent failure caused by stress not replicated on test
Inadequate design suitable for robust testing for all faults
Inadequate fault models and fault trees for determining root causes
Lack of understanding of interactions between different integrated
systems and software
• Reluctance to adopt new technologies (health monitoring) due to the
need to alter system designs or because the data handling /decision
making infrastructures are not available
Organisational
• Time pressures on maintenance operations
• Organisational cultures with no cross-functionality, employee
empowerment and encouragement to identify root causes
• Inadequate training or lack of training tools
• No commitment to sharing information and knowledge between
designers, manufacturers, service providers and operators
• Solutions likely to be disruptive to normal working practices= reluctance to
change
Behavioural and Procedural
• Discrepancies and faults in test procedures
• Incorrect fault reporting
• Wrong processes applied
• Incomplete documentation
• Lack of communication between maintenance personnel and
other experts.
Management policy;
Coherent teams
Tests should be
certified by the supplier
Diagnostic tests should
be added before
commissioning
Modelling of
intermittent faults;
e.g. Detect
probability of fault
Transfer of
information
Mitigation of NFF
Recognise
Design of fault
fault alarms
tolerant systems;
add redundancy
Design methodology
eg. Include in-built
health monitoring
Address root cause
of BIT deficiencies
The Problem
Within this field there are several identified problems:
• A lack of common understanding on what constitutes the
phenomena resulting in Diagnostic Failure
• Wide concept and lack of commonality in processes –
• Several existing standards deal with aspects of the
phenomena but may be overly specific
• Existing standards are incomplete and fail to fully define the
problem in a way which has been accepted by industry.
• A lack of interaction between the aspects, processes, activities
and stakeholders who suffer the impacts
A Multitude of Terms
• Missing published academic literature
• Early calls made into testability attributes of electronic equipment,
specifically to mitigate NFF – but this has not yet been achieved across
all test/maintenance levels.
• Cultural impacts
• Copernicus Technology Ltd survey
Q1) How can a true gauge of the problem be
investigated if there is no standardised term
used in the maintenance history?
Q2) Are all of these terms accurate – do they
actually describe the same event, or are there
subtle differences which need to be
recognised?
Do we need standards?
• Adopting standards help industries (and research) overcome technical
barriers by promoting organisational success through better workflow
paradigms and maintenance strategies
• Most of the existing descriptions of these phenomena do not provide any
consideration for where exactly do these diagnostic failures occur within
the maintenance process
“No Fault Found is a reported failure that
“The inability to replicate field failure
during repair shop test/diagnosis”
(Kirkland, 2011)
“A failure that may have occurred but
cannot be verified, or replicated at will,
or attributed to a specific root cause,
failure site, or failure mode” (Qi et al,
2008).
cannot be confirmed, recognised or localised
during diagnosis and therefore cannot be
repaired” (Roke, 2009)
“Removals of equipment from service for
reasons that cannot be verified by the
maintenance process (shop or elsewhere)”
(ARINC Report 672, 2008)
Maintenance culture
Is a single term enough?
The majority of definitions lead to believe that a failure during operation
(such as an intermittent fault) is the actual ‘No Fault Found’ event, and
hence leads to the majority of academic literature to classify NFF into three
distinct categories:
• intermittent failures,
• integration faults
• Built In Test Equipment (BITE) failures
However, such a practice may be incorrect as there are primarily the root
causes which begin a sequence of events through various levels of
maintenance, which blend with other factors such as organisational/
behavioural/cultural and technical abilities to result in the final outcome of
NFF (drivers towards diagnostic failure).
Maintenance culture
Root causes
Cables
Chassis (LRU)
sources
Influencing factors that
lead to a diagnostic failure
Lack of communication
Test procedures
Leads to a
maintenance action
Test coverage
Poor design
False alarm
Design defects
drivers
Connectors
Components
Location of
the fault
Wrong process
BIT
Incorrect reporting
Insufficient training
Intermittency
integration
Operator error
faults
Maintenance culture
1) There is no evidence of faults during testing so the unit can be certified as
serviceable as there probably never was a fault
2) There has been no evidence of faults during testing but the test coverage may
be inadequate so it is best to replace the unit just to be on the safe side
3) Repeating NFF issue > not a single event but a sequence
4) Fault Not Found instead of No Fault Found?
“A reported fault for which the root
cause cannot be found – in other
words a diagnostic failure” (Cockram
and Huby, 2009)
Concluding Remarks
Research Scope
•
Lack of standards with different terms describing largely the same problem
•
Human Factors is one of the core driver towards diagnostic failure
•
To identify procedural, process and behavioural issues that need to be changed,
learning from best practice in each industry.
•
To devise strategies, methodologies and system design rules to mitigate the
occurrence of intermittent failure mechanisms and to demonstrate their
effectiveness in reducing the likelihood of NFF occurrences.
•
To develop a multi-disciplinary approach at the System level for the effective
analysis of the root causes of NFF in order to assist design activity across domains.
Concluding Remarks
Key Challenges
•
Providing solutions which do not have a negative impact on current operations –
i.e. increased downtime means reduced availability
•
Specific training
•
Development and adoption of appropriate Standards
•
Diagnostic procedures which encourages identification of root causes
•
Cultural changes
•
Working with design to identify the necessary improvements to mitigate NFF
Through-life
Contact info
Samir: Samir.khan@cranfield.ac.uk
Paul: p.phillips@cranfield.ac.uk
Download