Dynamic Prognoser Architecture via the Path Classification and Estimation (PACE) Model

advertisement
Dynamic Prognoser Architecture via the Path
Classification and Estimation (PACE) Model
Dr. Dustin R. Garvey (dgarvey@expmicrosys.com)
Expert Microsystems, Inc.
Dr. J. Wesley Hines (jhines2@utk.edu)
University of Tennessee, Knoxville
Nuclear Engineering Department
thereby learning “real world” degradation patterns.
This paper addresses this problem by using expert
knowledge to initialize a path classification and
estimation (PACE) prognoser and then uses collected
examples of oil drill failure to incrementally
supplement its initial knowledge with example cases.
Abstract
Most modern prognostic algorithms are founded on
a simple abstraction of device degradation, for an
individual device there exists a degradation signal
that progresses along a unique path until it crosses a
critical failure threshold. While this abstraction has
been shown to be valid for well understood failure
modes under controlled stress conditions, its
viability in "real world" devices being exposed to
"real world" stresses is questionable. Because the
complexity of degradation should scale in a similar
fashion as the complexity of the device, the
applicability of a simple abstraction of degradation
is increasingly arguable for modern devices. This
paper will propose an alternative to the current
abstraction of degradation, which is founded on the
premise that degradation data should be allowed to
speak for itself. In this way, many different forms
of information can be incorporated into a
prognoser's estimate of a device's remaining useful
life (RUL). More specifically, this paper will
outline a methodology for implementing a dynamic
prognoser that can be incrementally trained to learn
general (physical model output, expert opinion,
etc.) and specific ("real world" data) degradation
trends. This work will demonstrate the viability of
the proposed method by applying a particular
embodiment, namely the path classification and
estimation (PACE) model, to data collected from a
deep-well oil exploration drill. To begin, expert
opinion will be used to develop a PACE prognoser.
Next, data collected from individual drills will be
used to incrementally train the prognoser to learn
specific degradation trends.
1
2
PACE Model
The general path model (GPM) (Lu & Meeker 1993)
is founded on the concept that a degradation signal
collected from an individual device will follow a
general path until it reaches an associated failure
threshold. Since its introduction, the thought model
proposed in the GPM has been prolifically adopted
by modern researchers and has resulted in a plethora
of techniques that can be related to the GPM in one
way or another (Lu & Meeker 1993, Upadhyaya et al.
1994, Mishra et al. 2004, Yan et al. 2004, Xu & Zhao
2005, Liao et al. 2006).
From this cursory
description, it can be seen that there are two
fundamental assumptions of the GPM and its modern
counterparts: 1) there exists a path for the
degradation signal that can be parameterized via
regression, machine learning, etc. and 2) there exists
a failure threshold for the degradation signal that
accurately predicts when a device will fail. For
modern computational capacity, the first assumption
is trivial, in that many methods exist for
parameterizing simple (polynomial regression, power
regression, etc.) and complex (fuzzy inference
systems, neural networks, etc.) relationships from
data. The assumption of the existence of a threshold
that accurately predicts device failure is not so easily
reconciled. While the existence of a failure threshold
has been shown to be valid for well understood
degradation processes (e.g. seeded crack growth) and
controlled testing environments (e.g. constant load or
uniform cycling), Liao et al. (2006) observes that for
“real world” applications, where the failure modes
are not always well understood or can be too complex
to be quantified by a single threshold, the failure
boundary is “gray” at best. Wang and Coit (2007)
attempt to address this problem by integrating
Introduction
With the increased interest in the development of
usable prognosis systems, there is an increasing focus
on the details that can help develop an accurate and
adaptable system. One such issue is the need for an
adaptable system that would be able to use one set of
information to initialize the prognoser and then
continuously use collected data to update the system,
44
uncertainty into the estimate of the threshold, but in
the end the authors replace an estimate of the
threshold with another, more conservative estimate.
For this work, instead of saying that a device has
failed if its degradation signal exceeds its threshold,
the approach implemented by Liao et al. (2006) will
be adopted, where the data is allowed to speak for
itself. In other words, for this work a device is
interpreted as having failed if it in fact fails. Before
continuing, it is important to note that if the failure
thresholds are well established, the data can be
formatted such that the instant where the signal
crosses the threshold is interpreted as a failure event.
As its name suggests, the PACE model is
fundamentally composed of two operations: 1)
classify a current degradation path as belonging to
one or more of previously collected exemplar
degradation paths and 2) use the resulting
memberships to estimate the RUL. Hence, the name
path classification (classify path according to
exemplar paths) and estimation (estimate the RUL
from the results of the classification). At this point,
the PACE is described in more detail by considering
a hypothetical example.
To begin, consider the example degradation
signals presented in Error! Reference source not
found.. The degradation signals and their associated
failure times are presented in the top plot. Here, the
failure times can be set to be either the time that the
device fails or the time at which an expert determines
that the device performance has sufficiently degraded
such that it has effectively failed. For this example, it
can be seen that there is not a clear failure threshold
for the degradation signal. Notice that the paths are
generalized by fitting an arbitrary function to the data
via regression, machine learning, etc. There are two
useful pieces of information that can be extracted
from the degradation paths, specifically the failure
times and the “shape” of the degradation that is
described by the functional approximations. These
pieces of information can be used to construct a
vector of exemplar failure times and functional
approximations, as follows:
T1 T T 2
T3 T4 Figure 1 Example degradation signals and their
associated functional approximations
To test the PACE, the degradation signal of
another, similar device is being monitored and an
estimate of the RUL of the individual device is
needed at an arbitrary time t*. Such a case is
presented in Figure 2, where the degradation signal is
plotted in BLACK. The query observation of the
degradation signal at time t* is written as u(t*). To
estimate the RUL of the device via the PACE model,
the algorithm presented in Figure 3 is used. The
general process for estimating the RUL can be seen
to be composed of three steps. First, the expected
degradation signal values according to the exemplar
degradation paths are estimated by evaluating the
regressed functions at t*. At the same time, the
expected RULs are calculated by subtracting the
current time t* from the observed failure times of the
exemplar paths. Second, the observed degradation
signal u(t*) is then classified according to the vector
of expected degradation signal values U(t*). The new
degradation path is assigned a membership value for
each of the historical paths which characterizes its
similarity to that exemplar. Third, the vector of
memberships of the observed degradation value to
the exemplar degradation paths is combined with the
vector of expected RULs to estimate the RUL of the
individual device. The details of the PACE model
f1 (t , θ1 ) f (t , θ )
2 f t , Θ 2
f 3 (t , θ 3 ) f 4 (t , θ 4 )
Here, Ti and fi (t,θi) are the failure times and
functional approximation of the ith exemplar
degradation signal path, θi are the parameters of the
functional approximation of the ith exemplar
degradation signal path, and Θ are all of the
parameters of each functional approximation.
45
context, the above vector can be rewritten as a
follows:
can now be described in the context of the present
example.
f1 (t*, θ1 ) U 1 (t*) f (t*, θ ) U (t*)
2 U(t*) 2
2
f 3 (t*, θ 3 ) U 3 (t*)
f 4 (t*, θ 4 ) U 4 (t*)
At the same time, the current time t* is used with the
vector of failure times to calculate the expected
RULs according to the exemplar degradation paths.
T1 t *
T t *
L(t*) T t* 2
T3 t *
T4 t *
Figure 2 Illustration of an observation of a
device’s degradation signal at time t* relative to
the functional approximations of the exemplar
devices
Now the currently observed degradation signal value
u(t*) can be compared to the expected degradation
signal values U(t*) by any one of a number of
classification algorithms to obtain a vector of
memberships μU[u(t*)]. Here, the memberships have
values on [0,1] and
*
U [u(t*)]
i
denotes the
th
membership of u(t ) to the i exemplar path.
U1 [u (t*)]
[u (t*)]
U
μ U [u (t*)] 2
U 3 [u (t*)]
U 4 [u (t*)]
Finally, the above memberships and the expected
RULs are combined in some way to estimate the
current RUL of the individual device, such as a
simple weighted average.
Figure 3 Process diagram of the PACE model for
estimating the RUL
3
Dynamic Prognoser by Example
Now that the PACE has been described, let’s take a
look at how it can be used to implement a dynamic
prognoser. More specifically, how can the PACE be
structured to use expert opinion prior to data
collection and how can the prognoser itself be
updated as degradation data is obtained. To begin,
let’s use the causal prognoser outlined in Garvey
(2007) and Garvey & Hines (2007), where the
cumulative vibration energy is used to infer the RUL
of an oil drill’s steering system. Let’s begin by
considering the relationship of vibration to
degradation by examining the signal itself.
*
First, the current time t is used to estimate
the expected values of the degradation signal and
RULs according to the exemplar paths. In equation
form, the expected values of the degradation signal
according to the exemplar paths are simply the
approximating functions evaluated at the current time
t*.
f1 (t*, θ1 ) f (t*, θ )
2 f t*, Θ 2
f 3 (t*, θ 3 ) f 4 (t*, θ 4 )
3.1 Vibration and Degradation
The function evaluations can be interpreted as
exemplars of the degradation signal at time t*. In this
Before we can get into the details about how the
PACE can be constructed to use expert opinion, we
46
need to take a look at how vibration is related to
steering system degradation. To do this, consider the
chart of the lateral vibration in G’s (i.e. 1 G = 9.81
m/s2) is presented in Figure 4. Also included in the
plot are the vibration level that separates normal and
moderate stresses (ORANGE) and the vibration level
that separates moderate and severe stresses (RED).
For this particular steering system, a failure occurs
just prior to the end of the file, after 160 elapsed
hours.
Notice that while the majority of the
observations are in the normal stress range, there are
many peaks that lie in the moderate and severe range.
What this means in terms of degradation is, when
there are more severe vibration events the
degradation of the steering system increases until
failure.
we have a physical model, we could simulate the
degradation to obtain the curves. For this work, we’ll
use expert knowledge to construct the original
prognoser and then augment this knowledge with
example degradations as they become available. To
get started, consider the following rules of thumb that
could possibly be obtained from engineers and
operators:
1) For an environment with no moderate or
severe events, the steering unit can be
expected to survive 200 hours.
2) For an environment with 50 moderate or
severe events, the steering unit can be
expected to survive 100 hours.
3) For an environment with 200 moderate
or severe events, the steering unit can be
expected to survive 0 hours.
In order to use these rules in the PACE, we need to a
vector of function forms of the degradation and their
corresponding lifetimes. What we get are the
following vectors:
0 f t ,Θ 50 200
200
T 100 0 Notice that the functional forms of the degradation
are constants, meaning that regardless of the current
time, they evaluate to 0, 50, and 200 respectively.
It’s important to note at this point that had we had a
physical model of the degradation, we could use the
parameterized physical form of the equations instead
of constants. The constants are only used to enable
easy communication of the underlying principal.
Figure 4 Example vibration signal (GRAY) with
normal-to-moderate (ORANGE) and moderateto-severe (RED) stress levels
3.3 Updating the Prognoser with
Examples
Let’s now suppose that we’ve had a steering unit
failure and obtained the vibration signal presented
earlier in Figure 4. In this case we’ve counted 646
moderate or severe vibration events and the
operational time before failure was 49 hours. Notice
we’re using the operational time and not the elapsed
time since we want to know how the degradation
affects the health while it is being used. Now that we
have this example, we can include it in the prognosis
model by simply appending the count and lifetime to
the previous vectors to obtain the following exemplar
vectors.
For the sake of simplicity, we’re going to
use a simple counting procedure to infer degradation.
More specifically, we’re going to count the number
of severe vibration events (i.e. vibration above 5 G)
and then relate this to the RUL.
3.2 Initial Prognoser from Expert
Knowledge
As is the case for many engineering products, often
no data is available for a new design, but it is still
desired to have some means for inferring the RUL of
individual devices for the imposed environmental
stresses. Since the PACE doesn’t use a fixed
architecture, but relies on the general concept of
having a function form of the degradation and an
associated failure time for examples, we can train an
initial prognoser on almost anything. For example, if
0 50 f t ,Θ 200
646
47
200
100 T
0 49 At this point, the initial expert opinion has been
supplemented with actual degradation data. In this
way, we can continuously update the prognoser
memory as additional data is made available.
It is important to note that the method
implemented in this discussion is rather simple, in
that we’re using a single number to quantify
degradation and not a “path”. The reason for the
simplicity was to enable effective communication of
the concept. If we were implementing a more
complex system, where we had functional forms of
the degradation derived from expert opinion or
simulation, we would simply append the functional
form of the degradation to the initial set as data is
made available. Furthermore, since the PACE is
purely data driven, if additional stress signals were
identified they can easily be integrated by simply
appending another column to the matrix of functional
approximations of the degradation.
Table 1 Summary of the incremental prognoser
estimate accuracies
# Examples
0
1
2
3
4
5
4
MAE (hrs)
97
92
81
80
74
53
Conclusions
This paper has shown that by taking a more relaxed
approach by allowing degradation data to speak for
itself, we can develop a prognosis system that can
incrementally learn “real world” degradation
mechanisms by simply retaining additional
degradation examples. The example presented in this
paper illustrates that by incrementally training the
PACE prognoser, we can significantly improve the
prognoser performance in time. Furthermore, it is
important to note that while the present example was
simplified for the sake of discussion, there are no
technical roadblocks to implementing the described
approach
with
more
complex
functional
approximations of the degradation, provided the
functional approximations are known or can be
found.
3.4 Testing and Results
To test the approach, the prognoser was
incrementally trained on five “real world” data sets.
For each prognoser, a random sampling of 20
observations of the number of events, elapsed time,
and actual RUL for the 5 sets was used as a test set.
In other words, we used the samples of the number of
events and elapsed time to estimate the RUL with the
current prognoser. Next, the estimate of the RUL
was compared to the actual RUL to assess the
accuracy of the individual prognoser. This process
was repeated beginning with the prognoser trained
only with expert opinion and ending with the
prognoser trained on the expert opinion and the entire
set of example cases.
The results are presented in Error!
Reference source not found.. In the # Examples
column the number of “real world” examples used in
the prognoser is listed. The prognoser accuracy over
the test set for each prognoser is presented in the
MAE (mean absolute error) column. Notice that there
is a clear downward trend in the MAE’s as the
prognoser is incrementally trained on real
degradation data. If we examine the prognoser
trained only with the expert opinion, the MAE is
approximately 97 hours. After we’ve supplemented
the expert opinion with the five examples, notice that
the MAE has significantly improved to a value of 53
hours. This represents a 45% improvement in the
predictive performance of the prognoser.
5
References
Garvey, Dustin R. (2007), An Integrated Fuzzy
Inference Based Monitoring, Diagnostic, and
Prognostic System, Ph.D. Dissertation, Nuclear
Engineering
Department,
University
of
Tennessee, Knoxville: 2007.
Garvey, Dustin R. and J. Wesley Hines (2007), “An
Integrated Fuzzy Inference Bases Monitoring,
Diagnostic, and Prognostic System”, Proceedings
of the 61st Annual Meeting of the Machinery
Failure and Prevention Technology Society,
pp.59-68, Virginia Beach, VA: April 17-19, 2007.
Liao, Haitao, Wenbiao Zhao, and Huairui Guo
(2006), “Predicting Remaining Useful Life of an
Individual Unit Using Proportional Hazards
Model and Logistic Regression Model”,
Proceedings of the Reliability and Maintainability
Symposium (RAMS), pp.127-132: January 23-26,
2006
48
Lu, C. Joseph and William Q. Meeker (1993), “Using
Degradation Measures to Estimate a Time-toFailure Distribution”, Technometrics, Vol. 35,
No. 2, pp.161-174: May 1993.
Mishra, S., S. Ganesan, M. Pecht and J. Xie (2004),
"Life Consumption Monitoring for Electronic
Prognostics", Proceedings of the IEEE Aerospace
Conference, Vol. 5, pp.3455-3467: March 6-13,
2004.
Nadaraya, E.A. (1964), “On estimating
regression”, Theory of Probability and Its
Applications, Vol. 10, pp. 186-190, 1964.
Upadhyaya, Belle R., Masoud Naghedolfeizi, and B.
Raychaudhuri (1994), “Residual Life Estimation
of Plant Components”, Periodic and Predictive
Maintenance Technology, pp.22-29: June 1994.
Wang, Peng and David W. Coit (2007), “Reliability
and Degradation Modeling with Random or
Uncertain Failure Threshold”, Proceedings of the
Annual
Reliability
and
Maintainability
Symposium, Las Vegas, NV: January 28-31,
2007.
Watson, G. S. (1964), “Smooth Regression
Analysis”, Indian Journal of Statistics, Series
A, Vol. 26, pp. 359-372: 1964.
Yan, Jihong, Muammer Koc, and Jay Lee (2004), “A
Prognostic Algorithm for Machine Performance
Assessment and Its Applications”, Production
Planning & Control, Vol. 15, No. 8, pp.796-801:
December 2004.
Xu, Di and Wenbiao Zhao (2005), "Reliability
Prediction using Multivariate Degradation Data",
Proceedings of the Annual Reliability and
Maintainability
Symposium,
pp.337-341,
Alexandria, VA: January 24-27, 2005.
49
Download