Using State Diagrams for Modeling Maintenance of

advertisement
58
IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 24, NO. 1, FEBRUARY 2009
Using State Diagrams for Modeling
Maintenance of Deteriorating Systems
Thomas M. Welte
Abstract—This paper discusses the use of state diagrams in
maintenance modeling. These diagrams frequently illustrate
deterioration, inspections and maintenance. Mathematically, the
state diagram can be represented by a Markov process. The
paper discusses the properties of such a Markov process. They
are compared with the maintenance situation in the real world. It
is shown that some properties make the model inconsistent with
reality especially in cases where a maintenance policy with nonperiodic inspections is modeled. A numerical example is provided
that shows that these model properties result in modeling errors.
The presented results make it clear that the common practice of
using Markov processes based on state diagrams must be judged
critically when they are used for modeling certain maintenance
strategies.
Index Terms—Deterioration, inspection, maintenance, Markov
processes, Monte Carlo methods, state diagrams.
I. INTRODUCTION
M
OST technical systems are subject to deterioration as a
result of usage and age. Thus, most major technical systems such as power plants, production systems, civil infrastructure, ships or planes are maintained according to a preventive
maintenance policy to avoid failure. These policies usually include corrective maintenance after a system failure, regular inspections to reveal the system condition and preventive maintenance to improve the system condition if the system has serious
signs of deterioration. Scheduling and optimization of the maintenance require mathematical models to quantify the impact of
maintenance on the lifetime and reliability of technical systems.
Some authors have proposed that deterioration, inspections
and maintenance can be illustrated by a state diagram [1],
[2]. Mathematically, the state diagram can be represented by
a Markov process, which may be solved by standard Markov
methods [1]–[4]. This paper critically discusses this practice
when it is applied to the modeling of maintenance of deteriorating systems. The advantage of using state diagrams is that
they provide a simple, graphical illustration of the maintenance
strategy. Furthermore, they can directly be used as basis for
the mathematical model, i.e., the Markov process. Thus, the
Manuscript received April 15, 2008; revised July 17, 2008. First published
December 09, 2008; current version published January 21, 2009. This work was
supported in part by the Norwegian Electricity Industry Association (EBL) and
in part by General Electric Energy, Norway. Paper no. TPWRS-00280-2008.
The author is with SINTEF Energy Research, Department of Energy Systems, Trondheim, Norway, and also with the Department of Production and
Quality Engineering, Norwegian University of Science and Technology, Trondheim, Norway (e-mail: thomas.welte@sintef.no).
Digital Object Identifier 10.1109/TPWRS.2008.2005711
diagrams provide an easy and straightforward tool that can be
used to build the mathematical model.
The focus in this paper is on maintenance policies with statedependent inspection frequencies (also called nonperiodic inspections). The models were originally introduced for analyzing
maintenance strategies with periodic inspections [5]–[7]. In this
case, they are useful because valid results can be obtained. Later
on, however, they have been generalized to nonperiodic inspections [1], [3], [4], [8] without analyzing and reflecting the consequence of this generalization. This paper fills this gap and it
is therefore a contribution towards a better understanding of the
models, their properties and their influence on the modeling results.
This paper discusses properties of Markov processes, which
are based on state diagrams where inspections and maintenance
are directly incorporated into the diagrams. It is analyzed
whether these properties are realistic or not. A numerical
example investigates to which extent the model properties
influences numerical results, for example, when the models are
used for calculating reliability measures, such as failure rates,
state durations, mean time between failures or mean time to
first failure.
The remainder of this paper is organized as follows: Section II
provides a short overview of the use of state diagrams in modeling the maintenance of deteriorating systems. The maintenance situation in the real world is described in Section III.
Model properties and different concepts to realize a maintenance model mathematically are discussed in Section IV. A numerical example is presented in Section V. Finally, the paper is
summarized and conclusions are drawn in Section VI.
II. STATE DIAGRAMS IN MAINTENANCE MODELING
In many applications, failures can be divided into two categories: Random failures and those arising as a consequence of
deterioration (ageing) [2]. In the latter case, the deterioration
process can be represented by a sequence of stages of increasing
wear, finally leading to equipment failure. A state diagram representing a simple failure-repair process for this case, is shown
in Fig. 1(a). If no maintenance is carried out, a new system will
, and will sooner
run through all stages of deterioration,
or later reach the fault state, denoted F. In a simple case, the
system will be replaced or repaired to a state of “as good as new,”
that is, the system is restored to state after failure. In practice,
many technical systems are maintained regularly to avoid failures and to intervene if the technical condition becomes critical.
Therefore, maintenance actions that improve the system condition are either carried out according to a predefined schedule, or
the system is inspected regularly to decide if and what kind of
0885-8950/$25.00 © 2008 IEEE
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 03:50 from IEEE Xplore. Restrictions apply.
WELTE: USING STATE DIAGRAMS FOR MODELING MAINTENANCE OF DETERIORATING SYSTEMS
59
Fig. 2. Basic principle for many maintenance models.
0
Fig. 1. State diagrams for deteriorating systems (adapted from [2]). S S :
stages of deterioration, F: fault state, M M : maintenance states. (a) Simple
failure-repair process. (b) Deteriorating system including maintenance.
0
maintenance is done. According to [2], maintenance can easily
; see the exbe added by introducing additional states,
ample in Fig. 1(b). The figure shows one example where it is assumed that maintenance will bring on average an improvement
of the system condition to the previous stage of deterioration.
It is a matter of common knowledge that the state diagram
turns into a Markov process if the state transitions occur with
a constant rate and if the future development of the system is
only dependent on the current state. This means that the time of
transition to a following state is modeled by an exponential distribution and the future process is independent of anything that
happened in the past. If the transition times are modeled by a
general probability distribution, the resulting model is called a
semi-Markov model [9]. The former is relatively easy to solve
and there are standard methods that can be used to calculate performance measures such as state probabilities, visit frequencies,
mean durations and mean time between failures [5], [9]–[11].
The solution of the latter requires more sophisticated mathematical techniques (see for example [11]). Monte Carlo simulations
or numerical methods are sometimes applied for computing reliability measures when the analytical solution is hard to derive.
This paper discusses models that have the following basic
principle (see Fig. 2): A sequence of deterioration states is followed by the fault state and from one or several deterioration
states and/or from the fault state there are transitions to additional states representing inspections, maintenance, decisions,
waiting periods etc. From there, the system returns to one of
the deterioration states or to the fault state (e.g., if the maintenance action could result in a system failure). In the following,
this way of formulating the maintenance model is denoted a
“classical” state diagram. Maintenance models that have this
classical structure are presented in [1], [3], [4], [6]–[8], and
[12]–[15].
Most of the examples found in the literature analyze a
strategy where inspections and maintenance are performed
with a constant rate, which is independent from the condition of
in Fig. 2 are equal
the system. This means that the rates
and have the same value [6], [7], [12], [14], [15]. In some other
applications [1], [3], [4], [8], however, state diagrams are used
to model maintenance policies with nonperiodic inspections
where the inspection frequency is increased with increasing
Fig. 3. Example of a classical state diagram.
deterioration. The inspection frequency usually depends on
the stage of deterioration of the system. This is a reasonable
assumption because it is often common practice to inspect
technical systems more frequently if it is known that the system
has deteriorated. The objective of this strategy is increasing the
probability of detecting a critical situation at the end of the life
of the system, and replacing or repairing the system before it
fails.
The time intervals between inspections and maintenance are
often modeled by an exponential distribution. As already discussed in [1] and [16], this assumption is not always realistic and
there are techniques to consider non-exponential distributions;
see, e.g., [6] and [16]. Nevertheless, exponentially distributed
inspection intervals are used in this paper, as it is frequently
practised in the literature [1], [3], [4], [6]–[8], [12]–[15].
A. State Diagram Used as Maintenance Model—An Example
A typical example of a classical state diagram is shown in
Fig. 3. The maintenance strategy visualized by this state diagram is used as a basis for discussions throughout this paper.
The deterioration process is represented by three discrete stages,
. If no maintenance is carried out, the last deterioration
stage is followed by the fault state F. It is assumed that after
failure, the system is replaced or repaired to the state . It is
well-known that this assumption can easily be relaxed [1]. In
order to extend the equipment lifetime it is obvious that maintenance is carried out according to a predefined strategy. Inspec) are performed, which retions (represented by the states
sult in the decision to
• do nothing, if the system is still in state ;
, if the system
• carry out a maintenance action, denoted
is in state . This will improve the system condition by
one stage;
, if the system
• carry out a maintenance action, denoted
is in state . This will also improve the system condition
by one stage.
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 03:50 from IEEE Xplore. Restrictions apply.
60
IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 24, NO. 1, FEBRUARY 2009
III. MAINTENANCE SITUATION IN THE REAL WORLD
A maintenance model that mathematically describes the
policy in Section II-A can be classified into the category of
“inspection models.” Valdez-Flores and Feldman [17] define
an inspection model as follows: “The state of the system is
completely unknown unless inspection is performed. In the
absence of repair or replacement actions, the system evolves
as a non-decreasing stochastic process. In general, at every
decision epoch there are two decisions that have to be made.
One decision is to determine what maintenance action to take,
whether the system should be replaced or repaired to a certain
state or whether the system should be left as it is. The other
decision is to determine when the next inspection should be
performed.”
According to their definition, decisions about inspection frequencies and maintenance actions are based on the knowledge
about the deterioration stage of the equipment. This is in agreement with our logical understanding of the situation in the real
world. There often exist predefined rules that recommend a special inspection frequency that is dependent on the stage of deterioration. Such a recommendation could be, for example, that
if the system is in state , inspections should be carried out
every second year. If the system is in stages or , inspections
should be performed yearly. The described strategy implies that
the currently used inspection frequency is adjusted as soon as
new knowledge about the system is available. New knowledge
about the system is normally provided by an inspection, a maintenance action or a failure, that is, after an event that provides
new information about the system condition.
In the classical state diagrams as presented in the literature,
there is usually a direct connection between the deterioration
states and the maintenance and inspection states. This connecto (confer
tion is illustrated by the arrows pointing from
Fig. 3). This connection is a poor model property and results
in errors, as shown later in this paper. Instead there should be
a clear separation of the deterioration process and the maintenance strategy (inspection and maintenance “process”) in a
model, because these two “processes” are independent of each
other, apart from when inspections and maintenance are carried
out or when failures occur. Furthermore, it is only at these points
in time that one can gather information about the system condition.
The discussed properties of the real maintenance situation
may be illustrated as in Fig. 4, where system deterioration, and
the maintenance and inspection strategy are illustrated as two
parallel “processes”; symbolized by the solid arrows. They are
only connected to each other at the points in time when inspections (I) are carried out or failures (F) occur. Then, decisions
(D) about maintenance actions (M) and the length of the next
are made.
inspection interval
A. Alternative Illustration of the Maintenance Strategy
As an alternative to the classical state diagram, deterioration,
inspections and maintenance may be illustrated as in Fig. 5. The
graph in Fig. 5 incorporates the considerations discussed in the
to the fault state
previous section. Deterioration from state
Fig. 4. System deterioration and maintenance/inspection strategy illustrated as
two parallel “processes.” D: decision, F: failure, I: inspection, M: maintenance,
: inspection interval.
Fig. 5. Alternative illustration of deterioration, inspections, and maintenance.
D: decision, F: fault state, I: inspection, S: stages of deterioration, eoi: end of
inspection, eom: end of maintenance, : inspection frequency, 1= : mean inspection duration, : deterioration rate, 1=: mean maintenance duration.
F is represented by a chain of states. This is similar to the classical state diagrams. System deterioration can be modeled, for
example, by a stochastic process such as the Markov process.
In contrast to a classical state diagram, other ways of graphical representations are used to add inspections and maintenance. There is only one inspection state (I) in the graph. The
represents that the
dash-dotted rectangle around states
system is inspected without knowledge about the current dete, depends
rioration state. The actual inspection frequency,
on a decision (D) that either has been made at the end of the
last inspection (eoi: end of inspection) or at the end of the last
maintenance action (eom: end of maintenance). The inspection
or . It is not until the induration is represented by ,
spection time is elapsed that the inspection result is available.
Now, a decision can be made whether to perform maintenance
, where represents
or not. This is illustrated by the nodes
the detected system state. The inspection result and the decision
depends on the system state . This dependency is illustrated by
, which is similar to the
the dotted arrows between and
illustration in Fig. 4 where the dotted arrows connect the deterioration process with the maintenance and inspection “process.”
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 03:50 from IEEE Xplore. Restrictions apply.
WELTE: USING STATE DIAGRAMS FOR MODELING MAINTENANCE OF DETERIORATING SYSTEMS
At the end of the maintenance action, or if no maintenance
is carried out directly at the end of the inspection, a second
decision about the next inspection time is made. These decisions are also based on the knowledge about the system condition. Note that the maintenance action may have changed the
system condition and the next inspection is scheduled dependent on the system state after maintenance. After the inspection
and the maintenance action is finished, the system is immediately put into operation. This is illustrated by the dashed arrows
to . Failures are assumed to be self-announcing
from
and a corrective maintenance action will bring the system back
to state .
IV. DISCUSSION
This section discusses the realism of state diagrams in maintenance modeling. It is shown that in principle there are two concepts to realize the model mathematically: the redrawing concept (RD concept) and the no-redrawing concept (NRD concept). It is shown that the concepts lead to different results. It
is argued that the NRD concept is in perfect agreement with
our logical understanding of reality, whereas the RD concept
has some discrepancies with the maintenance situation in the
real world. A Markov process that is built on classical state diagrams is a smart mathematical solution for the RD concept. It
follows from this that this kind of model is a poor representation of reality. The meaning of the last inspection interval and
the computation of visit frequencies and state durations is also
discussed in this section.
A. Mathematical or Numerical Model Realization
The state diagrams in Figs. 3 and 5 are only a graphical representation of deterioration, inspections and maintenance, but not
an executable model. It is not until a mathematical or numerical
model is built on the diagrams that an executable model is obtained. As mentioned before, the classical state diagram directly
represents a Markov process if all transitions are exponentially
distributed. The graph in Fig. 5 cannot directly be used as basis
for a mathematical model. Nevertheless, it may help to illustrate
the real situation and it makes clear how we must think when an
inspection model is realized mathematically.
At this point, we want to assume that Monte Carlo simulations
are used to realize the model. There are in principle two different
concepts to carry out these Monte Carlo simulations.
• Redrawing (RD)
Both the next time of system deterioration and the next inspection time are redrawn each time when there is a transition from to
and from to , respectively.
• No-redrawing (NRD)
The next time of system deterioration is only drawn when
there is a (physical) state change due to deterioration or
maintenance. Inspection times are only drawn when there
are decisions after an inspection or after a maintenance
action.
The two ways of carrying out Monte Carlo simulations can be
considered as fundamentally different concepts to realize the
mathematical model and to compute results. The RD concept
is illustrated by a classical state diagram as shown in Fig. 3,
61
whereas the graph in Fig. 5 is an attempt to illustrate the NRD
concept.
In order to clarify the difference between these two concepts, a deteriorating system is considered that is maintained
according to the policy described in Section II-A. Assume that
at time
. If
the system is “as good as new” in state
Monte Carlo methods are used to compute a realization of the
model, one would start the simulation by computing an exponentially distributed random number with rate . This number
is a realization of the sojourn time in state , denoted . In
addition, the next inspection interval, , could be computed
as an exponentially distributed random number with rate .
to ) is at
Thus, the time of deterioration (transition from
, and the next inspection (transition
time
to ) is at time
.
from
Assume that there is an inspection before the system dete. This inspection would reriorates to , that is,
veal that the real system is in deterioration stage . According
to the predefined strategy, no maintenance action is performed
and the system is returned into operation. To simplify matters,
the inspection duration is assumed to be short compared to the
operating times. Thus, the inspection periods can be neglected
approximately at time
.
and the system returns to
If the RD concept is applied, the restart of system operation
means that the system once again reaches state . Then, one
would again draw a random sojourn time, denoted , and a
random inspection interval, denoted . This means that the new
transition time from
to
is given by
and the
.
next inspection time is at
If the NRD concept is applied, the restart of system operation
means that the next inspection interval, , is computed as an
exponentially distributed random number, whereas the sojourn
is not redrawn. In this case, the new transition time
time in
to
is still the previous transition time, that is,
from
. The next inspection time is
.
As it can easily be seen, the difference in the concepts is given
by the different handling of the next transition times. However,
only one concept is correct for a given real world system. In
Section IV-B, the concepts are compared with the maintenance
situation in the real world. It is shown that the RD concept violates our logical understanding of the reality, whereas the NRD
concept realizes a situation that is in good agreement with our
understanding of reality.
In the considerations above, only the case for
was described. A similar situation occurs for the case where
, that is, for the case when a system transition to
the next deterioration state occurs before the next inspection is
performed. In this case, the next inspection times are handled
differently. If the concept of redrawing is used a new inspection
is drawn. Thus, the next inspection
interval, , with rate
, whereas for the NRD concept the next
is at
inspection time remains on the previous inspection time, i.e.,
.
B. Comparison With the Real Maintenance Situation
Let us analyze the maintenance situation in the real world by
and
.
again considering the two cases
In the former case, the time of system deterioration does not
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 03:50 from IEEE Xplore. Restrictions apply.
62
IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 24, NO. 1, FEBRUARY 2009
change (no redrawing) in the real world when inspections are
carried out, because the system does not know that it was inspected. This means that an inspection, followed by the decision to do nothing, has no influence on the system state. Solely
a maintenance action would influence/change the system state
and hence the next deterioration time. In the latter case, there is
no changing (redrawing) of the next inspection time in the real
to . The inspection
world if the system deteriorates from
time is a decision of the operator and he does not know when
the state transition (deterioration) occurs. Thus, the concept of
redrawing provides an unrealistic solution, because the inspections have an influence on the physical deterioration process and
system deterioration influences the next time of inspection.
In other words, if the RD concept is used, the scheduling of
the inspections is triggered by a state transition, whereas in a real
world situation, the next inspection time is a decision, which is
normally triggered by information gathered through inspections
or during maintenance. Thus, if the state diagram is solved with
techniques applying the RD concept, this is an obvious violation
of our logical understanding of the situation in the real world.
The use of classical state diagrams can entice the analyst to use
the RD concept. The graphical representation in the diagrams
suggests that the system jumps from state to state and that the
next event is given by the possible transitions from the actual
state to the connected states, whichever transition happens first.
The example in Section V shows that the use of the RD concept
can result in modeling errors. The NRD concept, however, exactly reflects the real maintenance situation. Thus, an inspection
model should be solved by mathematical methods and numerical procedures that realize the NRD concept.
C. Markov Processes Based on Classical State Diagrams
A Markov process that is built on the classical state diagram
is a smart mathematical solution of the RD concept. This can
easily be proven when the Markov model is “solved” by Monte
Carlo simulation based on the RD concept instead of standard
Markov methods. In the previous section, it has been shown that
the RD concept violates our logical understanding of the reality.
The logical consequence is that a Markov process based on a
classical state diagram is a maintenance model with the same
unrealistic properties as the RD concept.
In the case of nonperiodic inspections, the inspection frequencies (
in Fig. 3) increase with the deterioration state.
“Standard Markov methods” have been proposed and used for
inspection models with a nonperiodic inspection strategy [1],
to
[3], [4], [8]. As soon as there is a state transition from
in the model, there will also be a changing of the inspec. A consequence of this is that the
tion frequency from to
residual time to the next inspection is no longer redrawn as exponential distribution with the same rate as the previously drawn
inspection time, but with an increased rate. This leads to erroneous results as shown in Section V.
The Markov model provides a good, simple and correct solution, if all transitions are exponentially distributed and if all inspection rates are equal (nonperiodic inspections). In this case,
the time to next inspection and the time of system deterioration
can be redrawn each time when there is an arrival in . The
redrawn times can be interpreted as the residual time to system
deterioration and inspection, respectively. If all transitions have
constant rates, we can utilize the memoryless property of the exponential distribution. The (redrawn) residual time to the next
state transition is exponentially distributed with the same transition rate as the previously drawn transition time. This means
that it does not matter how long the system stayed in one state
and how often we redraw transition times. From this follows that
the RD concept and the NRD concept are equivalent. A numerical example for this case is given in Section V-B.
Note that the RD concept cannot be applied, if the transitions
have a general (non-exponential) distribution because the memoryless property is not valid anymore. This understanding is important when the model is realized by Monte Carlo simulation.
Since the graph in Fig. 5 and the NRD concept cannot be
mathematically realized by standard Markov methods, it may
be realized by Monte Carlo methods as described above, or by
other numerical procedures, for example, as suggested in [18]
and [19]. For some special maintenance strategies, e.g., periodic inspections, the model may be solved analytically by using
renewal theory [11], [20], [21]. When deterioration can be measured by a continuous quantity, stochastic processes such as the
gamma process or the Brownian motion may alternatively be
applied in maintenance modeling, see, e.g., [22]–[25].
Regardless of the applied concept, when Monte Carlo simulation is used to solve a model, an assumption has to be made according to the proper probability distributions. In general, more
flexible distributions (e.g., two-parametric distributions such as
the Weibull distribution or the gamma distribution) usually perform better than the exponential distribution, which has limitations due to the constant transition rate. The best way to find a
proper probability distribution is the collection of data and to
fit a probability distribution to the data set. However, collection
of data is not practical in many cases. This may require a long
time or the system under consideration is unique which means
that no useful data set can be obtained. Furthermore, the exact
sojourn times are difficult to observe or it would require continuous monitoring to get good observations of the sojourn times.
Some of these topics are discussed in more detail in [26] and
[27].
Knowledge about the deterioration process may help to define a suitable probability distribution for the sojourn times. For
example, if deterioration is caused by a series of randomly occurring shocks, the gamma distribution may be a good model.
If deterioration is caused by several “competing” deterioration
mechanisms, the Weibull distribution might be a good choice
(weakest link theory). For the repair time and the inspection intervals it is usually easier to collect data than for the sojourn
times. The lognormal distribution is considered to be a suitable distribution for repair times [9]. The inspection intervals
are rather deterministic than exponentially distributed. If the inspection intervals are not modeled as deterministic numbers, the
Weibull or gamma distribution with comparably low variance
might be a proper choice.
D. Meaning of the Last Inspection Rate
Consider a situation where it is known that the system is in
deterioration stage . Thus the operator will schedule the next
inspection approximately after a time period
. We assume
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 03:50 from IEEE Xplore. Restrictions apply.
WELTE: USING STATE DIAGRAMS FOR MODELING MAINTENANCE OF DETERIORATING SYSTEMS
now, that the system will deteriorate one stage to
and that
the next planned inspection will reveal the deterioration state.
According to the predefined strategy, the maintenance action
will be carried out and the system condition will be improved to
. The next inspection will be scheduled after approximately
time units.
The state diagram in Fig. 3 shows an inspection rate denoted
, which is situated between and . In the real maintenance
situation, does not exist. However, we need this rate to build
and a
our state diagram with an inspection that reveals state
following maintenance action that improves the system by one
stage. This means that there is again a mismatch between the
classical state diagram and the maintenance situation in the real
world. The proposed alternative in Fig. 5 gives a better representation of the real situation because it does not require .
It is certain that there are situations where an inspection reand afterwards the decision
veals that the system is in state
is taken to do nothing more than increasing the inspection frehas a practical meaning. However, the
quency to . Then,
considered maintenance strategy does not allow for this case.
Even though the opportunity could be included in the model, this
is obviously not a standard case because there is a high risk to get
equipment failure in the next time period. Thus, this would only
be done in rare situations where the equipment is indispensable
for a short time period. Another case where maintenance can
is when the maintenance action
return the system back to
fails to perform its intended purpose. In this case, one does usually not know that the maintenance action had failed. One would
continue to operate the system under the assumption that maintenance was successful until a future inspection revealed that the
maintenance was not successful or until failure occurred.
E. Visit Frequencies
Reliability measures that are commonly calculated with
Markov processes are the visit frequency and the mean duration of states. In a Markov process, the visit frequency is the
frequency of arrivals in state
and departures from state ,
respectively. If we consider, for example, visits in the first deterioration state , then, each transition from to and each
to
is counted as one departure from
in
transition from
the Markov process. In reality, however, the system does not
when there is a transition to . According
really leave state
to the predefined maintenance policy no maintenance action
is carried out and the system remains in . This means that
calculated with standard Markov
the visit frequency in state
including
methods is the frequency of all departures from
the imaginary departures from
to . This frequency has no
practical meaning and only exists in the Markov process model.
It is not the frequency of visits we are really interested in when
we calculate reliability measures.
The same argumentation applies for the mean duration of the
deterioration state calculated with standard Markov methods.
This duration is also an imaginary duration of each imaginary
state visit in the model. It would be more interesting to compute the real visit frequency and mean duration in . The real
mean duration can be defined as the average duration of the
time interval between the commissioning of a new or maintained system and the time when the system finally leaves
63
TABLE I
TRANSITION RATES
because there is a (physical) deterioration to . Dependent on
whether the system deteriorates during inspections or not, the
inspection durations have to be included or excluded from this
time interval. Referring to the definition of state duration given
above, one run through this time interval can be counted as one
state visit. It is recommended to use methods that are capable to
compute the real visit frequency and state duration. This can be
done by means of a model realization that is based on the NRD
concept (e.g. Monte Carlo simulations using the NRD concept).
In the examples in Section V, numerical values are presented for
the different definitions of visit frequency and mean duration in
the deterioration states.
V. NUMERICAL EXAMPLE
The maintenance policy to be analyzed in this example is as
presented in Section II-A. The state diagram formulated in the
classical way is shown in Fig. 3. The model is a simplification of
the models presented in [1], [3], [4], and [8]. The transition rates
between the states are constant (see Table I). It is claimed [1],
[3], [4] that standard Markov methods can be used to calculate
state probabilities, visit frequencies and mean durations, that is,
the model can be realized mathematically by a Markov process.
It can easily be shown that this Markov process generates identical results as Monte Carlo simulations based on the RD concept. The results for the steady-state solution of the Markov
process are presented in Table II where the columns with the results obtained by standard Markov methods are denoted “RD”
(since the RD concept is realized). The mean time between failures (MTBF) and the mean time to first failure (MTTFF) has
also been computed (Table III).
The development of analytical expressions for maintenance
models such as the described one is difficult (if not impossible).
It is therefore suggest to carry out calculations based on the
NRD concept by Monte Carlo simulation. The simulation
follow the descriptions in Section IV-A. Random transition
times are generated and the next time of system deterioration
is only drawn when there is a physical state change due to
deterioration or maintenance, and inspection times are only
drawn when there are decisions after an inspection or after a
maintenance action. In order to illustrate the difference between the two possible definitions of the state duration and the
(as discussed in
visit frequency in the deterioration states
Section IV-E), two solutions are computed (see Table II). The
solutions are denoted “NRD-a” and “NRD-b”, where “NRD-a”
is counted as one visit
means that each departure from
to
whereas “NRD-b” means that imaginary departures from
are not counted as a state visit.
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 03:50 from IEEE Xplore. Restrictions apply.
64
IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 24, NO. 1, FEBRUARY 2009
TABLE II
STATE PROBABILITIES, VISIT FREQUENCIES, AND MEAN DURATIONS
Models using the RD concept: Markov process
Models using the NRD concept: Monte Carlo simulation using two different ways (a, b) of computing visit
frequencies and mean durations
TABLE III
MTBF AND MTTFF
If we compare the results for both models an apparent and surprising finding is that the MTBF and the MTTFF (see Table III)
differ considerably. The results for MTBF and MTTFF calculated with the Markov model are much larger than the corresponding results calculated with the Monte Carlo simulation realizing the NRD concept. The reason for this is the direct connection between the deterioration states and the inspection states
in the Markov model. This means that the inspection rate in the
Markov model increases as soon as there is a state transition
to
, whereas in the real world the inspection frefrom
quency will still remain at the current value until the next inspection reveals the state transition. Thus, we can expect more
frequent inspections in the Markov model and there is a better
chance to correct the critical situation by maintenance and to
extend the lifetime of the equipment. This is confirmed through
the results. Inspection states and have higher visit frequencies in the Markov model than the equivalent states in the alternative model. Note that an equivalent state to in the Markov
model is an inspection leading to the result that the system is in
in Fig. 5. Furthermore, it is
state . This is denoted
not surprising that the steady state probability is higher for the
“good” state and lower in the “bad” states F, and in the
Markov model, compared to the probabilities in the alternative
model, because there is a higher chance to detect a critical deterioration and to intervene by maintenance due to more frequent
inspection in the Markov model.
A. Influence of the First Inspection Rate
In this section, the dependency of the MTBF on the
is analyzed. Fig. 6 shows plots of
first inspection rate
Fig. 6. MTBF as a function of the length of the first inspection rate , given
different values of . RD: Markov process/RD concept; NRD: NRD concept.
for both models. The focus is on the model
. This means that the first inspection
behavior if
interval is very long. Thus, if a new system is set into operation, in practice there will be no inspection or maintenance
before the system fails. In this situation, the MTBF is the
sum of the expected sojourn times of the system in the deterioration states if no maintenance is carried out, that is,
.
A case without inspections and maintenance could obviously
be represented as a simplified state diagram model as shown
in Fig. 1(a). This simplified model represents only the situa. However, maintenance models are also used
tion when
for analyzing the relationship between the inspection rates and
MTBF or MTTFF; see, e.g., [8]. The complete (nonsimplified)
model is required for such analyzes and the model should conyears when converges to small values.
verge to
The Monte Carlo simulations realizing the NRD concept, denoted “NRD” in Fig. 6, show this convergence. The Markov
process (RD concept) converges as well. The convergence limit,
however, is not as expected. It depends on the choice of other
model parameters, for example . This dependency is obviously wrong because the second inspection rate will never be
applied once the decision is made that no further inspections
are carried out. Thus, the Markov model yields incorrect results.
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 03:50 from IEEE Xplore. Restrictions apply.
WELTE: USING STATE DIAGRAMS FOR MODELING MAINTENANCE OF DETERIORATING SYSTEMS
TABLE IV
MTBF AND STATE PROBABILITIES FOR THE CASE
WITH PERIODIC INSPECTIONS ( = 1=year)
RD: Markov process/RD concept, NRD: NRD concept
The reason for this is again the direct connection between the deterioration states and the inspection states in the classical state
diagram, something that will trigger a changing of the inspecin
tion frequency as soon as the system deteriorates to state
the model.
B. Periodic Inspections
Finally, a case with periodic inspections is considered, that is,
the inspection rate is
in all deterioration states. Both
the Markov process (model based on RD concept) and Monte
Carlo simulations based on the NRD concept yield the same
results if all transitions have an exponential distribution and if
all inspection rates are equal (see Table IV). Note that the RD
concept and the NRD concept are interchangeable only when
the inspection intervals are exponentially distributed.
VI. SUMMARY AND CONCLUSIONS
This paper has discussed the use of state diagrams in maintenance modeling. The focus has been on inspection models. It
has been pointed out that the direct connection between the deterioration states and the maintenance and inspection states is a
poor property of classical state diagrams. It has been argued that
in reality, system deterioration and the maintenance and inspection strategy are two parallel and separated processes that are
only connected to each other when new information about the
condition of the system is gained by inspections or following a
failure.
It has been shown that there are two concepts to realize inspection models mathematically or numerically: the RD concept
and the NRD concept. The use of classical state diagrams can
entice the analyst to apply mathematical or numerical methods
that realize the RD concept; for example, Markov processes that
are based on such diagrams. A comparison with the real maintenance situation has shown that this can lead to discrepancies between the maintenance model and the real world. An unrealistic
dependency between system deterioration and inspection times
may arise. The numerical example presented in this paper has
illustrated that this can result in modeling errors when a maintenance strategy with nonperiodic inspections is analyzed. It has
been argued that a maintenance model should realize the NRD
concept. The proposed alternative graph may help to realize and
understand the NRD concept. The resulting inspection model
will be in good agreement with the real world situation.
The purpose of this paper is not a general critique of state diagrams. The diagrams and the resultant Markov processes can
65
provide useful, simple and correct solutions for different modeling situations. There are many good examples, as well as practical applications, where state diagrams are used as a basis for
further modeling steps. When a situation with nonperiodic inspections is analyzed, however, the Markov process is no longer
a good representation of reality. Obviously, a model is never a
representation of reality but a simplification so that there will
always be some discrepancies between the model and the real
world. However, if the model no longer represents something
that is close to the real situation, the validity of the obtained results may be questioned.
ACKNOWLEDGMENT
The author would like to thank Prof. J. Vatn from the Department of Production and Quality Engineering, Norwegian University of Science and Technology, for his helpful suggestions
and comments.
REFERENCES
[1] J. Endrenyi, G. Anders, and A. Leite da Silva, “Probabilistic evaluation of the effect of maintenance on reliability—An application,” IEEE
Trans. Power Syst., vol. 13, no. 2, pp. 576–582, May 1998.
[2] J. Endrenyi, S. Aboresheid, R. Allan, G. Anders, S. Asgarpoor, R.
Billinton, N. Chowdhury, E. Dialynas, M. Fipper, R. Fletcher, C. Grigg,
J. McCalley, S. Meliopoulos, T. Mielnik, P. Nitu, N. Rau, N. Reppen,
A. Salvaderi, A. Schneider, and C. Singh, “The present status of maintenance strategies and the impact of maintenance on reliability,” IEEE
Trans. Power Syst., vol. 16, no. 4, pp. 638–646, Nov. 2001.
[3] G. J. Anders, J. Endrenyi, and C. Yung, “Risk-based planer for asset
management,” IEEE Comput. Appl. Power, vol. 14, no. 4, pp. 20–26,
Oct. 2001.
[4] G. J. Anders and J. Endrenyi, “Using life curves in the management of
equipment maintenance,” in Proc. 2002 Probabilistic Methods Applied
to Power Systems (PMAPS) Conf., 2002.
[5] J. Endrenyi, Reliability Modeling in Electric Power Systems. Chichester, U.K.: Wiley, 1978.
[6] G. J. Anders, J. Endrenyi, G. Ford, and G. Stone, “A probabilistic model
for evaluating the remaining life of electrical insulation in rotating machines,” IEEE Trans. Energy Convers., vol. 5, no. 4, pp. 761–767, Dec.
1990.
[7] G. Anders, J. Endrenyi, G. Ford, J. Lyles, H. Sedding, J. Maksymiuk,
J. Stein, and D. Loberg, “Maintenance planning based on probabilistic
modeling of aging in rotating machines,” in Proc. 1992 CIGRE Int.
Conf. Large High Voltage Electric Systems.
[8] P. Jirutitijaroen and C. Singh, “The effect of transformer maintenance
parameters on reliability and cost: A probabilistic model,” Elect. Power
Syst. Res., vol. 72, no. 3, pp. 213–224, 2004.
[9] M. Rausand and A. Høyland, System Reliability Theory: Models,
Statistical Methods, and Applications. Hoboken, NJ: Wiley-Interscience, 2004.
[10] G. J. Anders, Probability Concepts in Electric Power Systems. New
York: Wiley, 1990.
[11] S. M. Ross, Stochastic Processes. New York: Wiley, 1996.
[12] S. Amari and L. McLaughlin, “Optimal design of a condition-based
maintenance model,” in Proc. 2004 Reliability and Maintainability
Symp. (RAMS), pp. 528–533.
[13] A. Jayakumar and S. Asgapoor, “Maintenance optimization of equipment by linear programming,” Probab. Eng. Inf. Sci., vol. 20, pp.
183–193, 2006.
[14] G. Chan and S. Asgarpoor, “Optimum maintenance policy with
Markov processes,” Elect. Power Syst. Res., vol. 76, no. 6–7, pp.
452–456, 2006.
[15] G. Theil, “Parameter evaluation for extended Markov models applied to
condition- and reliability-centered maintenance planning strategies,” in
Proc. 2006 Probabilistic Methods Applied to Power Systems (PMAPS)
Conf..
[16] S. Sim and J. Endrenyi, “Optimal preventive maintenance with repair,”
IEEE Trans. Reliab., vol. 37, no. 1, pp. 92–96, Apr. 1988.
[17] C. Valdez-Flores and R. M. Feldman, “Survey of preventive maintenance models for stochastically deteriorating single-unit systems,”
Naval Res. Logist., vol. 36, no. 4, pp. 419–446, 1989.
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 03:50 from IEEE Xplore. Restrictions apply.
66
IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 24, NO. 1, FEBRUARY 2009
[18] T. M. Welte, J. Vatn, and J. Heggset, “Markov state model for optimization of maintenance and renewal of hydro power components,” in
Proc. 2006 Probabilistic Methods Applied to Power Systems (PMAPS)
Conf..
[19] T. Welte, “Deterioration and maintenance models for components
in hydropower plants,” Ph.D. dissertation, Norwegian Univ. Science
Technol., Trondheim, Norway, 2008.
[20] M. J. Kallen and J. M. van Noortwijk, “Optimal periodic inspection of a
deterioration process with sequential condition states,” Int. J. Pressure
Vessels Piping, vol. 83, pp. 249–255, 2006.
[21] J. M. Kallen, “Markov processes for maintenance optimization of civil
infrastructure in The Netherlands,” Ph.D. dissertation, Delft Univ.
Technol., Delft, The Netherlands, 2007.
[22] J. M. van Noortwijk, “A survey of the application of gamma processes
in maintenance,” Reliab. Eng. Syst. Safety, vol. 94, no. 1, pp. 2–21, Jan.
2009.
[23] R. Dagg, “Optimal inspection and maintenance for stochastically deteriorating systems,” Ph.D. dissertation, City Univ. London, London,
U.K., 1999.
[24] A. Grall, L. Dieulle, C. Berenguer, and M. Roussignol, “Continuoustime predictive-maintenance scheduling for a deteriorating system,”
IEEE Trans. Reliab., vol. 51, no. 2, pp. 141–150, Jun. 2002.
[25] R. P. Nicola, R. Dekker, and J. M. van Noortwijk, “A comparison
of models for measurable deterioration: An application to coatings
on steel structures,” Reliab. Eng. Syst. Safety, vol. 92, no. 12, pp.
1635–1650, 2007.
[26] M. J. Kallen and J. M. van Noortwijk, “Statistical inference for Markov
deterioration models of bridge conditions in the Netherlands,” in Proc.
3rd Int. Conf. Bridge Maintenance, Safety and Management, 2006.
[27] T. M. Welte and A. O. Eggen, “Estimation of sojourn time distribution
parameters based on expert opinion and condition monitoring data,” in
Proc. 2008 Probabilistic Methods Applied to Power Systems (PMAPS)
Conf..
Thomas M. Welte was born in Böblingen, Germany, in 1976. He received the Dipl.-Ing. degree
in mechanical engineering from the University
of Stuttgart, Stuttgart, Germany, in 2003 and the
Ph.D. degree in safety, reliability, and maintenance
from the Norwegian University of Science and
Technology, Trondheim, Norway, in 2008.
He is a Research Scientist at SINTEF Energy Research, Department of Energy Systems,
Trondheim, Norway. He is working with maintenance and deterioration modeling, maintenance
optimization and renewal strategies.
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 03:50 from IEEE Xplore. Restrictions apply.
Download