Group Replacement Policies for Parallel Systems Whose

advertisement
Group Replacement Policies for Parallel Systems
Whose Components have Phase Distributed Failure Times
Elmira Popova 1
Graduate Program in Operations Research and Industrial Engineering
The University of Texas at Austin
Austin, TX 78712
John G. Wilson
Babcock Graduate School of Management
Wake Forest University
Winston-Salem, NC 27109
Abstract
Consider a system of components operating in parallel. Downtime costs are incurred when
failed components are not repaired or replaced. There are also xed, unit repair and replacement costs associated with the system. The failure distributions of the components
are assumed to be identically distributed random variables. Results on calculating the
expected cost and variance per unit time of various group replacement policies will be provided. Consideration of variance is important since, in many cases, practitioners wish not
only to achieve small expected cost but to also reduce variability from cycle to cycle. Phase
distributions allow for the modeling of a wide range failure time behavior. Closed form results are derived for the three major classes of group replacement policy (m-failure, T -age,
and (m; T )) when the underlying distribution is of phase type.
Keywords:
1 This
Maintenance, Phase Distributions, Production Planning, Cost Variability
research has been partially supported by Grant #0003658-472 from State of Texas
Advanced Technology Program, National Science Foundation Grant #DMC-8910378, and a
Babcock Research Grant
1 Introduction
Over the past few decades, the complexity of industrial systems has grown enormously.
New industrial paradigms to ensure fast production, delivery, and prot have been introduced. Just-in-time systems are among the most popular. Proper planning of maintenance
is important to minimize disruptions to the system. The literature on reliability and maintainability of complex systems has evolved at a relatively slow pace due to the mathematical
complexity of such problems and limited computational capabilities in the past.
In this paper, a system of stochastically independent and identical components is analyzed. Group replacement policies are investigated in detail. Group replacement policies
require the replacement or repair to as good as new of all components whenever any replacement or maintenance is performed. Their great advantages are that they allow for
economies of scale and are straightforward to implement.
There are three main classes of group replacement policy. A T - age policy (see, e.g.,
Okumoto and Elsayed [11]) calls for replacement every T units of time. An m-failure policy
(see, e.g., Assaf and Shanthikumar [1] and Wilson and Benmerzouga [17]) calls for replacing
the system at the time of the mth failure. A policy that combines features of both of the
above classes is the (m; T ) policy which calls for replacement at the time of the mth failure
or at time T whichever occurs rst (see Ritchken and Wilson [13] and Nakagawa [8]).
The work referenced above assumes that the parameters of the underlying failure time
distributions are known with certainty. In practice, an engineer's opinion of the failure time
distributions will change as data from the actual operation of the system is obtained. There
have been a number of recent results on adaptive Bayesian approaches to modeling this
situation. For the case of exponential failure times, Wilson and Benmerzouga [18] analyzed
a policy class that called for replacement of the system whenever the expected posterior
value of the exponential parameter exceeded a certain threshold. A more general form of
this policy for the case of three machines operating in parallel was considered in Wilson
and Popova [19]. The case of a single machine with a Weibull failure time is considered in
Mazzuchi and Soyer [7].
1
Group replacement policies are popular in large part due to the ease with which they
can be implemented in a real production setting. There has been little work in the literature
on identifying which class of policies contains the optimal policy for a given system. For
a parallel system where the components have exponential i.i.d. failure times, Assaf and
Shanthikumar [1] showed that the class of m-failure policies is optimal if one knows the
value of the underlying exponential parameter. Wilson and Popova [20] provide optimality
results for the adaptive Bayesian case where the parameter is continuously estimated from
the failure time data. The case where the system consists of only one component but where
the failure distribution is allowed to be any continuous distribution whose parameters are
continuously updated is considered in Popova and Wilson [12].
All of the above approaches assume that one is only interested in nding a policy that
minimizes the expected cost per unit time. However, many managers and engineers are often
just as interested in the variability of cost from cycle to cycle. Indeed, many might prefer
a policy with a slightly higher expected cost per unit time if the variability of these costs is
small. In any case, knowledge of the variance associated with a given policy provides useful
information. Consequently, it is somewhat surprising that most approaches in the literature
ignore variance considerations. In this paper both expected cost and variance per unit time
are explicitly modeled. Derivations of the quantities needed to compute the variance per
unit time are provided in the Appendix. These derivations apply to any continuous failure
distribution with nite rst and second moments. (A summary of this material and some
examples can be found in Wilson [16].)
One diculty with much of the literature on group replacement policies is the restrictive
assumptions that must often be made regarding the failure time distribution. A goal of this
paper is to analyze the situation where the failure distribution can be drawn from a very
wide class. Phase distributions (see Neuts [9]) can be used to approximate most existing
continuous distributions (see, e.g., Bobbio and Cumani [2], Johnson [5], Malhotra and
Reibman [6]). Consequently an extensive analysis for this case is provided.
Some research on the reliability of systems with phase distributed lifetimes has recently
appeared. The two unit priority redundant system with phase failure time and underlying
2
repair time of the non-priority unit is analyzed by Gururajan and Bhat [4], where closedform results for the reliability and availability of the system are provided. Chakravarthy
[3] considers the system of two machines in series with a buer in between. The machines
have exponential failure and repair times and the processing time is phase distributed.
An algorithm for obtaining the steady state probabilities and some system performance
measures is presented.
Consider the class of (m; T ) policies. Assume that the failure distribution is continuous
- e.g. Weibull. Suppose the values for T are restricted to the set fT1 ; T2; :::; Tkg. Then a
total of mk policies must be considered. Calculating the expected cost per unit time for
each of these policies involves many numerical integrations. If one also wishes to compute
the appropriate variance associated with each of these policies, then even more integrations
are required. However, as will be demonstrated in this paper, no integrations are required
if a phase distribution is used. One need only consider operations with matrises. However,
if one is analysing a large system of components the dimension of the matrices necessary
to compute the expected cost and variance for m and (m; T ) failure policies grows exponentially. Consequently, the size of the problem one can consider depends on the computer
power currently available. The class of phase distributions is suciently wide to capture
most reasonable failure behavior. For instance, one can nd phase distributions that approximate the Weibull distribution (see, e.g., Johnson [5], Malhotra and Reibman [6]).
The applicability of group replacement models is greatly enhanced when the failure
distributions are realistic, managerially important quantities such as variances are also
calculated and results are relatively easy to obtain numerically. It is demonstrated in this
paper that all of these objectives are satised when phase distributions are used to model
failure times.
Notation and basic assumptions are provided in x2. Since the focus of this paper is
on replacement policy issues, most of the phase related derivations are summarized in the
Appendix in an eort to reduce the algebraic complexity of the paper. x4 and x5 contain
explicit results for the expected costs and variances associated with T -age and m -failure
policies, respectively. x6 contains an analysis of the more algebraically complex (m; T )3
policies.
2 Notation and Assumptions
Assume that n independent components, or machines with identical failure time distributions are in operation. The system is working if at least one of the components is operating.
Each time group maintenance is performed a xed cost of c0 is incurred. The cost of either
replacing or repairing a broken component to as good as new is denoted by cr . The cost of
either replacing or repairing a functioning component to as good as new is denoted by cs .
The quantity cr ? cs is assumed to be positive and can be interpreted as the salvage value
for a used but functioning machine. Each failed machine results in a downtime cost of cd
per unit time until the machine is repaired or replaced.
The times of group replacement/maintenance are renewal points for the system with
the renewal cycle being the time between successive group maintenance operations. Let L
and C , respectively, denote the random variables for the length of the cycle and the total
cost incurred during the cycle. The cost, C , incurred during the repair cycle can be written
as C = co + ncs + (cr ? cs ) N + cd D; where N is the total number of components that fail
during the cycle and D is the total down time incurred during the cycle. The mean and
variance of L will be denoted by and 2, respectively. Let C (t) denote the total cost
incurred over all cycles between times 0 and t.
The goal of this paper is to develop explicit closed form expressions that do not require
integration for the expected cost per unit time lim t!1 t?1 E [C (t)] and the asymptotic
variance per unit time limt!1 t?1 V ar[C (t)]. It can be shown from renewal theory that the
following relationships hold:
lim t?1 E [C (t)] = E [C ] =
t!1
lim t?1 V arC (t) = fE [C ]g2 2 ?3 + ?1 V ar [C ] + 2 fE [C ]g2 ?1
t!1
?2? E [CL] E [C ]
2
(see Ross [14] and Smith [15]).
4
(1)
Let F () and f () denote the cumulative distribution and density functions, respectively,
for the time to failure of a given machine. For 1 i m, let fi (x) denote the density
function of i (the time to the ith failure) and let p(i; x) denote the probability that exactly
i out of n machines will have failed x time units into the cycle, i.e.
n!
fi (x) = (i ? 1)!(
f
(x) [F (x)]i [1 ? F (x)]n?i
n ? i)!
and
!
p (i; x) = ni [F (x)]i [1 ? F (x)]n?i .
(2)
(3)
Explicit results for the terms in (1) are derived in the Appendix. These results are
expressed in terms of F() ; f () ; fi () ; and p (; ) : Note that these results apply to any
failure time distribution whose rst and second moments are nite.
For the remainder of the paper it will be assumed that the failure time is a phase
distributed random variable with representation (; A), i.e.
F (x) = 1 ? eAx e,
(4)
where et = (1; 1; :::; 1) 2 Rr and A is an r r stable matrix with non-negative o diagonal
entries, non-positive row sums and negative diagonal entries. The initial probability vector
is given by (; m+1 ) with e + m+1 = 1. One interpretation for this distribution is
that it represents the time to absorption of a Markov process dened on the states labeled
1; 2; :::; r + 1, where the states 1; :::; r are transient and the state r + 1 is absorbing. (The
number r is the dimension of the "distribution
representation.) The innitesimal generator
#
0
A
A
for this process can be written as 0 0 , where Ae + A0 = 0 and the initial probability
vector of the process is given by (; dr+1) where e + dr+1 = 1 ( dr+1 = 0 in our analysis).
The density function is given by
f (x) = eAxA0, for x > 0,
(5)
(see Neuts [9], page 44 for details).
Let I denote the r r identity matrix, let Ik denote the rk rk identity matrix and let
ek denote the rk column vector that consists entirely of 1's. Let Xi , 1 i n, denote the
5
i.i.d. times to failure of the n components. Then, min (X1; X2) has a phase distribution with
representation (2 ; A2) where 2 , A2 A I + I A and denotes the Kronecker
product (see Neuts [9] for details). Apply this recursively to see that min (X1; ; Xk ) for
k 2 f1; 2; ; ng has a phase distribution with representation (k ; Ak ) where
(
for k = 1
k otherwise
k?1
(
for k = 1
Ak A
Ak?1 I + Ik?1 A otherwise .
The function 1 ? [1 ? F (x)]k is the distribution function of min (X1; ; Xk ) which is a
phase distribution with representation (k ; Ak ). Consequently,
1 ? [1 ? F (x)]k = 1 ? k eAk x ek , for k 2.
(6)
Expand F (x)i?1 = [1 ? (1 ? F (x))]i?1 in (2) and (3) with the Binomial theorem and
use (4), (5) and (6) to see that fi (x) and p(i; x) can be written as follows:
!
i?1 i ? 1 !
X
n
Ax
0
f (x) = i
e A
(?1)i?1?k eAn?k?1 xe
(7)
i
p (i; x) =
Example
i
n
i
!
i
X
k=0
n?k?1
k
k=0
!
i
i?k
An?k x e .
n?k
k (?1) n?k e
n?k?1
(8)
In order to approximate a Weibull distribution with shape parameter equal to c and
location parameter equal to b, Malhotra and Reibman [6] suggest solving the following two
equations for r and :
r?1 = b? c + 1
c
(9)
r(r + 1)?2 = b2 ? c +c 2 .
(If the above produces a nonintegral solution for r then choose the closest integer to the
solution.) Then use an Erlang distribution with parameters r and to approximate the
given Weibull.
Let = (1; 0; 0; 0) and
2
6
?
0 ?
A = 664 0
0
0
0 ?
0 0 ?
6
3
0
0 777 .
5
Then X has an Erlang distribution with parameters r = 4 and and distribution function
given by F (x) = 1 ? eAx e.
From (9), this distribution can be used to approximate a Weibull distribution with
parameters c = 2 and b = 4:51?1.
3 Preliminary results
In order to simplify the algebraic exposition, a number of results and denitions that will
be needed in the rest of the paper are collected in this section. A number of identities
involving F () and f () will be required and are listed below:
T
Z
Z
0
1
0
i
h
[1 ? F (t)]k dt = k eAk T ? Ik A?k 1 ek , for k = 1; ; n
(10)
[1 ? F (t)]k dt = ?k A?k 1 ek , for k = 1; ; n
(11)
x
Z
0
Z
Z
0
T
yf (y)dy = A?1 eAx e ? xeAx e ? A?1 e, for x > 0
x
0
(12)
F (t)dt = x ? A?1 eAxe + A?1 e, for x > 0
2
(13)
2
(T ? t)2 f (t)dt = 2T A?1 e + 2 A?1 e ? 2 A?1 eAT e + T 2 ,
(14)
(see Appendix). For i j n, dene S1 (j; T ); S2(j; T ); S3(j; T ) as follows:
S1 (j; T ) S2(j; T ) S3(j; T ) T
Z
0
T
Z
0
Z
T
0
x2f (x)[1 ? F (x)]j dx
(15)
xf (x)[A?1 eAxe][1 ? F (x)]j dx
(16)
xf (x)[1 ? F (x)]j dx.
(17)
Then it is shown in the Appendix that the following identities hold:
S1 (j; T ) =
1
T 2 j+1 eAj+1 T A?j+1
? 2Tj+1eAj+1 T A?j+11
ih
h
i3
1
+2j +1 eAj+1 T ? Ij +1 A?j +1
S2 (j; T ) =
n
i
h
ej A0
1
T A?1 j+1 eAj+2 T A?j+2
h
?A?1 j+1 eAj+2 T
S3 (j; T ) =
? Ij
i
+2
h
1
A?j+2
2
2 ej+1 a0
i
1
Tj+1 eAj+1 T A?j+1
? j+1 eAj+1 T ? Ij+1 A?j+11
7
(18)
(19)
2 ej A0 . (20)
On letting T go to innity in (18) to (20) and noting that for any substochastic matrix, A,
limx!1 eAx = 0, the following can be obtained:
1
S1 (j; 1) = ?2j+1 A?j+1
3 ej A0
1
A?1 j+1 A?j+2
S2 (j; 1) =
1
S3 (j; 1) = j+1 A?j+1
2 2 ej+1 A0
(21)
ej A 0 .
(22)
(23)
Now some identities involving F (), f () and fi () are needed. For 1 i n dene
U1 (i; T )and U2 (i; T ) by
Z
U1 (i; T ) =
T
xfi (x)
0
x
Z
0
yf (y)dy fF (x)g?1 dx
(24)
x2fi (x)dx,
(25)
and
T
Z
U2 (i; T ) =
0
respectively, and U3 (i; T )and U4 (i; T ) by
U3 (i; T ) =
Z
T
0
xfi (x)Ki(x)
Z
x
0
yf (y)dy fF (x)g?1 dx
(26)
and
U4 (i; T ) =
Z
T
0
x2fi (x)Ki(x)dx,
(27)
respectively, where
Ki(x) [1 ? F (x)]?(n?i)
nX
?i
j =m?i
[F (T ) ? F (x)]j [1 ? F (T )]n?i?j .
(28)
It is shown in the Appendix that U1(i; T ), U2 (i; T ), U3 (i; T ) and U4(i; T ) can be written in
terms of S1 (; T ), S2(; T ) and S3 (; T ):
U1 (i; T ) = i ni
i?2
X
!(
!
i?1
X
k=0
!
i ? 1 (?1)i?k?1 S (n ? k ? 1; T )
1
k
i ? 2 (?1)i?k?2 [S (n ? k ? 2; T ) ? S (n ? k ? 2; T )
?
1
2
k
k=0
io
+ A?1 e S3 (n ? k ? 2; T )
(29)
8
i?1 i ? 1
X
U2 (i; T ) = i ni
(?1)i?1?k S1(n ? k ? 1; T )
k
k=0
!
!
!
!
j
?i X
i?1 nX
X
i
?
2
n
?
i
j
n
(?1)i?2?k
U3 (i; T ) = i i
k
j
l
k=0 j =m?i l=0
n+l?i?j eAn+l?i?j T en+l?i?j fS2 (i + j ? k ? l ? 2; T )
!
!
? A? e S (i + j ? k ? l ? 2; T ) ? S (i + j ? k ? l ? 1; T )
!
!
!
!
j
i? nX
?i X
X
n
i
?
1
n
?
i
j
U (i; T ) = i i
(?1)i?k
k
j
l
k j m?i l
n l?i?j eAn+l?i?j T en l?i?j S (i + j ? k ? l; T ).
1
3
(30)
o
1
(31)
1
4
=0 =
=0
+
1
+
(32)
4 Expected cost and Variance per unit time for T -age replacement policies
A T -age replacement policy calls for replacement every T -units of time. The expected cost
per unit time equals
T ?1
(
c0 + ncs + (cr ? cs) nF (T ) + ncd
Z
0
T
F (t)dt
)
(see Okumoto and Elsayed [11]). Using (13) in the above expression, the expected cost per
unit time associated with a T -age replacement policy can be seen to equal
n
h
io
T ?1 c0 + ncr + n (cs ? cr ) eAT e + ncd T + A?1 e ? A?1 eAT e .
(33)
The asymptotic variance per unit time can be written as
T ?1
(
+nc2d
n (cr ? cs )2 F (T ) [1 ? F (T )] + 2ncd (cr ? cs)[1 ? F (T )]
Z
0
T
(T ? t)2 f (t)dt ? nc2d
"Z
T
0
F (t)dt
#29
=
;
Z
o
T
F (t)dt
(34)
(see the Appendix). Use (4), (13) and (14) in the above to obtain a result not involving
integration.
For T-age policies, the expressions for expected cost and variance per unit time only
involve matrices of dimension r.
9
Example continued
Suppose three components are operating in parallel. Let the cost parameters c0 , cs , cr
and cd equal 70, 10, 50 and 30, respectively. Assume that equal to 1.5. Then the expected
cost of a T - policy equals
h
i
e?1:5T ?33:75T 2 + 120T ?1 + 90 ? 20T ?1 + 90,
while the variance per unit time equals
h
e?1:5T 2025T 3 ? 4050T 2 ? 8100T ? 7200
i
h
i
+e?3T ?379:69T 5 + 2025T 3 + 2700T 2 ? 2700T ? 4800T ?1 ? 7200 + 4800T ?1.
Figure 1 contains plots of the expected cost and variance per unit time as a function of T .
The 2:3-age replacement policy has an expected cost of 80:15, minimizes the expected cost
per unit time and has an associated variance of 1367. Because calculation of expected costs
and variances is now a computationally easy matter, the decision maker can also consider
other approaches. For instance, the decision maker might decide that the 2% increase
in expected cost in going from a 2:3-age to 1.8-age replacement policy is worth the 19%
decrease in variance.
insert Figure 1 about here
5 m-failure policies
In this section, expressions will be provided that enable computation of the expected cost
per unit time and asymptotic variance associated with any given m-failure policy.
5.1 Expected cost per unit time
Use (3), (48), (53) and (11) to see that the expected length of the cycle, , and the expected
downtime incurred during the cycle, E [D], can be written as follows:
mX
?1
!
n
=
i=0 i
mX
?1
i ni
E [D] =
i=1
i
X
(?1)i?k+1
k=0
i
X
!
k=0
(?1)i?k+1
10
!
i
?1
k n?k An?k en?k
!
i A?1 e .
k n?k n?k n?k
(35)
(36)
The expected cost of a cycle is given by the following:
E [C ] = c0 + mcr + (n ? m)cs + cd E [D].
(37)
The mean asymptotic cost E [C ]= is obtained from (35),(36) and (37).
5.2 The asymptotic variance associated with m-failure policies
Note that, for m-failure policies, V ar[C ] = c2d E [D2] ? E [D]2 . Thus, from (1), the asymptotic variance per unit time associated with an m-failure policy can be calculated once expressions for 2 , E [CL], E [D2], , E [D] and E [C ] are available. Identities for , E [D] and
E [C ] have been provided in (35), (36) and (37). Expressions for 2, E [CL] and E [D2] will
now be provided. Use (25) and (49) to obtain the following:
h
i
2 = E L2 ? 2
= U2(m; 1) ? 2
(38)
For m = 1,
E [D] = E [D2] = 0
E [CL] = [c0 + mcr + (n ? m) cs]
V ar[C ] = 0.
(39)
In what follows the more dicult case where m 2 is considered. The expression E [CL]
can be written as follows:
E [CL] = [c0 + mcr + (n ? m) cs] + (m ? 1)cd
Z
0
1 Z x
0
F (t)dt [F (x)]?1 xfm (x)dx
(see Appendix, equation (52)). Use (2) and (13) to see that the integral on the right hand
side can be written as
1
!
x
m mn f (x)F (x)m?2 [1 ? F (x)]n?m dx
0
!
Z 1 n
mX
?2 m ? 2 !
o
n
x x ? A?1 eAx e + A?1 e m m f (x)
=
(?1)m?2?k [1 ? F (x)]n?2?k dx
k
0
k=0
Z
n
x ? A?1 eAx e + A?1 e
o
11
Now use (21), (22) and (23) to obtain the result
E [CL] = [c0 + mcr + (n ? m!)cs]
mX
?2 m ? 2 !
n
(?1)m?2?i [S1 (n ? 2 ? i; 1)
+cd m(m ? 1) m
i
i=0
i
?S2(n ? 2 ? i; 1) + A?1e S3(n ? i ? 2; 1)
(40)
From (67) in the Appendix, the following can be obtained
E [D2] =
Z
1
0
+2
x2
Z
(
1
0
mX
?1
x
)
fi(x) + (m ? 1)2fm (x) dx
i=1
(
mX
?1
i=2
)
(i ? 1)fi (x) ? (m ? 1) fm (x)
2
Z x
?
1
yf (y)dy .(41)
[F (x)]
0
Recall denitions (24) and (25) and apply (29) and ( 30) in (41) to obtain the result:
E [D2] =
mX
?1
i=1
U2 (i; 1) + (m ? 1)2 U2(m; 1) + 2
Example continued
mX
?1
i=2
U1 (i; 1) ? 2(m ? 1)2U1 (m; 1). (42)
Suppose that n = 3 and the failure distribution is phase type with representation =
(1; 0; 0; 0) and
3
2
? 0 0
7
6
A = 664 00 ?0 ? 0 775 .
0 0 0 ?
2
2
Then, , E [D], , E [CL] and E D can be calculated from (35), (36), (38), (40 ) and
(42), respectively (for m = 1 apply (39)). For n = 3, c0 = 70, cs = 10, cr = 50 and cd = 30
and = 1:5, Table 1 contains the expected cost and variance per unit time for 1, 2 and
3-failure policies. The 2-failure policy has the smallest expected cost per unit time (81.45)
and its variance equals 1368.
insert Table 1 about here
6 (m; T )-policies
In this section assume that an (m; T )-policy (i.e. replace at the time of the mth failure or
time T whichever occurs rst) is being followed. First an expression for the expected cost
per unit time will be provided. Then explicit results will be provided for each of the terms
in (1) which is the expression for the asymptotic variance per unit time.
12
6.1 Expected cost per unit time
From (3), (8) and (54):
mX
?1
E [N ] =
i ni
i=0
+m
n
X
i=m
!
i
X
!
i eAn?k T e
n?k
n?k
k
k=0
!
!
i
i
n X
i
?
k
An?k T e .
n?k
k n?k e
i k=0(?1)
(?1)i?k
(43)
From (8), (10) and (48) the expected cycle length can be written as
=
mX
?1
!
n
i
i=0
i
X
(?1)i?k
k=0
!
i A?1 heAn?k T ? I i e ,
n?k n?k
k n?k n?k
(44)
while the expected downtime is given by
mX
?1
E [D] = j nj
j =1
!
j
X
k=0
i
h
(?1)j ?k n?k A?n?1 k eAn?k T ? In?k en?k .
(45)
The expected cost for a cycle is given by:
E [C ] = c0 + ncs + (cr ? cs) E [N ] + cdE [D],
(46)
6.2 The asymptotic variance associated with (m; T ) policies
In order to calculate (1), it is necessary to calculate , E [D], E [C ], 2, E [CL] and V ar[C ].
The terms , E [D], E [C ], are provided in (44), (45) and (46), respectively. An expression
for 2 = E L2 ? 2 follows from (44) and the identity:
h
E L
2
i
=
Z
T
0
t2fm (t) dt + T 2
= U2(m; T ) + T
2
mX
?1
i=0
mX
?1
i=0
n
i
p (i; T )
!
i
X
k=0
(?1)i?k
!
i eAn?k T e ,
n?k
k n?k
where the last equality follows from (2), (8) and (25).
For m = 1, E [D] = E [D2] = 0,
E [CL] =
=
T
Z
0
Z
T
0
(cr ? cs ) xf1 (x) dx
!
(cr ? cs ) x n1 f (x) [1 ? F (x)]n?1
!
= (cr ? cs ) n1 S3 (n ? 1; T ),
13
n
o
and V ar[C ] = (cr ? cs )2 E N 2 ? E [N ]2 .
In what follows the more dicult case where m 2 is considered. From (52) in the
Appendix: Use ( 2), (3) and (13), apply (18), (19) and (20) in (52) to get
!
mX
?1
!
m ? 1 S (n ? 1 ? k; T ) A?1 e
3
k
k=0
!
!
mX
?2
n
m
?
2
m
?
2?k
(?1)
+cd m(m ? 1) m
fS1 (n ? 2 ? k; T )
k
k
=0
o
?S2 (n ? 2 ? k; T ) + A?1 e S3 (n ? 2 ? k; T )
!
!
i
mX
?1
X
n
i
+T (cr ? cs ) i i
(?1)i k n?k eAn?k T en?k
i=1
k=0
!
!
i
mX
?1
X
n
i
i
?
k
?
1
(?1)
+Tcd i i
n?k eAn?k T en?k
k
i=1
k=0
n
o
?
1 AT
T ? A e e + A?1 e + (c0 + ncs) .
E [CL] = m (cr ? cs ) mn
2
(?1)m?1?k
Calculation of V ar[C ] is somewhat more complicated than the other calculations. First
note that
n
h
i
o
V ar[C ] = (cr ? cs)2 E N 2 ? E [N ]2 + 2cd (cr ? cs ) fE [DN ]
n
h
i
o
?E [D]E [N ]g + cd E D ? E [D] .
2
2
2
Expressions for E [N ] and E [D] are provided by (43) and (45), respectively. Condition on
the number of failures at time T and use (8) to obtain
E [N 2] =
=
mX
?1
i=0
mX
?1
i=0
i2p(i; T ) + m2
i
+m2
2
mX
?1
n
i
!
i
X
n
X
i=m
p(i; T )
!
i (?1)i?k eAn?k T e
n?k
n?k
k=0 k
!
!
i
i (?1)i?k eAn?k T e .
n X
n?k
n?k
i k=0 k
i=0
Use expression (61) from the Appendix to obtain
E [DN ] =
(Z
0
T
F (t) dt
?1
n mX
X
)
mX
?1
i=0
mj ni
+
i=m j =1
i
2
!
!
!
i?1 i ? 1
n X
i?k?1 [1 ? F (T )]n?k?1
i k=0 k (?1)
! j i?j
!
!
X
i X
j
i
?
j
k
+l
j k=0 l=0 (?1)
k
l
14
(Z
T
[1 ? F (t)]n?k?l
0
)
[1 ? F (T )]l
Apply (8), (10) and (13) to get
E [DN ] =
n
o
T ? ?1 eAT e + A?1 e .
!
!
i?1
mX
?1
X
n
i
?
1
i
?
1?k
2
(?1)
i i
n?k?1 eAn?k?1 T en?k?1
k
i=0
k=0
!
! j i?j
!
!
?1
n mX
XX
X
n
i
j
i
?
j
k
+l
mj i
+
j k=0 l=0 (?1)
k
l
i=m j =1
i
h
n?k?l eAn?k?l T ? In?k?l en?k?l l eAl T el .
Thus only the term E D2 remains to be calculated. Note that
i
h
i
h
E [D2] = E IfmT g D2 + E Ifm>T g D2 .
(47)
Thus E [D2] can be computed by computing the two terms on the right hand side of (47).
Use (13) and (14) in (62) from the Appendix to obtain:
h
E Ifm >T g D
2
mX
?1
j nj
=
j =1
i
+2T
jX
?2
k=0
! j ?1
X
k=0
A?1 e + 2
(?1)j ?2?k
j?1
k
(?1)j ?1?k
A?1
j ?2
k
!
2
h
e ? 2
!
ih
h
n?k?1 eAn?k?1 T en?k?1 T 2
A?1
2
eAT e
mX
?1
!
+ j (j ? 1) nj j =1
i
n?k?2 eAn?k?2 T en?k?2 i2
[T ? A?1 eAT e + A?1 e .
From (67) in the Appendix: apply (29), (30), (31) and (32) to get
h
i
E Ifm T g D2 = 2
mX
?1
i=2
U3 (i; T ) ? (m ? 1)2U1(m; T ) +
mX
?1
i=1
U4 (i; T ) + (m ? 1)2U2 (m; T ).
The above expressions are algebraically complex. However, for a given phase distribution,
all reduce to tractable closed form expressions.
Example continued
For n = 3, = 1:5 and c0 = 70, cs = 10, cr = 50, cd = 30, Figure 2 and 3, respectively,
contain the expected costs and variance per unit time for (1; T ), (2; T ) and (3; T ) failure
15
policies as a function of T. The (2; 2:5) policy has the smallest expected cost per unit time
(79.97) and its associated variance is 1479.
insert Figure 2 about here
insert Figure 3 about here
7 Conclusion
Group maintenance policies form an important part of the reliability literature. However
the analyst has often been restricted to a very narrow (and often inappropriate) range of
distributions. Also, given the computational complexity of the problems, sensitivity analyses where failure time and cost parameters can be varied have been problematic. By
allowing the analyst to choose an arbitrary phase distribution, the applicability of group
maintenance approaches is greatly increased. A contribution of this paper has been to
provide explicit closed form results for the major policy classes when the failure time has
a phase distribution. These results, which in general appear quite algebraically daunting,
are computationally relatively easy for any given problem. This demonstrates once again
that, as predicted by Neuts [9], use of phase distributions can be of great practical utility.
Sensitivity analyses are now very easy to conduct. Unlike most of the literature, the variability associated with group maintenance policies has been explicitly modeled. (The results
provided for calculating the asymptotic variability per unit time apply to general failure
time distributions as long as the rst two moments are nite). The closed form results
for the asymptotic variance allow the analyst to consider criteria other than simply that
of minimizing expected cost per unit time. Indeed, in many applied situations, an analyst
might be willing to tolerate an increased expected cost in order to reduce variability. In
any case, even if choosing the policy that minimizes expected cost is the analyst's objective,
knowledge of the associated variability provides important managerial information.
16
Appendix
This Appendix is divided into two parts. In xA expressions for calculating (1) are
derived. xB contain the phase results listed in x3.
A Calculating Asymptotic Cost and Variance Per Unit Time
Explicit expressions for the terms in (1) required to compute the asymptotic expected cost
and variance per unit time associated with with (m; T ) policies will now be derived. (Similar
expressions for m-policies can be obtained by letting T ! 1 in the appropriate places.)
The results of this section are general and apply to any continuous failure distribution with
nite rst and second moments. (A summary of these results together with some examples
can be found in Wilson [16]).
A.1 Calculating ; ; E [CL] and E [C ] for (m; T ) policies
2
Expressions for ; 2; E [CL] and E [C ] are provided by (48), (49), (52), and (55), respectively.
Expressions for ; and 2 follow by noting that
=
=
T
Z
Z
P [min(T; m) > t] dt
0
?1
T mX
i?0
0
p(i; t)dt
(48)
and
h
i
2 = E fmin (T; tm)g2 ? 2
=
T
Z
0
t fm (t) dt + T
2
2
mX
?1
i=0
P (i; T ) ?
(Z
0
?1
T mX
i=0
)2
p(i; t)dt .
(49)
In order to calculate E [CL] it is rst necessary to nd expressions for E [C j L = T ] and
E [C j L = x], where x < T . Suppose that replacement occurs at x < T; i.e. replacement
occurs at the mth failure. The downtime incurred over the cycle is the sum of the downtimes
for each of the rst m ? 1 failures. For any y > 0; the expected downtime for an individual
machine given that it has failed before time y and is replaced at time y is given by
17
y
Z
y ? E [X j X < y] = y ?
0
y
Z
P [X > t j X < y] dt = [F (y)]?1
0
F (t) dt
Thus, for x < T
E [C j L = x] = c0 + mcr + (n ? m) cs + cd (m ? 1) [F (x)]?1
x
Z
0
F (t) dt,
(50)
where the rst three terms on the right hand side represent the xed and unit costs of
replacing the system with a new one, while the last term is the expected downtime cost.
Now suppose that the conditioning information is that the cycle ends at time T . Let Nf (t)
denote the number of machines that have failed by time T: Then
E [C j L = T ] = c0 + ncs + (cr ? cs) E [Nf (T ) j m T ]
+cd
mX
?1
i=1
P [Nf (T ) = i j m T ] i [F (T )]?1
T
Z
0
F (t) dt,
(51)
where the rst three terms represent the xed and unit costs of replacing the system with
a new one, while the last term is the expected downtime cost. Use (50) and (51) and the
expression for given by (48) to obtain the following:
E [CL] =
=
=
Z
xE [C j L = x] dG (x)
T
Z
0
Z
xE [C j L = x] fm (x) dx + TE [C j L = T ] P [m T ]
T
0
+T
Z x
?
1
F (t) dt xfm (x) dx
m (cr ? cs ) + cd (m ? 1) [F (x)]
mX
?1 (
i=1
i (cr ? cs ) + icd
+ (co + ncs )
Z
?1
T mX
0
i=0
0
T
[F (x)]?1 (T ) F (t) dt
Z
0
p (i; t) dt:
)
p (i; T )
(52)
Let N and D, respectively, denote the random variables corresponding to the number of
failures and the downtime
accumulated
during a cycle. On noting that E [min (T; j )] =
?1
RT
R T mP
o P [j > t] dt = o i=0 p (i; t) dt; it can be seen that the expected downtime in a cycle
is given by the following:
E [D] =
mX
?1
j =1
jE [min (T; j+1) ? min (T; j )]
18
?1
T mX
Z
=
0
j =1
jp (j; t) dt.
(53)
Condition on the number of failures at time t to obtain the expected number of failures
in a cycle:
E [N ] =
mX
?1
j =1
ip (i; T ) + m
n
X
j =m
p (i; T ).
(54)
Using (53) and (54), the expected cost incurred during a cycle can be written as follows:
E [C ] = co + ncs + (cr ? cs) E [N ] + cd E [D]
= co + cs + (cr ? cs )
+cd
?1
T mX
Z
0
j =1
mX
?1
j =1
ip (i; T ) + m (cr ? cs)
n
X
j =m
p (i; T )
jp (j; t) dt.
(55)
A.2 Calculating V ar [C ] for (m; T ) policies
Now an expression for Var [C ] will be provided. The variance of the cost of one cycle is
given by
V ar [C ] = V ar [co + ncs + (cr ? cs) N + cdD]
n
h
i
= (cr ? cs )2 E N 2 ? E [N ]2
o
+2cd (cr ? cs ) fE [DN ] ? E [D] E [N ]g
n
h
i
h
io
n
o
+c2d E Ifm T g D2 + E Ifm >T gD2 ? c2d E [D]2 .
(56)
Expressions for E [D] and E [N ] are provided by (53) and (54), respectively. Expressions
h
i
h
i
for E [N 2], E [DN ], E Ifm T gD2 and E Ifm >T g D2 will now be provided. These together
with (53) and (54) can then be inserted into (56) for explicit evaluation of V ar[C ].
Condition on the number of failures at time T to obtain:
h
i
E N2 =
mX
?1
i=0
i2p (i; T ) + m2
n
X
i=m
p (i; T ).
(57)
The random variable Nf (T ) equals the actual number of failures if the system is replaced
at time T: If the system is replaced before time T; Nf ; (T ) represents the number of failures
19
that would have occurred up to time T if the system had not been replaced. Condition on
the value of this random variable to obtain
E [DN ] =
mX
?1
i=0
iE [D j Nf (T ) = i] p (i; T ) +
From (52), for i < m,
E [D j Nf
n
X
i=m
mE [D j Nf (T ) = i] p (i; T ) (58)
Z T
?
1
F (t) dt.
(T ) = i] = i [F (T )]
(59)
0
Now consider the case where i m. Conditioned on Nf (T ) = i, where i m; one
has a system of i independent machines. The (conditional) distribution function of the
failure time of one of these i machines equals [F (T )]?1 F () since the only information
about the machine is that it would have failed before time T . So, in order to compute
the expected downtime conditioned on Nf (T ) = i; one can act as if the system consists
of i (instead of n) machines with i.i.d. failure times with distribution function equal to
[F (T!)]?1 F (). Thus, for this system, the probability of j failures by time t is given by
i hF (T )?1 F (t)ij h1 ? F (T )?1 F (t)ii?j:
j
Use this and proceed as in (53) to obtain:
E [D j Nf (T ) = i] =
=
=
mX
?1
jE [min (T; j+1) ? min (T; j ) j Nf (T ) = i]
j =1
Z T m
?1
X
0
Z
0
j =1
m
?1
T X
jP [Nf (t) = j j Nf (T ) = i] dt
!
j ji [F (T )]?i [F (t)]j [F (T ) ? F (t)]i?j dt. (60)
j =1
Use (59) and (60) in (58) to obtain
E [DN ] = [F (T )]?1
n
X
Z
+
m ni
i=m
T
0
F (t) dt
!Z
T
mX
?1
i?0
8
?1
<m
X
:
j =1
o
i
?
j
[F (T ) ? F (t)] dt .
0
h
i2 p (i; T )
!
i
n?i
j
j [1 ? F (t)] [F (t)]
(61)
i
h
i
It only remains to nd expressions for E Ifm >T gD2 and E Ifm T gD2 . Suppose it
is known that exactly j < m machines have failed by time T . Then, conditioned on this
20
o
nP
information, D has the same distribution as ji=1 (T ? Yi ) , where Y1 ; : : :; Yj are i.i.d.
random variables with density equal to [F (T )]?1 f (), the density of the time to failure, Y ,
of a single machine given that it fails by time T . Use this to obtain
h
i
E Ifm >T g D2 =
=
mX
?1
h
i
E D2 j Nf (T ) = j p (j; T )
j =1
mX
?1
E
28
< j
6 X
4
:
i=1
j =1
mX
?1 n h
+
(T ? Yi )
923
=
7
5
;
p (j; T )
i
o
jE (T ? Y )2 + j (j ? 1) (E [T ? Y ])2 p (j; T ) ;
j =1
h
i
where the last equality follows since the i.i.d. nature of the Yi implies that E (T ? Yi )2 =
h
i
E (T ? Y )2 and E [(T ? Yi ) (T ? Yj )] = E f[T ? Y ]g2 ; for i6= j: Use the results that
h
i
R
R
E (T ? Y )2 = 0T (T ? y)2 F (T )?1 f (y) dy and E [T ? Y ] = 0T F (T )?1 F (y) dy and simplify to obtain
h
E Ifm >T gD
2
i
=
(
)
Z T
mX
?1
?
1
2
jp (j; T )
F (T )
(T ? y ) f (y ) dy
0
j =1
(
)2 8
Z T
?1
<m
X
?1
+ F (T )
0
F (t) dt
:
j =1
9
=
j (j ? 1) p (j; T ); .
(62)
P
Suppose the mth failure occurs before time T , then D = mi=1?1 (m ? i ). Use this to
obtain
h
2
i
E Ifm T g D2 = E 4Ifm T g
"
(
mX
?1
i=1
(
)2 3
(m ? i )
= E Ifm T g (m ? 1)2 m2 +
+2
For 2 i m,
h
mX
?1
i=2
5
mX
?1
i=1
i2 ? 2 (m ? 1) m (m?1 + : : : + 1)
)#
i (i?1 + : : : + 1 )
i
E IfmT g i2 =
=
T
Z
0
Z
T
0
h
(63)
i
x2 E Ifm T g j i = x fi (x) dx
x2 P [Nf (T ) m j Nf (x) = i] fi (x) dx:
21
Conditioned on the event fNf (x) = ig ; there are n ? i functioning machines at time
x each of whose lifetime distribution function equals [1 ? F (x)]?1 F () : For the event
fNf (T ) mg to occur, at least m ? i of these must fail by time T , i.e.
nX
?i
!
n ? i fP [X T j X > x]gj
P [Nf (T ) m j Nf (x) = i] =
j
j =m?i
fP [X > T j X > x]gn?i?j
!
nX
?i
n
?
i
?
(n?i)
= [1 ? F (x)]
j
j =m?i
[F (T ) ? F (x)]j [1 ? F (T )]n?i?j .
(64)
Let Ki (x) denote (64), the probability that at least m machines will have failed by time
T given that exactly i have failed by time x. Thus,
h
Z
i
E IfmT g i =
2
0
T
x2Ki (x) fi (x) dx;
(65)
for 2 i m. Note that conditioned on the events i = x and fm T g, the quantity
i?1 + : : : + 1 is the sum of i ? 1 independent failure times each with density function equal
to F (x)?1 f (). Consequently,
h
i
E Ifm T g i (i?1 + : : : + 1 ) =
=
T
Z
0
Z
0
T
xKi (x) E [i?1 + : : : + 1 j 1 = x] fi (x) dx
xKi (x) (i ? 1)
Z
fi (x) dx.
Use (65) and (66) in (63) to obtain
h
E Ifm T g D
2
i
=
Z
T
(
0
x
y [F (x)]?1 f (y) dy .
(66)
)
mX
?1
fi (x) Ki (x) + (m ? 1) fm (x) dx
i=1
)
Z T (m
?1
X
2
(i ? 1) fi (x) Ki (x) ? (m ? 1) fm (x)
+2 x
0
i
=2
Z x
?
1
yf (y) dy dx.
[F (x)]
0
x
2
2
0
(67)
B Derivation of expressions given in x3
The following properties of the Kronecker product will be useful in the sequel. Let P , Q, U ,
V , W , Z be rectangular matrices such that the ordinary matrix product PQU and V WZ
22
are dened, then
(PQU ) (V WZ ) = (P V ) (Q W ) (U Z ) .
(68)
For any square matrices P and Q :
e(P IQ )x+(IP Q)x = ePx eQx,
(69)
where IQ and IP are identity matrices of the same dimension as Q and P , respectively (see
Neuts [10], p.373).
Derivation of (10)-(14)
The function 1 ? [1 ? F (t)]k is the distribution function of min (X1; ; Xk ) which is a
phase distribution with representation (k ; Ak ). Consequently,
1 ? [1 ? F (t)]k = 1 ? k eAk t ek , for k 2
(70)
and 0T [1 ? F (t)]k dt = 0T k eAk t ek dt, from which (10) and (11) follow.
Note that, for x > 0,
R
R
x
Z
0
Z
0
3 2
tA?1 eAt edt = x A?1 eAx e ? A?1 eAx ? I e
x
A?1 eAt edt = A?1
Z
0
x
2 eAx ? I e
eAt edt = A?1 eAx ? I e,
simplify and use integration by parts to obtain (12), (13) and (14).
Derivation of (18)-(20)
Use (6) and (5) and apply (68) and (69) to obtain
S1(j; T ) =
=
=
T
Z
0
Z
0
Z
0
T
T
x2 j eAj x ej eAx A0 dx
x2 j eAj x ej eAx A0 dx
x2 (j ) eAj x eAx ej A0 dx,
where the second equality follows since the product of scalars is, trivially, a Kronecker
product. Now apply (68), (69) and the denition of j +1 and Aj +1 to obtain
S1(j; T ) =
(Z
T
0
x j+1
2
)
eAj+1 x dx
23
ej A 0 ,
from which (18) follows on integrating by parts 2 times.
Use (6), (5) and apply (68) and ( 69) to obtain
T
Z
S2(j; T ) =
0
T
Z
=
0
h
x eAxA0 A?1 eAx e j eAj xej dx
x eAxA0
i
A?1 j eAj+1 x ej+1 dx
where the last equality follows by applying (68) and using the denition of j +1 and Aj +1 .
Again apply (68) and the denition of Aj +1 and ej +1 to obtain
S2 (j; T ) =
=
=
T
Z
Z
0
0
Z
0
x A?1 j eAj+1 x ej+1 eAxA0 dx
T
A?1 j eAj+1 x eAx dx ej+1 A0
Th
i
A?1 j+1 eAj+2 x dx ej+1 A0 ) ,
from which (19) follows on applying integration by parts.
Again, use (6), (5), (68) and (69) to obtain
S3 (j; T ) =
=
Z
T
0
(Z
0
x j eAj x ej eAx A0 dx
T
)
xj+1 eAj+1 x dx ej A0 .
Integration by parts of the above expression yields (20).
Derivation of (29)-(32)
Use the Binomial theorem to expand [F (x)]i?1 = f1 ? [1 ? F (x)]gi?1 and [F (T ) ? F (x)]i =
f[1 ? F (x)] ? [1 ? F (T )]gi in (2) and (28), respectively. Insert the resulting expressions for
fi (x) and Ki(x) into the denition for U3 (i; T ) and U4(i; T ), gather terms and recall the
denition of S1(; ), S2 (; ) and S3(; ) to obtain the results given in (31) and (32).
Similarly, (29) and (30) follow by inserting fi (x) into ( 24) and (25), expanding [F (x)]i?1
using the Binomial theorem and recalling the denition of S1 (; ), S2(; ) and S3 (; ).
24
References
[1] D. Assaf and J.G. Shanthikumar, Optimal group maintenance policies with continuous
and periodic inspection, Management Science 33 (1987) 1440-1452.
[2] A. Bobbio and A. Cumani, Modeling wear-out by multistate homogeneous Markov
models. In: Reliability in Electrical and Electronic Components and Systems , eds. C.
Lauger and J. Moltorf, North - Holand, 1982, pp. 101-106.
[3] S. Chakravarthy, Analysis of production line systems with two unreliable machines with
phase type processing times and a nite storage buer, Communications in Statistics:
Stochastic Models 3, (1987) 369-391.
[4] M. Gururajan and K. S. Bhat, A complex priority redundant system with phase type
distribution, Microelectronics and Reliability. 30 (1990) 3, 453-455.
[5] M. A. Johnson, Selecting parameters of phase distributions: combining nonlinear programming, heuristics, and Erlang Distributions, ORSA Journal of Computing 5, 1
(1993) 69 - 80.
[6] M. Malhotra and A. Reibman, Selecting and implementing phase approximations for
semi-Markov models, Communications in Statistics: Stochastic Models. 9, 4 (1993)
473-506.
[7] T. A. Mazzuchi and R. Soyer, A Bayesian perspective on some replacement strategies,
Reliability Engineering and System Safety 61 (1996) 295-303.
[8] T. Nakagawa, Further results on replacement problem of a parallel system in a random
environment, Journal of Applied Probability 16 (1979) 923-926.
[9] M. Neuts, Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach ,
Johns Hopkins University Press, Baltimore, 1981.
[10] M. Neuts, Algorithmic probability: a collection of problems , Chapman and Hall, 1995.
25
[11] K. Okumoto and E.A. Elsayed, An optimum group maintenance policy, Naval Research
Logistics Quarterly 30 (1983) 667-674.
[12] E. Popova and J. G. Wilson, Selecting and implementing the best group replacement
policy for a non Markovian system, in: Proceedings of the International Conference on
Probabilistic Safety Assessment and Management , eds. C. Cacciabue et al., SpringerVerlag, 1996, pp. 58 - 63.
[13] P. Ritchken and J. G. Wilson, (m; T ) group maintenance policies, Management Science
36, 5 (1990) 632-639.
[14] S. Ross, Stochastic Processes , John Wiley, New York, 1996.
[15] W. Smith, Renewal theory and its ramications, Journal of the Royal Statistical Society
B 20 (1958) 243-302.
[16] J. G. Wilson, A note on variance reducing group maintenance policies, Management
Science 42, 3 (1996) 452-460.
[17] J. G. Wilson and A. Benmerzouga, Optimal m-failure policies with random repair time,
Operations Research Letters 9 (1990) 203-209.
[18] J. G. Wilson and A. Benmerzouga, Bayesian group replacement policies, Operations
Research 43, 3 (1995) 471-476.
[19] J. G. Wilson and E. Popova, Adaptive replacement policies for a system of parallel machines, in: Lifetime Data:Models in Reliability and Survival Analysis , eds. N.P.Jewell
et al., Kluwer Academic Publishers, 1996, pp. 371-375.
[20] J. G. Wilson and E. Popova, Optimal Bayesian group maintenance policies, working
paper, department of Mechanical Engineering, The University of Texas at Austin,
Austin, 1997.
26
m Expected cost per unit time Asymptotic variance
1
2
3
85.73
81.45
84.77
2177
1368
1059
Table 1.
Expected cost and variance per unit time for 1, 2 and 3 - failure policy for n = 3, c0 = 70,
cs = 10, cr = 50, cd = 30 and the failure distribution is phase with representation given in
the Example.
27
Download