Failure Data Analysis by Models Involving 3 Weibull Distributions

advertisement
State Probability of a Series-parallel Repairable System
with Two-types of Failure States
Gregory Levitin
Reliability Department, Planning, Development and Technology Division,
Israel Electric Corporation Ltd., P.O. Box 10, Haifa, 31000 Israel
Tieling Zhang, Min Xie
Department of Industrial and Systems Engineering
National University of Singapore, Singapore 117576
ABSTRACT
This paper presents a method for the analysis of series-parallel safety-critical system
where the system states can be distinguished into failure-safe and failure-dangerous. The
method incorporates Markov chain and universal generating function technique. In the
model considered, both periodic inspection and repair (perfect and imperfect) of system
elements are taken into account. The system state distributions and the overall system
safety function are derived based on the developed model. The proposed method is
applicable to complex systems for analyzing state distributions and it is also useful in
decision-making such as determining the optimal proof-test interval or repair resource
allocation. An illustrative example is given.
Keywords: Availability, Safety-critical system, Markov model, Universal generating function,
Periodic inspection, Failure-safe, Failure-dangerous
1
1. Introduction
Safety is of paramount concern for large and complex systems such as nuclear power and
chemical processing plants, aircraft navigation control system, power transmission and
high speed railway networks, and so on. The complexity of large systems raises many
important problems concerning safety such that it may be very difficult or even impossible
to ensure that the systems will always behave as expected under all foreseeable conditions.
Dangerous faults may be caused by not only random hardware failure but also systematic
faults inadvertently designed into the system. Safety analysis or risk assessment for such a
system thus becomes a complex problem that involves study of human factors (human
error), production process, manufacturing control, on-line measurement or test and repair,
diagnosis with periodic inspections and so on. See Dominguez-Garcia et al. (2006), Delon
et al. (2005), Cowing et al. (2004), Marseguerra et al. (2004), Burgazzi (2003) for some
related discussions on the recent reliability related research for safety-critical systems.
The use of safety-critical systems represents taking proactive measures to prevent a
process plant from occurrence of dangerous events. For example, emergency shutdown
controllers are widely used in chemical processing industry. Their function is to monitor a
plant process and to identify if the process is operating within the acceptable limits. If the
process moves outside of an acceptable operation range, the controller automatically shuts
the process down in a safe manner (Bukowski, 2001). In order to provide proper analysis of
safety-critical systems the dangerous and non-dangerous failures should be distinguished,
that are corresponding to failure-safe and failure-dangerous states of the system.
The international standard IEC 61508 (1998) includes two frameworks: One is risk
reduction with Safety-Related System (SRS) and the other is the Overall Safety Life-cycle.
Since its publication, it has been widely adopted in various safety related studies and
applications (see, e.g., Faller, 2004, Hokstad and Cornliussen, 2004, Zhang et al., 2003,
Nunns, 2000, and Knegtering, 1999). A typical architecture of SRS is regarded to consist
of components with diagnosis and periodic inspection, where the failures in each
component are classified into detectable and undetectable. There are a number of studies
on safety-critical systems which correspond to different specific system structures, see,
e.g., some recent references such as Kang and Jang (2006), Kim et al. (2005), Weber et al.
(2005), Lee et al. (2004), Latif-Shabgahi (2004) and Son and Seong (2003).
2
Periodic inspection is important for safety-critical systems and it has been studied in
reliability analysis in general (see, e.g., Cui et al., 2004, Biswas, 2003, Bris et al., 2003,
Bukowski, 2001). In various studies of safety-critical system performance, the effects of
periodic inspection have been either ignored or modeled by assigning quite longer average
repair times for unrecognized degraded states (Zhang et al., 2003 or 2006). In practice, the
unrecognized fault can not be repaired until the next periodic inspection (proof-test). In
fact, the repair for this kind of faults is carried out at determined time. However, only very
few studies have concerned the problem. Bukowski (2001) gives a method of
incorporating periodic inspection and repair into Markov model in which both perfect and
imperfect inspection and repair can be modeled. However, in Bukowski (2001), the
situation that both unrecognized and recognized degraded states may exist simultaneously
was not included in the Markov model. As the unrecognized failure can only be found at
periodic inspection, the two kinds of faults could exist in some period of time.
The purpose of this paper is to present a method for evaluating the probabilities of
failure-safe and failure-dangerous states for arbitrary complex series-parallel systems with
imperfect diagnostics and imperfect periodic inspections and repairs of elements. Each
kind of element failures whatever are of failure-safe or failure-dangerous can be either
detected or undetected. The emphasis is on exact state probability or availability of such a
system. See Bowles and Dobbins (2004), Chandrasekhar et al. (2004) and Carrasco (2004)
for some related study of other systems.
The remainder of this paper is composed of Markov model for determining state
distribution of a single system element, universal generating function technique for
determining state distribution of the entire system and an illustrative example presented.
Acronyms & Notations
FD
failure-dangerous state
FS
failure-safe state
W
operational state
G
set of states of element (system): G = {W, FS, FD}

structure function
par
structure function for elements connected in parallel
ser
structure function for elements connected in series
3
Sj
random discrete state variable of element j
sjk
k-th realization of Sj: sjk  G
Fd
detected failure
Fu
undetected failure
FDd
detected failure-dangerous
FDu
undetected failure-dangerous
FSd
detected failure-safe
FSu
undetected failure-safe
pfd
probability of failure on demand
pfdD
probability of failure-dangerous on demand
pfdS
probability of failure-safe on demand

system transition rate matrix
0k
zero column vector of size k1
1k
unit column vector of size k1
PW(t), PFS(t), PFD(t)
probability of subsystem or the entire system is in state W, FS, FD at
time t
sd, dd, du, su
failure rate of FSd, FDd, FDu, FSu
sd, dd, du, su
repair rate of FSd, FDd, FDu, FSu
d
fraction of detected failures that are detected correctly
TI
Proof-test interval
Assumptions
1. System is composed of elements and each element can experience two categories of
failures:
Dangerous
and
non-dangerous,
corresponding
respectively
to
failure-dangerous and failure-safe events. Failure-dangerous and failure-safe events
are independent.
2. Both categories of failures can be detected and undetected.
3. Detected and undetected failures constitute independent events.
4. Failure rates for both kinds of failures are constant.
5. The element is in operation state if no failure event (detected or undetected) has
occurred.
4
6. The element is in failure-safe state if at least one non-dangerous failure (detected or
undetected) has occurred and no dangerous failure has occurred.
7. The element is in failure-dangerous state if at least one dangerous failure (detected or
undetected) has occurred.
8. The elements are independent and can undergo periodic inspections at different times.
9. The state of any composition of elements is unambiguously defined by the states of
these elements and the nature of elements interaction in the system.
10. The elements’ interaction is represented by series-parallel block diagram.
2. State distribution of single system element
According to IEC 61508, the typical system structure is composed of elements to which
diagnosis and periodic inspection and repair are applied. Failure-safe or failure-dangerous
events can occur independently. The failure category depends on the effects of a fault
occurrence. For example, if a failure results in shutdown of a properly operating process, it
is of the type of failure-safe (FS). This type of failure is referred in a variety of ways to
false trip and false alarm. However, if a safety-critical system fails in an operation which
is required to shut down a process, that could cause hazardous results, such as failure of a
monitor that is applied to control an important process. This type of failure is generally
called failure-dangerous (FD).
Both FS and FD events can be detected or undetected. The detected failure can be
detected instantly by diagnostic devices. An imperfect diagnosis model presumes that a
fraction d of detected failures can be detected instantaneously by diagnostic devices.
Whenever the failure of this kind is detected, the on-line repair is initiated. The failures
that can not be detected by the diagnostic devices or remain undetected because of the
imperfect diagnosis are considered to be undetected failures. These failures can be found
only by the proof-test (periodical inspection) just after the end of a proof-test interval. We
assume that failure rates of detected failure-safe and failure-dangerous (sd and dd,
respectively) as well as undetected failure-safe and failure-dangerous (su and du,
respectively) can be calculated or elicited from tests.
The state of any single element can be represented as combination of two independent
states corresponding to detected and undetected failures. Each of the two failures can be in
three different states of no failure (state W), failure of category FS and failure of category
5
FD. According to assumptions 5-7, the state of each element can be determined based on
each combination of states of failures using Table 1.
Table 1.
States of single element.
Detected Failure
Undetected
Failure
W
W
W
FSd
FS
FDd
FD
FSu
FS
FS
FD
FDu
FD
FD
FD
The state of each element j can be represented by a discrete random variable Sj that
takes values from the set G = {W, FS, FD}. In order to obtain the element state
distribution pjW = Pr(Sj = W), pjFS = Pr(Sj = FS) and pjFD = Pr(Sj = FD), one should
summarize the probabilities of any combination of states of detected and undetected
failures that results in the element states W, FS and FD, respectively. Based on element
state transition analysis, one can obtain the Markov state transition diagram presented in
Fig. 1. In this diagram, each possible combination of the states of detected and undetected
failures (marked inside the cycles) belongs to one of the three sets corresponding to three
different states of element defined according to Table 1.
Practically, no repair action is applied to the undetected failure until the next
proof-test. In general, the periodic inspection and repair take very short time when
comparing to the proof test interval TI, and the whole system stops operation (in down
state) during the process of periodic inspection and repair. Therefore, it is reasonable to set
repair rates for undetected failures du = su = 0 when analyzing the behavior of a
safety-critical system within the proof test interval (unlike equivalent repair rates for du
and su used in Zhang et al. (2003).
6
W
W,
2
su
4
FSd, FSu
Fig. 1.
su
dd
Undetected
du
3
FSd, W
du
7
dd
du
5
W, FDu
du
sd
sd
FS
1
sd
sd
dd
W, FSu
Detected
su
su
W
sd
dd
FSd, FDu
FDd, W
dd
su
sd
6
FD
8
su
du
du
dd
FDd, FSu
9
FDd, FDu
Markov state transition diagram used for calculating state distribution of a single
element.
According to Fig. 1, the following group of equations describes the element’s
behavior:
Pj(t) = Pj(t) j
(1)
Pj(t) = (pj1(t), pj2(t), …, pj9(t)) is the vector of state probabilities, P(t) is derivative of P(t)
with respect to t, and j is transition rate matrix, see appendix. According to Table 1, state
1 in the Markov diagram corresponds to state W of the element, states 2 - 4 correspond to
state FS of the element and states 5 - 9 correspond to state FD of the element. Having the
solution P(t) of Eq. (1) for any element j, one can obtain pjW = pj1, pjFS = pj2 + pj3 + pj4 and
pjFD = pj5+ pj6 + pj7 + pj8 + pj9. The solution of Eq. (1) can be expressed as
Pj(t) = Pj(0)  exp(j  t),
Pj(t) = Pj(n  TI+)  exp(j (t  n  TI)),
for t  0;
for n  TI+  t  (n +1) TI+ ,
(2)
n = 0, 1, 2, 
To consider imperfect inspection and repair, the undetected fault can not be repaired as
good as new and some may still exist after inspection and repair. A matrix Mji is used to
describe this behavior. Each element of the matrix Mji describes the transition rate of
probability from one state to another. Thus, we have
Pj(TI+) = Pj(TI)  Mj1 = Pj(0)  exp(j TI)  Mj1
(3)
Pj(2TI+) = Pj(2TI)  Mj2 = Pj(0)  exp(j TI)  Mj1  exp(j TI)  Mj2
Pj(n  TI+) = Pj(n  TI)  Mjn
7
= Pj((n 1 )TI+)  exp(j TI)  Mjn
= Pj((n 2 )TI+)  exp(j TI)  Mj(n 1)  exp(j TI)  Mjn
= Pj(0)  exp(j TI)  Mj1  exp(j TI)  Mj2 
      exp(j TI)  Mj(n 1)  exp(j TI)  Mjn for n = 1, 2, 3, 
(4)
In Eq. (4), n represents the nth proof-test interval and Mji (i = 1, 2, 3, , n) is matrix
associated with the ith proof-test.
3. State distribution of the entire series-parallel system
In order to obtain the state distribution of the entire system, the procedure used in this
paper is based on the universal generating function (u-function) technique. This method
was introduced in Ushakov (1987) and has shown to be very effective for the reliability
evaluation of different types of multi-state systems, see Levitin et al. (1998) and
Lisnianski and Levitin (2003). The comprehensive description of the method and its
numerous applications in reliability engineering can be found in (Levitin, 2005). For some
recent and related applications, see e.g., Levitin (2004 and 2005), and Korczak et al.
(2006).
The u-function of a discrete random variable Y is defined as a polynomial
K
u ( z )   q k z yk ,
(5)
k 1
where the variable Y has K possible values and qk is the probability that Y takes the value
of yk. In our case, the polynomial u(z) can define state distributions, i.e. it represents all of
the possible mutually exclusive states of the element (or any subsystem) by relating the
probabilities of each state to the value that takes the random state variable corresponding
to this element (subsystem) in that state. Note that the performance distribution of the
basic element j (probability mass function of discrete random variable Sj) can now be
represented as
u j ( z) 
where sj1 = FD,
sj2 = FS,
3
 p jk z
s jk
,
(6)
k 1
sj3 = W for any j.
To obtain the u-function of a subsystem consisting of two elements, composition
operators are introduced. These operators determine the u-function for two elements
8
connected in parallel and in series, respectively, using simple algebraic operations on the
individual u-functions of basic elements. All the composition operators take the form
3
u j ( z) 
ui ( z )   p jk z

s jk
k 1
3

 pih z

s jh
h1
3

k 1
3
 p jk pih z
 ( s jk , sih )
.
(7)
h1
The obtained u-function relates the probability of each combination of states of the
independent elements (which is equal to the product of the probabilities of these states) to
the value that the random state variable of the entire subsystem takes when this
combination is realized. The function (.) in composition operators expresses the
dependence of the entire subsystem state on the states of both of its elements. The
definition of the function (.) strictly depends on the physical nature of the system and on
the nature of the interaction of the system elements.
The structure functions for pairs of elements connected in parallel and in series should
be defined for any specific application based on analysis of system functioning. For
example, in the widely applied conservative approach the following assumptions are
made. Any subsystem consisting of two parallel elements is in failure-dangerous state if at
least one of elements is in failure-dangerous state and is in operational state if at least one
of the elements is in operational state. In the rest of cases, the subsystem is in failure-safe
state. This can be expressed by the structure function par(.) presented in Table 2. A
subsystem consisting of two elements connected in series is in the operational state if both
of the elements are in the operational state, whereas it is in failure-dangerous state if at
least one of elements is in failure-dangerous state. In the rest of cases, the subsystem is in
failure-safe state. This can be expressed by the structure function ser(.) presented in Table
3.
Table 2. Structure function for pair of elements connected in parallel.
Element 1
Element 2
W
W
W
FS
W
FD
FD
FS
W
FS
FD
FD
FD
FD
FD
Table 3. Structure function for pair of elements connected in series.
Element 1
Element 2
W
W
W
FS
FS
FD
FD
FS
FS
FS
FD
FD
FD
FD
FD
9
In the numerical realization of the composition operator in Eq. (7), we can encode the
states W, FS and FD by integer numbers 3, 2 and 1, respectively, as such sjk = k for any j.
In our case, k = 1, 2, 3. It can be seen that in this case the defined above functions par(.)
and ser(.) take the form:
max( s
, s ), if min( s jk , sih )  1
and ser(sjk, sih) = min(sjk, sih).
jk ih
par(sjk, sih) = 1, if min(
s jk , sih )  1

Note that the nine possible different combinations of element states produce only three
possible states of the subsystem. The probabilities of combinations that produce the same
subsystem state should be summed in order to obtain this state probability. This can be
done by collecting terms with equal exponents in the u-function obtained by Eq. (7).
Finally, any subsystem state distribution can be represented by the u-function taking the
form of Eq. (6).
Any subsystem consisting of two elements can be further treated as a single equivalent
element with a performance distribution that is equal to the performance distribution of
this subsystem. Consecutively applying the composition operators and replacing pairs of
elements by equivalent elements, one can obtain the u-function representing the
performance distribution of the entire system.
The recursive algorithm
The following recursive algorithm obtains the u-function that represents the entire system
state distribution:
Step 1.
Obtain the state probabilities for each element j using the Markov
transition diagram method presented in Section 2.
Step 2.
Define the u-functions uj(z) for each element j using Eq. (6).
Step 3.
If the system contains a pair elements connected in parallel or in a
series, replace this pair with an equivalent element with u-function obtained
by operator of Eq. (7) with the structure functions par(.) and ser(.),
respectively.
Step 4.
If the system contains more than one element, return to Step 3.
Otherwise, the algorithm stops.
10
The coefficients of the obtained u-function are equal to probabilities of operational,
failure-safe and failure-dangerous states of the entire system.
With the state probabilities of each element in the form of functions of time, one can
use the algorithm presented above to get the probability values corresponding to any given
time. Finally, the entire system state probabilities and the overall system safety (defined as
the sum of operational probability and failure-safe state probability) as functions of time
can be obtained. In the following section, we use an example to illustrate the procedure
described here.
4. Illustrative example
Consider a combine-cycle power plant with two generating units. Each unit consists of a
gas turbine blocks and fuel supply systems. The fuel to each turbine block can be supplied
by two parallel systems. The simplified reliability block diagram of the plant is presented
in Fig. 2. Each fuel supply system as well as each turbine can experience both safe and
dangerous failures (detected and undetected).
Fuel supply systems
Turbine block
1
5
2
3
6
4
Fig. 2. Reliability block diagram of combine cycle power plant
The parameters of fuel supply systems are: sd = 2.5610-5, su= 10-5, dd= 8.910-6,
du = 110-6, sd = 0.25; dd = 0.0833,  su= du = 0; d = 0.99; TI = 1.5 years. The fuel
supply systems are statistically identical, but the inspection times of systems 2 and 4 are
11
shifted 0.5 year earlier relatively to inspection times of systems 1 and 3. The matrix Mji
associated with each fuel supply system is M1i (i = 1, 2, 3, 4) as shown in Eq. (A2) in
Appendix.
The turbine blocks are also statistically identical. The parameters of the turbine blocks
are: sd = 2.5610-5,  su= 6.54010-6, dd= 7.910-6, du = 7.810-7;  sd = 0.25, dd =
0.0625,  su= du = 0; d = 0.99; TI = 2 years. The matrix Mji associated with each turbine
block is M2i (i = 1, 2, 3) as shown in Eq. (A3) in Appendix.
The probabilities pjW(t), pjFS(t) and pjFD(t) for each system element obtained by solving
equations (2) and (3) for a period of time, 65000 hours, are presented in Fig. 3 - 5. At the
same time, the probabilities PW(t), PFS(t) and PFD(t) for single generating unit and for the
entire system (the structure functions are defined in accordance with Tables 2 and 3,
respectively), obtained using the algorithm given in Section 3, are also presented in Fig. 3
through 5. These figures show that the variations of these probabilities for single
generating unit and the entire system have also the property of periodicity.
The system safety S(t)=PW(t)+PFS(t) as the function of time is presented in Fig. 6.
1
PW
0.96
0.92
0.88
0.84
0
10
20
30
40
50
60
t (thousands of hours)
elements 1,3
elements 2,4
single unit
system
elements 5,6
Fig. 3. Probabilities of working states
12
PS
0.12
0.08
0.04
0
0
10
20
30
40
50
60
t (thousands of hours)
elements 1,3
elements 2,4
elements 5,6
single unit
system
Fig. 4. Probabilities of failure-safe states
0.08
PD
0.064
0.048
0.032
0.016
0
0
10
20
30
40
50
60
t (thousands of hours)
elements 1,3
elements 2,4
single unit
system
elements 5,6
Fig. 5. Probabilities of failure-dangerous states
13
1
S
0.98
0.96
0.94
0.92
0.9
0
10
20
30
40
50
60
t (thousands of hours)
Fig. 6. Overall system safety
5. Conclusions
In this paper a method is proposed for the study of series-parallel systems with
imperfect diagnostics and imperfect periodic inspections and repairs of elements. Element
failures can be failure-safe and failure-dangerous and can be either detected or undetected.
The proposed model incorporates periodic inspection and repair (both perfect and
imperfect) of system elements. The Markov model is used for the determination of state
distribution of a single system element, while universal generating function technique for
state distribution of the entire system. The presented example shows that the procedure
can be easily implemented to estimate the state probabilities and the overall safety of a
safety-critical system.
The method presented in this paper can be applied to different research fields such as
power generation units, electronic devices and chips, data storage based on redundant
array of inexpensive disks (Katz et al., 1989; Gibson and Patterson, 1993, etc.) and so on. It
can be used for evaluating safety of a fault-tolerant single-chip multiple microprocessors
architecture (Yao, et al., 2004) which represents a promising solution to partly mitigate the
system faults and to increase the system dependability in mission-critical applications.
14
Acknowledgement:
This research was carried out while the first author was visiting National University of
Singapore supported by the research grant R-266-000-020-112 at National University of
Singapore. The authors would like to thank three referees for their constructive comments.
References
Biswas, A.; Sarkar, J. and Sarkar, S. (2003). Availability of a periodically inspected system, maintained
under an imperfect-repair policy. IEEE Transactions on Reliability, 52 (3), 311-318.
Bowles, J.B. and Dobbins, J.G. (2004). Approximate reliability and availability models for high availability
and fault-tolerant systems with repair. Quality and Reliability Engineering International, 20 (7),
679-697.
Bris, R., Chatelet, E. and Yalaoui, F. (2003). New method to minimize the preventive maintenance cost of
series-parallel systems. Reliability Engineering & System Safety, 82 (3), 247-255.
Bukowski, J.W. (2001). Modeling and analyzing the effects of periodic inspection on the performance of
safety-critical systems, IEEE Transactions on Reliability, 50 (2), 321 – 329.
Burgazzi, L. (2003). Reliability evaluation of passive systems through functional reliability assessment.
Nuclear Technology, 144 (2), 145-151.
Carrasco, J.A. (2004). Solving large interval availability models using a model transformation approach.
Computers & Operations Research, 31 (6), 807-861.
Chandrasekhar, P.; Natarajan, R. and Yadavalli, V.S.S. (2004). A study on a two unit standby system with
Erlangian repair time. Asia-Pacific Journal of Operational Research, 21 (3), 271-277
Cowing, M.M.; Pate-Cornell, M.E. and Glynn, P.W. (2004). Dynamic modeling of the tradeoff between
productivity and safety in critical engineering systems. Reliability Engineering & System Safety, 86 (3),
269-284.
Cui, L.R.; Loh, H.T. and Xie, M. (2004). Sequential inspection strategy for multiple systems under
availability requirement. European Journal of Operational Research, 155 (1), 170-177.
DeLong, T.A.; Smith, D.T. and Johnson, B.W. (2005). Dependability metrics to assess safety-critical
systems. IEEE Transactions on Reliability, 54, 498-505.
Dominguez-Garcia, A.D.; Kassakian, J.G. and Schindall, J.E. (2006). Reliability evaluation of the power
supply of an electrical power net for safety-relevant applications. Reliability Engineering & System
Safety, 91, 505-514.
Faller, R. (2004). Project experience with IEC 61508 and its consequences. Safety Science, 42 (5), 405-422.
Gibson G. A. and Patterson D.A. (1993). Designing Disk Arrays for High Data Reliability, Journal of
Parallel and Distributed Computing, 17, 4 – 27.
Goble, W.M. (1998). Control Systems Safety Evaluation and Reliability, 2nd ed: ISA.
Hokstad, P. and Corneliussen, J. (2004). Loss of safety assessment and the IEC 61508 standard.
Engineering & System Safety, 83 (1), 111-120.
Reliability
IEC 61508 (1998). Functional safety of electric/electronic/programmable electronic safety-related systems,
Parts. 1–7, October 1998–May 2000.
Inagaki, T. and Ikebe, Y. (1989). Performance analysis of a safety monitoring system under human-machine
interface of safety-presentation type, Microelectronics and Reliability, 29 (2), 1989, 165 – 175.
Kang, H.G. and Jang, S.C. (2006). Application of condition-based HRA method for a manual actuation of
the safety features in a nuclear power plant. Reliability Engineering & System Safety, 91, 627-633.
15
Katz R.H.; Gibson G.A. and Patterson D. (1989). Disk System Architectures for High Performance
Computing, Proceedings of the IEEE, 77, No. 12, pp. 1842 – 1858.
Kim, H.; Lee, H. and Lee, K. (2005). The design and analysis of AVTMR (all voting triple modular
redundancy) and dual-duplex system. Reliability Engineering & System Safety, 88, 291-300.
Korczak, E.; Levitin, G and Ben Haim. H. (2005). Survivability of series-parallel systems with multilevel
protection. Reliability Engineering & System Safety, 66, 45-54.
Knegtering, B. and Brombacher, A.C. (1999). Application of micro Markov models for quantitative safety
assessment to determine safety integrity levels as defined by the IEC 61508 standard for functional
safety. Reliability Engineering & System Safety, 66 (2), 171-175.
Latif-Shabgahi, G.; Bass, J.M. and Bennett, S. (2004). Taxonomy for software voting algorithms used in
safety-critical systems. IEEE Transactions on Reliability, 53 (3), 319-328.
Lee, D.Y.; Han, J.B. and Lyou, J. (2004). Reliability analysis of the reactor protection system with fault
diagnosis. Key Engineering Materials, 270, 1749-1754.
Levitin, G. (2004). A universal generating function approach for the analysis of multi-state systems with
dependent elements. Reliability Engineering & System Safety, 66, 285-292.
Levitin, G. (2005). Uneven allocation of elements in linear multi-state sliding window system. Eyropean
Journal of Operational Research, 163, 418-433.
Levitin G.; Lisnianski A.; Beh-Haim H. and Elmakis, D. (1998). Redundancy optimization for series-parallel
multi-state systems, IEEE Transactions on Reliability, 47 (2), 165-172.
Lisnianski, A. and Levitin, G. (2003). Multi-state System Reliability, World Scientific, Singapore.
Levitin, G. (2005). The Universal Generating Function in Reliability Analysis and Optimisation.
Springer-Verlag: Berlin, Springer Series in Reliability Engineering.
Marseguerra, M.; Zio, E. and Podofillini, L. (2004). A multiobjective genetic algorithm approach to the
optimization of the technical specifications of a nuclear safety system. Reliability Engineering & System
Safety, 84 (1), 87-99.
Nunns, S.R. (2000). Conformity assessment of safety related systems to IEC 61508 - the CASS initiative.
Computing & Control Engineering Journal, 11 (1), 33-39.
Olbrich, T; Richardson, A.M.D. and Bradley, D.A. (1996). Built-in self-test and diagnostic support for safety
critical Microsystems, Microelectronics and Reliability, 36, 1125– 1136.
Son, H.S. and Seong, P.H. (2003). Development of a safety critical software requirements verification
method with combined CPN and PVS: a nuclear power plant protection system application. Reliability
Engineering & System Safety, 80 (1), 19-32.
Ushakov I., (1987). Optimal standby problems and a universal generating function, Soviet Journal of
Computer System Science, 25, 79-82.
Wang, D. and Inagaki, T. (1994).Time-dependent optimality of an alarm subsystem, Microelectronics and
Reliability, 34, 1623 – 1633.
Weber, W.; Tondok, H. and Bachmayer, M.B. (2005). Enhancing software safety by fault trees: experiences
from an application to flight critical software. Reliability Engineering & System Safety, 89, 57-70.
Yao, W.B.; Wang D.S. and Zheng W.M. (2004). A Fault-tolerant Single-chip Multiprocessor, ACSAC 2004
 Proceedings of Advances in Computer Systems Architecture: 9 th Asia-Pacific Conference, Pen-Cheng
Yew and Jingling Xue (eds.), Berlin: Springer, 2004, p. 137-145.
Zhang, T.L.; Long, W. and Sato, Y. (2003). Availability of systems with self-diagnostic
components—applying Markov model to IEC 61508-6, Reliability Engineering & System Safety, 80,
133 – 141.
Zhang, T.L.; Xie, M. and Horigome, M. (2006). Availability and reliability of k-out-of-(M plus N): G warm
standby systems. Reliability Engineering & System Safety, 91, 381-387.
Zhou, Z. (1987). Analysis of a two unit standby redundant fail-safe system. Microelectronics and Reliability,
27, 469 – 474.
16
Appendix
The transition rate matrix for one element is
j =
c
su
sd
du
dd
0
0
0
0
0
(sd +dd)
0
0
0
sd
0
dd
0
sd
0
(su + ddu +sd )
0
0
su
du
0
0
0
0
0
(sd +dd)
0
0
sd
0
dd
dd
0
0
0
(su +du+ dd )
0
0
su
du
0
sd
0
0
0
sd
0
0
0
0
0
0
sd
0
0
sd
0
0
0
dd
0
0
0
0
0
dd
0
0
0
0
dd
0
0
0
0
dd
(A1)
where c = sd + dd + du + su .
The matrices M1i (i = 1, 2, 3, 4) for fuel supply system are
p1
M11 =
M12 =
M13 =
p2
p3
p4
1
0
0
0
0.90
0.10
0
0
1
0
0
0
0.80
0
0
0.20
15
05
05
05
p1
p2
p3
p4
1
0
0
0
0.88
0.12
0
0
1
0
0
0
0.776
0
0
0.224
15
05
05
05
p1
p2
p3
p4
1
0
0
0
0.85
0.15
0
0
1
0
0
0
0.747
0
0
0.253
15
05
05
05
p5
p6
p7
p8
p9
09
09
09
09
09
,
p5
p6
p7
p8
p9
09
09
09
09
09
,
p5
p6
p7
p8
p9
09
09
09
09
09
,
17
M14 =
p1
p2
p3
p4
1
0
0
0
0.808
0.192
0
0
1
0
0
0
0.711
0
0
0.289
15
05
05
05
p5
p6
p7
p8
p9
09
09
09
09
09
.
(A2)
The matrices M2i (i = 1, 2, 3) for turbine block are
M21 =
M22 =
M23 =
p1
p2
p3
p4
1
0
0
0
0.92
0.08
0
0
1
0
0
0
0.85
0
0
0.15
15
05
05
05
p1
p2
p3
p4
1
0
0
0
0.804
0.096
0
0
1
0
0
0
0.832
0
0
0.168
15
05
05
05
p1
p2
p3
p4
1
0
0
0
0.882
0.118
0
0
1
0
0
0
0.810
0
0
0.190
15
05
05
05
p5
p6
p7
p8
p9
09
09
09
09
09
,
p5
p6
p7
p8
p9
09
09
09
09
09
,
p5
p6
p7
p8
p9
09
09
09
09
09
.
(A3)
18
Gregory Levitin received a PhD degree in Industrial Automation from Moscow Research Institute of
Metalworking Machines in 1989. From 1982 to 1990 he worked as software engineer and research associate
in the field of industrial automation. From 1991 to 1993 he worked at the Technion (Israel Institute of
Technology) as a postdoctoral fellow at the faculty of Industrial Engineering and Management. Dr. Levitin is
presently an engineer-expert at the Reliability Department of the Israel Electric Corporation and adjunct
senior lecturer at the Technion. His current interests are in operations research and artificial intelligence
applications in reliability and power engineering. In this field Dr. Levitin has published over 100 papers and
two books. He is senior member of IEEE. He serves in editorial boards of IEEE Transactions on Reliability
and Reliability Engineering and System Safety.
Tieling Zhang received a Ph.D. in engineering from Tokyo University of Mercantile Marine in 2001. He
has six years’ experience of teaching, three years’ working in industry and a few years holding research
positions. Currently he is with Hitachi GST, Singapore. He has 30 articles included in peer-review journals
and international conference proceedings. He holds a new practical patent of China. His research interests
include system reliability, maintainability and safety, system optimization and vibration control.
Min Xie received his Ph.D. in Quality Technology from Linkoping University, Sweden, in 1987. Dr Xie has
been active in reliability and quality related research since then. He has authored or co-authored over 100
articles in refereed journals and 6 books, including Software Reliability Modelling by World Scientific,
Statistical Models and Control Charts for High Quality Processes by Kluwer Academic Publisher, and
Weibull Models by John Wiley & Sons. He is a department editor of IIE Transactions, an associate editor of
IEEE Trans on Reliability, and on the editorial board of several other journals. He is a fellow of IEEE.
19
Download