System Reliability and Risk Assessment: A Quantitative Extension

From: AAAI Technical Report SS-94-04. Compilation copyright © 1994, AAAI (www.aaai.org). All rights reserved.
System Reliability
and Risk Assessment:
A Quantitative
Extension of IDEF Methodologies
AndrewKusiak, Intelligent Systems Laboratory, Departmentof Industrial Engineering
Nick Larson, Intelligent SystemsLaboratory, Deparlmentof Industrial Engineering
The University of Iowa, Iowa City, Iowa 52242-1527
1.
Introduction
Evaluating system reliability requires modeling the
interaction of resources, information, and material within
the system. Such a model must consider quantitative data
describing the reliability of each element of the system, as
well as logical data describing the relationship between
individual components. For example, a manufacturing
system may assemble products X, Y, and Z, on machines
M1, M2, and M3, respectively, and package the products
on a fourth machine, M4. Therefore, three different
relationships exist between the componentsof the system,
one for each product. If the reliability of each machine
differs, the systemreliability will differ dependingon the
product. Similarly, the reliability of the systemas a whole
will be affected by the production levels of products X, Y,
and Z.
Given the exampleabove, with only three products and
four machines, it becomes apparent that determining
systemreliability requires a significant amountof data and
a structured modeling methodology. Furthermore, to
obtain an accurate assessment of system reliability, it is
necessary to include additional data describing information
and material componentsof the system; such as inspection
procedures, assembly specifications,
parts, and
subassemblies.
Systemreliability describes the likelihood of success
or failure in the operation of a system. Risk assessmentis
a technique for identifying scenarios that lead to problems
in a system, determining the likelihood of each scenario,
and the evaluating the consequenceof each scenario, i.e.,
the problem. Although quantitative approaches to risk
assessmentemploythe principles of systemreliability, the
identification, quantification, and evaluation of risk is a
more comprehensive modeling activity. Therefore, risk
assessmentprojects are well suited for the tools developed
for process modelingand analysis.
In 1978, the United States Air Force selected SADT
(Structured Analysis and DesignTechnique) as the language
to support the Integrated ComputerAided Manufacturing
(ICAM) program. SADTactivity modeling was adopted
by the ICAMprogram and revised by Sofrech, Inc. to
develop the ICAMDefinition Methodology (IDEF0).
Ross (1985) states that "thousands of people from hundreds
of organizations working on more than one hundred major
projects" proceeded to use the methodology for system
definition and design, as well as project management.
IDEF0introduced manufacturing to techniques that were
developed for computer system and software engineering
applications. Additional IDEFtechniques were developed
88
for information analysis (IDEF1), dynamic analysis
(IDEF2), and process modeling (IDEF3).
This paper presents procedures for integrating system
reliability and risk assessment techniques with IDEF0and
IDEF~modeling. The paper is motivated by the need to
increase the value of IDEF0 and IDEF3 models by
incorporating quantitative data, thus, extending modeluse
to applications such as risk assessment. By extending the
power of existing IDEF models, process modeling will
become more attractive to managementand evolve as a
powerful tool for reengineering design and manufacturing
systems.
2. Definitions
of System Reliability
and Risk
Assessment
Reliability maybe defined as the ability of an item
(product, system, etc.) to operate underdesignated operating
conditions for a designated period of time or number of
cycles (Modarres, 1993). An item’s reliability is often
measuredby the probability it will perform without failure
given a set of conditions. The following expression is a
probabilistic representationfor reliability.
R(t) = P(T > tlc~, c2, ...)
(1)
In (1), t is the period of time for the item’s operation,
T is the time to failure of the item, R(t) is the reliability of
the item, and Cl,C2 .... are the conditions under whichthe
item is operating. Thevariable t is often referred to as the
mission time. In practice, T is a random variable
representing the time-to-failure of the item, and Cl,C2 ....
are implicitly considered. Furthermore, fit) mayrepresent
the probability density function of the randomvariable T.
The probability that the item fails prior to time t is defined
in (2).
P(T < t) = I~ f( O)dO = F(t),
for
t>0
(2)
Since F(t) denotes the probability the item will fail
prior to time t, it is formally the unreliability of the item.
Thus, the reliability of the item is determinedby (3).
R(t) = 1- F(t)= It’f(O)dO
(3)
A systemis a collection of entities (i.e., information
and material) and resources (i.e., machines and workers)
whichinteract to performa set of activities in a given
process. Successfulcompletionof the process is dependent
uponpropercompletionof the individual activities in the
process. Therefore, it is necessary to model the
relationshipbetweenvariousitems(entities andresources),
as well as the reliability of individualitemsto assess the
reliability of the system. Complexmanufacturingsystems
producemanydifferent products through manydifferent
sequences of manufacturing activities. The system
reliability is a function of the activities performedand,
thus, is product dependent.Therefore, if the production
volumeof a product exhibiting low systemreliability is
increased,the reliability of the entire manufacturing
system
will decrease. At this point, the necessity for evaluating
system reliability in a manufacturingsetting becomes
obvious.
Manyof the common
techniques for modelingsystem
reliability are difficult to applyto complexmanufacturing
systems with multiple product types. Therefore, the
principles of such tools are moreuseful whenapplied to
modelingschemesdevelopedfor manufacturingsystems,
such as IDEF0and IDEF3.Also, the task of evaluating
system reliability in a manufacturingsetting is more
attractive if performed
as a component
of a risk assessment
study.
Risk is a measureof the probability and severity of
adverse effects (Lowrance,1976). Several types of risk
associated with project planningand softwaredevelopment
havebeencited in the literature, see Angand Gay(1993)
and Chittister and Haimes (1993). The following
manufacturingrisks are generalized from those cited in
variousengineeringdisciplines.
1. Requirementsrisk. Theconceptof whatthe
productis intendedto accomplish
is not accurate.
2. Technicalrisk. Theproductdoes not adhereto the
requirements
set forth by its design.
3. Schedulerisk. Theproductwill not be completed
by the deadlineset forth by productionplanning.
4. Costrisk. Theproductioncost will overrunits
budget.
5. Networkrisk. The mechanism
for linking various
productionactivities will not performas intended.
Risk assessmentis a process that attempts to answer
three questions: (1) Whatcan go wrong?(2) Whatis
likelihood that it will go wrong?(3) Whatare the
consequences?(Kaplanand Garrick, 1981). Basedon these
questions,(4) is a quantitativedefinitionof risk, whereSi
is a scenario of eventsthat leads to a problem,Pi is the
likelihood of scenario i, and Ci is the consequenceof
scenarioi.
R = {Si, Pi, C~}
i =1, 2 ..... n
(4)
This section has provided definitions of system
reliability and risk assessment. Section 4 discusses
89
techniques for determining system reliability and
integrating risk assessmentand IDEFmodels. Section 5
discusses issues related to risk assessment, such as
developingquantitative risk modelsbased on IDEF0and
IDEF3.
3.
Fundamentals of IDEF0 and IDEF3
IDEF0was developedfor modelinga wide variety of
systems which use hardware, software, and people to
perform activities (U. S. Air Force, 1981). An IDEF0
modelconsists of three components,diagrams,text, and a
glossary, all cross-referencedto each other. Thebox and
arrow diagramsare the major components
of the model. In
a diagram, a box represents a function and an arrow
represents an interface. Abox is assignedan active verb
phrase to represent the function. Aninterface maybe an
input, an output, a control, or a mechanism,and is
assigneda descriptivenounphrase. Inputs (I) enter the box
fromthe left, are transformedby the function, andexit the
boxto the right as an output(O). Acontrol (C) enters
top of the box and influences or determinesthe function
performed. A mechanism
(M) is a tool or resource which
performsthe function. Theinterfaces are generallyreferred
to as the ICOMs
(see Figure1).
Control (C)
Input(I)
Output (0)
FUNCTION
Mechansim(M)
Figure1. IDEF0function box and interface arrows
Eachdiagramhas betweenthree andsix function boxes
placed on a diagonal. Theboxeseach havea specific node
numberandare connectedby all relevant interfaces. Each
box on the diagrammaybe decomposed
into a lower level
of detail. This feature restricts the amountof information
that maybe containedin the modelon a single level. The
resulting diagramsforma hierarchy of informationwhich
is summarized
in a nodetree.
IDEF0provides a structured representation of the
functions,information,andobjects whichare interrelated in
a manufacturingsystem. IDEF3wascreated specifically to
model the sequence of activities
performed in a
manufacturingsystem. AnIDEF3modelenables an expert
Reject
proposal
Evaluate
Negotiate
Award
contract
Accept
Figure 2. IDEF3process flow diagram(Mayeret al. 1992)
to communicatethe process flow of a system through
defining a sequenceof activities and the relationships
betweenthose activities. There are two basic components
of the IDEF3process description language, the process
flow description and the object state transition network
description. The two componentsare cross-referenced to
build IDEF3diagrams(Mayeret al., 1992).
The IDEF3process flow description is madeup of
units of behavior (UOBs),links, and junction boxes.
UOBrepresents a function or activity occurring in the
process. For example,assembleparts, performinspection,
or evaluate proposal are all activities which maybe
represented as UOBsin a process model. Relationships
between UOBsare modeledwith three types of links,
precedencelinks, relational links, and object flowlinks.
Precedence links express simple temporal precedence
betweenUOBs.Relational links highlight the existence of
a relationship between two or more UOBs,however, no
temporalconsWaint
is implied. Objectflowlinks providea
mechanism
for capturing object related constraints between
UOBsand carry the same temporal semantics as a
precedencelink. Thelogic of branchingwithina processis
modeledusing junctions. Several classifications are used
to def’mejtmctionboxes.Junctionsare classified according
to logical semanticsas and(&), or (O), and exclusive or
(X). Multipleprocesspaths are classified as fan-in or fanout corresponding to converging and diverging paths,
respectively. The relative timing of process paths that
converge or diverge at a junction are classified as
synchronous or asynchronous. An example of an IDEF3
process flow diagramis shownin Figure 2 (Mayeret al.,
1992).
4. Integrating System Reliability Techniques
and IDEF Models
Asstated in section 2, systemreliability tools maybe
quite useful whenapplied to modelingschemesdeveloped
for manufacturingsystems, such as IDEF0and IDEF3.In
this section, several systemreliability modeling
techniques
90
are integrated with IDEF0and IDEF3.For a detailed
discussionof each technique,see Modarres
(1993).
4.1
Reliability Block Diagrams
Reliability block diagrams model the effect of
component
failure on systemperformanceby capturing the
physical arrangement of the system. Typical system
configurations include series systems, parallel systems,
standby redundantsystems, shared load systems, complex
parallel-series systems, and complexnonparallel-series
systems (Modarres, 1993). Additional system
configurationsmaybe identified in various applications,
however, most manufacturingsystems maybe accurately
described using those listed above. Figure 3 shows
reliability blockdiagramsfor series andparallel systems
and Figure 4 illustrates complexsystems. Thereliability
of complexsystems maybe calculated using various
analytical methods, however, such methods become
computationally intensive as the numberof components
increases (Shooman,1990).
Thesystemconfigurationsillustrated in Figures3 and
4 mayalso be used to describe IDEF0and IDEF3models.
Angand Gay(1993) discuss extensions to IDEF0models
whichenable project risk assessment. Several project
situations are described whichmaybe generalized to the
systemconfigurationsdescribedabove. Theextensionsare
efficient for includingquantitativedata in an IDEF0
model,
such as a probability of occurrence. However,due to the
decomposition
principle of IDEF0,it is difficult to identify
complex system configurations in the model. The
concepts of reliability block diagrams are more easily
adapted to IDEF3models. Consider Figure 2; the five
activities in the modelare arrangedin a complexparallelseries configuration. UnlikeIDEF0models,the numberof
activities (i.e., functions)in IDEF3
modelsis not restricted
to six per level. AlthoughIDEF3allows for elaboration
on a particular UOB
(i.e., activity), the entire processflow
maybe constructedon a single level. This representation
is moresuitable for IDEFapplications of risk assessment
whichare based on the principles of reliability
diagrams.
l
ll
block
+
(a)
(a)
Co)
Figure3. Series(a) andparallel (b) systemconfigurations
In IDEF3,series systems(or the series components
of
a complexsystem) are identified by UOBsconnected by
precedenceand/or object flow links. A series of UOBs
may
not contain a junction of any type, however,a junction
box maybegin or terminate a series. Furthermore,a UOB
is independentif it is not connectedto another UOB
by a
relational link. Thereliability of a series of independent
UOBs
is definedby (5), whereRs(t) is the reliability of the
system, the system contains N UOBs,and Ri(t) is the
reliability of the ith UOB.
N
R,(t) = Rl(t) × R:(t)X...×RN(t)
= H Ri(t) (5)
i=l
Parallel systemsare defined in IDEF3using junction
boxes. Each UOBimmediately following an and (&)
junction box will be performedin parallel. Therefore,the
reliability of the parallel systemfollowingan & junction
box is determinedby (5) as well. However,only one UOB
immediatelyfollowingan exclusive or (X) junction box
performed.Thus, the systemreliability for parallel UOBs
followingan Xjunction box is defined by (6), wherethe
systemcontains NUOBs,
Ri(t) is the reliability of the ith
UOB,and Pi is the probability of occurrenceof the ith
UOB(PI+P2+...+PN= 1).
Co)
Figure4. Complex
parallel-series (a) and
nonparallel-series(b) systems
Thesystemreliability for parallel UOBs
followingan
or (O) junctionbox is moredifficult to determine.At least
one, and as manyas all, of the UOBsfollowing an O
junction box maybe executed. Unlike the & and X cases,
the numberof different UOBsperformedin parallel is
unknown.However,if the system contains N UOBs,RiO)
is the reliability of the ith UOB,
andPi is the probability
of occurrenceof the ith UOB(Pi < 1), a lower boundon
the systemreliability is given by (7). This describesthe
worst case, whereall UOBsare executed and subject to
failure. If the reliability is to be evaluatedfor a known
set
of MUOBsin the parallel system (M < N), (5) may
used. Figure 5 illustrates parallel systemUOBs
for &, X,
and 0 junction boxes in IDEb3.
R,(t) = (PI x Rl(t)) X (P: R2(t))x...x(Pn x
Rn
N
Rift) = (PI x Rl(t)) (P: x Rz(t))+...+(Pn x R~
= H ( P, x R,(t))
= E(P~ X Ri(t))
(7)
i=1
N
(6)
Twoor moreUOBsconnected by relational links may
be considered a single UOBwhencalculating system
reliability in IDEF3models. In Figure 2, UOBs3 and 4
i=l
91
are connectedby a relational link, implyingan interaction
with UOB4 if UOB3 is executed. In this example,
successfullyexecutingUOB
3 will also require successful
completion of UOB4 (i.e., R3(t) = R3(t) x R4(t)
However,UOB
4 does not require executionof UOB3 and,
therefore, is not dependentuponits success.Thedirection
of the relational link determinesthe reliability of the
connected UOBs.In general, if UOBi is connected to
UOB
(i + 1) with a relational link in the direction of UOB
(i + 1), the successor failure of UOB
i will determine
success or failure of UOB
(i + 1). In IDEF3applications
of system reliability modeling,relational links maybe
avoidedby combining
related activities into a single (JOB.
models. A path set is a set of units (i.e., activities,
functions, UOBs)
that form a connectionbetweeninput and
output whentraversed in the direction of the arrows
(Modarres, 1993). For each path through an IDEF3model,
a minimalpath set will exist containing only the UOBs
on
the path. Onceagain, consider Figure 2. Thereare three
minimalpath sets in the IDEF3model;P1= (1, 2), P2ffi
(1, 3, 5), andP3 = (1, 4, 5). Eachpath set represents
event that wouldsuccessfully accomplishthe objective of
the system if each of the UOBson the path execute
successfully. Therefore, the union of all m path sets
definesthe set of all successfulcompletions
of the system.
Theprobability of this unionrepresents the reliability of
the system,as shownin (8).
R~(t) Pr ob(Pl u P2U...UPm)
Unfortunately,(8) requires that the path sets (Pi)
disjoint and, in practice, this is seldomtrue. Anupper
bound on the system reliability maybe determined by
assuming that the path sets are disjoint, as in (9).
However,for reliability values greater than 0.9 for the
missiontime, as in mostpractical applications, (9) does
not yield a useful bound(Modarres,1993).
(a)
Rs(t) < Prob(P0 + Prob(Pz)+...+Prob(Pm)
Acut set is a set of units (i.e., activities, functions,
UOBs)
that interrupt all possible connectionsbetweenthe
input and output points in the diagram(Modarres,1993).
In IDEF3process flow models,the minimalcut set is the
smallest set of UOBswhich prevent flow from input to
output. Failure of all UOBsin the minimumcut set
results in systemfailure. Theminimum
cut sets in Figure
2 are C1= (1), C2= (2, 3, 4), andC3 ffi (5). If the model
has n minimalcut sets andCi represents the event that all
UOBs
in the cut set fall prior to the missiontime t, the
system reliability is obtained from (10). Since the
probabilitythat all the UOBs
fall in at least oneof the cut
sets is the probabilitythat the systemfails, this valueis
subtractedfrom1 to obtainthe reliability of the system.
0,)
R,(t) = 1 - Prob(C,u C2w...uC~)
(10)
Asin (8), the unionin (10) is not usually disjoint,
thus, (11) gives the lowerboundfor systemreliability.
Sincethe probabilityof failure is usedto evaluatethe cut
sets and, in practice, these values are muchlower than
reliabilities,
the lower bound in (11) is a better
representationof systemreliability.
(c)
Figure 5. Parallel systemsmodeledusing IDEF3notation
for and(a), exclusiveor Co),andor (c) junctionboxes
4.2 Path Set and Cut Set Methods
Path set and cut set methods were developed to
determinethe reliability of complexsystemsdescribedby
reliability block diagrams. However,the principles of
these methodsare very useful in the analysis of IDEF3
92
R,(t) > 1 - [Prob(C0+ Prob(C2)+...
+Prob(Cn)]
(11)
Theprinciples of path sets maybe adaptedto evaluate
decision making processes modeledwith IDEF3.Each
minimalpath set identified in an IDEF3modelcorresponds
to a set of decisions in the operation of the system.
Junction boxes identify points where decisions are made
within a system. The decision set corresponding to path
set P3 = (1, 4, 5) is 3 ={do not re ject pr oposal, ac cept
proposal}. Twodecisions are made in the process; one
corresponding to each diverging (or fan ou0 junction box.
The first decision is to "not reject the proposal" and the
second is to "accept the proposal." Therefore, the
reliability of the decision set maybe easily determinedby
calculating the probability of successfully completingall
UOBsin the corresponding path set.
Cut sets maybe used to expose critical activities in
the system. UOBsin the intersection of all or manyof the
cut sets maybe considered critical to the operation of the
system. Furthermore, a cut set with a high probability of
failure (i.e., there are few UOBs
in the cut set and each has
a high probability of failure) maybe considered a critical
group of UOBs.
Applying system reliability
techniques to IDEF3
modelsrequires incorporating quantitative data regarding
probability of occurrence of UOBsand probability of
failure for each UOB.Reliability values for each of the
ICOMs in IDEF0 models may be used to obtain
probability of failure data for the correspondingUOBof an
IDEF3model. Diverging (or fan-ou0 junction boxes in
IDEF3mayalso contain probability of occurrence values
for each UOBthat immediatelyfollows the junction box.
It must be realized that the values for Pi and Ci are
based on approximations and assumptions and, thus,
possess a high level of uncertainty. Formal methods for
treating uncertainty analysis are discussed in the literature
(Morganand Henrion, 1990).
Uponcalculating the expected risk in the system, two
courses of action may be taken; (1) explore alternative
managementdecisions to avoid risk (i.e., decrease the
likelihood of high risk scenarios), and (2) reengineer
processes to mitigate the consequences.
IDEF0 and IDEF3 were developed to provide a
mechanism for evaluating the performance of complex
manufacturing systems. System reliability
and risk
assessment are applications which provide a useful
opportunity
for extending
the power of these
methodologies.
Acknowledgment
This research
has been partially supported by grant No.
DAAE07-93-C-R080
from the U.S. Army Tank
Automotive Command.
References
Ang, C. L. and R. g. L. Gay (1993). "IDEF0 modeling
for project risk assessment," Computersin Industry,
22, pp. 31-45.
Chittister, C. and Y. Y. Haimes(1993). "Risk associated
with software development: A holistic frameworkfor
assessment and management,"IEEE Transactions on
Systems, Man, and Cybernetics, 23(3), pp. 710-723.
Kaplan, S. and B. J. Garrick (1981). "Onthe quantitative
definition of risk," Risk Analysis, 1(1), 1981.
Lowrance, W. W. (1976). Of Acceptable Risk, William
Kaufmann, Los Altos, CA.
Mayer, R. J., T. P. Cullinane, P. S. deWitte, W. B.
Knappenberger,B. Perakath and M. S. Wells (1992).
Information Integration for ConcurrentEngineering
(lICE) IDEF3Process Description Capture Method
Report, ArmstrongLaboratory, Wright-Patterson
AFB, Ohio 45433, AL-TR-1992-0057.
Modarres, M. (1993). What Every Engineer Should Know
About Reliability and Risk Analysis, Marcel Dekker,
Inc., NewYork, NY.
Morgan, M. G. and M. Hem’ion (1990). Uncertainty: A
Guide to Dealing with Uncertainty in Quantitative
Risk and Policy Analysis, CambridgePress,
Cambridge, U.K.
Ross, D. T. (1985). "Applications and extensions
SADT," Computer, April, pp. 25-34.
Shooman,M. L. (1990). Probabilistic Reliability: An
Engineering Approach,2nd Ed., Kreiger, Melbourne,
FL.
U. S. Air Force (1981). Integrated ComputerAided
Manufacturing (ICAM)Architecture Part I1, Volume
W-Functional Modeling Manual(IDEFO), Air Force
Materials Laboratory, Wright-Patterson AFB,Ohio
45433, AFWAL-tr-81-4023.
5.
Risk Assessment in IDEF Models
Evaluating risk in IDEFmodels requires identifying
scenarios, determining the likelihood of these scenarios,
and estimating the consequences.This set of objectives is
often called the "risk triplet." The first two componentsof
the risk triplet are related to the techniquesfor determining
system reliability discussed in Section 4. Path sets maybe
used to identify scenarios in the system. Each scenario is
qualitatively evaluated to identify possible problemsthat
mayresult. The likelihood of each scenario is determined
by the probability of occurrence of the UOBsin the path
set.
Determiningthe consequencesof the scenario requires
estimating the impact on a set of performance measures.
Typical performance measures in manufacturing systems
are in-process inventory levels, lead times, set-up times,
scrap, rework, and resource utilization. Whenpossible,
consequences related to different performance measures
should be converted to a single unit of measurement,such
as dollar loss.
The total expected risk in the systemis calculated by
(12), wherePi is the probability of scenario i, and Ci is the
consequence,or cost, of the scenario.
n
R=~(P,
xC,)
(12)
iffil
93