Reliability Engineering - Part 1 Product Probability Law of Series Components n R Ri i 1 If a system comprises a large number of components, the system reliability may be rather low, even though the individual components have high reliabilities, e.g. V-1 missile in WW II. Transition of Component States Normal state continues Component fails N F Component is repaired Failed state continues The Repair-to-Failure Process Definitions of Reliability • The probability that an item will adequately perform its specified purpose for a specified period of time under specified environmental conditions. • The ability of an item to perform an required function, under given environmental and operational conditions and for a stated period of time. (ISO 8402) Definition of Quality The totality of features and characteristics of a product or service that bear on its ability to satisfy or implies needs (ISO 8402). Quality denotes the conformity of the product to its specification as manufactured, while reliability denotes its ability to continue to comply with its specification over its useful life. Reliability is therefore an extension of quality into the time domain. REPAIR -TO-FAILURE PROCESS MORTALITY DATA t=age in years ; L(t) =number of living at age t t L(t) 0 1,023,102 1 1,000,000 2 994,230 3 990,114 4 986,767 5 983,817 10 971,804 t L(t) t L(t) t L(t) 15 20 25 30 35 40 45 962,270 951,483 939,197 924,609 906,554 883,342 852,554 50 55 60 65 70 75 80 810,900 754,191 677,771 577,822 454,548 315,982 181,765 85 90 95 99 78,221 21,577 3,011 125 After Bompas-Smith. J.H. Mechanical Survival : The Use of Reliability Data, McGraw-Hill Book Company, New York , 1971. HUMAN RELIABILITY t Age in Years 0 1 2 3 4 5 10 15 20 25 30 40 45 50 55 60 65 70 75 80 85 90 95 99 100 L(t), Number Living at Age t R(t)=L(t)/N F(t)=1-R(t) 1,023,102 1,000,000 994,230 986,767 983,817 983,817 971,804 962,270 951,483 939,197 924,609 883,342 852,554 810,900 754,191 677,771 577,882 454,548 315,982 181,765 78,221 21,577 3,011 125 0 1. 0.9774 0.9718 0.9645 0.9616 0.9616 0.9499 0.9405 0.9300 0.9180 0.9037 0.8634 0.8333 0.7926 0.7372 0.6625 0.5648 0.4443 0.3088 0.1777 0.0765 0.0211 0.0029 0.0001 0. 0. 0.0226 0.0282 0.0322 0.0355 0.0384 0.0501 0.0595 0.0700 0.0820 0.0963 0.1139 0.1667 0.2074 0.2628 0.3375 0.4352 0.5557 0.6912 0.8223 0.9235 0.9789 0.9971 0.9999 1. repair= birth failure = death Meaning of R(t): (1) Prob. Of Survival (0.86) of an individual of an individual to age t (40) (2) Proportion of a population that is expected to Survive to a given age t. 1.0 P 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 10 20 30 40 50 60 70 80 90 100 Time to Failure The time elapsing from when the unit is put into operation until it fails for the first time, i.e. a random variable T. It may also be measured by indirect time concepts: • The number of times a switch is operated • The number of kilometers driven by a car • The number of rotations of a bearing • etc. State Variable The state of the unit at time t can be described by the state variable if the unit is functioning at time t 1 X (t ) 0 if the unit is in a failed state at time t Reliability, R(t) = probability of survival to (inclusive) age t = the number of surviving at t divided by the total sample Unreliability, F(t) = probability of death to age t (t is not included) =the total number of death before age t divided by the total population Reliability - R(t) • The probability that the component experiences no failure during the the time interval (0,t] or, equivalently, the probability that the unit survives the time interval (0, t] and is still functioning at time t. R (t ) Pr(T t ) lim R (t ) 1 t 0 lim R (t ) 0 t • Example: exponential distribution R(t ) e t Unreliability - F(t) • The probability that the component experiences the first failure during (0,t]. F (t ) Pr(T t ) R(t ) F (t ) 1 lim F (t ) 0 t 0 lim F (t ) 1 t • Example: exponential distribution F (t ) 1 e t FALURE DENSITY FUNCTION f(t) n(t ) n(t ) f (t ) Age in Years 0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 60 65 70 75 80 85 90 95 99 100 23,102 5,770 4,116 3,347 2,950 12,013 9,543 10,787 12,286 14,588 18,055 23,212 30,788 41,654 56,709 99,889 123,334 138,566 134,217 103,554 56,634 18,566 2,886 125 0 n(t ) n(t ) N 0.02260 0.00564 0.00402 0.00327 0.00288 0.00235 0.00186 0.00211 0.00240 0.00285 0.00353 0.00454 0.00602 0.00814 0.01110 0.01500 0.01950 0.02410 0.02710 0.02620 0.02020 0.01110 0.00363 0.00071 000012 f (t ) dF (t ) dt 0.00540 0.00454 0.00284 0.00330 0.00287 0.00192 0.00198 0.00224 0.00259 0.00364 0.00393 0.00436 0.00637 0.00962 0.01367 0.01800 0.02200 0.02490 0.02610 0.02460 0.01950 0.00970 0.00210 - 140 Number of Deaths (thousands) 120 100 80 60 40 20 20 40 60 80 100 Age in Years (t) 0.14 0.12 Failure Density f (t) 0.10 0.8 0.6 0.4 0.2 20 40 60 80 100 Age in Years (t) Failure Density - f(t) f (t )t Pr(t T t t ) dF (t ) t e f (t ) dt t F (t ) f (u )du 0 R (t ) f (u )du t (exponential distribution) number of deaths during [t , t ) f (t ) r (t ) number of survivals at age t R(t ) CALCULATION OF FAILURE RATE r(t) Age in Years 0 1 2 3 4 5 10 15 20 25 30 35 No. of Failures r(t)= f (t ) 1 F (t ) (death) 23,102 5,770 4,116 3,347 2,950 12,013 9,534 10,787 12,286 14,588 18,055 23,212 0.02260 0.00570 0.00414 0.00338 0.00299 0.00244 0.00196 0.00224 0.00258 0.00311 0.00391 0.0512 Age in Years No. of Failures r(t)= f (t ) 1 F (t ) (death) 40 45 50 55 60 65 70 75 80 85 90 95 99 30,788 41,654 56,709 76,420 99,889 123,334 138,566 134,217 103,554 56,634 18,566 2,886 125 0.00697 0.00977 0.01400 0.02030 0.02950 0.04270 0.06100 0.08500 0.11400 0.14480 0.17200 0.24000 1.20000 Random failures Early failures Wearout failures 0.2 0.15 0.1 0.05 20 40 60 80 Failure rate r(t) versus t. 100 Failure Rate, (faults/time) Period of Approximately Constant failure rate Infant Mortality Old Age Time Figure 11-2 A typical “bathtub” failure rate curve for process hardware. The failure rate is approximately constant over the mid-life of the component. Comments on Bathtub Curve • Often units are tested before they are distributed to the users. Thus, much of the infant mortality will be removed before the units are delivered for use. • For the majority of mechanical units, the failure rate will usually show a slightly increasing tendency in the useful life period. Failure Rate - r(t) • The probability that the component fails per unit time at time t, given that the component has survived to time t. f (t ) f (t ) r (t ) R(t ) 1 F (t ) • Example: r (t ) The component with a constant failure rate is considered as good as new, if it is functioning. As Good As New? Pr T t x e (t x ) Pr T t x | T t t e x Pr T x Pr T t e This implies that the probability that a unit will be functioning at time t+x, given that it is functioning at time t, is equal to the probability that a new unit has a time to failure longer than x. Hence the remaining life of a unit, functioning at time t, is independent of t. The exponential distribution has no “memory.” Relation Between Reliability and Failure Rate t -dR / dt r (t ) R(t ) exp [- r (u ) du ] 0 R t f (t ) r (t )exp [- r (u )du ] 0 r (t ) t Pr(t T t t | T t ) Interpretation of Failure Rate In actuarial statistics, the failure rate is called the force of mortality (FOM). The failure rate or FOM is a function of the life distribution of a single unit and an indication of the “proneness of failure” of the unit after time t has elapsed. Failure-Rate Experiment Split the time interval (0, t) into disjoint intervals of equal length dt. Then put n identical units into operation at time t=0. When a unit fails, record the time and leave that unit out. For each interval, determine 1. The number of units n(i) that fail in interval i. 2. The functioning times of the individual units in interval i. If a unit has failed before interval i, its functioning time is zero. Failure-Rate Experiment T ji functioning time of unit j in interval i n T j 1 ji r (i ) total functioning time for all units in interval n(i ) n T j 1 failure rate ji let m(i ) denotes the number of units which are functioning at the start of interval i. n(i ) z (i ) m(i )t n(i ) z (i )t m(i ) Mean Time to Failure - MTTF MTTF tf (t )dt tR(t )dt 0 0 0 0 tR(t ) 0 R(t )dt R (t )dt 1 Variance of Time to Failure Var T (t MTTF ) 2 f (t )dt 0 1 2 Failure Rate Failure Density Unreliability 1 f (t) Area = 1 F (t) 1 f t dt t 0 t (a) t (b) 0 Reliability t (c) R (t) 0 1 - F (t) t (d) Figure 11-1 Typical plots of (a) the failure rate (b) the failure density f (t), (c) the unreliability F(t), and (d) the reliability R (t). TABLE 11-1: FAILURE RATE DATA FOR VARIOUS SELECTED PROCESS COMPONENTS1 Instrument Fault/year Controller Control valve Flow measurement (fluids) 0.29 0.60 1.14 Flow measurement (solids) Flow switch Gas - liquid chromatograph 3.75 1.12 30.6 Hand valve Indicator lamp Level measurement (liquids) 0.13 0.044 1.70 Level measurement (solids) Oxygen analyzer pH meter 6.86 5.65 5.88 Pressure measurement Pressure relief valve Pressure switch 1.41 0.022 0.14 Solenoid valve Stepper motor Strip chart recorder 0.42 0.044 0.22 Thermocouple temperature measurement Thermometer temperature measurement Valve positioner 0.52 0.027 0.44 1Selected from Frank P. Lees, Loss Prevention in the Process Industries (London: Butterworths, 1986), p. 343. Example Consider two independent components with failure ratesλ1andλ2, respectively. Determine the probability that component 1 fails before component 2. Pr T2 T1 Pr T2 t | T1 t fT1 (t )dt 0 Similarly, e 2t 0 1e 1t dt 1 e ( 1 2 )t dt 0 1 1 2 Pr component j fails first among n components j n i 1 i A System with n Components in Parallel • Unreliability • Reliability n F Fi i 1 n R 1 F 1 (1 Ri ) i 1 A System with n Components in Series • Reliability • Unreliability n R Ri i 1 n F 1 R 1 (1 Fi ) i 1 Upper Bound of Unreliability for Systems with n Components in Series n n i 1 F Fi Fi Fj (1) i 1 n Fi i 1 i 2 j 1 n 1 n F l l 1 The Poisson Process Assumptions of Homogeneous Poisson Process (HPP) Suppose we are studying the occurrence of a certain event A in the course of a given time period. Let us assume 1. A can occur at any time in the interval. The probability of A occurring in the interval (t , t t ] is independent of t and may be written as t (t ) where is a positive constant. 2. The probability of more than one event A in this interval is (t ) , which is a function with property (t ) lim t 0 3. t 0 Let (t11,t12], (t21,t22],…be any sequence of disjoint intervals in the time period in question. Then the events “A occurs in (tj1,tj2],” j= 1, 2, …, are independent. The process is said to have intensityλ Probability of No Event Occurring in (0, t] Let N(t) denotes the number of times the event A occurs during the period (0, t]. Let p(n, t ) Pr N (t ) n p(0, t t ) p(0, t ) 1 t (t ) p(0, t t ) p(0, t ) (t ) p(0, t ) p(0, t ) t t d let t 0 p(0, t ) p(0, t ) dt t From p(0, 0) 1 p(0, t ) e for t 0 Exponential Distribution Let T1 denotes the time point when A occurs for the first time. T1 is a random variable and FT1 (t ) Pr{T1 t} 1 Pr{T1 t} 1 p(0, t ) Thus, 1 e t for t0 FT1 (t ) otherwise 0 e t for t0 fT1 (t ) otherwise 0 1 E[T1 ] tfT1 (t )dt Thus, thee waiting time T between consecutive occurrences in a HPP is exponentially distributed. 0 Probability of n Events Occurring in (0, t] p (n, t t ) p (n, t ) 1 t ( t ) p ( n 1, t )t p (n, t ) t p (n 1, t ) p (n, t ) p (n, t ) (t ) p (n, t t ) p (n, t ) (t ) p (n 1, t ) p( n, t ) p( n, t ) t t d p (n, t ) p (n 1, t ) p (n, t ) ; p( n, 0) 0 for n 1 dt d p (1, t ) p (0, t ) p (1, t ) e t p (1, t ) dt p (1, t ) te t ( t ) n t p ( n, t ) e for n 0,1, 2, n! Poisson Distribution Since p(n, t ) Pr N (t ) n This distribution is called the Poisson distribution with parameter t and random variable n (t ) t E N (t ) n e t n! n 0 n Notice that, since the expected number of occurrences of event A per unit time (t=1) is , expresses the intensity of the process. Example 1 Suppose that exactly one event (failure) of a HPP with intensityλis known to have occurred in the interval (0, t0]. Determine the distribution of the time T1 at which this event occurred. Example 1 Pr T1 t N (t0 ) 1 Pr T1 t | N (t0 ) 1 Pr N (t0 ) 1 Pr 1 event in (0, t ] 0 events in (t , t0 ] Pr N (t0 ) 1 Pr 1 event in (0, t ]} Pr{0 events in (t , t0 ] Pr N (t0 ) 1 te t e (t t ) t = t t0 e t0 0 0 for 0 t t0 In other words, the time at which the first failure occurs in uniformly distributed over (0,t0]. The expected time is thus E T1 | N (t0 ) 1 t0 / 2 Example 2 Suppose that the failure of a system are occurring in accordance with a HPP. Some failures develop into a consequence C, and others do not. The probability of this development is p and is assumed to be constant for each failure. The failure consequences are further assumed to be independent of each other. Determine the distribution of the consequences. Example 2 M (t ) the number of C failures in (0,t] n m Pr M (t ) m | N (t ) n p (1 p) n m m m 0,1, 2, , n n m n m t e t Pr M (t ) m p (1 p) n! nm m ( pt ) m pt e m! n Thus, M(t) is also a HPP with intensity (pλ) . The mean number of C consequences in (0,t] is (pλt). Gamma Distribution Consider a unit that is exposed to a series of shocks which occur a HPP with intensity λ. The time intervals T1, T2, T3, …, between consecutive shocks are then independent and exponentially distributed with parameter λ. Assume that the unit fails exactly at the kth shock, and not earlier. The time to failure of the unit T T1 T2 is then gamma distributed (k, λ). Tk Gamma Distribution Consider a homogeneous Poisson distribution, k 1 FTk (t ) Pr Tk t 1 Pr Tk t 1 p( j, t ) j 0 ( t ) t FTk (t ) 1 e j! j 0 k 1 fTk (t ) dFTk (t ) dt j (k 1)! ( t ) k 1 t e ( k ) (t ) k 1 e t The waiting time until the kth occurrence of event A in a HPP with intensity λis gamma distributed. The gamma distribution (1, λ) is an exponential distribution with parameter λ. Weibull Distribution For majority of mechanical units, the failure rates are slightly increasing in the useful life period (not constant). A distribution often used when r(t) is monotonic is the Weibull distribution. The time to failure T of a unit is said to be Weibull distributed with scale parameterλand shape parameterα. Weibull Distribution 1 e F (t ) Pr T t 0 ( t ) t e d f (t ) F (t ) dt 0 1 R(t ) Pr T t e ( t ) f (t ) 1 t r (t ) R(t ) t 0 otherwise ( t ) t 0 otherwise for t 0 for t 0 Weibull Distribution 1. α=1: The Weibull distribution reduces to exponential distribution. 2. α>1: The failure rate is increasing. 3. α<1: The failure rate is decreasing. Weibull Distribution R t 1 1 e Hence, the probability of time to failure larger than 1/λ is independent of α. This quantity is called the Characteristic Lifetime. MTTF 0 1 1 R (t )dt 1 1 2 2 1 Var T 2 1 1 Three-Parameter Weibull Distribution 1 e [ (t )] F (t ) Pr T t 0 (t ) d f (t ) F (t ) dt 0 1 R(t ) Pr T t e [ ( t )] f (t ) 1 (t ) r (t ) R(t ) t otherwise e [ ( t )] t otherwise for t for t Normal Distribution The normal distribution is sometimes used as a lifetime distribution, even though it allows negative TTF values with positive probability. (t ) 2 1 f (t ) exp 2 2 2 t The probability density of standard normal distribution is t2 1 (t ) exp 2 2 Normal Distribution The unreliability, reliability and failure rate can be written as t t F (t ) Pr T t ( )d t R (t ) 1 F (t ) 1 t R(t ) 1 r (t ) R (t ) t 1 Normal Distribution, Left Truncated at 0 t R(t ) Pr T t | T 0 t R(t ) 1 r (t ) R(t ) t 1 for t 0 Log Normal Distribution The time to failure T of a unit is said to be lognormally distributed if Y=ln(T) is normally distributed. The lognormal distribution is commonly used as a distribution for repair time. When modeling the repair time, it is natural to assume that the repair rate in increasing in a first phase. When the repair has been going on for a rather long time, this indicates serious problems. It is thus natural to believe that the repair rate is decreasing after a certain period of time. Log Normal Distribution 2 1 1 log t f (t ) exp t 2 2 2 t1 1 1 log t F (t ) exp dt 2 0 t 2 f (t ) r (t ) 1 F (t ) 1 2 MTTF exp 2