Chapter 4 Location-Scale-Based Parametric Distributions William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University Copyright 1998-2008 W. Q. Meeker and L. A. Escobar. Based on the authors’ text Statistical Methods for Reliability Data, John Wiley & Sons Inc. 1998. January 13, 2014 3h 41min 4-1 Chapter 4 Location-Scale-Based Parametric Distributions Objectives • Explain importance of parametric models in the analysis of reliability data. • Define important functions of model parameter that are of interest in reliability studies. • Introduce the location-scale and log-location-scale families of distributions. • Describe the properties of the exponential distribution. • Describe the Weibull and lognormal distributions and the related underlying location-scale distributions. 4-2 Motivation for Parametric Models • Complements nonparametric techniques. • Parametric models can be described concisely with just a few parameters, instead of having to report an entire curve. • It is possible to use a parametric model to extrapolate (in time) to the lower or upper tail of a distribution. • Parametric models provide smooth estimates of failure-time distributions. In practice it is often useful to compare various parametric and nonparametric analyses of a data set. 4-3 Functions of the Parameters • Cumulative distribution function (cdf) of T F (t; θ ) = Pr(T ≤ t), t > 0. • The p quantile of T is the smallest value tp such that F (tp; θ ) ≥ p. • Hazard function of T h(t; θ ) = f (t; θ ) , t > 0. 1 − F (t; θ ) 4-4 Functions of the Parameters-Continued • The mean time to failure, MTTF, of T (also known as expectation of T ) E(T ) = Z ∞ tf (t; θ ) dt = Z ∞ [1 − F (t; θ )] dt. 0 0 R∞ If 0 tf (t; θ ) dt = ∞, we say that the mean of T does not exist. • The variance (or the second central moment) of T and the standard deviation Var(T ) = SD(T ) = Z ∞ q0 [t − E(T )]2f (t; θ ) dt Var(T ). • Coefficient of variation γ2 γ2 = SD(T ) . E(T ) 4-5 Location-Scale Distributions Y belongs to the location-scale family of distributions if the cdf of Y can be expressed as y−µ , −∞ < y < ∞ F (y; µ, σ) = Pr(Y ≤ y) = Φ σ where −∞ < µ < ∞ is a location parameter and σ > 0 is a scale parameter. Φ is the cdf of Y when µ = 0 and σ = 1 and Φ does not depend on any unknown parameters. Note: The distribution of Z = (Y − µ)/σ does not depend on any unknown parameters. 4-6 Importance of Location-Scale Distributions Importance due to: • Most widely used statistical distributions are either members of this class or closely related to this class of distributions: exponential, normal, Weibull, lognormal, loglogistic, logistic, and extreme value distributions. • Methods of inference, statistical theory, and computer software generated for the general family can be applied to this large, important class of models. • Theory for location-scale distributions is relatively simple. 4-7 Examples of Exponential Distributions Probability Density Function Cumulative Distribution Function 2.0 1 1.5 F(t) f(t) .5 1.0 0.5 0.0 0 0.0 1.0 2.0 3.0 0.0 1.0 2.0 3.0 t t Hazard Function 2.0 1.5 h(t) 1.0 0.5 0.0 1.0 2.0 3.0 θ γ 0.5 1.0 2.0 0 0 0 t 4-8 Exponential Distribution For T ∼ EXP(θ, γ), t−γ F (t; θ, γ) = 1 − exp − θ t−γ 1 exp − f (t; θ, γ) = θ θ f (t; θ, γ) 1 h(t; θ, γ) = = , 1 − F (t; θ, γ) θ t > γ, where θ > 0 is a scale parameter and γ is both a location and a threshold parameter. When γ = 0 one gets the well-known one-parameter exponential distribution. Quantiles: tp = γ − θ log(1 − p). Moments: For integer m > 0, E[(T − γ)m] = m! θ m. Then E(T ) = γ + θ, Var(T ) = θ 2. 4-9 Motivation for the Exponential Distribution • Simplest distribution used in the analysis of reliability data. • Has the important characteristic that its hazard function is constant (does not depend on time t). • Popular distribution for some kinds of electronic components (e.g., capacitors or robust, high-quality integrated circuits). • This distribution would not be appropriate for a population of electronic components having failure-causing quality-defects. • Might be useful to describe failure times for components that exhibit physical wearout only after expected technological life of the system in which the component would be installed. 4 - 10 Examples of Normal Distributions Probability Density Function Cumulative Distribution Function 1 F(t) 1.2 f(t) .5 0.8 0.4 0 0.0 3 4 5 6 7 3 4 5 6 7 t t Hazard Function 20 15 h(t) 10 5 0 3 4 5 6 7 σ µ 0.3 0.5 0.8 5 5 5 t 4 - 11 Normal (Gaussian) Distribution For Y ∼ NOR(µ, σ) y−µ F (y; µ, σ) = Φnor σ 1 y−µ f (y; µ, σ) = φnor , σ σ √ −∞ < y < ∞. Rz 2 2π) exp(−z /2) and Φnor (z) = −∞ φnor (w)dw where φnor (z) = (1/ are pdf and cdf for a standardized normal (µ = 0, σ = 1). −∞ < µ < ∞ is a location parameter; σ > 0 is a scale parameter. −1 Quantiles: yp = µ + σΦ−1 nor (p) where Φnor (p) is the p quantile for a standardized normal. Moments: For integer m > 0, E[(Y − µ)m] = 0 if m is odd, and E[(Y − µ)m] = (m)!σ m/[2m/2 (m/2)!] if m is even. Thus E(Y ) = µ, Var(Y ) = σ 2. 4 - 12 Examples of Lognormal Distributions Probability Density Function Cumulative Distribution Function 1 1.2 F(t) f(t) .5 0.8 0.4 0 0.0 0.0 1.0 2.0 3.0 0.0 1.0 2.0 3.0 t t Hazard Function 4 3 h(t) 2 1 0 0.0 1.0 2.0 3.0 σ µ 0.3 0.5 0.8 0 0 0 t 4 - 13 Lognormal Distribution If T ∼ LOGNOR(µ, σ) then log(T ) ∼ NOR(µ, σ) with " # log(t) − µ σ " # log(t) − µ 1 φnor , f (t; µ, σ) = σt σ F (t; µ, σ) = Φnor t > 0. φnor and Φnor are pdf and cdf for a standardized normal. exp(µ) is a scale parameter; σ > 0 is a shape parameter. −1 Quantiles: tp = exp µ + σΦnor (p) , where Φ−1 nor (p) is the p quantile for a standardized normal. m 2 2 Moments: For integer m > 0, E(T ) = exp mµ + m σ /2 . h i 2 2 2 E(T ) = exp µ + σ /2 , Var(T ) = exp 2µ + σ exp(σ ) − 1 . 4 - 14 Motivation for Lognormal Distribution • The lognormal distribution is a common model for failure times. • It can be justified for a random variable that arises from the product of a number of identically distributed independent positive random quantities. • It has been suggested as an appropriate model for failure time caused by a degradation process with combinations of random rates that combine multiplicatively. • Widely used to describe time to fracture from fatigue crack growth in metals. • Useful in modeling failure time of a population electronic components with a decreasing hazard function (due to a small proportion of defects in the population). • Useful for describing the failure-time distribution of certain degradation processes. 4 - 15 Examples of Smallest Extreme Value Distributions Probability Density Function Cumulative Distribution Function 1 0.06 F(t) f(t) 0.04 .5 0.02 0 0.0 30 35 40 45 50 55 60 30 35 40 45 50 55 60 t t Hazard Function σ 1.5 h(t) 1.0 0.5 0.0 30 35 40 45 50 55 60 5 6 7 µ 50 50 50 t 4 - 16 Smallest Extreme Value Distribution For Y ∼ SEV(µ, σ), y−µ F (y; µ, σ) = Φsev σ 1 y−µ f (y; µ, σ) = φsev σ σ y−µ 1 exp , −∞ < y < ∞. h(y; µ, σ) = σ σ Φsev (z) = 1 − exp[− exp(z)], φsev (z) = exp[z − exp(z)] are cdf and pdf for standardized SEV (µ = 0, σ = 1). −∞ < µ < ∞ is a location parameter and σ > 0 is a scale parameter. Quantiles: yp = µ + Φ−1 sev (p)σ = µ + log [− log(1 − p)] σ. Mean and Variance: E(Y ) = µ − σγ, Var(Y ) = σ 2π 2/6, where γ ≈ .5772, π ≈ 3.1416. 4 - 17 Examples of Weibull Distributions Probability Density Function Cumulative Distribution Function 1 F(t) f(t) .5 0 0.0 0.5 1.0 1.5 2.0 2.5 2.0 1.5 1.0 0.5 0.0 0.0 0.5 1.0 1.5 2.0 t t Hazard Function 4 3 h(t) 2 1 0 0.0 0.5 1.0 1.5 2.0 β η 0.8 1.0 1.5 1 1 1 t 4 - 18 Weibull Distribution Common Parameterization: F (t) = Pr(T ≤ t) = 1 − exp − β η f (t) = β h(t) = η t η !β−1 t η !β−1 exp − , t η !β !β t η t>0 β > 0 is shape parameter; η > 0 is scale parameter. Quantiles: tp = η [− log(1 − p)]1/β . Moments: For integer m > 0, E(T m) = η mΓ(1+m/β). Then ! E(T ) = ηΓ 1 + 1 , β where Z ∞ Γ(κ) = 0 " Var(T ) = η 2 Γ 1 + ! 2 1 − Γ2 1 + β β !# wκ−1 exp(−w)dw is the gamma function. Note: When β = 1 then T ∼ EXP(η). 4 - 19 Alternative Weibull Parameterization Note: If T ∼ WEIB(µ, σ) then Y = log(T ) ∼ SEV(µ, σ). For T ∼ WEIB(µ, σ) then F (t; µ, σ) = 1 − exp − f (t; µ, σ) = β η t η !β−1 t η !β = Φsev exp − log(t) − µ σ # !β " # t 1 log(t) − µ = φsev η where σ = 1/β, µ = log(η), and " σt σ φsev (z) = exp[z − exp(z)] Φsev (z) = 1 − exp[− exp(z)]. Quantiles: tp = η [− log(1 − p)] 1/β h = exp µ + σΦ−1 sev (p) i where Φ−1 sev (p) is the p quantile for a standardized SEV (i.e., µ = 0, σ = 1). 4 - 20 Motivation for the Weibull Distribution • The theory of extreme values shows that the Weibull distribution can be used to model the minimum of a large number of independent positive random variables from a certain class of distributions. ◮ Failure of the weakest link in a chain with many links with failure mechanisms (e.g., creep or fatigue) in each link acting approximately independent. ◮ Failure of a system with a large number of components in series and with approximately independent failure mechanisms in each component. • The more common justification for its use is empirical: the Weibull distribution can be used to model failure-time data with a decreasing or an increasing hazard function. 4 - 21 Examples of Largest Extreme Value Distributions Probability Density Function Cumulative Distribution Function 1 0.06 F(t) f(t) 0.04 .5 0.02 0 0.0 0 10 20 30 40 0 10 20 30 40 t t Hazard Function 0.20 σ 0.15 h(t)0.10 0.05 0.0 0 10 20 30 40 5 6 7 µ 10 10 10 t 4 - 22 Largest Extreme Value Distribution When Y ∼ LEV(µ, σ), y−µ F (y; µ, σ) = Φlev σ y−µ 1 φlev f (y; µ, σ) = σ σ y−µ exp − σ i o, n h h(y; µ, σ) = y−µ −1 σ exp exp − σ −∞ < y < ∞. where Φlev (z) = exp[− exp(−z)] and φlev (z) = exp[−z−exp(−z)] are the cdf and pdf for a standardized LEV (µ = 0, σ = 1) distribution. −∞ < µ < ∞ is a location parameter and σ > 0 is a scale parameter. 4 - 23 Largest Extreme Value Distribution - Continued Quantiles: yp = µ − σ log [− log(p)] . Mean and Variance: E(Y ) = µ + σγ, Var(Y ) = σ 2π 2/6, where γ ≈ .5772, π ≈ 3.1416. Notes: • The hazard is increasing but is bounded in the sense that limy→∞ h(y; µ, σ) = 1/σ. • If Y ∼ LEV(µ, σ) then −Y ∼ SEV(−µ, σ). 4 - 24 Examples of Logistic Distributions Probability Density Function Cumulative Distribution Function 1 F(t) 0.25 0.20 0.15 f(t) 0.10 0.05 0.0 .5 0 5 10 15 20 25 5 10 15 20 25 t t Hazard Function h(t) 1.0 0.8 0.6 0.4 0.2 0.0 σ 5 10 15 20 25 1 2 3 µ 15 15 15 t 4 - 25 Logistic Distribution For Y ∼ LOGIS(µ, σ), F (y; µ, σ) f (y; µ, σ) h(y; µ, σ) −∞ < µ < ∞ is a eter. y−µ = Φlogis σ 1 y−µ = φlogis σ σ y−µ 1 Φlogis , −∞ < y < ∞. = σ σ location parameter; σ > 0 is a scale param φlogis and Φlogis are pdf and cdf for a standardized logistic distribution defined by φlogis(z) = exp(z) [1 + exp(z)]2 exp(z) . Φlogis(z) = 1 + exp(z) 4 - 26 Logistic Distribution-Continued Quantiles: p −1 yp = µ + σΦlogis(p) = µ + σ log 1−p , where Φ−1 logis (p) = log[p/(1 − p)] is the p quantile for a standardized logistic distribution. m ] = 0 if m is odd, Moments: For integer m > 0, E[(Y − µ) h iP ∞ (1/i)m if m m m m−1 and E[(Y − µ) ] = 2σ (m!) 1 − (1/2) i=1 is even. Thus E(Y ) = µ, σ 2π 2 . Var(Y ) = 3 4 - 27 Examples of Loglogistic Distributions Probability Density Function Cumulative Distribution Function 1 F(t) 1.2 f(t) .5 0.8 0.4 0 0.0 0 1 2 3 4 0 1 2 3 4 t t Hazard Function 3.0 2.0 h(t) 1.0 0.0 0 1 2 3 4 σ µ 0.2 0.4 0.6 0 0 0 t 4 - 28 Loglogistic Distribution If Y ∼ LOGIS(µ, σ) then T = exp(Y ) ∼ LOGLOGIS(µ, σ) with " # log(t) − µ F (t; µ, σ) = Φlogis σ # " 1 log(t) − µ f (t; µ, σ) = φlogis σt σ # " log(t) − µ 1 Φlogis , h(t; µ, σ) = σt σ t > 0. exp(µ) is a scale parameter; σ > 0 is a shape parameter. Φlogis and φlogis are cdf and pdf for a LOGIS(0, 1). 4 - 29 Loglogistic Distribution-Continued i −1 Quantiles: tp = exp µ + σΦlogis(p) = exp(µ) [p/(1 − p)]σ . h Moments: For integer m > 0, E(T m) = exp(mµ) Γ(1 + mσ) Γ(1 − mσ). The m moment is not finite when mσ ≥ 1. For σ < 1, E(T ) = exp(µ) Γ(1 + σ) Γ(1 − σ), and for σ < 1/2, i 2 2 Var(T ) = exp(2µ) Γ(1 + 2σ) Γ(1 − 2σ) − Γ (1 + σ) Γ (1 − σ) . h 4 - 30 Other Topics in Chapter 4 Pseudorandom number generation. 4 - 31 Topics in Chapter 5 • Parametric models with threshold parameters. • Important distributions used in reliability that can not be translated into location-scale distributions: gamma, generalized gamma, etc. • Finite (discrete) mixture distributions F (t; θ ) = ξ1F1(t; θ 1) + · · · + ξk Fk (t; θ k ) where ξi ≥ 0, and P i ξi = 1 • Compound (continuous) mixture distributions. If failure-times of units in a population are EXP(η) with 1/η ∼ GAM(θ, κ), then the unconditional failure time, T , of a unit selected at random from the population has a Pareto distribution of the form 1 F (t; θ, κ) = 1 − , t > 0. κ (1 + θt) 4 - 32