Chapter 11 Parametric Maximum Likelihood: Other Models William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University Copyright 1998-2008 W. Q. Meeker and L. A. Escobar. Based on the authors’ text Statistical Methods for Reliability Data, John Wiley & Sons Inc. 1998. January 13, 2014 3h 42min 11 - 1 Parametric Maximum Likelihood: Other Models Chapter 11 Objectives • ML estimation for the gamma and the extended generalized gamma (EGENG) distributions. • ML estimation for the BISA, IGAU, and GOMA distributions. • ML estimation for the limited failure population model. • ML estimation for truncated data (or data from truncated distributions). • ML estimation for threshold-parameter distributions like the 3-parameter lognormal and the 3-parameter Weibull distributions (using generalized threshold-scale or GETS distribution). 11 - 2 Fitting Other Distributions and Models • Likelihood principles similar to location-scale distributions. L(θ ) = n Y Li(θ ; datai) = i=1 n Y [f (ti; θ )]δi [1 − F (ti; θ )]1−δi i=1 where datai = (ti, δi), δi = ( 1 0 if if ti is an exact failure ti is a right censored observation and F (ti; θ ) and f (ti; θ ) are the specified distribution’s cdf and pdf, respectively. • For some non-location-scale distributions (e.g. GETS) the density approximation breaks down and one should use the actual interval probability instead. • Left censored and interval censored observations also could be included, as described in Chapter 2. 11 - 3 Confidence Intervals for Other Distributions and Models Confidence intervals and regions similar to location-scale distributions. • Normal approximation confidence intervals (using the delta method and appropriate transformations) are simple and are adequate in large samples or for rough approximations. • Profile likelihood and corresponding intervals provide useful insight into the information available about a particular parameter or functions of parameters. • Bootstrap and simulation-based intervals will generally provide confidence intervals with excellent approximations to nominal coverage probabilities, but will require more computer time (and may not be available in commercial software). 11 - 4 Lognormal Probability Plot of the Bearing Failure Data, Comparing ML Estimates of the Gamma, Lognormal and Weibull Distributions. Approximate 95% Pointwise Confidence Intervals for the Gamma Cdf are Also Shown .995 .98 .95 .9 • Proportion Failing .8 ••• .6 .4 .2 .1 .05 .02 • • • •• • ••• • •• • • • • • Gamma Lognormal Weibull 95% pointwise confidence intervals • .005 .001 10 20 50 100 200 Millions of Cycles 11 - 5 Fitting the Gamma Distribution • Scale and shape parameter estimated. • For the bearing data, the gamma, lognormal and Weibull distributions are similar over the range of the data. 11 - 6 Fitting the Extended Generalized Gamma (EGENG) Distribution • T ∼ EGENG(µ, σ, λ) • Special cases: Weibull (λ = 1), Lognormal (λ = 0), and Gamma (θ = λ2 exp(µ), σ = λ, κ = 1/(λ)2. • A more flexible curve for the data • Can use EGENG to see if there is evidence for one distribution over the other 11 - 7 Weibull Probability Plot of the Bearing Failure Data Showing Exponential, Weibull, Lognormal, and Generalized Gamma ML Estimates of F (t) .99 Proportion Failing .9 .7 .5 EXP .2 .1 • .05 WEIB .02 • •• • •••• • • •• •• • • • •• • • .01 .003 EGENG .001 LOGNOR 10 20 50 100 200 Millions of Cycles 11 - 8 Profile Likelihood Plot for EGENG λ for the Bearing Failure Data Showing Weibull and Lognormal Distributions as Special Cases 1.0 0.50 0.60 0.6 0.70 0.80 0.4 LOGNOR 0.2 Confidence Level Profile Likelihood 0.8 0.90 WEIB 0.95 0.99 0.0 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 λ 11 - 9 Fitting the EGENG Distribution to the Bearing Data-Conclusions • For the bearing data, the EGENG distribution provides a compromise between lognormal and Weibull. • The EGENG, lognormal and Weibull agree well within the range of the data. Important deviations in the lower tail of the distribution illustrate the danger of extrapolation. • The profile likelihood shows that the data do not, in this case, provide strong evidence for one distribution over the other. 11 - 10 Lognormal Probability Plot of the Fan Failure Data Showing Generalized Gamma ML Estimates and Corresponding Approximate 95% Pointwise Confidence Intervals for F (t) Along with Exponential, Weibull, and Lognormal ML Estimates of F (t) .4 .3 • Proportion Failing .2 • • .1 • .02 .005 .002 .001 • • .05 .01 • • • EXP • WEIB LOGNOR EGENG 200 500 2000 5000 20000 50000 Hours 11 - 11 Fitting the EGENG Distribution to the Fan Data • Only 12 failures out of 70 units (multiple censoring). • Lognormal fits the data well. Weibull and exponential also fit the data reasonably well. Can EGENG do better? • The EGENG has a larger likelihood than the other distributions, but the difference is statistically unimportant. • Comparison shows that the position of the smallest observation does not have much influence on the fit (small order statistics have a large amount of variability). • Fitting a 3-parameter distribution to 12 failures is overfitting. 11 - 12 Profile Likelihood Plot for EGENG λ for the Fan Failure Data Showing Weibull and Lognormal Distributions as Special Cases 1.0 0.50 0.60 0.6 0.70 0.80 0.4 LOGNOR 0.2 Confidence Level Profile Likelihood 0.8 0.90 WEIB 0.95 0.99 0.0 -8 -6 -4 -2 0 2 λ 11 - 13 Fitting the BISA and IGAU Distributions • Distributions motivated by similar degradation models ◮ Inverse-Gaussian (IGAU) based on time of first crossing of a threshold for a continuous-time Brownian motion process with drift. ◮ Birnbaum-Saunders (BISA) based on discrete-time growth of fatigue cracks until fracture. • For some values of their parameters, these distributions are very similar to each other and to the lognormal distribution. 11 - 14 Lognormal Probability Plot of Yokobori’s Fatigue Failure Data on Cylindrical Specimens at 52.658 ksi Showing Lognormal, BISA and IGAU Distribution ML Estimates .995 • .98 Proportion Failing .95 .9 .8 .7 .5 .3 .2 .1 .05 • .02 .005 •• • • • • • • • • • • •• • • • • • •• • • • • • • • •• • • • • •• •• • • •• • • • • • Lognormal Birnbaum-Saunders Inverse Gaussian • 20 50 100 200 500 Thousands of Load Oscillations 11 - 15 Weibull Probability Plot of Integrated Circuit Failure-Time Data with ML Estimates of the Weibull/LFP Model After 1370 Hours and 100 Hours of Testing .01 .005 Proportion Failing .003 • •• • • • • • • •• • • • • • • • .001 • • .0005 100-hour ML Fit 100-hour pointwise confidence intervals 1370-hour ML Fit 1370-hour pointwise confidence intervals .0003 • .0002 10 -2 10 -1 10 0 10 1 10 2 10 3 10 4 Hours 11 - 16 Limited Failure Population (LFP) Model • Only a small proportion (p) of the population is susceptible to failure. • The Weibull/LFP model is Pr(T ≤ t) = pF (t; µ, σ) = pΦsev " # log(t) − µ . σ Similar for lognormal or other distributions. • ML methods work. The likelihood has the form n ( Y p #)δ i log(ti) − µ L(µ, σ, p) = φsev × σ σ i=1 #)1−δ ( " i log(ti) − µ . 1 − pΦsev σ " Need to test until a high proportion (e.g. 90% or more) of the susceptible subpopulation has failed. See Meeker (1987) for more details. 11 - 17 Approximate Joint Confidence Regions For the LFP Parameters p and log(σ) Based on a Two-Dimensional Profile Likelihood After 100 Hours of Testing .9 99 95 50 70 Proportion Defective p .8 .6 .4 .2 .1 .05 80 .02 90 95 .01 .005 99 1.5 99 2.0 2.5 3.0 σ 11 - 18 Comparison of Profile Likelihoods for p, the LFP Proportion Defective, After 1370 and 100 Hours of Testing 1.0 Profile after 100-hours Profile after 1370-hours 0.50 0.60 0.6 0.70 0.80 0.4 Confidence Level Profile Likelihood 0.8 0.90 0.2 0.95 0.99 0.0 0.005 0.020 0.050 0.200 0.500 Proportion Defective p 11 - 19 Relationship Between Wald and Profile Likelihood-Based Confidence Regions/Intervals Result: Using the Wald (normal-theory) based interval is equivalent to using a quadratic approximation to the loglikelihood profile. • See Meeker and Escobar (1995) for proof. • Likelihood-based interval does not depend on transformation. • Simulation and some theory suggests that the likelihoodbased interval provides a better asymptotic approximation. • Interval for the LFP parameter provides an extreme example of where the quadratic approximation breaks down. 11 - 20 Analysis of Truncated Data Some relevant topics in the analysis of truncated data include: • Importance of distinguishing between truncated data and censored data. • Nonparametric estimation with left-truncated data. • ML estimation with left-truncated data. • Nonparametric estimation and ML estimation with right (and left) truncated data. 11 - 21 Distribution of Brake Pad Life from Observational Data • Pad wear (W as a proportion of wear at the end of life) was measured and distance driven (V in thousands of km) was recorded on automobiles that came in for service. Data from Kalbfleisch and Lawless (1992). • Time of failure for each pad was imputed from the observed wear rate as Y = V /W . • Units having already had a pad replacement were omitted from the data. Thus, high-rate units are under represented in the sample. • To analyze the data, each unit can be viewed as having been left-truncated at its observation time (if it had failed before its observation time, we would not know of the unit’s existence because it would have been omitted from the sample). 11 - 22 Weibull Probability Plot of the Nonparametric Estimate of Brake Pad Life, Conditional on Failure After 6.951 Thousand km .999 .98 .9 Proportion Failing .7 .5 .3 .2 .1 • .05 • • • ••• • • •• • • • • ••••• • • ••• • • • • ••• • • ••• • • •• • •• • •• • • • • • • • .03 .02 .01 • 20 40 60 80 100 120 140 180 km 11 - 23 Weibull Probability Plot of the Weibull-Adjusted Nonparametric Estimate of Brake Pad Life Distribution .999 .98 .9 Proportion Failing .7 .5 .3 .2 .1 • .05 .03 • •• • • •• • • • •• • ••• • • • •• • ••• •• • •• • • • •••• • • ••• • • • •• • • • ••••• • • • .02 20 40 60 80 100 120 140 180 km 11 - 24 IC Failure Data from a Limited Failure Population • Of the n =4,156 integrated circuits tested, there were 25 failures in the first 100 hours of testing. • The number of susceptible units in the sample is unknown. • The 25 failures can be viewed as a sample from a distribution truncated on the right at 100 hours. 11 - 25 Lognormal Probability Plot of the Nonparametric Estimate of the IC Failure-Time Distribution Conditional on Failure Before 100 Hours .99 • .98 .95 Proportion Failing .9 .8 .7 .6 .5 .4 .3 .2 • .1 .05 • • • • • • • • • • • • • • • • .02 10 -2 10 -1 10 0 10 1 10 2 10 3 Hours 11 - 26 Lognormal Probability Plot of the Lognormal-Adjusted (Unconditional) Nonparametric Estimate of the IC Failure-Time Distribution .6 .5 Proportion Failing .4 .3 .2 • • • • • • • • •• • • • • • .1 • • .05 • .02 10 -2 10 -1 10 0 10 1 10 2 10 3 Hours 11 - 27 Three-Parameter Weibull and Lognormal Distributions • Let γ be the threshold parameter. Then " # log(t − γ) − µ F (t; µ, σ, γ) = Φ σ " # log(t − γ) − µ 1 φ f (t; µ, σ, γ) = σ(t − γ) σ for t > γ. Φsev and φsev are used for the Weibull distribution and Φnor and φnor are used for the lognormal distribution. • Similarly, a threshold parameter can be added to other distributions for positive random variables. 11 - 28 Inferences for 3-Parameter Weibull or Lognormal Distributions Assuming that Threshold γ is Known • If γ can be assumed to be known, we can subtract γ from all times and fit the two-parameter Weibull distribution to estimate µ and σ. • Need to adjust inferences accordingly (e.g., add γ back into estimates of quantiles or subtract γ from times before computing probabilities). • Similar methods can be used for other distributions for positive random variables. • ML works if used correctly. 11 - 29 Fitting the Three-Parameter Weibull Distribution Likelihood with Unknown Threshold γ Two possible problems with ML that need to be avoided • If the smallest observation is an exact failure, there may be paths in the parameter space leading to infinite likelihood when the density approximation likelihood is used. Using the correct likelihood will allow one to avoid this problem. • For some data sets the ML estimate of σ will approach 0 (on the boundary of the parameter space). This problem can be avoided by extending to the parameter space to allow values of σ ≤ 0. For the 3-parameter Weibull (lognormal), use the SEV (NORM) GETS distribution. 11 - 30 SEV-GETS, NOR-GETS, and LEV-GETS pdfs with α = 0, σ = −.75, 0, .75, and ς = .5 (Least Disperse), 1, and 2 (Most Disperse) SEV σ = -.75 0.6 0.8 0.4 0.2 0.0 -3 Normal 0.8 0.4 0.4 0.0 -2 -1 0 1 2 0.0 -3 -1 0 1 2 3 0.8 0.6 0.4 0.2 0.0 0.8 0.4 0.0 -2 LEV σ = .75 σ=0 -1 0 1 2 3 0.8 0.6 0.4 0.2 0.0 -4 -2 0 2 4 1 2 2 3 -3 -2 -1 0 1 2 -2 -1 0 1 2 3 0.8 0.6 0.4 0.2 0.0 0.0 0 1 0.0 0.2 -1 0 0.4 0.4 -2 -1 0.8 0.6 -3 -2 -3 -1 0 1 2 3 11 - 31 The Three-Parameter Weibull Distribution Likelihood for Right Censored Data • The likelihood has the form L(µ, σ, γ) = = n Y i=1 n Y Li(µ, σ, γ; datai) {f (ti; µ, σ, γ)}δi {1 − F (ti; µ, σ, γ)}1−δi i=1 n ( Y " 1 log(ti − γ) − µ = φsev σ(t − γ) σ i i=1 " #)1−δ ( i log(ti − γ) − µ × 1 − Φsev σ #)δ i • Problem: when γ → t(1) and σ → 0, L(µ, σ, γ) → ∞. • Solution: Do not use the density approximation; use the correct likelihood (based on small intervals). 11 - 32 Density Approximation Profile Likelihood for γ and 3-Parameter Lognormal Probability Plots of the Fan Data with γ Varying Between −100 and 449.999 γ = -100 .1 .01 300 .2 .1 .05 • .02 .005 • • •• • •• • 100 .01 2000 10000 .1 .05 • • 200 1000 5000 • •• • •• • .3 • •• • •• • • • .2 .1 .05 .02 .01 .005 500 • γ = 449.99 • 50 • • •• • •• • .005 .2 .005 5000 • γ = 380 .01 500 .1 .02 .3 .02 • .2 .05 • 500 Proportion Failing Proportion Failing .3 •• .3 Proportion Failing 100 .005 Threshold Parameter Gamma γ = 300 .01 • • • • • • • Proportion Failing Proportion Failing -134.0 .2 .02 -100 γ=0 .3 .05 -134.6 lognormal Loglikelihood Threshold Profile 5000 • 10-2 100 102 104 11 - 33 Correct Likelihood (∆ = .01) Profile Likelihood for γ and 3-Parameter Lognormal Probability Plots of the Fan Data with γ Varying Between −100 and 449.999 γ = -100 .1 .01 300 .2 .1 .05 • .02 .005 • • • •• • • 100 • .02 .01 2000 10000 .1 .05 • .005 • 200 5000 • • •• • • .3 • .2 .1 .05 • .02 .01 .005 500 1000 γ = 449.99 • 50 • • • • •• • • • .005 .2 .02 5000 .1 .05 .3 .01 500 .2 γ = 380 • • • • • •• • • 500 Proportion Failing Proportion Failing .3 • Proportion Failing 100 .005 Threshold Parameter Gamma γ = 300 .01 • • .3 Proportion Failing Proportion Failing -106.6 .2 .02 -100 γ=0 .3 .05 -107.2 lognormal Loglikelihood Threshold Profile 5000 • •• • •• • • • 5 100 2000 11 - 34 Lognormal Probability Plot Comparing ML Estimates Three-Parameter Lognormal and Three-Parameter Weibull Distributions for the Turbine Fan Data .3 • Proportion Failing .2 • • .1 • • • • • .05 • .02 3-parameter lognormal 3-parameter Weibull 2-parameter lognormal .01 • .005 200 500 1000 2000 5000 10000 Hours 11 - 35 Fitting the 3-Parameter Lognormal and 3-Parameter Weibull Distributions to the Fan Data • Only 12 failures out of 70 units (multiple censoring). • 2-parameter Lognormal fits the data well. Weibull and exponential also fit the data reasonably well. Can 3-parameter distributions do better? • For the 3-parameter distributions, the density approximation breaks down. One should use the correct likelihood. • ML suggests that there is a positive threshold, but the level of improvement is statistically unimportant. • Fitting a 3-parameter distribution to 12 failures is overfitting. 11 - 36 Lognormal Probability Plot Comparing Three-Parameter Lognormal and Three-Parameter Weibull Distributions for the Alloy T7987 Data .95 3-parameter Weibull 3-parameter lognormal 2-parameter lognormal Proportion Failing .9 .8 .7 .6 .4 .3 .2 .1 .05 • • • .02 .005 50 •• 100 •• • •• • •• • • • ••• •• •• • •• •• ••• • • • • • 150 200 •• • •• • 250 350 Thousands of Cycles 11 - 37 Three-Parameter Weibull Distribution • Let γ be the threshold parameter. Then " # log(t − γ) − µ F (t; µ, σ, γ) = Φsev σ " # log(t − γ) − µ 1 φsev , f (t; µ, σ, γ) = σ(t − γ) σ t > γ. Both functions are 0 for t ≤ γ. • Similar for the two-parameter exponential and three-parameter lognormal. 11 - 38 Inferences for 3-Parameter Weibull Assuming that γ is Known • If γ can be assumed to be known, we can subtract γ from all times and fit the two-parameter Weibull distribution. • Need to adjust inferences accordingly (e.g., add γ back into estimates of quantiles or subtract γ from times before computing probabilities). • Similar for 3-parameter lognormal and 2-parameter exponential. 11 - 39