Chapter 11 Parametric Maximum Likelihood: Other Models Parametric Maximum Likelihood: Other Models Chapter 11 Objectives • ML estimation for the gamma and the extended generalized gamma (EGENG) distributions. • ML estimation for the BISA, IGAU, and GOMA distributions. • ML estimation for the limited failure population model. • ML estimation for truncated data (or data from truncated distributions). • • n Y i=1 • •• [f (ti; θ )]δi [1 − F (ti; θ )]1−δi • •• •• 50 • •• • •• •• • • • 200 Gamma Lognormal Weibull 95% pointwise confidence intervals 100 11 - 4 11 - 6 • For the bearing data, the gamma, lognormal and Weibull distributions are similar over the range of the data. • Scale and shape parameter estimated. Fitting the Gamma Distribution • Bootstrap and simulation-based intervals will generally provide confidence intervals with excellent approximations to nominal coverage probabilities, but will require more computer time (and may not be available in commercial software). • Profile likelihood and corresponding intervals provide useful insight into the information available about a particular parameter or functions of parameters. • Normal approximation confidence intervals (using the delta method and appropriate transformations) are simple and are adequate in large samples or for rough approximations. Confidence intervals and regions similar to location-scale distributions. 11 - 2 William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University 11 - 1 • ML estimation for threshold-parameter distributions like the 3-parameter lognormal and the 3-parameter Weibull distributions (using generalized threshold-scale or GETS distribution). Li(θ ; datai) = if ti is an exact failure if ti is a right censored observation • 20 Confidence Intervals for Other Distributions and Models Copyright 1998-2008 W. Q. Meeker and L. A. Escobar. Based on the authors’ text Statistical Methods for Reliability Data, John Wiley & Sons Inc. 1998. December 14, 2015 8h 9min Fitting Other Distributions and Models n Y i=1 • Likelihood principles similar to location-scale distributions. L(θ ) = ( 1 0 where datai = (ti, δi), δi = and F (ti; θ ) and f (ti; θ ) are the specified distribution’s cdf and pdf, respectively. • For some non-location-scale distributions (e.g. GETS) the density approximation breaks down and one should use the actual interval probability instead. 11 - 3 • Left censored and interval censored observations also could be included, as described in Chapter 2. 10 Millions of Cycles 11 - 5 Lognormal Probability Plot of the Bearing Failure Data, Comparing ML Estimates of the Gamma, Lognormal and Weibull Distributions. Approximate 95% Pointwise Confidence Intervals for the Gamma Cdf are Also Shown .995 .98 .95 .9 .8 .6 .4 .2 .1 .05 .02 .001 .005 Proportion Failing EXP WEIB LOGNOR 200 -0.5 LOGNOR • EGENG 500 0.0 • • λ 0.5 • • 2000 WEIB •• Hours • 1.0 • 5000 • 1.5 20000 2.0 0.50 0.60 0.70 0.80 0.90 0.95 0.99 50000 11 - 11 EXP WEIB EGENG • LOGNOR 20 • • • •• ••• •• 50 Millions of Cycles • •• •• •• • •• 100 • 200 11 - 8 Weibull Probability Plot of the Bearing Failure Data Showing Exponential, Weibull, Lognormal, and Generalized Gamma ML Estimates of F (t) .99 .9 .7 .5 .2 .1 .05 .02 .01 10 Fitting the EGENG Distribution to the Bearing Data-Conclusions 11 - 12 • Fitting a 3-parameter distribution to 12 failures is overfitting. • Comparison shows that the position of the smallest observation does not have much influence on the fit (small order statistics have a large amount of variability). • The EGENG has a larger likelihood than the other distributions, but the difference is statistically unimportant. • Lognormal fits the data well. Weibull and exponential also fit the data reasonably well. Can EGENG do better? • Only 12 failures out of 70 units (multiple censoring). Fitting the EGENG Distribution to the Fan Data 11 - 10 • The profile likelihood shows that the data do not, in this case, provide strong evidence for one distribution over the other. • The EGENG, lognormal and Weibull agree well within the range of the data. Important deviations in the lower tail of the distribution illustrate the danger of extrapolation. • For the bearing data, the EGENG distribution provides a compromise between lognormal and Weibull. .001 .003 Proportion Failing Fitting the Extended Generalized Gamma (EGENG) Distribution • T ∼ EGENG(µ, σ, λ) • Special cases: Weibull (λ = 1), Lognormal (λ = 0), and Gamma (θ = λ2 exp(µ), σ = λ, κ = 1/(λ)2. • A more flexible curve for the data 11 - 7 • Can use EGENG to see if there is evidence for one distribution over the other 1.0 0.8 0.6 0.4 0.2 0.0 -1.0 11 - 9 Confidence Level Profile Likelihood Plot for EGENG λ for the Bearing Failure Data Showing Weibull and Lognormal Distributions as Special Cases Profile Likelihood Lognormal Probability Plot of the Fan Failure Data Showing Generalized Gamma ML Estimates and Corresponding Approximate 95% Pointwise Confidence Intervals for F (t) Along with Exponential, Weibull, and Lognormal ML Estimates of F (t) .4 .3 .2 .1 .05 .02 .01 .002 .001 .005 Proportion Failing -8 -6 -4 λ -2 LOGNOR 0 WEIB 2 0.99 0.95 0.90 0.80 0.70 0.60 0.50 Profile Likelihood Plot for EGENG λ for the Fan Failure Data Showing Weibull and Lognormal Distributions as Special Cases 1.0 0.8 0.6 0.4 0.2 0.0 .995 .98 .95 .9 .8 .7 .5 .3 .2 .1 .05 .02 .005 • • •• •• • • • 20 100 200 Lognormal Birnbaum-Saunders Inverse Gaussian • •• ••• •• • •• • • • •••• • ••• •••• •• • •• • • ••• 50 Thousands of Load Oscillations • • • • 500 • Limited Failure Population (LFP) Model • The Weibull/LFP model is Pr(T ≤ t) = pF (t; µ, σ) = pΦsev " Similar for lognormal or other distributions. σ n ( Y p φsev " " log(ti) − µ σ #) #)δ i # × 1−δi log(ti) − µ σ • ML methods work. The likelihood has the form L(µ, σ, p) = i=1 ( 1 − pΦsev . log(t) − µ . σ • Only a small proportion (p) of the population is susceptible to failure. 11 - 15 Lognormal Probability Plot of Yokobori’s Fatigue Failure Data on Cylindrical Specimens at 52.658 ksi Showing Lognormal, BISA and IGAU Distribution ML Estimates 11 - 13 Confidence Level 11 - 17 Need to test until a high proportion (e.g. 90% or more) of the susceptible subpopulation has failed. See Meeker (1987) for more details. Fitting the BISA and IGAU Distributions • Distributions motivated by similar degradation models ◮ Inverse-Gaussian (IGAU) based on time of first crossing of a threshold for a continuous-time Brownian motion process with drift. ◮ Birnbaum-Saunders (BISA) based on discrete-time growth of fatigue cracks until fracture. 11 - 14 • For some values of their parameters, these distributions are very similar to each other and to the lognormal distribution. 10 -2 • -1 • 10 • • 0 • 10 •• • • ••• •• •• • • 1 10 2 •• • 10 3 10 4 100-hour ML Fit 100-hour pointwise confidence intervals 1370-hour ML Fit 1370-hour pointwise confidence intervals 10 Hours 11 - 16 Weibull Probability Plot of Integrated Circuit Failure-Time Data with ML Estimates of the Weibull/LFP Model After 1370 Hours and 100 Hours of Testing .01 .005 .003 .001 .9 .8 .6 .4 .2 .1 .05 .02 .01 .005 99 1.5 99 95 2.0 99 σ 2.5 50 3.0 70 80 90 95 11 - 18 Approximate Joint Confidence Regions For the LFP Parameters p and log(σ) Based on a Two-Dimensional Profile Likelihood After 100 Hours of Testing .0002 .0003 .0005 Proportion Failing Proportion Defective p Profile Likelihood Proportion Failing 0.005 0.050 0.500 Profile after 100-hours Profile after 1370-hours 0.200 Proportion Defective p 0.020 0.99 0.95 0.90 0.80 0.70 0.60 0.50 Comparison of Profile Likelihoods for p, the LFP Proportion Defective, After 1370 and 100 Hours of Testing 1.0 0.8 0.6 0.4 0.2 0.0 .999 .98 .9 .7 .5 .3 .2 .1 .05 .03 .02 .01 • • 20 • km 60 80 100 120 • • • •••• •••• • •••• ••••• •• • •••••• •• •• ••• • •••• ••• • •• • •• •• • • 40 • 140 • 180 11 - 19 Weibull Probability Plot of the Nonparametric Estimate of Brake Pad Life, Conditional on Failure After 6.951 Thousand km 11 - 21 • Nonparametric estimation and ML estimation with right (and left) truncated data. • ML estimation with left-truncated data. • Nonparametric estimation with left-truncated data. • Importance of distinguishing between truncated data and censored data. Some relevant topics in the analysis of truncated data include: Analysis of Truncated Data Confidence Level 11 - 23 Relationship Between Wald and Profile Likelihood-Based Confidence Regions/Intervals Result: Using the Wald (normal-theory) based interval is equivalent to using a quadratic approximation to the loglikelihood profile. • See Meeker and Escobar (1995) for proof. • Likelihood-based interval does not depend on transformation. • Simulation and some theory suggests that the likelihoodbased interval provides a better asymptotic approximation. 11 - 20 • Interval for the LFP parameter provides an extreme example of where the quadratic approximation breaks down. Distribution of Brake Pad Life from Observational Data • Pad wear (W as a proportion of wear at the end of life) was measured and distance driven (V in thousands of km) was recorded on automobiles that came in for service. Data from Kalbfleisch and Lawless (1992). • Time of failure for each pad was imputed from the observed wear rate as Y = V /W . • Units having already had a pad replacement were omitted from the data. Thus, high-rate units are under represented in the sample. 11 - 22 • To analyze the data, each unit can be viewed as having been left-truncated at its observation time (if it had failed before its observation time, we would not know of the unit’s existence because it would have been omitted from the sample). .98 .999 .9 .7 .5 .3 .2 .1 .05 .03 .02 • • 20 • km 60 80 120 • •• • •••• 100 • •••• •••• •••• •• • •••• • ••• • •• •••• ••• ••• •• • • •• •• • • 40 • 140 • 180 11 - 24 Weibull Probability Plot of the Weibull-Adjusted Nonparametric Estimate of Brake Pad Life Distribution Proportion Failing Profile Likelihood Proportion Failing IC Failure Data from a Limited Failure Population • Of the n =4,156 integrated circuits tested, there were 25 failures in the first 100 hours of testing. • The number of susceptible units in the sample is unknown. 11 - 25 • The 25 failures can be viewed as a sample from a distribution truncated on the right at 100 hours. -2 -1 • 10 • • • 0 • 10 • • • Hours • • 1 • 10 • •• •• • • 10 2 10 3 11 - 27 Lognormal Probability Plot of the Lognormal-Adjusted (Unconditional) Nonparametric Estimate of the IC Failure-Time Distribution .6 .5 .4 .3 .2 .1 .05 .02 10 • ML works if used correctly. 11 - 29 • Similar methods can be used for other distributions for positive random variables. • Need to adjust inferences accordingly (e.g., add γ back into estimates of quantiles or subtract γ from times before computing probabilities). • If γ can be assumed to be known, we can subtract γ from all times and fit the two-parameter Weibull distribution to estimate µ and σ. Inferences for 3-Parameter Weibull or Lognormal Distributions Assuming that Threshold γ is Known Proportion Failing -2 -1 • 10 • • • 0 • Hours • • • 10 • • 1 • 10 • • • • • • • 10 2 10 3 11 - 26 Lognormal Probability Plot of the Nonparametric Estimate of the IC Failure-Time Distribution Conditional on Failure Before 100 Hours .99 .98 .9 .95 .8 .7 .6 .5 .4 .3 .2 .1 .05 .02 10 " # 11 - 30 This problem can be avoided by extending to the parameter space to allow values of σ ≤ 0. For the 3-parameter Weibull (lognormal), use the SEV (NORM) GETS distribution. • For some data sets the ML estimate of σ will approach 0 (on the boundary of the parameter space). Using the correct likelihood will allow one to avoid this problem. • If the smallest observation is an exact failure, there may be paths in the parameter space leading to infinite likelihood when the density approximation likelihood is used. Two possible problems with ML that need to be avoided Fitting the Three-Parameter Weibull Distribution Likelihood with Unknown Threshold γ 11 - 28 • Similarly, a threshold parameter can be added to other distributions for positive random variables. for t > γ. Φsev and φsev are used for the Weibull distribution and Φnor and φnor are used for the lognormal distribution. log(t − γ) − µ F (t; µ, σ, γ) = Φ σ " # 1 log(t − γ) − µ φ σ(t − γ) σ f (t; µ, σ, γ) = • Let γ be the threshold parameter. Then Three-Parameter Weibull and Lognormal Distributions Proportion Failing -3 -2 -2 -1 0 1 0 σ = -.75 -1 0 -1 1 2 1 2 3 2 0.6 0.4 0.2 0.0 0.8 0.6 0.4 0.2 0.0 0.6 0.4 0.2 0.0 -3 -4 -3 0 0 0 σ=0 -1 -2 -1 1 1 2 2 2 3 4 3 0.8 0.4 0.0 0.8 0.4 0.0 0.8 0.6 0.4 0.2 0.0 -2 -3 -2 -1 -2 -1 1 0 1 σ = .75 0 -1 0 2 1 2 11 - 31 3 2 3 SEV-GETS, NOR-GETS, and LEV-GETS pdfs with α = 0, σ = −.75, 0, .75, and ς = .5 (Least Disperse), 1, and 2 (Most Disperse) 0.8 0.4 0.0 0.8 0.4 0.0 0.8 0.6 0.4 0.2 0.0 -2 • 300 • • 5000 •• •• • γ = 300 • 500 .3 .2 .1 .05 .02 .01 .005 .3 .2 .1 .05 .02 .01 .005 • 500 • 50 • •• • • • γ = -100 • 2000 • • • • 10000 • 5000 •• •• • γ = 380 • 500 • 1000 • Hours • • 2000 .3 .2 .1 .05 .02 .01 .005 .3 .2 .1 .05 .02 .01 .005 • • 200 • • 10-2 • • • • 11 - 33 104 • • •• •• • • • 5000 •• •• • γ=0 • 1000 102 γ = 449.99 100 • • 10000 3-parameter lognormal 3-parameter Weibull 2-parameter lognormal 5000 The Three-Parameter Weibull Distribution Likelihood for Right Censored Data Li(µ, σ, γ; datai) " " log(ti − γ) − µ σ i #)1−δ 1 log(ti − γ) − µ φsev σ(ti − γ) σ i #)δ {f (ti; µ, σ, γ)}δi {1 − F (ti; µ, σ, γ)}1−δi i=1 n ( Y i=1 n Y n Y • The likelihood has the form L(µ, σ, γ) = = = i=1 ( × 1 − Φsev • Problem: when γ → t(1) and σ → 0, L(µ, σ, γ) → ∞. 11 - 32 • Solution: Do not use the density approximation; use the correct likelihood (based on small intervals). 100 300 Threshold Profile -100 100 • • • 5000 • •• • γ = 300 • 500 •• Threshold Parameter Gamma .3 .2 .1 .05 .02 .01 .005 .3 .2 .1 .05 .02 .01 .005 .3 .2 .1 .05 .02 .01 .005 • 500 • 50 • • • •• • γ = -100 • 2000 • • • • 10000 •• 5000 • •• • γ = 380 • 500 .3 .2 .1 .05 .02 .01 .005 .3 .2 .1 .05 .02 .01 .005 200 5 • • • •• 2000 • • •• • •• • • 5000 • •• • γ=0 • 1000 • γ = 449.99 100 11 - 36 • Fitting a 3-parameter distribution to 12 failures is overfitting. • ML suggests that there is a positive threshold, but the level of improvement is statistically unimportant. • For the 3-parameter distributions, the density approximation breaks down. One should use the correct likelihood. • 2-parameter Lognormal fits the data well. Weibull and exponential also fit the data reasonably well. Can 3-parameter distributions do better? • Only 12 failures out of 70 units (multiple censoring). Fitting the 3-Parameter Lognormal and 3-Parameter Weibull Distributions to the Fan Data 11 - 34 Correct Likelihood (∆ = .01) Profile Likelihood for γ and 3-Parameter Lognormal Probability Plots of the Fan Data with γ Varying Between −100 and 449.999 -106.6 -107.2 -3 100 Threshold Profile -100 100 • Threshold Parameter Gamma .3 .2 .1 .05 .02 .01 .005 • 500 Proportion Failing Proportion Failing Proportion Failing Proportion Failing Proportion Failing Proportion Failing Proportion Failing Proportion Failing Lognormal Probability Plot Comparing ML Estimates Three-Parameter Lognormal and Three-Parameter Weibull Distributions for the Turbine Fan Data .3 .2 .1 .05 .02 .01 .005 200 11 - 35 lognormal Loglikelihood Proportion Failing SEV Normal LEV Density Approximation Profile Likelihood for γ and 3-Parameter Lognormal Probability Plots of the Fan Data with γ Varying Between −100 and 449.999 lognormal Loglikelihood Proportion Failing -134.0 -134.6 Proportion Failing 200 • •• ••• •• •• •• •• •• ••• ••• •• • •• •• 150 Thousands of Cycles •• 3-parameter Weibull 3-parameter lognormal 2-parameter lognormal • • • 100 • •• •• • •• 250 350 11 - 37 Lognormal Probability Plot Comparing Three-Parameter Lognormal and Three-Parameter Weibull Distributions for the Alloy T7987 Data .95 .9 .8 .7 .6 .4 .3 .2 .1 .05 .02 50 Inferences for 3-Parameter Weibull Assuming that γ is Known 11 - 39 • Similar for 3-parameter lognormal and 2-parameter exponential. • Need to adjust inferences accordingly (e.g., add γ back into estimates of quantiles or subtract γ from times before computing probabilities). • If γ can be assumed to be known, we can subtract γ from all times and fit the two-parameter Weibull distribution. .005 Proportion Failing Three-Parameter Weibull Distribution " # • Let γ be the threshold parameter. Then log(t − γ) − µ F (t; µ, σ, γ) = Φsev σ " # 1 log(t − γ) − µ φsev , σ(t − γ) σ f (t; µ, σ, γ) = Both functions are 0 for t ≤ γ. t > γ. • Similar for the two-parameter exponential and three-parameter lognormal. 11 - 38