Chapter 7 Parametric Likelihood Fitting Concepts: Exponential Distribution William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University 7-1 Copyright 1998-2008 W. Q. Meeker and L. A. Escobar. Based on the authors’ text Statistical Methods for Reliability Data, John Wiley & Sons Inc. 1998. December 14, 2015 8h 9min Example: Time Between α-Particle Emissions of Americium-241 (Berkson 1966) Berkson (1966) investigates the randomness of α-particle emissions of Americium-241, which has a half-life of about 458 years. Data: Interarrival times (units: 1/5000 seconds). • n =10,220 observations. • Data binned into intervals from 0 to 4000 time units. Interval sizes ranging from 25 to 100 units. Additional interval for observed times exceeding 4,000 time units. Frequency 7-3 • Smaller samples analyzed here to illustrate sample size effect. We start the analysis with n =200. 0 1000 2000 1/5000 Seconds 3000 4000 7-5 Chapter 7 Parametric Likelihood Fitting Concepts: Exponential Distribution Objectives • Show how to compute a likelihood for a parametric model using discrete data. • Show how to compute a likelihood for samples containing right censored observations and left censored observations. • Use a parametric likelihood as a tool for data analysis and inference. • Illustrate the use of likelihood and normal-approximation methods of computing confidence intervals for model parameters and other quantities of interest. • Explain the appropriate use of the density approximation for observations reported as exact failures. 7-2 All Times n = 10220 41 44 24 32 29 21 9 0 200 Random Sample of Times n = 200 dj 7-4 1609 2424 1770 1306 1213 1528 354 16 10220 Interarrival Times Frequency of Occurrence Data for α-Particle Emissions of Americium-241 Time 100 300 500 700 1000 2000 4000 ∞ Interval Endpoint lower upper tj−1 tj 0 100 300 500 700 1000 2000 4000 .99 .98 .95 .9 .8 .7 .6 .5 .3 .1 0 500 1000 1/5000 Seconds 1500 2000 7-6 Exponential Probability Plot of the n = 200 Sample of α-Particle Interarrival Time Data. The Plot also Shows Approximate 95% Simultaneous Nonparametric Confidence Bands. Probability Histogram of the n = 200 Sample of α-Particle Interarrival Time Data 40 30 20 10 0 Parametric Likelihood Probability of the Data • Using the model Pr(T ≤ t) = F (t; θ ) for continuous T , the likelihood (probability) for a single observation in the interval (ti−1, ti] is Li(θ ; datai) = Pr(ti−1 < T ≤ ti) = F (ti; θ ) − F (ti−1; θ ). Can be generalized to allow for explanatory variables, multiple sources of variability, and other model features. n Y i=1 Li(θ ; datai). • The total likelihood is the joint probability of the data. Assuming n independent observations L(θ ) = L(θ ; DATA) = C 7-7 • Want to estimate θ and g(θ ). We will find θ to make L(θ ) large. n=200 -> 800 1000 1200 <- estimate of mean using all n=10220 times 600 1600 0.99 0.95 0.90 0.80 0.70 0.60 0.50 b for the n = 200 α-Particle Interarrival R(θ) = L(θ)/L(θ) Time Data. Vertical Lines Give an Approximate 95% Likelihood-Based Confidence Interval for θ 1.2 1.0 0.8 0.6 0.4 0.2 0.0 400 θ 7-9 Samples of Times n=20000 n=2000 n=200 dj n=20 Interarrival Times Frequency of Occurrence Example. α-Particle Pseudo Data Constructed with Constant Proportion within Each Bin Time 100 300 500 700 1000 2000 4000 ∞ Interval Endpoint lower upper tj−1 tj 0 100 300 500 700 1000 2000 4000 3 5 3 3 2 3 1 0 20 300 500 300 300 200 300 100 000 2000 Li(θ) = 8 h Y exp − j tj−1 tj − exp − θ θ F (tj ; θ) − F (tj−1; θ) j=1 8 Y j=1 id Exponential Distribution and Likelihood for Interval Data Data: α-particle emissions of americium-241 • The exponential distribution is t , t > 0. F (t; θ) = 1 − exp − θ θ = E(T ), the mean time between arrivals. n Y • The interval-data likelihood has the form L(θ) = i=1 = dj 7-8 where dj is the number of interarrival times in the jth interval (i.e., times between tj−1 and tj ). 500 1000 1/5000 Seconds 2000 Exponential Distribution ML Fit Exponential Distribution 95% Pointwise Confidence Intervals 1500 7 - 10 Exponential Probability Plot for the n = 200 Sample of α-Particle Interarrival Time Data. The Plot Also Shows Parametric Exponential ML Estimate and 95% Confidence Intervals for F (t). .98 .95 .9 .8 .7 .6 .5 .4 .2 0 n=20 -> n=200 -> 600 1000 1200 <- estimate of mean using all n=20000 times 800 <- n=2000 θ 0.50 0.60 0.70 0.80 0.90 0.95 0.99 1600 7 - 12 b for the n = 20, 200, and 2000 Pseudo R(θ) = L(θ)/L(θ) Data. Vertical Lines Give Corresponding Approximate 95% Likelihood-Based Confidence Intervals 1.2 1.0 0.8 0.6 0.4 0.2 0.0 400 Confidence Level Confidence Level 30 50 30 30 20 30 10 0 200 3000 5000 3000 3000 2000 3000 1000 0000 20000 7 - 11 Probability Relative Likelihood Relative Likelihood tj All Times n = 10220 dj 292 494 332 236 261 308 73 4 2000 41 44 24 32 29 21 9 0 200 Li(θ ). 7 - 13 3 7 4 1 3 2 0 0 20 Random Samples of Times n = 2000 n = 200 n=20 Interarrival Times Frequency of Occurrence Example. α-Particle Random Samples Time tj−1 100 300 500 700 1000 2000 4000 ∞ Interval Endpoint lower upper 0 100 300 500 700 1000 2000 4000 1609 2424 1770 1306 1213 1528 354 16 10220 Likelihood as a Tool for Modeling/Inference n X i=1 What can we do with the (log) likelihood? L(θ ) = log[L(θ )] = • Study the surface. • Maximize with respect to θ (ML point estimates). • Look at curvature at maximum (gives estimate of Fisher information and asymptotic variance). 20 theta 30 40 50 0.50 0.60 0.70 0.80 0.90 0.95 0.99 7 - 15 200 n=200 θ 600 estimate of mean using all n=10220 times -> n=20 400 n=2000 800 1000 0.50 0.60 0.70 0.80 0.90 0.95 0.99 7 - 14 b for the n = 20, 200, and 2000 Samples R(θ) = L(θ)/L(θ) from the α-Particle Interarrival Time Data. Vertical Lines Give Corresponding Approximate 95% Likelihood-Based Confidence Intervals. 1.2 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 4.6 4.8 5.0 theta 5.2 5.4 5.6 0.99 0.95 0.90 0.80 0.70 0.60 0.50 Relative Likelihood for Simulated Exponential (θ = 5) Samples of Size n = 1000 • Use reparameterization to study functions of θ . 7 - 18 7 - 16 • Calibrate confidence regions/intervals with χ2 or simulation (or parametric bootstrap). • If the length of θ is > 1 or 2 and interest centers on subset of θ (need to get rid of nuisance parameters), look at profiles (suggests confidence regions/intervals for parameter subsets). • Regions of high likelihood are credible; regions of low likelihood are not credible (suggests confidence regions for parameters). Likelihood as a Tool for Modeling/Inference (Continued) Relative Likelihood Profile Likelihood • Observe effect of perturbations in data and model on likelihood (sensitivity, influence analysis). 10 7 - 17 Confidence Level Confidence Level 1.0 0.8 0.6 0.4 0.2 0.0 0 Confidence Level Relative Likelihood for Simulated Exponential (θ = 5) Samples of Size n = 3 Profile Likelihood b L(θ) L(θ) . Large-Sample Approximate Theory for Likelihood Ratios for a Scalar Parameter • Relative likelihood for θ is R(θ) = • If evaluated at the true θ, then, asymptotically, −2 log[R(θ)] follows, a chisquare distribution with 1 degree of freedom. • An approximate 100(1 − α)% likelihood-based confidence region for θ is the set of all values of θ such that i 2 −2 log[R(θ)] < χ(1−α;1) h or, equivalently, the set defined by 2 R(θ) > exp −χ(1−α;1) /2 . • General theory in the Appendix. 7 - 19 Normal-Approximation Confidence Intervals for θ (continued) e [θ , b θ̃] = [θ/w, θb × w] • A 100(1 − α)% normal-approximation (or Wald) confidence interval for θ is e b − log(θ) log(θ) ∼ ˙ NOR(0, 1) scelog(θ) b b ±z ˜ c log(θ)] = log(θ) b (1−α/2)selog(θ) b This follows after transformwhere w = exp[z(1−α/2)sceθb/θ]. ing (by exponentiation) the confidence interval [log(θ), which is based on Zlog(θ) b = closer to an NOR(0, 1) distribution than is Zθb. 7 - 21 b is unrestricted in sign, generally Z • Because log(θ) b is log(θ) Confidence Intervals for Functions of θ • For one-parameter distributions, confidence intervals for θ can be translated directly into confidence intervals for monotone functions of θ. λ̃] = [1/θ̃, e .00201]. e F (te; θ )]. 1/θ ] = [.00151, • The arrival rate λ = 1/θ is a decreasing function of θ. [λ, e F̃ (te)] = [F (te; θ̃), • F (t; θ) is a decreasing function of θ e [F (te), 7 - 23 Normal-Approximation Confidence Intervals for θ rh e b. is evaluated at θ θb − θ ∼ ˙ NOR(0, 1) sceθb i−1 θ̃] = θb ± z(1−α/2)sceθb. Zθb = −d2L(θ)/dθ 2 [θ, • A 100(1 − α)% normal-approximation (or Wald) confidence interval for θ is where sceθb = • Based on h i Pr z(α/2) < Zθb ≤ z(1−α/2) ≈ 1 − α i • From the definition of NOR(0, 1) quantiles h implies that Pr θb − z(1−α/2)sceθb < θ ≤ θb + z(1−α/2)sceθb ≈ 1 − α. 41.7 572 [289, 713] [281, 690] [242, 638] 101 440 7 - 20 6.1 596 [498, 662] [496, 660] [491, 654] 52 227 Sample of Times n = 200 n=20 [585, 608] [585, 608] [585, 608] 13 [140, 346] [145, 356] [125, 329] 175 [151, 201] [152, 202] [149, 200] 1.7 [164, 171] [164, 171] [164, 171] 168 All Times n =10,220 Comparisons for α-Particle Data ML Estimate θb bb Standard Error se θ 95% Confidence Intervals for θ Based on Likelihood Zlog(b ∼ ˙ NOR(0, 1) θ) Zb ∼ ˙ NOR(0, 1) θ b × 105 ML Estimate λ bb 5 Standard Error se λ×10 95% Confidence Intervals for λ × 105 Based on Likelihood Zlog(b ∼ ˙ NOR(0, 1) λ) Zb ∼ ˙ NOR(0, 1) λ 7 - 22 Density Approximation for Exact Observations • If ti−1 = ti − ∆i, ∆i > 0, and the correct likelihood F (ti; θ ) − F (ti−1; θ ) = F (ti; θ ) − F (ti − ∆i; θ ) (ti −∆i ) Z t i f (t)d t ≈ f (ti; θ )∆i can be approximated with the density f (t) as [F (ti; θ ) − F (ti − ∆i; θ )] = then the density approximation for exact observations Li(θ ; datai) = f (ti; θ ) may be appropriate. • For most common models, the density approximation is adequate for small ∆i. • There are, however, situations where the approximation breaks down as ∆i → 0. 7 - 24 ML Estimates for the Exponential Distribution Mean Based on the Density Approximation P • With r exact failures and n − r right-censored observations the ML estimate of θ is n ti TTT = i=1 θb = r r Pn TTT = i=1 ti, total time in test, is the sum of the failure times plus the censoring time of the units that are censored. • Using the observed curvature in the likelihood: v s u #−1 u" 2 2 b b θ d L(θ) θ u =√ . sceθb = t − = dθ 2 r r b θ 2(TTT ) θ̃] = 2 , χ(1−α/2;2r) 2(TTT ) . 2 χ(α/2;2r) 7 - 25 2 . • If the data are complete or failure censored, 2TTT /θ ∼ χ2r Then an exact 100(1 − α)% confidence interval for θ is e [θ , Exponential Analysis With Zero Failures • ML estimate for the Exponential distribution mean θ cannot be computed unless the available data contains one or more failures. 2(TTT ) TTT 2(TTT ) = = . 2 −2 log(α) − log(α) χ(1−α;2) • For a sample of n units with running times t1, . . . , tn and an assumed exponential distribution, a conservative 100(1 − α)% lower confidence bound for θ is e θ= e • The lower bound θ can be translated into an lower confidence bound for functions like tp for specified p or a upper confidence bound for F (te) for a specified te. 7 - 27 • This bound is based on the fact that under the exponential failure-time distribution, with immediate replacement of failed units, the number of failures observed in a life test with a fixed total time on test has a Poisson distribution. Other Topics in Chapter 7 • Inferences when there are no failures. 7 - 29 Confidence Interval for the Mean Life of a New Insulating Material • A life test for a new insulating material used 25 specimens which were tested simultaneously at a high voltage of 30 kV. • The test was run until 15 of the specimens failed. • The 15 failure times (hours) were recorded as: 1.15, 3.16, 10.38, 10.75, 12.53, 16.74, 22.54, 25.01, 33.02, 33.93, 36.17, 39.06, 44.56, 46.65, 55.93 Then TTT = 1.15 + · · · + 55.93 + 10 × 55.93 = 950.88 hours. 113.26]. 7 - 26 2(950.88) 2(950.88) 1901.76 1901.76 = 2 , , 2 = 46.98 16.79 χ(.975;30) χ(.025;30) θb = 950.88/15 = 63.392 hours • The ML estimate of θ and a 95% confidence interval are: e θ , θ̃ = [40.48, Analysis of the Diesel Generator Fan Data (Assuming Removal After 200 Hours of Service) • Here we do the analysis of the fan data after 200 hours of testing when all the fans were still running. e θ= 28000 2(TTT ) = = 4674. 2 5.991 χ(.95;2) • Thus TTT =14,000 hours. A conservative 95% lower confidence bound on θ is 18,485 hours. e • Using the entire data set, θb = 28,701 and a likelihoodbased approximate 95% lower confidence bound is θ = e • A conservative 95% upper confidence bound on F (10000; θ) is Fe (10000) = F (10000; θ ) = 1 − exp(−10000/4674) = .882. 7 - 28 This shows how little information comes from a short test with zero or few failures.