W Io C o

advertisement
Chapter 8
Maximum Likelihood
for Location-Scale Based Distributions
William Q. Meeker and Luis A. Escobar
Iowa State University and Louisiana State University
December 14, 2015
8h 9min
8-1
Copyright 1998-2008 W. Q. Meeker and L. A. Escobar.
Based on the authors’ text Statistical Methods for Reliability
Data, John Wiley & Sons Inc. 1998.
3.8
•
•
δ
•
4.2
15000
• •
Log Kilometers
4.0
10000
Li (µ, σ; datai )
φnor
δi
•
••
20000
1−δi
•
4.4
••
25000
× 1 − Φnor
0
-2
-4
-6
log(ti ) − µ
σ
ti is an exact observation
ti is a right censored observation
log(ti ) − µ
σ
[f (ti ; µ, σ)] i [1 − F (ti ; µ, σ)]
1
σti
if
if
1−δi
8-3
Weibull Probability Plot of the Shock Absorber Data
.98
.9
.7
.5
.3
.2
.1
.05
.02
.01
.005
.003
.001
.0005
5000
Kilometers
Lognormal Distribution Model Likelihood
for Right Censored Data
• The lognormal distribution model is
Standard quantile
Pr(T ≤ t) = F (t; µ, σ) = Φnor {[log(t) − µ]/σ} .
1
0
i=1
i=1
n Y
n
Y
i=1
n
Y
• The likelihood has the form
(
=
=
L(µ, σ) =
δi =
φnor (z) is the standardized normal density.
8-5
Chapter 8
Maximum Likelihood
for Location-Scale Based Distributions
Objectives
• Illustrate likelihood-based methods for parametric models
based on log-location-scale distributions (especially Weibull
and Lognormal).
• Construct and interpret likelihood-ratio-based confidence
intervals/regions for model parameters and for functions
of model parameters.
• Construct and interpret normal-approximation confidence
intervals/regions.
8-2
• Describe the advantages and pitfalls of assuming that the
log-location-scale distribution shape parameter is known.
Weibull Distribution Model Likelihood
for Right Censored Data
• The Weibull distribution model is
Li (µ, σ; datai )
δ
[f (ti ; µ, σ)] i [1 − F (ti ; µ, σ)]
1−δi
Pr(T ≤ t) = F (t; µ, σ) = Φsev {[log(t) − µ]/σ} .
i=1
1
0
if
if
ti is an exact observation
ti is a right censored observation
δi 1−δi
n Y
1
log(ti ) − µ
log(ti ) − µ
× 1 − Φsev
φsev
σti
σ
σ
i=1
n
Y
i=1
n
Y
• The likelihood has the form
(
=
=
L(µ, σ) =
δi =
10
-0.6
-0.8
-1
-1.2
log(σ)
-1.4
-1.6
8-6
8-4
φsev (z) is the standardized smallest extreme value density.
.4
10
.2
10
Weibull Relative Likelihood
for the Shock Absorber Data
b = 10.23 and σ
b = .3164
ML Estimates: µ
b log(σ
b ))
R(µ, log(σ)) = L(µ, log(σ))/L(µ,
µ
0.6
0.8
1
1
Relative Likelihood
0 0.2 0.4 0.6 0.8 1
Proportion Failing
Probability
b σ
b)
R(µ, σ) = L(µ, σ)/L(µ,
0.1
10.6
10.8
0.01
Weibull Relative Likelihood
for the Shock Absorber Data
b = 10.23 and σ
b = .3164
ML Estimates: µ
0.2
8-7
•
10000
•
•
•
15000
15000
Weibull Probability Plot
10000
15000
Lognormal Probability Plot
10000
10000
15000
Loglogistic Probability Plot
•
•
•
20000
Lognormal Distribution ML Fit
Weibull Distribution ML Fit
Lognormal Distribution 95% Pointwise Confidence Intervals
•
•
•
25000
•
25000
30000
0.5
0.4
0.3
0.2
9.8
10.0
10.2
25
µ
60
50
70 80
10.4
90
10.6
95
99
10.8
8-8
Weibull Likelihood-Based Joint Confidence Regions for
µ and σ for the Shock Absorber Data
b = 10.23 and σ
b = .3164
ML Estimates: h µ
i
2
R(µ, σ) > exp −χ(1−α;2)
/2 = 100α%
.9
.7
.5
.3
.2
.1
.05
.03
.02
.01
.005
.003
.001
4000
6000
8000
10000
Kilometers
• Relative likelihood for (µ, σ) is
R(µ, σ) =
• General theory in the Appendix.
14000
18000
28000
β^ = 3.16
^
η
= 27719
22000
8 - 12
• If evaluated at the true (µ, σ), then, asymptotically, −2 log[R(µ, σ)]
follows, a chisquare distribution with 2 degrees of freedom.
L(µ, σ)
.
b σ
b)
L(µ,
Large-Sample Approximate Theory for Likelihood
Ratios for Parameter Vector
8 - 10
Weibull Probability Plot of Shock Absorber Failure
Times (Both Failure Modes) with Maximum
Likelihood Estimates and Normal-Approximation 95%
Pointwise Confidence Intervals for F (t)
σ
0.7
0.4
10.4
5000
5000
25000
0.6
µ
0.7
0.9
0.001
10.2
.5
.1
.005
.0005
.8
.4
.1
.02
.005
.001
25000
0.7
0.001
10.0
30000
30000
8-9
0.6
0.5
0.4
0.3
0.2
9.8
Six-Distribution ML Probability Plot
of the Shock Absorber Data
25000
25000
5000
Shock Absorber Data (Both Failure Modes)
Probability Plots with ML Estimates and Pointwise 95% Confidence Intervals
25000
Smallest Extreme Value Probability Plot
5000
Fraction Failing
σ
.7
.3
.1
20000
.02
.005
15000
20000
20000
.8
.5
.2
.05
.02
10000
15000
Normal Probability Plot
10000
15000
Logistic Probability Plot
10000
30000
.001
5000
5000
5000
.005
.9
.6
.3
.1
.02
.002
.9
.6
.3
.1
.02
.005
.7
.6
.5
.4
.3
.2
.1
.05
.02
.01
Kilometers
8 - 11
Lognormal Probability Plots of Shock Absorber Data
with ML Estimates and Normal-Approximation 95%
Pointwise Confidence Intervals for F (t). The Curved
Line is the Weibull ML Estimate.
Proportion Failing
10.2
µ
10.4
10.6
10.8
0.99
0.95
0.90
0.80
0.70
0.60
0.50
Weibull Profile Likelihood R(µ) (exp(µ) ≈ t.63)
for the Shock Absorberi Data
h
L(µ,σ)
R(µ) = max
L(µ
b ,σ
b)
σ
10.0
Need: Inferences on subset θ 1, from the partition θ = (θ 1, θ 2)′.
• k1 = length(θ 1).
"
L(µ, σ)
.
b σ
b)
L(µ,
#
• When (θ 1, θ 2)′ = (µ, σ), profile likelihood for θ 1 = µ is
R(µ) = max
σ
8 - 15
• If evaluated at the true θ 1 = µ, then, asymptotically, −2 log[R(µ)]
follows, a chisquare distribution with k1 = 1 degrees of freedom.
• General theory in the Appendix.
0.6
0.5
0.01
18000
0.7 0.4 0.20.1
14000
0.9
b)
R(t.1, σ) = L(t.1, σ)/L(tb.1, σ
10000
8 - 17
Contour Plot of Weibull Relative Likelihood R(t.1, σ)
for the Shock Absorber Data
(Parameterized with t.1 and σ)
σ
0.4
0.3
0.2
6000
t.1
Profile Likelihood
0.3
0.4
σ
0.5
0.6
0.99
0.95
0.90
0.80
0.70
0.60
0.50
Weibull Profile Likelihood R(σ) (σ = 1/β)
for the Shock Absorberi Data
h
L(µ,σ)
R(σ) = max
L(µ
b ,σ
b)
µ
1.0
0.8
0.6
0.4
0.2
0.0
0.2
i
8 - 16
8 - 18
• Simple to implement if function and its inverse are easy to
compute.
• Then follow previous procedure.
• Define the function of interest as one of the parameters,
replacing one of the original parameters giving one-to-one
reparameterization g (µ, σ) = [g1(µ, σ), g2(µ, σ)].
• Likelihood approach can be applied to functions of parameters.
Confidence Regions and Intervals for
Functions of µ and σ
• Can improve the asymptotic approximation with simulation
(only small effect except in very small samples).
• Transformation of θ 1 will not affect the confidence statement.
2
R(θ 1) > exp −χ(1−α;k
/2 .
1)
h
or, equivalently, the set defined by
2
−2 log[R(θ 1)] < χ(1−α;k
1)
• An approximate 100(1 − α)% likelihood-based confidence
region for θ 1 is the set of all values of θ 1 such that
Asymptotic Theory of Likelihood Ratios – Continued
8 - 14
Confidence Level
1.0
0.8
0.6
0.4
0.2
0.0
9.8
8 - 13
Confidence Level
Large-Sample Approximate Theory for Likelihood
Ratios for Parameter Vector Subset
Profile Likelihood
10000
t.1
12000
16000
L(t.1,σ
b)
14000
18000
20000
Weibull Profile Likelihood R(t.1)
for the Shock Absorber Data
L(t.1,σ)
R(t.1) = max
b
σ
8000
h
θ
b − θ )′ Σ
(θ
b
i−1
b − θ)
(θ
#
0.50
0.60
0.70
0.80
0.90
0.95
0.99
8 - 19
has a chisquare distribution with k degrees of freedom,
where k is the length of θ .
"
∂ 2L(θ )
.
∂ θ∂ θ′
• Here, Σb = Iθ−1 is the large sample approximate covariance
θ
matrix where the Fisher information matrix for θ is
Iθ = E −
8 - 21
Asymptotic Theory for Wald’s Statistic – Continued
• An approximate 100(1 − α)% confidence region for θ is the
set of all values of θ in the ellipsoid
i
h
b − θ ) ≤ χ2
b − θ )′ Σ̂ −1 (θ
(θ
b
(1−α;k).
θ
• This is sometimes known as the normal-theory confidence
region.
• Can specialize to functions or subsets of θ .
• Can transform to improve asymptotic approximation. Try
to get a log likelihood with approximate quadratic shape.
8 - 23
0.02
0.04
0.08
F(10000)
0.06
0.10
0.12
0.14
0.99
0.95
0.90
0.80
0.70
0.60
0.50
Weibull Profile Likelihood R[F (10000)]
for the Shock Absorber
Data 



L[F
h (10000),σ]i
R [F (10000)] = max
σ  L Fb(10000),σb 
1.0
0.8
0.6
0.4
0.2
0.0
0.00
Asymptotic Theory for Wald’s Statistic
Profile Likelihood
"
∂ 2L(θ )
∂ θ∂ θ′
h
θ
b − θ )′ Σ̂
w(θ ) = (θ
b
• Asymptotically, the Wald statistic
"
d µ)
d µ,
b
b σ
b)
Var(
Cov(
d µ,
d σ
b σ
b ) Var(
b)
Cov(
q
e
[µ,
d µ).
b
Var(
#
#−1
=
b − θ)
(θ
"
8 - 20
8 - 22
.01208 .00399
.00399 .00535
b ± z(1−α/2)sc
eµb
µ̃] = µ
b /w,
b × w]
[σ , σ̃] = [σ
σ
e
q
i
h
d σ
b and sc
b ).
eσb = Var(
where w = exp z(1−α/2)sceσb /σ
#
8 - 24
b )−log(σ)]/sc
• Assuming that Zlog(σb) = [log(σ
elog(σb) ∼
˙ NOR(0, 1)
an approximate 100(1 − α)% confidence interval for σ is
where sceµb =
b − µ)/sc
• Assuming that Zµb = (µ
eµb ∼
˙ NOR(0, 1) distribution,
an approximate 100(1 − α)% confidence interval for µ is
Σ̂µb,σb =
• Estimated variance matrix for the shock absorber data
Normal-Approximation Confidence Intervals
for Model Parameters
when evaluated at the true θ , follows a chisquare distribution with k degrees of freedom, where k is the length of
θ.
i−1
b.
where the derivatives are evaluated at θ
θ
Σ̂b = −
• Let Σ̂b be a consistent estimator of Σb , the asymptotic
θ
θ
b . For example,
covariance matrix of θ
• Alternative asymptotic theory is based on the large-sample
distribution of quadratic forms (Wald’s statistic).
Confidence Level
1.0
0.8
0.6
0.4
0.2
0.0
6000
Asymptotic Theory of ML Estimation
Confidence Level
• If evaluated at the true value of θ , then asymptotically,
b has a MVN(θ , Σ ) and thus the Wald
(large samples) θ
b
θ
statistic
b denote the ML estimator of θ .
Let θ
Profile Likelihood
Normal-Approximation Confidence Intervals
for Function g1 = g1(µ, σ)
b σ
b ).
• ML estimate gb1 = g1(µ,
"
e
[g1,
d gb1) =
Var(
2
d µ
Var(
b) +
∂g1
∂σ
2
d σ
Var(
b) + 2
g̃1] = gb1 ± z(1−α/2)scegb1 ,
∂g1
∂µ
∂g1
∂µ
8 - 25
∂g1
∂σ
e
F̃ ] = Fb (te) ± z(1−α/2)sceFb .
Fb (te)
#
d µ
Cov(
b, σ
b)
• Assuming Zgb1 = (gb1 −g1)/scegb1 ∼
˙ NOR(0, 1), an approximate
100(1 − α)% confidence interval for g1 is
q
where
b bg =
se
1
b σ
b.
• Partial derivatives evaluated at µ,
• General theory in the appendix.
Confidence Interval for F (te; µ, σ)–Continued
Note: Wald’s confidence intervals depend on the parameterization used to derive the intervals.
"
[F ,
For example, an approximate 100(1−α)% confidence interval
for F (te; µ, σ) can be obtained using:
F̃ ] =
• The asymptotic normality of ZFb = (Fb − F )/sceFb
e
Fb (te) + (1 − Fb (te)) × w
Fb (te)
,
8 - 27
Fb (te) + (1 − Fb (te))/w
#
• The asymptotic normality of Zlogit(Fb) = [logit(Fb )−logit(F )]/scelogit(F
[F ,
where w = exp{z(1−α/2)sceFb /[Fb (te)(1 − Fb (te))]}.
ML Estimates for Biomedical Data
R Proc Reliability ML estimates (Weibull
Here we display SAS
and lognormal) for the DMBA and the IUD data.
8 - 29
Normal-Approximation Confidence Interval
for F (te; µ, σ)
Objective: Obtain a point estimate and a confidence interval
for Pr(T ≤ te) = F (te; µ, σ) at a fixed and known point te.
θ
b = (µ,
b σ
b ) and Σ̂b are available.
• The ML estimates θ
• The ML estimate for F (te; µ, σ) is
c)
b σ
b ) = Φ(ζ
Fb = F (te; µ,
e
b σ
b.
where ζce = [log(te) − µ]/
8 - 26
• In the context of Wald’s theory, however, there are many
ways to obtain a confidence interval for F (te; µ, σ).
Confidence Interval for F (te; µ, σ)—Continued
Comments:
• Often the confidence interval based on the asymptotic normality of ZFb has poor statistical properties caused by the
slow convergence toward normality of ZFb .
8 - 28
• The confidence interval based on the transformation Zlogit(Fb)
can have better statistical properties if Zlogit(Fb) converges
to normality faster than ZFb .
99.9
99
95
80
70
60
50
40
30
20
5
10
2
1
.5
.2
.1
100
Days
500
8 - 30
R Proc Reliability
SAS
Nonparametric and Weibull ML Estimate for DMBA
Data with Parametric Pointwise Approximate 95%
Confidence Intervals
Percent
R Proc Reliability
SAS
Nonparametric and Lognormal ML Estimate for
DMBA Data with Parametric Pointwise Approximate
95% Confidence Intervals
99.9
99
10
100
8 - 32
R Proc Reliability
SAS
Lognormal ML Estimate for IUD Data with a set of
Pointwise Approximate 95% Confidence Intervals
95
90
80
20
5
10
2
1
.5
.2
5
Weeks
Inference when σ (or Weibull β) is Given
sceη̂ =
η̂
β
s
1
.
r
• Simplifies problem. Only one parameter with r failures and
,
t , . . . , tn failures and censor times
1

P
β 1/β
n
t
η̂ =  i=1 i 
r
• Provides much more precision, especially with small r.
95
90
80
70
60
50
40
30
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
0
Hours
1000
1500
Bearing Cage Failure Data
500
2000
2
11
6
123
93
47
41
27
106
99
110
114
119
127
111
124
288
148
Count
Bearing-Cage Fracture Data Event Plot
• Danger of misleading inferences.
• Requires sensitivity analysis because β is in doubt.
◮ Lower confidence bound on tp.
◮ Upper confidence bound on F (t).
20
5
2
1
.5
.2
5
Bearing-Cage Fracture Field Data
• A population of n = 1703 units had been introduced into
service over time and 6 failures have been observed.
• There is concern that the B10 design life specification of
t.1 = 8 thousand hours was not being met.
• ML estimate is tb.1= 3.903 thousand hours and an approximate 95% likelihood-ratio confidence interval for t.1 is [2.093,
22.144] thousand hours.
• Management also wanted to know how many additional failures could be expected in the next year.
8 - 35
8 - 36
8 - 34
• If 0 failures can provide
100
8 - 33
30
70
2
1
.5
.2
.1
100
5
10
80
70
60
50
40
30
20
95
8 - 31
90
500
60
50
40
Days
Percent
10
10
Weeks
R Proc Reliability
SAS
Weibull ML Estimate for IUD Data with a set of
Pointwise Approximate 95% Confidence Intervals
Percent
Percent
.7
.2
.03
.005
.001
.0002
.00003
.7
.2
.03
.005
.001
.0002
.00003
200
200
β estimated
Hours
1000
β = 2 (or σ = .5 )
1000
5000
5000
.7
.2
.03
.005
.001
.0002
.00003
.7
.2
.03
.005
.001
.0002
.00003
200
200
5000
5000
β = 1.5 (or σ = .667)
Hours
1000
β = 3 (or σ = .333)
1000
Hours
8 - 37
Weibull Probability Plots Bearing-Cage Fracture Data
with Weibull ML Estimates and Sets of 95% Pointwise
Confidence Intervals for F (t) with β Estimated, and
Assumed Known Values of β= 1.5, 2, and 3.
1
2
3
4
5
6
β
F(t)
.5
.3
.2
.1
.05
.03
.02
.01
.005
.003
.001
.0005
1.5
2
500
β= 2.5
1000
β
1
2000
5000
8 - 41
Weibull Model 95% Upper Confidence Bounds on F (t)
for Component-A with Different Fixed Values for the
Weibull Shape Parameter
8 - 39
dence bound for functions like tp for specified p or a upper
confidence bound for F (te) for a specified te.
e
• The lower bound η can be translated into an lower confi-
2 n t β
η =  2 i=1 i  .
χ(1−α;2)
e
 P
• For a sample of n units with running times t1, . . . , tn and no
failures, a conservative 100(1−α)% lower confidence bound
for η is
• ML Estimate for the Weibull Scale Parameter η Cannot be
Computed Unless the Available Data Contains One or More
Failures.
Weibull/SEV Distribution with Given β = 1/σ
and Zero Failures
Hours
Fraction Failing
Fraction Failing
Time
.6
.5
.4
.3
.2
.1
.05
.02
.01
.005
.002
200
•
•
•
•
1000
•
Hours
2000
Lognormal Distribution ML Fit
Weibull Distribution ML Fit
Lognormal Distribution 95% Pointwise Confidence Intervals
•
500
Component A Safe Data
5000
Lognormal and Weibull Comparison
Bearing-Cage Fracture Field Data
Lognormal Probability Paper
14
12
8
10
6
4
2
0
-2
0.9
0.5
2
0.3
0.2
mu
6
10
14
0.1
10000
8 - 38
3500
6
Relative Weibull Likelihood
with One Failure at 1 and One Survivor at 2
8 - 42
8 - 40
4000
3
• Can replacement be increased from 2000 hours to 4000
hours?
Hours:
500 1000 1500 2000 2500 3000
Number of Units:
10
12
8
9
7
9
Staggered entry data, with no reported failures.
• Newly designed components were put into service during
the past year and no failures have been reported.
• Previous experience suggests that the Weibull shape parameter is near β = 2, and almost certainly between 1.5 and
2.5.
• Because of persistent reliability problems, the component
was redesigned to have a longer service life.
• A metal component in a ship’s propulsion system fails from
fatigue-caused fracture.
.0001
.00005
.001
.0005
Proportion Failing
sigma
Fraction Failing
Fraction Failing
0.9
0.5
0.3
1.0
0.2
mu
1.4
0.1
1.8
8 - 43
Relative Weibull Likelihood
with One Failure at 1.9 and One Survivor at 2
1.0
0.8
0.6
0.4
0.2
0.0
0.6
Regularity Conditions – Continued
• Identifiability.
8 - 45
• Can exchange the order of differentiation of log likelihood
w.r.t. θ and integration w.r.t. data.
• Bounded derivatives of likelihood.
• Continuous derivatives of log likelihood (w.r.t. θ ).
• Number of parameters does not grow too fast with n.
• Support does not depend on unknown parameters.
Some typical regularity conditions include:
sigma
Regularity Conditions
• Each technical result (e.g., asymptotic distribution of an
estimator) has its own set of conditions on the model (see
Lehmann 1983, Rao 1973).
• Frequent reference to Regularity Conditions which give rise
to simple results.
• For special cases the regularity conditions are easy to state
and check. For example, for some location-scale distributions the needed conditions are:
z 2φ2(z)
= 0
lim
z→−∞ Φ(z)
z 2φ2(z)
= 0.
lim
z→+∞ 1 − Φ(z)
8 - 44
• In non-regular models, asymptotic behavior is more complicated (e.g., behavior depends on θ ), but there are still
useful asymptotic results.
Other Topics Related to Parametric Likelihood
Covered in the Book
• Truncated data (Chapter 11).
• Threshold parameters (Chapter 11).
• Other distributions (e.g., gamma) (Chapter 11).
• Bayesian methods (Chapter 14).
• Multiple failure modes (Chapter 15).
8 - 46
Download