W Io C B

advertisement
Chapter 17
Failure-Time Regression Analysis
William Q. Meeker and Luis A. Escobar
Iowa State University and Louisiana State University
17 - 1
Copyright 1998-2008 W. Q. Meeker and L. A. Escobar.
Based on the authors’ text Statistical Methods for Reliability
Data, John Wiley & Sons Inc. 1998.
December 14, 2015
8h 10min
Computer Program Execution Time
Versus System Load
• Time to complete a computationally intensive task.
• Information from the Unix uptime command
17 - 3
• Predictions needed for scheduling subsequent steps in a
multi-step computational process.
Explanatory Variables for Failure Times
Useful explanatory variables explain/predict why some units
fail quickly and some units survive a long time.
• Continuous variables like stress, temperature, voltage, and
pressure.
• Discrete variables like number of hardening treatments or
number of simultaneous users of a system.
• Categorical variables like manufacturer, design, and location.
17 - 5
Regression model relates failure time distribution to explanatory variables x = (x1, . . . , xk ):
Pr(T ≤ t) = F (t) = F (t; x).
Chapter 17
Failure-Time Regression Analysis
Objectives
• Describe applications of failure-time regression
• Describe graphical methods for displaying censored regression data.
• Introduce time-scaling transformation functions.
• Describe simple regression models to relate life to explanatory variables.
• Illustrate the use of likelihood methods for censored regression data.
• Explain the importance of model diagnostics.
17 - 2
• Describe and illustrate extensions to nonstandard multiple
regression models
•
•
••• ••• ••
1
•
••
2
•
•
3
System Load
4
5
•
•
6
17 - 4
Scatter Plot of Computer Program Execution Time
Versus System Load
800
700
600
500
400
300
200
100
0
0
Failure-Time Regression Analysis
17 - 6
• Presentation motivated by practical problems in reliability
analysis.
◮ Nonstandard regression models that relate life to explanatory variables.
◮ Data may be censored.
◮ Data not necessarily from a normal distribution.
• The ideas presented here are more general:
where the xi are explanatory variables.
mean = β0 + β1x1 + · · · + βsxk
• Material in this chapter is an extension of statistical regression analysis with normal distributed data and
Seconds
Nickel-Base Super-alloy Fatigue Data
26 Observations in Total, 4 Censored
(Nelson 1984, 1990)
Originally described and analyzed by Nelson (1984 and 1990).
• Thousands of cycles to failure as a function of pseudostress in ksi.
• 26 units tested; 4 units did not fail.
1.5
2.5
scale
accelerating
transformation
2.0
Time at Level x0
3.0
17 - 7
Objective: Explore models that might be used to describe
the relationship between life length and the amount of pseudostress applied to the tested specimens.
scale
decelerating
transformation
0.5
1.0
SAFT Models
Illustrating Acceleration and Deceleration.
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.0
17 - 9
17 - 11
• Decelerating SAFT models are straight lines above the diagonal.
• Accelerating SAFT models are straight lines below the diagonal.
• The SAFT models are straight lines through the origin:
In a time transformation plot of T (x0) vs T (x)
The SAFT Transformation Models and Acceleration
Time at Level x
250
200
150
100
50
0
•
•
80
•
• •
•••
•
•
•
120
Pseudo-stress (ksi)
100
••••••
•
140
• ••
Nickel-Base Super-Alloy Fatigue Data
(Nelson 1984, 1990)
60
AF (x) > 0,
160
17 - 8
AF (x0) = 1
17 - 10
17 - 12
In particular, in a probability plot with a log-time scale,
F (t, x) is a translation of F (t, x0) along the log(t) axis.
This shows that in any plot with a log-time scale tp(x0) and
tp(x) are equidistant.
log[tp(x0)] − log[tp(x)] = log[AF (x)].
• Proportional quantiles: tp(x) = tp(x0)/AF (x). Then taking logs gives
• Scaled time: F (t; x) = F [AF (x) t; x0] . Thus the cdfs
F (t; x) and F (t; x0) do not cross each other.
For a SAFT model T (x) = T (x0)/AF (x), (Ψ(x) > 0), with
baseline cdf F (t; x0) so AF (x0) = 1
Some Properties of SAFT Models
• When 0 < AF (x) < 1, T (x) > T (x0), and time is decelerated (but we still call this an SAFT model).
• When AF (x) > 1, the model accelerates time in the sense
that time moves more quickly at x than at x0 so that T (x) <
T (x0).
◮ log[AF (x)] = β1x1 + . . . + βk xk with x0 = 0 for a vector
x = (x1, . . . , xk ).
◮ log[AF (x)] = β1x with x0 = 0 for a scalar x.
where typical forms for AF (x) are:
T (x) = T (x0)/AF (x),
• Scale Accelerated Failure Time (SAFT) models are defined
by
Scale Accelerated Failure Time (SAFT) Model
Thousands of Cycles
1
∆
5
10
Time
∆
50
100
17 - 13
Weibull Probability Plot of Two Members from an
SAFT Model with a Baseline Lognormal Distribution
.999
.9
.98
.7
.5
.3
.2
.1
.05
.03
.02
.01
.005
.003
.001
1000
500
200
100
50
•
•
•
• •
•
•
••
1
•
•
•
2
•
•
3
4
System Load
Local (observed information) estimate

2
0 ,β1 ,σ)
− ∂ L(β
∂β0 ∂β1
2
1 ,σ)
− ∂ L(β0,β
∂β12
2
0 ,β1 ,σ)
− ∂ L(β
∂σ∂β1
1
0
5
•
•
6

90%
50%
7
17 - 15
10%
2
0 ,β1 ,σ)
− ∂ L(β
∂β0∂σ
2
0 ,β1 ,σ)
− ∂ L(β
∂β1∂σ
2
0 ,β1 ,σ)
− ∂ L(β
∂σ 2
d b
d b b
d b b)
Var(
β
Cov(
β
0)
0 , β1 ) Cov(β0 , σ


d b
d βb , σ
 d b b

Σ̂b = 
Cov(
β
Cov(
1 , β0 ) Var(β1 )
1 b) 
θ
d σ
d σ
d σ
b , βb0)
b , βb1)
b)
Cov(
Cov(
Var(
2
1 ,σ)
− ∂ L(β0,β
2
2
2

0 ,β1 ,σ)
=  − ∂ L(β
∂β ∂β

∂β0




−∂
L(β0,β1,σ)
∂σ∂β0
b.
Partial derivatives are evaluated at βb0, βb1, σ
−1
17 - 17






Estimated Parameter Variance-Covariance Matrix
0
Computer Program Execution Time Versus System
Load Log-Linear Lognormal Regression Model
−1
b
b
log[tbp(x)] = µ(x)
+ Φnor
(p)σ
Probability
Seconds
"
log(t) − µ
σ
#
Lognormal Distribution Simple Regression Model with
Constant Shape Parameter β = 1/σ
• The lognormal simple regression model is
Pr(T ≤ t) = F (t; µ, σ) = F (t; β0, β1, σ) = Φnor
where µ = µ(x) = β0 + β1x and σ does not depend on x.
−1 (p)σ
log[tp(x)] = µ(x) + Φnor
• The failure-time log quantile function
is linear in x.
Notice that
tp(x)
= exp (β1x)
tp(0)
17 - 14
implies that this regression model is a scale accelerated failure time (SAFT) model with AF (x) = exp(−β1x).
Likelihood for Lognormal Distribution Simple
Regression Model with Right Censored Data
i=1
Li (β0, β1, σ; datai )
δi 1−δi
n Y
log(ti ) − µi
log(ti ) − µi
1
1 − Φnor
φnor
σti
σ
σ
i=1
n
Y
The likelihood for n independent observations has the form
L(β0, β1 , σ) =
=
(
1
0
exact observation
right censored observation
where datai = (xi, ti, δi), µi = β0 + β1xi,
δi =
17 - 16
φnor (z) is the standardized normal pdf and Φnor (z) is the
corresponding normal cdf.
The parameters are θ = (β0, β1, σ).
Standard Errors and Confidence Intervals
for Parameters


• Lognormal ML estimates for the computer time experiment
b = (βb , βb , σ
were θ
0 1 b ) = (4.49, .290, .312) and an estimate of
b is
the variance-covariance matrix for θ
.012 −.0037
0


Σ̂b =  −.0037
.0021
0 .
θ
0
0 .0029
√
.0021 = .046 .
1
β˜1] = βb1±z(.975)sceβb = .290±1.96(.046) = [.20,
.38]
• Normal-approximation confidence interval for the computer
execution time regression slope is
[β1,
e
1
where sceβb =
17 - 18
Standard Errors and Confidence Intervals for
Quantities at Specific Explanatory Variable Conditions
•
•
100
•••
••
••
120
#
• 90%
50%
•• 10%
140
160
60
•
80
•
•••
• •
•
•
•
100
120
•••
••
••
Pseudo-stress (ksi)
10%
• 90%
•• 50%
140
160
17 - 20
Log-Quadratic Weibull Regression Model with
Constant (β = 1/σ) Fit to the Fatigue Data
−1
b , x = log(pseudo-stress)
b
(p)σ
log[tbp(x)] = µ(x)
+ Φsev
b = βb0 + βb1x + βb2x2
µ
500
100
50
10
5
1
(
n
Y
i=1
Li (β0, β1, β2 , σ; datai )
exact observation
right censored observation
17 - 22
δi 1−δi
n Y
log(ti ) − µi
log(ti ) − µi
1
1 − Φsev
.
φsev
σti
σ
σ
i=1
1
0
• The Weibull quadratic regression model is
Pr[T ≤ t] = Φsev {[log(t) − µ]/σ} ,
[σ]
[σ]
[µ]
[µ]
[µ]
where µ = µ(x) = β0 + β1 x + β2 x2 and
log(σ) = log[σ(x)] = β0 + β1 x.
• The failure-time log quantile function is
h
−1 (p)σ(x)
log[tp(x)] = µ(x) + Φsev
which is not quadratic in x.
Also
17 - 24
Because the coefficient in front of tp(0) depends on p this
shows that the model is not an SAFT model.
−1 (p) {σ(x) − σ(0)} × t (0).
tp(x) = exp [µ(x) − µ(0)] exp Φsev
p
i
Weibull Distribution Quadratic Regression Model
with Nonconstant β = 1/σ
The parameters are θ = (β0, β1, β2, σ).
δi =
where µi = β0 + β1xi + β2xi2,
=
L(β0, β1 , β2, σ) =
The likelihood for n independent observations has the form
Likelihood for Weibull Distribution Quadratic
Regression Model with Right Censored Data
Thousands of Cycles
• Unknown values of µ and σ at each level of x
"
d µ)
d µ,
b
b σ
b)
Var(
Cov(
d µ,
d σ
b σ
b ) Var(
b)
Cov(
b = βb0 + βb1x, σ does not depend on x, and
• µ
Σ̂µb,σb =
d
d βb )+2xCov(
d βb , βb )+x2Var(
d βb )
b
is obtained from Var(
µ)
= Var(
0
1 0
1
d µ,
d βb , σ
d b b ).
b σ
b ) = Cov(
and Cov(
0 b ) + xCov(β1 , σ
• Use the above results with the methods from Chapter 8
to compute normal-approximation confidence intervals for
F (t), h(t), and tp.
• Could also use likelihood or simulation-based confidence intervals.
17 - 19
Weibull Distribution Quadratic Regression Model
with Constant Shape Parameter β = 1/σ
This is an AF time model with the following characteristics:
• The Weibull quadratic regression model is
Pr[T ≤ t] = Φsev {[log(t) − µ]/σ}
17 - 21
where µ = µ(x) = β0 + β1x + β2x2 and σ does not depend
on x.
• The failure-time log quantile function
−1 (p)σ
log[tp(x)] = µ(x) + Φsev
is quadratic in x. Also
tp(x) = exp(β1x + β2x2) × tp(0)
which shows that the model is an SAFT model.
Log-quadratic Weibull Regression Model with
Nonconstant β = 1/σ Fit to the Fatigue Data
500
•
•••
• •
•
•
80
50
60
100
10
5
1
Pseudo-stress (ksi)
17 - 23
b
b (x) x = log(pseudo-stress)
log[tb (x)] = µ(x)
+ Φ−1 (p)σ
p
sev
[σ]
[σ]
b = βb0 + βb1x + βb2x2, log(σ
b ) = βb0 + βb1 x.
µ
Thousands of Cycles
Likelihood for Weibull Distribution Quadratic
Regression Model with Nonconstant β = 1/σ
and Right Censored Data
The likelihood for n independent observations has the form
[µ]
[µ]
[σ]
[σ]
17 - 25
[µ] [µ] [µ] [σ] [σ]
L(β0 , β1 , β2 , β0 , β1 )
n
Y
[µ] [µ] [µ] [σ] [σ]
=
Li(β0 , β1 , β2 , β0 , β1 ; datai)
i=1
"
"
#)δ (
#)1−δ
n (
i
i
Y
1
log(ti) − µi
log(ti) − µi
=
1 − Φsev
.
φsev
σ
σ
i
i
i=1 σi ti
[µ]
[µ]
[µ]
[σ]
[σ]
where µi = β0 + β1 xi + β2 xi2 and σi = exp β0 + β1 xi .
[µ]
Parameters are θ = (β0 , β1 , β2 , β0 , β1 ).
5
4
3
2
1
0
•
•
•
•
•
•
•
20
•
•
•
•
•
•
•• •
• •
• •••
•
•• •
30
••
•
•
50
•
• •
• ••• •• •••• •
••• ••• •
•• • ••••
••• ••••• •••• •
• •• •
• •••••••••••• •
•
• ••
•
•
•
40
Years
60
••
90%
50%
10%
70
17 - 27
Stanford Heart Transplant Data
Quadratic Log-Location Weibull Regression Model
10
10
10
10
10
10
-1
10
10
Checking Model Assumptions
17 - 29
• Most analytical tests can be suitably generalized, at least
approximately, for censored data (especially using likelihood
ratio tests).
◮ Influence analysis (or sensitivity) analysis.
(see Escobar and Meeker 1992 Biometrics).
◮ Other residual plots.
◮ Probability plot of residuals.
◮ Residuals versus fitted values.
• Graphical checks using generalizations of usual diagnostics
(including residual analysis)
Days
5
4
3
2
1
0
•
•
•
•
•
•
•
20
•
•
•
•
•
•
•• •
• •
• •••
•
•• •
30
••
•
•
50
•
• •
• ••• •• •••• •
••• ••• •
•• • ••••
••• ••••• •••• •
• •• •
• •••••••••••• •
•
• ••
•
•
•
40
Years
60
••
90%
50%
10%
70
17 - 26
Stanford Heart Transplant Data
Quadratic Log-Mean Lognormal Regression Model
10
10
10
10
10
10
-1
10
10
Extrapolation and Empirical Models
"
log(ti) − log(t̂i)
b
σ
#
=
!1
ti bσ
t̂i
17 - 28
17 - 30
b i) and when ti is a censored observation,
where t̂i = exp(µ
the corresponding residual is also censored.
exp(ǫbi) = exp
• With models for positive random variables like Weibull, lognormal, and loglogistic, standardized residuals are defined
as
y − ybi
ǫbi = i
b
σ
where ybi is an appropriately defined fitted value (e.g., ybi =
b i).
µ
• For location-scale distributions like the normal, logistic, and
smallest extreme value,
Definition of Residuals
• Need to get the right curve to extrapolate: look toward
physical or other process theory
◮ To the lower tail of a distribution
◮ To the upper tail of a distribution
◮ In an explanatory variable like stress
• There are many different kinds of extrapolation
• Should not be used to extrapolate outside of the range of
one’s data.
• Empirical models can be useful, providing a smooth curve
to describe a population or a process.
Days
•
•
•
10
•
••
••
•
•
20
•
•
•
Fitted Values
50
•
•
•
200
•
• • •
100
••
17 - 31
500
Plot of Standardized Residuals Versus Fitted Values
for the Log-Quadratic Weibull Regression Model Fit
to the Super Alloy Data on Log-Log Axes
5.000
2.000
1.000
0.500
0.200
0.100
0.050
0.020
0.010
0.005
0.002
0.001
5
Empirical Regression Models
and Sensitivity Analysis
Objectives
• Describe a class of regression models that can be used to
describe the relationship between failure time and explanatory variables.
71/99
60
Length
80
92/100
100
17 - 33
• Describe and illustrate the use of empirical regression models.
• Illustrate the need for sensitivity analysis.
22/99
40
Picciotto Yarn Failure Data
Showing Fraction Failing at Each Length
200
100
50
20
10
20
17 - 35
•
0.005
•
•
0.050
•
•
0.200
Standardized Residuals
0.020
•
•
2.000
•
••
••
••
••
••
••
••
0.500
17 - 32
5.000
Probability Plot of the Standardized Residuals from
the Log-Quadratic Weibull Regression Model
Fit to the Super Alloy Data
.9
.98
.7
.5
.3
.2
.1
.05
.03
.02
.01
0.001
Picciotto Yarn Failure Data
(Picciotto 1970)
• Original data were uncensored. For purposes of illustration,
the data censored at 100 thousand cycles.
• Subset of data at particular level of stress and particular
specimen lengths (30mm, 60mm, and 90mm)
• Cycles to failure on specimens of yarn of different lengths.
Probability
.98
.9
.7
.5
.3
.2
.1
.05
.03
.02
.01
.005
.003
20
80
Picciotto Yarn Failure Data
Multiple Weibull Probability Plot
with Weibull ML Estimates
60
Subset of Piccioto Yarn Fatigue Data
With Individual Weibull MLE’s
Weibull Probability Plot
30
60
90
40
Thousands of Cycles
100
17 - 36
17 - 34
• Suppose that the goal is to estimate life for 10 mm units.
Probability
Standardized Residuals
• Show how to conduct a sensitivity analysis.
Thousands of Cycles
.9
.95
.8
.7
.6
.5
.4
.3
.2
.1
.05
.02
.01
.005
.002
60
Subset of Piccioto Yarn Fatigue Data
With Individual Lognormal MLE’s
Lognormal Probability Plot
30
60
90
40
Thousands of Cycles
80
Picciotto Yarn Failure Data
Multiple Lognormal Probability Plot
with Lognormal ML Estimates
20
5
10
30
60
90
10
20
100
200
Subset of Piccioto Yarn Fatigue Data
with Lognormal log Model MLE
Lognormal Probability Plot
50
Thousands of Cycles
500
1000
100
17 - 37
17 - 39
2000
Picciotto Yarn Failure Data
Multiple Lognormal Probability Plot with
Lognormal ML Log-Linear Regression Model
Estimates
.99
.98
.95
.9
.8
.7
.6
.4
.3
.2
.1
.05
.02
.005
.001
5.00
2.00
1.00
0.50
0.20
0.10
0.05
0.02
40
Fitted Values
80
100
120
Subset of Piccioto Yarn Fatigue Data
Residuals versus Fitted Values
60
140
17 - 41
Suggested Strategy for Fitting Empirical Regression
Models and Sensitivity Analysis
• Use data and previous experience to choose a base-line
model:
◮ Distribution at individual conditions
◮ Relationship between explanatory variables and distributions at individual conditions
• Fit models
• Use diagnostics (e.g., residual analysis) to check models
• Assess uncertainty
◮ Confidence intervals quantify statistical uncertainty
17 - 38
◮ Perturb and otherwise change the model and reanalyze
(sensitivity analysis) to assess model uncertainty.
Subset of Piccioto Yarn Fatigue Data
Scatter Plot of the Picciotto Yarn Failure Data
with Lognormal ML Log-Linear Regression Model
Estimate Densities
−1
b
log[tbp(x)] = βb0 + βb1 log (Length) + Φnor
(p)σ
2000
500
1000
200
20
40
60
17 - 40
100
10%
90%
20
10
Length on log scale
12
50%
8
50
4
100
.99
.98
.9
.95
.8
.7
.6
.4
.3
.2
.1
.05
.02
.005
.001
5
10
30
60
90
10
20
100
200
Subset of Piccioto Yarn Fatigue Data
with Lognormal sqrt Model MLE
Lognormal Probability Plot
50
Thousands of Cycles
500
1000
17 - 42
2000
Picciotto Yarn Failure Data
Multiple Lognormal Probability Plot with
Lognormal ML Squareroot-Linear Regression Model
Estimates
Thousands of Cycles
Probability
Probability
Probability
Picciotto Yarn Failure Data
Log-Linear Model Residual vs Fitted Values Plot
Standardized Residuals
Thousands of Cycles
Subset of Piccioto Yarn Fatigue Data
60
80
Scatter Plot of the Picciotto Yarn Failure Data
with Lognormal ML Squareroot-Linear Regression
Model Estimate Densities
√
−1
b
(p)σ
log[tbp(x)] = βb0 + βb1 Length + Φnor
2000
1000
500
200
40
1.5
17 - 45
17 - 43
100
10%
90%
20
100
0
50%
20
10
Length on sqrt scale
xi∗ = xi
√
1/3
xi∗ = log(xi)
def
xi∗ = −1/xi
1/3
xi∗ = −1/xi
√
xi∗ = −1/ xi
xi∗ = −1/xi2
Transformation
Examples of Monotone Increasing
Power Transformations of a
Positive Explanatory Variable
λ
−2
−1
−.5
−.333
0
.333
xi∗ =
xi
1
.5
xi∗ = xi
xi∗ = xi2
2
0.5
Subset of Piccioto Yarn Fatigue Data
with Lognormal Length:log at 30
Power Transformation Sensitivity Analysis on Length
0.0
Box-Cox Transformation Power
-0.5
ML estimate of the 0.1 quantile
95% Pointwise confidence intervals
-1.0
1.0
Picciotto Yarn Failure Data
Relationship Sensitivity Analysis
for the 0.1 Quantile at Length 30 mm
90
85
80
75
70
65
60
55
-1.5
17 - 47
5.00
2.00
1.00
0.50
0.20
0.10
0.05
0.02
60
Fitted Values
80
100
120
Subset of Piccioto Yarn Fatigue Data
Residuals versus Fitted Values
Picciotto Yarn Failure Data
Squareroot-Linear Model
Residual vs Fitted Values Plot
40
Box-Cox Transformation

λ

 xi −1

λ=
6 0
λ

 log(xi) λ = 0

0.5
Subset of Piccioto Yarn Fatigue Data
with Lognormal Length:log at 10
Power Transformation Sensitivity Analysis on Length
0.0
Box-Cox Transformation Power
-0.5
ML estimate of the 0.1 quantile
95% Pointwise confidence intervals
-1.0
1.0
140
Picciotto Yarn Failure Data
Relationship Sensitivity Analysis
for the 0.1 Quantile at Length 10 mm
20000
5000
10000
2000
500
1000
200
100
50
-1.5
1.5
17 - 44
17 - 48
17 - 46
◮ For fixed, xi, xi∗ is a continuous function of λ through 0.
◮ The transformed value xi∗ is an increasing function of xi.
• The Box-Cox transformation has the following properties:
where xi is the original, untransformed explanatory variable
for observation i and λ is the power transformation parameter.
xi∗ =
• The Box-Cox family of power transformations of a positive
explanatory variable is
Standardized Residuals
0.1 Quantile of Life in Thousands of Cycles
50
0.1 Quantile of Life in Thousands of Cycles
0.0
0.5
Box-Cox Transformation Power
-0.5
1.0
Profile Likelihood and 95% Confidence Interval
for Box-Cox Transformation Power from the Lognormal Distribution
-1.0
1.5
0.99
0.95
0.90
0.80
0.70
0.60
0.50
Picciotto Yarn Failure Data
Profile Plot for Different Box-Cox Parameters
for the Length Relationship
1.0
0.8
0.6
0.4
0.2
0.0
10000
5000
2000
1000
500
200
100
50
-1.5
0.5
Subset of Piccioto Yarn Fatigue Data
with Length:log at 10
Power Transformation Sensitivity Analysis on Length
0.0
Box-Cox Transformation Power
-0.5
lognormal:0.1 quantile
weibull:0.1 quantile
-1.0
1.0
1.5
Picciotto Yarn Failure Data
Relationship/Distribution Sensitivity Analysis
0.1 Quantile at 10 mm Length
17 - 51
17 - 49
Confidence Level
250
300
170 Degrees C
180 Degrees C
17 - 53
350
Scatter Plot the Effect of Voltage and Temperature
on Glass Capacitor Life
(Zelen 1959)
1200
1000
800
600
400
200
0
200
Volts
0.5
Subset of Piccioto Yarn Fatigue Data
with Lognormal Length:log at 10
Power Transformation Sensitivity Analysis on Length
0.0
Box-Cox Transformation Power
-0.5
Lognormal:0.1 quantile
Lognormal:0.5 quantile
Lognormal:0.9 quantile
-1.0
1.0
1.5
Picciotto Yarn Failure Data
Relationship Sensitivity Analysis
0.1, 0.5, 0.9 Quantiles at 10 mm Length
50000
20000
5000
10000
2000
500
1000
200
100
50
-1.5
Glass Capacitor Failure Data
• Original data from Zelen (1959).
.999
.9
.98
.7
.5
.3
.2
.1
.05
.03
.02
.01
.005
.003
100
170;200
170;250
170;300
170;350
180;200
180;250
180;300
180;350
200
Hours
500
1000
17 - 50
17 - 54
2000
Weibull Probability Plots of Glass Capacitor Life Test
Results at Individual Temperature and Voltage Test
Conditions
17 - 52
• Test at each combination run until 4 of 8 units failed (Type
II censoring).
• 2 × 4 factorial, 8 units at each combination.
• Experiment designed to determine the effect of voltage and
temperature on capacitor life.
Quantile of Life in Thousands of Cycles
Proportion Failing
Profile Likelihood
0.1 Quantile of Life in Thousands of Cycles
Hours
Two-Variable Regression Models
240
260
280
300
170 Degrees C
Volts
Time
340
180 Degrees C
320
∆
50
17 - 57
360
100
170 Deg C, 200 V
170 Deg C, 250 V
170 Deg C,300 V
170 Deg C, 350 V
180 Deg C, 200 V
180 Deg C, 250 V
180 Deg C, 300 V
180 Deg C, 350 V
200
Hours
500
1000
Weibull Probability Plots with Weibull Regression
Model ML Estimates of F (t) at each Set of Conditions
for the Glass Capacitor Data
.999
.9
.98
.7
.5
.3
.2
.1
.05
.03
.02
.01
.005
.003
100
17 - 56
The Proportional Hazard Failure Time Model
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.0
1.5
for all t > 0.
2.5
accelerating
proportional
hazards
2.0
Time at Level x0
1.0
decelerating
proportional
hazards
0.5
3.0
17 - 60
Proportional Hazard Model (Lognormal Baseline)
as a Time Transformation
17 - 58
Thus in a Weibull probability scale F (t; x) and F (t; x0)
are equidistant. In particular, Weibull plots of F (t; x) and
F (t; x0) are translation of each other along the probability
axis.
log {− log [1 − F (t; x)]}−log {− log [1 − F (t; x0)]} = log [Ψ(x)] .
• From 1−F (t; x) = [1−F (t; x0)]Ψ(x) and taking logs (twice):
• S(t; x) = [S(t; x0)]Ψ(x) or 1 − F (t; x) = [1 − F (t; x0)]Ψ(x).
If Ψ(x) 6= 1, F (t; x) and F (t; x0) do not cross and the model
is accelerating if Ψ(x) > 1 and decelerating if Ψ(x) < 1.
The PH model implies (some details later):
h(t; x0) and F (t; x0) denote the baseline hazard function
and cdf of the model.
h(t; x) = Ψ(x)h(t; x0),
The proportional hazard (PH) model assumes
Proportion Failing
• Additive model
log[tp(x)] = yp(x) = β0 + β1x1 + β2x2 + Φ−1(p)σ.
Because
tp(x) = exp [yp(x)] = exp(β1x1 + β2x2) tp(0)
this is an SAFT model
• The interaction model
log[tp(x)] = yp(x) = β0 + β1x1 + β2x2 + β3x1x2 + Φ−1(p)σ.
is also SAFT.
• Comparing the two models gives
17 - 55
−2 × (L1 − L2) = −2 × (−244.24 + 244.17) = 0.14
2
which is small relative χ.95,1
= 3.84
220
Estimates of Weibull t0.5 Plotted for each Combination
of the Glass Capacitor Test Conditions
Points are Estimates Regression–Model Free
1400
1200
800
1000
600
400
200
200
∆
5
10
Weibull Probability Plot of Two Members from a
PH Model with a Baseline Lognormal Distribution
.9
.98
.999
.7
.5
.3
.2
.1
.05
.03
.02
.01
.005
.003
.001
1
17 - 59
Time at Level x
Hours
Probability
Interpreting PH Models as a
Failure Time Transformation.
1
!
Suppose that T (x0) ∼ F (t; x0) and define the time transformation:
T (x) = F −1 1 − {1 − F [T (x0); x0]} Ψ(x) ; x0 .
It can be shown that:
• T (x) and T (x0) follow the PH relationship
h(t; x) = Ψ(x)h(t; x0).
if
if
Ψ(x) < 1,
Ψ(x) = 1,
Ψ(x) > 1,
identity transform
accelerating
• T (x) is a monotone transformation of T (x0) such that
T (x ) < T (x 0 )
if
decelerating
T (x ) = T (x 0 )
17 - 61
T (x ) > T (x 0 )
∆1
∆2
5
10
Time
∆1
∆2
50
100
17 - 63
Weibull Probability Plot of Two Members from a
PH/SAFT Model with a Baseline Weibull Distribution
.9
.98
.999
.7
.5
.3
.2
.1
.05
.03
.02
.01
.005
.003
.001
1
Statistical Methods for the
Semiparametric (Cox) PH Model
17 - 65
• Each unit has a vector xi of explanatory variables (often
called covariates).
• RS i = RS(t(i) − ε) is the risk set just before the failure at
time t(i).
Data: n units, t(1), . . . , t(r) ordered failure times with n − r
censored observations (usually multiply censored).
Probability
Some Comments on the
Proportional Hazard Failure Time Model
for all t > 0.
Here we consider the proportional hazards, PH, model
h(t; x) = Ψ(x)h(t; x0),
• In general, the PH model does not preserve the baseline
distribution. For example, if T (x0) has a lognormal distribution then T (x) has a power lognormal distribution.
17 - 62
• The semiparametric model without a particular specification
of h(t; x0) is known as Cox’s proportional hazards model.
Contrasting SAFT and PH Models
The scale failure time model tp(x) = tp(x0)/Ψ(x) and the
PH model h(t; x) = Ψ(x)h(t; x0) are equivalent if only if the
baseline distribution is Weibull.
In other words, the Weibull distribution is the only baseline
distribution for which a SAFT model is also a a PH model,
and vice versa.
Heuristic argument: Consider F (t; x1) and F (t; x2). The
Weibull probability plots of these two cdfs are translation
of each other in both the probability and the log(t) scale if
and only if the plots are straight parallel lines.
17 - 64
Cox PH Model Likelihood (Probability of the Data)
• The probability that individual i dies in [t, t + ∆t] is
h(t; xi)∆t = h(t; x0)∆tΨ(xi) = h(t; x0) exp(xi′ β )∆t
where Ψ(xi) = exp(xi′ β ) = exp(β1x1 +· · ·+βk xk ) and x0 = 0
ℓ∈RS i h(t; xℓ)∆t
h(t; xi)∆t
=P
exp(xi′ β )
′
ℓ∈RS i exp(xℓ β )
• If a death occurs at time t, the probability that it was individual i is
Li = P
r
Y
i=1
P
exp(xi′ β )
′
ℓ∈RS i exp(xℓβ )
• Conditional on the observed failure times, the joint probability of the data is
L(β ) =
• Need mild conditions on xi and the censoring mechanism.
17 - 66
Comments on the Cox PH Model Likelihood
• Cox (1972) called L(β ) a conditional likelihood.
• L(β ) is not a true likelihood, but there is justification for
treating it as such.
• With no censoring, Kalbfleisch and Prentice (1973) show
that L(β ) can be justified as a marginal likelihood based on
ranks of the observed failures. Argument can be extended
to Type II censoring.
• Cox (1975) provided a partial likelihood justification.
• Asymptotic theory shows that the estimates obtained from
maximizing L(β ) have high efficiency.
50
100
90 mm
60 mm
30 mm
150
Picciotto Yarn Failure Data
Cox PH Model Survival Estimates
(Linear Axes)
0
Thousands of Cycles
•
• •••••
•••
•
•
•
••• •
•••
•• •
•••• • •
••
••• ••••
•••••••••• •••••
•• •
•
•••
•
• •••
60
••
•
• •
••
• •
•••
Thousands of Cycles
•
••
•
•
•• •
•
• •
17 - 67
17 - 69
200
•
100
•••
•• •• • •••••
••• • •
80
Shoenfield Residuals
Estimating h(t; x0) [or S(t; x0) or F (t; x0)]
b for β and maximize the prob• Basic idea is to substitute β
ability of the data (after adjustment for the explanatory
variables), as in the product limit estimator.
i=1
r b
Y
b xi)
Ψ(x )
Ψ(
qi−1 i − qi
j=r+1
n
Y
qj−1 j
b x )
Ψ(
• Let qi = S(ti + ε; x0), i = 1, . . . , r for the r failure times and
let qj = S(tj + ε; x0), j = r + 1, . . . , n for the n − r censored
observations. If there are no ties among the failure times,
L(q ) =
b
c x ) = exp(x′ β
where Ψ(
i
i ) and q0 ≡ 1.
20
50
Thousands of Cycles
100
90 mm
60 mm
30 mm
200
2.5
accelerating and
decelerating
transformation
2.0
accelerating
transformation
1.5
Time at Level x0
1.0
decelerating
transformation
0.5
3.0
17 - 72
17 - 70
17 - 68
• Maximize L(q ) with respect to qi, i = 1, . . . , r to get the
step-function estimate of S(t; x0).
b
b
b
• Estimate of S(t) at x is S(t;
x) = [S(t;
x0]Ψ(x).
10
Picciotto Yarn Failure Data
Cox PH Model Survival Estimates
(on Weibull Probability Scales)
5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.0
General Failure Time Transformation Graph
.003
.005
.01
.02
.03
.05
.1
.2
.3
.5
.7
.9
.98
• Setup more complicated when there are ties.
Proportion Failing
• There are some technical difficulties when there are ties
observations. Approximations for L(β ) generally used.
1.0
0.8
0.6
0.4
0.2
0.0
•••
• ••• •••••
•
40
•
Picciotto Yarn Failure Data
Cox PH Model Schoenfeld Residuals versus Time
•
•
20
17 - 71
Time at Level x
Probability
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
General Failure Time Transformation Model
A general time transformation model is
T (x) = Υ [T (x0), x]
where x0 are some baseline conditions.
We assume that Υ (t, x) satisfies the following conditions:
• Υ (t, x) is nonnegative, i.e., Υ (t, x) ≥ 0 for all t and x.
• For fixed x, Υ (t, x) is monotone increasing in t.
• For all x, Υ (0, x) = 0.
17 - 73
• For x0 the transformation is the identity transformation,
i.e., Υ (t, x0) = t for all t.
Other Topics in Failure-Time Regression
• Models with non-location-scale distributions.
• Nonlinear relationship regression model.
• Fatigue-limit regression model.
• Special topics in regression: random effects models, nonparametric SAFT regression models, parametric PH regression models.
17 - 75
General Failure Time Transformation Model
This general assumed transformation model implies:
• The distribution of T (x) can be determined from the distribution of T (x0) and x. In particular, tp(x) = Υ [tp(x0), x]
for 0 ≤ p ≤ 1.
• In a plot of T (x0) versus T (x)
◮ T (x) entirely below the diagonal line implies acceleration.
◮ T (x) entirely above the diagonal line implies deceleration.
◮ Other T (x)s imply acceleration sometimes and deceleration other times. The cdfs of T (x) and T (x0) cross.
◮ Scale time transformations and proportional hazard model
are special cases.
17 - 74
Download