Incorporating unobserved heterogeneity in Weibull survival models: A Bayesian approach

advertisement
Incorporating unobserved heterogeneity in
Weibull survival models: A Bayesian approach
Catalina A. Vallejos1
1
Mark F.J. Steel2
MRC Biostatistics Unit, EMBL-European Bioinformatics Institute.
2 Dept. of Statistics, University of Warwick.
Workshop on Flexible Models for Longitudinal and Survival Data with
Applications in Biostatistics. 27-29 July, 2015
Motivation
What happens if we ignore unobserved heterogeneity?
0.9
Population hazard rates
1.2
Individual hazard rates
0.7
0.6
0.6
h(t)
0.8
h(t)
1.0
0.8
Group 2
0.4
0.4
0.5
Group 1
0
5
10
t
Catalina Vallejos
15
20
0
5
10
15
20
t
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
2/25
Mixture families of life distributions
Definition
Ti is distributed as a mixture of life distributions, iff its density
function is given by
Z
f (ti |ψ, θ) ≡
L
f ∗ (ti |ψ, Λi = λi ) dPΛi (λi |θ),
where f ∗ (·|ψ, Λi = λi ) is a lifetime density and PΛi (·|θ) is a cdf on
L possibly depending on a parameter θ, θ ∈ Θ.
• Distinction between individual and population-level survival
• The intuition behind the underlying model is preserved
• The influence of outlying observations is attenuated
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
3/25
Rate Mixtures of Weibull distributions
Definition
Ti is distributed as a Rate Mixtures of Weibull (RMW)
distributions iff
Ti |α, γ, Λi = λi ∼ Weibull (αλi , γ) ,
Λi |θ ∼ PΛi (·|θ),
i.e.
Z
f (ti |α, γ, θ) =
L
γ
γαλi e −αλi ti tiγ−1 dPΛi (λi |θ),
ti > 0,
where α, γ > 0 and PΛi (·|θ) is a cdf on L possibly depending on a
parameter θ ∈ Θ.
Denote Ti ∼ RMWP (α, γ, θ)
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
4/25
Rate Mixtures of Weibull distributions
• Relates to existing literature in frailty models
• Typically γ = 1 and Λi ∼ gamma (Lomax distribution)
⇒ e.g. Jewell (1982), Abbring and Van Den Berg (2007)
• Non-parametric mixtures ⇒ e.g. Kottas (2006)
• Case γ = 1: Rate Mixtures of Exponentials Ti ∼ RMEP (α, θ)
1/γ
• If Ti ∼ RMEP (α, θ) then Ti
∼ RMWP (α, γ, θ).
• For γ ≤ 1: decreasing hazard rate (Marshall and Olkin, 2007)
• Identifiability precludes unknown scale parameters in P
⇒ Fix scale parameters in P or set E(Λi |θ) = 1
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
5/25
Rate Mixtures of Weibull distributions
Example: RMW model with Gamma(θ, θ) mixing and α = 1
Density function
θ=1
θ=5
θ=∞
h(t)
1.0
0.4
0.0
0.0
0.2
f(t)
0.6
γ = 0.7
2.0
0.8
3.0
Hazard function
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0.0
0.5
1.0
2.5
3.0
2.0
2.5
3.0
h(t)
2.0
0.8
0.6
1.0
f(t)
0.4
0.0
0.0
0.0
0.5
1.0
1.5
t
Catalina Vallejos
2.0
t
0.2
γ=2
1.5
3.0
t
2.0
2.5
3.0
0.0
0.5
1.0
1.5
t
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
6/25
Rate Mixtures of Weibull distributions
Coefficient of variation
Theorem
If all the required moments exist, the coefficient of variation (cv )
of distributions in the RMW family is
v
u
u Γ (1 + 2/γ) varΛ (Λ−1/γ |θ)
Γ (1 + 2/γ) − Γ2 (1 + 1/γ)
i
i
u
+
.
cv (γ, θ) = u 2
Γ2 (1 + 1/γ)
u Γ (1 + 1/γ) E2Λi (Λ−1/γ
|θ)
i
|
{z
}
t
|
{z
}
(cv ∗ (γ,θ))2
s
It simplifies to
Catalina Vallejos
2
(cv W (γ))2
varΛi (Λ−1
i |θ)
+ 1 when γ = 1.
E2Λi (Λ−1
i |θ)
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
7/25
Rate Mixtures of Weibull distributions
Coefficient of variation
Theorem
If all the required moments exist, the coefficient of variation (cv )
of distributions in the RMW family is
v
u
u Γ (1 + 2/γ) varΛ (Λ−1/γ |θ)
Γ (1 + 2/γ) − Γ2 (1 + 1/γ)
i
i
u
+
.
cv (γ, θ) = u 2
Γ2 (1 + 1/γ)
u Γ (1 + 1/γ) E2Λi (Λ−1/γ
|θ)
i
|
{z
}
t
|
{z
}
(cv ∗ (γ,θ))2
s
It simplifies to
2
(cv W (γ))2
varΛi (Λ−1
i |θ)
+ 1 when γ = 1.
E2Λi (Λ−1
i |θ)
If θ is unknown, we restrict the range of (γ, θ) such that cv is finite
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
7/25
A regression model based on RMW distributions
Proportional Hazards (PH) models are popular in this context
0 ∗
hTi (ti |xi , β, Λi = λi ) = λi γtiγ−1 e xi β ,
Catalina Vallejos
Λi ∼ PΛi (θ)
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
8/25
A regression model based on RMW distributions
Proportional Hazards (PH) models are popular in this context
0 ∗
hTi (ti |xi , β, Λi = λi ) = λi γtiγ−1 e xi β ,
Λi ∼ PΛi (θ)
But the PH property is not preserved after mixture!
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
8/25
A regression model based on RMW distributions
Proportional Hazards (PH) models are popular in this context
0 ∗
hTi (ti |xi , β, Λi = λi ) = λi γtiγ−1 e xi β ,
Λi ∼ PΛi (θ)
But the PH property is not preserved after mixture!
Instead, we use an Accelerated Failure Times (AFT) specification
Ti ∼ RMWP (αi , γ, θ),
0
αi = e −γxi β ,
which is equivalent to
−1/γ
log(Ti ) = xi0 β + log(Λi
Catalina Vallejos
T0 ), Λi ∼ PΛi (θ), T0 ∼ Weibull(1, γ)
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
8/25
A regression model based on RMW distributions
Proportional Hazards (PH) models are popular in this context
0 ∗
hTi (ti |xi , β, Λi = λi ) = λi γtiγ−1 e xi β ,
Λi ∼ PΛi (θ)
But the PH property is not preserved after mixture!
Instead, we use an Accelerated Failure Times (AFT) specification
Ti ∼ RMWP (αi , γ, θ),
0
αi = e −γxi β ,
which is equivalent to
−1/γ
log(Ti ) = xi0 β + log(Λi
T0 ), Λi ∼ PΛi (θ), T0 ∼ Weibull(1, γ)
These regressions are equivalent setting β = −β ∗ /γ
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
8/25
Bayesian inference for the RMW-AFT model
A weakly informative prior
First consider the RME case (γ = 1)
Jeffreys and independence Jeffreys priors have structure
π(β, θ) ∝ π(θ),
but they are complicated to derive and π(θ) might not be proper.
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
9/25
Bayesian inference for the RMW-AFT model
A weakly informative prior
First consider the RME case (γ = 1)
Jeffreys and independence Jeffreys priors have structure
π(β, θ) ∝ π(θ),
but they are complicated to derive and π(θ) might not be proper.
Approach:
• Keep Jeffreys structure but use a proper π(θ)
• Match priors through common proper prior for cv , say π ∗ (cv )
• Exploting the functional relationship between cv and θ
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
9/25
Bayesian inference for the RMW-AFT model
A weakly informative prior
Table : Relationship between cv and θ for some RME models.
Mixing density
Range of cv
cv (θ)
Gamma(θ, θ)
(1, ∞)
q
Inverse-Gamma(θ, 1)
(1,
Inverse-Gaussian(θ, 1)
(1,
Log-Normal(0, θ)
(1, ∞)
Catalina Vallejos
√
√
3)
5)
dcv (θ) dθ θ
θ−2
θ−1/2 (θ − 2)−3/2
θ+2
θ
θ−3/2 (θ + 2)−1/2
5θ 2 +4θ+1
θ 2 +2θ+1
3θ+1
(5θ 2 +4θ+1)1/2 (θ+1)2
√
2 eθ − 1
e θ (2 e θ − 1)−1/2
q
q
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
10/25
Bayesian inference for the RMW-AFT model
A weakly informative prior
For the general RMW case
We choose
π(β, γ, θ) ∝ π(γ, θ) ≡ π(θ|γ)π(γ),
where π(θ|γ) and π(γ) are proper.
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
11/25
Bayesian inference for the RMW-AFT model
A weakly informative prior
For the general RMW case
We choose
π(β, γ, θ) ∝ π(γ, θ) ≡ π(θ|γ)π(γ),
where π(θ|γ) and π(γ) are proper.
Approach:
• Define π(θ|γ) as before through π ∗ (cv ), given γ
• Choose a proper π(γ)
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
11/25
Bayesian inference for the RMW-AFT model
A weakly informative prior
For the general RMW case
We choose
π(β, γ, θ) ∝ π(γ, θ) ≡ π(θ|γ)π(γ),
where π(θ|γ) and π(γ) are proper.
Approach:
• Define π(θ|γ) as before through π ∗ (cv ), given γ
• Choose a proper π(γ)
These priors are improper but the posterior distribution is
well defined under mild conditions
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
11/25
Bayesian inference for the RMW-AFT model
Outlier detection
No heterogeneity
Extreme value of λi
⇓
⇓
λ1 = λ2 = · · · = λn = λ
Potential outlier
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
12/25
Bayesian inference for the RMW-AFT model
Outlier detection
No heterogeneity
Extreme value of λi
⇓
⇓
λ1 = λ2 = · · · = λn = λ
Potential outlier
Formally, we contrast the models
M0 : Λi = λref
M1 : Λi 6= λref (with all other Λj , j 6= i free)
1
(i)
BF01 = π(λi |t, c)E
dP(λi |θ) λi =λref
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
12/25
Bayesian inference for the RMW-AFT model
Outlier detection
No heterogeneity
Extreme value of λi
⇓
⇓
λ1 = λ2 = · · · = λn = λ
Potential outlier
Formally, we contrast the models
M0 : Λi = λref
M1 : Λi 6= λref (with all other Λj , j 6= i free)
1
(i)
BF01 = π(λi |t, c)E
dP(λi |θ) λi =λref
Choice of λref ?
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
12/25
Bayesian inference for the RMW-AFT model
Outlier detection
• In Vallejos and Steel (2014) we recommended λref = E(Λi |θ)
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
13/25
Bayesian inference for the RMW-AFT model
Outlier detection
• In Vallejos and Steel (2014) we recommended λref = E(Λi |θ)
• This is not appropriate for RMW models where censoring is
very informative for λi ’s.
For censored observations we use correction factor
λcref = Ri (β, γ, θ)λoref , with Ri (β, γ, θ) =
Catalina Vallejos
E (Λi |ti , ci = 0, β, γ, θ)
.
E (Λi |ti , ci = 1, β, γ, θ)
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
13/25
Applications
To illustrate, we analyse 2 real datasets:
Dataset
Veteran’s administration
lung cancer (VA)
Cerebral palsy (CP)
n
Censoring
# covariates
137
1,549
7%
84%
5
2
We use RMW-AFT models as well as a Weibull model.
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
14/25
Applications
Model comparison
We compare models defined by different mixing distributions using
• Bayes Factors
• Conditional Predictive Ordinate (CPO): for observation i,
CPOi = f (ti |t−i ), t−i = (t1 , ..., ti−1 , ti+1 , ..., tn ),
where f (·|t−i ) is the predictive density given t−i .
Q
• PsML = ni=1 CPOi (Geisser and Eddy, 1979)
⇒ Ratios of PsML’s defining pseudo Bayes factors (PsBF)
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
15/25
Application: VA data
Model comparison in terms of BF and PsBF
3
4
5
6
5
4
3
2
0
1
2
3
4
5
6
0
1
2
3
4
5
log−Bayes Factors
γ~Gamma(0.001,0.001)
3
4
log−Bayes Factors
5
6
5
4
3
●
●
0
0
2
●
2
2
3
●
1
Log−Pseudo Bayes Factors
5
4
●
1
Log−Pseudo Bayes Factors
5
4
2
3
●
6
6
log−Bayes Factors
1
1
1
Log−Pseudo Bayes Factors
5
4
0
γ~Gamma(1,1)
0
Log−Pseudo Bayes Factors
3
6
●
log−Bayes Factors
●
0
Catalina Vallejos
2
0
2
●
γ~Gamma(4,1)
6
1
6
0
E(cv ) = 5.0
●
●
1
Log−Pseudo Bayes Factors
5
4
3
2
1
Log−Pseudo Bayes Factors
●
●
0
E(cv ) = 1.5
γ~Gamma(0.001,0.001)
6
γ~Gamma(1,1)
6
γ~Gamma(4,1)
0
1
2
3
4
log−Bayes Factors
5
6
0
1
2
3
Weibull
RMWEXP
RMWGAM
RMWIGAM
RMWIGAUSS
RMWLN
4
5
6
log−Bayes Factors
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
16/25
Application: CP data
Model comparison in terms of BF and PsBF (Geisser and Eddy, 1979)
4
6
8
6
4
0
2
4
6
8
0
2
4
6
8
4
6
log−Bayes Factors
8
4
6
●●
2
Log−Pseudo Bayes Factors
6
4
●
0
0
2
Log−Pseudo Bayes Factors
6
4
●
8
log−Bayes Factors
γ~Gamma(0.001,0.001)
8
log−Bayes Factors
γ~Gamma(1,1)
2
2
2
Log−Pseudo Bayes Factors
8
6
0
log−Bayes Factors
●●
0
Catalina Vallejos
4
8
● ●
γ~Gamma(4,1)
8
2
0
Log−Pseudo Bayes Factors
0
E(cv ) = 5.0
●●
0
2
4
6
● ●
γ~Gamma(0.001,0.001)
2
Log−Pseudo Bayes Factors
8
γ~Gamma(1,1)
0
E(cv ) = 1.5
Log−Pseudo Bayes Factors
γ~Gamma(4,1)
0
2
4
6
log−Bayes Factors
8
0
2
4
Weibull
RMWEXP
RMWGAM
RMWIGAM
RMWIGAUSS
RMWLN
6
8
log−Bayes Factors
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
17/25
Applications: VA dataset
Posterior medians and HPD 95% interval for some regression coefficients
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
18/25
Applications: CP dataset
Posterior medians and HPD 95% interval for some regression coefficients
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
19/25
Applications
Posterior medians and HPD 95% interval for γ
VA dataset
CP dataset
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
20/25
Application: VA data
10
44
17
5
E(cv ) = 1.5
2log(BF)
15
20
Outlier detection (Gamma(θ,θ) mixing)
75
78
118
0
36
0
20
40
60
80
100
120
140
20
Patient
10
17
75
78
36
5
E(cv ) = 5.0
2log(BF)
15
44
58
21 27
13
118
70
125
0
9
0
20
40
60
80
100
120
140
Patient
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
21/25
Application: CP data
Bayes Factors
0.0
1.0
2.0
Outlier detection (Exponential(1) mixing)
0
500
1000
1500
Patient
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
22/25
Conclusions
1
We explored mixtures of life distributions (e.g RMW family)
to deal with unobserved heterogeneity and outliers
2
Covariates through AFT specification:
retains AFT structure and the interpretation of β
3
Prior based on structure of Jeffreys prior,
but allows meaningful BFs
4
Proposal of outlier detection method based on mixing
parameters
5
Data support mixing; critical for estimation of β and γ
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
23/25
Acknowledgements
This research project was funded by
• University of Warwick
• Pontificia Universidad Católica de Chile
Many thanks to P.O.D. Pharoah and Prof. Jane Hutton for access
to the cerebral palsy dataset.
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
24/25
Full references list and more details in
C.A. Vallejos and M.F.J. Steel (2014), Incorporating unobserved
heterogeneity in Weibull survival models: A Bayesian approach.
CRiSM-WP 14-20.
C.A. Vallejos and M.F.J. Steel (2015), Objective Bayesian survival
analysis using scale mixtures of log-normal distributions. JASA
Catalina Vallejos
MRC Biostatistics Unit and EMBL European Bioinformatics Institute
25/25
Download