Arnošt Komárek Dept. of Probability and Mathematical Statistics Regression modelling of misclassified correlated interval-censored data Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, July 27 – 29, 2015 Joint work with Marı́a José Garcı́a-Zattera and Alejandro Jara Pontificia Universidad Católica de Chile Santiago de Chile Outline 1 Misclassified interval-censored data. 2 Model for misclassified interval-censored data. a Misclassification model. b Event-time model. 3 Estimation and inference. 4 Simulation study. 5 Models comparison. 6 The Signal Tandmobielr study. 7 Summary and conclusions. 3/87 Arnošt Komárek . Part I Misclassified interval-censored data Motivating dataset The Signal Tandmobielr study Longitudinal dental study, Flanders (Belgium), 1996 – 2001. 2 315 boys, 2 153 girls followed from 7 until 12 years old (primary school time). Annual dental examinations. Sixteen trained dental examiners. Each child examined in general by different examiner in each year. w Clinical data. Data on oral hygiene and dietary habits. 5/87 Arnošt Komárek I. Misclassified interval-censored data Main aim Model the relationship between time to caries experience (CE) and potential risk factors. Gender (boys vs. girls). Presence of sealants. Frequency of brushing (daily / not daily). Geographical location. 6/87 Arnošt Komárek I. Misclassified interval-censored data Caries experience (CE) 7/87 Arnošt Komárek I. Misclassified interval-censored data Caries experience (CE) ? Reversible 7/87 Arnošt Komárek I. Misclassified interval-censored data Caries experience (CE) What is oral health research and why of interest? Cariology 6 Caries (Irreversible) Emmanuel Lesaffre (ERASMUS & KUL) Statistical Modeling in OH Research 22 July 2009 7 / 94 ? Reversible 7/87 Arnošt Komárek I. Misclassified interval-censored data Statistical modelling challenges CE is a progressive disease we deal with a monotone 0/1 process. CE status checked only at discrete occasions (visits/dental examinations) interval censoring. w w Teeth in one mouth share common environment, genetical dispositions, . . . dependence among processes on different teeth in one mouth. w 8/87 Arnošt Komárek I. Misclassified interval-censored data CE process & interval censoring 1 Y(i,j) (t) T(i,j) 0 Y(i,j) 0 0 v(i,1) pp pp pp pp p -ppp pp pp pp pp pp pp pp p v(i,2) 0 1 1 - v(i,3) v(i,4) t T(i,j) ∈ v(i,2) , v(i,3) , Y(i,j) = 0, 0, 1, 1 9/87 Arnošt Komárek > . I. Misclassified interval-censored data Summary of notation T(i,j) : event (CE) time of tooth j on subject i, i = 1, . . . , N, j = 1, . . . , J. Y(i,j) (t): 0/1 CE status of tooth (i, j) at time t. x(i,j) : potential risk factors, covariates to explain T(i,j) Y(i,j) (t). 0 = v(i,0) < v(i,1) < v(i,2) < · · · < v(i,Ki ) < v(i,Ki +1) = ∞: visit times (of dental examinations) for subject i. Y(i,j) = Y(i,j,1) , . . . , Y(i,j,Ki ) > : recorded 0/1 CE status of tooth (i, j) at performed visits. 10/87 Arnošt Komárek I. Misclassified interval-censored data Interval-censored data Interest in Regression T(i,j) ∼ x(i,j) ≡ Y(i,j) (t) ∼ x(i,j) . Observed data Monotone 0/1 sequence Y(i,j) = Y(i,j,1) , . . . , Y(i,j,Ki ) visit times v(i,1) , . . . , v(i,Ki ) . > together with the ≡ Intervals (L(i,j) , U(i,j) ] such that T(i,j) ∈ (L(i,j) , U(i,j) ] and L(i,j) , U(i,j) ∈ 0, v(i,1) , . . . , v(i,Ki ) , ∞ . L(i,j) : the last visit time when Y(i,j,∗) = 0, U(i,j) : the first visit time when Y(i,j,∗) = 1. 11/87 Arnošt Komárek I. Misclassified interval-censored data Life is not so easy. . . Not easy and somehow subjective diagnosis of CE misclassification in recorded values Y(i,j,1) , . . . , Y(i,j,Ki ) . sensitivity/specificity of the diagnostic test towards caries are not one. w w 12/87 Arnošt Komárek I. Misclassified interval-censored data Life is not so easy. . . Not easy and somehow subjective diagnosis of CE misclassification in recorded values Y(i,j,1) , . . . , Y(i,j,Ki ) . sensitivity/specificity of the diagnostic test towards caries are not one. w w Misclassified correlated interval-censored data. 12/87 Arnošt Komárek I. Misclassified interval-censored data CE process & misclassified interval-censored data 1 Y(i,j) (t) T(i,j) 0 Y(i,j) 0 0 v(i,1) pp pp pp pp p -ppp pp pp pp pp pp pp pp p 0 0 - v(i,2) T(i,j) ∈ v(i,3) , v(i,4) 1 v(i,3) v(i,4) t really?, > Y(i,j) = 0, 0, 0, 1 . 13/87 Arnošt Komárek I. Misclassified interval-censored data CE process & misclassified interval-censored data 1 Y(i,j) (t) T(i,j) 0 Y(i,j) 0 1 v(i,1) pp pp pp pp p -ppp pp pp pp pp pp pp pp p v(i,2) 0 0 1 - v(i,3) v(i,4) t T(i,j) ∈ ???, Y(i,j) = 0, 1, 0, 1 14/87 Arnošt Komárek > . I. Misclassified interval-censored data Misclassified interval-censored data Interest in Regression T(i,j) ∼ x(i,j) ≡ Y(i,j) (t) ∼ x(i,j) . Observed data T(i,j) Y(i,j) observed only indirectly through Y(i,j) = Y(i,j,1) , . . . , Y(i,j,Ki ) > : w not necessarily monotone sequence of 0/1 possibly misclassified CE status indicators from visits performed at times v(i,1) , . . . , v(i,Ki ) . 15/87 Arnošt Komárek I. Misclassified interval-censored data Study design that leads to misclassified interval-censored data Longitudinal follow-up. Event status checked at pre-specified time points. Assumption here: visit times independent of the event time. Occurrence of event is determined by a diagnostic test (with possibly imperfect sensitivity and/or specificity). Frequent for many non-death events. Nevertheless, data are mostly analyzed as if both sensitivity and specificity are equal to one and hence there is no event status misclassification. Follow-up is not scheduled to stop after the first positive result. Frequent in longitudinal studies where the event is not the primary study outcome. 16/87 Arnošt Komárek I. Misclassified interval-censored data Principal questions Using just the observed data – Y(i,j) 1 Can we do a valid statistical inference on the time to event T(i,j) in presence of event misclassification even if no external information is available on the magnitude of the misclassification? No external information on the sensitivity/specificity values. 17/87 Arnošt Komárek I. Misclassified interval-censored data Principal questions Using just the observed data – Y(i,j) 1 Can we do a valid statistical inference on the time to event T(i,j) in presence of event misclassification even if no external information is available on the magnitude of the misclassification? 2 No external information on the sensitivity/specificity values. Can we evaluate the magnitude of misclassification? Can we estimate sensitivity/specificity of the event classification? 17/87 Arnošt Komárek I. Misclassified interval-censored data Principal questions Using just the observed data – Y(i,j) 1 Can we do a valid statistical inference on the time to event T(i,j) in presence of event misclassification even if no external information is available on the magnitude of the misclassification? 2 3 No external information on the sensitivity/specificity values. Can we evaluate the magnitude of misclassification? Can we estimate sensitivity/specificity of the event classification? Do we get a valid inference on the time to event T(i,j) if misclassification ignored and it is assumed that T(i,j) lies in the first “possible” observed interval? 17/87 Arnošt Komárek I. Misclassified interval-censored data Part II Modelling approach Hierarchical model Hierarchically specified model (likelihood) for observed data Yi = Y(i,1) , . . . , Y(i,J) . Start with a joint likelihood of unobservable Ti and observed Yi : p(Yi , Ti ) = p Yi Ti p(Ti ). p Yi Ti : (mis)classification model w visit times v (i,1) , . . . , v(i,Ki ) act as covariates here. p(Ti ): survival model for (correlated) event times w risk factors x (i,1) , . . . , x(i,J) 19/87 Arnošt Komárek act as covariates here. II. Modelling approach Hierarchical model Hierarchically specified model (likelihood) for observed data Yi = Y(i,1) , . . . , Y(i,J) . Start with a joint likelihood of unobservable Ti and observed Yi : p(Yi , Ti ) = p Yi Ti p(Ti ). p Yi Ti : (mis)classification model w visit times v (i,1) , . . . , v(i,Ki ) act as covariates here. p(Ti ): survival model for (correlated) event times w risk factors x (i,1) , . . . , x(i,J) act as covariates here. Likelihood of observed data on subject i: Z p(Yi ) = p(Yi , Ti ) dTi . RJ+ 19/87 Arnošt Komárek II. Modelling approach Overall likelihood Overall likelihood Independence among subjects (children): p(Y1 , . . . , YN ) = N Y p(Yi ). i=1 Z p(Yi ) = p(Yi , Ti ) 20/87 Arnošt Komárek = RJ+ p(Yi , Ti ) dTi , p Yi Ti p(Ti ) | {z } | {z } event-time model misclassification model II. Modelling approach Part III Misclassification model (Mis)classification model p Y i T i For each i (each child), some conditional independence assumed. Event classification Y(i,j,k) for given unit (tooth j) at given time (k) is (conditionally) independent of (a) event classification Y(i,j ∗ ,l) for other units (other teeth, j ∗ 6= j) at arbitrary times (arbitrary l); (b) event classification Y(i,j,k ∗ ) for the same unit (the same tooth) at other times (k ∗ 6= k ); (c) event times T(i,j ∗ ) of other units (other teeth, j ∗ 6= j). www p(Yi | Ti ) = Ki J Y Y p(Y(i,j,k) | T(i,j) ). j=1 k=1 In the rest: form of p(Y(i,j,k) | T(i,j) ) for given j (tooth) and k (visit time). 22/87 Arnošt Komárek III. Misclassification model Simple (mis)classification model p Y i T i Only one examiner Model parameters: α: examiner’s sensitivity α = P Y(i,j,k) = 1 T(i,j) ≤ v(i,k) . η: examiner’s specificity η = P Y(i,j,k ) = 0 T(i,j) > v(i,k ) . Likelihood contribution: p(Y(i,j,k) | T(i,j) ) = p(Y(i,j,k ) | T(i,j) ; α, η, vi,k ) Y α (i,j,k ) (1 − α)1−Y(i,j,k ) , if T(i,j) ≤ v(i,k) (correct Y(i,j,k) equals 1), = Y(i,j,k) 1−Y(i,j,k ) η , if T(i,j) > v(i,k) (1 − η) (correct Y(i,j,k) equals 0). 23/87 Arnošt Komárek III. Misclassification model Slightly complicated (mis)classification model more p Yi Ti More (Q > 1) examiners involved in a study Signal Tandmobielr study: Q = 16. Different examiners have different ability to detect event (caries) sensitivity/specificity should be allowed to depend on the examiner. w w 24/87 Arnošt Komárek III. Misclassification model Slightly complicated (mis)classification model more p Yi Ti More (Q > 1) examiners involved in a study Signal Tandmobielr study: Q = 16. Different examiners have different ability to detect event (caries) sensitivity/specificity should be allowed to depend on the examiner. It is not necessarily as easy to detect caries on all teeth (j = 1, . . . , J) in the mouth sensitivity/specificity should be allowed to depend on tooth (j). w w w 24/87 Arnošt Komárek III. Misclassification model Slightly complicated (mis)classification model more p Yi Ti Q examiners, dependence of sensitivity/specificity on tooth (j) One more set of covariates in a model: ξ(i,k) ∈ {1, . . . , Q} index (id) of examiner who scored (all) teeth of the ith child during his/her kth visit at time v(i,k) . Slightly more unknown parameters of a model (q = 1, . . . , Q): > αq = α(q,1) , . . . , α(q,J) , ηq = η(q,1) , . . . , η(q,J) > . α(q,j) , η(q,j) : sensitivity and specificity if the event classification of tooth j is performed by examiner q, i.e., α(q,j) = P Y(i,j,k) = 1 T(i,j) ≤ v(i,k) ; ξ(i,k) = q , η(q,j) = P Y(i,j,k) = 0 T(i,j) > v(i,k) ; ξ(i,k) = q . 25/87 Arnošt Komárek III. Misclassification model Slightly complicated (mis)classification model more p Yi Ti Q examiners, dependence of sensitivity/specificity on tooth (j) Model parameters: > > α = α> : sensitivites of all examiners for all teeth. 1 , . . . , αQ > > η = η> : specificities of all examiners for all teeth. 1 , . . . , ηQ Likelihood contribution: p(Y(i,j,k) | T(i,j) ) = p(Y(i,j,k ) | T(i,j) ; α, η, Y ) α(ξ(i,j,k (1 − α(ξ(i,k ) ,j) )1−Y(i,j,k ) , (i,k ) ,j) = 1−Y ) (1 − η(ξ(i,k ) ,j) )Y(i,j,k ) η(ξ(i,k(i,j,k , ) ,j) 26/87 Arnošt Komárek vi,k , ξ(i,k ) ) if T(i,j) ≤ v(i,k) (correct Y(i,j,k ) equals 1), if T(i,j) > v(i,k) (correct Y(i,j,k ) equals 0). III. Misclassification model Evenmore complicated (mis)classification model p Yi Ti Sensitivies/specificities can further be modelled in a structural way as functions of characteristics (covariates) of examiners/teeth: age of examiner, gender of examiner, tooth position in the mouth, .. . “Only” one more hierarchical level of the model. 27/87 Arnošt Komárek III. Misclassification model Hierarchical model Reminder Likelihood contribution of observed data of the ith child: Z p(Yi ) = RJ+ Z = RJ+ 28/87 Arnošt Komárek p(Yi , Ti ) dTi p Yi Ti p(Ti ) dTi . III. Misclassification model Part IV Event-time model Event time model p(T i ) One more reminder Ti = T(i,1) , . . . , T(i,J) ≡ possibly correlated times to CE for J teeth of the ith child. xi = x(i,1) , . . . , x(i,J) ≡ covariates that may explain Ti . Form of p(Ti ) can in principle be derived from any regression model for correlated event time data (if we believe that a given model is suitable for data at hand): frailty Cox model, random intercept accelerated failure time (AFT) model, .. . 30/87 Arnošt Komárek IV. Event-time model Event time model p(T i ) Random intercept AFT model log T(i,j) = x> (i,j) β + bi + ε(i,j) i = 1, . . . , N, j = 1, . . . , J, β: regression coefficients. ε(1,1) , . . . , ε(N,J) : i.i.d. with zero-mean density gε (·). b1 , . . . , bN : i.i.d. with density gb (·) . bi (common for all j) induces dependence between T(i,1) , . . . , T(i,J) , 31/87 Arnošt Komárek IV. Event-time model Event time model p(T i ) Distributional assumptions gε (·) ∼ N (0, σε2 ). gb (·) ∼ penalized Gaussian mixture (PGM): ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9 κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● κ−9 κ−6 κ−3 32/87 Arnošt Komárek κ0 κ3 κ6 κ9 κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9 IV. Event-time model Event time model p(T i ) Distributional assumptions gε (·) ∼ N (0, σε2 ), XM wl N (κl , ζ 2 ) {z } | penalized Gaussian mixture (PGM) gb (·) ∼ µ + τ Model parameters: σε2 , w = w−M , . . . , wM , µ, τ 2 . Penalized Gaussian mixture: M ≈ 15, ζ ≈ 0.2, w l=−M κ−M , . . . , κM : equidistant knots on interval approx. [−4.5, 4.5]; flexible model for distribution with approximately zero mean and unit variance. Regularization using penalized differences of (transformed) weights w−M , . . . , wM . 33/87 Arnošt Komárek IV. Event-time model Event time model p(T i ) log T(i,j) = x> (i,j) β + bi + ε(i,j) i = 1, . . . , N, j = 1, . . . , J, Distribution of the event time T(i,j) Up to the log-transformation: convolution of a full parametric N ormal and a semi-parametric PGM. Also distribution of the event time is specified semi-parametrically. More details: Komárek, Lesaffre & Hilton (2005, J. of Computat. and Graphical Stat.), Komárek, Lesaffre & Legrand (2007, Statistics in Medicine), Komárek & Lesaffre (2008, J. of the American Statistical Association). 34/87 Arnošt Komárek IV. Event-time model Part V Estimation and inference Likelihood p(Y1 , . . . , YN ) = N Y p(Yi ) = i=1 N Z Y i=1 RJ+ p(Yi , Ti )dTi = N Z Y i=1 RJ+ p Yi Ti )p(Ti ) dTi . p Yi Ti : (mis)classification model > > unknown parameters: α = α(1,1) , . . . , α(Q,J) , η = η(1,1) , . . . , η(Q,J) : sensitivities and specificities for examiners and teeth. p(Ti ): event-time model random intercept AFT model with a PGM distribution of random intercept; unknown parameters: regression coefficients β, intercept µ, variances τ 2 and σε2 , mixture weights w. 36/87 Arnošt Komárek V. Estimation and inference Likelihood For each i = 1, . . . , N:Z p(Yi ) = p Yi Ti ) | {z } RJ+ QJ j=1 QKi k =1 p p(Ti ) dTi . Y(i,j,k ) T(i,j) Misclassification part p Y(i,j,k) T(i,j) 1−Y(i,j,k ) I(T(i,j) ){T ∈(0,v ]} (i,j) (i,k ) = α(ξ(i,k ) ,j) Y(i,j,k ) 1 − α(ξ(i,k ) ,j) × = I(T(i,j) ) Y(i,j,k) {T(i,j) ∈(v(i,k ) ,+∞)} 1 − η(ξ(i,k ) ,j) η(ξ(i,k) ,j) 1−Y(i,j,k ) k 1−Y(i,j,k ) I(T(i,j) ){T ∈(v Y (i,j) (i,l−1) ,v(i,l) ]} α(ξ(i,k ) ,j) Y(i,j,k ) 1 − α(ξ(i,k) ,j) × l=1 Ki +1 × Y 1 − η(ξ(i,k ) ,j) Y(i,j,k) η(ξ(i,k) ,j) 1−Y(i,j,k ) I(T(i,j) ) {T(i,j) ∈(v(i,l−1) ,v(i,l) ]} . l=k +1 37/87 Arnošt Komárek V. Estimation and inference Likelihood For each i = 1, . . . , N: Z p(Yi ) = RJ+ p Yi Ti ) p(Ti ) dTi . Event-time part Z p(Ti ) = p T(i,j) bi p(bi ) dbi , R p T(i,j) bi : log-normal following from the AFT model with a normal error Unknown parameters: β, σε2 . p(bi ): normal mixture following from the PGM model. Unknown parameters: w, α, τ 2 . 38/87 Arnošt Komárek V. Estimation and inference Estimation and inference Maximum-likelihood clearly not tractable. Bayesian specification of the model (with weakly informative priors) and MCMC based inference Possible. All integrals in the likelihood disappear in calculations if Bayesian data augmentation used w unobserved event times T ; w random effects (frailties) b . (i,j) i package bayesSurv (≥2.3). 39/87 Arnošt Komárek V. Estimation and inference Prior distributions AFT regression parameters: β ∼ Normal (with large variances); (Inverted) variance of the AFT error terms: σε−2 ∼ Gamma (with small rate and shape params.); Location of the random intercepts: µ ∼ Normal (with large variance); (Inverted) squared scale of the random intercepts: τ −2 ∼ Gamma (with small rate and shape params.). 40/87 Arnošt Komárek V. Estimation and inference Random intercept distribution Remember: bi ∼ penalized Gaussian mixture (PGM) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9 κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● κ−9 κ−6 κ−3 41/87 Arnošt Komárek κ0 κ3 κ6 κ9 κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9 V. Estimation and inference Prior for PGM weights Mixture weights (from the PGM model for the distribution of the random intercept) PM 2 Remember: bi ∼ µ + τ l=−M wl N (κl , ζ ), where M is relatively large. Weights w should sum-up to one. It is primarily worked with the transformed weights a = a−M , . . . , aM : wl = PM exp(al ) m=−M exp(am ) , l = −M, . . . , M, a0 = 0. Regularization prior for the (transformed) weights. 42/87 Arnošt Komárek V. Estimation and inference Prior for PGM weights Mixture weights (from the PGM model for the distribution of the random intercept) Regularization prior for the (transformed) weights: λ p a λ ∝ exp − 2 M X j=−M+o λ 2 ∆o aj = exp − a> P> P a . o o 2 ∆o : difference operator of order o w P : corresponding difference operator matrix. o λ: smoothing hyperparameter w prior: λ ∼ Gamma. 43/87 Arnošt Komárek V. Estimation and inference Prior for misclassification parameters Sensitivities and specificities of the event-classification For each q (examiners) and j (unit – tooth) 0 < α(q,j) < 1: sensitivity of examiner q when scoring the jth unit (tooth); 0 < η(q,j) < 1: specificity of examiner q when scoring the jth unit (tooth). Identification constraint: α(q,j) + η(q,j) > 1. Prior: α(q,j) , η(q,j) ∼ Beta × Beta truncated by the identification constraint. 44/87 Arnošt Komárek V. Estimation and inference Markov chain Monte Carlo MCMC – Block Gibbs sampler Parameters of the event-time model (β, σε2 , PGM parameters w/a, λ, µ, τ 2 and augmented random effects b1 , . . . , bN ): Nothing new compared to the situation without misclassification, see earlier papers Komárek, Lesaffre (& Legrand) (2007, 2008). 45/87 Arnošt Komárek V. Estimation and inference Markov chain Monte Carlo MCMC – Block Gibbs sampler Parameters of the event-time model (β, σε2 , PGM parameters w/a, λ, µ, τ 2 and augmented random effects b1 , . . . , bN ): Nothing new compared to the situation without misclassification, see earlier papers Komárek, Lesaffre (& Legrand) (2007, 2008). Augmented event times T(i,j) : Sampling from a mixture of truncated log-normals. Truncation: intervals between the visit times. Mixture weights: binomial probabilities that depend on sensitivities/specificities and observed Y(i,j,l) values of 0/1 event classifications. 45/87 Arnošt Komárek V. Estimation and inference Markov chain Monte Carlo MCMC – Block Gibbs sampler Parameters of the event-time model (β, σε2 , PGM parameters w/a, λ, µ, τ 2 and augmented random effects b1 , . . . , bN ): Nothing new compared to the situation without misclassification, see earlier papers Komárek, Lesaffre (& Legrand) (2007, 2008). Augmented event times T(i,j) : Sampling from a mixture of truncated log-normals. Truncation: intervals between the visit times. Mixture weights: binomial probabilities that depend on sensitivities/specificities and observed Y(i,j,l) values of 0/1 event classifications. What would be changed if other than AFT model with normal errors assumed for event times? 45/87 Arnošt Komárek V. Estimation and inference Markov chain Monte Carlo MCMC – Block Gibbs sampler Parameters of the event-time model (β, σε2 , PGM parameters w/a, λ, µ, τ 2 and augmented random effects b1 , . . . , bN ): Nothing new compared to the situation without misclassification, see earlier papers Komárek, Lesaffre (& Legrand) (2007, 2008). Augmented event times T(i,j) : Sampling from a mixture of truncated log-normals. Truncation: intervals between the visit times. Mixture weights: binomial probabilities that depend on sensitivities/specificities and observed Y(i,j,l) values of 0/1 event classifications. What would be changed if other than AFT model with normal errors assumed for event times? Sensitivities (α’s) and specificities (η’s): Sampling from truncated Beta distributions. 45/87 Arnošt Komárek V. Estimation and inference Part VI Simulation study Simulation study J = 4 (teeth), N = 500, 1 000, 2 000 (children). log T(i,j) = 2.0 + 0.2 x(i,j),1 − 0.1 x(i,j),2 + bi + ε(i,j) . x(i,j),1 ∼ Uniform(0, 1), x(i,j),2 ∼ Bernoulli(0.5). var(bi ) + var(ε(i,j) ) = 0.1. q σb var(bi ) var(ε(i,j) ) = σ = 0.5, 1, 2, 5. ε gb : (a) N ormal, (b) clearly bimodal two-component N ormal mixture, (c) Gumbel. Ki = 10 visits (in random intervals). Q = 5 examiners randomly assigned to the visits. Sensitivities and specificities ranging 0.60 – 0.96. 47/87 Arnošt Komárek VI. Simulation study Simulation study 500 data sets for each scenario. Each dataset also analyzed while ignoring misclassification. The first Y(i,j,k) = 1 determined the “observed” interval where T(i,j) occurred leading to “standard” interval-censored data. 48/87 Arnošt Komárek VI. Simulation study Sensitivity α(1,1) = 0.60 gb : bimodal two-component N mixture σb σε: 0.5 1 2 5 ● ● ● 0.65 ● ● ● ● ● ● ● ● ● ● ● ● 0.60 ● ● ● α11 ● ● ● ● ● ● ● 0.55 ● ● ● ● ● ● ● 500 ● ● ● ● ● ● ● 1000 2000 ● ● ● ● ● 500 ● ● 1000 2000 500 1000 2000 500 1000 2000 N 49/87 Arnošt Komárek VI. Simulation study Sensitivity α(1,1) = 0.60 gb : Gumbel σb σε: 0.5 1 2 ● ● ● ● ● ● ● ● ● ● ● ● ● 0.65 ● ● ● ● ● ● 0.60 ● ● α11 5 ● ● 0.55 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 500 1000 2000 500 ● ● ● 1000 2000 500 ● 1000 2000 500 1000 2000 N 50/87 Arnošt Komárek VI. Simulation study Sensitivity α(4,4) = 0.91 gb : bimodal two-component N mixture σb σε: 0.5 1 2 5 ● 0.94 ● ● ● ● ● ● ● 0.90 α44 0.92 ● 0.88 ● ● ● ● ● ● ● ● ● ● ● ● 0.86 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000 N 51/87 Arnošt Komárek VI. Simulation study Sensitivity α(4,4) = 0.91 gb : Gumbel σb σε: 0.5 1 2 5 0.96 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.88 0.90 α44 0.92 0.94 ● ● ● ● ● 0.86 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.84 ● ● 500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000 N 52/87 Arnošt Komárek VI. Simulation study Regression parameter β1 = 0.20 gb : bimodal two-component N mixture σb σε: 0.5 1 2 5 0.30 ● ● ● ● ● ● ● 0.25 ● ● ● ● ● ● ● ● ● 0.20 β1 ● ● ● ● ● ● 0.15 ● ● ● ● ● ● ● 0.10 ● ● ● ● ● ● 500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000 N 53/87 Arnošt Komárek VI. Simulation study Regression parameter β1 = 0.20 0.30 gb : Gumbel σb σε: 0.5 1 2 5 ● ● ● ● ● ● ● ● ● 0.25 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.20 β1 ● ● ● ● ● 0.15 ● ● ● ● ● ● ● ● 500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000 N 54/87 Arnošt Komárek VI. Simulation study Regression parameter β1 = 0.20 gb : bimodal two-component N mixture σb σε: 0.5 1 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.05 0.00 β1 5 ● 0.10 0.15 0.20 IGNORED MISCLASSIFICATION −0.05 ● ● ● ● ● ● ● ● ● ● −0.10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 500 1000 55/87 Arnošt Komárek 2000 500 1000 2000 500 N 1000 2000 500 1000 2000 VI. Simulation study Regression parameter β1 = 0.20 gb : Gumbel IGNORED MISCLASSIFICATION σb σε: 1 0.20 0.5 2 0.15 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.05 0.10 ● 0.00 β1 5 ● ● ● −0.10 −0.05 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 500 1000 56/87 Arnošt Komárek 2000 500 1000 2000 500 N 1000 2000 500 1000 2000 VI. Simulation study Survival function for a certain covariates combination σb /σε = 5 gb : bimodal two-component N mixture 0 5 10 15 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 0.4 0.0 0.2 S(t) N = 2000 1.0 N = 1000 1.0 N = 500 0 5 10 15 0 5 10 Time Time Time N = 500 N = 1000 N = 2000 15 0 5 10 Time 57/87 Arnošt Komárek 15 1.0 S(t) 0.0 0.2 0.4 0.6 0.8 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.6 0.4 0.0 0.2 S(t) 0.8 1.0 gb : Gumbel 0 5 10 Time 15 0 5 10 15 Time VI. Simulation study Survival function for a certain covariates combination σb /σε = 0.5 gb : bimodal two-component N mixture 0 5 10 15 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 0.4 0.0 0.2 S(t) N = 2000 1.0 N = 1000 1.0 N = 500 0 5 10 15 0 5 10 Time Time Time N = 500 N = 1000 N = 2000 15 0 5 10 Time 58/87 Arnošt Komárek 15 1.0 S(t) 0.0 0.2 0.4 0.6 0.8 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.6 0.4 0.0 0.2 S(t) 0.8 1.0 gb : Gumbel 0 5 10 Time 15 0 5 10 15 Time VI. Simulation study Survival function for a certain covariates combination σb /σε = 5 IGNORED MISCLASSIFICATION gb : bimodal two-component N mixture 0 5 10 15 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 0.4 0.0 0.2 S(t) N = 2000 1.0 N = 1000 1.0 N = 500 0 5 10 15 0 5 10 Time Time Time N = 500 N = 1000 N = 2000 15 0 5 10 59/87 Arnošt Komárek Time 15 1.0 S(t) 0.0 0.2 0.4 0.6 0.8 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.6 0.4 0.0 0.2 S(t) 0.8 1.0 gb : Gumbel 0 5 10 Time 15 0 5 10 15 VI. Time Simulation study Survival function for a certain covariates combination σb /σε = 0.5 IGNORED MISCLASSIFICATION gb : bimodal two-component N mixture 0 5 10 15 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 0.4 0.0 0.2 S(t) N = 2000 1.0 N = 1000 1.0 N = 500 0 5 10 15 0 5 10 Time Time Time N = 500 N = 1000 N = 2000 15 0 5 10 60/87 Arnošt Komárek Time 15 1.0 S(t) 0.0 0.2 0.4 0.6 0.8 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.6 0.4 0.0 0.2 S(t) 0.8 1.0 gb : Gumbel 0 5 10 Time 15 0 5 10 15 VI. Time Simulation study Simulation study 2 How about if no misclassification present but we use the model that accounts for possible misclassification? Simulation study 2 where data generated without misclassification (all sensitivities and specificities being equal to one). 61/87 Arnošt Komárek VI. Simulation study Sensitivity α(1,1) = 1.00 1.00 gb : bimodal two-component N mixture σb σε: 0.5 ● ● 1 ● ● ● 2 ● ● ● 5 ● ● ● ● ● ● ● ● ● ● ● 0.99 ● ● 0.98 ● ● ● ● ● ● ● 0.97 α11 ● ● ● ● ● ● ● ● ● 0.96 ● 0.95 ● ● ● 500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000 N 62/87 Arnošt Komárek VI. Simulation study Sensitivity α(1,1) = 1.00 gb : Gumbel σb σε: 0.5 1 0.999 ● ● ● ● ● ● 2 ● ● 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.997 α11 0.998 ● ● 0.996 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.995 ● ● 500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000 N 63/87 Arnošt Komárek VI. Simulation study Regression parameter β1 = 0.20 gb : bimodal two-component N mixture σb σε: 0.5 1 2 5 ● 0.25 ● ● ● ● ● ● ● ● ● ● ● ● ● 0.20 β1 ● ● ● ● ● ● ● ● ● ● ● 0.15 ● ● ● ● ● ● ● ● ● 500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000 N 64/87 Arnošt Komárek VI. Simulation study Regression parameter β1 = 0.20 gb : Gumbel σb σε: 0.5 1 2 5 ● ● ● ● ● ● ● 0.24 0.26 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.20 ● ● ● ● 0.18 β1 0.22 ● ● ● ● 2000 500 ● ● ● ● ● ● ● 0.16 ● ● ● ● ● ● ● ● ● ● 0.14 ● ● 500 1000 2000 500 1000 2000 500 1000 1000 2000 N 65/87 Arnošt Komárek VI. Simulation study Survival function for a certain covariates combination σb /σε = 5 gb : bimodal two-component N mixture 0 5 10 15 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 0.4 0.0 0.2 S(t) N = 2000 1.0 N = 1000 1.0 N = 500 0 5 10 15 0 5 10 Time Time Time N = 500 N = 1000 N = 2000 15 0 5 10 Time 66/87 Arnošt Komárek 15 1.0 S(t) 0.0 0.2 0.4 0.6 0.8 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.6 0.4 0.0 0.2 S(t) 0.8 1.0 gb : Gumbel 0 5 10 Time 15 0 5 10 15 Time VI. Simulation study Survival function for a certain covariates combination σb /σε = 0.5 gb : bimodal two-component N mixture 0 5 10 15 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 S(t) 0.0 0.2 0.4 0.8 0.6 0.4 0.0 0.2 S(t) N = 2000 1.0 N = 1000 1.0 N = 500 0 5 10 15 0 5 10 Time Time Time N = 500 N = 1000 N = 2000 15 0 5 10 Time 67/87 Arnošt Komárek 15 1.0 S(t) 0.0 0.2 0.4 0.6 0.8 1.0 0.8 0.6 S(t) 0.0 0.2 0.4 0.6 0.4 0.0 0.2 S(t) 0.8 1.0 gb : Gumbel 0 5 10 Time 15 0 5 10 15 Time VI. Simulation study Part VII Models comparison Models comparison Two competing models M1 and M2 . May differ in specification of the event-time and/or the misclassification model. Pseudo Bayes factor (PsBF) (Geisser and Eddy, 1979, JASA; Gelfand and Dey, 1994, JRSS, B): PsMLM1 PsBF(M1 , M2 ) = , PsMLM2 PsMLM : pseudo marginal likelihood given model M: PsMLM = N Y J Y pM Y(i,j,1) , . . . , Y(i,j,Ki ) Y[−(i,j)] i=1 j=1 Y[−(i,j)] : data without observation of unit (tooth) j of subject (child) i; pM (· | ·): posterior predictive distribution. 69/87 Arnošt Komárek VII. Models comparison Pseudo marginal likelihood Approximation based on the proposal of Gelfand and Dey (1994, JRSS, B): pM Y(i,j,1) , . . . , Y(i,j,Ki ) Y[−(i,j)] ( = Eα, η, β, bi , σε2 | Y ( ≈ B 1X B b=1 !)−1 1 P Y(i,j,1) , . . . , Y(i,j,Ki ) α, η, β, bi , σε2 !)−1 1 (b) 2(b) P Y(i,j,1) , . . . , Y(i,j,Ki ) α(b) , η (b) , β (b) , bi , σε . , 70/87 Arnošt Komárek VII. Models comparison Pseudo marginal likelihood P Y(i,j,1) , . . . , Y(i,j,Ki ) α, η, β, bi , σε2 Ki +1 = X k =1 (Z v(i,k ) v(i,k −1) 1 2 ϕ log t x> (i,j) β + bi , σε dt t ) × W(i,j,k ) (Y(i,j) , α, η). W(i,j,k) (Y(i,j) , α, η): quantity for which we have a closed-form expression and which is also needed in the MCMC procedure. 71/87 Arnošt Komárek VII. Models comparison Part VIII The Signal Tandmobielr study Models Event-time model log T(i,j) = bi + x> (i,j) β + ε(i,j) T(i,j) Age at getting caries on tooth j (∈ {1, 2, 3, 4}) of a child i. x(i,j) : gender, presence of sealants, frequency of brushing, x and y geographical coordinate. Misclassification models 16 examiners. Model M1 : sensitivities/specificities both examiner and tooth specific (64 + 64 sensitivities and specificities). Model M2 : sensitivities/specificities only examiner specific (16 + 16 sensitivities and specificities). 73/87 Arnošt Komárek VIII. The Signal Tandmobiel study Models comparison Pseudo marginal log-likelihoods: M1 : −16 545, M2 : −16 515. PsBF(M1 , M2 ) = exp(−30) ≈ 10−13 . From a predictive point of view, the simpler model M2 (sensitivities/specificities only examiner specific) is better. 74/87 Arnošt Komárek VIII. The Signal Tandmobiel study Sensitivities 0.85 0.70 0.75 0.80 Sensitivity 0.90 0.95 1.00 (posterior means and 95% HPD credible intervals) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Examiner 75/87 Arnošt Komárek VIII. The Signal Tandmobiel study Specificities 0.85 0.70 0.75 0.80 Specificity 0.90 0.95 1.00 (posterior means and 95% HPD credible intervals) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Examiner 76/87 Arnošt Komárek VIII. The Signal Tandmobiel study Random intercept density 0.6 0.0 0.2 0.4 g(b) 0.8 1.0 1.2 (standardized, pointwise posterior means) −3 −2 −1 0 1 2 3 b 77/87 Arnošt Komárek VIII. The Signal Tandmobiel study Posterior summary for regression parameters Model Posterior Mean Median 95%HPD Gender (Girl) 1 2 −0.05971 −0.05984 −0.05964 −0.05993 Sealants (Present) 1 2 0.19027 0.19067 0.19016 0.19054 ( 0.16319 ; ( 0.16378 ; 0.21762) 0.21787) Freq. of Brush. (Daily) 1 2 0.16564 0.16538 0.16562 0.16542 ( 0.12168 ; ( 0.12242 ; 0.21056) 0.20938) x-ordinate 1 2 −0.00092 −0.00092 −0.00092 −0.00092 (−0.00122 ; −0.00062) (−0.00122 ; −0.00062) y -ordinate 1 2 −0.00002 −0.00007 −0.00004 −0.00007 (−0.00010 ; (−0.00101 ; 78/87 Arnošt Komárek (−0.09115 ; −0.02854) (−0.09098 ; −0.02814) 0.00087) 0.00080) VIII. The Signal Tandmobiel study Survival functions 0.8 1.0 (pointwise posterior means) 0.0 0.2 0.4 S(t) 0.6 Boy: Seal:More freq. Girl: Seal:More freq. Boy: Seal:Less freq. Boy: No seal:More freq. Girl: Seal:Less freq. Girl: No seal:More freq. Boy: No seal:Less freq. Girl: No seal:Less freq. 0 5 10 15 Age 79/87 Arnošt Komárek VIII. The Signal Tandmobiel study Hazard functions 0.4 (pointwise posterior means) 0.2 0.0 0.1 h(t) 0.3 Girl: No seal:Less freq. Boy: No seal:Less freq. Girl: No seal:More freq. Girl: Seal:Less freq. Boy: No seal:More freq. Boy: Seal:Less freq. Girl: Seal:More freq. Boy: Seal:More freq. 0 5 10 15 Age 80/87 Arnošt Komárek VIII. The Signal Tandmobiel study Part IX Summary and conclusions Misclassified interval-censored data 1 Y(i,j) (t) T(i,j) 0 Y(i,j) 0 1 v(i,1) pp pp pp pp p -ppp pp pp pp pp pp pp pp p v(i,2) 0 0 1 - v(i,3) v(i,4) t T(i,j) ∈ ???, Y(i,j) = 0, 1, 0, 1 82/87 Arnošt Komárek > . IX. Summary and conclusions Conclusions Interval-censored event time data are encountered whenever a certain evaluation (examination/labo/. . . ) is needed to determine the event status. Event status evaluation is often subject to misclassification. Not only human examiners but also labo procedures have usually sensitivity and/or specificity < 1. Ignoring misclassification may lead to seriously biased results of the event time analysis. 83/87 Arnošt Komárek IX. Summary and conclusions Conclusions Interval-censored event time data are encountered whenever a certain evaluation (examination/labo/. . . ) is needed to determine the event status. Event status evaluation is often subject to misclassification. Not only human examiners but also labo procedures have usually sensitivity and/or specificity < 1. Ignoring misclassification may lead to seriously biased results of the event time analysis. Joint modelling of the misclassification and event-time processes allows for unbiased/consistent estimation of parameters of: the event-time process (survival functions, regression parameters, . . . ); the misclassification process (sensitivities, specificities). No need for external (validation) data to get sensitivities/specificities related to classification. 83/87 Arnošt Komárek IX. Summary and conclusions Possible extensions/modifications Other than AFT model with random intercept as the event-time model. Only small parts of the MCMC scheme would have to be modified. Examiner-specific covariates to model sensitivities/specificities in the misclassification model. Logit model. 84/87 Arnošt Komárek IX. Summary and conclusions Possible extensions/modifications Other than AFT model with random intercept as the event-time model. Only small parts of the MCMC scheme would have to be modified. Examiner-specific covariates to model sensitivities/specificities in the misclassification model. Logit model. Time-dependent sensitivities/specificities. Useful if a learning-by-doing can be expected in event-classification. Likely not possible with our “joint” approach due to identifiability problems. External (validation) data needed to estimate parameters of the misclassification process. 84/87 Arnošt Komárek IX. Summary and conclusions Applicability Designed longitudinal studies with visit times pre-specified (being independent of the event times). Event status checked at each visit independently of previous examination results by imperfect diagnostic procedure. At least three visits (for at least some subjects) needed to identify parameters of the misclassification process (sensitivities and specificities). Above conditions quite often satisfied in practice and misclassification ignored. . . 85/87 Arnošt Komárek IX. Summary and conclusions Applicability Designed longitudinal studies with visit times pre-specified (being independent of the event times). Event status checked at each visit independently of previous examination results by imperfect diagnostic procedure. At least three visits (for at least some subjects) needed to identify parameters of the misclassification process (sensitivities and specificities). Above conditions quite often satisfied in practice and misclassification ignored. . . Practically nothing is lost if misclassification considered even if not present. 85/87 Arnošt Komárek IX. Summary and conclusions THANK YOU FOR YOUR ATTENTION! References G ARC ÍA -Z ATTERA , J ARA , KOM ÁREK (2015+). A flexible AFT model for misclassified clustered interval-censored data. Under review. KOM ÁREK , L ESAFFRE , H ILTON (2005). Accelerated failure time model for arbitrarily censored data with smoothed error distribution. Journal of Computational and Graphical Statistics, 14(3), 726–745. KOM ÁREK , L ESAFFRE , L EGRAND (2007). Baseline and treatment effect heterogeneity for survival times between centers using a random effects accelerated failure time model with flexible error distribution. Statistics in Medicine, 26(30), 5457–5472. KOM ÁREK , L ESAFFRE (2008). Bayesian accelerated failure time model with multivariate doubly-interval-censored data and flexible distributional assumptions. Journal of the American Statistical Association, 103(482), 523–533. G ARC ÍA -Z ATTERA , M UTSVARI , J ARA , D ECLERCK , L ESAFFRE (2010). Correcting for misclassification for a monotone disease process with an application in dental research. Statistics in Medicine, 29(30), 3103–3117. G ARC ÍA -Z ATTERA , J ARA , L ESAFFRE , M ARSHALL (2012). Modeling of multivariate monotone disease processes in the presence of misclassification. Journal of the American Statistical Association, 107(499), 976–989. 87/87 Arnošt Komárek IX. Summary and conclusions