Regression with frailty in survival analysis

Regression with Frailty in Survival analysis Kyuson Lim November 26, 2021 STATS 756 2 Kyuson Lim Contents 1 Acknowledgement 5 2 Introduction 7 2.1 8 Understanding the concept of cox-proportional hazard model . . . . . . 2.1.1 3 Partial Likelihood 3.1 3.2 4 5 Parametric cox model with frailty term . . . . . . . . . . . . . Breslow Partial likelihood 10 11 . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1.1 Example for computing Partial likelihood . . . . . . . . . . . . 12 3.1.2 Penalize Partial Likelihood (PPL) . . . . . . . . . . . . . . . . 14 Newton-Raphson Method . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.1 15 Newton-Raphson algorithm - example . . . . . . . . . . . . . . Simulation for data 17 4.1 Simulation study: Infection in Kidney patients . . . . . . . . . . . . . . 17 4.2 Simulation study: testing the model fit . . . . . . . . . . . . . . . . . . 18 4.3 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3.1 20 Generalized gamma frailty model . . . . . . . . . . . . . . . . Appendix: R codes 21 3 STATS 756 4 Kyuson Lim CONTENTS Chapter 1 Acknowledgement The purpose of this report is solely on the interpretation and analysis of ‘Regression with Frailty in survival analysis’ written by the McGilchrist and Aisbett in 1991. Note that the concepts of penalized partial likelihood and the Newton-Raphson algorithm used for elicitation of maximized coefficients in the Cox model extends to the paper of ’Estimation in generalized mixed models‘ with the same author. Moreover, the original dataset that is used for the analysis is attached in the R package ‘survival’ which is imported to be re-assessed. More specification of the original dataset, Infections in kidney patients, is found from the R document of ‘survival’ page 55. Also, this report rephrase for the specification of dataset containing the outlier, model specification and the codes to have used in the ’survival‘ package. However, the examples and codes are extracted from the textbook, ‘Applied survival analysis using R’ written by the author Dirk F. Moore for graph visualization of optimization in Newton-Raphson method and the guidance for elicitation process. Combined with the textbook ‘Frailty models in survival analysis’ written by the Andreas Wienke, the frailty terms are defined and derived for the equation of log likelihood as well as the penalized partial likelihood. Interpretation for the original paper and optimization method are defined by rephrasing the definitions used in the textbooks. Finally, the reports briefly extend to the paper of ‘Generalized gamma frailty model’ written by Professor Dr. Balakrishnan which extends the idea and method of the original paper as to be used with. This paper is highly reputable in the matter where the distribution of frailty term is parametric to be specified for lognormal, Weibull frailty model and the generalized gamma distribution. The comparison of the model performance with these distributions and the Newton-Raphson algorithm is used for extension of the original paper. I am pleased thank for all textbooks and guideline for writing this report in behalf of the course STATS 756 for analysis in Cox model with frailty terms. Also, I would be pleased to thank for Professor Dr. Balakishnan to support me to learn with the ideas of survival analysis and censoring and writing the report. 5 STATS 756 6 Kyuson Lim CHAPTER 1. ACKNOWLEDGEMENT Chapter 2 Introduction To begin with, some concepts and relationship between the survival function and the hazard function is explained. First of all, the Cox proportional-hazards model (Cox, 1972) is essentially a regression model commonly used statistical in medical research for investigating the association between the survival time of patients and one or more predictor variables. A graph of survival analysis The empirical hazard function is a step function evaluated at each time. Therefore, The survival probability, 𝑆(𝑡) is the probability that an individual survives from time origin to a specified future time 𝑡. The hazard, ℎ(𝑡) is a continuous probability function that an individual who is under observation at a time 𝑡 has an event at that time. These two hazard function and survival function is modeled as follows. As the survival function is a decreasing from 𝑡 = 0 → ∞ for continuous variable, the distribution function is explained by 𝐹 (𝑡) = 𝑝(𝑇 ≤ 𝑡). Hence, ∫ ∞ 𝑆(𝑡) = 𝑝(𝑇 > 𝑡) = 𝑓 (𝑢)𝑑𝑢 = 1 − 𝐹 (𝑡) 𝑡 𝑓 (𝑡) 𝑆(𝑡) and instantaneous time ∫(4𝑡 → 0) 𝑡 𝑑 (log 𝑆(𝑡)) 𝑓 (𝑡) − 𝑑𝑡 = 1−𝐹 (𝑡) for cdf of 𝐻 (𝑡) = 0 ℎ(𝑢)𝑑𝑢. For survival time 𝑇, hazard function ℎ(𝑡) = illustrate the event after time 𝑡, ℎ(𝑡) = 7 STATS 756 Kyuson Lim Therefore, 𝐻 (𝑡) = − log(𝑆(𝑡)) and 𝑆(𝑡) = exp(−𝐻 (𝑡)). If one is known, other two are easily determined. 2.1 Understanding the concept of cox-proportional hazard model In the survival analysis, the paper of regression with frailty in survival analysis contains the data for the infections of kidney patients. The survival analysis plot for the infection of 76 Kidney patients in the paper, analyzed by R shows that Figure 2.1: A graph of survival analysis for Kidney data As to have known with, the survival function defines a probability of surviving up to a point 𝑡, 𝑆(𝑡) = 𝑝(𝑇 > 𝑡). Since the hazard function is an instantaneous failure rate, given subject has survived up to time 𝑡 and fails in next small interval of time, >𝑡) ℎ(𝑡) = lim𝛿→0 𝑝(𝑡<𝑇 <𝑡+𝛿|𝑇 . 𝛿 In this report, some of the goals for understanding the cox proportional hazard model with frailty terms include • Cox proportional hazard model analyzes the various influential covariates simultaneously, where ℎ0 (𝑡) is non-parametric part and exp( 𝜷x𝑖 ) is parametric part. • Fit regression model to censored survival data and partial likelihood allows us to compare 2 groups of survival data. • By maximized log-likelihood of 𝜷, Newton-Raphson algorithm is used to derive the estimates. 8 CHAPTER 2. INTRODUCTION Kyuson Lim STATS 756 An endpoint, a single cause of death, and the survival times of each case have been assumed to be independent. Methods for analyzing such survival data is not sufficient if cases are not independent or if the event could occur repeatedly. Hence, the proportional hazard model is described as ℎ(𝑡|x𝑖 ) = ℎ0 (𝑡) exp( 𝜷0x𝑖 (𝑡)), 𝜷0 = (𝛽1 , .., 𝛽 𝑝 ) A cluster-specific random effect terms have a relative effect on the baseline hazard function, ℎ0 (x𝑖 ), reflect underlying hazard for subjects with all covariates x1 , ..., x 𝑝 equal to 0. We can find that the distribution of the baseline hazard function is a non-parametric part and the parametric part is exp( 𝜷0x𝑖 ) for the cox proportional hazard model to be semi-parametric. For 2 covariates, x1 = 1 and x2 = 0, a hazard rate for treated group is ℎ1 (𝑡|x𝑖 = 1) = ℎ0 (𝑡) exp( 𝜷), ℎ1 (𝑡|x𝑖 = 1) = ℎ1 (𝑡). Two hazards is constant exp( 𝜷), not dependent on time (t) and two hazards ratio of 2 groups remain proportional over time ℎ1 (𝑡) = exp( 𝜷) ℎ0 (𝑡) Moreover, a ratio of relative Hazard of two patients (𝑥1𝑘 = 1, 𝑥2𝑘 = 0) as explained in the lecture can be shown as exp(𝛽1 𝑥11 + 𝛽2 𝑥12 + · · · + 𝛽 𝑘 𝑥 1𝑘 + · · · + 𝛽 𝑝 𝑥 1𝑝 ) ℎ1 (𝑡|x) ℎ0 (𝑡) exp( 𝜷0x) = 0 0 = 0 ℎ2 (𝑡|x ) ℎ0 (𝑡) exp( 𝜷 x ) exp(𝛽1 𝑥21 + 𝛽2 𝑥22 + · · · + 𝛽 𝑘 𝑥 2𝑘 + · · · + 𝛽 𝑝 𝑥 2𝑝 ) exp(𝛽1 𝑥 11 + 𝛽2 𝑥 12 + · · · + 𝛽 𝑘 · 1 + · · · + 𝛽 𝑝 𝑥 1𝑝 ) exp(𝛽 𝑘 ) = = exp(𝛽 𝑘 ) exp(𝛽1 𝑥 21 + 𝛽2 𝑥 22 + · · · + 𝛽 𝑘 · 0 + · · · + 𝛽 𝑝 𝑥 2𝑝 ) exp(0) Hence, the Cox Proportional Hazards model is a linear model for the log of the hazard ratio. = A graph of relative hazard ratio for each coefficient The relative hazard ratio of each prediction in the paper is computed as follows. The dependence of particular diseases is found to influence the relative hazard ratio. The sex is insignificant predictor to be considered as well. While age is estimated to be 1 and sex to be 0.23, the GS disease is 1.09, the AN is 1.42 and the PKD is 0.24. CHAPTER 2. INTRODUCTION 9 STATS 756 2.1.1 Kyuson Lim Parametric cox model with frailty term In clustered data, survival times of individuals that are in same unit or family, meaning that survival times within a cluster are similar, to each other than to those from other clusters then the independence no longer holds. To accommodate such structure of subjects in the same cluster is to assign each individual in a cluster a common factor known as a frailty or as random effect. A random effects incorporated for within-cluster homogeneity in outcomes 1. Shared frailty model: ℎ(𝑡|x𝑖 𝑗 ) = ℎ0 (𝑡) exp(𝛽0x𝑖 𝑗 (𝑡) + 𝑤 𝑖 ) Note that 𝑤 𝑖 is the random effect for 𝑖th cluster for all individuals, vary across clusters. Subjects in the same cluster all share the same frailty factor. A frailty model refer to a survival model with only a random intercept. Meaning that the frailty term in the random effect, the frailty term of the model follows with some distribution such as log-normal, gamma, and Weibull distributions. In the paper, logfrailities are assumed to be normally distributed with 𝐸 (log(𝑤 𝑖 )) = 0, 𝑉 𝑎𝑟 (log(𝑤 𝑖 )) = 𝜎 2 (I − M−1 110) 2 1increase/decrease hazard for distinct class 2𝑤 𝑖 = 1, 10u = 0 10 CHAPTER 2. INTRODUCTION Chapter 3 Partial Likelihood The goal is to estimate 𝜷 that does not depend on ℎ0 (𝑡) for ordered death time of 𝑟 individuals, 𝑡 (1) < · · · < 𝑡 (𝑟) . First, we define risk set, 𝑅(𝑡 ( 𝑗) ), to be the group of individuals who are alive and uncensored at at a time prior to 𝑡 ( 𝑗) . The failure time 𝑡𝑖 is 0 when Hence, 𝑃(individuals 𝑖 dies at 𝑡 ( 𝑗) given one individual from risk set on 𝑅(𝑡 ( 𝑗) ) dies at 𝑡 ( 𝑗) | one death from the risk set 𝑅(𝑡 ( 𝑗) ) at 𝑡 ( 𝑗) ) = 𝑃(individual 𝑖 dies at 𝑡 ( 𝑗) )/𝑃( one death at 𝑡 ( 𝑗) ) ℎ0 (𝑡 ( 𝑗) ) exp( 𝜷0x𝑖 ) ℎ𝑖 (𝑡 𝑗 |x𝑖 ) exp( 𝜷0x𝑖 ) Í 𝑅(𝑡 ( 𝑗) ) = Í =Í = 0 0 𝑘∈𝑅(𝑡 ( 𝑗) ) ℎ 𝑘 (𝑡𝑖 |x 𝑗 ) 𝑘∈𝑅(𝑡 ( 𝑗) ) ℎ0 (𝑡 ( 𝑗) ) exp( 𝜷 x 𝑘 ) 𝑘∈𝑅(𝑡 ( 𝑗) ) exp( 𝜷 x 𝑘 ) The partial likelihood differs from the likelihood as the factors are conditional probabilities and frailties are latent variables that is unobserved. Then, the partial likelihood is simply expressed as a multiplication of conditional probabilities among 𝑛 samples for 𝑗 events exp( 𝜷0x𝑖 ) 𝐿 ( 𝜷) = Π𝑟𝑗=1 Í , 𝑟 ∈ {𝑡 (1) , ..., 𝑡 (𝑟) }(survival time) 0 𝑘 ∈𝑅(𝑡 ( 𝑗) ) exp( 𝜷 x 𝑘 ) Notice that 𝑥𝑖 is a vector of covariates for individual 𝑖 who dies at 𝑡 ( 𝑗) . As we can see from the equation, the risk function prior to the 𝑖th event is not counted in the denominator. A partial likelihood allow to use unspecified baseline survival distribution to define a survival distributions of subjects based on their covariates. Also, the derivation for maximized 𝜷 could be determined by taking the log of the equation. 3.1 Breslow Partial likelihood The likelihood is also expressed with hazard function ℎ(𝑥) = 𝑓 (𝑡)/𝑆(𝑡). Note that the likelihood function is only for uncensored individuals. 𝑛 𝐿( 𝜷, x𝑖 ) = Π𝑖=1 𝑓 (𝑡𝑖 , 𝜷) 𝑑𝑖 𝑆(𝑡𝑖 , 𝜷) 1−𝑑𝑖 𝑛 = Π𝑖=1 ℎ(𝑡𝑖 , 𝜷) 𝑑𝑖 𝑆(𝑡𝑖 .𝜷), ℎ(t, 𝜷) = ℎ0 (𝑡) exp( 𝜷0x𝑖 ) 11 STATS 756 Kyuson Lim As one of the simplest method, Breslow approximation adjusts both terms of the marginal method so that they have the same denominator, corresponding to all subjects at risk exp( 𝜷0x𝑖 ) 𝑛 𝐿 ( 𝜷) = Π𝑖=1 Í { 𝑘 ∈𝑅(𝑡 ( 𝑗) ) exp( 𝜷0x 𝑘 )} 𝑑𝑖 Note that 𝑡1 , ..., 𝑡 𝑛 is defined for observed survival time for 𝑛 individuals. Also, a 𝑑𝑖 is an event indicator as follows. ( 0 if patient is censored 𝑑𝑖 = 1 if patient dies Likewise, the partial likelihood is written in terms of a product of terms for each individuals, as opposed to each failure time. 𝑑𝑖 exp( 𝜷0x𝑖 ) 𝑛 log 𝐿 ( 𝜷) = log Π 𝑗=1 Í 0 𝑘∈𝑅(𝑡 ( 𝑗) ) exp( 𝜷 x 𝑘 ) 𝑛 ∑︁ exp( 𝜷0x𝑖 ) 𝑑𝑖 log Í = 0 𝑘∈𝑅(𝑡 ( 𝑗) ) exp( 𝜷 x 𝑘 ) 𝑖=1 ∑︁ 𝑛 𝑛 ∑︁ ∑︁ 0 0 = 𝑑𝑖 log(exp( 𝜷 x𝑖 )) − 𝑑𝑖 log exp( 𝜷 x 𝑘 ) 𝑖=1 = 𝑛 ∑︁ 0 𝑑𝑖 𝜷 x𝑖 − 𝑖=1 = 𝑛 ∑︁ 𝑘 ∈𝑅(𝑡 ( 𝑗) ) 𝑖=1 𝑛 ∑︁ ∑︁ 𝑑𝑖 log 𝑑𝑖 𝜷x𝑖 − log exp( 𝜷 x 𝑘 ) 𝑘∈𝑅(𝑡 ( 𝑗) ) 𝑖=1 0 ∑︁ 0 exp( 𝜷 x 𝑘 ) (e1) 𝑘∈𝑅(𝑡 ( 𝑗) ) 𝑖=1 The partial likelihood is valid when there are no two subjects who have same event time. A variation of hazard rate attribute to dependence of risk variables or frailty terms, hence the frailty is a random component. In our paper, the We can investigate to find for the specific derivation of computation for partial likelihood as follows. 3.1.1 Example for computing Partial likelihood Now, the simple example of 6 patients with two groups, treatment and control, is shown below. At time 0, 6 patients are at a risk of experiencing an event, which is defined as group of patients for initial set 𝑅1 . Patient 1 2 3 4 5 6 12 Survtime 6 7 10 15 19 25 Censor 1 0 1 1 0 1 Group C C T T T T CHAPTER 3. PARTIAL LIKELIHOOD Kyuson Lim STATS 756 1. Before first failure at time 𝑡 = 6, all 6 patients are at risk and anyone could experience event. 2. By groups, we know for each control and treatment group that exp(𝑥1 𝛽) = exp(𝑥2 𝛽) = exp(𝑥4 𝛽) = 1, exp(𝑥 3 𝛽) = exp(𝑥 5 𝛽) = exp(𝑥 6 𝛽) = exp(𝛽). 3. Substitute for 𝑝 1 = Í ℎ0 (𝑡1 ) exp(𝑥 𝑖 𝛽) , 𝑘 ∈𝑅1 ℎ0 (𝑡 1 ) exp(𝑥 𝑘 𝛽) where ℎ0 (𝑡1 ) is the hazard for a subject from a control group, the equation yield for 𝑝 1 = 1ℎ0 (𝑡1 ) 3ℎ0 (𝑡1 ) exp(𝛽)+3ℎ0 (𝑡1 ) 4. At time 7, a control patient dropped out and at 𝑡 = 10, 𝑝 2 = 1 time 𝑡 = 15 three patients at risk to give 𝑝 3 = 2 exp(𝛽)+1 . = exp(𝛽) 3 exp(𝛽)+1 1 3 exp(𝛽)+3 . as well as at 5. At last event 𝑡 = 25, one subject is at risk with partial likelihood to be the product exp(𝛽) of all, 𝐿(exp(𝛽)) = (3 exp(𝛽)+3) (3 exp(𝛽)+1) (2 exp(𝛽)+1) . 6. Taking the log transformation, 𝑙 (𝛽) = 𝛽 − log(3 exp(𝛽) + 3) − log(3 exp(𝛽) + 1) − log(2 exp(𝛽) + 1). We can put into R code for computation of the example to easily estimate the 𝛽. Note that we used a partial likelihood to maximize for obtaining for the estimate of 𝛽. > plsimple <- function(beta) { + psi <- exp(beta) + result <- log(psi) - log(3*psi + 3) - log(3*psi + 1) - log(2*psi + 1) + result } > result <- optim(par=0, fn = plsimple, method = "L-BFGS-B", control=list(fnscale = -1), lower = -3, upper = 1) > result$par [1] -1.326129 We may find from the maximum partial likelihood estimate, 𝛽ˆ = −1.326 which is also found from the plot, 𝑙 (𝛽) versus 𝛽. The optimized maximum value achieved by the Maximum partial likelihood estimate by Newton-Rapshon algorithm The solid curved black line is a plot of the log partial likelihood over a range of values of 𝛽. The maximum is indicated by the vertical dashed blue line, and the value of the CHAPTER 3. PARTIAL LIKELIHOOD 13 STATS 756 Kyuson Lim log-partial likelihood at a point is -3.672. The value -4.277 of the log-partial likelihood is at the null hypothesis value, 𝛽 = 0. The tangent to the 𝑙 (𝛽) curve at 𝛽 = 0 is shown by the straight red line. Its slope is the derivative of the log-likelihood evaluated at 𝛽 = 0. 3.1.2 Penalize Partial Likelihood (PPL) Taking a log, random effect are treated as penalty term in GLM by the Best Linear Unbiased Prediction (BLUP). Previously, we have defined the partial log likelihood as two terms for unknown 𝜷. This also extend to the joint likelihood of parameters 𝜃, 𝜷 and w for two separate parts of the equations (e1). A full likelihood is also expressed as , 𝑙 𝑓 𝑢𝑙𝑙 (ℎ0 (·), 𝜃, 𝜷) = log 𝑓 (x, u|ℎ0 (·), 𝜃, 𝜷) = log 𝑓 (x|ℎ0 (·), 𝜷, u) + log( 𝑓 (u|𝜃)) = 𝑙 𝑓 𝑢𝑙𝑙,1 (ℎ0 (·), 𝜷) + 𝑙 𝑓 𝑢𝑙𝑙,2 (ℎ0 (𝜃)). Maximization in PPL is a double iterative process, alternates between inner (𝑙 𝑝𝑎𝑟𝑡 ) and outer loop (𝑙 𝑝𝑒𝑛𝑎𝑙𝑖𝑧𝑒 ) until convergence. A penalty term of random effect is far away from mean value 0, by reducing a penalized partial likelihood. If log 𝑤 𝑖 is 𝑁 (0, 𝜎 2 𝐷) where 𝐷 is known matrix, then BLUP consists of maximizing a sum of two log-likelihood: 𝑙 𝑃𝑃𝐿 (𝜃, 𝜷, w) = 𝑙 𝑝𝑎𝑟𝑡 ( 𝜷, w) − 𝑙 𝑝𝑒𝑛𝑎𝑙𝑖𝑧𝑒 (𝜃, w) We know 𝑙 𝑝𝑎𝑟𝑡 ( 𝜷, w) which is the conditional likelihood for data given frailties. Also, the 𝑙 𝑝𝑒𝑛𝑎𝑙𝑖𝑧𝑒 (𝜃, w) stands for the distribution for frailties. The sum is a termed a penalized likelihood function in the sense that the 𝑙 𝑝𝑒𝑛𝑎𝑙𝑖𝑧𝑒 is a penalty function for the conditional log-likelihood of 𝑙 𝑝𝑎𝑟𝑡 . Note that this procedure is also specified in the paper ‘Estimation in generalized mixed models’. When a Cox model with random shared frailty terms is fit, one can use the median hazard ratio as a measure of the magnitude of the effect of clustering on the hazard of the outcome. In 𝑙 𝑝𝑎𝑟𝑡 , a Newton-Raphson uses local quadratic approximations of penalty term. Iterate to estimate 𝜷 and w𝑖 , using the derivative of likelihood and variance matrix, V 1. 3.2 Newton-Raphson Method The Newton-Raphson algorithm is originated from Taylor’s series 𝑓 (𝑥) ≈ 𝑓 (𝑥 𝑘 ) + (𝑥 − 𝑥 𝑘 ) 𝑓 0 (𝑥 𝑘 ) + 2!1 (𝑥 − 𝑥 𝑘 ) 2 𝑓 00 (𝑥 𝑘 ) + · · · + 𝑛!1 (𝑥 − 𝑥 𝑘 ) 𝑛 𝑓 (𝑛) (𝑥 𝑘 ) about some point, the system of non-linear equations is solved by the procedure of Newton-Raphson method. Now the Newton-Raphson method takes the first two terms of the expansion, 𝑓 (𝑥) = 𝑓 (𝑥 𝑘 ) + (𝑥 − 𝑥 𝑘 ) 𝑓 0 (𝑥 𝑘 ), and assume that 𝑥 = 𝑥 𝑘+1 is the solution of the equation 𝑓 (𝑥) = 0 then 0 = 𝑓 (𝑥 𝑘 ) + (𝑥 𝑘+1 − 𝑥 𝑘 ) 𝑓 0 (𝑥 𝑘 ) to be rearranged. Generally, the Newton-Raphson method could 1If V is replaced by 𝐸 (V), then the iterative procedure becomes the method of scoring. 14 CHAPTER 3. PARTIAL LIKELIHOOD Kyuson Lim STATS 756 only be solved for non-linear equation with a single variable. We approximate roots, 𝑓 (𝛽) = 0. 1. Start with initial value 𝛽 (0) of 𝛽. 2. First-order linear approximation of 𝑓 at 𝛽 (0) + ℎ: 𝑓 (𝛽 (0) + ℎ) ≈ 𝑓 (𝛽 (0) ) + ℎ 𝑓 0 (𝛽 (0) ) 3. Solve to find solution 𝛽 (1) (updated) = 𝛽 (0) + ℎ of 𝑓 (𝛽) = 0 ⇒ 𝑓 (𝛽 (1) ) = 0 by ℎ = −{ 𝑓 0 (𝛽 (0) )}−1 𝑓 (𝛽 (0) ) and thus 𝛽 (1) = 𝛽 (0) − { 𝑓 0 (𝛽 (0) )}−1 𝑓 (𝛽 (0) ) 4. Iterate until process converges 𝛽 (𝑘+1) ≈ 𝛽 (𝑘) . A GLM (poisson, logistic) uses the method of iteration for estimating the coefficients. A distribution of frailties are obtained when dependence on frailty terms. A NewtonRaphson procedure converge if sufficient variation of measure risk variables exists within each patients. 3.2.1 Newton-Raphson algorithm - example The goal is to produce better approximations to the roots of a real-valued function. 𝑓 (𝑥 0 ) 𝑓 0 (𝑥0 ) 𝑥1 = 𝑥0 − .. . 𝑥 𝑛+1 = 𝑥 𝑛 − 𝑓 (𝑥 𝑛 ) 𝑓 0 (𝑥 𝑛 ) For example, when 𝑓 (𝑥) = 𝑥 2 − 𝑎 and 𝑓 0 (𝑥) = 2𝑥, the initial guess is 𝑥 0 = 10 and the difference is set to be small to iterate until convergence. 102 − 612 𝑓 (𝑥 0 ) = 10 − = 35.6 𝑓 0 (𝑥 0 ) 2 × 10 𝑓 (𝑥 1 ) 35.62 − 612 = 𝑥1 − 0 = 35.6 − = 26.395 𝑓 (𝑥 1 ) 2 × 35.6 = · · · = 24.790 = · · · = 24.7376 = · · · = 24.738633753 𝑥1 = 𝑥0 − 𝑥2 𝑥3 𝑥4 𝑥5 However, we could implement into a set f non-linear systems to solve for 𝜷. Simply, by the implementation of Jacobian matrix, 𝜷 𝑘 = 𝜷 𝑘−1 − 𝐽 ( 𝜷 𝑘−1 ) −1 V( 𝜷 𝑘−1 ) CHAPTER 3. PARTIAL LIKELIHOOD 15 STATS 756 Kyuson Lim With initial estimate of 𝜷0 , w0 , the goal is to iteratively estimate 𝜷 with PPL. Loglikelihood is approximately quadratic in region of true values. Previously, we have found that maximized 𝑙 𝑝𝑎𝑟𝑡 − 𝑙 𝑝𝑒𝑛𝑎𝑙𝑖𝑧𝑒 gives estimators where a joint log-likelihood is 𝑙 𝑃𝑃𝐿 , 2 𝜷0 0 −𝜕 𝑙 𝑝𝑎𝑟 𝑡 /𝜕 𝜷𝜕 𝜷 0 −𝜕 2 𝑙 𝑝𝑎𝑟 𝑡 /𝜕 𝜷𝜕𝒘 0 𝜷ˆ −1 𝜕𝑙 𝑝𝑎𝑟 𝑡 /𝜕 𝜷0 −1 = +V −V , V= w0 𝜕𝑙 𝑝𝑎𝑟 𝑡 /𝜕𝒘 0 𝜎 −2 w0 −𝜕 2 𝑙 𝑝𝑎𝑟 𝑡 /𝜕 𝜷𝜕 𝜷 0 −𝜕 2 𝑙 𝑝𝑎𝑟 𝑡 /𝜕𝒘𝜕𝒘 0 + 𝜎 −2 I ŵ ˆ ŵ has approximately The variance matrix V taken to be 𝜎 2 (I − M−1 ), 10V = 0. So, 𝜷, a joint normal distribution with mean 𝜷, w with variance matrix V. 16 CHAPTER 3. PARTIAL LIKELIHOOD Chapter 4 Simulation for data 4.1 Simulation study: Infection in Kidney patients Analyzed by the R package ‘survival’, the survfit of the original dataset is shown below. The data of 76 patients for > data(kidney) > kfitm1 <- coxph(Surv(time,status) ~ age + sex + disease + frailty(id, dist=’gauss’)) > kfitm1 Call: coxph(formula = Surv(time, status) ~ age + sex + disease + frailty(id, dist = "gauss"), data = kidney) coef se(coef) age 0.00489 0.01497 sex -1.69728 0.46101 diseaseGN 0.17986 0.54485 diseaseAN 0.39294 0.54482 diseasePKD -1.13631 0.82519 frailty(id, dist = "gauss se2 Chisq DF 0.01059 0.10678 1.0 0.36170 13.55454 1.0 0.39273 0.10897 1.0 0.39816 0.52016 1.0 0.61728 1.89621 1.0 17.89195 12.1 p 0.74384 0.00023 0.74131 0.47077 0.16850 0.12376 Iterations: 7 outer, 42 Newton-Raphson Variance of random effect= 0.493 Degrees of freedom for terms= 0.5 0.6 1.7 12.1 Likelihood ratio test=47.5 on 14.9 df, p=3e-05 n= 76, number of events= 58 The only regression coefficient with -1.697 that is significantly large compared to its standard error is that of the sex variable, indicating a lower infection rate for female patients. The estimate of 𝜎 2 = 0.3821. In general, the effect of the prior distribution on frailty terms is to shrink estimates toward the origin, which bias the estimate. > kfit <- coxph(Surv(time, status)~ age + sex + disease + frailty(id), kidney) > kfit Iterations: 6 outer, 35 Newton-Raphson Variance of random effect= 5e-07 I-likelihood = -179.1 Degrees of freedom for terms= 1 1 3 0 Likelihood ratio test=17.6 on 5 df, p=0.003 n= 76, number of events= 58 17 STATS 756 Kyuson Lim > round(kfit$coefficients, 3) age sex diseaseGN 0.003 -1.483 0.088 diseaseAN diseasePKD 0.351 -1.431 Compare to previously defined code where the distribution of the frailty term is unspecified, the iterations of Newton-Raphson algorithm iterates only for 35 times to find the approximated value of 𝜷. 4.2 Simulation study: testing the model fit We also test the proportional hazards assumption for a Cox regression model fit. Note the function to have used in the analysis is ‘coxph’. > cox.zph(kfit) chisq df age 0.105 1 sex 5.953 1 disease 1.985 3 GLOBAL 7.869 5 p 0.746 0.015 0.576 0.164 Figure 4.1: A graph of coefficient vs. time The plot gives an estimate of the time-dependent coefficient 𝜷(𝑡). If the proportional hazards assumption holds then the true 𝜷(𝑡) function would be a horizontal line, slope of 0. However, the linearity of the regression model in the survival analysis could be tested via a plot of Martingale residuals. Martingale residuals are the discrepancy between the observed value of a subject’s failure indicator and its expected value, integrated over the time for which that patient was at risk. Note that the martingale residuals are plotted against covariates to detect nonlinearity. Plots of martingale residuals and partial residuals are examined against the last two of covariates, age and sex. Smooths are produced by local linear regression (using the lowess function). There are no observed non-linearity. 18 CHAPTER 4. SIMULATION FOR DATA Kyuson Lim STATS 756 Figure 4.2: A graph of residuals Comparing the magnitudes of the largest values to the regression coefficients suggests that 1 observation is influential individually. Figure 4.3: A graph of coefficient vs. time One of the males (id 21) is a large outlier, with much longer survival than his peers. If this observation is removed, then no evidence remains for a random subject effect. CHAPTER 4. SIMULATION FOR DATA 19 STATS 756 4.3 Kyuson Lim Extensions The predicted survival profiles for patients 5, and 12 is modeled. Predicted survival curves for three patients using the penalization Now, the survival analysis is not only restricted to modeling the survival time of patients but also survival time of small and medium size corporation (business) in the market as well. This implementation of the modeling is currently being studied by data scientist in South Korea for the extension of time to event data analysis. By implementation of the cox model in the financial market, we may look forward to have modeling of corporations for various improvement. 4.3.1 Generalized gamma frailty model The paper presents a frailty model using the generalized gamma distribution as the frailty distribution, and lognormal, Weibull frailty model as special cases. Written by Dr. N. Balakrishnan, the BLUP method of this paper is addressed for modeling a new frailty model with generalized gamma distribution that has more parameters to be less parametric and more flexible. Instead of EM algorithm, the Newton-Raphson algorithm is applied to obtain the MLE of parameters. The use of generalized gamma distribution as the frailty distribution in a frailty model has substantially improved the goodness-of-fit of the frailty model. The model is particularly useful in reducing errors in frailty variance estimation. Also, the performance of the likelihood ratio test depends on the cluster size. 20 CHAPTER 4. SIMULATION FOR DATA Chapter 5 Appendix: R codes library(survminer); library(lubridate); library(penalized);library(survival) # MPLE plsimple <- function(beta) { psi <- exp(beta) result <- log(psi) - log(3*psi + 3) log(3*psi + 1) - log(2*psi + 1) result } result <- optim(par=0, fn = plsimple, method = "L-BFGS-B", control=list(fnscale = -1), lower = -3, upper = 1) result$par # survival analysis, plot ggsurvplot(survfit(kfit), data = kidney) # used in report/presentation ggsurvplot(survfit(kfit), pval = F, conf.int = TRUE, risk.table = TRUE, # Add risk table risk.table.col = "strata", # Change risk table color by groups linetype = "strata", # Change line type by groups surv.median.line = "hv", # Specify median survival fun = "pct", data = kidney,legend = "none", ggtheme = theme_bw()) # hazard ratio, confidence interval ggforest(kfitm1, data = kidney, fontsize=1.25) # model diagnostics for events cox.zph(kfit) %>% plot plot(survfit(kfitm1)[1], lty=2, lwd=2, fun="event") # model validation temp <- cox.zph(kfit) print(temp) plot(temp) # display the results # plot curves 21 STATS 756 Kyuson Lim # model validation par(mfrow=c(2,2)) res <- residuals(kfit, type=’martingale’) X <- as.matrix(kidney[,c("age", "sex")]) # matrix of covariates > par(mfrow=c(2,2)) for (j in 1:2) { # residual plots plot(X[,j], res, xlab=c("age", "sex")[j], ylab="residuals") abline(h=0, lty=2) lines(lowess(X[,j], res, iter=0))} b <- coef(kfit)[c(1,2)] # regression coefficients for (j in 1:2) { # partial-residual plots plot(X[,j], b[j]*X[,j] + res, xlab=c("age", "sex")[j], ylab="component+residual") abline(lm(b[j]*X[,j] + res ~ X[,j]), lty=2) lines(lowess(X[,j], b[j]*X[,j] + res, iter=0)) } # influential point dfbeta <- residuals(kfit, type="dfbeta") par(mfrow=c(1,3)) for (j in 1:3) { plot(dfbeta[,j], ylab=names(coef(kfit))[j]) abline(h=0, lty=2) } # prediction attach(kidney) # penalization hepato.opt <- optL1(Surv(time, status), penalized=as.data.frame(kidney[,4:5]), standardize=T, fold=10) set.seed(34) hepato.prof <- profL1(Surv(time, status), penalized=kidney[,4:5], standardize=T, fold=10) hepato.pen <- penalized(Surv(time, status), penalized=kidney[,4:5], standardize=T, lambda1=hepato.opt$lambda) round(coef(hepato.pen, standardize=T), 3) hepato.predict.5 <- predict(hepato.pen, kidney[5,4:5]) hepato.predict.12 <- predict(hepato.pen, kidney[12,4:5]) par(mfrow=c(1,1)) plot(stepfun(hepato.predict.5@time[-1], hepato.predict.5@curves), do.points=F, col="blue", lwd=2, ylim=c(0,1), xlab="Time in months", ylab="Predicted survival probability") plot(stepfun(hepato.predict.12@time[-1], hepato.predict.12@curves), do.points=F, add=T, col="red") legend("bottomleft", legend=c( "Patient 5","Patient 12"), pch=1, col=c("blue", "red")) 22 CHAPTER 5. APPENDIX: R CODES Bibliography [1] McGilchrist, C. A., & Aisbett, C. W. (1991). Regression with frailty in survival analysis. Biometrics, 461-466. https://www.jstor.org/stable/2532138?casa_token=cxuDrkxyJzUAAAAA%3AEnp4ejKDMHcBHgMbROgKulGAA-lUE0Iw16oVqCSqDXPbWGutHjuBeIJ 3D0LUaBnEGd-dVIBW88Bkm6vPgEhEca24&seq=1#metadata_info_tab_contents. [2] Balakrishnan, N., & Peng, Y. (2006). Generalized gamma frailty model. Statistics in medicine, 25(16), 2797-2816. https://pubmed.ncbi.nlm.nih.gov/16220516/ [3] (R) Package ‘survival’ [Terry M. Therneau, et. al.] https://cran.r-project.org/web/packages/survival/survival.pdf [4] Moore, D. F. (2016). Applied survival analysis using R. New York, NY: Springer. https://link.springer.com/book/10.1007/978-3-319-31245-3 [5] McGilchrist, C. A. (1994). Estimation in generalized mixed models. Journal of the Royal Statistical Society: Series B (Methodological), 56(1), 61-69. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/j.2517-6161.1994.tb01959.x [6] Wienke, A. (2010). Frailty models in survival analysis. CRC press. https://www.routledge.com/Frailty-Models-in-Survival-Analysis/Wienke/p/book/9781420073881 23

Regression with frailty in survival analysis

Related documents

Products

Support

Regression with frailty in survival analysis

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib