“Assessing the Total Effect of Time-Varying Predictors in Prevention Research” Bethany Cara Bray Department of Human Development and Family Studies The Methodology Center Pennsylvania State University University Park, PA 16803 bcbray@psu.edu Phone: 814.865.1225 Rick S. Zimmerman Department of Communications University of Kentucky Donald Lynam Department of Psychology University of Kentucky Susan Murphy Department of Statistics The Institute for Social Research University of Michigan Preparation of this article and presentation was supported by Grant # P50 DA10075 from the National Institute on Drug Abuse to the Methodology Center at Pennsylvania State University and by the National Institute on Drug Abuse award #1 K02 DA15674-01. For more information or copies of the manuscript please contact Bethany Cara Bray. Assessing the Total Effect of Time-Varying Predictors in Prevention Research 4.7.03 Definitions Time-varying: A variable (i.e. peer pressure resistance) that has different values through out time. Non-time-varying: A variable (i.e. sex) that does not have different values through out time. Confounder: A variable that is correlated with both the predictor and the response. Confounders are alternate explanations of the observed relationship between the predictor and response. For instance, if the response is marijuana initiation and the predictor is conduct disorder initiation, peer pressure resistance is one confounder. Compositional Differences: The unequal distribution of the confounder (i.e. peer pressure resistance) between the types of participants that initiate the predictor (i.e. conduct disorder) and those who do not. For instance, of the participants who initiate conduct disorder, there is a higher percentage of participants who have low peer pressure resistance and of the participants who do not initiate conduct disorder, there is a higher percentage of participants who have high peer pressure resistance. Spurious Correlation: A false, accidental correlation. A spurious correlation is an accidental correlation between the predictor and response, created by including confounders in the response regression model. Spurious correlations make the relationship between the predictor and response appear different than what is actually true. Total Effect: The entire effect of the predictor on the response through all direct and indirect influences. For example, in Figure 1 the total effect of the predictor (i.e. conduct disorder, Cd) on the response (i.e. marijuana initiation, Mj) is represented by all paths following the direction of the arrows from Cd1 and Cd2 to Mj2 and Mj3. Response Regression Model: Regression model of the response (i.e. marijuana initiation) on the predictor (i.e. conduct disorder) and possibly other covariates (i.e. sex and race). The goal of this model is to estimate the total effect of the predictor on the response. Bethany Cara Bray bcbray@psu.edu Assessing the Total Effect of Time-Varying Predictors in Prevention Research 4.7.03 Themes Problems with confounders: Since confounders are correlated with both the predictor and response they offer alternate explanations of the observed relationship between the predictor and response. When confounding is not controlled, the unequal distribution of levels of the confounders among levels of the predictor (called compositional differences) causes bias in the estimated total effect of the predictor on the response. When confounding is not controlled, the estimated coefficient of the effect of the predictor on the response reflects the difference between the predictor groups, in addition to the causal effect of the predictor on the response. In other words, it is unclear whether the estimated effect of the predictor on the response represents the consequence that delayed predictor initiation has on the initiation of the response, or whether the estimated effect merely reflects compositional differences in the confounder, or if the estimated effect reflects a combination of the two. Standard Model: The standard model, which includes confounders as covariates in the response regression model, attempts to do two things simultaneously. The first is to control for confounding. The second is to estimate direct effects. We should worry when we are using one model to do two different things. Here we are going to focus on the problems with using the standard model to control for confounding while estimating the total effect of a predictor on a response when confounders are affected by the predictor, as often happens when the predictor and confounders are time-varying. Spurious correlations: One example is when confounders are included as covariates in the response regression model. A pathway opens between the predictor at time 1 and the response at time 3. This pathway is a spurious correlation that makes the relationship between the predictor and response (the estimated effect) appear different than what is actually true. When these spurious correlations cause the estimated effect to be different than what it actually is (a biased estimate), false conclusions regarding the consequences that the timing of the predictor has on the timing of the response may be made, leading to inaccurate conclusions, treatment, and intervention decisions. Why weighting works: Weighting attempts to do what randomization does – equalize the compositional differences in the confounders among levels of the predictor. This makes the groups of people in the different predictor levels comparable. By equalizing the compositional differences between the predictor levels, the confounders are controlled and the correlation between the confounder and predictor is eliminated. Thus, the confounder does not need to be controlled by including it as a covariate in the response regression model, which eliminates the spurious correlation. In other words, weighting eliminates the path of the spurious correlation by not conditioning on the confounder in the final response regression model while controlling for confounders by equalizing the compositional differences between initiators and non-initiators of the predictor. Hence, the estimates from the final response regression model are unbiased. Bethany Cara Bray bcbray@psu.edu Assessing the Total Effect of Time-Varying Predictors in Prevention Research 4.7.03 Notation Predictor: Conduct Disorder Initiation, Cd Response: Marijuana Initiation, Mj Confounder: Peer Pressure Resistance, Ppress Unmeasured Confounder: Parent-child Relationship Quality, U Selected References Barber, J. S., Murphy, S. A., & Verbitsky, N. (2002). Adjusting for time-varying confounding in survival analysis. Manuscript submitted for publication. Clayton, R. R., Cattarello, A. M., & Johnstone, B. M. (1996). The effectiveness of drug abuse resistance education (project DARE): 5-year follow-up results. Preventive Medicine, 25, 307-318. Hernán, M., Brumback, B., & Robins, J. M. (2000). Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology, 11, 5, 561-570. Pearl, J. (1998). Graphs, causality, and structural equation models. Sociological Methods and Research, 27, 226-284. Robins, J. M. (1986). A new approach to causal inference in mortality studies with sustained exposure periods – application to control of the healthy worker survivor effect. Mathematical Modeling, 7, 1393-1512. Robins, J. M. (1989). The analysis of randomized and nonrandomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. In L. Sechrest, H. Freeman, & A. Mulley (Eds.), Health Service Research Methodology: A Focus on AIDS (pp. 113-159). Washington, DC: NCHSR, U.S. Public Health Service. Robins, J. M. (1998). Marginal structural models. 1997 proceedings of the American Statistical Association, section on Bayesian statistical science (pp. 1-10). Retrieved from: http://www.biostat.harvard.edu/~robins/research.html. Robins, J. M. & Greenland, S. (1994). Adjusting for differential rates of PCP prophylaxis in high- versus low dose AZT treatment arms in an AIDS randomized trial. Journal of the American Statistical Association, 89, 737-749. Robins, J. M., Hernán, M. & Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology, 11, 5, 550-560. Bethany Cara Bray bcbray@psu.edu Assessing the Total Effect of Time-Varying Predictors in Prevention Research 4.7.03 Figure 1. Illustration of a spurious correlation between predictors and response in the sprinkler example RAINING (time 1) RAINING (time 2) U1 U2 c c c c a Conf1 Predict = Predictor Conf = Confounder Resp = Response U = Unmeasured Predictor a Predict1 Resp2 Conf2 Predict2 Resp3 Front Yard Grass (time 2) Front Yard Sprinkler (time 2) Back Yard Grass (time 3) b Front Yard Grass (time 1) Front Yard Sprinkler (time 1) Back Yard Grass (time 2) Sprinkler example follows examples often used by Pearl. Bethany Cara Bray bcbray@psu.edu Assessing the Total Effect of Time-Varying Predictors in Prevention Research 4.7.03 Figure 2. Some relationships among conduct disorder, peer pressure resistance, and marijuana Parent-Child Relationship Quality (time 1) Parent-Child Relationship Quality (time 2) U1 U2 a a Ppress1 c c c c Cd = Predictor Ppress = Confounder Mj = Response U = Unmeasured Predictor Cd1 Mj2 Ppress2 Cd2 Mj3 Peer Pressure Resistance (time 2) Conduct Disorder Initiation Marijuana Initiation (time 2) (time 3) b Peer Pressure Resistance (time 1) Conduct Disorder Initiation Marijuana Initiation (time 1) (time 2) Notes: The arrows represent causal paths. Many arrows that would naturally be in this figure are omitted for simplicity. Time progresses from left to right. This is not an SEM diagram. Bethany Cara Bray bcbray@psu.edu