PHSSR IG CyberSeminar Introductory Remarks Bryan Dowd Division of Health Policy and Management School of Public Health University of Minnesota Causation versus Association Who Cares? • The purpose of public health systems and services research is to examine the impact of the organization, financing, and delivery of public health services at the local, state, and national levels on population health. •By “impact,” I assume we mean the causal effect on population health of changing one of those factors. Causation versus Association • Linking “impact” and “causal effect” to change highlights a common distinction between causation and association. • “Association” (the weaker term) often refers to relationships among variables whose observed values come from a single observation of each subject (cross-sectional data). • “Causation” often refers to relationships among variables whose observed values come from multiple observations of each subject (time-series data). Causation versus Association But many analytic techniques are designed to draw causal inferences from cross-sectional data. And the fact that two variables change their values over time is no guarantee that the change in one variable caused the values of the other variable to change. Much of our empirical research attempts to distinguish causal relationships from spurious relationships. But from a practical perspective … In public health, there often comes a time when we must act: choosing a particular course of action or the status quo. Examples: 1. Should we impose a quarantine? 2. Should we inspect restaurants once a month or once a decade? 3. Should we inoculate the population against a particular disease? Association = status quo? The practical and empirical question is, “Do the data support taking one specific course of action versus an alternative?” In that context, saying the policy variable (that we control) and the outcome variable (that we are trying to influence) are merely associated, but not causally related, is equivalent to answering, “No. We should not take a particular course of action.” So “association” often is synonymous with “stay the course” or “maintain the status quo.” But that’s illogical ! “Staying the course” when the data do not support causal links between the policy variable and the outcome is illogical. If we can’t establish a causal relationship between the policy variable and the outcome of interest, then we have no way of knowing whether “staying the course” will continue to be “associated with” the same value of the outcome variable that it is now. The bottom line … We may speak of “association” but we always act as though we have drawn valid causal inferences, even when we choose not to change anything. So the most important question about causal inference is not how to pretend we don’t draw them, but how to make them as reliable as possible so that we make good decisions. The Basic Research Question X Y What would happen to Y if we were to change X by one unit (sometimes we add, “… holding the effect of other variables constant?”) X could be continuous (e.g., income) or binary (e.g., treatment versus control group). Sometimes called the “marginal effect” of X on Y, which we could denote β . Challenges: Omitted variable bias Enrollment in a smoking cessation program Health outcome Family health history Omitted variable, Omitted Confounder, Spurious Correlation Challenges: Mediating Variables Education Health outcome Income Controlling for income in a regression of health outcomes on education means that β is the partial effect of education on health outcomes, not the full effect. Which one do you want to estimate? Challenges: Reverse Causality 1 Health Insurance Health Status 2 Reverse causality We hypothesize that having health insurance affects health status, but the relationship we observe could be due, at least in part, to the reverse effect (medical underwriting). What Methodologies? What methodologies can be used to assess causal relationships in HSR and PHSSR? What forms of manipulation of the policy variables of interest can produce reliable causal inference? This is the area of great disagreement. When and why did that disagreement occur? A Brief History Regression & Correlation Multivariate regression & Partial correlation Time Experimental data: Galton, Pearson, Fisher Randomized trials Propensity scores DAGs Natural experiments Observational data: Wright The “big split” in 1926 - 1928 Structural equation modeling Panel data (Granger causality, etc.) Instrumental variables Sample selection models A Brief History 1926 - Ronald Fisher. Randomization. We (the analyst) must manipulate the policy variables. A research design solution. 1928 - Philip Wright. Instrumental variable estimation. Other types of manipulation can produce valid causal inference. A modeling solution. Many social scientists still are reluctant to use the word “causal” to describe their causal models. Today Today we have a broad menu of methods to choose from, but residual resistance to using approaches other than randomized trials. Some estimation approaches: 1. Randomization 2. Instrumental variables 3. Natural experiments 4. Sample selection models Example Health department characteristics or programs (“interventions”)t-1 Past health problems (unobserved) Population health outcomest Community risk factors (unobserved) One Solution: Randomization Randomization Intervention v Outcome u Randomly assign health departments to interventions. Often not practical, ethical or cost-effective. Another Solution: Instrumental Variables and Natural Experiments External event Intervention v Outcome u Some event external (“exogenous”) to the health department (e.g., legislation, “encouragement”) that, like randomization, results in the intervention being adopted by some departments but not others, but has no direct effect on the outcome. Another Solution: Sample Selection Models External event Intervention v Outcome ρ u Incorporate the correlation (ρ) of unobserved variables (v and u) into the estimation of the causal parameter β. Same data requirements for all methods. Estimation approaches vary for different types of dependent variables. Two Applications The relationship between local public health spending and measures of public health outcomes. The policy question: What is the effect of changing the level of local public health spending on public health outcomes? Both authors recognize that local public health departments were not randomized to different levels of spending.