This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site. Copyright 2008, The Johns Hopkins University and Francesca Dominici. All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided “AS IS”; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for obtaining permissions for use from third parties as needed. How Risky is Breathing? Statistical Methods in Air Pollution Risk Estimation Francesca Dominici Department of Biostatistics Bloomberg School of Public Health Johns Hopkins University From crisis to questions • We began with crisis-- the London fog in 1952, and have moved to questions: – Are there adverse effects of today’s air pollution? – How large are these risks? Particulate levels – 3,000 g /m 3 This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. December 5 1952: London's Piccadilly Circus at midday This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. Maureen Scholes, a nurse at the Royal London Hospital in 1952, says the smog penetrated through clothes, blackening undergarments 4,000 deaths the first week 8,000 over next 2 months Source: Royal London Hospital Archives and Museum Designer Smog Masks - London 1950’s This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. Davis When Smoke Ran Like Water (2002) Air pollution and mortality: Then and now London, December, 1952 Mortality and PM10 in Chicago, 2000 This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. "London "Killer" Smog of 1952" from Environmental Health. Available at: http://ocw.jhsph.edu. Copyright © Johns Hopkins Bloomberg School of Public Health. Creative Commons BY-NC-SA. Adapted from Turco, R. P. Air pollution and health: Fundamental questions Is there a risk at current levels? yes How can we estimate it? By integrating national data sets and developing methods to analyze them How big is the risk? The risk is very small but everyone is exposed! What causes it? ??? Bad air day? Chicago PM2.5 = 10 g /m 3 This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. Bad air day? Chicago PM2.5 = 20 g /m 3 This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. Bad air day? Chicago PM2.5 = 30 g /m 3 This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. Standard setting process in the US is evidence-based National Data Sets National Morbidity Mortality Air Pollution Study • Collected data 100 largest cities in the United States – Daily mortality – Daily temperature – Daily level of PM10 • Long time series – 1987 to 2000 The National Medicare Cohort Study, 1999-2005 (MCAPS) • Medicare data include: – Billing claims for everyone over 65 enrolled in Medicare (~48 million people), •date of service •treatment, disease (ICD 9) •age, gender, and race •place of residence (zip code) • Approximately 204 counties linked to the PM2.5 monitoring network MCAPS study population: 204 counties with populations larger than 200,000 (11.5 million people) This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. Please visit www.biostat.jhsph.edu/MCAPS for maps and other MCAPS information Daily time series of hospitalization rates and PM2.5 levels in Los Angeles county (1999-2005) This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. Please visit www.biostat.jhsph.edu/MCAPS for maps and other MCAPS information Statistical Ideas 3 Statistical Ideas for Analysis of Observational Studies 1. Adjusting for confounding – Semi-Parametric Regression 2. Combining health risk estimates across counties – Bayesian Hierarchical Models 3. Accounting for the uncertainty in the selection of the statistical model – Model averaging for confounding adjustment Statistical Methods for multi-site time series studies • Compare day-to-day variations in hospital admission rates with day-today variations in pollution levels within the same community • Avoid problem of unmeasured differences among populations • Key confounders Seasonal effects of infectious diseases and weather Statistical Methods Within city: Semi-parametric regressions for estimating associations between dayto-day variations in air pollution and mortality controlling for confounding factors Across cities: Hierarchical Models for estimating: – national-average relative rate – exploring heterogeneity of air pollution effects across the country Dominici Samet Zeger JRSSA 2000 Confounding • The association between air pollution and mortality is potentially confounded by: – Weather – Other pollutants – Seasonality – Long-term trend 1) Semi-parametric regression model for estimating health risk within a county air pollution series log E[Yt ] log N x t s(temp) s(time) c # of adverse events on day t c t # of people at risk on day t c health risk Kelsall Samet Zeger Xu AJE 1997 Time varying confounders: •Weather variables •seasonality 2)Bayesian hierarchical models for pooling risks across cities County-specific risk estimate County-specific true risk Within-county statistical error c c c ˆ ~ N(0,v ) ~ N(0, ) c Pooled risk c c c c 2 Across-county variance of the true risks 3) Do I have the “right” statistical model? Explore the sensitivity of the risk estimates to the statistical model Sensitivity of the national average lag effect of PM10 on mortality to different statistical models to adjust for confounding (NMMAPS 1987-2000) weak moderate strong Reported estimate Different statistical models to adjust for confounding Peng Dominici Louis JRSSC 2006 3) Do I have the “right” statistical model? X Z1 Z2 Y Z1 is a predictor of Y Z2 is a confounder Estimating risks by averaging across statistical models Regression Models y x 1z1 y x 2z2 y x 1z1 2z2 Weights based on prediction(BIC) Weights based on ability to adjust for confounding 0.9 0.0 0.0 0.9 0.1 0.1 3) Model averaging for confounding adjustment in observational studies • We assign zero weights to models that have optimal prediction properties but that do not include all the potential confounders • We identify all the potential confounders by searching for good predictors of the exposure X • Theoretical results and simulation studies have showed that this approach outperform existing methods to account for model uncertainty Crainiceanu Dominici Parmigiani Biometrika 2007 Wang Crainiceanu Parmigiani Dominici technical report 2007 Biostatistics in Action: The weight of the evidence This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. Full-text available at http://content.nejm.org/cgi/content/abstract/343/24/1742 O3 November 17 2004 Mortality This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. Full-text available at http://jama.ama-assn.org/cgi/content/full/292/19/2372 PM2.5 March 8 2005 Hospital Admissions This image has been deleted because JHSPH OpenCourseWare was not able to secure permission for its use. Full-text available at http://jama.ama-assn.org/cgi/content/full/295/10/1127 The new challenge: Estimating the toxicity of the PM complex mixture New Scientific Questions and Statistical Challenges What are the mechanisms of PM toxicity? Size? Chemical components? Sources? New Methods for estimating health effects of complex mixtures Emission sources Chemical constituents Size Total mass K Biomass burning Cl EC Vehicles OC SO4 PM2.5 PM10 NO3 Si Crustal PM10-2.5 Ca Al Fe Bell Dominici Ebisu Zeger Samet EHP 2007 3 g/m increase in PM % change in CVD hospitalization rate associated with 10 increase in PM10-2.5 on average across 108 US counties (1999-2005) PM10-2.5 PM alone PM2.5 PM alone 10 2.5 PM10-2.5 PM adjusted Adjusted by PM by PM2.5 PM2.5 PM adjusted Adjusted by PM by PM10-2.5 10 2.5 2.5 2 g /m 3 2.5 10 2.5 2.5 1.5 % increase in admission wi th a 10 1 0.5 0 -0.5 -1 Lag Lag -1.5 0 1 2 0 1 Lag 2 0 1 Lag 2 0 1 2 Peng Bell Chang McDermott ZegerLagSamet Dominici tech report 2007 The policy impact NAAQS: Science has had an Impact • From US EPA NAAQS Criteria Document 1996: “Many of the time-series epidemiology studies looking for associations between O3 exposure and daily human mortality have been difficult to interpret because of methodological or statistical weaknesses, including the failure to account for other pollutants and environmental effects.” • From US EPA Criteria Document 2006: “While uncertainties remain in some areas, it can be concluded that robust associations have been identified between various measures of daily O3 concentrations and increased risk of mortality.” Assessing the Public Health Impact of the Air Quality Regulations Reproducible research • We want to reproduce previous findings – “Did you do what you said you did?” • Test assumptions, robustness of findings; check methodology – “Is what you did any good?” • Implement and test new methodology – “I can do it better!” Peng Dominici Zeger AJE 2006 NMMAPSdata package for R • R is a free software environment for statistical analysis and graphics • NMMAPSdata package contains the entire updated (1987—2000) NMMAPS database as an add-on module for R • Supplemental code available online for reproducing canonical NMMAPS analysis and other analyses • iHAPSS: Internet-based Health and Air Pollution Surveillance System – http://www.ihapss.jhsph.edu/ Peng Welty R news 2004 Zeger Peng McDermott Dominici Samet HEI 2006 A new book to appear this summer… Roger Peng & Francesca Dominici Environmental Epidemiology with R: A Case study in Air Pollution and Health Concluding Thoughts Questions Policy Data Biostatistics in Action! Methods can be used to address other questions beyond air pollution analyses of observational studies Evidence • The weight of the evidence: – Has an explicit role in the Clean Air Act • New NAAQS process • New Research underway: especially PM Components and Sources – the cycle begins anew Collaborators in the BSPH • • • • • • • • • • • • PhD Students Medicare data users and collaborators in the BSPH and SOM Michelle Bell • Howard Chang Patrick Breysse • Sandy Eckel Ciprian Crainiceanu • Sorina Eftim • Gerald Anderson • Jennifer Feder Mary Fox • Emily Smith • Haley Hedlin Alyson Geyh • Ben Brooke • Yun Lu Aidan McDermott • Lia Clattenburg • Chi Wang Tom Louis • Robert Herbert • Yijie Zhou Giovanni Parmigiani • Peter Pronovost Roger Peng Jonathan Samet Funding sources Ron White •EPA: PM Research Center (Samet) Scott Zeger •NIEHS: Training grant in Environmental Biostatistics (Louis and Dominici) •NIEHS R01: Statistical methods in Environmental Epidemiology (Dominici)