Longitudinal studies: Cornerstone for causal modeling of dynamic relationships Illustrative examples from the Cebu Longitudinal Health and Nutrition Survey • Prospective, communitybased sample of 1983-4 birth cohort, follows mothers and index infant from urban&rural areas of Metro Cebu, The Philippines • Bi-monthly surveys birth-2yr, follow-up surveys in 1991, 1994, 1998, 2002, 2005 • Extensive individual, household and community data Types of longitudinal studies • Same individuals over time • Common age at enrolment (e.g. birth cohort) • Life course studies, individual trajectories • Challenging to separate age vs time effects • Eg, diet changes over time because kids get older or because there is a secular trend in dietary behaviors • Different ages at enrolment • Panels/cross sectional time series: Different individual over time, in common units (e.g. community, school, household) • Allow study of trends over time, but not individual trajectories • Mixed: repeatedly study individuals, but with replacement Each poses different challenges for data collection and analysis Focus on cohort studies …repeated measures of the same individuals, over time allow for: • Identification of sequence of events, providing basis for causal inference • Comparison of inter vs intra-individual variation in susceptibility, behavior, health • Response to shock or intervention differs between individuals • Individual growth rates vary with age Longitudinal Study Challenges • Cost (time, $) • Attrition • Bias associated with repeated contacts with individuals • observer effects • sampling bias amplified by repetition of surveys • panel conditioning: changes in response to participation Challenges of collecting longitudinal data Research priorities and funding opportunities change over time: funding infrequently covers more than 5 years at a time. Example: Cebu Longitudinal Health and Nutrition Survey Survey year Focus Funder 1983-86 Infant feeding, growth, morbidity, mortality NICHD, Ford Foundation 1991 Growth, school enrollment, IQ World Bank Nestle Foundation 1994 Family planning and women’s lives USAID: Women’s Studies Project 1998 Adolescent Health 2002 Effects of health on young adult NIH-Fogarty ISHED human capital 2005 Add biomarkers of CVD risk factors Mellon Foundation NIH-Fogarty ISHED Obesity roadmap funds Methodological challenges of collecting longitudinal data • • Technology for data collection and storage changes over time • Face to face vs. “ACASI” Measurement Issues • • • Change in personnel collecting data • interobserver reliability is harder to maintain and measure over time Change in how questions are asked • e.g. Analysis reveals flawed question on round 1: do we change the question on round 2? Change in how questions are answered • different social climate or respondent knowledge gained over time (perhaps by study participation) may affect veracity • Who responds? Child vs mother? At what age does a child become the respondent? • Change in meaning of indicators over time • E.g. wealth: TV vs computer vs. car over time Dilemmas and choices…. • Expanding the survey may increase respondent burden and compromise participation rates • But… Failure to expand the survey represents missed opportunities • Follow-up of all migrants is desirable • But… Follow-up is costly and not always feasible • Changing how a question is asked eliminates comparability over time • But… keeping a flawed question is bad science Data collection challenges • How often should participants be surveyed? • Frequent measurement allows sequence of events to be identified • Pregnancy>>>quit school>>>marriage • Quit school>>>marry>>>pregnancy • Respondent burden, “contamination” of sample Analysis challenges • Specialized techniques are needed to accommodate the strengths and weaknesses of longitudinal data • Accounting for complexity • Accounting for changing inputs across the lifecycle Analysis challenges • Accounting for differences in susceptibility • Example: parental investment may change based on acquired characteristics of the child • Example: developmental origins of adult disease: key premise is that prenatal factors alter response to subsequent exposures • Intergenerational studies Challenges: Selection bias related to attrition • Loss to follow-up: Death, Migration, Refusal • May result in sample which is markedly different from baseline sample in measured and unmeasured attributes • Biased estimates may be obtained if the relationships of interest are fundamentally different in those remaining vs. lost, particularly when differences relate to unmeasured characteristics Tools for handling selection bias • Heckman-type models estimate likelihood of being in the sample simultaneously with outcome of interest • Difficult to account for multiple reasons for attrition (with different potential for bias, e.g death vs migration) Challenges: growth trajectories and functional forms • Ideally…we would like models to accommodate • Non-linear “growth trajectories” • Differences in shape of trajectories at different ages, and in the relationship of exposures to outcomes at different ages 8 10 6 8 4 6 2 4 0 2 0 0 .2 .4 .6 .8 1 0 .2 .4 agey .8 1 diawk weight .8 1 diawk 0 0 2 2 4 4 6 6 8 8 weight .6 agey 0 .2 .4 .6 agey weight .8 1 0 .2 .4 .6 agey diawk weight diawk Latent growth curves: A category of Structural Equation Models • Random intercepts and random slopes allow each case to have a different trajectory over time • Random coefficients incorporated into SEMs by considering them as latent variables • Capitalize on SEM strengths, including: • ML methods for missing data • Estimation of different non linear forms of trajectories, including piecewise to identify different curve segments • Measures of model fit and • Inclusion of latent covariates and repeated covariates • Latent variables derived from multiple measured variables • Account for bi-directional relationships Data demands for econometric models • Detailed, time-varying, high quality exogenous variables • Often this means community level variables, so data collection cannot be limited to individual or household level information What’s on the frontier for new longitudinal methods? • ..”new data, methodologies, and tools from both inside and outside the social sciences are demonstrating real promise in advancing these sciences from descriptive to predictive ones”* • “Longitudinal surveys” is one of 6 listed frontiers • Improved statistical methods is another (but this section is about using the internet to conduct surveys!!) *Butz WP, Torrey BB Some Frontiers in Social Science. Science June 2006 What is on the frontier?? • Addition of biomarkers • Overcoming squeamishness of social scientists • Lack of laboratory facilities • What methodological improvements are needed? • Innovative data collection and tracking • Use of GPS and PDAs