Data Tools for MCH Professionals: Introduction to Local Data Sources and Analytic Considerations Michael D. Kogan, PhD Director, Office of Data and Program Development US Dept of Health and Human Services Health Resources and Services Administration Maternal and Child Health Bureau Laurin Kasehagen Robinson, PhD Senior MCH Epidemiologist CDC Assignee to CityMatCH Adjunct Assistant Professor in Pediatrics University of Nebraska Medical Center Workshop Overview • • • • • Role of local health departments Importance of local data Evidence-based public health Introduction to basic epidemiologic concepts Introduction to local data sources and overview of the reference guides – What’s available – How to use it – Advantages and limitations of these data sources • • • • Hands-on case studies I and presentations Break Hands-on case studies II and presentations Discussion – What was most useful? – What was missing? Role of Local Health Departments • Local health departments – play a key role in the provision of public health services to both rural and urban communities – are the closest source for information on and assistance with public health issues and concerns in a community • Serve 3 core functions Core Function #1 • Assess community problems, needs, and resources, through – Health needs assessments – Data – Surveillance Core Function #2 • Provide leadership in organizing strategies to address health problems, through – Programs designed to meet community needs Core Function #3 • Assure that direct services necessary for meeting local public health goals are available to all community residents, through – Community health services, including • • • • Screenings Education Prevention Outreach Why is local data important? • Essence of the importance of local level data summarized by Shah, Whitman & Silva in “Variations in the Health Conditions of 6 Chicago Community Areas: A Case for Local-Level Data” • “Variations in health measures identified at the local level shed light on the limitations of the existing city data often used in establishing public health policies and monitoring population health. . . . [Such] data are essential in identifying communities most at risk of poor health outcomes, exploring the determinants of such variations in health, and ultimately guiding community health programs and policies.” Potential Limitations of / for Local Data • Often limited to jurisdictions with populations of at least 100,000 – Why? Issues of small numbers, accuracy and confidentiality – Sometimes limited because of relatively rare events • E.g., maternal mortality, autism, teen pregnancies, unintentional injuries • The data may not be current – Denominators may be based on the 2000 Census – City / County / MSA population may be based on 2000 Census • Data may not be collected at the household or city or county level Evidence-Based Public Health: Gathering and Using the Best Evidence for Local Data Evidence-Based Medicine • Health care practices based on review of current best evidence on the effectiveness of a test, drug, surgery or other medical practice • Collect and analyze all of the research studies conducted on a particular intervention • Evidence is then graded • Best evidence based on findings from clinical trials and meta-analysis • Weakest evidence based on case reports Definition of Evidence-Based Public Health • “EBPH is the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of communities and populations in the domain of health protection, disease prevention, health maintenance and improvement.” Jenicek (1997) Differences between Public Health and Medicine Public Health Medicine Primary Focus Populations Individuals Emphasis Prevention Diagnosis Health Promotion Treatment Whole Community Whole Patient Paradigm Interventions aimed Medical Care at Environment, Human Behavior and Lifestyle, and Medical Care So what is “best evidence”? Best Evidence • • • • • • Makes sense (it’s relevant) Unbiased Available Statistically significant Significant to public health Leads to correct decisions Evidence Statistical significance GOOD BOTH BEST Meaningful to Public Health FAIR We have been taught to accept statistical significance. If large samples (as in many cases), we are bound to have statistical significance, even if it is not meaningful. Steps of Evidence-Based Public Health • Develop an initial statement of the issue • Search the scientific literature and organize information • Quantify the issue using sources of existing data • Develop and prioritize program options; implement interventions • Evaluate the program or policy Different Sources of Evidence in Public Health: The Information Continuum VERY STRONG Randomized control trials Active surveillance, some clinical studies VERY WEAK Routinely collected data, case review programs Review processes, personal anecdotes, gut feelings So why isn’t evidencebased decision-making used more often? How are Decisions Often Made? • Decisions on policies and programs are often made based on: – Personal experience – What we learned in formal training – What we heard at a conference – What a funding agency required / suggested – What others are doing Evidence and Public Health Decision Making • Good news – Strong evidence on the effect of many policies / programs aimed to improve public health, like immunizations or smoking cessation – Major efforts underway to assess the body of evidence for wide range of public health interventions, like the Cochrane Collaborative or the AMCHP Best Practices program What Works to Improve the Public’s Health? • Bad news – Many public health professionals are unaware of this evidence – Some who are aware don’t use it – Many existing disease control programs have interventions with insufficient evidence –while others use interventions with strong evidence of effectiveness – Lack of use of effective interventions can adversely affect fulfilling mission and getting public support Evidence-Based Maternal and Child Health • True or false? • For women who are experiencing problems with their pregnancy, bed rest is effective in preventing preterm labor. Evidence-Based Maternal and Child Health • FALSE! • Obstetric practices for which there is little evidence of effectiveness in preventing or treating preterm labor include bed rest. (Goldenberg, Obstetrics and Gynecology, 2002) The True Story of the 3 Local MCH Departments and Governor Wolf’s Office Once… • …the office of Governor Wolf called up the first local MCH department and wanted to know the preterm birth rate for 2006 and 2007. • The local data staff ran to the computer and quickly calculated the number of preterm births divided by the number of normal gestational age births. • And proudly showed it to the Governor. • “That’s not a rate, that’s a ratio!!!” thundered Governor Wolf (who had a doctorate in epidemiology). • And he huffed and he puffed and he blew away 25% of their funding. • So, the office of Governor Wolf called up the second local MCH department and wanted to know the preterm birth rate for 2006 and 2007. • The local data staff ran to the computer and quickly calculated the number of preterm births divided by the total number of births. • And proudly showed it to the Governor. • “Great,” said the Gov, “is it the same in 2006 and 2007?” • “Oh, we’re not sure of the year” said the second local MCH staff. • “Then it’s not a rate, it’s a proportion!!!” thundered Governor Wolf. • And he huffed and he puffed and he blew away 35.8% of their funding. • And then, Governor Wolf called up the third local MCH department and wanted to know the preterm birth rate for 2006 and 2007. • The local data staff ran to the computer and quickly calculated the number of preterm births divided by the total number of births for each year. • And proudly showed them to the Governor. • “Great,” said the Gov, “is it the same in 2006 and 2007?” • “No, it was 12.8 per 100 live births in 2006, and 10.2 per 100 live births in 2007; a significant decline” said the third local MCH department staff. • “Excellent!!!” cried Governor Wolf. • And he wiped out their funding altogether because of an immediate state budget crisis. Was Governor Wolf correct? Or, would any of the local health department responses suffice? (or, was the Governor just throwing around his epidemiologic weight) Why is this a ratio? Why is this a proportion? Why does it matter? What are the implications if the wrong measure is used? Why is this a rate? Measures of Disease Frequency 1.049:1 3,763,758 4,090,007 92.0% RATES 6,694 COUNTS PROPORTIONS RATIOS 161.8 per 100,000 Counts • Simplest, most frequently performed quantitative measure in epidemiology • Refer to the number of cases of disease, injury, events, or other health phenomenon being studied • Examples – No. of pregnant women who were screened for Hepatitis B during a prenatal care visit – No. of women who initiated breastfeeding in the U.S. in 2007 – No. of newborns screened for genetic, metabolic, hormonal and/or functional conditions within 24-48 hours of birth Why isn’t enumeration sufficient? • Can’t / Don’t always detect ALL events – Census – Sample • How would you know whether the counts – Represent events that are big, small, a problem, important? – Represent phenomena common or unique to a population? – Change over time? – Are similar or different between 2 different populations? Frequency Measures – Ratio, Proportion, Rate • Characterize part of a distribution • Can be used to compare one part of a distribution to another part of a distribution • Contrast to measures of central tendency that provide single values that summarize entire distributions of data (e.g., mean, median, mode) All 3 frequency measures have the same form: numerator denominator x 10n From “Births: Final Data for 2004” in the National Vital Statistics Reports, vol. 55(1):21, Sept 29, 2006. What is a ratio? • A fraction in which the numerator is NOT part of the denominator • Numerator and denominator need not be related • Limits -- ∞ to ∞ • Result is often expressed as the “x”:1 • E.g., – male-to-female ratio – no. of controls to no. of cases – no. of LBW births to no. of violent crimes in a neighborhood How to Calculate a Ratio Ratio = Number or rate of events, items, persons, etc. in one group Number or rate of events, items, persons, etc. in another group Example: Sex ratio – male live births to female live births = 2,118,982 / 2,019,367 = 1.049:1 (or 1,049 male live births per 1,000 female live births) What is a proportion? • • • • Compares a part to the whole The numerator is ALWAYS part of the denominator Type of ratio, “x/y” May be expressed as a decimal, a fraction, or a percentage • Limits – 0 to 1 • In epidemiology, tells us the fraction of the population that’s affected • E.g., – proportion of children in a school vaccinated against measles – proportion of women in PRAMS who initiated breastfeeding – % of women who initiated PNC in the 1st trimester How to Calculate a Proportion Proportion = Number of persons or events with a particular characteristic Total number of persons or events of which the numerator is a subset Example: Proportion (%) of 2003 live births with birthweights of 2500 grams or greater = 3,763,758 / 4,090,007 = 92.0% From “Infant Mortality Statistics from the 2003 Period Linked Birth/Infant Death Data Set”, NVSR 54(16):1, May 3, 2006. What is a rate? • A ratio that consists of a numerator and a denominator in which TIME forms a part of the denominator • Measures the frequency with which an event occurs in a defined population over a specified period of time From “Births: Final Data for 2005” in the National Vital Statistics Reports, vol. 56(6):1, December 5, 2007. Properties and Uses of Rates • Useful for putting disease frequency in the perspective of the size of the population • Can be used to compare among different groups of persons with potentially different sized populations (i.e., rate is a measure of risk) • Limits – 0 to ∞ • Can be expressed in any form that is convenient (e.g., per 1000, per 100,000, etc.) How to Calculate a Rate Rate = No. of persons or events in a given time period No. of persons or events in a reference population (at mid-point of year or time period) Example: 2005 Triplet or higher order multiples birth rate in the United States = 6,694 / 4,138,349 = 161.8 per 100,000 births Are percentages ratios? Proportions? And/or Rates? • Yes, Ratio – e.g., number of mothers in one group (e.g., 1st trimester) over the number of mothers in another group (e.g., all who had late or no PNC) • Yes, Proportion – e.g., the ratio of mothers in one group who are a subset of the other group • Perhaps, Rate – when percentages are a ratio that consists of a numerator and a denominator in which TIME forms a part of the denominator Incidence • Refers to the occurrence of new cases of disease, injury, attribute or events in a population over a specified period of time • Is a proportion, rate • Fundamental tool for exploring the etiology and causality of disease because new events provide estimates of risk of developing disease • Several types of incidence measures – Incidence proportion – Attack rates – Incidence rate How to Calculate Incidence Proportion (Risk) Incidence Proportion = Number of NEW cases of disease, injury, events, or deaths during a specified period of time _______________________________________________ Population at start of the specified period of time Example: 2007 Incidence of chickenpox in the United States 519 incident cases of chickenpox in the United States = 519 / 301,139,950 = 1.72 per 1,000,000 population From “Table II. Provisional cases of selected notifiable diseases, United States” in the MMWR, vol. 57(1):26, January 11, 2008. Uses of Incidence Data • Determining the extent of a disease or health problem in a community • Helping to determine etiology of disease because an estimate of risk of developing disease can be calculated • Identifying changes in disease over time • Comparing incidence rates in populations that differ in exposure – permits estimation of effects of exposure to a hypothesized factor of interest Prevalence • Refers to the number of persons in a population with a specified disease, injury or attribute or event at a specified point in time or over a specified period of time • Is a proportion, rate • Point prevalence – Measured at a particular point in time • Period prevalence – Measured over an interval of time How to Calculate Prevalence Prevalence of Disease / = an Attribute Total number of persons with [NEW + PREEXISTING cases of disease] OR [attribute of interest] during a specified period of time _________________________________________________________ Population during the same specified period of time Example: 2006 Prevalence of folic acid consumption among non-pregnant women aged 18-44 years in Puerto Rico = 995 / 410,210 = 24.8 per 10,000 live births From “Prevalence of Neural Tube Defects and Folic Acid Knowledge and Consumption – Puerto Rico, 19962006” in the MMWR, vol. 57(1):10-13, January 11, 2008. Properties of Prevalence Data • Prevalence and incidence are frequently confused . . . – Prevalence refers to the proportion of persons who have a condition at or during a specific period of time – Incidence refers to the proportion or rate of persons who develop a condition during a particular period of time Uses of Prevalence Data • Provides an indication of the extent of a health problem and may have implications for the scope of health services needed • Useful for – Describing the health burden of a population – Estimating frequency of an exposure – Allocating health resources – BUT, NOT for determining etiology Measures of Association • Quantify the relationship between exposure and disease among two groups of people within the same population or two different populations of people – Exposure is used loosely to mean inherent characteristics, biologic characteristics, acquired characteristics, activities, social or environmental conditions, etc. • Includes – – – – Relative risk (risk ratio) Rate ratio Odds ratio Proportionate mortality ratio Relative Risk / Risk Ratio (RR) • Compares the risk of a health event among one group with the risk among another group • The two groups are typically differentiated by demographic features or exposure to a suspected risk factor • Measure of association for cohort studies • When – RR = 1, same risk among the two groups – RR > 1, increased risk for the group in the numerator (usually the exposed group) – RR < 1, decreased risk for the group in the numerator (in some instances the exposure might be a protective factor) Relative Risk = Risk of disease (incidence proportion, attack rate) in the group of primary interest (exposed) ________________________________________ Risk of disease (incidence proportion, attack rate) in the comparison group (unexposed) Relative Risk of Hashimoto’s Thyroiditis Rate Ratio • Compares the incidence rates, person-time rates, or mortality rates of two groups • The two groups are typically differentiated by demographic features or exposure to a suspected risk factor • When – Rate ratio = 1, equal rates in the two groups – Rate ratio > 1, increased risk for the group in the numerator (usually the exposed group) – Rate ratio < 1, decreased risk for the group in the numerator (could indicate that the exposure is a protective factor) Rate Ratio = Rate for group of primary interest (exposed) ________________________________________ Rate for the comparison group (unexposed) Male:Female Rate Ratio of Syphillis Rate Ratio of non-Hispanic Black Males to non-Hispanic Black Females Rate Ratio = = 11.9 / 1.8 6.6 From “Primary and Secondary Syphillis – US, 2003-2004” in the MMWR, vol. 55(10):269-73, March 17, 2006. Odds Ratio (OR) • Quantifies the relationship between an exposure with two categories and health outcome • Sometimes called the cross-product ratio • Measure of choice in case-control studies – Often, the size of the population from which the cases are identified is not known; thus, risks, rates, risk ratios, and rate ratios cannot be calculated – Odds ratios approximate risk ratios (relative risks), particularly when the disease or outcome is rare • When – Odds ratio = 1, equal rates in the two groups – Odds ratio > 1, increased risk for the exposed group – Odds ratio < 1, decreased risk for the unexposed group Disease No Disease Exposed a b a+b Not Exposed c d c+d a+c b+d Total Odds Ratio = a/c b/d = ad bc Odds Ratios of Self-Reported Severity of Asthma Symptoms From “Self-Reported Increase in Asthma Severity … – Manhattan, NY, 2001” in the MMWR, vol. 51(35):78184, September 6, 2002. Measures of Natality Measure Numerator Denominator Crude birth rate No. of live births during a given period of time Mid-interval population Crude fertility rate No. of live births during a given period of time No. of women ages 15-44 years at mid-interval Crude rate of natural increase No. of live births MINUS no. of deaths during a given period of time Mid-interval population Low birth weight rate No. of live births <2500 grams during a given period of time No. of live births during the given period of time 62 Measures of Morbidity Measure Numerator Denominator Incidence proportion (attack rate or risk) No. of NEW cases of disease, injury, or events during a specified time interval Population at start of time interval Secondary attack rate No. of NEW cases among contacts Total number of contacts Incidence rate (person-time rate) No. of NEW cases of disease, injury, or events during a specified time interval Summed person-years of observation or average population during time interval Point prevalence No. of current cases or events (new + preexisting) at a specified point in time Population at the same specified point in time Period prevalence No. of current cases or events (new + preexisting) over a specified period of time Average or mid-interval population 63 Measures of Mortality Measure Numerator Denominator Crude death rate Total no. of deaths during a given period of time Mid-interval population Cause-specific death rate No. of deaths assigned to a specific cause during a given period of time Mid-interval population Proportionate mortality No. of deaths assigned to a specific cause during a given period of time Total no. of deaths from all causes during the same period of time Death-to-case ratio No. of deaths assigned to a specific cause during a given period of time No. of new cases of same disease reported during the same period of time Neonatal mortality rate No. of deaths among children <28 days of age during a given period of time No. of live births during the same period of time Postneonatal mortality rate No. of deaths among children 28-364 days of age during a given period of time No. of live births during the same period of time Infant mortality rate No. of deaths among children <1 year of age during a given period of time No. of live births during the same period of time Maternal mortality rate No. of deaths assigned to pregnancy-related causes* during a given period of time No. of live births during the same period of time *pregnancy-related death is defined as a death that occurred during pregnancy or within 1 year after the end of pregnancy and resulted from 1) complications of pregnancy itself, 2) a chain of events initiated by pregnancy, or 3) aggravation of an unrelated condition by the physiologic effects of pregnancy 64 Measures of Public Health Impact • Used to place the association between an exposure and an outcome into a meaningful public health context • Reflect the burden that an exposure contributes to the frequency of disease in a population – Contrasts with measures of association, which quantify the relationships between exposures and diseases and provide insight to causal relationships • Includes – Attributable proportion – Efficacy – Effectiveness Measures of Spread • Standard deviation – Conveys how widely or tightly the observations are distributed from the center point or values – Measure of spread used most commonly with the mean – Usually calculated only when the data are more or less normally distributed 66 Standard Error of the Mean • Refers to the variability that could be expected in the means of repeated samples taken from the same population • Assumes sample comes from a large population • Sample of interest is just one of an infinite number of possible samples • The mean is just one of an infinite number of sample means • Standard error quantifies the variation observed in the sample means • Primary use of standard error is in calculating confidence intervals around the mean – SE = std dev √n 67 Confidence Intervals • Common method for indicating a measurement’s precision – Narrow interval = high precision – Wide interval = low precision • Represents the range of values consistent with the data from a study . . . Simply a guide to the variability in a study • Confidence intervals can be calculated for some, but not all, epidemiologic measures . . . Regardless of measure, the interpretation is the same • • • • • Means Geometric means Proportions Risk ratios Odds ratios 68 Some Methods to Compare Differences between Groups • Rate ratios – Used to compare rates for 2 populations – Simply the ratio of 2 rates – Note: the multiplier must be the same for both rates • Relative percent difference (RPD) – Another method for comparing differences between 2 groups using prevalence (P1 – P2) RPD = ________ X 100 P2 Where P1 is the prevalence of the event in the first population and P2 is the prevalence of the event in the second population Prevalence of Diabetes and Relative Percent Difference • RPD between the rate of diabetes in Hispanics and nonHispanic white women (9.9 – 4.5) RPD = ________ X 100 4.5 = 5.4 / 4.5 X 100 = 120% From “Prevalence of Diabetes Among Hispanics – Selected Areas, 1998-2002” in the MMWR, vol. 53(40):941-44, October 14, 2004. Assessing Trends • Trend = long-term movement in an ordered series – Can be used to assess the overall pattern of change of an indicator, geographic areas, time periods, populations – Can be influenced by small numbers, changes in how data collected / defined – Can minimize effect by “smoothing” data via 3-year moving averages or data transformation (natural log scale) • Also can be used loosely to refer to an association which is consistent between 2 sets of data or strata, but not necessarily statistically significant How to Judge / Evaluate Data Sources • • • • • • • • • Timeliness Geographic specificity Specificity of demographic data Data consistency and standardization Availability over time Ability to identify individuals / events Adequate sample size Sample validity Primary data collection potential Caveats • Caveat . . . unique data sources – Not necessarily an abundance for local data, but may be packaged or presented in different ways – Some states try to ensure that data are available at county level – A number of websites that catalog or compile links to data sources, e.g., • California – UCSF Family Health Outcomes Project -http://familymedicine.medschool.ucsf.edu/fhop/htm/ca_mcah/ index.htm • Texas – UT School of Public Health -http://www.sph.uth.tmc.edu/charting/ Next Steps in this Workshop • What you have in hard copy and on disk – Source descriptions – Source quick reference guide – Case studies – Case studies “cheat sheet” – Copy of this presentation • Let’s take a look and GET STARTED! Acknowledgments • Belovich-Faust and Ligi. Role of the Local Health Dept., Bethlehem, PA Health Dept. • Shah, Whitman & Silva. Variations in the Health Conditions of 6 Chicago Community Areas: A Case for Local-Level Data. Am J Public Health 96(8): 1485-91 (2006). • Jenicek. Epidemiology, Evidence-Based Medicine, and Evidence-Based Public Health. J Epidemiol 7:187-97 (1997). • Brownson, et al. Evidence-Based Decision-Making in Public Health. J Public Health Manag Prac 5:86-87 (1999). • Goldenberg. The Management of Preterm Labor. Obstetrics and Gynecology 100(5 Pt 1):1020-37 (2002). • Lewis. Moneyball, 2003. Contact Information & Copies of Workshop Training Materials Michael D. Kogan, PhD HRSA/MCHB Director, Office of Data and Program Development 5600 Fishers Lane, Room 18-41 Rockville, MD 20857 301-443-3145 mkogan@hrsa.gov Laurin Kasehagen Robinson, PhD, MA CityMatCH Senior MCH Epidemiologist / CDC Assignee to CityMatCH Adjunct Asst Professor in Pediatrics University of Nebraska Medical Center, Department of Pediatrics 982170 Nebraska Medical Center Omaha, NE 68198-2170 402-561-7523 lkasehagen@unmc.edu