GIS and Health Geography What is epidemiology? GIS and health geography ◦ Major applications for GIS Epidemiology ◦ What is health (and how location matters) ◦ What is a disease (and how to identify one) ◦ Quantifying disease occurrence Incidence vs prevalence Identifying the population Working with small area data TOC A GIS can be a useful tool for health researchers and planners because, as expressed by Scholten and Lepper (1991): ◦ Health and ill-health are affected by a variety of lifestyle and environmental factors, including where people live. Characteristics of these locations (including socio-demographic and environmental exposure) offer a valuable source for epidemiological research studies on health and the environment. Health and ill-health always have a spatial dimension therefore. More than a century ago, epidemiologists and other medical scientists began to explore the potential of maps for understanding the spatial dynamics of disease. GIS and health Geography 1. Spatial epidemiology 2. Environmental hazards 3. Modeling Health Services 4. Identifying health inequalities Major applications for GIS Spatial epidemiology is concerned with describing and understanding spatial variation in disease risk. Individual level data Counts for small areas Recent developments owe much to: Geo-referenced health and population data Computing advances Development of GIS Statistical methodology Spatial epidemiology Population is unevenly distributed geographically. People move around (day-to-day movements; longer term movements including migration). People possess relevant individual characteristics (age, sex, genetic makeup, lifestyle, etc). People live in communities (small areas). Framework for analysis Provides a qualitative answer about the existence of an association (e.g. between environmental variable and health outcome). May provide evidence that can be followed up in other ways. Why small area analyses? These studies typically involve examining geographical variations in exposure to environmental variables (air, water, soil, etc.) and their association with health outcomes while controlling for other relevant factors using regression. Geographical correlation studies Issues: Spatial misalignment Frequency and quality of population data (e.g. Census every 10 years). Spatial compatibility of different data sets. Availability of data on population movements. Measuring population exposure to the environmental variable. Environmental impacts are often likely to be quite small (relative to, for example, lifestyle effects) and there may be serious confounding effects. Cannot estimate strength of an association; Ecological (or aggregation) bias. Issues: Uncertainty Allow for heterogeneity of exposure. Use well defined population groups. Use survey data to help obtain good exposure data. Allow for latency times. Allow for population movement effects. Issues: Best practices (Richardson 1992) Dr. John Snow’s Map of Cholera Deaths in the SOHO District of London, 1854 Spatial epidemiology 1. Spatial epidemiology 2. Environmental hazards 3. Modeling Health Services 4. Identifying health inequalities Major applications for GIS Hazard Surveillance •Hazardous agent present in the environment •Route of exposure exists •Host exposed to agent •Agent reaches target tissue Exposure •Agent produces adverse Surveillance effect Outcome Surveillance •Effect clinically apparent Environmental hazards GIS: Identify causal and mitigating factors Environmental hazards 1. Spatial epidemiology 2. Environmental hazards 3. Modeling Health Services 4. Identifying health inequalities Major applications for GIS A generic index of accessibility/ remoteness for all populated places in non-metropolitan Australia A model which allows accessibility to any type of service to be calculated from all populated places in Australia ARIA (Accessibility/Remoteness Index of Australia) AIRA Geographical location “Where do infants and children die in WA? 1980-2002” Jane Freemantle, PhD. November 2004 Remote non-Aboriginal Rural Aboriginal Metro. 0 2 4 6 8 10 12 14 16 18 20 22 24 26 Mortality Rate / 1000 live births Mortality rate of infants (1980-2001) Identifying health inequalities: Well-known relationship ◦ 25% – 50% of observed gradient due to risk factors like smoking, hypertension and diabetes in lower socio-economic groups (Marmot et al.,1997) ◦ Access to healthcare (Bosma et al., 2005) ◦ Imbalance between workplace demands and economic reward (Lynch et al.,1997) ◦ Poor education, lower levels of health literacy, low birth weight (Marmot, 2000) Relationship may vary with gender with the association thought to be stronger in males (Thurston, 2005) SES and Heart disease Number of daily hospital discharges (Y) with Ischemic Heart Disease (IHD) where admission had been via emergency room for ◦ ◦ ◦ ◦ 591 postcodes in NSW Every day from July 1, 1996 to June 30, 2001 Males and females 5-year age increments Denominator (N) obtained from census Social disadvantage measured at postal area level using the census-derived SEIFA (SocioEconomic Indexes for Areas) index The Data High values indicate social advantage SEIFA distribution in NSW NSW IHD rates GIS and health geography ◦ Major applications for GIS Epidemiology ◦ What is health (and how location matters) ◦ What is a disease (and how to identify one) ◦ Quantifying disease occurrence Incidence vs prevalence Identifying the population Working with small area data TOC The study of the distribution and determinants of health and disease-related states in populations, and the application of this study to control health problems. ‘the product of [epidemiology] is research and information and not public health action and implementation’ (Atwood et al. 1997) ‘epidemiology’s full value is achieved only when its contributions are placed in the context of public health action, resulting in a healthier populace.’ (Koplan et al. 1999) What is epidemiology? … are like bookies of disease, stalking the globe to determine point-spreads on which groups of people are most likely to get which diseases. Part detective and part statistician, part anthropologist and part physician, epidemiologists hope to track down the agents of illness by deducing which of the differences between peoples lie at the root of their distinctive disease patterns. Epidemiologists . . . (H. Shodell, Science ’82, September, p. 50) DESCRIPTIVE Health and disease in the community What? Who? When? Where? What are the health problems of the community? How many people are affected? Over what period of time? Where do the affected people live, work or spend leisure time? What are the attributes of these illnesses? ANALYTIC Why? What are the causal agents? What factors affect outcome? What are the attributes of affected persons? Etiology, prognosis and program evaluation How? By what mechanism do they operate? Epidemiologic approaches Dorland's Illustrated Medical Dictionary (28th ed.): Health – "a state of optimal physical, mental, and social well-being, and not merely the absence of disease and infirmity.“ Disease – "any deviation from or interruption of the normal structure or function of any part, organ, or system (or combination thereof) of the body that is manifested by a characteristic set of symptoms and signs . . .". What are “disease” and “health”? Health, as defined in the World Health Organization's Constitution, is "a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity." Health is seen as more than just the absence of disease, and depends upon a complex suite of factors, with location taking the lead. A location is more than just a position within a spatial frame (e.g., on the surface of the Earth or within the human body). Different locations on Earth are usually associated with different profiles: physical, biological, environmental, economic, social, cultural and possibly even spiritual profiles, that do affect and are affected by health, disease and healthcare. What is ‘health’ An example of how location matters and carries with it other factors into play The body weight of infants at birth is one readily available piece of data, and the relationship between low birth-weight and maternal and child health is a continuing line of research. In New York City, Sara McLafferty and Barbara Tempalski have studied the spatial distribution of low birth-weight infants and identified areas in which the number of low birth-weight infants increased sharply during the 1980s. Their results indicated that the rise in low birth-weight was closely linked to women's declining economic status, inadequate insurance coverage and prenatal care, as well as the spread of crack/cocaine. Location and health Location and health Location and health GIS and health geography ◦ Major applications for GIS Epidemiology ◦ What is health (and how location matters) ◦ What is a disease (and how to identify one) ◦ Quantifying disease occurrence Incidence vs prevalence Identifying the population Working with small area data TOC Manifestional criteria: Manifestational criteria refer to symptoms, signs, and other manifestations of the condition. Defining a disease in terms of manifestational criteria relies on the proposition that diseases have a characteristic set of manifestations. This defines disease in terms of labeling symptoms. Causal criteria: Causal criteria refer to the etiology of the condition, which must have been identified in order to be employed. This defines disease in terms of underlying pathological etiology. What is ‘disease’ How do you identify a disease? The Acquired Immunodeficiency Syndrome (AIDS) was initially defined by the CDC in terms of manifestational criteria as a basis for instituting surveillance. The operational definition grouped diverse manifestations – Kaposi's sarcoma outside its usual subpopulation, PCP and other opportunistic infections in people with no known basis for immunodeficiency. This was based on similar epidemiologic observations (similar population affected, similar geographical distribution) and a shared type immunity deficit (elevated ratio of T-suppressor to T-helper lymphocytes). Manifestational Criteria Human immunodeficiency virus (HIV, previously called human lymphotrophic virus type III) was discovered and demonstrated to be the causal agent for AIDS. AIDS could then be defined by causal criteria. Causal Criteria A single causal agent may have multiple clinical effects. Multiple etiologic pathways may lead to apparently identical manifestations, so that a manifestationally-defined disease entity may include subgroups with differing etiologies. Multi-causation necessitates a degree of arbitrariness in assigning a causative versus a contributing factor to a disease. Not all persons with the causal agent develop the disease. Challenges with Disease Classifications Onset of disease Physiologic Underlying Abnormalities Genetic Susceptibility Diagnosis of disease Sub-clinical disease Cause-specific mortality Clinical disease Environmental & Behavioral Factors (Spatial dependence) The natural history of disease X GIS and health geography ◦ Major applications for GIS Epidemiology ◦ What is health (and how location matters) ◦ What is a disease (and how to identify one) ◦ Quantifying disease occurrence Incidence versus prevalence Identifying the population Working with small area data TOC To study disease, we need measures of its occurrence. Some measures of disease occurrence ◦ Counts ◦ Prevalence ◦ Incidence ◦ Mortality Measures of disease occurrence DESCRIPTIVE What? What are the health problems of the community? What are the attributes of these illnesses? Health and disease in the community Who? How many people are affected? What are the attributes of affected persons? When? Where? Over what Where do the period of time? affected people live, work or spend leisure time? Each of the measures can be calculated for different combinations of What? Who? When? and Where? Each of the W’s needs to be defined carefully to get comparable measures across a province or state, a nation, the world. Epidemiologic approaches The prevalence of a disease is the proportion of individuals in a population with the disease (cases) at a specific point in time: Number cases in population at specified time Number of persons in population at that specified time Prevalence is a proportion – range of 0 to 1 Removes the effect of total population size – makes estimates from different populations or over time more comparable. Prevalence Often expressed as a percent (%) – Prevalence * 100 Also often expressed as the prevalence per 1,000 or 10,000 or 100,000. Prevalence * 1,000 = prevalence per 1,000. Prevalence (*BMI ≥30, or ~ 30 lbs overweight for 5’ 4” woman) 1991 1995 2002 No Data <10% 10%–14% 2006 15%–19% 20%–24% ≥25% Obesity Trends Among U.S. Adults Cases infected with the outbreak strain of Salmonella Saintpaul, as of July 15, 2008 9 pm EDT. We would need to know the population in each state in order to determine the prevalence. Salmonella cases: Infected Number of NEW cases in population DURING specified time Number of persons AT RISK of disease in population during that specified time If population size is 3.81 million, then 652 100,000 3,810,000 .00017 100,000 17.1 I The incidence of a disease is the rate at which new cases occur in a population during a specified period. Incidence Incidence of cases of infection with the outbreak strain as of July 15, 2008 9pm EDT Salmonella cases: Incidence Cases infected with the outbreak strain of Salmonella Saintpaul, as of July 15, 2008 9pm EDT Cases and Incidence – Salmonella Incidence and prevalence measure different aspects of disease occurrence Prevalence Incidence Numerator: All cases, no matter how long diseased Denominator: All persons in pop Measures: Presence of disease Most useful: Resource allocation Only NEW cases Only persons at risk of disease Risk of disease Risk, etiology Incidence and Prevalence Etiology: the study of a disease’s causes. Numerator ◦ Number of deaths Denominator ◦ Number of individuals in population (how defined?) Time interval ◦ 1-year: Annual Mortality Rate ◦ (typical to use an annual rate) Specifier ◦ age, sex, race, etc. Mortality Rate – Incidence of death Mortality rates For any measure, carefully defining both the numerator and denominator is crucial for interpretation. In order for measures to be comparable across studies, need consistent definition and reporting strategies for numerator. Also need consistent approaches for counting (or estimating) the persons or person-time for the denominator. Importance of defining terms AIDS cases, United States 1984-2000 Result of new definition 1st Quarter of 1993: Expansion of surveillance case definition Prevalence numerator – case definition Understanding population dynamics is crucial to epidemiology. Demography = the study of population dynamics including fertility, mortality and migration Greek English epi among demos people logy study The “demi” in Epidemiology GIS and health geography ◦ Major applications for GIS Epidemiology ◦ What is health (and how location matters) ◦ What is a disease (and how to identify one) ◦ Quantifying disease occurrence Incidence vs prevalence Identifying the population Working with small area data TOC Developing multi-level models for spatially-correlated data requires confidence in the dependent data. Data for disease mapping often consists of disease counts and exposure levels in small adjacent geographical areas. The analysis of disease rates or counts for small areas often involves a trade-off between statistical stability of the estimates and geographic precision. Data considerations 56 Disease caused by a deficient diet or failure of the body to absorb B complex vitamins or an amino acid. Common in certain parts of the world (in people consuming large quantities of corn), the disease is characterized by scaly skin sores, diarrhea, mucosal changes, and mental symptoms (especially a schizophrenia-like dementia). It may develop after gastrointestinal diseases or alcoholism. An example: Pellagra in the US 57 A case study: ◦ They considered approximately 800 counties clustered within 9 states in southern US ◦ For each county, data consisted of observed and expected number of pellagra deaths ◦ For each county, they also had several county-specific socio-economic characteristics and dietary factors ◦ % acres in cotton ◦ % farms under 20 acres ◦ Dairy cows per capita ◦ Access to mental hospital ◦ % Afro-American ◦ % single women Multi-level data in spatial epidemiology 58 Which social, economical, behavioral, or dietary factors best explain spatial distribution of pellagra in southern US? Which of the above factors is more important for explaining the history of pellagra incidence in the US? To what extent have state-laws affected the incidence of pellagra? Scientific Questions 59 Definition of Standardized Mortality Ratio 60 Definition of the expected number of deaths 61 Crude Standardized Mortality Ratio (Observed/Expected) of Pellagra Deaths in Southern USA in 1930 (Courtesy of Dr Harry Marks) 62 For small areas, the Standardized Mortality Ratio (SMR) can be very instable and maps of SMR can be misleading ◦ Spatial smoothing can improve stability SMR are spatially correlated ◦ Spatially correlated random effects Covariates available at different level of spatial aggregation (county, State) ◦ Multi-level regression structure Statistical Challenges 63 Spatial smoothing can reduce the random noise in maps of observable data (or disease rates) Trade-off between geographic resolution and the variability of the mapped estimates Spatial smoothing as method for reducing random noise and highlight meaningful geographic patterns in the underlying risk Spatial Smoothing 64 Shrinkage methods can be used to take into account instable SMR for the small areas Idea is that: ◦ smoothed estimates for each area “borrow strength” (precision) from data in other areas, by an amount dependent on the precision of the raw estimate of each area Shrinkage Estimation 65 When population in area A is large ◦ Statistical error associated with observed rate is small ◦ High credibility (weight) is given to observed estimate ◦ Smoothed rate is close to observed rate When population in area A is small ◦ Statistical error associated with observed rate is large ◦ Little credibility (low weight) is given to observed estimate ◦ Smoothed rate is “shrunk” towards mean rate of surrounding areas Shrinkage Estimation 66 Raw and smoothed SMR 67 Crude SMR Smoothed SMR SMR of pellagra deaths for 800 southern US counties in 1930 68 In epidemiology and demography, most rates, such as incidence, prevalence, mortality, are strongly age-dependent, with risks rising (e.g. chronic diseases) or declining (e.g. measles) with age. In part this is biological (e.g. immunity acquisition), and in part it reflects the hazards of cumulative exposure, as is the case for many forms of cancer. For many purposes, age-specific comparisons may be the most useful. Ensuring comparability However, comparisons of crude agespecific rates over time and between populations may be very misleading if the underlying age composition differs in the populations being compared. Hence, for a variety of purposes, a single ageindependent index, representing a set of age-specific rates, may be more appropriate. This is achieved by a process of age standardization or age adjustment. Ensuring comparability The age-standardized mortality rate is a weighted average of the age-specific mortality rates per 100 000 persons, where the weights are the proportions of persons in the corresponding age groups of the standard population. Standardizing Spatial Analytic Techniques for Medical Geographers(Albert et al., 2000) Methodological toolboxes GIS and health geography ◦ Major applications for GIS Epidemiology ◦ What is health (and how location matters) ◦ What is a disease (and how to identify one) ◦ Quantifying disease occurrence Incidence versus prevalence Identifying the population Working with small area data Summary