R15Description2 - Portland State University

advertisement
Principal Investigator/ Program Director: Emch, Michael, Edward
Introduction
This project will model spatio-temporal fluctuations of cholera in Bangladesh and Vietnam by
integrating spatial data sets including satellite imagery, climatic variables, and socio-demographic data.
Lobitz et al. (2000) suggested that cholera is influenced by climatic changes, which can be indirectly
measured using satellite imagery. They illustrated that sea surface temperature (SST) and sea surface
height (SSH) in the Bay of Bengal were associated with temporal fluctuations of cholera in Dhaka,
Bangladesh from 1992 to 1995. In this proposed study we will expand their model by: (1) including
several more satellite-derived biophysical variables in three additional study areas; (2) investigating how
temporal associations with satellite-derived biophysical variables vary in space (i.e., between and within
study areas); and (3) using satellite imagery to model changes in estuaries, the postulated environmental
reservoir for cholera. The specific variables that we will incorporate into the model of the spatio-temporal
distributions of cholera include SSH derived from the TOPEX Poseidon satellite; SST derived from
Advanced Very High Resolution Radiometer (AVHRR), ADEOS Ocean Color and Temperature Scanner
(OCTS), Terra Moderate-resolution Imaging Spectroradiometer (MODIS), and Aqua MODIS;
chlorophyll concentration derived from the Coastal Zone Color Scanner (CZCS), OCTS, SeaWiFS, Terra
MODIS, and Aqua MODIS; flooding derived from Radarsat and the European Remote Sensing (ERS)
satellite; land use/ land cover (LULC) derived from the Landsat Multispectral Scanner (MSS), Thematic
Mapper (TM), Enhanced Thematic Mapper (ETM+), and ASTER sensors; monthly temperature and
rainfall from weather stations; and population distribution and socio-economic status from spatially
referenced demographic databases. Associations between these variables and cholera incidence in
Bangladesh and Vietnam can be used to predict future epidemics in other parts of the world. This project
is a unique interdisciplinary opportunity to merge geographic, epidemiological, and ecological theories
and methods for infectious disease research.
While the emergence and fluctuation of cholera is not well understood, it is clear that where,
when, and how many people contract cholera is related to distributions of environmental and climatic
variables. People contract cholera when they ingest an infective dose of Vibrio cholerae bacteria, which
have been shown to be present in estuaries, ponds, lakes, and rivers. Fluctuations in bacteria populations
depend on environmental conditions such as water temperature and plankton concentrations. Identifying
the niches of cholera requires defining the areas of increased risk of contracting the disease. This study
will use geographic information technologies including satellite remote sensing to model environmental,
climatic, and socio-demographic dynamics, and will compare these distributions with fluctuations in the
spatio-temporal distributions of cholera in Bangladesh and Vietnam. In Vietnam, cholera case data will
be compiled from hospital records from 1980 to 2003. In Bangladesh, cholera case data will be compiled
from treatment facility records from 1983 to 2003. The locations of all cholera cases will be mapped and
integrated with the environmental, climatic, and socio-demographic variables that will be derived from
the satellite sensors, local population census databases, and other secondary data sources. The main
research question for this project is: What are the spatio-temporal associations between cholera incidence
and satellite-derived environmental variables (i.e., chlorophyll concentration, SST, SSH, rainfall, LULC,
flooding), climatic variables (i.e., in situ rainfall, temperature), and socio-demographic variables (i.e.,
population density, socio-economics)? The study areas include Hue and Nha Trang, Vietnam and Matlab,
Bangladesh.
The broader impact of this study is both theoretical and methodological. It will describe how
satellite imagery and other spatial information can be used to predict cholera outbreaks. Thus, the project
will build upon recent literature describing the use of remote sensing in health research (Epstein, 1998;
Beck et al., 2000). We will report how much of the spatio-temporal variation of cholera can be explained
using the aforementioned predictor variables in Bangladesh and Vietnam. We will investigate whether
there are patterns to relationships in time and space. In other words, we will determine whether the
relationships in a particular time and place (e.g., 1992-95 in Dhaka) are also significant and informative in
24
Principal Investigator/ Program Director: Emch, Michael, Edward
other times and places. The results of the proposed study will also serve as methodological case studies
that describe how specific satellite image products can be used to predict cholera in areas that have had
many severe cholera epidemics during their respective study periods. Satellite imagery will also be used
to model how local and regional ecosystems are changing and how changes in the different ecosystems
affect humans (i.e., cholera distributions). The findings can then be used to extrapolate to other areas of
the world where cholera epidemics might occur in similar ecosystems. During the present cholera
pandemic, cholera has spread to similar ecosystems in Asia, Africa, and South America. Ali et al. (2002a)
predicted that the next cholera pandemic, caused by a new genetic variant of cholera (i.e,, Vibrio cholerae
O139), might also spread to similar ecosystems around the world. This study is a collaborative effort
between multidisciplinary investigators at the International Centre for Diarrhoeal Disease Research,
Bangladesh (ICDDR,B), the Vietnam National Institute of Hygiene and Epidemiology (NIHE), the
International Vaccine Institute (IVI) and Portland State University (PSU).
Theoretical Context
In order to advance the philosophical and theoretical implications of this study it is necessary to
situate it within a theoretical context. In a recent speech to the Royal Swedish Academy of Science,
Professor Rita Colwell described her vision of how science will become more interdisciplinary and
suggested that there is a need for new frameworks of inquiry (Colwell, 2002a). Her concept of
biocomplexity is pertinent to this study as it “denotes the study of complex interactions in biological
systems, including humans, and their physical environments.” She believes that old science relied too
much on a reductionist approach and that new frameworks should be integrative and use modern tools
such as remote sensing and information technologies. Professor Colwell developed many of these
concepts using cholera as an example (Colwell, 2002b). She stated that “the cholera story, still to be fully
unraveled, embraces environmental factors from the cellular level to the scale of global climate.” She
defined biocomplexity as “the dynamic web of interrelationships that arise when living entities at all
levels, from genes to human beings to ecosystems, interact with their environment.” Her case example of
cholera calls for a dynamic view of the disease looking at the complex and dynamic interactions between
environment, host (humans), and disease agents (Figure 1). She argues that real world phenomena such
as cholera distributions have interactions across many scales, and that many variables concerning the
interactions of humans with their environment must be considered to understand this complex disease.
Environment
longevity & infectivity
distribution & transport
altered selective pressures
Agent
Pathogenicity
Immune response
Host specificity
nutrition
hygiene
treatment
housing
Host
After Colwell 2002a
Figure 1: Colwell’s Biocomplexity Theory Applied to Cholera
This project can be situated within the human-environment tradition of geography. Theoretical
concepts of human-environment connections have evolved within the field of geography for more than a
century. Recent theoretical progress within the human-environment tradition of geography parallels other
ecological fields. New ecological theories assume that there are non-equilibrium conditions in
environment and change processes and that there are significant human impacts on natural environments
25
Principal Investigator/ Program Director: Emch, Michael, Edward
in areas that were once thought to be natural (Zimmerer, 1994; Zimmerer and Young, 1998). There is a
growing understanding that environmental change and its causes varies across settings thus research
involving human-environment interaction must be multifaceted, multiscaler, and multitemporal. The
proposed study uses a holistic approach to investigate the distribution of cholera in space and time.
Turner (2002) recently suggested that the field of geography should become a “human-environment
science” and that inquiry of “coupled natural-human systems” is a logical division of the systematic
sciences. Medical geographers have been developing geographic theory that is aligned with both Turner’s
definition of what geography should be in the future and Colwell’s philosophy of a future science.
The medical geographic theoretical approach of disease ecology maintains that disease results
from a dynamic complex of variables that coincide in time and space (May, 1958, 1977; Mayer, 1982,
1984, 2000; Mayer and Meade, 1994; Meade, 1977; Meade et al., 1988; Meade and Earickson, 2000;
Learmonth, 1988; Paul, 1985; Pyle, 1977, 1979). Hunter (1974) argues that we must not have a
pathogencentric view of disease, i.e. one that focuses only on the disease agent. He suggests that our
studies of disease "must co-jointly involve pathogen, host, and environment" (Hunter, 1974). He views
environment broadly as consisting of "diverse physical, biological, social, cultural, and economic
components" (Hunter, 1974). Hunter defines geography as a discipline that bridges the social and
environmental sciences and writes that "its integration and coherence derive from systems-related analysis
of man-environmental interactions through time and over space" (Hunter, 1974). This medical
geographic approach is holistic recognizing that one must investigate the integration of many different
types of variables responsible for disease. While types of variables to be investigated have been classified
in many different ways, Mayer's (1986) classification system is most useful. Mayer differentiated
between biological, socioeconomic, behavioral, and environmental variables. Biological variables are
those that describe biological characteristics of the host (e.g., blood type). Behavioral variables are those
that describe individual or group behaviors and may be related to culture or individual decision making
(e.g., what types of food people eat). Environmental variables are those of the biophysical environment
(e.g., climatic variables). Socioeconomic variables are variables that affect the coincidence of agent and
host (e.g., wealth or class). Different patterns of socioeconomic, biological, and environmental variables
result in different spatial and temporal patterns of disease. Virtually every disease exhibits spatial and
temporal variation and medical geographers attempt to explain this variation. This study goes beyond our
previous medical geography/ spatial epidemiology work on cholera (Emch, 1999, 2000; Emch and Ali,
2001, 2003; Ali et al., 2002a, 2002b, 2002c, 2002d) since it describes the use of satellite imagery for
measuring environmental variables and how they can be used to predict disease. It also investigates how
the spatio-temporal distributions of cholera vary within and between ecosystems, and how these
relationships have changed over a 20-year period, during which there have been significant changes in the
environment, the human dimension, and the genetic strain of the disease agent.
Literature Review
Cholera is an acute infection caused by the colonization and multiplication of Vibrio cholerae O1
or O139 within the human small intestine. The incubation period ranges from one to five days and the
disease is characterized by watery diarrhea, muscle cramps, vomiting, and dehydration. Vibrios are
water-borne organisms that are natural inhabitants of seas, estuaries, brackish waters, rivers, and ponds of
coastal areas of the tropical world. They flourish in the dense organic matter, algae, and zooplankton of
the Ganges delta and similar ecosystems. Lipp et al. (2002) offered a hierarchical model for
environmental cholera transmission that includes abiotic factors, phytoplankton, and zooplankton leading
to human ingestion of an infective dose of V. cholerae. Abiotic conditions including temperature, pH,
Fe3+, salinity, and sunlight influence vibrio growth and expression of virulence genes such as those that
regulate cholera toxin (responsible for watery diarrhea) (Lipp et al., 2002; Faruque et al., 1998). These
abiotic factors influence phytoplankton and aquatic plants, which promote survival of V. cholerae and
provide food for zooplankton. V. cholerae proliferate in an environment that includes commensal
26
Principal Investigator/ Program Director: Emch, Michael, Edward
copepods and crustaceans because they provide attachment sites for vibrios to multiply and serve as a
vector to transmit an infective dose to humans (Huq and Colwell, 1995). In their model, Lipp et al.
(2002) also note that there are other influences on cholera transmission including climate variability (i.e.,
climate change, El Nino-Southern Oscillation [ENSO], North Atlantic Oscillation), seasonal effects (i.e.,
sunlight, temperature, precipitation, monsoons), and human dimensions (i.e., socioeconomics,
demographics, and sanitation). Pascual et al. (2000) found that the temporal variability of cholera is
associated with three inter-related climate variables including upper troposphere humidity, cloud cover
and top-of-atmosphere absorbed solar radiation. Rodo et al. (2002) suggest that because of ENSO-related
climate variability patterns, there is a 4-year periodicity to the temporal cycle of cholera. The theoretical
basis for the Lobitz et al. (2000) study and for the cholera prediction model that we will build is as
follows. Increased SST facilitates phytoplankton growth and therefore commensal copepods that eat the
phytoplankton flourish (Kiørboe, 1994; Huq and Colwell, 1996). Lobitz et al. (2000) suggest that there is
a relationship between SST and phytoplankton concentrations and that this is the reason for the
relationship with cholera. After an initial lag period, V. cholerae proliferate and are subsequently
transmitted to humans. Lobitz et al. (2000) argue that SSH is related to human-Vibrio contact because it
causes tidal intrusion of plankton, which transports the bacteria into inland waters. They found a direct
temporal relationship between SSH and cholera outbreaks. In our proposed study we will test whether
SSH is related to cholera in our three study areas and use satellite-derived chlorophyll levels in water as a
surrogate for phytoplankton as well as SST. We will investigate these phenomena at a much more
detailed spatial scale than did Lobitz et al. (2000), in several different environments, over a much longer
time period, and using several more variables including those that incorporate humans in the cholera
ecosystem such as socioeconomics and demographics. The human dimension is essential, enabling us to
describe the human circumstances in which prediction of cholera based on biophysical variables will
work.
In order to contract cholera, a person must ingest an infective dose of cholera, which is about 106
bacteria. Cholera is either foodborne or caused by ingesting fecally contaminated water (Hughes et al.
1982; Khan et al. 1981; Spira et al. 1980). Ingestion can occur if a flood overruns the sewage system or
non-septic latrines thus infecting water supplies. Ingestion can occur by swallowing contaminated water
when bathing, drinking, washing, and/or cooking the untreated water (Figure 2).
Figure 2: Major Transmission Routes of Cholera (modified from Mintz et al., 1994)
The large amount of bacteria excreted in the feces of infected people can cause massive environmental
pollution. Poor sanitation facilitates the fecal-oral transmission process. Aquatic reservoirs also facilitate
cholera transmission by providing long-term natural habitats for the pathogen; the consumption of water
or food from such reservoirs puts humans at risk of infection (Figure 2). V. cholerae can survive and
27
Principal Investigator/ Program Director: Emch, Michael, Edward
grow in food when the conditions are adequate including low temperatures, high moisture levels, high
organic content, and near-neutral pH (DePaola 1981; PAHO 1991). V. cholerae can survive on most
foods from 2 to 14 days (PAHO 1991).
Cholera transmission can be divided into primary and secondary types (Colwell and Spira, 1992).
Primary cases are the result of infection by surface water sources; for example, a person is directly
infected with the bacteria by drinking untreated pond water or eating undercooked shellfish. Secondary
cases are people infected by fecal-oral transmission from other people; for example, a healthy family
member is infected by a sick family member who puts his/her hands in the family's drinking water pot.
Another example of secondary transmission is when a mother is infected by the feces of her baby.
Primary transmission is controlled by factors such as temperature, salinity, nutrient concentrations, the
number of available attachment sites (plankton), shellfish consumption, and contact with water (Colwell
and Spira, 1992). Emch (1999) found that sanitation and water availability and use are extremely
important in the effort to reduce secondary cholera transmission. Sommer and Woodward (1972) found
an inverse relationship between diarrhea and access to tube well water. Khan (1981) found that people
were more likely to contract cholera if they had greater access to canal water compared with river or pond
water. Glass et al. (1982) reported higher cholera incidence rates in villages that are not adjacent to
rivers. Hughes et al. (1982) found that people who used contaminated surface water for cooking and
bathing were more likely to contract cholera than those who did not. Several studies have found that risk
of diarrheal diseases is associated with environmental variables. Emch (1999, 2000) found that cholera is
associated with flood control projects. A recent study found that the influx of fresh water from rainfall
events upstream from an estuary led to increases in vibrio populations (Colwell, 2002a, cited study by
Valerie Louis).
Before 1963, classical Vibrio cholerae O1 was the dominant strain of cholera in Bangladesh. A
new strain, Vibrio cholerae O1 El Tor, was initially recognized as a mild cholera-like disease in an
Indonesian village in 1937 and was confined there for approximately two decades (Burua, 1992). In 1959,
however, it was detected in Thailand, and by 1963 it had spread to India and Bangladesh in pandemic
form. It then spread throughout the world, attacked more than half a million people, and claimed 5000
lives within eighteen months (Epstein, 1997). The recent history of the disease suggests that classical
cholera was entirely replaced by El Tor. The replacement occurred because the environmental niches
where the vibrios live are the same for the two strains (Ali et al., 2002a). They share similar ecological
environments and patterns of transmission (Shears, 1994). In 1992, a new strain, Vibrio cholerae O139,
emerged in India and subsequently began to spread to Bangladesh and neighboring countries in 1993
(Shimada et al., 1993; Cheasty et al., 1993; Attapattu, 1994; Tay et al., 1994; Sachdeva et al., 1995;
Siddique et al., 1996; Dalsgaard et al., 1996, 1998). Vibrio cholerae O139 has spread even more rapidly
than did O1 El Tor. This leads to the question of whether El Tor will be replaced by O139 in the future?
While the symptoms of classical, El Tor, and O139 cholera are similar in Bangladesh, there are some
differences in the seasonal cycles of the different strains; for example, O139 cases appear later in the postmonsoon period (personal communication, M. Yunus, 2003).
In Bangladesh, cholera transmission is seasonal with a peak after the monsoon, extending from
September to December (Emch, 1999). Baqui et al. (1992) identified two cholera peaks, one sometime
between September and December, and the other between March and June. Colwell and Spira (1992)
suggested that the post-monsoon epidemic is associated with a heavy bloom of zooplankton; they
postulated that there is a permanent environmental reservoir for Vibrio cholerae in the brackish ponds and
canals of rural Bangladesh. Oppenheimer et al. (1978) reported that zooplankton populations decrease
during the monsoon season and then increase after the monsoon because of phytoplankton bloom. Emch
and Ali (2001) described the temporal cycle of cholera epidemics in Matlab, Bangladesh (Figure 3). They
found that during a three-year study period (1992 through 1994) there were three main cholera peaks in
September, October, or November. The 1992 epidemic was far less severe than the 1993 and 1994
epidemics. Secondary epidemics occurred in March and April in all three years, however, the 1992
28
Principal Investigator/ Program Director: Emch, Michael, Edward
epidemic was more severe than the other two years. Cholera cases were completely absent near the
beginning of each year.
90
80
70
60
50
40
30
20
10
0
Post
Post
Post
Pre
11/1/94
9/1/94
7/1/94
3/1/94
1/1/94
11/1/93
9/1/93
7/1/93
5/1/93
3/1/93
1/1/93
11/1/92
9/1/92
7/1/92
5/1/92
3/1/92
1/1/92
5/1/94
Pre
Pre
Cholera
Figure 3: Temporal Distribution of Cholera in Matlab, Bangladesh
Specific Research Objective, Questions, Hypotheses
The objective of this study is to investigate the spatio-temporal dynamics of cholera in
Bangladesh and Vietnam and to develop a cholera prediction model. We will answer the following
research questions:
1. What are the spatio-temporal associations between cholera incidence and satellite-derived
environmental variables (i.e., chlorophyll concentration, SST, SSH, rainfall, LULC, flooding),
climatic variables (i.e., in situ rainfall, temperature), and socio-demographic variables (i.e.,
population density, socio-economic status)?
2. How do associations between cholera and satellite-derived biophysical, climatic, and sociodemographic variables vary in space (i.e., between and within study areas), time (i.e., during
the 20-year longitudinal study period), and by cholera strain (i.e., classical, El Tor, O139)?
3. Are changes in estuaries (i.e., turbidity, chlorophyll concentration) and areas around estuaries
(i.e., LULC) related to cholera incidence?
1.
2.
3.
4.
5.
6.
We will test the following hypotheses in each of the study areas:
There is a relationship between satellite-derived SSH and cholera incidence.
There is a relationship between satellite-derived SST (time lagged) and cholera incidence.
There is a relationship between satellite-derived chlorophyll concentration (time lagged) and
cholera incidence.
There is a relationship between satellite-derived flooding and cholera incidence.
There is a relationship between in situ and satellite-derived rainfall and cholera incidence.
The aforementioned relationships between cholera and environmental and climatic variables
(hypotheses 1-5) are influenced by demographic and socioeconomic distributions (i.e., the
relationships in the physical world vary across socio-demographic situations).
29
Principal Investigator/ Program Director: Emch, Michael, Edward
7. The model of spatio-temporal fluctuations of cholera is neither constant nor linear in time
and/or in space.
8. The model of spatio-temporal fluctuations of cholera is not constant by dominant cholera
strain (classical, El Tor, O139).
Study Data
Table 1 summarizes the source and availability of each environmental, climatic, and sociodemographic variable. The cholera incidence data are described below in each study area description.
Different data are available for different time periods. For instance, at the beginning of the study period,
in 1983, the only available satellite data are intermittent CZCS images for chlorophyll concentration,
AVHRR for SST, and Landsat MSS for LULC around estuaries. At the end of the study period in 2003,
almost all of the satellite data sources are available and the sensors generally have much better spatial and
spectral resolutions and therefore provide more specific information to input into the models. Also, in
Bangladesh, cholera incidence can be mapped at the extended household unit level throughout the study
period because both the cholera numerator and population denominator are available. However, in both
Vietnamese study areas only recent cases can be mapped at the household level; cases before 1990 must
be mapped by local-level administrative unit (hamlet).
The Bangladesh Study Area: Matlab
The research site for the ICDDR,B and for this project is called Matlab because the Centre's
hospital is located in Matlab Town. Matlab is in south-central Bangladesh, approximately 50 kilometers
south-east of Dhaka, adjacent to where the Ganges River meets the Meghna River forming the Lower
Meghna River. Figure 4 shows the Matlab study area relative to the Meghna River. The river flowing
adjacent to Matlab Town is the Dhonagoda River.
a
ghn
Me
'] Matlab
er
Riv
Study Area
30
Principal Investigator/ Program Director: Emch, Michael, Edward
Figure 4 Study area superimposed on Landsat TM satellite image.
Variable
Chlorophyll
Concentration in Water
Data Source and Availability
Environmental Independent Variables
CZCS (1978-86 intermittent), ADEOS OCTS (1996-97), SeaWiFS (1997-),
Terra MODIS (1999-), Aqua MODIS (2002-)
Sea Surface Temperature AVHRR (1979-), Terra MODIS (1999-), Aqua MODIS (2002-)
Sea Surface Height
LULC Change In and
Around Estuaries
TOPEX/ Poseidon (1992-)
Landsat MSS (1972-), TM (1986-), ETM+ (1999-March 2003), ASTER (1999-),
Hyperion (2000-)
Water Turbidity
Landsat MSS (1972-), Landsat TM (1986-), ETM+ (1999-March 2003),
AVHRR (1979-), ASTER (1999-)
Flooding
Distance from Water
Bodies/ Flooding
ERS (1992-), Radarsat (1996-)
Landsat MSS (1972-), Landsat TM (1986-), ETM+ (1999-March 2003),
AVHRR (1979-), ASTER (1999-), ERS (1992-), Radarsat (1996-), GIS Analysis
Climatic Independent Variables
Monthly Rainfall
Weather stations (Bangladesh- Chandpur; Vietnam, Hue & Nha Trang;
Mozambique- Beira and Quelimane)
Weather stations (Bangladesh- Chandpur; Vietnam, Hue & Nha Trang;
Mozambique- Beira and Quelimane)
Socio-demographic Independent Variables
Monthly Temperature
Population Distribution
Demographic surveillance systems, vaccine trial databases, and/or census
combined with GIS (availability varies by time- see study area descriptions)
Socioeconomic
Distribution
Demographic surveillance systems, vaccine trial databases, and/or census
combined with GIS (availability varies by time- see study area descriptions)
Table 1: Source and Availability of Independent Variables
(Note: Appendices 1 & 2 list satellite data product attributes, sources, and costs in greater detail)
A demographic surveillance system (DSS) has recorded all vital events of the study area population since
1963; the study area population has been approximately 200,000 since that time. The database is the most
comprehensive longitudinal demographic database of a large population in the developing world. The
people of the study area live in clusters of patrilineally-related groups of households called baris. The P.I.
created a vector GIS database of the Matlab field research area (Emch, 1995; Emch, 1998; Emch, 1999;
Ali et al., 2001a). Features in digital format include baris, rivers, health facilities, and a flood-control
embankment. Figure 5 shows three features in the GIS database including the flood-control embankment,
the Dhonagoda River, and baris.
31
Principal Investigator/ Program Director: Emch, Michael, Edward
Figure 5 Study area GIS database.
The three map views in Figure 5 are displayed at different scales. The map view on the far right has the
individual bari identification numbers visible. The baris are all identified by an ICDDR,B DSS census
number within the structure of the GIS database. This allows us to link attribute data to the spatial
database. In turn, demographic, disease, and other data can be linked to specific bari locations. The
Matlab field research center has in- and out-patient services, a medical laboratory, and research facilities.
One-hundred twenty community health workers (CHWs) visit each household area every two weeks to
collect demographic, morbidity, and other data. The DSS conducts periodic censuses and uses CHWs to
update demographic data (e.g., births, deaths, and migrations).
The Vietnam Study Areas: Nha Trang and Hue
The Vietnam case study areas are Nha Trang and Hue, both of which are in coastal regions (Figure
6). Figure 7 is a map of several features of Nha Trang including administrative units (communes), rivers,
lakes, roads, railroads, and the locations of the two estuaries in the study area. Cholera caused by Vibrio
cholerae 01 El Tor first appeared in Vietnam in 1964. Cholera case data will be compiled from hospital
records from 1980 to 2003 and from a household-level vaccine trial database from 1995 to 2003. While
there are presently many cases of cholera in Hue (approximately 200 laboratory confirmed cases in one
part of Hue in 2003), there have been no cases in Nha Trang since 2000; the reason for the disappearance
of cholera is unknown and is confusing in the light of the current socio-environmental conditions. To
prepare for this proposed study, in September 2003 the P.I. took a boat into the northern estuary of Nha
Trang (Figure 8). At the red dot there was a dense slum with approximately 15 latrines hanging over the
water. The black dots are the individual households, mapped using global positioning system (GPS)
receivers as part of a census that was done for the vaccine trial. There were small children swimming in
the water near the hanging latrines; thus, the socio-economic and sanitation environment was perfect for
cholera yet there has not been a case for several years. Cholera transmission will not occur unless the
environmental situation is right for the disease. Since Hue and Nha Trang possess similar sociodemographics, some yet undetected difference must exist that is responsible for the different cholera
patterns. One obvious difference between the two study areas is that Hue is farther up the estuary and the
feeder river is much larger (Figure 9). However, there may be other differences that can be determined
using satellite imagery.
The locations of cholera cases will be mapped at different levels. This study will derive cholera
incidence from commune health center records and the population database that is being created jointly by
the NIHE and IVI for vaccine trials and disease burden studies in Hue and Nha Trang. A detailed
32
Principal Investigator/ Program Director: Emch, Michael, Edward
population census has been conducted for the 300,000 persons living in Nha Trang, and the locations of
each of the 40,000 households have been mapped using GPS receivers. A similar spatial and population
database has been collected in Hue of 285,000 persons living in 56,000 households. Cholera cases will be
derived from hospital records.
Figure 6: Study Area Locations
Figure 7: Nha Trang GIS Database
Independent Variables
The beginning of this project will involve conducting a comprehensive search (via various web
and ftp sites) for all of the available cloud-free satellite images for the three study areas during the entire
study period. First, we will collect and/or model the data for the independent variables that are
hypothesized to be related to cholera distributions; next we will divide the data into spatial units for
statistical analysis. The proposed study investigates local statistical relationships (Fotheringham et al.,
2002) and thus we will divide variables into spatial subsets. Lobitz et al. (2000) measured the temporal
relationship between cholera and SST and SSH at one point in the Bay of Bengal. We will base our
analyses on many points and/or average values within or adjacent to cholera data collection units. Some
variables are continuous (e.g., SST, SSH, chlorophyll); however, other data sets, including those collected
from weather stations, are not available as spatially-continuous distributions. Therefore, these variables
will be incorporated into the models only as temporally varying distributions. Some of the independent
variables listed in Table 1 do not require any preprocessing because they do not vary in space (i.e., the
monthly temperature and rainfall from weather stations). Others require minimal data processing by the
investigators because they are collected as derived variables. The methods used to describe the
calculations to compute the derived variables are too complex to describe here; these methods are
provided in detail in the following sources (Strong et al., 1984; McClain et al., 1985; Esaias et al., 1998;
Mitchum, 1998; O'Reilly et al., 1998; Walton et al., 1998; Brown and Minnett, 1999; Chambers et al.,
2003). The variables that will be derived from secondary sources include AVHRR and MODIS-derived
SST (Figure 10); chlorophyll concentration derived from CZCS, OCTS, SeaWiFS, Terra MODIS, and
Aqua MODIS (Figure 11). These datasets will be georeferenced, a lengthy process because there are a
large number of satellite images.
33
Principal Investigator/ Program Director: Emch, Michael, Edward
Figure 8: Study Area Locations
Figure 9: Nha Trang GIS Database
Some independent variables will require a significant amount of preprocessing work to derive the
predictor variables from the satellite imagery or primary data sources. These variables include water
turbidity and LULC (derived from Landsat MSS, TM, ETM, AVHRR, and ASTER) as well as flooding
(derived from Radarsat and ERS) (Figure 12). The satellite-derived data have various spatial resolutions
including ≤30 meter (e.g., multispectral bands of Landsat TM & ETM+ and radar imagery), 79 meters
(e.g., Landsat MSS), 250 meters (e.g., MODIS-derived chlorophyll and SSH), 1.1 kilometers (e.g.,
AVHRR-derived SST), 4 kilometers (i.e., SeaWiFS) (Figure 11). Measuring relationships between
cholera and independent variables collected at various spatial resolutions requires that the images be
resampled to a common resolution. The dependent variable (i.e., cholera incidence) will also need to be
calculated at the same resolution. The models may be built at different resolutions for different times
because of the availability of different satellite data. For example, for times with only SeaWiFS data to
describe chlorophyll distributions, it is not possible to build a model that predicts cholera distributions
below 4 kilometers. Starting in 1999, when Terra MODIS was launched, chlorophyll concentration and
SST have been available at a spatial resolution of 250 meters1 (Figure 10). Once all of the images are
georeferenced, they will be integrated and resampled to a common spatial resolution within geographic
information system (GIS) software.
1
In 2003, Aqua MODIS was launched and thus there are two SST and chlorophyll scenes collected of the same
area each day, Terra MODIS in the morning and Aqua MODIS in the afternoon.
34
Principal Investigator/ Program Director: Emch, Michael, Edward
A
B
Figure 10: (A) Terra MODIS Derived Chlorophyll (B) Terra MODIS Derived SST
A
B
C
Figure 11: Chlorophyll (A) 2002 SeaWiFS of Vietnam (B) 1980 CZCS Asia and Africa (C) 1997 OCTS
Bangladesh
Landsat and other satellite imagery will be used to measure ecological change in and around the
estuaries for as many dates as possible during the study period. Figure 13 is a 1997 Landsat TM image of
the southern estuary in the Nha Trang, Vietnam study area (also see Figure 7 for study area map).
35
Principal Investigator/ Program Director: Emch, Michael, Edward
Figure 12: ERS Image of the Matlab Study Area During the 1998 Monsoon
Figure 13: Landsat TM Image of Nha Trang
The left map view shows the region around Nha Trang, and the right map is a magnified view of the
estuary. Using image-processing software, the satellite data will be used to measure environmental
changes in natural vegetation, anthropogenic features, and water quality (i.e., turbidity and floating
vegetation). Satellite imagery from the aforementioned sensors will be used to classify different LULC
classes that may be related to cholera incidence. The images will initially be radiometrically corrected so
that all scenes are comparable. We will use the dark-object subtraction algorithm as described in Song et
al. (2001). The imagery will then be classified into LULC classes. First, we will develop a classification
scheme that will include such coarse classes as forest, water, soil, impervious surface, agricultural lands,
aquaculture areas, and grass lands. Then, we will attempt to model more detailed classes such as specific
crops. We will use a variety of image processing methods including traditional hard classifiers (e.g.,
maximum likelihood) as well as knowledge-based schemes (i.e., that rely on GIS data). While we do not
have ancillary data that will allow us to use accurate training data in the classification of the retrospective
satellite imagery, we will collect ground-truth data through field visits. These field visits will involve
both collecting training data for modern features (that may or may not have changed during the study
period) and asking questions of land holders to determine what the LULC was in the past. We will show
the land holders the satellite images and ask them if they remember what the LULC class was when the
imagery was acquired.
36
Principal Investigator/ Program Director: Emch, Michael, Edward
Several environmental variables will be modeled using multitemporal satellite imagery and
subsequently performing GIS analysis. There are numerous canals and rivers in the study areas, which
are often connected to open latrines. We assume that the use of these water bodies for bathing, washing,
and cooking is greater for people who live closer to them. While approximately 95 percent of the people
living in Matlab drink tube well water (Emch, 1999), people can contract cholera by swallowing water
while they are bathing. This is supported by a recent study (Sack et al., 2003) that found that people in
Matlab who bathed exclusively with tube well water were 0.4 times as likely to contract cholera as people
who used pond and/or river and/or canal water. Also, rural Bangladeshis use pond or river water for
cooking; if they do not boil the water then they will likely contract the disease. Thus, while people are not
usually drinking water directly from the cholera reservoir, access to these water sources has been shown
to be a risk factor for the disease. This mode of transmission is also likely to occur in Vietnam and
Mozambique. We will compute distance from water bodies using Landsat satellite imagery (i.e., either
MSS, TM, ETM+, and ASTER depending on availability of cloud-free dates) for the dry season and radar
satellite data (i.e., Radarsat and/or ERS) for the rainy/monsoon season (unless cloud-free Landsat imagery
is available, though this is unlikely). Satellite data will be acquired for several dates during the 20-year
study period based on availability. After rectification, the imagery will be classified into water and nonwater classes and then a distance surface will be created from the water pixels. The distance surface
created using the dry season images will describe proximity to the permanent surface water features
including rivers, canals, ponds, and swamps. The distance surface that will be created using the wet
season images will describe proximity to flood-inundated areas. We will also use GIS analytical tools to
differentiate between the different water types (i.e., rivers, canals, ponds, and swamps) so that we can
create separate distance surfaces for the different water types. Lastly, we will measure turbidity in water
using the imagery.
We will calculate ecological (i.e., neighborhood-level) variables using the household, bari, and
hamlet-level GIS databases. Spatial filtering techniques will be used to create neighborhood-level
variables because they are more appropriate than individual or household-level variables in some cases.
For instance, if a person living in a socio-economically deprived area (i.e., where there is poor sanitation)
has only 10 neighbors as opposed to 500 living in close proximity, the local environment will not be as
polluted because there are fewer people introducing fecal material into the water. Various spatial
filtering methods have been used so that data collected from a field survey can be scaled to remove noise
or create neighborhood-level variables (Meijerink et al., 1994; Watkins et al., 1993; Ali et al., 2002d).
We will use methods proposed by Ali et al. (2002d) which apply a low pass filter to develop
neighborhood-level population density and socio-economic variables.
Statistical Analysis
One goal of this study is to measure the relationships between the independent variables and
cholera incidence. Ordinary least squares (OLS) models will be built to measure global relationships
between the independent variables and incidence. One assumption of OLS regression is that observations
are independent of one another; with geographic data, however, this is not likely to be the case. Statistical
models need to control for both spatial and temporal autocorrelation. Stern and Cressie (2000) analyzed
disease risk by aggregating spatial components into statistical models to improve validity and increase
predictive performance. Since cholera is a highly infectious disease, spatial components are important in
explaining occurrence rates and specifying the causes and propagation of the disease. Infectious diseases
usually have spatially correlated occurrence rates, therefore, interpretation of results of statistical models
should consider the correlation structure of the disease distribution. For these reasons, the use of
conventional statistical models may be misleading. For spatially correlated events, several spatial
statistical methods have been proposed including the Markov Random Field (MRF) auto-poisson model
(Besag, 1974) and the hierarchical model (Clayton and Kaldor, 1987; Aitchison and Ho, 1989).
37
Principal Investigator/ Program Director: Emch, Michael, Edward
There are several diagnostic measures to test for spatial autocorrelation and other assumption
violations of OLS regression including non-normal errors, heteroskedastic (non-constant variance) errors,
multicollinear predictors, and spatially autocorrelated errors. The Kiefer-Salmon test shows whether the
residuals from the OLS model are significantly different from a normal distribution. The Breusch-Pagan
test shows whether there is significant heteroskedasticity of errors. Moran's I and Lagrange Multiplier
tests can be used to measure whether there is spatial autocorrelation in the model residuals. Spatial
structure of data sets is ignored in conventional regression; thus, conventional regression is often an
inadequate tool for comparing spatial distributions because of spatial dependence (i.e., what happens in
one place depends on what happens in other places) and spatial heterogeneity (i.e., relationships vary
across space). Statistical inference is problematic when dependence and heterogeneity are ignored and
regression assumptions are violated. Spatial heterogeneity refers not only to nonconstant relationships
between variables in space but also heteroskedasticity, or nonconstant variance, which can result from
omitted variables. One solution to dependency is to use specialized “spatial regression” methods that
incorporate spatial effects (Anselin, 1988, 1998; Anselin and Bao, 1997). Spatial autocorrelation should
be accounted for so that significance tests are not suspect. If variables are spatially dependent, explanation
is not complete without some characterization of spatial interaction. Inclusion of a spatial lag variable, a
variable representing the neighborhood effect of cholera incidence, may explain some of the residual
variation. By accounting for the spatial effects in the model, we can interpret the significance of the
other, non-spatial variables. Models that include a spatial lag variable are called a regressive spatial
autoregressive models.
Spatial regression analysis allows us to relate the occurrence of a spatial distribution with various
other factors that are geographic in nature (Kulldorff, 1998). This research project will use spatial
regression analysis to investigate relationships between the effects of socio-environmental conditions and
cholera incidence. We will build statistical models to identify the strengths of relationships between
cholera incidence and the variables listed in Table 1. In addition, we will use geographically weighted
regression (GWR) to explore the relationships between cholera and the socio-environmental risk factors
(Brunsdon et al., 1999; Fotheringham et al., 2002). GWR results in locally specific parameter estimates
so the spatial variation of relationships can be mapped, enabling us to explore how relationships vary
within and between the different ecosystems in the three study areas.
Expected Outcomes
We will disseminate our findings at international conferences and by writing papers. We will
submit several papers to peer-reviewed journals on the following topics.
1. Papers describing spatio-temporal associations between cholera incidence and satellite-derived
environmental variables (i.e., chlorophyll concentration, SST, SSH, rainfall, LULC, flooding),
climatic variables (i.e., in situ rainfall and temperature), and socio-demographic variables (i.e.,
population density and socio-economics).
2. A paper describing how associations between cholera and satellite-derived biophysical, climatic,
and socio-demographic variables vary in space (i.e., between and within study areas) and time
(i.e., during the 20-year longitudinal study period).
3. A paper describing how changes in estuaries (i.e., turbidity and chlorophyll concentration) and
areas around estuaries (i.e., LULC) are related to cholera incidence.
4. A paper presenting the cholera prediction model including descriptions of the data and methods
that should be used for prediction. This paper will address such theoretical ideas as nonlinearity of
relationships in space and time and whether the model is different for dominant cholera strain
(classical, El Tor, O139).
38
Download