Digital Governance and Hotspot GeoInformatics of Biodiversity

advertisement
Center for Statistical Ecology
and Environmental Statistics
Digital Governance and Hotspot Geoinformatics of Biodiversity Measurement,
Comparison and Management in the Age of Indicators and Information Technology
By Ganapati P. Patil¹ and Wayne L. Myers²,
¹Center for Statistical Ecology and Environmental Statistics, Department of
Statistics, Penn State University, University Park, PA, 16802, USA
² School of Forest Resources and Office for Remote Sensing for Spatial Information
Resources, and Penn State Institutes of Environment, Penn State University,
University Park, PA 16802, USA
This material is based upon work supported by the United States National Science Foundation under Grant
No. 0307010. Any opinions, findings, and conclusions or recommendations expressed in this material are
those of the author(s) and do not necessarily reflect the views of the agencies.
[Invited Paper for the ISI Platinum Jubilee Volume for International Biodiversity Conference Symposium]
Technical Report Number 2008-1121
TECHNICAL REPORTS AND REPRINTS SERIES
November 2008
Department of Statistics
The Pennsylvania State University
University Park, PA 16802
G. P. Patil
Distinguished Professor and Director
Tel: (814)865-9442 Fax: (814)865-1278
Email: gpp@stat.psu.edu
http: //www.stat.psu.edu/~gpp
http://www.stat.psu.edu/hotspots
DGOnline News
Digital Governance and Hotspot GeoInformatics of Biodiversity Measurement,
Comparison and Management in the Age of Indicators and Information Technology
Ganapati P. Patil¹ and Wayne L. Myers²
(1) Center for Statistical Ecology and Environmental Statistics, Department of Statistics,
The Pennsylvania State University, University Park, PA, US
(2) School of Forest Resources and Office for Remote Sensing and Spatial Information
Resources, The Pennsylvania State University, University Park, PA, 16802
Abstract :
Biodiversity measurement, comparison and related hotspot geoinformatics are challenging
issues and opportunities in the twenty-first century of statistical ecology, environmental
statistics, risk analysis, knowledge discovery, decision making, and decision support in the age
of ecological, environmental, and socio-economic indicators.
This paper covers diversity measurement and comparison, diversity profiles, biodiversity
indicators selection for monitoring, etiology, early warning, and management, substantive and
geographic considerations, etc. The paper also attempts to demonstrate that the societal
biodiversity issues and concerns for knowledge discovery and decision making have also led to
interesting and innovative mathematical, statistical, computational, visualizational, and software
developmental issues and approaches for decision support in multi-scale advanced raster map
analysis area in this age of indicators and information technology of digital governance.
A unique prototype novel and innovative district level initiative in the triadic spirit of digital
governance and hotspot geoinformatics for natural resources monitoring, etiology, early
warning and sustainable management is briefly introduced within the context of district level
linking of small rivers and streams as a vehicle for monsoon rainwater harvesting and
management in the face of water scarcity to help provide a shot in the arm to restore and
enhance agriculture, biodiversity, nature conservation, drinking water, and eco-cultural
community life.
This material is based upon work partially supported by the National Science Foundation under Grant
No. 0307010. Any opinions, findings and conclusions or recommendations expressed in this material are
those of the author(s) and do not necessarily reflect the views of the sponsoring agencies.
1
1. Introduction, Background, and Motivation
It is a great delight for me (GP) to be invited to speak to the Platinum Jubilee of the Indian
Statistical Institute in its important biodiversity component. It is with a sense of gratitude and
affection that I am here celebrating golden jubilee of my own time at the Indian Statistical
Institute in great company with C. R. Rao and J. Roy and subsequently with Professor
Mahalanobis and Ranidi. It has been an honor to have received the only D.Sc. of Theoretical
and Applied Statistics of the Institute so far. My wife and I have fond memories of our stay with
Professor and Ranidi at Amrapalli in the Poet’s Room! It was a treat to be at the Silver Jubilee
and at the Golden Jubilee of the Institute together with great stalwarts of the whole variety of
disciplines and fields in which the Institute has been known to be involved worldwide.
Diversity measurement and comparison has been an important issue for a long time. These days
biodiversity measurement, comparison, and related hotspot geoinformatics are challenging
issues and opportunities in the twenty-first century of statistical ecology, environmental
statistics, risk analysis, knowledge discovery, decision making, and decision support in the age
of ecological, environmental, and socio-economic indicators. The challenge and opportunities
multiply all the more in the present day setting of information technology and digital
governance.
The whole issue is very exciting. And so also the individual parts. We will touch on these as we
move along. We will cover diversity measurement and comparison, diversity profiles,
biodiversity indicators selection for monitoring, etiology, early warning, and management,
substantive and geospatial considerations, etc. We will demonstrate that the societal biodiversity
issues and concerns for knowledge discovery and decision making have led to interesting and
innovative mathematical, statistical, computational, visualization, and software developmental
issues and approaches for decision support in multi-scale advanced raster map analysis area in
this age of indicators and information technology of digital governance. And toward the end, we
will share a unique prototype novel and innovative district level initiative of linking small rivers
and streams as a vehicle for monsoon rainwater harvesting and management in the face of water
scarcity to help restore and enhance agriculture, biodiversity, nature conservation, drinking
water, and diverse eco-cultural community development.
2
2. Ecological Diversity as a Motivating Example
2.1 Quantification for Ecological Diversity
Conservation biology, landscape ecology, and ecosystem-oriented natural resources
management lend considerable urgency to issues and approaches concerning biodiversity
assessment. Most of the traditional approaches and statistical tools are plot-based with a goal of
definitive characterization. Diversity, however, is relative to a spatial scale, temporal scale, and
taxocene spectrum. Patterns may be more informative than absolutes in this regard.
The issues are fundamental in that explaining the effects of environment on the distribution and
abundance of species is the essence of much ecological work. The controversies arise from the
intrinsic scientific importance of diversity theories, as well as from the broad economic and
social ramifications of considering biodiversity in land use decisions. At the heart of the
scientific and social controversies regarding diversity are problems of quantification,
interpretation, and analysis.
The classical view of diversity remains important for intensive studies of particular ecological
communities and forest stands (Gove et al., 1994). However, the emerging sciences of
landscape ecology and conservation biology have made evident the logistical and economical
impracticality of such intensive observational coverage for regions in the order of square
kilometers and larger (Scott et al., 1989). These spatial scales are necessarily encompassed by
contemporary ecosystem-oriented resource management and design of regional/national
networks of biodiversity reserves. Furthermore, species/area and minimum viable population
issues become fundamental in these matters.
The multidimensional character of diversity can be revealed by establishing an intrinsic, and
index-free, diversity ordering. In effect, diversity may appear to have decreased when viewed
from one vantage point (i.e., index), and increased when viewed from a different perspective.
In view of the inadequacy of a single index, Patil and Taillie (1979, 1982) quantify diversity by
means of diversity profiles. A diversity profile is a curve depicting the simultaneous values of a
large collection of diversity indices. Thus, the profile portrays the views of diversity from many
different vantage points simultaneously and in a single picture.
Differences in community diversity are studied by comparing profiles. If the two communities
are intrinsically comparable, then one profile will lie uniformly above the other. Conversely,
when the communities are not intrinsically comparable, their profiles may intersect. But even
here, the profiles can reveal which portions of the community have undergone opposing
diversity changes.
3
2.2. Indicators for Ecological Diversity
We proceed to consider ways of coping with complexity and confounding that embrace multiple
indicators rather than agonizing over choices and conflicts of diversity measures. We
contemplate enlarging the orders of indicators to encompass some interactions in a formal
manner that accommodates both parameterization and visualization. We conclude by noting the
convergence of biodiversity and ecological community concepts at meter scales, but not for
broader landscape, regional, and global scales of ecological organization.
The indefiniteness regarding biodiversity that can give rise to frustration is well expressed by L.
R. Taylor (1978) in the following quote:
Diversity so pervades every aspect of biology that each author may safely interpret the
word as he wishes and there is consequently no central theme to the subject. We cannot
be sure if this flexibility is healthy or due to lack of discipline, but it can be traced back
to the beginnings of interest in biological diversity …
The recent programs of the U.S. National Science Foundation probing biocomplexity in many
contexts serve to provide evidence that the flexibility addressed by Taylor is both healthy and
indicative of need for strengthening discipline with regard to scientific constructs and means by
which they are made operational.
Indicators/expressions of this nature are appropriate, and therefore of value, if they convey the
desired information within the budget and delivery delays that are acceptable. One method is
more efficient than another if it conveys the required information either more rapidly or at less
cost. Conveying more information at the same cost and timing is not necessarily desirable if
unwanted information has to be processed or filtered.
Increasingly sophisticated management, intervention, remediation, and regulation require a
continuing flow of multiple indicators for various aspects of ecosystems.
What ecosystem managers and regulators seek is a complementary set of indicators that
captures aspects of interest. We thus have entered the exciting age of indicators. For ecological
diversity and biodiversity.
4
3. Biodiversity Measurement and Comparison
3.1 Biodiversity with Presence/Absence Data:
Biodiversity is perhaps best revealed by a species list. Biodiversity may evade specific definition,
but there is very strong consensus that the current loss of species, along with the subsequent loss of
genetic diversity, is unacceptable if we are to maintain a healthy ecosystem. Such a concern pertains
to ecosystems at many spatial scales, whether a state park of 10 km 2 , a whole state, a nation or the
entire globe. Indeed, environmental concerns have traditionally been more localized; however,
contemporary issues like global warming, ozone depletion and biodiversity loss are very large scale
concerns.
Large scale monitoring for biodiversity assessment typically allows for only a species list to be
aquired in an area of concern. There is simply too much ground to cover for estimating relative
abundances as well. If the species list is aquired from a sampled sub-area, then how do we estimate
the total number of species, known as species richness, for the larger area of concern? We can not
simply estimate the average number of species per unit area and multiply by the whole area. If one
sample unit has 3 species and another has 9, the average number of species per sample unit is not
necessarily (9+3)/2 = 6. Some species may be present in both units, therefore implying that 3
species plus 9 species would be less than 12 species. Biodiversity as species richness is determined
by what becomes of
s(2) = 1+1, s(3) = 1+1+1,…. s(n) = 1+1+1+….+1
with n summands for n investigators or n individuals.
An approach to this problem of estimating the total of a non-additive variable is to apply the
concept of a species area curve. The number of species increases with increasing area sampled in a
non-linear manner, rising rapidly at first, then reaching a point of diminishing returns. The
challenge is then to maximally accelerate the empirical species-area curve so that the point of
diminishing returns is achieved in as small an area as possible. Knowledge of habitat may help to
achieve this sampling objective by providing covariate information that helps us to direct which
sample units to measure.
3.2 Biodiversity with Relative Abundance Data
3.2.1 Am I a Specialist or a Generalist?
The degree of specialization/diversification has to be relative to the categories identified.
3.2.2 Resource Apportionment
Resource may take the form of time, energy, biomass, abundance, etc. Degree of
specialization/diversification does not depend on the identity of the categories. It is
permutation-invariant.
5
3.2.3 Diversity as Average Species Rarity
Let
C = (s, π ) = ( π ) = ( π 1, π 2 ,…. π s )
be an ecological community of s species with relative abundance vector π . Let
R(i; π )
be the rarity measure of the ith species in the community with relative abundance vector π .
Diversity of the community π is then measured by its average species rarity given by
Δ (π ) =
s
∑ π R(i;π ).
i
i =1
Several of the most frequently used diversity indices may be conveniently expressed under the
umbrella of average species rarity through judicious choice of rarity functions. Species richness,
species count, Shannon's, and Simpson's indices all may be derived from this theory as follows
Δ SR =
1
∑ (π
i =1
Δ SC =
Δ Sh =
Δ Si =
s
1
∑ (π
i =1
)π i = s
species richness,
(1)
− 1)π i = s – 1
species count,
(2)
i
i
s
∑ (− log π
i =1
s
∑ (1 − π
i =1
i
i
)π i =
s
∑π
i
log π i Shannon,
(3)
i =1
s
)π i = 1 − ∑ π i2
Simpson,
(4)
i =1
where the term in parentheses denotes the species rarity function used in each case. Table 1 presents
a hypothetical example of three forest stands composed of just five or fewer species of trees. The
relative abundances of these tree species based on some quantitative measure of abundance are
given, and the diversity indices (1) through (4) calculated from these relative abundances also are
shown for each community. The example clearly shows the inconsistency of the different indices in
their ranking of these three communities. For example, Δ SC (Stand 1) > Δ SC (Stand 2), but
Δ Sh (Stand 1) < Δ Sh (Stand 2) and Δ Si (Stand 1) < Δ Si (Stand 2). This is an interesting comparison
because it illustrates how one may be lead to the conclusion that a community with fewer species
(Stand 2) can be more diverse than one with more species (Stand 1) using either Shannon's or
Simpson's index. Similar inconsistencies among the indices may be found by comparing Stands 1
and 3. The only comparison that is consistently ordered with all indices is Δ (Stand 2) > Δ (Stand 3).
This inconsistency of different diversity indices evidently is quite common when making
comparisons between communities and arises from a lack of intrinsic diversity ordering between the
communities being compared (see the following section).
6
Table1: Three hypothetical forest stand communities composed of five or fewer species of trees.
1
Stand
2
3
Pinus strobus
Quercus rubra
Tsuga canadensis
Acer rubrum
Betula papyrifera
0.50
0.30
0.10
0.05
0.05
Relative abundance
0.25
0.25
0.25
0.25
0.00
0.35
0.35
0.30
0.00
0.00
Total:
1.00
1.00
Diversity index
1.00
∆SR
∆SC
∆Sh
∆Si
5
4
1.24
0.65
4
3
1.39
0.75
3
2
1.10
0.67
Species
3.3 Diversity Profiles
Diversity profiles allow the graphical comparison of diversity between communities. One set of
profiles that incorporates indices (2) through (4) as point estimates along the curve are the so-called
7
Δ β profiles of Patil and Taillie. Since the Δ β
profile incorporates indices developed from
dichotomous-type rarity measures, it too may be developed in the same manner:
Δβ =
s
∑
=1
(1 − π βi )
β
1 − ∑ i =1π iβ +1
s
πi =
β
, β ≥ −1.
i
The restriction that β ≥ −1 assures that Δ β has certain desireable properties. The species count,
Shannon and Simpson indices are related to Δ β by Δ SC = Δ −1
Δ Sh = Δ 0 , Δ Si = Δ1 .
The Δ β diversity profiles for the three stands in Table1 are presented in Figure 1. Note that the
profile for Stand 1 crosses both profiles for Stands 2 and 3. The profile for Stand 1 crosses that of
Stand 2 at β = -0.45, which explains why both Δ Sh and Δ Si rank diversity of these two communities
differently from Δ SC . On the other hand, the profiles for Stands 1 and 3 cross at β = 0.62 showing
how Δ SC and Δ Sh rank these two communities differently from Δ Si . In general, it also is possible for
two Δ β profiles to cross at β >1 or for them to cross more than once; in either case, even
calculating all three indices( Δ SC , Δ Sh , and Δ Si ) alone may not be enough to show the inconsistent
ranking of communities at larger β . Calculating and plotting Δ β profiles for β > 1 may not be
helpful either because the profiles tend to converge quickly beyond this point and intersections do
not resolve---an algorithm for numerically finding the intersections of any two Δ β profiles is
required in this case.
Figure 1: Δ β profiles for the three hypothetical forest stand communities in
Table 1.
8
Perhaps the most useful way to compare diversity between communities imd
> C)
is by the concept of
intrinsic diversity ordering. This concept may be defined as follows:
Community C ′ is intrinsically more diverse than community C (written C ′
provided C leads to C ′ by a finite sequence of 1. introducing a species, 2. transferring
abundance from more to less abundant species without reversing the rank-order of the
species, and 3. relabeling species (i.e., permuting the components of the abundance
vector).
Note that this ordering is only partial and two given communities need not be intrinsically
comparable.
A diversity profile approach has been developed by Patil and Taillie using a rank-type rarity
measure on π # that incorporates the concepts of intrinsic diversity ordering defined above. Let
U
U
U
⎧1 if i > j;
R(i) = ⎨
⎩0 if i ≤ j,
for 1≤ j ≤ s. Then average species rarity is given as
s
Tj =
∑π
i = j +1
#
i
,
j = 1,…. s − 1 , l;
where Ts = 0 and T0 = 1. The quantity in (7) is termed the right tail-sum of the ranked relative
abundance vector π # , and when a plot of the (j, T j ) pairs is constructed for each community, the
U
U
U
U
resulting profiles are termed intrinsic diversity profiles. Any intrinsic orderings of the communities,
if they exist, can be determined with the intrinsic diversity( T j ) profiles.
The right tail-sum profiles for the three stands in Table 1 are plotted in Figure 2. Notice that the
profile for Stand 1 crosses both those for Stands 2 and 3, but that the profile for Stand 2 is
everywhere above that for Stand 3. It follows that the only intrinsic diversity ordering for these
imd
stands is C (Stand 2) > (Stand 3). This is consistent with the findings of the indices in the section
on Average Species Rarity and the Δ β profiles. The Δ β profiles are isotonic to intrinsic diversity
ordering in that, if an intrinsic diversity ordering exists, they will preserve it. However, the Δ β
profiles may not cross even if the T j profiles do; therefore, the Δ β profiles do not necessarily reflect
intrinsic diversity ordering. Since the diversity indices discussed have the same properties as
the Δ β profiles, it should be emphasized that, of the methods presented thus far, the T j profiles are
the most reliable measure of intrinsic diversity ordering between communities.
9
Figure 2: Right tail-sum ( T j ) profiles for the three hypothetical forest stand communities.
10
4. Exploring Patterns of Habitat Diversity Across Landscapes Using Partial Ordering
Different aspects of diversity for a particular complement of biotic elements in a locale present copious mathematical
and even conceptual challenge as considered to this point. However, the problematic nature of capturing diversity in
nature compounds rapidly when one extends the consideration to real-world contexts of natural complexity and
human influence at landscape and regional levels of geographical scope. Organisms do not naturally occur in cages,
aquariums, and other such laboratory vessels. They occupy environmental contexts that we usually call habitats.
Habitat is itself indefinite in the abstract, implying that a particular organism is capable of sustained occupancy in that
context. If the organism in question is not actually known to be present, then habitat is a hypothesis. If the organism
is present and sustaining occupancy, then its environmental context is habitat. Extended study of the circumstances
in which an organism sustains occupancy leads to a habitat model. If a locational context matches a habitat model
but lacks occupancy, it does not follow that the habitat model is erroneous since individuals of the organism may not
have found their way into (colonized) the locale in its current condition within current lifespan for individuals.
We do know from collective scientific experience that there is some specificity among some kinds of organisms in
their habitat requirements. Therefore, it is not expected that certain kinds of diversity will co-exist in a region unless
there are adequate expanses of the respective types of habitat in some sort of spatial mosaic. Thus, habitat diversity is
requisite for having certain types of biodiversity. What constitutes an adequate expanse is another issue, both in
aggregate and as spatial instances. Partitioning a region into smaller and smaller instances of different types of
habitats will increase the spatial diversity of habitat, but excessively small instances create fragmentation which is
detrimental to sustaining biodiversity. Also, the greater the distance between spatial instances of habitat the more
perilous becomes the dispersal between instances to replenish occupancy under incidents of attrition. Thus, habitat
diversity in both different kinds of habitat and complexity of spatial arrangement can be seen as increasing
biodiversity; but only up to some point of diminishing returns. Thus, indicators of a particular aspect of biodiversity
are not necessarily monotone increasing at a landscape/regional spatial scope.
Some different kinds of organisms have varying degrees of similarity in habitat requirements. When there is
sufficient similarity that suites of different organisms are typically found jointly occupying an environmental context,
then such a suite can be considered a community of organisms. Thus, habitat diversity is generally consonant with
community diversity. Joint occupancy may or may not be individually and/or collectively beneficial for a particular
kind of organism, and these relationships may or may not be reciprocal. Thus, one kind of organism may prey upon
another, but in so doing may prevent the prey from over-exploiting resources that it requires to the collective
detriment. If all spatial instances of habitat are accessible to all individuals over a chain of generations, then adaptive
diversification is largely retarded by extensive genetic intermixing so that speciation to produce substantively
different (new) kinds of organisms does not proceed rapidly. If spatial instances of habitat are both small and
inaccessible, on the other hand, then the (meta)population in that locale becomes more subject to catastrophic
extermination without replenishment.
The consequence of all of these biological, spatial and temporal interactions compounded by human influence is that
biodiversity extended to ecological diversity is anything but simple. It must therefore be approached from a
multidimensional perspective of pattern and complementary indicators, and any consideration will always be partial
in some sense. Consideration of biological/ecological diversity having meaningful implications will always entail
perspectives and priorities. It is also quite possible (probable) for different perspectives to be conflicting in various
respects.
11
4.1 Partial Prioritization with Multiple Indicators
Beyond the academic, we must speak of particular places and consider specific sorts of diversity under definitive
priorities for protection, remediation or other relevant regards. In such situations, we will often not be in a position to
expend equal effort on everything. It will thus be in order to identify instances that are particularly poor or
problematic and instances that are particularly positive or prime. The intermediary instances will be of less
immediate interest since they will not support special attention for either remediation or retention. The intermediary
instances are the great middle ground where multiple use interests of humanity are served under a complex of
considerations and land tenure. We must unambiguously delineate the spatial instances at some selected scale and
characterize them in terms of a suite of indicators that cover the concerns on at least an ordinal basis. We can then
proceed to prioritize at both the poor and prime poles. Spatial proximity or adjacency among the instances of interest
may or may not be a collateral concern. The special (spatial) instances at either end of the evaluations will be
considered here as salient sets in the sense that they stand out in a salient manner relative to the intermediaries.
To explicate the partial prioritization process and protocols we use data from a biodiversity assessment that was
conducted in the state of Pennsylvania in the northeastern USA (Myers et al., 2000), whereby the state was divided at
a first level into 635 km2 hexagonal cells. Pennsylvania has New York State on its northern border, Lake Erie at its
northeast corner, and Delaware/Chesapeake Bay of the Atlantic Ocean at its southeast corner. Figure A depicts the
physiographic character of Pennsylvania by hill-shading and Figure B shows hexagons (with their identification
numbers) covering what is called the Ridge and Valley physiographic region of Pennsylvania containing the
Appalachian Mountains as remnant flanks of major geological folds after eons of erosion have excavated the
fractured tops of the folds. Pennsylvania has a moist temperate climatic regime with the natural vegetation cover
being predominately forested.
12
Figure A. Hill-shading depiction of the physiographic characteristics of Pennsylvania.
Figure B. Hexagonal zones in Ridge and Valley Region of Pennsylvania.
13
Several indicators of biodiversity were evaluated for each hexagon during the GAP Analysis biodiversity assessment
program. We select five of these indicators for present purposes. Four of these are considered to be positive
indicators: 1) number of bird species expected to have viable breeding populations; 2) number of mammal species
expected to have viable breeding populations; 3) variability of elevation; and 4) percent of forest cover. A fifth
indicator has the negative sense of extensive disturbance, being the percent of the hexagon in one non-forest extent
when forest patches covering less than one square kilometer are ignored.
The four positive indicators are listed by hexagon in Table A. The purpose here is to find an area consisting of
several adjacent hexagons that is prime relative to the suite of indicators without collapsing indicator dimensions in
some composite manner. We thus wish to honor all of the dimensions of indications simultaneously. We use the
negative indicator of openness as a coupling criterion here. Among the available adjacencies, we intend to give
preference where the candidate pairs do not exhibit high degree of openness. The prime area is to be assembled
progressively by first finding salience relative to the four positive indicators, and then examining neighbors according
to both positive indicators and avoidance of openness. The necessary computational capabilities have been
configured as modules in the nonproprietary R statistical software system (Venables & Smith, 2004).
The logic of the prioritization process rests on dual ordination of subsets segregated based on both domination and
subordination in terms of the mathematics of partial ordering (Patil & Taillie, 2004). One unit or case of interest
(hexagon) dominates another if it is as least as good on all indicators and better on at least one indicator (Myers et al.,
2006). Conversely, one unit (case) is subordinate to another if it is no better on any indicator and is worse on at least
one indicator. While the domination and subordination views are related, they are not equivalent inasmuch as they do
not necessarily extract identical subsets.
Subsets of domination (or lack thereof) are segregated recursively in terms of numbered levels. For level number 1
the subset consists of all units of interest (cases) that are not dominated by any other unit. Domination level 2 is
obtained by removing units in the level 1 subset from consideration, and then segregating all units among the
remainder that are not dominated by any other units among the remainder. Subsets for subsequent levels are
extracted likewise until there are no dominations among the remainder.
Since remaining units for successive levels are increasingly dominated by members of prior subsets, increasing level
numbers for domination reflect increasing inferiority in the sense of greater consensus on inferiority across indicators.
Thus, a level 1 unit might be undominated by virtue of having a maximal value on one indicator, but still have a
relatively poor value on one or more other indicators. However, higher level numbers are increasingly characterized
by lack of good values on any indicators. It can thereby be seen that the process gives equal voice to all indicators
without formulating a composite metric in any manner.
U
U
14
Table A. Conservation characteristics for hexagonal zones in the Ridge and Valley Region for Pennsylvania
prototype (from RVhexs.txt file).
ZoneNum BirdSp
MamlSp TopoVarI PctForst
2409
130
45
89
80.8
2410
128
43
105
85.4
2529
133
45
103
74.3
2530
123
45
83
82.5
2649
127
47
65
66.0
2650
120
46
56
69.8
2651
121
46
62
62.5
2652
129
45
70
80.1
2771
135
47
81
48.9
2772
130
46
54
47.5
2773
122
46
101
76.6
2774
126
46
80
77.1
2894
126
47
114
89.4
2895
135
47
130
84.7
2896
123
49
114
75.4
2897
129
48
114
83.2
3019
133
53
102
59.3
3020
136
51
117
68.7
3021
122
46
110
89.8
3022
128
47
123
87.0
3023
118
46
110
74.4
3145
131
50
94
67.4
3146
128
48
103
68.8
3147
129
48
94
69.7
3148
120
47
92
78.1
-----------------------------------------------------------------------------------------------------------U
U
U
U
U
U
U
U
U
Subsets for levels of subordination unfold in a parallel manner, but with an opposite sense of quality. The level one
subset consists of units (cases) that do not have any other units that are clearly inferior in the sense of being no better
on any indicator and being worse on at least one indicator. Level two applies the same criterion to the remainder
after removing level 1 from consideration. Thus increasing level of subordination reflects more and more units in
lower levels that are clearly inferior according to the subordination view. Thus increasing level number for
subordination can be interpreted as increasing consensus on superiority.
U
U
A joint view breaks the subsets from separate views down into subsets of units (cases) having a particular level of
domination coupled with a particular level of subordination. We refer to these jointly classified sets as being salience
sets (Myers & Patil, 2008). A plot is then generated of instances of salience sets with domination level on the
abscissa (X- or horizontal axis) and subordination level on the ordinate (Y- or vertical axis). Instances plotting in the
upper-left are preferred by virtue of high superiority and low inferiority, whereas instances in the lower-right exhibit
low superiority and high inferiority. Instances plotting on an upper-left to lower-right diagonal are consistent with
regard to the two views. Greater departure from this diagonal implies that the messages conveyed by the indicators
are more mixed. Where primary interest lies in the best of the best or the worst of the worst, such a graphic is highly
appropriate for making selections.
U
U
U
15
U
Since the logic of prioritization according to salience rests only on the orderings or rankings of the cases on the
respective indicators, this is a non-parametric approach involving only sorting and ordering. Accordingly, it is
expedient as well as lending clarity to the subsequent aspects of the process if the data on indicators are converted to
ranks and comparative computations then done on the ranks.
The first stage of the process is to determine preferred patches to anchor the building of the network consisting of
patches and patch pairings. This initial determination is based entirely on the data in Table A without regard to the
data on pairings. The resulting salience plot is presented in Figure C, and the salience data ratings for making the plot
are listed in Table B. From the plot of salience in Figure C it can be seen that preference should be to use patches
(hexagons) having subordinance level of 4 and dominance level of 1 as initial anchors for the prospective network.
Accordingly, it can be seen in Table B that the hexagons having these ratings are 2897 and 3020 which also happen
to be neighbors. Thus the initial core of for the prospective network is as shown in Figure D.
Figure C. Graph of salience sets for prioritization of initial hexagons.
16
Table B. Membership of hexagons in salience sets.
Hexagon Dominance Subordinance
2409
2
2
2410
2
1
2529
2
1
2530
2
1
2649
3
2
2650
3
1
2651
4
1
2652
3
1
2771
2
2
2772
3
1
2773
2
2
2774
2
2
2894
1
3
2895
1
3
2896
1
2
2897
1
4
3019
1
2
3020
1
4
3021
1
3
3022
1
3
3023
2
1
3145
2
3
3146
2
3
3147
2
3
Figure D. Initial units according to prioritization by salience.
Having obtained the initial units by the prioritization protocol, we can proceed to do a first stage expansion through
suitable modification for selection of pairings. The major modification consists of allowing only those hexagons
17
bordering 2897 or 3020 to become candidates. With considerably fewer candidate linkages than hexagons, there are
only two levels on the salience axes as given in Table C. However, there is still a definite segregation of preference
for linking numbers 2896 and 3021 into the network with both of these linkages connecting to 2897. Thus the firststage expansion map is as depicted in Figure E. Subsequent expansion of the network is a matter of repeating this
scenario until the hexagons that would be linked into the network fall into the less desirable positions of Figure C. Of
the two elements just added, both are in the most favorable column with regard to dominance. Unit 3021 is in the
second best position with respect to subordination, and 2896 is one step below that
Table C. Membership of first-stage linkages in salience sets.
Hexagon Dominance Subordinance
2774
1
1
3019
1
1
2896
2
1
2896
1
2
3021
1
2
3145
1
1
3021
2
1
Figure E. Map of developing network after first-stage expansion.
18
4.2 Pattern Extraction
Pattern extraction from multivariate environmental information is notably important in our biodiversity work from
two standpoints. One of these pertains to the pursuit of prioritization with multiple indicators as above. The
computations involved in partial ordering are highly iterative and recursive, which can lead to combinatorial
constraints on practicality as the number of cases (instances) increases from a few hundred to thousands. One way of
coping with these computational challenges is to use multivariate pattern extraction techniques to obtain collectives
of cases that have similar patterns of indicator values. The collectives can then be treated comparatively for salience
in terms of central values of indicators for the cases comprising a collective. Cases comprising the salient collectives
can then be further prioritized among individuals in a multi-stage modality. Pattern extraction as strategically chosen
clustering techniques is thus used as a kind of data compression. In so doing it is essential that emphasis be placed on
obtaining a high degree of homogeneity within clusters as opposed to obtaining fewer clusters that have large intercluster differences.
The other major role for pattern extraction is in its signal processing sense for provisional partitioning of landscapes
into mosaics from remotely sensed multispectral data. Placement of spatial instances of the spectral patterns on the
landscape provides a point of departure for comparative landscape ecological investigation of habitat diversity. The
image data are thus progenitors for pattern maps of the landscape, with patterns providing a further framework
analysis of attributes at multiple scales. Without such a quasi-intelligent spatial organization of the landscape, it
becomes quite difficult to conduct synoptic investigations and to detect secondary patterns of change in the landscape
over time. Landscape change dynamics determine sustainability of ecological diversity. Our poly-pattern approach
to landscapes through remote sensing (Figure F) provides an adaptable image model that serves purposes well beyond
the initial images.
Figure F. Pattern-based view of a landscape in Jalgaon, India from remote sensing.
The pattern picture of the landscape in Figure F would not be possible to render in this manner by standard remote
sensing techniques such as color infrared (CIR) composites. It draws information from all bands, not just three
bands, giving contrasts and coloration not otherwise possible. Coloring healthy vegetation green is much more easily
understandable by the public than a reddish rendering in CIR. In pattern mode, each pattern can be rendered
selectively without affecting the rendering of other patterns. This is more informative and gives greater distinctions
among landscape elements than classical remote sensing methods of enhancement. Furthermore, the areas occupied
by each individual color casting can be easily extracted since this is a map of image information rather than a 3-band
composite. Pattern-Based Compression of Multi-Band Image Data for Landscape Analysis (Myers & Patil, 2006 –
Springer) offers a guidebook for landscape pattern perspectives from remote sensing.
19
5. Digital Governance and Hotspot Geoinfomatics with Forest Cover Data
5.1 Introduction
Impact of modern communication and information technologies on the society in various ways cannot be overstated. Its
recognition is reflected in emergence of digital governance worldwide. Purpose of digital governance, stated variously, is to
empower public for information access and analysis to enable transparency, accuracy, and efficiency for societal good at large.
In this context, development and applications of methodologies for geoinformatic hotspot analysis of spatial and temporal data
are of utmost importance.
. Patil and Taillie (2004) proposed the upper level set (ULS) scan statistic. Patil et al. (2008a) report software implementation of
the ULS scan statistic. The ULS scan statistic and its software implementation differ from the widely used SaTScan system in
three main respects:
•
The ULS scan statistic uses an irregularly shaped scanning window unlike the circularly shaped window used by
SaTScan
•
The ULS scan statistic can be used to detect hotspots in any structure with the network topology whereas SaTScan is
applicable to geospatial regions only.
•
The software provides an option of the use of the gamma distribution to model response data that are of continuous
nature in addition to the binomial and Poisson models.
The second item in the above list seems to be quite significant in view of wide interest in hotspots in a network setting such as
sensor networks [Patil et al. (2008b)]. In addition to the responses that can be modeled using binomial, Poisson, and gamma
distributions, there is a need for a model that can handle continuous fraction responses. In this section, we present a model that
can be used to analyze this kind of data for hotspot detection. The beta distribution is a natural choice for modeling continuous
fractional data. But because of its lack of additive property, it is not suitable for generating simulated replications of data which
are essential for computing p-values. We propose a suitable transformation of the data so that the gamma distribution serves as a
reasonably good approximate model. Software reported in Patil et al. (2008a) has now the capability to process continuous
fractional data. We illustrate use of the software and viability of the proposed model to detect hotspot with forest cover data.
The forest cover may be seen at times to be a biodiversity indicator.
5.2 Scan statistics for geospatial hotspot detection.
We have the following:
R: A geographical connected region,
T: A set of ‘cells’ forming a partition or tessellation of R,
N: cardinality of T,
n1, n2, …, nN: ‘sizes’ of the N cells, and
y1, y2, …, yN: responses of interest for the N cells.
Here y1, y2, …, yN are assumed to be a particular realization of independently distributed random variables Y1, Y2, …, YN that
have distributions with a common form but with different parameter values that account for cell-to-cell response variation.
Interpretation of size of a cell depends on the context. For example, if Y1, Y2, …, YN are binomial random variables, then n1, n2,
…nN are respective numbers of trials. If Y1, Y2, …, YN have Poisson distribution, where, for a = 1, 2, …, N, Ya represents the
number of events of a given type that occur at random in cell a with intensity λa, then n1, n2, …nN are areas of the N cells and
E[Ya] = naλa. In general, for a = 1, 2, …, N, ya/na is the response rate or the intensity of the response for cell a. The spatial scan
statistic seeks to identify “hotspots” or clusters of cells that have elevated response or, more precisely, elevated response rates
compared with the rest of the region. Clearly, we are interested in the responses adjusted for cell sizes rather than in raw
responses. It is possible that adjustment for some other characteristic such as gender or age is meaningful in some studies. Given
a cluster of cells, C, the response rate for the cluster is the ratio:
20
∑Cya / ∑Cna
where ∑C indicates summation over cells a belonging to the cluster C. This suggests that we assume that the parametrized
family of distributions of Y1, Y2, …, YN is additive. In addition, a cluster of cells to be considered as a potential hotspot or a
candidate hotspot needs to satisfy two geometrical properties:
(1) Cells within the cluster should be connected, that is, any two cells a1, a2 in the cluster should be adjacent to each other or
there should be a sequence of cells b1, b2, … bk, k ≥ 1, all inside the cluster, such that a1 is adjacent to b1, a2 is adjacent to
bk, and any two successive cells in the sequence are adjacent to each other. Such a cluster of connected cells will be called a
zone. The set of all zones in R will be denoted by Ω. This requirement merely says precisely that two disjoint clusters with
significantly elevated responses would constitute two distinct hotspots.
(2) The zone should not be excessively large so that the complement of the zone rather than the zone itself would constitute the
background. This is achieved by limiting the search for hotspots to zones that do not comprise more than a certain threshold
percentage, say, fifty percent of the entire region in size.
The process of hotspot detection then involves testing for each eligible zone in Ω the null hypothesis that its response rate is the
same as that of the rest of the region, that is, the zone is not a hotspot, against the alternative hypothesis that its response rate is
higher in comparison with that of the rest of the region. We conclude that there is no hotspot if the null hypothesis is not rejected
for each eligible zone. This hypothesis testing model is formulated precisely as described below with the binomial response
model used for illustration. Under the binomial response model, each Ya is Binomial (na, pa), 1 ≤ a ≤ N and Y1, Y2, …, YN are
independently distributed. Then the null hypothesis that there is no hotspot, that is, response rates for all cells are equal is stated
as:
H0: p1 = p2 = … = pN = p0, say,
against the alternative hypothesis that there exists a non-empty zone Z in Ω for which the response rate is higher than that for the
rest of the region. Formally, the alternative hypothesis is:
H1: There is a non-empty zone Z ε Ω and values 0 ≤ pnz , pz ≤ 1 such that
⎛ pz
for all cells a in Z
pa =⎨
⎝ pnz
for all cells a in R – Z,
and pz > pnz.
The zone Z specified in the alternative hypothesis is an unknown parameter along with pz and pnz. Thus the full model involves
three parameters: Z, pz, and pnz with Z ε Ω and H0 implying Z = Φ. For testing the null hypothesis using the likelihood ratio,
under the assumption of the binomial response model, the maximum likelihood estimates (MLEs) p̂0 , p̂ z , and p̂nz of p0 under
H0 and of pz and pnz for a given zone Z under H1 are readily obtained as respective response ratios so that the likelihood functions
L0( p̂0 ) and L1(Z, p̂ z , p̂nz ) are available. Our objective is to maximize L1(Z, p̂ z , p̂nz ) as Z varies over Ω, that is to compute the
MLE of Z. If the ratio of the maximized L1(Z,
p̂ z , p̂nz )/ L0( p̂0 ) is significantly high then MLE of Z is declared as a hotspot. However, Ω is generally so large that its size
makes it impractical to maximize L1(Z, p̂ z , p̂nz ) as Z varies over Ω by exhaustive search. One approach to obtain an
approximate solution to the maximization problem is to replace the original parameter space Ω by a smaller, more tractable
subset Ω0 of Ω, and maximize L1(Z, p̂ z , p̂nz ) as Z varies over Ω0 by exhaustive search. Success of this approach of reduction of
the parameter space depends on how well the reduced parameter space Ω0 brackets the MLE over full Ω.
Widely used SaTScan software [Kulldorff (2006)] uses Ω0 = ΩSatScan obtained as the set of zones covered by a collection of
series of expanding concentric circles with centers at centroid of each cell.
It may do a poor job of detecting actual hotspots that are not quite compact. Below we review the ULS scan statistic, an
alternative to the circular scan statistic as developed by Patil and Taillie (2004).that depends on the data and takes care of
connectedness of clusters using adjacency. It is based on the concept of the upper level set (ULS) tree. For comparative nature
of the circular scan statistic and the ULS scan statistic, the reader is referred to Patil and Taillie (2004).
Because of their wide applicability in epidemiology, the above two models are implemented in SaTScan which has large
following. Continuous response models have received relatively much less attention. Patil and Taillie (2004) discuss an
approach to modeling continuous response distribution with gamma and lognormal as illustrations.
21
5.3 Continuous Fractional Response Model and Forest Cover Data Analysis
The gamma model discussed in the previous section is applicable in hotspotting when continuous responses are positive valued
and additive in nature, a situation that occurs quite frequently. Another situation with continuous responses that occurs
frequently in practice is when they are between 0 and 1 when it seems plausible to postulate Y ~ beta(α, β) with the pdf:
α-1
fY(y; α, β) = Γ(α+β)/(Γ(α)Γ(β))y (1-y)β-1, 0 < y < 1, where α > 0, β > 0
where Y represents a typical cell response. However, the beta family does not possess the additive property. Hence, to begin
with, we propose the transformation:
X = Y/(1-Y).
X has the beta distribution of the second kind (Patil et al 1984) with the pdf
α-1
fX(x; α, β) = Γ(α+β )x
α+ β
/(Γ(α)Γ(β) (1+x)
),
0 < x, where α > 0, β > 0
We note that this distribution also arises as a mixture of the gamma distribution gamma(k, α) on parameter k where 1/k ~
gamma(1, β) [Patil et al. (1984)]. In the absence of availability of an exact model with properties in conformance of our
guiding principles, it appears reasonable to approximate the exact model, namely the mixture of the gamma distribution, with a
straight gamma distribution that satisfies our criteria. Thus we propose to treat Y/(1-Y) as gamma(k, β).
In many situations with continuous fractional response the beta distribution of the first kind (Patil et al 1984) rather than the
standard beta may be more applicable. The beta distribution of the first kind with parameters r, s, α, and β has the pdf
fY(y; r, s, α, β) = Γ(α+β)/(Γ(α)Γ(β))(y-r)
α-1
(s-y)β-1/(s-r) α+β-1,
r < x <s,
where 0 ≤ r < s ≤ 1, α > 0, β > 0
The simple transformation, Y’ = (Y – r)/(s – r), takes us to the beta scenario. However, r and s are typically unknown. Hence, to
be able to deal with the beta distribution of the first kind using the technique developed for the standard beta distribution, one
may use the transformation
Y’ = (Y – r̂ )/( ŝ – r̂ ), ..................................................................................................... (1)
where r̂ and ŝ are reasonable estimates of r and s, respectively. For our purpose, we will use ymin and ymax for r̂ and ŝ ,
respectively, where
ymin = min { ya | a ε T }, and
ymax = max { ya | a ε T }
mostly because of computational ease.
In Section 5.4, we will describe an application of the continuous fraction response model to Jalgaon district forest cover data
using software implementation of the ULS scan statistic described in detail in Patil et al. (2008a). Its current version is able to
handle the continuous fraction response model in addition to binomial, Poisson, and gamma models. Results reported in Section
5.4 indicate that application of the gamma model to approximate the beta model of the second kind is a viable technique to do
hotspot detection with data, where the beta model is appropriate.
22
5.4. Jalgaon District Forest Cover.
Table 2 shows Jalgaon district (Maharashtra) forest cover 2001-02 data by tehsil 1 .
F
F
Table 2. Jalgaon District Forest Cover
Serial
Number
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Total
Tehsil Name
Geographical
Area (Hectares)
844.15
484.53
413.38
398.77
1217.63
954.36
463.53
646.11
511.03
825.07
1360.72
820.41
791.21
935.70
954.38
11620.98
Amalner
Bhadgaon
Bhusawal
Bodvad
Chalisgaon
Chopda
Dharangaon
Edlabad
Erandol
Jalgaon
Jamner
Pachora
Parola
Raver
Yawal
Forest Area
Forest Cover
(Hectares)
21.90
78.49
29.60
56.37
121.11
162.13
19.47
132.57
24.26
142.68
155.72
72.46
98.59
264.05
308.18
1687.58
0.02594
0.16199
0.07160
0.14136
0.09946
0.16988
0.04200
0.20518
0.04747
0.17293
0.11444
0.08832
0.12461
0.28220
0.32291
0.14522
We intend to determine if some cluster of tehsils can be considered as a hotspot based on the data in the last column of the table,
using the ULS software (referred to as the ‘program’ hereafter) mentioned above. Figure 3 shows the contents of the input file to
the program, which are described below :
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0.025943257
0.161992034
0.071604819
0.141359681
0.099463712
0.169883482
0.042003754
0.205181780
0.047472751
0.172930782
0.114439414
0.088321693
0.124606615
0.282195148
0.322911209
5
4
3
2
1
0
0
2
1
2
2
1
0
2
2
6
8
7
7
11
6
5
3
6
5
3
4
1
7
5
12
11
9
10
12
9
8
13
9
6
9
8
4
14
9
12
10 13 14
14
9 12
11
8
11
9
6
12
10 11 14
10
8
13
Figure 3. Input File for Jalagaon Forest Cover Data
23
The program requires that each cell in the region be identified serially as 0, 1, 2, … with one line of input for each cell in the
data file in that order. The first entry in a given line is the cell identifier, the second entry is the ‘size’ of the cell, the third entry
is the response. Remaining entries in the line are identifiers of cells that are adjacent to the cell mentioned at the beginning of
the line. For the fractional continuous data as in the current case, the model to be used is specified by the user as ‘beta’. For the
beta model, each cell size needs to be input as 1 and the response is assumed to be between 0 and 1. The program automatically
unitizes data using the transformation (1) above. However, it is necessary to adjust the data values 0 and 1 so that computed
probability densities are not zero. The program replaces the zero data value by u/2 and the unit data value by (1 + v )/2, where u
is the smallest non-zero data value and v is the largest non-unit data value after unitization. Unitized data values are further
subjected to the transformation y/(1-y) before application of the gamma model. The program also allows the user to specify the
threshold percentage to limit the size of a hotspot relative to the total size of the entire region. We ran the program five times
with five threshold values of 10%, 20%, 30%, 40%, and 50%. Results of the five runs are summarized in Table 2. A map of
Jalgaon district identifying each tehsil and the hotspot consisting of three tehsils as detected by the program are shown in Figure
4.
Table 3. Jalgaon District Forest Cover Hotspots
Threshold %
10
20
30
40
50
Member
Count
1
3
3
3
3
Member
Tehsils
14
7,13,14
7,13,14
7,13,14
7,13,14
p-value
0.180
0.040
0.049
0.083
0.106
Incidentally, tt may be worth noting that the zone consisting of tehsils 7, 13, and 14 happens to have the maximum likelihood
value for threshold values of 20%, 30%, 40%, and 50%, however, with different p-values. This situation is explained by the fact
that, when we increase the threshold, we increase the set of competing candidate zones and the maximum likelihood values
occurring in the simulated samples exceed the likelihood of the top candidate zone as per actual data set more often. On the
other hand, with the threshold of 10%, the p-value of the zone = {14} is greater than that of zone = {7, 13, 14} when the
threshold is 0.20%. This is due to a greater probability of a high response over a small area purely by chance. We conclude that
choice of the threshold is an important consideration in hotspot analysis from the point of view of the manager responsible for
making practical decisions.
Figure 4. The shaded area is a hotspot at 5% level for thresholds of 20% and 30%
24
The three tehsils making up the hotspot in Figure 4 are Yawal, Raver, and Edlabad (now known as Muktainagar). All the three
tehsils are located in the Satpuda mountain region and are known for their forest. Appropriately they have been identified as a
hotspot. More importantly, the model presented in the paper asserts through the p-value the degree to which the tehsils stand out
as forest covered areas within the district.
6. A Unique Prototype Novel and Innovative District Level Initiative to Help Restore and Enhance
Agriculture, Biodiversity, Nature Conservation, and Diverse Eco-Cultural Community Development
A district level watershed surveillance and research institute (JalaSRI) : This is now functional in the spirit of
triadic digital governance and hotspot geoinformatics for natural resource monitoring, etiology, early warning, and
sustainable management, with emphasis on model watershed, rural entrepreneurial youth brigades, appropriate
smart sensor networks, etc., to help with improved restoration, enhancement, and impact assessment in response to
district level linking of small rivers and streams as a vehicle for monsoon rainwater harvesting and management in
the face of water scarcity. This is to provide a shot in the arm to restore and enhance agriculture, biodiversity, nature
conservation, drinking water, and community life. The Jalgaon district model of digital governance in this context is
in progress to be a prototype model, bringing together academia, agencies, and communities at the district level to
innovatively help improve the synergistics of present day science and technology and local wisdom for watershed
assessment, development, and sustainable livelihood. JalaSRI is a leading partner together with the district
Collectorate, watershed communities, and others, such as, international crop research institute for semi-arid tropics
(ICRISAT).
Model Watershed Development : Most rainfed areas in the tropical developing world faces water scarce situation even
during the crop growing monsoon season. Through community watershed management, scarce water resources can be
conserved through rainwater harvesting and use efficiently for enhancing agricultural productivity, improving livelihoods and
minimizing land degradation. Through a project sponsored by Ministry of Agriculture, Government of India, a model
watershed of 1000 ha is being established in Jalgaon district by adopting number of collective action, convergence, capacity
building and consortium approach for harvesting of rainwater, efficient use for increasing agricultural productivity, improving
livelihoods through income generating activities and building sustainability through capacity building of all the stakeholders.
Dr. SP Wani, Principal Scientist (Watersheds) and Regional Theme Coordinator, GT-Agroecosystems, Asia from International
Crops Research Institute for the Semi-Arid Tropics (ICRISAT), is leading this project. With ICRISAT and JalaSRI as close
collaborators, Dr. SP Wani is building partnership between JalaSRI and ICRISAT with an intention to share the experiences
of ICRISAT as well as of JalaSRI not only in the district but with other organizations working in the area of water
management through developing tropical conditions.
Digital Governance and Hotspot Geoinformatics : The NSF DGP project, with Dr. G.P. Patil as the “Principal
Investigator”, has been instrumental to conceptualize surveillance geoinformatics partnership among several
interested cross-disciplinary scientists in academia, agencies, and private sector across the nations.
Under his able leadership, JalaSRI efforts are driven by a wide variety of case studies of interest to agencies,
academia, and private sector involving critical societal issues, such as public health, ecosystem health, ecohealth,
financial health, biodiversity and threats to biodiversity, emerging infectious diseases, water management and
conservation, persistent poverty, environmental justice, social networks, sensor networks, energy conservation, early
warning, and disaster management. It involves research of space-time diseases, poverty, pollution, object
identification and tracking, early detection, early warning, hotspot trajectories and trends.
River Connectivity Project in Jalgaon District : The monsoon water may not be flown away wasted but through
river connectivity it may be used for drinking as well as irrigation and other purposes. The district administration in
Jalgaon under the versatile collaborative leadership of district executive engineer VD Patil and district collector
Vijay Singhal, IAS, M.Tech. IIT, Delhi have implemented connectivity of rivers and streams. Four existing canals
were used and repaired and their capacities were also enhanced.
25
Existing natural nalas, river beds were used to a great extent and some additional canals and channels were also dug.
For this work, 2 crores of rupees from the scarcity fund were obtained from the Government. These 2 crores of rupees
saved at least nine crores of rupees of drinking water alone which would have otherwise been spent on tankers in
supplying drinking water. Thousands of hectares of area have come under irrigation plus there is reduction in water
losses because of repairs of canals.
Total benefits received by agriculturalists have ranged between 45-50 crores. This has helped improve the economic
condition of farmers’ community, prohibiting migration of local and poor people to other areas in search of jobs.
People in the areas are now well versed with the modalities of the project and have much benefited by the results.
Such is the success story in place for connectivity of rivers in the Jalgaon district. It has had its impact on the
potential for restoration and enhancement of biodiversity, in conjunction with the potential for reductions in threats to
biodiversity.
Biodiversity and Habitat Conservation Working Group : Under the leadership of Dr. Gauri Rane,
JalaSRI has since its inception an active biodiversity and habitat conservation group. Dr Rane has
received her training in Pune, Penn State, and Dehradun. The activities of the Group are directed to
study the topography and the biodiversity of the Jalgaon district forest, to identify biodiversity rich areas
and their status in the study area, to list out endemic, endangered, threatened species and medicinal
plants of the forest, to observe geographical distribution of plants and animals and distributional pattern
of the species, to suggest potential site for corridor building, for the comeback of the tiger and to engage
in agroforestry for the district irrigation schemes.
JalaSRI has some fifteen Working Groups across the spectrum relevant at district level. Interestingly,
JalaSRI has its anthem, a powerful anthem. One part speaks of :
To conserve bio-diversity
Is a necessity
Nature has a whole lot of purpose
In its variety
Promote the culture of embracing nature
Take to the task of shaping our future
Geo-Informatics
Environmental statistics
Public health officials
And social scientists
Working together all in accordance
To realise the dream of digital governance
For more equally exciting information inclusive of JalaSRI on Stage Dance Drama, see Patil et.al (2008).
May this Jalgaon district JalaSRI prototype example be instructive and inspirational to districts of similar
makeup in Maharashtra, in India, and in the world.
The following diagrams may be suggestive :
26
Innovative
and Unique
Prototype
„
An Innovative and Unique Prototype
District Level River Linking Initiative
16
17
Project Presentation by Vijay Singhal, (IAS) Collector Jalgaon to
to
Hon’
Hon’ble H.E. Smt. Pratibhatai Patil, President of India and Hon’
Hon’ble Sh.
Sharad Pawar, Union Minister for Agriculture.
18
District Level River Linking Field Trip under the
Leadership of V.D. Patil, District Executive Engineer
Model Watershed, Ideal Watershed
Jalgaon, MS, India
„
„
„
„
In the River Linking Project Area
One thousand Hectares ( approx 2500 Acres)
JalaSRI, District Collectorate,
Collectorate,
and ICRISAT (CGIAR/World Bank)
Permanent Field Work Station for
Sustainable Livehood,
Livehood, Youth Investment,
Sensor Network,
Digital Governance and Hotspot GeoInformatics,
Degree Programs in GeoInformatics
19
27
8. References
Grassle, F., G.P. Patil, W. Smith., and C. Taillie. (1979). Ecological Diversity in Theory and Practice. International
Co-operative Publishing House, Fairland, MD.
Gove, J., Patil, G.P., & Taillie, C. (1994). A mathematical programming model for maintaining structural diversity in
uneven-aged forest stands with implications to other formulations. Ecological Modelling, 79, 11-19.
Gove, J. H., Patil, G. P., and Taillie, C. (1994). A mathematical programming model for maintaining structural
diversity in uneven-aged forest stands with implications to other formulations. Ecological Modelling 79, 11-19.
Johnson, G.D. & Patil, G.P. (2006). Environmental and Ecological Statistics Series: Volume 1: Landscape Pattern
Analysis for Assessing Ecosystem Condition. New York, NY: Springer.
Kulldorff, M. (1997). A Spatial Scan Statistic, Communications in Statistics: Theory and Methods, 26(6), 1481--1496.
Kulldorff, M. (2006). SaTScan™ v 7.0: Software for the spatial and space-time scan statistics, Information Management
Services Inc., Silver Spring, MD
Kulldorff, M., Nagarwalla, N. (1995). Spatial Disease Clusters: Detection and Inference, Statistics in Medicine, 14, 799--810.
Kulldorff, M., Rand, K., Gherman, G., Williams, G., and DeFrancesco, D. (1998). SaTScan v 2.1: Software for the spatial and
space-time scan statistics, National Cancer Institute, Bethesda, MD.
Myers, W.L. and G.P. Patil (2006). Biodiversity in the Age of Ecological Indicators, Acta Biotheoretica, 54, pp.
119-123.
Myers, W., J. Bishop, R. Brooks and G.P. Patil (2001). Composite spatial indexing of regional habitat importance.
Community Ecology 2(2): 213–220.
Myers, W. & Patil, G.P. (2006). Environmental and Ecological Statistics Series: Volume 2: Pattern-based
Compression of Multi-band Image Data for Landscape Analysis. New York, NY: Springer.
Myers, W., Bishop, J., Brooks, R. and Patil, G. P. (2001). Composite Spatial Indexing of Regional Habitat
Importance. Community Ecology 2(2): 213-220.
Myers, W. and G. P. Patil. 2008. Semi-subordination sequences in multi-measure prioritization problems. Chapter 7
in: R. Todeschini and M. Pavan, Eds. Ranking Methods: Theory and Applicatons, Volume 27 of Data Handling in
Science and Technology. Amsterdam: Elsevier.
Myers, W., G. P. Patil and Y. Cai. 2006. Exploring patterns of habitat diversity across landscapes using partial
ordering. In: R. Bruggemann and L. Carlsen, Eds. Partial Order in Environmental Sciences and Chemistry. Berlin:
Springer. Pp. 309-325.
Myers, W., J. Bishop, R. Brooks, T. O’Connell, D. Argent, G. Storm and J. Stauffer, Jr. 2000. The Pennsylvania
GAP Analysis final report. The Pennsylvania State University, Univ. Park, PA 16802.
Myers, W., J. Bishop, R. Brooks and G. P. Patil. 2001. Composite spatial indexing of regional habitat importance.
Community Ecology 2(2): 213-220.
28
Patil, G.P. & Taillie, C. (1982). Diversity as a concept and its measurement. Journal of the American Statistical
Association, 77, 548-567. (Invited discussion paper).
Patil, G.P. & Taillie, C. (2004a). Upper level set scan statistic for detecting arbitrarily shaped hotspots.
Environmental and Ecological Statistics, 11, 183-197.
Patil, G.P. & Taillie, C. (2004b). Multiple indicators, partially ordered sets, and linear extensions: Multi-criterion
ranking and prioritization. Environmental and Ecological Statistics, 11, 199-228.
Patil, G.P. (2002). Diversity profiles. Technical Report 2001-0206. Also in: A. El-Shaarawi and W.W. Piegorsch
(Eds). Encyclopedia of Environmentrics. Wiley, pp. 555-61.
Patil, G.P. (2001). Statistical ecology and environmental statistics. Technical Report 2001-0401. Also in: Jeff Wood
(Ed). Encyclopedia of Life Support Systems. EOLSS Publisher. United Nations Project.
Patil, G. P. and Taillie, C. (1979a). An overview of diversity. In Ecological Diversity in Theory and Practice 5, (eds. J.
F. Grassle, G. P. Patil, W. K. Smith and C. Taillie), 3-27. Fairland, Maryland, USA: International Co-operative
Publishing House.
Patil, G. P., and Taillie, C. (1979b). A study of diversity profiles and orderings for a bird community in the
vicinity of Colstrip, Montana. In Contemporary Quantitative Ecology and Related Ecometric, (eds. Patil, G. P.
and Rosenzweig, M.), 23-48. Burtonsville, MD, USA: International Co-operative Publishing House.
Patil, G.P. (2007). Statistical geoinformatics of geographic hotspot detection and multicriteria prioritization for monitoring,
etiology, early warning and sustainable management for digital governance in agriculture, environment, and ecohealth, Journal
of Indian Society of Agricultural Statistics, 61, 132--146.
Patil, G.P., Boswell, M.T., and Ratnaparkhi, M.V. (1984). Dictionary and Classified Bibliography of Statistical Distributions in
Scientific Work. Vol. 2: Univariate Continuous Models, International Co-operative Publishing House, Burtonsville, MD.
Patil, G.P., Taillie, C. (2003). Geographic and network surveillance via scan statistics for critical area detection, Statistical
Science, 18(4), 457--465.
Patil, G.P., Acharya, R., Glasmier, A., Myers, W., Phoha, S., and Rathbun, S. (2006a). Hotspot detection and prioritization
geoinformatics for digital governance, In Digital Government: Advanced Research and Case Studies, (Eds., H. Chen, L. Brandt,
V. Gregg, R. Traunmuller, S. Dawes, E. Hovy, A. Macintosh, C. Larson), Springer Publishers, US.
Patil, G.P., Modarres, R., Myers, W.L., and Patankar, P. (2006b). Spatially Constrained Clustering and Upper Level Set Scan
Hotspot Detection in Surveillance GeoInformatics, Environmental and Ecological Statistics, 13, 365—377.
Patil, G.P., Acharya, R., Myers, W., Phoha, S., and Zambre R. (2007). Hotspot Geoinformatics for detection, prioritization, and
security, In Encyclopedia of Geographical Information Science, (Eds., S. Shekhar and H. Xiong), Springer Publishers.
Patil, G.P., Acharya, R., and Phoha, S. (2007). Digital governance, hotspot detection, and homeland security, In Encyclopedia
of Quantitative Risk Analysis, Wiley, New York.
Patil, G.P., Acharya, R., Modarres, R., Myers, W.L., and Rathbun, S.L. (2007). Hotspot geoinformatics for digital government.
In Encyclopedia of Digital Government, Volume II, (Eds. Ari-Veikko Anttiroiko and Matti Malkia), 919
29
Patil, G.P., Joshi, S.W., and Rathbun, S.L. (2007). Hotspot geoinformatics, environmental risk, and digital governance, In
Encyclopedia of Quantitative Risk Analysis, Wiley, New York, 927, Idea Group Reference, Hershey, PA.
Patil, G.P., Joshi, S.W., Myers, W.L., and Koli, R.E. (2008a). ULS Scan Statistic for Hotspot Detection with Continuous
Gamma Response, In Joe Naus Volume, (Eds. Glaz, Joseph et al.), Birkhauser, Boston, MA (in press).
Patil, G.P., Patil, V.D., Pawde, S.P., Phoha, S., Singhal, V., and Zambre, R. (2008b). Digital governance, hotspot
geoinformatics, and sensor networks for monitoring, etiology, early warning, and sustainable management, In Geoinformatics
for Natural Resource Management, (Ed. P.K. Joshi), Nova Science Publishers, New York (in press).
Patil, G.P., Pawde, S.P., Rane, G.M., Zambre, R.A., Wani, S.P., Paranjape, Jhelum. (2008). A
Picturesque Informative Pamphlet : District Level Watershed Surveillance and Research Institute JalaSRI. Penn State
CSEES TR 2008-1204.
Rane, Gauri.M., Pandey, Rahul.K., Bhardwaj, Jaya., Murthy, Rama, Myers, Wayne, Patil, Ganapati. (2008).
Biographical and Bio-cultural Context for Collaborative Conservation and Resource Sustainability in Jalgaon District
of Maharashtra, India. JalaSRI TR 2008-1015.
Scott, J. M., F. Davis, B. Custi, R. Noss, B. Butterfield, C. Groves, H. Anderson, S. Caicco, F. D’Erchia, T. C.
Edwards, Jr., J. Ulliman and R. G. Wright. 1993. GAP Analysis: a geographic approach to protection of biological
diversity. Wildlife Monographs No. 123.
Scott, J., Csuti, B., Estes, J., & Anderson, H. (1989). Status assessment of biodiversity protection. Conservation
Biology, 3, 85-87.
Taylor, L. R. (1978). Bates, Williams, Hutchinson – a variety of diversities. In Diversity of Insect Faunas. I. A.
Mound and N. Waloff (eds), pp. 1-18. Oxford: Blackwell Scientific Publications.
Venables, W. and D. Smith. 2004. An introduction to R, revised and updated. Bristol, UK: Network Theory
Limited. 146 p.
30
Download