Document 11863976

advertisement
This file was created by scanning the printed publication.
Errors identified by the software have been corrected;
however, some errors may remain.
Covariate-Directed Sampling for
Assessing Species Richness
G.P. Patil, Glen Johnson
and Matteo Grigolettol
Abstract.-Since
species richness is a spatially non-additive variable, it can not be estimated with conventional moment estimators.
We may, however, exploit the species-area relationship which implies that the number of species increases according t o a law of diminishing returns as the sampled area increases. An efficient sampling method would then maximize the number of species encountered within a fixed amount of area that can be affordably sampled. We suggest that when spatial covariate information is available, it should be exploited for directing the location of sample
units in such a manner that increases the habitat diversity observed
within a minimum number of sample units. This approach was evaluated through a retrospective assessment of breeding bird species
richness in Pennsylvania using cumulative tree richness as the covariate. Working with EMAP hexagons (635 square kilometers) as
the primary sampling unit, we found that although tree richness
was a fairly weak covariate, it still outperformed random sampling.
INTRODUCTION
W i t h such a strong and legitimate concern for t h e alarming loss of biodiversity on our planet (Wilson, 1988; Stevens, l995), monitoring methods are
essential for quantifying t h e biodiversity of large geographic regions. W i t h all
t h e shortcomings of "indices", actual species richness (the number of different
species) appears t o b e t h e least controversial measure of biodiversity. Meanwhile, biodiversity researchers recognize t h a t reliable methods of estimation
still require development (Yoon, 1995).
Since species richness is a spatially non-additive variable, we can not estim a t e its total from conventional moment estimators, such as by multiplying a
sample mean by t h e number of population units in t h e sampling frame. For
this reason, we t u r n t o species-area curves, which are used by ecologists for
several reasons (Kilburn, l966), including t h e prediction of species richness in
larger areas t h a n those sampled (Evans, Clark and Brand, 1955).
Center for Statistical Ecology and Environmental Statistics, Department of Statistics,
Penn State University, University Park, PA 16801
The species-area relationship basically states that as the area within a homogeneous habitat increases, the number of different species encountered will
also increase until "a point of no return", after which increasing the area does
not further increase the number of different species encountered. The common model for this process, as originally proposed by Arrhenius (1921), is a
power function, presented as S = kAz, where S is the species richness and A
is the area, while z and k are population specific parameters. For applications
to wildlife conservation issues, see Usher (1985). Although the power funct ion has been traditionally used to model the species-area relationship within
homogeneous habitat, Johnson and Patil (1995) observed a classical power
function response for breeding bird richness across the whole state of Pennsylvania which encompasses very heterogeneous habitat. If the power function is
fit from sample data, its ability to extrapolate to a larger area is limited by
upward bias since this model is unbounded (Williams, 1995).
Our objective is to develop a sampling plan that maximizes the acceleration
of a species-area curve towards it's plateau in order to encounter the most
species within a sampled area. This is especially critical when sampling from
a very large geographic area like the state of Pennsylvania since this can rapidly
become a very expensive exercise.
SAMPLING STRATEGIES
When constructing a species-area curve from successive aggregation of discrete sample units, the usual approaches are to either combine sample units
in a continuous fashion or t o combine units that are obtained at random from
throughout the region of interest.
When habitat is diverse across the region, random aggregation may result
in a steeper curve than is obtained from continuous aggregation because spatially discontinuous sample units may encounter more diverse habitats, therefore increasing the chance of encountering different species. The expected
number of species encountered in n sample units of equal size obtained at
random, E[S,], can be readily computed (Kobayashi, 1979) for providing a
benchmark t o compare other sampling protocols.
An alternative approach to continuous aggregation or random sampling is
to perform directed sampling based on values of some covariate that are readily
available for the sample units. With the advent of geographic information systems,we feel that such covariate information is becoming more readily available
for geographic areas at the landscape scale and above. The desired property
of a covariate would be t o direct sampling in a manner that accelerates the
species-area curve faster than would be observed with spatially continuous or
random sampling.
BREEDING BIRDS IN PENNSYLVANIA
HSI
Number of Species
8....5 5 t o "
'73t082
....
:::
....
.. .
:gi 82 to
64 t o 7 3
t o 100
91
100 t o 109
Figure 1: Bird richness in the hexagons.
We evaluated the proposed approach of covariate-directed sampling through
a retrospective study of a known community of Breeding birds in Pennsylvania. Our database is described in Johnson and Patil (1995). The sampling
frame consists of a tessellation of Pennsylvania by hexagons, each 635 km2,
of the Environmental Protection Agency's Environmental Monitoring and Assessment Program (EMAP). Associated with each hexagon are species lists
for breeding birds, other vertebrate groups and trees. While other groups are
based on records of occurrence, the breeding birds are based on the much more
thorough Pennsylvania Breeding Bird Survey (Brauning, 1992). The distribution of species richness for breeding birds with respect to EMAP hexagons is
displayed in Figure 1 in the form of a greyscale thematic map.
Of the information available in our database, tree species presented the
most promising covariate for choosing an optimal hexagon ordering for ultimately measuring bird species richness. We basically hypothesized that differences in bird species are likely to be associated with differences in tree
species; therefore, if hexagons are chosen in an order that corresponds t o maximum acceleration of the tree species richness curve, using this same ordering
will accelerate the bird richness curve. Constructing the optimal tree species
richness curve was performed by choosing the first hexagon as the one containing the highest tree species richness. After noting which species were in
the first hexagon, all members of these species were deleted from the remaining hexagons. The second hexagon was then chosen as the one containing
the highest tree richness. Steps 1 and 2 were then repeated until all the tree
O
8
$
E O
5'.
0
03
-
9
V
o
0
first directed cycle
mean for random sampling after first cycle
mean for random sampling
Hexagons Observed
Figure 2: Species-area curve from tree-directed sampling for one cycle,
followed by the expected value from subsequent random sampling. The
expected curve from completely random sampling is also shown.
species had been accounted for.
We discovered that all tree species were accounted for within the first nine
hexagons sampled. At this point we experimented with two techniques. The
first one is to randomly sample part of the remaining hexagons. The second
technique is to reintroduce all of the tree species that are within the unsampled
hexagons, repeating the procedure used for the first cycle, and so on for a
certain number of cycles (this is called a completely directed procedure). At
each step the new bird species found in the hexagon were recorded. The
completely directed procedure proved to perform somewhat better, as seen in
Figures 2 and 3 which provide results over the whole state.
For the protocol which sampled the first nine hexagons by tree-directed
sampling, we fit the power function model, via linear regression, for additional
random samples of 20, 30 and 50. The results are presented in Table 1, where
extrapolation to the total area (statewide) appears to yield unacceptably high
bias. Since we are using a covariate (number of tree species) for the first cycle
followed by random sampling, then the number of new tree species considered
increases just for the first nine hexagons, and it becomes constant thereafter
(just the area increases). To take this into account, we might use a model of
the form S = kAZTP,where T is the number of tree species and p is a new
parameter. Such a model has been used by Rafe et al. (1985); what they call
habitat heterogeneity in our case is represented by number of tree species. The
extrapolated curves for both models are compared in Figure 4 for the sampling
protocol of one tree-directed cycle, followed by 20 randomly chosen hexagons.
50
100
Hexagons Observed
Figure 3: Species-area curve from completely tree-directed sampling, with
demarcation of each sampling cycle. The expected curve from completely
random sampling is also shown for comparison.
Table 1: Extrapolated species richness (EX) for one tree-directed cycle
(9 hexagons) plus subsequent random samples of size 20, 30 and 50. Bias
equals the extrapolated minus the total known species.
Sample size
29
k
36.87
0.157
EX
bias
230
+52
Here we see a substantial reduction in bias from incorporating the covariate.
The main drawback of a model which incorporates the covariate is that
there is no trivial way to use it when covariate-directed selection of all sample
units comes into play (completely directed procedure). There are no clear
values of the variable T to be used in the second and the following cycles.
This problem is important since we obtain the steepest species-area curve
with the completely directed procedure.
Besides tree species, we also had available information on (i) fish, (ii) mammals, (iii) reptiles and amphibians and (iv) butterflies and skippers. Using
these covariates we reapplied the same analysis as with trees, but did not
discover a stronger covariate than tree richness. Results from studying these
other covariates can be found in Grigoletto, et al. (1995).
DISCUSSION
Most of the cycle lengths in Figure 5 vary little around 10 hexagons. Why
I
____._____..__._.-------------
I
I
I
extrapolating curve for model (1)
extrapolating curve for model (2)
50
100
150
Hexagons Observed
Figure 4: Extrappolating curves for Model 1 ( S = kAZ) and
Model 2 ( S = kAzTP).
Cycle 1
Tundra Swan
cycle 2
Short-eared
cycle 3
cycle 4
A
Sedge Wren
Owl
*+
4
Swainson's Thrush
S u m m e r Tanager
Pine Siskin
Figure 5: Observed hexagons and unencountered species after 4 cycles.
does that happen? If we look at it is clear that the observed sites tend to
cluster. Since we are using tree- directed sampling, this happens because if
there are certain areas consisting of multiple hexagons with a high number of
tree species, then in different cycles we tend to re-observe these areas. In fact,
the larger clusters do contain hexagons from each of the four cycles.
After four cycles, only six bird species remain unencountered. In Figure 5,
we see that most of the unencountered species are rare and appear just in
one, two or three hexagons. Pine Siskin is an exception since this bird is
quite spread over central northern Pennsylvania. Since this region is a mostly
forested area, this seems to suggest that if the "area covered by forest" could
be used jointly with tree richness, we would have a stronger covariate.
The primary purpose of this paper is to suggest the idea of covariatedirected sampling for achieving observational economy when sampling to estimate species richness, a spatially non-additive variable. We illustrated the
approach with a retrospective study of breeding birds in Pennsylvania, where
the database at our disposal provided species lists for several other communities. Of these, tree richness appeared to be the best available covariate.
Although tree richness was not strongly correlated with bird species richness,
tree-directed sampling did accelerate the bird species-area curve faster than
was expected with completely random sampling. Therefore, if other covariates
can be obtained that are more strongly correlated with bird richness, then such
covariates are expected to further improve the sampling efficiency.
An outstanding question that still remains is how to estimate species richness statistically, in a manner that allows construction of confidence bounds.
Progress has been made (Bunge and Fitzpatrick, 1993) for the case when population densities are also measured within each sampled species; however, we
were faced with presence/absence responses for each species, where the dataanalytic approach appears to be the only alternative.
Acknowledgments
Prepared with partial support from the United States Environmental Protection Agency, Environmental Monitoring and Assessment Program, EMAP
Design and Statistics Group under a Cooperative Agreement Number CR821783. The contents have not been subjected to Agency review and therefore
do not necessarily reflect the views of the Agency and no official endorsement
should be inferred.
REFERENCES
Arrhenius, 0. 1921. Species and area. Journal of Ecology, 9:95-99.
Brauning, D.W. 1992. Atlas of Breeding Birds in Pennsylvania. University
of Pittsburgh Press, Pittsburgh. 484 pp.
Bunge, J. and Fitzpatrick, M. 1993. Estimating the number of species: a
review. Journal of the American Statistical Assoc., 88:364-373.
Evans, F.C., Clark, P. J . and Brand, R.H. 1955. Estimation of the number of
species present on a given area. Ecology, 36:342-343.
Grigoletto, M., Johnson, G., Patil, G.P. and Taillie, C. 1995. Using CovariateDirected Sampling of EMAP Hexagons to Assess the Statewide Species
Richness of Breeding Birds in Pennsylvania. Technical Report no. 951102. Center for Statistical Ecology and Environmental Statistics, Department of Statistics, Penn State University, University Park, PA.
Johnson, G.D. and Patil, G.P. 1995. Estimating statewide species richness
of breeding birds in Pennsylvania. Coenosis, lO(2-3):81-87.
Kilburn, P.D. 1966. Analysis of the species-area relation. Ecology, 47:831843.
Kobayashi, S. 1979. Species-area curves. in Ord, J.K., Patil, G.P. and Taillie, C (eds.), Statistical Distributions in Ecological Work, pp. 349-368.
International Co-operative Publishing House, Fairland, Maryland.
Rafe, R.W., Usher, M.B and Jefferson, R.G. 1985 Birds on reserves: the
influence of area and habitat on species richness. Journal of Applied
Ecology, 22:327-335.
Stevens, W.K. 1995. How many species are being lost? Scientists try new
yardstick. The New York Times, p. C4, July 25.
Usher, M.B . 1985. Implications for the species-area relationship for wildlife
conservation. Journal of Environmental Mngt., 2 l : l 8 l - l 9 l .
White, D., Kimerling, A. J . and Overton, W.S. 1992. Cartographic and geometric components of a global sampling design for environmental monitoring. Cartographic and Geographic Information Systems, 19(1):5-22.
Williams, M.R. 1995. An extreme-value function model of the species incidence and species-area relations. Ecology, 76(8):2607-2616.
Wilson, E. 0 . 1988. Biodiversity. National Academy Press, Washington,
D.C., 521 pp.
Yoon, C.K. 1995. Monumental inventory of Costa Rican forest's insects under
way. The New York Times, p.C4, July 11.
Download