Models for Site Occupancy and Detection - Sortie-ND

advertisement
Likelihood Methods in Ecology
April, 2012
C. D. Canham
Likelihood Models for Site Occupancy when Detection Probabilities are < 1
From:
MacKenzie et al. 2002. Estimating site occupancy rates when detection probabilities are less
than one. Ecology 83(8):2248-2255.
Data Requirements:
1. sample a set of sites, with each site having a sequence of t = 1..m observation periods
within the overall study period (this “repeated measures” feature is essential to the
estimation of detectability)
Assumptions:
1. sites are “closed” (i.e. no changes in occupancy of individual sites during the study
period)
2. no false positives
3. probability of detecting a species at one site is independent of probability of
detection at other sites
Basic Model:
i = probability that the species is present at site i (throughout the study period)
pit
= probability that a species is detected at site i during observation period t (note that
because it is assumed that there are no false positives, this is effectively a conditional
probability, i.e. given that that species is present at site i.
So, in general, the expected probability of “observing” a species at a site at any given time
is:
ipi
The likelihood function effectively splits the dataset up into two subsets:
1. for sites (i)where the species was observed at least once, the likelihood of observing
the particular sequence of detections and nondetections (pi) is
m
 pit if the species was detected 
Likelihood   i it whereit  

1  pit if the species was not detected 

t 1
2. for sites (i) where the species was not observed at all, either the species was present
(i) but not detected at all m observation times, or the species was not present,
with probability 1-i . So the likelihood for one of these observations is
 m

Likelihood   i  ( 1  pit )  (1 - i )
 t 1



The total log likelihood for the dataset is then simply the sum of the logs of these
likelihoods.
Extensions:
The basic model can be fitted to estimate the average occupancy () and detection
probability (p), given the observations, over the entire sample.
But both occupancy and detection probability can be modeled as a function of covariates.
For occupancy (), the covariates would represent aspects that varied among sites (but not
over time, since occupancy is assumed constant within a site). These could either be
continuous or categorical covariates (i.e. fragment size as a continuous variable, or forest
type as a categorical variable)
For detection probability (p), the covariates are assumed to be site-specific but time
varying (but must be measurable regardless of detection).
Sample R Code: (download R code from the course website)
Data Format:
A special format will be required, given the repeated measures design of the data.
Each sample “site” constitutes a statistical observation. The repeated attempts to detect
presence at a site over time are attributes of the single observation.
Different species will have to be analyzed separately, either with separate data files for
each species, or with one large file that has sets of columns for each species.
The R code assumes that the first m columns of the data frame consist of observations of
presence for a given species (0, 1, or NA), where m is the largest number of remeasurement
periods for any site (observation) in the dataset, 0 = not observed, 1 = observed, and NA for
a missing value. The code doesn’t actually require that the columns be in an order
representing date of observation, and missing values can be in any column if a site was not
sampled on that date. Any names can be used for those columns.
The remaining columns of the dataset contain any covariates that might be needed in the
scientific models…
Download