Likelihood Methods in Ecology April, 2012 C. D. Canham Likelihood Models for Site Occupancy when Detection Probabilities are < 1 From: MacKenzie et al. 2002. Estimating site occupancy rates when detection probabilities are less than one. Ecology 83(8):2248-2255. Data Requirements: 1. sample a set of sites, with each site having a sequence of t = 1..m observation periods within the overall study period (this “repeated measures” feature is essential to the estimation of detectability) Assumptions: 1. sites are “closed” (i.e. no changes in occupancy of individual sites during the study period) 2. no false positives 3. probability of detecting a species at one site is independent of probability of detection at other sites Basic Model: i = probability that the species is present at site i (throughout the study period) pit = probability that a species is detected at site i during observation period t (note that because it is assumed that there are no false positives, this is effectively a conditional probability, i.e. given that that species is present at site i. So, in general, the expected probability of “observing” a species at a site at any given time is: ipi The likelihood function effectively splits the dataset up into two subsets: 1. for sites (i)where the species was observed at least once, the likelihood of observing the particular sequence of detections and nondetections (pi) is m pit if the species was detected Likelihood i it whereit 1 pit if the species was not detected t 1 2. for sites (i) where the species was not observed at all, either the species was present (i) but not detected at all m observation times, or the species was not present, with probability 1-i . So the likelihood for one of these observations is m Likelihood i ( 1 pit ) (1 - i ) t 1 The total log likelihood for the dataset is then simply the sum of the logs of these likelihoods. Extensions: The basic model can be fitted to estimate the average occupancy () and detection probability (p), given the observations, over the entire sample. But both occupancy and detection probability can be modeled as a function of covariates. For occupancy (), the covariates would represent aspects that varied among sites (but not over time, since occupancy is assumed constant within a site). These could either be continuous or categorical covariates (i.e. fragment size as a continuous variable, or forest type as a categorical variable) For detection probability (p), the covariates are assumed to be site-specific but time varying (but must be measurable regardless of detection). Sample R Code: (download R code from the course website) Data Format: A special format will be required, given the repeated measures design of the data. Each sample “site” constitutes a statistical observation. The repeated attempts to detect presence at a site over time are attributes of the single observation. Different species will have to be analyzed separately, either with separate data files for each species, or with one large file that has sets of columns for each species. The R code assumes that the first m columns of the data frame consist of observations of presence for a given species (0, 1, or NA), where m is the largest number of remeasurement periods for any site (observation) in the dataset, 0 = not observed, 1 = observed, and NA for a missing value. The code doesn’t actually require that the columns be in an order representing date of observation, and missing values can be in any column if a site was not sampled on that date. Any names can be used for those columns. The remaining columns of the dataset contain any covariates that might be needed in the scientific models…