Occupancy Modeling using the UNMARKED Package, written by Ian Fiske Presented by Luke Powell (LPowel9@lsu.edu) Nov 6, 2009 The R-code in this document walks through an example of using repeat visit visits to a site to model occupancy of unmarked animals while incorporating detectability as described in the book “Occupancy Estimation and Modeling,” by Darryl MacKenzie, James D. Nichols, J. Andrew Royle, Kenneth Pollock, Larrisa L. Bailey, and James E. Hines. Mackenzie and others have published several papers on occupancy modeling as well. This code is an alternative to the “canned” program PRESENCE, which is available free online. The website for presence is here http://www.mbr-pwrc.usgs.gov/software/presence.html. It also includes links to some MacKenzie papers and some help files for PRESENCE. This page http://www.uvm.edu/envnr/vtcfwru/spreadsheets/occupancy/occupancy.htm includes excercises from the MacKenzie book that you can work through using PRESENCE, or UNMARKED. The code below requires the beta version of package UNMARKED, written by Ian Fiske of NC State. Website for UNMARKED: http://r-forge.r-project.org/projects/unmarked/ Please note that I have altered the dataset (because I am preparing a manuscript with it), so the results you get in your analyses will not be the same as the sample results I have included in this document. My comments on the code are delineated with a #. They are either on the same line as the code or are on the line immediately before the code starts. Code=aerial Notes on code=aerial, starting with # Output=Courier new THE CODE: install.packages("unmarked", repos="http://R-Forge.R-project.org") #installs the package “unmarked” from the R-Forge website install.packages("reshape") #installs the package “reshape”, which is needed to run “unmarked” library(unmarked) #Opens the package “unmarked” #set the working directory as the path to where your data file is setwd("C:/Documents and Settings/lpowel9/My Documents/LSU/Courses/Fall09/R4Ecologists/OccPresentation") rubl <- read.csv("fakeRUBLdata.csv", na.strings = "-") #reads in the dataset, telling R that all the “-“ cells contain no data #Reads in your presence/absence data y <- apply(rubl[,2:10], 2, function(x) as.numeric(as.character(x))) M <- nrow(y) ## Prepares the site covariates (these remain the same with every visit) puddles <- as.vector(rubl[,"SmpoolsBin"]) #”as.vector”=categorical softwood <- as.vector(rubl[,"Softwood"]) mud <- as.vector(rubl[,"MudBin"]) beaver <- as.vector(rubl[,"CurrentBeaver"]) youngspfir <- as.vector(rubl[,"YngSpFirBinary"]) harvest <- as.vector(rubl[,"Harvest5to15"]) cogr <- as.vector(rubl[,"COGREver"]) rwbl <- as.vector(rubl[,"RWBLEver"]) year <- as.factor(rubl$Year) wetarea <- as.factor(rubl[,"WetArea"])#”as.factor”= ordinal categories or numeric road <- as.factor(rubl[,"Road"]) shrub <- rubl[,grep("Shrub",names(rubl))]/100 tree <- rubl[,grep("Tree",names(rubl))]/100 ## Prepares the detection covariates (these vary with each visit) samplingMethod <- as.factor(rep(c("a","b","c"), 3*M)) precip <- as.vector(t(rubl[,grep("Precip",names(rubl))])) mincat <- as.factor(t(rubl[,grep("MinCat",names(rubl))])) wind <- as.vector(t(rubl[,grep("wind",names(rubl))])) # non-categories? minute <- as.vector(t(rubl[,grep("MinBy",names(rubl))])) jday <- as.vector(t(rubl[,grep("DayBy",names(rubl))])) sky <- as.factor(t(rubl[,grep("sky",names(rubl))])) ## remove the first observation for the "driveby" sites #This is an extra step to account for my funky sampling scheme . . . not necessary with other datasets for(i in 1:3) { is.na(y[,i]) <- rubl$WetChoice == "driveby" } # reduce number of reasons for choosing a site to simplify model #This is also an extra step to account for my funky sampling scheme. . . not necessary with other datasets choice <- rubl$WetChoice choice[choice == "LooksGoodRndmReplacement"] <- "LooksGood" choice[choice == "NearPositive"] <- "OldPositive" choice[choice == "driveby"] <- "LooksGood" levels(choice)[c(1,3,4)] <- NA ## create an unMarkedFrame for model fitting obsCovs <- data.frame(samplingMethod, precip = precip, wind = wind, mincat = mincat, minute = minute, jday = jday, sky = sky) siteCovs <- data.frame(year = year, choice = choice, wetarea=wetarea, softwood=softwood, mud=mud,beaver=beaver,youngspfir=youngspfir, harvest=harvest,cogr=cogr,rwbl=rwbl,road=road,puddles=puddles, shrub = shrub, tree = tree) ## impute missing site covariates (continuous & binary) with mean #This step replaces missing values with the mean for that category *if necessary siteClasses <- sapply(siteCovs, class) numSiteCovs <- which(siteClasses %in% c("integer","numeric")) for(i in numSiteCovs) { siteCovs[is.na(siteCovs[,i]),i] <- mean(siteCovs[,i],na.rm = TRUE) } ## impute missing site covariate (categorical) with most frequent category #if necessary factorSiteCovs <- which(siteClasses %in% "factor") for(i in factorSiteCovs) { siteCovs[is.na(siteCovs[,i]),i] <- names(which.max(table(siteCovs[,i]))) } ## impute missing obs covariates (continuous & binary) with mean #if necessary obsClasses <- sapply(obsCovs, class) numObsCovs <- which(obsClasses %in% c("integer","numeric")) for(i in numObsCovs) { obsCovs[is.na(obsCovs[,i]),i] <- mean(obsCovs[,i],na.rm = TRUE) } ## impute missing obs covariate (categorical) with most frequent category #if necessary factorObsCovs <- which(obsClasses %in% "factor") for(i in factorObsCovs) { obsCovs[is.na(obsCovs[,i]),i] <- names(which.max(table(obsCovs[,i]))) } ## add back in unimputed mincat for comparing times with each other obsCovs$mincat.orig <- mincat #Another additional step for my funky dataset rublUMF <- unmarkedFrameOccu(y, siteCovs = siteCovs, obsCovs = obsCovs) #creates and names the dataframe ########################################################### ## THE ANALYSIS for detectability questions. ########################################################### (fm.Null <- occu(~ 1 ~ 1, rublUMF)) #The null model (no variables) #don’t forget the extra () around the whole line or you will have to use summary() to get the output #Notice that detection variables go after the first ”~” and occupancy variables go after the second “~” #The last thing inside the parentheses is the unmarked “frame” containing your data Call: occu(formula = ~1 ~ 1, data = rublUMF) Occupancy: Estimate SE z P(>|z|) -1.97 0.19 -10.3 4.23e-25 Detection: Estimate SE z P(>|z|) -0.947 0.185 -5.11 3.26e-07 AIC: 557.0372 Warning message: In handleNA2(umf, X, V) : 1 sites have been discarded because of missing data. #Now you can paste the AIC value into the AIC table in Excel #Modeling detectability now: (fm.wind <- occu(~ wind ~ 1, rublUMF)) #a model with only WIND as a detection covariate; “1” after the second “~” means that there are no occupancy variables included. Call: occu(formula = ~wind ~ 1, data = rublUMF) Occupancy: Estimate SE z P(>|z|) -1.97 0.189 -10.4 2.04e-25 Detection: (Intercept) wind Estimate SE z P(>|z|) -0.398 0.353 -1.13 0.2596 -0.334 0.186 -1.80 0.0725 AIC: 555.7108 Warning message: In handleNA2(umf, X, V) : 1 sites have been discarded because of missing data. #Interpretation: WIND is marginally significant p= 0.07 #The AIC value is also somewhat lower than that of the null model, suggesting that the model has more support than the null model #The negative parameter estimate indicates that wind has a negative influence on occupancy #Runs the model with WIND and SKY as covariates #next model:sky is categorical, wind is numeric (fm.windsky <- occu(~ wind+ sky ~ 1, rublUMF)) Call: occu(formula = ~wind + sky ~ 1, data = rublUMF) Occupancy: Estimate SE z P(>|z|) -2.04 0.186 -11 4e-28 Detection: (Intercept) wind sky2 sky3 Estimate 0.0627 -0.2664 -0.6920 -0.7597 SE z P(>|z|) 0.437 0.143 0.8861 0.201 -1.322 0.1860 0.420 -1.646 0.0997 0.587 -1.295 0.1952 AIC: 556.6656 Warning message: In handleNA2(umf, X, V) : 1 sites have been discarded because of missing data. #Adding SKY does not help the model. AIC value increases, indicating that the model is not as “good” as the last one that just included WIND (fm.windMud <- occu(~ wind ~ mud, rublUMF)) #This is a model with WIND as a detection covariate and MUD as an occupancy covariate Call: occu(formula = ~wind ~ mud, data = rublUMF) Occupancy: (Intercept) mud Estimate SE z P(>|z|) -2.28 0.226 -10.10 5.53e-24 1.16 0.371 3.14 1.70e-03 Detection: (Intercept) wind Estimate SE z P(>|z|) -0.414 0.354 -1.17 0.2420 -0.322 0.185 -1.74 0.0825 AIC: 548.4278 Warning message: In handleNA2(umf, X, V) : 1 sites have been discarded because of missing data. #This model provides strong evidence that MUD affects occupancy, as indicated by the relatively low AIC value and the confidence interval (Estimate 1.96+/-SE)that does not overlap zero