Project update: 10/15/12 A comparison of the accuracy of analysis methods for mark recapture data Dan Bachen Goal: To determine the most effective method of analyzing mark recapture data that increases power while reduces the necessary number of samples. Methods: 1. 2. 3. 4. 5. Set realistic parameters, generate/ assign covariate values to sites Perform a power analysis on real data using selected methods of analysis Calculate number of samples needed to resolve a between a -80 to 90% treatment effect Calculate the bias introduced to estimates of known parameters Compare results to select best method for use in stepwise analysis Progress 10/15/12 Updated capture history generator to account for differing probability of capture by capture history not night Power Analysis I performed a power analysis of data I collected in the field to explore the minimum number of plots needed accurately estimate a realistic treatment effect. The original data covered 24 sites trapped between mid-May and mid-June 2012. Each was trapped for 5 nights, captured animals were uniquely marked and released, and subsequent recaptures recorded. On six of the plots there were sparse captures (=<2 individuals) so these were excluded from analysis. To analyze the data I first estimated abundance by plot from the capture histories for each plot using the RCapture package in R (see Figure 1 for results). Some of the sites differed slightly in vegetation and elevation so I used an ANOVA test to see if a post-hoc stratification design should be used to more accurately estimate the variance within plots. There was little conclusive evidence that the plots differed in abundance of deer mice (p-value .25, Figure 2). Using the estimates of abundance by plot derived from the RCapture analysis I examined the effect sample size had on the ability to discriminate a gradient of treatment effects. I was specifically interested in being able to detect a treatment effect of between 80-90% reduction in abundance (as seen in Ostoja and Schupp 2009) with 20 or 40 total plots, >90% of the time (alpha level .1). The estimates of abundance were generated through RCapture so the bias introduced by this method is reflected in these estimates, so I calculated a mean of the abundance and the standard deviation, then assuming means would be normally distributed used a normal random number generator (rnorm{base} function in R) to simulate abundances of plots. In a parametric bootstrap framework I simulated sampling between 2 and 100 sites per treatment and control with treatment effects from between -10% reduction to -100% reduction. Then I compared the results using a t-test, scoring whether a difference was detected with a p-value of less than .05 (for results see Figure 3). I found that 40 plots were adequate to detect a treatment effect of -80%, 99.5% of the time. If 20 plots were used, a treatment effect would have to exceed -90% to detect it more than 90% of the time. Based on these results the power to detect a realistic treatment effect is optimized with 40 sites. Bias The analysis using RCapture estimated the mean of my sites to be 9.6423 and the SD to be 5.317. These statistics reflect the actual mean and variance within the population, but through using Rcapture for the analysis bias is introduced. Fortunately through simulation this bias can be estimate and corrected for. Using known parameters (fixed probabilities of capture: .2 for first capture and .9 for subsequent captures, true abundances between 10 and 25) I generated simulated capture histories and analyzed them with the Rcapture package. In a bootstrap framework I iterated this process 100,000 times and took the mean and variance of these repetitions. I then calculated the bias of these estimates using the formula: Bias= Parameter- estimate of parameter. Based on this analysis both the estimate of mean and variance produced by Rcapture are biased (mean p-value<.0001, SD p-value<.001). Bias also increases with increasing abundance, where for every additional animal added to the population bias of the mean increases by .24 and bias of the SD increases by .05 (See Figure 4 for results). Using this information to correct for bias the true mean of my study plots is 13.96 and the corrected SD is 3.597. Future progress The ability to minimize sample size and therefor cost and effort is a real need in ecology, as having enough resources to gather an adequate sample size is often challenging. Using the process outlined above I will be able to investigate whether different techniques will give me more power to discriminate smaller treatment effects for a given sample size or conversely allow me to discriminate a -90% treatment effect with a smaller sample size. To complete this analysis I will perform power analysis of my data with 3 additional techniques that estimate abundance from capture histories: - Minimum known alive o The number of unique individuals is used Lincoln/Peterson estimator in a Bayesian framework Closed Capture likelihood estimator in a Bayesian framework I will compare the reduction in sample size needed to resolve the designated treatment effect to select the most effective method of analysis. Progress 10/4 Currently I have successfully written code to: Generate abundances based on defined covariate values A treatment effect to a subset of these abundances Generate capture histories based on fixed probabilities Perform a “black box” analysis to estimate abundance based on capture history using the Rcapture package with model s(.)p(t) which should correctly account for my probabilities of capture Model effect of treatment using covariate values generated for each site and abundances generated from capture history, generate estimate of treatment effect Score whether the true value falls within the CI of estimate What I need to do: Write code for hierarchical analysis linking abundance model: Abundance ~ TE+S+C+P+(S*C)+E with likelihood model used in Rcapture analysis Iterate both analysis types to get an accurate estimate of the proportion of times correct. Literature cited Ostoja, S. M., and E. W. Schupp. 2009. Conversion of sagebrush shrublands to exotic annual grasslands negatively impacts small mammal communities. Diversity and Distributions 25:863-870. Appendix Figure 1. Estimates of abundance by plot estimated with the Rcapture package in R Plot LC1 LC2 LC4 LC6 LT1 LT2 LT3 LT4 LT5 LT6 SC1 SC2 SC3 SC4 SC5 SC6 ST1 ST2 ST3 ST4 ST5 ST6 Abundance 9.389161092 3.041862251 12 6.000000002 5.000000003 2.000000002 7.000000003 8.583886749 10.28205028 1 10.4244289 2 6.000000001 1 3 10 12.6243839 25.81725641 13.2219365 5.158047662 6.541381265 13.8778794 Excluded Y Y Y Y Figure 2. ANOVA analysis of difference within stratum Call: lm(formula = stor.vec.t ~ stratum) Residuals: Min 1Q Median 3Q Max -8.3332 -3.4445 -0.2108 2.7063 16.4840 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.482 1.701 3.809 0.0011 ** stratumU 2.852 2.406 1.185 0.2499 Figure 3. Proportion of reps successfully identifying a treatment effect by sample size Proportion successful at -20% TE 0.0 0.00 0.05 0.2 0.4 success.sample.size 0.15 0.10 success.sample.size 0.20 0.6 0.25 Proportion successful at -10% TE 20 40 60 80 100 0 40 80 Proportion successful at -30% TE Proportion successful at -40% TE 1.0 100 0.6 0.2 0.4 success.sample.size 0.8 0.8 0.6 0.4 0.0 0.0 0 20 40 60 80 0 100 20 40 60 80 100 Index Index Proportion successful at -60% TE 0.0 0.6 0.4 0.0 0.2 0.2 0.4 0.6 success.sample.size 0.8 0.8 1.0 1.0 Proportion successful at -50% TE success.sample.size 60 Index 0.2 success.sample.size 20 Index 1.0 0 0 20 40 60 Index 80 100 0 20 40 60 Index 80 100 Proportion successful at -70% TE 0.8 0.6 0.2 0.0 0.0 20 40 60 80 100 0 20 40 60 80 Index Index Proportion successful at -90% TE Proportion successful at -100% TE 100 0.6 0.4 0.0 0.0 0.2 0.2 0.4 0.6 success.sample.size 0.8 0.8 1.0 1.0 0 success.sample.size 0.4 success.sample.size 0.6 0.4 0.2 success.sample.size 0.8 1.0 1.0 Proportion successful at -80% TE 0 20 40 60 80 Index 100 0 20 40 60 80 Index Summary of percent detecting difference at 10 and 20 plots per treatment or control TE -0.6 -0.7 -0.8 -0.9 -1 10 0.63 0.77 0.86 0.92 0.97 20 0.92 0.98 0.995 0.99 0.99 Figure 4. Bias for Rcapture calculations 100 4.0 2.5 3.0 3.5 Bias 4.5 5.0 5.5 Bias by true abundance 5 10 15 Abundance Output from analysis of bias in abundance with linear model Estimate Std. Error t value Pr(>|t|) (Intercept) -0.017171 0.035728 -0.481 n 0.638 0.233416 0.001974 118.229 <2e-16 *** Output from analysis of bias in SD of abundance with linear model Estimate Std. Error t value Pr(>|t|) (Intercept) 0.827552 0.029430 28.12 1.02e-13 *** n 9.92e-15 *** 0.054132 0.001626 33.29