Page 1 Richard B. Harris Wildlife Biology Program University of Montana Missoula, Montana 59812 1-406-542-6399 rharris@montana.com On estimating wildlife densities from line transect data RICHARD B. HARRIS, University of Montana, Missoula, MT USA 59812 KENNETH P. BURNHAM, Colorado State University, Fort Collins, CO USA 80523 [English version: Published in Acta Zoological Sinica (动物学报)] 48: 812-818 (2002) Abstract: Line transects are one of the best ways to estimate density of wildlife populations over large areas. However, density estimates will be unreliable if using mathematical procedures that, although simple and easy to use, do not correspond with reality. We argue here that using a naïve estimator, in which the mean of observed perpendicular distances are equated with effective strip width, is unlikely to yield reliable results. If conducted correctly, density estimates using this equation will most often be too high. Instead, we urge investigators to use program DISTANCE, and to familiarize themselves with the underlying theory, by reading Buckland et al. (1993). ------------------------------------------------------------------------------------------Key words: density estimation, detection function, Fourier series, line transect, negative exponential distribution, program DISTANCE ________________________________________________________________ Acta Zoological Sinica (动物学报) 00:000-000 Page 2 It is now well known that estimating the abundance of wildlife populations is fraught with difficulties. As pointed out by Sheng and Xu (1992), line transect methods are among the best for medium and large-sized animals when estimation on large-sized areas is required. Thus, the line transect method has increasingly become used by Chinese wildlife scientists (e.g., Liu and Yi 1993, State Forestry Administration 1995, Gao and Yao 1997). However, even this method can produce unreliable results if critical assumptions are violated in the field, and/or if inappropriate mathematical analyses are applied afterwards. It is worthwhile reviewing the underlying assumptions of line-transect estimation here (Anderson et al. 1979, Burnham et al. 1980, Buckland et al. 1993): 1. Objects on the center line must be observed with probability = 1.0 (i.e., every object on the line must be detected). 2. Transect lines are placed randomly, or at least objectively, with respect to the population being studied; 3. Objects (i.e., animals or animal groups) do not move toward or away from the transect line in response to the observer before distances are measured; 4. Distances from the transect line to each object are measured accurately; 5. Transect line segments are straight; 6. The size of the object (or, if objects occur in groups, the size of the group) does not affect the probability of observation (if it does, analyses that account for size-bias must be used); and 7. Objects encountered are independent (i.e., observing an object does not affect the probability of observing any other object); Page 3 Additionally, sample sizes (number of objects observed) must be sufficient to provide robust estimates of the detection function and its variance (Burnham et al. 1980 proposed a minimum of 40 for any single estimate). If sample sizes are too small, results can be accurate in theory, but unreliable in practice. Under field conditions, it is difficult to comply with all these assumptions and obtain reasonably large sample sizes(Southwell, 1994, Harris 1996). After data have been appropriately collected, equations or numerical methods are used to model the detection function, from which density is estimated. A number of competing models of how detection decreases with distance have been proposed. Common sense, empirical data and simulation modeling have supported use of detection functions with a “shoulder” near the center-line, such as the Fourier series (Burnham et al. 1980) and the half-normal (Buckland et al. 1993). Detection functions with a ‘shoulder’ are likely to reflect reality better than other shapes, because objects are often only slightly less detectable when near the center line than on it, whereas detectability often drops off at some distance from the center line. Equally importantly, modern theory has stressed that detection functions may differ among taxa, habitats, sighting conditions, and other factors. Thus, computer programs, such as DISTANCE (Thomas et al. 1998), provide alternative detection functions as well as metrics comparing the fit of each, allowing the user to chose the most appropriate based on a priori or empirical information (Burnham and Anderson 1998). A simple model of declining detection with distance is the negative exponential (Eberhardt 1968, Gates et al. 1968, Eq. 1). g(x) = e(-ax) where (Eq. 1) Page 4 g(x) = probability of detecting the animal at perpendicular distance x, assuming that all animals on the transect line are seen, i.e., g(0) = 1 a = parameter fitted to the data x = perpendicular distance However, the negative exponential function lacks a shoulder; in fact, the steepest decline in detection is closest to the center line. An even simpler approach to treating distance data is to equate the mean of recorded perpendicular distances with the effective width of a sampled strip, and then to proceed with calculations (Eq. 2) as though a strip transect had been conducted (Sheng and Xu 1992, State Forestry Administration 1995). D = ns/2LW (Eq. 2) where D = estimated density of animals (or animal groups) n = number of animals (or animal groups) seen s = mean group size L = length of transect line(s) W = mean perpendicular distance of animals (or groups) seen However, the point estimate of density obtained using Eq. 2 will only be accurate if the underlying (i.e., true) detection function is negative exponential. As well, Eq. 2 lacks a theoretical basis and a method to estimate its variance. Our objective here is to examine the use of Eq. 2 (and the conceptually similar Eq. 1) to estimating density from distance data, and to encourage Chinese scientists to use alternative methods that have been found superior. Page 5 PROBLEMS IN USING EQUATION 2 Equation 2 is inflexible and will usually show a positive bias If the true decline in detectability with distance follows a negative exponential distribution, point estimates produced by either Eq. 1 or Eq. 2 will be approximately correct. However, they will be unreliable if other detection functions characterize the data. Buckland et al. (1993) recommend using a modified half-normal parametric detection function. If the half-normal detection function is true, and the “mean distances” approach is used, a positive bias in the resulting density of 57% can be expected. Burnham et al. (1980) conducted simulations to assess the performance of alternative detection functions when the true, underlying detection function was known. Table 1 reprints a portion of their results, comparing the negative exponential distribution (Eq. 1) with the much more flexible Fourier series. As can easily be seen, the negative exponential model performed well when the detection probability did, in fact, decline exponentially. Under these conditions, the Fourier series produced a negative bias of about 12-16%. However, when any other detection probability was simulated, the negative exponential function produced highly biased results, from 10 to almost 66% too high, while the Fourier series returned relatively unbiased results. Thus, Burnham eta al. (1980) recommended using the Fourier series because it was more robust to varied underlying detection functions. One obvious way to compare the appropriateness of alternative detection functions is to apply all of them in a situation in which density is already known. In a test of various sighting methods performed prior to the development of rigorous line-transect theory, Robinette et al. (1974) demonstrated that Eq. 2 produced positive proportional Page 6 biases of 19% to 89%, with a mean proportional bias of 48%. Similarly, Parmenter et al. (1989) showed that modeling detectability using the negative exponential model always resulted in an upward biased. Laake (1978) conducted experiments in which observers documented perpendicular distances to wooden stakes placed in the ground at a known density (37.5/ha). Even in this well-controlled experiment, observers often failed to record all objects directly on the line, violating a critical assumption. An example of the detection function for one of the experiments is shown in Figure 1a, where we have corrected for the fact that, in this case, g(0) = 0.82 (rather than g(0) = 1.0). In this example, the Fourier series estimator using program DISTANCE estimated the density as 42.5, about 13% from the true value. Had Eq. 2 been used with these data (as illustrated in Figure 1b), the estimated density would have been 67.9/ha (biased positively by 81%), and there would have been no way to assess the amount of uncertainty in this estimate. This example is not an isolated case. In a recent experimental survey of the desert tortoise in the southwestern United States, Anderson et al. (submitted) had 12 teams estimate the abundance of artificial tortoises in which the true number was known. The 12 estimates varied from a negative bias of 7% to a positive bias of 13%, with a mean bias of –4% (Table 2). Had Eq. 2 been used instead, biases would have varied from 62% to 93%, with a mean of 70% (Table 2). Equation 2 allows calculation without inspecting the data Most published applications of line transects in the Chinese literature lack raw data with which to compare competing mathematical approaches. However, Gao and Yao Page 7 (1997) displayed their raw data on line-transect surveys of argali (Ovis ammon) in Xinjiang. They calculated densities from line transects with sample sizes of 4,1,1,3,3,2,2,2, and 3 argali groups/transect. Even if transects from each study area had been combined (the more appropriate procedure), total sample sizes for the 2 study areas would have been 9 and 14, both far smaller than the recommended minimum of 40 (Burnham et al. 1980). A cursory examination of histograms for the 2 study sites suggests that detection did not decline with distance (Fig. 2). Thus, the fundamental assumptions for fitting any of the possible detection functions were evidently not met. The only function that is truly consistent with these (admittedly few) data are that detection probability was approximately invariant at least as far as the furthest group of argali seen. Thus, for the Hami study area (Fig. 2a), a more appropriate estimate of the width of strip “effectively” sampled would not have been the mean perpendicular distance of 327 m, but instead the largest perpendicular distance of 380 m. Doing so would have reduced their estimated density of 0.53 argali/km2 to 0.41 argali/km2. Similarly, for the Mulei study area (Fig. 2b), the estimated density of 0.82 argali/km2 would have been more appropriately estimated as 0.54 argali/km2. By using Eq. 2, Gao and Yao (1997) had no need to examine histograms of their data, or to consider the implications of assuming the negative exponential. AN EXAMPLE FROM FIELD WORK IN CHINA Field work in China is particularly difficult, and many of the means for conducting line transects used in the West (e.g., aircraft) are not available. However, Page 8 sufficient observations can sometimes be obtained even in the difficult conditions Chinese scientists usually find themselves in. From these, detection functions can be estimated using program DISTANCE, rather than using Eq. 2. For example, Harris (1996, see also Harris and Miller 1995, Harris et al.1996) walked randomly placed line transects in Qinghai to estimate densities of Tibetan gazelle (Procapra picticaudata). The sample sizes obtained (N=64 groups) allowed the estimation of the detection function using the Fourier series (Fig. 3). It appears that even here, some “heaping” occurred in those distance categories closest to the center-line, which should be avoided if possible. RECOMMENDATIONS It may be tempting to apply Eq. 2 because it is so easy to calculate. However, the accumulated experience in western countries, illustrated briefly by the examples provided here, is that it forms an unreliable basis for estimating density from distance data. Considerable effort has gone into providing user-friendly computer software (Thomas et al. 1998) and explanatory text material (Burnham et al. 1980, Buckland et al. 1993) for methods that are known to be more reliable. Both the software (program DISTANCE) and the accompanying text book (Buckland et al. 1993) are available at no cost over the internet from site http://ruwpa.st-and.ac.uk/distance. Now that computers and internet access are becoming more common in China, the methods provided by program DISTANCE should be used whenever possible. It is true that program DISTANCE is not a panacea; users can (and will) choose differing ways of treating data, resulting in slightly different density estimates. As well, the most robust and precise estimators often underestimate true density slightly because real data rarely match the ideal perfectly. Page 9 However, recent work has shown that, given an introduction to the important concepts, conscientious investigators will produce results varying by less than 10% from one another using program DISTANCE (Anderson and Southwell 1995). Thus, even if a small negative bias cannot be avoided, results will be fairly consistent from survey to survey. However, no detection function will perform well when sample sizes are very small. For example, Gao and Yao (1997) reported density estimates from lines surveyed in which only a single group of animals (i.e., n=1) was observed. Such an estimate is, of course, theoretically possible using Eq. 2. However, investigators should not be fooled into thinking that they really know much about the density of animals in an area when they have only a single group with which to model detection. Unless there are persuasive reasons to avoid doing so, it is appropriate to combine data from replicate surveys, or portions of a study area, to achieve reasonable sample sizes in estimating a detection function. However, if, after having combined similar lines, sample sizes are still quite small (e.g., 10-20), it is then best to avoid estimating densities from distances. Instead, it is more prudent to simply report the number and type of animals observed (as well as thoroughly documenting methods used), and treating the results as an index to abundance. This index cannot be used to determine absolute density or abundance, but might still be useful if repeated periodically to obtain a rough idea of population trends. Similarly, estimates using program DISTANCE are invalidated to the degree that field procedures violate the fundamental assumptions of line-transect sampling. If sampling cannot be conducted in a way that minimizes assumption violations, it is again Page 10 advisable to simply report the raw data and methods used, rather than attempt to derive a density when no model exists to do so. Finally, the importance of a rigorous, objective sampling regime cannot be stressed enough. Extrapolations of density are only valid if the transects sampled truly represent an unbiased selection of all possible transects within the area of interest. ACKNOWLEDGEMENTS Work in China was funded by the Robert M. Lee Foundation and the Liu Guo Lit Charitable Trust. S. T. Buckland and J. L. Laake provided suggestions to improve the manuscript. LITERATURE CITED Anderson, D.R., Laake, J.L., Crain, B.R., and Burnham, K.P. (1979). Guidelines for line transect sampling of biological populations. Journal of Wildlife Management. 43:70-78. Anderson, D.R. and Southwell, C. (1995). Estimates of macropod density from line transect surveys relative to analyst expertise. Journal of Wildlife Management. 59:852-857. Anderson, D. R., K. P. Burnham, B. C. Lubow, L. Thomas, P. S. Corn, P. A. Medica, and R. W. Marlow. 2001. Field trials of line transect methods applied to estimation of desert tortoise abundance. Journal of Wildlife Management 65: 583-597. Page 11 Buckland, S. T., D. R. Anderson, K. P. Burnham, and J. L. Laake. 1993. Distance sampling: Estimating abundance of biological populations. Chapman and Hall, London. 446 pages. Available from http://ruwpa.st-and.ac.uk/distance. Burnham, K. P., and D. R. Anderson. 1998. Model selection and inference: A practical information-theoretical approach. Springer-Verlag, New York. NY. Burnham, K.P., Anderson, D.R., and Laake, J.L. (1980). Estimation of density from line transect sampling of biological populations. Wildlife Monographs. 72:1-202. Eberhardt, L.L. (1968). A preliminary appraisal of line transects. Journal of Wildlife Management. 32:82-88. Gao X. Y. and J. Yao. 1997. Argali of the eastern Tianshan, Xianjiang. Chinese Wildlife (Yesheng Dongwu) 18(4):38-40. (in Chinese). Gates, C. E., W. H. Marshall, and D. P. Olson 1968. Line transect method of estimating grouse population densities. Biometrics 24(1):135-145. Harris, R. B., D. J. Miller, Cai G. Q., and D. H. Pletscher. 1996. Wildlife status and conservation in Yeniugou, Qinghai. Acta Theriologica Sinica 16:113-118 (in Chinese, English version available). Harris, R. B. 1996. Wild ungulate surveys in grassland habitats: Satisfying methodological assumptions. Chinese Journal of Zoology 31(2):16-21. (in Chinese, English version available). Harris, R. B. and D. J. Miller. 1995. Overlap in summer habitats and diets of Tibetan plateau ungulates. Mammalia 59:197-212. Laake, J. L. 1978. Line transect estimators robust to animal movement. M.S. Thesis, Utah State Univ., Logan. 55 pp. Page 12 Liu, W. L. and B. G. Yi. 1993. Wildlife protection in Tibet. China Forestry Press, Beijing. (in Chinese). Parmenter, R. R., J. A. MacMahon, and D. R. Anderson. 1989. Animal density estimation using a trapping web design: field validation experiments. Ecology 70:169-179. Robinette, W.L., Loveless, C.M., and Jones, D.A. (1974). Field tests of strip census methods. Journal of Wildlife Management. 38:81-96. Sheng, H. L, and H. F. Xu. 1992. Field Research Methods for Mammals. China Forestry Press, Beijing. (in Chinese). Southwell, C. (1994). Evaluation of walked line transect counts for estimating macropod density. Journal of Wildlife Management. 58:348-356. State Wildlife Protection Office. 1995. National terrestrial wildlife resource survey and monitoring methods. Ministry of Forestry, Beijing. December, 1995. (in Chinese). Thomas, L., J. L. Laake, J. F. Durry, S. T. Buckland, D. L. Borchers, D. R. Anderson, K. P. Burnham, S. Stringberg, S. L. Hedley, M. L. Burt, F. Marques, J. H. Pollard, and R. M. Fewster. 1998. Program DISTANCE 3.5. Research Unit for Wildlife Population Assessment, University of St. Andrews, U.K. Available from http://ruwpa.st-and.ac.uk/distance. Page 13 Table 1. Mean percent relative bias of the negative exponential function ( Eq. 1) and the more flexible Fourier series, when applied to simulated data using underlying negative exponential, half-normal, and modified beta detection functions. Values in each case are means from 25 simulations, the sample size of distances in each simulation was 100. Data taken from Burnham et al. (1980:158). ________________________________________________________________ Method Used -----------------Negative Exponential Fourier Series ---------------------------- ------------------- Underlying Detection Function --------------------------------------------------------------------------------------------------Negative Exponential - Severely truncated - 0.3 - 13.7 - Moderately truncated - 1.8 - 12.5 - Untruncated + 1.9 - 16.5 - Severely truncated + 10.4 + 0.1 - Moderately truncated + 26.5 + 1.1 - Untruncated + 59.7 + 2.5 - Shoulder + 65.9 - 5.3 - Linear + 47.7 - 7.1 - Spiked + 32.8 - 5.3 Half-Normal Modified Beta ___________________________________________________________ Page 14 Table 2. Abundance estimates of artificial tortoises using Eq. 2 and the Fourier series method (Burnham et al. 1980) from 12 different survey teams. Data are taken from Anderson et al. (submitted). The true density of tortoises in all 12 cases was 76. ________________________________________________________________ Team N Eq. 2 Percent Bias Fourier Percent Bias Estimate Estimate -------------------------------- ------------------------------- ________________________________________________________________ 1 47 124 +63% 74 - 3% 2 49 128 +68% 78 + 3% 3 52 126 +66% 75 - 1% 4 55 129 +70% 79 + 4% 5 57 123 +62% 73 - 4% 6 57 129 +70% 71 - 7% 7 64 147 +93% 86 + 13% 8 52 132 +74% 79 + 4% 9 52 127 +67% 75 - 1% 10 60 130 +71% 79 + 4% 11 59 144 +89% 85 + 12% 12 55 125 +64% 72 - 5% ---------------------------------------------------------------------------------------------------Mean 129 +70% 73 - 4% _____________________________________________________________ Page 15 Figure Captions. Figure 1. Histograms and fitted detection functions for an example of distance data. Data are from wooden stakes placed in the ground at a known density (Burnham et al. 1980:62). A. The Fourier series detection function, which yielded an estimate of 42.5 stakes/ha. B. The negative exponential detection function, which is assumed when using Eq. 2, which yielded an estimate of 67.9 stakes/ha. The true density of stakes as 37.5ha-2. Figure 2. Histograms of perpendicular distances of argali (Ovis ammon) observed during line transect surveys conducted by Gao and Yao (1997). A. Surveys in Hami (Table 1 from Gao and Yao 1997), fitted with the negative exponential function. B. Same as A., except fitted using a half-normal function. C. Surveys in Mulei (Table 2 from Gao and Yao 1997), fitted with the negative exponential distribution. D. Same as C., except fitted using a half-normal function. Compare the shape of histograms from those in Fig. 1. Figure 3. Fourier series detection functions superimposed on histograms of perpendicular distances from a survey of Tibetan gazelles in Qinghai province, using program DISTANCE, N = 64. Page 16 A. 1.2 DETECTION PROBABILITY 1.0 0.8 0.6 0.4 0.2 0.0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 PERPENDICULAR DISTANCE (m) 1.2 DETECTION PROBABILITY 1.0 0.8 0.6 0.4 0.2 0.0 0 5 10 15 20 25 PERPENDICULAR DISTANCE (m) 30 32 34 Page 17 Line Transect Methods Page 19 2/5/2016 Figure 2. 1 .6 6 5 5 5 1 .8 8 2 6 5 1 .4 9 8 9 9 1 .6 9 4 3 9 1 .3 3 2 4 4 1 .5 0 6 1 2 1 .1 6 5 8 8 1 .3 1 7 8 6 0 .9 9 9 3 2 9 1 .1 2 9 5 9 0 .8 3 2 7 7 4 0 .9 4 1 3 2 6 0 .6 6 6 2 2 0 .7 5 3 0 6 0 .4 9 9 6 6 5 0 .5 6 4 7 9 5 0 .3 3 3 1 1 0 .3 7 6 5 3 0 .1 6 6 5 5 5 0 .1 8 8 2 6 5 0 0 0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 P e rp e n d ic u la r d is ta n c e in m e te rs 400 500 600 700 800 900 1000 P e r p e n d ic u la r d is ta n c e in m e te r s A. Hami data: Negative exponential function A. ????????? ????? B. Hami data: Half-normal function B. ????????? ????? 1 .7 1 1 3 1 .6 8 9 4 5 1 .5 4 0 1 7 1 .5 2 0 5 1 .3 6 9 0 4 1 .3 5 1 5 6 1 .1 9 7 9 1 1 .1 8 2 6 1 1 .0 2 6 7 8 1 .0 1 3 6 7 0 .8 5 5 6 5 1 0 .8 4 4 7 2 3 0 .6 8 4 5 2 0 .6 7 5 7 7 9 0 .5 1 3 3 9 0 .5 0 6 8 3 4 0 .3 4 2 2 6 0 .3 3 7 8 8 9 0 .1 7 1 1 3 0 .1 6 8 9 4 5 0 0 0 100 200 300 400 500 600 P e rp e n d ic u la r d is ta n c e in m e te rs C. Mulei data: Negative exponential function C.. ????????? ????? 700 800 900 1000 0 100 200 300 400 500 P e rp e n d ic u la r d is ta n c e in m e te rs D. Mulei data: Half-normal function D. ????????? ????? 600 700 800 900 1000 Line Transect Methods Page 20 2/5/2016 1.6 DETECTION PROBABILITY 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 PERPENDICULAR DISTANCE (m) Figure 3.