RAD 1 RAD – a FORTRAN program for fitting species—rank order distributions Version 2.1 Werner Ulrich Nicolaus Copernicus University in Toruń Chair of Ecology and Biogeography Lwowska 1, 87-100 Toruń; Poland e-mail: ulrichw @ umk.pl Latest update: 08.06.2015 1. Introduction Many SAD models have been developed to Since its introduction to ecology by Raunkiær mimic resource division and stochastic processes (1909), species - abundance distributions (SADs) have (Tokeshi 1999, McGill et al. 2007), although nearly all been extensively studied (reviewed in McGill et al. of them are now out of date. Interestingly, two of the 2007, Matthew and Whittaker 2014, 2015). They pro- oldest models, the log-series (Fisher et al. 1943) and vide an exhaustive description of the distribution of the lognormal (Preston 1948) are still discussed and abundances within an ecological community the log-series received even a renaissance as two (Magurran 2004, McGill et al. 2007, Morlon et al. neutral models, the zero sum multinomial (Hubbell 2009, Matthews and Whittaker 2015) and have been 2001) and the maximum entropy approach (Harte linked to differential resource use and competitive 2011) predict log-series. According to the multinomial strength (Sugihara 1980, Tokeshi 1998, Pueyo, 2006), neutral model the degree of dispersal should cause a disturbance regimes (Gray and Mirza 1979), stochas- gradient from lognormal to log-series type relative tic processes (May 1975, Šizling et al. 2009), or spe- abundance distributions (Hubbell 2001). The maxi- cies-specific dispersal rates (Hubbell 2001, Zillio and mum entropy based neutral model of Harte (2011) Condit, 2007). predicts the prevalence of log-series due to constraint random assembly of local communities. Particularly Relative abundance 1 the zero sum multinomial neutral model might serve 0.1 as an ecological standard (Rosindell et al. 2012). 0.01 0.001 0.0001 0 10 20 30 Species rank order 40 50 Figure 1. Species rank order—log abundance (Whittaker) plots generated by RAD of the power fraction model (yellow), the lognormal model (light blue), the geometric series (dark blue), the log-series (brown), and the power function (fractal) model (grey). 2 RAD The present RAD 2.1 software is the follow-up in a straight line when using a double log SAD plot. of the older RAD software (Ulrich 2001) that fitted a They contain all models that are based on allometric large number of relative abundance distributions. (power) functions (Frontier 1985, Moulliot et al. 2000, RAD is focused on five models, the lognormal (Preston Pueyo 2006). 1948), the log-series (Fisher et al. 1943), the geomet- While it is often possible to classify observed ric (Motomura 1932), the power function (Frontier SADs into one of these three classes, it is frequently 1985, Mouillot et al. 2000), and the power fraction challenging or even impossible to discriminate within (Tokeshi 1996). The latter represents a general niche a given class. Further, at low species richness it bedivision model. comes increasingly difficult to discriminate between RAD fits species rank order—log abundance SAD classes. Ulrich et al. (2010) advised not to fit asplots (often called Whittaker plots) only. It does not semblages of less than ten species. deal with species—log abundance (Preston) distributions. Therefore RAD fits avoid possible biases intro- Lognormal model RAD generates lognormally distributed rank- duces by grouping of species into abundance classes abundance distributions with S species from a stoand by the associated reduction of data. chastic process with variance (shape parameter) z: Reliable model identification needs fully censused assemblages. As most models differ by their predictions of the relative abundances of the rarest species, samples are often fitted equally by different models. At low sample size model discrimination becomes increasingly challenging. Samples are most often best fitted by log-series (Ulrich et al. 2010). pi ae z norm where pi denotes the relative abundance of species i and norm is a normally distributed random variate. a is a normalizing constant that ensures that the sum of all relative abundances adds to one. Power fraction model The relative abundance pi of species i is calcu- 2. Models In general, we can subsume nearly all models described fo far (McGill et al. 2007) into three classes. One class of models results in SAD shapes similar to simple exponential functions. The geometric series (Motomura 1932), random assortment (Tokeshi 1990) and dominance pre-emption (Tokeshi 1990) models, and, as a descriptor of samples, the log-series (Fisher et al. 1943) belong to this class (Fig.1). The second class contains all models that generate S-shaped SAD plots. To this class belong the lognormal (Preston 1948) and broken stick (McArthur 1957) models, and the random (Tokeshi 1990), power (Tokeshi 1996), MacArthur (Tokeshi 1990) and Sugihara (Sugihara 1980) fractions. The third class contains models result lated iteratively by a two step process as developed in Tokeshi (1996). First a species is chosen from an exponential probability distribution with parameter z (-∞ < z < ∞) applied to the total abundance space. Next the part of relative abundance occupied by this species is dived into two parts using an equal random distribution. The power fraction is a very flexible model able to mimic the whole range of S-shaped SAD distributions including the lognormal, the Sugihara fraction, and the broken stick. Geometric model The geometric series is computed as pi az (1 z ) i where pi denotes the relative abundance of species i. The value of the parameter z may range between 0 RAD 3 Relative abundance 1 Si z Xi i with 0 < X < 1 and 0 < z < S. Power function (fractal) model 0.1 In a stochastic version of the power function the relative abundance pi of species i is given by 0.01 pi a ( lin S 1) z with lin being a linear random number and with the shape parameter z > 0. a is again a normalizing con- 0.001 0 10 20 30 Species rank order Figure 2. A typical log-series fit (black dots) of a hypothetical relative abundance distribution (blue dots). stant that ensures that the sum of all relative abundances adds to one. 3. Fitting algorithms The fitting algorithm of RAD 2.1 is essentially similar to the older RAD versions. Single parameter and 1. a is a normalizing constant that ensures that the sum of all relative abundances adds to one. models (lognormal, power fraction, power function, geometric) are fitted by an iterative encapsulation Log-series model process. Initially of each model five SADs are calculatIn the two parametric log-series the expected ed with parameters x equally spanning the whole number of species Si with i individuals is given by range of possible values (the maximum and minimum possible values xmin and xmax, x1 = 0.25(xmax - xmin) Figure 3. The SAD screen 4 RAD Convergence is very fast for all models and generally needs less than 20 iterations. Fitting very large assemblages (> 500 speFigure 5. A batch file for multiple analysis. cies) may take considerable CPU time. 4. Program run RAD first askes about the input file (Fig. 3). Figure 4. A space delimited data file containing six This should be a space delimited text file as shown in communities to be analysed. Zero counts are eliminated prior to fitting. Fig. 4. The first column has to contain species names, the first line plot names. These names must not con+xmin, x2 = 0.5(xmax - xmin) +xmin, and x3 = 0.75(xmax xmin) +xmin. Of the three best fitting (see below) parameters two are retained as being the new xmin and xmax This process is repeated until |xmax - xmin| < 0.001, a convergence value that proved to be fully sufficient tain spaces. Zero counts are eliminated prior to fitting. Fitting is done with observed species only. After a blank input (carriage return) the software expects the name of a file containing multiple input files as shown in Fig. 5. for the present models. The log-series is a two parameter model (z and X). The parameter z is first iteratively estimated as described above from Next the software asks about the maximum allowed abundance range (range = maximum/ minimum) for fitting the log-series. As RAD will create a respective vector of size range, the value is limited N by the available working memory. The default value S z ln1 z is 50,000. If the observed community has a much largwith S being the total number of species and N the total abundance. Next X is estimated accordingly. This procedure proved to give robust results (Fig. 2), while a simultaneous parameter encapsulation sometimes Figure 6. Three methods for fitting (red: observed, black: predicted), equivalent to the OLS (a), RMA (b), and MA (c) approach. 1 be a count of individuals. In practise most abundance data are given as density = counts / number of sample units. RAD treats such data differentially. If all species abundances of the input data are above or equal unity the total abundance is estimated as the sum of these values. If the least abundance species has an abundance below unity all values are first divided by this value making the least abundant species to have an abundance of one. Relative abundance failed. In the log-series the total abundance N should (a) (b) 0.1 (c) 0.01 0 1 2 3 4 5 6 Species rank order 7 8 RAD 5 Figure 7. The major output file of SAD. er abundance range than predefined fits might be- where pi,obs, pi,exp denote the observed and fitted relacome increasingly worse as numbers of species with tive abundances of species i. Minimum values min very low abundances will be underrepresented. obtained with species j refer to the respective mini- RAD proposes three target functions (Fig. 6) to mum when calculated over all species S. assess goodness of fit. It uses similar approaches as in linear regression. Ordinary least squares fits (OLS) use d 2 (ln p i , obs ln pi , exp ) 2 RAD produces two output files. The first S (default = Output.txt) contains the summary statistics S reduced major axis (RMA) fits sum of all species abundances (counts or densities). If S these are given as relative abundances RAD first di- d 2 S and major axis (MA) fits 2 (Fig. 7). The total abundance (TotalA) refers to the ( min[(ln pi ,obs ln pi ,exp ) ( j i ) ] 2 d2 5. Output ( min(ln pi ,obs ln p j ,exp ) 2 vides all values by the value of the least abundant species. Abundance difference (DiffA) provides the S S Figure 8. The distribution file of SAD. 6 range of abundances (DiffA = most / least abundant species). Curvature refers to the relative skewness of the empirical distribution with respect to the skew- RAD 7. System requirements RAD is written in FORTRAN 95, has been ness of the lognormal. In theory the lognormal has a compiled under a 64 bit architecture, and runs under skewness of zero, but as the present lognormal is cal- Windows 8, 7, XP, and Vista. The present version is culated from a stochastic process small differences only limited by the computer’s memory. occur. Therefore (log pobs log pobs )3 (log plnorm log p ln lnorm )3 S S Curvature S 3 3 ( S 1)( S 2) obs lnorm In most cases curvature will be similar to the skewness of the observed SAD. Next RAD prints the shape parameters z (and 8. Acknowledgements Development of this program was support- X in the case of the log-series) and the associated fits ed by a grant from the Polish Science Centre (grant d2. The log-series parameter X is always below one 2014/13/B/NZ8/04681 ). and nearly always above 0.9. In many cases it is very close to one. In this case RAD prints a value of 9. References 1.0000000. Finally, RAD prints the Shannon diversity Connolly, S.R. and Dornelas, M. 2011. Fitting and emH and the respective evenness E = H/ln(S). pirical evaluation of models for species abunThe second output file (Fig. 8) contains the dance distributions. In: Biological Diversity: empirical and the fitted relative abundance distribu- Frontiers in Measurement and Assessment (eds tions of the five models. The latter relative abundanc- A.E. Magurran and B.J. McGill), pp. 123-140. es are averages from 20 replicates using the best fit Oxford University Press, Oxford. parameter values. This number has proven to be suffi- Fisher, A.G., Corbet S.A. and Williams, S.A. 1943. The cient for a smooth plot. In the case of the stochastic relation between the number of species and the models RAD prints for each species the standard devi- number of individuals in a random sample of an ation of the average abundance derived from 20 repli- animal population. J. Anim. Ecol. 12: 42-58. cates. Only the deterministic geometric model does Frontier, S. 1985. Diversity and structure in aquatic not have a variance. In the case of serval plots ecosystems. In: Oceanography and Marine Bio(columns) in the input file, RAD fits automatically the logy—An Annual Review ( ed. M. Barnes). Aber- marginal total distribution of all of these plots. deen, pp. 253-312. Gray, J.S. and Mirza, F.B. 1979. A possible method for 6. Citing RAD RAD is freeware but nevertheless if you use RAD in scientific work you should cite SAS as follows: the detection of pollution-induced disturbance on marine benthic communities. Marine Poll. Bull. 10: 142-146. Ulrich W. 2015. RAD – a FORTRAN program for fitting Harte, J. 2011. Maximum entropy and ecology. Oxford, Univ. Press. species—rank order distributions. Version 2.1. www.keib.umk.pl. Hubbell, S.P. 2001. The unified theory of biogeogra- RAD 7 phy and biodiversity. Princeton (Univ. Press). Magurran, A.E. 2004. Measuring biological diversity. Blackwell, Oxford. Pueyo, S. 2006. Self-similarity in species–area relationship and in species–abundance distribution. Oikos 112: 156–162. MacArthur, R.H. 1957. On the relative abundance of Raunkiær, C. 1909. Livsformen hos Planter paa ny bird species. Proc. Natl. Acad. Science USA 43: Jord. Kongelige Danske Videnskabernes Sels- 293-294. kabs Skrifter. Naturvidenskabelig og Mathema- Matthews, T.J. and Whittaker, R.J. 2014. Fitting and comparing competing models of the species tisk Afdeling 7: 1-70. Rosindell, J., Hubbell, S.P., He, F., Harmon, L.J. and abundance distribution: assessment and pro- Etienne, R.S. 2012. The case for ecological neu- spect. Frontiers Biogeogr. 6: 67–82. tral theory. Trends Ecol. Evol. 27: 203–208. Matthews, T.J. and Whittaker, R.J. 2015. On the spe- Šizling, A.L., Storch, D., Šizlingova, E., Reif, R. and Ga- cies abundance distribution in applied ecology ston, K. 2009. Species abundance distribution and biodiversity management. J. Appl. Ecol. 52: results from a spatial analogy of central limit 443-454. theorem. Proc. Natl. Acad. Sci. USA 106: 6691– May, R.M. 1975. Patterns of species abundance and 6695. diversity. In: Ecology and evolution of communi- Sugihara, G. 1980. Minimal community structure: an ties (eds. Cody, M. L. and Diamond, J.M.) Cam- explanation of species–abundance patterns. Am. bridge University Press, pp. 81–120. Nat. 116: 770–787. McGill, B.J. et al. 2007. Species abundance distribu- Tokeshi, M. 1990. Niche apportionment or random tions: moving beyond single prediction theories assortment: species abundance patterns revisi- to integration within an eco-logical framework. ted. J. Animal Ecol. 59: 1129-1146. Ecol. Lett. 10: 995–1015. Morlon, H. et al. 2009. Taking species abundance distributions beyond individuals. Ecol. Lett. 12: 488-501. Motomura, I. 1932. On the statistical treatment of communities. Zool. Mag. Tokyo 44: 379-383 (in Japanese). Moulliot, D., Lepretre, A., Andrei-Ruiz, M.-C. and Tokeshi, M. 1996. Power fraction: a new explanation of relative abundance patterns in species-rich assemblages. Oikos 75: 543-550. Tokeshi, M. 1998. Species coexistence. Blackwell, Oxord. Ulrich, W. 2001. RAD—a Fortran program for fitting relative abundance distributions. www.keib.umk.pl. Viale, D. 2000. The fractal model: an new model Ulrich, W., Ollik, M. and Ugland, K.I. 2010. A metato describe the species accumulation process analysis of species – abundance distributions. and relative abundance distribution (RAD). Oikos Oikos 119: 1149-1155. 90: 333-342. Preston, F.W. 1948. The commonness and rarity of species. Ecology 29: 254–283. Zillio, T. and Condit, R. 2007. The impact of neutrality, niche differentiation and species input on diversity and abundance distributions. Oikos 116: 931 -940.