Mori et al. Appendix Appendix 1. Evaluating the individual randomization model using artificially-generated datasets. To further explore how the model based on individual randomization (the IR algorithm) behaves in response to different meta-community structures, we artificially generated different datasets. We first specified the total abundance Ni of species i by sampling from a log-normal distribution of abundances with N(0, 1) (Kraft et al. 2011; Ulrich et al. 2010). To mimic different carrying capacities per plot, the relative abundances Aj for each site j were also drawn from a random uniform distribution (0 < Aj ≤ 1.0) (Ulrich and Gotelli 2010). The number of plots and size of species pool (γ-diversity) were set to 30 plots and 50 species for all calculations, respectively. From here, we generated two different meta-communities. The first has a pattern of intraspecific aggregation, while the second has no aggregation. To mimic a possible intraspecific aggregation of individuals (each species has the possibility of failing to appear in some localities due to possible dispersal limitation, biotic filtering, abiotic filtering and other stochastic factors), we specified the probability for species i in failing to immigrate into some local sites (plots) using a random uniform distribution (0 ≤ Bi ≤ 1.0) with the maximum probability of immigration failure set to half of all locations (i.e., 15 plots). The first dataset therefore shows an inconsistency in species abundance ranks between local communities, which is well observed in natural communities. The second dataset was not adjusted to have such aggregated distribution of individuals within species, so that all plots have (almost) the same pattern of species rank-abundance (specified above). We found a significant positive correlation of ß-deviation, the magnitude of deviation from the expected ß-diversity, with abundance in each plot, regardless of ß-diversity indices (Fig. S1a). This result is consistent with that presented in the main manuscript (Fig. 2), again suggesting an abundance-driven sampling effect. The values of ß-deviation were positive (Fig. S2a), indicating strong intraspecific aggregation. In contrast, there was no correlation between ß-deviation and abundance in each plot when analyzing the data without intraspecific aggregation (Fig. S1b). Local communities in this second dataset consistently maintain the same pattern of species rank-abundance distribution across local communities within the meta-community; that is, there was no shift in species dominance between locations. Due to the lack of aggregation within species, the values for ß-deviation were distributed around zero (Fig. S2b). Note that it is possible to generate different artificial datasets with different specification of aggregated patterns. However, because of difficulties with making individuals in each species aggregate in a realistic manner, we did not conduct further benchmark tests using simulated datasets. Our preliminary results are provided primarily to show the potential Mori et al. Appendix of the IR model, which may be useful for quantifying the magnitude of intraspecific aggregation (Myers et al. 2013), when analyzing the appropriate data. Fig. S1. Relationships between individual abundance in each plot and plot-mean ß-deviation, the magnitude of deviation from the expected ß-diversity calculated from the individual-based randomization. a) Data with intraspecific aggregation. b) Data without intraspecific aggregation. Data were artificially generated (see the text). Species abundance distributions follow a log-normal distribution. The number of plots and size of species pool (γ-diversity) were set to be 30 and 50 for all calculations, respectively. Four different measures of ß-diversity were used (shown above each box). The correlation coefficient (r-value) is shown when significant; ** and * indicate P < 0.01 and P < 0.05, respectively. Mori et al. Appendix Fig. S2. Boxplots showing the results for ß-deviation, the magnitude of deviation from the expected ß-diversity calculated from the individual-based randomization. a) Data with intraspecific aggregation. b) Data without intraspecific aggregation. Data were artificially generated (see the text). Species abundance distributions follow a log-normal distribution. The number of plots was 30 for all calculations, and the size of the species pool (γ-diversity) was set to 50. The results shown are based on Chao’s ß-diversity index. References Kraft NJB, Comita LS, Chase JM, Sanders NJ, Swenson NG, Crist TO, Stegen JC, Vellend M, Boyle B, Anderson MJ, Cornell HV, Davies KF, Freestone AL, Inouye BD, Harrison SP, Myers JA (2011) Disentangling the drivers of beta diversity along latitudinal and elevational gradients. Science 333:1755-1758 Myers JA, Chase JM, Jimenez I, Jorgensen PM, Araujo-Murakami A, Paniagua-Zambrana N, Seidel R (2013) Beta-diversity in temperate and tropical forests reflects dissimilar mechanisms of community assembly. Ecol Lett 16:151-157 Ulrich W, Gotelli NJ (2010) Null model analysis of species associations using abundance data. Ecology 91:3384-3397 Ulrich W, Ollik M, Ugland KI (2010) A meta-analysis of species-abundance distributions. Oikos 119:1149-1155 Mori et al. Appendix Appendix 2. Notes on null models used in this study. Table S1. Notes on null models used in this study. Basic properties of each model are described in Table 1. Model Summary (possible pros and cons) Related references Individual-based randomization This model is effective for quantifying the magnitude of intraspecific aggregation of individuals by excluding the effects of 1,2 (IR) species pool size. It is normal to see aggregation, possibly resulting from biotic filtering, abiotic filtering, dispersal limitation, immigration history and/or ecological drift. The model shuffles species at the individual level while preserving: 1) the species-abundance pattern of a meta-community, and 2) the carrying capacity (the number of individuals that can be hospitable) at each location. Due to these constraints, the model may be useful for disentangling the spatial and/or environmental processes governing how species fail to migrate into some locations within a meta-community. The null hypothesis could be related to the neutral hypothesis as this null model makes species disperse into any sites within a null meta-community at the individual level. However, when comparing results from different regions, careful consideration is necessary because the model can be significantly affected by the abundance-driven sampling effect. Species-based randomization This model is suitable for disentangling the factors shaping community processes among species with similar occurrence 3,4,5 (SR) frequencies. However, if species with similar abundances have different niches or dispersal abilities, they may not directly compete for resources, may have different habitat requirements, or may experience different levels of dispersal limitation. In such cases, the method has limited application for niche- or dispersal-based processes influencing the observed patterns of ß-diversity. Importantly, the model restricts species to emerge in a specific location; that is, species do not have equal probability of migrating into all locations within a null meta-community. Note that this model is often used for functional and phylogenetic community analyses, because overall species’ frequencies are often distributed non-randomly across the phylogeny or the functional trait space. Probability-based randomization This model may be suitable for avoiding incorrect interpretations from the abundance-driven sampling effect. Given the 5,6 (PR) generality of species abundance distribution, the model is in general robust. However, it may have limited statistical power because of its relatively strong restrictions. More specifically, if one fails to assume there are processes underlying the observed shape of species abundance distribution, this method might undervalue important local processes. If a focal hypothesis is directly related to intraspecific aggregation of individuals, the model may not be the best candidate (although the model is Mori et al. Appendix capable of detecting aggregation). Due to the efficiency in removing both sampling effects (caused by differences in species pool size and abundance), the model may be useful for seeing a spatial gradient in ß-diversity as γ-diversity and abundance are often correlated at a broader geographical scale. In such cases, the null hypothesis may be that, after excluding the effects of differences in the size of the species pool and the individual abundance among sites, there are no local processes (associated with environmental, spatial and historical factors) causing the ß-diversity to deviate from a random expectation. Richness-based randomization This model is only applicable when the focal null hypothesis is independent of species-abundance patterns. The model does not 7,8 (RR) provide any inference about intraspecific aggregation. The model may be valid when one assumes all species have equal chance of migration into a given site; however, this type of randomization is not valid for abundance-based data. This model can make rare and abundant species unrealistically-abundant and rare, respectively. Although uncertainty still exists about the mechanical explanations that generate species-abundance patterns, a null model that ignores species identity in a species rank-abundance (rare or dominant), as does this model, may misestimate such empirical patterns of spatial organization when abundance data is available. Related references 1. Kraft NJB, Comita LS, Chase JM, Sanders NJ, Swenson NG, Crist TO, Stegen JC, Vellend M, Boyle B, Anderson MJ, Cornell HV, Davies KF, Freestone AL, Inouye BD, Harrison SP, Myers JA (2011) Disentangling the drivers of β diversity along latitudinal and elevational gradients. Science 333:1755-1758 2. Myers JA, Chase JM, Jimenez I, Jorgensen PM, Araujo-Murakami A, Paniagua-Zambrana N, Seidel R (2013) Beta-diversity in temperate and tropical forests reflects dissimilar mechanisms of community assembly. Ecol Lett 16:151-157 3. Hardy OJ (2008) Testing the spatial phylogenetic structure of local communities: statistical performances of different null models and test statistics on a locally neutral community. J Ecol 96:914-926 4. Purschke O, Schmid BC, Sykes MT, Poschlod P, Michalski SG, Durka W, Kuhn I, Winter M, Prentice HC (2013) Contrasting changes in taxonomic, phylogenetic and functional diversity during a long-term succession: insights into assembly processes. J Ecol 101:857-866 5. Mori AS, Ota AT, Fujii S, Seino T, Kabeya D, Okamotto T, Ito MT, Kaneko N, Hasegawa M (2014) Biotic homogenization and differentiation of soil faunal communities in the production forest landscape: taxonomic and functional perspectives. Oecologia:in press 6. Karp DS, Rominger AJ, Zook J, Ranganathan J, Ehrlich PR, Daily GC (2012) Intensive agriculture erodes beta-diversity at large scales. Ecol Lett 15:963-970 Mori et al. Appendix 7. Gotelli NJ (2000) Null model analysis of species co-occurrence patterns. Ecology 81:2606-2621 8. McGill BJ, Etienne RS, Gray JS, Alonso D, Anderson MJ, Benecha HK, Dornelas M, Enquist BJ, Green JL, He F, Hurlbert AH, Magurran AE, Marquet PA, Maurer BA, Ostling A, Soykan CU, Ugland KI, White EP (2007) Species abundance distributions: moving beyond single prediction theories to integration within an ecological framework. Ecol Lett 10:995-1015