Supplementary Information, Appendix 3: Sample size analysis and permutations. Results of a bootstrap analysis of the effect of sample size on the estimate of nucleotide diversity (π). For three populations (Mionectes oleagineus – Bocas del Toro, Panama; Myrmeciza exsul – Bocas del Toro, Panama; Henicorhina leucostica – Darien, Panama) we created 8 bootstrap datasets each consisting of 1000 subsamples of the entire population dataset. Each subsampled dataset had between 2 and 10 individuals, which were sampled with replacement from the population. For each of these 8000 subsampled datasets we calculated π. At each sample size value (e.g. 2 to 8) we calculated the mean (Nuc[leotide] Div[ersity] mean), standard deviation (Nuc Div St Dev), median (Nuc Div Median), and median minus mean (Median – Mean) of π. We present these data in three figures below. In all three cases, we noted a minor downward bias in mean and median estimates of π at sample sizes between 2 and 4. However, above n = 4 we detected no bias in π associated with sample size. Likewise, the standard deviation of π was relatively large for samples of 2 or 3, but above n = 4 little decrease in standard deviation was observed with increases in sample size. Finally, Median values of π were extremely stable above n = 4. We infer from these data that the minor bias in the most extreme small samples is due to the lack of sampling relatively uncommon, genetically-dissimilar, alleles. The conclusions in this manuscript are based on the estimate of nucleotide diversity in 49 populations where the true parameter value remains unknown. Our bootstrap simulations demonstrate that sample-size related artifacts are unlikely to have played an appreciable role in our observations of patterns in the geographical distribution of nucleotide diversity among these populations. However, we include some permutations of our null-hypothesis testing to explicitly address any issues related to some of the smallest sample sizes included in our overall dataset, as follows. We sequentially removed small-sample sized populations from our analysis to create six subset data matrices (minimum n = 3 to minimum n = 8) for which we evaluated whether the value of πi > πi+1 was positive more often than expected by chance. As was the case for the full dataset, reduced subsets of data had a ratio of positive to negative values of πi > πi+1 not significantly different than 0.50. Thus, we conclude that small samples were not responsible for the patterns observed in our study. Supplementary Table 2. Effect of removing small sample-size populations on the test of an inverse relationship between latitude and nucleotide diversity. minimum n 2** 3 4 5 6 7 8 count of πi > πi+1 21 21 21 21 19 18 16 number of comparisons 38 37 36 34 30 29 27 ratio 0.55 0.57 0.58 0.62 0.63 0.62 0.59 p-value* 0.31 0.26 0.20 0.11 0.10 0.13 0.22 * p-value is the probability that the ratio is significantly different than 0.50 as measured by a binomial exact test. ** i.e., the full dataset as evaluated in the main Results section. We sequentially removed small-sample sized populations from our analysis creating six subset data matrices (minimum n = 3 to minimum n = 8) for which we calculated the probability of observing that few, or few, number of species with a maximum π value in an edge population. The probability of observing no species with max π in an edge population is the easiest to calculate, as this is the product of the probability for each species of max π occurring in an no-edge population. The probability for any given species is just the number of non-edge populations divided by the total number of populations. For example, in the full dataset, the probability of 0 max-π in edge populations is: (p Phaethornis longirostris [4/6]) × (p Phaethornis striigularis [4/6]) × (p Amazilia tzacatl [4/6]) × (p Glyphorynchus spirurus [3/5]) × (p Myrmeciza exsul [2/4]) × (p Pipra mentalis [3/5]) × (p Mionectes oleagineus [4/6]) × (p Henicorhina leucosticta [4/6]) × (p Euphonia goulidi [3/5]) = (0.67 0.67 × 0.67 × 0.60 × 0.50 × 0.60 0.67 × 0.67 0.50) = 0.012 In the case where the max π occurs in the edge in at least one species, it is necessary to calculate all the possibilities of equally extreme, or more extreme outcomes. Therefore, this product is the sum of all such scenarios. In the case of 1 edge, this sums the previous product for n = 0 as well as the 9 different ways (i.e. one for each species) in one species has max π in an edge population. For each of the nine hypothetical case of n = 1, the probability is calculated as above, except that for the given species, the probability of max π occurring in a non-edge (i.e. 4/6) is replaced by the probability of max π occurring in an edge (i.e. 2/6). This is done for each of the nine species, and then these nine probabilities are summed along with the calculated probability of 0 occurrences of max π in an edge population. In the full dataset, as well as in 5 of 6 permutations of our dataset, the number of species with the maximum observed π value occurring in an edge population was significantly smaller than expected by chance (i.e. if populations with the maximum π value are distributed randomly throughout a species’ range), while in the final case, the p-value was nearly 0.05 (0.065). We take this as evidence that sample size issues have no effect on our observation that the populations with maximum π values occur in range-center populations more often than could be expected by chance. Supplementary Table 3. Effect of removing small sample-size populations on the test of whether the maximum π value occurs in edge populations less frequently than expected by chance. minimum n 2* 3 4 5 6 7 8 number of species 9 9 9 9 8 8 8 number of species with max π in edge population 0 1 1 1 0 0 0 p-values in bold represent significant values where = 0.05. * i.e. the full dataset as evaluated in the main Results section. joint probability of an result equally or more extreme 0.012 0.065 0.043 0.039 0.012 0.010 0.006