parameters deviation

advertisement
Electronic supplementary materials
Material and methods
We genetically identified 255 unique individuals in the 2010 genetic census in Virunga Massif
(see [1],[2] for details), and using identical methods found 195 unhabituated gorillas in the 2011
Bwindi genetic census. For each of these two datasets we used genetic assessment of sex and
field assessment of age class based on dung size to identify mature individuals (i.e. adult
females, silverback males) of which we randomly selected 50 in order to reduce the potential for
biases that might arise by including parent-offspring pairs. The same genotyping methodologies
as for the mountain gorillas were applied to 64 eastern lowland gorilla samples opportunistically
collected over several years from the following localities: Mount Tshiaberimu (n = 9), Walikale
(n = 12), highland sector of Kahuzi-Biega (n = 29), Itombwe Massif (n = 6), and 8 individuals of
unknown origin. Here we also drew a random sample of 50 individuals for subsequent analyses.
We used 50 genotypes from each population in order to make the analyses computationally
tractable but even with this reduced dataset and the use of multiple parallel runs on computer
clusters the analyses described took more than six months to complete. Hence, we could not
repeat the analyses with new subsamples of the data. We note that the 50 genotypes used from
the Virungas contained 82% of the variation of the entire sample of 255 gorillas as estimated
from comparison of the average number of alleles and expect a similar and higher proportion of
variation to be retained in the Bwindi and eastern lowland subsamples, respectively. In addition,
as there is no present day migration between populations we assume that each individual from
each of these populations will represent that population’s history equally well and so do not
expect the use of subsamples of individuals to bias the results. We used 8 loci (D1s550,
1
D2s1326, D5s1470, D6s1056, D7s817, D8s1106, D16s2624, vWf) to compare mountain and
eastern lowland gorillas and an additional ninth locus in analyses comparing only the mountain
gorilla populations (D4s1627). Genotypes used are presented in table S1.
To conduct the analyses in IMa2, several preliminary analyses were required in order to
identify which heating terms of the chains produced adequate mixing among chains for our
datasets, as assessed by high ESS (Effective Sample Size) values of the parameter t (time since
divergence) and high swapping rates (> 0.80) between successive chains. The heating terms for
the geometric model were set to –ha 0.99 and –hb 0.80 along with 200 independent heated
chains. IMa2 assumes a stepwise mutation model for simulating genealogies for comparison with
the data. We accounted for uncertainty around the mean mutation rate value typical for
microsatellite markers (5.0 × 10-4 mutation event per generation, Estoup et al. 2002) by
specifying a range of mutation rates (5.0 × 10-5 - 5.0 × 10-3). We used a generation time of 20
years [3].
We examined initial plots of approximate posterior distributions in order to define the upper
limits for the uniform priors for all population pairs as Θ = 35, t = 3 and m = 0.3. Using this
value of t allows testing of a population split time up to 120,000 years ago, which we consider an
upper limit given that gene flow between western and eastern gorilla populations might have
persisted until 80,000 years ago [4]. Higher prior values for t resulted mainly in poor estimation
of this and other parameters (e.g. ancestral population size). Based on the short geographical
distances between eastern gorilla populations, we used a moderate value of 0.3 as the upper limit
for the migration parameter, m. We conducted six independent runs with different random seeds
for each pairwise analysis, using a burn-in of 4 million steps followed by 20,000 saved
genealogies (degree of thinning = 100). For each run and for each estimated parameter, we
2
extracted the parameter value (expressed in demographic units) with the highest posterior
probability from the histogram table provided in the software output, as well as the 95% highest
posterior density interval (95% HPD interval). This interval represents the shortest range of
values for a given parameter that contains 95% of the posterior probability distribution. Finally,
we calculated the mean and the standard deviation for each parameter across all six independent
runs to visually assess parameter convergence and we then interpreted these values.
When using MSVAR1.3, we defined the means and the standard deviations of all hyperprior
distributions so that a wide range of biologically realistic values could be tested as priors for all
parameters, except for the mutation rate for which more information was a priori available (table
S2). In all simulations, hyperprior distributions for parameters N0 and N1 were similar to each
other, so that neither the decline nor the expansion scenario was a priori favored. The means of
the log-normal distributions varied between 2 and 4 for parameters N0 and N1, and between 3 and
4 for parameter xa (table S3). The upper value for means of parameters N0 and N1 was chosen to
be much higher than the current census population size of either mountain gorilla population.
Use of higher than indicated values for parameters N0 and N1 resulted in MCMC simulations
getting fixed in a local minimum, which prevented proper analysis of the output. For the
parameter μ, we used a fixed mean value of -3.3 with standard deviation of 0.5, so that mutation
rate values around 5.0 × 10-4 [5]could be tested. A Markov Chain Monte Carlo (MCMC) scheme
was used to sample from the posterior distribution of the parameters. Each simulation was run
for 8 × 109 iterations with a degree of thinning of 200,000 iterations, resulting in 40,000 saved
iterations. We then removed the first 10% of iterations (burn-in) to calculate the Gelman-Rubin
statistic in order to assess the degree of convergence of the model parameters across simulations.
We finally pooled the last 10,000 iterations of each simulation, resulting in a posterior
3
distribution for any parameter based on 70,000 iterations. The seven independent simulations of
any type of model (linear or exponential) applied on each mountain gorilla population separately
reached high convergence across chains, as suggested by a Gelman-Rubin statistic less than or
equal to 1.07 for each parameter (details not shown).
4
Fig. S1. Posterior distributions obtained with the exponential model in MSVAR 1.3 of the
following parameters: a) ancestral (N1) and current (N0) effective sizes for the Virunga mountain
gorilla population, b) ancestral (N1) and current (N0) effective sizes for the Bwindi mountain
gorilla population, c) time (xa) since the size of the Virunga population started to change, and d)
time (xa) since the size of the Bwindi population started to change. The values on the x-axis are
expressed in a logarithmic scale. The prior distributions are shown as dashed lines for
comparison.
a)
b)
c)
d)
5
Table S2. Prior starting values of the four parameters (N0, N1, μ, xa) modeled in MSVAR 1.3 and
used for each simulation. For each parameters, the two columns indicate the mean and the
variance of the normal distribution from which the parameter value is drawn for the first
iteration. These values are updated during the MCMC sampling using hyperpriors as defined in
Table S3.
Simulation
log10(N0)
log10(N1)
log10(μ)
1
4 1
4 1
-3.5
1
5 1
2
4 1
4 2
-3.5
1
5 1
3
4 2
4 1
-3.5
1
5 1
4
4 2
4 2
-3.5
1
5 1
5
4 1
4 1
-3.5
1
5 3
6
4 2
4 2
-3.5
1
5 3
7
4 1
4 1
-3.5
1
5 1
6
log10(xa)
Table S3. Hyperprior values of the four parameters (N0, N1, μ, xa) modeled in MSVAR 1.3 used
for each simulation. For each parameter, the first two values indicate the mean and standard
deviation of the normal distribution from which the mean is drawn at each iteration, while the
last two values indicate the mean and standard deviation of the normal distribution (truncated at
zero) from which the standard deviation is drawn at each iteration.
Simulation
log10(N0)
log10(N1)
log10(μ)
log10(xa)
1
3 1.5 0 0.5
3
2 0 0.5
-3.3
0.5 0 0.5
4 2
0 0.5
2
3 1.5 0 0.5
3
2 0 0.5
-3.3
0.5 0 0.5
4 2
0 0.5
3
3 1.5 0 0.5
3
2 0 0.5
-3.3
0.5 0 0.5
4 2
0 0.5
4
3 1.5 0 0.5
3
2 0 0.5
-3.3
0.5 0 0.5
4 2
0 0.5
5
2 1
2
1 0 0.5
-3.3
0.5 0 0.5
4 2
0 0.5
6
4 1.5 0 0.5
4 1.5 0 0.5
-3.3
0.5 0 0.5
4 2
0 0.5
7
3 1.5 0 0.5
3
-3.3
0.5 0 0.5
3 2.5 0 0.5
0 0.5
2 0 0.5
7
References
1. Gray M., Roy J., Vigilant L., Fawcett K., Basabose A., Cranfield M., Uwingeli P.,
Mburanumwe I., Kagoda E., Robbins M.M. 2013 Genetic census reveals increased but uneven
growth of a critically endangered mountain gorilla population. Biol Conserv 158, 230-238.
(doi:10.1016/j.biocon.2012.09.018).
2. Roy J., Gray M., Stoinski T., Robbins M.M., Vigilant L. 2014 Fine-scale genetic structure
analyses suggest further male than female dispersal in mountain gorillas. Bmc Ecol 14.
(doi:10.1186/1472-6785-14-21).
3. Langergraber K.E., Prufer K., Rowney C., Boesch C., Crockford C., Fawcett K., Inoue E.,
Inoue-Muruyama M., Mitani J.C., Muller M.N., et al. 2012 Generation times in wild
chimpanzees and gorillas suggest earlier divergence times in great ape and human evolution.
Proceedings of the National Academy of Sciences of the United States of America 109, 1571615721. (doi:10.1073/pnas.1211740109).
4. Thalmann O.H., Fischer A.H., Lankester F.H., Paabo S.H., Vigilant L.H. 2007 The complex
evolutionary history of gorillas: Insights from genomic data. Molecular Biology and Evolution
24, 146-158. (doi:10.1093/molbev/msl160).
5. Estoup A., Jarne P., Cornuet J.M. 2002 Homoplasy and mutation model at microsatellite loci
and their consequences for population genetics analysis. Molecular Ecology 11, 1591-1604.
(doi:10.1046/j.1365-294X.2002.01576.x).
8
Download