Supp. Mat.

advertisement
1
Garrick RC, Benavides E, Russello MA, Hyseni C, Edwards DL, Gibbs JP, Tapia W, Ciofi C,
2
Caccone A (2014) Lineage fusion in Galápagos giant tortoises. Molecular Ecology (doi: xxxxxx)
3
Supplementary material
4
5
6
Methods
7
8
Previous analysis of the number of Chelonoidis becki genotypic clusters
9
10
We recently re-examined the genetic composition of C. becki tortoises (Garrick et al.
11
2012; Edwards et al. 2013). The number of natural genetic clusters (K) within C. becki was
12
assessed by analyzing microsatellite data from ~1700 Wolf Volcano tortoises in STRUCTURE
13
v2.3.3 (Pritchard et al. 2000). Following Evanno et al. (2005), we found strong support for K = 2
14
groups [Piedras Blancas (PBL) and Puerto Bravo (PBR) herein]. In this initial analysis, estimated
15
membership coefficients (Q-values) of Q ≥ 0.95 were considered indicative of purebred
16
individuals (Vähä & Primmer 2006). Minimally related representatives of PBR and PBL
17
purebreds (23 individuals per group), as determined by analyses using KINGROUP v2.0
18
(Konovalov et al. 2004; full-sib reconstruction, Descending Ratio search algorithm), replaced
19
previously selected representatives of C. becki in our archipelago-wide reference dataset
20
(Russello et al. 2007; Poulakakis et al. 2008). This now comprises 354 individuals, representing
21
all extant and most extinct taxa of Galápagos giant tortoises. STRUCTURE analysis of the
22
reference dataset recovers K = 12 natural groups that are sufficiently differentiated for use in
23
genetic assignment tests (Garrick et al. 2012; Edwards et al. 2013).
1
24
25
Classification of Chelonoidis becki individuals
26
27
STRUCTURE was used to estimate Q-values for 841tortoises sampled from Wolf
28
Volcano via comparison to the reference dataset, with K = 12 set as a fixed parameter..
29
Previously, in order to classify individuals on the basis of their Q-values, we simulated numerous
30
admixture scenarios for Wolf Volcano tortoises, including heterospecific crosses (i.e., between
31
C. becki and tortoise species endemic to other islands; Russello et al. 2007; Garrick et al. 2012;
32
Edwards et al. 2013). Here we used the same approach but focused on conspecific crosses (i.e.,
33
hybridization and backcrossing between the two C. becki lineages, PBR and PBL).
34
35
We used HYBRIDLAB v1.0 (Nielsen et al. 2006) to simulate 12-locus microsatellite
36
datasets that could be used for characterizing the distribution of Q-values associated with five
37
different classes of C. becki tortoise: (1) PBR purebred; (2) PBL purebred; (3) F1 hybrid; (4)
38
PBR × F1 backcross; or (5) PBL × F1 backcross. Simulations were conditioned on the empirical
39
microsatellite data (i.e., same number of alleles, and extent of differentiation between parental
40
gene pools). These simulated datasets, comprised of 100 individuals each, were analyzed
41
together with the 354-individual reference dataset in STRUCTURE (run settings above, with K =
42
12). Following Garrick et al. (2012), we were able to distinguish between the five different
43
classes of C. becki tortoise by jointly considering two descriptors: QR, the Q-value range within
44
each parental cluster (i.e., purebred PBL and PBR), and QD, the Q-value difference between
45
parental clusters (table S1). Where necessary, ambiguous assignments of Wolf Volcano tortoises
2
46
were resolved following Garrick et al. (2012) by comparing empirical Q-values to the
47
distribution of simulated QR and QD, and then selecting the best-fit hybrid class.
48
49
Genetic diversity and differentiation
50
51
Pairwise genetic distance metrics.—We framed our assessment of levels of genetic
52
differentiation between purebred PBR and PBL lineages in the context of differences observed
53
between pairs of the following 13 Galápagos giant tortoise taxa: C. hoodensis, C. chathamensis,
54
C. abingdoni, C. porteri, C. ephippium, C. darwini, C. vandenburghi, C. microphyes, C.
55
guntheri, C. vicina, C. elephantopus, C. sp. nov. (Russello et al. 2005), and C. becki (represented
56
by the PBL lineage). For nuclear microsatellites, we compared a metric based on allele
57
frequencies alone (FST) to a related metric that also incorporates information on mutational
58
changes in alleles (RST). This approach can help determine whether divergences occurred over
59
relatively short timescales where genetic drift dominates, or over the longer timescales on which
60
new alleles evolve (Pons & Petit 1996; Hardy et al. 2003). For the 13 taxa (79 interspecific
61
pairwise comparisons), FST and RST values were calculated in GENEPOP v4.0 (Rousset 2008)
62
and RSTCALC v2.2 (Goodman 1997), respectively. Additionally, levels of divergence between
63
the lineages were examined on the basis of mtDNA sequences, for which we calculated
64
uncorrected p-distances, and maximum likelihood-corrected distances (using the best-fit model
65
identified via MODELTEST v3.0; Posada & Crandall 1998) in PAUP* v4.0b10 (Swofford
66
2002). Here, non-redundant haplotypes were the units of analysis; after excluding mtDNA
67
haplotypes that generate pairwise distances of zero, the 13 taxa were represented by 78
68
haplotypes (2619 interspecific pairwise comparisons).
3
69
70
Hybridization dynamics and forward-in-time projections
71
72
To examine the potential consequences of continued hybridization among C. becki
73
tortoises on Wolf Volcano, we compared characteristics of the present generation (G0) of C.
74
becki tortoises with those after one generation of random mating (G1). These characteristics
75
included changes in the frequency of PBL vs. PBR mtDNA haplogroups and purebred vs.
76
admixed tortoises, the degree of genetic substructure (measured via linkage disequilibrium
77
[LD]), and levels of genetic diversity (measured via allelic richness [AR] and observed
78
heterozygosity [HO]).
79
80
For the following analyses, we used only adult C. becki tortoises with complete genetic
81
data (i.e., individuals for which sex had been determined, and mtDNA sequences plus nuclear
82
microsatellite genotypes were available; N = 502 assigned individuals). First, the observed
83
frequencies of males and females for the following eight types of tortoises were used to calculate
84
the probability of all possible mate pairings (subscript “mt” is the mtDNA haplogroup): (1)
85
purebred PBL + PBLmt, (2) purebred PBR + PBRmt, (3) F1 hybrid + PBLmt, (4) F1 hybrid +
86
PBRmt, (5) PBL × F1 backcross + PBLmt, (6) PBL × F1 backcross + PBRmt, (7) PBR × F1
87
backcross + PBLmt, and (8) PBR × F1 backcross + PBRmt (table S4). From this pairwise matrix,
88
we projected the next-generation frequencies of purebred tortoises that had matching mtDNA
89
and nuclear genetic backgrounds (i.e., types 1 and 2, above; figure S4). Based on nuclear genetic
90
data alone, we also projected next-generation frequencies of purebred PBL, purebred PBR, and
91
hybrid (i.e., F1 plus backcross) tortoises (figure S5).
4
92
93
Stochasticity is associated with the particular parental pairs that may form in a single
94
reproductive cycle that generates that next generation of offspring, as well as random segregation
95
of alleles that occur within those individuals during gamete formation. To model this, we
96
simulated crosses between the multilocus microsatellite genotypes of randomly selected male-
97
female pairs. We also explored the possible impacts of assumptions about Ne by considering two
98
different values that—assuming a current census size (Nc) of ~8,000 tortoises on Wolf Volcano
99
tortoises (Garrick et al. 2012)—correspond with a Ne : Nc ratio of 0.05 and 0.10 (i.e., 200 and
100
400 adults, respectively). These ratios lie within the range commonly seen in numerous wild
101
species (Frankham 1995). To perform this modeling, subsets of adult C. becki tortoises were
102
randomly selected to represent the gene pool of the present (G0) generation’s breeders (N = 200
103
and N = 400 individuals, 3 replicates each).
104
105
To quantify the level of genetic substructure that exists in the present generation,
106
GENEPOP was used to perform tests of LD for all possible pairs of microsatellite loci using the
107
log likelihood ratio statistic (12 loci, 66 comparisons). To measure genetic diversity in the
108
present generation, HO was also calculated in GENEPOP. For the same data, another genetic
109
diversity metric, AR, was calculated using rarefaction correction, with a standardized sample size
110
of 184 diploid individuals, implemented in HP-RARE. To represent the gene pool of the next
111
generation (G1), random mating within the subsets of breeders was simulated between random
112
male-female pairs using HYBRIDLAB, to generate a total of 200 offspring per set (N = 200 or
113
400 breeders × 3 replicates each = 6 simulated datasets comprising 200 offspring). Using these
114
multilocus genotypes, we tested for LD and calculated HO and AR.
5
115
116
To compare G0 vs. G1, frequency distributions of P-values for tests of LD were plotted to
117
illustrate change in the number of locus pairs showing significant LD (figure S6). Here, locus
118
pairs that fall within the histogram category of P < 0.05 are of greatest interest, since these
119
represent cases of significant deviation from random association of alleles across loci (i.e.,
120
linkage equilibrium). We also included a category of P < 0.00001, which approximates the
121
Bonferroni-corrected critical value, owing to multiple tests being performed on the same data
122
(figure S6). For AR and HO, we calculated the difference in these summary statistic values
123
between G0 vs. G1 for each microsatellite locus, and represented the distribution of these values
124
as box-and-whisker plots (figures S7 and S8).
6
Supplementary Tables
Table S1. Criteria used to classify 841 Wolf Volcano tortoises as purebreds, F1 hybrids or
backcrosses, based on STRUCTURE Q-values derived from crosses simulated in HYBRIDLAB.
Following Garrick et al. (2012), a combination of two criteria was used: QR and QD.
7
Table S2. Assessment of historical divergence between the Puerto Bravo (PBR) and Piedras
Blancas (PBL) lineages of C. becki, estimated using approximate Bayesian computation. Three
data subsets (1-3) were run, each without or with bottleneck events (-NB, no bottleneck; -B, with
bottleneck) included. Three scenarios were compared via posterior probabilities, which identified
scenario 2a as the best-fit (error associated with scenario choice relates to this model). Estimates
of parameters included in the best-fit scenario are reported as medians, together with 5% and
95% quantiles (Q5 and Q95, respectively). Parameters were effective population sizes (Ne;
subscripts indicate population, where “AGO” = Santiago Island and “ancestor” is no longer
extant), and splitting times, in units of generations (t1, younger event; t2, older event).
Bottlenecks, when included, had two parameters: duration, in units of generations (dur), and
severity of the size reduction, in units of Ne (sev1, younger event; sev2 older event).
8
Table S3. Proportions of offspring expected to have a PBR mtDNA haplogroup sequence if
mating between the two lineages of C. becki on Wolf Volcano is random. Values were calculated
based on operational sex-ratios from empirical data. Below, the two parental gene pools in each
cross are represented by females (♀) and males (♂), where red indicates individuals carrying a
PBR mtDNA haplogroup sequence (blue is PBL mtDNA haplogroup). Proportions of each sex
are given below. Grid cells represent random pairings (initial expected proportions of offspring
from each type of cross are given in grey text), of which only male-female parental pairs produce
offspring (adjusted proportions are given in black text). Panel A: Purebred PBL × purebred PBL.
Panel B: Purebred PBL × F1 hybrid. Panel C: Purebred PBR × F1 hybrid. The delimitation of
mtDNA haplogroups is shown in figure S1.
9
Table S4. Probability of random male (♂) × female (♀) pairings, calculated for each of eight types of C. becki tortoises. The eight
different types take onto account microsatellite (msat)-based assignment [i.e., PBL purebred, PBR purebred, F1 hybrid, and PBL
backcross (F1 × PBL) or PBR backcross (F1 × PBR)], and mitochondrial DNA (mtDNA) haplogroup (PBLmt or PBRmt). The column
and row labeled ‘Frequency’ is based on empirical data from N = 502 adult individuals from Wolf Volcano in the present (G0)
generation. The interior cells of the matrix represent the projected the next-generation (G1) frequencies of offspring from each of the
potential crosses, assuming random mating. Comparisons between G0 vs. G1 were used to examine the trajectory of future changes in
frequencies of mtDNA haplogroup sequences, as well as purebred vs. hybrid tortoises on Wolf Volcano.
10
Supplementary Figures
Figure S1. Statistical parsimony network showing evolutionary relationships among mitochondrial DNA (mtDNA) sequences carried
by Wolf Volcano tortoises. Each mtDNA haplotype is represented by a pie chart (labeled R-1 to R-5 and L-1 to L-5). Pie slices
indicate the frequency of a given haplotype for individuals classified as purebreds, F1 hybrids, and backcrosses (five classes) on the
basis of their microsatellite genotypes. Pie chart sizes reflect overall abundance of each mtDNA haplotype. Black diamonds are
hypothetical (unsampled or extinct) haplotypes, and black lines between haplotypes represent a single mutational step. The two
disconnected networks are separated by a large number of mutational steps. “Native” haplotypes are found almost exclusively in C.
becki tortoises from Wolf Volcano, Isabela Island, whereas “non-native” haplotypes appear to be derived from C. vandenberghi from
Alcedo Volcano, Isabela Island.. This analysis was performed with sequences from 800 classified individuals. Fourteen additional
individuals were omitted because they carried haplotypes characteristic of species endemic to other islands (i.e., C. hoodensis from
Española [N = 10], C. chathamensis [N = 2] from San Cristóbal, or C. elephantopus [N = 2] from Floreana; main text, figure 1).
11
Figure S2. Best-fit model of historical divergence between the Puerto Bravo (PBR) and Piedras
Blancas (PBL) lineages of C. becki, estimated using approximate Bayesian computation. Point
estimates are median values, averaged over three data subsets, and 90% confidence intervals are
given in parentheses. Model parameters are as follows: Ne = effective population size of
contemporary and ancestral linages, and t = splitting time in units of thousands of years ago.
12
Figure S3. Best-fit model of historical divergence between Puerto Bravo (PBR) and Piedras
Blancas (PBL) lineages of C. becki, including hypothetical bottleneck events, estimated using
approximate Bayesian computation. Point estimates are median values, averaged over three data
subsets, and 90% confidence intervals are given in parentheses. Model parameters are as follows:
Ne = effective population size of contemporary and ancestral linages, and t = splitting time in
units of thousands of years ago. Bottleneck events are associated with long-distance over-water
colonization of Wolf Volcano from a Santiago Island ancestor, characterized by duration
(median generations = 3; 90% CI: 1–5 for both) and severity (median Ne = 8 or 11; 90% CI: 2–18
or 3–19).
13
Figure S4. Histograms comparing the frequency of two C. becki mtDNA haplogroups in the
present generation (G0) vs. projected frequencies after one generation of random mating (G1).
For each timescale, frequencies of PBL and PBR haplogroups sum to one. Within columns,
polka-dots indicate the proportion of a given haplogroup that occurs in purebreds with a
corresponding nuclear genetic background (i.e., PBR mtDNA in a purebred PBR individual).
14
Figure S5. Histograms comparing the frequency three classes of C. becki tortoises, as
determined using nuclear microsatellite data (i.e., purebred PBR, purebred PBL, and hybrids), in
the present generation (G0) vs. projected frequencies after one generation of random mating (G1).
For each timescale, frequencies of the three classes sum to one. Within the columns representing
hybrids, the proportion of F1 hybrids, PBL × F1 backcrosses, and PBR × F1 backcrosses are
indicated by dark, intermediate, and light grey shading, respectively (diagonal stripes are other
kinds of admixture resulting from F2 and third-generation double backcrosses).
15
Figure S6. Frequency distributions comparing the current level of linkage disequilibrium (LD)
among microsatellite alleles of C. becki tortoises (G0; solid lines, filled circles) vs. projected LD
after one generation of random mating (G1; dashed lines, open circles). Distributions represent Pvalues of LD tests for each pair of microsatellite loci, and x-axis labels indicate upper bounds of
P-value categories. For the present (G0) generation, these were calculated using empirical
genotypic data from randomly selected subsets of adult C. becki tortoises (Panel A: N = 200
individuals; Panel B: N = 400). Forward-in-time (G1) projections are based on simulations of a
single episode of male-female crosses (N = 200 offspring in both cases) using the same subsets
of Ne = 200 and Ne = 400 adult tortoises chosen as breeders from the present generation.
16
Figure S7. Box-and-whisker plots showing projected change in allelic richness (AR) at
microsatellite loci of C. becki tortoises, after one generation of random mating. For the present
(G0) generation, AR was calculated using empirical genotypic data from randomly selected
subsets of adult C. becki tortoises (N = 200, and N = 400 individuals). Forward-in-time (G1)
projections are based on simulations of a single episode of male-female crosses (N = 200
offspring in both cases) conducted using the same subsets of Ne = 200 and Ne = 400 adult
tortoises chosen as breeders from the present generation. The change in AR was calculated for
each of 12 microsatellite loci as G1 minus G0 (the red line on zero marks no change in AR). On
each plot, the lower and upper boundaries of the box represent 25th and 75th percentiles,
respectively, and the line within the box marks the median. Upper and lower whiskers indicate
the 90th and 10th percentiles; outlying data points are shown as open circles.
17
Figure S8. Box-and-whisker plots showing projected change in observed heterozygosity (HO) at
microsatellite loci of C. becki tortoises, after one generation of random mating. For the present
(G0) generation, HO was calculated using empirical genotypic data from randomly selected
subsets of adult C. becki tortoises (N = 200, and N = 400 individuals). Forward-in-time (G1)
projections are based on simulations of a single episode of male-female crosses (N = 200
offspring in both cases) conducted using the same subsets of Ne = 200 and Ne = 400 adult
tortoises chosen as breeders from the present generation. The change in HO was calculated for
each of 12 microsatellite loci as G1 minus G0 (the red line on zero marks no change in HO). On
each plot, the lower and upper boundaries of the box represent 25th and 75th percentiles,
respectively, and the line within the box marks the median. Upper and lower whiskers indicate
the 90th and 10th percentiles; outlying data points are shown as open circles.
18
Supplementary References
Edwards DL, Benavides E, Garrick RC et al. (2013) The genetic legacy of Lonesome George
survives: Giant tortoises with Pinta Island ancestry identified in Galápagos. Biological
Conservation, 157, 225–228.
Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the
software STRUCTURE: A simulation study. Molecular Ecology, 14, 2611–2620.
Frankham R (1995) Effective population-size: adult-population size ratios in wildlife: A review.
Genetical Research, 66, 95–107.
Garrick RC, Benavides E, Russello MA et al. (2012) Genetic rediscovery of an ‘extinct’
Galápagos giant tortoise species. Current Biology, 22, R10–R11.
Goodman SJ (1997) RSTCalc: A collection of computer programs for calculating estimates of
genetic differentiation from microsatellite data and determining their significance. Molecular
Ecology, 6, 881–885.
Hardy OJ, Charbonnel N, Fréville H, Heuertz M (2003) Microsatellite allele sizes: a simple test
to assess their significance on genetic differentiation. Genetics, 163, 1467–1482.
19
Konovalov DA, Manning C, Henshaw MT (2004) KINGROUP: A program for pedigree
relationship reconstruction and kin group assignments using genetic markers. Molecular Ecology
Notes, 4, 779–782.
Nielsen EE, Bach LA, Kotlicki P (2006) HYBRIDLAB (Version 1.0): A program for generating
simulated hybrids from population samples. Molecular Ecology Notes, 6, 971–973.
Pons O, Petit RJ (1996) Measuring and testing genetic differentiation with ordered versus
unordered alleles. Genetics, 144, 1237–1245.
Posada D, Crandall KA (1998) MODELTEST: Testing the model of DNA substitution.
Bioinformatics, 14, 817–818.
Poulakakis N, Glaberman S, Russello M et al. (2008) Historical DNA analysis reveals living
descendants of an extinct species of Galápagos tortoise. Proceedings of the National Academy of
Sciences, USA, 105, 15464–15469.
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus
genotype data. Genetics, 155, 945–959.
Rousset F (2008) GENEPOP’007: A complete re-implementation of the GENEPOP software for
Windows and Linux. Molecular Ecology Resources, 8, 103–106.
20
Russello MA, Beheregaray LB, Gibbs JP et al. (2007) Lonesome George is not alone among
Galápagos tortoises. Current Biology, 17, R317–R318.
Russello MA, Glaberman S, Gibbs JP et al. (2005) A cryptic taxon of Galápagos tortoise in
conservation peril. Biology Letters, 1, 287–290.
Swofford DL (2002) PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods).
Sinauer, Sunderland, Massachusetts, USA.
Vähä JP, Primmer CR (2006) Efficiency of model-based Bayesian methods for detecting hybrid
individuals under different hybridization scenarios and with different numbers of loci. Molecular
Ecology, 15, 63–72.
21
Download