Supplementary Material - Springer Static Content Server

advertisement
1
Supplementary Material
2
Materials and methods
3
Descriptive statistics
4
The ‘Check raw data option’ in Genalex 6.5 (Peakall and Smouse, 2006; Peakall and Smouse, 2012), allowed
5
calculating the PCR amplification success (in percentage) for each locus and sampling location (Table S1).
6
Mean number of alleles (Na), observed (Ho) and unbiased expected heterozygosity (uHe) were calculated with
7
GenAlEx 6.5 (Table S2). FSTAT 2.9.3.2 (Goudet, 2002), was used to calculate Allelic richness (Ar), using the
8
rarefaction method (El Mousadik and Petit, 1996)(Table S2). Deviations from Hardy–Weinberg equilibrium (HWE)
9
were tested (Table S2; null hypothesis H1 = heterozygote deficiency), as implemented in GENEPOP 4.2 (Rousset,
10
2008). The inbreeding coefficient FIS (Weir and Cockerham, 1984)(Table S2) was computed with GENODIVE v2.0b23
11
(Meirmans and Van Tienderen, 2004).
12
To correct for multiple comparisons, a false discovery rate FDR correction (Benjamini and Hochberg, 1995) was
13
performed using SGoF+ (Carvajal-Rodriguez and de Uña-Alvarez, 2011), whenever necessary.
14
Clustering analyses
15
Three Bayesian Markov Chain Monte Carlo programs with different algorithms and assumptions were used to infer
16
population structure of the studied populations: STRUCTURE v2.3.4 (Falush et al., 2003; Falush et al., 2007; Hubisz et
17
al., 2009; Pritchard et al., 2000), InStruct (Gao et al., 2007) , and BAPS v6 (Corander and Marttinen, 2006; Corander et
18
al., 2006; Corander et al., 2008). In all approaches, individuals are assigned probabilistically to one subpopulation or
19
jointly to two or more subpopulations if their genotypes indicate that they are admixed.
20
The software STRUCTURE v.2.3.4 was used because of its ability in providing a simultaneous description of clines and
21
clusters by making use of multilocus genotypes (François and Durand, 2010). In STRUCTURE, underlying population
22
structure was investigated using directly the admixture model, with correlated alleles frequencies between clusters and
23
LOCPRIOR option (Hubisz et al., 2009). Ten different runs from K=1 to K= 7 of 100000 burn-in followed by 500000
24
iterations were computed for each K value. To determine the most appropriate value of K, two statistics were
25
considered: the maximum value of LnPr(X|K), and the ΔK statistic developed by (Evanno et al., 2005)(Figure S1). The
26
R package CorrSieve1.6-8 (Campana et al., 2011) was used to calculate the ΔK statistic.
27
Bottleneck, population growth, and recent migration rates
1
28
BOTTLENECK 1.2.02 (Piry et al., 1999) was used to test for recent demographic changes using two-phase model
29
(TPM) with 70% of the stepwise mutation model (SMM) (Piry et al., 1999) and the Wilcoxon’s signed rank-test (Table
30
S7). In order to detect signatures of population growth, the intralocus variance k test (Reich and Goldstein, 1998) and
31
the interlocus g test (Reich et al., 1999) were performed using the Kgtests Excel Macro (Bilgin, 2007). It computes the
32
significance of the k statistic using the one-tailed binomial distribution. Significance levels for the interlocus g test are
33
available in Table 1 in (Reich et al., 1999). The statistics g is interpreted as an indication of expansion when it has an
34
unusually low value; for seven loci (our study), g values lower than 0.14 (20<n >40) or 0.015 (n<20) will indicate a
35
population expansion (Table S7).
36
The BayesAss v3 program (BA3) was used to estimate recent migration rates (m) between populations (groups of
37
populations) using a non-equilibrium Bayesian method through Markov chain Monte Carlo techniques (Wilson and
38
Rannala, 2003)(Table S8).
39
To generate correct results in BA3 the strategy proposed by Rannala (2011) was followed. The mixing parameters were
40
adjusted (dM=0.5; dA= 0.60; dF= 0.6) to have an acceptance rate between 20% and 40%, with 1*106 iterations, 1*105
41
iterations as burn-in, and 10 independent runs (starting with different random number seeds). In order to assess for
42
significance, migration rates were averaged over the 10 independent runs, and compared to average migration rates of
43
10 randomly permuted data sets (generated in GENODIVE v2.0b23). Estimated migration rates were considered
44
significant when the 95% confidence interval (CI) did not overlap with the 95% CI of the randomly permuted data
45
(Andras et al., 2013). The BayesAss analysis was realized pooling together CF3 and CF4, considering them indistinct
46
(see results: Bayesian clustering, and F ST and DEST estimates of differentiation).
47
2
48
Supplementary figures
49
50
51
52
Figure S1 CorrSieve output of the STRUCTURE results for the Corallium rubrum data set. The graph shows the mean
53
estimated Ln probabilities of data (black), and ΔK values (orange) for each K.
54
55
3
56
57
58
Supplementary tables
Table S1 PCR amplification success (in percentage) for each locus and sampling location calculated with the ‘Check raw data option’ in Genalex 6.5. In bold the values <75%.
(A) PCR Amplification success (%) on the whole dataset: 12 loci and 122 colonies
Sample/locus
COR9
COR46
COR48
COR58
Mic13
Mic20
93.75
87.50
100.00
87.50
100.00
93.75
AL1
95.00
90.00
100.00
100.00
CF2
65.00
60.00
96.30
100.00
96.30
100.00
100.00
CF3
62.96
96.43
96.43
100.00
100.00
100.00
CF4
53.57
87.10
87.10
77.42
90.32
100.00
100.00
PCo5
92.62
91.80
88.52
100.00
99.18
Overall
71.31
(B) PCR Amplification rate after the quality check: 7 loci and 115 colonies
Sample/locus
COR46
COR48
Mic13
Mic20
Mic24
Mic26
93.33
100.00
100.00
100.00
100.00
86.67
AL1
95.00
90.00
100.00
100.00
100.00
100.00
CF2
96.30
100.00
100.00
100.00
100.00
96.30
CF3
100.00
100.00
100.00
100.00
92.00
84.00
CF4
92.86
82.14
100.00
100.00
96.43
85.71
Pco5
95.65
93.91
100.00
100.00
97.39
90.43
overall
Mic22
87.50
65.00
22.22
46.43
0.00
37.70
Mic23
81.25
85.00
92.59
100.00
70.97
86.07
Mic24
100.00
100.00
100.00
89.29
93.55
95.90
Mic25
100.00
95.00
96.30
92.86
64.52
87.80
Mic26
87.50
100.00
96.30
78.57
87.10
89.34
Mic27
100.00
100.00
96.30
96.43
100.00
98.36
Mic27
100.00
100.00
96.30
100.00
100.00
99.13
4
59
Table S2 Descriptive statistics for the samples (total number of genotypes) and loci analyzed: locus name (total
60
number of alleles): Na = mean number of alleles; Ar = allelic richness calculated with the rarefaction method; Ho =
61
observed heterozygosity; uHe = unbiased expected heterozygosity; FIS = inbreeding fixation index; Nu = frequencies of
62
null alleles; PHWE = probability for the global Hardy Weinberg test when H1= heterozygote deficit. Bold underlined
63
values indicate significant values after FDR correction.
Sample/Locus
AL1(15)
CF2(20)
CF3(27)
CF4(25)
PCo5(28)
Overall (115)
Na
Ar
Ho
uHe
Fis
Nu
PHWE
Na
Ar
Ho
uHe
Fis
Nu
PHWE
Na
Ar
Ho
uHe
Fis
Nu
PHWE
Na
Ar
Ho
uHe
Fis
Nu
PHWE
Na
Ar
Ho
uHe
Fis
Nu
PHWE
Na
Ar
Ho
uHe
Fis
Nu
PHWE
COR46(16)
8.00
7.92
0.43
0.87
0.52
0.22
0.00
7.00
6.54
0.21
0.77
0.72
0.31
0.00
5.00
4.97
0.35
0.78
0.56
0.25
0.00
8.00
6.59
0.40
0.80
0.50
0.22
0.00
10.00
8.26
0.65
0.83
0.20
0.08
0.04
7.60
9.40
0.41
0.81
0.45
0.22
0.00
COR48(20)
11.00
10.82
0.27
0.93
0.72
0.33
0.00
7.00
6.30
0.33
0.64
0.49
0.19
0.00
7.00
5.05
0.11
0.64
0.83
0.32
0.00
9.00
7.50
0.44
0.77
0.44
0.19
0.00
12.00
9.87
0.22
0.86
0.75
0.34
0.00
9.20
10.27
0.27
0.77
0.63
0.27
0.00
Mic13(6)
4.00
3.87
0.07
0.61
0.89
0.33
0.00
3.00
2.99
0.35
0.44
0.22
0.08
0.13
3.00
2.74
0.22
0.50
0.56
0.17
0.02
3.00
2.52
0.12
0.36
0.67
0.19
0.00
2.00
1.46
0.04
0.04
0.00
0.00
1.00
3.00
3.16
0.16
0.39
0.57
0.16
0.00
Mic20(8)
3.00
2.99
0.60
0.50
-0.22
0.00
0.90
4.00
3.88
0.80
0.66
-0.22
0.00
0.94
3.00
2.48
0.41
0.48
0.15
0.04
0.30
2.00
2.00
0.48
0.44
-0.08
0.00
0.81
6.00
4.39
0.50
0.67
0.26
0.09
0.02
3.60
4.06
0.56
0.55
0.02
0.03
0.30
Mic24(13)
6.00
5.45
0.27
0.36
0.27
0.10
0.03
4.00
3.93
0.55
0.56
0.02
0.07
0.02
5.00
3.90
0.41
0.38
-0.06
0.00
0.80
5.00
3.87
0.30
0.28
-0.09
0.00
1.00
3.00
2.60
0.15
0.18
0.16
0.00
0.18
4.60
5.62
0.34
0.35
0.01
0.03
0.04
Mic26(29)
16.00
16.00
0.62
0.96
0.37
0.16
0.00
7.00
6.65
0.45
0.81
0.45
0.22
0.00
14.00
9.83
0.46
0.78
0.42
0.17
0.00
13.00
10.26
0.57
0.85
0.34
0.12
0.01
16.00
12.12
0.67
0.91
0.27
0.11
0.00
13.20
13.59
0.55
0.86
0.29
0.16
0.00
Mic27(22)
17.00
15.97
0.93
0.96
0.03
0.00
0.44
12.00
10.12
0.40
0.86
0.54
0.23
0.00
13.00
11.20
0.62
0.92
0.33
0.16
0.00
15.00
12.27
0.76
0.92
0.17
0.08
0.01
16.00
12.62
0.89
0.93
0.04
0.03
0.02
14.60
12.85
0.72
0.92
0.21
0.10
0.00
5
64
Table S3 Results for the Bottleneck software and the k-g tests. TPM excess = probability of Wilcoxon’s signed rank-
65
test for heterozygosity excess using two-phase model (TPM); g = value of the interlocus g test, K = probability value
66
associated with the intra-locus test. Significance levels for the interlocus g test are available in Table 1 in Reich et al
67
(1999)
Samples
AL1
CF2
CF3
CF4
PCo5
TPM excess
(P-value)
0.148
0.531
0.406
0.766
0.973
g
0.503
1.528
0.617
1.084
1.037
K
(P-value)
0.467
0.203
0.467
0.467
0.467
68
69
Table S4 Estimates of recent immigration derived from BayesAss v3. Means ± standard deviations (with associated
70
95% confidence intervals) of the posterior distributions of migration rate (m) in the past one- to- three generations into
71
each population. Immigrant source populations are given in rows (from), and receiving populations in columns (into).
72
Numbers refers to the migration rates (off-diagonal) and self-recruitment non-migration rates (diagonal). In bold are
73
indicated the pairwise estimates significantly different from zero.
74
from/into
AL1
CF2
CF3+CF4
PCo5
AL1
0.931±0.031
(0.870;0.993)
0.013±0.013
(-0.012;0.038)
0.006±0.009
(-0.006;0.018)
0.010±0.016
(-0.009;0.029)
CF2
0.026±0.022
(-0.017;0.069)
0.929±0.038
(0.855;1.003)
0.010±0.009
(-0.008;0.028)
0.018±0.016
(-0.014;0.049)
CF3+CF4
0.019±0.019
(-0.017;0.056)
0.044±0.035
(-0.025;0.113)
0.975±0.013
(0.949;1.001)
0.029±0.022
(-0.014;0.072)
PCo5
0.023±0.020
(-0.017;0.063)
0.014±0.013
(-0.012;0.040)
0.009±0.008
(-0.007;0.024)
0.943±0.026
(0.892;0.994)
75
76
6
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
References (supplementary material)
Andras, J. P., K. L. Rypien & C. D. Harvell, 2013. Range-wide population genetic structure of the Caribbean
sea fan coral, Gorgonia ventalina. Molecular ecology 22(1):56-73 doi:10.1111/mec.12104.
Benjamini, Y. & Y. Hochberg, 1995. Controlling the False Discovery Rate: A Practical and Powerful
Approach to Multiple Testing. Journal of the Royal Statistical Society Series B (Methodological)
57(1):289-300.
Bilgin, R., 2007. Kgtests: a simple Excel Macro program to detect signatures of population expansion using
microsatellites. Molecular Ecology Notes 7(3):416-417 doi:10.1111/j.1471-8286.2006.01671.x.
Campana, M. G., H. V. Hunt, H. Jones & J. White, 2011. CorrSieve: software for summarizing and
evaluating Structure output. Molecular ecology resources 11(2):349-52.
Carvajal-Rodriguez, A. & J. de Uña-Alvarez, 2011. Assessing Significance in High-Throughput Experiments
by Sequential Goodness of Fit and q-Value Estimation. PloS one 6(9):e24700.
Corander, J. & P. Marttinen, 2006. Bayesian identification of admixture events using multilocus molecular
markers. Molecular ecology 15(10):2833-2843.
Corander, J., P. Marttinen & S. Mantyniemi, 2006. A Bayesian method for identification of stock mixtures
from molecular marker data. Fishery Bulletin 104(4):550-558.
Corander, J., J. Sirén & E. Arjas, 2008. Bayesian spatial modeling of genetic population structure.
Computational Statistics 23(1):111-129.
El Mousadik, A. & R. Petit, 1996. High level of genetic differentiation for allelic richness among populations
of the argan tree [Argania spinosa (l.) skeels] endemic to Morocco. Theoretical and Applied
Genetics 92:832-839.
Evanno, G., S. Regnaut & J. Goudet, 2005. Detecting the number of clusters of individuals using the
software STRUCTURE: a simulation study. Molecular ecology 14(8):2611-2620.
Falush, D., M. Stephens & J. K. Pritchard, 2003. Inference of population structure using multilocus genotype
data: Linked loci and correlated allele frequencies. Genetics 164(4):1567-1587.
Falush, D., M. Stephens & J. K. Pritchard, 2007. Inference of population structure using multilocus genotype
data: dominant markers and null alleles. Molecular Ecology Notes 7(4):574-578.
François, O. & E. Durand, 2010. Spatially explicit Bayesian clustering models in population genetics.
Molecular ecology resources 10(5):773-784.
Gao, H., S. Williamson & C. D. Bustamante, 2007. A Markov chain Monte Carlo approach for joint
inference of population structure and inbreeding rates from multilocus genotype data. Genetics
176(3):1635-1651.
Goudet, J., 2002. FSTAT, a program to estimate and test gene diversities and fixation indices (version
2.9.3.2). .
Hubisz, M. J., D. Falush, M. Stephens & J. K. Pritchard, 2009. Inferring weak population structure with the
assistance of sample group information. Molecular ecology resources 9(5):1322-32.
Meirmans, P. G. & P. H. Van Tienderen, 2004. GENOTYPE and GENODIVE: two programs for the
analysis of genetic diversity of asexual organisms. Molecular Ecology Notes 4(4):792-794.
Peakall, R. & P. E. Smouse, 2006. GENALEX 6: genetic analysis in Excel. Population genetic software for
teaching and research. Molecular Ecology Notes 6(1):288-295.
Peakall, R. & P. E. Smouse, 2012. GenAlEx 6.5: genetic analysis in Excel. Population genetic software for
teaching and research—an update. Bioinformatics 28(19):2537-2539.
Piry, S., G. Luikart & J.-M. Cornuet, 1999. Computer note. BOTTLENECK: a computer program for
detecting recent reductions in the effective size using allele frequency data. The Journal of heredity
90(4):502-503.
Pritchard, J. K., M. Stephens & P. Donnelly, 2000. Inference of population structure using multilocus
genotype data. Genetics 155(2):945-959.
Rannala, B., 2011. BayesAss Edition 3.0 User’s Manual available at
ftp://ftp.ie.freshrpms.net/pub/sourceforge/b/project/ba/bayesass/BA3/3.0.0/docs/BA3Manual.pdf
Reich, D. E. & D. B. Goldstein, 1998. Genetic evidence for a Paleolithic human population expansion in
Africa. Proceedings of the National Academy of Sciences of the United States of America
95(14):8119-23.
Reich, D. E., M. W. Feldman & D. B. Goldstein, 1999. Statistical properties of two tests that use multilocus
data sets to detect population expansions. Molecular biology and evolution 16(4):453-466.
7
131
132
133
134
135
136
Rousset, F., 2008. genepop'007: a complete re-implementation of the genepop software for Windows and
Linux. Molecular ecology resources 8(1):103-6.
Weir, B. S. & C. C. Cockerham, 1984. Estimating F-statistics for the analysis of population structure.
Evolution; international journal of organic evolution 38:1358-1370.
Wilson, G. A. & B. Rannala, 2003. Bayesian inference of recent migration rates using multilocus genotypes.
Genetics 163(3):1177-1191.
8
Download