ddi12387-sup-0001-AppendixS1

advertisement
1
Appendix S1 Example workflow for assessing population genetic health
2
3
Choice of molecular marker type
4
The increasing availability of genetic tools and associated expertise has made it practical and
5
affordable in many cases for genetic information to be included in surveys of the population
6
status of threatened plant species. Several different types of molecular markers are available
7
for conservation genetic research to generate the parameters of interest that we propose,
8
including allozymes, amplified fragment length polymorphisms (AFLPs), microsatellites and
9
single nucleotide polymorphisms (SNPs) (Table S1), with the choice of marker typically
10
constrained by the size of the study and available budget, and the availability of pre-existing
11
genetic resources and expertise. Microsatellite markers are frequently used in conservation
12
genetics studies as they provide high information content and analysis methods are readily
13
available. However, microsatellite markers require development as they are species-specific
14
in most cases, although next generation genomic sequencing technology has dramatically
15
reduced the cost and time required to develop microsatellites [1] and SNPs [2]. When
16
microsatellite or other markers are not available for the species’ of interest, the use of AFLPs
17
represents a good trade-off between genetic informativeness and the number of species that
18
can be studied within a given time-frame. Since the AFLP method is not species-specific the
19
method can be quickly deployed to many species of wide taxonomic breadth with very little
20
development time required for each, although several technical limitations of the method also
21
need to be considered [3,4]. Estimating the inbreeding coefficient (FIS) from dominant marker
22
data has proved controversial, though a recently proposed Bayesian approach appears to
23
provide robust estimates of FIS inferred from simulated and empirical data [14]. We use AFLPs
24
in the case studies presented in this paper, but any of the molecular marker methods can be
25
used to generate the required data on population genetic differentiation, genetic diversity and
26
inbreeding for the species of interest.
27
28
Sampling considerations
29
Populations may variously be defined based on ecological or genetic criteria and may or may
30
not be identified prior to beginning a sample collection regime. For example, for species with
31
restricted or disjunct distributions, populations may be circumscribed based on geographic
32
location or habitat type. Alternatively, if no a priori information is available, a widespread
33
sampling regime could be implemented and then genetic populations may be defined a
34
posteriori using analytical methods that aim to resolve groups of individuals that are
35
connected by gene flow (see example workflow in Fig S1). Once the identity of populations
36
has been resolved, a minimum sample size of 20-30 individuals per population is generally
37
recommended for robust genetic analyses [5, 6], though larger sample sizes are preferred for
38
AFLP studies to accurately sample rare alleles. In terms of collecting samples within a
39
population it is preferable to sample from widely-spaced plants to reduce the probability of
40
sampling related individuals since many species exhibit low levels of pollen and /or seed
41
dispersal which leads to spatial clumping of related individuals. When working with rare and
42
threatened species it may sometimes be difficult to achieve even sample sizes as population
43
sizes can vary widely. Under such circumstances it may be appropriate to use rarefaction to
44
scale population genetic parameters to the population with the smallest sample size, though it
45
is essential to be mindful that measures of sampling error increase with decreasing sample
46
size and conclusions should be tempered accordingly.
47
48
The case studies we present consist of relatively even sampling of small numbers of
49
populations and we use an informal statistical approach (non-overlap of standard errors, Fig
50
S1) to assess differences in population genetic statistics between target and reference
51
populations. This approach allows ease of assessment for non-specialist users of our decision-
52
making framework, however, if more complex sampling designs are employed then
53
appropriate formal statistical analyses should be conducted.
Table S1: Comparison of different classes of markers commonly used in conservation genetic research based on whether the marker development is
generic (transferable to other species) or species-specific, the time taken for marker development, cost of development, ongoing running costs,
reproducibility, information content and main limitation. ↓ = Low, ↑ = High, ↔ = Moderate
Marker type
Technology
Development
Development
Running
time
Cost
Cost
Reproducibility
Information
Main limitation
content
Allozymes
Generic
↓
↓
↓
↑
↓ ↔
Low polymorphism
Amplified Fragment
Generic
↓
↓
↔
↑
↑c
Dominant marker,
Length Polymorphisms
estimating Fis not
(AFLPs)
Inter-Simple Sequence
straightforward
Generic
↓
↓
↔
↔↑
↔ ↑
As above
Species-
↔ ↑
↔b
↔b
↑
↑
Null alleles, ascertainment
Repeats (ISSRs)
Microsatellites
specific a
Single Nucleotide
Species-
Polymorphisms (SNPs)
specific
bias
↔ ↑
↑b
↔ ↑b
↑
↔ ↑c
Advanced bioinformatics
required to analyse large
datasets
a
Some microsatellites are transferable to other closely-related species within genera, but risk of null alleles increases
b
Next-generation sequencing technologies are advancing rapidly and per-sample costs are falling with time
c
Individual loci are not highly informative but the power of the marker-type is in the large number of loci sampled and their genome-wide distribution
Design and collect experimental data
- select appropriate marker sysem (Table S1)
- select populations to sample and numbers of individuals per population (typically n=20-30). Sample
evenly across populations. If uneven sample sizes use rarefaction methods below (e.g. MolKin [7])
- incorporate comparative element, i.e. compare small populations to large healthy ones or similar
size populations of closely related common species (see main text)
Visualise genetic structure amongst populations
- Principal Coordinates analysis of inter-individual genetic distance calculated from marker data or
Bayesian analysis of population structure
- populations defined a priori or use PCA to visualise natural groupings
- confirm identification of populations for subsequent analysis; test for Wahlund effects
- example software: GenAlEx [8] (all data types), Structure [9] (all data types)
Quantify genetic differetiation amongst populations
- quantify using Wright's FST [10] or analogues
- FST measures difference in allele frequencies between populations
- ranges from 0 to 1. Test that observed FST > 0 via permutation testing. FST values >0.1-0.2
considered high.
- example software: AFLP-SURV [11] (AFLPs), GenAlEx [8] or Genepop [12] (Microsatellites)
Quantify genetic diversity within populations
- quantify as Expected Heterozygosity (HE) or use other diversity measures (allelic richness,
Shannon's diversity index, proportion of polymorphic loci)
- HE measures probability that an individual is heterozygous at a given locus. Ranges from 0 to 1
- Assess whether HE is high or low in relation to reference population. Use non-overlap of standard
errors to assess statistical difference unless more sophisticated analysis required
- example software: AFLP-SURV [11], GenAlEx [8] or Genepop [12]
Quantify inbreeding within populations
- quantify using Wright's Inbreeding coefficient (FIS) [13]. Measures the decrease in heterozygosity of
individuals relative to that of the sub-population as a consequence of non-random mating
- ranges from 0 to 1. Assess whether FIS is high or low in relation to reference population using
standard errors unless more sophisticated analysis required
- example software: I4A [14] (AFLPs), GenAlEx [8] or Genepop [12]
1
2
3
Figure S1: Example of genetic analysis workflow to gain genetic parameters required for
4
decision-making matrix.
5
References
6
[1]
Gardner MG, Fitch AJ, Bertozzi T, Lowe AJ (2011) Rise of the machines--
7
recommendations for ecologists when using next generation sequencing for
8
microsatellite development. Molecular Ecology Resources, 11, 1093–101.
9
[2]
10
11
generation sequencing data. Nature reviews. Genetics, 12, 443–51.
[3]
12
13
Nielsen R, Paul JS, Albrechtsen A, Song YS (2011) Genotype and SNP calling from next-
Bonin A, Bellemain E, Bronken Eidesen P et al. (2004) How to track and assess
genotyping errors in population genetics studies. Molecular Ecology, 13, 3261–73.
[4]
Foll M, Beaumont MA, Gaggiotti O (2008) An approximate Bayesian computation
14
approach to overcome biases that arise when using amplified fragment length
15
polymorphism markers to study population structure. Genetics, 179, 927–39.
16
[5]
Hale M.L., Burg T.M., & Steeves T.E. (2012) Sampling for microsatellite-based
17
population genetic studies: 25 to 30 individuals per population is enough to accurately
18
estimate allele frequencies. PloS One, 7, e45170.
19
[6]
20
21
Sinclair E. A. & Hobbs R.J. (2009) Sample size effects on estimates of population genetic
structure: Implications for ecological restoration. Restoration Ecology, 17, 837–844.
[7]
Gutiérrez J.P., Royo L.J., Alvarez I., & Goyache F. (2005) MolKin v2.0: a computer
22
program for genetic analysis of populations using molecular coancestry information. The
23
Journal of Heredity, 96, 718–21.
24
25
[8]
Peakall, R. & Smouse, P.E. (2006) GenAlEx 6: Genetic analysis in Excel. Population
genetic software for teaching and research. Molecular Ecology Notes, 6, 288-295.
26
[9]
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using
27
28
multilocus genotype data. Genetics, 155, 945-959.
[10]
29
30
354.
[11]
31
32
[12]
37
38
Raymond M, Rousset F (1995) GENEPOP (Version 1.2): Population genetics software for
exact tests and ecumenicism. Journal of Heredity, 86, 248–249.
[13]
35
36
Vekemans X (2002) AFLP-SURV version 1.0. Distributed by the author, Laboratoire de
Génétique et Ecologie Végétale, Université Libre de Bruxelles, Belgium.
33
34
Wright SE (1951) The genetical structure of populations. Annals of Eugenics, 15, 323-
Wright SE (1969) Evolution and the Genetics of Populations. Vol 2. The Theory of Gene
Frequencies. University of Chicago Press., Chicago.
[14]
Chybicki I, Oleska A, Burczyk J (2011) Increased inbreeding and strong kinship structure
in Taxus baccata estimated from bothe AFLP and SSR data. Heredity, 107, 589-600.
Download