Fig. S1 Mean estimated Ln Pr(X|K) values from STRUCTURE K

advertisement
1
Supplementary Methods:
2
Assumptions of Nb and Ne estimators
3
There are assumptions associated with every method of estimating Ne and deviations from these
4
assumptions may introduce a severe bias in Ne calculations if left unaddressed. The first common
5
assumption is that no mutation or selection is present in our genetic markers. Bias introduced from
6
mutation rate is unlikely to be substantial considering the short time interval of our study and influence of
7
selection appears to be minimal due to the general adherence of the microsatellite loci to HWE. Another
8
crucial assumption for most Ne estimators is that the assumption of closed population with no further
9
subdivision or structure. We ensured that no further population structure exists through a number of
10
STRUCTURE analyses, though the resulting information did show some gene flow between the two
11
populations. The LDNe method of estimating Nb is expected to be robust to migration rate lower than
12
10% (Waples and England 2011), but additional steps were taken to reduce bias from this violation of
13
assumption. Putative immigrants from each population were removed for all methods with the exception
14
of the maximum likelihood method (MLNe), in which immigration is specifically considered for the
15
estimation of Ne.
16
Most methods of estimating Ne also assume a stable population size, which may be violated for
17
our study species. Splittail population numbers are driven by availability of floodplain habitat critical for
18
spawning, which in turn depend on the amount of precipitation (Moyle et al. 2004). This reliance on
19
precipitation may produce natural variability in the population size of the species. However, this
20
presumed life history is based upon information on the Central Valley splittail population that has greater
21
access to floodplain spawning habitat. Given the limited floodplain availability and comparatively low
22
annual freshwater outflow within the Petaluma and Napa Rivers (Feyrer et al. 2005; Feyrer et al. 2007), it
23
is unlikely that the life history of the San Pablo Bay splittail population would be identical to that of the
24
Central Valley population (Moyle et al. 2004).
25
The last major assumption of many Ne estimators is the assumption of discrete generations.
26
Treating overlapping generations as discrete can introduce significant bias (Waples and Yokota 2007;
1
27
Luikart et al. 2010) and we attempt to address this bias in a couple of ways. First, single-sample Ne
28
estimators (i.e., LDNe) can be applied to individuals randomly sampled from a single cohort in a
29
population with overlapping generations, providing an estimate of the effective number of breeders (Nb)
30
that produced the cohort (Waples 2005). A second approach is to use the temporal method and allow for
31
ample time to elapse between two sampling periods. Once several generations have passed between the
32
two sampling points, signal from genetic drift should become stronger relative to the overlapping
33
generation bias (Palstra and Ruzzante 2008). Although a method to incorporate age-structure effects for
34
species with overlapping generations exists, it requires detailed demographic information lacking for less
35
well-studied species such as splittail (Jorde and Ryman 1995).
36
When using temporal estimators, Waples and Yokota (2007) recommended using samples spaced
37
apart by ~3-5 generations or more to reduce overlapping generation bias. Unfortunately, otolith and scale
38
analyses on adult splittail collected in the San Pablo Bay (Hobbs, unpublished data) suggested a
39
moderately long generation interval for the species (~4.4 years) and indicating that ~2 generations have
40
passed between our two sampling periods. Waples and Yokota (2007) observed that for species with Type
41
2 or Type 3 survivorship (higher mortality among juveniles, increased survivorship in older individuals),
42
sampling of newborns without sufficient time gap results in a downwardly biased estimate. Based on this
43
information and our understanding of the splittail’s life history (Moyle et al. 2004), a slight downward
44
bias may be expected in our temporal Ne estimates.
45
46
References:
47
Feyrer F, Sommer TR, Baxter RD (2005) Spatial-temporal distribution and habitat associations of age-0
48
49
splittail in the lower San Francisco Estuary watershed. Copeia 1:159–168.
Feyrer F, Sommer T, Hobbs J (2007) Living in a dynamic environment: Variability in life history traits of
50
age-0 splittail in tributaries of San Francisco Bay. Trans Am Fish Soc 136:1393–1405. doi:
51
10.1577/T06-253.1
2
52
53
54
Jorde PE, Ryman N (1995) Temporal allele frequency change and estimation of effective size in
populations with overlapping generations. Genetics 139:1077–1090.
Luikart G, Ryman N, Tallmon DA, et al. (2010) Estimation of census and effective population sizes: the
55
increasing usefulness of DNA-based approaches. Conserv Genet 11:355–373. doi: 10.1007/s10592-
56
010-0050-7
57
Moyle PB, Baxter RD, Sommer T, et al. (2004) Biology and population dynamics of Sacramento splittail
58
(Pogonichthys macrolepidotus) in the San Francisco Estuary : A review. San Francisco Estuary &
59
Watershed Science 2:Article 3.
60
Palstra FP, Ruzzante DE (2008) Genetic estimates of contemporary effective population size: what can
61
they tell us about the importance of genetic stochasticity for wild population persistence? Mol Ecol
62
17:3428–3447. doi: 10.1111/j.1365-294X.2008.03842.x
63
64
65
66
67
68
Waples RS (2005) Genetic estimates of contemporary effective population size: to what time periods do
the estimates apply? Mol Ecol 14:3335–3352. doi: 10.1111/j.1365-294X.2005.02673.x
Waples RS, England PR (2011) Estimating contemporary effective population size on the basis of linkage
disequilibrium in the face of migration. Genetics 189:633–644. doi: 10.1534/genetics.111.132233
Waples RS, Yokota M (2007) Temporal estimates of effective population size in species with overlapping
generations. Genetics 175:219–233. doi: 10.1534/genetics.106.065300
69
70
71
72
73
74
75
76
77
3
78
79
Fig. S1 Mean estimated Ln Pr(X|K) values from STRUCTURE K inference analysis of the full data set.
Vertical lines denote standard deviation. Out of K= 1-8, K= 2 has the highest mean value.
1
2
3
4
Mean of estimated Ln probability of data
-84000
80
-85000
-86000
-87000
-88000
-89000
-90000
-91000
81
82
83
84
85
86
87
88
89
90
4
K
5
6
7
8
91
92
93
Fig. S2 Mean estimated Ln Pr(X|K) values from STRUCTURE K inference analysis of 2011 San Pablo
Bay (Petaluma and Napa Rivers) collection. Vertical lines denote standard deviation. Out of K= 1-8, K= 2
has the highest mean value.
1
2
3
4
Mean of estimated Ln probability of data
-21000
94
-21500
-22000
-22500
-23000
-23500
-24000
-24500
-25000
95
96
97
98
99
100
101
102
103
5
K
5
6
7
8
104
105
106
Fig. S3 Mean estimated Ln Pr(X|K) values from STRUCTURE K inference analysis of 2012 San Pablo
Bay (Petaluma and Napa Rivers) collection. Vertical lines denote standard deviation. Out of K= 1-8, K= 1
has the highest mean value.
1
2
3
4
Mean of estimated Ln probability of data
-21000
107
-21500
-22000
-22500
-23000
-23500
-24000
-24500
-25000
108
109
110
111
112
113
114
115
116
6
K
5
6
7
8
117
118
119
Fig. S4 Mean estimated Ln Pr(X|K) values from STRUCTURE K inference analysis of 2011 Central
Valley collection. Vertical lines denote standard deviation. Out of K= 1-8, K= 1 has the highest mean
value.
1
2
3
4
Mean of estimated Ln probability of data
-17000
120
-17500
-18000
-18500
-19000
-19500
121
122
123
124
125
126
127
128
129
7
K
5
6
7
8
130
131
132
Fig. S5 Mean estimated Ln Pr(X|K) values from STRUCTURE K inference analysis of all individuals
assigned to the San Pablo Bay population. Vertical lines denote standard deviation. Out of K= 1-8, K= 1
has the highest mean value.
1
2
3
4
Mean of estimated Ln probability of data
-40000
133
-40500
-41000
-41500
-42000
-42500
-43000
134
135
136
137
138
139
140
141
142
8
K
5
6
7
8
143
144
145
Fig. S6 Mean estimated Ln Pr(X|K) values from STRUCTURE K inference analysis of all individuals
assigned to the Central Valley population. Vertical lines denote standard deviation. Out of K= 1-8, K= 1
has the highest mean value.
1
2
3
4
Mean of estimated Ln probability of data
-43000
-44000
-45000
-46000
-47000
-48000
-49000
-50000
-51000
146
-52000
147
148
149
150
151
152
153
154
155
9
K
5
6
7
8
156
157
Fig. S7 ΔK values, calculated as ΔK = m|L′′(K)|/s[L(K)], from K inference analysis of the 2011 San Pablo
Bay (Petaluma and Napa Rivers) collection.
2
3
K
4
450
400
350
300
ΔK
250
200
150
100
50
158
0
159
160
161
162
163
164
165
166
167
168
169
10
5
6
7
170
Fig. S8 ΔK values, calculated as ΔK = m|L′′(K)|/s[L(K)], from K inference analysis of the full dataset.
2
3
K
4
3000
2500
2000
ΔK 1500
1000
500
171
0
11
5
6
7
Download