Validation: - Proceedings of the Royal Society B

advertisement
Supporting Information
Ne=2
Ne=5
Ne=20
Ne=50
Ne=100
0.5
He
0.4
0.3
0.2
0.1
0.0
0
20
40
60
80
100
Generations (t)
Fig. S5. Loss in simulated and theoretical predicted mean (±SEM) gene diversity
(expected heterozygosity, He) by random genetic drift. The mean (±SEM) simulated
He was calculated over 100 simulations in populations with different effective size
over 100 generations. The theoretical heterozygosity at generation t was calculated as
Het=H0 Π ({1 − [1/(2Net + 1)]}), where Het is the expected heterozygosity at
generation t, H0 is the expected heterozygosity at t=0, and Net is the effective
population size at generation t. The initial genetic variation in the population at t=0
was H0=0.5 and there were two alleles. There was no mutation or selection. The
simulated loss in He was in good agreement with the theoretically expected values for
different values of Ne.
Linkage disequilibrium (LD)
1.0
c=0.001
c=0.01
c=0.1
0.8
0.6
0.4
0.2
0.0
0
100
200
300
400
500
Generations (t)
Fig. S6. Decay in theoretical and simulated mean (±SEM) linkage disequilibrium (D)
between neutral alleles at two partially linked loci. The mean (±SEM) decay in
linkage disequilibrium was calculated over 100 simulations for 500 generations and is
in good agreement with the theoretically expected values for different values of c. The
theoretical linkage disequilibrium was calculated as LD=(1 - c)t, where c is the
recombination rate and t the generation number.
1.000
Fitness (w)
0.998
0.996
0.994
Recessive mutations (h = 0), theoretical
Co-dominant mutations (h = 0.5), theoretical
Recessive mutations (h = 0), simulated
Co-dominant mutations (h = 0.5), simulated
0.992
0.990
10-5
5x10-5
10-4
5x10-4
10-3
5x10-3
Cumulative mutation rate (U)
Fig. S7. Theoretical and simulated mean (±SEM) fitness in populations with recessive
(h=0) and co-dominant (h=0.5) mutations with selection coefficient s=0.1 and total
mutation rate U= 510-3 to 110-5 and effective population size Ne=100. The
equilibrium fitness was reached after 20,000 generations, and the values of 50
simulations were used to calculate the mean (±SEM) fitness. The theoretical predicted
fitness values were calculated as w=e-U, for completely recessive mutations (h=0),
and w=e-2U for co-dominant mutations (h=0.5), where U is the total mutation rate
across the linked region (i.e. haplotype block). Assuming a single-base mutation rate
of μ=110-8 and a linked region of 105 bp (Stenzel et al. 2004), the total mutation rate
would equate to U=110-3. The simulated values approach the theoretically expected
values reasonably well although the fitness values in simulations are marginally
higher than the theoretically expected value. The explanation for this small bias is that
the allele frequency of deleterious mutations in a selection-mutation balance is based
on infinitely large populations. However, due to a low level of inbreeding in finite
populations, the frequency of these mutations can be appreciably less than the
theoretically expected frequency in infinite populations (Crow & Kimura 1970).
Consequently, the equilibrium fitness in the simulated populations is marginally
higher than the theoretically expected fitness value.
20
Effective number of alleles (ne)
18
16
14
12
10
8
6
= 10-5
= 10-4
= 10-3
4
2
0.0
0.1
0.2
0.3
0.4
0.5
Overdominance selection (S)
Fig. S8. Theoretically expected and mean (±SEM) simulated effective number of
alleles (ne) maintained in a population with size Ne=1000 across a range of
overdominant selection coefficients (S) and mutation rates (μ). Population were
simulated with effective size Ne=1000, overdominant selection coefficients S=0.01,
0.05, 0.1, 0.2 and 0.5, and mutation rate μ=10-5, 10-4 and 10-3. The figure shows that
the simulated values for ne are in good agreement with the theoretically expected
values over the entire range of S and for different values of μ.
The theoretically expected values were calculated using:
ne ≈ 2(NeS)½ / (4.6 log10{0.4 / [2Neμ / (NeS)½]})½, when 2Neμ / (NeS)½ < 0.1.
When 2Neμ / (NeS)½  0.1, the following approximation was used:
ne ≈ 3.7Neμ + (NeS)½.
These equations are derived from equations 9.7.19, 9.7.29 and 9.7.30 (in Crow &
Kimura 1970).
E=0
E = 0.01
E = 0.08
E = 0.16
Linkage disequilibrium (LD)
1.0
0.8
0.6
0.4
0.2
0.0
0
100
200
300
400
500
Generations (t)
Fig. S9. Decay in simulated mean(±SEM) linkage disequilibrium (LD) between
haplotype blocks in populations with size Ne=1000. Both haplotypes consist of two
haplotype blocks separated by a recombination hotspot with recombination rate
c=0.01. Each block carried a single (unique) recessive deleterious mutation with
selection coefficients s=0, 0.1, 0.2 and 0.4 and dominance coefficient h=0 (see Figure
2). The solid line represents the theoretical values of LD for two haplotype blocks
without epistatic selection (i.e. haplotype blocks fixed for mutations with s=0).
Epistatic selection of E=0.16 maintains a high level of linkage disequilibrium between
haplotype blocks and can extinguish the recombination hotspot. Note that the
simulated recombination rate (c=0.01) represents an extremely “hot” recombination
spot. The median map distance induced by a hotspot is 0.043 cM (or one crossover
per 2,300 meioses) and the hottest identified in the human genome is 1.2 cM (one
crossover per 80 meioses, i.e. c=0.012), (The International HapMap Consortium
2007).
0 .1 6
A B C : s = 0 .5 , n o b o ttle n e c k
O v e rd o m in a n c e : S = 0 .5 , n o b o ttle n e c k
0 .1 4
0 .1 2
G ST
0 .1 0
0 .0 8
0 .0 6
0 .0 4
0 .0 2
0 .0 0
0
500
1000
1500
2000
G e n e ra tio n s (t)
Fig. S10. Population differentiation (GST) with ABC evolution and overdominant
selection in simulated source-sink metapopulations. Selection coefficients are s=0.5
(ABC evolution, solid symbols) and S=0.5 (overdominance, open symbols). The
source population has an infinitely large population size (N=∞), and the sink
population has a constant size N=5000 (circles). The migration is unidirectional with
rate 2Nm=1. Overdominant selection in resulted in a rapid homogenization of the
gene pools (open circles), whereas populations remained genetically differentiated
with ABC evolution (solid circles). ABC evolution thus appears to be more consistent
with the high level of MHC differentiation commonly observed in vertebrate
populations (see e.g. Muirhead 2001; Richman et al. 2003).
Table S1. Haplotype genealogy of a simulated population subject to ABC evolution
over >3105 generations (data of Fig. 3a). The parameters used were: overdominant
mutation rate μ=10-5, overdominance selection S=0.05, total mutation rate of
completely recessive (h=0) deleterious (s=0.01) mutations U=10-3, and size Ne=1000.
Simulations with incomplete linkage (c=0.001) were run as well and gave
qualitatively similar results (data not shown). The first column shows the
overdominant allele (labelled with the generation number it arose), the second column
the generation it was last observed (rounded to the nearest 100), and the third column
its derived mutant allele by which it was replaced (the derived mutant allele is also
labelled by its generation number). The forth column shows the number of
generations the parental allele coexisted with its derived mutant, and the fifth column
the total number of generations it existed. The sixth column shows the total number of
mutations the haplotype received at its overdominant gene. The last column shows the
number of deleterious mutations that were fixed in the haplotype when it went extinct.
Allele
Time in generations
Mutations
Extinct
Replaced
Time co-
Total
Total no. of
Total no.
by
by
existed
time
overdominant
of bad
with
existed
mutations
mutations
received
received
mutant
0
12800
11035
1765
12800
103
1
670
14800
12190
2610
14130
70
4
1566
11500
9604
1896
9934
54
1
7922
22500
21770
730
14578
61
4
9604
44500
41843
2657
34896
162
8
11035
299000
298318
682
287965
923
136
12190
19600
17509
2091
7410
24
4
17509
22800
21461
1339
5291
26
8
21461
56900
55676
1224
35439
113
18
21770
57900
57155
745
36130
157
17
31288
56000
55933
67
24712
83
18
41843
48500
47307
1193
6657
20
9
47307
56700
49826
6874
9393
39
9
49826
109300
107715
1585
59474
263
22
55676
111600
111199
401
55924
229
35
55933
75600
75064
536
19667
44
27
57155
>300400
Extant H2 *
243245
968
131
75064
78400
77667
733
3336
3
27
77667
86600
86306
294
8933
25
30
86306
192400
191859
541
106094
329
83
107715 117200
114704
2496
9485
38
24
111199 122300
121525
775
11101
54
38
114704 122600
119364
3236
7896
27
24
119364 239600
238473
1127
120236
596
66
121525 130200
129714
486
8675
37
40
129714 139200
138560
640
9486
46
47
138560 >300400
Extant H5 *
161840
626
124
191859 >300400
Extant H4 *
108541
353
140
238473 >300400
Extant H3 *
61927
265
97
298318 >300400
Extant H1 *
2082
4
138
Text S1. Genealogies during ABC evolution.
The genealogies presented in Figure 3 are representative examples taken from many
simulation runs. ABC evolution resulted in genealogies and a pattern of genetic
differentiation that are characteristic for the MHC in two important aspects: (1) little
divergence from the ancestral allele, and (2) large genetic differentiation of alleles in
extant population. Firstly, some extant alleles have diverged only little from the
ancestral type. For example, haplotype H1 in Figure 3a has diverged from the
ancestral type by only two mutations (i.e. steps in the genealogy) and H3 has diverged
by three mutations. This compares to 17 mutations for the least diverged allele in the
overdominant genealogy (Fig. 3b).
Secondly, despite this high level of genetic conservation, ABC evolution
resulted in considerable genetic differentiation in the extant population. The
haplotypes in Figure 3a have diverged from each other by a combined total of 2 + 3 +
10 + 8 + 9 - 2 = 30 mutations by generation 300,000. (Note that H3 and H4 share
coancestry and have two mutations in common (overdominant mutations in
generation 1566 and 9604), and hence, two mutations were deducted). The genetic
variation in the population with a gene under overdominant selection is considerably
lower, and alleles in the genealogy of Figure 3b have diverged by a combined total of
six mutations after 300,000 generations.
Some alleles are reminiscent for trans-species polymorphism. For example,
overdominant allele 11035 (in haplotype H1, Fig. 3a) persisted for 287965
generations. The long persistence time is particularly remarkable given the relatively
small population size (Ne=1000) and high overdominant mutation rate (μ=10-5). With
high mutation rate and small Ne, the rate of allelic turnover increases. Nevertheless, it
had survived 923 overdominant mutations before it was replaced by the invading
mutant allele 287283 at generation 299000.
Text S2. Demographic scenarios simulated.
Aguilar et al. (2004) found that “a severe bottleneck (to an effective size of 10
individuals or fewer for one or two generations, followed by ≈12 generations of
population growth) was necessary to explain near monomorphism at the 18
[microsatellite] loci”. I simulated various bottleneck scenarios, and found that a two
generation single-pair bottleneck with subsequent population growth (with r=0.28,
(Aguilar et al. 2004)) to final size N=104 was a realistic bottleneck scenario. During
this scenario, a neutral microsatellite locus with initial heterozygosity He=0.36 and
stepwise mutation rate μ=10-4 becomes monomorphic in 85% of simulations. The
initial heterozygosity was based on the observed mean heterozygosity at 18
microsatellite loci in Santa Catalina, the most polymorphic fox population analysed
by Aguilar et al. (2004). The mutation rate and model were also taken from Aguilar et
al. (2004). With this demographic scenario, the probability that 18 (unlinked)
microsatellite loci become monomorphic equals p=0.8518=0.053.
References for Supporting Information
Aguilar A, Roemer G, Debenham S, Binns M, Garcelon D, et al. (2004) High MHC
diversity maintained by balancing selection in an otherwise genetically
monomorphic mammal. Proc Natl Acad Sci USA 101: 3490-3494.
doi:10.1073/pnas.0306582101
Crow JF, Kimura M (1970) An introduction to population genetics theory. Harper &
Row Publishers, New York.
Muirhead CA (2001) Consequences of population structure on genes under balancing
selection. Evolution 55: 1532-1541.
Richman AD, Herrera LG, Nash D, Schierup MH (2003) Relative roles of mutation
and recombination in generating allelic polymorphism at an MHC class II locus
in Peromyscus maniculatus. Genet Res Camb 82: 89–99.
Stenzel A, Lu T, Koch WA, Hampe J, Guenther SM. et al. (2004) Patterns of linkage
disequilibrium in the MHC region on human chromosome 6p. Hum Genet 114:
377–385. doi:10.1007/s00439-003-1075-5
The International HapMap Consortium (2007) A second generation human haplotype
map of over 3.1 million SNPs. Nature 449: 851-862. doi:10.1038/nature06258
van Oosterhout C, Joyce DA, Cummings SM, Blais J, Barson, NJ, et al. (2006)
Balancing selection, random genetic drift and genetic variation at the Major
Histocompatibility Complex (MHC) in two wild populations of guppies
(Poecilia reticulata). Evolution 60: 2562–2574.
Download