Chapter 6. Multi-locus coevolution, epistasis, and linkage

advertisement
Chapter 6. Multi-locus coevolution, epistasis, and linkage disequilibrium
Biological Motivation
Obviously, more than a dsingle locus is involved. Here we develop a basic framework for studying two
locus systems introducing the concepts of epistasis, recombination, and linkage disequilibrium. After
studying how coevolution proceeds in a simple two locus system (motivated by???) we move on to
explore ??? INTRODUCE EPISTASIS AND LINKAGE DISEQUILIBRIUM
Key Questions:
ο‚·
ο‚·
ο‚·
What patterns of epistasis are likely to be generated by species interactions?
How do these patterns of epistasis influence the dynamics and outcome of coevolution?
What patterns of linkage disequilibrium do we expect to emerge in coevolving systems?
Building a 2-locus model of coevolution
Our goal is to develop the simplest possible model that captures the potentially important
consequences of the multi-locus gene-for-gene interactions for coevolution between X and X. Clearly,
the simplest starting point is to focus on only a single pair of loci and haploid sexual species. Within
haploid sexuals, recombination occurs in a transient diploid phase but selection occurs in the haploid
phase. Thus, we avoid the complexities of diploidy that we struggled with in the previous chapter. Of
course, ignoring diploidy also comes at the cost of reduced realism since both XX and XX are, indeed,
diploid species.
We imagine that rusts and flax’s run into each other at random, and that when a Flax individual
with genotype i encounters a rust individual with genotype j, an infection results with probability 𝛼𝑖,𝑗 . If
we assume that infection has negative fitness consequences for the flax and positive fitness
consequences for the rust, the fitness of the four possible Flax genotypes is given by:
π‘Šπ‘‹,𝐴𝐡 = 1 − 𝑠𝑋 (π‘Œπ΄π΅ 𝛼𝐴𝐡,𝐴𝐡 + π‘Œπ΄π‘ 𝛼𝐴𝐡,𝐴𝑏 + π‘Œπ‘Žπ΅ 𝛼𝐴𝐡,π‘Žπ΅ + π‘Œπ‘Žπ‘ 𝛼𝐴𝐡,π‘Žπ‘ )
(1a)
π‘Šπ‘‹,𝐴𝑏 = 1 − 𝑠𝑋 (π‘Œπ΄π΅ 𝛼𝐴𝑏,𝐴𝐡 + π‘Œπ΄π‘ 𝛼𝐴𝑏,𝐴𝑏 + π‘Œπ‘Žπ΅ 𝛼𝐴𝑏,π‘Žπ΅ + π‘Œπ‘Žπ‘ 𝛼𝐴𝑏,π‘Žπ‘ )
(1b)
π‘Šπ‘‹,π‘Žπ΅ = 1 − 𝑠𝑋 (π‘Œπ΄π΅ π›Όπ‘Žπ΅,𝐴𝐡 + π‘Œπ΄π‘ π›Όπ‘Žπ΅,𝐴𝑏 + π‘Œπ‘Žπ΅ π›Όπ‘Žπ΅,π‘Žπ΅ + π‘Œπ‘Žπ‘ π›Όπ‘Žπ΅,π‘Žπ‘ )
(1c)
π‘Šπ‘‹,π‘Žπ‘ = 1 − 𝑠𝑋 (π‘Œπ΄π΅ π›Όπ‘Žπ‘,𝐴𝐡 + π‘Œπ΄π‘ π›Όπ‘Žπ‘,𝐴𝑏 + π‘Œπ‘Žπ΅ π›Όπ‘Žπ‘,π‘Žπ΅ + π‘Œπ‘Žπ‘ π›Όπ‘Žπ‘,π‘Žπ‘ )
(1d)
Similarly, the fitness of the four possible Rust genotypes is given by:
π‘Šπ‘Œ,𝐴𝐡 = 1 − π‘ π‘Œ (1 − 𝑋𝐴𝐡 𝛼𝐴𝐡,𝐴𝐡 − 𝑋𝐴𝑏 𝛼𝐴𝑏,𝐴𝐡 − π‘‹π‘Žπ΅ π›Όπ‘Žπ΅,𝐴𝐡 − π‘‹π‘Žπ‘ π›Όπ‘Žπ‘,𝐴𝐡 )
(2a)
π‘Šπ‘Œ,𝐴𝑏 = 1 − π‘ π‘Œ (1 − 𝑋𝐴𝐡 𝛼𝐴𝐡,𝐴𝑏 − 𝑋𝐴𝑏 𝛼𝐴𝑏,𝐴𝑏 − π‘‹π‘Žπ΅ π›Όπ‘Žπ΅,𝐴𝑏 − π‘‹π‘Žπ‘ π›Όπ‘Žπ‘,𝐴𝑏 )
(2b)
Mathematica Resources: http://www.webpages.uidaho.edu/~snuismer/Nuismer_Lab/the_theory_of_coevolution.htm
π‘Šπ‘Œ,π‘Žπ΅ = 1 − π‘ π‘Œ (1 − 𝑋𝐴𝐡 𝛼𝐴𝐡,π‘Žπ΅ − 𝑋𝐴𝑏 𝛼𝐴𝑏,π‘Žπ΅ − π‘‹π‘Žπ΅ π›Όπ‘Žπ΅,π‘Žπ΅ − π‘‹π‘Žπ‘ π›Όπ‘Žπ‘,π‘Žπ΅ )
(2c)
π‘Šπ‘Œ,π‘Žπ‘ = 1 − π‘ π‘Œ (1 − 𝑋𝐴𝐡 𝛼𝐴𝐡,π‘Žπ‘ − 𝑋𝐴𝑏 𝛼𝐴𝑏,π‘Žπ‘ − π‘‹π‘Žπ΅ π›Όπ‘Žπ΅,π‘Žπ‘ − π‘‹π‘Žπ‘ π›Όπ‘Žπ‘,π‘Žπ‘ )
(2d)
Now, if we assume that the probability of survival to mating for the various Flax and Rust genotypes
depends on these fitnesses, we can calculate the frequency of each genotype after selection but prior to
random mating. As before, we can calculate these frequencies by multiplying the current frequency by
its relative fitness. For the Flax, this yields the following expressions:
′
𝑋𝐴𝐡
=
𝑋𝐴𝐡 π‘Šπ‘‹,𝐴𝐡
̅𝑋
π‘Š
(3a)
′
𝑋𝐴𝑏
=
𝑋𝐴𝑏 π‘Šπ‘‹,𝐴𝑏
̅𝑋
π‘Š
(3b)
′
π‘‹π‘Žπ΅
=
π‘‹π‘Žπ΅ π‘Šπ‘‹,π‘Žπ΅
̅𝑋
π‘Š
(3c)
′
π‘‹π‘Žπ‘
=
π‘‹π‘Žπ‘ π‘Šπ‘‹,π‘Žπ‘
̅𝑋
π‘Š
(3d)
̅𝑋 is the population mean fitness of species X and is given by:
where, as usual, the symbol π‘Š
̅𝑋 = 𝑋𝐴𝐡 π‘Šπ‘‹,𝐴𝐡 + 𝑋𝐴𝑏 π‘Šπ‘‹,𝐴𝑏 + π‘‹π‘Žπ΅ π‘Šπ‘‹,π‘Žπ΅ + π‘‹π‘Žπ‘ π‘Šπ‘‹,π‘Žπ‘
π‘Š
(3e)
The same procedure can now be applied to the rust population to calculate the frequency of two-locus
genotypes there after selection but prior to mating:
′
π‘Œπ΄π΅
=
π‘Œπ΄π΅ π‘Šπ‘Œ,𝐴𝐡
Μ…π‘Œ
π‘Š
(4a)
′
π‘Œπ΄π‘
=
𝑋𝐴𝑏 π‘Šπ‘Œ,𝐴𝑏
Μ…π‘Œ
π‘Š
(4b)
′
π‘Œπ‘Žπ΅
=
π‘Œπ‘Žπ΅ π‘Šπ‘Œ,π‘Žπ΅
Μ…π‘Œ
π‘Š
(4c)
′
π‘Œπ‘Žπ‘
=
π‘Œπ‘Žπ‘ π‘Šπ‘Œ,π‘Žπ‘
Μ…π‘Œ
π‘Š
(4d)
̅𝑋 is the population mean fitness of species X and is given by:
where, as usual, the symbol π‘Š
Μ…π‘Œ = π‘Œπ΄π΅ π‘Šπ‘Œ,𝐴𝐡 + π‘Œπ΄π‘ π‘Šπ‘Œ,𝐴𝑏 + π‘Œπ‘Žπ΅ π‘Šπ‘Œ,π‘Žπ΅ + π‘Œπ‘Žπ‘ π‘Šπ‘Œ,π‘Žπ‘
π‘Š
(4e)
OK, so now we know what the frequencies of the various genotypes are just before mating ensues. How
can we now move forward to incorporate changes to genotype frequencies that accrue during the
process of mating?
If we are willing to assume that both Flax and Rust mate at random and have quite large
population sizes, we can derive basic expressions for changes in genotype frequencies. The long and
2
tedious way to go about this is to first tabulate the frequency of offspring with various genotypes that
are produced by all possible combinations of parents (Table 1). RECOMBINATION! INTRODUCE IT HERE
Table 1. Genotype frequencies produced by random matings
Maternal|Paternal
genotypes
AB|AB
AB|Ab
AB|aB
AB|ab
Ab|AB
Ab|Ab
Ab|aB
Ab|ab
aB|AB
aB|Ab
aB|aB
aB|ab
ab|AB
ab|Ab
ab|aB
ab|ab
Frequency of mating
AB
𝑋𝐴𝐡 𝑋𝐴𝐡
𝑋𝐴𝐡 𝑋𝐴𝑏
𝑋𝐴𝐡 π‘‹π‘Žπ΅
𝑋𝐴𝐡 π‘‹π‘Žπ‘
𝑋𝐴𝐡 𝑋𝐴𝐡
𝑋𝐴𝐡 𝑋𝐴𝑏
𝑋𝐴𝐡 π‘‹π‘Žπ΅
𝑋𝐴𝐡 π‘‹π‘Žπ‘
𝑋𝐴𝐡 𝑋𝐴𝐡
𝑋𝐴𝐡 𝑋𝐴𝑏
𝑋𝐴𝐡 π‘‹π‘Žπ΅
𝑋𝐴𝐡 π‘‹π‘Žπ‘
𝑋𝐴𝐡 𝑋𝐴𝐡
𝑋𝐴𝐡 𝑋𝐴𝑏
𝑋𝐴𝐡 π‘‹π‘Žπ΅
𝑋𝐴𝐡 π‘‹π‘Žπ‘
1
1/2
1/2
(1 − π‘Ÿ)/2
1/2
0
π‘Ÿ/2
0
1/2
π‘Ÿ/2
0
0
(1 − π‘Ÿ)/2
0
0
0
Offspring genotype
Ab
aB
0
1/2
0
π‘Ÿ/2
1/2
1
(1 − π‘Ÿ)/2
1/2
0
(1 − π‘Ÿ)/2
0
0
π‘Ÿ/2
1/2
0
0
0
0
1/2
π‘Ÿ/2
0
0
(1 − π‘Ÿ)/2
0
1/2
(1 − π‘Ÿ)/2
1
1/2
π‘Ÿ/2
0
1/2
0
ab
0
0
0
(1 − π‘Ÿ)/2
0
0
π‘Ÿ/2
1/2
0
π‘Ÿ/2
0
1/2
(1 − π‘Ÿ)/2
1/2
1/2
1
What Table 1 provides us with is the raw material for calculating the frequency of the various genotypes
in the offspring generation. All we need to do now is sum up the entries in each column, weighting each
entry by the frequency with which the two relevant parental genotypes encounter one another at
random and mate. Mathematically, this amounts to evaluating the following expression for each of the
four possible offspring genotypes, i:
𝑋𝑖′′ = ∑4𝑗=1 ∑4π‘˜=1 𝑋𝑗′ π‘‹π‘˜′ Π𝑋,𝑗+π‘˜→𝑖
(5a)
and the following expression for the four possible offspring genotype in Rust:
π‘Œπ‘–′′ = ∑4𝑗=1 ∑4π‘˜=1 π‘Œπ‘—′ π‘Œπ‘˜′ Ππ‘Œ,𝑗+π‘˜→𝑖
(5b)
where Π𝑋,𝑗+π‘˜→𝑖 and Ππ‘Œ,𝑗+π‘˜→𝑖 are the probability that two parents with genotypes j and k produce an
offspring of genotype i within the Flax and Rust populations, respectively, and are given in the offspring
genotype columns of Table 1.
Although equations (5) help to see, mechanistically speaking, how the genotype frequencies
within one generation are translated into those of the next through the process of segregation and
recombination, they are quite clunky and not terribly insightful. Fortunately, these equations can be
greatly simplified and re-expressed in a way that is much easier to implement from a practical
3
standpoint. Specifically, plugging away at equations (5) algebraically for a while (or more realistically, a
very long while) allows them to be re-written as:
′′
′
𝑋𝐴𝐡
= 𝑋𝐴𝐡
+ π‘Ÿπ‘‹ 𝐷𝑋′
(6a)
′′
′
𝑋𝐴𝑏
= 𝑋𝐴𝑏
− π‘Ÿπ‘‹ 𝐷𝑋′
(6b)
′′
′
π‘‹π‘Žπ΅
= π‘‹π‘Žπ΅
− π‘Ÿπ‘‹ 𝐷𝑋′
(6c)
′′
′
π‘‹π‘Žπ‘
= π‘‹π‘Žπ‘
+ π‘Ÿπ‘‹ 𝐷𝑋′
(6d)
in the Flax and as:
′′
′
π‘Œπ΄π΅
= π‘Œπ΄π΅
+ π‘Ÿπ‘Œ π·π‘Œ′
(7a)
′′
′
π‘Œπ΄π‘
= π‘Œπ΄π‘
− π‘Ÿπ‘Œ π·π‘Œ′
(7b)
′′
′
π‘Œπ‘Žπ΅
= π‘Œπ‘Žπ΅
− π‘Ÿπ‘Œ π·π‘Œ′
(7c)
′′
′
π‘Œπ‘Žπ‘
= π‘Œπ‘Žπ‘
+ π‘Ÿπ‘Œ π·π‘Œ′
(7d)
in the rust. In these equations, DX and DY quantify linkage disequilibrium, a measure of the statistical
′
′
association (i.e., the covariance) between alleles at the A and B loci. Specifically, 𝐷𝑋′ = 𝑋𝐴𝐡
π‘‹π‘Žπ‘
−
′
′
′
′
′
′
′
𝑋𝐴𝑏 π‘‹π‘Žπ΅ and π·π‘Œ = π‘Œπ΄π΅ π‘Œπ‘Žπ‘ − π‘Œπ΄π‘ π‘Œπ‘Žπ΅ such that linkage disequilibrium is positive if there is an excess of AB
and ab genotypes within a population and negative if it is, instead, the Ab and aB genotypes that are in
excess. A key insight provided by equations (6-7) is that the change in genotype frequencies that occurs
in response to random mating depends entirely on the rate of recombination. If no recombination
occurs, genotype frequencies within the offspring population remain identical to those within the
parental population. If, instead, recombination occurs, genotype frequencies in the offspring generation
differ from those in the parental generation by an amount proportional to linkage disequilibrium.
Clearly, then, recombination can influence coevolution only in cases where coevolutionary selection, or
some other evolutionary force, acts to create linkage disequilibrium within populations of interacting
species.
We are now at a point where we have successfully described how genotype frequencies change
over the course of a single generation. To maintain some generality, let’s wait to substitute in the
specific values for fitness corresponding to our GFG model, and simply express how genotype
frequencies change in terms of arbitrary fitness values, W. Specifically, subsitututing (3) into (6) and
changing from recursion equations to difference equations, yields the following expressions for the
change in host genotype frequencies that occurs over the course of a single generation:
βˆ†π‘‹π΄π΅ =
Μ… 𝑋 )π‘Š
̅𝑋
π‘Ÿπ‘‹ (π‘Šπ‘‹,π‘Žπ΅ π‘Šπ‘‹,𝐴𝑏 π‘‹π‘Žπ΅ 𝑋𝐴𝑏 −π‘Šπ‘‹,π‘Žπ‘ π‘Šπ‘‹,𝐴𝐡 π‘‹π‘Žπ‘ 𝑋𝐴𝐡 )+𝑋𝐴𝐡 (π‘Šπ‘‹,𝐴𝐡 −π‘Š
2
Μ…
π‘Šπ‘‹
(8a)
βˆ†π‘‹π΄π‘ =
Μ… 𝑋 )π‘Š
̅𝑋
π‘Ÿπ‘‹ (π‘Šπ‘‹,π‘Žπ‘ π‘Šπ‘‹,𝐴𝐡 π‘‹π‘Žπ‘ 𝑋𝐴𝐡 −π‘Šπ‘‹,π‘Žπ΅ π‘Šπ‘‹,𝐴𝑏 π‘‹π‘Žπ΅ 𝑋𝐴𝑏 )+𝑋𝐴𝑏 (π‘Šπ‘‹,𝐴𝑏 −π‘Š
Μ… 𝑋2
π‘Š
(8b)
4
βˆ†π‘‹π‘Žπ΅ =
Μ… 𝑋 )π‘Š
̅𝑋
π‘Ÿπ‘‹ (π‘Šπ‘‹,π‘Žπ‘ π‘Šπ‘‹,𝐴𝐡 π‘‹π‘Žπ‘ 𝑋𝐴𝐡 −π‘Šπ‘‹,π‘Žπ΅ π‘Šπ‘‹,𝐴𝑏 π‘‹π‘Žπ΅ 𝑋𝐴𝑏 )+π‘‹π‘Žπ΅ (π‘Šπ‘‹,π‘Žπ΅ −π‘Š
2
Μ…
π‘Šπ‘‹
(8c)
βˆ†π‘‹π‘Žπ‘ =
Μ… 𝑋 )π‘Š
̅𝑋
π‘Ÿπ‘‹ (π‘Šπ‘‹,π‘Žπ΅ π‘Šπ‘‹,𝐴𝑏 π‘‹π‘Žπ΅ 𝑋𝐴𝑏 −π‘Šπ‘‹,π‘Žπ‘ π‘Šπ‘‹,𝐴𝐡 π‘‹π‘Žπ‘ 𝑋𝐴𝐡 )+π‘‹π‘Žπ‘ (π‘Šπ‘‹,π‘Žπ‘ −π‘Š
Μ… 𝑋2
π‘Š
(8d)
Equations for the pathogen species, Y, are essentially identical and so are not shown. We are now to a
point where we could, if we wished, simply simulate the process of coevolution by plugging in the values
for fitness we derived previously for the GFG system (EQUSTIONS X) and iterating equations (X).
Although this approach would surely provide us with some insights into the process of a coevolution, a
much more insightful and elegant approach is to first make a change of variables (Appendix 3) that
allows us to focus on allele frequencies and linkage disequilibrium rather than genotype frequencies. In
addition to facilitating biological interpretation and intuition, this change of variables simplifies our
model by reducing the number of variables we follow from four in equations (X) to three, which is the
actual number of degrees of freedom in the system.
In order to make the change of variables from genotype frequencies to allele frequencies and
linkage disequilibrium, we first need to clearly define the new variables. Specifically, we define allele
frequencies:
𝑝𝑋,𝐴 = 𝑋𝐴𝐡 + 𝑋𝐴𝑏
(9a)
𝑝𝑋,𝐡 = 𝑋𝐴𝐡 + π‘‹π‘Žπ΅
(9b)
π‘π‘Œ,𝐴 = π‘Œπ΄π΅ + π‘Œπ΄π‘
(9c)
π‘π‘Œ,𝐡 = π‘Œπ΄π΅ + π‘Œπ‘Žπ΅
(9d)
and linkage disequilibrium:
𝐷𝑋 = 𝑋𝐴𝐡 π‘‹π‘Žπ‘ − 𝑋𝐴𝑏 π‘‹π‘Žπ΅
(10a)
π·π‘Œ = π‘Œπ΄π΅ π‘Œπ‘Žπ‘ − π‘Œπ΄π‘ π‘Œπ‘Žπ΅
(10b)
for both of the interacting species. The next step in our change of variables is to write down new
recursions that capture the way in which our new variables change over the course of a single
generation. The easiest way to do this is to just substitute the predicted values for the genotype
′′
′′
frequencies in the next generation (e.g., 𝑋𝐴𝐡
, 𝑋𝐴𝑏,
etc.) into expressions (9-10), yielding:
′′
𝑝𝑋,𝐴
=
π‘Šπ‘‹,𝐴𝑏 𝑋𝐴𝑏 +π‘Šπ‘‹,𝐴𝐡 𝑋𝐴𝐡
̅𝑋
π‘Š
(11a)
′′
𝑝𝑋,𝐡
=
π‘Šπ‘‹,π‘Žπ΅ π‘‹π‘Žπ΅ +π‘Šπ‘‹,𝐴𝐡 𝑋𝐴𝐡
̅𝑋
π‘Š
(11b)
𝐷𝑋′′ =
(π‘Šπ‘‹,π‘Žπ΅ π‘Šπ‘‹,𝐴𝑏 π‘‹π‘Žπ΅ 𝑋𝐴𝑏 −π‘Šπ‘‹,π‘Žπ‘ π‘Šπ‘‹,𝐴𝐡 π‘‹π‘Žπ‘ 𝑋𝐴𝐡 )(π‘Ÿπ‘‹ −1)
Μ… 𝑋2
π‘Š
5
(11c)
′′
π‘π‘Œ,𝐴
=
π‘Šπ‘Œ,𝐴𝑏 π‘Œπ΄π‘ +π‘Šπ‘Œ,𝐴𝐡 π‘Œπ΄π΅
Μ…π‘Œ
π‘Š
(12a)
′′
π‘π‘Œ,𝐡
=
π‘Šπ‘Œ,π‘Žπ΅ π‘Œπ‘Žπ΅ +π‘Šπ‘Œ,𝐴𝐡 π‘Œπ΄π΅
Μ…π‘Œ
π‘Š
(12b)
π·π‘Œ′′ =
(π‘Šπ‘Œ,π‘Žπ΅ π‘Šπ‘Œ,𝐴𝑏 π‘Œπ‘Žπ΅ π‘Œπ΄π‘ −π‘Šπ‘Œ,π‘Žπ‘ π‘Šπ‘Œ,𝐴𝐡 π‘Œπ‘Žπ‘ π‘Œπ΄π΅ )(π‘Ÿπ‘Œ −1)
Μ… π‘Œ2
π‘Š
(12c)
Obviously, we still have a bit of a problem! Our equations now contain a mix of old and new variables
which can never be a good thing. The way to move forward is to recognize that the genotype
frequencies appearing in the right hand sides of the equations can be re-written using definitions (9-10)
in the following way:
𝑋𝐴𝐡 = 𝑝𝑋,𝐴 𝑝𝑋,𝐡 + 𝐷𝑋
(13a)
𝑋𝐴𝑏 = 𝑝𝑋,𝐴 π‘žπ‘‹,𝐡 − 𝐷𝑋
(13b)
π‘‹π‘Žπ΅ = π‘žπ‘‹,𝐴 𝑝𝑋,𝐡 − 𝐷𝑋
(13c)
π‘‹π‘Žπ‘ = π‘žπ‘‹,𝐴 π‘žπ‘‹,𝐡 + 𝐷𝑋
(13d)
π‘Œπ΄π΅ = π‘π‘Œ,𝐴 π‘π‘Œ,𝐡 + π·π‘Œ
(14a)
π‘Œπ΄π‘ = π‘π‘Œ,𝐴 π‘žπ‘Œ,𝐡 − π·π‘Œ
(14b)
π‘Œπ‘Žπ΅ = π‘žπ‘Œ,𝐴 π‘π‘Œ,𝐡 − π·π‘Œ
(14c)
π‘Œπ‘Žπ‘ = π‘žπ‘Œ,𝐴 π‘žπ‘Œ,𝐡 + π·π‘Œ
(14d)
Substituting (13 and 14) into (11 and 12) and doing a bit of algebra allows us to finally complete our
change of variables and arrive at a set of equations expressed entirely in terms of the new variables.
′′
𝑝𝑋,𝐴
=
𝑝𝑋,𝐴 (π‘žπ‘‹,𝐡 π‘Šπ‘‹,𝐴𝑏 +𝑝𝑋,𝐡 π‘Šπ‘‹,𝐴𝐡 )+(π‘Šπ‘‹,𝐴𝐡 −π‘Šπ‘‹,𝐴𝑏 )𝐷𝑋
′′
𝑝𝑋,𝐡
=
𝑝𝑋,𝐡 (π‘žπ‘‹,𝐴 π‘Šπ‘‹,π‘Žπ΅ +𝑝𝑋,𝐴 π‘Šπ‘‹,𝐴𝐡 )+(π‘Šπ‘‹,𝐴𝐡 −π‘Šπ‘‹,π‘Žπ΅ )𝐷𝑋
𝐷𝑋′′ =
(15a)
_
π‘Šπ‘‹
(15b)
_
π‘Šπ‘‹
(π‘Šπ‘‹,π‘Žπ΅ π‘Šπ‘‹,𝐴𝑏 (π‘žπ‘‹,𝐴 𝑝𝑋,𝐡 −𝐷𝑋 )(𝑝𝑋,𝐴 π‘žπ‘‹,𝐡 −𝐷𝑋 )−π‘Šπ‘‹,π‘Žπ‘ π‘Šπ‘‹,𝐴𝐡 (π‘žπ‘‹,𝐴 π‘žπ‘‹,𝐡 +𝐷𝑋 )(𝑝𝑋,𝐴 𝑝𝑋,𝐡 +𝐷𝑋 ))(π‘Ÿπ‘‹ −1)
Μ… 𝑋2
π‘Š
(15c)
and,
′′
π‘π‘Œ,𝐴
=
π‘π‘Œ,𝐴 (π‘žπ‘Œ,𝐡 π‘Šπ‘Œ,𝐴𝑏 +π‘π‘Œ,𝐡 π‘Šπ‘Œ,𝐴𝐡 )+(π‘Šπ‘Œ,𝐴𝐡 −π‘Šπ‘Œ,𝐴𝑏 )π·π‘Œ
Μ…π‘Œ
π‘Š
(16a)
′′
π‘π‘Œ,𝐡
=
π‘π‘Œ,𝐡 (π‘žπ‘Œ,𝐴 π‘Šπ‘Œ,π‘Žπ΅ +π‘π‘Œ,𝐴 π‘Šπ‘Œ,𝐴𝐡 )+(π‘Šπ‘Œ,𝐴𝐡 −π‘Šπ‘Œ,π‘Žπ΅ )π·π‘Œ
Μ…π‘Œ
π‘Š
(16b)
6
π·π‘Œ′′ =
(π‘Šπ‘Œ,π‘Žπ΅ π‘Šπ‘Œ,𝐴𝑏 (π‘žπ‘Œ,𝐴 π‘π‘Œ,𝐡 −π·π‘Œ )(π‘π‘Œ,𝐴 π‘žπ‘Œ,𝐡 −π·π‘Œ )−π‘Šπ‘Œ,π‘Žπ‘ π‘Šπ‘Œ,𝐴𝐡 (π‘žπ‘Œ,𝐴 π‘žπ‘Œ,𝐡 +π·π‘Œ )(π‘π‘Œ,𝐴 π‘π‘Œ,𝐡 +π·π‘Œ ))(π‘Ÿπ‘Œ −1)
Μ… π‘Œ2
π‘Š
(16c)
We now have a set of equations describing how allele frequencies and linkage disequilibrium evolve in
response to natural selection and random mating over the course of a single generation. Our last move
is to re-write these recursions as difference equations by subtracting their values at the start of the
generation from (15-16):
_
βˆ†π‘π‘‹,𝐴 =
𝑝𝑋,𝐴 (𝑝𝑋,𝐡 π‘Šπ‘‹,𝐴𝐡 +π‘žπ‘‹,𝐡 π‘Šπ‘‹,𝐴𝑏 −π‘Šπ‘‹ )+(π‘Šπ‘‹,𝐴𝐡 −π‘Šπ‘‹,𝐴𝑏 )𝐷𝑋
(17a)
_
π‘Šπ‘‹
_
βˆ†π‘π‘‹,𝐡 =
βˆ†π·π‘‹ =
𝑝𝑋,𝐡 (π‘žπ‘‹,𝐴 π‘Šπ‘‹,π‘Žπ΅ +𝑝𝑋,𝐴 π‘Šπ‘‹,𝐴𝐡 −π‘Šπ‘‹ )+(π‘Šπ‘‹,𝐴𝐡 −π‘Šπ‘‹,π‘Žπ΅ )𝐷𝑋
(17b)
_
π‘Šπ‘‹
Μ… 𝑋2
(π‘Šπ‘‹,π‘Žπ΅ π‘Šπ‘‹,𝐴𝑏 (π‘žπ‘‹,𝐴 𝑝𝑋,𝐡 −𝐷𝑋 )(𝑝𝑋,𝐴 π‘žπ‘‹,𝐡 −𝐷𝑋 )−π‘Šπ‘‹,π‘Žπ‘ π‘Šπ‘‹,𝐴𝐡 (π‘žπ‘‹,𝐴 π‘žπ‘‹,𝐡 +𝐷𝑋 )(𝑝𝑋,𝐴 𝑝𝑋,𝐡 +𝐷𝑋 ))(π‘Ÿπ‘‹ −1)−𝐷𝑋 π‘Š
Μ… 𝑋2
π‘Š
(17c)
and,
βˆ†π‘π‘Œ,𝐴 =
Μ… π‘Œ )+(π‘Šπ‘Œ,𝐴𝐡 −π‘Šπ‘Œ,𝐴𝑏 )π·π‘Œ
π‘π‘Œ,𝐴 (π‘žπ‘Œ,𝐡 π‘Šπ‘Œ,𝐴𝑏 +π‘π‘Œ,𝐡 π‘Šπ‘Œ,𝐴𝐡 −π‘Š
Μ…
π‘Šπ‘Œ
(18a)
βˆ†π‘π‘Œ,𝐡 =
Μ… π‘Œ )+(π‘Šπ‘Œ,𝐴𝐡 −π‘Šπ‘Œ,π‘Žπ΅ )π·π‘Œ
π‘π‘Œ,𝐡 (π‘žπ‘Œ,𝐴 π‘Šπ‘Œ,π‘Žπ΅ +π‘π‘Œ,𝐴 π‘Šπ‘Œ,𝐴𝐡 −π‘Š
Μ…π‘Œ
π‘Š
(18b)
βˆ†π·π‘Œ =
Μ… π‘Œ2
(π‘Šπ‘Œ,π‘Žπ΅ π‘Šπ‘Œ,𝐴𝑏 (π‘žπ‘Œ,𝐴 π‘π‘Œ,𝐡 −π·π‘Œ )(π‘π‘Œ,𝐴 π‘žπ‘Œ,𝐡 −π·π‘Œ )−π‘Šπ‘Œ,π‘Žπ‘ π‘Šπ‘Œ,𝐴𝐡 (π‘žπ‘Œ,𝐴 π‘žπ‘Œ,𝐡 +π·π‘Œ )(π‘π‘Œ,𝐴 π‘π‘Œ,𝐡 +π·π‘Œ ))(π‘Ÿπ‘Œ −1)−π·π‘Œ π‘Š
Μ… π‘Œ2
π‘Š
(18c)
PHHHEEEEWWWWYYYY! We have done it. With the bulk of the tedious algebraic book-keeping behind
us, we can finally move on.
Analyzing the model
We can now transform the general two locus model described by difference equations (17-18)
into a specific model of coevolution between Flax and Flax-Rust by replacing the general values of
fitness W with their specific given by equations (X) and values of the interaction matrix appropriate for
our gene-for-gene model:
1
0
𝛼=[
0
0
1
1
0
0
1
0
1
0
1
1
]
1
1
(19)
Here, the interaction matrix depicts the outcome of the classical gene-for-gene model where host
resistance genes (A and B) are able to recognize parasite avirulence genes (a and b), but not parasite
virulence genes (A and B).
Even after working long and hard to simplify the resulting equations, however, I couldn’t get them to fit
on a single line of this page. As a general rule of thumb, if your equation doesn’t fit on a single line, you
aren’t going to learn much from it. So, what can we do? One option is to charge straight ahead and
7
simply rely on our computer to simulate coevolution by iterating our recursion equations for a large
number of parameter combinations. Although there is nothing wrong with this approach, we can
actually gain quite a bit of biological insight by using a bit more mathematical finesse, and developing
approximations that assume selection is not too strong and that recombination occurs with some
reasonable frequency. To be a bit more specific, one way to proceed is to pursue a Quasi-Linkage
Equilibrium (QLE) approximation (REFS).
Although a great deal of difficult math has gone into rigorous mathematical investigation when
and where the QLE approximation can be applied (REFS), our approach here will be more informal and, I
hope, more practical for those who simply want to learn something about biology rather than
mathematics. As a general rule of thumb, anytime selection is not too strong (less than a 1% difference
in fitness among genotypes) and recombination is of a larger magnitude than selection (if the fitness
difference among genotypes is 1%, recombination should be at least 0.1 or greater), linkage
disequilibrium will change much more rapidly than allele frequencies and will, in fact, approach a quasiequilibrium state where its value is small, and a function of the current allele frequencies within the
population. What this means to us is that if selection is weak (< 1%) and recombination is frequent
(>10%), linkage disequilibrium will be as small as selection (< 0.01). As a result, as long as we are willing
to tolerate some small amount of inaccuracy in our prediction, we can ignore all terms in our difference
equations that include things like s2, D2, and s*D because these terms will all be very small and quite
negligible. To be a bit more formal, if we are willing to assume recombination is frequent, and that
selection is weak and of some small order ε, linkage disequilibrium will also be weak and of order ε,
allowing us to ignore all terms of order ε2 and higher. Clearly, what this means is that our QLE
approximation will be more and more accurate as the difference in fitness among genotypes decreases,
because the terms we ignore become ever smaller in relation to the terms we keep.
Returning to the specific case of coevolution between Flax and Flax Rust, what we are going to
assume is that the fitness consequences of the interaction are relatively weak such that 𝑠𝑋 and π‘ π‘Œ are
both of small order ε, and that recombination within both species is relatively frequent (i.e., > ε). Our
next step in implementing our QLE approximation is to replace each of our difference equations with its
first order Taylor Series Expansion in ε. Using Mathematica, this is an incredibly trivial thing to do, and
yields the following approximate expressions for evolutionary change in the Flax:
βˆ†π‘π‘‹,𝐴 ≈ 𝑠𝑋 𝑝𝑋,𝐴 π‘žπ‘‹,𝐴 π‘žπ‘Œ,𝐴 (1 − 𝑝𝑋,𝐡 π‘žπ‘Œ,𝐡 )
(20a)
βˆ†π‘π‘‹,𝐡 ≈ 𝑠𝑋 𝑝𝑋,𝐡 π‘žπ‘‹,𝐡 π‘žπ‘Œ,𝐡 (1 − 𝑝𝑋,𝐴 π‘žπ‘Œ,𝐴 )
(20b)
βˆ†π·π‘‹ ≈ −𝑠𝑋 (1 − π‘Ÿπ‘‹ )𝑝X,A π‘žX,A 𝑝X,B π‘žX,B π‘žY,A π‘žY,B − π‘Ÿπ‘‹ 𝐷𝑋
(20c)
and Flax Rust:
βˆ†π‘π‘Œ,𝐴 ≈ π‘ π‘Œ 𝑝Y,A π‘žY,A 𝑝X,A (1 − 𝑝X,B π‘žY,B )
(21a)
βˆ†π‘π‘Œ,𝐡 ≈ π‘ π‘Œ 𝑝Y,B π‘žY,B 𝑝X,B (1 − 𝑝X,A π‘žY,A )
(21b)
βˆ†π·π‘Œ ≈ π‘ π‘Œ (1 − π‘Ÿπ‘Œ )𝑝Y,A π‘žY,A 𝑝Y,B π‘žY,B 𝑝X,A 𝑝X,B − π‘Ÿπ‘Œ π·π‘Œ
(21c)
8
The beauty of the QLE approximation, and the primary reason for using it, (other than the fact that it
makes truly lovely equations) is that it allows us to “see” things about the biology of a system that we
might otherwise spend hours upon hours simulating and still never pick up on. For instance, even a
passing inspection of our approximation reveals that coevolutionary change in allele frequencies is
independent of linkage disequilibrium. As a result, we can solve for the quasi-equilibrium values of
linkage disequilibrium by simply setting (20c and 21c) equal to zero and solving for 𝐷𝑋 and π·π‘Œ :
̃𝑋 ≈ − 𝑠𝑋 (1−π‘Ÿπ‘‹ )𝑝X,A π‘žX,A 𝑝X,B π‘žX,B π‘žY,A π‘žY,B
𝐷
π‘Ÿ
(22a)
Μƒπ‘Œ ≈ π‘ π‘Œ (1−π‘Ÿπ‘Œ )𝑝Y,A π‘žY,A 𝑝Y,B π‘žY,B 𝑝X,A 𝑝X,B
𝐷
π‘Ÿ
(22b)
𝑋
π‘Œ
Remarkably, this shows that the sign of linkage disequilibrium should be different in Flax and Flax rust.
Specifically, linkage disequilibrium between resistance genes within the flax should always be negative
whereas linkage disequilibrium between virulence genes in the rust should always be positive. The
biological reason for this intriguing pattern is that Flax individuals receive a fitness benefit by carrying a
single resistance gene at either locus (A or B) whereas rust individuals must carry virulence alleles at
both loci (A and B) in order to evade detection and elimination by the host. Consequently, within the
Flax population, the quantity π‘Šπ‘‹,𝐴𝐡 π‘Šπ‘‹,π‘Žπ‘ − π‘Šπ‘‹,𝐴𝑏 π‘Šπ‘‹,π‘Žπ΅ is negative, indicating it experiences negative
epistasis. In contrast, within the Rust population the quantity π‘Šπ‘Œ,𝐴𝐡 π‘Šπ‘Œ,π‘Žπ‘ − π‘Šπ‘Œ,𝐴𝑏 π‘Šπ‘Œ,π‘Žπ΅ is positive,
indicating the rust population experiences positive epistasis.
Our QLE approximation has already unearthed a valuable insight about our expectations for the
form of epistasis and sign of linkage disequilibrium that we expect to emerge from GFG coevolution. Can
we push our QLE approximation further to learn about the dynamics and outcomes of coevolution? The
place to start is with an analysis of allele frequency change. Inspecting equations (19a-b, 20a-b) reveals
that as long as genetic variation exists at all loci, host resistance genes and parasite virulence genes will
increase in frequency (Figure 1). Only when the parasite fixes both virulence alleles, or the host has no
resistance alleles at either locus, does coevolution cease. This picture of coevolutionary dynamics is
remarkably similar to what we saw when we studied gene-for-gene coevolution in a haploid, single locus
model. Only when we study the relative rates of coevolution in the two species do we see the novel
twist that multiple loci and epistasis bring to the table. Specifically, if both Flax and Rust initially have
very low frequencies of resistance and virulence alleles, the Flax population will evolve resistance much
more rapidly than the rust can evolve to overcome it (Figure 1). The reason for this striking difference in
coevolutionary rates is, again, epistasis. Because the host realizes fitness benefits by having a resistance
allele at a single locus (because that is sufficient to recognize and clear the rust), selection is quite
effective at increasing the frequency of even a very rare resistance allele. In contrast, the parasite must
carry virulence alleles at both loci in order to avoid recognition by hosts with even only a single
resistance allele. Thus, if virulence alleles are initially very rare, rust individuals carrying virulence alleles
at both loci are incredibly rare, and selection has a very difficult time increasing the frequency of the
virulence alleles (Figure 1). Together with our earlier results about the sign of linkage disequilibrium, this
shows that it is epistasis that causes our two locus model to differ in interesting ways from the single
locus model we studied earlier in Chapter 2.
9
At this point, you should be wondering just how much we should trust our QLE approximation.
As with any approximation, whether it be the QLE, quantitative genetics, or adaptive dynamics, it is
important to evaluate robustness using simulations. In addition to allowing the generality of our
conclusions to be explored, performing simulations also helps serve as a safeguard against
straightforward algebraic errors. Although there are many ways to perform simulations, each of which
makes its own set of assumptions, for our purposes it is sufficient to simply iterate the general recursion
equations (X). Our goal is to push the limits of our QLE approximation to see when it “breaks”. As we
already saw in Figure 1, when𝑠𝑋 = π‘ π‘Œ = 0.01 and π‘Ÿπ‘‹ = π‘Ÿπ‘Œ = 0.1 the predictions of our QLE
approximation and exact simulations are more or less indistinguishable, even over 5,000 generations of
coevolution. But what about scenarios where the interaction between Flax and Rust has large fitness
consequences for the interacting species, perhaps something more along the lines of 𝑠𝑋 = π‘ π‘Œ = 0.1? In
such cases, Figure 2 shows that significant errors begin to creep into our QLE approximation, which over
sufficient amounts of time, lead to significant quantitative errors in our predictions for allele frequencies
and linkage disequilibria. That said, the amount of error accruing over any single generation remains
vanishingly small, and qualitative predictions such as the sign of linkage disequilibrium, remain entirely
robust. What can we conclude from this exercise about the robustness of the QLE approximation? The
answer is that it largely depends on the question you wish to address. If the goal is qualitative
prediction, the QLE approximation often works remarkably well even when its basic assumptions are
grossly violated. If, on the other hand, the goal is quantitative prediction, selection really does need to
be quite weak and recombination quite frequent for accuracy to be maintained over thousands of
generations.
Answers to key questions
What patterns of epistasis are likely to be generated by species interactions?
Our exploration of coevolution between Flax and its pathogen M. lini, revealed interesting
patterns of epistasis. Specifically, our QLE approximation revealed that the host plant, L. marginale,
experiences negative epistasis because under the assumptions of our gene-for-gene model, a single
resistance gene can be sufficient to recognize and clear the pathogen. In contrast, we found that
epistasis within the rust, M. lini, was positive, a pattern that emerges because host recognition can be
thwarted only by evading recognition at both loci.
How do these patterns of epistasis influence the dynamics and outcome of coevolution?
By studying coevolution using our QLE approximation and numerical simulation, we found that
epistasis plays an important role in the dynamics of coevolution while having no real impact on its
ultimate outcome. Specifically, because epistasis in the Flax is negative, significant fitness gains accrue
to individuals carrying a resistance allele at only a single locus. Thus, selection is quite effective at
increasing the frequency of resistance genes in the host. In contrast, because epistasis within the rust is
positive, significant fitness gains accrue only to individuals carrying virulence alleles at both loci.
Consequently, selection has a very hard time gaining traction on rare virulence alleles, thus slowing the
rate at which they increase in frequency within the rust population.
10
What patterns of linkage disequilibrium do we expect to emerge in coevolving systems?
In light of our observation that epistasis is negative within the Flax population but positive
within the Rust population, it is perhaps not surprising that we also observe differences in the sign of
linkage disequilibrium in the two species. Specifically, linkage disequilibrium within the Flax population
tends to be negative whereas within the rust population it tends to be positive. What this means is that
if we sample a Flax individual and find that it is carrying a resistance allele at one locus, it is less likely
than expected based on allele frequencies that it is also carrying a resistance allele at another locus. In
contrast, if we sample a rust individual and find that it is carrying a virulence allele at one locus, it is
more likely than we would expect based on allele frequencies that it is also carrying a virulence allele at
another locus.
New Questions Arising:
Our simple model of gene-for-gene coevolution between Flax and Flax-Rust suggests that …. Although
thought provoking, the tenuous connection between this prediction and available empirical data
immediately raises several important questions:
ο‚·
Are the patterns of epistasis and linkage disequilibrium we uncovered for our simple gene-forgene model also likely to occur in other species interactions?
ο‚·
If species interactions are mediated by quantitative traits, rather than molecular recognition,
should we still expect epistasis and linkage disequilibrium to matter?
In the next two sections, we will develop extensions of our two-locus model that will help us to answer
these questions and gain further insight into the process of coevolution in multi-locus systems.
Extensions
Extension 1: Evaluating the consistency of epistasis and disequilibrium in coevolving interactions
Just how general are the patterns of epistasis and linkage disequilibrium we observed in our
investigation of gene-for-gene coevolution between Wild flax and flax rust? Perhaps the easiest way to
answer this question is to jump right into developing and analyzing a model of coevolution for a very
different interaction. Because it is still fresh in our memory from the previous chapter, let’s use the
interaction between the snail XX and its schistosome parasite, XX, as our test case. As a quick refresher,
recent empirical studies have identified two molecules that have been hypothesized to interact and play
an important role in the outcome of an encounter between an individual snail and an individual
schistosome. Specifically, snails deploy FREP molecules that bind to specific mucin molecules produced
by the schistosome; when the FREP “matches” the mucin, the infection is cleared. In the previous
chapter, we developed interaction matrices describing this molecular interaction under various
assumptions, all of which involved only a single diploid locus. As many of you probably guessed, the
assumption that the structure of these molecules depends on only a single genetic locus is most likely
false (REFS). Given this information, let’s revisit this interaction and explore how it is likely to coevolve
given a fresh set of genetic assumptions. To keep things simple, we are going to forget all about diploidy
11
and simply focus in on a scenario where FREP and mucin are produced by only a pair of haploid, diallelic
loci in each species.
The particular assumption we are going to make to adapt this interaction to the two-locus
modeling framework we developed earlier in this chapter is that each genotype makes a FREP or mucin
molecule with a unique conformation. JUSTIFY USING MOLECULAR SPECIFICS… I THINK THIS MEANS
YOU MUST HAVE AN ACTIVE SITE THAT IS ENCODED BY TWO SEPARATE LOCI? If the genotypes of snail
and shistosome match, the host FREP molecule binds to the parasite mucin molecule and the infection is
cleared. If, in contrast, the genotypes of the two species do not match, the snail FREP fails to bind to the
schistosome mucin and the infection succeeds. Together, these assumptions lead to the following
interaction matrix describing the probability that a schistosome genotype evades recognition and
successfully infects a host genotype:
0
1
𝛼=[
1
1
1
0
1
1
1
1
0
1
1
1
]
1
0
(23)
where snail genotypes are in columns {AB, Ab, aB, ab} and schistosome genotypes are in rows {AB, Ab,
aB, ab}. All we need to do now is plug the appropriate values of 𝛼 from (23) into the general expressions
for coevolutionary change in two locus systems we developed previously (17-18). It probably comes as
little surprise, however, that even after a lengthy session of algebra, there is really no way to write these
exact recursion equations down in a way that really helps us to understand what the hell is going on. As
before, this level of complexity suggests that it is time to deploy an approximation; in this case, the best
possible approximation for the job is Quasi-Linkage Equilibrium.
Developing a QLE approximation for coevolution between the snail B. glabrata and the
schistosome parasite S. mansoni makes the exact same assumptions and follows the exact same steps as
when we used it to study coevolution between wild flax and flax rust. In short, we assume selection is
weak (order ε) and recombination sufficiently common for linkage disequilibrium to also be small (order
ε). We then describe evolutionary change in allele frequencies and linkage disequilibrium using their first
order Taylor Series exampsions in ε (see accompanying Mathematica notebook). The result is the
following system of recursion equations describing evolutionary change in the snail:
βˆ†π‘π‘‹,𝐴 ≈ −𝑠𝑋 𝑝X,A π‘žX,A (1 − 2𝑝Y,A )(π‘žY,B − 𝑝X,B (1 − 2𝑝Y,B ))
(24a)
βˆ†π‘π‘‹,𝐡 ≈ −𝑠𝑋 𝑝X,B π‘žX,B (1 − 2𝑝Y,B )(π‘žY,A − 𝑝X,A (1 − 2𝑝Y,A ))
(24b)
βˆ†π·π‘‹ ≈ 𝑠𝑋 𝑝X,A π‘žX,A 𝑝X,B π‘žX,B (1 − 2𝑝Y,A )(1 − 2𝑝Y,B )(1 − π‘Ÿπ‘‹ ) − π‘Ÿπ‘‹ 𝐷𝑋
(24c)
and evolutionary change in the schistosome:
βˆ†π‘π‘Œ,𝐴 ≈ π‘ π‘Œ 𝑝Y,A π‘žY,A (1 − 2𝑝X,A )(π‘žY,B − 𝑝X,B (1 − 2𝑝Y,B ))
(25a)
βˆ†π‘π‘Œ,𝐡 ≈ π‘ π‘Œ 𝑝Y,B π‘žY,B (1 − 2𝑝X,B )(π‘žY,A − 𝑝X,A (1 − 2𝑝Y,A ))
(25b)
12
βˆ†π·π‘Œ ≈ −π‘ π‘Œ 𝑝Y,A π‘žY,A 𝑝Y,B π‘žY,B (1 − 2𝑝X,A )(1 − 2𝑝X,B )(1 − π‘Ÿπ‘Œ ) − π‘Ÿπ‘Œ π·π‘Œ
(25c)
where all terms of order ε2 and greater have been ignored. As before, our QLE approximation decouples
coevolutionary changes in allele frequencies from changes in linkage disequilibrium, allowing us to solve
for the quasi-equilibrium values of linkage disequilibrium:
̃𝑋 ≈ 𝑠𝑋 𝑝X,A π‘žX,A 𝑝X,B π‘žX,B (1−2𝑝Y,A )(1−2𝑝Y,B )(1−π‘Ÿπ‘‹ )
𝐷
π‘Ÿ
(26a)
Μƒπ‘Œ ≈ − π‘ π‘Œ 𝑝Y,A π‘žY,A 𝑝Y,B π‘žY,B (1−2𝑝X,A )(1−2𝑝X,B )(1−π‘Ÿπ‘Œ )
𝐷
π‘Ÿ
(26b)
𝑋
π‘Œ
Just a quick look at these expressions reveals that they are quite different from those we found for flax
and flax rust. Whereas the sign of linkage disequilibrium was constant and predictable for Flax and Flax
Rust, these expressions contain terms that allow the sign of linkage disequilibrium to fluctuate as allele
frequencies change over time within the snail and schistosome populations.
The underlying cause of these changes in the sign and magnitude of linkage disequilibrium is
what is often referred to as fluctuating epistasis (REFS). What this means is that the sign and magnitude
of epistasis changes over time within snail and schistosome populations as differing combinations of
alleles become more infective or more resistant than predicted by the individual alleles themselves. To
be more specific, the terms (1 − 2𝑝YA )(1 − 2𝑝YB ) and (1 − 2𝑝XA )(1 − 2𝑝XB ) appearing in equations
(26) measure which genotypes (and thus molecular structures of mucin and FREP molecules) are most
common within the schistosome and snail populations. If the most common alleles within the
schistosome population are A and B (or a and b), then the first of these two terms is positive, indicating
epistatic selection in favor of AB (or ab) genotypes within the snail population, and a corresponding
excess of these genotypes within the snail population. Similarly, if the second of these terms is positive,
because the A and B alleles (or a and b) are the most frequent within the snail population, epistatic
selection favors AB (or ab) genotypes within the schistosome population and a corresponding excess of
these genotypes. Although these considerations clearly show that the potential exists for the sign of
epistatic selection and linkage disequilibrium to fluctuate within this system, equations (26) make it
equally clear that this will only occur if allele frequencies themselves fluctuate over time.
We now know that drawing conclusions about the sign of linkage disequilibrium requires that
we understand what is going on with allele frequencies within both snail and schistosome populations.
Unfortunately, drawing definitive conclusions about the coevolution of allele frequencies within this
system is much more challenging than it was for the gene-for-gene model we studied earlier in this
chapter. As a consequence, simple inspection of the QLE approximation no longer suffices, and we must
take the more formal mathematical approach of identifying equilibria and evaluating their local stability.
If we follow the standard protocol and first identify equilibria by setting equations (24a,b and 25a,b)
equal to zero and solving for the allele frequencies that satisfy the equality, we find that there are thirty
possible equilibria!!! Clearly, it isn’t going to be possible to neatly summarize all thirty of these equilibria
in a tidy table. Instead, let’s focus on just the five equilibria that exist, are not purely unstable, and help
us to learn something important about coevolution between the snail B. glomurata and its schistosome
parasite S. mansoni (Table X).
13
Table 2. A subset of equilibria and their local stability
𝑝𝑋,𝐴
𝑝𝑋,𝐡
π‘π‘Œ,𝐴
π‘π‘Œ,𝐡
Eigenvalues
Stability
0
0
1
1
1, 1, 1, 1
Neutrally stable
0
1
1
0
1, 1, 1, 1
Neutrally stable
1
0
0
1
1, 1, 1, 1
Neutrally stable
1
1
0
0
1, 1, 1, 1
Neutrally stable
1/2
1
1
1 − β…ˆ √𝑠𝑋 √π‘ π‘Œ , 1 − β…ˆ √𝑠𝑋 √π‘ π‘Œ ,
4
4
1
1
1 + β…ˆ √𝑠𝑋 √π‘ π‘Œ , 1 + β…ˆ √𝑠𝑋 √π‘ π‘Œ
4
4
Cycles with increasing amplitude
1/2
1/2
1/2
Looking through this table of equilibria we see that many of the usual suspects are present,
including equilibria corresponding to fixed mismatching of the two species as well as the fully
polymorphic equilibrium where allele frequencies equal one half. A quick inspection of the local stability
conditions reveals, however, that multi-locus genetics adds an interesting twist. Specifically, the
equilibria corresponding to fixed mismatching, where the snail is completely unable to recognize and
defend against the shistosome is neutrally stable. What this means, is that even if new mutations arise
within a snail population that allow infections by the schistosome population to be recognized and
cleared, they will not aggressively spread through the population. Instead, we expect any novel and
advantageous mutants to remain hovering near a frequency of zero. How can this be the case? Why do
these novel mutations not spread rapidly through the snail population? The answer to these questions
can be found in the pattern of epistasis produced by the infection matrix (X). The key feature of this
infection matrix that leads to these somewhat counterintuitive results is the requirement that an
individual snail must have alleles that match those of the schistosome at both loci for the infection to be
recognized and cleared. Because, any novel mutations in the snail population that match the
corresponding resident allele in the schistosome population will inevitably be very, very rare, the chance
of any individual snail carrying the “right” allele at both loci is vanishingly small. As a consequence, even
though selection strongly favors increases in the frequency of alleles within the snail population that
match those in the schistosome population, epistasis prevents them from spreading efficiently, greatly
reducing the rate of adaptation within the snail.
Turning our attention to the fully polymorphic equilibrium reveals that, near this equilibrium,
allele frequencies behave in much the same way as we might expect for a single locus model.
Specifically, our local stability analysis reveals eigenvalues consisting of a real component equal to one
and an imaginary component. As we know well by now, in a discrete time system, eigenvalues like these
suggest oscillations of increasing amplitude. Although we have observed similar cyclical dynamics in
previous models, what makes cycles super cool in this case is their consequences for patterns of linkage
disequilibrium. Earlier in this section we showed that the sign and magnitude of linkage disequilibrium in
both snail and schistosome could, conceivably, fluctuate over time if allele frequencies also fluctuate.
14
We can now put this all together and predict what we should observe in the snail-schistosome system. If
all alleles are relatively common, we expect to see oscillations in allele frequencies and oscillations in
the sign and magnitude of linkage disequilibrium in snail and schistosome. The reason for this is largely
that the particular combinations of mucin alleles that are best able to evade recognition by the most
common snail FREP molecules and those combinations of snail FREP molecules best able to recognize
the most common schistosome mucin molecules are favored by (epistatic) natural selection. In other
words, we have demonstrated the potential for fluctuating epistasis in this interaction, one of the most
perennially popular explanations for why coevolution might favor the evolution of sexual recombination
(REFS).
Unlike gene-for-gene coevolution between Flax and Flax-Rust, it appears that within the snailschistosome system the presence of two, epistatically interacting loci may qualitatively change the
dynamics of the system. Specifically, although we still expect allele frequencies to oscillate if alternative
forms of the allele are common, if both loci approach monomorphism our local stability analyses seem
to suggest the system might get stuck in a state where the snail is unable to recognize and defend
against the schistosome. This intriguing conclusion, along with our prediction of fluctuating
disequilibrium, however, relies on very simple analyses of our QLE approximation. How robust are the
predictions of our QLE approximation to cases where selection is strong or recombination infrequent?
The easiest way to answer this important question is by simulating the exact recursion equations (X) and
comparing the results to the predictions of our QLE approximation for specific combinations of
parameters. Not surprisingly, these simulations reveal our QLE approximation is quite accurate as long
as selection is weak and recombination frequent (Figures X and X). As selection becomes stronger
relative to recombination, the quantitative accuracy of our QLE approximation begins to break down,
although broad qualitative predictions (e.g., fluctuating LD) remain robust. The only place we really wind
up in trouble is when recombination becomes very weak; in such cases we can really blow it by using the
QLE! For instance, Figure 6 shows just how bad our QLE approximation can become when recombination
is quite weak. That the QLE breaks down in this case should not come as any real surprise, however,
since we knew going in that we were egregiously violating a critical assumption. This just shows, once
again, the importance of evaluating the robustness of any approximation we use to gain analytical
insight, whether it be the QLE, classical quantitative genetics, or adaptive dynamics.
Extension 2: Evaluating the importance of epistasis and disequilibrium for quantitative traits
So far in this chapter we have focused on systems where we have some insight, albeit
incomplete, into how host and parasite genotypes might interact mechanistically to produce either
infection or resistance. In many other cases, however, we may know only how individual phenotypes
interact to produce various outcomes of an interaction. For instance, in Chapter 3 we studied
coevolution between the cuckoo, XXX, and its warbler host, XXX, assuming interactions depended on
egg coloration in the two species. We further assumed that egg coloration in both species was the
product of a very large number of genes, each with only a very small impact on phenotype. How would
our predictions about the dynamics and outcome of coevolution in this interaction change if egg
15
coloration were instead controlled only by a pair of loci in each species? Fortunately, it is possible to
answer this question using the mathematical machinery we have already developed in this chapter.
Our starting point for integrating the interaction between cuckoo and warbler into the twolocus population genetic framework we have developed in this chapter is to specify how individual
phenotypes impact the outcome of encounters. To keep things consistent with our previous studies of
coevolution between cuckoo and warbler in Chapter 3, we will assume the probability the host bird fails
to recognize a counterfeit egg within its nest and rears it as its own, depends on the degree to which
cuckoo and warbler egg coloration matches. This verbal description of egg recognition can be captured
by a phenotype matching function of the form:
𝑃(π‘₯, 𝑦) = 𝐸π‘₯𝑝[−(π‘₯ − 𝑦)2 ]
(27a)
which we saw in Chapter 3 can be approximated by its first order Taylor Series expansion:
𝑃(π‘₯, 𝑦) ≈ 1 − πœ”(π‘₯ − 𝑦)2
(27b)
where the parameter πœ” quantifies the ability of warblers to discriminate among eggs based on
differences in coloration. For the approximation (27b) to be accurate, this parameter must be small
relative to the distance in egg coloration between interacting individuals.
The next step we must take is to specify how genotypes are translated into phenotypes. The
simplest way this can be done is to assume phenotypes are determined additively, and that the
phenotypic effect of each locus is equal, as shown in Table 3. With this translation between genotype
and phenotype in hand, we can generate an interaction matrix by simply plugging the phenotypic values
for each genotype from Table 3 into the interaction function (27b).
Table 3. Genotypes and their associated phenotypes
Genotype
Egg coloration, z
ab
0
aB
1
Ab
1
AB
2
The result of this substitution is the following interaction matrix describing the probability that a cuckoo
with a particular genotype produces eggs that successfully masquerade as the eggs of a particular
warbler genotype and thus avoid ejection:
1
1−πœ”
𝛼=[
1−πœ”
1 − 4πœ”
1−πœ”
1
1
1−πœ”
1−πœ”
1
1
1−πœ”
1 − 4πœ”
1−πœ”
]
1−πœ”
1
(28)
where warbler genotypes are in columns {AB, Ab, aB, ab} and cuckoo genotypes are in rows {AB, Ab, aB,
ab}.
16
To move forward with our analysis, we just need to substitute the values of the interaction
matrix (28) into the general expressions for coevolutionary change (17-18) and analyze the resulting
model. Not surprisingly, however, there is a catch. Specifically, making this substitution leads to an
algebraic mess. Although it might, in principle, be possible to sort this mess out, a simpler and more
efficient approach is to again apply a Quasi-Linkage Equilibrium approximation. As before, this requires
that we assume selection is weak (order ε) and that recombination occurs with a sufficient frequency for
linkage disequilibrium to be kept small (order ε). We then use first order Taylor Series expansions in ε
(see accompanying Mathematica notebook) to derive approximate expressions for coevolutionary
change in warbler:
πœ”π‘ 
βˆ†π‘π‘‹,𝐴 ≈ 1−𝑠𝑋 𝑝X,A π‘žX,A (1 + 2(𝑝X,B − 𝑝Y,A − 𝑝Y,B ))
(29a)
𝑋
πœ”π‘ 
βˆ†π‘π‘‹,𝐡 ≈ 1−𝑠𝑋 𝑝X,B π‘žX,B (1 + 2(𝑝X,A − 𝑝Y,A − 𝑝Y,B ))
(29b)
𝑋
βˆ†π·π‘‹ ≈
2πœ”π‘ π‘‹
𝑝 π‘ž 𝑝 π‘ž (1 − π‘Ÿπ‘‹ ) − π‘Ÿπ‘‹ 𝐷𝑋
1−𝑠𝑋 X,A X,A X,B X,B
(29c)
and cuckoo:
βˆ†π‘π‘Œ,𝐴 ≈ −πœ”π‘ π‘Œ 𝑝Y,A π‘žY,A (1 + 2(𝑝Y,B − 𝑝X,A − 𝑝X,B ))
(30a)
βˆ†π‘π‘Œ,𝐡 ≈ −πœ”π‘ π‘Œ 𝑝Y,B π‘žY,B (1 + 2(𝑝Y,A − 𝑝X,A − 𝑝X,B ))
(30b)
βˆ†π·π‘Œ ≈ −2πœ”π‘ π‘Œ 𝑝Y,A π‘žY,A 𝑝Y,B π‘žY,B (1 − π‘Ÿπ‘Œ ) − π‘Ÿπ‘Œ π·π‘Œ
(30c)
where all terms of order ε2 and greater have been ignored.
As we have seen previously in this chapter, because the QLE approximation decouples
coevolutionary changes in allele frequencies from changes in linkage disequilibrium, we can solve for the
quasi-equilibrium values of linkage disequilibrium without thinking about evolution of allele frequencies
themselves. Pursuing this strategy by setting βˆ†π·π‘‹ and βˆ†π·π‘Œ equal to zero and solving for 𝐷𝑋 and π·π‘Œ
yields the following QLE solutions for linkage disequilibrium in warbler and cuckoo:
̃𝑋 ≈ 2πœ”π‘ π‘‹ 𝑝X,A π‘žX,A 𝑝X,B π‘žX,B (1−π‘Ÿπ‘‹ )
𝐷
(1−𝑠 )π‘Ÿ
(31a)
Μƒπ‘Œ ≈ − 2πœ”π‘ π‘Œ 𝑝Y,A π‘žY,A 𝑝Y,B π‘žY,B (1−π‘Ÿπ‘Œ )
𝐷
(31b)
𝑋
𝑋
π‘Ÿπ‘Œ
If we inspect these expressions for even a minute, it becomes clear that the sign of linkage
disequilibrium within warbler and cuckoo populations will be quite predictable. Specifically, equations
(31) demonstrate that LD will always be positive within the warbler population but negative within the
cuckoo population. Why should this be the case? The easiest way to understand why LD should differ in
sign between the two species is to return to thinking about phenotypes and their interaction with
fitness. Before we tie these things together, however, it is important to realize that the QLE
approximation guarantees that phenotypes follow a unimodal, and unskewed, frequency distribution.
17
The simple reason for this is that when a phenotype is determined by multiple loci, bimodality (or multimodality for that matter) and skew can only arise through the build-up of linkage disequilibrium. Of
course, since our QLE approximation applies only under conditions where LD remains small, we know
that it also applies only to cases where phenotype distributions remain unimodal and unskewed. Now, if
egg color distributions in both warbler and cuckoo are unimodal and unskewed, the best possible egg
coloration for the cuckoo is that which matches the average egg coloration of the warbler population;
any cuckoo whose egg coloration score is either smaller than, or larger than, the optimal coloration
defined by the warbler will have reduced fitness. As a consequence, selection acting on the cuckoo
population is of a stabilizing form, resulting in an excess of intermediate phenotypes corresponding to
negative linkage disequilibrium (Figure Xa). For the warbler, however, the best possible egg coloration is
that which is the furthest away from the average egg coloration of the cuckoo population. Thus,
selection favors cuckoos with egg coloration scores that are either larger than the warbler average OR
smaller than the warbler average. Thus, selection acting on the cuckoo is of a disruptive form, favoring
the buildup of positive linkage disequilibrium (Figure Xb).
We now know that linkage disequilibrium takes a particularly simple and intuitive form in the
interaction between warbler and cuckoo. In order to build on this understanding, we now turn to an
analysis of allele frequency dynamics. Unfortunately, our expressions for allele frequency change cannot
be studied through inspection alone and we must turn to a formal analysis of equilibria and local
stability. Following the same old standard procedure reveals that there are 49 possible equilibria. Of
course, many of these equilibria do not even exist (see Mathematica notebook), and of those that do,
none are ever locally stable. To be more specific, if we walk down this lengthy list of equilibria and stop
to inspect the four eigenvalues associated with each, it becomes apparent that in each and every case,
at least one of the eigenvalues has a real component greater than one. As we have seen previously for
discrete time systems, having even one eigenvalue with a real component greater than unity is sufficient
to guarantee instability. What this observation suggests, particularly when combined with the
observation that many of the equilibria are associated with imaginary eigenvalues, is that the only
possible outcome of coevolution between warbler and cuckoo is perpetual allele frequency cycles.
Hopefully, at this point at least some of you are wondering how this can possibly be correct when back
in Chapter 3 we predicted that this same interaction between warblers and cuckoos would lead to either
a stable equilibrium where egg coloration in the two species match or an endless coevolutionary chase
where egg coloration became ever more extreme in both species. What changed? Although there are
many ways to answer this question, the simplest is to point out that the genetic architecture of traits
involved in coevolutionary interactions matters, and it matters a lot. Whereas in Chapter 3 we assumed
egg coloration depended on a very large number of genes, each with a very small phenotypic effect, we
are now assuming egg coloration is controlled by only two loci, each with a substantive impact on
phenotype. An important consequence of this difference is that in Chapter 3 the genetic variance
associated with egg coloration could not evolve (by assumption), whereas the genetic variance can
evolve quite rapidly in the population genetic framework we study here. It is well known that this
difference can lead to significantly different outcomes to the coevolutionary process (REFS).
18
At this point, we have gleaned all the really useful information we can from our QLE
approximation. The easiest way to push things a little bit further is to use numerical iteration of the
more exact recursion equations defined by (17-18) and (28) to explore the robustness of our QLE
predictions. Not too surprisingly, these numerical analyses show that our QLE approximation works very
well when the parameter πœ”, and thus the strength of selection, is small and the rate of recombination is
relatively large by comparison (Figure 7). In such cases, the only possible outcome is fluctuations in the
allele frequencies of warbler and cuckoo accompanied by positive linkage disequilibrium in the warbler
and negative linkage disequilibrium in the cuckoo. As the parameter, πœ”, becomes larger or the rate of
recombination smaller, the quantitative accuracy of our QLE approximation deteriorates as we should
expect (Figure 8). Interestingly, however, even under such conditions, it appears that the qualitative
predictions of our QLE approximation remain entirely correct: allele frequencies cycle and LD is positive
within the warbler population but negative within the cuckoo population. The reason I use the caveat
“appears” in this case, is because numerical investigations can rarely investigate the entire parameter
space leaving open the possibility that somewhere there exist parameter combinations that yield
alternative outcomes but which we failed to explore.
Conclusions and Synthesis
Once coevolutionary interactions depend upon multiple loci, epistatic selection and linkage
disequilibrium have the potential to become powerful forces in the coevolutionary process. For
instance, our exploration of coevolution between Rust and Flax revealed that although including two loci
does not impact the outcome of coevolution it does allow epistasis to impact the rate. We saw
something quite different when we explored coevolution between schistosome and snail. Here
epistastic selection changes everything and allows for patterns of fluctuating disequilibria… Finally, our
consideration of coevolution between the warbler, x, and the parasitic cuckoo, x, has revealed that the
assumptions we make about the genetic architecture underlying the traits of coevolving species can
have significant impacts on our expectations. Specifically, in This result demonstrates that if we are to
accurately predict the outcome and dynamics of coevolution we need to know something about the way
in which
Much as we saw with dominance in the previous chapter, plays an important role in the dynamics and
outcome of coevolution. In addition to building up linkage disequilibrium between loci, epistatic
selection causes changes in the frequency of alleles at one locus to depend on the frequency of alleles at
another. Ultimately, these interact
We know so very little about the actual genetic architecture and patterns of epistasis within real
systems it is almost shocking; certainly humbling. This, along with dominance, is the frontier of
coevolutionary genetics.
In summary,
19
20
References
Figure Legends
Dybdahl, M. F., C. E. Jenkins, and S. L. Nuismer. 2014. Identifying the Molecular Basis of Host-Parasite
Coevolution: Merging Models and Mechanisms. AMERICAN NATURALIST 184:1-13.
Mitta, G., C. M. Adema, B. Gourbal, E. S. Loker, and A. Theron. 2012. Compatibility polymorphism in
snail/schistosome interactions: From field to theory to molecular mechanisms. Developmental
and Comparative Immunology 37:1-8.
21
Download