O R I G I NA L A RT I C L E doi:10.1111/j.1558-5646.2009.00861.x GENETIC ARCHITECTURE AND POSTZYGOTIC REPRODUCTIVE ISOLATION: EVOLUTION OF BATESON–DOBZHANSKY–MULLER INCOMPATIBILITIES IN A POLYGENIC MODEL Janna L. Fierst1,2,3 and Thomas F. Hansen1,3 1 Center for Ecological and Evolutionary Synthesis, Department of Biology, University of Oslo, 0316 Oslo, Norway 2 3 E-mail: jfierst@bio.fsu.edu Department of Biological Sciences, Florida State University, Tallahassee, Florida 32306 Received April 29, 2009 Accepted September 28, 2009 The Bateson–Dobzhansky–Muller model predicts that postzygotic isolation evolves due to the accumulation of incompatible epistatic interactions, but few studies have quantified the relationship between genetic architecture and patterns of reproductive divergence. We examined how the direction and magnitude of epistatic interactions in a polygenic trait under stabilizing selection influenced the evolution of hybrid incompatibilities. We found that populations evolving independently under stabilizing selection experienced suites of compensatory allelic changes that resulted in genetic divergence between populations despite the maintenance of a stable, high-fitness phenotype. A small number of loci were then incompatible with multiple alleles in the genetic background of the hybrid and the identity of these incompatibility loci changed over the evolution of the populations. For F 1 hybrids, reduced fitness evolved in a window of intermediate strengths of epistatic interactions, but F 2 and backcross hybrids evolved reduced fitness across weak and moderate strengths of epistasis due to segregation variance. Strong epistatic interactions constrained the allelic divergence of parental populations and prevented the development of reproductive isolation. Because many traits with varying genetic architectures must be under stabilizing selection, our results indicate that polygenetic drift is a plausible hypothesis for the evolution of postzygotic reproductive isolation. KEY WORDS: Compensatory mutations, epistasis, hybrid load, speciation. Bateson (1909), Dobzhansky (1937), and Muller (1940, 1942) separately outlined a scenario of allopatric speciation in which a single, homogeneous population is split into two isolated populations, which then experience substitutions of alleles at different loci. After sufficient isolation, hybrid offspring from these two populations suffer reduced viability due to epistatic interactions between untested combinations of alleles. The Bateson– Dobzhansky–Muller (BDM) model was significant because it proposed a specific genetic mechanism for the evolution of reproductive isolation, and the general scenario of speciation occurring through the accumulation of incompatible epistatic interactions is now widely accepted. 1 C 2009 The Author(s). Evolution The BDM model has been studied analytically (Wagner et al. 1994; Orr 1995; Orr and Turelli 2001; Gavrilets 2003, 2004; Shpak 2005) and computationally (Johnson and Porter 2000; Porter and Johnson 2002; Johnson and Porter 2007; Palmer and Feldman 2009), and recent empirical studies have identified and analyzed genes and chromosomal regions that act in BDM incompatibilities (Fishman and Willis 2001; Kondrashov et al. 2002; Presgraves et al. 2003; Moyle and Graham 2005; Brideau et al. 2006; Burton et al. 2006; Sweigart et al. 2006; Willis 2006; but see Welch 2004). These studies have examined the genetic basis of reproductive isolation to characterize the genetic changes that accompany species divergence, and the results highlight questions J. L . F I E R S T A N D T. F. H A N S E N about the generality of genetic patterns in speciation (reviewed in Coyne and Orr 2004; Johnson 2006). First, are there aspects of the genetic system that promote or prevent reproductive isolation? Although empirical studies have typically focused on sexrelated differences such as Haldane’s rule or the large-X effect, quantitative patterns of genetic architecture may also affect the development of reproductive isolation. Second, is reproductive isolation typically polygenic and complex, or simple, involving few loci? Finally, if we identify hybrid-incompatibility loci today, what role did they play in the initial formation of reproductive isolation? In this article, we study the BDM scenario by modeling a polygenic phenotypic trait with multilinear epistasis. Previous work with theoretical genotype–phenotype maps has suggested that underlying epistatic architectures may lead to speciation through curved genotype–phenotype maps that result in small genetic changes pushing different populations to isolated fitness maxima. Models of the BDM process have typically focused on epistatically interacting loci that directly decrease fitness and have not addressed the role of genotype–phenotype maps in reproductive isolation (Orr 1995; Gavrilets and Gravner 1997; Orr and Turelli 2001; Gavrilets 2003, 2004; Shpak 2005; but see Wagner et al. 1994; Johnson and Porter 2000; Porter and Johnson 2002; Johnson and Porter 2007), but epistatic interactions at the level of trait and the level of fitness may have different consequences for species divergence. Epistasis can either canalize the phenotype against genetic perturbations or magnify their effects (Hansen and Wagner 2001a), but studying these phenomena requires an explicit genotype–phenotype map that is missing in genotype-fitness representations. Features of genetic architecture such as the rate and effect of mutations, distribution of alleles, and patterns of gene interaction are known to affect the evolution of a population (Bürger 2000), but few studies have examined the effects of genetic architecture on patterns of population divergence. Part of this is due to the complexity of epistatic interactions, whereby alleles at any number of loci may interact and produce large or small effects on the phenotype. Traditionally, quantitative genetics has focused on models based on additive effects but there is substantial empirical evidence for epistasis and a need to understand the role epistasis plays in evolution (Whitlock et al. 1995; Fenster et al. 1997; Wolf et al. 2000; Flatt 2005; Hansen 2006; Le Rouzic and Carlborg 2007). Theoretical treatments of epistasis can be divided into those that address statistical epistasis and those that address functional, or physiological epistasis (Cheverud and Routman 1995; Phillips 1998; Hansen and Wagner 2001a; Alvarez-Castro and Carlborg 2007). Statistical epistasis depends on segregating allele frequencies and thus cannot address long-term population dynamics or predict the effects of epistatic interactions on evo- 2 EVOLUTION 2009 lutionary responses. As our goal is to understand how the evolution of reproductive isolation is influenced by the structure of the genotype–phenotype map, we instead use a functional description of epistasis that models the phenotypic differences that result from epistatic interactions between alleles at different loci. Models of functional epistasis have shown that the direction and magnitude of epistatic interactions can influence the evolution of a population (Kondrashov 1988; Charlesworth 1990; Zhivotovsky and Gavrilets 1992; Gavrilets and de Jong 1993; Kondrashov 1993; Rice 1998; Hansen and Wagner 2001a,b; Rice 2002; Hermisson et al. 2003; Rice 2004; Carter et al. 2005; Weinreich et al. 2005; Hansen et al. 2006; Desai et al. 2007; Yukilevich et al. 2008; reviewed in Hansen 2006). If patterns of epistasis affect a population’s response to selection, they may also affect population divergence and reproductive isolation. In this study, we analyze the effects of directionality and magnitude of epistatic interactions on reproductive isolation. We focus primarily on stabilizing selection because several simulation studies have shown that BDM-type reproductive isolation can readily evolve under directional selection (Johnson and Porter 2000; Porter and Johnson 2002; Johnson and Porter 2007), and there is substantial empirical evidence for the evolution of reproductive isolation as a byproduct of directional selection (i.e., adaptive radiation, Schluter 2000; genetic conflict, Presgraves et al. 2003; Presgraves 2007; Presgraves and Stephan 2007; reviewed in Coyne and Orr 2004). In contrast to this, there are few investigations and little direct evidence for the evolution of reproductive isolation under stabilizing selection. Theoretical studies suggest, however, that reproductive isolation under stabilizing selection should be a plausible scenario (Gavrilets and Gravner 1997; Gavrilets 1999, 2003, 2004; Palmer and Feldman 2009). Many phenotypic and physiological traits are conserved over long time scales and likely subject to strong stabilizing selection, and evolutionary studies of selective forces point to stabilizing selection as common (Endler 1986; Travis 1989; Arnold et al. 2001; Blows and Brooks 2003; Estes and Arnold 2007; but see Kingsolver et al. 2001). For allopatric populations in particular, we expect similar patterns of stabilizing selection on quantitative traits, and it is important to understand whether postzygotic isolation can readily evolve in this situation. Theoretical studies of the evolution of reproductive isolation under stabilizing selection may help guide future empirical studies. We address two primary questions. First, what is the effect of genetic architecture on the evolution of reproductive isolation? That is, are there certain features of genetic architecture that promote or prevent the evolution of reproductive isolation? Second, we ask what the genetic architecture looks like at the end of the process—what is the genetic basis of reproductive incompatibility in these populations, and what insights can we gain about the process through the genetic architecture we see at the end? H Y B R I D I Z AT I O N A N D G E N E T I C A R C H I T E C T U R E Theory The BDM process of speciation posits that a series of genetic changes happen in isolated populations, but changes in different populations can be epistatically incompatible and lead to a loss of hybrid fitness. Our model examines a character with a continuous, polygenic genetic architecture with multilinear epistasis that is subject to stabilizing selection. We assume that two populations are isolated and evolve independently under the same optimum and strength of stabilizing selection until subsequently brought into contact. Because multiple loci contribute to the trait and it is under stabilizing selection, allelic values can diverge along phenotypically equivalent ridges of high fitness (illustrated in Fig. 1 for the two-locus case). Although we do not directly model multiple characters, real populations likely have many polygenic characters under stabilizing selection, and similar processes occurring across multiple characters would result in severe reduction of hybrid fitness. A AB 2 AC y B -1/ C Hybrid offspring deviate in fitness from the parent populations due to (1) an epistatic shift in the mean trait values of the hybrids away from the ecological optimum and (2) an increased variance in their trait values due to segregation variance caused by divergent allelic values of the parental populations. Under stabilizing selection, the latter effect can cause hybrid load even in the absence of epistasis in the trait (Barton 1989; Slatkin and Lande 1994). THE MULTILINEAR GENOTYPE–PHENOTYPE MAP The multilinear epistatic model (Hansen and Wagner 2001a) describes the phenotype as a linear function of a set of “reference effects” at each locus. The reference effect, i y, of a genotype at locus i, is defined as the phenotypic effect of substituting this one-locus genotype into a designated reference genotype. In this article, we use the average genotype of the ancestral population before the species became isolated as our reference genotype. The reference effects (i y’s) refer to the phenotypic effects of wholelocus genotypes, and do not specify the interaction between alleles at the same locus. We define the reference effect of a single allele (i a) as the average phenotypic effect of substituting the allele into a specified reference genotype (with equal probability of replacing each of the two alleles at a locus). Without dominance, the locus reference effect is then the sum of the effects of the two alleles at the locus. A genotype, g, is characterized as the set of locus reference effects, g = {1 y, . . . , n y}, over all n loci. With pairwise epistasis, the linearity assumption results in a genotype–phenotype map of the following form: z = zr + i -1/ 1 y Isoclines of maximal fitness for a two-locus, continuumof-alleles model of a trait under quadratic stabilizing selection. Figure 1. Isolated parental populations (here “A”, “B,” and “C”) can diverge stochastically along the isoclines. Due to the curvature of the isoclines, hybrids between these populations (here “AB” and “AC”) will fall in the “holes” outside the maximum fitness isoclines, and have low fitness. It is also conceivable that the genotype may sometimes jump from one branch of the isocline to another, and cause more dramatic incompatibilities (as in “AC”). The dashed lines indicate asymptotes for the isoclines. As epistasis gets weaker (ε −> 0) the asymptotes and the lower branch moves to negative infinity, and the upper branch straightens to a straight line. With an additive architecture the hybrids will therefore on average fall directly on the maximal fitness isocline but segregation variance may still cause individual hybrids to deviate from the average (maximal fitness isoclines occur where 0 = 1 y + 2 y + ε1 y 2 y, and there are two parallel ridges of high fitness under stabilizing selection). i y+ 1 ij i j ε y y, 2 i j=i (1) where z r is the value of the reference genotype, and the sums are over the reference effects in g. The epistatic coefficient, ij ε, specifies the impact that the interaction between two loci has on the phenotype. A positive ij ε will increase the effects of joint substitutions with positive effects on the two loci, but decrease the effects of joint substitutions with a negative effect. A negative ij ε will do the opposite. Thus, positive epistasis refers to reinforcement of positive substitutions, and negative epistasis refers to reinforcement of negative substitutions. This framework can include any order of interaction and also dominance, but in this article, we consider only pairwise epistasis without dominance. Specifying a reference genotype sets a scale and origin against which we can measure the phenotypic effect of individual gene substitutions, and the meaning of the parameters in the multilinear model will depend on the choice of reference. This is an unavoidable consequence of epistasis, and a reference genotype will always need to be specified. The multilinear form is, however, invariant with respect to the choice of reference, and EVOLUTION 2009 3 J. L . F I E R S T A N D T. F. H A N S E N simple formulae exist for translating allelic and epistatic values from one reference to another (Hansen and Wagner 2001a; Barton and Turelli 2004; Alvarez-Castro and Carlborg 2007). We emphasize that the genetic architecture measured with reference to the ancestral genotype may be very different from the genetic architecture as it would be measured in the descendant populations, and that this distinction must be kept in mind when interpreting our results. HYBRID LOAD UNDER STABILIZING SELECTION In this model, postzygotic reproductive isolation results from loss of hybrid fitness (Fig. 1). We consider two isolated populations with identical mean phenotypes, z̄ 1 and z̄ 2 , equal to a phenotype, θ, with optimal fitness. The fitness contribution of this trait is determined by a quadratic fitness function W(z) = 1 − s(z − θ)2 , where s is the strength of selection. If the two populations have recently diverged from a common ancestor and we measure their genetic parameters with reference to this common ancestor, then we can assume that they have only diverged in terms of their reference effects, and have the same epistasis coefficients (this entails the assumption that there are no effects of higherorder epistasis). We assume, without loss of generality, that θ = 0. Using the constraint that z̄ 1 = z̄ 2 = θ = 0, the mean phenotype of the F 1 hybrid will be z̄ F1 = − 1 ij i j ε a a, 2 i j=i (3) EVOLUTION 2009 (4) In the F 1 , the variance term is similar to the variance term of the parental populations, and we assume it does not contribute to the hybrid load s L F1 = s(z̄ F1 )2 = (aT Ea)2 . (5) 4 In the F 2 and backcrosses, the variance term is larger than in the parental populations due to segregation of divergent parental alleles. From the hybrid means and variances derived in Appendix A, the F 2 and backcross loads become ⎞2 ⎛ s ⎝ ij i j L F2 = ε a1 − ia2 a1 − ja2 ⎠ 4 i j 2 s i + a1 − ia2 2 i 2 j 2 s ij 2 i + ε a1 − ia2 a1 − ja2 8 i j ⎛ ⎞2 j s i 2 ij + a1 − ia2 ⎝ ε a1 + ja2 ⎠ 2 i j 2 j s ij 2 i i (6) + ε a1 − a2 a1 − ja2 2 i j L B1 where a = {1 a, . . . , n a}T is the column vector of allelic differences at all loci, T denotes transpose, E is the matrix of epistasis coefficients with its ijth entry equal to ij ε, and we have adopted the convention that ii ε = 0. These equations tell us that allelic divergence at epistatically interacting loci will shift the mean F 1 hybrid phenotype away from the optimum. The total effect will depend on the degree of allelic divergence, the strength of epistasis, and on correlations between different epistasis terms across different loci. BDM incompatibilities in this framework are always conditional on the entire genetic background of the hybrid. This contrasts with other BDM models (Orr 1995; Orr and Turelli 2001; Gavrilets 2003, 2004; Shpak 2005) where discrete incompatibilities are unconditional and instead occur at a given rate or once allelic divergence has crossed a set threshold. In general, the load generated by quadratic selection is proportional to the average squared distance of individuals from the 4 L = s(z̄)2 + svar[z]. (2) where i a = i a 1 − i a 2 is the difference in allelic reference values of the two populations at locus i. The derivation is given in Appendix A. This can be written more compactly in matrix notation as 1 z̄ F1 = − aT Ea, 2 optimum, and equal to the sum of the squared mean deviance and the individual variance (e.g., Hansen et al. 2006b). Specifically, ⎞2 ⎛ j 9s ⎝ ij i i j = ε a1 − a2 a1 − a2 ⎠ 64 i j 2 s i + a1 − ia2 4 i 2 j 2 s ij 2 i + ε a1 − ia2 a1 − ja2 32 i j ⎛ ⎞ 2 3 s i 1 2 j ij + a1 − ia2 ⎝ ε a1 + ja2 ⎠ 4 i 2 2 j 2 3 j s ij 2 i 1j i + ε a1 − a2 a1 + a2 (7) 4 i j 2 2 or, in matrix notation s s L F2 = (aT Ea)2 + aT a 4 2 s + (a • a)T (E • E) (a • a) 8 s + (a • a)T (EyF2 • EyF2 ) 2 s + yTF2 E (a • a) 2 9s s (aT Ea)2 + aT a L B1 = 64 4 s + (a • a)T (E • E)(a • a) 32 s s + (a • a)T (EyB1 • EyB1 ) + yTB1 E(a • a) 4 4 (8) (9) H Y B R I D I Z AT I O N A N D G E N E T I C A R C H I T E C T U R E where yF2 = {(1 a1 + 1 a2 ), . . . , (n a1 + n a2 )}T , and yB = 1 1 n n { (3 a12+ a2 ) , . . . , (3 a12+ a2 ) }T are column vectors of mean locus effects at all loci, and • denotes element-wise multiplication (Hadamard product). The L B2 is symmetric to L B1 . In these equations, the first term is due to the deviance of the mean phenotype from the parental values, and the rest is due to the segregation variance. Because one term in the segregation variance does not depend on trait epistasis, hybrid incompatibility in the F 2 and backcrosses can evolve even in a purely additive genetic architecture, as previously observed by Barton (1989) and Slatkin and Lande (1994). portant component of F 2 and backcross load under quasi-neutral divergence. If the allelic reference effects are evolving in a quasi-neutral manner, their divergence may be approximated as a Brownian motion (Lynch and Hill 1986; Lynch 1990). If so, the variance parameter k is expected to increase linearly with time, t, as k = 2σ2 t, where σ2 equals the quasi-neutral mutational variance per allele per generation. Using this, we get L F1 = 2sσ4 t 2 i L F2 = snσ2 t + EVOLUTION OF THE HYBRID LOAD UNDER L B1 = QUASI-NEUTRAL DIVERGENCE In this section, we propose and analyze a model of quasi-neutral divergence, where we assume that all loci diverge at the same stochastic rate. When reproductive isolation occurs under uniform selection, genetic divergence is due to either stochasticity stemming from genetic drift or the random appearance of new mutations (Mani and Clarke 1990). As selection will keep the phenotype near the optimum, divergence will be constrained to sets of high-fitness genotypes in a complex interaction between selection and stochasticity. For this model, we assume the difference in allelic reference effects, i a, is independent and identically normally distributed with mean zero and variance parameter k. This implies that the rate of divergence is independent of the strength and pattern of epistasis experienced by each specific locus (see below for conditions). Then, as shown in Appendix A, the expected hybrid loads contributed by one trait are L F1 = sk2 ij 2 ε , 2 i j L F2 = skn 13sk2 ij 2 + ε , 2 8 i j L B1 = skn 5sk2 ij 2 ε , + 4 8 i j (10) where n is number of loci, and denotes expectation with respect to random allelic divergence. This predicts that F 1 load is proportional to the sum of the squared epistasis coefficients whereas the F 2 and B 1 loads simplify to two terms, one due to epistasis and one term proportional to the summed variances in allelic divergence, independent of epistasis. As the effects of epistasis reduce to a simple sum of the squared epistatic coefficients, the directionality of epistasis does not affect the load under quasi-neutral divergence. As shown in Appendix A, the deviance of the hybrid mean from the optimum contributes a factor 1/2 (out of the 13/8) to the epistatic term in the F 2 load, and a factor 1/4 (out of 5/8) to the backcross load. Thus, segregation variance is the most im- ε , ij 2 j 13sσ4 t 2 ij 2 ε , 2 i j (11) 5sσ4 t 2 ij 2 snσ2 t + ε , 2 2 i j The mutational variance parameter, σ2 , only includes mutations with sufficiently small effects to behave as if they are nearly neutral. As most mutations may not fall exactly on the neutral subspace, this parameter will be affected by the strength of selection, the effective population size, the level of genetic variation, the number of loci, and the strength of epistasis. Under stabilizing selection, the strength of selection on an individual allele is proportional to s times its quadratic effect, and only mutations with a squared effect less than (4N e s)−1 will be effectively neutral (Lynch 1984). Thus, σ2 may, to a first approximation, be proportional with (N e s)−1 , and this leads us to predict that the terms with epistasis in (11) may scale as ∼N e −2 s−1 , whereas the first (additive) term in the F 2 and backcrosses may be roughly independent of s and scale with N e −1 . However, this relationship is defined when the average affect of an allelic substitution deviates from the optimum and the population mean is at the optimum. As phenotypic variance increases in the population and the population deviates from the optimum, the average effect of a substitution becomes complex and the selective effect on individual substitutions less well defined. This relationship may be true early on, but break down as genetic variation accumulates. The effect of epistasis on the rate of divergence is difficult to predict, as epistasis both alters the effect size of new mutations, and induces canalizing selection pressures that operate even in infinite populations (Hermisson et al. 2003; Alvarez-Castro et al. 2009). As we will see in the simulations later, very strong epistasis blocks allelic divergence, and prevents the development of hybrid load altogether. Quasi-neutral divergence can only occur in traits with zero to moderate epistasis. Ignoring these complications, the equations in (11) predict that the hybrid load in the initial phases of reproductive isolation will develop linearly with time in the F 2 and backcrosses. With moderate epistasis, an F 1 load will also develop and increase quadratically. Eventually, the accumulation of hybrid load will EVOLUTION 2009 5 J. L . F I E R S T A N D T. F. H A N S E N decelerate as the total load approaches its theoretical maximum of one. If multiple traits are subject to the same processes and contribute to reproductive isolation, the total load will increase more quickly, but still as a combination of linear and quadratic terms with time in the initial phases of reproductive isolation. The quasi-neutral hypothesis assumes that loci diverge independently, and the predictions do not fit many of our simulation results. This suggests correlated divergence in the reference effects of epistatically interacting loci may be important. Because the two populations experience the same conditions there is no favored sign of i a, but an accelerated evolution of hybrid load may result from correlations between ij ε, i a, and j a of one or more interacting loci, j. For these correlations to have an effect on hybrid load, they must lead to systematic divergence of one or more of the terms in equations (5)–(9). Although the quasi-neutral model assumes and predicts that hybrid load is not affected by the pattern of epistasis, directionality or other systematic patterns of epistasis may result in the correlated divergence of interacting loci. To evaluate these scenarios, we now turn to simulations. Methods of Simulation We used individual-based simulations to study the divergence of identical populations subject to mutation, drift, and either stabilizing or directional selection acting on a single phenotypic trait with a multilinear genotype–phenotype map. The fitness of each individual (w) was determined by the strength of selection (s) and the phenotype (z), where θ is the optimum phenotype. For stabilizing selection, the fitness function was w = 1 − s(z − θ)2 and for directional selection the fitness function was w = 1 + s(z − z̄). The methods of simulation follow Carter et al. (2005). Each simulation began with the creation of an epistasis matrix E. The off-diagonal coefficients of the matrix were drawn randomly from a normal distribution with a specified mean and standard deviation, and all individuals in all populations in a simulation run shared the same epistasis matrix. Directional architectures had epistatic coefficients with a negative or positive mean, and a standard deviation of some specified value. For the results we present here, the standard deviation for directional architectures was on the order of the mean—for instance, a negative architecture with mean of −1 had a standard deviation of 1. This means that although the majority of epistatic coefficients in positive and negative architectures were of similar sign and magnitude, a few coefficients were of small order and opposite sign. We also performed simulation runs with fixed epistatic coefficients to explore the effects that the variance in epistatic interactions had on the evolution of the populations. We examined nondirectional epistasis through epistatic coefficients with a mean of zero and a standard deviation of some specified value. Because the mean centers at the origin, using the 6 EVOLUTION 2009 same standard deviation for nondirectional and directional architectures would result in stronger average epistasis in directional runs. To produce comparable strengths of epistasis, we scaled the standard deviation used in nondirectional runs with the following formula: Let x be a normally distributed variable with mean m and standard deviation σ. The expectation for the absolute value of x is then E[|x|] = σ m 2 |m| 2/ Exp − /σ m + /σ Erf , √ π 2 σ 2 x where Erf [x] = (2/√π) 0 Exp[−t 2 ] dt is the error function. In our simulations m = σ in the above equation, and therefore, σ ε(Nondirectional) = 1.1666 × σ ε(Directional) . The population had N = 1000 individuals with equal numbers of males and females, and population size was constant over generations. In the simulations all ancestral allelic values were set to zero, thus the reference effect of a single allele is also its numerical allelic value in this study. As all individuals began with a specified number of diploid loci with allelic reference values of zero and a reference genotype of zero, all phenotypic changes were due to the different mutations experienced in each population. The reference effects of new mutations were created by randomly drawing from a normal distribution with a mean of zero and a specified standard deviation σ m (the mutational effect size) and adding these to the existing allelic value. The simulation selected individuals for mating based on their relative fitness, and new individuals were created with free recombination between all parental loci. A small amount of environmental variance was added to each phenotype by adding a randomly drawn value from a normal distribution with a specified standard deviation (for most √ simulation runs, this standard deviation was 0.05). Parameters that varied between simulations were: number of loci, number of generations, population size (N), mutation rate (μ), mutational effects (σ m ), type and strength of selection (s), and the mean (|ε|) and standard deviation (σ ε ) of epistatic coefficients. The parameter space for these simulations was very large and for the simulations presented here we fixed some of the parameters to focus on the importance of the epistatic architectures. Small population sizes and large mutational effects caused the populations to behave erratically whereas large populations, low mutation rates, and small mutational effects greatly increased our computational time. Reproductive isolation developed slowly and increased consistently with the number of loci and number of generations, and to accelerate the simulations we fixed those parameters at number of loci = 20 and number of generations = 106 , and increased the mutation rate to μ = 0.01 mutations/locus/generation. For the simulations presented here, σ m = 0.01. Because our main goal was to evaluate the effects of genetic architecture and the epistatic effects (as measured in the evolving populations) themselves change over H Y B R I D I Z AT I O N A N D G E N E T I C A R C H I T E C T U R E time, we started all simulations from zero reference effects and did not allow a burn in. The run time for these simulations was very long and to adequately sample the parameter space and replicate our results we used extremely high per locus mutation rates. Because of this, we cannot make quantitative predictions about the absolute time to speciation. The main goal of our study was to understand how the evolution of reproductive isolation would be affected by genetic architecture, and high mutation rates would affect the results if they increased the likelihood of cosegregation of alleles at different loci and resulted in linkage disequilibrium that altered the allelic dynamics. This is unlikely for two reasons: First, the increased number of alleles per locus is offset by the low number of loci, and the total mutational variance is not unrealistically high (see Discussion). Second, previous studies with the multilinear model have considered the effects of linkage disequilibrium with numerous cosegregating alleles and moderate linkage, and found that the effects were minimal (Hermisson et al. 2003; Carter et al. 2005). We simulated populations under a wide range of mean epistasis strengths (|ε|) and found that strong epistasis produced very little hybrid load. Because of this, we concentrated on thoroughly exploring the parameter space from ε̄ = −2 to ε̄ = 2. In terms of individual gene interactions, this is rather weak epistasis. The presence of an allele with reference effect i a will modify the effect of a new mutation at interacting locus j with the factor f = 1 + ij i ε a (Hansen and Wagner 2001a). Thus, if ij ε = 1, then fixing a typical mutation with homozygous effect 2i σ m = 0.02 will alter the phenotypic effect of a subsequent mutation at locus j with 2%. Each run created 10 populations subject to the same parameters and epistasis matrix. The run time for these simulations was very long and we created multiple populations with the same epistasis matrix to maximize the number of population crosses under a given run time. As there were 10 parental populations in each simulation run, there were 45 F 1 , F 2 , and backcross populations. Using each population in multiple crosses increased the chances of obtaining results that were disproportionately influenced by one or a few odd populations, and we ran 20–30 simulations for each set of parameters to increase our certainty. Using each population in multiple crosses greatly increased our sample size, and we were able to examine 900–1350 F 1 populations for each set of parameters instead of 100–150 F 1 populations if we had used each population in a single cross. At the end of the experiment, 50 individuals from each population were randomly selected and mated to individuals from another population to create 1000 F 1 hybrids. The 1000 F 1 hybrids were randomly mated to create an F 2 population of 1000 individuals. We created backcross populations of 2000 individuals by randomly selecting F 1 individuals and mating them to one randomly selected individual from each parental population. All hybrids were created with free recombination between genotypes, and we calculated the phenotype, fitness, and load for each hybrid. For both directional and stabilizing selection the load was calculated in each parental population as (w p − w h )/w p , where w p was the average fitness in the parental population and w h was the average fitness of the hybrid. To assess the effects of individual alleles or single-locus genotypes on hybrid fitness, we introgressed individual alleles from one parental population into a second parental population and calculated the fitness of the second parental population after each introgression. To do this, we randomly chose one individual from one population from each simulation run as a reference and introgressed each allele from a randomly selected individual from every other parental population in the simulation. Results EVOLUTION OF THE POPULATIONS UNDER STABILIZING SELECTION The populations began with all allelic reference values set to zero, but at the end of the simulation runs the populations had fixed alleles with reference effects that diverged significantly from the ancestral population (Fig. 2). Despite the large reference effects of these allelic values, the parental phenotypes remained close to the ancestral state and mean fitness was never more than 1–2% below the maximum. The genotypic values at individual loci changed dramatically over the course of the simulations and although some loci did not diverge much from the ancestral values and could accept ancestral alleles, introgressions at other positions caused severe loss of fitness. The Brownian-motion model assumes that the betweenpopulation variance in reference effects should be linear with time and have a slope of one. The slope was roughly one for most parameter combinations (Fig. 2) but less than one for populations evolving with stronger epistatic interactions (i.e., |ε|= 1.117 in these simulations). Increasing the strength of selection also slowed the accumulation of variance (Fig. 2). Directional architectures showed systematic shifts away from the ancestral state in mean epistatic terms (i.e., ij ε iy jy) and mean locus reference effects ( ȳ, Fig. 3). The distribution of reference effects for populations with positive architectures became increasingly negative over the evolution of the populations and produced a distribution of ij εi yj y with a positive mean, whereas populations with negative architectures fixed positive alleles and had ij i j ε y y with negative means. Populations with nondirectional architectures diverged from the ancestral population, but did not consistently shift in either ȳ or ij ε iy jy. Although the genetic architectures shifted systematically, this did not result in correlated epistatic terms (Fig. 4). The mean locus divergence between populations (y) was not systematic (Fig. 4A) and although the mean EVOLUTION 2009 7 J. L . F I E R S T A N D T. F. H A N S E N Directional epistasis Nondirectional epistasis Additive Architecture ( = 0) Additive Architecture, s=0 | | = 0.0117 =0 Average Locus Variance (among populations) Average Locus Variance (among populations) | | = 0.117 Average Locus Variance (among populations) | | = 1.17 Average Locus Variance (among populations) s = 1.0 10 s = 0.1 s = 0.01 y = 0.544x - 1.36 y = 0.500x - 1.56 y = 0.493x - 2.28 y = 0.505x - 2.52 y = 0.652x - 0.77 y = 0.689x - 0.81 2 1 0.2 0.1 1 10 2 3 4 10 1 2 3 4 10 1 2 3 4 10 y = 1.034x - 1.23 y = 0.956x - 1.35 2 1 0.1 y = 0.973x - 0.44 y = 0.979x - 0.46 y = 0.998x - 0.69 y = 0.924x - 0.63 0.2 1 10 2 3 4 10 1 2 3 4 10 1 2 3 4 10 y = 0.985x - 1.44 y = 1.005x - 1.45 2 1 0.1 y = 1.032x - 0.44 y = 1.013x - 0.46 y = 1.00x - 0.68 y = 1.00x - 0.69 0.2 1 10 2 3 4 10 1 2 3 4 10 1 2 3 4 10 y = 1.006x - 0.71 y = 1.029x - 1.47 2 1 y = 1.083x - 0.52 y = 1.043x - 0.39 0.2 0.1 1 2 3 4 10 1 2 3 4 10 1 2 3 4 10 Generations x 105 Figure 2. Log–log plots of variance in allelic reference effects across independent populations against time in generations. The accumu- lation of variance was linear on the log scale for most parameter combinations, but the slope was <1 for all parameter combinations, and 1 for populations where |ε| = 1.117. The average variance per locus was calculated among different populations within one simulation run and then averaged across loci. This was then averaged across 20–30 simulation runs with the same parameters. The parameters for the simulations were: N = 1000, number of loci = 20, μ = 0.01 mutations/locus/generation, σ m = 0.01. The values for s are shown across the top of the figure and the values for ε̄ are shown along the side with σε = ε̄. epistatic divergence (ij ε iy jy) shifted according to architecture (Fig. 4B), the magnitude of this shift was small. Hermisson et al. (2003) found there is a phase of adaptation of the genetic architecture under stabilizing selection driven by canalizing selection. This reduces and equalizes directional 8 EVOLUTION 2009 epistasis across loci, as measured with respect to the evolving population, and may explain why we see a systematic change of reference effects in directional architectures. Populations with directional epistasis diverged more quickly from the ancestral state, but because the diverging populations shifted in the same H Y B R I D I Z AT I O N A N D G E N E T I C A R C H I T E C T U R E s=1.0, m=0.01, 1 1 Positive epistasis Nondirectional epistasis Negative epistasis 0.5 ij i j yy 0.5 0 y for 500,000 generations and then evolve separately for 1,000,000 generations the development of reproductive isolation occurred more slowly than in the simulation results we report here, but the results were qualitatively similar. The results we report are for a population size of 1000 individuals, and we also did runs with 200 individuals. As expected, these populations diverged and developed reproductive isolation more quickly, but also much more stochastically. =0.1 B A -0.5 -1 | |=0.1, 0 -0.5 2 4 6 8 -1 10 Generations x 105 2 4 6 8 10 EVOLUTION OF REPRODUCTIVE INCOMPATIBILITIES Generations x 105 The difference in the accumulation of variance/load between directional and nondirectional architectures was due to a systematic shift of the epistatic terms (ij εi y j y) and the locus valFigure 3. ues (y). The average epistasis term across populations sharing the same parameters (A) shifted such that positive architectures had distributions of epistasis terms with positive means, negative architectures had distributions with negative means, and nondirectional architectures had means that remained close to the origin. The average locus value across these populations (B) shifted in the opposite direction from the average epistasis term, and positive architectures had allelic distributions with negative means, negative architectures had allelic distributions with positive means, and nondirectional architectures had means that remained close to the origin. These values were calculated over 20–40 simulation runs (200–400 populations) that shared the same parameters (for nondirectional epistasis, |ε| = 0). direction, they diverged only marginally faster from each other than nondirectional architectures (Fig. 2). Still, some proportion of reproductive isolation may have been due to initial “adaptation” of the architecture, and influenced by compensatory evolution. Indeed, when we allowed the populations to adapt to the background A B 0.1 0.1 Positive epistasis Nondirectional epistasis Negative epistasis yy 0.05 ij i j y 0.05 0 -0.05 -0.1 0 -0.05 2 4 6 8 Generations x 10 5 10 -0.1 2 4 6 8 Generations x 10 10 5 Figure 4. The difference between epistatic terms (ij εi y j y) and allelic values (y) did not show evidence of systematically contributing to the hybrid load. In (B) the epistatic terms are slightly shifted away from the origin, but these values are very small compared to the shifts in average locus and epistatic effect reported in Figure 3. These values were calculated over 20–40 simulation runs (200–400 populations) that shared the same parameter values (shown in figure). The evolution of hybrid load was highly stochastic, but many architectures reached a plateau with loads less than 100% fitness loss. Such plateaus are probably due to canalizing selection limiting the divergence of the architectures. In Figure 5, we present the average F 1 load from simulation runs with different parameter combinations. The Brownian-motion expectations predicted that the F 1 hybrid load would accumulate quadratically, and we tested this by log transforming the data and fitting a linear regression. If the accumulation was quadratic, the slope of this regression should be two. We found that the slopes varied between 0.225 and 3.272 depending on the values of |ε| and s, and were not consistently two. As expected, additive architectures did not develop F 1 hybrid load. The F 1 hybrid load was highest for intermediate values of ε (i.e., |ε| = 0.117) and increased with increasing values of s (Fig. 5), but F 2 and backcross hybrids developed load under many combinations of |ε| and s (Fig. 6). With the parameters chosen, |ε| = 0.117 produces on average a 0.2% change in the effect of a subsequent mutation and this may be the level at which selection, constraints, and mutational effects balance to accelerate the accumulation of genetic changes. Directional and nondirectional architectures produced similar levels of F 2 and backcross hybrid load (Fig. 7A,C) and we present only the directional cases in Figure 6. Severe hybrid load developed due to segregation variance alone, and both F 2 and backcross hybrids showed high levels of load across small values of |ε|, including in purely additive architectures. Increasing s increased F 2 and backcross hybrid load (Fig. 6). In Figure 7, we plot the load across strengths of |ε| for F 1 , F 2 , and backcross hybrids. For F 1 hybrids, there was a window of intermediate values of |ε| that produced high loads (Fig. 7A). Here, there was a marked difference between directional architectures that reached loads of about 80% and nondirectional architectures that reached loads of about 40%. The development of F 1 load was extremely stochastic with some population pairs evolving zero fitness within 200,000 generations and other pairs remaining compatible over the full 1,000,000 generations. We ran some of our simulations over 2,000,000 generations to see if all population pairs would eventually evolve reproductive isolation over a longer time scale, but some pairs retained high hybrid fitness even over this time scale. EVOLUTION 2009 9 J. L . F I E R S T A N D T. F. H A N S E N Directional epistasis Nondirectional epistasis Additive Architecture ( = 0) s = 1.0 | | = 1.17 F1 Hybrid Load 1 y = 0.02*x0.225 y = 0.02*x1.061 0.8 0.2 2 4 1 F1 Hybrid Load y = 0.01*x1.037 y = 0.01*x1.029 y = 0.02*x0.627 y = 0.02*x0.819 0.4 | | = 0.117 6 8 10 2 y = 0.01*x3.272 y = 0.01*x2.067 0.8 4 6 8 10 2 y = 0.01*x2.094 y = 0.01*x1.694 4 6 8 10 y = 0.001*x 1.995 y = 0.001*x 1.966 0.6 0.4 0.2 0 | | = 0.0117 2 1 4 6 8 10 2 4 6 8 10 2 y = 0.0002*x 2.401 y = 0.0008*x 2.195 y = 0.001*x 2.049 y = 0.002*x 2.295 0.8 4 6 8 10 y = 0.0002*x 0.891 y = 0.0001*x 0.181 0.6 0.4 0.2 0 =0 2 4 6 8 10 2 1 4 6 8 10 2 y = 0.0003*x 0.26 y = 0.0002*x 1.02 F1 Hybrid Load s = 0.01 0.6 0 F1 Hybrid Load s = 0.1 4 6 8 10 y = 0.00002*x 0.94 0.8 0.6 0.4 0.2 0 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 Generations x 105 Figure 5. The accumulation of F 1 hybrid load depended on the parameters and significant load evolved with |ε| = 0.117, s = 1.0 and s = 0.1. We analyzed the pattern of accumulation of hybrid load by log transforming the data and fitting a linear regression to the initial phase of divergence (the first 400,000 generations). If the rate of accumulation of hybrid load was quadratic, the slope of this log–log regression would be two. The hybrid load was averaged across 10–30 simulation runs with the same parameters (45 F 1 populations per simulation run, 450–1350 F 1 populations total). The parameters for the simulations were: N = 1000, number of loci = 20, μ = 0.01 mutations/locus/generation, σ m = 0.01. The values for s are shown across the top of the figure and the values for |ε| are shown along the side with σε = |ε|. The F 2 and backcross hybrids reached similar loads for both directional and nondirectional architectures (Fig. 7B,C). As |ε| increased, the average F 2 and backcross loads decreased although there was a small peak around the same point where F 1 load peaked. In contrast to these results, F 1 and F 2 hybrids from populations with a fixed ε evolved comparatively less load than populations with variable epistatic interactions (Fig. 7D). Due to the high fitness of the parental populations, none of the crosses resulted in heterosis. 10 EVOLUTION 2009 We analyzed the architecture of the diverged populations through single-allele introgressions and found that although the populations diverged from the ancestral population and each other at most loci, reproductive incompatibility resulted from a small number of loci with negative interactions throughout the background. When single alleles were introgressed from one parental population into another, the majority of loci appeared to have negligible effects on fitness (Fig. 8) although many more interactions likely contributed to hybrid load. When the introgression results H Y B R I D I Z AT I O N A N D G E N E T I C A R C H I T E C T U R E F2 hybrids Backcross hybrids s = 1.0 | | = 1.17 s = 0.1 s = 0.01 1 Hybrid Load 0.8 0.6 0.4 0.2 | | = 0.117 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 1 Hybrid Load 0.8 0.6 0.4 0.2 | | = 0.0117 1 Hybrid Load 0.8 0.6 0.4 0.2 =0 1 Hybrid Load 0.8 0.6 0.4 0.2 Generations x 105 Figure 6. The accumulation of F 2 and backcross hybrid load occurred across a broad area of parameter space and increased with increasing s and decreasing |ε|. The hybrid load was averaged across 10–30 simulation runs with the same parameters (45 F 1 populations per simulation run, 450–1350 F 1 populations total). The parameters for the simulations were: N = 1000, number of loci = 20, μ = 0.01 mutations/locus/generation, σ m = 0.01. The values for s are shown across the top of the figure and the values for |ε| are shown along the side with σε = |ε|. were compared with the phenotype and fitness of the F 1 hybrid, the majority of the reproductive incompatibility at each time step mapped to only one or two loci (an example is shown in Fig. 9). EVOLUTION OF REPRODUCTIVE ISOLATION UNDER DIRECTIONAL SELECTION The behavior of the multilinear model under directional selection has been reported in previous papers (Carter et al. 2005; Hansen et al. 2006) and our simulation runs under directional selection produced very similar results to other studies (Johnson and Porter 2000, 2007; Porter and Johnson 2002), with reproductive isolation developing quickly under a large range of parameter values. Consequently, our discussion of directional selection will be brief. The direction of epistatic interactions affected the phenotypic response to directional selection (for details, see Carter et al. 2005 and Hansen et al. 2006a). The F 1 hybrids from populations with positive epistatic architectures were always intermediate to the parental populations whereas the F 2 hybrids could show loss of fitness, heterosis, or intermediate phenotypes. This is because under positive epistasis the population phenotypes diverged very EVOLUTION 2009 11 J. L . F I E R S T A N D T. F. H A N S E N s=1.0, s=1.0, m=0.01 m=0.01, | |=0.117, =0.1 1 1 Average Hybrid Load Average Hybrid Load 1 0.8 0.6 0.4 0.2 10-4 F2 Hybrids B 0.001 0.01 0.1 0.8 0.6 0.4 0.2 10-4 1 | | 0.1 1 Average Hybrid Load 0.8 0.6 0.4 0.2 0.2 0 1 2 3 4 0.1 1 F1 hybrids F2 hybrids 0.2 10-4 7 8 9 10 11 12 13 14 15 16 17 18 19 20 The allelic introgressions showed that the majority of Discussion 0.001 0.01 0.1 1 | | Directional epistasis Nondirectional epistasis Figure 7. Hybrid load as a function of the strength of epistasis. The average (A) F 1 , (B) F 2 , (C) F 1 backcross, and (D) fixed ε hy- brid loads are plotted against the mean strength of epistasis in the population, |ε|. When the populations evolved with no variance in the strength of epistatic interactions, very little F 1 hybrid load was produced. Load was maximized at intermediate levels of epistatic interactions, but the height of this load was <20% although the F 1 hybrids from populations with variable epistatic interactions showed average loads of >80% for the same parameters and strengths of epistasis. At each point the average of 900 population crosses is plotted. The parameters for the parental populations were: N = 1000, number of loci = 20, s = 0.1, σε = |ε| (except for runs with fixed ε, where σ ε = 0), μ = 0.01 mutations/locus/generation, σ m = 0.01. quickly and there was a very large phenotypic space the hybrids could fall into. In the F 2 ’s, diploid combinations of parental loci were segregating and thus had significant effects on the phenotype. Under nondirectional epistasis, parental populations showed intermediate divergence and F 1 hybrids were either intermediate to the P 1 and P 2 populations or showed loss of fitness in both populations and F 2 hybrids also showed intermediate or low fitness. Under negative epistasis, both F 1 and F 2 hybrids typically showed loss of fitness relative to both parental populations although some population crosses did produce hybrids with intermediate phenotypes. EVOLUTION 2009 6 1000, number of loci = 20, s = 0.1, |ε|= 0.117, σ ε = 0.1, μ = 0.01 mutations/locus/generation, σ m = 0.01. 0.6 0.4 5 alleles had very little effect on fitness. The plot shows the fitness effects of individual loci introgressed from P 2 into P 1 , ordered by effect. The parameters for the simulation graphed here were: N = 0.8 | | 12 0.4 Figure 8. 1 0.001 0.01 0.6 Loci Ranked by Fitness Effect Fixed D 1 Average Hybrid Load 0.001 0.01 0.8 | | Backcross C 10-4 P1 Fitness After Introgression F1 Hybrids A Stabilizing selection acting on a phenotypic trait with a polygenic architecture will generate a network of high-fitness genotypes on which quasi-neutral drift may take place. Separated populations may experience identical selection regimes and still drift to different positions in the network, and this may produce low-fitness hybrids (Gavrilets 2004; illustrated in Fig. 1). We have studied the role of genetic architecture in this process. Our simulations showed that genetic divergence occurred when trait epistasis was weak to moderate, but strong epistasis prevented divergence. The F 1 hybrid load developed under intermediate strengths of epistasis that were strong enough to shift the hybrid phenotype away from the parental mean, but not strong enough to prevent allelic divergence. When epistasis was weak or absent, F 2 and backcross load developed due to segregation variance. Our analytical model of quasi-neutral divergence predicted that hybrid load would scale with the sum of the squared epistasis coefficients, and therefore that directionality of and variance in epistasis would have minor effects. It further predicted that load would accumulate quadratically for F 1 hybrids and with a combination of linear and quadratic factors for F 2 and backcross hybrids, and that load would scale negatively with the strength of selection. These predictions were not well supported by our simulations, and it is clear that the dynamics of allelic divergence in the simulations were more complex than assumed in our simple quasi-neutral model. Directional epistasis elevated the F 1 load but had small effects on F 2 and backcross loads, which indicates that it primarily affected the shift of the mean phenotype, and not the segregation variance. Directional epistasis also accelerated the divergence of the populations from their ancestral state, and may H Y B R I D I Z AT I O N A N D G E N E T I C A R C H I T E C T U R E s=1.0, Fitness After Introgression 1 =0.1 B A 0.6 0.4 0.2 1 Fitness After Introgression | |=0.117, 0.8 0 3 5 12 13 Generation 2 x 105 15 F1 3 5 12 13 Generation 3 x 105 15 F1 15 F1 15 F1 D C 0.8 0.6 0.4 0.2 0 1 Fitness After Introgression m=0.01, 3 5 12 13 Generation 4 x 105 15 F1 3 5 12 13 Generation 6 x 105 F E 0.8 0.6 0.4 0.2 0 3 5 12 13 Locus Generation 8 x 105 15 F1 3 5 12 13 Locus Generation 10 x 105 Figure 9. This graph shows the fitness after introgression of single alleles over time and plots only the loci with severe fitness ef- fects (fitness <0.05). The fitness effects of individual loci changed stochastically, and the locus that initially caused reproductive incompatibility (locus three in B) changed to have very mild effects on fitness later in the simulations (E and F). After a number of generations that locus evolved to be more compatible between the parental populations, and reproductive incompatibility in later generations was caused by other loci. At the end of the simulations, the loci responsible for the initial hybrid dysfunction had fixed compatible alleles and the early genetic history of the loci was not reflected in the later time steps. Although we present the results from one cross in Figure 7 for simplicity, we have performed allelic introgressions for all simulations with results presented in this article and we see the same results across simulation runs for which hybrid load evolves. The parameters for the simulation graphed here were: N = 1000, number of loci = 20, s = 0.1, |ε|= 0.117, σ ε = 0.1, μ = 0.01 mutations/locus/generation, σ m = 0.01. thus be more important when a small population is isolated from a large, slowly evolving population as in peripatric speciation. Although the BDM model is well established, ours is the first analytical study that examines how epistasis within a population relates to epistatic effects between separated populations. Many theoretical studies have explored evolutionary change through neutral fitness networks (e.g., Schuster et al. 1994; Fontana and Schuster 1998; Gavrilets 1999; van Nimwegen et al. 1999; Reidys and Stadler 2001, 2002; Hermisson et al. 2003; Gavrilets 2004; Ciliberti et al. 2007; Alvarez-Castro et al. 2009) and empirical studies have found evidence that developmental systems may diverge through compensatory mutations (reviewed in True and Haag 2001; Haag 2007). Gavrilets (1999, 2003, 2004; Gavrilets and Gravner 1997) expanded on the BDM model through two types of mathematical models. First, he explored networks of high-fitness genotypes connected by small allelic changes with reproductive isolation accumulating as a byproduct of mutation and genetic drift moving populations through genotypic space. His second approach (Gavrilets 2003, 2004) was to expand on a probability-based model developed by Orr (1995) and Orr and Turelli (2001) in which incompatibilities are not modeled explicitly, but assumed to accumulate over time. This approach predicts that reproductive incompatibilities will accumulate with the square of time, a pattern we also see for pairwise polygenic trait epistasis. Johnson and Porter (2000, 2007; Porter and Johnson 2002) studied the BDM scenario through a computational model of binding strength in a developmental pathway. When the parental pathways were under parallel directional selection, Johnson and Porter (2000) found that both F 1 and F 2 hybrids often had extremely low fitness after 1000 generations, but under stabilizing selection hybrids showed no fitness loss. Their directional selection results differ from ours in that they do not report heterosis, but this is likely due to the high fitness of their parental populations and the Gaussian fitness function used in their model. Palmer and Feldman (2009) studied the BDM process through a computational model of gene networks and found that reproductive incompatibility evolved under stabilizing selection if the gene interactions were parameterized to act in a threshold manner. They attributed the development of reproductive incompatibility to complex patterns of epistasis and pleiotropy, but because of the complexity of the model they were not able to analyze the architecture. Modeling the BDM process with a genotype–phenotype mapping allowed us to analyze reproductive isolation at different levels of hybridization. Our findings point to segregation variance as the dominant component of hybrid load for many parameter values, and we found that significant backcross and F 2 load evolved in a purely additive trait architecture. Segregation variance does not affect the population mean, but causes the average hybrid individual to deviate from the optimum. Only with certain strengths of epistasis will systematic shifts in the trait mean contribute significantly to hybrid load, but any value of epistasis will increase the effect of segregation variance. The evolution of reproductive EVOLUTION 2009 13 J. L . F I E R S T A N D T. F. H A N S E N isolation under stabilizing selection through segregation variance in F 2 hybrids was first suggested by Barton (1989). Barton’s model suggested that the fitness cost would be small (<10%), although he proposed that similar processes occurring for multiple traits could result in large fitness differences. A similar model was studied in detail by Slatkin and Lande (1994), who found that significant F 2 segregation variance evolved under stabilizing selection depending on the distribution of allelic effects. Our results expand on these models by demonstrating that segregation variance results in significant levels of hybrid load across a broad range of genetic architectures, and by quantitatively comparing the contribution of F 1 (epistatic) load with hybrid load due to segregation variance. In empirical populations, F 2 hybrids often have high phenotypic variance and low fitness compared to F 1 or parental populations (Lynch and Walsh 1998; Edmands 1999; Demuth and Wade 2007), and a similar pattern has been documented in backcross hybrids (Hurt and Hedrick 2003). A closer look at patterns of trait variation in hybrids may be the best way to test the segregation-variance hypothesis. Reproductive isolation evolving through traits with an additive genetic basis falls under the BDM scenario because any nonlinear fitness function will produce epistasis for fitness (Wagner et al. 1994). Trait epistasis is significant because it is necessary for the development of F 1 load, which cannot be generated by an additive trait when the parental populations have similar phenotypes, and because it elevates the F 2 and backcross loads. Strong trait epistasis, however, prevents genetic divergence and thus limits the evolution of reproductive isolation. This is because evolution along fitness isoclines, as in Figure 1, is not purely neutral. Epistatic interactions produce distinct equilibrium points that are generated by canalizing selection (Hermisson et al. 2003; Alvarez-Castro et al. 2009) and when epistasis is strong, canalizing selection overcomes drift. Independent populations then evolve toward the same points in the landscape. On the other hand, strong epistasis may also bring distinct branches of the neutral space closer to each other. This allows discrete jumps (or peak shifts) from one branch to another, and may produce discrete incompatibilities (Fig. 1). Although the evolution of incompatibilities was slow and stochastic in our simulations, we may expect more regular evolution in real populations where multiple traits are under stabilizing selection. In their Speciation book, Coyne and Orr (2004) reported estimated diversification intervals (the time it takes for a lineage to completely speciate) between 0.5 and 20 million years with an average of 6.5 million years. Bolnick and Near (2005) used hybrid hatching success as a measure of hybrid incompatibility and found that Centrarchid fish (with a generation time of two to four years) lost hybrid fitness at a rate of 3.13% per million years, had a mean diversification rate of 11.15 million years and a minimum period of 24.83 million years to complete hybrid inviability. 14 EVOLUTION 2009 Other published estimates of minimum time to hybrid inviability range from 1.5 million years in Anurans (Sasa et al. 1998) to 5.5 million years in birds (Price and Bouvier 2002; Lijtmaer et al. 2003). In our simulations with 1000 individuals, mutation rates of 10−2 per allele per generation, and 20 loci contributing to the trait, loads on the order of 10% or more typically appeared after 105 generations. Real per locus mutation rates may be three orders of magnitude less than our assumption (Drake et al. 1998), and this would greatly increase our predicted divergence times, but as explained in the methods this is unlikely to alter our qualitative results on the effects of genetic architecture. In particular, although our per locus mutation rates were extremely high, it is the total mutation rate that would determine the progress of reproductive isolation. As the number of loci in our simulations is very low when compared to polygenic traits in real populations, the total mutation rate may be comparable if a lower per locus mutation rate was balanced by a larger number of loci (Nei et al. 1983). In fact, the level of segregating variation in our simulations was realistic. Houle (1998) assessed trait variation in life history and phenotypic characters through the ratio V A /V M and reported that V A /V M in published studies ranged from 33.72 to 646.01. The parameters we used in these simulations resulted in comparable V A /V M ratios and ranged from 25 to 500 in different populations. Even so, we add the caveat that we have not systematically explored the effects of varying per locus mutation rates, numbers of loci, or total mutation rate, and it is possible that some results may depend on this. Even with these caveats, our results indicate it is unlikely that reproductive isolation can evolve rapidly due to a single trait under stabilizing selection. Thus, BDM incompatibilities under stabilizing selection likely result from the cumulative load produced by a large number of traits. A load of a few percent per trait can evolve quickly in our simulations, and if this happens in hundreds of traits, a polygenic BDM becomes a reasonable hypothesis for the evolution of postzygotic isolation. Our simulation results indicate that studying reproductive isolation at the end of the process may provide few insights into the origins of reproductive barriers. As noted by Coyne and Orr (2004), reproductive isolation will inevitably be followed by additional genetic divergence, and this may potentially obscure which genetic interactions caused the initial speciation event. Canalizing forces will move populations toward robust areas of neutral space and reduce large between-population deviances, and thus incompatibilities may come and go over time. In this study, the differences between early and late genetic incompatibilities were significant. We could not have identified the original incompatibility loci if we had only examined the endpoint of the simulations, and this problem could only be magnified in empirical studies. Although the process of reproductive isolation was polygenic and complex, hybrid load mapped to introgression on H Y B R I D I Z AT I O N A N D G E N E T I C A R C H I T E C T U R E relatively few loci. The introgressed alleles likely caused their effects through interactions with many loci, but the pattern of hybrid load mapping to few alleles was consistent and independent of time. Although previous theoretical studies of the BDM model have indicated that the number of incompatibilities should accelerate with time (Orr 1995; Orr and Turelli 2001), empirical data point to two different scenarios. The bulk of Drosophila data has indicated that speciation is highly complex and polygenic (Palopoli and Wu 1994; Presgraves et al. 2003; Tao et al. 2003) and many studies of speciation genetics support a scenario of multiple small incompatibilities accumulating over time (Coyne and Orr 1989, 1997; Sasa et al. 1998; Presgraves 2002). In contrast, recent studies have also found BDM-type incompatibilities that map to few loci (Brideau et al. 2006) and appear to occur independent of time (Kondrashov et al. 2002; Kulathinal et al. 2004). Introgression studies in multiple plant species have indicated that the genetic basis of reproductive isolation may involve relatively few loci (Moyle and Graham 2005; Fishman and Willis 2006; Sweigart et al. 2006) although the resolution with which these are mapped is very low. The implicated regions are large enough that they may contain multiple interacting genes, but the evidence for speciation involving multiple genes or few remains ambiguous. However, hybrid load caused by contributions from many traits would also produce a pattern in which total load is mapped to many loci with small effects. Our results contribute to an accumulating body of literature that points to different ways in which genetic and phenotypic differences can be decoupled and evolution can occur through connected networks of functional equivalence, and suggest that reproductive isolation may evolve under stabilizing selection for a range of genetic architectures if multiple loci and traits are subject to similar processes. ACKNOWLEDGMENTS The authors thank A. J. R. Carter for helping write the code, and J. Hermisson, D. Houle, A. P. S. Le Rouzic, A. H. Porter and two anonymous reviewers for helpful comments on the manuscript. JLF thanks the Center for Ecological and Evolutionary Synthesis at the University of Oslo for hosting my research visit. This work was supported by a Norwegian Research Council, Leiv Eiriksson Mobility Grant to JLF, and National Science Foundation grants #0444157 and #0344417, and Norwegian Research Council grant #177857 to TFH. LITERATURE CITED Alvarez-Castro, J. M., and O. Carlborg. 2007. A unified model for functional and statistical epistasis and its application quantitative trait loci analysis. Genetics 176:1151–1167. Alvarez-Castro, J. M., M. Kopp, and J. Hermission. 2009. Effects of epistasis and the evolution of genetic architecture: exact results for a 2-locus model. Theor. Popul. Biol. 75:109–122. Arnold, S. J., M. E. Pfrender, and A. G. Jones. 2001. The adaptive landscape as a conceptual bridge between micro- and macroevolution. Genetica 112/113:9–32. Barton, N. 1989. The divergence of a polygenic system subject to stabilizing selection, mutation and drift. Genet. Res. 54:59–77. Barton, N. H., and M. Turelli. 2004. Effects of genetic drift on variance components under a general model of epistasis. Evolution 58:2111– 2132. Bateson, W. 1909. Heredity and variation in modern lights. Pp. 85–101 in A. C. Seward, ed. Darwin and modern science, Cambridge Univ. Press, Cambridge. Blows, M. W., and R. Brooks. 2003. Measuring nonlinear selection. Am. Nat. 162:815–820. Bolnick, D. I., and T. J. Near. 2005. Tempo of hybrid inviability in Centrarchid fishes (Teleostei: Centrarchidae). Evolution 59:1754–1767. Brideau, N. J., H. A. Flores, J. Wang, S. Maheshwari, X. Wang, and D. A. Barbash. 2006. Two Dobzhansky-Muller genes interact to cause hybrid lethality in Drosophila. Science 314:1292–1295. Bürger, R. 2000. The mathematical theory of selection, recombination, and mutation. Wiley, Chichester. Burton, R. S., C. K. Ellison, and J. S. Harrison. 2006. The sorry state of F-2 hybrids: consequences of rapid mitochondrial DNA evolution in allopatric populations. Am. Nat. 168:S14–S24. Carter, A. J. R., J. Hermisson, and T. F. Hansen. 2005. The role of epistatic gene interactions in the response to selection and the evolution of evolvability. Theor. Popul. Biol. 68:179–196. Charlesworth, B. 1990. Mutation-selection balance and the evolutionary advantage of sex and recombination. Genet. Res. 55:199–221. Cheverud, J. M., and E. J. Routman. 1995. Epistasis and its contribution to genetic variance components. Genetics 139:1455–1461. Ciliberti, S., O. C. Martin, and A. Wagner. 2007. Innovation and robustness in complex regulatory gene networks. Proc. Nat. Acad. Sci. USA 104:13591–13596. Coyne, J. A., and H. A. Orr. 1989. Patterns of speciation in Drosophila. Evolution 43:362–381. ———. 1997. “Patterns of speciation in Drosophila” revisited. Evolution 51:295–303. ———. 2004. Speciation. Sinauer Associates, Sunderland, MA. Demuth, J. P., and M. J. Wade. 2007. Population differentiation in the beetle Tribolium castaneum. I. Genetic architecture. Evolution 61:494–509. Desai, M. M., D. Weissman, and M. W. Feldman. 2007. Evolution can favor antagonistic epistasis. Genetics 177:1001–1010. Dobzhansky, T. G. 1937. Genetics and the origin of species. Columbia Univ. Press, New York. Drake, J., B. Charlesworth, D. Charlesworth, and J. F. Crow. 1998. Rates of spontaneous mutation. Genetics 148:1667–1686. Edmands, S. 1999. Heterosis and outbreeding depression in interpopulation crosses spanning a wide range of divergence. Evolution 53:1757–1768. Endler, J. A. 1986. Natural selection in the wild. Princeton Univ. Press, NJ. Estes, S., and S. J. Arnold. 2007. Resolving the paradox of stasis: models with stabilizing selection explain evolutionary divergence on all timescales. Am. Nat. 169:227–244. Fenster, C. B., L. F. Galloway, and L. Chao. 1997. Epistasis and its consequences for the evolution of natural populations. Trends Ecol. Evol. 12:282–286. Fishman, L., and J. H. Willis. 2001. Evidence for Dobzhansky-Muller incompatibilities contributing to the sterility of hybrids between Mimulus guttatus and M. nasutus. Evolution 55:1932–1942. ———. 2006. A cytonuclear incompatibility causes anther sterility in Mimulus hybrids. Evolution 60:1372–1381. Flatt, T. 2005. The evolutionary genetics of canalization. Q. Rev. Biol. 80:287– 316. Fontana, W., and P. Schuster. 1998. Shaping space: the possible and the attainable in RNA genotype-phenotype mapping. J. Theor. Biol. 194:491–515. EVOLUTION 2009 15 J. L . F I E R S T A N D T. F. H A N S E N Gavrilets, S. 1999. A dynamical theory of speciation on holey adaptive landscapes. Am. Nat. 154:1–22. ———. 2003. Models of speciation: what have we learned in 40 years? Evolution 57:2197–2215. ———. 2004. Fitness landscapes and the Origin of Species. Princeton Univ. Press, Princeton, NJ. Gavrilets, S., and G. de Jong. 1993. Pleiotropic models of polygenic variation, stabilizing selection, and epistasis. Genetics 134:609–625. Gavrilets, S., and J. Gravner. 1997. Percolation on the fitness hypercube and the evolution of reproductive isolation. J. Theor. Biol. 184:51–64. Haag, E. S. 2007. Compensatory vs. pseudocompensatory evolution in molecular and developmental interactions. Genetica 129:45–55. Hansen, T. F. 2006. The evolution of genetic architecture. Annu. Rev. Ecol. Evol. Syst. 37:123–157. Hansen, T. F., and G. P. Wagner. 2001a. Modeling genetic architecture: a multilinear theory of gene interaction. Theor. Popul. Biol. 59:61–86. ———. 2001b. Epistasis and the mutation load: a measurement-theoretical approach. Genetics 158:477–485. Hansen, T. F., J. M. Alvarez-Castro, A. J. R. Carter, J. Hermisson, and G. P. Wagner. 2006a. Evolution of genetic architecture under directional selection. Evolution 60:1523–1536. Hansen, T. F., A. J. R Carter, and C. Pélabon. 2006b. On adaptive accuracy and precision in natural populations. Am. Nat. 168:168–181. Hermisson, J., T. F. Hansen, and G. P. Wagner. 2003. Epistasis in polygenic traits and the evolution of genetic architecture under stabilizing selection. Am. Nat. 161:708–734. Houle, D. 1998. How should we explain variation in the genetic variance of traits? Genetica 102–3:241–253. Hurt, C. R., and P. W. Hedrick. 2003. Initial stages of reproduction in two species of the endangered Sonoran topminnow. Evolution 57:2835– 2841. Johnson, N. A. 2006. Patterns and processes of speciation: the evolution of reproductive isolating barriers. Pp. 374–386 in C. Fox and J. B. Wolf. eds. Evolutionary genetics. Oxford Univ. Press, Oxford. Johnson, N. A., and A. H. Porter. 2000. Rapid speciation via parallel, directional selection on regulatory genetic pathways. J. Theor. Biol. 205:527– 542. ———. 2007. Evolution of branched regulatory genetic pathways: directional selection on pleiotropic loci accelerates developmental system drift. Genetica 129:57–70. Kingsolver, J. G., H. E. Hoekstra, J. M. Hoekstra, D. Berrigan, S. N. Vignnieri, C. E. Hill, A. Hoang, P. Gibert, and P. Beerli. 2001. The strength of phenotypic selection in natural populations. Am. Nat. 157:245–261. Kondrashov, A. S. 1988. Deleterious mutations and the evolution of sexual reproduction. Nature 336:435–440. ———. 1993. Classification of hypotheses on the advantage of amphimixis. J. Hered. 84:372–387. Kondrashov, A. S., S. Sunyaev, and F. A. Kondrashov. 2002. DobzhanskyMuller incompatibilities in protein evolution. Proc. Nat. Acad. Sci. USA 99:14878–14883. Kulathinal, R. J., B. R. Bettancourt, and D. L. Hartl. 2004. Compensated deleterious mutations in insect genomes. Science 306:1553–1554. Le Rouzic, A. P. S., and O. Carlborg. 2007. Evolutionary potential of hidden genetic Variation. Trends Ecol. Evol. 23:33–37. Lijtmaer, D. A., B. Mahler, and P. L. Tubaro. 2003. Hybridization and postzygotic isolation patterns in pigeons and doves. Evolution 57:1411– 1418. Lynch, M. 1984. The selective value of alleles underlying polygenic traits. Genetics 108:1021–1033. ———. 1990. The rate of morphological evolution in mammals from the standpoint of the neutral expectation. Am. Nat. 136:727–741. 16 EVOLUTION 2009 Lynch, M., and W. G. Hill. 1986. Phenotypic evolution by neutral mutation. Evolution 40:915–935. Lynch, M., and B. Walsh. 1998. Genetics and analysis of quantitative traits. Sinauer Associates, Sunderland, MA. Mani, G. S., and B. C. Clark. 1990. Mutational order: a major stochastic process in evolution. Proc. R. Soc. Lond. B 240:29–37. Moyle, L. C., and E. B. Graham. 2005. Genetics of hybrid incompatibility between Lycopersicon esculentum and L. hirsutum. Genetics 169:355– 373. Muller, H. J. 1940. Bearing of the Drosophila work on systematics. Pp. 185– 268 in J. Huxley, ed. The new systematics. Oxford Univ. Press, Oxford. ———. 1942. Isolating mechanisms, evolution and temperature. Biol. Symp. 6:71–125. Nei, M., T. Maruyama, and C. I. Wu. 1983. Models of evolution of reproductive isolation. Genetics 103:557–579. Orr, H. A. 1995. The population genetics of speciation: the evolution of hybrid incompatibilities. Genetics 139:1805–1813. Orr, H. A., and M. Turelli. 2001. The evolution of postzygotic isolation: accumulating Dobzhansky-Muller incompatibilities. Evolution 55:1085– 1094. Palopoli, M. F., and C. I. Wu. 1994. Genetics of hybrid male-sterility between Drosophila sibling species- a complex web of epistasis is revealed in interspecific studies. Genetics 138:329–341. Palmer, M. E., and M. W. Feldman. 2009. Dynamics of hybrid incompatibility in gene networks in a constant environment. Evolution 63:418– 431. Phillips, P. D. 1998. The language of gene interaction. Genetics 149:1167– 1171. Porter, A. H., and N. A. Johnson. 2002. Speciation despite gene flow when developmental pathways evolve. Evolution 56:2103– 2111. Presgraves, D. C. 2002. Patterns of postzygotic isolation in Lepidoptera. Evolution 56:1168–1183. ———. 2007. Does genetic conflict drive molecular evolution of nuclear transport genes in Drosophila? BioEssays 29:386–391. Presgraves, D. C., and W. Stephan. 2007. Pervasive adaptive evolution among interactors of the Drosophila hybrid inviability gene, Nup96. Mol. Biol. Evol. 24:306–314. Presgraves, D. C., L. Balagopalan, S. M. Abmayr, and H. A. Orr. 2003. Adaptive evolution drives divergence of a hybrid inviability gene between two species of Drosophila. Nature 423:715–719. Price, T. D., and M. M. Bouvier. 2002. The evolution of F 1 post-zygotic incompatibilities in birds. Evolution 56:2083–2089. Reidys, C. M., and P. F. Stadler. 2001. Neutrality in fitness landscapes. Appl. Math. Comput. 117:321–350. ———. 2002. Combinatorial landscapes. SIAM Rev. 44:3–54. Rice, S. H. 1998. The evolution of canalization and the breaking of Von Baer’s laws: modeling the evolution of development with epistasis. Evolution 52:647–656. ———. 2002. A general population genetic theory for the evolution of developmental interactions. Proc. Natl. Acad. Sci. USA 99:15518– 15523. ———. 2004. Developmental associations between traits: covariance and beyond. Genetics 166:513–526. Sasa, M. M., P. T. Chippindale, and N. A. Johnson. 1998. Patterns of postzygotic isolation in frogs. Evolution 52:1811–1820. Schluter, D. 2000. The ecology of adaptive radiation. Oxford Univ. Press, Oxford. Schuster, P., W. Fontana, P. Stadler, and I. Hofacker. 1994. From sequences to shapes and back—A case-study in RNA secondary structures. Proc. R. Soc. Lond. B Biol. Sci. 255:279–284. H Y B R I D I Z AT I O N A N D G E N E T I C A R C H I T E C T U R E Shpak, M. 2005. The role of deleterious mutations in allopatric speciation. Evolution 59:1389–1399. Slatkin, M., and R. Lande. 1994. Segregation variance after hybridization of isolated populations. Genet. Res. 64:51–56. Sweigart, A. L., L. Fishman, and J. H. Willis. 2006. A simple genetic incompatibility causes hybrid male sterility in Mimulus. Genetics 172:2465– 2479. Tao, Y., Z. B. Zeng, J. Li, D. L. Hartl, and C. C. Laurie. 2003. Genetic dissection of hybrid incompatibilities between Drosophila simulans and D. mauritiana. II. Mapping hybrid male sterility loci on the third chromosome. Genetics 164:1399–1418. Travis, J. 1989. The role of optimizing selection in natural populations. Annu. Rev. Ecol. Syst. 20:279–296. True, J. R., and E. S. Haag. 2001. Developmental system drift and flexibility in evolutionary trajectories. Evol. Dev. 3:109–119. van Nimwegen, E., W. Y. Crutchfield, and M. Huynen. 1999. Neutral evolution of mutational robustness. Proc. Natl. Acad. Sci. USA 96:9716– 9720. Wagner, A., G. P. Wagner, and P. Similion. 1994. Epistasis can faciliate the evolution of reproductive isolation by peak shifts: a two-locus two-allele model. Genetics 138:533–545. Weinreich, D. M., R. A. Watson, and L. Chao. 2005. Perspective: sign epistasis and genetic constraint on evolutionary trajectories. Evolution 59:1165– 1174. Welch, J. J. 2004. Accumulating Dobzhansky-Muller incompatibilities: reconciling theory and data. Evolution 58:1145–1156. Whitlock, M. C., P. C. Phillips, F. B.-G. Moore, and S. J. Tonsor. 1995. Multiple fitness peaks and epistasis. Ann. Rev. Ecol. Syst. 26:601– 629. Wolf, J. B., E. D. Brodie III, and M. J. Wade. 2000. Epistasis and the evolutionary process. Oxford Univ. Press, Oxford. Yukilevich, R., J. Lachance, F. Aoki, and J. R. True. 2008. Long-term adaptation of epistatic genetic networks. Evolution 62:2215–2235. Zhivotovsky, L. A., and S. Gavrilets. 1992. Quantitative variability and multilocus polymorphism under epistatic section. Theor. Popul. Biol. 42:254– 283. Associate Editor: D. Presgraves Appendix A ANALYTICAL PREDICTIONS OF THE HYBRID LOAD As explained in the main text, the hybrid load is given by the formula L = sE[z x ]2 + s var[z x ], where z x is the phenotype of an individual of type x, where x could be the F 1 , F 2 , backcross, or any other type of hybrid. In this Appendix, we compute these means and variances for F 1 , F 2 , and B 1 and derive predictions for the load under assumptions of quasi-neutral divergence of the parental populations. We do this by assuming that the parental populations are at the optimum, θ = 0, and that they harbor no genetic variation, i.e., we ignore within-population variation, and only include segregation variation. Genetic variation within the parental populations should in any case have similar effects in hybrids, and also be negligible in relation to segregation variance. Expectations for the phenotypic mean With only pairwise epistasis, the phenotypic mean is given as ⎡ E[z x ] = E ⎣zr + i i ⎤ 1 ij i j ⎦ y+ εy y , 2 i j=i (A1) where i y is the reference effect of locus i and z r is the phenotype of the reference genotype, which is taken to be the (mean) genotype of the ancestral population. We assume z r = 0 without loss of generality. For the F 1 , all individuals are heterozygotes, and the reference effects are i y F1 = i a 1 + i a 2 , where i a 1 and i a 2 are allelic reference effects of the parental populations 1 and 2, respectively. Fitting this into A1 we get E[z F1 ] = 1 E[z 1 ] + E[z 2 ] 2 ⎡ ⎤ 1 ij i ε a1 − ia2 ja1 − ja2 ⎦ . − E⎣ 2 i j=i (A2) Using the assumption that E[z 1 ] = E[z 2 ] = 0, and treating i a 1 and i a 2 as constants, we get E[z F1 ] = − 1 ij i ε a1 − ia2 ja1 − ja2 . 2 i j=i (A3) In the F 2 , we have three possible genotypes, such that i y = {2 a 1 with probability 1/4, i a 1 + i a 2 with probability 1/2, and 2i a 2 with probability 1/4}. Fitting this into A1, and assuming no linkage, we get i E[z F2 ] = − 1 ij i ε a1 − ia2 ja1 − ja2 , 2 i j=i (A4) which is the same as for the F 1 . For the B 1 resulting from a cross between F 1 and parental population 1, there are two possible genotypes, such that i y = {2i a 1 with probability 1/2, and i a 1 + i a 2 with probability 1/2}. Fitting this into equation (A1) gives E[z B1 ] = − 3 ij i ε a1 − ia2 ja1 − ja2 8 i j=i (A5) and identically for the B 2 . Expectations for the segregation variance To compute the segregation variance, we start by using the assumption of reference effects of different loci to be independent to decompose the variance into contributions from individual loci EVOLUTION 2009 17 J. L . F I E R S T A N D T. F. H A N S E N (n a 1 + n a 2 )}]T is the column vector of mean locus effects at all loci, • denotes element-wise multiplication (Hadamard product), and E is the matrix of epistasis coefficients with its ijth entry equal to ij ε, and we have adopted the convention that ii ε = 0. For the backcross, var[i y B1 ] = (i a 1 – i a 2 )2 /4, and E[i y B1 ] = (3i a 1 + i a 2 )/2. Using this in (A7) gives and pairs of loci. ⎡ var[z x ] = var ⎣zr + = i i ⎤ 1 ij i j ⎦ y+ εy y 2 i j=i 1 ij 2 ε var[ iy jy] 2 i i j=i ij ir + ε εE[ jy]E[ry]var[ iy] var[ iy] + j=i r =i, j i + var[z B1 ] = ij εE[ jy]var[ iy]. (A6) j=i i + We can simplify this further by using the relation to get 1 ij 2 var[ y] + ε var[ iy]var[ jy] var[z x ] = 2 i i j=i ⎛ ⎞2 ij + var[ iy] ⎝ εE[ jy]⎠ i = j=i i + + i ij εE[ y]var[ iy]. j (A7) j=i We can now use this equation to compute the segregation variance for each of the hybrid types by fitting in the appropriate equations for the mean and variance of the reference effects. First, we note that for the F 1 there is no segregation variance, var[i y] = 0, and var[Z F1 ] = 0. (A8) For the F 2 the per locus segregation variance is var[i y F 2 ] = (i a 1 – i a 2 )2 /2, and E[i y F 2 ] = i a 1 + i a 2 . Using this in (A7) gives var[z F2 ] = 2 1 i a1 − ia2 2 i 2 j 2 1 ij 2 i + ε a1 − ia2 a1 − ja2 8 i j=i ⎞2 ⎛ 1 i 2 ij + a1 − ia2 ⎝ ε ja1 + ja2 ⎠ 2 i j=i + = 2 j 2 1 ij i ε a1 − ia2 a1 + ja2 2 i j=i 1 T 1 a a + (a • a)T (E • E)(a • a) 2 8 1 1 + (a • a)T (EyF2 • EyF2 ) + yTF2 E(a • a), 2 2 (A9) where a = {1 a, . . . , n a}T is the column vector of allelic differences, i a = i a 1 − i a 2 at all loci, y F2 = {(1 a 1 + 1 a 2 ), . . . , 18 EVOLUTION 2009 2 j 2 1 ij 2 i ε a1 − ia2 a1 − ja2 32 i j=i ⎞2 ⎛ 2 ij j 1 i i j + a1 − a2 ⎝ ε 3 a1 − a2 ⎠ 4 i j=i var[ iy jy] = var[ iy]var[ jy] + E[ jy]2 var[ iy] + E[ iy]2 var[ jy], 2 1 i a1 − ia2 4 i (A10) 2 1 ij i ε a1 − ia2 3 ja1 − ja2 4 i j=i 1 T 1 a a + (a • a)T (E • E)(a • a) 4 32 + 1 1 (a • a)T (EyB1 • EyB1 ) + yTB1 E(a • a), 4 4 where y B1 = {(31 a 1 + 1 a 2 )/2, . . . , (3m a 1 + m a 2 )/2}T is the column vector of mean locus effects at all loci. The hybrid load under quasi-neutral divergence More detailed predictions about the hybrid load can be derived under the assumption of quasi-neutral divergence of loci. Specifically, we assume that the difference in allelic reference effects, i a, of all loci are independent and normally distributed with mean zero, and an identical variance parameter k, which is independent of the epistasis. Here, we need some results on the expectation of quadratic forms (Lynch and Walsh 1998). Specifically, let xT Mx be a quadratic form, with a positive definite matrix, M, and a random vector, x, with mean vector μ and variance matrix V. Then the expectation of xT Mx is Tr[MV] + μMμ, where Tr is the trace function. If x ∼ N(μ,V), then the variance of the quadratic form is 2Tr[(MV)2 ] + 4μMVMμ. For the F 1 hybrids, this gives s L F1 = s E[z F1 ]2 = (aT Ea)2 4 = sk2 sk2 ij 2 ε , Tr[E2 ] = 2 2 i j (A11) where we have used to denote expectation with respect to the assumed distribution of the i a under quasi-neutral divergence, and to emphasize the difference from the within-hybrid H Y B R I D I Z AT I O N A N D G E N E T I C A R C H I T E C T U R E expectations. The load of the F 2 is L F2 = s E[z F2 ]2 + svar[z F2 ] s = (aT Ea)2 4 s s + aT a + (a • a )T (E • E)(a • a ) 2 8 s s + (a • a )T (EyF2 • EyF2 ) + yTF2 E(a • a) . 2 2 (A12) because i a2 is independent of j y F 2 if i = j. The final term is computed as follows: yTF2 E(a i = (A13) where n is number of loci, and I is the identity matrix. The third term is also the expectation of a quadratic form. We note that a • a is a vector of independent χ2 (1)-distributed variables, each multiplied by k. The mean and variance of these are k, and 2k2 , respectively. Thus (a • a)T (E • E)(a • a) = Tr[2k 2 (E • E)] ε , ij 2 i j (A14) where J = {1, . . . , 1}T , and we have used Tr[E • E] = 0 . To compute the last two terms, we also have to make assumptions about the distribution of y F2 , which in accordance with quasi neutrality we take to be a vector of independent normally distributed variables with mean zero and variance, k. 2 ij j ia 2 ε y F2 (a • a)T (EyF2 • EyF2 ) = = a i 2 = i = k2 2 ij ε y F2 j ia 2 i ij ε iy F2 ja 2 = 0, (A16) j because i y F 2 = 0. Using this in (A11) gives L F2 = skn 13sk2 ij 2 ε . + 2 8 i j (A17) For the backcross, we do the same computations with y B1 in place of y F2 . The elements of y B1 , (3i a 1 + i a 2 )/2, are independent and normally distributed with mean zero and variance 5k/4. Thus, the computations in (A13) and (A14) will be the same for the backcross as for the F 2 ; the other terms are (a • a) (EyB1 • EyB1 ) = T a i 2 = 2 ε y B1 ij j j a i 2 i = j ia 2 2 ε y B1 ij j i j i j ε ε jy B1 ry B1 ij ir r 2 ij ia 2 ε jy B2 1 = 5k 2 ij 2 ε , = 4 i j (A18) and yT B1 E(a • a) = 0 with same reasoning as for the F 2 . Combining these results gives j i = ε y F2 a j 2 i + k 2 J(E • E)J = k 2 i ij i j i We compute this term by term. The expectation in the first term was computed in (A11) as (aT Ea)2 = 2k 2 i j ij ε2 . For the second term aT a = Tr[kI] = kn, • a) = r j a i 2 i ij 2 ε j ε ε jy F2 ry F2 ij ir j y F2 2 j ε , ij 2 (A15) L B1 = s E[z B1 ]2 + svar[z B1 ] s 9s (aT Ea)2 + aT a = 64 4 s + (a • a )T (E • E)(a • a ) 32 s s + (a • a )T (EyB1 • EyB1 ) + yTB1 E(a • a ) 4 4 skn 5sk2 ij 2 = + ε . (A19) 4 8 i j j EVOLUTION 2009 19