Chapter 12 (Dekkers) Incorporating Molecular Genetic Information in Genetic Improvement Programs for Livestock Based in part on: Dekkers et al. (2001), Dekkers and Hospital (2002), and Dekkers and Settar (2003) Substantial advances have been made in the genetic improvement of agriculturally important animal and plant populations through artificial selection on quantitative traits. Most of this selection has been on observable phenotype, without knowledge of the genetic architecture of the selected characteristics, which is treated as a black box, with no knowledge of the number of genes that affect the trait, let alone of the effects of each gene or their locations in the genome. Despite the obvious flaws of this model, the tremendous rates of genetic improvement that have been achieved attest to the utility of the quantitative genetic approach. Nevertheless, quantitative genetic selection has several limitations: phenotype is an imperfect predictor of an individual’s breeding value; phenotype may not be observed on both genders or prior to the time when selection decisions must be made; and phenotype is not very effective in resolving negative associations between genes, e.g. those caused by linkage or epistasis. The ideal situation for quantitative genetic selection is that the trait has high heritability and that the phenotype can be observed on all individuals prior to reproductive age. This ideal is hardly ever achieved, which limits the effectiveness of quantitative genetic selection. Andersson (2001) and Mauricio (2001) reviewed how molecular genetics can be used to discern the genetic nature of quantitative traits in animals and plants, respectively, by identifying genes or chromosomal regions that affect the trait — so-called quantitative trait loci or QTL. This has enabled identification and characterization of at least some of the genes that contribute to genetic variation in quantitative traits. Because DNA can be obtained at any age and on both genders, molecular genetics can alleviate some of the limitations of quantitative genetic selection, as will be discussed below. Thus, the genes and genetic markers that are being discovered can be used to enhance genetic improvement of breeding stock through marker-assisted selection. The purpose of this Chapter is to show how this information can be used to enhance genetic improvement. Emphasis will be on utilization of natural variation within a species, rather than on the introduction of new genetic variation through genetic modification, although some of the programs reviewed, such as introgression, also play an important role in the introduction of transgenes into breeding populations (see e.g. Gama et al. 1992). 233 Applications of Molecular Data Use of Molecular Data in Selection ➣ Parental identification / verification ➣ Traceability ➣ Evaluation of Genetic diversity Molec. ➣ Introgression of desirable genes Marker-Assisted Introgression (MAI) Selection strategy Genotypic data Factors Affecting Extra Response from MAS Benefit from use of molecular data ➣ Effects of identified QTL (% of genetic variance) ➣ Higher h2 than phenotypic data ➣ Recombination rates between markers and QTL ➣ Effectiveness of Phenotypic Selection ➣ Expressed in both sexes ➣ Expressed at early age (embryo stage) ➣ Heritability ➣ Explains within-family variation ➣ Restrictions on phenotyping (measurement) ➣ traces Mendelian sampling terms ➣ in one sex only gX = 1/2gsire + 1/2gdam + RAsire + RAdam Own phenotype Progeny phenotype Marker/genotypic data on X Traits (sex-limited traits) ➣ EBV based on relatives for one sex ➣ late in life (after selection) ➣ EBV based on relatives ➣ not on live animal (meat quality traits) ➣ EBV from relatives, reduced intensity ➣ difficult to measure (disease traits) Quantitative Traits • Routine recorded • both sexes • sex-limited • late in life • Genetic defects/disorders • Appearance • Difficult to record • feed intake • product quality • Quantitative traits • Unrecorded / low h 2 Potential gain from MAS/GAS Ease of QTL detection Genome scans • Single gene traits Candidate genes Ancestral records Half-sib records Full-sib records EBV genetics Identified or marked QTL ➣ Enhance selection within outbred populations Marker-Assisted Selection (MAS) Information providing records Phenotypic data Unknown genes • disease resistance Molecular genetic analyses of quantitative traits lead to the identification two broadly different types of genetic loci that can be used to enhance genetic improvement programs: causal mutations and presumed non-functional genetic markers that are linked to QTL (indirect markers). Causal mutations for quantitative traits are hard to find, difficult to prove, and few examples are available (Andersson 2001). Non-functional or anonymous polymorphisms are abundant across the genome and their linkage with QTL can be established by evidence of empirical associations of marker genotypes with trait phenotype. Two approaches are used to identify indirect markers (Andersson 2001): directed searches using candidate gene approaches in unstructured populations (Rothschild and Soller 1997); and undirected genome-wide searches 234 in specialized populations, such as F2 crosses or half-sib family populations. Because candidate gene markers focus on polymorphisms within a gene that is postulated to affect the trait, they are often tightly linked to the QTL. A candidate gene marker can represent the functional polymorphism, although this is difficult to prove (Andersson 2001). Genome scans, on the other hand, only identify regions of chromosomes that affect the trait. The length of these regions is typically 10 to 20 cM, but the exact position and number of QTL within the region is unknown. Whereas a causative polymorphisms give direct information about genotype for the QTL, use of indirect markers for QTL mapping and for selection is based on existence of linkage or gametic phase disequilibrium (LD) between the marker and the QTL. Marker-QTL LD can exist at the population level but always exists within families, even between loosely linked loci. Although two loci are expected to be in population-wide equilibrium in large random-mating populations, partial population-wide LD can exist by chance between tightly linked loci in breeding populations that are under selection. Population-wide LD can also be created by crossing lines or breeds. Although LD will then exist even between loosely linked loci, this LD will erode rapidly over generations. Indirect markers that are identified using the candidate gene marker approach are expected to be in substantial LD with the QTL in which they reside. Unless the functional polymorphism has been identified, however, linkage phase of a candidate gene marker with the functional variant can differ from one population to the next and must, therefore, be assessed in the population in which it will be used. Although more abundant and extensive, within-family LD is more difficult to use because linkage phases between the markers and QTL will not be the same in all families and must, therefore, be assessed on a within-family basis. Utilization of m arkers that are in population-wide disequilibrium with a Q TL (Q /q) M Q M Q M arker and QTL alleles are or tend to be in consistent linkage phase M Q m q m M Selection can be on marker genotype across the population Population-wide linkage disequilibrium can be created by crossing (ideally inbred) lines or breeds and will then exist between loosely linked m arkers for several generations M Q M Q X M Q m q m q m q Q q m q Within-family disequilibrium Population-wide linkage equilibrium q m Utilization of indirect markers that are in population-wide equilibrium with a QTL (Q/q) A lthough all linked m arkers are expected to be in populationw ide linkage equilibrium w ith Q TL , tightly linked m arkers have a substantial probability to be in partial population-w ide LD because of the effects of drift, selection, m utation, and population admixture (Sved 1971, Goddard 1991, H astbacka et al. 1992). This probability is higher in selected populations of sm all effective size, w hich is the case for agricultural species, as demonstrated by Farnir et al. (2000) for dairy cattle. D irect m arkers are at the QTL and are, therefore, expected to be in com plete population-w ide linkage disequilibrium w ith the Q TL 235 M Q M Q m Q m Q M Q m Q M Q m Q M Q M Q m Q m Q M q m q M q m q M q M q m q m q M Q m Q M Q m Q M q M q m q m q M q m q M q m q Marker and QTL alleles appear in alternate linkage phases. Marker genotype gives no information about QTL genotype. This will be the case for most indirect markers in an outbreeding population Recombination rate r M Q m q Gametes produced M 1/ Q and their frequency m Q 1/ 2 (1-r) M 1/ q 2r 2r m q 1/ 2 (1-r) Despite population-wide equilibrium, the marker and QTL will be in partial disequilibrium within a family. The extent of disequilibrium depends on the recombination rate (r), but will occur even with loose linkage (r=0.2). This disequilibrium can be used to detect QTL and for selection Use of within-family linkage disequilibrium for QTL mapping and MAS Three Types of Molecular Information 1) Genotype for functional gene BB Bb bb ➣ polymorphism = causative mutation 2) Genotype for a direct marker ➣ polymorphism is in populationwide linkage disequilibrium with causative mutation M Sire m MB mb Mb mB 3) Genotype for linked genetic markers ➣ polymorphism in linkage equilibrium across population E(µ µMM) = E(µ µMm) = E(µ µmm) MB mb Mb mB Use within-family disequilbrium B ? ? ? ? X Marker - QTL haplotypes present Marker - QTL haplotypes present r b Random dams M progeny m progeny M B µ +1/2α m b µ -1/2α ? ? ? ? M b µ -1/2α m B µ+1/2α ? ? ? ? 1/ 2 1/ (1-r) r 2 Average µ +1/2 (1-2r)α 1/ 2 Non-recombinants (1-r) 1/ µ - 1/2 (1-2r)α 2 Recombinants r Contrast µM?-µm?= (1-2r)α Linkage Disequilibrium can persist many generations for tightly linked loci Measure of Disequilibrium = DM,B = freq(MB) - freq(M)*freq(B) M Random mating in large population: DM,B(t+1) = (1-r) DM,B (t) = (1-r)t DM,B (0) m 1 r B b r= .0 01 0. 9 D M,B (t+1 ) 0. 8 0. 7 0. 6 r= .0 1 0. 5 0. 4 0. 3 0. 2 r = .1 r= .2 0. 1 0 0 r= .5 20 r= .0 5 40 60 80 100 G en era t ion The use of molecular genetics in selection programs rests on the ability to determine the genotype of individuals for causal mutations or indirect markers using DNA analysis. This information is then used to assess the genetic value of the individual, which can be captured in a molecular score that can be used for selection. This removes some of the limitations of quantitative genetic selection discussed above. It is clear that the use of molecular data for genetic improvement would be most effective if the genetic architecture of a quantitative trait was completely transparent such that we knew the number, positions, and effects of all genes involved. In that case, the process of selection would be reduced to a simple ‘building block’ problem (genotype building) of selection and mating to create individuals with the right combination of alleles at each QTL. However, this situation is far from reality and may never be achieved; although advances in molecular genetics have been able to partially dissect the black box of quantitative traits, the information provided by molecular data is far from complete, for three main reasons. First, in most cases only a limited number of genes that affect the trait has been identified, albeit the ones with the largest effects. Nevertheless, a substantial part of the black box remains obscure and selection exclusively on genotype for identified QTL would not result in maximum response to selection. Instead, selection on molecular score must be combined with selection on phenotype, which reflects the collective action of all genes, including those that have not been identified. Second, with indirect 236 markers, selection is not directly on the QTL, but on the marker, via LD. As LD erodes in the course of the selection program due to recombination, efficiency of selection is reduced. Third, for both causal and indirect markers, the effects of the QTL must be estimated empirically on the basis of statistical associations between markers and phenotype. Estimation requirements are particularly high for markers that are not in population-wide LD and for which within-family LD must be used. In that case, marker-QTL linkage phase and effects must be estimated on a withinfamily basis. Thus, the use of molecular information does not remove the need for phenotypic information and, therefore, suffers to some degree from the same limits as quantitative genetic selection. Despite the limitations outlined above, molecular genetic information can be used to enhance several breeding strategies through what is broadly referred to as Marker-Assisted Selection (MAS). All strategies for MAS are based on the use a molecular score, although the composition of this score differs from application to application. In addition to those described below, the application of molecular data in genetic programs includes their use for parentage verification or identification (for example, when mixed semen is used in artificial insemination) and in genetic conservation programs to identify unique genetic resources and quantify genetic diversity. The type of genetic information that is available, and its association with the functional mutation (population-wide LD or within-family LD), has important consequences for the use of molecular information in selection programs. On this basis, the following three types of selection programs using molecular information can be distinguished: • • • Gene-assisted selection (GAS) – selection based on the functional mutation for the QTL Marker-assisted selection based on population-wide LD (LD-MAS) – selection based on markers or marker haplotypes that are in population-wide disequilibrium with the QTL Marker-assisted selection based on within-family LD (LE-MAS) – selection based on markers or marker haplotypes that are in population-wide equilibrium with the QTL but in LD with the QTL on a within-family basis. Three types of observable molecular genetic loci Possible selection strategies Ease of Detection Use Q Functional mutations q - known genes MQ Markers in pop.-wide LD mq with functional mutation M Q Markers in pop.-wide LE m q with functional mutation GAS LD-MAS • Two-stage selection 1) Select on genotype 2) Select on EBV • Index selection I = b1 genotype + b2 EBV • Pre-selection 1) GAS (index) at young age 2) Select on EBV at later age LE-MAS For each of the three types of selection (GAS, LD-MAS, and LE-MAS), there are two basic strategies for combining the molecular information with phenotypic information in a selection strategy: 237 1) Two-stage selection, in which selection is on the molecular score in the first stage and on phenotype or a (polygenic) EBV in the second stage 2) Index selection, in which selection is on an index of molecular score and phenotypic information. In addition, molecular information could be used primarily for pre-selection of young animals for further testing. Methods to derive indexes that combine molecular and phenotypic information will be presented in the next section, followed by methods to predict responses to selection with presence of QTL of large effects. We will then compare alternative selection strategies for utilization of QTL information between and within breeds, and finish with an economic analysis of MAS and opportunities for the redesign of breeding programs to more fully capture the benefits of MAS. 12.1 Including QTL Information in Estimated Breeding Values When distinguishing QTL that have been mapped from other background genes that affect the trait, which will be referred to as polygenes, the genetic value gi of an individual i, can be partitioned into the sum of genetic values at the QTL, g Qi , and the sum of genetic values at gi = g Qi + g pi polygenes, g pi : Molecular genetic information provides information that can be used to estimate g Qi , whereas and individual’s phenotype provides information on the collective effect of all genes. Unless all QTL that affect the trait have been identified, selection on QTL must be combined with selection on phenotypic information, to ensure simultaneous improvement of both g Qi and g pi . Lande and Thompson (1990) suggested that QTL and phenotypic information should be combined in an index of the following form: Ii = bQ gˆ Qi + bPPi where ĝ Qi is the molecular score for individual i, i.e. the individual’s estimated breeding value for the QTL, Pi is the individual’s phenotype, and bQ and bP are index weights. The molecular score, ĝ Qi , can be computed as the sum over QTL or markers of estimates of effects on phenotype based on the individual’s QTL or marker genotypes. An example is in Table 12.1. Lande and Thompson (1990) showed that index weights could be derived by standard selection index theory, given the proportion of genetic variance explained by the QTL or markers (q= σ Q2 / h 2σ P2 ), and the (total) heritability of the trait (h2= σ g2 / σ P2 ): éσ Q2 ébQ ù −1 êb ú = P G = êσ 2 ë Pû ëê Q é 1 − h2 ù ê 2 ú −1 σ Q2 ù éσ Q2 ù ê 1 − qh ú ú ê ú= σ P2 ûú ëêσ g2 ûú êê (1 − q ) úú 2 ê h 1 − qh 2 ú ë û 238 Thus, the relative weight on the molecular score relative to phenotype is: bQ bP 1 = −1 h2 1− q Table 12.1. Example of the calculation of molecular score and index of phenotype and molecular score with 3 additive QTL with allele substitution effects (allele A vs. B) of +10, +5, and –10 for for QTL 1, 2, and 3, respectively. The QTL jointly explain 50% of the genetic variance for a trait with heritability 0.5. Resulting index weights on molecular score and phenotype are 2/3 and 1/3, respectively (after J. Holland, 1998). QTL 1 QTL 2 QTL 3 Molecular score Phenotype Animal Genotype Value Genotype Value Genotype Value 1 AA 10 AA 5 AA -10 5 35 2 AA 10 AA 5 BB 10 25 -10 3 AB 0 BB -5 AB 0 -5 -15 4 AB 0 BB -5 AA -10 -15 15 5 BB -10 AA 5 AB 0 -5 25 Index value 15.0 13.3 -8.3 -5.0 5.0 Index of QTL and own phenotype Index of QTL and own phenotype (Lande &Thompson, 1990 Genetics 124:743) (Lande &Thompson, 1990 Genetics 124:743) g = gQ + gpol I = bQ gQ + bP P Index: bQ bP = P-1 G = σQ2 σ Q2 σQ2 σP2 -1 gQ= QTL/marker BV gpol = Polygenic BV σg2 = Total genetic var. = σQ2 + σpol2 q = fraction of genetic variance due to QTL/marker = σQ2/σg2 h2 = total heritability = σg2/σp2 σ Q2 Index: σ g2 I = bQ gQ + bP P Selection index theory: bQ = (1-h2)/(1-qh2) Accuracy = rg,I = (b’G).5/σ σg bQ/bP = (1/h2 bP = h2(1-q)/(1-qh2) - 1)/(1-q) Efficiency = rg,I /rg,P = [(q/h2) + (1-q)2/(1-h2q)]1/2 Can be expanded to multiple QTL and multiple phenotypic records using standard selection index theory Example relative weights are in Table 12.2, which shows that the index gives more weight to the molecular score as heritability decreases and as the proportion of variance explained by the QTL increases. Table 12.1 also gives index values for the example animals. This illustrates that different selection decisions would be made based on molecular score alone, based on phenotype alone, and based on the index. The Lande and Thompson (1990) formulation of the index is easily extended to situations were indexes of phenotypes of relatives are used. Indexes can also be extended to multiple-trait situations. 239 Table 12.2. Index weight on molecular score relative to phenotype (bQ/bP) for different heritabilities and proportions of genetic variance explained by the QTL (after J. Holland, 1998). Heritability (h2) 0.10 0.25 0.50 0.75 1.00 0.10 10 3.33 1.11 0.37 0 Proportion of genetic variance explained by QTL (q) 0.25 0.50 0.75 1.00 12 18 36 Total weight 4 6 12 Total weight 1.33 2 4 Total weight 0.44 0.67 1.33 Total weight 0 0 0 Either It is useful to note that the above index can be reparameterized into an equivalent index of molecular score and phenotype adjusted for the molecular score as follows: I i' = bQ' ĝ Qi + bP' Pi ' Where Pi ' = Pi - ĝ Qi . Using selection index theory and defining polygenic heritability as the heritability of phenotype adjusted for molecular score: h 2 pol = σ g2 − σ Q2 σ p2 − σ Q2 h 2 (1 − q ) = 1 − qh 2 weights for this index can then be derived to be independent of r and equal to: bQ' = 1 and ' P b =h 2 pol : éσ Q2 ébQ' ù −1 P G = = ê ê 'ú êë 0 ëbP û Thus, the resulting index is: ù 2 2ú σ P − σ Q úû 0 −1 é σ Q2 ù é 1 ù =ê 2 ú ê 2 2ú êëσ g − σ Q úû ëh pol û 2 I i' = ĝ Qi + h pol Pi ' One important advantage of index I ' over index I is that its index weights remain constant over generations, whereas weights for index I must be updated each generation as QTL frequencies, and therefore the proportion of genetic variance explained by the QTL, change. This index also allows easy extension to indexes based on BLUP EBV. To see this, note that the second term in 2 this index, h pol Pi ' , represents the individual’s estimated breeding value for polygenes, ĝ pi , based on own phenotype adjusted for the QTL. This index can be expanded to BLUP EBV from a model that includes QTL or markers as a fixed or random effect (see Fernando and Grossman, 1989, for methodology to include marked QTL as random effects in a BLUP animal model). Such models result in estimates of molecular scores, ĝ Qi , and EBV for polygenic effects, gˆ pol ,i , with accuracy rpol. Index weights for combining these two estimates, realizing that the variance 2 2 of polygenic EBV is equal to rpol σ 2pol , where σ 2pol = h pol (σ P2 − σ Q2 ) is the polygenic variance, can be derived as: éσ Q2 ébQ' ù −1 = = P G ê ê 'ú êë 0 ëb P û ù 2 2 ú rpol σ pol úû 0 240 −1 é σ Q2 ù é1ù ê 2 2 ú=êú êërpol σ pol úû ë1û I i' = ĝ Qi + gˆ pol ,i Thus the index is: Index of QTL and Phenotypic information Index of QTL and own phenotype Generalization to BLUP EBV Alternative (but equivalent) formulation P* = phenotype adjusted for QTL/marker P* = P - gQ Index: g^Q = EBV based on (multiple) markers/QTL = Σg^Qi for multiple markers/QTL σp*2 = σP2 - σQ2 hpol2 = polygenic heritability = σpol2/σp*2 g^pol = BLUP for polygenic BV I = bQ gQ + bP* P* Estimates can be obtained from BLUP-QTL animal models (Fernando & Grossman, 1989 Genet. Sel. Evol. 21:467) Selection index theory: bQ = 1 bP* = hpol2 I = g^Q + g^pol I = gQ + hpol2 P* overall EBV QTL EBV Polygenic EBV Use of within-family LD Marker-assisted BLUP (Fernando and Grossman, 1989) Sire Dam Ms Qsp Md Qdp Ms Qsm Md Qdm yi = µ + vip + vim + u + e Paternal / Maternal PolyQTL allele effect genic Progeny Ms Qip Md Qim Var(u) Var(u) = Aσ Aσu2 Var(v) Var(v) = Gσ Gσv2 G = gametic relationship matrix for QTL effects Computed from vip , ^vim ➣ EBV for QTL alleles: ^ - marker genotypes ^ ➣ EBV for polygenic effects: u - m-QTL rec. rate ^ip + v^im + u^ Total EBV = v If the phenotypic EBV is from a regular animal model and not from a model that includes the marker or QTL as separate effects, derivation of the index can only be approximated by correcting the EBV for effects of the QTL. This could be done by regressing the regular EBV, ĝ i , on the molecular score (or QTL genotype(s)) using: ĝ i = βĝ Qi + ei Residuals from this model then provide approximate estimates of polygenic EBV, i.e. gˆ pol ,i ≈ êi , which can be used in the index described above. Note that, although ĝ Qi may represent an unbiased estimate of the QTL effects, the estimate of the regression coefficient β will be less than 1. The reason is that when estimating EBV ĝ i , all effects, including the QTL effects are regressed back toward zero. In theory, the extent of regression can be approximated by the square of the accuracy of the EBV, i.e. r2. This can be most readily seen for EBV based on own phenotype alone, in which case: ĝ i = h 2 Pi = h 2 g Qi + h 2 ( Pi − g Qi ) Thus in this case the regression factor is β = h2 = r2 since r = h for selection on own phenotype. This relationship β = r2 is, however, only an approximation when phenotypic information from relatives contributes to the EBV because, the extent of regression of phenotypic information from relatives is not equal to r2. This is most easiest seen from table 4.1 in Chapter 4, when comparing index coefficients b to the square of the accuracy rHI. 241 In addition, if animals with EBV with different accuracy are included in the analysis, a single regression coefficient will not suffice. This could be accommodated by a weighted least squares analysis or by first de-regressing EBV. These problems are, however, all circumvented when the marker information is included directly in the genetic evaluation model, which is the preferred method. It is useful to note that selection based on own phenotype (without molecular information) can also be written as selection on an index of breeding values for the QTL and polygenes by noting that selection on Pi is equivalent to selection on h p2 Pi , which can be written as 2 2 2 2 ĝ i = h pol Pi = h pol g Qi + h pol Pi ' = h pol g Qi + gˆ pol ,i . Thus, with phenotypic selection, the emphasis on the molecular score relative to the EBV for 2 polygenes is equal to the polygenic heritability, h pol , instead of 1 as in MAS. Similarly, for more complex EBV based on phenotypic records, as shown earlier, the EBV can be approximated by: ĝ i ≈ r 2 g Qi + gˆ pol ,i and the implicit weight on the molecular score is approximately equal to the square of accuracy, r2 . 12.2 Predicting Response to Selection with QTL Information Apart from stochastic simulation (see Chapter 2), two deterministic methods have been used to predict response to selection on EBV that include information on an identified or marked QTL: 1) using selection index theory 2) using mixture distributions The first approach follows standard selection theory, in which the QTL information is considered as another source of normally distributed information in the index. The second approach more precisely models selection on a QTL. Both approaches will be described in detail below. 12.2.1 Selection Index approach to predicting response to marker-assisted selection Consider the previously derived selection index of molecular score and own phenotype, when the molecular score explains a fraction q of the additive genetic variance: Ii = bQ gˆ Qi + bPPi The accuracy of this index and response to selection can be derived by standard selection index theory (Chapter 4) as: rg,I = = b' G = σ g2 é 1− h2 ê 2 ë1 − qh h2 (1 − q) ù éq ù ú 1 − qh 2 û êë1 úû 2 q − 2qh 2 + h 2 2 (1 − q ) + = q h 1 − qh 2 1 − qh 2 242 Similarly for the alternate index parameterization: I i' = bQ' ĝ Qi + bP' Pi ' rg,I’ = and 2 Using h pol = b' G = σ g2 [1 2 é q ù h pol ê1 − q ú = ë û ] 2 q + h pol (1 − q) h 2 (1 − q) it can easily be shown that rg,I’ = rg,I , i.e. the two indexes are equivalent 1 − qh 2 Assuming equal selection in males and females, with selection intensity i, response to selection can be predicted as: RMAS = i rg,I σg Response to phenotypic selection without QTL information is: RP = i rg,P σg With rg,P = h, the efficiency of selection using marker information, defined as response to MAS relative to response without marker information, is given by: rg , I R q (1 − q) 2 + = E = MAS = RP rg , P h 2 1 − qh 2 An equivalent equation can be derived using the alternate index I’: rg , I ' 1 R 2 E = MAS = = q + h pol (1 − q ) RP rg , P h Figure 12.1 shows the impact of heritability and proportion of variance explained by the molecular score on efficiency of MAS. This Figure shows that MAS will be most beneficial for traits with low heritability and when the molecular score explains a large proportion of the genetic variance. Figure 12.1. Efficiency of MAS relative to phenotypic selection 5 h2 =0.05 4.5 Efficiency 4 3.5 h2 =0.10 3 2.5 h2 =0.25 2 1.5 1 0 0.2 0.4 0.6 0.8 1 h2 =0.50 h2 =0.75 h2 =1.00 Fraction of variance associated with molecular score (q ) Similar procedures, using selection index theory, can be used to derive accuracy and efficiency of MAS for more complex EBV that use information from relatives and/or multiple traits (see 243 Lande and Thompson, 1989). Efficiency of such indexes is approximately equal to those illustrated in Figure 12.1, but with h2 replaced by accuracy squared, r2. This shows that, in general, for a given proportion of variance explained by QTL, MAS will be most efficient for cases in which regular selection is relatively ineffective. This includes traits with low heritability, sex-limited traits, traits that are observed late in life (after selection), and traits that require sacrificing the animal to observe phenotype (e.g. carcass quality traits). There are several important limitations to the selection index derivations and results presented in this section. First, it is important to note that the derived accuracy and selection response and efficiency assume normality of both phenotypes and molecular scores. Molecular scores will clearly not be normally distributed if only a few QTL are included. But even in that case, derived accuracies and efficiencies of MAS will be reasonable approximations if q is not too large and most emphasis is on phenotype, such that the index is still approximately normal. Note, however, that the index itself does not require normality and will be optimal (i.e. result in maximal accuracy and response in additive genetic values from the current to the next generation), even if molecular scores are not normally distributed, as long as the QTL are additive (see Dekkers 1999 for optimal QTL breeding values with dominance). In addition, results apply only to selection over a single generation. Response over multiple generations must accommodate changing variances. Changes in variance due to the Bulmer effect can be accommodated in selection index derivations and response calculations using the procedures developed in Chapter 5. However, another important factor to consider here, especially if selection is on a limited number of QTL of sizeable effect, is the change in variance associated with molecular score as a result of changes in gene frequencies. Accommodating changes in gene frequencies requires additional theory, which will be presented in the next section. 12.2.2 Mixture distribution approach to predicting response to marker-assisted selection Consider a population of infinite size with discrete generations, selection of fractions Qs and Qd of males and females, and random mating of selected parents. Selection is for a quantitative trait affected by an identified QTL (i.e., not marked) and additive polygenic effects. The QTL has two alleles (B and b). Genotypes BB, Bb, bB and bb, where the first letter indicates the allele received from the sire, are denoted by m = 1, 2, 3, and 4. To simplify optimization procedures, it is assumed that genotypes Bb and bB can be distinguished, although this may not always be possible in practice. The genotypic value of QTL genotype m is denoted by qm, with q1= a, q2=q3= d, and q4= a, following Falconer and Mackay (1996). Polygenic effects are assumed to follow the infinitesimal genetic model (Falconer and Mackay, 2 1996). Let σp’ and h pol denote the phenotypic SD and heritability of the trait within QTL genotype. Both parameters are assumed constant over generations; the effect of selection on polygenic variance (Bulmer, 1980) is ignored. Alternatively, it can be assumed that the population has been under selection for several generations and that the polygenic variance is the stabilized variance with gametic phase disequilibrium. 244 Let pst and pdt be the frequencies of allele B among paternal and maternal gametes that produce generation t or, equivalently, the frequencies of B among sires and dams that are selected for breeding in generation t-1. The frequency of B in generation t then is equal to (pst+pdt)/2. Table 12.3 shows the resulting QTL genotype frequencies under random mating of selected parents and summarizes the notation used. Let AsBt and Asbt be the mean polygenic breeding values of paternal gametes that form generation t and that carry allele B and b, respectively. This formulation allows for gametic phase disequilibrium between the QTL and polygenes (Dekkers and van Arendonk, 1998). Mean polygenic breeding values of maternal gametes are similarly denoted by AdBt and Adbt. The mean polygenic breeding value by genotype class is denoted by u mt and is the sum of mean polygenic breeding values of the paternal and maternal gametes (Table 12.3). The resulting mean total genotypic value of genotype class m in generation t is then equal to qm+ u mt (Table 12.3). Note that, although genotype classes Bb and bB have the same QTL value (d), they can differ in mean polygenic breeding value due to gametic phase disequilibrium. Weighting the genotypic mean of each genotype class by its frequency, the mean total genotypic value of the population in generation t, g t , is given by: g t = (pst+pdt −1)a + (pst+pdt −2pstpdt)d + pstAsBt + (1−pst)Asbt + pdtAdBt + (1−pdt)Adbt Table 12.3. Summary of notation used for selection on a QTL with two alleles (B and b) in generation t Ge notype No Genotype frequency1 Mean polygenic breeding value2 Mean genetic value Mean BV, deviated from genotype Bb3 Prop. selected in sex j Index wts B gamete production, fraction Selection differential4 BB 1 pstpdt u 1t=AsBt+AdBt a+ u 1t αt+ u 1t− u 2t fj1t bj1t 1 ij1t σj Bb 2 pst(1-pdt) u 2t=AsBt+Adbt d+ u 2t 0 fj2t 0 ½ ij2t σj bB 3 (1-pst)pdt u 3t=Asbt+AdBt d+ u 3t u 3t− u 2t fj3t bj3t ½ ij3t σj bb 4 (1-pst)(1-pdt) u 4t=Asbt+Adbt -a+ u 4t −αt+ u 3t− u 2t fj4t bj4t 0 ij4t σj 1 2 3 4 pst and pdt are frequencies of allele B among selected sires and dams that are used to produce generation t. u mt is the mean polygenic breeding value of individuals of genotype m in generation t, AjBt and Ajbt are the mean polygenic values of gametes from sex j that carry allele B or b and are used to produce generation t. αt=a+(1−pst−pdt)d is the standard QTL allele substitution effect in generation t (Falconer and Mackay, 1996) σj is the SD of estimates of polygenic breeding values for sex j; i denotes selection intensity. 245 12.2.2.1 Selection Model with QTL Information With QTL genotype assumed known before selection, information available for selection can be obtained from genetic evaluation with a BLUP animal model with QTL genotype included as a fixed effect (Kennedy et al., 1992; Israel and Weller, 1998). Such a model results in estimates of the QTL effects ( q̂ m) and in estimates of individual polygenic breeding values, ûimt for individual i of genotype class m in generation t. Let rs and rd denote the accuracy of resulting polygenic EBV for males and females. In a large population, estimates of QTL effects will be known without error, which is what will be used here: q̂ m=qm. Following Falconer and Mackay (1996), breeding values at the QTL in generation t, when deviated from the breeding value of the heterozygote, are equal to –αt, 0 and +αt for genotypes bb, Bb (=bB) and BB (Dekkers, 1999), where αt is the standard QTL substitution effect and equal to αt=a+(1−pst−pdt)d (Falconer and Mackay, 1996). Note that the standard QTL substitution effect for generation t is derived using the allele frequency in generation t (=½(pst+pdt)). Adding average polygenic breeding values, the mean total breeding value of individuals of genotype m in generation t, deviated from the mean breeding value of Bb individuals (m=Bb) is equal to: g mt = nm[a+(1−pst−pdt)d] + ( u mt – u 2,t) where indicator variable nm is equal to –1, 0, 0, and +1 for m equal to BB, Bb, bB, and bb, respectively (see Table 12.3). In practice, mean polygenic breeding values by genotype class, u mt, can be estimated as the average estimated polygenic breeding value by genotype class. For a large population, these estimates can be assumed known without error, which is what will be used here: u mt= û mt Resulting values can be used to compute the following selection criterion that combines the mean breeding value of the QTL genotype, g mt , with the individual’s polygenic breeding value estimate ( ûijmt), which is deviated from the mean polygenic breeding value of genotype class m ( u mt): Iijmt = bjmt g mt + ( ûijmt- û mt) where bjmt is the weight given to the QTL breeding value for individuals of sex j of genotype m in generation t. Selection on this index involves truncation selection across the four genotype classes, as illustrated in Figure 12.2. The index value for each genotype class m is assumed to follow a normal distribution with mean bjmt g mt and SD equal to the SD of polygenic EBV within genotype class, which is equal to σj=rjσpol, where σpol is the polygenic SD, and rj is the accuracy of polygenic EBV for sex j. The polygenic standard deviation, σpol, is assumed to be constant over generations and equal to hpolσp’, where σp’ is the phenotypic standard deviation adjusted for the QTL effect. For known parameters of the four distributions, the unique truncation point that results in the correct proportion selected (Qs for males and Qd for females) can be determined numerically. The bisection method described in Chapter 3 can be used for this purpose. 246 Figure 12.2. Truncation selection across distributions of index values for QTL genotypes bb x j4t σ f j4t b j4t g j4t bB x j3t σ f j3t b j3t g j3t Bb x j2t σ f j2t b j2t g j2t BB x j1t σ f j1t b j1t g j1t Let xjmt, fjmt, and ijmt be the standardized truncation point, proportion selected, and selection intensity (Falconer and Mackay, 1996) for genotype class m of sex j at generation t when truncating across the four distributions (Table 12.3). With xjmt obtained from the unique point of truncation across the four distributions (Figure 12.2), fjmt, and ijmt can be approximated under the assumption of normality of polygenic EBV. The expected frequency of B among paternal (j=s) and maternal (j=d) gametes that form the next generation can then be derived based on the proportion of B gametes produced by each genotype (Table 12.3) as: pj,t+1 = [pstpdtfj1t + ½pst(1-pdt)fj2t + ½(1-pst)pdtfj3t]/Qj where Qj is the total proportion selected for sex j. Following Falconer and Mackay (1996), ijmtσj is the selection differential and genetic superiority for polygenic breeding values of selected parents of sex j of genotype m, which is deviated from the mean polygenic breeding value of all selection candidates of genotype m (Table 12.3). Expected mean polygenic breeding values of B and b gametes that form the next generation (t+1) can then be computed separately for paternal and maternal gametes as: AjB,t+1 = ½[fj1tpstpdt( u 1t+ij1tσj) + ½fj2tpst(1-pdt)( u 2t+ij2tσj) + ½fj3t(1-pst)pdt( u 3t+ij3tσj)]/Qj pj,t+1 Ajb,t+1 = ½[½fj2tpst(1-pdt)( u 2t+ij2tσj) + ½fj3t(1-pst)pdt( u 3t+ij3tσj) + fj4t(1-pst)(1-pdt)( u 4t+ij4tσj)]/Qj(1-pj,t+1) 247 Note that these equations are based on standard polygenic selection theory (Falconer and Mackay, 1996), by summing polygenic means of B or b gametes that are produced by parents that are selected from each genotype, weighted by their relative frequencies. The normalizing constant Qjpj,t+1 derives from the fact that the sum of weights in the equation for B is equal to Qjpj,t+1. Similarly, the sum of weights in the equation for b is equal to Qj(1-pj,t+1). This is a general procedure for modeling selection on a quantitative trait that is affected by a QTL and can be easily extended to selection on multiple QTL by increasing the number of genotype classes (see Chakraborty et al. 2002). This procedure can be used to model what will be referred to as standard index GAS by setting all weights bjmt are equal to one, which result in an index that is equivalent to the index that was derived using selection index theory. This index maximizes response from the current to the next generation for additive QTL, as shown by Dekkers and van Arendonk (1998). This formulation also allows index weights to be derived for what will be referred to as optimal index GAS (see later), which aims to maximize response to selection over multiple generations. Finally, this procedure also allows approximation of selection without QTL information by setting weights bjmt = rj2 . If the QTL is non-additive, the first term of the index must be replaced by bjmt g mt à rj2(qm+ u mt) because with GAS, QTL effects are then based on allele substitution effects, whereas the genotypic effect is reflected in the phenotype. The model does not specifically allow consideration of marked QTL. However, assuming recombination rates are small, it does provide a good approximation to LD-MAS. In that case, QTL alleles are replaced by marker haplotypes. 12.3 Two-stage vs. index selection on QTL The following figures compare responses from two-stage GAS, in which selection is on QTL genotype in stage 1, followed by selection on phenotype, to responses from standard index GAS. Phenotypic selection, without use of QTL information, and optimal index GAS (see later), are included for comparison also. The example is for selection on a biallelic additive QTL with effect a=0.5σp and starting frequency 0.1 for a trait with polygenic heritability 0.25 and selection of 10% of males and 25% of females. Results are from the deterministic mixture distribution model. For two-stage GAS, this was implemented by first selecting BB individuals, followed by individuals with the Bb or bB genotype (equal proportions) if there were not enough BB animals, and by bb individuals if still additional animals were required. For the last selected genotype, individuals with the highest phenotypic value were selected until the required overall proportion selected was obtained. Results (see figures below) show that two-stage GAS resulted in more rapid fixation of the QTL than standard index GAS, but at a cost to polygenic response. Cumulative responses and 10 1 cumulative discounted response (CDR = å g t ) (ρ = 10% interest) from two-stage GAS t t =1 (1 + ρ ) was, however, lower than response from standard index GAS. The reason for this is that twostage selection removes some individuals that have high polygenic EBV in the first stage, which are selected with index GAS because their high polygenic effects more than offset the fact that 248 Frequency 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Standard GAS Phenotypic 2-stage GAS 10 Optimal GAS Standard GAS Phenotypic 2-stage GAS 8 6 4 2 0 1 2 3 4 5 6 Generation 7 8 9 0 10 0 Cumulative response deviated from phenotypic selection Optima GASl Standard GAS Phenotypic 2-stage GAS 0.3 0.1 -0.1 -0.3 -0.5 1 1 2 3 4 5 6 7 Generation 8 3 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 4 5 6 7 Generation 8 9 10 Optimal GAS Standard GAS Phenotypic 2-stage GAS 0 -0.7 0 2 CDR deviated from phenotypic selection Cumulative response 0.5 Cumulative response Genetic gain 12 Genetic gain Frequency they have an unfavorable QTL genotype. This is akin to multiple-trait selection using independent culling levels versus index selection. 9 10 1 2 3 4 5 6 Generation 7 8 9 10 For comparison, results from phenotypic and optimal index GAS are shown also. Both two-stage and standard GAS ultimately resulted in lower response and CDR than phenotypic selection – the reason for this will be explained later. Optimal GAS had greater CDR than phenotypic selection across the planning horizon. The next figure shows lost cumulative response from two-stage compared to standard GAS after 1, 2, 3 generations, and lost CDR over 10 generations, for QTL of different effects. Results show that lost response from two-stage GAS was greatest for small QTL and reduced to zero for large QTL. In fact, two-stage GAS can be modeled using the mixture distribution by giving a very large effect to the QTL, which ensures that all individuals with the favorable QTL genotype are selected. 249 2-stage vs. index GAS (p0=.1) Selected prop.: sires=.1 dams=.25 Accuracy: rs=.8 rd=.5 Lost trait response 60 50 40 30 20 10 0 2 1 Ge ra n etio n Q T L e ffe c t (a in σ g ) 1 3 CDR 10 0.8 0.6 0.4 0.2 0 Response lost (%) 70 Whereas the previous results were simulated for a single trait, they also apply to selection for an aggregate multiple-trait breeding goal. In that case, the effect of the QTL is expressed relative to the genetic standard deviation for the breeding goal and selection accuracies are those for the multiple-trait index as a predictor of the aggregate genotype (rHI). Note that the results presented above not only apply to selection on QTL for quantitative traits but also for genes associated with single gene traits, such as genes associated with genetic defects, appearance (horns, color), and diseases. Although it is often difficult to assign an economic value to such genes, the above results show that simply culling all carriers of a genetic defect results in lost response for other traits, because selection emphasis is diverted. This occurs even if the gene has no direct negative effect (pleiotropy) on the other traits. Thus, it is important to attempt to assess an economic value for such single-gene traits. This economic value could account for the lost marketability of breeding stock that are carriers of the genetic defect. With regard to the use of carriers, one point to note is that occurrence of homozygous recessive progeny can be avoided by careful mating. Thus, there is often no valid reason to absolutely avoid use of carrier animals in breeding. In the two-stage selection procedure used above, selection was first on molecular data and then on phenotype. There may be benefit to turning this around and have selection on the index that includes marker information follow a first stage of selection using phenotype-based EBV; only individuals that are selected in the first stage would need to be genotyped, which would save costs. 12.4 Long-term Response to Selection with QTL Information To examine longer-term responses to selection, Fig. 12.3 illustrates responses to selection on phenotype and to standard index GAS based on the mixture distribution model depicted for an example situation. For illustrative purposes, the example reflects a QTL of very large effect (the 250 difference between homozygotes is 2a = 1.5σp’). Similar trends are observed for QTL of smaller effect, although the differences between phenotypic selection and GAS are smaller. Fig. 12.3. Responses to standard MAS and phenotypic based on the deterministic model in a population of infinite size. Selection is of the top 20% of males and females for a trait controlled by a biallelic additive QTL and polygenes. The QTL has effect a = 1 phenotypic standard deviations and frequency 0.1. Polygenic heritability is 0.25. The main graph shows cumulative total response to selection, expressed in polygenic standard deviations (σpol); b) frequency of the favorable QTL allele; and c) polygenic response per generation. (From Dekkers & Settar, 2003) 25 b) QTL frequency PHENOTYPIC 1 STANDARD GAS Frequency 0.8 0.6 0.4 0.2 0 15 0 5 10 15 Generation 20 25 30 c) P olygenic response 10 0.8 Response (σ pol) Genetic value ( σ pol) 20 5 0.6 0.4 0.2 0 0 5 10 15 G eneration 20 25 30 0 0 5 10 15 20 25 30 Generation Figure 12.3 clearly shows the extra response from GAS during early generations. By generation 5, however, cumulative response from phenotypic selection exceeds that from GAS. As expected, GAS fixes the QTL at a faster rate than phenotypic selection (Fig. 12.3b). The increased selection emphasis on the QTL, however, results in lower response in polygenes (Fig. 12.3c). Although polygenic response per generation returns to maximum as soon as the QTL is fixed, i.e. sooner for GAS than for phenotypic selection, the extra polygenic response that is lost in early generations with GAS is never regained in later generations, which is the reason for the lower cumulative response for GAS in the longer term. Results illustrated in Fig. 12.3 are based on several simplifying assumptions for the polygenic component of the genetic model; (a) the infinitesimal model for polygenes, i.e. an infinite number of polygenes of small effect; (b) large population size, i.e. no inbreeding or drift; and (c) 251 genetic variance contributed by polygenes remains constant over generations, i.e. no gametic phase disequilibrium among polygenes (Bulmer 1980). The deterministic model does account for the gametic phase disequilibrium between the QTL and polygenes that is induced by simultaneous selection on the QTL and polygenes (Dekkers and van Arendonk 1998). This is reflected in a negative association between the QTL and polygenes, such that individuals with a (un)favorable QTL genotype tend to have poorer (better) polygenic breeding values. The creation of this negative association by selection is illustrated in Fig. 12.2 by noting that individuals with a BB genotype are less intensely selected for polygenes than individuals with a bb genotype for the QTL. A negative association is created by both phenotypic selection and GAS but is larger for MAS because of the greater emphasis on the QTL (Dekkers and van Arendonk 1998). Despite the simplifying assumptions of the deterministic model, the results illustrated in Fig. 12.3 have been repeated in several studies by stochastic simulation (e.g. Larzul et al. 1997; PongWong and Woolliams 1998). A stochastic model simulates individuals in the population under selection, rather than population distributions, and does not require many of the assumptions that are inherent to the deterministic model depicted in Fig. 12.2. Typical results from such stochastic simulations are demonstrated in Fig. 12.4, which represents the results of simulating selection in a population of 250 males and 250 females, with 20% selected for each sex. Three different genetic models were used for polygenes: the infinitesimal genetic model and models in which the polygenic component is simulated by 50 or 10 individual loci. Results for the stochastic model were averaged over 500 replicate simulations. Fig. 12.4 focuses on the difference in cumulative responses between GAS and phenotypic selection over generations, rather than the absolute responses illustrated in Fig. 12.3. For a given method of selection (GAS or phenotypic selection), absolute cumulative responses to selection (not shown) differed between genetic models; responses were greatest for the deterministic model, followed by the infinitesimal model, and the finite locus models with 50 and 10 polygenes. Average rates of change in frequency of the QTL were very similar between genetic models (results not shown). For the infinitesimal model, differences in response between GAS and phenotypic selection were very similar for the stochastic model and the deterministic model (Fig. 12.4). In contrast to the deterministic model, the stochastic model accommodates reductions in polygenic variance as a result of the Bulmer effect and inbreeding (Fig. 12.5). Under the stochastic model, however, changes in polygenic variance were similar for GAS and phenotypic selection and would, therefore, have limited impact on their contrast. The finite locus model with 50 polygenes exhibited similar differences in cumulative response between GAS and phenotypic selection as the infinitesimal model for the first 10 generations (Fig. 12.4). In subsequent generations, GAS regained some of the response it had lost under the finite locus model and differences with phenotypic selection decreased slightly. The recovery of lost response under GAS was greater for the model with 10 polygenes; the difference with phenotypic selection was reduced to 0.13 polygenic standard deviations by generation 15. This behavior of the finite locus model is explained by the change in frequencies of polygenes. The average frequency of polygenes is initially lower for GAS than phenotypic selection. As 252 frequencies move closer to 1, however, polygenic variance is depleted, and more rapidly so for phenotypic selection than for GAS (Fig. 12.5). As a result, polygenic response with GAS is able to catch up with polygenic response for phenotypic selection. Figure 12.4. Cumulative responses to standard GAS as a deviation from cumulative response for phenotypic selection (GAS response – phenotypic response, expressed in polygenic standard deviations, σpol). Selection is of the top 20% males and females from 250 individuals per sex for a trait controlled by a biallelic additive QTL and polygenes. The QTL has effect a = 1 phenotypic standard deviations and frequency 0.1. Polygenic heritability is 0.25. In addition to a deterministic model, results are presented for three stochastic models with different models for the polygenic component: the infinitesimal model, and finite locus models with 50 or 10 unlinked loci of equal effect but frequencies drawn from a uniform [0,1] distribution. Stochastic simulation results are the average of 500 replicate simulations. (From Dekkers and Settar, 2003) GAS - Phenotypic Response (σpol) 1 Deterministic 0.8 Infinitesimal 0.6 50 polygenes 10 polygenes 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Generation In a large population with negligible inbreeding, all genes that affect the trait will ultimately be fixed for their favorable allele under both GAS and phenotypic selection. Thus, ultimate response will be the same for both strategies. In populations of limited size, however, ultimate response will differ between strategies because of their impact on rates of fixation and loss of polygenes. Thus, ultimate differences in response between GAS and phenotypic selection will depend on the proportion of polygenes for which the favorable allele is lost. These differences will, however be small (Dekkers and Settar, 2003). 253 Figure 12.5. Polygenic variance (relative to polygenic variance in generation 0) under standard MAS and phenotypic selection (Phen). Selection is of the top 20% males and females from 250 individuals per sex for a trait controlled by a biallelic additive QTL and polygenes. The QTL has effect a = 1 phenotypic standard deviations and frequency 0.1. Polygenic heritability is 0.25. Results are presented for three models for the polygenic component: the infinitesimal model, and finite locus models with 50 or 10 unlinked loci with equal effect but frequencies drawn from a uniform [0,1] distribution. Results are the average of 500 replicate stochastic simulations. (From Dekkers and Settar, 2003) 1 Polygenic Variance ( σ pol) Infinitesimal 0.8 0.6 GAS Phen 50 polygenes GAS Phen 10 polygenes GAS Phen 0.4 0.2 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Generation 12.5 Optimizing QTL selection The previous results demonstrate that GAS strategies that maximize response over a single generation will not maximize response over more than one generation. The underlying reason is that selection not only changes the population mean but also population parameters such as gene frequencies and genetic variances. These changes in population parameters affect the amount of response that can be made in subsequent generations. Thus, strategies that maximize response over multiple generations must account for changes in population parameters that affect subsequent responses to selection. In the case of GAS and under the infinitesimal genetic model with constant polygenic variance, the main population parameter that affects response to selection in subsequent generations is the frequency of the QTL. Therefore, to develop strategies that maximize longer-term response to selection, Dekkers and van Arendonk (1998) used the mixture distribution model described 254 previously to optimize weights in the previously described index of the genetic value for a single known QTL ( g Qi ) and an EBV for polygenes: Iijmt = bjmt g mt + ( ûijmt- û mt) Index weights bjmt were allowed to differ by generation, sex, and QTL genotype. In reference to Figure 12.2, changing weights on the QTL changes the means of the three distributions and, thereby, the proportions selected from each genotype. MAS Strategies that maximize response over one generation may not maximize response over multiple generations Selection in current generation Progeny mean Genetic parameters (QTL frequency and variance) Response in subsequent generations Dekkers and van Arendonk (1998) used optimal control theory to derive the index weights that maximized cumulative response after T generations. Optimal control theory utilizes the unique structure of response to selection over generations, in that the optimal selection strategy for generation t depends only on population parameters in generation t, i.e. polygenic means and QTL frequency, and not on the path that led to these parameters (Dekkers and van Arendonk 1998). Manfredi et al. (1998) solved a similar problem using a more general optimization method. Their method does not utilize the unique structure of selection over multiple generations and requires more computing time. The approach of Dekkers and van Arendonk (1998) was subsequently extended to multiple QTL by Chakraborty et al. (2002). The general selection objective considered is to maximize a weighted sum of mean total genotypic T values by generation over a planning horizon of T generations: R = å wt g t t =1 where wt is the relative emphasis on generation t in the overall objective. For an economic objective function such as cumulative discounted responses (CDR), weights wt reflect discount factors and are equal to wt = 1/(1+ρ)t, where ρ is the rate of interest per generation. If the objective is to maximize cumulative response after T generations, set wT = 1 and all other wt = 0. The problem of maximizing this objective function can be stated as the following constrained multi-stage non-linear optimization or optimal control problem (Lewis, 1986), using the notation developed previously: 255 Given the polygenic and gene frequencies in generation 0, i.e. AsB0, AdB0, Asb0, Adb0, ps0, and pd0, T Max R = å wt g t f jmt t =1 Subject to, for j=s,d and every t=0 to T-1: Qj = pstpdtfj1t+pst(1-pdt)fs2t+(1-pst)pdtfj3t+(1-pst)(1-pdt)fj4t pj,t+1 = [pstpdtfj1t + ½pst(1-pst)fj2t + ½(1-pst)pdtfj3t]/Qj AjB,t+1= ½[fj1tpstpdt( u 1t+ij1tσj) + ½fj2tpst(1-pdt)( u 2t+ij2tσj) + ½fj3t(1-pst)pdt( u 3t+ij3tσj)]/Qj pj,t+1 Ajb,t+1 = ½[½fj2tpst(1-pdt)( u 2t+ij2tσj) + ½fj3t(1-pst)pdt( u 3t+ij3tσj) + fj4t(1-pst)(1-pdt)( u 4t+ij4tσj)]/Qj(1-pj,t+1) This represents an optimization problem with 8T decision variables, fjmt, that must be optimized subject to 8T equality constraints. The first set of constraints corresponds to the overall fraction selected, the second set to changes in gene frequency, and the last two to changes in mean polygenic values. The only variable related to polygenic effects in the equations is the SD of polygenic EBV. Thus, the problem formulation and, therefore, its solutions, do not depend explicitly on heritability or polygenic variance. Finding optimal weights Optimizing Selection on QTL Optimal Control Theory (Dekkers and van Arendonk, 1998, Genetical Research) Selection criterion = I = b QTL + Polygenic EBV b QTL (E)BV + Polygenic EBV Control Variables bj0 State Variables A0 p0 System Output Optimize MAX bjt ∆A ∆p A1 p1 g1 T to maximize response bj1 1 {Σ (1+ρ)t gt } t=1 bj2 ∆A ∆p A2 p2 bj,T-1 ∆A ∆p g2 AT-1 pT-1 gT-1 Subject to: pt+1 = pt + ∆ pt At+1 = At + ∆ At } ∆A ∆p AT pT gT for t = 0, . . . T-1 The above formulation uses fractions selected from each distribution in each generation, fjmt, as decision variables rather than index weights, bjmt. Dekkers and Van Arendonk (1998) demonstrated how the truncation points, xjmt, that correspond to the fractions selected fjmt, can be bjmt = σj( xjmt − xj,Bb,t)/ g mt transformed to index weights on g mt : Formulating the problem in terms of fractions selected is thus equivalent to formulating the problem in terms of index weights. This can also be illustrated through Figure 12.2; changing the index weights bjmt shifts the four distributions relative to each other. When truncating on the index, changes in index weights therefore result in corresponding changes in truncation points and in fractions selected from each distribution. Index weights for genotype Bb are equal to zero because means are deviated from genotype BB. Thus, the number of index weights in this formulation is 6T (3 per sex per generation) compared to 8T variables fjmt. The difference in 256 number of variables results from the implementation of the first set of constraint equations, which constrain the total fraction selected per sex. Optimal fractions selected fjmt and, thereby, index weights that maximize the constrained objective function can then be derived iteratively, using optimal control procedures, following Dekkers and Van Arendonk (1998). Details are in Chakraborty et al. (2002). Figure 12.6 illustrates the optimal weights assigned to the QTL in each generation for the example situation when the objective was to maximize cumulative response over 30 generations. Index weights were the same for males and females because selection intensities were the same for both. Weights differed by generation and QTL genotype. Except for the final generations, weights on the QTL were substantially lower than those used for standard GAS (bQ = 1 for all 2 for all generations). Optimal weights were generations) and phenotypic selection (bQ = h pol equal to 1 for the final generation because at that point the aim is to maximize response in the next generation, equivalent to standard GAS. In generation 29, the optimal weight on the unfavorable QTL genotype (bb) was extremely large. Figure 12.6. Weights on the QTL with standard GAS, phenotypic selection, and optimal GAS. 20% selection of males and females. a = 1σp h2= 0.25. Figure 12.7. Cumulative total (closed symbols) and polygenic (open symbols) responses to standard and optimal GAS, deviated from phenotypic selection and b) 0.8frequency of the QTL for optimal GAS. MAS - Phenotypic Response ( σ pol ) 1.2 Standard GAS 1 Index weight 0.8 0.6 0.4 Phenotypic 0.2 Optimal GAS bbb 0 -0.2 0 10 15 Optimal GAS 0 Standard GAS -0.4 -0.8 b) QTL frequency 1 -1.2 0 0 -1.6 -1.6 Optimal GAS bBB 5 0.4 0 20 25 5 10 15 20 G en era tion 25 30 30 Generation 30 Generation Figure 12.7 depicts the resulting changes in cumulative total and polygenic responses for optimal GAS as a deviation from responses to phenotypic selection. Frequencies of the QTL in each generation are illustrated in Fig. 12.7b. Optimal GAS led to a much more gradual and almost linear increase in frequency toward fixation at the end of the planning horizon. This is in contrast to standard GAS and phenotypic selection (Figure 12.3b). As a result, cumulative response was lower for optimal GAS than for phenotypic selection (Fig. 12.7) for the first 23 generations. However, polygenic response was greater, which led to a 0.42 polygenic standard deviation greater cumulative response by the end of the planning horizon, at which time the QTL was also fixed under optimal GAS. Realized selection intensities that were placed on the polygenes and the QTL in a generation, which were computed as response generated in that generation divided by the standard deviation of the selected component (= polygenic standard deviation for polygenic response and 257 = 2 p(1 − p) for the QTL, where p is the gene frequency), are given in Figure 12.8. For optimal GAS, selection intensity placed on the polygenes was remarkably constant over generations, apart from the last generation. In contrast, selection pressure placed on polygenes was lower for both standard GAS and phenotypic selection prior to fixation of the QTL. Patterns for selection intensities placed on the QTL nearly mirrored those of intensity on polygenes for standard GAS and phenotypic selection but was again nearly constant for optimal GAS, apart from the first and last generation. The latter likely relates to the build-up of gametic phase disequilibrium between the QTL and polygenes. Figure 12.8. Standardized selection response (intensity) per generation for the QTL (closed symbols) and polygenes (open symbols). Figure 12.9. Frequencies for 3 QTL under standard and optimal GAS. The QTL have effects a (in phenotypic SD) and initial frequencies p 1.8 1 1.6 0.9 Polygenes QTL frequency 1.4 Intensity 1.2 1.0 Phenotypic Standard GAS Optimal GAS 0.8 0.6 Standard GAS 0.8 0.7 0.6 0.5 0.4 0.3 Optimal GAS 0.2 0.4 QTL 0.2 0.1 a = 1/2 p = 0.1 a = 1/4 p = 0.1 a = 1/4 p = 0.3 0 -0.2 0.0 0 5 10 15 20 25 0 30 5 10 15 20 Generation Generation Results depicted in Figure 12.8 demonstrate that, to maximize cumulative response over a planning horizon of T generations, selection emphasis on the QTL should be controlled in such a manner that the QTL is close to fixation in generation T, while equal selection emphasis is placed on polygenes across generations. Note that this optimal solution is by nature similar to that obtained by Finney (1958) for selection across multiple stages. He also found that to maximize cumulative response in a quantitative trait over multiple stages of selection, selection efforts should be divided equally across stages. Unequal selection results in lower cumulative response because of the non-linear relationship between proportion selected and selection intensity (Falconer and MacKay 1996). The additional complication in the present context is that the total selection emphasis that is applied to polygenes across all generations is not predetermined but must be balanced against placing sufficient emphasis on the QTL, such that the QTL frequency is moved to fixation in generation T. Results depicted in Fig. 12.8 show that this is achieved by maintaining a nearly constant standardized selection emphasis on the QTL over generations, apart from the first and last generations. Trends in frequencies of with selection on 3 QTL with standard and optimal GAS to maximize cumulative response over 20 generations are illustrated in Figure 12.9. With standard GAS, rates of fixation depended on both effect and starting frequency of the QTL and the QTL with the larger effect was moved to fixation more quickly, as expected. The same was observed for phenotypic selection (not shown), but rates of fixation were lower for all QTL. In contrast, with optimal GAS, frequencies increased nearly linearly to reach near fixation at the end of the 258 planning horizon for all three QTL, regardless of the effect and the initial frequency of the QTL. The rate of increase in frequency was determined by the initial frequency of the QTL, not by its effect; magnitude of the QTL had no impact on the selection emphasis that was placed on the QTL in the optimal strategy. Figures 12.10 and 12.11 show results from optimal GAS on a single QTL when the objective was to maximize cumulative discounted response over 10 generations with a 10% interest rate. Parameters included 10 and 25% selection among males and females, heritability 0.10, and QTL parameters a = 0.2 p, d= ½a, and starting frequency 0.10. Optimal GAS resulted in 3.5% greater CDR than standard GAS and in 8.6% greater CDR than phenotypic selection. Additional results are in Chakraborty and Dekkers (2000). Figure 12.10. Cumulative total and polygenic responses 7 1 4 Polygenic response 3 O p tim a l 2 P h en o typ ic S ta n d a rd 1 0 2 4 G en er a tio n 66 S eries1 0 88 S eries1 1 0 .6 al tim p O AS G ic typ no e Ph 0 .4 0 .2 Fi g u r e 3 Q T L f r e que n c y f o r S eries8 0 drd adra n d a n t StSa S GA 0 .8 QTL frequency 5 Genetic level Total response QTL CDR over strategy phenotypic Standard + 3.5% Optimal + 8.6% 6 Figure 12.11. QTL frequencies 0 0 1100 2 4 6 G e n e r a tio n 259 8 10 12.5 Implementation of MAS for Genetic Improvement Genetic improvement can be accomplished by within-breed selection and/or by capitalizing on between breed differences. Both can be achieved to some degree through conventional selection using phenotype but can be enhanced through the use of molecular information. In what follows, we will first discuss opportunities for utilizing between-breed differences and then evaluate strategies for within-breed selection. 12.5.1 Between-breed improvement Molecular markers can be used to assist the integration of favorable genes (alleles) from multiple breeds through marker-assisted introgression of MAS within crosses or synthetic lines. Use of between breed QTL information Use of QTL information Based on between-breed LD • Between-breed selection • Marker-assisted QTL introgression • Within-breed selection • MAS in synthetic lines / recent crosses 12.5.1.1 Marker-assisted introgression The aim of an introgression program is to introduce one or more genes (target genes) from a breed that is superior for some genes or QTL but inferior for general performance (the donor breed) into a high performance line that lacks the target genes (the recipient breed). This is done through an initial F1 cross followed by multiple backcrosses to the recipient breed and one or more generations of intercrossing. The aim of the backcross generations is to maintain the target gene(s) while recovering the background genome of the recipient breed. The purpose of the intercrosses is to fix the line for the target gene(s). Effectiveness of introgression schemes is limited by the ability to identify backcross or intercross individuals that carry the target gene(s) and by the ability to identify backcross individuals that have a high proportion of the recipient genome, in particular in the region(s) around the target gene(s). The latter affects the number of backcross generations required to recover the recipient genome. Molecular genetics can enhance the effectiveness of both phases of an introgression program. Effectiveness of the backcrossing phase can be increased in two ways: i) by identifying carriers of the target gene(s) (foreground selection), and ii) by enhancing recovery of the donor 260 genetic background (background selection). Effectiveness of the intercrossing phase can also be enhanced through foreground selection on the target gene(s). QTL Introgression Program QQ Donor line F1 Qq 1 ct Sele X qq Recipient line X R qq X R qq Qq BCn-1 X R qq Qq BCn X BCn IC1 X IC1 IC2 X Qq BC1 Introgression of Q from donor to recipient line Cross Backcrossing • recover R genome • maintaining QTL Qq D x R F1 F1 x R BC1 x R BC2 x R BC3 x R BC1 BC2 BC3 BC4 . . BCnxBCn IC1 IC1 x IC1 IC2 . . ICk x ICk ICk+1 IC2 Freq. Genotype % of R genome of Q selected Average 95% Range Qq 0.50 Qq 50 50 - 50 Qq/qq Qq/qq Qq/qq Qq/qq 0.50 0.50 0.50 0.50 Qq Qq Qq Qq 75 87.5 93.75 96.88 66.6 - 83.4 80.7 - 94.3 88.8 - 98.8 93.5 - 100 QQ/Qq/qq 0.50 QQ (+Qq?) QQ/Qq/qq >0.50 QQ (+Qq?) 99.? 99.? QQ/Qq/qq >0.50 99.? QQ Qq Qq/qq Qq/qq Qq/qq Qq/qq 0.50 0.50 0.50 0.50 0.50 % of R genome Mean 95% Range Qq Qq Qq Qq Qq QQ/Qq/qq 0.50 QQ (+Qq?) IC1 x IC1 IC2 QQ/Qq/qq >0.50 QQ (+Qq?) . . ICk x ICk ICk+1 QQ/Qq/qq >0.50 QQ Use of Markers in Introgression Pro- Progeny geny genotypes F1 BC1 BC2 BC3 BC4 Progeny Freq. Genotype genot. of Q selected BCnxBCn IC1 Intercrossing to fix QTL QQ Improved line QQ Cross D xR F1 x R BC1 x R BC2 x R BC3 x R . . Progeny m q 99.? 99.? 99.? • Large #’s needed to find BC’s that are heterozygous for ALL QTL • single marker • flanking markers Q 50 - 50 66.6 - 83.4 80.7 - 94.3 88.8 - 98.8 93.5 - 100 Introgression of Multiple QTL 1) Identify Qq individuals in BC M 50 75 87.5 93.75 96.88 • Gene pyramiding Q1 x Q2 Q3 x Q4 Q1+Q2 x Q3+Q4 2) Speed recovery of background genome during BC (and IC) ➣ select on phenotype (among Qq BC animals) = traditional way ➣ select on markers spread over genome (among Qq BC’s) Q1+Q2+Q3+Q4 3) Selection among juveniles (short generation interval) Effectiveness of foreground selection depends on the number of target genes and on the confidence interval for the position of those genes. The latter determines the size of the genomic region that must be introgressed. Both factors have a large impact on the number of individuals that is required to find individuals that are carriers for all target genes during the backcrossing phase and homozygous during the intercrossing phase. For the introgression of multiple target genes, gene pyramiding strategies can be used during the backcrossing phase to reduce the number of individuals required (Hospital and Charcosset 1997, Koudandé et al. 2000). The use of molecular markers in background selection involves estimating the proportion of the recipient genome on the basis of markers across the genome and selecting individuals with the highest proportion. To reduce linkage drag, greater emphasis can be given to markers around the target gene(s). Hanset et al. (1995) reported on the successful introgression of the halothane normal allele into a Piétrain line that had a high frequency of the halothane positive allele. They used foreground selection on a marker that is closely linked to RYR. Yancovic et al. (1995) reported on markers261 assisted introgression of the naked neck gene in chickens from a local breed into an improved broiler line. The naked neck gene is of benefit in warm climates because it reduces feather cover. Yancovic et al. (1995) used markers to speed up the recovery of the background genome of the improved broiler. In general, however, the application of introgression programs to livestock appears limited for several reasons: i) Apart from some major genes, QTL studies show that most economic traits are affected by a substantial number of genes with moderate effects. This makes the number of QTL to introgress more than can feasibly be handled within an introgression program. ii) Most QTL may already be segregating within the recipient breed, such that within-breed selection may be more effective than introgression. iii) Most QTL are not very precisely mapped, which increases the size of the genome region(s) that must be introgressed and the population size required. iv) The economic benefit of the target gene may not be large enough to compensate for the extra costs and reduced genetic gain in other traits that is associated with an introgression program. v) The introgressed gene(s) may have a different effect in the new genetic background, as has been observed in several plant introgression programs (Dekkers and Hospital 2002). 12.5.1.2 Marker-assisted synthetic line development Lande and Thompson (1990) proposed a strategy for marker-assisted selection within a hybrid population created by crossing two inbred lines. The strategy capitalizes on population-wide linkage disequilibrium that initially exists in crosses between lines or breeds. Thus, marker-QTL associations identified in the F2 generation can be selected on for several generations, until the QTL are fixed or the disequilibrium disappears. Zhang and Smith (1992) evaluated the use of markers in such a situation with selection on BLUP EBV. They compared the following three selection strategies: MAS: selection on an EBV derived from marker effects BLUP: selection on BLUP EBV derived from phenotype COMB: combined selection on an index of the EBV based on markers and phenotype. Data for a cross between inbred lines were simulated on the basis on 100 QTL and 100 markers in a genome of 2000 cM. Marker effects were estimated in the F2 generation using a two step procedure. In the first step, a separate F2 population from the same cross was used to identify markers with the largest effects. Then, to obtain unbiased estimates, the effects of those markers were re-estimated in the F2 population under selection. The latter estimates were used to obtain marker-based EBV throughout the selection process. 262 Selection in Inbred Line Cross Zhang and Smith (1992) Line 1 Line 2 x F1 x F1 F2 x F2 100 QTL - biallelic - a ~ Normal 100 markers biallelic - ave. ave. 20 cM interval QTL detection - 1000 F2 Largest 20 QTL selected (67% var) var) - re-estimated in other 1000 F2 Selection - Marker score alone - BLUP EBV - Marker score + BLUP F3 x F3 Genetic progress based on selection on markers alone (MAS), phenotypic data alone (BLUP), or their combination (COMB) in a cross between inbred lines. Based on Zhang and Smith (1992) 6 MAS BLUP COMB MAS 5 BLUP 2 h =0.25 h2=0.50 COMB Genetic mean 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 Generation Results illustrated in the figure above show that index selection (COMB) resulted in greatest response, followed by selection on BLUP EBV and selection on markers alone. Rates of response declined over generations for all strategies because data were simulated using a finite number of loci, which were moved to fixation by selection. Rates of response declined faster for MAS because recombination eroded the disequilibrium between the markers and QTL. Nevertheless, substantial rates of response were obtained using selection on markers alone. The MAS strategy of Zhang and Smith (1992) has potential for selection for traits that are difficult or expensive to measure in livestock, such as meat quality traits, because it does not require continuous phenotypic evaluation, in contrast to the BLUP and COMB strategies. 263 Although Gimelfarb and Lande (1994) showed that greater response could be obtained by reestimating marker effects in subsequent generations, this would require the continuous recording of phenotypic data, the cost of which may not outweigh the benefits. Zhang and Smith (1992) considered the ideal situation of a cross with inbred lines. Although the lines were not divergent for the trait of interest, they were homozygous at alternate alleles for all loci. Breeds used in a cross to enhance meat quality will typically have different means, which will increase the extent of linkage disequilibrium in the cross. However, both breeds will likely segregate for most QTL, which will reduce the disequilibrium. Nevertheless, even in crosses between commercial breeds of swine, substantial numbers of QTL have been found for which the breeds have sufficient differences in frequency to allow their detection (Malek et al. 2001a,b, Grindflek et al. 2001). In addition, favorable effects have been found to originate from the breed with the lower mean for a number of QTL (Malek et al. 2001b). A greater problem with the use of crosses between outbred instead of inbred lines is the limited ability to follow QTL past the F2 generation. In contrast to inbred lines, markers are not fully informative in crosses between outbred lines. Therefore, the ability to track breed origin of markers or marker haplotypes will decrease over generations, unless a substantial number of markers are genotyped within the QTL regions. An important advantage of selection in a breed cross population is that it can capitalize on QTL identified in breed-cross studies. This could remove the first step in the estimation process used by Zhang and Smith (1992), i.e. that of identification of markers with large effects. Although this does entail the risk that different QTL may segregate in the population under selection, in particular if QTL studies were based on different breeds, there would be a substantial cost saving. It is crucial, however, that the second step of the estimation process be conducted in the population under selection, in order to obtain unbiased estimates of QTL effects that are relevant to the population under selection. For meat quality traits, this requires slaughter of a substantial number of F2 individuals to obtain phenotypic data. Thus, the size of the F2 population must be sufficient to support both marker effect estimation and selection. An alternative approach to QTL detection and estimation was suggested and evaluated by Whittaker et al. (1997). They used a cross-validation approach that allowed the same F2 population to be used for both selection of markers and estimation of marker effects, while maximizing power. This would remove the need for prior QTL information, although such information could still be useful for reducing the genotyping load by focusing only on the most promising genomic regions. Genetic improvement within a synthetic should focus on all traits of economic importance. Thus, selection would be on an index of a marker-based EBV for difficult to measure traits and a BLUP EBV for performance traits. If available marker-based EBV could also be included for performance traits. Instead of deriving the emphasis that is placed on difficult to measure traits versus performance traits on the basis of economic values, additional emphasis should be given to the former traits in the initial generations, before the disequilibrium between markers and QTL erodes. 264 Instead of an F2 population, a backcross population could be used as the starting point for MAS selection in a synthetic line. This could be beneficial if the breed difference for performance is large and favorable effects for QTL originate from both breeds at alternate loci. Then, a backcross to the high performance breed would reduce the genetic lag for performance traits. The frequency of favorable QTL alleles from the other breed would, however, only be ¼. Thus considerable emphasis would need to be placed these QTL during the initial generations of selection. Use of a backcross for selection does not negate the use of an F2 cross or prior data on such a cross for marker selection or QTL identification. 12.5.2 Within-breed selection Most selection programs focus on genetic improvement within a breed or line and, in many cases, the subsequent use of that line within a crossbreeding strategy. Within-breed selection requires information that captures differences between individuals within a breed, rather than the between-breed differences that were discussed in the previous section. The purpose of this section is to describe opportunities for using molecular data in genetic improvement in withinbreed selection programs. The benefits of MAS for within-breed improvement have been evaluated in several computer simulation studies. The majority of these have used stochastic simulation models. The extra responses from MAS that have been observed in those studies depend highly on the specifics simulated (Spelman 1998 WCGALP), including 1. breeding program design 2. trait population parameters (heritability, etc.) 3. genetic model assumed for marked QTL and polygenes 4. marker specifics (GAS vs. LD-MAS vs. LE-MAS, number of markers, recombination rates with QTL, informativeness) 5. method for genetic evaluation 6. amount of phenotypic and genotypes available 7. others Thus, care must be taken when evaluating results from these studies. When considering within-breed improvement using molecular data, it is important to distinguish between markers that are in population-wide linkage disequilibrium (LD) with a QTL (LDMAS) and markers that are in population-wide equilibrium. The latter require the use of the LD within families(LE-MAS). The use of population-wide versus within-family LD has important consequences for the use markers in selection and for the phenotypic data that is required to support their use. Smith and Smith (1993) advocated the use of markers that are in populationwide disequilibrium with QTL because marker effects are easier to estimate and require smaller amounts of phenotypic data. This is important in particular for meat quality traits. Marker requirements are, however, greater for LD-MAS because they must be tightly linked to the QTL, whereas sufficient within-family LD will exist even for markers that are more distant from the QTL (within 10 cM). The use of LD-MAS vs. LE-MAS will be discussed further in what follows. 265 Factors affecting benefits from MAS in simulation studies Three types of observable molecular genetic loci 1. breeding program design Ease of Detection Use 2. trait population parameters (h2 , etc.) 3. genetic model for QTL and polygenes Q Functional mutations 4. marker specifics (GAS vs. LD-MAS vs. LEMAS, number of markers, recombination rates with QTL, informativeness) q 5. method for genetic evaluation 6. amount of phenotypes and genotypes MQ Markers in pop.-wide LD mq with functional mutation M Q Markers in pop.-wide LE m q 7. others - known genes with functional mutation GAS LD-MAS LE-MAS Pathways by which MAS can Increase Response to Selection ➣ Increase accuracy of selection ➣ Decrease generation interval ➣ Marker information is available at an early age ➣ Increase selection intensity ➣ Selection at an early age among more candidates 12.5.2.1 Selection on markers that are in population-wide LD – LD-MAS Markers that are in population-wide LD with a QTL include markers identified using candidate gene and related approaches. The ideal case is a marker that is known to represent the functional polymorphisms (GAS), e.g. the RYR and RN genes, but this is not required for the effective use of population-wide LD. Although markers that are not within the functional gene are not expected to be in extensive LD with a QTL within a closed population, markers that are tightly linked to a QTL have a substantial probability to be in partial population-wide LD with that QTL because of the effects of drift, selection, mutation, and population admixture (Sved 1971, Goddard 1991, Meuwissen and Goddard 2000). This probability is higher in selected populations of small effective size, which is the case for livestock, as demonstrated by Farnir et al. (2000) for dairy cattle. Markers that are tightly linked to QTL can be found through fine mapping or candidate gene approaches. The extent of LD can often be enhanced through the use of haplotypes of tightly linked markers. High-density marker maps with, e.g., a marker every 1 or 2 cM, will also include markers that are in tight linkage with the QTL and that have the potential to be in substantial population-wide LD, as was recently demonstrated by Meuwissen et al. (2001) through 266 simulation. They showed that for populations with an effective population size of 100 and a 1 or 2 cM spacing between markers across the genome, sufficient disequilibrium was present that genetic values could be predicted with substantial accuracy for several generations on the basis associations of marker haplotypes with phenotype on as few as 500 individuals. Although genotyping costs would be to high when applied to the entire genome, opportunities might exist to utilize this approach on a limited scale by saturating previously identified QTL regions with markers. For markers that are in population-wide LD with the QTL, selection can be directly on marker genotype or on marker haplotype if multiple linked markers are used to track the QTL. It is, however, essential to estimate the effects of the markers within the population under selection to capture the degree of LD and linkage phases that are present in the population and to guard against potential interactions of the QTL with the background genome. For the same reason, it will also be prudent to re-estimate the effects on a regular basis. Estimation requires marker genotypes and meat quality phenotypes on a random sample of individuals in the population and should be based on an animal model with marker genotypes or haplotypes included as fixed effects (e.g. Short et al. 1997, Israel and Weller 1998). Use of population-wide LD with a high-density marker map Meuwissen et al. (2001) Mixed model for prediction of marker effects y = Σ marker haplotype + residual random (Bayesian model) Accu racy 1 Ne = 100 12.5.2.2 0.7 500 1000 2200 # individuals 1 Accu racy Marker distance 1 cM 2 cM 4 cM 0.8 0.6 Estimates from 2200 individuals EBV accuracy 0.85 0.81 0.75 0.9 0.9 0.8 0.7 0.6 1 2 3 4 5 6 7 8 Generation Selection using within-family LD Use of within-family LD between a QTL and a linked marker based on LE-MAS requires marker effects or, at a minimum, marker-QTL linkage phases to be determined separately for each family, which requires marker genotypes and phenotypes on family members. If linkage between the marker and QTL is loose, phenotypic records must be from close relatives of the selection candidate because associations will erode through recombination. With progeny data, markerQTL effects or linkage phases can be determined based on simple statistical tests that contrast the mean phenotype of progeny that inherited alternate marker alleles from the common parent. Alternatively, marker-assisted animal models have been developed to incorporate marker data in genetic evaluation for complex pedigrees (Fernando and Grossman 1989, Goddard 1992). These models result in BLUP EBV of QTL effects along with polygenic EBV. 267 Use of within-family LD Use of within-family LD Linkage phase not consistent between sires Marker-assisted BLUP (Fernando and Grossman, 1989) Sire 1 Sire 2 Sire 3 Sire 4 Sire Dam M Q M q M Q M q Ms Qsp Md Qdp m q m Q m Q m q Ms Qsm Md Qdm yi = µ + vip + vim + u + e Paternal / Maternal PolyQTL allele effect genic Progeny QTL effect must be estimated for each individual/family Var(u) Var(u) = Aσ Aσu2 Ms Qip ➣ Based on family information ➣ marker genotypes ➣ phenotypes Var(v) Var(v) = Gσ Gσv2 G = gametic relationship matrix Md Qim for QTL effects Computed from ➣ EBV for QTL alleles: ^ vip , ^vim - marker genotypes ^ ➣ EBV for polygenic effects: u - m-QTL rec. rate ^ip + v^im + u ^ Total EBV = v Meuwissen and Goddard (1996) evaluate the benefit of LE-MAS for different types of traits. Marker-assisted EBV were evaluated using the marker-assisted genetic evaluation model of Goddard (1992) by including the QTL as a random effect. Selection was on the sum of EBV for the QTL and polygenes, similar to the COMB strategy of Zhang and Smith (1992). Comparisons were to genetic gain from a conventional selection with BLUP EBV without availability of genetic markers. Results (see figures) showed that the benefit of MAS is greatest for traits for which phenotypic (BLUP) selection is not effective. This includes traits for which phenotypes cannot be observed prior to selection, traits that can only be observed on one trait, and traits that require sacrificing the animal to obtain phenotypic data. A prime example of the latter is meat quality traits. Benefits of MAS was greater for lower heritability traits and increased with the proportion of genetic variance explained by the QTL (molecular score). Benefits decreased over generations, as QTL alleles were fixed and polygenic response was lost. Possible gains using MAS ((Meuwissen Meuwissen & Goddard ‘96) Possible gains using MAS ((Meuwissen Meuwissen & Goddard ‘96) QTL with 1/3 of genetic variance marked by haplotype. haplotype. h2 = .27 64 Effect of Heritability QTL with 1/3 of genetic variance 62 70 50 38 37 38 40 39 31 30 30 25 21 20 15 9 10 5 4 0 1 1 2 3 Generation 2 2 Meat quality trait Sex-limited trait Phenotyping after selection Phenotyping before selection (%) Extra response from MAS Extra response from MAS (%) 70 55 60 3 4 5 268 60 45 50 36 38 34 40 30 23 25 30 21 15 17 20 13 9 10 6 5 4 0 2 1 2 1 3 2 3 Generation 4 5 h2=.11 Phenotyping after h2=.27 h2=.11 Phenotyping before h2=.27 Meuwissen & Goddard ‘96) Possible gains using MAS ((Meuwissen Effect of size of QTL effect Figure 7. Potential extra gains from MAS for meat quality traits based on within family linkage disequilibrium Phenotyping after selection Based on Meuwissen and Goddard (1996) h2 = 32% QTL with multiple alleles explains 1/3 of the genetic variance for a trait with 0.27 heritability. Marker haplotypes are informative such that transmission of QTL alleles can be followed from parent to offspring for 90% of offspring1. Marker and phenotypic data was available for five generations prior to the initiation of MAS 60 Extra response from MAS (%) 47 40 50 33 40 29 25 23 30 19 13 20 12 12 10 10 7 5 5 4 0 4 1 1 2 2 3 Generation 3 4 5 va QTL ria nc e( %) (%) Extra response from MAS 70 46.7 26.7 13.3 6.7 70 64 62 60 55 50 40 39 30 24 23 20 25 22 10 Strategy 2 0 1 1 2 2 Generation 1) This Strategy 1 3 3 4 5 will require a set of highly polymorphic markers around the QTL. For meat quality traits, Meuwissen and Goddard (1996) considered two strategies: I) A random two of four members of each full sib family is slaughtered to record meat quality data. The remaining individuals are selected on the basis of a markerassisted EBV for meat quality, once data on their sibs is recorded. II) Animals are selected on the basis a marker-assisted EBV and non-selected animals are slaughtered to provide data for the next generation of selection. For both, all individuals were genotyped for markers around a previously identified QTL. Results illustrated in Figure 7 show that strategy I) gave 24% greater response than conventional selection. The benefit of strategy II) was substantially greater but declined over generations as favorable alleles at the QTL were fixed. The greater response from strategy II) compared to I) was in large part the result of the greater selection intensity that was achieved with strategy II) because half of the selection candidates were not slaughtered prior to selection. However, it is questionable whether this increased selection intensity can be realized in practice due to inbreeding considerations. Thus, the extra response of 24% appears more realistic. Implementation of selection on within-family LD requires extensive phenotyping and genotyping. In addition, data should be available for several generations prior to initiating MAS to accurately estimate QTL effects. For example, Meuwissen and Goddard (1996) assumed phenotypic and genotypic data for five generations prior to initiation of MAS and responses dropped substantially without the buildup of such data. Although the same genotypic data can also be applied to performance traits, the benefit of MAS for these traits will be less than for meat quality traits (Meuwissen and Goddard 1996), in particular if markers are in QTL regions for meat quality rather than for performance traits. Nevertheless, correlated effects on other traits should be carefully considered and monitored when applying MAS. Another obstacle for the use of within-family LD is that it requires knowledge of QTL regions that segregate within the population. Since most QTL mapping studies in pigs are based on the breed cross model, information about within-breed segregation of QTL is limited. Thus, withinbreed QTL mapping studies must be conducted prior to implementation of MAS. Although such 269 studies could concentrate on QTL regions previously identified in breed cross studies, substantial population sizes will be required to detect or confirm their segregation within a breed. Related issues were discussed by Spelman and Bovenhuis (1998) in the context of implementing QTL knowledge in dairy cattle breeding programs. Outbred Populations Designs for Genome Scan QTL mapping Two approaches for utilizing Marker information for preselection, including determining heterozygosity of sire for Daughter/HS Design sire x dams Mm ?? previously identified QTL associate (Kashi et al. 1990: Anim. Prod. 124:743) Grand Daughter Design grand sire x dam Mm ?? iate Genotype sons x dams assoc M? m? EBV of sons Phenotype grand progeny ➣ Bottom-up approach based on Daughter design (MacKinnon and Georges, 1998: Livest. Prod. Sci. 54:229) ?? Bottom-up approach Top-down approach Based on Daughter design Based on Grand-daughter design Sire x ?? Sires Young bulls m? estimate marker contrast based on daughter phenotypes Dams x Gr.sire ?? Mm bull dams Mm daughters M? daughters M? m? Phenotype ➣ Top-down approach based on Grand-daughter design Dams x Genotype M? M? m? estimate marker contrast based on EBV of sons based on grand daughter phenotypes If = 0 and e.g. M is favorable x m? select for progeny-testing If = 0 and e.g. M is favorable Key components for implementation of MAS Business objectives R&D Farms DNA collection Phenotyping Pedigree Genotyping Genotypic Database Qua Con Deci Sup lity t r o l Phenotypic s i o n Database port Analytical tools 270 MAS bull dams ?? Young bulls M? ?? m? ?? Select Select 12.5.2.3 GAS vs. LD-MAS vs. LE-MAS An important question for the implementation of MAS for within-breed improvement is whether to use LD-MAS or LE-MAS or GAS. The next figures give a contrast between these strategies in terms of marker, phenotyping, and implementation requirements. Implementation Implementation LE-MAS vs LD-MAS vs GAS requirements LE-MAS vs LD-MAS vs GAS requirements • QTL detection LE < LD < GAS • Within-line confirmation LE >> LD > GAS • Genetic evaluation LE >> LD > GAS LE LD < GAS • Selection implementation LE >> LD > GAS • LE lower accuracy – risk à within-family selection à selection room req’s • Phenotyping- relatives (LE) vs. sample (LD/GAS) - LD/GAS çè effects at field level • Genotyping - candidate + relatives (LE) vs. candidate • Analysis < • Genetic gain • Pick up new mutations LE > LD > GAS • Marketability LE <<LD < GAS • Protection (patents) - MA-BLUP (LE) vs. fixed effect (+ prob) • Product differentiation Pong-Wong et al. (2002) compared GAS to LE-MAS on a QTL bracketed by two markers at different distances. Selection was the standard index of BLUP EBV for the QTL and polygenes, but with optimized contributions to restrict the rate of inbreeding to 5% per generation. LE- vs LD-MAS Pong-Wong et al ‘02 QTL in marker bracket of 2d cM ∆F=5% σ2QTL=.5σ σ2 p LE- vs LD-MAS Pong-Wong et al ‘02 QTL in marker bracket of 20 cM p0=.15 With prior estimates of QTL effects (accuracy r) 60 males, 60 females GAS resulted in substantial extra gains over BLUP selection without marker information in early generations (see figures below), but eventually gave lower cumulative response, similar to what was demonstrated earlier. Extra responses to LE-MAS were delayed and smaller, relative to GAS. Response to LE-MAS increased when marker-QTL distance became smaller. To investigate whether reduced accuracy of estimated QTL effects with LE-MAS was the main reason for the lower response to LE-MAS vs. GAS, Pong-Wong et al. (2002) also evaluated the effect of including prior information on individual QTL effect estimates with LE-MAS. This 271 could represent information from a prior QTL scan using the same sires. They showed that response to LE-MAS was nearly equivalent to that of GAS when highly accurate prior information was included. This indicates that the main limitation to LE-MAS relative to LDMAS and GAS is the limited amount of information that is available to estimate QTL effects, which for LE-MAS is limited to information from relatives. It should be noted, however, that even distant relatives can contribute substantial information for estimating QTL effects with LEMAS if markers are tightly linked to the QTL. With complete LD and large amounts of data, LD-MAS will be equivalent to GAS. In practice, however, LD-MAS will be limited by the extent of LD and the accuracy of estimates of effects of marker haplotype effects; while QTL may typically only have 2 alleles, the potential number of haplotypes to estimate effects with a haplotype of n SNP markers is 2n , and, therefore, less data will be available from a given sample to estimate individual haplotype effects for LD-MAS compared to GAS. Thus, including more markers in the haplotype for LD-MAS will reduce accuracy. On the other hand, LD of markers with the QTL is expected to increase when including more markers in the haplotypes. Goddard and Hayes (2002) investigated the impact of these two factors (see figure below). Results showed that a haplotype of 11 markers explained over 98% of variance at a bracketed QTL (results for infinite population size), compared to 80% for 4 markers. Accuracies were, however similar between 4 and 11 markers (~45%) when only 100 individuals were evaluated, reflecting the greater impact of number of individuals evaluated when a larger number of haplotypes is to be estimated. GAS- vs LD-MAS Goddard & Hayes ‘02 QTL in 10 cM interval with 4 or 11 markers Pop-wide LD based on Ne = 100 Accuracy 1.0 4 markers 11 markers 0.8 0.6 0.4 0.2 0.0 100 1000 2000 Infinite # animals evaluated 12.6 Using Molecular Information at Unused or Under-used Selection Stages The previous considered selection stages where both phenotypic and molecular information was available. In those cases, any selection pressure that is applied to molecular information is taken away from phenotypic information. There are, however, many cases where not all available selection space is utilized in conventional selection, which provides selection room for MAS (Soller and Medjugorac 1999). These include cases where selection decisions must be made at stages of the animal’s life cycle when limited to no phenotypic information is available to guide those decisions. A prime example is pre-selection on the basis of markers among members of a full-sib family for further testing, prior to availability of individual or progeny records. There is 272 often a need to limit the number of full sibs tested to limit inbreeding and increase the availability of diverse blood lines. In such situations conventional selection has no basis for selection because EBV are derived from pedigree information, which is the same for all members of a full-sib family. Family members can, however, differ for the markers they inherited, which then provides a basis for selection, instead of having to make a random choice. Such strategies were evaluated by Kashi et al. (1990) for dairy cattle. Incorporating MAS in the existing program Incorporating MAS Within existing selection steps in currently under-utilized selection stages Capitalizing on excess reproductive capacity ➣ Increases response through accuracy of selection ➣ Requires balance between QTL and polygenic selection ➣ Long versus short-term response to selection? (excess meioses) = “Selection space” for MAS ((Soller Soller & Medjugorac, Medjugorac, 1999) Pre-selection among full-sibs prior to performance testing with limited test capacity E.g. pre-selection of young dairy bulls for progeny testing Currently 1 (e.g.) full-brother chosen (at random) per MOET flush to restrict inbreeding MAS implementation: select FS with highest marker score Impact on response through increased intensity MAS in Pre-selection of Young Bulls for Progeny Testing A a X ET A a a a A Progeny Test 12.6.1 Models for evaluation of pre-selection The effect of pre-selection on genetic gain can be evaluated using selection index approaches o by using a deterministic model based on mixture distributions. In the following, a model based on mixture distributions that accounts for the reduction in variances due to selection in the first stage would be described. Stage 1: select across two normal distributions of total EBV (QTL or marker + polygenic EBV) of young bulls with means +/- ½α, standard deviations r1σg , where r1 is the accuracy of parental average polygenic EBV, and frequencies ½ and ½ for sons that received allele B or allele b from their heterozygous sire (see figure). Here, B and b can either represent either alleles of a QTL, in which case α is the QTL allele substitution effect, or alleles at a linked markers, in which case α is the marker-associated QTL effect (= (1-2r)*(QTL substitution effect). 273 Sire B b B progeny b progeny -1/2α +1/2α 2 The unique truncation point for selection of a proportion Q1 in stage 1 is determined by the bisection method of Chapter 3, resulting in proportions selected of fB and fb from sons that received allele B and b, respectively. Input parameters for stage 2: Frequencies of progeny tested bulls that carry allele B vs. b are: wB = fB/Q1 wb = fb/Q1 and Polygenic means of these two groups are: u B = iB r1σg u b = ib r1σg and where iB and ib are selection intensities associated with fB and fb. Since EBV are unbiased, these are also the mean polygenic EBV of bulls following their progeny test. Thus the mean of the standard QTL index following the progeny test is: g B = ½α + iB r1σg g B = -½α + ib r1σg and The standard deviation of the parental average EBV of the bulls selected following stage 1 is equal to: σ’1,B = 1 − k B r1σg and σ’1,b = 1 − k b r1σg where kB = iB(iB-xB) kb = ib(ib-xb) and Then, recognizing that a progeny test polygenic EBV can be written as the sum of the parental average EBV and an independent normally distributed variable with mean zero and standard deviation (r2 - r1)σg the standard deviations of polygenic EBV following the progeny test are: σ’2,B = 1 − k B r1σg + (r2 - r1)σg and 274 σ’1,b = 1 − k b r1σg + (r2 - r1)σg Stage 2 selection: If in stage 2 the top 10% of bulls are selected, the genetic mean of selected bulls can be determined by determining the unique truncation point across the two distributions with means and standard deviations as determined above. Selection without QTL information: Selection without use of QTL information is also modeled as selection across two distributions, but now with total EBV means equal to those for standard index QTL selection multiplied by the square of accuracy for the respective stage (r12 and r22). 12.6.2 Effect of pre-selection MAS pre-selection MAS pre-selection NO excess reproductive capacity WITH excess reproductive capacity Select more parents to allow QTL selection among progeny QTL variance = 20% Limited/no impact on polygenic response 1.4 1.4 Parental selection 1.2 1.0 Response Response 1.0 0.8 QTL selection 0.6 0.8 Parental selection 0.6 QTL selection 0.4 0.4 0.2 0.2 0.0 0.0 100 Total response 1.2 50 33 25 25 100 % pre-selection 50 50 75 33 100 % dam selection 25 % pre-selection QTL pre-selection NO excess reproductive capacity Effect of top-down or bottom-up Select more parents to allow QTL selection among progeny approach on average genetic value 1.4 Net response 1.0 Response of progeny-tested young bulls Total response 1.2 0.8 Parental selection 0.6 QTL selection SPELMAN and GARRICK: Genetic and Economic Responses for Within-Family Marker-Assisted Selection in Dairy Cattle Breeding Schemes 1998 J. Dairy Sci. Sci. 81: 2942-2950 0.4 Costs 0.2 0.0 25 100 50 50 75 33 100 % dam selection 25 % pre-selection Given the uncertainties about the sustainability of marker effects, it appears prudent to use molecular genetic information in a manner that does not prevent progress toward the overall breeding goal that can be achieved through conventional selection. 275 12.6.3 Integrating molecular and reproductive technologies Selection space for MAS can be increased with technologies that enhance the reproductive rate of, in particular, the female. In addition to increasing selection space within a generation by increasing full-sib family size, space for MAS can also be created across generations by introducing several rapid generations of selection based on markers alone. Such programs were proposed by Georges and Massey (1991) for dairy cattle and subsequently by Visscher et al. (2000) for pigs. In such programs of ‘velogenetics’, the short generations for marker-assisted selection are facilitated by the use of reproductive technologies such as the recovery of oocytes from the unborn foetus, in-vitro maturation of oocytes, and in-vitro fertilization. These technologies are then combined with the selection of embryos for implantation based exclusively on the inheritance of markers that were previously estimated to have favorable effects. Enhancements to further reduce the generation interval in these programs were suggested by Haley and Visscher (1998) and Visscher et al. (2000). Although further advances in reproductive technologies are required for velogenetic programs to become feasible, they offer potential to improve meat quality through markerassisted introgression, synthetic line development, and within-breed selection based on population-wide LD. Use of MAS for Within-Breed Genetic Improvement Redesign selection program to more effectively capitalize on MAS ➣ Creating ‘selection space’ for MAS (Soller& Soller&Medjugorac ‘99) ➣ Incorporating MAS in the existing selection program ➣ ➣ within existing selection steps ➣ Increase family size at pre-selection stages through reproductive technology ➣ in currently under-utilized selection stages - Capitalizing on ‘excess’ reproductive capacity ➣ Substitute MAS at early age with selection at later age based on phenotype or progeny test ➣ Redesign of the selection program to more effectively capitalize on MAS ➣ Move toward juvenile programs ➣ Require designs that maximize amount of information for marker-QTL estimation ➣ Creating ‘selection space’ for MAS (Soller& Soller&Medjugorac ‘99) ➣ Integrating molecular and reproductive technology ➣ Integrating molecular and reproductive technology Integrating Reproductive Integrating Reproductive and Molecular Technology and Molecular Technology ~ ~ Harvest oocytes MAS Fertilize ~ Mature oocytes Fertilize ~ Mature oocytes Ideally at stages where polygenic selection is limited ➣ to minimize impact of MAS on polygenic response MAS Har v es in u t oocy tero tes Implant Recipient 276 Implant Recipient MAS + Reproductive Technology Velogenetics ((Georges,Massey Georges, Georges,Massey ‘91) ~ Generation genotyping - MAS 1 genotyping - MAS 2 genotyping - MAS 3 ~~ phenotyping genotyping genotyping genotyping - phenotypic selection MAS MAS MAS 8 phenotyping - phenotypic selection ~ 4 5 6 7 ~~ 12.7 Economics of MAS 12.7.1 Economic value at production level Hayes and Goddard (2003) evaluated the economics of LE-MAS in an integrated swine operation with a 100 sow nucleus, a 1000 sow multiplier, and a 10,000 sow commercial tier. Implementation of LE-MAS included a genome-scan using a half-sib design within a commercial line, followed by inclusion of data from markers that bracket significant QTL in genetic evaluation using markers-assisted BLUP (Fernando and Grossman, 1989). Selection was for a multi-trait breeding goal that included four independent trait categories: Growth index (GI) Meat quality index (MQI) Pigs born alive (PBA) Net feed intake (NFI) Genetic architecture of the traits was simulated by a finite locus model with 102 QTL with effects drawn from a multivariate gamma distribution following a mutation model. Genetic and economic parameters were as follows: Trait Genetic variance Heritability Economic value ($/genetic s.d) GI 0.7 0.32 2.1 MQI 1.5 0.29 1 PBA 1.2 0.11 2.8 NFI 1.4 0.16 -4.2 277 An important decision for the application of MAS is which QTL or markers should be used in selection. QTL mapping studies typically apply very stringent thresholds based on genome-wide testing to reduce the rate of false positives, as suggested by Lander and Kruglyak (1995). This, however, increases the rate of false negatives and removes opportunities to select on those QTL. In this study, three different thresholds to determine significance were used in the genome scan to determine which QTL to use for LE-MAS: • 5% comparison-wise • 5% chromosome-wise • 5% genome-wise Two sample sizes for the genome-scan were evaluated: • small: 5 sires with 50 progeny each • large: 5 sires with 200 progeny each Number of QTL detected and % of genetic variance explained by the detected QTL is given in the next figure: Profitability of LE-MAS in pigs Profitability of LE-MAS in pigs Hayes and Goddard, 2003 Hayes and Goddard, 2003 30 # QTL detected QTL variance (% ) 25 $x106 1.0 20 0.5 15 0.0 10 -0.5 Costs Returns 5 -1.0 Profit 0 Gen-w Chr-w Point-w Small scan 5 sires x 50 Gen-w Chr-w Point-w -1.5 Large scan Gen-w 5 sires x 200 Chr-w Point-w Small scan Gen-w Chr-w Point-w Large scan The number of QTL detected and variance explained by the detected QTL increased with decreasing stringency of the threshold used and was greater for the large scan than the small scan, both because of greater power. However, for the small scan, the number of QTL increased proportionally faster with decreasing stringency of the threshold than variance explained by the QTL. This is because a non-stringent threshold results in many false positives, in particular for a scan with lower power. The number of QTL detected was greatest for GI and smallest for NPA (results not shown), consistent with the lower heritability of NPA. Relative increases in response from implementing LE-MAS were greatest for MQI, followed by PBA, NFI, and GI (results not shown), consistent with the effectiveness of regular selection for these trait categories. Extra returns from the increased genetic merit in the nucleus from LE-MAS were computed using discounted gene flow by following the expression of a single round of genetic improvement through the multiplier and commercial levels over 100 6-month time periods. A discount rate of 5% per 6-month period was used. Extra costs were assumed to consist only of genotyping costs. 278 Results (see above figure) showed that extra costs were proportional to the number of QTL detected. Extra returns depended primarily on the % of variance explained by the detected QTL and the number of false positives. The latter represent wasted selection space. Extra returns were similar for all strategies, except for the small scan with the comparison-wise threshold. This case had a large number of false positives, which reduced the effectiveness of selection. Several other studies have, however, shown that greater genetic gains from MAS can be obtained by allowing a higher rate of false positives, in order to reduce the number of false negatives (Moreau et al. 1998, Spelman and Garrick 1998). Subtracting costs from returns, profit was positive for all strategies except when comparisonwise thresholds were used, for which costs were high because of the large number of QTL detected. Highest profit was obtained for the scans with the most stringent thresholds because of the limited genotyping costs. Using a similar procedure, Hayes and Goddard (2003, unpublished) evaluated the break-even genotyping cost for GAS, to determine the resources that could be allocated to finding causative mutations, starting from a genome scan. To evaluate this, they followed-up on the QTL with the largest effect for each of the four traits from the large genome scan. It was then assumed that the causative mutation for these QTL was known and extra genetic and economic response from standard index GAS was evaluated. Results (see table below) show that, although extra relative responses would by lowest for GI because of the already effective selection for that trait, extra returns were similar for all four traits, because of the high economic value of GI. Break-even costs of genotyping not only depend on extra returns but also on the number of generations until fixation of the gene, after which no genotyping is required anymore. As a result, break-even genotyping costs were lowest for PBA because it would be fixed more rapidly than the other genes because of the limited accuracy of polygenic EBV for this trait. Strategy GI MQI PBA NFI % genetic variance by largest QTL 26 25 15 24 Extra Returns Break-even genotyping (million $) costs$/pig 0.838 104.36 0.779 97.09 0.736 78.13 0.753 80.04 12.7.2 Other economic objectives Hayes and Goddard (2003) evaluated economic benefits from MAS on the basis of extra returns at the production level. Often, however, the main driving force behind selection decisions in commercial breeding companies is increasing market share and sale of breeding stock, as discussed in Chapter 8. Here, we will evaluate the effect of MAS on such objectives. The following figures show the impact of 50% pre-selection of young dairy bulls for progeny testing on an index of QTL and parental EBV information on the genetic merit of bulls following 279 progeny testing, as well as on the number of bulls that have a progeny-test in the top 10 or 1% within a competitive market situation. QTL substitution effects ranged from 0.1 to 0.5 polygenic standard deviations, the accuracy of the parental average EBV was either 0, 0.2, or 0.4, and the accuracy of the progeny-test was 0.85. A parental average EBV accuracy of 0 reflects a situation where young bulls considered for progeny testing have been heavily selected, such that the remaining variation among parental EBV’s is negligible. Selection was among sons of sires that were heterozygous for the QTL. The deterministic mixture distribution described in 12.6.1 was used. The impact of MAS on market share was determined by determining the unique truncation point for 10% selected in stage 2 for non-MAS selection, assuming this was the industry-wide truncation point, and applying this to the two distributions of total EBV under MAS selection. The same was done for determining the impact on number of bulls in the top 1%. Results (see figures) show that extra genetic gain from MAS increases with QTL substitution effect and decreases with the accuracy of parental average EBV. The impact on market share is, however, greater than the impact on genetic gain and the impact on the number of bulls in the top 1% is greater than the impact on the number of bulls in the top 10%. Thus, MAS is expected to have a relatively larger impact on market share than genetic gain, and an even greater effect on the number of top bulls. Genetic gain / market share from preselection of young bulls Genetic gain / market share from preselection of young bulls % gain 60 % gain 50% pre-selected at rEBV = 0 N o in to p 1 % N o in to p 1 0 % Mean E B V No in top 1% No in top 10% Mean EBV 50 40 5 0 % p r e -s e le c te d a t r E B V = 0 .2 30 20 10 0 0 0.1 0.2 0.3 0.4 σ g) QTL substitution effect (σ 0 .1 0.5 0 .2 0 .3 0 .4 Q T L s u b s titu tio n e ffe c t (σ σ g) 0. Brascamp et al. (1990) also found MAS to have a greater relative effect on market share than on genetic gain. However, they found that the economic returns from the increase in market share were smaller than the economic returns from increased returns at the production level. The latter were evaluated using discounted gene flow. 280 Opportunities for MAS / GAS Integration in breeding & business goals Monogenic traits Phenotype LE markers LD markers Genes Costs Risks Polygenic traits BLUP EBV Breeding Business goal Selection strategy Genotype (prob) Complete evaluation marker/gene effects Multiple stage selection ; Program redesign Short- vs. vs. long-term response 12.8 Selection for Crossbred Performance (from Dekkers and Chakraborty, 2002) In most livestock, crossbreds are used for commercial production to capitalize on heterosis and complementarity and the aim of selection within pure-lines is to maximize crossbred performance. Selection is, however, within pure-lines and primarily based on purebred data. Several theoretical studies have shown that combined crossbred and purebred selection (CCSP) can result in greater responses in crossbred performance, in particular if genes with complete or over-dominance affect the trait (Wei and van der Steen, 1992; Uimari and Gibson 1998). Use of crossbred data, however, requires separate testing and recording strategies. The strategic use of non-additive QTL in pure-line selection, however, allows selection for crossbred performance without crossbred data. Selection Dam Line Nucleus Sire Line Nucleus Multiplier Multiplier Random mating No selection Commercial A deterministic model was developed for selection in sire and dam nucleus lines that provide parents for the multiplier phase of a two-breed terminal cross. Selection was for a trait controlled by a known QTL and additive infinitesimal polygenes (heritability = 0.3). The QTL 281 had alleles Q and q and genotypic values a, d, and –a. Frequencies of Q in generation 0 were 0.3 and 0.2 in the sire and dam line. In both lines, selected fractions were 0.1 and 0.25 for sires and dams. Unselected nucleus and multiplier animals were used to produce multiplier and commercial animals by random mating. The objective was to maximize cumulative discounted response (CDR) of crossbred performance over ten generations: CDR=ΣδtGt, where Gt is the mean crossbred performance of progeny from generation t and δt the discount rate based on 10% interest. The index for selection of animals of genotype i and sex j in line k in generation t was: Iijkt = bijkt gijkt + û ijkt , where gijkt is the known purebred breeding value for the QTL (Dekkers and Chakraborty, 2001) and û ijkt a polygenic estimated breeding value from own phenotype. Four selection strategies were compared, all based on purebred information: 1. Phenotypic selection: selection on purebred phenotypic information. 2. Standard QTL selection: selection on an index with weights bijkl equal to one. 3. Optimal QTL selection: selection on an index with weights bijkl optimized to maximize CDR. Weights were derived by an extension to two lines of methods by Chakraborty et al. (2002). 4. Stepwise optimal QTL selection: selection on an index with bijkl optimized each generation to maximize performance of crossbred progeny, following Dekkers and Chakraborty (2001). Table 1 shows CDR for alternative QTL selection strategies relative to phenotypic selection. Responses in polygenic and QTL values are in Table 2 for optimal QTL and phenotypic selection. Figure 1 shows trends in frequencies for QTL with complete and over-dominance. Table 1. Extra (%) CDR of QTL selection strategies over phenotypic selection for different degrees of dominance (d) and QTL effects (a in polygenic s.d.). Selection d=0 Standard QTL Optimal QTL Stepwise QTL -0.4 1.1 -0.4 Standard QTL Optimal QTL Stepwise QTL -1.6 1.0 -1.6 Standard QTL Optimal QTL Stepwise QTL -1.3 0.8 -1.3 Degree of dominance d = 1 /2 a d=a a = 0.5 -1.7 -2.3 0.5 1.1 -0.7 -0.1 a=1 -3.6 -3.9 0.3 2.2 -1.8 0.3 a=2 -3.2 -3.1 0.1 3.3 -2.3 1.1 d = 11/2a -2.4 3.7 2.3 -2.6 9.8 8.0 1.5 21.0 20.9 For additive QTL (d=0), optimal QTL selection gave <1% greater CDR than phenotypic selection (Table 1), similar to single line selection (Dekkers and Chakraborty, 2001). Similar results were obtained for a QTL with partial dominance (d=½a). For a QTL with complete dominance (d=a), optimal QTL selection gave up to 3.3% greater CDR than phenotypic selection (Table 1). Extra response increased with size of the QTL effect. Optimal QTL selection fixed the 282 favorable allele in the sire line and maintained an intermediate frequency in the dam line (Figure 1a). Phenotypic selection increased the frequency in both lines. Extra responses from optimal selection resulted not only from a greater frequency of heterozygotes among crossbred progeny but also from greater polygenic response in the dam line (Table 2). Table 2. Cumulative discounted responses (in polygenic s.d, σpol) in crossbreds from phenotypic (Phen.) and optimal QTL selection (Optimal) in sire and dam lines for a QTL with different degrees of dominance (d) and additive effects (a). Crossbred d=0 d = 1 /2 a d=a d = 11/2a A Phen. Optim Phen. Optim Phen. Optim Phen. Optim response a= 0.5 Polygenic 11.7 11.5 11.8 11.5 11.7 11.4 11.6 11.1 12.9 12.5 12.8 12.7 12.8 13.1 12.6 13.1 Polygenic QTL value 1.1 2.0 1.2 1.7 1.3 1.6 1.4 2.3 Total 25.7 26.0 25.8 25.9 25.7 26.0 25.6 26.6 a=1 Polygenic 11.3 11.1 11.4 11.2 11.4 10.9 11.2 10.7 12.3 11.9 12.4 12.2 12.3 13.1 12.0 13.1 Polygenic QTL value 5.6 6.6 5.1 5.5 4.6 5.0 4.5 6.6 Total 29.2 29.5 28.9 29.0 28.4 29.0 27.7 30.4 a=2 Polygenic 10.7 10.7 11.0 10.8 11.1 10.6 10.5 10.3 11.4 11.2 11.7 11.7 11.8 13.0 11.1 12.9 Polygenic QTL value 15.1 15.6 13.2 13.5 11.2 11.7 9.8 14.8 Total 37.2 37.5 35.9 35.9 34.2 35.3 31.5 38.1 A Polygenic sire = polygenic contribution from sire line; Polygenic dam = polygenic contribution from dam line; QTL value = QTL contribution; Total = total genetic response. Standard QTL selection gave lower CDR than phenotypic selection for most cases (Table 1). Standard QTL selection increased the frequency of Q in both lines but faster than phenotypic selection. This resulted in greater gain than phenotypic selection in early generations but in lower response in later generations. For QTL with complete or over-dominance, frequencies reached an asymptote less than one, where the substitution effect is zero (Figure 1). For non-additive QTL, stepwise QTL selection gave greater CDR than standard QTL selection, in particular with over-dominance (Table 1). For over-dominant QTL, stepwise QTL selection gave similar CDR and trends in frequencies as optimal QTL selection (Figure 1). This study demonstrates that strategic use of non-additive QTL enables selection for crossbred performance based on purebred data. Limited benefits were obtained for QTL with partial dominance but these results apply to a trait of moderate heritability and observed in both sexes prior to selection. Greater crossbred performance from optimal QTL selection resulted not only from greater QTL response, but also from greater polygenic response (Table 2). For over-dominant QTL, optimal QTL selection resulted in 4, 10, and 21% greater CDR than phenotypic selection for QTL with additive effects of ½, 1, and 2 polygenic standard deviations. At a frequency of 0.25, this represents QTL that explain 27, 59, and 85% of the purebred genetic variance. Although these QTL effects may be unrealistic, similar results may be obtained from selection on multiple QTL that jointly explain such proportion of variance. 283 Figure 1. QTL allele frequency in sire and dam lines under alternate selection strategies for a QTL with complete or over-dominance and an additive effect of one polygenic s.d. a – Complete dominance (d = a) b – Over-dominance (d = 1½a) 1.0 1.0 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 Optimal- Sires Standard - Sires Phenotypic - Sires 0.1 0.0 Optimal - Dams Standard - Dams Phenotypic - Dams 0.2 0.1 0.0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Generation Generation The majority of the extra crossbred response could be achieved by optimizing one generation at a time (Table 1). These results can be extended to selection on markers linked to QTL, provided non-additive effects are estimated. Recent QTL studies, in particular in breed crosses, have found several over-dominant QTL (e.g. De Koning et al., 2000; Malek et al., 2001). Although these may be represent linked QTL in repulsion phase, their apparent over-dominant effects can be used to select for crossbred performance prior to the break-up of such linkage. The use of QTL removes the requirement of crossbred testing in CCSP, thereby saving important test resources and enabling the short generation intervals of purebred selection. Although a twobreed terminal cross was modeled here, results in principle apply to any crossbreeding system. The choice between purebred QTL selection and CCSP will depend on the proportion of nonadditive genetic variation contributed by the QTL, the impact of other factors that differentiate purebred from commercial performance (GxE), the increase in generation intervals with CCSP, and on the cost of crossbred testing versus the cost of implementing marker-assisted selection. Benefits of using QTL could be further enhanced by mating based on QTL genotype at the commercial level. This, however, requires extensive genotyping at the multiplier level. Recent gene and QTL mapping studies have also revealed that QTL may not be expressed in a Mendelian fashion. In particular, several studies have detected genes and QTL in pigs that are subject to gametic imprinting (Jeon et al. 1999, De Koning et al. 2000). Future studies will undoubtedly identify other epigenic phenomena that affect the inheritance and expression of QTL. These effects will need to be taken into account when designing selection programs. Although they may on the one hand complicate selection programs, they may also provide opportunities. For example, De Koning (2001) suggested that utilization of a combination of imprinted and sex-linked QTL would allow a diverse set of markets to be targeted through strategic crosses between a single set of breeds. 284