Incorporating Molecular Genetic Information in Genetic Improvement Programs for Livestock Chapter

advertisement
Chapter 12
(Dekkers)
Incorporating Molecular Genetic
Information in Genetic Improvement
Programs for Livestock
Based in part on: Dekkers et al. (2001), Dekkers and Hospital (2002), and Dekkers and Settar
(2003)
Substantial advances have been made in the genetic improvement of agriculturally important
animal and plant populations through artificial selection on quantitative traits. Most of this
selection has been on observable phenotype, without knowledge of the genetic architecture of the
selected characteristics, which is treated as a black box, with no knowledge of the number of
genes that affect the trait, let alone of the effects of each gene or their locations in the genome.
Despite the obvious flaws of this model, the tremendous rates of genetic improvement that have
been achieved attest to the utility of the quantitative genetic approach. Nevertheless, quantitative
genetic selection has several limitations: phenotype is an imperfect predictor of an individual’s
breeding value; phenotype may not be observed on both genders or prior to the time when
selection decisions must be made; and phenotype is not very effective in resolving negative
associations between genes, e.g. those caused by linkage or epistasis. The ideal situation for
quantitative genetic selection is that the trait has high heritability and that the phenotype can be
observed on all individuals prior to reproductive age. This ideal is hardly ever achieved, which
limits the effectiveness of quantitative genetic selection.
Andersson (2001) and Mauricio (2001) reviewed how molecular genetics can be used to discern
the genetic nature of quantitative traits in animals and plants, respectively, by identifying genes
or chromosomal regions that affect the trait — so-called quantitative trait loci or QTL. This has
enabled identification and characterization of at least some of the genes that contribute to genetic
variation in quantitative traits. Because DNA can be obtained at any age and on both genders,
molecular genetics can alleviate some of the limitations of quantitative genetic selection, as will
be discussed below. Thus, the genes and genetic markers that are being discovered can be used to
enhance genetic improvement of breeding stock through marker-assisted selection. The purpose
of this Chapter is to show how this information can be used to enhance genetic improvement.
Emphasis will be on utilization of natural variation within a species, rather than on the
introduction of new genetic variation through genetic modification, although some of the
programs reviewed, such as introgression, also play an important role in the introduction of
transgenes into breeding populations (see e.g. Gama et al. 1992).
233
Applications of Molecular Data
Use of Molecular Data in Selection
➣ Parental identification / verification
➣ Traceability
➣ Evaluation of Genetic diversity
Molec.
➣ Introgression of desirable genes
Marker-Assisted Introgression (MAI)
Selection
strategy
Genotypic
data
Factors Affecting Extra Response from MAS
Benefit from use of molecular data
➣ Effects of identified QTL (% of genetic variance)
➣ Higher h2 than phenotypic data
➣ Recombination rates between markers and QTL
➣ Effectiveness of Phenotypic Selection
➣ Expressed in both sexes
➣ Expressed at early age (embryo stage)
➣ Heritability
➣ Explains within-family variation
➣ Restrictions on phenotyping (measurement)
➣ traces Mendelian sampling terms
➣ in one sex only
gX = 1/2gsire + 1/2gdam + RAsire + RAdam
Own phenotype
Progeny phenotype
Marker/genotypic data on X
Traits
(sex-limited traits)
➣ EBV based on relatives for one sex
➣ late in life
(after selection)
➣ EBV based on relatives
➣ not on live animal (meat quality traits)
➣ EBV from relatives, reduced intensity
➣ difficult to measure (disease traits)
Quantitative Traits
• Routine recorded
• both sexes
• sex-limited
• late in life
• Genetic defects/disorders
• Appearance
• Difficult to record
• feed intake
• product quality
• Quantitative traits
• Unrecorded / low h 2
Potential
gain from
MAS/GAS
Ease
of QTL
detection
Genome scans
• Single gene traits
Candidate genes
Ancestral records
Half-sib records
Full-sib records
EBV
genetics
Identified or
marked
QTL
➣ Enhance selection within outbred populations
Marker-Assisted Selection (MAS)
Information
providing
records
Phenotypic
data
Unknown
genes
• disease resistance
Molecular genetic analyses of quantitative traits lead to the identification two broadly different
types of genetic loci that can be used to enhance genetic improvement programs: causal
mutations and presumed non-functional genetic markers that are linked to QTL (indirect
markers). Causal mutations for quantitative traits are hard to find, difficult to prove, and few
examples are available (Andersson 2001). Non-functional or anonymous polymorphisms are
abundant across the genome and their linkage with QTL can be established by evidence of
empirical associations of marker genotypes with trait phenotype. Two approaches are used to
identify indirect markers (Andersson 2001): directed searches using candidate gene approaches
in unstructured populations (Rothschild and Soller 1997); and undirected genome-wide searches
234
in specialized populations, such as F2 crosses or half-sib family populations. Because candidate
gene markers focus on polymorphisms within a gene that is postulated to affect the trait, they are
often tightly linked to the QTL. A candidate gene marker can represent the functional
polymorphism, although this is difficult to prove (Andersson 2001). Genome scans, on the other
hand, only identify regions of chromosomes that affect the trait. The length of these regions is
typically 10 to 20 cM, but the exact position and number of QTL within the region is unknown.
Whereas a causative polymorphisms give direct information about genotype for the QTL, use of
indirect markers for QTL mapping and for selection is based on existence of linkage or gametic
phase disequilibrium (LD) between the marker and the QTL. Marker-QTL LD can exist at the
population level but always exists within families, even between loosely linked loci. Although
two loci are expected to be in population-wide equilibrium in large random-mating populations,
partial population-wide LD can exist by chance between tightly linked loci in breeding
populations that are under selection. Population-wide LD can also be created by crossing lines or
breeds. Although LD will then exist even between loosely linked loci, this LD will erode rapidly
over generations. Indirect markers that are identified using the candidate gene marker approach
are expected to be in substantial LD with the QTL in which they reside. Unless the functional
polymorphism has been identified, however, linkage phase of a candidate gene marker with the
functional variant can differ from one population to the next and must, therefore, be assessed in
the population in which it will be used. Although more abundant and extensive, within-family
LD is more difficult to use because linkage phases between the markers and QTL will not be the
same in all families and must, therefore, be assessed on a within-family basis.
Utilization of m arkers that are in population-wide disequilibrium with a
Q TL (Q /q)
M
Q
M
Q
M arker and QTL alleles
are or tend to be in
consistent linkage phase
M
Q
m
q
m
M
Selection can be on
marker genotype across
the population
Population-wide linkage
disequilibrium can be created by
crossing (ideally inbred) lines or
breeds and will then exist
between loosely linked m arkers
for several generations
M
Q
M
Q
X
M
Q
m
q
m
q
m
q
Q
q
m
q
Within-family
disequilibrium
Population-wide linkage equilibrium
q
m
Utilization of indirect markers that are in population-wide equilibrium with
a QTL (Q/q)
A lthough all linked m arkers are
expected to be in populationw ide linkage equilibrium w ith
Q TL , tightly linked m arkers have
a substantial probability to be in
partial population-w ide LD
because of the effects of drift,
selection, m utation, and
population admixture (Sved
1971, Goddard 1991, H astbacka
et al. 1992). This probability is
higher in selected populations of
sm all effective size, w hich is the
case for agricultural species, as
demonstrated by Farnir et al.
(2000) for dairy cattle.
D irect m arkers are at the QTL
and are, therefore, expected to be
in com plete population-w ide
linkage disequilibrium w ith the
Q TL
235
M Q
M Q
m Q
m Q
M Q
m Q
M Q
m Q
M Q
M Q
m Q
m Q
M q
m q
M q
m q
M q
M q
m q
m q
M Q
m Q
M Q
m Q
M q
M q
m q
m q
M q
m q
M q
m q
Marker and QTL alleles appear in alternate
linkage phases. Marker genotype gives no
information about QTL genotype. This will
be the case for most indirect markers in an
outbreeding population
Recombination rate r
M Q
m q
Gametes
produced
M
1/
Q
and their
frequency
m Q
1/
2 (1-r)
M
1/
q
2r
2r
m q
1/
2 (1-r)
Despite population-wide
equilibrium, the marker
and QTL will be in partial
disequilibrium within a
family. The extent of
disequilibrium depends on
the recombination rate (r),
but will occur even with
loose linkage (r=0.2). This
disequilibrium can be used
to detect QTL and for
selection
Use of within-family linkage disequilibrium
for QTL mapping and MAS
Three Types of Molecular Information
1) Genotype for functional gene
BB Bb bb
➣ polymorphism = causative mutation
2) Genotype for a direct marker
➣ polymorphism is in populationwide linkage disequilibrium
with causative mutation
M
Sire
m
MB mb Mb mB
3) Genotype for linked genetic markers
➣ polymorphism in linkage equilibrium across population
E(µ
µMM) = E(µ
µMm) = E(µ
µmm)
MB mb Mb mB
Use within-family
disequilbrium
B
?
?
?
?
X
Marker - QTL
haplotypes present
Marker - QTL
haplotypes present
r
b
Random dams
M progeny
m progeny
M
B µ +1/2α
m
b µ -1/2α
?
?
?
?
M
b µ -1/2α
m
B µ+1/2α
?
?
?
?
1/
2
1/
(1-r)
r
2
Average µ +1/2 (1-2r)α
1/
2
Non-recombinants
(1-r)
1/
µ - 1/2 (1-2r)α
2
Recombinants
r
Contrast
µM?-µm?= (1-2r)α
Linkage Disequilibrium can persist many
generations for tightly linked loci
Measure of Disequilibrium
= DM,B = freq(MB) - freq(M)*freq(B)
M
Random mating in large population:
DM,B(t+1) = (1-r) DM,B (t) = (1-r)t DM,B (0)
m
1
r
B
b
r= .0 01
0. 9
D M,B (t+1 )
0. 8
0. 7
0. 6
r= .0 1
0. 5
0. 4
0. 3
0. 2
r = .1
r= .2
0. 1
0
0
r= .5
20
r= .0 5
40
60
80
100
G en era t ion
The use of molecular genetics in selection programs rests on the ability to determine the
genotype of individuals for causal mutations or indirect markers using DNA analysis. This
information is then used to assess the genetic value of the individual, which can be captured in a
molecular score that can be used for selection. This removes some of the limitations of
quantitative genetic selection discussed above.
It is clear that the use of molecular data for genetic improvement would be most effective if the
genetic architecture of a quantitative trait was completely transparent such that we knew the
number, positions, and effects of all genes involved. In that case, the process of selection would
be reduced to a simple ‘building block’ problem (genotype building) of selection and mating to
create individuals with the right combination of alleles at each QTL. However, this situation is
far from reality and may never be achieved; although advances in molecular genetics have been
able to partially dissect the black box of quantitative traits, the information provided by
molecular data is far from complete, for three main reasons. First, in most cases only a limited
number of genes that affect the trait has been identified, albeit the ones with the largest effects.
Nevertheless, a substantial part of the black box remains obscure and selection exclusively on
genotype for identified QTL would not result in maximum response to selection. Instead,
selection on molecular score must be combined with selection on phenotype, which reflects the
collective action of all genes, including those that have not been identified. Second, with indirect
236
markers, selection is not directly on the QTL, but on the marker, via LD. As LD erodes in the
course of the selection program due to recombination, efficiency of selection is reduced. Third,
for both causal and indirect markers, the effects of the QTL must be estimated empirically on the
basis of statistical associations between markers and phenotype. Estimation requirements are
particularly high for markers that are not in population-wide LD and for which within-family LD
must be used. In that case, marker-QTL linkage phase and effects must be estimated on a withinfamily basis. Thus, the use of molecular information does not remove the need for phenotypic
information and, therefore, suffers to some degree from the same limits as quantitative genetic
selection.
Despite the limitations outlined above, molecular genetic information can be used to enhance
several breeding strategies through what is broadly referred to as Marker-Assisted Selection
(MAS). All strategies for MAS are based on the use a molecular score, although the composition
of this score differs from application to application. In addition to those described below, the
application of molecular data in genetic programs includes their use for parentage verification or
identification (for example, when mixed semen is used in artificial insemination) and in genetic
conservation programs to identify unique genetic resources and quantify genetic diversity.
The type of genetic information that is available, and its association with the functional mutation
(population-wide LD or within-family LD), has important consequences for the use of molecular
information in selection programs. On this basis, the following three types of selection programs
using molecular information can be distinguished:
•
•
•
Gene-assisted selection (GAS) – selection based on the functional mutation for the QTL
Marker-assisted selection based on population-wide LD (LD-MAS) – selection based on
markers or marker haplotypes that are in population-wide disequilibrium with the QTL
Marker-assisted selection based on within-family LD (LE-MAS) – selection based on
markers or marker haplotypes that are in population-wide equilibrium with the QTL but
in LD with the QTL on a within-family basis.
Three types of observable
molecular genetic loci
Possible selection strategies
Ease of
Detection Use
Q Functional mutations
q
- known genes
MQ Markers in pop.-wide LD
mq
with functional mutation
M Q Markers in pop.-wide LE
m q
with functional mutation
GAS
LD-MAS
• Two-stage selection
1) Select on genotype
2) Select on EBV
• Index selection
I = b1 genotype + b2 EBV
• Pre-selection
1) GAS (index) at young age
2) Select on EBV at later age
LE-MAS
For each of the three types of selection (GAS, LD-MAS, and LE-MAS), there are two basic
strategies for combining the molecular information with phenotypic information in a selection
strategy:
237
1) Two-stage selection, in which selection is on the molecular score in the first stage and on
phenotype or a (polygenic) EBV in the second stage
2) Index selection, in which selection is on an index of molecular score and phenotypic
information.
In addition, molecular information could be used primarily for pre-selection of young animals for
further testing.
Methods to derive indexes that combine molecular and phenotypic information will be presented
in the next section, followed by methods to predict responses to selection with presence of QTL
of large effects. We will then compare alternative selection strategies for utilization of QTL
information between and within breeds, and finish with an economic analysis of MAS and
opportunities for the redesign of breeding programs to more fully capture the benefits of MAS.
12.1 Including QTL Information in Estimated Breeding Values
When distinguishing QTL that have been mapped from other background genes that affect the
trait, which will be referred to as polygenes, the genetic value gi of an individual i, can be
partitioned into the sum of genetic values at the QTL, g Qi , and the sum of genetic values at
gi = g Qi + g pi
polygenes, g pi :
Molecular genetic information provides information that can be used to estimate g Qi , whereas
and individual’s phenotype provides information on the collective effect of all genes. Unless all
QTL that affect the trait have been identified, selection on QTL must be combined with selection
on phenotypic information, to ensure simultaneous improvement of both g Qi and g pi . Lande and
Thompson (1990) suggested that QTL and phenotypic information should be combined in an
index of the following form:
Ii = bQ gˆ Qi + bPPi
where ĝ Qi is the molecular score for individual i, i.e. the individual’s estimated breeding value
for the QTL, Pi is the individual’s phenotype, and bQ and bP are index weights. The molecular
score, ĝ Qi , can be computed as the sum over QTL or markers of estimates of effects on
phenotype based on the individual’s QTL or marker genotypes. An example is in Table 12.1.
Lande and Thompson (1990) showed that index weights could be derived by standard selection
index theory, given the proportion of genetic variance explained by the QTL or markers
(q= σ Q2 / h 2σ P2 ), and the (total) heritability of the trait (h2= σ g2 / σ P2 ):
éσ Q2
ébQ ù
−1
êb ú = P G = êσ 2
ë Pû
ëê Q
é 1 − h2 ù
ê
2 ú
−1
σ Q2 ù éσ Q2 ù ê 1 − qh ú
ú ê ú=
σ P2 ûú ëêσ g2 ûú êê (1 − q ) úú
2
ê h 1 − qh 2 ú
ë
û
238
Thus, the relative weight on the molecular score relative to phenotype is:
bQ
bP
1
=
−1
h2
1− q
Table 12.1. Example of the calculation of molecular score and index of phenotype and molecular
score with 3 additive QTL with allele substitution effects (allele A vs. B) of +10, +5, and –10 for
for QTL 1, 2, and 3, respectively. The QTL jointly explain 50% of the genetic variance for a trait
with heritability 0.5. Resulting index weights on molecular score and phenotype are 2/3 and 1/3,
respectively (after J. Holland, 1998).
QTL 1
QTL 2
QTL 3
Molecular
score Phenotype
Animal Genotype Value Genotype Value Genotype Value
1
AA
10
AA
5
AA
-10
5
35
2
AA
10
AA
5
BB
10
25
-10
3
AB
0
BB
-5
AB
0
-5
-15
4
AB
0
BB
-5
AA
-10
-15
15
5
BB
-10
AA
5
AB
0
-5
25
Index
value
15.0
13.3
-8.3
-5.0
5.0
Index of QTL and own phenotype
Index of QTL and own phenotype
(Lande &Thompson, 1990 Genetics 124:743)
(Lande &Thompson, 1990 Genetics 124:743)
g = gQ + gpol
I = bQ gQ + bP P
Index:
bQ
bP
= P-1 G =
σQ2
σ Q2
σQ2
σP2
-1
gQ= QTL/marker BV
gpol = Polygenic BV
σg2 = Total genetic var. = σQ2 + σpol2
q = fraction of genetic variance due to QTL/marker = σQ2/σg2
h2 = total heritability = σg2/σp2
σ Q2
Index:
σ g2
I = bQ gQ + bP P
Selection index theory:
bQ = (1-h2)/(1-qh2)
Accuracy = rg,I = (b’G).5/σ
σg
bQ/bP =
(1/h2
bP = h2(1-q)/(1-qh2)
- 1)/(1-q)
Efficiency = rg,I /rg,P = [(q/h2) + (1-q)2/(1-h2q)]1/2
Can be expanded to multiple QTL and multiple phenotypic
records using standard selection index theory
Example relative weights are in Table 12.2, which shows that the index gives more weight to the
molecular score as heritability decreases and as the proportion of variance explained by the QTL
increases.
Table 12.1 also gives index values for the example animals. This illustrates that different
selection decisions would be made based on molecular score alone, based on phenotype alone,
and based on the index.
The Lande and Thompson (1990) formulation of the index is easily extended to situations were
indexes of phenotypes of relatives are used. Indexes can also be extended to multiple-trait
situations.
239
Table 12.2. Index weight on molecular score relative to phenotype (bQ/bP) for different
heritabilities and proportions of genetic variance explained by the QTL (after J. Holland, 1998).
Heritability
(h2)
0.10
0.25
0.50
0.75
1.00
0.10
10
3.33
1.11
0.37
0
Proportion of genetic variance explained by QTL (q)
0.25
0.50
0.75
1.00
12
18
36
Total weight
4
6
12
Total weight
1.33
2
4
Total weight
0.44
0.67
1.33
Total weight
0
0
0
Either
It is useful to note that the above index can be reparameterized into an equivalent index of
molecular score and phenotype adjusted for the molecular score as follows:
I i' = bQ' ĝ Qi + bP' Pi '
Where Pi ' = Pi - ĝ Qi . Using selection index theory and defining polygenic heritability as the
heritability of phenotype adjusted for molecular score:
h
2
pol
=
σ g2 − σ Q2
σ p2 − σ Q2
h 2 (1 − q )
=
1 − qh 2
weights for this index can then be derived to be independent of r and equal to: bQ' = 1 and
'
P
b =h
2
pol
:
éσ Q2
ébQ' ù
−1
P
G
=
=
ê
ê 'ú
êë 0
ëbP û
Thus, the resulting index is:
ù
2
2ú
σ P − σ Q úû
0
−1
é σ Q2 ù é 1 ù
=ê 2 ú
ê 2
2ú
êëσ g − σ Q úû ëh pol û
2
I i' = ĝ Qi + h pol
Pi '
One important advantage of index I ' over index I is that its index weights remain constant over
generations, whereas weights for index I must be updated each generation as QTL frequencies,
and therefore the proportion of genetic variance explained by the QTL, change. This index also
allows easy extension to indexes based on BLUP EBV. To see this, note that the second term in
2
this index, h pol
Pi ' , represents the individual’s estimated breeding value for polygenes, ĝ pi , based
on own phenotype adjusted for the QTL. This index can be expanded to BLUP EBV from a
model that includes QTL or markers as a fixed or random effect (see Fernando and Grossman,
1989, for methodology to include marked QTL as random effects in a BLUP animal model).
Such models result in estimates of molecular scores, ĝ Qi , and EBV for polygenic effects, gˆ pol ,i ,
with accuracy rpol. Index weights for combining these two estimates, realizing that the variance
2
2
of polygenic EBV is equal to rpol
σ 2pol , where σ 2pol = h pol
(σ P2 − σ Q2 ) is the polygenic variance, can
be derived as:
éσ Q2
ébQ' ù
−1
=
=
P
G
ê
ê 'ú
êë 0
ëb P û
ù
2
2 ú
rpol σ pol úû
0
240
−1
é σ Q2 ù é1ù
ê 2 2 ú=êú
êërpol σ pol úû ë1û
I i' = ĝ Qi + gˆ pol ,i
Thus the index is:
Index of QTL and Phenotypic information
Index of QTL and own phenotype
Generalization to BLUP EBV
Alternative (but equivalent) formulation
P* = phenotype adjusted for QTL/marker
P* = P - gQ
Index:
g^Q = EBV based on (multiple) markers/QTL
= Σg^Qi for multiple markers/QTL
σp*2 = σP2 - σQ2
hpol2 = polygenic heritability = σpol2/σp*2
g^pol = BLUP for polygenic BV
I = bQ gQ + bP* P*
Estimates can be obtained from BLUP-QTL animal models
(Fernando & Grossman, 1989 Genet. Sel. Evol. 21:467)
Selection index theory:
bQ = 1
bP* = hpol2
I = g^Q + g^pol
I = gQ + hpol2 P*
overall
EBV
QTL
EBV
Polygenic
EBV
Use of within-family LD
Marker-assisted BLUP
(Fernando and Grossman, 1989)
Sire
Dam
Ms Qsp
Md Qdp
Ms Qsm
Md Qdm
yi = µ + vip + vim + u + e
Paternal / Maternal PolyQTL allele effect
genic
Progeny
Ms Qip
Md Qim
Var(u)
Var(u) = Aσ
Aσu2
Var(v)
Var(v) = Gσ
Gσv2
G = gametic relationship matrix
for QTL effects
Computed from
vip , ^vim
➣ EBV for QTL alleles: ^
- marker genotypes
^
➣ EBV for polygenic effects: u
- m-QTL rec. rate
^ip + v^im + u^
Total EBV = v
If the phenotypic EBV is from a regular animal model and not from a model that includes the
marker or QTL as separate effects, derivation of the index can only be approximated by
correcting the EBV for effects of the QTL. This could be done by regressing the regular EBV,
ĝ i , on the molecular score (or QTL genotype(s)) using:
ĝ i = βĝ Qi + ei
Residuals from this model then provide approximate estimates of polygenic EBV, i.e. gˆ pol ,i ≈ êi ,
which can be used in the index described above. Note that, although ĝ Qi may represent an
unbiased estimate of the QTL effects, the estimate of the regression coefficient β will be less
than 1. The reason is that when estimating EBV ĝ i , all effects, including the QTL effects are
regressed back toward zero. In theory, the extent of regression can be approximated by the
square of the accuracy of the EBV, i.e. r2. This can be most readily seen for EBV based on own
phenotype alone, in which case:
ĝ i = h 2 Pi = h 2 g Qi + h 2 ( Pi − g Qi )
Thus in this case the regression factor is β = h2 = r2 since r = h for selection on own phenotype.
This relationship β = r2 is, however, only an approximation when phenotypic information from
relatives contributes to the EBV because, the extent of regression of phenotypic information
from relatives is not equal to r2. This is most easiest seen from table 4.1 in Chapter 4, when
comparing index coefficients b to the square of the accuracy rHI.
241
In addition, if animals with EBV with different accuracy are included in the analysis, a single
regression coefficient will not suffice. This could be accommodated by a weighted least squares
analysis or by first de-regressing EBV. These problems are, however, all circumvented when the
marker information is included directly in the genetic evaluation model, which is the preferred
method.
It is useful to note that selection based on own phenotype (without molecular information) can
also be written as selection on an index of breeding values for the QTL and polygenes by noting
that selection on Pi is equivalent to selection on h p2 Pi , which can be written as
2
2
2
2
ĝ i = h pol
Pi = h pol
g Qi + h pol
Pi ' = h pol
g Qi + gˆ pol ,i .
Thus, with phenotypic selection, the emphasis on the molecular score relative to the EBV for
2
polygenes is equal to the polygenic heritability, h pol
, instead of 1 as in MAS. Similarly, for more
complex EBV based on phenotypic records, as shown earlier, the EBV can be approximated by:
ĝ i ≈ r 2 g Qi + gˆ pol ,i
and the implicit weight on the molecular score is approximately equal to the square of accuracy,
r2 .
12.2 Predicting Response to Selection with QTL Information
Apart from stochastic simulation (see Chapter 2), two deterministic methods have been used to
predict response to selection on EBV that include information on an identified or marked QTL:
1) using selection index theory
2) using mixture distributions
The first approach follows standard selection theory, in which the QTL information is considered
as another source of normally distributed information in the index. The second approach more
precisely models selection on a QTL. Both approaches will be described in detail below.
12.2.1 Selection Index approach to predicting response to marker-assisted selection
Consider the previously derived selection index of molecular score and own phenotype, when the
molecular score explains a fraction q of the additive genetic variance:
Ii = bQ gˆ Qi + bPPi
The accuracy of this index and response to selection can be derived by standard selection index
theory (Chapter 4) as:
rg,I =
=
b' G
=
σ g2
é 1− h2
ê
2
ë1 − qh
h2
(1 − q) ù éq ù
ú
1 − qh 2 û êë1 úû
2
q − 2qh 2 + h 2
2 (1 − q )
+
=
q
h
1 − qh 2
1 − qh 2
242
Similarly for the alternate index parameterization:
I i' = bQ' ĝ Qi + bP' Pi '
rg,I’ =
and
2
Using h pol
=
b' G
=
σ g2
[1
2 é q ù
h pol
ê1 − q ú =
ë
û
]
2
q + h pol
(1 − q)
h 2 (1 − q)
it can easily be shown that rg,I’ = rg,I , i.e. the two indexes are equivalent
1 − qh 2
Assuming equal selection in males and females, with selection intensity i, response to selection
can be predicted as:
RMAS = i rg,I σg
Response to phenotypic selection without QTL information is:
RP = i rg,P σg
With rg,P = h, the efficiency of selection using marker information, defined as response to MAS
relative to response without marker information, is given by:
rg , I
R
q (1 − q) 2
+
=
E = MAS =
RP
rg , P
h 2 1 − qh 2
An equivalent equation can be derived using the alternate index I’:
rg , I ' 1
R
2
E = MAS =
=
q + h pol
(1 − q )
RP
rg , P h
Figure 12.1 shows the impact of heritability and proportion of variance explained by the
molecular score on efficiency of MAS. This Figure shows that MAS will be most beneficial for
traits with low heritability and when the molecular score explains a large proportion of the
genetic variance.
Figure 12.1. Efficiency of MAS relative to
phenotypic selection
5
h2 =0.05
4.5
Efficiency
4
3.5
h2 =0.10
3
2.5
h2 =0.25
2
1.5
1
0
0.2
0.4
0.6
0.8
1
h2 =0.50
h2 =0.75
h2 =1.00
Fraction of variance associated with molecular
score (q )
Similar procedures, using selection index theory, can be used to derive accuracy and efficiency
of MAS for more complex EBV that use information from relatives and/or multiple traits (see
243
Lande and Thompson, 1989). Efficiency of such indexes is approximately equal to those
illustrated in Figure 12.1, but with h2 replaced by accuracy squared, r2. This shows that, in
general, for a given proportion of variance explained by QTL, MAS will be most efficient for
cases in which regular selection is relatively ineffective. This includes traits with low heritability,
sex-limited traits, traits that are observed late in life (after selection), and traits that require
sacrificing the animal to observe phenotype (e.g. carcass quality traits).
There are several important limitations to the selection index derivations and results presented in
this section. First, it is important to note that the derived accuracy and selection response and
efficiency assume normality of both phenotypes and molecular scores. Molecular scores will
clearly not be normally distributed if only a few QTL are included. But even in that case, derived
accuracies and efficiencies of MAS will be reasonable approximations if q is not too large and
most emphasis is on phenotype, such that the index is still approximately normal. Note, however,
that the index itself does not require normality and will be optimal (i.e. result in maximal
accuracy and response in additive genetic values from the current to the next generation), even if
molecular scores are not normally distributed, as long as the QTL are additive (see Dekkers 1999
for optimal QTL breeding values with dominance).
In addition, results apply only to selection over a single generation. Response over multiple
generations must accommodate changing variances. Changes in variance due to the Bulmer
effect can be accommodated in selection index derivations and response calculations using the
procedures developed in Chapter 5. However, another important factor to consider here,
especially if selection is on a limited number of QTL of sizeable effect, is the change in variance
associated with molecular score as a result of changes in gene frequencies. Accommodating
changes in gene frequencies requires additional theory, which will be presented in the next
section.
12.2.2 Mixture distribution approach to predicting response to marker-assisted selection
Consider a population of infinite size with discrete generations, selection of fractions Qs and Qd of
males and females, and random mating of selected parents. Selection is for a quantitative trait
affected by an identified QTL (i.e., not marked) and additive polygenic effects. The QTL has two
alleles (B and b). Genotypes BB, Bb, bB and bb, where the first letter indicates the allele received
from the sire, are denoted by m = 1, 2, 3, and 4. To simplify optimization procedures, it is assumed
that genotypes Bb and bB can be distinguished, although this may not always be possible in
practice. The genotypic value of QTL genotype m is denoted by qm, with q1= a, q2=q3= d, and q4= a, following Falconer and Mackay (1996).
Polygenic effects are assumed to follow the infinitesimal genetic model (Falconer and Mackay,
2
1996). Let σp’ and h pol
denote the phenotypic SD and heritability of the trait within QTL genotype.
Both parameters are assumed constant over generations; the effect of selection on polygenic
variance (Bulmer, 1980) is ignored. Alternatively, it can be assumed that the population has been
under selection for several generations and that the polygenic variance is the stabilized variance
with gametic phase disequilibrium.
244
Let pst and pdt be the frequencies of allele B among paternal and maternal gametes that produce
generation t or, equivalently, the frequencies of B among sires and dams that are selected for
breeding in generation t-1. The frequency of B in generation t then is equal to (pst+pdt)/2. Table 12.3
shows the resulting QTL genotype frequencies under random mating of selected parents and
summarizes the notation used.
Let AsBt and Asbt be the mean polygenic breeding values of paternal gametes that form generation t
and that carry allele B and b, respectively. This formulation allows for gametic phase disequilibrium
between the QTL and polygenes (Dekkers and van Arendonk, 1998). Mean polygenic breeding
values of maternal gametes are similarly denoted by AdBt and Adbt. The mean polygenic breeding
value by genotype class is denoted by u mt and is the sum of mean polygenic breeding values of the
paternal and maternal gametes (Table 12.3). The resulting mean total genotypic value of genotype
class m in generation t is then equal to qm+ u mt (Table 12.3). Note that, although genotype classes
Bb and bB have the same QTL value (d), they can differ in mean polygenic breeding value due to
gametic phase disequilibrium.
Weighting the genotypic mean of each genotype class by its frequency, the mean total genotypic
value of the population in generation t, g t , is given by:
g t = (pst+pdt −1)a + (pst+pdt −2pstpdt)d + pstAsBt + (1−pst)Asbt + pdtAdBt + (1−pdt)Adbt
Table 12.3. Summary of notation used for selection on a QTL with two alleles (B and b) in generation t
Ge
notype
No
Genotype
frequency1
Mean
polygenic
breeding value2
Mean
genetic
value
Mean BV,
deviated
from
genotype Bb3
Prop.
selected in
sex j
Index
wts
B
gamete
production,
fraction
Selection
differential4
BB
1
pstpdt
u 1t=AsBt+AdBt
a+ u 1t
αt+ u 1t− u 2t
fj1t
bj1t
1
ij1t σj
Bb
2
pst(1-pdt)
u 2t=AsBt+Adbt
d+ u 2t
0
fj2t
0
½
ij2t σj
bB
3
(1-pst)pdt
u 3t=Asbt+AdBt
d+ u 3t
u 3t− u 2t
fj3t
bj3t
½
ij3t σj
bb
4
(1-pst)(1-pdt)
u 4t=Asbt+Adbt
-a+ u 4t
−αt+ u 3t− u 2t
fj4t
bj4t
0
ij4t σj
1
2
3
4
pst and pdt are frequencies of allele B among selected sires and dams that are used to produce
generation t.
u mt is the mean polygenic breeding value of individuals of genotype m in generation t, AjBt and
Ajbt are the mean polygenic values of gametes from sex j that carry allele B or b and are used to
produce generation t.
αt=a+(1−pst−pdt)d is the standard QTL allele substitution effect in generation t (Falconer and
Mackay, 1996)
σj is the SD of estimates of polygenic breeding values for sex j; i denotes selection intensity.
245
12.2.2.1 Selection Model with QTL Information
With QTL genotype assumed known before selection, information available for selection can be
obtained from genetic evaluation with a BLUP animal model with QTL genotype included as a
fixed effect (Kennedy et al., 1992; Israel and Weller, 1998). Such a model results in estimates of the
QTL effects ( q̂ m) and in estimates of individual polygenic breeding values, ûimt for individual i of
genotype class m in generation t. Let rs and rd denote the accuracy of resulting polygenic EBV for
males and females. In a large population, estimates of QTL effects will be known without error,
which is what will be used here: q̂ m=qm.
Following Falconer and Mackay (1996), breeding values at the QTL in generation t, when deviated
from the breeding value of the heterozygote, are equal to –αt, 0 and +αt for genotypes bb, Bb (=bB)
and BB (Dekkers, 1999), where αt is the standard QTL substitution effect and equal to
αt=a+(1−pst−pdt)d (Falconer and Mackay, 1996). Note that the standard QTL substitution effect for
generation t is derived using the allele frequency in generation t (=½(pst+pdt)). Adding average
polygenic breeding values, the mean total breeding value of individuals of genotype m in generation
t, deviated from the mean breeding value of Bb individuals (m=Bb) is equal to:
g mt = nm[a+(1−pst−pdt)d] + ( u mt – u 2,t)
where indicator variable nm is equal to –1, 0, 0, and +1 for m equal to BB, Bb, bB, and bb,
respectively (see Table 12.3). In practice, mean polygenic breeding values by genotype class, u mt,
can be estimated as the average estimated polygenic breeding value by genotype class. For a large
population, these estimates can be assumed known without error, which is what will be used here:
u mt= û mt
Resulting values can be used to compute the following selection criterion that combines the mean
breeding value of the QTL genotype, g mt , with the individual’s polygenic breeding value estimate
( ûijmt), which is deviated from the mean polygenic breeding value of genotype class m ( u mt):
Iijmt = bjmt g mt + ( ûijmt- û mt)
where bjmt is the weight given to the QTL breeding value for individuals of sex j of genotype m
in generation t.
Selection on this index involves truncation selection across the four genotype classes, as
illustrated in Figure 12.2.
The index value for each genotype class m is assumed to follow a normal distribution with mean
bjmt g mt and SD equal to the SD of polygenic EBV within genotype class, which is equal to
σj=rjσpol, where σpol is the polygenic SD, and rj is the accuracy of polygenic EBV for sex j. The
polygenic standard deviation, σpol, is assumed to be constant over generations and equal to hpolσp’,
where σp’ is the phenotypic standard deviation adjusted for the QTL effect. For known parameters
of the four distributions, the unique truncation point that results in the correct proportion selected
(Qs for males and Qd for females) can be determined numerically. The bisection method described
in Chapter 3 can be used for this purpose.
246
Figure 12.2. Truncation selection across distributions of index values for QTL genotypes
bb
x j4t σ
f j4t
b j4t g j4t
bB
x j3t σ
f j3t
b j3t g j3t
Bb
x j2t σ
f j2t
b j2t g j2t
BB
x j1t σ
f j1t
b j1t g j1t
Let xjmt, fjmt, and ijmt be the standardized truncation point, proportion selected, and selection intensity
(Falconer and Mackay, 1996) for genotype class m of sex j at generation t when truncating across
the four distributions (Table 12.3). With xjmt obtained from the unique point of truncation across the
four distributions (Figure 12.2), fjmt, and ijmt can be approximated under the assumption of normality
of polygenic EBV. The expected frequency of B among paternal (j=s) and maternal (j=d) gametes
that form the next generation can then be derived based on the proportion of B gametes produced by
each genotype (Table 12.3) as:
pj,t+1 = [pstpdtfj1t + ½pst(1-pdt)fj2t + ½(1-pst)pdtfj3t]/Qj
where Qj is the total proportion selected for sex j.
Following Falconer and Mackay (1996), ijmtσj is the selection differential and genetic superiority for
polygenic breeding values of selected parents of sex j of genotype m, which is deviated from the
mean polygenic breeding value of all selection candidates of genotype m (Table 12.3). Expected
mean polygenic breeding values of B and b gametes that form the next generation (t+1) can then be
computed separately for paternal and maternal gametes as:
AjB,t+1 = ½[fj1tpstpdt( u 1t+ij1tσj) + ½fj2tpst(1-pdt)( u 2t+ij2tσj) + ½fj3t(1-pst)pdt( u 3t+ij3tσj)]/Qj pj,t+1
Ajb,t+1 = ½[½fj2tpst(1-pdt)( u 2t+ij2tσj) + ½fj3t(1-pst)pdt( u 3t+ij3tσj) + fj4t(1-pst)(1-pdt)( u 4t+ij4tσj)]/Qj(1-pj,t+1)
247
Note that these equations are based on standard polygenic selection theory (Falconer and Mackay,
1996), by summing polygenic means of B or b gametes that are produced by parents that are
selected from each genotype, weighted by their relative frequencies. The normalizing constant
Qjpj,t+1 derives from the fact that the sum of weights in the equation for B is equal to Qjpj,t+1.
Similarly, the sum of weights in the equation for b is equal to Qj(1-pj,t+1).
This is a general procedure for modeling selection on a quantitative trait that is affected by a
QTL and can be easily extended to selection on multiple QTL by increasing the number of
genotype classes (see Chakraborty et al. 2002). This procedure can be used to model what will be
referred to as standard index GAS by setting all weights bjmt are equal to one, which result in an
index that is equivalent to the index that was derived using selection index theory. This index
maximizes response from the current to the next generation for additive QTL, as shown by
Dekkers and van Arendonk (1998). This formulation also allows index weights to be derived for
what will be referred to as optimal index GAS (see later), which aims to maximize response to
selection over multiple generations. Finally, this procedure also allows approximation of
selection without QTL information by setting weights bjmt = rj2 . If the QTL is non-additive, the
first term of the index must be replaced by bjmt g mt à rj2(qm+ u mt) because with GAS, QTL effects
are then based on allele substitution effects, whereas the genotypic effect is reflected in the
phenotype.
The model does not specifically allow consideration of marked QTL. However, assuming
recombination rates are small, it does provide a good approximation to LD-MAS. In that case, QTL
alleles are replaced by marker haplotypes.
12.3 Two-stage vs. index selection on QTL
The following figures compare responses from two-stage GAS, in which selection is on QTL
genotype in stage 1, followed by selection on phenotype, to responses from standard index GAS.
Phenotypic selection, without use of QTL information, and optimal index GAS (see later), are
included for comparison also. The example is for selection on a biallelic additive QTL with
effect a=0.5σp and starting frequency 0.1 for a trait with polygenic heritability 0.25 and selection
of 10% of males and 25% of females. Results are from the deterministic mixture distribution
model. For two-stage GAS, this was implemented by first selecting BB individuals, followed by
individuals with the Bb or bB genotype (equal proportions) if there were not enough BB animals,
and by bb individuals if still additional animals were required. For the last selected genotype,
individuals with the highest phenotypic value were selected until the required overall proportion
selected was obtained.
Results (see figures below) show that two-stage GAS resulted in more rapid fixation of the QTL
than standard index GAS, but at a cost to polygenic response. Cumulative responses and
10
1
cumulative discounted response (CDR = å
g t ) (ρ = 10% interest) from two-stage GAS
t
t =1 (1 + ρ )
was, however, lower than response from standard index GAS. The reason for this is that twostage selection removes some individuals that have high polygenic EBV in the first stage, which
are selected with index GAS because their high polygenic effects more than offset the fact that
248
Frequency
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Standard GAS
Phenotypic
2-stage GAS
10
Optimal GAS
Standard GAS
Phenotypic
2-stage GAS
8
6
4
2
0
1
2
3
4
5
6
Generation
7
8
9
0
10
0
Cumulative response deviated from phenotypic
selection
Optima GASl
Standard GAS
Phenotypic
2-stage GAS
0.3
0.1
-0.1
-0.3
-0.5
1
1
2
3
4 5 6 7
Generation
8
3
1
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
4 5 6 7
Generation
8
9
10
Optimal GAS
Standard GAS
Phenotypic
2-stage GAS
0
-0.7
0
2
CDR deviated from phenotypic selection
Cumulative response
0.5
Cumulative response
Genetic gain
12
Genetic gain
Frequency
they have an unfavorable QTL genotype. This is akin to multiple-trait selection using
independent culling levels versus index selection.
9 10
1
2
3
4 5 6
Generation
7
8
9
10
For comparison, results from phenotypic and optimal index GAS are shown also. Both two-stage
and standard GAS ultimately resulted in lower response and CDR than phenotypic selection –
the reason for this will be explained later. Optimal GAS had greater CDR than phenotypic
selection across the planning horizon.
The next figure shows lost cumulative response from two-stage compared to standard GAS after
1, 2, 3 generations, and lost CDR over 10 generations, for QTL of different effects. Results show
that lost response from two-stage GAS was greatest for small QTL and reduced to zero for large
QTL. In fact, two-stage GAS can be modeled using the mixture distribution by giving a very
large effect to the QTL, which ensures that all individuals with the favorable QTL genotype are
selected.
249
2-stage vs. index GAS (p0=.1)
Selected prop.: sires=.1 dams=.25 Accuracy: rs=.8 rd=.5
Lost trait response
60
50
40
30
20
10
0
2
1
Ge
ra n etio
n
Q T L e ffe c t (a in σ g )
1
3
CDR 10
0.8
0.6
0.4
0.2
0
Response lost (%)
70
Whereas the previous results were simulated for a single trait, they also apply to selection for an
aggregate multiple-trait breeding goal. In that case, the effect of the QTL is expressed relative to
the genetic standard deviation for the breeding goal and selection accuracies are those for the
multiple-trait index as a predictor of the aggregate genotype (rHI).
Note that the results presented above not only apply to selection on QTL for quantitative traits
but also for genes associated with single gene traits, such as genes associated with genetic
defects, appearance (horns, color), and diseases. Although it is often difficult to assign an
economic value to such genes, the above results show that simply culling all carriers of a genetic
defect results in lost response for other traits, because selection emphasis is diverted. This occurs
even if the gene has no direct negative effect (pleiotropy) on the other traits. Thus, it is important
to attempt to assess an economic value for such single-gene traits. This economic value could
account for the lost marketability of breeding stock that are carriers of the genetic defect. With
regard to the use of carriers, one point to note is that occurrence of homozygous recessive
progeny can be avoided by careful mating. Thus, there is often no valid reason to absolutely
avoid use of carrier animals in breeding.
In the two-stage selection procedure used above, selection was first on molecular data and then
on phenotype. There may be benefit to turning this around and have selection on the index that
includes marker information follow a first stage of selection using phenotype-based EBV; only
individuals that are selected in the first stage would need to be genotyped, which would save
costs.
12.4 Long-term Response to Selection with QTL Information
To examine longer-term responses to selection, Fig. 12.3 illustrates responses to selection on
phenotype and to standard index GAS based on the mixture distribution model depicted for an
example situation. For illustrative purposes, the example reflects a QTL of very large effect (the
250
difference between homozygotes is 2a = 1.5σp’). Similar trends are observed for QTL of smaller
effect, although the differences between phenotypic selection and GAS are smaller.
Fig. 12.3. Responses to standard MAS and phenotypic based on the deterministic model in a
population of infinite size. Selection is of the top 20% of males and females for a trait controlled
by a biallelic additive QTL and polygenes. The QTL has effect a = 1 phenotypic standard
deviations and frequency 0.1. Polygenic heritability is 0.25. The main graph shows cumulative
total response to selection, expressed in polygenic standard deviations (σpol); b) frequency of the
favorable QTL allele; and c) polygenic response per generation. (From Dekkers & Settar, 2003)
25
b) QTL frequency
PHENOTYPIC
1
STANDARD GAS
Frequency
0.8
0.6
0.4
0.2
0
15
0
5
10
15
Generation
20
25
30
c) P olygenic response
10
0.8
Response (σ pol)
Genetic value ( σ pol)
20
5
0.6
0.4
0.2
0
0
5
10
15
G eneration
20
25
30
0
0
5
10
15
20
25
30
Generation
Figure 12.3 clearly shows the extra response from GAS during early generations. By generation
5, however, cumulative response from phenotypic selection exceeds that from GAS. As
expected, GAS fixes the QTL at a faster rate than phenotypic selection (Fig. 12.3b). The
increased selection emphasis on the QTL, however, results in lower response in polygenes (Fig.
12.3c). Although polygenic response per generation returns to maximum as soon as the QTL is
fixed, i.e. sooner for GAS than for phenotypic selection, the extra polygenic response that is lost
in early generations with GAS is never regained in later generations, which is the reason for the
lower cumulative response for GAS in the longer term.
Results illustrated in Fig. 12.3 are based on several simplifying assumptions for the polygenic
component of the genetic model; (a) the infinitesimal model for polygenes, i.e. an infinite
number of polygenes of small effect; (b) large population size, i.e. no inbreeding or drift; and (c)
251
genetic variance contributed by polygenes remains constant over generations, i.e. no gametic
phase disequilibrium among polygenes (Bulmer 1980). The deterministic model does account for
the gametic phase disequilibrium between the QTL and polygenes that is induced by
simultaneous selection on the QTL and polygenes (Dekkers and van Arendonk 1998). This is
reflected in a negative association between the QTL and polygenes, such that individuals with a
(un)favorable QTL genotype tend to have poorer (better) polygenic breeding values. The
creation of this negative association by selection is illustrated in Fig. 12.2 by noting that
individuals with a BB genotype are less intensely selected for polygenes than individuals with a
bb genotype for the QTL. A negative association is created by both phenotypic selection and
GAS but is larger for MAS because of the greater emphasis on the QTL (Dekkers and van
Arendonk 1998).
Despite the simplifying assumptions of the deterministic model, the results illustrated in Fig.
12.3 have been repeated in several studies by stochastic simulation (e.g. Larzul et al. 1997; PongWong and Woolliams 1998). A stochastic model simulates individuals in the population under
selection, rather than population distributions, and does not require many of the assumptions that
are inherent to the deterministic model depicted in Fig. 12.2. Typical results from such stochastic
simulations are demonstrated in Fig. 12.4, which represents the results of simulating selection in
a population of 250 males and 250 females, with 20% selected for each sex. Three different
genetic models were used for polygenes: the infinitesimal genetic model and models in which the
polygenic component is simulated by 50 or 10 individual loci. Results for the stochastic model
were averaged over 500 replicate simulations.
Fig. 12.4 focuses on the difference in cumulative responses between GAS and phenotypic
selection over generations, rather than the absolute responses illustrated in Fig. 12.3. For a given
method of selection (GAS or phenotypic selection), absolute cumulative responses to selection
(not shown) differed between genetic models; responses were greatest for the deterministic
model, followed by the infinitesimal model, and the finite locus models with 50 and 10
polygenes. Average rates of change in frequency of the QTL were very similar between genetic
models (results not shown).
For the infinitesimal model, differences in response between GAS and phenotypic selection were
very similar for the stochastic model and the deterministic model (Fig. 12.4). In contrast to the
deterministic model, the stochastic model accommodates reductions in polygenic variance as a
result of the Bulmer effect and inbreeding (Fig. 12.5). Under the stochastic model, however,
changes in polygenic variance were similar for GAS and phenotypic selection and would,
therefore, have limited impact on their contrast.
The finite locus model with 50 polygenes exhibited similar differences in cumulative response
between GAS and phenotypic selection as the infinitesimal model for the first 10 generations
(Fig. 12.4). In subsequent generations, GAS regained some of the response it had lost under the
finite locus model and differences with phenotypic selection decreased slightly. The recovery of
lost response under GAS was greater for the model with 10 polygenes; the difference with
phenotypic selection was reduced to 0.13 polygenic standard deviations by generation 15. This
behavior of the finite locus model is explained by the change in frequencies of polygenes. The
average frequency of polygenes is initially lower for GAS than phenotypic selection. As
252
frequencies move closer to 1, however, polygenic variance is depleted, and more rapidly so for
phenotypic selection than for GAS (Fig. 12.5). As a result, polygenic response with GAS is able
to catch up with polygenic response for phenotypic selection.
Figure 12.4. Cumulative responses to standard GAS as a deviation from cumulative response for
phenotypic selection (GAS response – phenotypic response, expressed in polygenic standard
deviations, σpol). Selection is of the top 20% males and females from 250 individuals per sex for
a trait controlled by a biallelic additive QTL and polygenes. The QTL has effect a = 1
phenotypic standard deviations and frequency 0.1. Polygenic heritability is 0.25. In addition to a
deterministic model, results are presented for three stochastic models with different models for
the polygenic component: the infinitesimal model, and finite locus models with 50 or 10
unlinked loci of equal effect but frequencies drawn from a uniform [0,1] distribution. Stochastic
simulation results are the average of 500 replicate simulations. (From Dekkers and Settar, 2003)
GAS - Phenotypic Response (σpol)
1
Deterministic
0.8
Infinitesimal
0.6
50 polygenes
10 polygenes
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Generation
In a large population with negligible inbreeding, all genes that affect the trait will ultimately be
fixed for their favorable allele under both GAS and phenotypic selection. Thus, ultimate
response will be the same for both strategies. In populations of limited size, however, ultimate
response will differ between strategies because of their impact on rates of fixation and loss of
polygenes. Thus, ultimate differences in response between GAS and phenotypic selection will
depend on the proportion of polygenes for which the favorable allele is lost. These differences
will, however be small (Dekkers and Settar, 2003).
253
Figure 12.5. Polygenic variance (relative to polygenic variance in generation 0) under standard
MAS and phenotypic selection (Phen). Selection is of the top 20% males and females from 250
individuals per sex for a trait controlled by a biallelic additive QTL and polygenes. The QTL
has effect a = 1 phenotypic standard deviations and frequency 0.1. Polygenic heritability is
0.25. Results are presented for three models for the polygenic component: the infinitesimal
model, and finite locus models with 50 or 10 unlinked loci with equal effect but frequencies
drawn from a uniform [0,1] distribution. Results are the average of 500 replicate stochastic
simulations. (From Dekkers and Settar, 2003)
1
Polygenic Variance ( σ pol)
Infinitesimal
0.8
0.6
GAS
Phen
50 polygenes
GAS
Phen
10 polygenes
GAS
Phen
0.4
0.2
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Generation
12.5 Optimizing QTL selection
The previous results demonstrate that GAS strategies that maximize response over a single
generation will not maximize response over more than one generation. The underlying reason is
that selection not only changes the population mean but also population parameters such as gene
frequencies and genetic variances. These changes in population parameters affect the amount of
response that can be made in subsequent generations. Thus, strategies that maximize response
over multiple generations must account for changes in population parameters that affect
subsequent responses to selection.
In the case of GAS and under the infinitesimal genetic model with constant polygenic variance,
the main population parameter that affects response to selection in subsequent generations is the
frequency of the QTL. Therefore, to develop strategies that maximize longer-term response to
selection, Dekkers and van Arendonk (1998) used the mixture distribution model described
254
previously to optimize weights in the previously described index of the genetic value for a single
known QTL ( g Qi ) and an EBV for polygenes:
Iijmt = bjmt g mt + ( ûijmt- û mt)
Index weights bjmt were allowed to differ by generation, sex, and QTL genotype. In reference to
Figure 12.2, changing weights on the QTL changes the means of the three distributions and,
thereby, the proportions selected from each genotype.
MAS Strategies that maximize response
over one generation may not maximize
response over multiple generations
Selection
in current
generation
Progeny mean
Genetic
parameters
(QTL frequency
and variance)
Response in
subsequent
generations
Dekkers and van Arendonk (1998) used optimal control theory to derive the index weights that
maximized cumulative response after T generations. Optimal control theory utilizes the unique
structure of response to selection over generations, in that the optimal selection strategy for
generation t depends only on population parameters in generation t, i.e. polygenic means and
QTL frequency, and not on the path that led to these parameters (Dekkers and van Arendonk
1998). Manfredi et al. (1998) solved a similar problem using a more general optimization
method. Their method does not utilize the unique structure of selection over multiple generations
and requires more computing time. The approach of Dekkers and van Arendonk (1998) was
subsequently extended to multiple QTL by Chakraborty et al. (2002).
The general selection objective considered is to maximize a weighted sum of mean total genotypic
T
values by generation over a planning horizon of T generations:
R = å wt g t
t =1
where wt is the relative emphasis on generation t in the overall objective. For an economic objective
function such as cumulative discounted responses (CDR), weights wt reflect discount factors and are
equal to wt = 1/(1+ρ)t, where ρ is the rate of interest per generation. If the objective is to maximize
cumulative response after T generations, set wT = 1 and all other wt = 0.
The problem of maximizing this objective function can be stated as the following constrained
multi-stage non-linear optimization or optimal control problem (Lewis, 1986), using the notation
developed previously:
255
Given the polygenic and gene frequencies in generation 0, i.e. AsB0, AdB0, Asb0, Adb0, ps0, and pd0,
T
Max R = å wt g t
f jmt
t =1
Subject to, for j=s,d and every t=0 to T-1:
Qj
= pstpdtfj1t+pst(1-pdt)fs2t+(1-pst)pdtfj3t+(1-pst)(1-pdt)fj4t
pj,t+1 = [pstpdtfj1t + ½pst(1-pst)fj2t + ½(1-pst)pdtfj3t]/Qj
AjB,t+1= ½[fj1tpstpdt( u 1t+ij1tσj) + ½fj2tpst(1-pdt)( u 2t+ij2tσj) + ½fj3t(1-pst)pdt( u 3t+ij3tσj)]/Qj pj,t+1
Ajb,t+1 = ½[½fj2tpst(1-pdt)( u 2t+ij2tσj) + ½fj3t(1-pst)pdt( u 3t+ij3tσj) + fj4t(1-pst)(1-pdt)( u 4t+ij4tσj)]/Qj(1-pj,t+1)
This represents an optimization problem with 8T decision variables, fjmt, that must be optimized
subject to 8T equality constraints. The first set of constraints corresponds to the overall fraction
selected, the second set to changes in gene frequency, and the last two to changes in mean polygenic
values. The only variable related to polygenic effects in the equations is the SD of polygenic
EBV. Thus, the problem formulation and, therefore, its solutions, do not depend explicitly on
heritability or polygenic variance.
Finding optimal weights
Optimizing Selection on QTL
Optimal Control Theory
(Dekkers and van Arendonk, 1998, Genetical Research)
Selection
criterion
=
I = b QTL + Polygenic EBV
b QTL (E)BV + Polygenic EBV
Control
Variables
bj0
State
Variables
A0
p0
System
Output
Optimize
MAX
bjt
∆A
∆p
A1
p1
g1
T
to maximize response
bj1
1
{Σ (1+ρ)t gt }
t=1
bj2
∆A
∆p
A2
p2
bj,T-1
∆A
∆p
g2
AT-1
pT-1
gT-1
Subject to: pt+1 = pt + ∆ pt
At+1 = At + ∆ At
}
∆A
∆p
AT
pT
gT
for t = 0, . . . T-1
The above formulation uses fractions selected from each distribution in each generation, fjmt, as
decision variables rather than index weights, bjmt. Dekkers and Van Arendonk (1998)
demonstrated how the truncation points, xjmt, that correspond to the fractions selected fjmt, can be
bjmt = σj( xjmt − xj,Bb,t)/ g mt
transformed to index weights on g mt :
Formulating the problem in terms of fractions selected is thus equivalent to formulating the
problem in terms of index weights. This can also be illustrated through Figure 12.2; changing the
index weights bjmt shifts the four distributions relative to each other. When truncating on the
index, changes in index weights therefore result in corresponding changes in truncation points
and in fractions selected from each distribution. Index weights for genotype Bb are equal to zero
because means are deviated from genotype BB. Thus, the number of index weights in this
formulation is 6T (3 per sex per generation) compared to 8T variables fjmt. The difference in
256
number of variables results from the implementation of the first set of constraint equations,
which constrain the total fraction selected per sex.
Optimal fractions selected fjmt and, thereby, index weights that maximize the constrained objective
function can then be derived iteratively, using optimal control procedures, following Dekkers and
Van Arendonk (1998). Details are in Chakraborty et al. (2002).
Figure 12.6 illustrates the optimal weights assigned to the QTL in each generation for the
example situation when the objective was to maximize cumulative response over 30 generations.
Index weights were the same for males and females because selection intensities were the same
for both. Weights differed by generation and QTL genotype. Except for the final generations,
weights on the QTL were substantially lower than those used for standard GAS (bQ = 1 for all
2
for all generations). Optimal weights were
generations) and phenotypic selection (bQ = h pol
equal to 1 for the final generation because at that point the aim is to maximize response in the
next generation, equivalent to standard GAS. In generation 29, the optimal weight on the
unfavorable QTL genotype (bb) was extremely large.
Figure 12.6. Weights on the QTL with standard
GAS, phenotypic selection, and optimal GAS. 20%
selection of males and females. a = 1σp h2= 0.25.
Figure 12.7. Cumulative total (closed symbols) and
polygenic (open symbols) responses to standard and
optimal GAS, deviated from phenotypic selection and
b) 0.8frequency of the QTL for optimal GAS.
MAS - Phenotypic Response ( σ pol )
1.2
Standard GAS
1
Index weight
0.8
0.6
0.4
Phenotypic
0.2
Optimal GAS bbb
0
-0.2 0
10
15
Optimal GAS
0
Standard GAS
-0.4
-0.8
b) QTL frequency
1
-1.2
0
0
-1.6
-1.6
Optimal GAS bBB
5
0.4
0
20
25
5
10
15
20
G en era tion
25
30
30
Generation
30
Generation
Figure 12.7 depicts the resulting changes in cumulative total and polygenic responses for optimal
GAS as a deviation from responses to phenotypic selection. Frequencies of the QTL in each
generation are illustrated in Fig. 12.7b. Optimal GAS led to a much more gradual and almost
linear increase in frequency toward fixation at the end of the planning horizon. This is in contrast
to standard GAS and phenotypic selection (Figure 12.3b). As a result, cumulative response was
lower for optimal GAS than for phenotypic selection (Fig. 12.7) for the first 23 generations.
However, polygenic response was greater, which led to a 0.42 polygenic standard deviation
greater cumulative response by the end of the planning horizon, at which time the QTL was also
fixed under optimal GAS.
Realized selection intensities that were placed on the polygenes and the QTL in a generation,
which were computed as response generated in that generation divided by the standard deviation
of the selected component (= polygenic standard deviation for polygenic response and
257
= 2 p(1 − p) for the QTL, where p is the gene frequency), are given in Figure 12.8. For optimal
GAS, selection intensity placed on the polygenes was remarkably constant over generations,
apart from the last generation. In contrast, selection pressure placed on polygenes was lower for
both standard GAS and phenotypic selection prior to fixation of the QTL. Patterns for selection
intensities placed on the QTL nearly mirrored those of intensity on polygenes for standard GAS
and phenotypic selection but was again nearly constant for optimal GAS, apart from the first and
last generation. The latter likely relates to the build-up of gametic phase disequilibrium between
the QTL and polygenes.
Figure 12.8. Standardized selection response
(intensity) per generation for the QTL (closed
symbols) and polygenes (open symbols).
Figure 12.9. Frequencies for 3 QTL under standard
and optimal GAS. The QTL have effects a (in
phenotypic SD) and initial frequencies p
1.8
1
1.6
0.9
Polygenes
QTL frequency
1.4
Intensity
1.2
1.0
Phenotypic
Standard GAS
Optimal GAS
0.8
0.6
Standard GAS
0.8
0.7
0.6
0.5
0.4
0.3
Optimal GAS
0.2
0.4
QTL
0.2
0.1
a = 1/2
p = 0.1
a = 1/4
p = 0.1
a = 1/4
p = 0.3
0
-0.2
0.0
0
5
10
15
20
25
0
30
5
10
15
20
Generation
Generation
Results depicted in Figure 12.8 demonstrate that, to maximize cumulative response over a
planning horizon of T generations, selection emphasis on the QTL should be controlled in such a
manner that the QTL is close to fixation in generation T, while equal selection emphasis is
placed on polygenes across generations. Note that this optimal solution is by nature similar to
that obtained by Finney (1958) for selection across multiple stages. He also found that to
maximize cumulative response in a quantitative trait over multiple stages of selection, selection
efforts should be divided equally across stages. Unequal selection results in lower cumulative
response because of the non-linear relationship between proportion selected and selection
intensity (Falconer and MacKay 1996). The additional complication in the present context is that
the total selection emphasis that is applied to polygenes across all generations is not
predetermined but must be balanced against placing sufficient emphasis on the QTL, such that
the QTL frequency is moved to fixation in generation T. Results depicted in Fig. 12.8 show that
this is achieved by maintaining a nearly constant standardized selection emphasis on the QTL
over generations, apart from the first and last generations.
Trends in frequencies of with selection on 3 QTL with standard and optimal GAS to maximize
cumulative response over 20 generations are illustrated in Figure 12.9. With standard GAS, rates
of fixation depended on both effect and starting frequency of the QTL and the QTL with the
larger effect was moved to fixation more quickly, as expected. The same was observed for
phenotypic selection (not shown), but rates of fixation were lower for all QTL. In contrast, with
optimal GAS, frequencies increased nearly linearly to reach near fixation at the end of the
258
planning horizon for all three QTL, regardless of the effect and the initial frequency of the QTL.
The rate of increase in frequency was determined by the initial frequency of the QTL, not by its
effect; magnitude of the QTL had no impact on the selection emphasis that was placed on the
QTL in the optimal strategy.
Figures 12.10 and 12.11 show results from optimal GAS on a single QTL when the objective
was to maximize cumulative discounted response over 10 generations with a 10% interest rate.
Parameters included 10 and 25% selection among males and females, heritability 0.10, and QTL
parameters a = 0.2 p, d= ½a, and starting frequency 0.10. Optimal GAS resulted in 3.5% greater
CDR than standard GAS and in 8.6% greater CDR than phenotypic selection. Additional results
are in Chakraborty and Dekkers (2000).
Figure 12.10. Cumulative total and polygenic responses
7
1
4
Polygenic
response
3
O p tim a l
2
P h en o typ ic
S ta n d a rd
1
0
2
4
G en er a tio n
66
S eries1 0
88
S eries1 1
0 .6
al
tim
p
O AS
G
ic
typ
no
e
Ph
0 .4
0 .2
Fi g u r e 3 Q T L f r e que n c y f o r
S eries8
0
drd
adra
n
d
a
n
t
StSa S
GA
0 .8
QTL frequency
5
Genetic level
Total
response
QTL
CDR over
strategy phenotypic
Standard + 3.5%
Optimal
+ 8.6%
6
Figure 12.11. QTL frequencies
0
0
1100
2
4
6
G e n e r a tio n
259
8
10
12.5 Implementation of MAS for Genetic Improvement
Genetic improvement can be accomplished by within-breed selection and/or by capitalizing on
between breed differences. Both can be achieved to some degree through conventional selection
using phenotype but can be enhanced through the use of molecular information. In what follows,
we will first discuss opportunities for utilizing between-breed differences and then evaluate
strategies for within-breed selection.
12.5.1 Between-breed improvement
Molecular markers can be used to assist the integration of favorable genes (alleles) from multiple
breeds through marker-assisted introgression of MAS within crosses or synthetic lines.
Use of between breed
QTL information
Use of QTL information
Based on between-breed LD
• Between-breed selection
• Marker-assisted QTL introgression
• Within-breed selection
• MAS in synthetic lines / recent crosses
12.5.1.1 Marker-assisted introgression
The aim of an introgression program is to introduce one or more genes (target genes) from a
breed that is superior for some genes or QTL but inferior for general performance (the donor
breed) into a high performance line that lacks the target genes (the recipient breed). This is done
through an initial F1 cross followed by multiple backcrosses to the recipient breed and one or
more generations of intercrossing. The aim of the backcross generations is to maintain the target
gene(s) while recovering the background genome of the recipient breed. The purpose of the
intercrosses is to fix the line for the target gene(s).
Effectiveness of introgression schemes is limited by the ability to identify backcross or intercross
individuals that carry the target gene(s) and by the ability to identify backcross individuals that
have a high proportion of the recipient genome, in particular in the region(s) around the target
gene(s). The latter affects the number of backcross generations required to recover the recipient
genome. Molecular genetics can enhance the effectiveness of both phases of an introgression
program. Effectiveness of the backcrossing phase can be increased in two ways: i) by identifying
carriers of the target gene(s) (foreground selection), and ii) by enhancing recovery of the donor
260
genetic background (background selection). Effectiveness of the intercrossing phase can also be
enhanced through foreground selection on the target gene(s).
QTL Introgression Program
QQ
Donor line
F1
Qq
1
ct
Sele
X
qq
Recipient line
X
R qq
X
R qq
Qq BCn-1
X
R qq
Qq BCn
X
BCn
IC1
X
IC1
IC2
X
Qq
BC1
Introgression of Q from donor to recipient line
Cross
Backcrossing
• recover R genome
• maintaining QTL
Qq
D
x R F1
F1 x R
BC1 x R
BC2 x R
BC3 x R
BC1
BC2
BC3
BC4
.
.
BCnxBCn IC1
IC1 x IC1 IC2
.
.
ICk x ICk ICk+1
IC2
Freq. Genotype
% of R genome
of Q selected Average
95% Range
Qq
0.50
Qq
50
50 - 50
Qq/qq
Qq/qq
Qq/qq
Qq/qq
0.50
0.50
0.50
0.50
Qq
Qq
Qq
Qq
75
87.5
93.75
96.88
66.6 - 83.4
80.7 - 94.3
88.8 - 98.8
93.5 - 100
QQ/Qq/qq 0.50 QQ (+Qq?)
QQ/Qq/qq >0.50 QQ (+Qq?)
99.?
99.?
QQ/Qq/qq >0.50
99.?
QQ
Qq
Qq/qq
Qq/qq
Qq/qq
Qq/qq
0.50
0.50
0.50
0.50
0.50
% of R genome
Mean 95% Range
Qq
Qq
Qq
Qq
Qq
QQ/Qq/qq 0.50 QQ (+Qq?)
IC1 x IC1 IC2 QQ/Qq/qq >0.50 QQ (+Qq?)
.
.
ICk x ICk ICk+1 QQ/Qq/qq >0.50
QQ
Use of Markers in Introgression
Pro- Progeny
geny genotypes
F1
BC1
BC2
BC3
BC4
Progeny Freq. Genotype
genot.
of Q selected
BCnxBCn IC1
Intercrossing
to fix QTL
QQ Improved line QQ
Cross
D xR
F1 x R
BC1 x R
BC2 x R
BC3 x R
.
.
Progeny
m
q
99.?
99.?
99.?
• Large #’s needed to find BC’s that are
heterozygous for ALL QTL
• single marker
• flanking markers
Q
50 - 50
66.6 - 83.4
80.7 - 94.3
88.8 - 98.8
93.5 - 100
Introgression of Multiple QTL
1) Identify Qq individuals in BC
M
50
75
87.5
93.75
96.88
• Gene pyramiding
Q1 x Q2
Q3 x Q4
Q1+Q2 x Q3+Q4
2) Speed recovery of background genome during BC (and IC)
➣ select on phenotype (among Qq BC animals) = traditional way
➣ select on markers spread over genome (among Qq BC’s)
Q1+Q2+Q3+Q4
3) Selection among juveniles (short generation interval)
Effectiveness of foreground selection depends on the number of target genes and on the
confidence interval for the position of those genes. The latter determines the size of the genomic
region that must be introgressed. Both factors have a large impact on the number of individuals
that is required to find individuals that are carriers for all target genes during the backcrossing
phase and homozygous during the intercrossing phase. For the introgression of multiple target
genes, gene pyramiding strategies can be used during the backcrossing phase to reduce the
number of individuals required (Hospital and Charcosset 1997, Koudandé et al. 2000).
The use of molecular markers in background selection involves estimating the proportion of the
recipient genome on the basis of markers across the genome and selecting individuals with the
highest proportion. To reduce linkage drag, greater emphasis can be given to markers around the
target gene(s).
Hanset et al. (1995) reported on the successful introgression of the halothane normal allele into a
Piétrain line that had a high frequency of the halothane positive allele. They used foreground
selection on a marker that is closely linked to RYR. Yancovic et al. (1995) reported on markers261
assisted introgression of the naked neck gene in chickens from a local breed into an improved
broiler line. The naked neck gene is of benefit in warm climates because it reduces feather cover.
Yancovic et al. (1995) used markers to speed up the recovery of the background genome of the
improved broiler.
In general, however, the application of introgression programs to livestock appears limited for
several reasons:
i) Apart from some major genes, QTL studies show that most economic traits are affected by a
substantial number of genes with moderate effects. This makes the number of QTL to
introgress more than can feasibly be handled within an introgression program.
ii) Most QTL may already be segregating within the recipient breed, such that within-breed
selection may be more effective than introgression.
iii) Most QTL are not very precisely mapped, which increases the size of the genome region(s)
that must be introgressed and the population size required.
iv) The economic benefit of the target gene may not be large enough to compensate for the extra
costs and reduced genetic gain in other traits that is associated with an introgression program.
v) The introgressed gene(s) may have a different effect in the new genetic background, as has
been observed in several plant introgression programs (Dekkers and Hospital 2002).
12.5.1.2 Marker-assisted synthetic line development
Lande and Thompson (1990) proposed a strategy for marker-assisted selection within a hybrid
population created by crossing two inbred lines. The strategy capitalizes on population-wide
linkage disequilibrium that initially exists in crosses between lines or breeds. Thus, marker-QTL
associations identified in the F2 generation can be selected on for several generations, until the
QTL are fixed or the disequilibrium disappears. Zhang and Smith (1992) evaluated the use of
markers in such a situation with selection on BLUP EBV. They compared the following three
selection strategies:
MAS: selection on an EBV derived from marker effects
BLUP: selection on BLUP EBV derived from phenotype
COMB: combined selection on an index of the EBV based on markers and phenotype.
Data for a cross between inbred lines were simulated on the basis on 100 QTL and 100 markers
in a genome of 2000 cM. Marker effects were estimated in the F2 generation using a two step
procedure. In the first step, a separate F2 population from the same cross was used to identify
markers with the largest effects. Then, to obtain unbiased estimates, the effects of those markers
were re-estimated in the F2 population under selection. The latter estimates were used to obtain
marker-based EBV throughout the selection process.
262
Selection in Inbred Line Cross
Zhang and Smith (1992)
Line 1
Line 2
x
F1 x F1
F2 x F2
100 QTL - biallelic
- a ~ Normal
100 markers biallelic
- ave.
ave. 20 cM interval
QTL detection - 1000 F2
Largest 20 QTL selected (67% var)
var)
- re-estimated in other 1000 F2
Selection - Marker score alone
- BLUP EBV
- Marker score + BLUP
F3 x F3
Genetic progress based on selection on markers alone (MAS), phenotypic data
alone (BLUP), or their combination (COMB) in a cross between inbred lines.
Based on Zhang and Smith (1992)
6
MAS
BLUP
COMB
MAS
5
BLUP
2
h =0.25
h2=0.50
COMB
Genetic mean
4
3
2
1
0
1
2
3
4
5
6
7
8
9
10
Generation
Results illustrated in the figure above show that index selection (COMB) resulted in greatest
response, followed by selection on BLUP EBV and selection on markers alone. Rates of
response declined over generations for all strategies because data were simulated using a finite
number of loci, which were moved to fixation by selection. Rates of response declined faster for
MAS because recombination eroded the disequilibrium between the markers and QTL.
Nevertheless, substantial rates of response were obtained using selection on markers alone.
The MAS strategy of Zhang and Smith (1992) has potential for selection for traits that are
difficult or expensive to measure in livestock, such as meat quality traits, because it does not
require continuous phenotypic evaluation, in contrast to the BLUP and COMB strategies.
263
Although Gimelfarb and Lande (1994) showed that greater response could be obtained by reestimating marker effects in subsequent generations, this would require the continuous recording
of phenotypic data, the cost of which may not outweigh the benefits.
Zhang and Smith (1992) considered the ideal situation of a cross with inbred lines. Although the
lines were not divergent for the trait of interest, they were homozygous at alternate alleles for all
loci. Breeds used in a cross to enhance meat quality will typically have different means, which
will increase the extent of linkage disequilibrium in the cross. However, both breeds will likely
segregate for most QTL, which will reduce the disequilibrium. Nevertheless, even in crosses
between commercial breeds of swine, substantial numbers of QTL have been found for which
the breeds have sufficient differences in frequency to allow their detection (Malek et al. 2001a,b,
Grindflek et al. 2001). In addition, favorable effects have been found to originate from the breed
with the lower mean for a number of QTL (Malek et al. 2001b).
A greater problem with the use of crosses between outbred instead of inbred lines is the limited
ability to follow QTL past the F2 generation. In contrast to inbred lines, markers are not fully
informative in crosses between outbred lines. Therefore, the ability to track breed origin of
markers or marker haplotypes will decrease over generations, unless a substantial number of
markers are genotyped within the QTL regions.
An important advantage of selection in a breed cross population is that it can capitalize on QTL
identified in breed-cross studies. This could remove the first step in the estimation process used
by Zhang and Smith (1992), i.e. that of identification of markers with large effects. Although this
does entail the risk that different QTL may segregate in the population under selection, in
particular if QTL studies were based on different breeds, there would be a substantial cost
saving. It is crucial, however, that the second step of the estimation process be conducted in the
population under selection, in order to obtain unbiased estimates of QTL effects that are relevant
to the population under selection. For meat quality traits, this requires slaughter of a substantial
number of F2 individuals to obtain phenotypic data. Thus, the size of the F2 population must be
sufficient to support both marker effect estimation and selection.
An alternative approach to QTL detection and estimation was suggested and evaluated by
Whittaker et al. (1997). They used a cross-validation approach that allowed the same F2
population to be used for both selection of markers and estimation of marker effects, while
maximizing power. This would remove the need for prior QTL information, although such
information could still be useful for reducing the genotyping load by focusing only on the most
promising genomic regions.
Genetic improvement within a synthetic should focus on all traits of economic importance. Thus,
selection would be on an index of a marker-based EBV for difficult to measure traits and a
BLUP EBV for performance traits. If available marker-based EBV could also be included for
performance traits. Instead of deriving the emphasis that is placed on difficult to measure traits
versus performance traits on the basis of economic values, additional emphasis should be given
to the former traits in the initial generations, before the disequilibrium between markers and QTL
erodes.
264
Instead of an F2 population, a backcross population could be used as the starting point for MAS
selection in a synthetic line. This could be beneficial if the breed difference for performance is
large and favorable effects for QTL originate from both breeds at alternate loci. Then, a
backcross to the high performance breed would reduce the genetic lag for performance traits.
The frequency of favorable QTL alleles from the other breed would, however, only be ¼. Thus
considerable emphasis would need to be placed these QTL during the initial generations of
selection. Use of a backcross for selection does not negate the use of an F2 cross or prior data on
such a cross for marker selection or QTL identification.
12.5.2 Within-breed selection
Most selection programs focus on genetic improvement within a breed or line and, in many
cases, the subsequent use of that line within a crossbreeding strategy. Within-breed selection
requires information that captures differences between individuals within a breed, rather than the
between-breed differences that were discussed in the previous section. The purpose of this
section is to describe opportunities for using molecular data in genetic improvement in withinbreed selection programs.
The benefits of MAS for within-breed improvement have been evaluated in several computer
simulation studies. The majority of these have used stochastic simulation models. The extra
responses from MAS that have been observed in those studies depend highly on the specifics
simulated (Spelman 1998 WCGALP), including
1. breeding program design
2. trait population parameters (heritability, etc.)
3. genetic model assumed for marked QTL and polygenes
4. marker specifics (GAS vs. LD-MAS vs. LE-MAS, number of markers, recombination
rates with QTL, informativeness)
5. method for genetic evaluation
6. amount of phenotypic and genotypes available
7. others
Thus, care must be taken when evaluating results from these studies.
When considering within-breed improvement using molecular data, it is important to distinguish
between markers that are in population-wide linkage disequilibrium (LD) with a QTL (LDMAS) and markers that are in population-wide equilibrium. The latter require the use of the LD
within families(LE-MAS). The use of population-wide versus within-family LD has important
consequences for the use markers in selection and for the phenotypic data that is required to
support their use. Smith and Smith (1993) advocated the use of markers that are in populationwide disequilibrium with QTL because marker effects are easier to estimate and require smaller
amounts of phenotypic data. This is important in particular for meat quality traits. Marker
requirements are, however, greater for LD-MAS because they must be tightly linked to the QTL,
whereas sufficient within-family LD will exist even for markers that are more distant from the
QTL (within 10 cM). The use of LD-MAS vs. LE-MAS will be discussed further in what
follows.
265
Factors affecting benefits from MAS in
simulation studies
Three types of observable
molecular genetic loci
1. breeding program design
Ease of
Detection Use
2. trait population parameters (h2 , etc.)
3. genetic model for QTL and polygenes
Q Functional mutations
4. marker specifics (GAS vs. LD-MAS vs. LEMAS, number of markers, recombination rates
with QTL, informativeness)
q
5. method for genetic evaluation
6. amount of phenotypes and genotypes
MQ Markers in pop.-wide LD
mq
with functional mutation
M Q Markers in pop.-wide LE
m q
7. others
- known genes
with functional mutation
GAS
LD-MAS
LE-MAS
Pathways by which MAS can Increase
Response to Selection
➣
Increase accuracy of selection
➣
Decrease generation interval
➣ Marker information is available at an early age
➣
Increase selection intensity
➣ Selection at an early age among more candidates
12.5.2.1 Selection on markers that are in population-wide LD – LD-MAS
Markers that are in population-wide LD with a QTL include markers identified using candidate
gene and related approaches. The ideal case is a marker that is known to represent the functional
polymorphisms (GAS), e.g. the RYR and RN genes, but this is not required for the effective use
of population-wide LD.
Although markers that are not within the functional gene are not expected to be in extensive LD
with a QTL within a closed population, markers that are tightly linked to a QTL have a
substantial probability to be in partial population-wide LD with that QTL because of the effects
of drift, selection, mutation, and population admixture (Sved 1971, Goddard 1991, Meuwissen
and Goddard 2000). This probability is higher in selected populations of small effective size,
which is the case for livestock, as demonstrated by Farnir et al. (2000) for dairy cattle.
Markers that are tightly linked to QTL can be found through fine mapping or candidate gene
approaches. The extent of LD can often be enhanced through the use of haplotypes of tightly
linked markers. High-density marker maps with, e.g., a marker every 1 or 2 cM, will also include
markers that are in tight linkage with the QTL and that have the potential to be in substantial
population-wide LD, as was recently demonstrated by Meuwissen et al. (2001) through
266
simulation. They showed that for populations with an effective population size of 100 and a 1 or
2 cM spacing between markers across the genome, sufficient disequilibrium was present that
genetic values could be predicted with substantial accuracy for several generations on the basis
associations of marker haplotypes with phenotype on as few as 500 individuals. Although
genotyping costs would be to high when applied to the entire genome, opportunities might exist
to utilize this approach on a limited scale by saturating previously identified QTL regions with
markers.
For markers that are in population-wide LD with the QTL, selection can be directly on marker
genotype or on marker haplotype if multiple linked markers are used to track the QTL. It is,
however, essential to estimate the effects of the markers within the population under selection to
capture the degree of LD and linkage phases that are present in the population and to guard
against potential interactions of the QTL with the background genome. For the same reason, it
will also be prudent to re-estimate the effects on a regular basis. Estimation requires marker
genotypes and meat quality phenotypes on a random sample of individuals in the population and
should be based on an animal model with marker genotypes or haplotypes included as fixed
effects (e.g. Short et al. 1997, Israel and Weller 1998).
Use of population-wide LD
with a high-density marker map
Meuwissen et al. (2001)
Mixed model for prediction of marker effects
y = Σ marker haplotype + residual
random (Bayesian model)
Accu racy
1
Ne = 100
12.5.2.2
0.7
500
1000
2200
# individuals
1
Accu racy
Marker distance
1 cM
2 cM
4 cM
0.8
0.6
Estimates from 2200 individuals
EBV accuracy
0.85
0.81
0.75
0.9
0.9
0.8
0.7
0.6
1
2
3
4
5
6
7
8
Generation
Selection using within-family LD
Use of within-family LD between a QTL and a linked marker based on LE-MAS requires marker
effects or, at a minimum, marker-QTL linkage phases to be determined separately for each
family, which requires marker genotypes and phenotypes on family members. If linkage between
the marker and QTL is loose, phenotypic records must be from close relatives of the selection
candidate because associations will erode through recombination. With progeny data, markerQTL effects or linkage phases can be determined based on simple statistical tests that contrast
the mean phenotype of progeny that inherited alternate marker alleles from the common parent.
Alternatively, marker-assisted animal models have been developed to incorporate marker data in
genetic evaluation for complex pedigrees (Fernando and Grossman 1989, Goddard 1992). These
models result in BLUP EBV of QTL effects along with polygenic EBV.
267
Use of within-family LD
Use of within-family LD
Linkage phase not consistent
between sires
Marker-assisted BLUP
(Fernando and Grossman, 1989)
Sire 1
Sire 2
Sire 3
Sire 4
Sire
Dam
M
Q
M
q
M
Q
M
q
Ms Qsp
Md Qdp
m q
m
Q
m
Q
m
q
Ms Qsm
Md Qdm
yi = µ + vip + vim + u + e
Paternal / Maternal PolyQTL allele effect
genic
Progeny
QTL effect must be estimated for each individual/family
Var(u)
Var(u) = Aσ
Aσu2
Ms Qip
➣ Based on family information
➣ marker genotypes
➣ phenotypes
Var(v)
Var(v) = Gσ
Gσv2
G = gametic relationship matrix
Md Qim
for QTL effects
Computed from
➣ EBV for QTL alleles: ^
vip , ^vim
- marker genotypes
^
➣ EBV for polygenic effects: u
- m-QTL rec. rate
^ip + v^im + u
^
Total EBV = v
Meuwissen and Goddard (1996) evaluate the benefit of LE-MAS for different types of traits.
Marker-assisted EBV were evaluated using the marker-assisted genetic evaluation model of
Goddard (1992) by including the QTL as a random effect. Selection was on the sum of EBV for
the QTL and polygenes, similar to the COMB strategy of Zhang and Smith (1992). Comparisons
were to genetic gain from a conventional selection with BLUP EBV without availability of
genetic markers. Results (see figures) showed that the benefit of MAS is greatest for traits for
which phenotypic (BLUP) selection is not effective. This includes traits for which phenotypes
cannot be observed prior to selection, traits that can only be observed on one trait, and traits that
require sacrificing the animal to obtain phenotypic data. A prime example of the latter is meat
quality traits.
Benefits of MAS was greater for lower heritability traits and increased with the proportion of
genetic variance explained by the QTL (molecular score). Benefits decreased over generations,
as QTL alleles were fixed and polygenic response was lost.
Possible gains using MAS ((Meuwissen
Meuwissen & Goddard ‘96)
Possible gains using MAS ((Meuwissen
Meuwissen & Goddard ‘96)
QTL with 1/3 of genetic variance marked by haplotype.
haplotype.
h2 = .27
64
Effect of Heritability QTL with 1/3 of genetic variance
62
70
50
38
37
38
40
39
31
30
30
25
21
20
15
9
10
5
4
0
1
1
2
3
Generation
2
2
Meat quality trait
Sex-limited trait
Phenotyping after selection
Phenotyping before selection
(%)
Extra response from MAS
Extra response from MAS (%)
70
55
60
3
4
5
268
60
45
50
36
38
34
40
30
23
25
30
21
15
17
20
13
9
10
6
5
4
0
2
1
2
1
3
2
3
Generation
4
5
h2=.11
Phenotyping after
h2=.27
h2=.11
Phenotyping before
h2=.27
Meuwissen & Goddard ‘96)
Possible gains using MAS ((Meuwissen
Effect of size of QTL effect
Figure 7. Potential extra gains from MAS for meat quality traits based on within
family linkage disequilibrium
Phenotyping after selection
Based on Meuwissen and Goddard (1996)
h2 = 32%
QTL with multiple alleles explains 1/3 of the genetic variance for a trait with 0.27 heritability. Marker haplotypes
are informative such that transmission of QTL alleles can be followed from parent to offspring for 90% of
offspring1. Marker and phenotypic data was available for five generations prior to the initiation of MAS
60
Extra response from MAS (%)
47
40
50
33
40
29
25
23
30
19
13
20
12
12
10
10
7
5
5
4
0
4
1
1
2
2
3
Generation
3
4
5
va QTL
ria
nc
e(
%)
(%)
Extra response from MAS
70
46.7
26.7
13.3
6.7
70
64
62
60
55
50
40
39
30
24
23
20
25
22
10
Strategy 2
0
1
1
2
2
Generation
1) This
Strategy 1
3
3
4
5
will require a set of highly polymorphic markers around the QTL.
For meat quality traits, Meuwissen and Goddard (1996) considered two strategies:
I)
A random two of four members of each full sib family is slaughtered to record
meat quality data. The remaining individuals are selected on the basis of a markerassisted EBV for meat quality, once data on their sibs is recorded.
II)
Animals are selected on the basis a marker-assisted EBV and non-selected
animals are slaughtered to provide data for the next generation of selection.
For both, all individuals were genotyped for markers around a previously identified QTL.
Results illustrated in Figure 7 show that strategy I) gave 24% greater response than conventional
selection. The benefit of strategy II) was substantially greater but declined over generations as
favorable alleles at the QTL were fixed. The greater response from strategy II) compared to I)
was in large part the result of the greater selection intensity that was achieved with strategy II)
because half of the selection candidates were not slaughtered prior to selection. However, it is
questionable whether this increased selection intensity can be realized in practice due to
inbreeding considerations. Thus, the extra response of 24% appears more realistic.
Implementation of selection on within-family LD requires extensive phenotyping and
genotyping. In addition, data should be available for several generations prior to initiating MAS
to accurately estimate QTL effects. For example, Meuwissen and Goddard (1996) assumed
phenotypic and genotypic data for five generations prior to initiation of MAS and responses
dropped substantially without the buildup of such data. Although the same genotypic data can
also be applied to performance traits, the benefit of MAS for these traits will be less than for
meat quality traits (Meuwissen and Goddard 1996), in particular if markers are in QTL regions
for meat quality rather than for performance traits. Nevertheless, correlated effects on other
traits should be carefully considered and monitored when applying MAS.
Another obstacle for the use of within-family LD is that it requires knowledge of QTL regions
that segregate within the population. Since most QTL mapping studies in pigs are based on the
breed cross model, information about within-breed segregation of QTL is limited. Thus, withinbreed QTL mapping studies must be conducted prior to implementation of MAS. Although such
269
studies could concentrate on QTL regions previously identified in breed cross studies, substantial
population sizes will be required to detect or confirm their segregation within a breed. Related
issues were discussed by Spelman and Bovenhuis (1998) in the context of implementing QTL
knowledge in dairy cattle breeding programs.
Outbred Populations Designs for
Genome Scan QTL mapping
Two approaches for utilizing Marker
information for preselection, including
determining heterozygosity of sire for
Daughter/HS Design
sire x dams
Mm
??
previously identified QTL
associate
(Kashi et al. 1990: Anim. Prod. 124:743)
Grand Daughter
Design
grand sire x dam
Mm
??
iate Genotype
sons x dams
assoc
M? m?
EBV of sons Phenotype
grand progeny
➣ Bottom-up approach based on Daughter design
(MacKinnon and Georges, 1998: Livest. Prod. Sci. 54:229)
??
Bottom-up approach
Top-down approach
Based on Daughter design
Based on Grand-daughter design
Sire
x
??
Sires
Young bulls
m?
estimate marker
contrast based
on daughter
phenotypes
Dams x Gr.sire
??
Mm
bull dams
Mm
daughters
M?
daughters
M?
m?
Phenotype
➣ Top-down approach based on Grand-daughter design
Dams x
Genotype
M?
M?
m?
estimate marker
contrast based
on EBV of sons
based on
grand daughter
phenotypes
If = 0
and e.g. M
is favorable
x
m?
select for progeny-testing
If = 0
and e.g. M
is favorable
Key components for implementation of MAS
Business
objectives
R&D
Farms
DNA collection
Phenotyping
Pedigree
Genotyping
Genotypic
Database
Qua
Con
Deci
Sup
lity
t r o l Phenotypic
s i o n Database
port
Analytical
tools
270
MAS
bull dams
??
Young bulls
M? ??
m? ??
Select
Select
12.5.2.3 GAS vs. LD-MAS vs. LE-MAS
An important question for the implementation of MAS for within-breed improvement is whether
to use LD-MAS or LE-MAS or GAS. The next figures give a contrast between these strategies
in terms of marker, phenotyping, and implementation requirements.
Implementation
Implementation
LE-MAS vs LD-MAS vs GAS requirements
LE-MAS vs LD-MAS vs GAS requirements
• QTL detection
LE
<
LD < GAS
• Within-line confirmation
LE
>> LD > GAS
• Genetic evaluation
LE
>> LD > GAS
LE
LD < GAS
• Selection implementation
LE >> LD > GAS
• LE lower accuracy – risk à within-family selection
à selection room req’s
• Phenotyping- relatives (LE) vs. sample (LD/GAS)
- LD/GAS çè effects at field level
• Genotyping - candidate + relatives (LE) vs. candidate
• Analysis
<
• Genetic gain
• Pick up new mutations
LE > LD > GAS
• Marketability
LE <<LD < GAS
• Protection (patents)
- MA-BLUP (LE) vs. fixed effect (+ prob)
• Product differentiation
Pong-Wong et al. (2002) compared GAS to LE-MAS on a QTL bracketed by two markers at
different distances. Selection was the standard index of BLUP EBV for the QTL and polygenes,
but with optimized contributions to restrict the rate of inbreeding to 5% per generation.
LE- vs LD-MAS
Pong-Wong et al ‘02
QTL in marker bracket of 2d cM
∆F=5%
σ2QTL=.5σ
σ2 p
LE- vs LD-MAS
Pong-Wong et al ‘02
QTL in marker bracket of 20 cM
p0=.15
With prior estimates of QTL effects (accuracy r)
60 males, 60 females
GAS resulted in substantial extra gains over BLUP selection without marker information in early
generations (see figures below), but eventually gave lower cumulative response, similar to what
was demonstrated earlier. Extra responses to LE-MAS were delayed and smaller, relative to
GAS. Response to LE-MAS increased when marker-QTL distance became smaller. To
investigate whether reduced accuracy of estimated QTL effects with LE-MAS was the main
reason for the lower response to LE-MAS vs. GAS, Pong-Wong et al. (2002) also evaluated the
effect of including prior information on individual QTL effect estimates with LE-MAS. This
271
could represent information from a prior QTL scan using the same sires. They showed that
response to LE-MAS was nearly equivalent to that of GAS when highly accurate prior
information was included. This indicates that the main limitation to LE-MAS relative to LDMAS and GAS is the limited amount of information that is available to estimate QTL effects,
which for LE-MAS is limited to information from relatives. It should be noted, however, that
even distant relatives can contribute substantial information for estimating QTL effects with LEMAS if markers are tightly linked to the QTL.
With complete LD and large amounts of data, LD-MAS will be equivalent to GAS. In practice,
however, LD-MAS will be limited by the extent of LD and the accuracy of estimates of effects
of marker haplotype effects; while QTL may typically only have 2 alleles, the potential number
of haplotypes to estimate effects with a haplotype of n SNP markers is 2n , and, therefore, less
data will be available from a given sample to estimate individual haplotype effects for LD-MAS
compared to GAS. Thus, including more markers in the haplotype for LD-MAS will reduce
accuracy. On the other hand, LD of markers with the QTL is expected to increase when
including more markers in the haplotypes. Goddard and Hayes (2002) investigated the impact of
these two factors (see figure below). Results showed that a haplotype of 11 markers explained
over 98% of variance at a bracketed QTL (results for infinite population size), compared to 80%
for 4 markers. Accuracies were, however similar between 4 and 11 markers (~45%) when only
100 individuals were evaluated, reflecting the greater impact of number of individuals evaluated
when a larger number of haplotypes is to be estimated.
GAS- vs LD-MAS Goddard & Hayes ‘02
QTL in 10 cM interval with 4 or 11 markers
Pop-wide LD based on Ne = 100
Accuracy
1.0
4 markers
11 markers
0.8
0.6
0.4
0.2
0.0
100
1000
2000
Infinite
# animals evaluated
12.6 Using Molecular Information at Unused or Under-used Selection Stages
The previous considered selection stages where both phenotypic and molecular information was
available. In those cases, any selection pressure that is applied to molecular information is taken
away from phenotypic information. There are, however, many cases where not all available
selection space is utilized in conventional selection, which provides selection room for MAS
(Soller and Medjugorac 1999). These include cases where selection decisions must be made at
stages of the animal’s life cycle when limited to no phenotypic information is available to guide
those decisions. A prime example is pre-selection on the basis of markers among members of a
full-sib family for further testing, prior to availability of individual or progeny records. There is
272
often a need to limit the number of full sibs tested to limit inbreeding and increase the
availability of diverse blood lines. In such situations conventional selection has no basis for
selection because EBV are derived from pedigree information, which is the same for all
members of a full-sib family. Family members can, however, differ for the markers they
inherited, which then provides a basis for selection, instead of having to make a random choice.
Such strategies were evaluated by Kashi et al. (1990) for dairy cattle.
Incorporating MAS in the existing program
Incorporating MAS
Within existing selection steps
in currently under-utilized selection stages
Capitalizing on excess reproductive capacity
➣
Increases response through accuracy of selection
➣
Requires balance between QTL and polygenic selection
➣
Long versus short-term response to selection?
(excess meioses)
= “Selection space” for MAS ((Soller
Soller & Medjugorac,
Medjugorac, 1999)
Pre-selection among full-sibs prior to performance testing
with limited test capacity
E.g. pre-selection of young dairy bulls for progeny testing
Currently 1 (e.g.) full-brother chosen (at random)
per MOET flush to restrict inbreeding
MAS implementation: select FS with highest marker score
Impact on response through increased intensity
MAS in Pre-selection of
Young Bulls for Progeny Testing
A
a
X
ET
A
a
a
a
A
Progeny
Test
12.6.1 Models for evaluation of pre-selection
The effect of pre-selection on genetic gain can be evaluated using selection index approaches o
by using a deterministic model based on mixture distributions. In the following, a model based
on mixture distributions that accounts for the reduction in variances due to selection in the first
stage would be described.
Stage 1: select across two normal distributions of total EBV (QTL or marker + polygenic EBV)
of young bulls with means +/- ½α, standard deviations r1σg , where r1 is the accuracy of parental
average polygenic EBV, and frequencies ½ and ½ for sons that received allele B or allele b from
their heterozygous sire (see figure). Here, B and b can either represent either alleles of a QTL, in
which case α is the QTL allele substitution effect, or alleles at a linked markers, in which case α
is the marker-associated QTL effect (= (1-2r)*(QTL substitution effect).
273
Sire
B
b
B progeny
b progeny
-1/2α +1/2α
2
The unique truncation point for selection of a proportion Q1 in stage 1 is determined by the
bisection method of Chapter 3, resulting in proportions selected of fB and fb from sons that
received allele B and b, respectively.
Input parameters for stage 2: Frequencies of progeny tested bulls that carry allele B vs. b are:
wB = fB/Q1
wb = fb/Q1
and
Polygenic means of these two groups are:
u B = iB r1σg
u b = ib r1σg
and
where iB and ib are selection intensities associated with fB and fb. Since EBV are unbiased, these
are also the mean polygenic EBV of bulls following their progeny test. Thus the mean of the
standard QTL index following the progeny test is:
g B = ½α + iB r1σg
g B = -½α + ib r1σg
and
The standard deviation of the parental average EBV of the bulls selected following stage 1 is
equal to:
σ’1,B = 1 − k B r1σg
and
σ’1,b = 1 − k b r1σg
where kB = iB(iB-xB)
kb = ib(ib-xb)
and
Then, recognizing that a progeny test polygenic EBV can be written as the sum of the parental
average EBV and an independent normally distributed variable with mean zero and standard
deviation
(r2 - r1)σg
the standard deviations of polygenic EBV following the progeny test are:
σ’2,B = 1 − k B r1σg + (r2 - r1)σg
and
274
σ’1,b = 1 − k b r1σg + (r2 - r1)σg
Stage 2 selection: If in stage 2 the top 10% of bulls are selected, the genetic mean of selected
bulls can be determined by determining the unique truncation point across the two distributions
with means and standard deviations as determined above.
Selection without QTL information: Selection without use of QTL information is also modeled
as selection across two distributions, but now with total EBV means equal to those for standard
index QTL selection multiplied by the square of accuracy for the respective stage (r12 and r22).
12.6.2 Effect of pre-selection
MAS pre-selection
MAS pre-selection
NO excess reproductive capacity
WITH excess reproductive capacity
Select more parents to allow QTL selection among progeny
QTL variance = 20%
Limited/no impact on polygenic response
1.4
1.4
Parental
selection
1.2
1.0
Response
Response
1.0
0.8
QTL selection
0.6
0.8
Parental selection
0.6
QTL selection
0.4
0.4
0.2
0.2
0.0
0.0
100
Total response
1.2
50
33
25
25
100
% pre-selection
50
50
75
33
100 % dam selection
25 % pre-selection
QTL pre-selection
NO excess reproductive capacity
Effect of top-down or bottom-up
Select more parents to allow QTL selection among progeny
approach on average genetic value
1.4
Net response
1.0
Response
of progeny-tested young bulls
Total response
1.2
0.8
Parental selection
0.6
QTL selection
SPELMAN and GARRICK: Genetic and Economic
Responses for Within-Family Marker-Assisted Selection
in Dairy Cattle Breeding Schemes
1998 J. Dairy Sci.
Sci. 81: 2942-2950
0.4
Costs
0.2
0.0
25
100
50
50
75
33
100 % dam selection
25 % pre-selection
Given the uncertainties about the sustainability of marker effects, it appears prudent to use
molecular genetic information in a manner that does not prevent progress toward the overall
breeding goal that can be achieved through conventional selection.
275
12.6.3 Integrating molecular and reproductive technologies
Selection space for MAS can be increased with technologies that enhance the reproductive rate
of, in particular, the female.
In addition to increasing selection space within a generation by increasing full-sib family size,
space for MAS can also be created across generations by introducing several rapid generations of
selection based on markers alone. Such programs were proposed by Georges and Massey (1991)
for dairy cattle and subsequently by Visscher et al. (2000) for pigs. In such programs of
‘velogenetics’, the short generations for marker-assisted selection are facilitated by the use of
reproductive technologies such as the recovery of oocytes from the unborn foetus, in-vitro
maturation of oocytes, and in-vitro fertilization. These technologies are then combined with the
selection of embryos for implantation based exclusively on the inheritance of markers that were
previously estimated to have favorable effects. Enhancements to further reduce the generation
interval in these programs were suggested by Haley and Visscher (1998) and Visscher et al.
(2000). Although further advances in reproductive technologies are required for velogenetic
programs to become feasible, they offer potential to improve meat quality through markerassisted introgression, synthetic line development, and within-breed selection based on
population-wide LD.
Use of MAS for Within-Breed
Genetic Improvement
Redesign selection program to more
effectively capitalize on MAS
➣ Creating ‘selection space’ for MAS (Soller&
Soller&Medjugorac ‘99)
➣ Incorporating MAS in the existing selection program
➣
➣ within existing selection steps
➣ Increase family size at pre-selection stages
through reproductive technology
➣ in currently under-utilized selection stages
- Capitalizing on ‘excess’ reproductive capacity
➣ Substitute MAS at early age with selection at
later age based on phenotype or progeny test
➣ Redesign of the selection program to more effectively
capitalize on MAS
➣ Move toward juvenile programs
➣ Require designs that maximize amount of
information for marker-QTL estimation
➣ Creating ‘selection space’ for MAS (Soller&
Soller&Medjugorac ‘99)
➣ Integrating molecular and reproductive technology
➣ Integrating molecular and reproductive technology
Integrating Reproductive
Integrating Reproductive
and Molecular Technology
and Molecular Technology
~
~
Harvest
oocytes
MAS
Fertilize
~
Mature
oocytes
Fertilize
~
Mature
oocytes
Ideally at stages where polygenic selection is limited
➣ to minimize impact of MAS on polygenic response
MAS
Har
v es
in u t oocy
tero tes
Implant
Recipient
276
Implant
Recipient
MAS + Reproductive Technology
Velogenetics ((Georges,Massey
Georges,
Georges,Massey ‘91)
~
Generation
genotyping - MAS
1
genotyping - MAS
2
genotyping - MAS
3
~~
phenotyping genotyping genotyping genotyping -
phenotypic selection
MAS
MAS
MAS
8
phenotyping - phenotypic selection
~
4
5
6
7
~~
12.7 Economics of MAS
12.7.1 Economic value at production level
Hayes and Goddard (2003) evaluated the economics of LE-MAS in an integrated swine
operation with a 100 sow nucleus, a 1000 sow multiplier, and a 10,000 sow commercial tier.
Implementation of LE-MAS included a genome-scan using a half-sib design within a
commercial line, followed by inclusion of data from markers that bracket significant QTL in
genetic evaluation using markers-assisted BLUP (Fernando and Grossman, 1989). Selection was
for a multi-trait breeding goal that included four independent trait categories:
Growth index (GI)
Meat quality index (MQI)
Pigs born alive (PBA)
Net feed intake (NFI)
Genetic architecture of the traits was simulated by a finite locus model with 102 QTL with
effects drawn from a multivariate gamma distribution following a mutation model.
Genetic and economic parameters were as follows:
Trait
Genetic
variance
Heritability
Economic value
($/genetic s.d)
GI
0.7
0.32
2.1
MQI
1.5
0.29
1
PBA
1.2
0.11
2.8
NFI
1.4
0.16
-4.2
277
An important decision for the application of MAS is which QTL or markers should be used in
selection. QTL mapping studies typically apply very stringent thresholds based on genome-wide
testing to reduce the rate of false positives, as suggested by Lander and Kruglyak (1995). This,
however, increases the rate of false negatives and removes opportunities to select on those QTL.
In this study, three different thresholds to determine significance were used in the genome scan
to determine which QTL to use for LE-MAS:
• 5% comparison-wise
• 5% chromosome-wise
• 5% genome-wise
Two sample sizes for the genome-scan were evaluated:
• small: 5 sires with 50 progeny each
• large: 5 sires with 200 progeny each
Number of QTL detected and % of genetic variance explained by the detected QTL is given in
the next figure:
Profitability of LE-MAS in pigs
Profitability of LE-MAS in pigs
Hayes and Goddard, 2003
Hayes and Goddard, 2003
30
# QTL detected
QTL variance (% )
25
$x106
1.0
20
0.5
15
0.0
10
-0.5
Costs
Returns
5
-1.0
Profit
0
Gen-w
Chr-w
Point-w
Small scan
5 sires x 50
Gen-w
Chr-w
Point-w
-1.5
Large scan
Gen-w
5 sires x 200
Chr-w
Point-w
Small scan
Gen-w
Chr-w
Point-w
Large scan
The number of QTL detected and variance explained by the detected QTL increased with
decreasing stringency of the threshold used and was greater for the large scan than the small
scan, both because of greater power. However, for the small scan, the number of QTL increased
proportionally faster with decreasing stringency of the threshold than variance explained by the
QTL. This is because a non-stringent threshold results in many false positives, in particular for a
scan with lower power. The number of QTL detected was greatest for GI and smallest for NPA
(results not shown), consistent with the lower heritability of NPA. Relative increases in response
from implementing LE-MAS were greatest for MQI, followed by PBA, NFI, and GI (results not
shown), consistent with the effectiveness of regular selection for these trait categories.
Extra returns from the increased genetic merit in the nucleus from LE-MAS were computed
using discounted gene flow by following the expression of a single round of genetic
improvement through the multiplier and commercial levels over 100 6-month time periods. A
discount rate of 5% per 6-month period was used. Extra costs were assumed to consist only of
genotyping costs.
278
Results (see above figure) showed that extra costs were proportional to the number of QTL
detected. Extra returns depended primarily on the % of variance explained by the detected QTL
and the number of false positives. The latter represent wasted selection space. Extra returns were
similar for all strategies, except for the small scan with the comparison-wise threshold. This case
had a large number of false positives, which reduced the effectiveness of selection.
Several other studies have, however, shown that greater genetic gains from MAS can be obtained
by allowing a higher rate of false positives, in order to reduce the number of false negatives
(Moreau et al. 1998, Spelman and Garrick 1998).
Subtracting costs from returns, profit was positive for all strategies except when comparisonwise thresholds were used, for which costs were high because of the large number of QTL
detected. Highest profit was obtained for the scans with the most stringent thresholds because of
the limited genotyping costs.
Using a similar procedure, Hayes and Goddard (2003, unpublished) evaluated the break-even
genotyping cost for GAS, to determine the resources that could be allocated to finding causative
mutations, starting from a genome scan. To evaluate this, they followed-up on the QTL with the
largest effect for each of the four traits from the large genome scan. It was then assumed that the
causative mutation for these QTL was known and extra genetic and economic response from
standard index GAS was evaluated.
Results (see table below) show that, although extra relative responses would by lowest for GI
because of the already effective selection for that trait, extra returns were similar for all four
traits, because of the high economic value of GI. Break-even costs of genotyping not only
depend on extra returns but also on the number of generations until fixation of the gene, after
which no genotyping is required anymore. As a result, break-even genotyping costs were lowest
for PBA because it would be fixed more rapidly than the other genes because of the limited
accuracy of polygenic EBV for this trait.
Strategy
GI
MQI
PBA
NFI
% genetic variance
by largest QTL
26
25
15
24
Extra Returns Break-even genotyping
(million $)
costs$/pig
0.838
104.36
0.779
97.09
0.736
78.13
0.753
80.04
12.7.2 Other economic objectives
Hayes and Goddard (2003) evaluated economic benefits from MAS on the basis of extra returns
at the production level. Often, however, the main driving force behind selection decisions in
commercial breeding companies is increasing market share and sale of breeding stock, as
discussed in Chapter 8. Here, we will evaluate the effect of MAS on such objectives.
The following figures show the impact of 50% pre-selection of young dairy bulls for progeny
testing on an index of QTL and parental EBV information on the genetic merit of bulls following
279
progeny testing, as well as on the number of bulls that have a progeny-test in the top 10 or 1%
within a competitive market situation. QTL substitution effects ranged from 0.1 to 0.5 polygenic
standard deviations, the accuracy of the parental average EBV was either 0, 0.2, or 0.4, and the
accuracy of the progeny-test was 0.85. A parental average EBV accuracy of 0 reflects a situation
where young bulls considered for progeny testing have been heavily selected, such that the
remaining variation among parental EBV’s is negligible. Selection was among sons of sires that
were heterozygous for the QTL. The deterministic mixture distribution described in 12.6.1 was
used.
The impact of MAS on market share was determined by determining the unique truncation point
for 10% selected in stage 2 for non-MAS selection, assuming this was the industry-wide
truncation point, and applying this to the two distributions of total EBV under MAS selection.
The same was done for determining the impact on number of bulls in the top 1%.
Results (see figures) show that extra genetic gain from MAS increases with QTL substitution
effect and decreases with the accuracy of parental average EBV. The impact on market share is,
however, greater than the impact on genetic gain and the impact on the number of bulls in the top
1% is greater than the impact on the number of bulls in the top 10%. Thus, MAS is expected to
have a relatively larger impact on market share than genetic gain, and an even greater effect on
the number of top bulls.
Genetic gain / market share from preselection of young bulls
Genetic gain / market share from preselection of young bulls
% gain
60
% gain
50% pre-selected at rEBV = 0
N o in to p 1 %
N o in to p 1 0 %
Mean E B V
No in top 1%
No in top 10%
Mean EBV
50
40
5 0 % p r e -s e le c te d a t r E B V = 0 .2
30
20
10
0
0
0.1
0.2
0.3
0.4
σ g)
QTL substitution effect (σ
0 .1
0.5
0 .2
0 .3
0 .4
Q T L s u b s titu tio n e ffe c t (σ
σ g)
0.
Brascamp et al. (1990) also found MAS to have a greater relative effect on market share than on
genetic gain. However, they found that the economic returns from the increase in market share
were smaller than the economic returns from increased returns at the production level. The latter
were evaluated using discounted gene flow.
280
Opportunities for MAS / GAS
Integration in breeding & business goals
Monogenic
traits
Phenotype
LE markers
LD markers
Genes
Costs Risks
Polygenic
traits
BLUP
EBV
Breeding
Business
goal
Selection
strategy
Genotype
(prob)
Complete evaluation marker/gene effects
Multiple stage selection ; Program redesign
Short- vs.
vs. long-term response
12.8
Selection for Crossbred Performance (from Dekkers and Chakraborty, 2002)
In most livestock, crossbreds are used for commercial production to capitalize on heterosis and
complementarity and the aim of selection within pure-lines is to maximize crossbred
performance. Selection is, however, within pure-lines and primarily based on purebred data.
Several theoretical studies have shown that combined crossbred and purebred selection (CCSP)
can result in greater responses in crossbred performance, in particular if genes with complete or
over-dominance affect the trait (Wei and van der Steen, 1992; Uimari and Gibson 1998). Use of
crossbred data, however, requires separate testing and recording strategies. The strategic use of
non-additive QTL in pure-line selection, however, allows selection for crossbred performance
without crossbred data.
Selection
Dam
Line
Nucleus
Sire
Line
Nucleus
Multiplier
Multiplier
Random
mating
No selection
Commercial
A deterministic model was developed for selection in sire and dam nucleus lines that provide
parents for the multiplier phase of a two-breed terminal cross. Selection was for a trait
controlled by a known QTL and additive infinitesimal polygenes (heritability = 0.3). The QTL
281
had alleles Q and q and genotypic values a, d, and –a. Frequencies of Q in generation 0 were 0.3
and 0.2 in the sire and dam line. In both lines, selected fractions were 0.1 and 0.25 for sires and
dams. Unselected nucleus and multiplier animals were used to produce multiplier and
commercial animals by random mating.
The objective was to maximize cumulative discounted response (CDR) of crossbred performance
over ten generations: CDR=ΣδtGt, where Gt is the mean crossbred performance of progeny from
generation t and δt the discount rate based on 10% interest. The index for selection of animals of
genotype i and sex j in line k in generation t was: Iijkt = bijkt gijkt + û ijkt , where gijkt is the known
purebred breeding value for the QTL (Dekkers and Chakraborty, 2001) and û ijkt a polygenic
estimated breeding value from own phenotype. Four selection strategies were compared, all
based on purebred information:
1. Phenotypic selection: selection on purebred phenotypic information.
2. Standard QTL selection: selection on an index with weights bijkl equal to one.
3. Optimal QTL selection: selection on an index with weights bijkl optimized to maximize CDR.
Weights were derived by an extension to two lines of methods by Chakraborty et al. (2002).
4. Stepwise optimal QTL selection: selection on an index with bijkl optimized each generation to
maximize performance of crossbred progeny, following Dekkers and Chakraborty (2001).
Table 1 shows CDR for alternative QTL selection strategies relative to phenotypic selection.
Responses in polygenic and QTL values are in Table 2 for optimal QTL and phenotypic
selection. Figure 1 shows trends in frequencies for QTL with complete and over-dominance.
Table 1. Extra (%) CDR of QTL selection strategies over phenotypic selection for
different degrees of dominance (d) and QTL effects (a in polygenic s.d.).
Selection
d=0
Standard QTL
Optimal QTL
Stepwise QTL
-0.4
1.1
-0.4
Standard QTL
Optimal QTL
Stepwise QTL
-1.6
1.0
-1.6
Standard QTL
Optimal QTL
Stepwise QTL
-1.3
0.8
-1.3
Degree
of dominance
d = 1 /2 a
d=a
a = 0.5
-1.7
-2.3
0.5
1.1
-0.7
-0.1
a=1
-3.6
-3.9
0.3
2.2
-1.8
0.3
a=2
-3.2
-3.1
0.1
3.3
-2.3
1.1
d = 11/2a
-2.4
3.7
2.3
-2.6
9.8
8.0
1.5
21.0
20.9
For additive QTL (d=0), optimal QTL selection gave <1% greater CDR than phenotypic
selection (Table 1), similar to single line selection (Dekkers and Chakraborty, 2001). Similar
results were obtained for a QTL with partial dominance (d=½a). For a QTL with complete
dominance (d=a), optimal QTL selection gave up to 3.3% greater CDR than phenotypic selection
(Table 1). Extra response increased with size of the QTL effect. Optimal QTL selection fixed the
282
favorable allele in the sire line and maintained an intermediate frequency in the dam line (Figure
1a). Phenotypic selection increased the frequency in both lines. Extra responses from optimal
selection resulted not only from a greater frequency of heterozygotes among crossbred progeny
but also from greater polygenic response in the dam line (Table 2).
Table 2. Cumulative discounted responses (in polygenic s.d, σpol) in crossbreds from phenotypic
(Phen.) and optimal QTL selection (Optimal) in sire and dam lines for a QTL with different
degrees of dominance (d) and additive effects (a).
Crossbred
d=0
d = 1 /2 a
d=a
d = 11/2a
A
Phen. Optim Phen. Optim Phen. Optim Phen. Optim
response
a= 0.5
Polygenic
11.7 11.5
11.8 11.5
11.7 11.4
11.6 11.1
12.9 12.5
12.8 12.7
12.8 13.1
12.6 13.1
Polygenic
QTL value
1.1
2.0
1.2
1.7
1.3 1.6
1.4 2.3
Total
25.7 26.0
25.8 25.9
25.7 26.0
25.6 26.6
a=1
Polygenic
11.3 11.1
11.4 11.2
11.4 10.9
11.2 10.7
12.3 11.9
12.4 12.2
12.3 13.1
12.0 13.1
Polygenic
QTL value
5.6
6.6
5.1
5.5
4.6 5.0
4.5 6.6
Total
29.2 29.5
28.9 29.0
28.4 29.0
27.7 30.4
a=2
Polygenic
10.7 10.7
11.0 10.8
11.1 10.6
10.5 10.3
11.4 11.2
11.7 11.7
11.8 13.0
11.1 12.9
Polygenic
QTL value
15.1 15.6
13.2 13.5
11.2 11.7
9.8 14.8
Total
37.2 37.5
35.9 35.9
34.2 35.3
31.5 38.1
A
Polygenic sire = polygenic contribution from sire line; Polygenic dam = polygenic contribution from dam line; QTL value = QTL contribution; Total = total genetic response.
Standard QTL selection gave lower CDR than phenotypic selection for most cases (Table 1).
Standard QTL selection increased the frequency of Q in both lines but faster than phenotypic
selection. This resulted in greater gain than phenotypic selection in early generations but in
lower response in later generations. For QTL with complete or over-dominance, frequencies
reached an asymptote less than one, where the substitution effect is zero (Figure 1).
For non-additive QTL, stepwise QTL selection gave greater CDR than standard QTL selection,
in particular with over-dominance (Table 1). For over-dominant QTL, stepwise QTL selection
gave similar CDR and trends in frequencies as optimal QTL selection (Figure 1).
This study demonstrates that strategic use of non-additive QTL enables selection for crossbred
performance based on purebred data. Limited benefits were obtained for QTL with partial
dominance but these results apply to a trait of moderate heritability and observed in both sexes
prior to selection. Greater crossbred performance from optimal QTL selection resulted not only
from greater QTL response, but also from greater polygenic response (Table 2).
For over-dominant QTL, optimal QTL selection resulted in 4, 10, and 21% greater CDR than
phenotypic selection for QTL with additive effects of ½, 1, and 2 polygenic standard deviations.
At a frequency of 0.25, this represents QTL that explain 27, 59, and 85% of the purebred genetic
variance. Although these QTL effects may be unrealistic, similar results may be obtained from
selection on multiple QTL that jointly explain such proportion of variance.
283
Figure 1. QTL allele frequency in sire and dam lines under alternate selection strategies for a
QTL with complete or over-dominance and an additive effect of one polygenic s.d.
a – Complete dominance (d = a)
b – Over-dominance (d = 1½a)
1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
Optimal- Sires
Standard - Sires
Phenotypic - Sires
0.1
0.0
Optimal - Dams
Standard - Dams
Phenotypic - Dams
0.2
0.1
0.0
0
1
2
3
4
5
6
7
8
9
10
0
1
2
3
4
5
6
7
8
9
10
Generation
Generation
The majority of the extra crossbred response could be achieved by optimizing one generation at a
time (Table 1). These results can be extended to selection on markers linked to QTL, provided
non-additive effects are estimated. Recent QTL studies, in particular in breed crosses, have found
several over-dominant QTL (e.g. De Koning et al., 2000; Malek et al., 2001). Although these
may be represent linked QTL in repulsion phase, their apparent over-dominant effects can be
used to select for crossbred performance prior to the break-up of such linkage.
The use of QTL removes the requirement of crossbred testing in CCSP, thereby saving important
test resources and enabling the short generation intervals of purebred selection. Although a twobreed terminal cross was modeled here, results in principle apply to any crossbreeding system.
The choice between purebred QTL selection and CCSP will depend on the proportion of nonadditive genetic variation contributed by the QTL, the impact of other factors that differentiate
purebred from commercial performance (GxE), the increase in generation intervals with CCSP,
and on the cost of crossbred testing versus the cost of implementing marker-assisted selection.
Benefits of using QTL could be further enhanced by mating based on QTL genotype at the
commercial level. This, however, requires extensive genotyping at the multiplier level.
Recent gene and QTL mapping studies have also revealed that QTL may not be expressed in a
Mendelian fashion. In particular, several studies have detected genes and QTL in pigs that are
subject to gametic imprinting (Jeon et al. 1999, De Koning et al. 2000). Future studies will
undoubtedly identify other epigenic phenomena that affect the inheritance and expression of
QTL. These effects will need to be taken into account when designing selection programs.
Although they may on the one hand complicate selection programs, they may also provide
opportunities. For example, De Koning (2001) suggested that utilization of a combination of
imprinted and sex-linked QTL would allow a diverse set of markets to be targeted through
strategic crosses between a single set of breeds.
284
Download