Single Factor Mating Designs

advertisement
PBG 650 Advanced Plant Breeding
Module 7: Estimating Genetic Variances
– Why estimate genetic variances?
– Single factor mating designs
Why estimate genetic variances?
•
New crop species
– ensure adequate genetic variance for selection
– determine appropriate type of cultivar
• pure lines, hybrids, open-pollinated varieties
•
•
Predict response to short and long-term selection
•
•
Use in selection indices
Determine optimum number and location of testing
environments
Predict single-cross performance
Do Breeders Need to Estimate Genetic Variances?
•
For breeders working with elite germplasm, it is
often more useful to develop breeding populations
for the purposes of selection, than to estimate
genetic variances
– use parents with high means
– make crosses between unrelated individuals to maintain
high genetic variation (or assess diversity at molecular
level)
– single-cross performance can be predicted from data
routinely generated in breeding programs
– recurrent selection is not widely used in breeding
programs for major crop species
Bernardo, 2010, Chapt. 7
Do Breeders Need to Estimate Genetic Variances?
•
Options in mating designs for self-pollinated crops
are limited
•
Potential of purelines or open-pollinated varieties vs
hybrids can be assessed by comparing means of
these types of cultivars and by considering costs of
hybrid seed production
•
•
Precision of genetic variance estimates is often low
Selection indices can be constructed that do not
require input of genetic variances
What about newer crops, less developed germplasm?
Genetic variances?
•
Provides valuable baseline information for breeding initiatives
for minor crops, new traits
• For many crops and situations, recurrent selection is more
efficient than pedigree selection
• Need to distinguish between genetic and environmental
correlations among traits
• Better understanding of environmental influences and GXE is
essential for effective, well-targeted breeding efforts
 Obtain estimates of genetic variances as an integral part of
breeding program
–
–
–
–
progeny trials, mapping populations
realized selection response, correlated selection response
monitor changes in genetic variances over time
accumulate information about inheritance of important traits
Classic approach for estimating genetic variances
•
Develop one or more types of progeny
– half sibs, full-sibs, testcrosses, recombinant inbreds
•
Evaluate progeny in a set of environments
– representative of potential environments in target region
•
Estimate variance components from mean squares
in ANOVA (or directly using mixed models)
•
Equate variance components with expectation
based on covariances among relatives
# of variance components that can be estimated
= # of covariances among relatives in the design
Assumptions
•
Relatives are noninbred and belong to a particular randommating reference population
– estimates apply to that population alone
– relatives must represent a random sample from the
population
• parents cannot be selected from the population, or chosen from
different populations
• parents can be inbred, as long as their progeny (relatives) are not
inbred (use of inbred parents can increase precision)
•
The usual assumptions for equilibrium also apply
– diploid inheritance
– no linkage or linkage disequilibrium
• using fully inbred parents may reduce effects of linkage
Fixed vs Random effects
•
Fixed effects
• interested in the effects of the treatments per se
• Σi=0
•
Random effects
• treatments are a random sample from a larger reference
population that has a mean of 0 and variance σt2
• objectives are to extend conclusions to all members of the
population
• interested in estimating magnitude of variance among and
within groups
• Σti  0 for any given experiment
Single-factor analysis, one location
•
Families and blocks are considered to be random
effects
Source
Blocks
Families
Error
df
r-1
f-1
(r-1)(f-1)
 F  (MS F  MS E ) / r
2
MS
MSR
MSF
MSE
Expected Mean Square
 e2  f R2
 e2  r F2
 e2
F
2
= CovFamily
However, estimate of additive genetic variance will be
biased upward if there is GXE or epistasis
Single-factor analysis, multiple environments
• An environment could be a location or a different year or
season at the same location
• Environments are generally considered to be random,
because we want to make inferences about the
performance that could be expected at other potential
sites in the target production environment
• Specific environments, such as irrigation, fertilizer levels,
temperature or daylength regimes, would be fixed effects
• Note that aspects of the experimental design (blocks,
locations) are often treated as fixed effects in molecular
studies where the objective is to make associations
between markers and phenotypes.
Single-factor analysis, multiple environments
Source
df
MS
Years
y -1
Blocks/Years
y(r-1)
Families
f-1
MSF
Families x Years
(f-1)(y-1)
MSFY
Error
y(r-1)(f-1)
MSE
 F  (MS F  MS FY ) / ry
2
Not biased by GXE
Expected Mean Square
2
 e2  r FY
 ry F2
2
2
 e  r FY
 e2
F
2
= CovFamily
Additive genetic variance from single-factor design
Relatives
  Cov Family
2
F
A
2
Half-sibs
Common parent not inbred
Common parent inbred
Full-sibs
Parents not inbred
NA
Parents inbred
NA
Recombinant inbreds
Clones
NA
Genotypes divided into sets
•
Large numbers of families can be divided into sets,
and variances can be pooled across sets.
Source
df
MS
Years
y -1
Sets
s-1
Years x Sets
(y-1)(s-1)
Blocks/(YearsxSets)
(r-1)ys
Families/Sets
(f-1)s
MSF
Years x Families/Sets
(y-1)(f-1)s
MSFY
Error
(r-1)(f-1)ys MSE
Expected Mean Square
2
 e2  r FY
 ry F2
2
 e2  r FY
 e2
Calculation of σA2 is the same as before
Example – single-factor analysis
•
60 maize S2 lines are allowed to open pollinate; bulked to
form half-sib families
• 2 randomized complete blocks, 3 locations
Source
df
MS
Mean Square
Location
2
Blocks/Locations
3
Families
59
MSF
14.36
FamiliesxLocations
118
MSFL
6.18
Error
177
MSE
4.00
 e2  r FL2  rl F2
2
2
 e  r FL
 e2
Are there significant differences among families?
F test MSF/ MSFL= 14.36/6.18 = 2.32
Pr>F is <0.0001
Compare to Fcritical with 59,118 df
Bernardo, pg 155
What is the level of inbreeding in the S2 parents?
•
A family represents the alleles of its parents
– Collectively, an S1 family has the same distribution of
alleles as the S0 plant from which it was derived
Expected frequency of heterozygotes
P12 = 2pq(1-F)
Plants
Families
P12
F
F2 or S0
F3 or S1
P12=2pq
0
F3 or S1
F4 or S2
(0.5)P12
0.5
F4 or S2
F5 or S3
(0.25)P12
0.75
F5 or S3
F6 or S4
(0.125)P12
0.875
Fn or Sn-2
Fn+1 or Sn-1
(1/2)n-2P12
1-(1/2)n-2
• The distinction between plants and families decreases as F approaches 1
Example – single-factor analysis
Source
df
MS
Mean Square
Location
2
Blocks/Locations
3
Families
59
MSF
14.36
FamiliesxLocations
118
MSFL
6.18
Error
177
MSE
4.00
 e2  r FL2  rl F2
2
2
 e  r FL
 e2
Estimate additive genetic variance
 F2  (MS F  MS FL ) / rl = (14.36-6.18)/(2*3) = 1.36
4
4
2
1.36  3.63
 
( F ) 
1
1 F
1 2
2
A
Heritability based on family means
•
For animals, a family consists of multiple progeny from an
individual
– each of the progeny is a replicate
– usually measure variance among progeny within each family
•
For plants, we usually take collective measurements of
multiple plants in a plot, and replicate the plots across reps
and environments
•
Heritabilities in plants are usually expressed on the basis of
family means. Meaning will vary depending on the size of the
plots, number of replications and number of environments
h2 
Cov (G, P )

2
P

 G2
 
2
G
 e2
rl

2
GL
l
 G2
 2
 G   X2
Variance of family means
Families
59
MSF
FamiliesxLocations
118
MSFL
Error
177
MSE
X 
2
X 
2
MS error
rl
MS FL
rl
 e2  r FL2  rl F2
2
2
6.18  e  r FL
2
4.00  e
14.36
appropriate error term for families
number of observations on each family
6.18

 1.03
2*3
think of this as the square of the
standard error of a family mean
P2  F2  2X  1.36  1.03  2.39
 P2
2
2
2
MSF  e2  r FL
 rl F2


14.36
2
e
FL


 F 


 2.39
rl
rl
rl
l
2*3
Heritability on a family mean basis
h 
2
G
2
Cov (G, P)

P
2
G 
2
F
2
h 
2
F 
2
 e2
rl

1 F
4
2
 FL
r
 A2

 e2
rl
G
 2
2
G   X
2

2
 GL
r
1.36
1.36  1.03
 0.57
Download