PBG 650 Advanced Plant Breeding Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating designs Why estimate genetic variances? • New crop species – ensure adequate genetic variance for selection – determine appropriate type of cultivar • pure lines, hybrids, open-pollinated varieties • • Predict response to short and long-term selection • • Use in selection indices Determine optimum number and location of testing environments Predict single-cross performance Do Breeders Need to Estimate Genetic Variances? • For breeders working with elite germplasm, it is often more useful to develop breeding populations for the purposes of selection, than to estimate genetic variances – use parents with high means – make crosses between unrelated individuals to maintain high genetic variation (or assess diversity at molecular level) – single-cross performance can be predicted from data routinely generated in breeding programs – recurrent selection is not widely used in breeding programs for major crop species Bernardo, 2010, Chapt. 7 Do Breeders Need to Estimate Genetic Variances? • Options in mating designs for self-pollinated crops are limited • Potential of purelines or open-pollinated varieties vs hybrids can be assessed by comparing means of these types of cultivars and by considering costs of hybrid seed production • • Precision of genetic variance estimates is often low Selection indices can be constructed that do not require input of genetic variances What about newer crops, less developed germplasm? Genetic variances? • Provides valuable baseline information for breeding initiatives for minor crops, new traits • For many crops and situations, recurrent selection is more efficient than pedigree selection • Need to distinguish between genetic and environmental correlations among traits • Better understanding of environmental influences and GXE is essential for effective, well-targeted breeding efforts Obtain estimates of genetic variances as an integral part of breeding program – – – – progeny trials, mapping populations realized selection response, correlated selection response monitor changes in genetic variances over time accumulate information about inheritance of important traits Classic approach for estimating genetic variances • Develop one or more types of progeny – half sibs, full-sibs, testcrosses, recombinant inbreds • Evaluate progeny in a set of environments – representative of potential environments in target region • Estimate variance components from mean squares in ANOVA (or directly using mixed models) • Equate variance components with expectation based on covariances among relatives # of variance components that can be estimated = # of covariances among relatives in the design Assumptions • Relatives are noninbred and belong to a particular randommating reference population – estimates apply to that population alone – relatives must represent a random sample from the population • parents cannot be selected from the population, or chosen from different populations • parents can be inbred, as long as their progeny (relatives) are not inbred (use of inbred parents can increase precision) • The usual assumptions for equilibrium also apply – diploid inheritance – no linkage or linkage disequilibrium • using fully inbred parents may reduce effects of linkage Fixed vs Random effects • Fixed effects • interested in the effects of the treatments per se • Σi=0 • Random effects • treatments are a random sample from a larger reference population that has a mean of 0 and variance σt2 • objectives are to extend conclusions to all members of the population • interested in estimating magnitude of variance among and within groups • Σti 0 for any given experiment Single-factor analysis, one location • Families and blocks are considered to be random effects Source Blocks Families Error df r-1 f-1 (r-1)(f-1) F (MS F MS E ) / r 2 MS MSR MSF MSE Expected Mean Square e2 f R2 e2 r F2 e2 F 2 = CovFamily However, estimate of additive genetic variance will be biased upward if there is GXE or epistasis Single-factor analysis, multiple environments • An environment could be a location or a different year or season at the same location • Environments are generally considered to be random, because we want to make inferences about the performance that could be expected at other potential sites in the target production environment • Specific environments, such as irrigation, fertilizer levels, temperature or daylength regimes, would be fixed effects • Note that aspects of the experimental design (blocks, locations) are often treated as fixed effects in molecular studies where the objective is to make associations between markers and phenotypes. Single-factor analysis, multiple environments Source df MS Years y -1 Blocks/Years y(r-1) Families f-1 MSF Families x Years (f-1)(y-1) MSFY Error y(r-1)(f-1) MSE F (MS F MS FY ) / ry 2 Not biased by GXE Expected Mean Square 2 e2 r FY ry F2 2 2 e r FY e2 F 2 = CovFamily Additive genetic variance from single-factor design Relatives Cov Family 2 F A 2 Half-sibs Common parent not inbred Common parent inbred Full-sibs Parents not inbred NA Parents inbred NA Recombinant inbreds Clones NA Genotypes divided into sets • Large numbers of families can be divided into sets, and variances can be pooled across sets. Source df MS Years y -1 Sets s-1 Years x Sets (y-1)(s-1) Blocks/(YearsxSets) (r-1)ys Families/Sets (f-1)s MSF Years x Families/Sets (y-1)(f-1)s MSFY Error (r-1)(f-1)ys MSE Expected Mean Square 2 e2 r FY ry F2 2 e2 r FY e2 Calculation of σA2 is the same as before Example – single-factor analysis • 60 maize S2 lines are allowed to open pollinate; bulked to form half-sib families • 2 randomized complete blocks, 3 locations Source df MS Mean Square Location 2 Blocks/Locations 3 Families 59 MSF 14.36 FamiliesxLocations 118 MSFL 6.18 Error 177 MSE 4.00 e2 r FL2 rl F2 2 2 e r FL e2 Are there significant differences among families? F test MSF/ MSFL= 14.36/6.18 = 2.32 Pr>F is <0.0001 Compare to Fcritical with 59,118 df Bernardo, pg 155 What is the level of inbreeding in the S2 parents? • A family represents the alleles of its parents – Collectively, an S1 family has the same distribution of alleles as the S0 plant from which it was derived Expected frequency of heterozygotes P12 = 2pq(1-F) Plants Families P12 F F2 or S0 F3 or S1 P12=2pq 0 F3 or S1 F4 or S2 (0.5)P12 0.5 F4 or S2 F5 or S3 (0.25)P12 0.75 F5 or S3 F6 or S4 (0.125)P12 0.875 Fn or Sn-2 Fn+1 or Sn-1 (1/2)n-2P12 1-(1/2)n-2 • The distinction between plants and families decreases as F approaches 1 Example – single-factor analysis Source df MS Mean Square Location 2 Blocks/Locations 3 Families 59 MSF 14.36 FamiliesxLocations 118 MSFL 6.18 Error 177 MSE 4.00 e2 r FL2 rl F2 2 2 e r FL e2 Estimate additive genetic variance F2 (MS F MS FL ) / rl = (14.36-6.18)/(2*3) = 1.36 4 4 2 1.36 3.63 ( F ) 1 1 F 1 2 2 A Heritability based on family means • For animals, a family consists of multiple progeny from an individual – each of the progeny is a replicate – usually measure variance among progeny within each family • For plants, we usually take collective measurements of multiple plants in a plot, and replicate the plots across reps and environments • Heritabilities in plants are usually expressed on the basis of family means. Meaning will vary depending on the size of the plots, number of replications and number of environments h2 Cov (G, P ) 2 P G2 2 G e2 rl 2 GL l G2 2 G X2 Variance of family means Families 59 MSF FamiliesxLocations 118 MSFL Error 177 MSE X 2 X 2 MS error rl MS FL rl e2 r FL2 rl F2 2 2 6.18 e r FL 2 4.00 e 14.36 appropriate error term for families number of observations on each family 6.18 1.03 2*3 think of this as the square of the standard error of a family mean P2 F2 2X 1.36 1.03 2.39 P2 2 2 2 MSF e2 r FL rl F2 14.36 2 e FL F 2.39 rl rl rl l 2*3 Heritability on a family mean basis h 2 G 2 Cov (G, P) P 2 G 2 F 2 h 2 F 2 e2 rl 1 F 4 2 FL r A2 e2 rl G 2 2 G X 2 2 GL r 1.36 1.36 1.03 0.57