PBG 650 Advanced Plant Breeding Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating designs Why estimate genetic variances? • New crop species – ensure adequate genetic variance for selection – determine appropriate type of cultivar • pure lines, hybrids, open-pollinated varieties • • Predict response to short and long-term selection • • Use in selection indices Determine optimum number and location of testing environments Predict single-cross performance Do Breeders Need to Estimate Genetic Variances? • For breeders working with elite germplasm, it is often more useful to develop breeding populations for the purposes of selection, than to estimate genetic variances – use parents with high means – make crosses between unrelated individuals to maintain high genetic variation (or assess diversity at molecular level) – single-cross performance can be predicted from data routinely generated in breeding programs – recurrent selection is not widely used in breeding programs for major crop species Bernardo, 2010, Chapt. 7 Do Breeders Need to Estimate Genetic Variances? • Options in mating designs for self-pollinated crops are limited • Potential of purelines or open-pollinated varieties vs hybrids can be assessed by comparing means of these types of cultivars and by considering costs of hybrid seed production • • Precision of genetic variance estimates is often low Selection indices can be constructed that do not require input of genetic variances What about newer crops, less developed germplasm? Genetic variances? • Provides valuable baseline information for breeding initiatives for minor crops, new traits • For many crops and situations, recurrent selection is more efficient than pedigree selection • Need to distinguish between genetic and environmental correlations among traits • Better understanding of environmental influences and GXE is essential for effective, well-targeted breeding efforts Obtain estimates of genetic variances as an integral part of breeding program – – – – progeny trials, mapping populations realized selection response, correlated selection response monitor changes in genetic variances over time accumulate information about inheritance of important traits Classic approach for estimating genetic variances • Develop one or more types of progeny – half sibs, full-sibs, testcrosses, recombinant inbreds • Evaluate progeny in a set of environments – representative of potential environments in target region • Estimate variance components from mean squares in ANOVA (or directly using mixed models) • Equate variance components with expectation based on covariances among relatives # of variance components that can be estimated = # of covariances among relatives in the design Assumptions • Relatives are noninbred and belong to a particular randommating reference population – estimates apply to that population alone – relatives must represent a random sample from the population • parents cannot be selected from the population, or chosen from different populations • parents can be inbred, as long as their progeny (relatives) are not inbred (use of inbred parents can increase precision) • The usual assumptions for equilibrium also apply – diploid inheritance – no linkage or linkage disequilibrium • using fully inbred parents may reduce effects of linkage Fixed vs Random effects • Fixed effects • interested in the effects of the treatments per se • Σi=0 • Random effects • treatments are a random sample from a larger reference population that has a mean of 0 and variance σt2 • objectives are to extend conclusions to all members of the population • interested in estimating magnitude of variance among and within groups • Σti 0 for any given experiment Single-factor analysis, one location • Families and blocks are considered to be random effects Source Blocks Families Error df r-1 f-1 (r-1)(f-1) MS MSR MSF MSE (MSF MSE ) / r 2 F Expected Mean Square e2 f R2 e2 r F2 e2 2 F = CovFamily However, estimate of additive genetic variance will be biased upward if there is GXE or epistasis Single-factor analysis, multiple environments • An environment could be a location or a different year or season at the same location • Environments are generally considered to be random, because we want to make inferences about the performance that could be expected at other potential sites in the target production environment • Specific environments, such as irrigation, fertilizer levels, temperature or daylength regimes, would be fixed effects • Note that aspects of the experimental design (blocks, locations) are often treated as fixed effects in molecular studies where the objective is to make associations between markers and phenotypes. Single-factor analysis, multiple environments Source df MS Years y -1 Blocks/Years y(r-1) Families f-1 MSF Families x Years (f-1)(y-1) MSFY Error y(r-1)(f-1) MSE (MSF MSFY ) / ry 2 F Not biased by GXE Expected Mean Square 2 e2 r FY ry F2 2 e2 r FY e2 2 F = CovFamily Additive genetic variance from single-factor design Relatives F CovFamily 2 2 A Half-sibs Common parent not inbred Common parent inbred Full-sibs Parents not inbred NA Parents inbred NA Recombinant inbreds Clones NA Genotypes divided into sets • Large numbers of families can be divided into sets, and variances can be pooled across sets. Source df MS Years y -1 Sets s-1 Years x Sets (y-1)(s-1) Blocks/(YearsxSets) (r-1)ys Families/Sets (f-1)s MSF Years x Families/Sets (y-1)(f-1)s MSFY Error (r-1)(f-1)ys MSE Expected Mean Square 2 e2 r FY ry F2 2 e2 r FY e2 Calculation of σA2 is the same as before Example – single-factor analysis • 60 maize S2 lines are allowed to open pollinate; bulked to form half-sib families • 2 randomized complete blocks, 3 locations Source df MS Mean Square Location 2 Blocks/Locations 3 Families 59 MSF 14.36 FamiliesxLocations 118 MSFL 6.18 Error 177 MSE 4.00 e2 r FL2 rl F2 e2 r FL2 e2 Are there significant differences among families? F test MSF/ MSFL= 14.36/6.18 = 2.32 Pr>F is <0.0001 Compare to Fcritical with 59,118 df Bernardo, pg 155 What is the level of inbreeding in the S2 parents? • A family represents the alleles of its parents – Collectively, an S1 family has the same distribution of alleles as the S0 plant from which it was derived Expected frequency of heterozygotes P12 = 2pq(1-F) Plants Families P12 F F2 or S0 F3 or S1 P12=2pq 0 F3 or S1 F4 or S2 (0.5)P12 0.5 F4 or S2 F5 or S3 (0.25)P12 0.75 F5 or S3 F6 or S4 (0.125)P12 0.875 Fn or Sn-2 Fn+1 or Sn-1 (1/2)n-2P12 1-(1/2)n-2 • The distinction between plants and families decreases as F approaches 1 Example – single-factor analysis Source df MS Mean Square (EMS) Location 2 Blocks/Locations 3 Families 59 MSF 14.36 FamiliesxLocations 118 MSFL 6.18 Error 177 MSE 4.00 e2 r FL2 rl F2 e2 r FL2 e2 Estimate additive genetic variance (MSF MSFL ) / rl = (14.36-6.18)/(2*3) = 1.36 2 F 4 4 2 1.36 3.63 A ( F ) 1 1 F 1 2 2 Heritability based on family means • For animals, a family consists of multiple progeny from an individual – each of the progeny is a replicate – usually measure variance among progeny within each family • For plants, we usually take collective measurements of multiple plants in a plot, and replicate the plots across reps and environments • Heritabilities in plants are usually expressed on the basis of family means. Meaning will vary depending on the size of the plots, number of replications and number of environments h 2 Cov (G, P ) 2 P G2 2 G e2 rl 2 GL l G2 2 G X2 Variance of family means Source df MS Mean Square (EMS) Families 59 MSF 14.36 FamiliesxLocations 118 MSFL 6.18 Error 177 MSE 4.00 X 2 X 2 MSerror rl MSFL rl e2 r FL2 rl F2 e2 r FL2 e2 appropriate error term for families number of observations on each family 6.18 1.03 2*3 think of this as the square of the standard error of a family mean P2 F2 2X 1.36 1.03 2.39 P2 2 2 2 MSF e2 r FL rl F2 14.36 2 e FL F 2.39 rl rl rl l 2*3 Heritability on a family mean basis h 2 Cov(G, P) G 2 2 P 2 G F 2 h 2 2 F e2 rl 1 F 4 2 FL r A2 e2 rl G 2 2 G X 2 2 GL r 1.36 1.36 1.03 0.57