Advanced Plant Breeding CSS 650 Take-Home Final Exam, Fall 2009 Due Wednesday, Dec. 9, 2009 Name Show your work! This is essentially a final homework assignment. You can refer to your notes, journal articles, and text books. Do as much as you can on your own, but you may compare answers with your classmates if you wish. 1) The eggplant breeder at a private seed company has just retired and you are hired to take over the program. You are given a set of 20 new dihaploid lines and wish to know their potential for use as parents in hybrids. As we learned from the poster session in class, hybrid seed production is currently done by hand pollination in eggplant. What mating designs could be 12 points used to estimate combining ability (hint: we discussed two in class and another is commonly used in hybrid corn breeding)? List any questions you would ask the retiree that would help you to choose the best design. For the purposes of this exam, assume specific responses to those questions and explain your consequent choice of a mating design. Draw a diagram showing how you would make the crosses. 1 2) A breeder wants to improve pearl millet as a grain and forage crop in North Dakota. In 2006 he evaluated 300 half-sib families from a breeding population in a yield trial at a single location with 3 replications (blocks). To assist in developing a selection index, he decided to estimate heritabilities and the genetic correlation (rA) between grain yield and plant height. He used the following SAS code to generate univariate analyses for both traits as well as an analysis of covariance for the two traits. proc glm; class Block Family; model Yield Height =Block Family; manova h=Family/printh printe; random Block Family/test; Run; The GLM Procedure Multivariate Analysis of Variance H = Type III SSCP Matrix for Family Yield Yield Height 161.46 1255.8 Height 1255.8 83241.6 Output from printh E = Error SSCP Matrix Yield Yield Height 179.4 1315.6 Height 1315.6 65780.0 Output from printe He summarized the ANOVA for yield as follows: Source Block Family Error df SS MS F Prob>F 2 299 598 161.46 179.40 0.54 0.30 1.8 0.0000 From these results he calculated the phenotypic variance among half-sib families to be 0.18 and obtained an estimate of heritability for yield on a family mean basis = 0.44. 2 2) cont’d. Use the SAS output above to answer the following questions (show your work) 8 points 8 points 8 points a) Calculate phenotypic variance among half-sib families for plant height and estimate heritability on a family mean basis for this trait. b) What is the additive genetic correlation (rA) for yield and plant height? (you will need to calculate the genetic covariance for these traits in the same manner that you calculated the genetic variance among half sib families for individual traits) c) If he selected the best 15 families for grain yield, what would be the expected response to selection? 3 He decided to validate these estimates by making separate selections for yield and plant height. He selected the best 15 families for each trait: Mean of all families Mean of 15 selected families Grain Yield t/ha 3.5 4.3 Plant Height cm 170 190 In 2007 he intermated remnant seed of selected families to form the C1 cycles of selection. In 2008 he evaluated the selection response: Selection for yield t/ha cm Cycle 0 Cycle 1 3.8 4.1 176 180 Selection for plant height cm t/ha 176 188 3.8 3.9 d) Calculate the realized heritability for yield. 8 points 8 points e) Use the relationship below to obtain an estimate of rA from the selection experiment. CR X CRY rA2 R X RY 5 points f) Do estimates of h2 for yield and rA obtained from the half-sib trial in 2006 agree fairly well with results from the selection experiment? 4 3) In class, we discussed several approaches for using molecular markers to improve crops Marker-based selection (MBS) Marker-assisted selection (MAS) o F2 enrichment o Marker-assisted backcrossing (MABC) o Marker-assisted recurrent selection (MARS) Genomic Selection (GS). For each of the scenarios below, indicate which approach(es) would be most appropriate, and explain your choice. 5 points 5 points 5 points a) A diploid, self-pollinating crop, with available marker density at ~5 cM throughout the genome. Three QTL have been identified that explain a large proportion of the variation for resistance to an important disease. Existing cultivars possess anywhere from 0-3 of the desired alleles at these loci. New lines must have higher yield than existing cultivars and acceptable quality to justify release. b) A commercial breeding company for a major, high value crop. Facilities and resources are available for high density, high throughput genotyping. Important traits such as yield are controlled by many QTL with small effects. c) A minor, relatively new crop that is cross-pollinated by insects. Hand pollinations are time-consuming and no male sterility system has been developed. Commercial varieties may be open-pollinated populations or synthetics. Several molecular markers have been developed using a candidate gene approach that impart desirable quality characteristics. 5 8 pts 4) A number of papers have been published in recent years comparing the use of AMMI and GGE as techniques for analyzing genotype by environment interactions (GEI) in multilocational trials. Briefly explain the features of these linear-bilinear models and the difference between them. 6 5) Recall the BLUP example from Bernardo’s text that we discussed in class and in lab. The drawback of the IML program that we used was that one had to manually iterate the program many times to obtain the correct estimates of the breeding values and genetic variance. I have modified the PROC MIXED program I gave to you in lab so that it gives correct estimates in a single run. The main change is to include a type=lin(1) option in the Random statement. data purelines; input set n_loc variety$ genotype y; datalines; 1 18 Morex 1 4.45 1 18 Robust 2 4.61 1 18 Stander 4 5.27 2 9 Robust 2 5.00 2 9 Excel 3 5.82 2 9 Stander 4 5.79 ; data GR; input parm row col1-col4; datalines; 1 1 2 1 0.875 1 2 1 2 1.6875 1 3 0.875 1.6875 2 1 4 0.6875 1.34375 1.421875 ; run; options nodate nocenter; 0.6875 1.34375 1.421875 2 Proc Mixed data=purelines noclprint covtest; class genotype set; weight n_loc; Model y=set/outpredm=LGGR outpred=PredGGR; Random genotype/ldata=GR type=lin(1) solution; lsmeans set; ods listing exclude solutionR; ods output solutionR=BLUPGGR; ods output covparms=VGGR; Proc print data=PredGGR; Run; Data BLUPs; Set BLUPGGR; Keep genotype BLUP Pred_Error P_value; BLUP=Estimate; Pred_Error=StdErrPred; P_Value=Probt; Proc print; Title1'Genotype effect BLUPs, Prediction Error and P-Value for H0:BLUP=0'; Run; Quit; 7 5) cont’d. Output from the BLUP analysis The Mixed Procedure Covariance Parameter Estimates Cov Parm Standard Error Estimate LIN(1) Residual 0.3499 0.05281 Z Value Pr Z 1.18 0.68 0.2384 0.2488 0.2968 0.07784 The linear term in the model estimates additive genetic variance. The residual is VR. Least Squares Means Effect set set set 1 2 Estimate Standard Error DF t Value Pr > |t| 4.8874 5.3525 0.6754 0.6767 1 1 7.24 7.91 0.0874 0.0801 Obs set n_loc variety genotype 1 2 3 4 5 6 1 1 1 2 2 2 18 18 18 9 9 9 Morex Robust Stander Robust Excel Stander 1 2 4 2 3 4 y Pred StdErr Pred 4.45 4.61 5.27 5.00 5.82 5.79 4.45182 4.59133 5.28684 5.05642 5.80165 5.75193 0.054045 0.049300 0.049335 0.062002 0.075379 0.062191 DF Alpha 1 1 1 1 1 1 0.05 0.05 0.05 0.05 0.05 0.05 Lower 3.76512 3.96492 4.65998 4.26862 4.84387 4.96172 These are the BLUEs for the fixed effects in the model Upper Resid 5.13853 -0.001821 5.21775 0.018666 5.91371 -0.016845 5.84422 -0.056420 6.75943 0.018351 6.54214 0.038069 Genotype effect BLUPs, Prediction Error and P-Value for H0:BLUP=0 Pred_ Obs genotype BLUP Error P_Value 1 2 3 4 1 2 3 4 -0.43562 -0.29611 0.44912 0.39941 0.67572 0.67634 0.67947 0.67558 These are the BLUPs and the standard errors that you would obtain if you iterated the matrix calculations many times. 0.63546 0.73729 0.62817 0.66009 8 5) cont’d. a) Run the same analysis using a subset of the data from the Minnesota barley breeding program that we used for the TASSEL demonstration (MN06ex2.xls). Fifty breeding lines were evaluated for yield at two locations (set=1). In addition, the first 25 lines were evaluated at a third location (set=2). The kinship matrix (also in MN06ex2.xls) was obtained from TASSEL using SNP data. A single column of 1’s was added in the first column to obtain the correct format for the LIN(1) covariance matrix option in SAS. You do not need to submit your program or output with this exam, provided that you can answer questions b and c. 5 pts b) Give the estimate you obtained for additive genetic variance and its standard error. 10 pts c) Identify the two lines with the highest breeding values. If you crossed these lines, what is the expected breeding value of an inbred line (RIL) derived from the cross? 5 pts d) If you were the barley breeder in Minnesota would you use BLUPs to choose parents for making crosses or would you use the mean yield across sites? Explain your answer. 9