Advanced Plant Breeding PBG 650 Take-Home Final Exam, Fall 2015 Due 9:30 am on Friday, December 11, 2015 Name KEY Part 2 – Recurrent Selection, Inbred/Hybrid Selection, Correlated Traits 1) You intend to carry out recurrent selection in a maize population in order to develop improved open-pollinated varieties for the Republic of Benin, where there is presently no private seed company marketing hybrid seed. There is a demand for varieties that have good resistance to ear rot and good husk cover (the husk is tight and extends well beyond the ear tip), to reduce losses due to grain weevils in storage. Farmers in the area grow one crop of maize during the rainy season. It is possible for you to complete two generations in a year, in the main season under rainfed conditions and in a dry season nursery under irrigation at your research station. S1 family selection and full-sib family selection are both easy to employ in maize, but you would like to choose the method that is most efficient. 12 points a) Briefly describe the steps involved in these two selection schemes and discuss the factors that would influence your choice of methods in this situation. What decision will you make regarding the relative efficiency of these two methods? Indicate any assumptions that you are making in your discussion. (Hint – either answer is acceptable if you can justify your choice). Method Expected Gain Generations per cycle Full-sib i(1/ 2) A2 Pfs 2 S1 family i A2 PS 1 3 The full sib scheme consists of two generations – 1) formation of families (plant-to-plant crosses), and 2) evaluation of progeny (full-sib families). Recombination is achieved when plant to plant crosses are made to form the next generation of full-sib families. Progeny should be evaluated in the main cropping system, but new families can be made in the offseason, so one cycle of selection can be completed each year. The S1 scheme consists of three generations – 1) formation of families (selfing), 2) evaluation of progeny (S1 families), and 3) recombination of selected families. Because evaluation must be carried out in the main season, it will take two years to complete a cycle. The S1 scheme takes advantage of all of the additive genetic variance in the population, whereas full-sib selection only benefits from ½ of the additive genetic variance. However, the phenotypic variance for S1 selection will be correspondingly larger, which reduces the expected gain from selection. With full-sib family selection, dominance variance contributes 1 to the genetic and phenotypic variance, but is not amenable to selection (it is not part of the numerator of the expected gain formula). If there is a lot of dominance for either trait of interest, that would tend to reduce the effectiveness of full-sib selection in comparison to S1 family selection. Both types of selection have opportunities for additional selection within families during the recombination phase, if selected families are planted in rows. Overall, the schemes appear to be comparable, in terms of progress expected per year. Other factors may determine the final decision. For example, reciprocal crosses between two noninbred plants provides more seed for testing than a single selfed ear. However, for the full-sib scheme, the effective population size is twice as large as for S1 family selection, for an equal number of families evaluated and percentage of families selected. An advantage of the S1 scheme is that you would get a season off every two years, which would permit you to stagger your breeding trials and potentially to include another population in your program. b) The weather during the first year of trials is hot and dry, and it is difficult to distinguish levels of ear rot resistance. You decide to select primarily for husk cover (length of husk extension beyond the tip of the ear) that season. The mean of 250 families is 2.5 cm. You select the best families, which have a mean of 3.1 cm. After recombining the selected families, you conduct a trial and include both the original cycle of selection (C0) and your improved cycle of selection (C1). You note that the average for the improved population (C1) is now 2.8 cm, and that the C0 this year has a mean of 2.4 cm. What is the realized heritability for this trait? 5 points h2 = R XC1 -XC0 2.8-2.4 0.4 = = = =0.667 S XS -X0 3.1-2.5 0.6 2) You are working with a new crop that is naturally outcrossing, but can be readily selfpollinated. You are trying to decide whether to breed synthetic varieties, or if it would be worth the additional cost to produce hybrid seed. All possible single crosses among 6 inbred parents were evaluated for yield, and the averages for each cross are shown below. The yields of the inbred parents are shown on the diagonals. A A B C D E F B 23 C D 40 25 46 42 26 2 E 37 41 40 21 F 39 36 37 38 18 38 36 45 35 40 24 5 points a) Use Wright’s formula to predict the yield of a synthetic variety developed by random mating all of the single-crosses. Y -Y 39.333-22.8333 μˆ synthetic =Yii' - ii' i =39.333- =39.333-2.75=36.583 6 n 6 points b) Consider the four parents that have the highest yield per se. Estimate the yield that could be obtained from all possible double crosses involving these four parents (there are 3 possible combinations). Which of these double crosses would be expected to give you the highest yield? How does that compare to the predicted yield of the synthetic and to the best possible single-cross? A, B, C, and F have the highest yields per se. single cross parents AB CF AC AF BF BC AC 46 AB 40 AB 40 AF 38 AF 38 AC 46 BC 42 BC 42 BF 36 BF 36 CF 45 CF 45 40.5 41.25 41.75 Double cross (AxF)x(BxC) has the highest predicted yield of 41.75. This is less than the best single cross AxC which has a yield of 46. The best double cross is better than the synthetic, which has an estimated yield of 36.58. 3) You know from a previous study that the additive genetic variance in the F2 generation of a particular biparental cross is 25 units, and the dominance variance is 6 units. a) Calculate the expected additive and dominance variance among and within families in the F4 generation (assume that each F4 family was derived by selfing an individual F3 plant). 4 points 3 Among families 𝜎𝐴2 = (2) ∗ 25 = 37.5 Within families 𝜎𝐴2 = (1/4) ∗ 25 = 6.25 3 𝜎𝐷2 = (16) ∗ 6 = 1.125 𝜎𝐷2 = (1/4) ∗ 6 = 1.5 b) If selfing continues by single-seed descent, what will be the approximate additive and dominance variance among and within families in the F10 generation (i.e., assume the inbreeding coefficient F≈1)? 4 points Among families 𝜎𝐴2 = 2 ∗ 25 = 50 Within families 𝜎𝐴2 = 0 𝜎𝐷2 = 0 𝜎𝐷2 = 0 3 4) A breeder wants to improve meadowfoam as an oilseed crop in Oregon. In 2012 she evaluated half-sib families from a breeding population in a yield trial at a single location using a lattice design with two complete replications (blocks). For the purposes of this exercise, data for yield (in lbs/acre), thousand seed weight (TSW in g), and oil content (percent by weight at ~10% moisture) will be analyzed for a subset of 87 families (called entries), ignoring the incomplete blocking structure in the experiment. This problem follows a similar format to a question on the 2011 final, but uses a different data set. In this case you have been provided with most of the necessary computer output and you are asked to perform a sample of the calculations and fill in the blanks rather than performing all of the computations yourself. The goal is to give you some familiarity with the analysis of correlated traits and provide a roadmap for future use, if needed. The data sets and programs are provided for reference, and you will not need them to complete the exam. (You have the option to run the R program at the end for extra credit.) To assist in developing a selection index, we will first calculate the genetic variance and covariance matrix for the three traits (on a family mean basis). The following SAS code was used to generate univariate analyses for all traits as well as an analysis of covariance among traits. The yield values were first divided by 100 to avoid problems of scale (being very different than the other two traits). proc glm data=mf; class rep entry; model oil TSW yield=rep entry; manova h=entry/printh printe; random rep entry/test; run; The GLM Procedure Multivariate Analysis of Variance H = Type III SSCP Matrix for Entry Oil TSW Yield Oil 128.032087 24.3648 98.3671 TSW 24.364798 49.7222 39.9476 Yield 98.3670954 39.9476 694.69 The GLM Procedure Multivariate Analysis of Variance E = Error SSCP Matrix Oil Oil TSW Yield 34.4834115 -0.42441 -10.462 TSW -0.42440833 16.1758 2.24658 Yield -10.4615115 2.24658 215.582 4 a) We will use the traits TSW and Oil as an example to demonstrate how a genetic covariance can be calculated from an Analysis of Covariance. You will first need to calculate Mean Squares by dividing both the Entry SSCP and the Error SSCP by their degrees of freedom (both have 86 df). Refer to your lecture notes to determine how to estimate the covariance for half-sib families for this combination of traits (TSW and Oil). We used a similar approach to estimate the Genetic Variance among half-sib families from an ANOVA at a single location. 5 points MCPHS = 24.365/86 = 0.2833 MCPerror = -0.00493 CovHSXY = (MCPHS-MCPerror)/r = (0.2833-(-0.00493))/2 = 0.144123 Genetic Covariance for Families Oil TSW Oil 5 points 0.54388765 Yield 0.14412329 0.63272446 TSW 0.144123293 0.19503695 0.21919216 Yield 0.632724459 0.21919216 2.78551071 b) Use your estimate of the genetic covariance from the previous question to calculate the genetic correlation between TSW and oil. You will need to use the additive genetic variance for each trait from the table above (on the diagonals). A(XY) rA 2A(X)2A(Y) rA(TSW,oil) 0.144123 0.4425 0.54388 * 0.19503 c) Calculation of genetic correlations using MANOVA will give the same results as a mixed model analysis when the data are balanced. For the MANOVA, each trait is considered to be a different variable and each appears in a different column. For the mixed model analysis, there is a single variable called ‘trait’, and the variable names (yield, TSW, and oil) represent different levels of that variable. The traits are handled as repeated measures on each plot. The following program was run in SAS (adapted from the article by Piepho and Möhring, 2011, Crop Sci. 51:1-6): proc mixed data=correl; class rep entry plot trait; model Y=trait trait*rep; random trait /subject=entry type=unr; repeated trait / subject=plot type=unr; run; 5 The output is explained below. Note that SAS automatically sorts the traits in alphabetical order, regardless of how they are sorted in the data set. In the SSCP matrix from the MANOVA, the variables are listed in the order that they appear the in the Model statement. You can use this output to check your calculations for question ‘b’. Covariance Parameter Estimates Cov Parm Subject Estimate Var(1) Entry 0.5439 Var(2) Entry 0.195 Var(3) Entry 2.7855 Corr(2,1) Entry 0.4425 Corr(3,1) Entry 0.5141 Corr(3,2) Entry 0.2974 Var(1) PLOT 0.401 Var(2) PLOT 0.1881 Var(3) PLOT 2.5068 Corr(2,1) PLOT -0.018 Corr(3,1) PLOT -0.1213 Corr(3,2) PLOT 0.03804 5 points Additive genetic variance for Oil Additive genetic variance for TSW Additive genetic variance for Yield Additive genetic correlation between TSW and Oil Additive genetic correlation between Yield and Oil Additive genetic correlation between TSW and Yield Error variance for Oil Error variance for TSW Error variance for Yield Error correlation between TSW and Oil Error correlation between Yield and Oil Error correlation between TSW and Yield d) The heritability for TSW is 0.675. Use the information in the table to calculate heritability for oil content. For TSW (not required, but included for reference) 2 0.195 h2 F2 0.675 P 0.28905 For Oil content (required) 2 0.401 P2 F2 2X F2 E 0.5439 0.7444 2 2 F2 0.5439 F2 0.5439 h 2 0.731 P 0.7444 e) If the breeder selects the best 20% of the families for oil content, what would be the expected response to selection after those families are intermated? 2 4 points R X ihX A X ih2X PX 1.40 * 0.731* 0.7444 0.883 5 points Oil percentage in the population should increase by 0.883 units. f) If the breeder selects the highest 20% of the families for TSW, what change in oil content would be expected in the next generation? CR Y ihXrA A Y 1.40 * 0.675 * 0.4425 * 0.5439 0.375 Oil percentage in the population should increase by about 0.375 units. 6 --------------------------------------------------------------------------------------------------------------------BEYOND THIS POINT IS EXTRA CREDIT g) Meadowfoam growers are paid for seed produced by the pound (yield), but processors value meadowfoam with high oil content. The breeder decides to give equal weight to these two traits in a selection index. She includes TSW in the index because she knows it is correlated with seed yield and oil content. Fill in the missing values in the selection index, using economic weights of +1 for yield and oil content, and 0 for TSW. (Traits are numbered in alphabetical order: 1=oil, 2=TSW, 3=yield) + 2 points b1 0.744 b 0.142 2 b3 0.572 0.142 0.572 0.289 0.232 0.232 4.039 1 0.544 0.144 0.633 0.144 0.195 0.219 0.633 1 0.219 0 2.785 1 Use the R program at the end of this exam to solve for the values of the coefficients. Paste your result here. + 4 points b1 2.8822 b 1.0656 2 b3 14.563 h) For entry 1, average oil = 29.04%, TSW = 10.315 g, and yield/100 = 17.685. What would the index value be for this entry? How would you use this information in your selection? + 4 points I = 2.8822*29.04 + 1.0656*10.315 + 14.563*17.685 = 352.24 Select ~17 families (20%) with the highest index scores. 7 #R program to estimate coefficients for index selection #fill in the blanks in each matrix before running the program #create the phenotypic covariance matrix P1 <- c( , 0.141655802, 0.571901717) P2<- c(0.141655802, 0.289082394, 0.232253696) P3 <- c(0.571901717, 0.232253696, 4.038892749) PV <- cbind(P1, P2, P3) #create the additive genetic covariance matrix G1 <- c(0.54388765, , 0.632724459) G2<- c( , 0.195036949, 0.219192165) G3 <- c(0.632724459, 0.219192165, 2.785510706) GV <- cbind(G1, G2, G3) GV A<- matrix(c( , A , ), nrow=3) #compute the transpose of PV (PV prime) PVp <- t(PV) PVp #multiply PVp x GV x A to estimate b values b <- PVp %*% GV %*% A b 8