Take-Home_Final_15_part2_key

advertisement
Advanced Plant Breeding PBG 650
Take-Home Final Exam, Fall 2015
Due 9:30 am on Friday, December 11, 2015
Name
KEY
Part 2 – Recurrent Selection, Inbred/Hybrid Selection, Correlated Traits
1) You intend to carry out recurrent selection in a maize population in order to develop
improved open-pollinated varieties for the Republic of Benin, where there is presently no
private seed company marketing hybrid seed. There is a demand for varieties that have
good resistance to ear rot and good husk cover (the husk is tight and extends well beyond
the ear tip), to reduce losses due to grain weevils in storage. Farmers in the area grow one
crop of maize during the rainy season. It is possible for you to complete two generations in
a year, in the main season under rainfed conditions and in a dry season nursery under
irrigation at your research station. S1 family selection and full-sib family selection are both
easy to employ in maize, but you would like to choose the method that is most efficient.
12 points
a) Briefly describe the steps involved in these two selection schemes and discuss the
factors that would influence your choice of methods in this situation. What decision will
you make regarding the relative efficiency of these two methods? Indicate any
assumptions that you are making in your discussion. (Hint – either answer is acceptable
if you can justify your choice).
Method
Expected Gain
Generations per cycle
Full-sib
i(1/ 2) A2  Pfs
2
S1 family
i A2  PS 1
3
The full sib scheme consists of two generations – 1) formation of families (plant-to-plant
crosses), and 2) evaluation of progeny (full-sib families). Recombination is achieved when
plant to plant crosses are made to form the next generation of full-sib families. Progeny
should be evaluated in the main cropping system, but new families can be made in the offseason, so one cycle of selection can be completed each year.
The S1 scheme consists of three generations – 1) formation of families (selfing), 2)
evaluation of progeny (S1 families), and 3) recombination of selected families. Because
evaluation must be carried out in the main season, it will take two years to complete a
cycle.
The S1 scheme takes advantage of all of the additive genetic variance in the population,
whereas full-sib selection only benefits from ½ of the additive genetic variance. However,
the phenotypic variance for S1 selection will be correspondingly larger, which reduces the
expected gain from selection. With full-sib family selection, dominance variance contributes
1
to the genetic and phenotypic variance, but is not amenable to selection (it is not part of
the numerator of the expected gain formula). If there is a lot of dominance for either trait of
interest, that would tend to reduce the effectiveness of full-sib selection in comparison to S1
family selection. Both types of selection have opportunities for additional selection within
families during the recombination phase, if selected families are planted in rows. Overall,
the schemes appear to be comparable, in terms of progress expected per year. Other
factors may determine the final decision. For example, reciprocal crosses between two
noninbred plants provides more seed for testing than a single selfed ear. However, for the
full-sib scheme, the effective population size is twice as large as for S1 family selection, for
an equal number of families evaluated and percentage of families selected. An advantage of
the S1 scheme is that you would get a season off every two years, which would permit you
to stagger your breeding trials and potentially to include another population in your
program.
b) The weather during the first year of trials is hot and dry, and it is difficult to distinguish
levels of ear rot resistance. You decide to select primarily for husk cover (length of husk
extension beyond the tip of the ear) that season. The mean of 250 families is 2.5 cm.
You select the best families, which have a mean of 3.1 cm. After recombining the
selected families, you conduct a trial and include both the original cycle of selection (C0)
and your improved cycle of selection (C1). You note that the average for the improved
population (C1) is now 2.8 cm, and that the C0 this year has a mean of 2.4 cm. What is
the realized heritability for this trait?
5 points
h2 =
R XC1 -XC0 2.8-2.4 0.4
=
=
=
=0.667
S XS -X0 3.1-2.5 0.6
2) You are working with a new crop that is naturally outcrossing, but can be readily selfpollinated. You are trying to decide whether to breed synthetic varieties, or if it would be
worth the additional cost to produce hybrid seed. All possible single crosses among 6 inbred
parents were evaluated for yield, and the averages for each cross are shown below. The
yields of the inbred parents are shown on the diagonals.
A
A
B
C
D
E
F
B
23
C
D
40
25
46
42
26
2
E
37
41
40
21
F
39
36
37
38
18
38
36
45
35
40
24
5 points
a) Use Wright’s formula to predict the yield of a synthetic variety developed by random
mating all of the single-crosses.
 Y -Y 
 39.333-22.8333 
μˆ synthetic =Yii' -  ii' i  =39.333- 
 =39.333-2.75=36.583
6


 n 
6 points
b) Consider the four parents that have the highest yield per se. Estimate the yield that
could be obtained from all possible double crosses involving these four parents (there
are 3 possible combinations). Which of these double crosses would be expected to give
you the highest yield? How does that compare to the predicted yield of the synthetic
and to the best possible single-cross?
A, B, C, and F have the highest yields per se.
single cross parents
AB
CF
AC
AF
BF
BC
AC
46
AB
40
AB
40
AF
38
AF
38
AC
46
BC
42
BC
42
BF
36
BF
36
CF
45
CF
45
40.5
41.25
41.75
Double cross (AxF)x(BxC) has the highest predicted yield of 41.75. This is less than the best
single cross AxC which has a yield of 46. The best double cross is better than the synthetic,
which has an estimated yield of 36.58.
3) You know from a previous study that the additive genetic variance in the F2 generation of a
particular biparental cross is 25 units, and the dominance variance is 6 units.
a) Calculate the expected additive and dominance variance among and within families in
the F4 generation (assume that each F4 family was derived by selfing an individual F3
plant).
4 points
3
Among families 𝜎𝐴2 = (2) ∗ 25 = 37.5
Within families 𝜎𝐴2 = (1/4) ∗ 25 = 6.25
3
𝜎𝐷2 = (16) ∗ 6 = 1.125
𝜎𝐷2 = (1/4) ∗ 6 = 1.5
b) If selfing continues by single-seed descent, what will be the approximate additive and
dominance variance among and within families in the F10 generation (i.e., assume the
inbreeding coefficient F≈1)?
4 points
Among families 𝜎𝐴2 = 2 ∗ 25 = 50
Within families 𝜎𝐴2 = 0
𝜎𝐷2 = 0
𝜎𝐷2 = 0
3
4) A breeder wants to improve meadowfoam as an oilseed crop in Oregon. In 2012 she
evaluated half-sib families from a breeding population in a yield trial at a single location
using a lattice design with two complete replications (blocks). For the purposes of this
exercise, data for yield (in lbs/acre), thousand seed weight (TSW in g), and oil content
(percent by weight at ~10% moisture) will be analyzed for a subset of 87 families (called
entries), ignoring the incomplete blocking structure in the experiment.
This problem follows a similar format to a question on the 2011 final, but uses a different
data set. In this case you have been provided with most of the necessary computer output
and you are asked to perform a sample of the calculations and fill in the blanks rather than
performing all of the computations yourself. The goal is to give you some familiarity with
the analysis of correlated traits and provide a roadmap for future use, if needed. The data
sets and programs are provided for reference, and you will not need them to complete the
exam. (You have the option to run the R program at the end for extra credit.)
To assist in developing a selection index, we will first calculate the genetic variance and
covariance matrix for the three traits (on a family mean basis). The following SAS code was
used to generate univariate analyses for all traits as well as an analysis of covariance among
traits. The yield values were first divided by 100 to avoid problems of scale (being very
different than the other two traits).
proc glm data=mf;
class rep entry;
model oil TSW yield=rep entry;
manova h=entry/printh printe;
random rep entry/test;
run;
The GLM Procedure
Multivariate Analysis of Variance
H = Type III SSCP Matrix for Entry
Oil
TSW
Yield
Oil 128.032087 24.3648 98.3671
TSW
24.364798 49.7222 39.9476
Yield 98.3670954 39.9476
694.69
The GLM Procedure
Multivariate Analysis of Variance
E = Error SSCP Matrix
Oil
Oil
TSW
Yield
34.4834115 -0.42441 -10.462
TSW -0.42440833 16.1758 2.24658
Yield -10.4615115 2.24658 215.582
4
a) We will use the traits TSW and Oil as an example to demonstrate how a genetic
covariance can be calculated from an Analysis of Covariance. You will first need to
calculate Mean Squares by dividing both the Entry SSCP and the Error SSCP by their
degrees of freedom (both have 86 df). Refer to your lecture notes to determine how to
estimate the covariance for half-sib families for this combination of traits (TSW and Oil).
We used a similar approach to estimate the Genetic Variance among half-sib families
from an ANOVA at a single location.
5 points
MCPHS = 24.365/86 = 0.2833
MCPerror = -0.00493
CovHSXY = (MCPHS-MCPerror)/r = (0.2833-(-0.00493))/2 = 0.144123
Genetic Covariance for Families
Oil
TSW
Oil
5 points
0.54388765
Yield
0.14412329 0.63272446
TSW
0.144123293 0.19503695 0.21919216
Yield
0.632724459 0.21919216 2.78551071
b) Use your estimate of the genetic covariance from the previous question to calculate the
genetic correlation between TSW and oil. You will need to use the additive genetic
variance for each trait from the table above (on the diagonals).
 A(XY)
rA 
2A(X)2A(Y)
rA(TSW,oil) 
0.144123
 0.4425
0.54388 * 0.19503
c) Calculation of genetic correlations using MANOVA will give the same results as a mixed
model analysis when the data are balanced. For the MANOVA, each trait is considered
to be a different variable and each appears in a different column. For the mixed model
analysis, there is a single variable called ‘trait’, and the variable names (yield, TSW, and
oil) represent different levels of that variable. The traits are handled as repeated
measures on each plot.
The following program was run in SAS (adapted from the article by Piepho and Möhring,
2011, Crop Sci. 51:1-6):
proc mixed data=correl;
class rep entry plot trait;
model Y=trait trait*rep;
random trait /subject=entry type=unr;
repeated trait / subject=plot type=unr;
run;
5
The output is explained below. Note that SAS automatically sorts the traits in alphabetical
order, regardless of how they are sorted in the data set. In the SSCP matrix from the
MANOVA, the variables are listed in the order that they appear the in the Model statement.
You can use this output to check your calculations for question ‘b’.
Covariance Parameter Estimates
Cov Parm
Subject
Estimate
Var(1)
Entry
0.5439
Var(2)
Entry
0.195
Var(3)
Entry
2.7855
Corr(2,1)
Entry
0.4425
Corr(3,1)
Entry
0.5141
Corr(3,2)
Entry
0.2974
Var(1)
PLOT
0.401
Var(2)
PLOT
0.1881
Var(3)
PLOT
2.5068
Corr(2,1)
PLOT
-0.018
Corr(3,1)
PLOT
-0.1213
Corr(3,2)
PLOT
0.03804
5 points
Additive genetic variance for Oil
Additive genetic variance for TSW
Additive genetic variance for Yield
Additive genetic correlation between TSW and Oil
Additive genetic correlation between Yield and Oil
Additive genetic correlation between TSW and Yield
Error variance for Oil
Error variance for TSW
Error variance for Yield
Error correlation between TSW and Oil
Error correlation between Yield and Oil
Error correlation between TSW and Yield
d) The heritability for TSW is 0.675.
Use the information in the table to calculate heritability for oil content.
For TSW (not required, but included for reference)
2
0.195
h2  F2 
 0.675
P 0.28905
For Oil content (required)
2
0.401
P2  F2  2X  F2  E  0.5439 
 0.7444
2
2
F2  0.5439
F2 0.5439
h  2 
 0.731
P 0.7444
e) If the breeder selects the best 20% of the families for oil content, what would be the
expected response to selection after those families are intermated?
2
4 points
R X  ihX  A X  ih2X PX  1.40 * 0.731* 0.7444  0.883
5 points
Oil percentage in the population should increase by 0.883 units.
f) If the breeder selects the highest 20% of the families for TSW, what change in oil
content would be expected in the next generation?
CR Y  ihXrA  A Y  1.40 * 0.675 * 0.4425 * 0.5439  0.375
Oil percentage in the population should increase by about 0.375 units.
6
--------------------------------------------------------------------------------------------------------------------BEYOND THIS POINT IS EXTRA CREDIT
g) Meadowfoam growers are paid for seed produced by the pound (yield), but processors
value meadowfoam with high oil content. The breeder decides to give equal weight to
these two traits in a selection index. She includes TSW in the index because she knows it
is correlated with seed yield and oil content.
Fill in the missing values in the selection index, using economic weights of +1 for yield
and oil content, and 0 for TSW.
(Traits are numbered in alphabetical order: 1=oil, 2=TSW, 3=yield)
+ 2 points
 b1   0.744
b    0.142
 2 
b3   0.572

0.142 0.572 

0.289 0.232 
0.232 4.039

1
 0.544

 0.144
 0.633

0.144
0.195
0.219
0.633   1 
 
0.219   0 
2.785  1 
  
Use the R program at the end of this exam to solve for the values of the coefficients.
Paste your result here.
+ 4 points
 b1   2.8822
b   1.0656
 2 

b3  14.563
h) For entry 1, average oil = 29.04%, TSW = 10.315 g, and yield/100 = 17.685. What would
the index value be for this entry? How would you use this information in your selection?
+ 4 points
I = 2.8822*29.04 + 1.0656*10.315 + 14.563*17.685 = 352.24
Select ~17 families (20%) with the highest index scores.
7
#R program to estimate coefficients for index selection
#fill in the blanks in each matrix before running the program
#create the phenotypic covariance matrix
P1 <- c(
, 0.141655802, 0.571901717)
P2<- c(0.141655802, 0.289082394, 0.232253696)
P3 <- c(0.571901717, 0.232253696, 4.038892749)
PV <- cbind(P1, P2, P3)
#create the additive genetic covariance matrix
G1 <- c(0.54388765,
, 0.632724459)
G2<- c(
, 0.195036949, 0.219192165)
G3 <- c(0.632724459, 0.219192165, 2.785510706)
GV <- cbind(G1, G2, G3)
GV
A<- matrix(c( ,
A
,
), nrow=3)
#compute the transpose of PV (PV prime)
PVp <- t(PV)
PVp
#multiply PVp x GV x A to estimate b values
b <- PVp %*% GV %*% A
b
8
Download