Advanced Plant Breeding CSS 650

advertisement
Advanced Plant Breeding CSS 650
Take-Home Final Exam, Fall 2009
Due Wednesday, Dec. 9, 2009
Name
Show your work! This is essentially a final homework assignment. You can refer to your notes,
journal articles, and text books. Do as much as you can on your own, but you may compare
answers with your classmates if you wish.
1) The eggplant breeder at a private seed company has just retired and you are hired to take over
the program. You are given a set of 20 new dihaploid lines and wish to know their potential
for use as parents in hybrids. As we learned from the poster session in class, hybrid seed
production is currently done by hand pollination in eggplant. What mating designs could be
12 points
used to estimate combining ability (hint: we discussed two in class and another is commonly
used in hybrid corn breeding)? List any questions you would ask the retiree that would help
you to choose the best design. For the purposes of this exam, assume specific responses to
those questions and explain your consequent choice of a mating design. Draw a diagram
showing how you would make the crosses.
1
2) A breeder wants to improve pearl millet as a grain and forage crop in North Dakota. In 2006
he evaluated 300 half-sib families from a breeding population in a yield trial at a single
location with 3 replications (blocks). To assist in developing a selection index, he decided to
estimate heritabilities and the genetic correlation (rA) between grain yield and plant height.
He used the following SAS code to generate univariate analyses for both traits as well as an
analysis of covariance for the two traits.
proc glm;
class Block Family;
model Yield Height =Block Family;
manova h=Family/printh printe;
random Block Family/test;
Run;
The GLM Procedure
Multivariate Analysis of Variance
H = Type III SSCP Matrix for Family
Yield
Yield
Height
161.46
1255.8
Height
1255.8
83241.6
Output from printh
E = Error SSCP Matrix
Yield
Yield
Height
179.4
1315.6
Height
1315.6
65780.0
Output from printe
He summarized the ANOVA for yield as follows:
Source
Block
Family
Error
df
SS
MS
F
Prob>F
2
299
598
161.46
179.40
0.54
0.30
1.8
0.0000
From these results he calculated the phenotypic variance among half-sib families to be 0.18 and
obtained an estimate of heritability for yield on a family mean basis = 0.44.
2
2) cont’d. Use the SAS output above to answer the following questions (show your work)
8 points
8 points
8 points
a) Calculate phenotypic variance among half-sib families for plant height and estimate
heritability on a family mean basis for this trait.
b) What is the additive genetic correlation (rA) for yield and plant height? (you will need to
calculate the genetic covariance for these traits in the same manner that you calculated
the genetic variance among half sib families for individual traits)
c) If he selected the best 15 families for grain yield, what would be the expected response to
selection?
3
He decided to validate these estimates by making separate selections for yield and plant
height. He selected the best 15 families for each trait:
Mean of all families
Mean of 15 selected families
Grain Yield t/ha
3.5
4.3
Plant Height cm
170
190
In 2007 he intermated remnant seed of selected families to form the C1 cycles of selection.
In 2008 he evaluated the selection response:
Selection for yield
t/ha
cm
Cycle 0
Cycle 1
3.8
4.1
176
180
Selection for plant height
cm
t/ha
176
188
3.8
3.9
d) Calculate the realized heritability for yield.
8 points
8 points
e) Use the relationship below to obtain an estimate of rA from the selection experiment.
 CR X  CRY 
rA2  


 R X  RY 
5 points
f) Do estimates of h2 for yield and rA obtained from the half-sib trial in 2006 agree fairly
well with results from the selection experiment?
4
3) In class, we discussed several approaches for using molecular markers to improve crops
 Marker-based selection (MBS)
 Marker-assisted selection (MAS)
o F2 enrichment
o Marker-assisted backcrossing (MABC)
o Marker-assisted recurrent selection (MARS)
 Genomic Selection (GS).
For each of the scenarios below, indicate which approach(es) would be most appropriate, and
explain your choice.
5 points
5 points
5 points
a) A diploid, self-pollinating crop, with available marker density at ~5 cM throughout the
genome. Three QTL have been identified that explain a large proportion of the variation
for resistance to an important disease. Existing cultivars possess anywhere from 0-3 of
the desired alleles at these loci. New lines must have higher yield than existing cultivars
and acceptable quality to justify release.
b) A commercial breeding company for a major, high value crop. Facilities and resources
are available for high density, high throughput genotyping. Important traits such as yield
are controlled by many QTL with small effects.
c) A minor, relatively new crop that is cross-pollinated by insects. Hand pollinations are
time-consuming and no male sterility system has been developed. Commercial varieties
may be open-pollinated populations or synthetics. Several molecular markers have been
developed using a candidate gene approach that impart desirable quality characteristics.
5
8 pts
4) A number of papers have been published in recent years comparing the use of AMMI and
GGE as techniques for analyzing genotype by environment interactions (GEI) in
multilocational trials. Briefly explain the features of these linear-bilinear models and the
difference between them.
6
5) Recall the BLUP example from Bernardo’s text that we discussed in class and in lab. The
drawback of the IML program that we used was that one had to manually iterate the program
many times to obtain the correct estimates of the breeding values and genetic variance. I have
modified the PROC MIXED program I gave to you in lab so that it gives correct estimates in
a single run. The main change is to include a type=lin(1) option in the Random statement.
data purelines;
input set n_loc variety$ genotype y;
datalines;
1 18
Morex
1 4.45
1 18
Robust
2 4.61
1 18
Stander
4 5.27
2 9
Robust
2 5.00
2 9
Excel
3 5.82
2 9
Stander
4 5.79
;
data GR;
input parm row col1-col4;
datalines;
1 1 2
1
0.875
1 2 1
2
1.6875
1 3 0.875
1.6875
2
1 4 0.6875
1.34375
1.421875
;
run;
options nodate nocenter;
0.6875
1.34375
1.421875
2
Proc Mixed data=purelines noclprint covtest;
class genotype set;
weight n_loc;
Model y=set/outpredm=LGGR outpred=PredGGR;
Random genotype/ldata=GR type=lin(1) solution;
lsmeans set;
ods listing exclude solutionR; ods output solutionR=BLUPGGR;
ods output covparms=VGGR;
Proc print data=PredGGR;
Run;
Data BLUPs;
Set BLUPGGR;
Keep genotype BLUP Pred_Error P_value;
BLUP=Estimate;
Pred_Error=StdErrPred;
P_Value=Probt;
Proc print;
Title1'Genotype effect BLUPs, Prediction Error and P-Value for H0:BLUP=0';
Run;
Quit;
7
5) cont’d. Output from the BLUP analysis
The Mixed Procedure
Covariance Parameter Estimates
Cov Parm
Standard
Error
Estimate
LIN(1)
Residual
0.3499
0.05281
Z
Value
Pr Z
1.18
0.68
0.2384
0.2488
0.2968
0.07784
The linear term in the model
estimates additive genetic
variance. The residual is VR.
Least Squares Means
Effect
set
set
set
1
2
Estimate
Standard
Error
DF
t Value
Pr > |t|
4.8874
5.3525
0.6754
0.6767
1
1
7.24
7.91
0.0874
0.0801
Obs set n_loc variety genotype
1
2
3
4
5
6
1
1
1
2
2
2
18
18
18
9
9
9
Morex
Robust
Stander
Robust
Excel
Stander
1
2
4
2
3
4
y
Pred
StdErr
Pred
4.45
4.61
5.27
5.00
5.82
5.79
4.45182
4.59133
5.28684
5.05642
5.80165
5.75193
0.054045
0.049300
0.049335
0.062002
0.075379
0.062191
DF Alpha
1
1
1
1
1
1
0.05
0.05
0.05
0.05
0.05
0.05
Lower
3.76512
3.96492
4.65998
4.26862
4.84387
4.96172
These are the BLUEs for
the fixed effects in the
model
Upper
Resid
5.13853 -0.001821
5.21775 0.018666
5.91371 -0.016845
5.84422 -0.056420
6.75943 0.018351
6.54214 0.038069
Genotype effect BLUPs, Prediction Error and P-Value for H0:BLUP=0
Pred_
Obs
genotype
BLUP
Error
P_Value
1
2
3
4
1
2
3
4
-0.43562
-0.29611
0.44912
0.39941
0.67572
0.67634
0.67947
0.67558
These are the BLUPs and
the standard errors that you
would obtain if you iterated
the matrix calculations
many times.
0.63546
0.73729
0.62817
0.66009
8
5) cont’d.
a) Run the same analysis using a subset of the data from the Minnesota barley breeding
program that we used for the TASSEL demonstration (MN06ex2.xls). Fifty breeding
lines were evaluated for yield at two locations (set=1). In addition, the first 25 lines were
evaluated at a third location (set=2). The kinship matrix (also in MN06ex2.xls) was
obtained from TASSEL using SNP data. A single column of 1’s was added in the first
column to obtain the correct format for the LIN(1) covariance matrix option in SAS. You
do not need to submit your program or output with this exam, provided that you can
answer questions b and c.
5 pts
b) Give the estimate you obtained for additive genetic variance and its standard error.
10 pts
c) Identify the two lines with the highest breeding values. If you crossed these lines, what is
the expected breeding value of an inbred line (RIL) derived from the cross?
5 pts
d) If you were the barley breeder in Minnesota would you use BLUPs to choose parents for
making crosses or would you use the mean yield across sites? Explain your answer.
9
Download