PBG 650 Advanced Plant Breeding Module 9: Best Linear Unbiased Prediction – Purelines – Single-crosses Best Linear Unbiased Prediction (BLUP) • Allows comparison of material from different populations evaluated in different environments • Makes use of all performance data available for each genotype, and accounts for the fact that some genotypes have been more extensively tested than others • Makes use of information about relatives in pedigree breeding systems • Provides estimates of genetic variances from existing data in a breeding program without the use of mating designs Bernardo, Chapt. 11 BLUP History • • • Initially developed by C.R. Henderson in the 1940’s • BLUP is a general term that refers to two procedures Most extensively used in animal breeding Used in crop improvement since the 1990’s, particularly in forestry – true BLUP – the ‘P’ refers to prediction in random effects models (where there is a covariance structure) – BLUE – the ‘E’ refers to estimation in fixed effect models (no covariance structure) B-L-U • “Best” means having minimum variance • “Linear” means that the predictions or estimates are linear functions of the observations • Unbiased – expected value of estimates = their true value – predictions have an expected value of zero (because genetic effects have a mean of zero) Regression in matrix notation Y = X + ε Linear model Parameter estimates b = (X’X)-1X’Y Source df SS MS Regression p b’X’Y MSR Residual n-p Y’Y - b’X’Y MSE Total n Y’Y BLUP Mixed Model in Matrix Notation Design matrices Y = X + Zu + e Fixed effects Random effects • Fixed effects are constants – overall mean – environmental effects (mean across trials) • Random effects have a covariance structure – – – – breeding values dominance deviations testcross effects general and specific combining ability effects Classification for the purposes of BLUP BLUP for purelines – barley example Environments Set 1 18 Set 1 18 Set 1 18 Set 2 9 Set 2 9 Set 2 9 Cultivar Grain Yield t/ha Morex (1) 4.45 Robust (2) 4.61 Stander (4) 5.27 Robust (2) 5.00 Excel (3) 5.82 Stander (4) 5.79 Parameters to be estimated • means for two sets of environments – fixed effects – we are interested in knowing effects of these particular sets of environments • breeding values of four cultivars – random effects – from the same breeding population – there is a covariance structure (cultivars are related) Bernardo, pg 269 Linear model for barley example Yij = + ti + uj + eij ti = effect of ith set of environments uj = effect of jth cultivar Y = X + Zu + e In matrix notation: 4.45 1 0 4.61 1 0 5.27 = 1 0 5.00 0 1 5.82 0 1 5.79 0 1 b1 b2 1 0 + 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 1 u1 u2 u3 u4 e11 e12 + e14 e22 e23 e24 Weighted regression Y = X + ε Where εij ~N (0, σ2) b = (X’X)-1X’Y For the barley example When εij ~N (0, Then b = 18 0 0 0 18 0 Rσ2) (X’R-1X)-1X’R-1Y R-1= 0 0 0 0 0 0 0 0 0 0 0 18 0 0 0 9 0 0 0 0 0 0 0 0 9 0 0 0 0 9 Covariance structure of random effects Morex Morex 1 Robust Excel Stander XY Excel Stander 1/2 1 7/16 27/32 1 11/32 43/64 91/128 1 2 Cov XY rA Remember u A A 2 Robust 2 2 D r = 2XY 2 1 1 2 7/8 27/16 11/16 43/32 7/8 11/16 27/16 43/32 2 91/64 91/64 2 A 2 Mixed Model Equations βˆ uˆ -1 X’R-1X X’R-1Z Z’R-1X Z’R-1Z + A-1(σε2/σA2) = Rσ2 • • X’R-1Y each matrix is composed of submatrices the algebra is the same Calculations in Excel Z’R-1Y Results from BLUP Original data BLUP estimates For fixed effects b1 = + t1 b2 = + t2 Environments Set 1 18 Set 1 18 Set 1 18 Set 2 9 Set 2 9 Set 2 9 1 2 u1 u2 u3 u4 Cultivar Morex Robust Stander Robust Excel Stander Grain Yield t/ha 4.45 4.61 5.27 5.00 5.82 5.79 Set 1 4.82 Set 2 5.41 Morex -0.33 Robust -0.17 Excel 0.18 Stander 0.36 Interpretation from BLUP BLUP estimates 1 2 u1 u2 u3 u4 Set 1 4.82 Set 2 5.41 Morex -0.33 Robust -0.17 Excel 0.18 Stander 0.36 For a set of recombinant inbred lines from an F2 cross of Excel x Stander Predicted mean breeding value = ½(0.18+0.36) = 0.27 Shrinkage estimators • In the simplest case (all data balanced, the only fixed effect is the overall mean, inbreds unrelated) 2 BLUP(i ) h Yi. Y.. • If h2 is high, BLUP values are close to the phenotypic values • If h2 is low, BLUP values shrink towards the overall mean • For unrelated inbreds or families, ranking of genotypes is the same whether one uses BLUP or phenotypic values Sampling error of BLUP βˆ uˆ X’R-1Z Z’R-1X Z’R-1Z + A-1(σε2/σA2) X’R-1Y = coefficient matrix C11 C21 Z’R-1Y Rσ2 invert the matrix • -1 X’R-1X C12 C22 each element of the matrix is a matrix Diagonal elements of the inverse of the coefficient matrix can be used to estimate sampling error of fixed and random effects Sampling error of BLUP βˆ = uˆ C11 C21 C12 C22 X’R-1Y Z’R-1Y 2 2 C112 fixed effects 2 2 C222 random effects Estimation of Variance Components (would really need a larger data set) 1. 2. 3. Use your best guess for an initial value of σε2/σA2 4. 5. Calculate a new σε2/σA2 Solve for ˆ and û Use current solutions to solve for σε2 and then for σA2 Repeat the process until estimates converge BLUP for single-crosses Performance of a single cross: GB73,Mo17 = GCAB73 + GCAMo17 + SCAB73,Mo17 BLUP Model Y = X + Ug1 + Wg2 + Ss + e • • Sets of environments are fixed effects GCA and SCA are considered to be random effects Example in Bernardo, pg 277 from Hallauer et al., 1996 Performance of maize single crosses Set 1 1 1 2 2 Entry SC-1 SC-2 SC-3 SC-2 SC-3 Grain Yield Pedigree t ha-1 B73 x Mo17 7.85 H123 x Mo17 7.36 B84 x N197 5.61 H123 x Mo17 7.47 B84 x N197 5.96 Iowa Stiff Stalk x Lancaster Sure Crop 7.85 7.36 5.61 7.47 5.96 1 1 = 1 0 0 0 0 0 1 1 b1 b2 1 0 + 0 0 0 0 0 1 0 1 0 1 0 1 0 1 gB73 1 0 gMo17 0 1 0 gB84 + 0 1 gN197 + 0 0 1 gH123 1 0 0 1 0 0 1 0 0 0 0 s1 1 s2 0 s3 1 e11 e12 + e13 e22 e23 Covariance of single crosses SC-X is jxk SC-Y is j’xk’ 2 Cov SC jj'GCA (1) 2 kk 'GCA (2) B73, B84, H123 1 G1= B73,B84 MO17, N197 B73,B84 B73,H123 1 B73,H123 B84,H123 2 2g1 G1GCA(1) 2 jj'kk 'SCA B84,H123 1 G2= Mo17,N197 Mo17,N197 1 1 2 2g2 G2GCA(2) assuming no epistasis Covariance of single crosses SC-X is jxk SC-Y is j’xk’ 2 Cov SC jj'GCA (1) S= 2 kk 'GCA (2) 2 jj'kk 'SCA SC-1=B73xMO17 SC-2=H123xMO17 SC-3=B84xN197 1 B73,H123Mo17,Mo17 B73,B84Mo17,N197 B73,H123Mo17,Mo17 1 B84,H123Mo17,N197 B73,B84Mo17,N197 B84,H123Mo17,N197 1 S 2 s 2 SCA Solutions -1 X'R X -1 X'R U -1 -1 X'R W X'R Z U'R X U'R-1U + Q 1 U'R W U'R Z -1 -1 -1 -1 W'R X W'R U -1 Z'R X -1 Z'R U -1 W'R W + Q 2 W'R Z -1 -1 Z'R W -1 -1 -1 X Z'R Z + Q S -1 2 Q1 G11 2 / GCA(1) 2 Q2 G21 2 / GCA(2) 2 QS S1 2 / SCA -1 X'R Y U'R Y -1 W'R Y -1 Z'R Y b1 b2 g B73 gB84 g H123 gMo 17 g N197 s1 s 2 s3