Combined Analysis of Experiments Basic Research – Researcher makes hypothesis and conducts a single experiment to test it – The hypothesis is modified and another experiment is conducted – Combined analysis of experiments is seldom required – Experiments may be repeated to • Provide greater precision (increased replication) • Validate results from initial experiment Applied Research – Recommendations to producers must be based on multiple locations and seasons that represent target environments (soil types, weather patterns) Multilocational trials Often called MET = multi-environment trials How do treatment effects change in response to differences in soil and weather throughout a region? – What is the range of responses that can be expected? Detect and quantify interactions of treatments and locations and interactions of treatments and seasons in the recommendation domain Combined estimates are valid only if locations are randomly chosen within target area – Experiments often carried out on experiment stations – Generally use sites that are most accessible or convenient – Can still analyze the data, but consider possible bias due to restricted site selection when making interpretations Preliminary Analysis Complete ANOVA for each experiment – Do we have good data from each site? – Examine residual plots for validity of ANOVA assumptions, outliers Examine experimental errors from different locations for heterogeneity – Perform F Max test or Levene’s test for homogeneity of variance – If homogeneous, perform a combined analysis across sites – If heterogeneous, may need to use a transformation or break sites into homogeneous groups and analyze separately – Differences in means across sites are often greater than treatment effects – Does not prevent a combined analysis, but may contribute to error heterogeneity if there are associations between means and variances MET Linear Model (for an RBD) Yijk = + i + j(i) + k + ()ik + ijk = mean effect i = ith location effect j(i) = jth block effect within the ith location k = kth treatment effect ik = interaction of the kth treatment in the ith location ijk = pooled error Environments = Locations = Sites Blocks are nested in locations – SS for blocks is pooled across locations Treatment x Environment Interaction Obtain a preliminary estimate of interaction of treatment with environment or season Will we be able to make general recommendations about the treatments or should they be specific for each site? – Error degrees of freedom are pooled across sites, so it is relatively easy to detect interactions – Consider the relative magnitude of variation due to the treatments compared to the interaction MS – Are there rank changes in treatments across environments (crossover interactions)? Treatments and locations are random Source df SS MS Location Blocks in Loc. Treatment Loc. X Treatment Pooled Error l-1 l(r-1) t-1 (l-1)(t-1) l(r-1)(t-1) SSL SSB(L) SST SSLT SSE M1 M2 M3 M4 M5 Expected MS 2e r2TL tR2 (L) rtL2 2e tR2 (L) 2e r2TL rl2T 2e r2TL 2e F for Locations = (M1+M5)/(M2+M4) Satterthwaite’s approximate df N1’ = (M1+M5)2/[(M12/(l-1))+M52/(l)(r-1)(t-1)] N2’ = (M2+M4)2/[(M22/(l-1))+M42/(l)(r-1)(t-1)] F for Treatments = M3/M4 F for Loc. x Treatments = M4/M5 Treatments and locations are fixed Fixed Locations • constitute the entire population of environments OR • represent specific environmental conditions (rainfall, elevation, etc.) Source df SS MS Location l-1 SSL M1 Blocks in Loc. l(r-1) SSB(L) M2 2 e2 tR(L) Treatment t-1 SST M3 e2 rlT2 Loc. X Treatment (l-1)(t-1) SSLT M4 2 e2 rLT Pooled Error SSE M5 e2 l(r-1)(t-1) F for Locations = M1/M2 F for Treatments = M3/M5 F for Loc. x Treatments = M4/M5 Expected MS 2 e2 tR(L) rtL2 Treatments are fixed, Locations are random Source df SS MS Expected MS Location Blocks in Loc. Treatment Loc. X Treatment Pooled Error l-1 l(r-1) t-1 (l-1)(t-1) l(r-1)(t-1) SSL SSB(L) SST SSLT SSE M1 M2 M3 M4 M5 2e tR2 (L) rtL2 2e tR2 (L) 2e r2TL rl2T 2e r2TL 2e F for Locations = M1/M2 F for Treatments = M3/M4 F for Loc. x Treatments = M4/M5 SAS uses slightly different rules for determining Expected MS No direct test for Locations for this model SAS Expected Mean Squares Varieties fixed, Locations random PROC GLM; Class Location Rep Variety; Model Yield = Location Rep(Location) Variety Location*Variety; Random Location Rep(Location) Location*Variety/Test; Source Type III Expected Mean Square Location Var(Error) + 3 Var(Location*Variety) + 7 Var(Rep(Location)) + 21 Var(Location) Dependent Variable: Yield Source Location Error DF Type III SS Mean Square F Value Pr > F 0.20 0.6745 1 0.505125 0.505125 5.8098 15.027788 2.586644 Error: MS(Rep(Location)) + MS(Location*Variety) - MS(Error) Treatments are fixed, Years are random Source df SS MS Expected MS Years Blocks in Years Treatment Years X Treatment Pooled Error l-1 l(r-1) t-1 (l-1)(t-1) l(r-1)(t-1) SSY SSB(Y) SST SSYT SSE M1 M2 M3 M4 M5 2e tR2 (Y) rt2Y 2e tR2 (Y) 2e r2TY ry2T F for Years = M1/M2 F for Treatments = M3/M4 F for Years x Treatments = M4/M5 2e r2TY 2e Locations and Years in the same trial Can analyze as a factorial Source df Years y-1 Locations l-1 Years x Locations (y-1)(l-1) Block(Years x Locations) yl(r-1) Can determine the magnitude of the interactions between treatments and environments – TxY, TxL, TxYxL For a simpler interpretation, consider all year and location combinations as “sites” and use one of the models presented for multilocational trials Combined Lab or Greenhouse Study (CRD) Assume Treatments are fixed, Trials are random A “trial” is a repetition of a replicated experiment Source df SS MS Expected MS Trial l-1 SSL M1 e2 rtL2 Treatment t-1 SST M2 2 e2 rLT rlT2 M3 2 e2 rLT M4 e2 Trial x Treatment Pooled Error (l-1)(t-1) lt(r-1) SSLT SSE F for Trials = M1/M4 (SAS would say M1/M3) F for Treatments = M2/M3 F for Trials x Treatments = M3/M4 If there are no interactions, consider pooling SSLT and SSE – Use a conservative P value to pool (e.g. >0.25 or >0.5) Preliminary ANOVA Assumptions for this example: – locations and blocks are random – Treatments are fixed Source Total Location Blocks in Loc. Treatment Loc. X Treatment Pooled Error df lrt-1 l-1 l(r-1) t-1 (l-1)(t-1) l(r-1)(t-1) SS SSTot SSL SSB(L) SST SSLT SSE MS F M1 M2 M3 M4 M5 M1/M2 M3/M4 M4/M5 If Loc. x Treatment interactions are significant, must be cautious in interpreting main effects combined across all locations Genotype by Environment Interactions (GEI) When the relative performance of varieties differs from one location or year to another… – how do you make selections? – how do you make recommendations to farmers? No rank changes, but interaction A B Response No interaction A B Environments Rank changes and interaction B A Genotype x Environment Interactions (GEI) How much does GEI contribute to variation among varieties or breeding lines? P = G + E + GE P is phenotype of an individual G is genotype E is environment GE is the interaction DeLacey et al., 1990 – summary of results from many crops and locations 70-20-10 rule E: GE: G 20% of the observed variation among genotypes is due to interaction of genotype and environment Stability Many approaches for examining GEI have been suggested since the 1960’s Characterization of GEI is closely related to the concept of stability. “Stability” has been interpreted in different ways. – Static – performance of a genotype does not change under different environmental conditions (relevant for disease resistance, quality factors) – Dynamic – genotype performance is affected by the environment, but its relative performance is consistent across environments. It responds to environmental factors in a predictable way. Measures of stability CV of individual genotypes across locations Regression of genotypes on an environmental index – Eberhart and Russell, 1966 Ecovalence – Wricke, 1962 Superiority measure of cultivars – Lin and Binns, 1988 Many others… Analysis of GEI – other approaches Rank sum index (nonparametric approach) Cluster analysis Factor analysis Principal component analysis AMMI Pattern analysis Analysis of crossovers Partial Least Squares Regression Factorial Regression