Basic Research
– Researcher makes hypothesis and conducts a single experiment to test it
– The hypothesis is modified and another experiment is conducted
– Combined analysis of experiments is seldom required
– Experiments may be repeated to
• Provide greater precision (increased replication)
• Validate results from initial experiment
Applied Research
– Recommendations to producers must be based on multiple locations and seasons that represent target environments (soil types, weather patterns)
Often called MET = multi-environment trials
How do treatment effects change in response to differences in soil and weather throughout a region?
–
What is the range of responses that can be expected?
Detect and quantify interactions of treatments and locations and interactions of treatments and seasons in the recommendation domain
Combined estimates are valid only if locations are randomly chosen within target area
– Experiments often carried out on experiment stations
– Generally use sites that are most accessible or convenient
–
Can still analyze the data, but consider possible bias due to restricted site selection when making interpretations
Complete ANOVA for each experiment
– Do we have good data from each site?
– Examine residual plots for validity of ANOVA assumptions, outliers
Examine experimental errors from different locations for heterogeneity
– Perform F Max test or Levene’s test for homogeneity of variance
– If homogeneous, perform a combined analysis across sites
Differences in means across sites are often greater than treatment effects
Does not prevent a combined analysis, but may contribute to error heterogeneity if there are associations between means and variances
If heterogeneous:
– Break sites into homogeneous groups and analyze separately
– Use a transformation
– Use a generalized linear mixed model that accounts for error structure
Y ijk
=
+
i
+
j(i)
+
k
+ (
) ik
+
ijk
= mean effect
i
j(i)
k
ik
ijk
= i th location effect
= j th block effect within the i th location
= k th treatment effect
= interaction of the k th
= pooled error treatment in the i th location
Environments = Locations = Sites
Blocks are nested in locations
– SS for blocks is pooled across locations
Obtain a preliminary estimate of interaction of treatment with environment or season
Will we be able to make general recommendations about the treatments or should they be specific for each region or site?
– Error degrees of freedom are pooled across sites, so it is relatively easy to detect interactions
– Consider the relative magnitude of variation due to the treatments compared to the interaction MS
– Are there rank changes in treatments across environments (crossover interactions)?
Source df SS MS
Location
Blocks in Loc.
l-1 l(r-1)
SSL M1
SSB(L) M2
Treatment t-1 SST M3
Loc. X Treatment (l-1)(t-1) SSLT M4
Pooled Error l(r-1)(t-1) SSE M5
Expected MS
2 e
r
2
TL
t
2
R ( L )
rt
2
L
2 e
2 e
2 e
2 e
t
2
R ( L )
r
2
TL
r
2
TL
rl
2
T
F for Locations = (M1+M5)/(M2+M4)
Satterthwaite’s approximate df
N1’ = (M1+M5) 2 /[(M1 2 /(l-1))+(M5 2 /((l)(r-1)(t-1)))]
N2’ = (M2+M4) 2 /[(M2 2 /(l-1))+(M4 2 /((l)(r-1)(t-1)))]
F for Treatments = M3/M4
F for Loc. x Treatments = M4/M5
Fixed Locations
• constitute the entire population of environments
OR
• represent specific environmental conditions (rainfall, elevation, etc.)
Source
Location
Blocks in Loc.
Treatment
Loc. X Treatment df l-1 l(r-1) t-1
(l-1)(t-1)
SS
SSL
SSB(L)
SST
SSLT
MS Expected MS
M1 t 2
R(L) rt 2
L
M2
M3
M4 e e t 2
R(L) r rl 2
T
2
LT
Pooled Error l(r-1)(t-1) SSE M5
2 e e
F for Locations = M1/M2
F for Treatments = M3/M5
F for Loc. x Treatments = M4/M5
Treatments are fixed, Locations are random
Source df SS MS
Location
Blocks in Loc.
l-1 l(r-1)
SSL M1
SSB(L) M2
Treatment t-1 SST M3
Loc. X Treatment (l-1)(t-1) SSLT M4
Pooled Error l(r-1)(t-1) SSE M5
Expected MS
2 e
t
2
R ( L )
rt
2
L
2 e
t
2
R ( L )
2 e
2 e
2 e
r
2
TL
r
2
TL
rl
2
T
F for Locations = M1/M2
F for Treatments = M3/M4
F for Loc. x Treatments = M4/M5
SAS uses slightly different rules for determining Expected MS
No direct test for Locations for this model
Varieties fixed, Locations random
PROC GLM ;
Class Location Rep Variety;
Model Yield = Location Rep(Location) Variety Location*Variety;
Random Location Rep(Location) Location*Variety/ Test ;
Source Type III Expected Mean Square
Location Var(Error) + 3 Var(Location*Variety) +
7 Var(Rep(Location)) + 21 Var(Location)
Dependent Variable: Yield
Source DF Type III SS Mean Square F Value Pr > F
Location 1 0.505125 0.505125 0.20 0.6745
Error 5.8098
15.027788 2.586644
Error: MS(Rep(Location)) + MS(Location*Variety) - MS(Error)
Source df SS MS
Years l-1
Blocks in Years l(r-1)
SSY M1
SSB(Y) M2
Treatment t-1 SST M3
Years X Treatment (l-1)(t-1) SSYT M4
Pooled Error l(r-1)(t-1) SSE M5
Expected MS
2 e
t
2
R ( Y )
rt
2
Y
2 e
t
2
R ( Y )
2 e
2 e
2 e
r
2
TY
r
2
TY
ry
2
T
F for Years = M1/M2
F for Treatments = M3/M4
F for Years x Treatments = M4/M5
Can analyze as a factorial
Source
Years df y-1
Locations
Years x Locations
Block(Years x Locations) l-1
(y-1)(l-1) yl(r-1)
Can determine the magnitude of the interactions between treatments and environments
– TxY, TxL, TxYxL
For a simpler interpretation, consider all year and location combinations as “sites” and use one of the models presented for multilocational trials
Combined Lab or Greenhouse Study (CRD)
Assume Treatments are fixed, Trials are random
A “trial” is a repetition of a replicated experiment
Source
Trial
Treatment df l-1 t-1
SS
SSL
SST
MS
M1
M2
Trial x Treatment (l-1)(t-1) SSLT M3
Pooled Error lt(r-1) SSE M4
Expected MS e e r r rt
2
LT
2
L
2
LT rl e
2 e
2
T
F for Trials = M1/M4 ( SAS would say M1/M3 )
F for Treatments = M2/M3
F for Trials x Treatments = M3/M4
If there are no interactions, consider pooling SSLT and SSE
– Use a conservative P value to pool (e.g. >0.25 or >0.5)
Assumptions for this example:
– locations and blocks are random
–
Treatments are fixed
Source
Total
Location
Blocks in Loc.
df lrt-1 l-1 l(r-1)
SS
SSTot
SSL
SSB(L)
Treatment t-1 SST
Loc. X Treatment (l-1)(t-1) SSLT
Pooled Error l(r-1)(t-1) SSE
MS F
M1 M1/M2
M2
M3 M3/M4
M4 M4/M5
M5
If Loc. x Treatment interactions are significant, must be cautious in interpreting main effects combined across all locations
PROC MIXED or PROC GLIMMIX
Genotype by Environment Interactions (GEI)
When the relative performance of varieties differs from one location or year to another…
– how do you make selections?
– how do you make recommendations to farmers?
Genotype x Environment Interactions (GEI)
How much does GEI contribute to variation among varieties or breeding lines?
P = G + E + GE
P is phenotype of an individual
G is genotype
E is environment
GE is the interaction
DeLacey et al ., 1990 – summary of results from many crops and locations
70-20-10 rule
E: GE: G
20% of the observed variation among genotypes is due to interaction of genotype and environment
Many approaches for examining GEI have been suggested since the 1960’s
Characterization of GEI is closely related to the concept of stability. “Stability” has been interpreted in different ways.
– Static – performance of a genotype does not change under different environmental conditions
(relevant for disease resistance, quality factors)
– Dynamic – genotype performance is affected by the environment, but its relative performance is consistent across environments. It responds to environmental factors in a predictable way.
CV of individual genotypes across locations
Regression of genotypes on an environmental index
– Eberhart and Russell, 1966
Ecovalence
– Wricke, 1962
Superiority measure of cultivars
– Lin and Binns, 1988
Many others…
Rank sum index (nonparametric approach)
Cluster analysis
Factor analysis
Principal component analysis
AMMI
Pattern analysis
Analysis of crossovers
Partial Least Squares Regression
Factorial Regression