Across Location Analyses

advertisement
Combined Analysis of Experiments
 Basic Research
– Researcher makes hypothesis and conducts a single
experiment to test it
– The hypothesis is modified and another experiment is
conducted
– Combined analysis of experiments is seldom required
– Experiments may be repeated to
• Provide greater precision (increased replication)
• Validate results from initial experiment
 Applied Research
– Recommendations to producers must be based on
multiple locations and seasons that represent target
environments (soil types, weather patterns)
Multilocational trials
 Often called MET = multi-environment trials
 How do treatment effects change in response to
differences in soil and weather throughout a region?
– What is the range of responses that can be expected?
 Detect and quantify interactions of treatments and
locations and interactions of treatments and seasons in
the recommendation domain
 Combined estimates are valid only if locations are
randomly chosen within target area
– Experiments often carried out on experiment stations
– Generally use sites that are most accessible or convenient
– Can still analyze the data, but consider possible bias due to
restricted site selection when making interpretations
Preliminary Analysis
 Complete ANOVA for each experiment
– Do we have good data from each site?
– Examine residual plots for validity of ANOVA assumptions,
outliers
 Examine experimental errors from different
locations for heterogeneity
– Perform F Max test or Levene’s test for homogeneity of variance
– If homogeneous, perform a combined analysis across sites
– If heterogeneous, may need to use a transformation or break
sites into homogeneous groups and analyze separately
– Differences in means across sites are often greater than
treatment effects
– Does not prevent a combined analysis, but may contribute to
error heterogeneity if there are associations between means and
variances
MET Linear Model (for an RBD)
Yijk =  + i + j(i) + k + ()ik + ijk
 = mean effect
i = ith location effect
j(i) = jth block effect within the ith location
k = kth treatment effect
ik = interaction of the kth treatment in the ith location
ijk = pooled error
 Environments = Locations = Sites
 Blocks are nested in locations
– SS for blocks is pooled across locations
Treatment x Environment Interaction
 Obtain a preliminary estimate of interaction of
treatment with environment or season
 Will we be able to make general
recommendations about the treatments or
should they be specific for each site?
– Error degrees of freedom are pooled across sites, so
it is relatively easy to detect interactions
– Consider the relative magnitude of variation due to
the treatments compared to the interaction MS
– Are there rank changes in treatments across
environments (crossover interactions)?
Treatments and locations are random
Source
df
SS
MS
Location
Blocks in Loc.
Treatment
Loc. X Treatment
Pooled Error
l-1
l(r-1)
t-1
(l-1)(t-1)
l(r-1)(t-1)
SSL
SSB(L)
SST
SSLT
SSE
M1
M2
M3
M4
M5
Expected MS
2e  r2TL  tR2 (L)  rtL2
2e  tR2 (L)
2e  r2TL  rl2T
 2e  r2TL
 2e
F for Locations = (M1+M5)/(M2+M4)
Satterthwaite’s approximate df
N1’ = (M1+M5)2/[(M12/(l-1))+M52/(l)(r-1)(t-1)]
N2’ = (M2+M4)2/[(M22/(l-1))+M42/(l)(r-1)(t-1)]
F for Treatments = M3/M4
F for Loc. x Treatments = M4/M5
Treatments and locations are fixed
Fixed Locations
• constitute the entire population of environments
OR
• represent specific environmental conditions (rainfall, elevation, etc.)
Source
df
SS
MS
Location
l-1
SSL
M1
Blocks in Loc.
l(r-1)
SSB(L) M2
2
e2  tR(L)
Treatment
t-1
SST
M3
e2  rlT2
Loc. X Treatment (l-1)(t-1)
SSLT
M4
2
e2  rLT
Pooled Error
SSE
M5
 e2
l(r-1)(t-1)
F for Locations = M1/M2
F for Treatments = M3/M5
F for Loc. x Treatments = M4/M5
Expected MS
2
e2  tR(L)
 rtL2
Treatments are fixed, Locations are random
Source
df
SS
MS
Expected MS
Location
Blocks in Loc.
Treatment
Loc. X Treatment
Pooled Error
l-1
l(r-1)
t-1
(l-1)(t-1)
l(r-1)(t-1)
SSL
SSB(L)
SST
SSLT
SSE
M1
M2
M3
M4
M5
2e  tR2 (L)  rtL2
2e  tR2 (L)
2e  r2TL  rl2T
 2e  r2TL
 2e
F for Locations = M1/M2
F for Treatments = M3/M4
F for Loc. x Treatments = M4/M5
 SAS uses slightly different rules for determining Expected MS
 No direct test for Locations for this model
SAS Expected Mean Squares
Varieties fixed, Locations random
PROC GLM;
Class Location Rep Variety;
Model Yield = Location Rep(Location) Variety Location*Variety;
Random Location Rep(Location) Location*Variety/Test;
Source
Type III Expected Mean Square
Location
Var(Error) + 3 Var(Location*Variety) +
7 Var(Rep(Location)) + 21 Var(Location)
Dependent Variable: Yield
Source
Location
Error
DF
Type III SS
Mean Square
F Value
Pr > F
0.20
0.6745
1
0.505125
0.505125
5.8098
15.027788
2.586644
Error: MS(Rep(Location)) + MS(Location*Variety) - MS(Error)
Treatments are fixed, Years are random
Source
df
SS
MS
Expected MS
Years
Blocks in Years
Treatment
Years X Treatment
Pooled Error
l-1
l(r-1)
t-1
(l-1)(t-1)
l(r-1)(t-1)
SSY
SSB(Y)
SST
SSYT
SSE
M1
M2
M3
M4
M5
2e  tR2 (Y)  rt2Y
2e  tR2 (Y)
2e  r2TY  ry2T
F for Years = M1/M2
F for Treatments = M3/M4
F for Years x Treatments = M4/M5
2e  r2TY
 2e
Locations and Years in the same trial
 Can analyze as a factorial
Source
df
Years
y-1
Locations
l-1
Years x Locations
(y-1)(l-1)
Block(Years x Locations)
yl(r-1)
 Can determine the magnitude of the interactions
between treatments and environments
– TxY, TxL, TxYxL
 For a simpler interpretation, consider all year and
location combinations as “sites” and use one of the
models presented for multilocational trials
Combined Lab or Greenhouse Study (CRD)
 Assume Treatments are fixed, Trials are random
 A “trial” is a repetition of a replicated experiment
Source
df
SS
MS
Expected MS
Trial
l-1
SSL
M1
e2  rtL2
Treatment
t-1
SST
M2
2
e2  rLT
 rlT2
M3
2
e2  rLT
M4
 e2
Trial x Treatment
Pooled Error
(l-1)(t-1)
lt(r-1)
SSLT
SSE
F for Trials = M1/M4 (SAS would say M1/M3)
F for Treatments = M2/M3
F for Trials x Treatments = M3/M4
 If there are no interactions, consider pooling SSLT and SSE
– Use a conservative P value to pool (e.g. >0.25 or >0.5)
Preliminary ANOVA
 Assumptions for this example:
– locations and blocks are random
– Treatments are fixed
Source
Total
Location
Blocks in Loc.
Treatment
Loc. X Treatment
Pooled Error
df
lrt-1
l-1
l(r-1)
t-1
(l-1)(t-1)
l(r-1)(t-1)
SS
SSTot
SSL
SSB(L)
SST
SSLT
SSE
MS
F
M1
M2
M3
M4
M5
M1/M2
M3/M4
M4/M5
 If Loc. x Treatment interactions are significant, must be
cautious in interpreting main effects combined across all
locations
Genotype by Environment Interactions (GEI)
 When the relative performance of varieties
differs from one location or year to another…
– how do you make selections?
– how do you make recommendations to farmers?
No rank changes,
but interaction
A
B
Response
No interaction
A
B
Environments
Rank changes and
interaction
B
A
Genotype x Environment Interactions (GEI)
 How much does GEI contribute to variation
among varieties or breeding lines?
P = G + E + GE
P is phenotype of an individual
G is genotype
E is environment
GE is the interaction
DeLacey et al., 1990 – summary of results from many
crops and locations
70-20-10 rule
E: GE: G
20% of the observed variation among
genotypes is due to interaction of
genotype and environment
Stability
 Many approaches for examining GEI have
been suggested since the 1960’s
 Characterization of GEI is closely related to the
concept of stability. “Stability” has been
interpreted in different ways.
– Static – performance of a genotype does not
change under different environmental conditions
(relevant for disease resistance, quality factors)
– Dynamic – genotype performance is affected by the
environment, but its relative performance is
consistent across environments. It responds to
environmental factors in a predictable way.
Measures of stability
 CV of individual genotypes across locations
 Regression of genotypes on an environmental
index
– Eberhart and Russell, 1966
 Ecovalence
– Wricke, 1962
 Superiority measure of cultivars
– Lin and Binns, 1988
 Many others…
Analysis of GEI – other approaches
 Rank sum index (nonparametric approach)
 Cluster analysis
 Factor analysis
 Principal component analysis
 AMMI
 Pattern analysis
 Analysis of crossovers
 Partial Least Squares Regression
 Factorial Regression
Download