RGxE: An R Program for Genotype x Environment Interaction Analysis

advertisement
RGxE: An R Program for Genotype x Environment Interaction Analysis
by Mahendra Dia, Todd C. Wehner, and Consuelo Arellano
Abstract Genotype x environmental interaction (GxE) can lead to differences in performance of
genotypes over environments. GxE analysis can be used to analyze the stability of genotypes and the
value of test locations. We developed an R language program (RGxE) that computes univariate stability
statistics, descriptive statistics, pooled ANOVA, genotype F ratio across location and environment,
cluster analysis for location, and location correlation with average location performance. Univariate
stability statistics calculated are regression slope (bi), deviation from regression (S2d), Shukla’s variance
(σi2), S square Wricke’s ecovalence (Wi), and Kang’s yield stability (YSi). RGxE is free and intended for
use by scientists studying performance of polygenic or quantitative traits over multiple environments. In
the present paper we provide the RGxE program and its components along with an example input data
and outputs. Additionally, the RGxE program is also available at
http://cuke.hort.ncsu.edu/cucurbit/wehner/software.html.
1
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
List of abbreviations
AMMI = Additive main effects and multiplicative interaction model
ANOVA = Analysis of variance
BLUP = Best linear unbiased prediction
CLT = Cultigen or genotype
CRAN = Comprehensive R Archive Network
CSV = Comma Separated Value
CV = Coefficient of variation
DF = Degrees of freedom
ENV = Environment (location - year combination)
EU = Experimental unit
GGE = Genotype main effects plus Genotype x environmental interaction effect model
GxE = Genotype x environmental interaction
H0 = Null hypothesis
LC = Location
Max = Maximum
MCMC = Markov Chain Monte Carlo
Min = Minimum
MS = Mean square
RGxE = R program for the analysis of genotype stability and location value
RP = Replication
RPid = Replication id, which is an experimental unit
Sd = Standard deviation
SS = Sum of square
Var = Variance
YR = Year
bi = Regression slope
S2d = Deviation from regression
σi2 = Shukla’s variance
Wi = Wricke’s ecovalence
YSi = Kang’s yield stability
2
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Introduction
Genotype x environmental interaction (GxE) refers to the modification of genetic factors by
environmental factors, and to the role of genetic factors in determining the performance of genotypes in
different environments. GxE can occur for quantitative traits of economic importance and is often studied
in plant and animal breeding, genetic epidemiology, pharmacogenomics and conservational biology
research. The traits include reproductive fitness, longevity, height, weight, yield, and disease resistance.
Selection of superior genotypes in target environments is an important objective of plant breeding
programs. A target environment is a production environment used by growers. In order to identify
superior genotypes across multiple environments, plant breeders conduct trials across locations and years,
especially during the final stages of cultivar development. GxE is said to exist when genotype
performance differs over environments. Performance of genotype can vary greatly across environment
because of the effect of environment on trait expression. Cultivars with high and stable performance are
difficult to identify, but are of great value.
Since it is impossible to test genotypes in all target environments, plant breeders do indirect selection
using their own multiple-environment trials, or test environments. GxE reduces the predictability of the
performance of genotypes in target environments based on genotype performance in test environments.
An important factor in plant breeding is the selection of suitable test locations, since it accounts for GxE
and maximizes gain from selection (Yan et al. 2011). An efficient test location is discriminating, and is
representative of the target environments for the cultivars to be released. Discriminating locations can
detect differences among genotypes with few replications. Representative locations make it likely that
genotypes selected will perform well in target environments (Yan et al. 2011).
The analysis of variance (ANOVA) is useful in determining the existence, size and significance of GxE.
In order to determine GxE for a group of elite cultivars, genotypes are often considered to be fixed effects
and environments random. However, for the purpose of estimating breeding values using best linear
unbiased prediction (BLUP), genotypes are considered to be random and environments fixed. Some
statisticians consider genotypes random effect, provided that the objective is to select the best ones (Smith
et al. 2005). If GxE is significant, additional stability statistics can be calculated.
Several statistical methods have been proposed for stability analysis. These methods are based on
univariate and multivariate models. The present paper focuses on univariate models for the analysis of
stability measured using R programming, so a brief description of each stability measure is provided
below.
The most widely used methods are univariate stability models based on regression and variance estimates.
According to the regression model, stability is expressed in terms of the trait mean (M), the slope of
regression line (bi) and the sum of squares for deviation from regression (S2d). High mean of a genotype
performance is a precondition of stability. The slope (bi) of regression indicates the response of genotype
to the environmental index, which is derived from the average performance of all genotypes in each
environment. If bi is not significantly different from unity, the genotype is adapted in all environments. A
bi greater than unity describes genotypes with higher sensitivity to environmental change (below average
stability), and greater specificity of adaptability to high yielding environments. A bi less than unity
provides a measure of greater resistance to environmental change (above average stability), and therefore
increasing specificity of adaptability to low yielding environments.
The variance parameters that measure stability statistics include stability ecovalence (Wi2) proposed by
Wricke (1962), stability variance (σi2) proposed by Shukla (1972), and yield stability (YSi) proposed by
Kang (1993).
3
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Ecovalence stability index (Wi2) of a genotype is its contribution to the GxE squared and summed across
all environments. Since the value of Wi2 is expressed as a sum of squares, a test of significance for Wi2 is
not available. Shukla (1972) proposed an unbiased estimate (σi2) of the variance of GxE plus an error term
associated with genotype. Shukla's stability variance (σi2) is a linear combination of Wricke's ecovalence
(Wi2). Shukla's stability statistic measures the contribution of a genotype to the GxE and error term,
therefore a genotype with low σi2 is regarded as stable. According to Kang et al. (1987), Wi2 and σi2 are
equivalent in ranking genotypes for stability.
The Kang (1993) stability statistic (YSi) is a nonparametric stability procedure in which both the mean
(M) and Shukla (1972) stability variance (σi2) for a trait are used as selection criteria. This method gives
equal weight to M and σi2. According to this method, genotypes with YSi greater than the mean YSi are
considered stable (Kang 1993; Mekbib 2003, Fan et al. 2007).
Genotype F ratio for each test location and correlation of test location with average location are important
measures of location value. When the mean of all genotypes are equal, then the F ratio will be close to 1.
If analysis of variance is run by location, then high genotype F ratio indicates high discriminating ability
for that location. High and significant value of Pearson correlation of each location with the mean of all
locations indicates strong representation of mean location performance.
Our objective was to develop an R language program (RGxE) that gives an output for genotype stability
and location value using univariate models, descriptive statistics, genotype F ratio across location and
environment, cluster analysis for location, and location correlation with average location performance. In
addition to the RGxE program, Dia et al. (2016a, in press) provided a SAS program (SASGxE) that
computes multivariate stability statistics using R program along with univariate stability statistics and
location value using SAS programming. These multivariate stability statistics include the additive main
effects and multiplicative interaction (AMMI) model, and genotype main effects plus GxE (GGE) model.
RGxE uses R software (version 3.1.3 and higher). RGxE is freely available, annotated, and intended for
scientists studying performance of polygenic or quantitative traits under different environmental
conditions. In the present paper we provide the RGxE program and it’s components along with outputs. A
supplemental file is provided with instructions for the user-enetered fields, interpretation of univariate
stability statistics, sample input data, and output from sample input data. The RGxE program along with
associated files is also available at http://cuke.hort.ncsu.edu/cucurbit/wehner/software.html.
General features of the RGxE Program
1. Overview of the RGxE program
RGxE is a user friendly and annotated R program that will allow user to analyze genotype stability and
evaluate test location value of balanced mult-location replicated trial data. This program generates output
(.csv or .txt) into the same folder from where it reads input dataset and Console window of helper
application ‘R studio’ (RStudio, 2015) of R statistical software (R Core Team, 2015). A schematic
representation of RGxE is presented in Figure 1. Below are the key components of RGxE program which
user can independently run.
2. Installing and loading packages
RGxE uses dplyr (Wickham et al., 2015), tidyr (Wickham, 2016), broom (Robinson, 2015), agricolae
(Mendiburu, 2015), lme4 (Bates et al., 2015), afex (Singmann et al., 2015), cluster (Maechler et al.,
2015), and grDevices (R Core Team, 2015) packages. The dplyr, tidyr, broom, agricolae, lme4, afex,
cluster, and grDevices packages are available from the Comprehensive R Archive Network (CRAN),
therefore they can be installed as any other packages, by simply typing:
4
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
install.packages("dplyr")
install.packages("tidyr")
install.packages("broom")
install.packages("agricolae")
install.packages("lme4")
install.packages("afex")
install.packages("cluster")
install.packages("grDevices")
Once installed, the packages have to be loaded before they can be used. This can be done through the
library() or require()command, as shown below.
library (tidyr)
library (dplyr)
library (sqldf)
library(lme4)
library(afex)
library(broom)
library(agricolae)
library(cluster)
library(grDevices)
3. Input data and validation
RGxE starts with user-entered field to read input data. Instructions on user enetered fields are presented in
supplemental material. The user is required to set current working directory using setwd(), which is
input data file location, and pass input data file name. RGxE requires an input data file in .csv (comma
separated value) format. Highlighted fields are user entered in the code shown below for Windows and
iOS (Mac) operating system, respectively.
setwd("E:/PhD Research Work/PhD Articles")
#### For Windows user ####
tempa <- read.csv("RGxEInputData2_2016_02_15.csv", header = TRUE)
#### For iOS or Mac user ####
file.name <- "E:/PhD Research Work/PhD Articles/RGxEInputData2_2016_02_15.csv"
out.name <- "E:/PhD Research Work/PhD Articles/GxEROutput.csv"
tempa <- read.csv(file.name)
The input data file is comprised of column names including YR (year), LC (location), RP (replication),
CLT (cultigen or genotype), and dependent variable (Trait). Sample input data is presented in
supplemental material. User is required not to change the column names as program takes same variable
name for the analysis. Dependent variable in example input data is yield (Mg ha-1) of watermelon.
Hereafter, a word ‘genotype’ is used to indicate cultigen, cultivar, variety or genotype. RGxE validates
the structure of input data, with below arguments, so that correct column types (numeric, logical, factor,
or character) are used for statistical analysis.
tempa$YR <- as.factor(tempa$YR)
tempa$RP <- as.factor(tempa$RP)
tempa$LC <- as.factor(tempa$LC)
tempa$CLT <- as.factor(tempa$CLT)
tempa$Trait <- as.numeric(tempa$Trait)
To access the structure of data, the str() command can be used.
str(tempa)
5
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
'data.frame':
400 obs. of 5 variables:
$ YR
: Factor w/ 2 levels "2009","2010": 1 1
$ LC
: Factor w/ 5 levels "CI","FL","KN",..:
$ RP
: Factor w/ 4 levels "1","2","3","4": 1
$ CLT : Factor w/ 10 levels "CalhounGray",..:
$ Trait: num 56.2 74.2 32.6 74.2 64.8 ...
1
3
1
3
1
3
1
1
1
3
1
9
1
3
1
2
1
3
1
5
1
3
1
4
1
3
1
7
1 ...
3 3 3 ...
1 1 ...
10 6 8 ...
Top 6 rows of example input data can be viewed using head() command.
head(tempa)
1
2
3
4
5
6
YR
2009
2009
2009
2009
2009
2009
LC RP
CLT Trait
KN 1
EarlyCanada 56.236
KN 1
CalhounGray 74.167
KN 1
StarbriteF1 32.601
KN 1
CrimsonSweet 74.167
KN 1 GeorgiaRattlesnake 64.794
KN 1
FiestaF1 70.907
4. Genotype stability statistics
4.1 Analysis of variance (ANOVA)
In multi-location replicated trial data, combined ANOVA is performed with the objectives to identify the
significance of different effects; estimate and compare mean for levels of fixed factors; and estimate the
size of genotype and GxE variance components. The ANOVA model comprises four factors: genotype
(CLT), location (LC), year (YR), and replication or block (RP) nested within locations and year. The
response of the genotype i in the location j, year k and replication r is presented as:
Response = m + CLTi + LCj + YCk + RPr(LCj*YRk) + CLTi*LCj + CLTi*YRk + LCj*YRk + CLTi*LCj*YRk + Errorijkr
where m = grand mean. Depending on the objectives of the analysis, the genotype, location and year are
defined as random or fixed effect, which gives five different ANOVA models (Table 1). The genotype is
random when the aim is to estimate variance components, genetic parameters, genetic gains expected
from selection or different breeding strategies etc. Conversely, genotype is fixed factor when aim is to
make comparison of test material for selection or recommendation. Similarly, location is considered as
random when the main interest is to estimate variance components for sites that are representative of the
relevant population within target region. Location is fixed when interest is to make explicit comparison of
one level another and each location represents a well-defined area with relative to crop management. The
year and replication are usually treated as random factor.
Different combinations of random and fixed effects in ANOVA model have implications for the
expectations of mean square (MS) values with the possible modification of the error term to be adopted in
the F test. Therefore, sometimes the F test is not as straightforward as the ratio between two mean
squares.
RGxE computes five different cases of ANOVA:
 case 1: CLT, YR, LC and RP – all random
 case 2: CLT, YR and LC – fixed; RP – random
 case 3: CLT – fixed; LC, YR and RP – random
 case 4: LC – fixed; CLT, YR and RP – random
 case 5: CLT and LC – fixed; YR and RP – random
6
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
For random effect RGxE computes estimates of variance components using lmer() function of lme4
package. The significance of random effects is computed using likelihood ratio test to attain p-values.
Likelihood is the probability of the data given a model. The logic of the likelihood ratio test is to compare
the likelihood of two models with each other. The model without the factor that you are interested in (null
model) is compared with model with the factor that you are interested in (full model) using anova()
function. It gives a Chi-Square value, the associated degrees of freedom and p-value. According to Wilk’s
theorem, the negative two times the log likelihood ratio of two models approaches a Chi-Square
distribution with k degrees of freedom, where k is number of random effects tested. RGxE create user
defined anova_lrt() function to compute likelihood ratio test and it is stored in ANOVA model Case I
code.
The type III sum of squares (SS), MS, F value of fixed effects are computed by fitting model in anova()
function of lme4 package. The significance (p-value) of fixed effects is computed using mixed()
function of afex package. The mixed() function computes type III like p-values using default method
via Kenward-Roger approximation for degrees of freedom.
To identify each experimental unit (EU) uniquely a distinct value must be assigned to EU. RGxE assign a
distinct value to each combination of replication (RP) nested within location (LC) x year (YR) and use
this new term (RPid) in model. After installing and calling packages, user can independently compute
five different ANOVA models while feeding input data (tempa) in below code. User friendly output is
generated in “data.frame” class using dplyr and tidyr packages.
########################################################################
##
ANOVA: Compute analysis of variance
##
########################################################################
#Generate unique id for replication for anova
tempa$RPid<-as.factor(paste(tempa$YR, tempa$LC, tempa$RP, sep="."))
########################################################################
###
ANOVA Case 1: CLT, YR, LC and RP - All Random
###
########################################################################
#full model
fit.f1<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) +
(1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) +
(1|RPid), data=tempa)
#model summary
summary1 <- summary(fit.f1)
#variance of random factors
variance <- as.data.frame(summary1$varcor)
#drop rownames
rownames(variance) <- NULL
variance1 <- variance %>% select (-var1, -var2) %>%
rename(sov=grp, Variance=vcov, stddev=sdcor)
#Type 3 test of hypothesis
#Type III Wald chisquare tests
anova(fit.f1, type="III")
7
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#Type 1
test of hypothesis
anova(fit.f1, type="marginal", test="F")
#model fitness
anovacase1 <- plot(fit.f1,
main="Model fitness Case 1: CLT, YR, LC and RP - All Random",
xlab="Predicated Value", ylab="Residual")
#LRT - likelihood ratio test for computing significance of random effect
#create function (anova_lrt) for Likelihood ratio test, where parameters
#a=outputdatasetname; example-anova1r
#b=full model name; example-fit.f1
#c=reduced model name; example-fit.f1r
#d=effect name; example- "RPid", NOTE: call it in quotation
anova_lrt <- function (a,b,c,d){
#level of significance
a <-anova(b,c)
#convert anova into data frame
a <- data.frame(a)
#convert rownames into column
a$name <- rownames(a)
# drop rownames
rownames(a) <- NULL
a <- a %>% filter(name=="b") %>%
mutate(sov=d) %>% select(sov, Pr_Chisq = starts_with("Pr..Chisq."))
# return the result
return(a)
}
#null model for YR
fit.f1y<-lmer(Trait~ 1 + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1y <- anova_lrt(anova1y,fit.f1,fit.f1y,"YR")
#null model for LC
fit.f1l<-lmer(Trait~ 1 + (1|YR) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1l <- anova_lrt(anova1l,fit.f1,fit.f1l,"LC")
#null model for CLT
fit.f1c<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1c <- anova_lrt(anova1c,fit.f1,fit.f1c,"CLT")
8
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#null model for YR:LC
fit.f1yl<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1yl <- anova_lrt(anova1yl,fit.f1,fit.f1yl,"YR:LC")
#null model for YR:CLT
fit.f1yc<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1yc <- anova_lrt(anova1yc,fit.f1,fit.f1yc,"YR:CLT")
#null model for LC:CLT
fit.f1lc<-lmer(Trait~ 1 + (1|YR)
(1|YR:CLT) +
(1|YR:LC:CLT) +
+ (1|LC) + (1|CLT) + (1|YR:LC) +
(1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1lc <- anova_lrt(anova1lc,fit.f1,fit.f1lc,"LC:CLT")
#null model for YR:LC:CLT
fit.f1ylc<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) +
(1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1ylc <- anova_lrt(anova1ylc,fit.f1,fit.f1ylc,"YR:LC:CLT")
#null model for RP
fit.f1r<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) , data=tempa)
#level of significance
#call function anova_lrt
anova1r <- anova_lrt(anova1ylr,fit.f1,fit.f1r,"RPid")
#Merge anova and level of significance
anova1 <- bind_rows(anova1y, anova1l)%>% bind_rows(anova1c)%>%
bind_rows(anova1yl)%>% bind_rows(anova1yc)%>%
bind_rows(anova1r)%>% bind_rows(anova1lc)%>%bind_rows(anova1ylc)
anova1 <- as.data.frame(anova1)
#Merge final output
anova_randall <- variance1%>% left_join(anova1 , by ="sov")
anova_randall$Pr_Chisq[anova_randall$stddev == 0] <- NA
9
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#Print final output
print(anova_randall)
sov Variance
stddev
YR:LC:CLT 49.72994 7.051946
LC:CLT
0.00000 0.000000
RPid 73.91368 8.597306
YR:CLT
0.00000 0.000000
YR:LC 57.81311 7.603494
CLT 111.69687 10.568674
LC 699.56950 26.449376
YR
0.00000 0.000000
Residual 327.52638 18.097690
Pr_Chisq
8.894184e-03
NA
4.145811e-07
NA
7.872463e-02
1.386709e-03
9.083568e-03
NA
NA
Where sov = source of variance, stddev = standard deviation, Pr_Chisq = Chi-Square probability
In this example, the estimate of variance of random effects location x genotype (LC:CLT), year x
genotype (YR:LC) and year (YR) is zero. It represent overfitted model, meaning model is more complex
than the data can support. Random effect variance estimated as zero is common with those random effects
that have too few or small number of levels. The alternate option is to use Markov Chain Monte Carlo
(MCMC) simulation using MCMCglmm package to get probability of random effects.
Fitness of ANOVA model for case 1 can be plotted using command print(anovacase1), where x-axis
is model predicted value and y-axis is residual value (Figure 2). The uniform distribution of fitted
residuals on both side of the reference line (value = 0) confirms the goodness of fit.
The best linear unbiased predictor (BLUP) of random effects can be extracted using ranef() function of
lme4 package. BLUPs are estimates of random effects. They allow us to account environmental factors in
our model and missing data; and can be used for making selection. BLUP tend to ‘shrunk’ towards the
population mean relative to their fixed effects estimates. RGxE computes BLUP of individual genotypes
and generate user friendly output in “data.frame” class using dplyr and tidyr packages (see below
code).
#Compute BLUP for CLT
#BLUP - Best linear unbiased predictor
randeffect1 <- ranef(fit.f1)
#BLUP for clt
BLUP_CLT <- as.data.frame(randeffect1$CLT)
#convert rownames into column
BLUP_CLT$genotype <- rownames(BLUP_CLT)
#drop rownames
rownames(BLUP_CLT) <- NULL
#rename variable name
BLUP_CLT <- BLUP_CLT %>% select(genotype,Blup = starts_with("(Intercept)"))
#return estimate of fixed effect from full model summary to compute BLUP
fixestimate1 <- as.data.frame(summary1$coefficients)
10
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#compute BLUP value
BLUP_CLT1 <- BLUP_CLT %>%
mutate(Blup = Blup + fixestimate1$Estimate)
#final output for BLUP for cultivars
BLUP_CLT1 <- as.data.frame(BLUP_CLT1)
#Print final output
print(BLUP_CLT1)
genotype
CalhounGray
CrimsonSweet
EarlyCanada
FiestaF1
GeorgiaRattlesnake
Legacy
Mickylee
Quetzali
StarbriteF1
SugarBaby
Blup
76.255
62.535
59.840
77.283
70.710
68.417
61.300
58.744
83.422
51.169
Where BLUP = best linear unbiased predictor
Similarly, remaining four cases of ANOVA model can be computed using below code. The user defined
function anova_lrt(), which stored in ANOVA model Case I code, is used for likelihood ratio test for
computing significance of random effects.
########################################################################
##
ANOVA Case 2: CLT, YR and LC - Fixed; RP - Random
##
########################################################################
#full model
fit.f2<-lmer(Trait~ YR*LC*CLT + (1|RPid), data=tempa)
#model summary
summary2 <- summary(fit.f2)
##variance of random factors
variance2 <- as.data.frame(summary2$varcor)
#drop rownames
rownames(variance2) <- NULL
variance2 <- variance2 %>% select (-var1, -var2) %>%
rename(sov=grp, Mean_Sq=vcov, stddev=sdcor)
#Type 3
test of hypothesis
anova.f2t3 <- as.data.frame(anova(fit.f2, type="III"))
#convert rownames into column
anova.f2t3$name <- rownames(anova.f2t3)
#drop rownames
11
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
rownames(anova.f2t3) <- NULL
#Type 2
test of hypothesis
anova.f2t2 <- anova(fit.f2, type="marginal", test="F")
#model fitness
anovacase2 <- plot(fit.f2,
main="Model fitness Case 2: CLT, YR and LC - Fixed; RP - Random",
xlab="Predicated Value", ylab="Residual")
#level of significance: "KR" is implemented corresponding to
#the Kenward-Rogers approximation for degrees of freedom
anova_f2t3 <- mixed(Trait~ YR*LC*CLT + (1|RPid), data=tempa)
anova_f2t3 <- as.data.frame(anova_f2t3$anova_table)
#convert rownames into column
anova_f2t3$name <- rownames(anova_f2t3)
#drop rownames
rownames(anova_f2t3) <- NULL
anova_f2t3 <- anova_f2t3 %>% select(name, Prob_F = starts_with("Pr(>F)"))
#final output for case 2: CLT, YR and LC - Fixed
anova_f2 <- anova.f2t3 %>% left_join(anova_f2t3, by = "name")%>%
rename(sov = name)%>%
select(sov, Df,Sum_Sq=starts_with("Sum Sq"),
Mean_Sq = starts_with("Mean Sq"),
F_value=starts_with("F value"), Prob_F)
anova_f2 <- anova_f2 %>% bind_rows(variance2)
anova_f2a <- as.data.frame(anova_f2) # print for output
The output of ANOVA model Case 2 can be viewed using print(anova_f2a) command and is
presented below.
"#Note: P values are computed for F ratio using KR method "
"#Kenward-Rogers approximation for degrees of freedom"
sov
YR
LC
CLT
YR:LC
YR:CLT
LC:CLT
YR:LC:CLT
RPid
Residual
Df
1
4
9
4
9
36
36
NA
NA
Sum_Sq
Mean_Sq
F_value
Prob_F
stddev
120.4704
120.47042 0.367819 5.487578e-01
NA
73133.5245 18283.38113 55.822625 1.801102e-13
NA
44948.8523 4994.31692 15.248595 4.888724e-20
NA
5372.9402 1343.23505 4.101151 9.076711e-03
NA
4564.9326
507.21473 1.548623 1.309478e-01
NA
15018.7208
417.18669 1.273750 1.455183e-01
NA
23058.4845
640.51346 1.955609 1.516985e-03
NA
NA
73.91369
NA
NA 8.597307
NA
327.52636
NA
NA 18.097689
Where sov = source of variance, Sum_Sq = sum of square, Mean_Sq = Mean square, F_value = F ratio,
Prob_F = probability for F test, stddev = standard deviation,
12
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
########################################################################
##
ANOVA CASE 3: CLT - Fixed; YR, LC and RP - Random
##
########################################################################
fit.f3<-lmer(Trait~ 1 + (1|YR) + (1|LC) + CLT + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#model summary
summary3 <- summary(fit.f3)
##variance of random factors
variance3 <- as.data.frame(summary3$varcor)
#drop rownames
rownames(variance3) <- NULL
variance3 <- variance3 %>% select (-var1, -var2) %>%
rename(sov=grp, Mean_Sq=vcov, stddev=sdcor)
#Type 3 test of hypothesis
anova.f3t3 <- as.data.frame(anova(fit.f3, type="III"))
#convert rownames into column
anova.f3t3$name <- rownames(anova.f3t3)
#drop rownames
rownames(anova.f3t3) <- NULL
#Type 2
test of hypothesis
anova.f3t2 <- anova(fit.f3, type="marginal", test="F")
#model fitness
anovacase3 <- plot(fit.f3,
main="Model fitness Case 3: CLT - Fixed; YR, LC and RP - Random",
xlab="Predicated Value", ylab="Residual")
#p value for fixed effects
anova_f3t3 <- mixed(Trait~ 1 + (1|YR) + (1|LC) + CLT + (1|YR:LC) +
(1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) +
(1|RPid), data=tempa)
anova_f3t3 <- as.data.frame(anova_f3t3$anova_table)
#convert rownames into column
anova_f3t3$name <- rownames(anova_f3t3)
#drop rownames
rownames(anova_f3t3) <- NULL
anova_f3t3 <- anova_f3t3 %>% select(name, Prob_F = starts_with("Pr(>F)"))
#final output of case 3 for fixed effect - CLT
13
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
anova_f3 <- anova.f3t3 %>% left_join(anova_f3t3, by = "name")%>%
rename(sov = name)%>%
select(sov, Df,Sum_Sq=starts_with("Sum Sq"),
Mean_Sq = starts_with("Mean Sq"),
F_value=starts_with("F value"), Prob_F)
#level of significance for random effects
#null model for CLT
fit.f3c <- lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|YR:LC) + (1|YR:LC) +
(1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) +
(1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3c <- anova_lrt(anova3c,fit.f3,fit.f3c,"CLT")
#null model for LC
fit.f3l <- lmer(Trait~ 1 + CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3l <- anova_lrt(anova3l,fit.f3,fit.f3l,"LC")
#null model for YR
fit.f3y <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3y <- anova_lrt(anova3y,fit.f3,fit.f3y,"YR")
#null model for YR:LC
fit.f3yl <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3yl <- anova_lrt(anova3yl,fit.f3,fit.f3yl,"YR:LC")
#null model for YR:CLT
fit.f3yc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3yc <- anova_lrt(anova3yc,fit.f3,fit.f3yc,"YR:CLT")
#null model for LC:CLT
fit.f3lc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
14
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#call function anova_lrt
anova3lc <- anova_lrt(anova3lc,fit.f3,fit.f3lc,"LC:CLT")
#null model for YR:LC:CLT
fit.f3ylc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3ylc <- anova_lrt(anova3ylc,fit.f3,fit.f3ylc,"YR:LC:CLT")
#null model for RPid or replication
fit.f3r <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT), data=tempa)
#level of significance
#call function anova_lrt
anova3r <- anova_lrt(anova3r,fit.f3,fit.f3r,"RPid")
#Merge anova and level of significance
anova3 <- bind_rows(anova3y, anova3l)%>% bind_rows(anova3yl)%>%
bind_rows(anova3yc)%>% bind_rows(anova3r)%>%
bind_rows(anova3lc)%>%bind_rows(anova3ylc)
anova3 <- as.data.frame(anova3)
#Merge final output
anova_cfix <- variance3%>% left_join(anova3 , by ="sov")
anova_cfix$Pr_Chisq[anova_cfix$stddev == 0] <- NA
The output of ANOVA model Case 3 can be viewed using print(anova_f3) and
print(anova_cfix)commands; and is presented below.
"#Note: P values are computed for F ratio using KR method "
"#Kenward-Rogers approximation for degrees of freedom"
sov Df
Sum_Sq Mean_Sq F_value
Prob_F
CLT 9 27964.75 3107.195 9.486854 0.006364982
Where sov = source of variance, Sum_Sq = sum of square, Mean_Sq = Mean square, F_value = F ratio,
Prob_F = probability for F test
"#For random effect P values are generated using LRT-Likelihood Ratio Test"
"#via model comparison and anova"
sov
Mean_Sq
stddev
YR:LC:CLT 49.72992 7.051945
LC:CLT
0.00000 0.000000
RPid 73.91370 8.597308
YR:CLT
0.00000 0.000000
YR:LC 57.81306 7.603490
LC 699.57346 26.449451
YR
0.00000 0.000000
Residual 327.52637 18.097690
Pr_Chisq
2.724821e-02
NA
4.145811e-07
NA
6.776546e-02
9.446027e-03
NA
NA
15
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Where sov = source of variance, Mean_Sq = Mean square, stddev = standard deviation, Pr_Chisq =
Chi-Square probability
########################################################################
##
ANOVA Case 4: LC - Fixed; YR, CLT and RP - Random
##
########################################################################
fit.f4 <- lmer(Trait~ 1 + LC + (1|YR) + (1|CLT) + (1|YR:LC) +
(1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) +
(1|RPid), data=tempa)
#model summary
summary4 <- summary(fit.f4)
##variance of random factors
variance4 <- as.data.frame(summary4$varcor)
# drop rownames
rownames(variance4) <- NULL
variance4 <- variance4 %>% select (-var1, -var2) %>%
rename(sov=grp, Mean_Sq=vcov, stddev=sdcor)
#Type 3 test of hypothesis
anova.f4t3 <- as.data.frame(anova(fit.f4, type="III"))
#convert rownames into column
anova.f4t3$name <- rownames(anova.f4t3)
#drop rownames
rownames(anova.f4t3) <- NULL
#Type 2 test of hypothesis
anova.f4t2 <- anova(fit.f3, type="marginal", test="F")
#model fitness
anovacase4 <- plot(fit.f4,
main="Model fitness Case 4: LC - Fixed; YR, CLT and RP - Random",
xlab="Predicated Value", ylab="Residual")
#level of significance for fixed effects
anova_f4t3 <- mixed(Trait~ 1 + LC + (1|YR) + (1|CLT) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
anova_f4t3 <- as.data.frame(anova_f4t3$anova_table)
#convert rownames into column
anova_f4t3$name <- rownames(anova_f4t3)
#drop rownames
rownames(anova_f4t3) <- NULL
16
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
anova_f4t3 <- anova_f4t3 %>% select(name, Prob_F = starts_with("Pr(>F)"))
#final output of case 4 for fixed effect - LC
anova_f4 <- anova.f4t3 %>% left_join(anova_f4t3, by = "name")%>%
rename(sov = name)%>%
select(sov, Df,Sum_Sq=starts_with("Sum Sq"),
Mean_Sq = starts_with("Mean Sq"),
F_value=starts_with("F value"), Prob_F)
#level of significance of random effect
#null model for CLT
fit.f4c <- lmer(Trait~ 1 + LC + (1|YR) + (1|YR:LC) + (1|YR:LC) +
(1|YR:CLT)+
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4c <- anova_lrt(anova4c,fit.f4,fit.f4c,"CLT")
#null model for LC
fit.f4l <- lmer(Trait~ 1 + (1|CLT) + (1|YR) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4l <- anova_lrt(anova4l,fit.f4,fit.f4l,"LC")
#null model for YR
fit.f4y <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4y <- anova_lrt(anova4y,fit.f4,fit.f4y,"YR")
#null model for YR:LC
fit.f4yl <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4yl <- anova_lrt(anova4yl,fit.f4,fit.f4yl,"YR:LC")
#null model for YR:CLT
fit.f4yc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4yc <- anova_lrt(anova4yc,fit.f4,fit.f4yc,"YR:CLT")
#null model for LC:CLT
17
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
fit.f4lc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4lc <- anova_lrt(anova4lc,fit.f4,fit.f4lc,"LC:CLT")
#null model for YR:LC:CLT
fit.f4ylc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4ylc <- anova_lrt(anova4ylc,fit.f4,fit.f4ylc,"YR:LC:CLT")
#null model for RPid or replication
fit.f4r <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT), data=tempa)
#level of significance
#call function anova_lrt
anova4r <- anova_lrt(anova4r,fit.f4,fit.f4r,"RPid")
#Merge anova and level of significance
anova4 <- bind_rows(anova4c, anova4y)%>% bind_rows(anova4yl)%>%
bind_rows(anova4yc)%>% bind_rows(anova4r)%>%
bind_rows(anova4lc)%>%bind_rows(anova4ylc)
anova4 <- as.data.frame(anova4)
#Merge final output
anova_lfix <- variance4%>% left_join(anova4 , by ="sov")
anova_lfix$Pr_Chisq[anova_lfix$stddev == 0] <- NA
The output of ANOVA model Case 4 can be viewed using print(anova_f4) and
print(anova_lfix)commands; and is presented below.
"#Note: P values are computed for F ratio using KR method "
"#Kenward-Rogers approximation for degrees of freedom"
sov Df
Sum_Sq Mean_Sq F_value
Prob_F
LC 4 21801.73 5450.432 16.6412 0.009420459
Where sov = source of variance, Sum_Sq = sum of square, Mean_Sq = Mean square, F_value = F ratio,
Prob_F = probability for F test
"#For random effect P values are generated using LRT-Likelihood Ratio Test"
"#via model comparison and anova"
sov
YR:LC:CLT
LC:CLT
RPid
YR:CLT
Mean_Sq
49.72997
0.00000
73.91371
0.00000
stddev
Pr_Chisq
7.051948 8.212358e-03
0.000000
NA
8.597308 4.145811e-07
0.000000
NA
18
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
YR:LC 57.81300 7.603486 3.980432e-01
CLT 111.69670 10.568666
NA
YR
0.00000 0.000000
NA
Residual 327.52635 18.097689
NA
Where sov = source of variance, Mean_Sq = Mean square, stddev = standard deviation, Pr_Chisq =
Chi-Square probability
########################################################################
##
ANOVA Case 5: CLT and LC - Fixed; YR, and RP - Random
##
########################################################################
fit.f5 <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#model summary
summary5 <- summary(fit.f5)
##variance of random factors
variance5 <- as.data.frame(summary5$varcor)
#drop rownames
rownames(variance5) <- NULL
variance5 <- variance5 %>% select (-var1, -var2) %>%
rename(sov=grp, Mean_Sq=vcov, stddev=sdcor)
#Type 3 test of hypothesis
anova.f5t3 <- as.data.frame(anova(fit.f5, type="III"))
#convert rownames into column
anova.f5t3$name <- rownames(anova.f5t3)
#drop rownames
rownames(anova.f5t3) <- NULL
#Type 2 test of hypothesis
anova.f5t2 <- anova(fit.f3, type="marginal", test="F")
#model fitness
anovacase5 <- plot(fit.f5,
main="Model fitness Case 5: CLT and LC - Fixed; YR and RP
- Random",
xlab="Predicated Value", ylab="Residual")
#level of significance for fixed effects
anova_f5t3 <- mixed(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) +
(1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) +
(1|RPid), data=tempa)
anova_f5t3 <- as.data.frame(anova_f5t3$anova_table)
#convert rownames into column
anova_f5t3$name <- rownames(anova_f5t3)
19
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#drop rownames
rownames(anova_f5t3) <- NULL
anova_f5t3 <- anova_f5t3 %>% select(name, Prob_F = starts_with("Pr(>F)"))
#final output of case 5 for fixed effect - CLT and LC
anova_f5 <- anova.f5t3 %>% left_join(anova_f5t3, by = "name")%>%
rename(sov = name)%>%
select(sov, Df,Sum_Sq=starts_with("Sum Sq"),
Mean_Sq = starts_with("Mean Sq"),
F_value=starts_with("F value"), Prob_F)
#level of significance for random effects
#null model for CLT
fit.f5c <- lmer(Trait~ 1 + LC + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5c <- anova_lrt(anova5c,fit.f5,fit.f5c,"CLT")
#null model for LC
fit.f5l <- lmer(Trait~ 1 + CLT + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5l <- anova_lrt(anova5l,fit.f5,fit.f5l,"LC")
#null model for YR
fit.f5y <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5y <- anova_lrt(anova5y,fit.f5,fit.f5y,"YR")
#null model for YR:LC
fit.f5yl <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5yl <- anova_lrt(anova5yl,fit.f5,fit.f5yl,"YR:LC")
#null model for YR:CLT
fit.f5yc <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5yc <- anova_lrt(anova5yc,fit.f5,fit.f5yc,"YR:CLT")
20
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#null model for LC:CLT
fit.f5lc <- lmer(Trait~ 1 + CLT + LC + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5lc <- anova_lrt(anova5lc,fit.f5,fit.f5lc,"LC:CLT")
#null model for YR:LC:CLT
fit.f5ylc <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5ylc <- anova_lrt(anova5ylc,fit.f5,fit.f5ylc,"YR:LC:CLT")
#null model for RPid or replication
fit.f5r <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) , data=tempa)
#level of significance
#call function anova_lrt
anova5r <- anova_lrt(anova5r,fit.f5,fit.f5r,"RPid")
#Merge anova and level of significance
anova5 <- bind_rows(anova5y, anova5yl)%>%
bind_rows(anova5yc)%>% bind_rows(anova5r)%>%
bind_rows(anova5ylc)
anova5 <- as.data.frame(anova5)
#Merge final output
anova_clfix <- variance5%>% left_join(anova5 , by ="sov")
anova_clfix$Pr_Chisq[anova_clfix$stddev == 0] <- NA
The output of ANOVA model Case 5 can be viewed using print(anova_f5) and
print(anova_clfix)commands; and is presented below.
"#Note: P values are computed for F ratio using KR method "
"#Kenward-Rogers approximation for degrees of freedom"
sov Df
Sum_Sq
Mean_Sq
F_value
Prob_F
CLT 9 23982.801 2664.7557 8.136004 0.002267970
LC 4 21801.728 5450.4320 16.641201 0.009275436
CLT:LC 36 8013.352 222.5931 0.679619 0.874336050
Where sov = source of variance, Sum_Sq = sum of square, Mean_Sq = Mean square, F_value = F ratio,
Prob_F = probability for F test
"#For random effect P values are generated using LRT-Likelihood Ratio Test"
"#via model comparison and anova"
sov
YR:LC:CLT
RPid
Mean_Sq
71.58186
73.91368
stddev
Pr_Chisq
8.460607 1.000000e+00
8.597307 1.305383e-07
21
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
YR:CLT
0.00000 0.000000
NA
YR:LC 55.62783 7.458407 2.975676e-01
YR
0.00000 0.000000
NA
Residual 327.52636 18.097689
NA
Where sov = source of variance, Mean_Sq = Mean square, stddev = standard deviation, Pr_Chisq =
Chi-Square probability
Fitness of ANOVA model case 2, case 3, case 4 and case 5 can be plotted using command
print(anovacase2), print(anovacase3), print(anovacase4) and print(anovacase1),
respectively.
4.2. Descriptive statistics
A new additional variable environment (ENV) is created and quality check of missing value is performed
in dataset ‘tempa2’using dplyr package. Environment is location-year combination, which is highlighted
is below code. RGxE validates structure of dataset ‘tempa2’ as it serves input data for descriptive and
other statistics (Figure 1).
#Compute environment - Location by year combination
tempa2 <- tempa %>%
mutate (ENV = paste(LC,YR, sep='-')) %>%
#remove missing records
na.omit()
#validate data
tempa2$YR <- as.factor(tempa2$YR)
tempa2$RP <- as.factor(tempa2$RP)
tempa2$LC <- as.factor(tempa2$LC)
tempa2$CLT <- as.factor(tempa2$CLT)
tempa2$ENV <- as.factor(tempa2$ENV)
tempa2$Trait <- as.numeric(tempa2$Trait)
Top 6 rows of input data ‘tempa2’ can be viewed using head() command.
head(tempa2)
1
2
3
4
5
6
YR
2009
2009
2009
2009
2009
2009
LC RP
CLT Trait
KN 1
EarlyCanada 56.236
KN 1
CalhounGray 74.167
KN 1
StarbriteF1 32.601
KN 1
CrimsonSweet 74.167
KN 1 GeorgiaRattlesnake 64.794
KN 1
FiestaF1 70.907
RPid
2009.KN.1
2009.KN.1
2009.KN.1
2009.KN.1
2009.KN.1
2009.KN.1
ENV
KN-2009
KN-2009
KN-2009
KN-2009
KN-2009
KN-2009
Descriptive statistics including count, minimum (min), maximum (max), mean, sum, median, variance
(var), standard deviation (sd), and coefficient of variation (cv) are computed using dplyr package. Using
tidyr package results of descriptive statistics are transposed in user friendly layout so that researchers can
interpret them easily (see supplemental material for descriptive statistics outputs). RGxE generates
following descriptive statistics.
 Trait mean over genotype and environment
 Trait mean and sd over genotype and year
 Trait mean, cv, sd and sum over genotype and location
 Trait mean, sd, and sum over genotype, location and year
 Trait mean over genotype, location and replication
22
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.






Trait mean over location and year
Trait mean over location and replication
Trait count, min, max, mean, sum, median, var, and sd over location
Trait count, min, max, mean, sum, median, var, and sd over year
Trait count, min, max, mean, sum, median, var, and sd over genotype
Trait count, min, max, mean, sum, median, var, and sd over environment
4.3 Univariate stability statistics
Among univariate stability statistics, RGxE generates output of regression slope (bi), deviation from
regression (S2d), Shukla’s sigma (σi2), ssquares, Wricke’s ecovalence (Wi) and Kang’s statistics (YSi).
RGxE regresses the response of genotype against the environmental index to compute regression slope
(bi), deviation from regression (S2d), T-test on regression slope (H0: bi = 1) and F-test on deviation from
regression (H0: S2d = 0). The level of significance of T-test and F-test is computed using lm() function of
R (R Core Team, 2015), and dplyr and tidyr packages. The level of significance at 0.05, 0.01 and 0.001
is represented by ‘*’, ‘**’, ‘***’; respectively. Environmental index is average performance of all
genotypes in each environment. Stability.par() function of agricolae package is used to compute
Shukla’s sigma (σi2), ssquares, Wricke’s ecovalence (Wi) and Kang’s statistics (YSi). For selection of
stable genotype, user can independently compute univariate stability statistics while feeding required
input data (tempa2) in below code. Top 6 rows of input data ‘tempa2’ is presented in section 4.2
descriptive statistics.
########################################################################
##
Compute univariate stability statistics - regression analysis
##
########################################################################
#Compute regression (slope) and deviation from regression
#compute environmental index
dsterm <- tempa2 %>%
group_by (ENV, RP, YR, LC) %>%
summarize (ENVTrait = mean(Trait,na.rm=FALSE))
dst02 <- tempa2 %>%
left_join(dsterm, by=c("ENV", "RP")) %>% #Left join on multiple columns
arrange (CLT) %>%
rename (YR= YR.x, LC = LC.x )
#fit model
fit_model <- dst02 %>%
group_by(CLT) %>% #group regression analysis by cultivar
do (model=lm(Trait~ENVTrait + ENV + RP, data=.))
#parameter estimates
paramlm <- as.data.frame(fit_model %>% tidy(model))
glancelm <- as.data.frame(fit_model %>% glance(model))
augmentlm <- as.data.frame(fit_model %>% augment(model))
outmsed <- lapply(fit_model$model, anova) #anova output
outmsed2 <- as.data.frame(do.call(rbind, outmsed)) #convert list into
data.frame
#convert rownames into column
outmsed2$SOV <- rownames(outmsed2)
# drop rownames
23
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
rownames(outmsed2) <- NULL
#remove numeric values from string of rownames using function gsub
outmsed2 <- outmsed2 %>% mutate(SOV = gsub("\\d+","",SOV))
#extract unique cultivar name and merge to outmsed2 dataset
genotypes <- dst02 %>% select(CLT) %>% distinct (CLT) %>% arrange(CLT)
#Stack 4 times to match number of rows with outmsed2 dataset
genotypes1 <- genotypes %>% bind_rows(genotypes) %>%
bind_rows(genotypes) %>% bind_rows(genotypes) %>% arrange(CLT)
#attach list of cultivars to outmsed2
outmsed3 <- as.data.frame(outmsed2 %>% bind_cols(genotypes1))
#transpose outmsed3
outmsed4 <- outmsed3 %>%
select (CLT, SOV, MS = starts_with("Mean")) %>%
filter (SOV != "RP")
#rename variables
#transpose MS values
MSDS <- outmsed4 %>%
spread (SOV, MS) %>% #transpose using library tidyr
arrange (CLT)
#Transpose degress of freedoms for F-test
FDS3 <- outmsed3 %>%
filter (SOV != "RP")%>%
select (CLT, SOV, Df) %>%
spread (SOV, Df) %>% #transpose using library tidyr
rename (DF_ENVTrait = ENVTrait, DF_Residuals = Residuals,
DF_ENV = ENV)
#Subset parameters - paramlm dataset
REGCOEFGS <- paramlm %>%
filter (term == "ENVTrait") %>%
select (-statistic, -p.value)
#Test and level of significance of regression and deviation from regression
#Merge MSDS, FDS3, REGCOEFGS
slope <- MSDS %>% inner_join (REGCOEFGS, by = "CLT") %>%
inner_join (FDS3, by = "CLT") %>%
rename (MSE = Residuals, LREGMS=ENVTrait, DEVLMS = ENV,
BI= estimate, STDERR = std.error)
#test significance levels
slope1 <- slope %>%
mutate (T_H01 = (BI-1)/STDERR , #Null Hypothesis for slope = 1
PT_H01 = 2*pt(-abs(T_H01), DF_Residuals),
F_DEVREG=DEVLMS/MSE, #NULL HYPOTHESIS: PREDICTED-ACTUAL = 0
PF_HO0= 1-pf(F_DEVREG, DF_ENV, DF_Residuals))
#add legend for level of significance
24
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
slope2 <- slope1 %>%
mutate (SIG_SLOPE = ifelse(PT_H01 <= 0.001, "***",
ifelse(PT_H01 <= 0.01, "**",
ifelse(PT_H01 <= 0.05, "*","")))) %>%
mutate (SIG_DEVREG = ifelse(PF_HO0 <= 0.001, "***",
ifelse(PF_HO0 <= 0.01, "**",
ifelse(PF_HO0 <= 0.05, "*",""))))
#final regression output
options(digits=5)
univariate2 <- slope2 %>%
mutate (SLOPE = paste(BI,SIG_SLOPE, sep=""),
DEVREG = paste(DEVLMS,SIG_DEVREG, sep="") ) %>%
select (CLT,SLOPE, DEVREG )
########################################################################
#
Compute univariate stability statistics - shukla, ecovalence, YS
#
########################################################################
#Compute Shukla, Wricke Ecovalense, Kangs YS
repno <- tempa2 %>%
summarise (total_rep = n_distinct(RP)) #count total number of rep
dstgl <- tempa2 %>%
group_by (CLT, LC) %>%
#Summarize genotype performance across locations
summarize (Trait = mean(Trait,na.rm=FALSE))
dstgl1 <- dstgl %>%
spread (LC, Trait) #transpose values
#convert into data frame so that row containing structure information is
deleted
dstgl2 <- as.data.frame(dstgl1)
#create rownames
rownames(dstgl2) <- dstgl2[ ,1]
shukla <- dstgl2[,-1]
#compute MS error term
tempa3 <- glm(Trait ~ LC + YR + LC:YR + RP %in% (LC:YR) + CLT + CLT:LC +
CLT:YR + CLT:LC:YR, family = gaussian , data= tempa2 )
#model summary
summary1 <- summary.glm(tempa3)
#Error SS
error_ss <- as.data.frame(summary1$deviance)
error_ss1 <- error_ss %>%
#rename variable
select (Deviance = starts_with("summary"), everything())
#Error DF
error_df <- as.data.frame(summary1$df.residual)
error_df1 <- error_df %>%
25
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#rename variable
select (Df = starts_with("summary"), everything())
#MS of error
mse1 <- as.data.frame(error_ss1/error_df1)
mse <- mse1 %>%
#rename variable
rename (MS = Deviance)
# MSError is used populated from ANOVA
univariate1a <- stability.par(shukla, rep= repno$total_rep , MSerror=mse$MS,
alpha=0.1, main="Genotype")
#pool results into individual columns
univariate1b <- univariate1a$statistics
#create column genotype from rownames
univariate1b$genotype <- rownames(univariate1b)
rownames(univariate1b) = NULL #remove rownames
names(univariate1b)[3] <- "significane_sigma"
# rename duplicate name dot
names(univariate1b)[5] <- "significane_s2"
# rename duplicate name dot
names(univariate1b)[2] <- "sigma"
# rename
names(univariate1b)[4] <- "ssquare"
# rename
univariate1c <- univariate1a$stability
#create column genotype from rownames
univariate1c$genotype <- rownames(univariate1c)
rownames(univariate1c) = NULL #remove rownames
names(univariate1c)[8] <- "legend"
# rename variable ... to legend
#Merge
univariate1d <- univariate1b %>%
inner_join (univariate1c , by = "genotype") %>%
# deselect all columns between Yield and Stab.rating
select (-Yield: -Stab.rating) %>%
# arrange the column order for final output
select (CLT=genotype, Mean, sigma,
significane_sigma, ssquare,
significane_s2, Ecovalence,YSi, legend)
#Final stability statistics
#Merge Univariate2 and Univariate1d
univariate <- univariate2 %>%
inner_join(univariate1d, by = "CLT") %>%
mutate (SIGMA=paste(sigma,significane_sigma, ""),
SIGMA_SQUARE=paste(ssquare,significane_s2, ""),
YS_Kang =paste(YSi,legend, "")) %>%
select (Genotype = CLT, Mean, SLOPE, DEVREG, SIGMA,
SIGMA_SQUARE,Ecovalence, YS_Kang)
print(univariate)
Genotype Mean SLOPE
CalhounGray 77.349 1.32
CrimsonSweet 62.013 1.36
DEVREG
SIGMA
SIGMA_SQUARE Ecovalence YS_Kang
124.67
61.35 ns
15.76 ns
279.75
10 +
1450.04*** 439.12 ns 567.99 ns
1488.64
4
26
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
EarlyCanada
FiestaF1
GeorgiaRattlesnake
Legacy
Mickylee
Quetzali
StarbriteF1
SugarBaby
59.000
78.498
71.151
68.588
60.632
57.775
85.360
49.307
0.32*
1.58
0.92
1.13
0.59
0.97
1.29
0.50*
686.25*
657.87
220.06
428.07
705.48*
96.53
221.14
332.18*
253.23
300.29
52.21
287.39
188.11
82.81
157.24
264.19
ns
ns
ns
ns
ns
ns
ns
ns
285.37
385.99
44.86
262.49
195.13
86.10
78.37
308.43
ns
ns
ns
ns
ns
ns
ns
ns
893.77
1044.35
250.52
1003.10
685.37
348.42
586.61
928.84
2
11
8
7
3
1
12
-1
+
+
+
+
Where SLOPE = regression slope, DEVREG = deviation from regression, SIGMA = Shukla’s sigma,
SIGMA_SQUARE = ssquares, Ecovalence = Wricke’s ecovalence, YS_Kang = Kang’s statistics, ns =
non-significant, + = indicate stable genotype according to Kang’s stability statistics
5. Location value statistics
Input data ‘tempa2’ is used to calculate genotype F ratio across location and environment; correlation of
location with average location performance; and location cluster analysis.
5.1 Genotype F ratio across location and environment; and correlation among location and
average location
RGxE computes analysis of variance by location using lm() function to get the genotype F values across
location. When the mean of all genotypes are equal then the F ratio will be close to 1. The high genotype
F value indicates high discriminating ability for that location. Similarly, Pearson’s test of correlation of
locations with average location is computed using cor.test() function of R built in stats package (R
core team, 2015). Function cor.test() provide level of significance of correlation and the level of
significance at 0.05, 0.01 and 0.001 is represented by ‘*’, ‘**’, ‘***’; respectively. RGxE generates user
friendly output for genotype F ratio across location and environment; and correlation of location with
average location performance using dplyr and tidyr packages as shown in below code.
######################################################################
##
Compute location statistics - genotype F ratio across
##
##
location and environment; location correlation
##
######################################################################
#Location values
#F-value of genotype across location
#fit model
fit_modellc <- tempa2 %>%
group_by(LC) %>% #group regression analysis by location
do (model1=lm(Trait~CLT + YR + CLT:YR + RP%in%YR , data=.))
#parameter estimates
paramlmlc <- as.data.frame(fit_modellc %>% tidy(model1))
glancelmlc <- as.data.frame(fit_modellc %>% glance(model1))
augmentlmlc <- as.data.frame(fit_modellc %>% augment(model1))
outmsedlc <- lapply(fit_modellc$model1, anova) #anova output
#convert list into data.frame
outmsedlc2 <- as.data.frame(do.call(rbind, outmsedlc))
#convert rownames into column
outmsedlc2$SOV <- rownames(outmsedlc2)
#drop rownames
rownames(outmsedlc2) <- NULL
27
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#remove numeric values from string of rownames using function gsub
outmsedlc2 <- outmsedlc2 %>% mutate(SOV = gsub("\\d+","",SOV))
#extract unique location name and merge to outmsedlc2 dataset
location <- dst02 %>% select(LC) %>% distinct (LC) %>% arrange(LC)
#Extract gentype F value for each location
locvalue <- outmsedlc2 %>%
filter(SOV == "CLT") %>%
select (FRatioGenotype = starts_with("F value")) %>%
bind_cols (location) %>% select (LC, FRatioGenotype)
locvalue <- as.data.frame (locvalue)
#Correlation between location and average location for each genotype
#compute genotype mean at each location
glcmean1 <- tempa2 %>%
group_by (CLT, LC) %>%
summarize (glcmean = mean(Trait,na.rm=FALSE)) %>%
as.data.frame(select (CLT, LC, glcmean))
#compute genotype mean across all location -average location
gmean1 <- tempa2 %>%
group_by (CLT ) %>%
summarize (gmean = mean(Trait,na.rm=FALSE)) %>%
as.data.frame(select (CLT, gmean))
#merge location mean with average location for each genotype
lgmean <- glcmean1 %>%
left_join(gmean1, by="CLT") %>%
arrange(LC) %>% select (-CLT)
#compute correlation with level of significance
lcgcorr <- lgmean %>%
group_by(LC) %>%
do(tidy(cor.test(.$glcmean, .$gmean, method = c("pearson"))))
lcgcorr1 <- lcgcorr %>%
select (LC, Corr_Value = starts_with ("estimate"),
Pvalue = starts_with("p.value"))
#post process correlation value
lcgcorr2 <- lcgcorr1 %>%
mutate (SIG_CORR = ifelse(Pvalue <= 0.001, "***",
ifelse(Pvalue <= 0.01, "**",
ifelse(Pvalue <= 0.05, "*",""))))
#concatenate p value symbol with correlation value
lcgcorr3 <- lcgcorr2 %>%
mutate (LocCorrelation=paste(Corr_Value,SIG_CORR, sep="")) %>%
select (LC, LocCorrelation)
lcgcorr3 <- as.data.frame(lcgcorr3)
28
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#Final location value table for output
#compute location mean
Locmean <- tempa2 %>%
group_by (LC ) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))%>%
select (LC, Mean = starts_with("Trait"))
Locmean <- as.data.frame(Locmean)
#merge all location value outputs for print
LocationValue <- Locmean %>%
inner_join (locvalue, by = "LC") %>%
inner_join(lcgcorr3, by = "LC") %>%
rename (Location = LC)
print(LocationValue)
Location
CI
FL
KN
SC
TX
Mean
61.040
100.153
63.786
82.685
27.173
FRatioGenotype
4.1804
2.2579
4.6964
6.8813
2.9966
LocCorrelation
0.95***
0.86**
0.90***
0.88***
0.89***
Where FRatioGenotype = genotype F ratio, LocCorrelation = correlation of location with average
location
########################################################################
###
F-value of genotype across environmen
##
########################################################################
#fit model
fit_modelen <- tempa2 %>%
group_by(ENV) %>% #group regression analysis by location
do (model2=lm(Trait~CLT + RP , data=.))
#parameter estimates
paramlmen <- as.data.frame(fit_modelen %>% tidy(model2))
glancelmen <- as.data.frame(fit_modelen %>% glance(model2))
augmentlmen <- as.data.frame(fit_modelen %>% augment(model2))
outmseden <- lapply(fit_modelen$model2, anova) #anova output
#convert list into data.frame
outmseden2 <- as.data.frame(do.call(rbind, outmseden))
#convert rownames into column
outmseden2$SOV <- rownames(outmseden2)
# drop rownames
rownames(outmseden2) <- NULL
#remove numeric values from string of rownames using function gsub
outmseden2 <- outmseden2 %>% mutate(SOV = gsub("\\d+","",SOV))
29
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#extract unique environment name and merge to outmseden2 dataset
environment <- dst02 %>% select(ENV) %>% distinct (ENV) %>% arrange(ENV)
#Extract gentype F value for each location
locvalue2 <- outmseden2 %>%
filter(SOV == "CLT") %>%
select (FRatioGenotype = starts_with("F value")) %>%
bind_cols (environment) %>% select (ENV, FRatioGenotype)
locvalue2 <- as.data.frame (locvalue2)
print(locvalue2)
ENV
CI-2009
CI-2010
FL-2009
FL-2010
KN-2009
KN-2010
SC-2009
SC-2010
TX-2009
TX-2010
FRatioGenotype
2.4015
3.3665
2.6914
1.7231
1.9999
6.3971
2.8729
8.0454
2.5003
1.4619
Where ENV = environment, FRatioGenotype = genotype F ratio
5.2 Location cluster analysis
Hierarchical cluster analysis for location relatedness is computed using hclust() function of R built in
stats package (R core team, 2015). The arguments passed to hclust() function include Euclidean
distance computed from dist() function and Ward’s method. Function dist() of R built in stats
package (R core team, 2015) computes and return the distance matrix between rows of a data matrix. Tree
or dendogram of cluster analysis is generated using plot() function, of R built in graphics package (R
core team, 2015), as shown in below code.
########################################################################
###
Compute cluster analysis of location
###
########################################################################
#location cluster analysis
#Euclidean distance
#Ward Hierarchical Clustering
#trait mean over location
mean_l <- tempa2 %>%
group_by (LC ) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))
mean_l1 <- as.data.frame(mean_l)
clusterdata <- mean_l1 %>% select (Trait)
clusterdata <- na.omit(clusterdata)
distance <- dist(clusterdata, method = "euclidean") # distance matrix
hcluster <- hclust(d=distance, method="ward.D")
locationcluster <- plot(hcluster, labels=mean_l1$LC) # display dendogram
6. Final output
30
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
After all computation is over, RGxE clears the Console Window of R studio then saves the output using
sink()function along with the system date and time using Sys.time()function of R built in base
package (R core team, 2015). RGxE auto saves the output (output name = ‘RGxEOutput’) in folder
which is defined in setwd() command in the beginning of the program. Program gives user option to
save results in .csv (RGxEOutput.csv) or .txt (RGxEOutput.txt) format. Output, in .txt format,
from sample input data generated by RGxE is presented in supplemental material. Additionally, RGxE
prints the output in Console Window of R studio.
7. Result interpretation
Interpretation of univariate stability statistics is presented in supplemental material. Additionally, studies
published on genotype stability (Dia et al., 2016b) and location value (Dia et al., 2016c, in press) used
SASGxE program (Dia et al., 2016a, in press), which is equivalent to RGxE program. Thus, these studies
can serve as source of RGxE output interpretation. Also, interpretation of RGxE and SASGxE program is
available at available at http://cuke.hort.ncsu.edu/cucurbit/wehner/software.html .
RGxE program
#############################################################################
#Program Name: RGxE
#Version: 1.1
#Purpose: It computes various stability analysis statistics for Genotype x
#
Environment Interaction data
#
Overall ANOVA with different combination of random and fixed
effects;
#
Regression coefficients - slope and Deviation from regression;
#
Shukla Sigma; Shukla SSquare; Wricke's Ecovalence; Kang's YS;
#
Location statistics: Genotype F ratio across location and
environment;
#
correlation of individual location with average location;
#
Descriptive statistics on genotype, location, environment, year and
#
replication;
#
Cluster analysis of location
#libraries and packages used: dplyr, tidyr, sqldf, broom, agricolae
#
(stability.par), cluster, lme4, afex
#Developers: Mahendra Dia,Todd C. Wehner and Consuelo Arellano
#
NC State University, NC, USA
#Last Modified: March 11, 2016
#############################################################################
#Install packages
install.packages("tidyr")
install.packages("dplyr")
install.packages ("sqldf")
install.packages("lme4")
install.packages("afex")
install.packages("broom")
install.packages("agricolae")
#call libraries
library (tidyr)
library (dplyr)
library (sqldf)
library(lme4)
library(afex)
library(broom)
library(agricolae)
31
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
library(cluster)
library(grDevices)
#list current objects
ls()
#to remove list of all objects from workspace
rm(list=ls())
#to identify current directory
getwd()
#set current directory
###user NEED to replace input data file path###
setwd("E:/PhD Research Work/PhD Articles
/BLUE_BLUP_Prediction_Interval_SAS_R")
#check current directory
getwd()
########################################################################
####################
For Windows user
###################
########################################################################
######################################
########## Option 1 ################
######################################
#import input data
###user NEED to replace input data file name###
tempa <-read.csv("RGxEInputData2_2016_02_15.csv", header = TRUE)
######################################
########## Option 2 ################
######################################
#opens pop up window to select file
#file.name <- file.choose()
#read csv file
#tempa <-read.csv(file.name)
########################################################################
####################
For Mac user
###################
########################################################################
#file.name <- file.choose()
#file.name
#copy path from console and paste here
#file.name <- "/Users/toddwehner/Desktop/RCodeCurrent/GxeR14Data4.csv"
#create object path for where the data will go at the end
# I append the file name with "_out" so I know its data coming out of R
#out.file <- "/Users/toddwehner/Desktop/RCodeCurrent/GxeR14Out4.csv"
#tempa <-read.csv(file.name)
########################################################################
32
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
######
Define trait that need to be analyzed
#######
########################################################################
#user need to define trait name for analysis
#example: MKMGHA = Marketable mega-gram per hectare
tempa <- tempa %>% rename(Trait = Trait) %>%
select(YR, LC, RP, CLT, Trait)
#view top 6 rows of input data
head(tempa)
#view bottom 6 rows of input data
tail(tempa)
#Get dimension of data
dim(tempa)
#Get data type
class(tempa)
#Get structure of data (character vs. numeric vs. matrix vs. vector vs.
factor)
#make sure: numeric value should be numeric
str(tempa)
#validate the variable types
tempa$YR <- as.factor(tempa$YR)
tempa$RP <- as.factor(tempa$RP)
tempa$LC <- as.factor(tempa$LC)
tempa$CLT <- as.factor(tempa$CLT)
tempa$Trait <- as.numeric(tempa$Trait)
########################################################################
##
ANOVA: Compute analysis of variance
##
########################################################################
#Generate unique id for replication for anova
tempa$RPid<-as.factor(paste(tempa$YR, tempa$LC, tempa$RP, sep="."))
########################################################################
###
ANOVA Case 1: CLT, YR, LC and RP - All Random
###
########################################################################
#full model
fit.f1<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#model summary
summary1 <- summary(fit.f1)
#variance of random factors
33
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
variance <- as.data.frame(summary1$varcor)
#drop rownames
rownames(variance) <- NULL
variance1 <- variance %>% select (-var1, -var2) %>%
rename(sov=grp, Variance=vcov, stddev=sdcor)
#Type 3 test of hypothesis
#Type III Wald chisquare tests
anova(fit.f1, type="III")
#Type 2
test of hypothesis
anova(fit.f1, type="marginal", test="F")
#model fitness
anovacase1 <- plot(fit.f1,
main="Model fitness Case 1: CLT, YR, LC and RP - All
Random",
xlab="Predicated Value", ylab="Residual")
#LRT - likelihood ratio test for computing significance of random effect
#create function for Likelihood ratio test
#a=outputdatasetname; example-anova1r
#b=full model name; example-fit.f1
#c=reduced model name; example-fit.f1r
#d=effect name; example- "RPid", NOTE: call it in quotation
anova_lrt <- function (a,b,c,d){
#level of significance
a <-anova(b,c)
#convert anova into data frame
a <- data.frame(a)
#convert rownames into column
a$name <- rownames(a)
# drop rownames
rownames(a) <- NULL
a <- a %>% filter(name=="b") %>%
mutate(sov=d) %>% select(sov, Pr_Chisq = starts_with("Pr..Chisq."))
# return the result
return(a)
}
#null model for YR
fit.f1y<-lmer(Trait~ 1 + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1y <- anova_lrt(anova1y,fit.f1,fit.f1y,"YR")
#null model for LC
fit.f1l<-lmer(Trait~ 1 + (1|YR) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
34
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
anova1l <- anova_lrt(anova1l,fit.f1,fit.f1l,"LC")
#null model for CLT
fit.f1c<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1c <- anova_lrt(anova1c,fit.f1,fit.f1c,"CLT")
#null model for YR:LC
fit.f1yl<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1yl <- anova_lrt(anova1yl,fit.f1,fit.f1yl,"YR:LC")
#null model for YR:CLT
fit.f1yc<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1yc <- anova_lrt(anova1yc,fit.f1,fit.f1yc,"YR:CLT")
#null model for LC:CLT
fit.f1lc<-lmer(Trait~ 1 + (1|YR)
(1|YR:CLT) +
(1|YR:LC:CLT) +
+ (1|LC) + (1|CLT) + (1|YR:LC) +
(1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1lc <- anova_lrt(anova1lc,fit.f1,fit.f1lc,"LC:CLT")
#null model for YR:LC:CLT
fit.f1ylc<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) +
(1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova1ylc <- anova_lrt(anova1ylc,fit.f1,fit.f1ylc,"YR:LC:CLT")
#null model for RP
fit.f1r<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) , data=tempa)
#level of significance
#call function anova_lrt
anova1r <- anova_lrt(anova1ylr,fit.f1,fit.f1r,"RPid")
35
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#Merge anova and level of significance
anova1 <- bind_rows(anova1y, anova1l)%>% bind_rows(anova1c)%>%
bind_rows(anova1yl)%>% bind_rows(anova1yc)%>%
bind_rows(anova1r)%>% bind_rows(anova1lc)%>%bind_rows(anova1ylc)
anova1 <- as.data.frame(anova1)
#Merge final output
anova_randall <- variance1%>% left_join(anova1 , by ="sov")
anova_randall$Pr_Chisq[anova_randall$stddev == 0] <- NA
#Compute BLUP for CLT
#BLUP - Best linear unbiased predictor
randeffect1 <- ranef(fit.f1)
#BLUP for clt
BLUP_CLT <- as.data.frame(randeffect1$CLT)
#convert rownames into column
BLUP_CLT$genotype <- rownames(BLUP_CLT)
#drop rownames
rownames(BLUP_CLT) <- NULL
#rename variable name
BLUP_CLT <- BLUP_CLT %>% select(genotype,Blup = starts_with("(Intercept)"))
#return estimate of fixed effect from full model summary to compute BLUP
fixestimate1 <- as.data.frame(summary1$coefficients)
#compute BLUP value
BLUP_CLT1 <- BLUP_CLT %>%
mutate(Blup = Blup + fixestimate1$Estimate)
#final output for BLUP for cultivars
BLUP_CLT1 <- as.data.frame(BLUP_CLT1)
########################################################################
##
ANOVA Case 2: CLT, YR and LC - Fixed; RP - Random
##
########################################################################
#full model
fit.f2<-lmer(Trait~ YR*LC*CLT + (1|RPid), data=tempa)
#model summary
summary2 <- summary(fit.f2)
##variance of random factors
variance2 <- as.data.frame(summary2$varcor)
#drop rownames
36
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
rownames(variance2) <- NULL
variance2 <- variance2 %>% select (-var1, -var2) %>%
rename(sov=grp, Mean_Sq=vcov, stddev=sdcor)
#Type 3
test of hypothesis
anova.f2t3 <- as.data.frame(anova(fit.f2, type="III"))
#convert rownames into column
anova.f2t3$name <- rownames(anova.f2t3)
#drop rownames
rownames(anova.f2t3) <- NULL
#Type 2
test of hypothesis
anova.f2t2 <- anova(fit.f2, type="marginal", test="F")
#model fitness
anovacase2 <- plot(fit.f2,
main="Model fitness Case 2: CLT, YR and LC - Fixed; RP Random",
xlab="Predicated Value", ylab="Residual")
#level of significance: "KR" is implemented corresponding to
#the Kenward-Rogers approximation for degrees of freedom
anova_f2t3 <- mixed(Trait~ YR*LC*CLT + (1|RPid), data=tempa)
anova_f2t3 <- as.data.frame(anova_f2t3$anova_table)
#convert rownames into column
anova_f2t3$name <- rownames(anova_f2t3)
#drop rownames
rownames(anova_f2t3) <- NULL
anova_f2t3 <- anova_f2t3 %>% select(name, Prob_F = starts_with("Pr(>F)"))
#final output for case 2: CLT, YR and LC - Fixed
anova_f2 <- anova.f2t3 %>% left_join(anova_f2t3, by = "name")%>%
rename(sov = name)%>%
select(sov, Df,Sum_Sq=starts_with("Sum Sq"),
Mean_Sq = starts_with("Mean Sq"),
F_value=starts_with("F value"), Prob_F)
anova_f2 <- anova_f2 %>% bind_rows(variance2)
anova_f2a <- as.data.frame(anova_f2) # print for output
########################################################################
##
ANOVA CASE 3: CLT - Fixed; YR, LC and RP - Random
##
########################################################################
fit.f3<-lmer(Trait~ 1 + (1|YR) + (1|LC) + CLT + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
37
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#model summary
summary3 <- summary(fit.f3)
##variance of random factors
variance3 <- as.data.frame(summary3$varcor)
#drop rownames
rownames(variance3) <- NULL
variance3 <- variance3 %>% select (-var1, -var2) %>%
rename(sov=grp, Mean_Sq=vcov, stddev=sdcor)
#Type 3 test of hypothesis
anova.f3t3 <- as.data.frame(anova(fit.f3, type="III"))
#convert rownames into column
anova.f3t3$name <- rownames(anova.f3t3)
#drop rownames
rownames(anova.f3t3) <- NULL
#Type 2
test of hypothesis
anova.f3t2 <- anova(fit.f3, type="marginal", test="F")
#model fitness
anovacase3 <- plot(fit.f3,
main="Model fitness Case 3: CLT - Fixed; YR, LC and RP Random",
xlab="Predicated Value", ylab="Residual")
#p value for fixed effects
anova_f3t3 <- mixed(Trait~ 1 + (1|YR) + (1|LC) + CLT + (1|YR:LC) +
(1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) +
(1|RPid), data=tempa)
anova_f3t3 <- as.data.frame(anova_f3t3$anova_table)
#convert rownames into column
anova_f3t3$name <- rownames(anova_f3t3)
#drop rownames
rownames(anova_f3t3) <- NULL
anova_f3t3 <- anova_f3t3 %>% select(name, Prob_F = starts_with("Pr(>F)"))
#final output of case 3 for fixed effect - CLT
anova_f3 <- anova.f3t3 %>% left_join(anova_f3t3, by = "name")%>%
rename(sov = name)%>%
select(sov, Df,Sum_Sq=starts_with("Sum Sq"),
Mean_Sq = starts_with("Mean Sq"),
F_value=starts_with("F value"), Prob_F)
#level of significance for random effects
38
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#null model for CLT
fit.f3c <- lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|YR:LC) + (1|YR:LC) +
(1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) +
(1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3c <- anova_lrt(anova3c,fit.f3,fit.f3c,"CLT")
#null model for LC
fit.f3l <- lmer(Trait~ 1 + CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3l <- anova_lrt(anova3l,fit.f3,fit.f3l,"LC")
#null model for YR
fit.f3y <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3y <- anova_lrt(anova3y,fit.f3,fit.f3y,"YR")
#null model for YR:LC
fit.f3yl <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3yl <- anova_lrt(anova3yl,fit.f3,fit.f3yl,"YR:LC")
#null model for YR:CLT
fit.f3yc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3yc <- anova_lrt(anova3yc,fit.f3,fit.f3yc,"YR:CLT")
#null model for LC:CLT
fit.f3lc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3lc <- anova_lrt(anova3lc,fit.f3,fit.f3lc,"LC:CLT")
#null model for YR:LC:CLT
fit.f3ylc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) +
39
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
(1|YR:CLT) +
(1|LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova3ylc <- anova_lrt(anova3ylc,fit.f3,fit.f3ylc,"YR:LC:CLT")
#null model for RPid or replication
fit.f3r <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT), data=tempa)
#level of significance
#call function anova_lrt
anova3r <- anova_lrt(anova3r,fit.f3,fit.f3r,"RPid")
#Merge anova and level of significance
anova3 <- bind_rows(anova3y, anova3l)%>% bind_rows(anova3yl)%>%
bind_rows(anova3yc)%>% bind_rows(anova3r)%>%
bind_rows(anova3lc)%>%bind_rows(anova3ylc)
anova3 <- as.data.frame(anova3)
#Merge final output
anova_cfix <- variance3%>% left_join(anova3 , by ="sov")
anova_cfix$Pr_Chisq[anova_cfix$stddev == 0] <- NA
########################################################################
##
ANOVA Case 4: LC - Fixed; YR, CLT and RP - Random
##
########################################################################
fit.f4 <- lmer(Trait~ 1 + LC + (1|YR) + (1|CLT) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#model summary
summary4 <- summary(fit.f4)
##variance of random factors
variance4 <- as.data.frame(summary4$varcor)
# drop rownames
rownames(variance4) <- NULL
variance4 <- variance4 %>% select (-var1, -var2) %>%
rename(sov=grp, Mean_Sq=vcov, stddev=sdcor)
#Type 3 test of hypothesis
anova.f4t3 <- as.data.frame(anova(fit.f4, type="III"))
#convert rownames into column
anova.f4t3$name <- rownames(anova.f4t3)
#drop rownames
rownames(anova.f4t3) <- NULL
40
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#Type 2 test of hypothesis
anova.f4t2 <- anova(fit.f3, type="marginal", test="F")
#model fitness
anovacase4 <- plot(fit.f4,
main="Model fitness Case 4: LC - Fixed; YR, CLT and RP Random",
xlab="Predicated Value", ylab="Residual")
#level of significance for fixed effects
anova_f4t3 <- mixed(Trait~ 1 + LC + (1|YR) + (1|CLT) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
anova_f4t3 <- as.data.frame(anova_f4t3$anova_table)
#convert rownames into column
anova_f4t3$name <- rownames(anova_f4t3)
#drop rownames
rownames(anova_f4t3) <- NULL
anova_f4t3 <- anova_f4t3 %>% select(name, Prob_F = starts_with("Pr(>F)"))
#final output of case 4 for fixed effect - LC
anova_f4 <- anova.f4t3 %>% left_join(anova_f4t3, by = "name")%>%
rename(sov = name)%>%
select(sov, Df,Sum_Sq=starts_with("Sum Sq"),
Mean_Sq = starts_with("Mean Sq"),
F_value=starts_with("F value"), Prob_F)
#level of significance of random effect
#null model for CLT
fit.f4c <- lmer(Trait~ 1 + LC + (1|YR) + (1|YR:LC) + (1|YR:LC) +
(1|YR:CLT)+
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4c <- anova_lrt(anova4c,fit.f4,fit.f4c,"CLT")
#null model for LC
fit.f4l <- lmer(Trait~ 1 + (1|CLT) + (1|YR) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4l <- anova_lrt(anova4l,fit.f4,fit.f4l,"LC")
#null model for YR
fit.f4y <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR:LC) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
41
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#level of significance
#call function anova_lrt
anova4y <- anova_lrt(anova4y,fit.f4,fit.f4y,"YR")
#null model for YR:LC
fit.f4yl <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4yl <- anova_lrt(anova4yl,fit.f4,fit.f4yl,"YR:LC")
#null model for YR:CLT
fit.f4yc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) +
(1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4yc <- anova_lrt(anova4yc,fit.f4,fit.f4yc,"YR:CLT")
#null model for LC:CLT
fit.f4lc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4lc <- anova_lrt(anova4lc,fit.f4,fit.f4lc,"LC:CLT")
#null model for YR:LC:CLT
fit.f4ylc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova4ylc <- anova_lrt(anova4ylc,fit.f4,fit.f4ylc,"YR:LC:CLT")
#null model for RPid or replication
fit.f4r <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) +
(1|YR:CLT) +
(1|LC:CLT) + (1|YR:LC:CLT), data=tempa)
#level of significance
#call function anova_lrt
anova4r <- anova_lrt(anova4r,fit.f4,fit.f4r,"RPid")
#Merge anova and level of significance
anova4 <- bind_rows(anova4c, anova4y)%>% bind_rows(anova4yl)%>%
bind_rows(anova4yc)%>% bind_rows(anova4r)%>%
bind_rows(anova4lc)%>%bind_rows(anova4ylc)
42
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
anova4 <- as.data.frame(anova4)
#Merge final output
anova_lfix <- variance4%>% left_join(anova4 , by ="sov")
anova_lfix$Pr_Chisq[anova_lfix$stddev == 0] <- NA
########################################################################
##
ANOVA Case 5: CLT and LC - Fixed; YR, and RP - Random
##
########################################################################
fit.f5 <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#model summary
summary5 <- summary(fit.f5)
##variance of random factors
variance5 <- as.data.frame(summary5$varcor)
#drop rownames
rownames(variance5) <- NULL
variance5 <- variance5 %>% select (-var1, -var2) %>%
rename(sov=grp, Mean_Sq=vcov, stddev=sdcor)
#Type 3 test of hypothesis
anova.f5t3 <- as.data.frame(anova(fit.f5, type="III"))
#convert rownames into column
anova.f5t3$name <- rownames(anova.f5t3)
#drop rownames
rownames(anova.f5t3) <- NULL
#Type 2 test of hypothesis
anova.f5t2 <- anova(fit.f3, type="marginal", test="F")
#model fitness
anovacase5 <- plot(fit.f5,
main="Model fitness Case 5: CLT and LC - Fixed; YR and RP
- Random",
xlab="Predicated Value", ylab="Residual")
#level of significance for fixed effects
anova_f5t3 <- mixed(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) +
(1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) +
(1|RPid), data=tempa)
anova_f5t3 <- as.data.frame(anova_f5t3$anova_table)
#convert rownames into column
anova_f5t3$name <- rownames(anova_f5t3)
#drop rownames
43
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
rownames(anova_f5t3) <- NULL
anova_f5t3 <- anova_f5t3 %>% select(name, Prob_F = starts_with("Pr(>F)"))
#final output of case 5 for fixed effect - CLT and LC
anova_f5 <- anova.f5t3 %>% left_join(anova_f5t3, by = "name")%>%
rename(sov = name)%>%
select(sov, Df,Sum_Sq=starts_with("Sum Sq"),
Mean_Sq = starts_with("Mean Sq"),
F_value=starts_with("F value"), Prob_F)
#level of significance for random effects
#null model for CLT
fit.f5c <- lmer(Trait~ 1 + LC + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5c <- anova_lrt(anova5c,fit.f5,fit.f5c,"CLT")
#null model for LC
fit.f5l <- lmer(Trait~ 1 + CLT + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5l <- anova_lrt(anova5l,fit.f5,fit.f5l,"LC")
#null model for YR
fit.f5y <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5y <- anova_lrt(anova5y,fit.f5,fit.f5y,"YR")
#null model for YR:LC
fit.f5yl <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5yl <- anova_lrt(anova5yl,fit.f5,fit.f5yl,"YR:LC")
#null model for YR:CLT
fit.f5yc <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5yc <- anova_lrt(anova5yc,fit.f5,fit.f5yc,"YR:CLT")
44
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#null model for LC:CLT
fit.f5lc <- lmer(Trait~ 1 + CLT + LC + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5lc <- anova_lrt(anova5lc,fit.f5,fit.f5lc,"LC:CLT")
#null model for YR:LC:CLT
fit.f5ylc <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|RPid), data=tempa)
#level of significance
#call function anova_lrt
anova5ylc <- anova_lrt(anova5ylc,fit.f5,fit.f5ylc,"YR:LC:CLT")
#null model for RPid or replication
fit.f5r <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) +
(1|YR:CLT) + (1|YR:LC:CLT) , data=tempa)
#level of significance
#call function anova_lrt
anova5r <- anova_lrt(anova5r,fit.f5,fit.f5r,"RPid")
#Merge anova and level of significance
anova5 <- bind_rows(anova5y, anova5yl)%>%
bind_rows(anova5yc)%>% bind_rows(anova5r)%>%
bind_rows(anova5ylc)
anova5 <- as.data.frame(anova5)
#Merge final output
anova_clfix <- variance5%>% left_join(anova5 , by ="sov")
anova_clfix$Pr_Chisq[anova_clfix$stddev == 0] <- NA
########################################################################
#
Compute Mean and CV of genotype, location, year, rep, environment #
########################################################################
#Compute environment -
Location by year combination
tempa2 <- tempa %>%
mutate (ENV = paste(LC,YR, sep='-')) %>%
#remove missing records
na.omit()
head(tempa2)
#validate data
tempa2$YR <- as.factor(tempa2$YR)
tempa2$RP <- as.factor(tempa2$RP)
tempa2$LC <- as.factor(tempa2$LC)
tempa2$CLT <- as.factor(tempa2$CLT)
tempa2$ENV <- as.factor(tempa2$ENV)
tempa2$Trait <- as.numeric(tempa2$Trait)
45
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#compute descritptive statistics
#trait mean over genotype and environment
mean_ge <- tempa2 %>%
group_by (CLT, ENV) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))
mean_ge1 <- mean_ge %>%
spread (ENV, Trait) #transpose using library tidyr
mean_ge2 <- as.data.frame(mean_ge1)
#trait mean over genotype and year
mean_gy <- tempa2 %>%
group_by (CLT, YR) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))
mean_gy1 <- mean_gy %>%
spread (YR, Trait) #transpose using library tidyr
mean_gy2 <- as.data.frame(mean_gy1)
#trait sum over genotype and year
sum_gy <- tempa2 %>%
group_by (CLT, YR) %>%
summarize (Trait = sum(Trait,na.rm=FALSE))
sum_gy1 <- sum_gy %>%
spread (YR, Trait) #transpose using library tidyr
sum_gy2 <- as.data.frame(sum_gy1)
#trait standard deviation (sd) over genotype and year
sd_gy <- tempa2 %>%
group_by (CLT, YR) %>%
summarize (Trait = sd(Trait,na.rm=FALSE))
sd_gy1 <- sd_gy %>%
spread (YR, Trait) #transpose using library tidyr
sd_gy2 <- as.data.frame(sd_gy1)
#trait mean over genotype and location
mean_gl <- tempa2 %>%
group_by (CLT, LC) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))
mean_gl1 <- mean_gl %>%
spread (LC, Trait) #transpose using library tidyr
mean_gl2 <- as.data.frame(mean_gl1)
#trait sum over genotype and location
sum_gl <- tempa2 %>%
group_by (CLT, LC) %>%
summarize (Trait = sum(Trait,na.rm=FALSE))
sum_gl1 <- sum_gl %>%
spread (LC, Trait) #transpose using library tidyr
46
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
sum_gl2 <- as.data.frame(sum_gl1)
#trait standard deviation (sd) over genotype and location
sd_gl <- tempa2 %>%
group_by (CLT, LC) %>%
summarize (Trait = sd(Trait,na.rm=FALSE))
sd_gl1 <- sd_gl %>%
spread (LC, Trait) #transpose using library tidyr
sd_gl2 <- as.data.frame(sd_gl1)
#trait mean over genotype, location and year
mean_gly <- tempa2 %>%
group_by (CLT, LC, YR) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))
mean_gly1 <- mean_gly %>%
spread (LC, Trait) #transpose using library tidyr
mean_gly2 <- as.data.frame(mean_gly1)
#trait sum over genotype, location and year
sum_gly <- tempa2 %>%
group_by (CLT, LC, YR) %>%
summarize (Trait = sum(Trait,na.rm=FALSE))
sum_gly1 <- sum_gly %>%
spread (LC, Trait) #transpose using library tidyr
sum_gly2 <- as.data.frame(sum_gly1)
#trait standard deviation (sd) over genotype, location and year
sd_gly <- tempa2 %>%
group_by (CLT, LC, YR) %>%
summarize (Trait = sd(Trait,na.rm=FALSE))
sd_gly1 <- sd_gly %>%
spread (LC, Trait) #transpose using library tidyr
sd_gly2 <- as.data.frame(sd_gly1)
#trait mean over genotype, location and rep
mean_glr <- tempa2 %>%
group_by (CLT, LC, RP) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))
mean_glr1 <- mean_glr %>%
spread (LC, Trait) #transpose using library tidyr
mean_glr2 <- as.data.frame(mean_glr1)
#trait mean over location and year
mean_ly <- tempa2 %>%
group_by (LC, YR) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))
47
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
mean_ly1 <- mean_ly %>%
spread (YR, Trait) #transpose using library tidyr
mean_ly2 <- as.data.frame(mean_ly1)
#trait mean over location and replication
mean_lr <- tempa2 %>%
group_by (LC, RP) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))
mean_lr1 <- mean_lr %>%
spread (RP, Trait) #transpose using library tidyr
mean_lr2 <- as.data.frame(mean_lr1)
#trait count, min, max, mean, sum, median, var, sd over location
mean_l <- tempa2 %>%
group_by (LC ) %>%
summarize (count = n(),
min = min(Trait,na.rm=FALSE),
max = max(Trait,na.rm=FALSE),
mean = mean(Trait,na.rm=FALSE),
sum = sum(Trait,na.rm=FALSE),
median = median(Trait,na.rm=FALSE),
var = var(Trait,na.rm=FALSE),
sd = sd(Trait,na.rm=FALSE))
stat_l1 <- as.data.frame(mean_l)
#trait count, min, max, mean, sum, median, var, sd over year
mean_y <- tempa2 %>%
group_by (YR ) %>%
summarize (count = n(),
min = min(Trait,na.rm=FALSE),
max = max(Trait,na.rm=FALSE),
mean = mean(Trait,na.rm=FALSE),
sum = sum(Trait,na.rm=FALSE),
median = median(Trait,na.rm=FALSE),
var = var(Trait,na.rm=FALSE),
sd = sd(Trait,na.rm=FALSE))
stat_y1 <- as.data.frame(mean_y)
#trait count, min, max, mean, sum, median, var, sd over genotype
mean_g <- tempa2 %>%
group_by (CLT ) %>%
summarize (count = n(),
min = min(Trait,na.rm=FALSE),
max = max(Trait,na.rm=FALSE),
mean = mean(Trait,na.rm=FALSE),
sum = sum(Trait,na.rm=FALSE),
median = median(Trait,na.rm=FALSE),
var = var(Trait,na.rm=FALSE),
sd = sd(Trait,na.rm=FALSE))
stat_g1 <- as.data.frame(mean_g)
#trait count, min, max, mean, sum, median, var, sd over environment
mean_e <- tempa2 %>%
48
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
group_by (ENV) %>%
summarize (count = n(),
min = min(Trait,na.rm=FALSE),
max = max(Trait,na.rm=FALSE),
mean = mean(Trait,na.rm=FALSE),
sum = sum(Trait,na.rm=FALSE),
median = median(Trait,na.rm=FALSE),
var = var(Trait,na.rm=FALSE),
sd = sd(Trait,na.rm=FALSE))
stat_e1 <- as.data.frame(mean_e)
#trait CV over genotype and location
#CV = (standard deviation /mean) *100
cv_gl <- tempa2 %>%
group_by (CLT, LC) %>%
summarise (Trait_m = mean(Trait,na.rm=FALSE) ,
Trait_s = sd(Trait,na.rm=FALSE)) %>%
mutate (Trait = (Trait_s/Trait_m)*100) %>% # CV
select (-Trait_m, -Trait_s)
cv_gl1 <- cv_gl %>%
spread (LC, Trait) #transpose using library tidyr
cv_gl2 <- as.data.frame(cv_gl1)
########################################################################
##
Compute univariate stability statistics - regression analysis
##
########################################################################
#Compute regression (slope) and deviation from regression
#compute environmental index
dsterm <- tempa2 %>%
group_by (ENV, RP, YR, LC) %>%
summarize (ENVTrait = mean(Trait,na.rm=FALSE))
dst02 <- tempa2 %>%
left_join(dsterm, by=c("ENV", "RP")) %>% #Left join on multiple columns
arrange (CLT) %>%
rename (YR= YR.x, LC = LC.x )
#fit model
fit_model <- dst02 %>%
group_by(CLT) %>% #group regression analysis by cultivar
do (model=lm(Trait~ENVTrait + ENV + RP, data=.))
#parameter estimates
paramlm <- as.data.frame(fit_model %>% tidy(model))
glancelm <- as.data.frame(fit_model %>% glance(model))
augmentlm <- as.data.frame(fit_model %>% augment(model))
outmsed <- lapply(fit_model$model, anova) #anova output
outmsed2 <- as.data.frame(do.call(rbind, outmsed)) #convert list into
data.frame
#convert rownames into column
outmsed2$SOV <- rownames(outmsed2)
# drop rownames
49
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
rownames(outmsed2) <- NULL
#remove numeric values from string of rownames using function gsub
outmsed2 <- outmsed2 %>% mutate(SOV = gsub("\\d+","",SOV))
#extract unique cultivar name and merge to outmsed2 dataset
genotypes <- dst02 %>% select(CLT) %>% distinct (CLT) %>% arrange(CLT)
#Stack 4 times to match number of rows with outmsed2 dataset
genotypes1 <- genotypes %>% bind_rows(genotypes) %>%
bind_rows(genotypes) %>% bind_rows(genotypes) %>% arrange(CLT)
#attach list of cultivars to outmsed2
outmsed3 <- as.data.frame(outmsed2 %>% bind_cols(genotypes1))
#transpose outmsed3
outmsed4 <- outmsed3 %>%
select (CLT, SOV, MS = starts_with("Mean")) %>%
filter (SOV != "RP")
#rename variables
#transpose MS values
MSDS <- outmsed4 %>%
spread (SOV, MS) %>% #transpose using library tidyr
arrange (CLT)
#Transpose degress of freedoms for F-test
FDS3 <- outmsed3 %>%
filter (SOV != "RP")%>%
select (CLT, SOV, Df) %>%
spread (SOV, Df) %>% #transpose using library tidyr
rename (DF_ENVTrait = ENVTrait, DF_Residuals = Residuals,
DF_ENV = ENV)
#Subset parameters - paramlm dataset
REGCOEFGS <- paramlm %>%
filter (term == "ENVTrait") %>%
select (-statistic, -p.value)
#Test and level of significance of regression and deviation from regression
#Merge MSDS, FDS3, REGCOEFGS
slope <- MSDS %>% inner_join (REGCOEFGS, by = "CLT") %>%
inner_join (FDS3, by = "CLT") %>%
rename (MSE = Residuals, LREGMS=ENVTrait, DEVLMS = ENV,
BI= estimate, STDERR = std.error)
#test significance levels
slope1 <- slope %>%
mutate (T_H01 = (BI-1)/STDERR , #Null Hypothesis for slope = 1
PT_H01 = 2*pt(-abs(T_H01), DF_Residuals),
F_DEVREG=DEVLMS/MSE, #NULL HYPOTHESIS: PREDICTED-ACTUAL = 0
PF_HO0= 1-pf(F_DEVREG, DF_ENV, DF_Residuals))
#add legend for level of significance
50
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
slope2 <- slope1 %>%
mutate (SIG_SLOPE = ifelse(PT_H01 <= 0.001, "***",
ifelse(PT_H01 <= 0.01, "**",
ifelse(PT_H01 <= 0.05, "*","")))) %>%
mutate (SIG_DEVREG = ifelse(PF_HO0 <= 0.001, "***",
ifelse(PF_HO0 <= 0.01, "**",
ifelse(PF_HO0 <= 0.05, "*",""))))
#final regression output
options(digits=5)
univariate2 <- slope2 %>%
mutate (SLOPE = paste(BI,SIG_SLOPE, sep=""),
DEVREG = paste(DEVLMS,SIG_DEVREG, sep="") ) %>%
select (CLT,SLOPE, DEVREG )
########################################################################
#
Compute univariate stability statistics - shukla, ecovalence, YS
#
########################################################################
#Compute Shukla, Wricke Ecovalense, Kangs YS
repno <- tempa2 %>%
summarise (total_rep = n_distinct(RP)) #count total number of rep
dstgl <- tempa2 %>%
group_by (CLT, LC) %>%
#Summarize genotype performance across locations
summarize (Trait = mean(Trait,na.rm=FALSE))
dstgl1 <- dstgl %>%
spread (LC, Trait) #transpose values
#convert into data frame so that row containing structure information is
deleted
dstgl2 <- as.data.frame(dstgl1)
#create rownames
rownames(dstgl2) <- dstgl2[ ,1]
shukla <- dstgl2[,-1]
#compute MS error term
tempa3 <- glm(Trait ~ LC + YR + LC:YR + RP %in% (LC:YR) + CLT + CLT:LC +
CLT:YR + CLT:LC:YR, family = gaussian , data= tempa2 )
#model summary
summary1 <- summary.glm(tempa3)
#Error SS
error_ss <- as.data.frame(summary1$deviance)
error_ss1 <- error_ss %>%
#rename variable
select (Deviance = starts_with("summary"), everything())
#Error DF
error_df <- as.data.frame(summary1$df.residual)
error_df1 <- error_df %>%
51
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#rename variable
select (Df = starts_with("summary"), everything())
#MS of error
mse1 <- as.data.frame(error_ss1/error_df1)
mse <- mse1 %>%
#rename variable
rename (MS = Deviance)
# MSError is used populated from ANOVA
univariate1a <- stability.par(shukla, rep= repno$total_rep , MSerror=mse$MS,
alpha=0.1, main="Genotype")
#pool results into individual columns
univariate1b <- univariate1a$statistics
#create column genotype from rownames
univariate1b$genotype <- rownames(univariate1b)
rownames(univariate1b) = NULL #remove rownames
names(univariate1b)[3] <- "significane_sigma"
# rename duplicate name dot
names(univariate1b)[5] <- "significane_s2"
# rename duplicate name dot
names(univariate1b)[2] <- "sigma"
# rename
names(univariate1b)[4] <- "ssquare"
# rename
univariate1c <- univariate1a$stability
#create column genotype from rownames
univariate1c$genotype <- rownames(univariate1c)
rownames(univariate1c) = NULL #remove rownames
names(univariate1c)[8] <- "legend"
# rename variable ... to legend
#Merge
univariate1d <- univariate1b %>%
inner_join (univariate1c , by = "genotype") %>%
# deselect all columns between Yield and Stab.rating
select (-Yield: -Stab.rating) %>%
# arrange the column order for final output
select (CLT=genotype, Mean, sigma,
significane_sigma, ssquare,
significane_s2, Ecovalence,YSi, legend)
#Final stability statistics
#Merge Univariate2 and Univariate1d
univariate <- univariate2 %>%
inner_join(univariate1d, by = "CLT") %>%
mutate (SIGMA=paste(sigma,significane_sigma, ""),
SIGMA_SQUARE=paste(ssquare,significane_s2, ""),
YS_Kang =paste(YSi,legend, "")) %>%
select (Genotype = CLT, Mean, SLOPE, DEVREG, SIGMA,
SIGMA_SQUARE,Ecovalence, YS_Kang)
########################################################################
##
Compute location statistics - genotype F ratio across
##
##
location and environment; location correlation
##
########################################################################
52
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#Location values
#F-value of genotype across location
#fit model
fit_modellc <- tempa2 %>%
group_by(LC) %>% #group regression analysis by location
do (model1=lm(Trait~CLT + YR + CLT:YR + RP%in%YR , data=.))
#parameter estimates
paramlmlc <- as.data.frame(fit_modellc %>% tidy(model1))
glancelmlc <- as.data.frame(fit_modellc %>% glance(model1))
augmentlmlc <- as.data.frame(fit_modellc %>% augment(model1))
outmsedlc <- lapply(fit_modellc$model1, anova) #anova output
#convert list into data.frame
outmsedlc2 <- as.data.frame(do.call(rbind, outmsedlc))
#convert rownames into column
outmsedlc2$SOV <- rownames(outmsedlc2)
#drop rownames
rownames(outmsedlc2) <- NULL
#remove numeric values from string of rownames using function gsub
outmsedlc2 <- outmsedlc2 %>% mutate(SOV = gsub("\\d+","",SOV))
#extract unique location name and merge to outmsedlc2 dataset
location <- dst02 %>% select(LC) %>% distinct (LC) %>% arrange(LC)
#Extract gentype F value for each location
locvalue <- outmsedlc2 %>%
filter(SOV == "CLT") %>%
select (FRatioGenotype = starts_with("F value")) %>%
bind_cols (location) %>% select (LC, FRatioGenotype)
locvalue <- as.data.frame (locvalue)
#Correlation between location and average location for each genotype
#compute genotype mean at each location
glcmean1 <- tempa2 %>%
group_by (CLT, LC) %>%
summarize (glcmean = mean(Trait,na.rm=FALSE)) %>%
as.data.frame(select (CLT, LC, glcmean))
#compute genotype mean across all location -average location
gmean1 <- tempa2 %>%
group_by (CLT ) %>%
summarize (gmean = mean(Trait,na.rm=FALSE)) %>%
as.data.frame(select (CLT, gmean))
#merge location mean with average location for each genotype
lgmean <- glcmean1 %>%
53
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
left_join(gmean1, by="CLT") %>%
arrange(LC) %>% select (-CLT)
#compute correlation with level of significance
lcgcorr <- lgmean %>%
group_by(LC) %>%
do(tidy(cor.test(.$glcmean, .$gmean, method = c("pearson"))))
lcgcorr1 <- lcgcorr %>%
select (LC, Corr_Value = starts_with ("estimate"),
Pvalue = starts_with("p.value"))
#post process correlation value
lcgcorr2 <- lcgcorr1 %>%
mutate (SIG_CORR = ifelse(Pvalue <= 0.001, "***",
ifelse(Pvalue <= 0.01, "**",
ifelse(Pvalue <= 0.05, "*",""))))
#concatenate p value symbol with correlation value
lcgcorr3 <- lcgcorr2 %>%
mutate (LocCorrelation=paste(Corr_Value,SIG_CORR, sep="")) %>%
select (LC, LocCorrelation)
lcgcorr3 <- as.data.frame(lcgcorr3)
#Final location value table for output
#compute location mean
Locmean <- tempa2 %>%
group_by (LC ) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))%>%
select (LC, Mean = starts_with("Trait"))
Locmean <- as.data.frame(Locmean)
#merge all location value outputs for print
LocationValue <- Locmean %>%
inner_join (locvalue, by = "LC") %>%
inner_join(lcgcorr3, by = "LC") %>%
rename (Location = LC)
########################################################################
###
F-value of genotype across environmen
##
########################################################################
#fit model
fit_modelen <- tempa2 %>%
group_by(ENV) %>% #group regression analysis by location
do (model2=lm(Trait~CLT + RP , data=.))
#parameter estimates
paramlmen <- as.data.frame(fit_modelen %>% tidy(model2))
glancelmen <- as.data.frame(fit_modelen %>% glance(model2))
augmentlmen <- as.data.frame(fit_modelen %>% augment(model2))
outmseden <- lapply(fit_modelen$model2, anova) #anova output
#convert list into data.frame
54
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
outmseden2 <- as.data.frame(do.call(rbind, outmseden))
#convert rownames into column
outmseden2$SOV <- rownames(outmseden2)
# drop rownames
rownames(outmseden2) <- NULL
#remove numeric values from string of rownames using function gsub
outmseden2 <- outmseden2 %>% mutate(SOV = gsub("\\d+","",SOV))
#extract unique environment name and merge to outmseden2 dataset
environment <- dst02 %>% select(ENV) %>% distinct (ENV) %>% arrange(ENV)
#Extract gentype F value for each location
locvalue2 <- outmseden2 %>%
filter(SOV == "CLT") %>%
select (FRatioGenotype = starts_with("F value")) %>%
bind_cols (environment) %>% select (ENV, FRatioGenotype)
locvalue2 <- as.data.frame (locvalue2)
########################################################################
###
Compute cluster analysis of location
###
########################################################################
#location cluster analysis
#Euclidean distance
#Ward Hierarchical Clustering
#trait mean over location
mean_l <- tempa2 %>%
group_by (LC ) %>%
summarize (Trait = mean(Trait,na.rm=FALSE))
mean_l1 <- as.data.frame(mean_l)
clusterdata <- mean_l1 %>% select (Trait)
clusterdata <- na.omit(clusterdata)
distance <- dist(clusterdata, method = "euclidean") # distance matrix
hcluster <- hclust(d=distance, method="ward.D")
locationcluster <- plot(hcluster, labels=mean_l1$LC) # display dendogram
########################################################################
####
Clear console
####
########################################################################
#clear console
cat("\014")
########################################################################
###
Export output in .CSV or .TXT
###
###
Print final output in console
###
########################################################################
#user can turn on and off .csv or .txt file by commenting/uncommenting
55
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
#codes in below 2 lines
#sink(file="RGxEOutput.csv", append=FALSE, split=TRUE)
sink(file="RGxEOutput.txt", append=FALSE, split=TRUE)
""
""
"#############
Print date and time
#################"
""
print(Sys.time())
""
"########################################################################"
"###
Section 1: Identify level of significance of
###"
"###
different effects, variances and BLUP
###"
"########################################################################"
""
""
"########################################################################"
"###
ANOVA - ANALYSIS OF VARIANCE
####"
"########################################################################"
""
"########################################################################"
"##
ANOVA Case 1: CLT, YR, LC and RP - All Random
##"
"########################################################################"
""
"##Print ANOVA"
"#Note: P values are generated using LRT-Likelihood Ratio Test via "
"#model comparison and anova"
""
print(anova_randall, row.names = FALSE)
""
"##Print BLUP-Best linear unbiased predictor value for random genotypes"
""
print(BLUP_CLT1, row.names = FALSE)
""
"########################################################################"
"##
ANOVA Case 2: CLT, YR and LC - Fixed; RP - Random
##"
"########################################################################"
""
"##Print ANOVA"
"#Note: P values are computed for F ratio using KR method "
"#Kenward-Rogers approximation for degrees of freedom"
""
print(anova_f2a, row.names = FALSE)
""
"########################################################################"
"##
ANOVA Case 3: CLT - Fixed; YR, LC and RP - Random
##"
"########################################################################"
""
"##Print ANOVA"
"#Note: For fixed effects P values are computed for F ratio using "
"#KR method Kenward-Rogers approximation for degrees of freedom"
""
print(anova_f3, row.names = FALSE)
""
"#For random effect P values are generated using LRT-Likelihood Ratio Test "
"#via model comparison and anova"
""
print(anova_cfix, row.names = FALSE)
""
"########################################################################"
"##
ANOVA Case 4: LC - Fixed; YR, CLT and RP - Random
##"
"########################################################################"
""
"##Print ANOVA"
56
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
"#Note: For fixed effects P values are computed for F ratio using"
"#KR method Kenward-Rogers approximation for degrees of freedom"
""
print(anova_f4, row.names = FALSE)
""
"#For random effect P values are generated using LRT-Likelihood Ratio Test"
"#via model comparison and anova"
""
print(anova_lfix, row.names = FALSE)
""
"########################################################################"
"##
ANOVA Case 5: CLT and LC - Fixed; YR, and RP - Random
##"
"########################################################################"
""
"##Print ANOVA"
"#Note: For fixed effects P values are computed for F ratio using "
"#KR method Kenward-Rogers approximation for degrees of freedom"
""
print(anova_f5, row.names = FALSE)
""
"#For random effect P values are generated using LRT-Likelihood Ratio Test"
"#via model comparison and anova"
""
print(anova_clfix, row.names = FALSE)
""
""
"########################################################################"
"###
Section 2: Identify general statistcs (mean and CV) of
###"
"###
genotype, location, year, replication and environment
###"
"########################################################################"
""
""
"########################################################################"
"###
Descriptive Statistics
###"
"########################################################################"
""
"#Descriptive Statistics - Means and CV"
""
"#trait mean across genotype and environment(location x year combination)"
""
print(mean_ge2, row.names = FALSE)
""
"#trait mean across genotype and years"
""
print(mean_gy2, row.names = FALSE)
""
"#trait mean across genotype and location"
""
print(mean_gl2, row.names = FALSE)
""
"#trait mean across genotype, location and year"
""
print(mean_gly2, row.names = FALSE)
""
"#trait mean across genotype, location and replication"
""
print(mean_glr2, row.names = FALSE)
""
"#trait mean across location and year"
""
print(mean_ly2, row.names = FALSE)
""
"#trait mean across location and replication"
""
57
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
print(mean_lr2, row.names = FALSE)
""
"#trait count, min, max, mean, sum, median, var, sd over location"
""
print(stat_l1, row.names = FALSE)
""
"#trait count, min, max, mean, sum, median, var, sd over year"
""
print(stat_y1, row.names = FALSE)
""
"#trait count, min, max, mean, sum, median, var, sd over genotype"
""
print(stat_g1, row.names = FALSE)
""
"#trait count, min, max, mean, sum, median, var, sd over environment"
""
print(stat_e1, row.names = FALSE)
""
"#trait coefficient of variation (cv) across gentoype and location"
""
print(cv_gl2, row.names = FALSE)
""
"trait sum across genotype and location"
""
print(sum_gl2, row.names = FALSE)
""
"trait sum across genotype, location and year"
""
print(sum_gly2, row.names = FALSE)
""
"trait sum across genotype and year"
""
print(sum_gy2, row.names = FALSE)
""
"trait standard deviation (sd) across genotype and location"
""
print(sd_gl2, row.names = FALSE)
""
"trait standard deviation (sd) across genotype, location and year"
""
print(sd_gly2, row.names = FALSE)
""
"trait standard deviation (sd) across genotype and year"
""
print(sd_gy2, row.names = FALSE)
""
""
""
"########################################################################"
"###
Section 3: Choose stable and best genotype
###"
"########################################################################"
""
""
"########################################################################"
"###
Stability Statistics
###"
"########################################################################"
""
"#Univariate statistics - mean, slope, deviation from regression,"
"#Shukla,Ecovalence, Kang"
""
print(univariate, row.names = FALSE)
""
""
"########################################################################"
58
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
"##
Section 4: Choose discriminative and representative location
##"
"########################################################################"
""
""
"########################################################################"
"###
location statistics - genotype F ratio across location
###"
"###
and environment; location correlation
###"
"########################################################################"
""
"#location value"
"# location mean, genotype F ratio across location,"
"#correlation of location with average location performace"
""
print(LocationValue, row.names = FALSE)
""
"#location value-genotype F ratio across location,"
""
print(locvalue2, row.names = FALSE)
sink()
########################################################################
###
ANOVA model fitness plots : Case 1 to 5
###
########################################################################
#Print model fitness of Case I to Case V
#pdf file is sent to same folder where input file is located
#Model
#Model
#Model
#Model
#Model
fitness
fitness
fitness
fitness
fitness
Case
Case
Case
Case
Case
1:
2:
3:
4:
5:
CLT, YR, LC and RP - All Random
CLT, YR and LC - Fixed; RP - Random
CLT - Fixed; YR, LC and RP - Random
LC - Fixed; YR, CLT and RP - Random
CLT and LC - Fixed; YR and RP - Random
pdf("modelfitness.pdf")
anovacase1
anovacase2
anovacase3
anovacase4
anovacase5
dev.off()
########################################################################
###
Cluster analysis of location performance
###
########################################################################
#print dendogram for location
#pdf file is sent to same folder where input file is located
dev.print(pdf, 'locationcluster.pdf')
dev.off()
#End
References
Bates D, Maechler M, Bolker B, Walker S. 2015. Fitting Linear Mixed-Effects Models Using lme4.
Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01.
59
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Dia M, Wehner TC, Arellano C. 2016a. Analysis of genotype x environment interaction (GxE) using SAS
programming. Agron. J. [AJ-2016-02-0088-A , in press]
Dia M, Wehner TC, Hassell R, Price DS, Boyhan GE, Olson S, King S, Davis AR, Tolla GE. 2016b.
Genotype x environment interaction and stability analysis for watermelon fruit yield in the U.S..
Crop Sci. doi: 10.2135/cropsci2015.10.0625.
Dia M, Wehner TC, Hassell R, Price DS, Boyhan GE, Olson S, King S, Davis AR, Tolla GE. 2016c.
Values of locations for representing mega-environments and for discriminating yield of
watermelon in the U.S. Crop Sci. [CROP-2015-11-0698-ORA, in press]
Fan XM, Kang M, Chen H, Zhang Y, Tan J, Xu C. 2007. Yield stability of maize hybrids evaluated in
multi-environment trials in Yunnan, China. Agron. J. 99: 220-228.
Kang MS. 1993. Simultaneous selection for yield and stability in crop performance trials: Consequences
for growers. Agron. J. 85: 754-757.
Kang MS, Miller JD, Darrah LL. 1987. A note on relationship between stability variance and ecovalence.
J. Hered. 78: 107.
Maechler M, Rousseeuw P, Struyt A, Hubert M. 2015. cluster: Cluster Analysis Basics and Extensions. R
package version 2.0.1.
Mekbib F. 2003. Yield stability in common beans (Phaseolus vulgaris L.) genotypes. Euphytica 130:
147-153.
Mendiburu FD. 2015. agricolae: Statistical Procedures for Agricultural Research. R package version 1.23. http://CRAN.R-project.org/package=agricolae
R Core Team. 2015. R: A language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
60
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
RStudio Team (2015). RStudio: Integrated Development for R. RStudio, Inc., Boston, MA (Computer
software v0.98.1074) [cited 2015 March 04]. Available at http://www.rstudio.com/.
Robinson D. 2015. broom: Convert Statistical Analysis Objects into Tidy Data Frames. R package
version 0.4.0. http://CRAN.R-project.org/package=broom
Shukla GK. 1972. Genotype stability analysis and its application to potato regional trails. Crop Sci. 11:
184-190.
Singmann H, Bolker B, Westfall J. 2015. afex: Analysis of Factorial Experiments. R package version
0.15-2. http://CRAN.R-project.org/package=afex
Smith AB, Cullis BR, Thompson R. 2005. The analysis of crop cultivar
breeding and evaluation trials: an overview of current mixed model approaches. J. Agric. Sci.143:
449-462.
Wricke G. 1962. Uberiene method zur erfassung der ökologischen streubreite in eldversuchen. Z.
Pflanzenzücht. 47: 92-96.
Wickham H. 2016. tidyr: Easily Tidy Data with `spread()` and `gather()` Functions. R package version
0.4.1. http://CRAN.R-project.org/package=tidyr
Wickham H, Francois R. 2015. dplyr: A Grammar of Data Manipulation. R package version 0.4.3.
http://CRAN.R-project.org/package=dplyr
Yan W, Pageau D, Frégeau-Reid J, Durand J. 2011. Assessing the representativeness and repeatability of
test locations for genotype evaluation. Crop Sci. 51: 1603-1610.
61
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Authors
Mahendra Dia
North Carolina State University,
Raleigh, NC, USA
diamahendra@gmail.com
Todd C. Wehner
North Carolina State University,
Raleigh, NC, USA
tcwehner@gmail.com
Consuelo Arellano
North Carolina State University,
Raleigh, NC, USA
carella@ncsu.edu
62
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Table captions.
Table 1. ANOVA models including the factors genotype (CLT), location (LC), year (YR), and replication
(RP) for multi-location replicated trials across years in a randomized complete block design.
Figure captions.
Figure 1. Schematic representation of overall process of RGxE program for genotype stability and
location value.
Figure 2. Residual plot for Case 1 of ANOVA model where genotype, location, year and replication are
treated as random effect.
63
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Table 1. ANOVA models including the factors genotype (CLT), location (LC), year (YR), and replication
(RP) for multi-location replicated trials across years in a randomized complete block design.
Source of variation
DF
CLT
LC
YR
RP(LC*YR)
CLT*LC
CLT*YR
LC*YR
CLT*LC*YR
Pooled error
g-1
l-1
y-1
(r-1)ly
(g-1)(l-1)
(g-1)(y-1)
(l-1)(y-1)
(g-1)(l-1) (y-1)
(r-1)(g-1)ly
Fixed vs. random effects
Case 1
Case 2
Case 3
Random
Fixed
Fixed
Random
Fixed
Random
Random
Fixed
Random
Random
Random Random
Random
Fixed
Random
Random
Fixed
Random
Random
Fixed
Random
Random
Fixed
Random
Case 4
Random
Fixed
Random
Random
Random
Random
Random
Random
64
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Case 5
Fixed
Fixed
Random
Random
Fixed
Random
Random
Random
Figure 1. Overview of overall process of RGxE program for genotype stability and location value.
65
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Figure 2. Residual plot for Case 1 of ANOVA model where genotype, location, year and replication are
treated as random effect.
66
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
SUPPLEMENT
(For Publication)
RGxE: An R Program for Genotype x Environment Interaction Analysis
by Mahendra Dia, Todd C. Wehner, and Consuelo Arellano
Description
The supplemental material provides instructions for user enetered field, interpretation of univariate
stability statistics, sample data and output from sample input data generated from RGxE program
1. Instructions for user entered field
A. Define input file information.
a. For Windows user
I.
Option 1
67
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
II.
Option 2
b. For Ois or Mac user
B. Define trait name that needs to be analyzed.
68
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
C. Define output file information.
2. Interpretation of univariate stability statistics
2.1 Linear regression coefficient (bi)
Regression coefficient (bi) of genotypes approximating unity (P<0.01) along with high trait mean is
considered to stable across wide range of environment. When this is associated with low trait mean
performance, genotypes are poorly adapted to all environments. A bi greater than unity describes
genotypes with higher sensitivity to environmental change (below average stability), and greater
specificity of adaptability to high yielding environments.
2.2 Deviation from regression (S2d)
Genotype is considered to be stable when deviation from regression (S2d) is not significantly different
from zero.
2.3 Shukla stability variance (σi2), ssquare, and Wricke's ecovalence (Wi2)
Genotype with low σi2, ssquare and Wi2 is regarded as stable.
2.4 Kang stability statistic (YSi)
According to this method, genotypes with YSi greater than the mean YSi are considered stable. Stable
genotypes are marked with symbol ‘+’.
3. Output (.txt) from sample data
[1] "#############
Print date and time
#################"
[1] ""
[1] "2016-03-12 15:13:22 EST"
[1] ""
"########################################################################"
[1] "###
Section 1: Identify level of significance of
###"
[1] "###
different effects, variances and BLUP
###"
"########################################################################"
[1] ""
[1] ""
[1]
"########################################################################"
69
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
[1] "###
ANOVA - ANALYSIS OF VARIANCE
####"
[1]
"########################################################################"
[1] ""
[1]
"########################################################################"
[1] "##
ANOVA Case 1: CLT, YR, LC and RP - All Random
##"
[1]
"########################################################################"
[1] ""
[1] "##Print ANOVA"
[1] "#Note: P values are generated using LRT-Likelihood Ratio Test via "
[1] "#model comparison and anova"
[1] ""
sov Variance stddev
Pr_Chisq
YR:LC:CLT
49.730 7.0519 8.8942e-03
LC:CLT
0.000 0.0000
NA
RPid
73.914 8.5973 4.1458e-07
YR:CLT
0.000 0.0000
NA
YR:LC
57.813 7.6035 7.8725e-02
CLT 111.697 10.5687 1.3867e-03
LC 699.569 26.4494 9.0836e-03
YR
0.000 0.0000
NA
Residual 327.526 18.0977
NA
[1] ""
[1] "##Print BLUP-Best linear unbiased predictor value for random
genotypes"
[1] ""
genotype
Blup
CalhounGray 76.255
CrimsonSweet 62.535
EarlyCanada 59.840
FiestaF1 77.283
GeorgiaRattlesnake 70.710
Legacy 68.417
Mickylee 61.300
Quetzali 58.744
StarbriteF1 83.422
SugarBaby 51.169
[1] ""
"########################################################################"
[1] "##
ANOVA Case 2: CLT, YR and LC - Fixed; RP - Random
##"
"########################################################################"
[1] ""
[1] "##Print ANOVA"
[1] "#Note: P values are computed for F ratio using KR method "
[1] "#Kenward-Rogers approximation for degrees of freedom"
[1] ""
sov Df
Sum_Sq
Mean_Sq F_value
Prob_F stddev
YR 1
120.47
120.470 0.36782 5.4876e-01
NA
LC 4 73133.52 18283.381 55.82263 1.8011e-13
NA
CLT 9 44948.85 4994.317 15.24860 4.8887e-20
NA
YR:LC 4 5372.94 1343.235 4.10115 9.0767e-03
NA
YR:CLT 9 4564.93
507.215 1.54862 1.3095e-01
NA
70
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
LC:CLT 36 15018.72
417.187 1.27375 1.4552e-01
NA
YR:LC:CLT 36 23058.48
640.513 1.95561 1.5170e-03
NA
RPid NA
NA
73.914
NA
NA 8.5973
Residual NA
NA
327.526
NA
NA 18.0977
[1] ""
[1]
"########################################################################"
[1] "##
ANOVA Case 3: CLT - Fixed; YR, LC and RP - Random
##"
"########################################################################"
[1] ""
[1] "##Print ANOVA"
[1] "#Note: For fixed effects P values are computed for F ratio using "
[1] "#KR method Kenward-Rogers approximation for degrees of freedom"
[1] ""
sov Df Sum_Sq Mean_Sq F_value
Prob_F
CLT 9 27965 3107.2 9.4869 0.006365
[1] ""
[1] "#For random effect P values are generated using LRT-Likelihood Ratio
Test "
[1] "#via model comparison and anova"
[1] ""
sov Mean_Sq stddev
Pr_Chisq
YR:LC:CLT 49.730 7.0519 2.7248e-02
LC:CLT
0.000 0.0000
NA
RPid 73.914 8.5973 4.1458e-07
YR:CLT
0.000 0.0000
NA
YR:LC 57.813 7.6035 6.7765e-02
LC 699.573 26.4495 9.4460e-03
YR
0.000 0.0000
NA
Residual 327.526 18.0977
NA
[1] ""
[1]
"########################################################################"
[1] "##
ANOVA Case 4: LC - Fixed; YR, CLT and RP - Random
##"
"########################################################################"
[1] ""
[1] "##Print ANOVA"
[1] "#Note: For fixed effects P values are computed for F ratio using"
[1] "#KR method Kenward-Rogers approximation for degrees of freedom"
[1] ""
sov Df Sum_Sq Mean_Sq F_value
Prob_F
LC 4 21802 5450.4 16.641 0.0094205
[1] ""
[1] "#For random effect P values are generated using LRT-Likelihood Ratio
Test"
[1] "#via model comparison and anova"
[1] ""
sov Mean_Sq stddev
Pr_Chisq
YR:LC:CLT 49.730 7.0519 8.2124e-03
LC:CLT
0.000 0.0000
NA
RPid 73.914 8.5973 4.1458e-07
YR:CLT
0.000 0.0000
NA
YR:LC 57.813 7.6035 3.9804e-01
CLT 111.697 10.5687
NA
71
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
YR
0.000 0.0000
NA
Residual 327.526 18.0977
NA
[1] ""
[1]
"########################################################################"
[1] "##
ANOVA Case 5: CLT and LC - Fixed; YR, and RP - Random
##"
"########################################################################"
[1] "##Print ANOVA"
[1] "#Note: For fixed effects P values are computed for F ratio using "
[1] "#KR method Kenward-Rogers approximation for degrees of freedom"
[1] ""
sov Df Sum_Sq Mean_Sq F_value
Prob_F
CLT 9 23982.8 2664.76 8.13600 0.0022680
LC 4 21801.7 5450.43 16.64120 0.0092754
CLT:LC 36 8013.4 222.59 0.67962 0.8743361
[1] ""
[1] "#For random effect P values are generated using LRT-Likelihood Ratio
Test"
[1] "#via model comparison and anova"
[1] ""
sov Mean_Sq stddev
Pr_Chisq
YR:LC:CLT 71.582 8.4606 1.0000e+00
RPid 73.914 8.5973 1.3054e-07
YR:CLT
0.000 0.0000
NA
YR:LC 55.628 7.4584 2.9757e-01
YR
0.000 0.0000
NA
Residual 327.526 18.0977
NA
[1] ""
[1] ""
"########################################################################"
[1] "### Section 2: Identify general statistcs (mean and CV) of
###"
[1] "### genotype, location, year, replication and environment
###"
"########################################################################"
[1] ""
"########################################################################"
[1] "###
Descriptive Statistics
###"
"########################################################################"
[1] "#Descriptive Statistics - Means and CV"
[1] ""
[1] "#trait mean across genotype and environment(location x year
combination)"
[1] ""
CLT CI-2009 CI-2010 FL-2009 FL-2010 KN-2009 KN-2010 SC-2009 SC-2010 TX-2009 TX-2010
CalhounGray 64.590 81.502 106.663 121.151 68.767 70.805 104.374 89.694 48.932 17.0135
CrimsonSweet 25.775 73.046 110.511 65.791 62.960 57.561 74.870 112.900 21.176 15.5362
EarlyCanada 50.939 55.320 90.613 116.482 57.051 55.421 55.767 74.885 21.744 11.7770
FiestaF1 69.379 81.298 109.561 99.026 80.178 67.647 89.577 123.786 42.324 22.2093
GeorgiaRattlesnake 52.671 72.843 102.128 109.443 69.990 62.960 87.217 79.128 62.326 12.8060
Legacy 52.467 66.323 105.471 89.036 81.298 72.435 88.606 63.556 50.802 15.8825
Mickylee 49.410 59.089 78.977 113.275 70.194 51.448 82.658 48.035 40.334 12.8977
Quetzali 40.751 62.757 79.424 87.583 61.432 48.697 79.719 77.865 29.735 9.7903
StarbriteF1 77.936 84.457 125.971 120.292 70.601 79.974 112.504 102.433 57.339 22.0972
SugarBaby 42.381 57.866 80.657 91.010 37.491 48.799 53.531 52.595 18.116 10.6258
[1] ""
[1] "#trait mean across genotype and years"
[1] ""
CLT
2009
2010
72
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
CalhounGray 78.665 76.033
CrimsonSweet 59.059 64.967
EarlyCanada 55.223 62.777
FiestaF1 78.204 78.793
GeorgiaRattlesnake 74.866 67.436
Legacy 75.729 61.447
Mickylee 64.314 56.949
Quetzali 58.212 57.338
StarbriteF1 88.870 81.850
SugarBaby 46.435 52.179
[1] ""
[1] "#trait mean across genotype and location"
[1] ""
CLT
CI
FL
KN
SC
TX
CalhounGray 73.046 113.907 69.786 97.034 32.973
CrimsonSweet 49.411 88.151 60.261 93.885 18.356
EarlyCanada 53.129 103.548 56.236 65.326 16.761
FiestaF1 75.338 104.294 73.912 106.682 32.266
GeorgiaRattlesnake 62.757 105.786 66.475 83.173 37.566
Legacy 59.395 97.253 76.867 76.081 33.342
Mickylee 54.250 96.126 60.821 65.347 26.616
Quetzali 51.754 83.503 55.065 78.792 19.763
StarbriteF1 81.196 123.132 75.288 107.468 39.718
SugarBaby 50.124 85.834 43.145 53.063 14.371
[1] ""
[1] "#trait mean across genotype, location and year"
[1] ""
CLT
YR
CI
FL
KN
SC
TX
CalhounGray 2009 64.590 106.663 68.767 104.374 48.9320
CalhounGray 2010 81.502 121.151 70.805 89.694 17.0135
CrimsonSweet 2009 25.775 110.511 62.960 74.870 21.1758
CrimsonSweet 2010 73.046 65.791 57.561 112.900 15.5362
EarlyCanada 2009 50.939 90.613 57.051 55.767 21.7445
EarlyCanada 2010 55.320 116.482 55.421 74.885 11.7770
FiestaF1 2009 69.379 109.561 80.178 89.577 42.3235
FiestaF1 2010 81.298 99.026 67.647 123.786 22.2093
GeorgiaRattlesnake 2009 52.671 102.128 69.990 87.217 62.3257
GeorgiaRattlesnake 2010 72.843 109.443 62.960 79.128 12.8060
Legacy 2009 52.467 105.471 81.298 88.606 50.8020
Legacy 2010 66.323 89.036 72.435 63.556 15.8825
Mickylee 2009 49.410 78.977 70.194 82.658 40.3335
Mickylee 2010 59.089 113.275 51.448 48.035 12.8977
Quetzali 2009 40.751 79.424 61.432 79.719 29.7353
Quetzali 2010 62.757 87.583 48.697 77.865 9.7903
StarbriteF1 2009 77.936 125.971 70.601 112.504 57.3388
StarbriteF1 2010 84.457 120.292 79.974 102.433 22.0972
SugarBaby 2009 42.381 80.657 37.491 53.531 18.1155
SugarBaby 2010 57.866 91.010 48.799 52.595 10.6258
[1] ""
[1] "#trait mean across genotype, location and replication"
[1] ""
CLT RP
CI
FL
KN
SC
TX
CalhounGray 1 98.413 95.809 62.349 89.448 54.4325
CalhounGray 2 69.684 136.038 55.625 103.293 25.3965
73
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
CalhounGray 3 59.292 118.781 70.296 87.128
CalhounGray 4 64.794 105.000 90.875 108.266
CrimsonSweet 1 37.491 39.581 62.349 79.465
CrimsonSweet 2 59.293 130.710 58.885 94.685
CrimsonSweet 3 63.979 93.970 55.218 109.528
CrimsonSweet 4 36.880 88.343 64.591 91.863
EarlyCanada 1 64.386 104.058 51.346 72.465
EarlyCanada 2 45.233 107.547 52.772 59.435
EarlyCanada 3 61.331 104.386 48.901 59.649
EarlyCanada 4 41.566 98.200 71.925 69.755
FiestaF1 1 78.650 67.163 70.703 111.038
FiestaF1 2 70.499 145.651 72.944 117.007
FiestaF1 3 63.164 97.226 82.114 98.525
FiestaF1 4 89.041 107.133 69.888 100.156
GeorgiaRattlesnake 1 70.499 95.776 61.941 84.538
GeorgiaRattlesnake 2 60.923 106.091 61.738 78.721
GeorgiaRattlesnake 3 58.274 105.629 74.167 69.766
GeorgiaRattlesnake 4 61.331 115.646 68.054 99.667
Legacy 1 64.794 80.311 66.424 84.233
Legacy 2 41.159 104.211 71.315 82.887
Legacy 3 55.829 97.647 87.614 68.635
Legacy 4 75.797 106.844 82.113 68.570
Mickylee 1 51.550 95.789 54.810 72.913
Mickylee 2 60.923 107.262 72.333 59.385
Mickylee 3 36.880 95.143 59.700 63.388
Mickylee 4 67.647 86.310 56.440 65.701
Quetzali 1 64.590 73.935 53.587 94.370
Quetzali 2 53.791 101.604 50.328 75.674
Quetzali 3 50.939 79.901 47.883 62.777
Quetzali 4 37.694 78.573 68.462 82.347
StarbriteF1 1 71.722 76.359 54.199 80.066
StarbriteF1 2 86.189 124.965 70.296 110.171
StarbriteF1 3 90.875 126.852 70.296 103.640
StarbriteF1 4 76.001 164.350 106.361 135.996
SugarBaby 1 60.923 71.728 35.861 59.690
SugarBaby 2 58.070 94.041 30.563 52.498
SugarBaby 3 36.472 85.855 50.735 53.883
SugarBaby 4 45.029 91.711 55.421 46.181
[1] ""
[1] "#trait mean across location and year"
[1] ""
LC
2009
2010
CI 52.630 69.450
FL 98.998 101.309
KN 65.996 61.575
SC 82.882 82.488
TX 39.283 15.064
[1] ""
[1] "#trait mean across location and replication"
[1] ""
LC
1
2
3
4
CI 66.302 60.576 57.703 59.578
FL 80.051 115.812 100.539 104.211
KN 57.357 59.680 64.692 73.413
15.2330
36.8290
31.2560
13.5540
19.0985
9.5155
10.1675
15.2815
27.9405
13.6535
51.4160
31.7220
18.9710
26.9565
49.4325
58.1390
13.2595
29.4325
48.1675
37.6610
29.0530
18.4875
53.3065
21.7000
19.0495
12.4065
27.9930
22.7275
18.2570
10.0735
40.4225
61.3850
31.6680
25.3965
12.6045
15.8565
13.8400
15.1815
74
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
SC 82.823 83.376 77.692 86.850
TX 37.920 30.342 20.637 19.793
[1] ""
[1] "#trait count, min, max, mean, sum, median, var, sd over location"
[1] ""
LC count
min
max
mean
sum median
var
sd
CI
80 0.000 108.81 61.040 4883.2 59.904 422.13 20.546
FL
80 0.000 180.91 100.153 8012.3 99.492 886.48 29.774
KN
80 19.153 109.21 63.786 5102.8 63.776 305.04 17.465
SC
80 30.522 144.67 82.685 6614.8 81.462 740.18 27.206
TX
80 0.000 102.63 27.173 2173.8 20.388 505.83 22.491
[1] ""
[1] "#trait count, min, max, mean, sum, median, var, sd over year"
[1] ""
YR count min
max
mean
sum median
var
sd
2009
200
0 172.06 67.958 13592 68.735 1005.8 31.714
2010
200
0 180.91 65.977 13195 65.405 1324.4 36.392
[1] ""
[1] "#trait count, min, max, mean, sum,
[1] ""
CLT count
min
max
CalhounGray
40 9.454 152.04
CrimsonSweet
40 0.000 147.98
EarlyCanada
40 7.213 119.75
FiestaF1
40 4.034 180.91
GeorgiaRattlesnake
40 4.157 144.54
Legacy
40 5.990 120.69
Mickylee
40 10.229 128.39
Quetzali
40 5.909 110.52
StarbriteF1
40 7.417 172.06
SugarBaby
40 8.884 106.33
median, var, sd over genotype"
mean
77.349
62.013
59.000
78.498
71.151
68.588
60.632
57.775
85.360
49.307
sum
3094.0
2480.5
2360.0
3139.9
2846.0
2743.5
2425.3
2311.0
3414.4
1972.3
median
81.777
61.738
56.644
79.792
69.277
68.808
59.395
60.433
83.540
51.245
var
1275.65
1468.58
1009.08
1334.98
1055.36
835.44
892.20
759.80
1462.50
653.62
sd
35.716
38.322
31.766
36.537
32.486
28.904
29.870
27.564
38.243
25.566
[1] ""
[1] "#trait count, min, max, mean, sum, median, var, sd over environment"
[1] ""
ENV count
min
max
mean
sum median
var
sd
CI-2009
40 0.000 108.805 52.630 2105.19 54.199 489.72 22.1297
CI-2010
40 42.789 97.395 69.450 2778.00 67.851 220.28 14.8417
FL-2009
40 56.717 172.065 98.998 3959.91 95.171 553.49 23.5265
FL-2010
40 0.000 180.906 101.309 4052.36 107.077 1239.46 35.2060
KN-2009
40 19.153 109.213 65.996 2639.85 66.424 419.76 20.4881
KN-2010
40 35.861 105.138 61.575 2462.99 60.922 188.12 13.7157
SC-2009
40 30.522 144.666 82.882 3315.30 82.864 705.88 26.5683
SC-2010
40 40.853 141.956 82.488 3299.51 76.512 793.38 28.1671
TX-2009
40 0.000 102.626 39.283 1571.31 32.904 658.22 25.6559
TX-2010
40 4.034 34.231 15.064 602.54 13.143
65.60 8.0994
[1] ""
[1] "#trait coefficient of variation (cv) across gentoype and location"
[1] ""
CLT
CI
FL
KN
SC
TX
CalhounGray 31.949 23.999 32.158 16.135 83.562
CrimsonSweet 62.295 49.664 13.229 30.140 66.446
EarlyCanada 21.952 15.884 29.933 30.311 70.379
FiestaF1 21.779 43.048 13.199 22.666 56.006
GeorgiaRattlesnake 19.610 20.215 10.081 38.647 94.577
Legacy 29.606 27.040 15.967 24.143 71.471
75
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Mickylee
Quetzali
StarbriteF1
SugarBaby
29.351
31.077
21.454
28.754
21.866
21.406
30.106
13.669
25.915
21.994
31.218
30.860
33.958
22.743
21.508
20.260
99.438
75.846
65.571
33.650
[1] ""
[1] "trait sum across genotype and location"
[1] ""
CLT
CI
FL
KN
SC
TX
CalhounGray 584.37 911.26 558.29 776.27 263.78
CrimsonSweet 395.29 705.21 482.08 751.08 146.85
EarlyCanada 425.03 828.38 449.89 522.61 134.09
FiestaF1 602.71 834.35 591.30 853.45 258.13
GeorgiaRattlesnake 502.05 846.28 531.80 665.38 300.53
Legacy 475.16 778.03 614.93 608.65 266.74
Mickylee 434.00 769.01 486.57 522.77 212.93
Quetzali 414.03 668.03 440.52 630.34 158.10
StarbriteF1 649.57 985.05 602.30 859.74 317.74
SugarBaby 400.99 686.67 345.16 424.50 114.97
[1] ""
[1] "trait sum across genotype, location and year"
[1] ""
CLT
YR
CI
FL
KN
SC
TX
CalhounGray 2009 258.36 426.65 275.07 417.50 195.728
CalhounGray 2010 326.01 484.60 283.22 358.78 68.054
CrimsonSweet 2009 103.10 442.05 251.84 299.48 84.703
CrimsonSweet 2010 292.19 263.16 230.24 451.60 62.145
EarlyCanada 2009 203.75 362.45 228.20 223.07 86.978
EarlyCanada 2010 221.28 465.93 221.69 299.54 47.108
FiestaF1 2009 277.51 438.25 320.71 358.31 169.294
FiestaF1 2010 325.19 396.10 270.59 495.14 88.837
GeorgiaRattlesnake 2009 210.68 408.51 279.96 348.87 249.303
GeorgiaRattlesnake 2010 291.37 437.77 251.84 316.51 51.224
Legacy 2009 209.87 421.88 325.19 354.42 203.208
Legacy 2010 265.29 356.14 289.74 254.23 63.530
Mickylee 2009 197.64 315.91 280.77 330.63 161.334
Mickylee 2010 236.36 453.10 205.79 192.14 51.591
Quetzali 2009 163.00 317.69 245.73 318.88 118.941
Quetzali 2010 251.03 350.33 194.79 311.46 39.161
StarbriteF1 2009 311.75 503.89 282.40 450.01 229.355
StarbriteF1 2010 337.83 481.17 319.90 409.73 88.389
SugarBaby 2009 169.52 322.63 149.96 214.12 72.462
SugarBaby 2010 231.46 364.04 195.20 210.38 42.503
[1] ""
[1] "trait sum across genotype and year"
[1] ""
CLT
2009
2010
CalhounGray 1573.3 1520.7
CrimsonSweet 1181.2 1299.3
EarlyCanada 1104.5 1255.5
FiestaF1 1564.1 1575.9
GeorgiaRattlesnake 1497.3 1348.7
Legacy 1514.6 1228.9
Mickylee 1286.3 1139.0
Quetzali 1164.2 1146.8
76
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
StarbriteF1 1777.4 1637.0
SugarBaby 928.7 1043.6
[1] ""
[1] "trait standard
[1] ""
CLT
CalhounGray
CrimsonSweet
EarlyCanada
FiestaF1
GeorgiaRattlesnake
Legacy
Mickylee
Quetzali
StarbriteF1
SugarBaby
[1] ""
[1] "trait standard
[1] ""
CLT
CalhounGray
CalhounGray
CrimsonSweet
CrimsonSweet
EarlyCanada
EarlyCanada
FiestaF1
FiestaF1
GeorgiaRattlesnake
GeorgiaRattlesnake
Legacy
Legacy
Mickylee
Mickylee
Quetzali
Quetzali
StarbriteF1
StarbriteF1
SugarBaby
SugarBaby
[1] ""
[1] "trait standard
[1] ""
CLT
CalhounGray
CrimsonSweet
EarlyCanada
FiestaF1
GeorgiaRattlesnake
Legacy
Mickylee
Quetzali
StarbriteF1
SugarBaby
deviation (sd) across genotype and location"
CI
23.337
30.780
11.663
16.408
12.306
17.584
15.923
16.084
17.420
14.413
FL
27.336
43.780
16.447
44.896
21.384
26.297
21.019
17.875
37.069
11.732
KN
22.4419
7.9720
16.8332
9.7560
6.7012
12.2736
15.7614
12.1110
23.5036
13.3147
SC
15.657
28.297
19.801
24.180
32.143
18.368
22.191
17.919
23.114
10.750
TX
27.5527
12.1968
11.7961
18.0711
35.5286
23.8302
26.4661
14.9893
26.0434
4.8357
deviation (sd) across genotype, location and year"
YR
2009
2010
2009
2010
2009
2010
2009
2010
2009
2010
2009
2010
2009
2010
2009
2010
2009
2010
2009
2010
CI
32.0192
7.4101
19.1151
18.8557
13.8353
10.6391
19.4961
12.3851
5.7280
7.0215
12.1090
21.1391
21.8298
7.2519
11.7591
11.9366
24.6272
8.5569
16.6566
6.8838
FL
32.1452
23.8823
28.5590
48.2005
12.3116
5.7846
6.6594
67.7116
17.4036
26.9899
19.9329
32.1907
5.4043
14.7424
15.8267
21.2281
32.8501
45.8879
7.8131
13.7355
KN
30.7113
15.1397
7.8283
8.2203
23.6429
10.0205
7.4503
7.8670
6.9186
4.8959
15.5727
7.5245
18.3739
2.7913
10.3753
11.2465
30.6391
17.0780
16.4254
7.6551
SC
11.6864
17.0801
23.4217
18.8560
23.2914
11.3460
10.2947
21.8657
41.5767
25.2695
16.8878
9.1512
16.7378
8.3460
20.2266
18.3803
27.8233
20.1214
14.9464
6.7591
TX
32.5363
5.7921
15.1403
9.8325
15.9525
1.9904
17.5309
13.5988
35.6340
6.3833
19.9047
10.7674
33.4959
3.2569
15.1437
5.4514
25.5900
9.9889
3.3705
2.4096
deviation (sd) across genotype and year"
2009
34.861
38.308
27.794
25.759
29.401
26.536
25.412
24.665
36.771
24.071
2010
37.410
39.097
35.619
45.569
35.676
30.055
34.015
30.835
40.298
27.292
77
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
[1] ""
[1] ""
"########################################################################"
[1] "###
Section 3: Choose stable and best genotype
###"
"########################################################################"
[1] ""
"########################################################################"
[1] "###
Stability Statistics
###"
"########################################################################"
[1] ""
[1] "#Univariate statistics - mean, slope, deviation from regression,"
[1] "#Shukla,Ecovalence, Kang"
[1] ""
Genotype
CalhounGray
CrimsonSweet
EarlyCanada
FiestaF1
GeorgiaRattlesnake
Legacy
Mickylee
Quetzali
StarbriteF1
SugarBaby
Mean
SLOPE
DEVREG
SIGMA
77.349
1.31693830164876
124.670028621208 61.346841 ns
62.013
1.36454332357129 1450.03510073339*** 439.124582 ns
59.000 0.320939654616631*
686.251275445017* 253.229481 ns
78.498
1.58347941482123
657.873266033708 300.285026 ns
71.151 0.917198392144549
220.063304045718 52.212609 ns
68.588
1.13392930476125
428.070344903385 287.393522 ns
60.632 0.593584749858975
705.480966435745* 188.105058 ns
57.775 0.971528006198702
96.5321084597175 82.808549 ns
85.360
1.29338936247089
221.136796113672 157.240582 ns
49.307 0.504469489907715*
332.181769968082* 264.187195 ns
SIGMA_SQUARE Ecovalence YS_Kang
15.761172 ns
279.75
10 +
567.987945 ns
1488.64
4
285.369649 ns
893.77
2
385.996807 ns
1044.35
11 +
44.863403 ns
250.52
8 +
262.48765 ns
1003.10
7 +
195.125992 ns
685.37
3
86.103189 ns
348.42
1
78.371337 ns
586.61
12 +
308.429643 ns
928.84
-1
[1] ""
[1] ""
"########################################################################"
[1] "## Section 4: Choose discriminative and representative location ##"
"########################################################################"
[1] ""
"########################################################################"
[1] "### location statistics - genotype F ratio across location
###"
[1] "###
and environment; location correlation
###"
"########################################################################"
[1] ""
[1] "#location value"
[1] "# location mean, genotype F ratio across location,"
[1] "#correlation of location with average location performace"
[1] ""
Location
Mean FRatioGenotype
LocCorrelation
CI 61.040
4.1804 0.949049766419943***
FL 100.153
2.2579 0.864975641187942**
KN 63.786
4.6964 0.896401033871258***
SC 82.685
6.8813 0.87603982110229***
TX 27.173
2.9966 0.89062492518194***
[1] ""
[1] "#location value-genotype F ratio across location,"
[1] ""
ENV FRatioGenotype
CI-2009
2.4015
CI-2010
3.3665
FL-2009
2.6914
FL-2010
1.7231
KN-2009
1.9999
KN-2010
6.3971
SC-2009
2.8729
SC-2010
8.0454
TX-2009
2.5003
78
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
TX-2010
1.4619
4. Sample data
YR
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
LC
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
RP
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
CLT
Early Canada
Calhoun Gray
Starbrite F1
Crimson Sweet
Georgia Rattlesnake
Fiesta F1
Mickylee
Sugar Baby
Legacy
Quetzali
Early Canada
Georgia Rattlesnake
Quetzali
Mickylee
Calhoun Gray
Fiesta F1
Starbrite F1
Sugar Baby
Crimson Sweet
Legacy
Fiesta F1
Quetzali
Crimson Sweet
Sugar Baby
Early Canada
Legacy
Starbrite F1
Georgia Rattlesnake
Mickylee
Calhoun Gray
Crimson Sweet
Early Canada
Starbrite F1
Mickylee
Quetzali
Sugar Baby
Fiesta F1
Calhoun Gray
Trait
56.236
74.167
32.601
74.167
64.794
70.907
57.051
28.118
68.054
57.866
56.236
64.387
64.794
96.580
38.306
85.577
69.684
19.153
56.236
72.537
86.800
49.309
59.496
50.531
28.933
103.100
72.537
79.057
68.869
53.384
61.942
86.800
107.583
58.274
73.759
52.161
77.427
109.213
79
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
KN
KN
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
SC
4
4
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
1
Georgia Rattlesnake
Legacy
Mickylee
Crimson Sweet
Fiesta F1
Calhoun Gray
Georgia Rattlesnake
Legacy
Starbrite F1
Quetzali
Early Canada
Sugar Baby
Calhoun Gray
Georgia Rattlesnake
Legacy
Early Canada
Crimson Sweet
Quetzali
Sugar Baby
Mickylee
Fiesta F1
Starbrite F1
Starbrite F1
Quetzali
Crimson Sweet
Calhoun Gray
Mickylee
Early Canada
Sugar Baby
Fiesta F1
Legacy
Georgia Rattlesnake
Sugar Baby
Calhoun Gray
Georgia Rattlesnake
Crimson Sweet
Fiesta F1
Legacy
Quetzali
Early Canada
Starbrite F1
Mickylee
Georgia Rattlesnake
71.722
81.502
37.898
0.000
59.904
108.805
59.904
66.832
47.679
56.236
68.054
55.014
59.496
46.049
39.528
44.826
45.234
43.196
56.644
62.757
59.904
97.395
98.617
33.823
33.008
32.193
24.858
55.014
21.598
59.089
46.049
51.346
36.268
57.866
53.384
24.858
98.617
57.459
29.748
35.861
68.054
72.129
86.331
80
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
FL
FL
FL
FL
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
1
1
1
1
Legacy
Calhoun Gray
Mickylee
Starbrite F1
Quetzali
Fiesta F1
Sugar Baby
Early Canada
Crimson Sweet
Georgia Rattlesnake
Starbrite F1
Calhoun Gray
Fiesta F1
Sugar Baby
Mickylee
Early Canada
Crimson Sweet
Legacy
Quetzali
Calhoun Gray
Early Canada
Starbrite F1
Mickylee
Quetzali
Crimson Sweet
Georgia Rattlesnake
Legacy
Sugar Baby
Fiesta F1
Calhoun Gray
Legacy
Sugar Baby
Crimson Sweet
Starbrite F1
Early Canada
Quetzali
Georgia Rattlesnake
Fiesta F1
Mickylee
Calhoun Gray
Starbrite F1
Crimson Sweet
Legacy
109.967
95.215
102.407
80.830
84.355
80.121
74.228
72.374
42.198
44.357
124.189
121.397
94.237
53.852
70.601
30.522
97.578
91.567
88.348
98.842
41.688
100.329
67.076
50.246
82.399
74.513
69.562
46.252
101.857
102.041
83.328
39.793
77.305
144.666
78.486
95.928
143.668
82.093
90.549
103.316
98.132
79.163
115.950
81
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
TX
TX
TX
TX
TX
TX
TX
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
1
1
1
1
1
1
1
Sugar Baby
Early Canada
Georgia Rattlesnake
Fiesta F1
Quetzali
Mickylee
Calhoun Gray
Legacy
Crimson Sweet
Sugar Baby
Quetzali
Starbrite F1
Georgia Rattlesnake
Mickylee
Early Canada
Fiesta F1
Calhoun Gray
Quetzali
Fiesta F1
Crimson Sweet
Starbrite F1
Mickylee
Legacy
Sugar Baby
Early Canada
Georgia Rattlesnake
Starbrite F1
Crimson Sweet
Mickylee
Legacy
Georgia Rattlesnake
Fiesta F1
Sugar Baby
Early Canada
Calhoun Gray
Quetzali
Quetzali
Starbrite F1
Early Canada
Sugar Baby
Fiesta F1
Georgia Rattlesnake
Crimson Sweet
70.393
88.616
98.554
118.457
81.374
76.863
152.036
120.695
147.984
81.751
92.686
126.095
96.119
86.136
107.270
110.397
94.223
86.918
102.984
102.912
107.594
73.358
76.494
81.086
89.020
127.092
172.065
111.987
79.549
108.744
86.747
106.407
89.398
77.546
77.078
56.717
50.077
56.476
7.213
15.266
68.601
79.305
34.842
82
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
1
1
1
1
1
1
1
1
1
1
Legacy
Mickylee
Calhoun Gray
Sugar Baby
Legacy
Crimson Sweet
Starbrite F1
Fiesta F1
Early Canada
Quetzali
Mickylee
Georgia Rattlesnake
Calhoun Gray
Mickylee
Crimson Sweet
Sugar Baby
Early Canada
Legacy
Quetzali
Starbrite F1
Georgia Rattlesnake
Fiesta F1
Calhoun Gray
Sugar Baby
Calhoun Gray
Quetzali
Early Canada
Fiesta F1
Starbrite F1
Crimson Sweet
Georgia Rattlesnake
Legacy
Mickylee
Calhoun Gray
Fiesta F1
Georgia Rattlesnake
Legacy
Crimson Sweet
Starbrite F1
Quetzali
Mickylee
Early Canada
Sugar Baby
78.364
88.968
92.646
22.829
45.166
21.444
93.552
32.636
21.639
27.688
33.171
102.626
27.606
26.281
28.417
18.185
43.982
48.693
27.753
35.951
22.362
33.908
21.012
16.182
54.464
13.423
14.144
34.149
43.376
0.000
45.010
30.985
12.914
50.531
70.499
59.089
64.794
50.531
75.797
49.309
52.569
46.456
43.604
83
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
KN
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
1
1
1
1
1
1
1
1
1
1
2
2
2
Sugar Baby
Legacy
Calhoun Gray
Starbrite F1
Early Canada
Quetzali
Mickylee
Georgia Rattlesnake
Crimson Sweet
Fiesta F1
Quetzali
Sugar Baby
Mickylee
Fiesta F1
Georgia Rattlesnake
Crimson Sweet
Early Canada
Starbrite F1
Calhoun Gray
Legacy
Quetzali
Starbrite F1
Sugar Baby
Legacy
Georgia Rattlesnake
Fiesta F1
Calhoun Gray
Mickylee
Early Canada
Crimson Sweet
Quetzali
Crimson Sweet
Fiesta F1
Early Canada
Mickylee
Georgia Rattlesnake
Sugar Baby
Calhoun Gray
Starbrite F1
Legacy
Mickylee
Legacy
Georgia Rattlesnake
41.974
70.092
72.944
70.907
49.309
35.861
48.086
59.089
61.534
60.311
46.456
50.939
50.531
77.427
69.277
50.939
68.869
68.054
87.207
72.129
63.164
105.138
58.681
82.725
64.387
62.349
72.537
54.606
57.051
67.239
72.944
74.982
97.395
60.719
65.202
81.094
66.832
88.022
95.765
62.757
59.089
42.789
75.797
84
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
CI
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
Crimson Sweet
Calhoun Gray
Sugar Baby
Starbrite F1
Fiesta F1
Early Canada
Quetzali
Fiesta F1
Sugar Baby
Early Canada
Quetzali
Legacy
Crimson Sweet
Calhoun Gray
Georgia Rattlesnake
Mickylee
Starbrite F1
Calhoun Gray
Sugar Baby
Crimson Sweet
Early Canada
Mickylee
Legacy
Georgia Rattlesnake
Fiesta F1
Quetzali
Starbrite F1
Quetzali
Starbrite F1
Georgia Rattlesnake
Crimson Sweet
Legacy
Mickylee
Fiesta F1
Calhoun Gray
Sugar Baby
Early Canada
Sugar Baby
Calhoun Gray
Legacy
Quetzali
Early Canada
Starbrite F1
73.352
79.872
59.496
74.982
81.094
45.641
64.387
67.239
51.346
67.647
68.054
65.609
94.950
86.392
65.202
48.901
83.132
71.722
53.791
48.901
47.271
63.164
94.135
69.277
79.464
45.641
83.947
104.384
79.301
82.745
116.731
58.498
43.420
141.956
83.682
45.152
72.557
51.143
85.190
74.208
63.001
88.348
96.152
85
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
SC
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
Mickylee
Fiesta F1
Crimson Sweet
Georgia Rattlesnake
Quetzali
Crimson Sweet
Mickylee
Legacy
Fiesta F1
Early Canada
Starbrite F1
Georgia Rattlesnake
Sugar Baby
Calhoun Gray
Early Canada
Georgia Rattlesnake
Calhoun Gray
Legacy
Mickylee
Quetzali
Fiesta F1
Crimson Sweet
Starbrite F1
Sugar Baby
Crimson Sweet
Legacy
Quetzali
Fiesta F1
Sugar Baby
Starbrite F1
Calhoun Gray
Early Canada
Mickylee
Georgia Rattlesnake
Crimson Sweet
Fiesta F1
Starbrite F1
Sugar Baby
Legacy
Early Canada
Quetzali
Georgia Rattlesnake
Calhoun Gray
48.168
139.776
91.792
113.084
75.308
136.658
59.700
67.708
95.194
77.610
106.951
65.018
61.514
75.414
61.025
55.666
114.490
53.812
40.853
68.767
118.219
106.421
127.326
52.569
0.000
44.673
66.497
15.869
73.064
54.587
88.303
119.500
114.715
92.999
113.437
180.906
123.836
106.330
87.727
107.824
110.522
116.064
120.040
86
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
FL
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
Mickylee
Mickylee
Sugar Baby
Fiesta F1
Starbrite F1
Calhoun Gray
Crimson Sweet
Legacy
Quetzali
Georgia Rattlesnake
Early Canada
Legacy
Fiesta F1
Mickylee
Georgia Rattlesnake
Crimson Sweet
Sugar Baby
Quetzali
Calhoun Gray
Early Canada
Starbrite F1
Sugar Baby
Legacy
Starbrite F1
Crimson Sweet
Fiesta F1
Mickylee
Georgia Rattlesnake
Early Canada
Calhoun Gray
Quetzali
Quetzali
Legacy
Sugar Baby
Mickylee
Crimson Sweet
Georgia Rattlesnake
Early Canada
Starbrite F1
Fiesta F1
Calhoun Gray
Quetzali
Legacy
128.388
116.927
90.624
91.469
146.110
143.339
85.028
118.799
72.884
84.165
119.752
104.945
107.860
93.071
144.545
64.698
94.024
100.429
132.922
118.853
156.635
9.943
17.971
24.369
27.670
34.231
17.645
19.560
13.122
16.219
5.909
17.767
30.156
8.884
10.229
5.664
13.652
8.924
29.218
30.808
23.187
8.761
9.413
87
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
2010
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
TX
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
Fiesta F1
Georgia Rattlesnake
Crimson Sweet
Starbrite F1
Mickylee
Early Canada
Sugar Baby
Calhoun Gray
Fiesta F1
Calhoun Gray
Sugar Baby
Legacy
Crimson Sweet
Mickylee
Georgia Rattlesnake
Early Canada
Quetzali
Starbrite F1
4.034
4.157
9.780
27.385
11.818
11.899
9.495
9.454
19.764
19.194
14.181
5.990
19.031
11.899
13.855
13.163
6.724
7.417
88
© 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.
Download