RGxE: An R Program for Genotype x Environment Interaction Analysis by Mahendra Dia, Todd C. Wehner, and Consuelo Arellano Abstract Genotype x environmental interaction (GxE) can lead to differences in performance of genotypes over environments. GxE analysis can be used to analyze the stability of genotypes and the value of test locations. We developed an R language program (RGxE) that computes univariate stability statistics, descriptive statistics, pooled ANOVA, genotype F ratio across location and environment, cluster analysis for location, and location correlation with average location performance. Univariate stability statistics calculated are regression slope (bi), deviation from regression (S2d), Shukla’s variance (σi2), S square Wricke’s ecovalence (Wi), and Kang’s yield stability (YSi). RGxE is free and intended for use by scientists studying performance of polygenic or quantitative traits over multiple environments. In the present paper we provide the RGxE program and its components along with an example input data and outputs. Additionally, the RGxE program is also available at http://cuke.hort.ncsu.edu/cucurbit/wehner/software.html. 1 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. List of abbreviations AMMI = Additive main effects and multiplicative interaction model ANOVA = Analysis of variance BLUP = Best linear unbiased prediction CLT = Cultigen or genotype CRAN = Comprehensive R Archive Network CSV = Comma Separated Value CV = Coefficient of variation DF = Degrees of freedom ENV = Environment (location - year combination) EU = Experimental unit GGE = Genotype main effects plus Genotype x environmental interaction effect model GxE = Genotype x environmental interaction H0 = Null hypothesis LC = Location Max = Maximum MCMC = Markov Chain Monte Carlo Min = Minimum MS = Mean square RGxE = R program for the analysis of genotype stability and location value RP = Replication RPid = Replication id, which is an experimental unit Sd = Standard deviation SS = Sum of square Var = Variance YR = Year bi = Regression slope S2d = Deviation from regression σi2 = Shukla’s variance Wi = Wricke’s ecovalence YSi = Kang’s yield stability 2 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Introduction Genotype x environmental interaction (GxE) refers to the modification of genetic factors by environmental factors, and to the role of genetic factors in determining the performance of genotypes in different environments. GxE can occur for quantitative traits of economic importance and is often studied in plant and animal breeding, genetic epidemiology, pharmacogenomics and conservational biology research. The traits include reproductive fitness, longevity, height, weight, yield, and disease resistance. Selection of superior genotypes in target environments is an important objective of plant breeding programs. A target environment is a production environment used by growers. In order to identify superior genotypes across multiple environments, plant breeders conduct trials across locations and years, especially during the final stages of cultivar development. GxE is said to exist when genotype performance differs over environments. Performance of genotype can vary greatly across environment because of the effect of environment on trait expression. Cultivars with high and stable performance are difficult to identify, but are of great value. Since it is impossible to test genotypes in all target environments, plant breeders do indirect selection using their own multiple-environment trials, or test environments. GxE reduces the predictability of the performance of genotypes in target environments based on genotype performance in test environments. An important factor in plant breeding is the selection of suitable test locations, since it accounts for GxE and maximizes gain from selection (Yan et al. 2011). An efficient test location is discriminating, and is representative of the target environments for the cultivars to be released. Discriminating locations can detect differences among genotypes with few replications. Representative locations make it likely that genotypes selected will perform well in target environments (Yan et al. 2011). The analysis of variance (ANOVA) is useful in determining the existence, size and significance of GxE. In order to determine GxE for a group of elite cultivars, genotypes are often considered to be fixed effects and environments random. However, for the purpose of estimating breeding values using best linear unbiased prediction (BLUP), genotypes are considered to be random and environments fixed. Some statisticians consider genotypes random effect, provided that the objective is to select the best ones (Smith et al. 2005). If GxE is significant, additional stability statistics can be calculated. Several statistical methods have been proposed for stability analysis. These methods are based on univariate and multivariate models. The present paper focuses on univariate models for the analysis of stability measured using R programming, so a brief description of each stability measure is provided below. The most widely used methods are univariate stability models based on regression and variance estimates. According to the regression model, stability is expressed in terms of the trait mean (M), the slope of regression line (bi) and the sum of squares for deviation from regression (S2d). High mean of a genotype performance is a precondition of stability. The slope (bi) of regression indicates the response of genotype to the environmental index, which is derived from the average performance of all genotypes in each environment. If bi is not significantly different from unity, the genotype is adapted in all environments. A bi greater than unity describes genotypes with higher sensitivity to environmental change (below average stability), and greater specificity of adaptability to high yielding environments. A bi less than unity provides a measure of greater resistance to environmental change (above average stability), and therefore increasing specificity of adaptability to low yielding environments. The variance parameters that measure stability statistics include stability ecovalence (Wi2) proposed by Wricke (1962), stability variance (σi2) proposed by Shukla (1972), and yield stability (YSi) proposed by Kang (1993). 3 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Ecovalence stability index (Wi2) of a genotype is its contribution to the GxE squared and summed across all environments. Since the value of Wi2 is expressed as a sum of squares, a test of significance for Wi2 is not available. Shukla (1972) proposed an unbiased estimate (σi2) of the variance of GxE plus an error term associated with genotype. Shukla's stability variance (σi2) is a linear combination of Wricke's ecovalence (Wi2). Shukla's stability statistic measures the contribution of a genotype to the GxE and error term, therefore a genotype with low σi2 is regarded as stable. According to Kang et al. (1987), Wi2 and σi2 are equivalent in ranking genotypes for stability. The Kang (1993) stability statistic (YSi) is a nonparametric stability procedure in which both the mean (M) and Shukla (1972) stability variance (σi2) for a trait are used as selection criteria. This method gives equal weight to M and σi2. According to this method, genotypes with YSi greater than the mean YSi are considered stable (Kang 1993; Mekbib 2003, Fan et al. 2007). Genotype F ratio for each test location and correlation of test location with average location are important measures of location value. When the mean of all genotypes are equal, then the F ratio will be close to 1. If analysis of variance is run by location, then high genotype F ratio indicates high discriminating ability for that location. High and significant value of Pearson correlation of each location with the mean of all locations indicates strong representation of mean location performance. Our objective was to develop an R language program (RGxE) that gives an output for genotype stability and location value using univariate models, descriptive statistics, genotype F ratio across location and environment, cluster analysis for location, and location correlation with average location performance. In addition to the RGxE program, Dia et al. (2016a, in press) provided a SAS program (SASGxE) that computes multivariate stability statistics using R program along with univariate stability statistics and location value using SAS programming. These multivariate stability statistics include the additive main effects and multiplicative interaction (AMMI) model, and genotype main effects plus GxE (GGE) model. RGxE uses R software (version 3.1.3 and higher). RGxE is freely available, annotated, and intended for scientists studying performance of polygenic or quantitative traits under different environmental conditions. In the present paper we provide the RGxE program and it’s components along with outputs. A supplemental file is provided with instructions for the user-enetered fields, interpretation of univariate stability statistics, sample input data, and output from sample input data. The RGxE program along with associated files is also available at http://cuke.hort.ncsu.edu/cucurbit/wehner/software.html. General features of the RGxE Program 1. Overview of the RGxE program RGxE is a user friendly and annotated R program that will allow user to analyze genotype stability and evaluate test location value of balanced mult-location replicated trial data. This program generates output (.csv or .txt) into the same folder from where it reads input dataset and Console window of helper application ‘R studio’ (RStudio, 2015) of R statistical software (R Core Team, 2015). A schematic representation of RGxE is presented in Figure 1. Below are the key components of RGxE program which user can independently run. 2. Installing and loading packages RGxE uses dplyr (Wickham et al., 2015), tidyr (Wickham, 2016), broom (Robinson, 2015), agricolae (Mendiburu, 2015), lme4 (Bates et al., 2015), afex (Singmann et al., 2015), cluster (Maechler et al., 2015), and grDevices (R Core Team, 2015) packages. The dplyr, tidyr, broom, agricolae, lme4, afex, cluster, and grDevices packages are available from the Comprehensive R Archive Network (CRAN), therefore they can be installed as any other packages, by simply typing: 4 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. install.packages("dplyr") install.packages("tidyr") install.packages("broom") install.packages("agricolae") install.packages("lme4") install.packages("afex") install.packages("cluster") install.packages("grDevices") Once installed, the packages have to be loaded before they can be used. This can be done through the library() or require()command, as shown below. library (tidyr) library (dplyr) library (sqldf) library(lme4) library(afex) library(broom) library(agricolae) library(cluster) library(grDevices) 3. Input data and validation RGxE starts with user-entered field to read input data. Instructions on user enetered fields are presented in supplemental material. The user is required to set current working directory using setwd(), which is input data file location, and pass input data file name. RGxE requires an input data file in .csv (comma separated value) format. Highlighted fields are user entered in the code shown below for Windows and iOS (Mac) operating system, respectively. setwd("E:/PhD Research Work/PhD Articles") #### For Windows user #### tempa <- read.csv("RGxEInputData2_2016_02_15.csv", header = TRUE) #### For iOS or Mac user #### file.name <- "E:/PhD Research Work/PhD Articles/RGxEInputData2_2016_02_15.csv" out.name <- "E:/PhD Research Work/PhD Articles/GxEROutput.csv" tempa <- read.csv(file.name) The input data file is comprised of column names including YR (year), LC (location), RP (replication), CLT (cultigen or genotype), and dependent variable (Trait). Sample input data is presented in supplemental material. User is required not to change the column names as program takes same variable name for the analysis. Dependent variable in example input data is yield (Mg ha-1) of watermelon. Hereafter, a word ‘genotype’ is used to indicate cultigen, cultivar, variety or genotype. RGxE validates the structure of input data, with below arguments, so that correct column types (numeric, logical, factor, or character) are used for statistical analysis. tempa$YR <- as.factor(tempa$YR) tempa$RP <- as.factor(tempa$RP) tempa$LC <- as.factor(tempa$LC) tempa$CLT <- as.factor(tempa$CLT) tempa$Trait <- as.numeric(tempa$Trait) To access the structure of data, the str() command can be used. str(tempa) 5 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. 'data.frame': 400 obs. of 5 variables: $ YR : Factor w/ 2 levels "2009","2010": 1 1 $ LC : Factor w/ 5 levels "CI","FL","KN",..: $ RP : Factor w/ 4 levels "1","2","3","4": 1 $ CLT : Factor w/ 10 levels "CalhounGray",..: $ Trait: num 56.2 74.2 32.6 74.2 64.8 ... 1 3 1 3 1 3 1 1 1 3 1 9 1 3 1 2 1 3 1 5 1 3 1 4 1 3 1 7 1 ... 3 3 3 ... 1 1 ... 10 6 8 ... Top 6 rows of example input data can be viewed using head() command. head(tempa) 1 2 3 4 5 6 YR 2009 2009 2009 2009 2009 2009 LC RP CLT Trait KN 1 EarlyCanada 56.236 KN 1 CalhounGray 74.167 KN 1 StarbriteF1 32.601 KN 1 CrimsonSweet 74.167 KN 1 GeorgiaRattlesnake 64.794 KN 1 FiestaF1 70.907 4. Genotype stability statistics 4.1 Analysis of variance (ANOVA) In multi-location replicated trial data, combined ANOVA is performed with the objectives to identify the significance of different effects; estimate and compare mean for levels of fixed factors; and estimate the size of genotype and GxE variance components. The ANOVA model comprises four factors: genotype (CLT), location (LC), year (YR), and replication or block (RP) nested within locations and year. The response of the genotype i in the location j, year k and replication r is presented as: Response = m + CLTi + LCj + YCk + RPr(LCj*YRk) + CLTi*LCj + CLTi*YRk + LCj*YRk + CLTi*LCj*YRk + Errorijkr where m = grand mean. Depending on the objectives of the analysis, the genotype, location and year are defined as random or fixed effect, which gives five different ANOVA models (Table 1). The genotype is random when the aim is to estimate variance components, genetic parameters, genetic gains expected from selection or different breeding strategies etc. Conversely, genotype is fixed factor when aim is to make comparison of test material for selection or recommendation. Similarly, location is considered as random when the main interest is to estimate variance components for sites that are representative of the relevant population within target region. Location is fixed when interest is to make explicit comparison of one level another and each location represents a well-defined area with relative to crop management. The year and replication are usually treated as random factor. Different combinations of random and fixed effects in ANOVA model have implications for the expectations of mean square (MS) values with the possible modification of the error term to be adopted in the F test. Therefore, sometimes the F test is not as straightforward as the ratio between two mean squares. RGxE computes five different cases of ANOVA: case 1: CLT, YR, LC and RP – all random case 2: CLT, YR and LC – fixed; RP – random case 3: CLT – fixed; LC, YR and RP – random case 4: LC – fixed; CLT, YR and RP – random case 5: CLT and LC – fixed; YR and RP – random 6 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. For random effect RGxE computes estimates of variance components using lmer() function of lme4 package. The significance of random effects is computed using likelihood ratio test to attain p-values. Likelihood is the probability of the data given a model. The logic of the likelihood ratio test is to compare the likelihood of two models with each other. The model without the factor that you are interested in (null model) is compared with model with the factor that you are interested in (full model) using anova() function. It gives a Chi-Square value, the associated degrees of freedom and p-value. According to Wilk’s theorem, the negative two times the log likelihood ratio of two models approaches a Chi-Square distribution with k degrees of freedom, where k is number of random effects tested. RGxE create user defined anova_lrt() function to compute likelihood ratio test and it is stored in ANOVA model Case I code. The type III sum of squares (SS), MS, F value of fixed effects are computed by fitting model in anova() function of lme4 package. The significance (p-value) of fixed effects is computed using mixed() function of afex package. The mixed() function computes type III like p-values using default method via Kenward-Roger approximation for degrees of freedom. To identify each experimental unit (EU) uniquely a distinct value must be assigned to EU. RGxE assign a distinct value to each combination of replication (RP) nested within location (LC) x year (YR) and use this new term (RPid) in model. After installing and calling packages, user can independently compute five different ANOVA models while feeding input data (tempa) in below code. User friendly output is generated in “data.frame” class using dplyr and tidyr packages. ######################################################################## ## ANOVA: Compute analysis of variance ## ######################################################################## #Generate unique id for replication for anova tempa$RPid<-as.factor(paste(tempa$YR, tempa$LC, tempa$RP, sep=".")) ######################################################################## ### ANOVA Case 1: CLT, YR, LC and RP - All Random ### ######################################################################## #full model fit.f1<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #model summary summary1 <- summary(fit.f1) #variance of random factors variance <- as.data.frame(summary1$varcor) #drop rownames rownames(variance) <- NULL variance1 <- variance %>% select (-var1, -var2) %>% rename(sov=grp, Variance=vcov, stddev=sdcor) #Type 3 test of hypothesis #Type III Wald chisquare tests anova(fit.f1, type="III") 7 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #Type 1 test of hypothesis anova(fit.f1, type="marginal", test="F") #model fitness anovacase1 <- plot(fit.f1, main="Model fitness Case 1: CLT, YR, LC and RP - All Random", xlab="Predicated Value", ylab="Residual") #LRT - likelihood ratio test for computing significance of random effect #create function (anova_lrt) for Likelihood ratio test, where parameters #a=outputdatasetname; example-anova1r #b=full model name; example-fit.f1 #c=reduced model name; example-fit.f1r #d=effect name; example- "RPid", NOTE: call it in quotation anova_lrt <- function (a,b,c,d){ #level of significance a <-anova(b,c) #convert anova into data frame a <- data.frame(a) #convert rownames into column a$name <- rownames(a) # drop rownames rownames(a) <- NULL a <- a %>% filter(name=="b") %>% mutate(sov=d) %>% select(sov, Pr_Chisq = starts_with("Pr..Chisq.")) # return the result return(a) } #null model for YR fit.f1y<-lmer(Trait~ 1 + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1y <- anova_lrt(anova1y,fit.f1,fit.f1y,"YR") #null model for LC fit.f1l<-lmer(Trait~ 1 + (1|YR) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1l <- anova_lrt(anova1l,fit.f1,fit.f1l,"LC") #null model for CLT fit.f1c<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1c <- anova_lrt(anova1c,fit.f1,fit.f1c,"CLT") 8 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #null model for YR:LC fit.f1yl<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1yl <- anova_lrt(anova1yl,fit.f1,fit.f1yl,"YR:LC") #null model for YR:CLT fit.f1yc<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1yc <- anova_lrt(anova1yc,fit.f1,fit.f1yc,"YR:CLT") #null model for LC:CLT fit.f1lc<-lmer(Trait~ 1 + (1|YR) (1|YR:CLT) + (1|YR:LC:CLT) + + (1|LC) + (1|CLT) + (1|YR:LC) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1lc <- anova_lrt(anova1lc,fit.f1,fit.f1lc,"LC:CLT") #null model for YR:LC:CLT fit.f1ylc<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1ylc <- anova_lrt(anova1ylc,fit.f1,fit.f1ylc,"YR:LC:CLT") #null model for RP fit.f1r<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) , data=tempa) #level of significance #call function anova_lrt anova1r <- anova_lrt(anova1ylr,fit.f1,fit.f1r,"RPid") #Merge anova and level of significance anova1 <- bind_rows(anova1y, anova1l)%>% bind_rows(anova1c)%>% bind_rows(anova1yl)%>% bind_rows(anova1yc)%>% bind_rows(anova1r)%>% bind_rows(anova1lc)%>%bind_rows(anova1ylc) anova1 <- as.data.frame(anova1) #Merge final output anova_randall <- variance1%>% left_join(anova1 , by ="sov") anova_randall$Pr_Chisq[anova_randall$stddev == 0] <- NA 9 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #Print final output print(anova_randall) sov Variance stddev YR:LC:CLT 49.72994 7.051946 LC:CLT 0.00000 0.000000 RPid 73.91368 8.597306 YR:CLT 0.00000 0.000000 YR:LC 57.81311 7.603494 CLT 111.69687 10.568674 LC 699.56950 26.449376 YR 0.00000 0.000000 Residual 327.52638 18.097690 Pr_Chisq 8.894184e-03 NA 4.145811e-07 NA 7.872463e-02 1.386709e-03 9.083568e-03 NA NA Where sov = source of variance, stddev = standard deviation, Pr_Chisq = Chi-Square probability In this example, the estimate of variance of random effects location x genotype (LC:CLT), year x genotype (YR:LC) and year (YR) is zero. It represent overfitted model, meaning model is more complex than the data can support. Random effect variance estimated as zero is common with those random effects that have too few or small number of levels. The alternate option is to use Markov Chain Monte Carlo (MCMC) simulation using MCMCglmm package to get probability of random effects. Fitness of ANOVA model for case 1 can be plotted using command print(anovacase1), where x-axis is model predicted value and y-axis is residual value (Figure 2). The uniform distribution of fitted residuals on both side of the reference line (value = 0) confirms the goodness of fit. The best linear unbiased predictor (BLUP) of random effects can be extracted using ranef() function of lme4 package. BLUPs are estimates of random effects. They allow us to account environmental factors in our model and missing data; and can be used for making selection. BLUP tend to ‘shrunk’ towards the population mean relative to their fixed effects estimates. RGxE computes BLUP of individual genotypes and generate user friendly output in “data.frame” class using dplyr and tidyr packages (see below code). #Compute BLUP for CLT #BLUP - Best linear unbiased predictor randeffect1 <- ranef(fit.f1) #BLUP for clt BLUP_CLT <- as.data.frame(randeffect1$CLT) #convert rownames into column BLUP_CLT$genotype <- rownames(BLUP_CLT) #drop rownames rownames(BLUP_CLT) <- NULL #rename variable name BLUP_CLT <- BLUP_CLT %>% select(genotype,Blup = starts_with("(Intercept)")) #return estimate of fixed effect from full model summary to compute BLUP fixestimate1 <- as.data.frame(summary1$coefficients) 10 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #compute BLUP value BLUP_CLT1 <- BLUP_CLT %>% mutate(Blup = Blup + fixestimate1$Estimate) #final output for BLUP for cultivars BLUP_CLT1 <- as.data.frame(BLUP_CLT1) #Print final output print(BLUP_CLT1) genotype CalhounGray CrimsonSweet EarlyCanada FiestaF1 GeorgiaRattlesnake Legacy Mickylee Quetzali StarbriteF1 SugarBaby Blup 76.255 62.535 59.840 77.283 70.710 68.417 61.300 58.744 83.422 51.169 Where BLUP = best linear unbiased predictor Similarly, remaining four cases of ANOVA model can be computed using below code. The user defined function anova_lrt(), which stored in ANOVA model Case I code, is used for likelihood ratio test for computing significance of random effects. ######################################################################## ## ANOVA Case 2: CLT, YR and LC - Fixed; RP - Random ## ######################################################################## #full model fit.f2<-lmer(Trait~ YR*LC*CLT + (1|RPid), data=tempa) #model summary summary2 <- summary(fit.f2) ##variance of random factors variance2 <- as.data.frame(summary2$varcor) #drop rownames rownames(variance2) <- NULL variance2 <- variance2 %>% select (-var1, -var2) %>% rename(sov=grp, Mean_Sq=vcov, stddev=sdcor) #Type 3 test of hypothesis anova.f2t3 <- as.data.frame(anova(fit.f2, type="III")) #convert rownames into column anova.f2t3$name <- rownames(anova.f2t3) #drop rownames 11 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. rownames(anova.f2t3) <- NULL #Type 2 test of hypothesis anova.f2t2 <- anova(fit.f2, type="marginal", test="F") #model fitness anovacase2 <- plot(fit.f2, main="Model fitness Case 2: CLT, YR and LC - Fixed; RP - Random", xlab="Predicated Value", ylab="Residual") #level of significance: "KR" is implemented corresponding to #the Kenward-Rogers approximation for degrees of freedom anova_f2t3 <- mixed(Trait~ YR*LC*CLT + (1|RPid), data=tempa) anova_f2t3 <- as.data.frame(anova_f2t3$anova_table) #convert rownames into column anova_f2t3$name <- rownames(anova_f2t3) #drop rownames rownames(anova_f2t3) <- NULL anova_f2t3 <- anova_f2t3 %>% select(name, Prob_F = starts_with("Pr(>F)")) #final output for case 2: CLT, YR and LC - Fixed anova_f2 <- anova.f2t3 %>% left_join(anova_f2t3, by = "name")%>% rename(sov = name)%>% select(sov, Df,Sum_Sq=starts_with("Sum Sq"), Mean_Sq = starts_with("Mean Sq"), F_value=starts_with("F value"), Prob_F) anova_f2 <- anova_f2 %>% bind_rows(variance2) anova_f2a <- as.data.frame(anova_f2) # print for output The output of ANOVA model Case 2 can be viewed using print(anova_f2a) command and is presented below. "#Note: P values are computed for F ratio using KR method " "#Kenward-Rogers approximation for degrees of freedom" sov YR LC CLT YR:LC YR:CLT LC:CLT YR:LC:CLT RPid Residual Df 1 4 9 4 9 36 36 NA NA Sum_Sq Mean_Sq F_value Prob_F stddev 120.4704 120.47042 0.367819 5.487578e-01 NA 73133.5245 18283.38113 55.822625 1.801102e-13 NA 44948.8523 4994.31692 15.248595 4.888724e-20 NA 5372.9402 1343.23505 4.101151 9.076711e-03 NA 4564.9326 507.21473 1.548623 1.309478e-01 NA 15018.7208 417.18669 1.273750 1.455183e-01 NA 23058.4845 640.51346 1.955609 1.516985e-03 NA NA 73.91369 NA NA 8.597307 NA 327.52636 NA NA 18.097689 Where sov = source of variance, Sum_Sq = sum of square, Mean_Sq = Mean square, F_value = F ratio, Prob_F = probability for F test, stddev = standard deviation, 12 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. ######################################################################## ## ANOVA CASE 3: CLT - Fixed; YR, LC and RP - Random ## ######################################################################## fit.f3<-lmer(Trait~ 1 + (1|YR) + (1|LC) + CLT + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #model summary summary3 <- summary(fit.f3) ##variance of random factors variance3 <- as.data.frame(summary3$varcor) #drop rownames rownames(variance3) <- NULL variance3 <- variance3 %>% select (-var1, -var2) %>% rename(sov=grp, Mean_Sq=vcov, stddev=sdcor) #Type 3 test of hypothesis anova.f3t3 <- as.data.frame(anova(fit.f3, type="III")) #convert rownames into column anova.f3t3$name <- rownames(anova.f3t3) #drop rownames rownames(anova.f3t3) <- NULL #Type 2 test of hypothesis anova.f3t2 <- anova(fit.f3, type="marginal", test="F") #model fitness anovacase3 <- plot(fit.f3, main="Model fitness Case 3: CLT - Fixed; YR, LC and RP - Random", xlab="Predicated Value", ylab="Residual") #p value for fixed effects anova_f3t3 <- mixed(Trait~ 1 + (1|YR) + (1|LC) + CLT + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) anova_f3t3 <- as.data.frame(anova_f3t3$anova_table) #convert rownames into column anova_f3t3$name <- rownames(anova_f3t3) #drop rownames rownames(anova_f3t3) <- NULL anova_f3t3 <- anova_f3t3 %>% select(name, Prob_F = starts_with("Pr(>F)")) #final output of case 3 for fixed effect - CLT 13 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. anova_f3 <- anova.f3t3 %>% left_join(anova_f3t3, by = "name")%>% rename(sov = name)%>% select(sov, Df,Sum_Sq=starts_with("Sum Sq"), Mean_Sq = starts_with("Mean Sq"), F_value=starts_with("F value"), Prob_F) #level of significance for random effects #null model for CLT fit.f3c <- lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|YR:LC) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3c <- anova_lrt(anova3c,fit.f3,fit.f3c,"CLT") #null model for LC fit.f3l <- lmer(Trait~ 1 + CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3l <- anova_lrt(anova3l,fit.f3,fit.f3l,"LC") #null model for YR fit.f3y <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3y <- anova_lrt(anova3y,fit.f3,fit.f3y,"YR") #null model for YR:LC fit.f3yl <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3yl <- anova_lrt(anova3yl,fit.f3,fit.f3yl,"YR:LC") #null model for YR:CLT fit.f3yc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3yc <- anova_lrt(anova3yc,fit.f3,fit.f3yc,"YR:CLT") #null model for LC:CLT fit.f3lc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance 14 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #call function anova_lrt anova3lc <- anova_lrt(anova3lc,fit.f3,fit.f3lc,"LC:CLT") #null model for YR:LC:CLT fit.f3ylc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3ylc <- anova_lrt(anova3ylc,fit.f3,fit.f3ylc,"YR:LC:CLT") #null model for RPid or replication fit.f3r <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT), data=tempa) #level of significance #call function anova_lrt anova3r <- anova_lrt(anova3r,fit.f3,fit.f3r,"RPid") #Merge anova and level of significance anova3 <- bind_rows(anova3y, anova3l)%>% bind_rows(anova3yl)%>% bind_rows(anova3yc)%>% bind_rows(anova3r)%>% bind_rows(anova3lc)%>%bind_rows(anova3ylc) anova3 <- as.data.frame(anova3) #Merge final output anova_cfix <- variance3%>% left_join(anova3 , by ="sov") anova_cfix$Pr_Chisq[anova_cfix$stddev == 0] <- NA The output of ANOVA model Case 3 can be viewed using print(anova_f3) and print(anova_cfix)commands; and is presented below. "#Note: P values are computed for F ratio using KR method " "#Kenward-Rogers approximation for degrees of freedom" sov Df Sum_Sq Mean_Sq F_value Prob_F CLT 9 27964.75 3107.195 9.486854 0.006364982 Where sov = source of variance, Sum_Sq = sum of square, Mean_Sq = Mean square, F_value = F ratio, Prob_F = probability for F test "#For random effect P values are generated using LRT-Likelihood Ratio Test" "#via model comparison and anova" sov Mean_Sq stddev YR:LC:CLT 49.72992 7.051945 LC:CLT 0.00000 0.000000 RPid 73.91370 8.597308 YR:CLT 0.00000 0.000000 YR:LC 57.81306 7.603490 LC 699.57346 26.449451 YR 0.00000 0.000000 Residual 327.52637 18.097690 Pr_Chisq 2.724821e-02 NA 4.145811e-07 NA 6.776546e-02 9.446027e-03 NA NA 15 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Where sov = source of variance, Mean_Sq = Mean square, stddev = standard deviation, Pr_Chisq = Chi-Square probability ######################################################################## ## ANOVA Case 4: LC - Fixed; YR, CLT and RP - Random ## ######################################################################## fit.f4 <- lmer(Trait~ 1 + LC + (1|YR) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #model summary summary4 <- summary(fit.f4) ##variance of random factors variance4 <- as.data.frame(summary4$varcor) # drop rownames rownames(variance4) <- NULL variance4 <- variance4 %>% select (-var1, -var2) %>% rename(sov=grp, Mean_Sq=vcov, stddev=sdcor) #Type 3 test of hypothesis anova.f4t3 <- as.data.frame(anova(fit.f4, type="III")) #convert rownames into column anova.f4t3$name <- rownames(anova.f4t3) #drop rownames rownames(anova.f4t3) <- NULL #Type 2 test of hypothesis anova.f4t2 <- anova(fit.f3, type="marginal", test="F") #model fitness anovacase4 <- plot(fit.f4, main="Model fitness Case 4: LC - Fixed; YR, CLT and RP - Random", xlab="Predicated Value", ylab="Residual") #level of significance for fixed effects anova_f4t3 <- mixed(Trait~ 1 + LC + (1|YR) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) anova_f4t3 <- as.data.frame(anova_f4t3$anova_table) #convert rownames into column anova_f4t3$name <- rownames(anova_f4t3) #drop rownames rownames(anova_f4t3) <- NULL 16 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. anova_f4t3 <- anova_f4t3 %>% select(name, Prob_F = starts_with("Pr(>F)")) #final output of case 4 for fixed effect - LC anova_f4 <- anova.f4t3 %>% left_join(anova_f4t3, by = "name")%>% rename(sov = name)%>% select(sov, Df,Sum_Sq=starts_with("Sum Sq"), Mean_Sq = starts_with("Mean Sq"), F_value=starts_with("F value"), Prob_F) #level of significance of random effect #null model for CLT fit.f4c <- lmer(Trait~ 1 + LC + (1|YR) + (1|YR:LC) + (1|YR:LC) + (1|YR:CLT)+ (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4c <- anova_lrt(anova4c,fit.f4,fit.f4c,"CLT") #null model for LC fit.f4l <- lmer(Trait~ 1 + (1|CLT) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4l <- anova_lrt(anova4l,fit.f4,fit.f4l,"LC") #null model for YR fit.f4y <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4y <- anova_lrt(anova4y,fit.f4,fit.f4y,"YR") #null model for YR:LC fit.f4yl <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4yl <- anova_lrt(anova4yl,fit.f4,fit.f4yl,"YR:LC") #null model for YR:CLT fit.f4yc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4yc <- anova_lrt(anova4yc,fit.f4,fit.f4yc,"YR:CLT") #null model for LC:CLT 17 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. fit.f4lc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4lc <- anova_lrt(anova4lc,fit.f4,fit.f4lc,"LC:CLT") #null model for YR:LC:CLT fit.f4ylc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4ylc <- anova_lrt(anova4ylc,fit.f4,fit.f4ylc,"YR:LC:CLT") #null model for RPid or replication fit.f4r <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT), data=tempa) #level of significance #call function anova_lrt anova4r <- anova_lrt(anova4r,fit.f4,fit.f4r,"RPid") #Merge anova and level of significance anova4 <- bind_rows(anova4c, anova4y)%>% bind_rows(anova4yl)%>% bind_rows(anova4yc)%>% bind_rows(anova4r)%>% bind_rows(anova4lc)%>%bind_rows(anova4ylc) anova4 <- as.data.frame(anova4) #Merge final output anova_lfix <- variance4%>% left_join(anova4 , by ="sov") anova_lfix$Pr_Chisq[anova_lfix$stddev == 0] <- NA The output of ANOVA model Case 4 can be viewed using print(anova_f4) and print(anova_lfix)commands; and is presented below. "#Note: P values are computed for F ratio using KR method " "#Kenward-Rogers approximation for degrees of freedom" sov Df Sum_Sq Mean_Sq F_value Prob_F LC 4 21801.73 5450.432 16.6412 0.009420459 Where sov = source of variance, Sum_Sq = sum of square, Mean_Sq = Mean square, F_value = F ratio, Prob_F = probability for F test "#For random effect P values are generated using LRT-Likelihood Ratio Test" "#via model comparison and anova" sov YR:LC:CLT LC:CLT RPid YR:CLT Mean_Sq 49.72997 0.00000 73.91371 0.00000 stddev Pr_Chisq 7.051948 8.212358e-03 0.000000 NA 8.597308 4.145811e-07 0.000000 NA 18 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. YR:LC 57.81300 7.603486 3.980432e-01 CLT 111.69670 10.568666 NA YR 0.00000 0.000000 NA Residual 327.52635 18.097689 NA Where sov = source of variance, Mean_Sq = Mean square, stddev = standard deviation, Pr_Chisq = Chi-Square probability ######################################################################## ## ANOVA Case 5: CLT and LC - Fixed; YR, and RP - Random ## ######################################################################## fit.f5 <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #model summary summary5 <- summary(fit.f5) ##variance of random factors variance5 <- as.data.frame(summary5$varcor) #drop rownames rownames(variance5) <- NULL variance5 <- variance5 %>% select (-var1, -var2) %>% rename(sov=grp, Mean_Sq=vcov, stddev=sdcor) #Type 3 test of hypothesis anova.f5t3 <- as.data.frame(anova(fit.f5, type="III")) #convert rownames into column anova.f5t3$name <- rownames(anova.f5t3) #drop rownames rownames(anova.f5t3) <- NULL #Type 2 test of hypothesis anova.f5t2 <- anova(fit.f3, type="marginal", test="F") #model fitness anovacase5 <- plot(fit.f5, main="Model fitness Case 5: CLT and LC - Fixed; YR and RP - Random", xlab="Predicated Value", ylab="Residual") #level of significance for fixed effects anova_f5t3 <- mixed(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) anova_f5t3 <- as.data.frame(anova_f5t3$anova_table) #convert rownames into column anova_f5t3$name <- rownames(anova_f5t3) 19 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #drop rownames rownames(anova_f5t3) <- NULL anova_f5t3 <- anova_f5t3 %>% select(name, Prob_F = starts_with("Pr(>F)")) #final output of case 5 for fixed effect - CLT and LC anova_f5 <- anova.f5t3 %>% left_join(anova_f5t3, by = "name")%>% rename(sov = name)%>% select(sov, Df,Sum_Sq=starts_with("Sum Sq"), Mean_Sq = starts_with("Mean Sq"), F_value=starts_with("F value"), Prob_F) #level of significance for random effects #null model for CLT fit.f5c <- lmer(Trait~ 1 + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5c <- anova_lrt(anova5c,fit.f5,fit.f5c,"CLT") #null model for LC fit.f5l <- lmer(Trait~ 1 + CLT + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5l <- anova_lrt(anova5l,fit.f5,fit.f5l,"LC") #null model for YR fit.f5y <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5y <- anova_lrt(anova5y,fit.f5,fit.f5y,"YR") #null model for YR:LC fit.f5yl <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5yl <- anova_lrt(anova5yl,fit.f5,fit.f5yl,"YR:LC") #null model for YR:CLT fit.f5yc <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5yc <- anova_lrt(anova5yc,fit.f5,fit.f5yc,"YR:CLT") 20 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #null model for LC:CLT fit.f5lc <- lmer(Trait~ 1 + CLT + LC + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5lc <- anova_lrt(anova5lc,fit.f5,fit.f5lc,"LC:CLT") #null model for YR:LC:CLT fit.f5ylc <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5ylc <- anova_lrt(anova5ylc,fit.f5,fit.f5ylc,"YR:LC:CLT") #null model for RPid or replication fit.f5r <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) , data=tempa) #level of significance #call function anova_lrt anova5r <- anova_lrt(anova5r,fit.f5,fit.f5r,"RPid") #Merge anova and level of significance anova5 <- bind_rows(anova5y, anova5yl)%>% bind_rows(anova5yc)%>% bind_rows(anova5r)%>% bind_rows(anova5ylc) anova5 <- as.data.frame(anova5) #Merge final output anova_clfix <- variance5%>% left_join(anova5 , by ="sov") anova_clfix$Pr_Chisq[anova_clfix$stddev == 0] <- NA The output of ANOVA model Case 5 can be viewed using print(anova_f5) and print(anova_clfix)commands; and is presented below. "#Note: P values are computed for F ratio using KR method " "#Kenward-Rogers approximation for degrees of freedom" sov Df Sum_Sq Mean_Sq F_value Prob_F CLT 9 23982.801 2664.7557 8.136004 0.002267970 LC 4 21801.728 5450.4320 16.641201 0.009275436 CLT:LC 36 8013.352 222.5931 0.679619 0.874336050 Where sov = source of variance, Sum_Sq = sum of square, Mean_Sq = Mean square, F_value = F ratio, Prob_F = probability for F test "#For random effect P values are generated using LRT-Likelihood Ratio Test" "#via model comparison and anova" sov YR:LC:CLT RPid Mean_Sq 71.58186 73.91368 stddev Pr_Chisq 8.460607 1.000000e+00 8.597307 1.305383e-07 21 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. YR:CLT 0.00000 0.000000 NA YR:LC 55.62783 7.458407 2.975676e-01 YR 0.00000 0.000000 NA Residual 327.52636 18.097689 NA Where sov = source of variance, Mean_Sq = Mean square, stddev = standard deviation, Pr_Chisq = Chi-Square probability Fitness of ANOVA model case 2, case 3, case 4 and case 5 can be plotted using command print(anovacase2), print(anovacase3), print(anovacase4) and print(anovacase1), respectively. 4.2. Descriptive statistics A new additional variable environment (ENV) is created and quality check of missing value is performed in dataset ‘tempa2’using dplyr package. Environment is location-year combination, which is highlighted is below code. RGxE validates structure of dataset ‘tempa2’ as it serves input data for descriptive and other statistics (Figure 1). #Compute environment - Location by year combination tempa2 <- tempa %>% mutate (ENV = paste(LC,YR, sep='-')) %>% #remove missing records na.omit() #validate data tempa2$YR <- as.factor(tempa2$YR) tempa2$RP <- as.factor(tempa2$RP) tempa2$LC <- as.factor(tempa2$LC) tempa2$CLT <- as.factor(tempa2$CLT) tempa2$ENV <- as.factor(tempa2$ENV) tempa2$Trait <- as.numeric(tempa2$Trait) Top 6 rows of input data ‘tempa2’ can be viewed using head() command. head(tempa2) 1 2 3 4 5 6 YR 2009 2009 2009 2009 2009 2009 LC RP CLT Trait KN 1 EarlyCanada 56.236 KN 1 CalhounGray 74.167 KN 1 StarbriteF1 32.601 KN 1 CrimsonSweet 74.167 KN 1 GeorgiaRattlesnake 64.794 KN 1 FiestaF1 70.907 RPid 2009.KN.1 2009.KN.1 2009.KN.1 2009.KN.1 2009.KN.1 2009.KN.1 ENV KN-2009 KN-2009 KN-2009 KN-2009 KN-2009 KN-2009 Descriptive statistics including count, minimum (min), maximum (max), mean, sum, median, variance (var), standard deviation (sd), and coefficient of variation (cv) are computed using dplyr package. Using tidyr package results of descriptive statistics are transposed in user friendly layout so that researchers can interpret them easily (see supplemental material for descriptive statistics outputs). RGxE generates following descriptive statistics. Trait mean over genotype and environment Trait mean and sd over genotype and year Trait mean, cv, sd and sum over genotype and location Trait mean, sd, and sum over genotype, location and year Trait mean over genotype, location and replication 22 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Trait mean over location and year Trait mean over location and replication Trait count, min, max, mean, sum, median, var, and sd over location Trait count, min, max, mean, sum, median, var, and sd over year Trait count, min, max, mean, sum, median, var, and sd over genotype Trait count, min, max, mean, sum, median, var, and sd over environment 4.3 Univariate stability statistics Among univariate stability statistics, RGxE generates output of regression slope (bi), deviation from regression (S2d), Shukla’s sigma (σi2), ssquares, Wricke’s ecovalence (Wi) and Kang’s statistics (YSi). RGxE regresses the response of genotype against the environmental index to compute regression slope (bi), deviation from regression (S2d), T-test on regression slope (H0: bi = 1) and F-test on deviation from regression (H0: S2d = 0). The level of significance of T-test and F-test is computed using lm() function of R (R Core Team, 2015), and dplyr and tidyr packages. The level of significance at 0.05, 0.01 and 0.001 is represented by ‘*’, ‘**’, ‘***’; respectively. Environmental index is average performance of all genotypes in each environment. Stability.par() function of agricolae package is used to compute Shukla’s sigma (σi2), ssquares, Wricke’s ecovalence (Wi) and Kang’s statistics (YSi). For selection of stable genotype, user can independently compute univariate stability statistics while feeding required input data (tempa2) in below code. Top 6 rows of input data ‘tempa2’ is presented in section 4.2 descriptive statistics. ######################################################################## ## Compute univariate stability statistics - regression analysis ## ######################################################################## #Compute regression (slope) and deviation from regression #compute environmental index dsterm <- tempa2 %>% group_by (ENV, RP, YR, LC) %>% summarize (ENVTrait = mean(Trait,na.rm=FALSE)) dst02 <- tempa2 %>% left_join(dsterm, by=c("ENV", "RP")) %>% #Left join on multiple columns arrange (CLT) %>% rename (YR= YR.x, LC = LC.x ) #fit model fit_model <- dst02 %>% group_by(CLT) %>% #group regression analysis by cultivar do (model=lm(Trait~ENVTrait + ENV + RP, data=.)) #parameter estimates paramlm <- as.data.frame(fit_model %>% tidy(model)) glancelm <- as.data.frame(fit_model %>% glance(model)) augmentlm <- as.data.frame(fit_model %>% augment(model)) outmsed <- lapply(fit_model$model, anova) #anova output outmsed2 <- as.data.frame(do.call(rbind, outmsed)) #convert list into data.frame #convert rownames into column outmsed2$SOV <- rownames(outmsed2) # drop rownames 23 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. rownames(outmsed2) <- NULL #remove numeric values from string of rownames using function gsub outmsed2 <- outmsed2 %>% mutate(SOV = gsub("\\d+","",SOV)) #extract unique cultivar name and merge to outmsed2 dataset genotypes <- dst02 %>% select(CLT) %>% distinct (CLT) %>% arrange(CLT) #Stack 4 times to match number of rows with outmsed2 dataset genotypes1 <- genotypes %>% bind_rows(genotypes) %>% bind_rows(genotypes) %>% bind_rows(genotypes) %>% arrange(CLT) #attach list of cultivars to outmsed2 outmsed3 <- as.data.frame(outmsed2 %>% bind_cols(genotypes1)) #transpose outmsed3 outmsed4 <- outmsed3 %>% select (CLT, SOV, MS = starts_with("Mean")) %>% filter (SOV != "RP") #rename variables #transpose MS values MSDS <- outmsed4 %>% spread (SOV, MS) %>% #transpose using library tidyr arrange (CLT) #Transpose degress of freedoms for F-test FDS3 <- outmsed3 %>% filter (SOV != "RP")%>% select (CLT, SOV, Df) %>% spread (SOV, Df) %>% #transpose using library tidyr rename (DF_ENVTrait = ENVTrait, DF_Residuals = Residuals, DF_ENV = ENV) #Subset parameters - paramlm dataset REGCOEFGS <- paramlm %>% filter (term == "ENVTrait") %>% select (-statistic, -p.value) #Test and level of significance of regression and deviation from regression #Merge MSDS, FDS3, REGCOEFGS slope <- MSDS %>% inner_join (REGCOEFGS, by = "CLT") %>% inner_join (FDS3, by = "CLT") %>% rename (MSE = Residuals, LREGMS=ENVTrait, DEVLMS = ENV, BI= estimate, STDERR = std.error) #test significance levels slope1 <- slope %>% mutate (T_H01 = (BI-1)/STDERR , #Null Hypothesis for slope = 1 PT_H01 = 2*pt(-abs(T_H01), DF_Residuals), F_DEVREG=DEVLMS/MSE, #NULL HYPOTHESIS: PREDICTED-ACTUAL = 0 PF_HO0= 1-pf(F_DEVREG, DF_ENV, DF_Residuals)) #add legend for level of significance 24 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. slope2 <- slope1 %>% mutate (SIG_SLOPE = ifelse(PT_H01 <= 0.001, "***", ifelse(PT_H01 <= 0.01, "**", ifelse(PT_H01 <= 0.05, "*","")))) %>% mutate (SIG_DEVREG = ifelse(PF_HO0 <= 0.001, "***", ifelse(PF_HO0 <= 0.01, "**", ifelse(PF_HO0 <= 0.05, "*","")))) #final regression output options(digits=5) univariate2 <- slope2 %>% mutate (SLOPE = paste(BI,SIG_SLOPE, sep=""), DEVREG = paste(DEVLMS,SIG_DEVREG, sep="") ) %>% select (CLT,SLOPE, DEVREG ) ######################################################################## # Compute univariate stability statistics - shukla, ecovalence, YS # ######################################################################## #Compute Shukla, Wricke Ecovalense, Kangs YS repno <- tempa2 %>% summarise (total_rep = n_distinct(RP)) #count total number of rep dstgl <- tempa2 %>% group_by (CLT, LC) %>% #Summarize genotype performance across locations summarize (Trait = mean(Trait,na.rm=FALSE)) dstgl1 <- dstgl %>% spread (LC, Trait) #transpose values #convert into data frame so that row containing structure information is deleted dstgl2 <- as.data.frame(dstgl1) #create rownames rownames(dstgl2) <- dstgl2[ ,1] shukla <- dstgl2[,-1] #compute MS error term tempa3 <- glm(Trait ~ LC + YR + LC:YR + RP %in% (LC:YR) + CLT + CLT:LC + CLT:YR + CLT:LC:YR, family = gaussian , data= tempa2 ) #model summary summary1 <- summary.glm(tempa3) #Error SS error_ss <- as.data.frame(summary1$deviance) error_ss1 <- error_ss %>% #rename variable select (Deviance = starts_with("summary"), everything()) #Error DF error_df <- as.data.frame(summary1$df.residual) error_df1 <- error_df %>% 25 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #rename variable select (Df = starts_with("summary"), everything()) #MS of error mse1 <- as.data.frame(error_ss1/error_df1) mse <- mse1 %>% #rename variable rename (MS = Deviance) # MSError is used populated from ANOVA univariate1a <- stability.par(shukla, rep= repno$total_rep , MSerror=mse$MS, alpha=0.1, main="Genotype") #pool results into individual columns univariate1b <- univariate1a$statistics #create column genotype from rownames univariate1b$genotype <- rownames(univariate1b) rownames(univariate1b) = NULL #remove rownames names(univariate1b)[3] <- "significane_sigma" # rename duplicate name dot names(univariate1b)[5] <- "significane_s2" # rename duplicate name dot names(univariate1b)[2] <- "sigma" # rename names(univariate1b)[4] <- "ssquare" # rename univariate1c <- univariate1a$stability #create column genotype from rownames univariate1c$genotype <- rownames(univariate1c) rownames(univariate1c) = NULL #remove rownames names(univariate1c)[8] <- "legend" # rename variable ... to legend #Merge univariate1d <- univariate1b %>% inner_join (univariate1c , by = "genotype") %>% # deselect all columns between Yield and Stab.rating select (-Yield: -Stab.rating) %>% # arrange the column order for final output select (CLT=genotype, Mean, sigma, significane_sigma, ssquare, significane_s2, Ecovalence,YSi, legend) #Final stability statistics #Merge Univariate2 and Univariate1d univariate <- univariate2 %>% inner_join(univariate1d, by = "CLT") %>% mutate (SIGMA=paste(sigma,significane_sigma, ""), SIGMA_SQUARE=paste(ssquare,significane_s2, ""), YS_Kang =paste(YSi,legend, "")) %>% select (Genotype = CLT, Mean, SLOPE, DEVREG, SIGMA, SIGMA_SQUARE,Ecovalence, YS_Kang) print(univariate) Genotype Mean SLOPE CalhounGray 77.349 1.32 CrimsonSweet 62.013 1.36 DEVREG SIGMA SIGMA_SQUARE Ecovalence YS_Kang 124.67 61.35 ns 15.76 ns 279.75 10 + 1450.04*** 439.12 ns 567.99 ns 1488.64 4 26 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. EarlyCanada FiestaF1 GeorgiaRattlesnake Legacy Mickylee Quetzali StarbriteF1 SugarBaby 59.000 78.498 71.151 68.588 60.632 57.775 85.360 49.307 0.32* 1.58 0.92 1.13 0.59 0.97 1.29 0.50* 686.25* 657.87 220.06 428.07 705.48* 96.53 221.14 332.18* 253.23 300.29 52.21 287.39 188.11 82.81 157.24 264.19 ns ns ns ns ns ns ns ns 285.37 385.99 44.86 262.49 195.13 86.10 78.37 308.43 ns ns ns ns ns ns ns ns 893.77 1044.35 250.52 1003.10 685.37 348.42 586.61 928.84 2 11 8 7 3 1 12 -1 + + + + Where SLOPE = regression slope, DEVREG = deviation from regression, SIGMA = Shukla’s sigma, SIGMA_SQUARE = ssquares, Ecovalence = Wricke’s ecovalence, YS_Kang = Kang’s statistics, ns = non-significant, + = indicate stable genotype according to Kang’s stability statistics 5. Location value statistics Input data ‘tempa2’ is used to calculate genotype F ratio across location and environment; correlation of location with average location performance; and location cluster analysis. 5.1 Genotype F ratio across location and environment; and correlation among location and average location RGxE computes analysis of variance by location using lm() function to get the genotype F values across location. When the mean of all genotypes are equal then the F ratio will be close to 1. The high genotype F value indicates high discriminating ability for that location. Similarly, Pearson’s test of correlation of locations with average location is computed using cor.test() function of R built in stats package (R core team, 2015). Function cor.test() provide level of significance of correlation and the level of significance at 0.05, 0.01 and 0.001 is represented by ‘*’, ‘**’, ‘***’; respectively. RGxE generates user friendly output for genotype F ratio across location and environment; and correlation of location with average location performance using dplyr and tidyr packages as shown in below code. ###################################################################### ## Compute location statistics - genotype F ratio across ## ## location and environment; location correlation ## ###################################################################### #Location values #F-value of genotype across location #fit model fit_modellc <- tempa2 %>% group_by(LC) %>% #group regression analysis by location do (model1=lm(Trait~CLT + YR + CLT:YR + RP%in%YR , data=.)) #parameter estimates paramlmlc <- as.data.frame(fit_modellc %>% tidy(model1)) glancelmlc <- as.data.frame(fit_modellc %>% glance(model1)) augmentlmlc <- as.data.frame(fit_modellc %>% augment(model1)) outmsedlc <- lapply(fit_modellc$model1, anova) #anova output #convert list into data.frame outmsedlc2 <- as.data.frame(do.call(rbind, outmsedlc)) #convert rownames into column outmsedlc2$SOV <- rownames(outmsedlc2) #drop rownames rownames(outmsedlc2) <- NULL 27 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #remove numeric values from string of rownames using function gsub outmsedlc2 <- outmsedlc2 %>% mutate(SOV = gsub("\\d+","",SOV)) #extract unique location name and merge to outmsedlc2 dataset location <- dst02 %>% select(LC) %>% distinct (LC) %>% arrange(LC) #Extract gentype F value for each location locvalue <- outmsedlc2 %>% filter(SOV == "CLT") %>% select (FRatioGenotype = starts_with("F value")) %>% bind_cols (location) %>% select (LC, FRatioGenotype) locvalue <- as.data.frame (locvalue) #Correlation between location and average location for each genotype #compute genotype mean at each location glcmean1 <- tempa2 %>% group_by (CLT, LC) %>% summarize (glcmean = mean(Trait,na.rm=FALSE)) %>% as.data.frame(select (CLT, LC, glcmean)) #compute genotype mean across all location -average location gmean1 <- tempa2 %>% group_by (CLT ) %>% summarize (gmean = mean(Trait,na.rm=FALSE)) %>% as.data.frame(select (CLT, gmean)) #merge location mean with average location for each genotype lgmean <- glcmean1 %>% left_join(gmean1, by="CLT") %>% arrange(LC) %>% select (-CLT) #compute correlation with level of significance lcgcorr <- lgmean %>% group_by(LC) %>% do(tidy(cor.test(.$glcmean, .$gmean, method = c("pearson")))) lcgcorr1 <- lcgcorr %>% select (LC, Corr_Value = starts_with ("estimate"), Pvalue = starts_with("p.value")) #post process correlation value lcgcorr2 <- lcgcorr1 %>% mutate (SIG_CORR = ifelse(Pvalue <= 0.001, "***", ifelse(Pvalue <= 0.01, "**", ifelse(Pvalue <= 0.05, "*","")))) #concatenate p value symbol with correlation value lcgcorr3 <- lcgcorr2 %>% mutate (LocCorrelation=paste(Corr_Value,SIG_CORR, sep="")) %>% select (LC, LocCorrelation) lcgcorr3 <- as.data.frame(lcgcorr3) 28 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #Final location value table for output #compute location mean Locmean <- tempa2 %>% group_by (LC ) %>% summarize (Trait = mean(Trait,na.rm=FALSE))%>% select (LC, Mean = starts_with("Trait")) Locmean <- as.data.frame(Locmean) #merge all location value outputs for print LocationValue <- Locmean %>% inner_join (locvalue, by = "LC") %>% inner_join(lcgcorr3, by = "LC") %>% rename (Location = LC) print(LocationValue) Location CI FL KN SC TX Mean 61.040 100.153 63.786 82.685 27.173 FRatioGenotype 4.1804 2.2579 4.6964 6.8813 2.9966 LocCorrelation 0.95*** 0.86** 0.90*** 0.88*** 0.89*** Where FRatioGenotype = genotype F ratio, LocCorrelation = correlation of location with average location ######################################################################## ### F-value of genotype across environmen ## ######################################################################## #fit model fit_modelen <- tempa2 %>% group_by(ENV) %>% #group regression analysis by location do (model2=lm(Trait~CLT + RP , data=.)) #parameter estimates paramlmen <- as.data.frame(fit_modelen %>% tidy(model2)) glancelmen <- as.data.frame(fit_modelen %>% glance(model2)) augmentlmen <- as.data.frame(fit_modelen %>% augment(model2)) outmseden <- lapply(fit_modelen$model2, anova) #anova output #convert list into data.frame outmseden2 <- as.data.frame(do.call(rbind, outmseden)) #convert rownames into column outmseden2$SOV <- rownames(outmseden2) # drop rownames rownames(outmseden2) <- NULL #remove numeric values from string of rownames using function gsub outmseden2 <- outmseden2 %>% mutate(SOV = gsub("\\d+","",SOV)) 29 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #extract unique environment name and merge to outmseden2 dataset environment <- dst02 %>% select(ENV) %>% distinct (ENV) %>% arrange(ENV) #Extract gentype F value for each location locvalue2 <- outmseden2 %>% filter(SOV == "CLT") %>% select (FRatioGenotype = starts_with("F value")) %>% bind_cols (environment) %>% select (ENV, FRatioGenotype) locvalue2 <- as.data.frame (locvalue2) print(locvalue2) ENV CI-2009 CI-2010 FL-2009 FL-2010 KN-2009 KN-2010 SC-2009 SC-2010 TX-2009 TX-2010 FRatioGenotype 2.4015 3.3665 2.6914 1.7231 1.9999 6.3971 2.8729 8.0454 2.5003 1.4619 Where ENV = environment, FRatioGenotype = genotype F ratio 5.2 Location cluster analysis Hierarchical cluster analysis for location relatedness is computed using hclust() function of R built in stats package (R core team, 2015). The arguments passed to hclust() function include Euclidean distance computed from dist() function and Ward’s method. Function dist() of R built in stats package (R core team, 2015) computes and return the distance matrix between rows of a data matrix. Tree or dendogram of cluster analysis is generated using plot() function, of R built in graphics package (R core team, 2015), as shown in below code. ######################################################################## ### Compute cluster analysis of location ### ######################################################################## #location cluster analysis #Euclidean distance #Ward Hierarchical Clustering #trait mean over location mean_l <- tempa2 %>% group_by (LC ) %>% summarize (Trait = mean(Trait,na.rm=FALSE)) mean_l1 <- as.data.frame(mean_l) clusterdata <- mean_l1 %>% select (Trait) clusterdata <- na.omit(clusterdata) distance <- dist(clusterdata, method = "euclidean") # distance matrix hcluster <- hclust(d=distance, method="ward.D") locationcluster <- plot(hcluster, labels=mean_l1$LC) # display dendogram 6. Final output 30 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. After all computation is over, RGxE clears the Console Window of R studio then saves the output using sink()function along with the system date and time using Sys.time()function of R built in base package (R core team, 2015). RGxE auto saves the output (output name = ‘RGxEOutput’) in folder which is defined in setwd() command in the beginning of the program. Program gives user option to save results in .csv (RGxEOutput.csv) or .txt (RGxEOutput.txt) format. Output, in .txt format, from sample input data generated by RGxE is presented in supplemental material. Additionally, RGxE prints the output in Console Window of R studio. 7. Result interpretation Interpretation of univariate stability statistics is presented in supplemental material. Additionally, studies published on genotype stability (Dia et al., 2016b) and location value (Dia et al., 2016c, in press) used SASGxE program (Dia et al., 2016a, in press), which is equivalent to RGxE program. Thus, these studies can serve as source of RGxE output interpretation. Also, interpretation of RGxE and SASGxE program is available at available at http://cuke.hort.ncsu.edu/cucurbit/wehner/software.html . RGxE program ############################################################################# #Program Name: RGxE #Version: 1.1 #Purpose: It computes various stability analysis statistics for Genotype x # Environment Interaction data # Overall ANOVA with different combination of random and fixed effects; # Regression coefficients - slope and Deviation from regression; # Shukla Sigma; Shukla SSquare; Wricke's Ecovalence; Kang's YS; # Location statistics: Genotype F ratio across location and environment; # correlation of individual location with average location; # Descriptive statistics on genotype, location, environment, year and # replication; # Cluster analysis of location #libraries and packages used: dplyr, tidyr, sqldf, broom, agricolae # (stability.par), cluster, lme4, afex #Developers: Mahendra Dia,Todd C. Wehner and Consuelo Arellano # NC State University, NC, USA #Last Modified: March 11, 2016 ############################################################################# #Install packages install.packages("tidyr") install.packages("dplyr") install.packages ("sqldf") install.packages("lme4") install.packages("afex") install.packages("broom") install.packages("agricolae") #call libraries library (tidyr) library (dplyr) library (sqldf) library(lme4) library(afex) library(broom) library(agricolae) 31 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. library(cluster) library(grDevices) #list current objects ls() #to remove list of all objects from workspace rm(list=ls()) #to identify current directory getwd() #set current directory ###user NEED to replace input data file path### setwd("E:/PhD Research Work/PhD Articles /BLUE_BLUP_Prediction_Interval_SAS_R") #check current directory getwd() ######################################################################## #################### For Windows user ################### ######################################################################## ###################################### ########## Option 1 ################ ###################################### #import input data ###user NEED to replace input data file name### tempa <-read.csv("RGxEInputData2_2016_02_15.csv", header = TRUE) ###################################### ########## Option 2 ################ ###################################### #opens pop up window to select file #file.name <- file.choose() #read csv file #tempa <-read.csv(file.name) ######################################################################## #################### For Mac user ################### ######################################################################## #file.name <- file.choose() #file.name #copy path from console and paste here #file.name <- "/Users/toddwehner/Desktop/RCodeCurrent/GxeR14Data4.csv" #create object path for where the data will go at the end # I append the file name with "_out" so I know its data coming out of R #out.file <- "/Users/toddwehner/Desktop/RCodeCurrent/GxeR14Out4.csv" #tempa <-read.csv(file.name) ######################################################################## 32 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. ###### Define trait that need to be analyzed ####### ######################################################################## #user need to define trait name for analysis #example: MKMGHA = Marketable mega-gram per hectare tempa <- tempa %>% rename(Trait = Trait) %>% select(YR, LC, RP, CLT, Trait) #view top 6 rows of input data head(tempa) #view bottom 6 rows of input data tail(tempa) #Get dimension of data dim(tempa) #Get data type class(tempa) #Get structure of data (character vs. numeric vs. matrix vs. vector vs. factor) #make sure: numeric value should be numeric str(tempa) #validate the variable types tempa$YR <- as.factor(tempa$YR) tempa$RP <- as.factor(tempa$RP) tempa$LC <- as.factor(tempa$LC) tempa$CLT <- as.factor(tempa$CLT) tempa$Trait <- as.numeric(tempa$Trait) ######################################################################## ## ANOVA: Compute analysis of variance ## ######################################################################## #Generate unique id for replication for anova tempa$RPid<-as.factor(paste(tempa$YR, tempa$LC, tempa$RP, sep=".")) ######################################################################## ### ANOVA Case 1: CLT, YR, LC and RP - All Random ### ######################################################################## #full model fit.f1<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #model summary summary1 <- summary(fit.f1) #variance of random factors 33 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. variance <- as.data.frame(summary1$varcor) #drop rownames rownames(variance) <- NULL variance1 <- variance %>% select (-var1, -var2) %>% rename(sov=grp, Variance=vcov, stddev=sdcor) #Type 3 test of hypothesis #Type III Wald chisquare tests anova(fit.f1, type="III") #Type 2 test of hypothesis anova(fit.f1, type="marginal", test="F") #model fitness anovacase1 <- plot(fit.f1, main="Model fitness Case 1: CLT, YR, LC and RP - All Random", xlab="Predicated Value", ylab="Residual") #LRT - likelihood ratio test for computing significance of random effect #create function for Likelihood ratio test #a=outputdatasetname; example-anova1r #b=full model name; example-fit.f1 #c=reduced model name; example-fit.f1r #d=effect name; example- "RPid", NOTE: call it in quotation anova_lrt <- function (a,b,c,d){ #level of significance a <-anova(b,c) #convert anova into data frame a <- data.frame(a) #convert rownames into column a$name <- rownames(a) # drop rownames rownames(a) <- NULL a <- a %>% filter(name=="b") %>% mutate(sov=d) %>% select(sov, Pr_Chisq = starts_with("Pr..Chisq.")) # return the result return(a) } #null model for YR fit.f1y<-lmer(Trait~ 1 + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1y <- anova_lrt(anova1y,fit.f1,fit.f1y,"YR") #null model for LC fit.f1l<-lmer(Trait~ 1 + (1|YR) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt 34 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. anova1l <- anova_lrt(anova1l,fit.f1,fit.f1l,"LC") #null model for CLT fit.f1c<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1c <- anova_lrt(anova1c,fit.f1,fit.f1c,"CLT") #null model for YR:LC fit.f1yl<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1yl <- anova_lrt(anova1yl,fit.f1,fit.f1yl,"YR:LC") #null model for YR:CLT fit.f1yc<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1yc <- anova_lrt(anova1yc,fit.f1,fit.f1yc,"YR:CLT") #null model for LC:CLT fit.f1lc<-lmer(Trait~ 1 + (1|YR) (1|YR:CLT) + (1|YR:LC:CLT) + + (1|LC) + (1|CLT) + (1|YR:LC) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1lc <- anova_lrt(anova1lc,fit.f1,fit.f1lc,"LC:CLT") #null model for YR:LC:CLT fit.f1ylc<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova1ylc <- anova_lrt(anova1ylc,fit.f1,fit.f1ylc,"YR:LC:CLT") #null model for RP fit.f1r<-lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) , data=tempa) #level of significance #call function anova_lrt anova1r <- anova_lrt(anova1ylr,fit.f1,fit.f1r,"RPid") 35 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #Merge anova and level of significance anova1 <- bind_rows(anova1y, anova1l)%>% bind_rows(anova1c)%>% bind_rows(anova1yl)%>% bind_rows(anova1yc)%>% bind_rows(anova1r)%>% bind_rows(anova1lc)%>%bind_rows(anova1ylc) anova1 <- as.data.frame(anova1) #Merge final output anova_randall <- variance1%>% left_join(anova1 , by ="sov") anova_randall$Pr_Chisq[anova_randall$stddev == 0] <- NA #Compute BLUP for CLT #BLUP - Best linear unbiased predictor randeffect1 <- ranef(fit.f1) #BLUP for clt BLUP_CLT <- as.data.frame(randeffect1$CLT) #convert rownames into column BLUP_CLT$genotype <- rownames(BLUP_CLT) #drop rownames rownames(BLUP_CLT) <- NULL #rename variable name BLUP_CLT <- BLUP_CLT %>% select(genotype,Blup = starts_with("(Intercept)")) #return estimate of fixed effect from full model summary to compute BLUP fixestimate1 <- as.data.frame(summary1$coefficients) #compute BLUP value BLUP_CLT1 <- BLUP_CLT %>% mutate(Blup = Blup + fixestimate1$Estimate) #final output for BLUP for cultivars BLUP_CLT1 <- as.data.frame(BLUP_CLT1) ######################################################################## ## ANOVA Case 2: CLT, YR and LC - Fixed; RP - Random ## ######################################################################## #full model fit.f2<-lmer(Trait~ YR*LC*CLT + (1|RPid), data=tempa) #model summary summary2 <- summary(fit.f2) ##variance of random factors variance2 <- as.data.frame(summary2$varcor) #drop rownames 36 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. rownames(variance2) <- NULL variance2 <- variance2 %>% select (-var1, -var2) %>% rename(sov=grp, Mean_Sq=vcov, stddev=sdcor) #Type 3 test of hypothesis anova.f2t3 <- as.data.frame(anova(fit.f2, type="III")) #convert rownames into column anova.f2t3$name <- rownames(anova.f2t3) #drop rownames rownames(anova.f2t3) <- NULL #Type 2 test of hypothesis anova.f2t2 <- anova(fit.f2, type="marginal", test="F") #model fitness anovacase2 <- plot(fit.f2, main="Model fitness Case 2: CLT, YR and LC - Fixed; RP Random", xlab="Predicated Value", ylab="Residual") #level of significance: "KR" is implemented corresponding to #the Kenward-Rogers approximation for degrees of freedom anova_f2t3 <- mixed(Trait~ YR*LC*CLT + (1|RPid), data=tempa) anova_f2t3 <- as.data.frame(anova_f2t3$anova_table) #convert rownames into column anova_f2t3$name <- rownames(anova_f2t3) #drop rownames rownames(anova_f2t3) <- NULL anova_f2t3 <- anova_f2t3 %>% select(name, Prob_F = starts_with("Pr(>F)")) #final output for case 2: CLT, YR and LC - Fixed anova_f2 <- anova.f2t3 %>% left_join(anova_f2t3, by = "name")%>% rename(sov = name)%>% select(sov, Df,Sum_Sq=starts_with("Sum Sq"), Mean_Sq = starts_with("Mean Sq"), F_value=starts_with("F value"), Prob_F) anova_f2 <- anova_f2 %>% bind_rows(variance2) anova_f2a <- as.data.frame(anova_f2) # print for output ######################################################################## ## ANOVA CASE 3: CLT - Fixed; YR, LC and RP - Random ## ######################################################################## fit.f3<-lmer(Trait~ 1 + (1|YR) + (1|LC) + CLT + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) 37 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #model summary summary3 <- summary(fit.f3) ##variance of random factors variance3 <- as.data.frame(summary3$varcor) #drop rownames rownames(variance3) <- NULL variance3 <- variance3 %>% select (-var1, -var2) %>% rename(sov=grp, Mean_Sq=vcov, stddev=sdcor) #Type 3 test of hypothesis anova.f3t3 <- as.data.frame(anova(fit.f3, type="III")) #convert rownames into column anova.f3t3$name <- rownames(anova.f3t3) #drop rownames rownames(anova.f3t3) <- NULL #Type 2 test of hypothesis anova.f3t2 <- anova(fit.f3, type="marginal", test="F") #model fitness anovacase3 <- plot(fit.f3, main="Model fitness Case 3: CLT - Fixed; YR, LC and RP Random", xlab="Predicated Value", ylab="Residual") #p value for fixed effects anova_f3t3 <- mixed(Trait~ 1 + (1|YR) + (1|LC) + CLT + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) anova_f3t3 <- as.data.frame(anova_f3t3$anova_table) #convert rownames into column anova_f3t3$name <- rownames(anova_f3t3) #drop rownames rownames(anova_f3t3) <- NULL anova_f3t3 <- anova_f3t3 %>% select(name, Prob_F = starts_with("Pr(>F)")) #final output of case 3 for fixed effect - CLT anova_f3 <- anova.f3t3 %>% left_join(anova_f3t3, by = "name")%>% rename(sov = name)%>% select(sov, Df,Sum_Sq=starts_with("Sum Sq"), Mean_Sq = starts_with("Mean Sq"), F_value=starts_with("F value"), Prob_F) #level of significance for random effects 38 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #null model for CLT fit.f3c <- lmer(Trait~ 1 + (1|YR) + (1|LC) + (1|YR:LC) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3c <- anova_lrt(anova3c,fit.f3,fit.f3c,"CLT") #null model for LC fit.f3l <- lmer(Trait~ 1 + CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3l <- anova_lrt(anova3l,fit.f3,fit.f3l,"LC") #null model for YR fit.f3y <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3y <- anova_lrt(anova3y,fit.f3,fit.f3y,"YR") #null model for YR:LC fit.f3yl <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3yl <- anova_lrt(anova3yl,fit.f3,fit.f3yl,"YR:LC") #null model for YR:CLT fit.f3yc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3yc <- anova_lrt(anova3yc,fit.f3,fit.f3yc,"YR:CLT") #null model for LC:CLT fit.f3lc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3lc <- anova_lrt(anova3lc,fit.f3,fit.f3lc,"LC:CLT") #null model for YR:LC:CLT fit.f3ylc <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) + 39 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. (1|YR:CLT) + (1|LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova3ylc <- anova_lrt(anova3ylc,fit.f3,fit.f3ylc,"YR:LC:CLT") #null model for RPid or replication fit.f3r <- lmer(Trait~ 1 + CLT + (1|LC) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT), data=tempa) #level of significance #call function anova_lrt anova3r <- anova_lrt(anova3r,fit.f3,fit.f3r,"RPid") #Merge anova and level of significance anova3 <- bind_rows(anova3y, anova3l)%>% bind_rows(anova3yl)%>% bind_rows(anova3yc)%>% bind_rows(anova3r)%>% bind_rows(anova3lc)%>%bind_rows(anova3ylc) anova3 <- as.data.frame(anova3) #Merge final output anova_cfix <- variance3%>% left_join(anova3 , by ="sov") anova_cfix$Pr_Chisq[anova_cfix$stddev == 0] <- NA ######################################################################## ## ANOVA Case 4: LC - Fixed; YR, CLT and RP - Random ## ######################################################################## fit.f4 <- lmer(Trait~ 1 + LC + (1|YR) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #model summary summary4 <- summary(fit.f4) ##variance of random factors variance4 <- as.data.frame(summary4$varcor) # drop rownames rownames(variance4) <- NULL variance4 <- variance4 %>% select (-var1, -var2) %>% rename(sov=grp, Mean_Sq=vcov, stddev=sdcor) #Type 3 test of hypothesis anova.f4t3 <- as.data.frame(anova(fit.f4, type="III")) #convert rownames into column anova.f4t3$name <- rownames(anova.f4t3) #drop rownames rownames(anova.f4t3) <- NULL 40 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #Type 2 test of hypothesis anova.f4t2 <- anova(fit.f3, type="marginal", test="F") #model fitness anovacase4 <- plot(fit.f4, main="Model fitness Case 4: LC - Fixed; YR, CLT and RP Random", xlab="Predicated Value", ylab="Residual") #level of significance for fixed effects anova_f4t3 <- mixed(Trait~ 1 + LC + (1|YR) + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) anova_f4t3 <- as.data.frame(anova_f4t3$anova_table) #convert rownames into column anova_f4t3$name <- rownames(anova_f4t3) #drop rownames rownames(anova_f4t3) <- NULL anova_f4t3 <- anova_f4t3 %>% select(name, Prob_F = starts_with("Pr(>F)")) #final output of case 4 for fixed effect - LC anova_f4 <- anova.f4t3 %>% left_join(anova_f4t3, by = "name")%>% rename(sov = name)%>% select(sov, Df,Sum_Sq=starts_with("Sum Sq"), Mean_Sq = starts_with("Mean Sq"), F_value=starts_with("F value"), Prob_F) #level of significance of random effect #null model for CLT fit.f4c <- lmer(Trait~ 1 + LC + (1|YR) + (1|YR:LC) + (1|YR:LC) + (1|YR:CLT)+ (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4c <- anova_lrt(anova4c,fit.f4,fit.f4c,"CLT") #null model for LC fit.f4l <- lmer(Trait~ 1 + (1|CLT) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4l <- anova_lrt(anova4l,fit.f4,fit.f4l,"LC") #null model for YR fit.f4y <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) 41 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #level of significance #call function anova_lrt anova4y <- anova_lrt(anova4y,fit.f4,fit.f4y,"YR") #null model for YR:LC fit.f4yl <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4yl <- anova_lrt(anova4yl,fit.f4,fit.f4yl,"YR:LC") #null model for YR:CLT fit.f4yc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) + (1|LC:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4yc <- anova_lrt(anova4yc,fit.f4,fit.f4yc,"YR:CLT") #null model for LC:CLT fit.f4lc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4lc <- anova_lrt(anova4lc,fit.f4,fit.f4lc,"LC:CLT") #null model for YR:LC:CLT fit.f4ylc <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova4ylc <- anova_lrt(anova4ylc,fit.f4,fit.f4ylc,"YR:LC:CLT") #null model for RPid or replication fit.f4r <- lmer(Trait~ 1 + LC + (1|CLT) + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|LC:CLT) + (1|YR:LC:CLT), data=tempa) #level of significance #call function anova_lrt anova4r <- anova_lrt(anova4r,fit.f4,fit.f4r,"RPid") #Merge anova and level of significance anova4 <- bind_rows(anova4c, anova4y)%>% bind_rows(anova4yl)%>% bind_rows(anova4yc)%>% bind_rows(anova4r)%>% bind_rows(anova4lc)%>%bind_rows(anova4ylc) 42 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. anova4 <- as.data.frame(anova4) #Merge final output anova_lfix <- variance4%>% left_join(anova4 , by ="sov") anova_lfix$Pr_Chisq[anova_lfix$stddev == 0] <- NA ######################################################################## ## ANOVA Case 5: CLT and LC - Fixed; YR, and RP - Random ## ######################################################################## fit.f5 <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #model summary summary5 <- summary(fit.f5) ##variance of random factors variance5 <- as.data.frame(summary5$varcor) #drop rownames rownames(variance5) <- NULL variance5 <- variance5 %>% select (-var1, -var2) %>% rename(sov=grp, Mean_Sq=vcov, stddev=sdcor) #Type 3 test of hypothesis anova.f5t3 <- as.data.frame(anova(fit.f5, type="III")) #convert rownames into column anova.f5t3$name <- rownames(anova.f5t3) #drop rownames rownames(anova.f5t3) <- NULL #Type 2 test of hypothesis anova.f5t2 <- anova(fit.f3, type="marginal", test="F") #model fitness anovacase5 <- plot(fit.f5, main="Model fitness Case 5: CLT and LC - Fixed; YR and RP - Random", xlab="Predicated Value", ylab="Residual") #level of significance for fixed effects anova_f5t3 <- mixed(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) anova_f5t3 <- as.data.frame(anova_f5t3$anova_table) #convert rownames into column anova_f5t3$name <- rownames(anova_f5t3) #drop rownames 43 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. rownames(anova_f5t3) <- NULL anova_f5t3 <- anova_f5t3 %>% select(name, Prob_F = starts_with("Pr(>F)")) #final output of case 5 for fixed effect - CLT and LC anova_f5 <- anova.f5t3 %>% left_join(anova_f5t3, by = "name")%>% rename(sov = name)%>% select(sov, Df,Sum_Sq=starts_with("Sum Sq"), Mean_Sq = starts_with("Mean Sq"), F_value=starts_with("F value"), Prob_F) #level of significance for random effects #null model for CLT fit.f5c <- lmer(Trait~ 1 + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5c <- anova_lrt(anova5c,fit.f5,fit.f5c,"CLT") #null model for LC fit.f5l <- lmer(Trait~ 1 + CLT + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5l <- anova_lrt(anova5l,fit.f5,fit.f5l,"LC") #null model for YR fit.f5y <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5y <- anova_lrt(anova5y,fit.f5,fit.f5y,"YR") #null model for YR:LC fit.f5yl <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5yl <- anova_lrt(anova5yl,fit.f5,fit.f5yl,"YR:LC") #null model for YR:CLT fit.f5yc <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5yc <- anova_lrt(anova5yc,fit.f5,fit.f5yc,"YR:CLT") 44 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #null model for LC:CLT fit.f5lc <- lmer(Trait~ 1 + CLT + LC + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5lc <- anova_lrt(anova5lc,fit.f5,fit.f5lc,"LC:CLT") #null model for YR:LC:CLT fit.f5ylc <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|RPid), data=tempa) #level of significance #call function anova_lrt anova5ylc <- anova_lrt(anova5ylc,fit.f5,fit.f5ylc,"YR:LC:CLT") #null model for RPid or replication fit.f5r <- lmer(Trait~ 1 + CLT + LC + LC:CLT + (1|YR) + (1|YR:LC) + (1|YR:CLT) + (1|YR:LC:CLT) , data=tempa) #level of significance #call function anova_lrt anova5r <- anova_lrt(anova5r,fit.f5,fit.f5r,"RPid") #Merge anova and level of significance anova5 <- bind_rows(anova5y, anova5yl)%>% bind_rows(anova5yc)%>% bind_rows(anova5r)%>% bind_rows(anova5ylc) anova5 <- as.data.frame(anova5) #Merge final output anova_clfix <- variance5%>% left_join(anova5 , by ="sov") anova_clfix$Pr_Chisq[anova_clfix$stddev == 0] <- NA ######################################################################## # Compute Mean and CV of genotype, location, year, rep, environment # ######################################################################## #Compute environment - Location by year combination tempa2 <- tempa %>% mutate (ENV = paste(LC,YR, sep='-')) %>% #remove missing records na.omit() head(tempa2) #validate data tempa2$YR <- as.factor(tempa2$YR) tempa2$RP <- as.factor(tempa2$RP) tempa2$LC <- as.factor(tempa2$LC) tempa2$CLT <- as.factor(tempa2$CLT) tempa2$ENV <- as.factor(tempa2$ENV) tempa2$Trait <- as.numeric(tempa2$Trait) 45 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #compute descritptive statistics #trait mean over genotype and environment mean_ge <- tempa2 %>% group_by (CLT, ENV) %>% summarize (Trait = mean(Trait,na.rm=FALSE)) mean_ge1 <- mean_ge %>% spread (ENV, Trait) #transpose using library tidyr mean_ge2 <- as.data.frame(mean_ge1) #trait mean over genotype and year mean_gy <- tempa2 %>% group_by (CLT, YR) %>% summarize (Trait = mean(Trait,na.rm=FALSE)) mean_gy1 <- mean_gy %>% spread (YR, Trait) #transpose using library tidyr mean_gy2 <- as.data.frame(mean_gy1) #trait sum over genotype and year sum_gy <- tempa2 %>% group_by (CLT, YR) %>% summarize (Trait = sum(Trait,na.rm=FALSE)) sum_gy1 <- sum_gy %>% spread (YR, Trait) #transpose using library tidyr sum_gy2 <- as.data.frame(sum_gy1) #trait standard deviation (sd) over genotype and year sd_gy <- tempa2 %>% group_by (CLT, YR) %>% summarize (Trait = sd(Trait,na.rm=FALSE)) sd_gy1 <- sd_gy %>% spread (YR, Trait) #transpose using library tidyr sd_gy2 <- as.data.frame(sd_gy1) #trait mean over genotype and location mean_gl <- tempa2 %>% group_by (CLT, LC) %>% summarize (Trait = mean(Trait,na.rm=FALSE)) mean_gl1 <- mean_gl %>% spread (LC, Trait) #transpose using library tidyr mean_gl2 <- as.data.frame(mean_gl1) #trait sum over genotype and location sum_gl <- tempa2 %>% group_by (CLT, LC) %>% summarize (Trait = sum(Trait,na.rm=FALSE)) sum_gl1 <- sum_gl %>% spread (LC, Trait) #transpose using library tidyr 46 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. sum_gl2 <- as.data.frame(sum_gl1) #trait standard deviation (sd) over genotype and location sd_gl <- tempa2 %>% group_by (CLT, LC) %>% summarize (Trait = sd(Trait,na.rm=FALSE)) sd_gl1 <- sd_gl %>% spread (LC, Trait) #transpose using library tidyr sd_gl2 <- as.data.frame(sd_gl1) #trait mean over genotype, location and year mean_gly <- tempa2 %>% group_by (CLT, LC, YR) %>% summarize (Trait = mean(Trait,na.rm=FALSE)) mean_gly1 <- mean_gly %>% spread (LC, Trait) #transpose using library tidyr mean_gly2 <- as.data.frame(mean_gly1) #trait sum over genotype, location and year sum_gly <- tempa2 %>% group_by (CLT, LC, YR) %>% summarize (Trait = sum(Trait,na.rm=FALSE)) sum_gly1 <- sum_gly %>% spread (LC, Trait) #transpose using library tidyr sum_gly2 <- as.data.frame(sum_gly1) #trait standard deviation (sd) over genotype, location and year sd_gly <- tempa2 %>% group_by (CLT, LC, YR) %>% summarize (Trait = sd(Trait,na.rm=FALSE)) sd_gly1 <- sd_gly %>% spread (LC, Trait) #transpose using library tidyr sd_gly2 <- as.data.frame(sd_gly1) #trait mean over genotype, location and rep mean_glr <- tempa2 %>% group_by (CLT, LC, RP) %>% summarize (Trait = mean(Trait,na.rm=FALSE)) mean_glr1 <- mean_glr %>% spread (LC, Trait) #transpose using library tidyr mean_glr2 <- as.data.frame(mean_glr1) #trait mean over location and year mean_ly <- tempa2 %>% group_by (LC, YR) %>% summarize (Trait = mean(Trait,na.rm=FALSE)) 47 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. mean_ly1 <- mean_ly %>% spread (YR, Trait) #transpose using library tidyr mean_ly2 <- as.data.frame(mean_ly1) #trait mean over location and replication mean_lr <- tempa2 %>% group_by (LC, RP) %>% summarize (Trait = mean(Trait,na.rm=FALSE)) mean_lr1 <- mean_lr %>% spread (RP, Trait) #transpose using library tidyr mean_lr2 <- as.data.frame(mean_lr1) #trait count, min, max, mean, sum, median, var, sd over location mean_l <- tempa2 %>% group_by (LC ) %>% summarize (count = n(), min = min(Trait,na.rm=FALSE), max = max(Trait,na.rm=FALSE), mean = mean(Trait,na.rm=FALSE), sum = sum(Trait,na.rm=FALSE), median = median(Trait,na.rm=FALSE), var = var(Trait,na.rm=FALSE), sd = sd(Trait,na.rm=FALSE)) stat_l1 <- as.data.frame(mean_l) #trait count, min, max, mean, sum, median, var, sd over year mean_y <- tempa2 %>% group_by (YR ) %>% summarize (count = n(), min = min(Trait,na.rm=FALSE), max = max(Trait,na.rm=FALSE), mean = mean(Trait,na.rm=FALSE), sum = sum(Trait,na.rm=FALSE), median = median(Trait,na.rm=FALSE), var = var(Trait,na.rm=FALSE), sd = sd(Trait,na.rm=FALSE)) stat_y1 <- as.data.frame(mean_y) #trait count, min, max, mean, sum, median, var, sd over genotype mean_g <- tempa2 %>% group_by (CLT ) %>% summarize (count = n(), min = min(Trait,na.rm=FALSE), max = max(Trait,na.rm=FALSE), mean = mean(Trait,na.rm=FALSE), sum = sum(Trait,na.rm=FALSE), median = median(Trait,na.rm=FALSE), var = var(Trait,na.rm=FALSE), sd = sd(Trait,na.rm=FALSE)) stat_g1 <- as.data.frame(mean_g) #trait count, min, max, mean, sum, median, var, sd over environment mean_e <- tempa2 %>% 48 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. group_by (ENV) %>% summarize (count = n(), min = min(Trait,na.rm=FALSE), max = max(Trait,na.rm=FALSE), mean = mean(Trait,na.rm=FALSE), sum = sum(Trait,na.rm=FALSE), median = median(Trait,na.rm=FALSE), var = var(Trait,na.rm=FALSE), sd = sd(Trait,na.rm=FALSE)) stat_e1 <- as.data.frame(mean_e) #trait CV over genotype and location #CV = (standard deviation /mean) *100 cv_gl <- tempa2 %>% group_by (CLT, LC) %>% summarise (Trait_m = mean(Trait,na.rm=FALSE) , Trait_s = sd(Trait,na.rm=FALSE)) %>% mutate (Trait = (Trait_s/Trait_m)*100) %>% # CV select (-Trait_m, -Trait_s) cv_gl1 <- cv_gl %>% spread (LC, Trait) #transpose using library tidyr cv_gl2 <- as.data.frame(cv_gl1) ######################################################################## ## Compute univariate stability statistics - regression analysis ## ######################################################################## #Compute regression (slope) and deviation from regression #compute environmental index dsterm <- tempa2 %>% group_by (ENV, RP, YR, LC) %>% summarize (ENVTrait = mean(Trait,na.rm=FALSE)) dst02 <- tempa2 %>% left_join(dsterm, by=c("ENV", "RP")) %>% #Left join on multiple columns arrange (CLT) %>% rename (YR= YR.x, LC = LC.x ) #fit model fit_model <- dst02 %>% group_by(CLT) %>% #group regression analysis by cultivar do (model=lm(Trait~ENVTrait + ENV + RP, data=.)) #parameter estimates paramlm <- as.data.frame(fit_model %>% tidy(model)) glancelm <- as.data.frame(fit_model %>% glance(model)) augmentlm <- as.data.frame(fit_model %>% augment(model)) outmsed <- lapply(fit_model$model, anova) #anova output outmsed2 <- as.data.frame(do.call(rbind, outmsed)) #convert list into data.frame #convert rownames into column outmsed2$SOV <- rownames(outmsed2) # drop rownames 49 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. rownames(outmsed2) <- NULL #remove numeric values from string of rownames using function gsub outmsed2 <- outmsed2 %>% mutate(SOV = gsub("\\d+","",SOV)) #extract unique cultivar name and merge to outmsed2 dataset genotypes <- dst02 %>% select(CLT) %>% distinct (CLT) %>% arrange(CLT) #Stack 4 times to match number of rows with outmsed2 dataset genotypes1 <- genotypes %>% bind_rows(genotypes) %>% bind_rows(genotypes) %>% bind_rows(genotypes) %>% arrange(CLT) #attach list of cultivars to outmsed2 outmsed3 <- as.data.frame(outmsed2 %>% bind_cols(genotypes1)) #transpose outmsed3 outmsed4 <- outmsed3 %>% select (CLT, SOV, MS = starts_with("Mean")) %>% filter (SOV != "RP") #rename variables #transpose MS values MSDS <- outmsed4 %>% spread (SOV, MS) %>% #transpose using library tidyr arrange (CLT) #Transpose degress of freedoms for F-test FDS3 <- outmsed3 %>% filter (SOV != "RP")%>% select (CLT, SOV, Df) %>% spread (SOV, Df) %>% #transpose using library tidyr rename (DF_ENVTrait = ENVTrait, DF_Residuals = Residuals, DF_ENV = ENV) #Subset parameters - paramlm dataset REGCOEFGS <- paramlm %>% filter (term == "ENVTrait") %>% select (-statistic, -p.value) #Test and level of significance of regression and deviation from regression #Merge MSDS, FDS3, REGCOEFGS slope <- MSDS %>% inner_join (REGCOEFGS, by = "CLT") %>% inner_join (FDS3, by = "CLT") %>% rename (MSE = Residuals, LREGMS=ENVTrait, DEVLMS = ENV, BI= estimate, STDERR = std.error) #test significance levels slope1 <- slope %>% mutate (T_H01 = (BI-1)/STDERR , #Null Hypothesis for slope = 1 PT_H01 = 2*pt(-abs(T_H01), DF_Residuals), F_DEVREG=DEVLMS/MSE, #NULL HYPOTHESIS: PREDICTED-ACTUAL = 0 PF_HO0= 1-pf(F_DEVREG, DF_ENV, DF_Residuals)) #add legend for level of significance 50 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. slope2 <- slope1 %>% mutate (SIG_SLOPE = ifelse(PT_H01 <= 0.001, "***", ifelse(PT_H01 <= 0.01, "**", ifelse(PT_H01 <= 0.05, "*","")))) %>% mutate (SIG_DEVREG = ifelse(PF_HO0 <= 0.001, "***", ifelse(PF_HO0 <= 0.01, "**", ifelse(PF_HO0 <= 0.05, "*","")))) #final regression output options(digits=5) univariate2 <- slope2 %>% mutate (SLOPE = paste(BI,SIG_SLOPE, sep=""), DEVREG = paste(DEVLMS,SIG_DEVREG, sep="") ) %>% select (CLT,SLOPE, DEVREG ) ######################################################################## # Compute univariate stability statistics - shukla, ecovalence, YS # ######################################################################## #Compute Shukla, Wricke Ecovalense, Kangs YS repno <- tempa2 %>% summarise (total_rep = n_distinct(RP)) #count total number of rep dstgl <- tempa2 %>% group_by (CLT, LC) %>% #Summarize genotype performance across locations summarize (Trait = mean(Trait,na.rm=FALSE)) dstgl1 <- dstgl %>% spread (LC, Trait) #transpose values #convert into data frame so that row containing structure information is deleted dstgl2 <- as.data.frame(dstgl1) #create rownames rownames(dstgl2) <- dstgl2[ ,1] shukla <- dstgl2[,-1] #compute MS error term tempa3 <- glm(Trait ~ LC + YR + LC:YR + RP %in% (LC:YR) + CLT + CLT:LC + CLT:YR + CLT:LC:YR, family = gaussian , data= tempa2 ) #model summary summary1 <- summary.glm(tempa3) #Error SS error_ss <- as.data.frame(summary1$deviance) error_ss1 <- error_ss %>% #rename variable select (Deviance = starts_with("summary"), everything()) #Error DF error_df <- as.data.frame(summary1$df.residual) error_df1 <- error_df %>% 51 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #rename variable select (Df = starts_with("summary"), everything()) #MS of error mse1 <- as.data.frame(error_ss1/error_df1) mse <- mse1 %>% #rename variable rename (MS = Deviance) # MSError is used populated from ANOVA univariate1a <- stability.par(shukla, rep= repno$total_rep , MSerror=mse$MS, alpha=0.1, main="Genotype") #pool results into individual columns univariate1b <- univariate1a$statistics #create column genotype from rownames univariate1b$genotype <- rownames(univariate1b) rownames(univariate1b) = NULL #remove rownames names(univariate1b)[3] <- "significane_sigma" # rename duplicate name dot names(univariate1b)[5] <- "significane_s2" # rename duplicate name dot names(univariate1b)[2] <- "sigma" # rename names(univariate1b)[4] <- "ssquare" # rename univariate1c <- univariate1a$stability #create column genotype from rownames univariate1c$genotype <- rownames(univariate1c) rownames(univariate1c) = NULL #remove rownames names(univariate1c)[8] <- "legend" # rename variable ... to legend #Merge univariate1d <- univariate1b %>% inner_join (univariate1c , by = "genotype") %>% # deselect all columns between Yield and Stab.rating select (-Yield: -Stab.rating) %>% # arrange the column order for final output select (CLT=genotype, Mean, sigma, significane_sigma, ssquare, significane_s2, Ecovalence,YSi, legend) #Final stability statistics #Merge Univariate2 and Univariate1d univariate <- univariate2 %>% inner_join(univariate1d, by = "CLT") %>% mutate (SIGMA=paste(sigma,significane_sigma, ""), SIGMA_SQUARE=paste(ssquare,significane_s2, ""), YS_Kang =paste(YSi,legend, "")) %>% select (Genotype = CLT, Mean, SLOPE, DEVREG, SIGMA, SIGMA_SQUARE,Ecovalence, YS_Kang) ######################################################################## ## Compute location statistics - genotype F ratio across ## ## location and environment; location correlation ## ######################################################################## 52 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #Location values #F-value of genotype across location #fit model fit_modellc <- tempa2 %>% group_by(LC) %>% #group regression analysis by location do (model1=lm(Trait~CLT + YR + CLT:YR + RP%in%YR , data=.)) #parameter estimates paramlmlc <- as.data.frame(fit_modellc %>% tidy(model1)) glancelmlc <- as.data.frame(fit_modellc %>% glance(model1)) augmentlmlc <- as.data.frame(fit_modellc %>% augment(model1)) outmsedlc <- lapply(fit_modellc$model1, anova) #anova output #convert list into data.frame outmsedlc2 <- as.data.frame(do.call(rbind, outmsedlc)) #convert rownames into column outmsedlc2$SOV <- rownames(outmsedlc2) #drop rownames rownames(outmsedlc2) <- NULL #remove numeric values from string of rownames using function gsub outmsedlc2 <- outmsedlc2 %>% mutate(SOV = gsub("\\d+","",SOV)) #extract unique location name and merge to outmsedlc2 dataset location <- dst02 %>% select(LC) %>% distinct (LC) %>% arrange(LC) #Extract gentype F value for each location locvalue <- outmsedlc2 %>% filter(SOV == "CLT") %>% select (FRatioGenotype = starts_with("F value")) %>% bind_cols (location) %>% select (LC, FRatioGenotype) locvalue <- as.data.frame (locvalue) #Correlation between location and average location for each genotype #compute genotype mean at each location glcmean1 <- tempa2 %>% group_by (CLT, LC) %>% summarize (glcmean = mean(Trait,na.rm=FALSE)) %>% as.data.frame(select (CLT, LC, glcmean)) #compute genotype mean across all location -average location gmean1 <- tempa2 %>% group_by (CLT ) %>% summarize (gmean = mean(Trait,na.rm=FALSE)) %>% as.data.frame(select (CLT, gmean)) #merge location mean with average location for each genotype lgmean <- glcmean1 %>% 53 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. left_join(gmean1, by="CLT") %>% arrange(LC) %>% select (-CLT) #compute correlation with level of significance lcgcorr <- lgmean %>% group_by(LC) %>% do(tidy(cor.test(.$glcmean, .$gmean, method = c("pearson")))) lcgcorr1 <- lcgcorr %>% select (LC, Corr_Value = starts_with ("estimate"), Pvalue = starts_with("p.value")) #post process correlation value lcgcorr2 <- lcgcorr1 %>% mutate (SIG_CORR = ifelse(Pvalue <= 0.001, "***", ifelse(Pvalue <= 0.01, "**", ifelse(Pvalue <= 0.05, "*","")))) #concatenate p value symbol with correlation value lcgcorr3 <- lcgcorr2 %>% mutate (LocCorrelation=paste(Corr_Value,SIG_CORR, sep="")) %>% select (LC, LocCorrelation) lcgcorr3 <- as.data.frame(lcgcorr3) #Final location value table for output #compute location mean Locmean <- tempa2 %>% group_by (LC ) %>% summarize (Trait = mean(Trait,na.rm=FALSE))%>% select (LC, Mean = starts_with("Trait")) Locmean <- as.data.frame(Locmean) #merge all location value outputs for print LocationValue <- Locmean %>% inner_join (locvalue, by = "LC") %>% inner_join(lcgcorr3, by = "LC") %>% rename (Location = LC) ######################################################################## ### F-value of genotype across environmen ## ######################################################################## #fit model fit_modelen <- tempa2 %>% group_by(ENV) %>% #group regression analysis by location do (model2=lm(Trait~CLT + RP , data=.)) #parameter estimates paramlmen <- as.data.frame(fit_modelen %>% tidy(model2)) glancelmen <- as.data.frame(fit_modelen %>% glance(model2)) augmentlmen <- as.data.frame(fit_modelen %>% augment(model2)) outmseden <- lapply(fit_modelen$model2, anova) #anova output #convert list into data.frame 54 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. outmseden2 <- as.data.frame(do.call(rbind, outmseden)) #convert rownames into column outmseden2$SOV <- rownames(outmseden2) # drop rownames rownames(outmseden2) <- NULL #remove numeric values from string of rownames using function gsub outmseden2 <- outmseden2 %>% mutate(SOV = gsub("\\d+","",SOV)) #extract unique environment name and merge to outmseden2 dataset environment <- dst02 %>% select(ENV) %>% distinct (ENV) %>% arrange(ENV) #Extract gentype F value for each location locvalue2 <- outmseden2 %>% filter(SOV == "CLT") %>% select (FRatioGenotype = starts_with("F value")) %>% bind_cols (environment) %>% select (ENV, FRatioGenotype) locvalue2 <- as.data.frame (locvalue2) ######################################################################## ### Compute cluster analysis of location ### ######################################################################## #location cluster analysis #Euclidean distance #Ward Hierarchical Clustering #trait mean over location mean_l <- tempa2 %>% group_by (LC ) %>% summarize (Trait = mean(Trait,na.rm=FALSE)) mean_l1 <- as.data.frame(mean_l) clusterdata <- mean_l1 %>% select (Trait) clusterdata <- na.omit(clusterdata) distance <- dist(clusterdata, method = "euclidean") # distance matrix hcluster <- hclust(d=distance, method="ward.D") locationcluster <- plot(hcluster, labels=mean_l1$LC) # display dendogram ######################################################################## #### Clear console #### ######################################################################## #clear console cat("\014") ######################################################################## ### Export output in .CSV or .TXT ### ### Print final output in console ### ######################################################################## #user can turn on and off .csv or .txt file by commenting/uncommenting 55 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. #codes in below 2 lines #sink(file="RGxEOutput.csv", append=FALSE, split=TRUE) sink(file="RGxEOutput.txt", append=FALSE, split=TRUE) "" "" "############# Print date and time #################" "" print(Sys.time()) "" "########################################################################" "### Section 1: Identify level of significance of ###" "### different effects, variances and BLUP ###" "########################################################################" "" "" "########################################################################" "### ANOVA - ANALYSIS OF VARIANCE ####" "########################################################################" "" "########################################################################" "## ANOVA Case 1: CLT, YR, LC and RP - All Random ##" "########################################################################" "" "##Print ANOVA" "#Note: P values are generated using LRT-Likelihood Ratio Test via " "#model comparison and anova" "" print(anova_randall, row.names = FALSE) "" "##Print BLUP-Best linear unbiased predictor value for random genotypes" "" print(BLUP_CLT1, row.names = FALSE) "" "########################################################################" "## ANOVA Case 2: CLT, YR and LC - Fixed; RP - Random ##" "########################################################################" "" "##Print ANOVA" "#Note: P values are computed for F ratio using KR method " "#Kenward-Rogers approximation for degrees of freedom" "" print(anova_f2a, row.names = FALSE) "" "########################################################################" "## ANOVA Case 3: CLT - Fixed; YR, LC and RP - Random ##" "########################################################################" "" "##Print ANOVA" "#Note: For fixed effects P values are computed for F ratio using " "#KR method Kenward-Rogers approximation for degrees of freedom" "" print(anova_f3, row.names = FALSE) "" "#For random effect P values are generated using LRT-Likelihood Ratio Test " "#via model comparison and anova" "" print(anova_cfix, row.names = FALSE) "" "########################################################################" "## ANOVA Case 4: LC - Fixed; YR, CLT and RP - Random ##" "########################################################################" "" "##Print ANOVA" 56 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. "#Note: For fixed effects P values are computed for F ratio using" "#KR method Kenward-Rogers approximation for degrees of freedom" "" print(anova_f4, row.names = FALSE) "" "#For random effect P values are generated using LRT-Likelihood Ratio Test" "#via model comparison and anova" "" print(anova_lfix, row.names = FALSE) "" "########################################################################" "## ANOVA Case 5: CLT and LC - Fixed; YR, and RP - Random ##" "########################################################################" "" "##Print ANOVA" "#Note: For fixed effects P values are computed for F ratio using " "#KR method Kenward-Rogers approximation for degrees of freedom" "" print(anova_f5, row.names = FALSE) "" "#For random effect P values are generated using LRT-Likelihood Ratio Test" "#via model comparison and anova" "" print(anova_clfix, row.names = FALSE) "" "" "########################################################################" "### Section 2: Identify general statistcs (mean and CV) of ###" "### genotype, location, year, replication and environment ###" "########################################################################" "" "" "########################################################################" "### Descriptive Statistics ###" "########################################################################" "" "#Descriptive Statistics - Means and CV" "" "#trait mean across genotype and environment(location x year combination)" "" print(mean_ge2, row.names = FALSE) "" "#trait mean across genotype and years" "" print(mean_gy2, row.names = FALSE) "" "#trait mean across genotype and location" "" print(mean_gl2, row.names = FALSE) "" "#trait mean across genotype, location and year" "" print(mean_gly2, row.names = FALSE) "" "#trait mean across genotype, location and replication" "" print(mean_glr2, row.names = FALSE) "" "#trait mean across location and year" "" print(mean_ly2, row.names = FALSE) "" "#trait mean across location and replication" "" 57 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. print(mean_lr2, row.names = FALSE) "" "#trait count, min, max, mean, sum, median, var, sd over location" "" print(stat_l1, row.names = FALSE) "" "#trait count, min, max, mean, sum, median, var, sd over year" "" print(stat_y1, row.names = FALSE) "" "#trait count, min, max, mean, sum, median, var, sd over genotype" "" print(stat_g1, row.names = FALSE) "" "#trait count, min, max, mean, sum, median, var, sd over environment" "" print(stat_e1, row.names = FALSE) "" "#trait coefficient of variation (cv) across gentoype and location" "" print(cv_gl2, row.names = FALSE) "" "trait sum across genotype and location" "" print(sum_gl2, row.names = FALSE) "" "trait sum across genotype, location and year" "" print(sum_gly2, row.names = FALSE) "" "trait sum across genotype and year" "" print(sum_gy2, row.names = FALSE) "" "trait standard deviation (sd) across genotype and location" "" print(sd_gl2, row.names = FALSE) "" "trait standard deviation (sd) across genotype, location and year" "" print(sd_gly2, row.names = FALSE) "" "trait standard deviation (sd) across genotype and year" "" print(sd_gy2, row.names = FALSE) "" "" "" "########################################################################" "### Section 3: Choose stable and best genotype ###" "########################################################################" "" "" "########################################################################" "### Stability Statistics ###" "########################################################################" "" "#Univariate statistics - mean, slope, deviation from regression," "#Shukla,Ecovalence, Kang" "" print(univariate, row.names = FALSE) "" "" "########################################################################" 58 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. "## Section 4: Choose discriminative and representative location ##" "########################################################################" "" "" "########################################################################" "### location statistics - genotype F ratio across location ###" "### and environment; location correlation ###" "########################################################################" "" "#location value" "# location mean, genotype F ratio across location," "#correlation of location with average location performace" "" print(LocationValue, row.names = FALSE) "" "#location value-genotype F ratio across location," "" print(locvalue2, row.names = FALSE) sink() ######################################################################## ### ANOVA model fitness plots : Case 1 to 5 ### ######################################################################## #Print model fitness of Case I to Case V #pdf file is sent to same folder where input file is located #Model #Model #Model #Model #Model fitness fitness fitness fitness fitness Case Case Case Case Case 1: 2: 3: 4: 5: CLT, YR, LC and RP - All Random CLT, YR and LC - Fixed; RP - Random CLT - Fixed; YR, LC and RP - Random LC - Fixed; YR, CLT and RP - Random CLT and LC - Fixed; YR and RP - Random pdf("modelfitness.pdf") anovacase1 anovacase2 anovacase3 anovacase4 anovacase5 dev.off() ######################################################################## ### Cluster analysis of location performance ### ######################################################################## #print dendogram for location #pdf file is sent to same folder where input file is located dev.print(pdf, 'locationcluster.pdf') dev.off() #End References Bates D, Maechler M, Bolker B, Walker S. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01. 59 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Dia M, Wehner TC, Arellano C. 2016a. Analysis of genotype x environment interaction (GxE) using SAS programming. Agron. J. [AJ-2016-02-0088-A , in press] Dia M, Wehner TC, Hassell R, Price DS, Boyhan GE, Olson S, King S, Davis AR, Tolla GE. 2016b. Genotype x environment interaction and stability analysis for watermelon fruit yield in the U.S.. Crop Sci. doi: 10.2135/cropsci2015.10.0625. Dia M, Wehner TC, Hassell R, Price DS, Boyhan GE, Olson S, King S, Davis AR, Tolla GE. 2016c. Values of locations for representing mega-environments and for discriminating yield of watermelon in the U.S. Crop Sci. [CROP-2015-11-0698-ORA, in press] Fan XM, Kang M, Chen H, Zhang Y, Tan J, Xu C. 2007. Yield stability of maize hybrids evaluated in multi-environment trials in Yunnan, China. Agron. J. 99: 220-228. Kang MS. 1993. Simultaneous selection for yield and stability in crop performance trials: Consequences for growers. Agron. J. 85: 754-757. Kang MS, Miller JD, Darrah LL. 1987. A note on relationship between stability variance and ecovalence. J. Hered. 78: 107. Maechler M, Rousseeuw P, Struyt A, Hubert M. 2015. cluster: Cluster Analysis Basics and Extensions. R package version 2.0.1. Mekbib F. 2003. Yield stability in common beans (Phaseolus vulgaris L.) genotypes. Euphytica 130: 147-153. Mendiburu FD. 2015. agricolae: Statistical Procedures for Agricultural Research. R package version 1.23. http://CRAN.R-project.org/package=agricolae R Core Team. 2015. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. 60 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. RStudio Team (2015). RStudio: Integrated Development for R. RStudio, Inc., Boston, MA (Computer software v0.98.1074) [cited 2015 March 04]. Available at http://www.rstudio.com/. Robinson D. 2015. broom: Convert Statistical Analysis Objects into Tidy Data Frames. R package version 0.4.0. http://CRAN.R-project.org/package=broom Shukla GK. 1972. Genotype stability analysis and its application to potato regional trails. Crop Sci. 11: 184-190. Singmann H, Bolker B, Westfall J. 2015. afex: Analysis of Factorial Experiments. R package version 0.15-2. http://CRAN.R-project.org/package=afex Smith AB, Cullis BR, Thompson R. 2005. The analysis of crop cultivar breeding and evaluation trials: an overview of current mixed model approaches. J. Agric. Sci.143: 449-462. Wricke G. 1962. Uberiene method zur erfassung der ökologischen streubreite in eldversuchen. Z. Pflanzenzücht. 47: 92-96. Wickham H. 2016. tidyr: Easily Tidy Data with `spread()` and `gather()` Functions. R package version 0.4.1. http://CRAN.R-project.org/package=tidyr Wickham H, Francois R. 2015. dplyr: A Grammar of Data Manipulation. R package version 0.4.3. http://CRAN.R-project.org/package=dplyr Yan W, Pageau D, Frégeau-Reid J, Durand J. 2011. Assessing the representativeness and repeatability of test locations for genotype evaluation. Crop Sci. 51: 1603-1610. 61 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Authors Mahendra Dia North Carolina State University, Raleigh, NC, USA diamahendra@gmail.com Todd C. Wehner North Carolina State University, Raleigh, NC, USA tcwehner@gmail.com Consuelo Arellano North Carolina State University, Raleigh, NC, USA carella@ncsu.edu 62 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Table captions. Table 1. ANOVA models including the factors genotype (CLT), location (LC), year (YR), and replication (RP) for multi-location replicated trials across years in a randomized complete block design. Figure captions. Figure 1. Schematic representation of overall process of RGxE program for genotype stability and location value. Figure 2. Residual plot for Case 1 of ANOVA model where genotype, location, year and replication are treated as random effect. 63 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Table 1. ANOVA models including the factors genotype (CLT), location (LC), year (YR), and replication (RP) for multi-location replicated trials across years in a randomized complete block design. Source of variation DF CLT LC YR RP(LC*YR) CLT*LC CLT*YR LC*YR CLT*LC*YR Pooled error g-1 l-1 y-1 (r-1)ly (g-1)(l-1) (g-1)(y-1) (l-1)(y-1) (g-1)(l-1) (y-1) (r-1)(g-1)ly Fixed vs. random effects Case 1 Case 2 Case 3 Random Fixed Fixed Random Fixed Random Random Fixed Random Random Random Random Random Fixed Random Random Fixed Random Random Fixed Random Random Fixed Random Case 4 Random Fixed Random Random Random Random Random Random 64 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Case 5 Fixed Fixed Random Random Fixed Random Random Random Figure 1. Overview of overall process of RGxE program for genotype stability and location value. 65 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Figure 2. Residual plot for Case 1 of ANOVA model where genotype, location, year and replication are treated as random effect. 66 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. SUPPLEMENT (For Publication) RGxE: An R Program for Genotype x Environment Interaction Analysis by Mahendra Dia, Todd C. Wehner, and Consuelo Arellano Description The supplemental material provides instructions for user enetered field, interpretation of univariate stability statistics, sample data and output from sample input data generated from RGxE program 1. Instructions for user entered field A. Define input file information. a. For Windows user I. Option 1 67 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. II. Option 2 b. For Ois or Mac user B. Define trait name that needs to be analyzed. 68 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. C. Define output file information. 2. Interpretation of univariate stability statistics 2.1 Linear regression coefficient (bi) Regression coefficient (bi) of genotypes approximating unity (P<0.01) along with high trait mean is considered to stable across wide range of environment. When this is associated with low trait mean performance, genotypes are poorly adapted to all environments. A bi greater than unity describes genotypes with higher sensitivity to environmental change (below average stability), and greater specificity of adaptability to high yielding environments. 2.2 Deviation from regression (S2d) Genotype is considered to be stable when deviation from regression (S2d) is not significantly different from zero. 2.3 Shukla stability variance (σi2), ssquare, and Wricke's ecovalence (Wi2) Genotype with low σi2, ssquare and Wi2 is regarded as stable. 2.4 Kang stability statistic (YSi) According to this method, genotypes with YSi greater than the mean YSi are considered stable. Stable genotypes are marked with symbol ‘+’. 3. Output (.txt) from sample data [1] "############# Print date and time #################" [1] "" [1] "2016-03-12 15:13:22 EST" [1] "" "########################################################################" [1] "### Section 1: Identify level of significance of ###" [1] "### different effects, variances and BLUP ###" "########################################################################" [1] "" [1] "" [1] "########################################################################" 69 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. [1] "### ANOVA - ANALYSIS OF VARIANCE ####" [1] "########################################################################" [1] "" [1] "########################################################################" [1] "## ANOVA Case 1: CLT, YR, LC and RP - All Random ##" [1] "########################################################################" [1] "" [1] "##Print ANOVA" [1] "#Note: P values are generated using LRT-Likelihood Ratio Test via " [1] "#model comparison and anova" [1] "" sov Variance stddev Pr_Chisq YR:LC:CLT 49.730 7.0519 8.8942e-03 LC:CLT 0.000 0.0000 NA RPid 73.914 8.5973 4.1458e-07 YR:CLT 0.000 0.0000 NA YR:LC 57.813 7.6035 7.8725e-02 CLT 111.697 10.5687 1.3867e-03 LC 699.569 26.4494 9.0836e-03 YR 0.000 0.0000 NA Residual 327.526 18.0977 NA [1] "" [1] "##Print BLUP-Best linear unbiased predictor value for random genotypes" [1] "" genotype Blup CalhounGray 76.255 CrimsonSweet 62.535 EarlyCanada 59.840 FiestaF1 77.283 GeorgiaRattlesnake 70.710 Legacy 68.417 Mickylee 61.300 Quetzali 58.744 StarbriteF1 83.422 SugarBaby 51.169 [1] "" "########################################################################" [1] "## ANOVA Case 2: CLT, YR and LC - Fixed; RP - Random ##" "########################################################################" [1] "" [1] "##Print ANOVA" [1] "#Note: P values are computed for F ratio using KR method " [1] "#Kenward-Rogers approximation for degrees of freedom" [1] "" sov Df Sum_Sq Mean_Sq F_value Prob_F stddev YR 1 120.47 120.470 0.36782 5.4876e-01 NA LC 4 73133.52 18283.381 55.82263 1.8011e-13 NA CLT 9 44948.85 4994.317 15.24860 4.8887e-20 NA YR:LC 4 5372.94 1343.235 4.10115 9.0767e-03 NA YR:CLT 9 4564.93 507.215 1.54862 1.3095e-01 NA 70 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. LC:CLT 36 15018.72 417.187 1.27375 1.4552e-01 NA YR:LC:CLT 36 23058.48 640.513 1.95561 1.5170e-03 NA RPid NA NA 73.914 NA NA 8.5973 Residual NA NA 327.526 NA NA 18.0977 [1] "" [1] "########################################################################" [1] "## ANOVA Case 3: CLT - Fixed; YR, LC and RP - Random ##" "########################################################################" [1] "" [1] "##Print ANOVA" [1] "#Note: For fixed effects P values are computed for F ratio using " [1] "#KR method Kenward-Rogers approximation for degrees of freedom" [1] "" sov Df Sum_Sq Mean_Sq F_value Prob_F CLT 9 27965 3107.2 9.4869 0.006365 [1] "" [1] "#For random effect P values are generated using LRT-Likelihood Ratio Test " [1] "#via model comparison and anova" [1] "" sov Mean_Sq stddev Pr_Chisq YR:LC:CLT 49.730 7.0519 2.7248e-02 LC:CLT 0.000 0.0000 NA RPid 73.914 8.5973 4.1458e-07 YR:CLT 0.000 0.0000 NA YR:LC 57.813 7.6035 6.7765e-02 LC 699.573 26.4495 9.4460e-03 YR 0.000 0.0000 NA Residual 327.526 18.0977 NA [1] "" [1] "########################################################################" [1] "## ANOVA Case 4: LC - Fixed; YR, CLT and RP - Random ##" "########################################################################" [1] "" [1] "##Print ANOVA" [1] "#Note: For fixed effects P values are computed for F ratio using" [1] "#KR method Kenward-Rogers approximation for degrees of freedom" [1] "" sov Df Sum_Sq Mean_Sq F_value Prob_F LC 4 21802 5450.4 16.641 0.0094205 [1] "" [1] "#For random effect P values are generated using LRT-Likelihood Ratio Test" [1] "#via model comparison and anova" [1] "" sov Mean_Sq stddev Pr_Chisq YR:LC:CLT 49.730 7.0519 8.2124e-03 LC:CLT 0.000 0.0000 NA RPid 73.914 8.5973 4.1458e-07 YR:CLT 0.000 0.0000 NA YR:LC 57.813 7.6035 3.9804e-01 CLT 111.697 10.5687 NA 71 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. YR 0.000 0.0000 NA Residual 327.526 18.0977 NA [1] "" [1] "########################################################################" [1] "## ANOVA Case 5: CLT and LC - Fixed; YR, and RP - Random ##" "########################################################################" [1] "##Print ANOVA" [1] "#Note: For fixed effects P values are computed for F ratio using " [1] "#KR method Kenward-Rogers approximation for degrees of freedom" [1] "" sov Df Sum_Sq Mean_Sq F_value Prob_F CLT 9 23982.8 2664.76 8.13600 0.0022680 LC 4 21801.7 5450.43 16.64120 0.0092754 CLT:LC 36 8013.4 222.59 0.67962 0.8743361 [1] "" [1] "#For random effect P values are generated using LRT-Likelihood Ratio Test" [1] "#via model comparison and anova" [1] "" sov Mean_Sq stddev Pr_Chisq YR:LC:CLT 71.582 8.4606 1.0000e+00 RPid 73.914 8.5973 1.3054e-07 YR:CLT 0.000 0.0000 NA YR:LC 55.628 7.4584 2.9757e-01 YR 0.000 0.0000 NA Residual 327.526 18.0977 NA [1] "" [1] "" "########################################################################" [1] "### Section 2: Identify general statistcs (mean and CV) of ###" [1] "### genotype, location, year, replication and environment ###" "########################################################################" [1] "" "########################################################################" [1] "### Descriptive Statistics ###" "########################################################################" [1] "#Descriptive Statistics - Means and CV" [1] "" [1] "#trait mean across genotype and environment(location x year combination)" [1] "" CLT CI-2009 CI-2010 FL-2009 FL-2010 KN-2009 KN-2010 SC-2009 SC-2010 TX-2009 TX-2010 CalhounGray 64.590 81.502 106.663 121.151 68.767 70.805 104.374 89.694 48.932 17.0135 CrimsonSweet 25.775 73.046 110.511 65.791 62.960 57.561 74.870 112.900 21.176 15.5362 EarlyCanada 50.939 55.320 90.613 116.482 57.051 55.421 55.767 74.885 21.744 11.7770 FiestaF1 69.379 81.298 109.561 99.026 80.178 67.647 89.577 123.786 42.324 22.2093 GeorgiaRattlesnake 52.671 72.843 102.128 109.443 69.990 62.960 87.217 79.128 62.326 12.8060 Legacy 52.467 66.323 105.471 89.036 81.298 72.435 88.606 63.556 50.802 15.8825 Mickylee 49.410 59.089 78.977 113.275 70.194 51.448 82.658 48.035 40.334 12.8977 Quetzali 40.751 62.757 79.424 87.583 61.432 48.697 79.719 77.865 29.735 9.7903 StarbriteF1 77.936 84.457 125.971 120.292 70.601 79.974 112.504 102.433 57.339 22.0972 SugarBaby 42.381 57.866 80.657 91.010 37.491 48.799 53.531 52.595 18.116 10.6258 [1] "" [1] "#trait mean across genotype and years" [1] "" CLT 2009 2010 72 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. CalhounGray 78.665 76.033 CrimsonSweet 59.059 64.967 EarlyCanada 55.223 62.777 FiestaF1 78.204 78.793 GeorgiaRattlesnake 74.866 67.436 Legacy 75.729 61.447 Mickylee 64.314 56.949 Quetzali 58.212 57.338 StarbriteF1 88.870 81.850 SugarBaby 46.435 52.179 [1] "" [1] "#trait mean across genotype and location" [1] "" CLT CI FL KN SC TX CalhounGray 73.046 113.907 69.786 97.034 32.973 CrimsonSweet 49.411 88.151 60.261 93.885 18.356 EarlyCanada 53.129 103.548 56.236 65.326 16.761 FiestaF1 75.338 104.294 73.912 106.682 32.266 GeorgiaRattlesnake 62.757 105.786 66.475 83.173 37.566 Legacy 59.395 97.253 76.867 76.081 33.342 Mickylee 54.250 96.126 60.821 65.347 26.616 Quetzali 51.754 83.503 55.065 78.792 19.763 StarbriteF1 81.196 123.132 75.288 107.468 39.718 SugarBaby 50.124 85.834 43.145 53.063 14.371 [1] "" [1] "#trait mean across genotype, location and year" [1] "" CLT YR CI FL KN SC TX CalhounGray 2009 64.590 106.663 68.767 104.374 48.9320 CalhounGray 2010 81.502 121.151 70.805 89.694 17.0135 CrimsonSweet 2009 25.775 110.511 62.960 74.870 21.1758 CrimsonSweet 2010 73.046 65.791 57.561 112.900 15.5362 EarlyCanada 2009 50.939 90.613 57.051 55.767 21.7445 EarlyCanada 2010 55.320 116.482 55.421 74.885 11.7770 FiestaF1 2009 69.379 109.561 80.178 89.577 42.3235 FiestaF1 2010 81.298 99.026 67.647 123.786 22.2093 GeorgiaRattlesnake 2009 52.671 102.128 69.990 87.217 62.3257 GeorgiaRattlesnake 2010 72.843 109.443 62.960 79.128 12.8060 Legacy 2009 52.467 105.471 81.298 88.606 50.8020 Legacy 2010 66.323 89.036 72.435 63.556 15.8825 Mickylee 2009 49.410 78.977 70.194 82.658 40.3335 Mickylee 2010 59.089 113.275 51.448 48.035 12.8977 Quetzali 2009 40.751 79.424 61.432 79.719 29.7353 Quetzali 2010 62.757 87.583 48.697 77.865 9.7903 StarbriteF1 2009 77.936 125.971 70.601 112.504 57.3388 StarbriteF1 2010 84.457 120.292 79.974 102.433 22.0972 SugarBaby 2009 42.381 80.657 37.491 53.531 18.1155 SugarBaby 2010 57.866 91.010 48.799 52.595 10.6258 [1] "" [1] "#trait mean across genotype, location and replication" [1] "" CLT RP CI FL KN SC TX CalhounGray 1 98.413 95.809 62.349 89.448 54.4325 CalhounGray 2 69.684 136.038 55.625 103.293 25.3965 73 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. CalhounGray 3 59.292 118.781 70.296 87.128 CalhounGray 4 64.794 105.000 90.875 108.266 CrimsonSweet 1 37.491 39.581 62.349 79.465 CrimsonSweet 2 59.293 130.710 58.885 94.685 CrimsonSweet 3 63.979 93.970 55.218 109.528 CrimsonSweet 4 36.880 88.343 64.591 91.863 EarlyCanada 1 64.386 104.058 51.346 72.465 EarlyCanada 2 45.233 107.547 52.772 59.435 EarlyCanada 3 61.331 104.386 48.901 59.649 EarlyCanada 4 41.566 98.200 71.925 69.755 FiestaF1 1 78.650 67.163 70.703 111.038 FiestaF1 2 70.499 145.651 72.944 117.007 FiestaF1 3 63.164 97.226 82.114 98.525 FiestaF1 4 89.041 107.133 69.888 100.156 GeorgiaRattlesnake 1 70.499 95.776 61.941 84.538 GeorgiaRattlesnake 2 60.923 106.091 61.738 78.721 GeorgiaRattlesnake 3 58.274 105.629 74.167 69.766 GeorgiaRattlesnake 4 61.331 115.646 68.054 99.667 Legacy 1 64.794 80.311 66.424 84.233 Legacy 2 41.159 104.211 71.315 82.887 Legacy 3 55.829 97.647 87.614 68.635 Legacy 4 75.797 106.844 82.113 68.570 Mickylee 1 51.550 95.789 54.810 72.913 Mickylee 2 60.923 107.262 72.333 59.385 Mickylee 3 36.880 95.143 59.700 63.388 Mickylee 4 67.647 86.310 56.440 65.701 Quetzali 1 64.590 73.935 53.587 94.370 Quetzali 2 53.791 101.604 50.328 75.674 Quetzali 3 50.939 79.901 47.883 62.777 Quetzali 4 37.694 78.573 68.462 82.347 StarbriteF1 1 71.722 76.359 54.199 80.066 StarbriteF1 2 86.189 124.965 70.296 110.171 StarbriteF1 3 90.875 126.852 70.296 103.640 StarbriteF1 4 76.001 164.350 106.361 135.996 SugarBaby 1 60.923 71.728 35.861 59.690 SugarBaby 2 58.070 94.041 30.563 52.498 SugarBaby 3 36.472 85.855 50.735 53.883 SugarBaby 4 45.029 91.711 55.421 46.181 [1] "" [1] "#trait mean across location and year" [1] "" LC 2009 2010 CI 52.630 69.450 FL 98.998 101.309 KN 65.996 61.575 SC 82.882 82.488 TX 39.283 15.064 [1] "" [1] "#trait mean across location and replication" [1] "" LC 1 2 3 4 CI 66.302 60.576 57.703 59.578 FL 80.051 115.812 100.539 104.211 KN 57.357 59.680 64.692 73.413 15.2330 36.8290 31.2560 13.5540 19.0985 9.5155 10.1675 15.2815 27.9405 13.6535 51.4160 31.7220 18.9710 26.9565 49.4325 58.1390 13.2595 29.4325 48.1675 37.6610 29.0530 18.4875 53.3065 21.7000 19.0495 12.4065 27.9930 22.7275 18.2570 10.0735 40.4225 61.3850 31.6680 25.3965 12.6045 15.8565 13.8400 15.1815 74 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. SC 82.823 83.376 77.692 86.850 TX 37.920 30.342 20.637 19.793 [1] "" [1] "#trait count, min, max, mean, sum, median, var, sd over location" [1] "" LC count min max mean sum median var sd CI 80 0.000 108.81 61.040 4883.2 59.904 422.13 20.546 FL 80 0.000 180.91 100.153 8012.3 99.492 886.48 29.774 KN 80 19.153 109.21 63.786 5102.8 63.776 305.04 17.465 SC 80 30.522 144.67 82.685 6614.8 81.462 740.18 27.206 TX 80 0.000 102.63 27.173 2173.8 20.388 505.83 22.491 [1] "" [1] "#trait count, min, max, mean, sum, median, var, sd over year" [1] "" YR count min max mean sum median var sd 2009 200 0 172.06 67.958 13592 68.735 1005.8 31.714 2010 200 0 180.91 65.977 13195 65.405 1324.4 36.392 [1] "" [1] "#trait count, min, max, mean, sum, [1] "" CLT count min max CalhounGray 40 9.454 152.04 CrimsonSweet 40 0.000 147.98 EarlyCanada 40 7.213 119.75 FiestaF1 40 4.034 180.91 GeorgiaRattlesnake 40 4.157 144.54 Legacy 40 5.990 120.69 Mickylee 40 10.229 128.39 Quetzali 40 5.909 110.52 StarbriteF1 40 7.417 172.06 SugarBaby 40 8.884 106.33 median, var, sd over genotype" mean 77.349 62.013 59.000 78.498 71.151 68.588 60.632 57.775 85.360 49.307 sum 3094.0 2480.5 2360.0 3139.9 2846.0 2743.5 2425.3 2311.0 3414.4 1972.3 median 81.777 61.738 56.644 79.792 69.277 68.808 59.395 60.433 83.540 51.245 var 1275.65 1468.58 1009.08 1334.98 1055.36 835.44 892.20 759.80 1462.50 653.62 sd 35.716 38.322 31.766 36.537 32.486 28.904 29.870 27.564 38.243 25.566 [1] "" [1] "#trait count, min, max, mean, sum, median, var, sd over environment" [1] "" ENV count min max mean sum median var sd CI-2009 40 0.000 108.805 52.630 2105.19 54.199 489.72 22.1297 CI-2010 40 42.789 97.395 69.450 2778.00 67.851 220.28 14.8417 FL-2009 40 56.717 172.065 98.998 3959.91 95.171 553.49 23.5265 FL-2010 40 0.000 180.906 101.309 4052.36 107.077 1239.46 35.2060 KN-2009 40 19.153 109.213 65.996 2639.85 66.424 419.76 20.4881 KN-2010 40 35.861 105.138 61.575 2462.99 60.922 188.12 13.7157 SC-2009 40 30.522 144.666 82.882 3315.30 82.864 705.88 26.5683 SC-2010 40 40.853 141.956 82.488 3299.51 76.512 793.38 28.1671 TX-2009 40 0.000 102.626 39.283 1571.31 32.904 658.22 25.6559 TX-2010 40 4.034 34.231 15.064 602.54 13.143 65.60 8.0994 [1] "" [1] "#trait coefficient of variation (cv) across gentoype and location" [1] "" CLT CI FL KN SC TX CalhounGray 31.949 23.999 32.158 16.135 83.562 CrimsonSweet 62.295 49.664 13.229 30.140 66.446 EarlyCanada 21.952 15.884 29.933 30.311 70.379 FiestaF1 21.779 43.048 13.199 22.666 56.006 GeorgiaRattlesnake 19.610 20.215 10.081 38.647 94.577 Legacy 29.606 27.040 15.967 24.143 71.471 75 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. Mickylee Quetzali StarbriteF1 SugarBaby 29.351 31.077 21.454 28.754 21.866 21.406 30.106 13.669 25.915 21.994 31.218 30.860 33.958 22.743 21.508 20.260 99.438 75.846 65.571 33.650 [1] "" [1] "trait sum across genotype and location" [1] "" CLT CI FL KN SC TX CalhounGray 584.37 911.26 558.29 776.27 263.78 CrimsonSweet 395.29 705.21 482.08 751.08 146.85 EarlyCanada 425.03 828.38 449.89 522.61 134.09 FiestaF1 602.71 834.35 591.30 853.45 258.13 GeorgiaRattlesnake 502.05 846.28 531.80 665.38 300.53 Legacy 475.16 778.03 614.93 608.65 266.74 Mickylee 434.00 769.01 486.57 522.77 212.93 Quetzali 414.03 668.03 440.52 630.34 158.10 StarbriteF1 649.57 985.05 602.30 859.74 317.74 SugarBaby 400.99 686.67 345.16 424.50 114.97 [1] "" [1] "trait sum across genotype, location and year" [1] "" CLT YR CI FL KN SC TX CalhounGray 2009 258.36 426.65 275.07 417.50 195.728 CalhounGray 2010 326.01 484.60 283.22 358.78 68.054 CrimsonSweet 2009 103.10 442.05 251.84 299.48 84.703 CrimsonSweet 2010 292.19 263.16 230.24 451.60 62.145 EarlyCanada 2009 203.75 362.45 228.20 223.07 86.978 EarlyCanada 2010 221.28 465.93 221.69 299.54 47.108 FiestaF1 2009 277.51 438.25 320.71 358.31 169.294 FiestaF1 2010 325.19 396.10 270.59 495.14 88.837 GeorgiaRattlesnake 2009 210.68 408.51 279.96 348.87 249.303 GeorgiaRattlesnake 2010 291.37 437.77 251.84 316.51 51.224 Legacy 2009 209.87 421.88 325.19 354.42 203.208 Legacy 2010 265.29 356.14 289.74 254.23 63.530 Mickylee 2009 197.64 315.91 280.77 330.63 161.334 Mickylee 2010 236.36 453.10 205.79 192.14 51.591 Quetzali 2009 163.00 317.69 245.73 318.88 118.941 Quetzali 2010 251.03 350.33 194.79 311.46 39.161 StarbriteF1 2009 311.75 503.89 282.40 450.01 229.355 StarbriteF1 2010 337.83 481.17 319.90 409.73 88.389 SugarBaby 2009 169.52 322.63 149.96 214.12 72.462 SugarBaby 2010 231.46 364.04 195.20 210.38 42.503 [1] "" [1] "trait sum across genotype and year" [1] "" CLT 2009 2010 CalhounGray 1573.3 1520.7 CrimsonSweet 1181.2 1299.3 EarlyCanada 1104.5 1255.5 FiestaF1 1564.1 1575.9 GeorgiaRattlesnake 1497.3 1348.7 Legacy 1514.6 1228.9 Mickylee 1286.3 1139.0 Quetzali 1164.2 1146.8 76 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. StarbriteF1 1777.4 1637.0 SugarBaby 928.7 1043.6 [1] "" [1] "trait standard [1] "" CLT CalhounGray CrimsonSweet EarlyCanada FiestaF1 GeorgiaRattlesnake Legacy Mickylee Quetzali StarbriteF1 SugarBaby [1] "" [1] "trait standard [1] "" CLT CalhounGray CalhounGray CrimsonSweet CrimsonSweet EarlyCanada EarlyCanada FiestaF1 FiestaF1 GeorgiaRattlesnake GeorgiaRattlesnake Legacy Legacy Mickylee Mickylee Quetzali Quetzali StarbriteF1 StarbriteF1 SugarBaby SugarBaby [1] "" [1] "trait standard [1] "" CLT CalhounGray CrimsonSweet EarlyCanada FiestaF1 GeorgiaRattlesnake Legacy Mickylee Quetzali StarbriteF1 SugarBaby deviation (sd) across genotype and location" CI 23.337 30.780 11.663 16.408 12.306 17.584 15.923 16.084 17.420 14.413 FL 27.336 43.780 16.447 44.896 21.384 26.297 21.019 17.875 37.069 11.732 KN 22.4419 7.9720 16.8332 9.7560 6.7012 12.2736 15.7614 12.1110 23.5036 13.3147 SC 15.657 28.297 19.801 24.180 32.143 18.368 22.191 17.919 23.114 10.750 TX 27.5527 12.1968 11.7961 18.0711 35.5286 23.8302 26.4661 14.9893 26.0434 4.8357 deviation (sd) across genotype, location and year" YR 2009 2010 2009 2010 2009 2010 2009 2010 2009 2010 2009 2010 2009 2010 2009 2010 2009 2010 2009 2010 CI 32.0192 7.4101 19.1151 18.8557 13.8353 10.6391 19.4961 12.3851 5.7280 7.0215 12.1090 21.1391 21.8298 7.2519 11.7591 11.9366 24.6272 8.5569 16.6566 6.8838 FL 32.1452 23.8823 28.5590 48.2005 12.3116 5.7846 6.6594 67.7116 17.4036 26.9899 19.9329 32.1907 5.4043 14.7424 15.8267 21.2281 32.8501 45.8879 7.8131 13.7355 KN 30.7113 15.1397 7.8283 8.2203 23.6429 10.0205 7.4503 7.8670 6.9186 4.8959 15.5727 7.5245 18.3739 2.7913 10.3753 11.2465 30.6391 17.0780 16.4254 7.6551 SC 11.6864 17.0801 23.4217 18.8560 23.2914 11.3460 10.2947 21.8657 41.5767 25.2695 16.8878 9.1512 16.7378 8.3460 20.2266 18.3803 27.8233 20.1214 14.9464 6.7591 TX 32.5363 5.7921 15.1403 9.8325 15.9525 1.9904 17.5309 13.5988 35.6340 6.3833 19.9047 10.7674 33.4959 3.2569 15.1437 5.4514 25.5900 9.9889 3.3705 2.4096 deviation (sd) across genotype and year" 2009 34.861 38.308 27.794 25.759 29.401 26.536 25.412 24.665 36.771 24.071 2010 37.410 39.097 35.619 45.569 35.676 30.055 34.015 30.835 40.298 27.292 77 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. [1] "" [1] "" "########################################################################" [1] "### Section 3: Choose stable and best genotype ###" "########################################################################" [1] "" "########################################################################" [1] "### Stability Statistics ###" "########################################################################" [1] "" [1] "#Univariate statistics - mean, slope, deviation from regression," [1] "#Shukla,Ecovalence, Kang" [1] "" Genotype CalhounGray CrimsonSweet EarlyCanada FiestaF1 GeorgiaRattlesnake Legacy Mickylee Quetzali StarbriteF1 SugarBaby Mean SLOPE DEVREG SIGMA 77.349 1.31693830164876 124.670028621208 61.346841 ns 62.013 1.36454332357129 1450.03510073339*** 439.124582 ns 59.000 0.320939654616631* 686.251275445017* 253.229481 ns 78.498 1.58347941482123 657.873266033708 300.285026 ns 71.151 0.917198392144549 220.063304045718 52.212609 ns 68.588 1.13392930476125 428.070344903385 287.393522 ns 60.632 0.593584749858975 705.480966435745* 188.105058 ns 57.775 0.971528006198702 96.5321084597175 82.808549 ns 85.360 1.29338936247089 221.136796113672 157.240582 ns 49.307 0.504469489907715* 332.181769968082* 264.187195 ns SIGMA_SQUARE Ecovalence YS_Kang 15.761172 ns 279.75 10 + 567.987945 ns 1488.64 4 285.369649 ns 893.77 2 385.996807 ns 1044.35 11 + 44.863403 ns 250.52 8 + 262.48765 ns 1003.10 7 + 195.125992 ns 685.37 3 86.103189 ns 348.42 1 78.371337 ns 586.61 12 + 308.429643 ns 928.84 -1 [1] "" [1] "" "########################################################################" [1] "## Section 4: Choose discriminative and representative location ##" "########################################################################" [1] "" "########################################################################" [1] "### location statistics - genotype F ratio across location ###" [1] "### and environment; location correlation ###" "########################################################################" [1] "" [1] "#location value" [1] "# location mean, genotype F ratio across location," [1] "#correlation of location with average location performace" [1] "" Location Mean FRatioGenotype LocCorrelation CI 61.040 4.1804 0.949049766419943*** FL 100.153 2.2579 0.864975641187942** KN 63.786 4.6964 0.896401033871258*** SC 82.685 6.8813 0.87603982110229*** TX 27.173 2.9966 0.89062492518194*** [1] "" [1] "#location value-genotype F ratio across location," [1] "" ENV FRatioGenotype CI-2009 2.4015 CI-2010 3.3665 FL-2009 2.6914 FL-2010 1.7231 KN-2009 1.9999 KN-2010 6.3971 SC-2009 2.8729 SC-2010 8.0454 TX-2009 2.5003 78 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. TX-2010 1.4619 4. Sample data YR 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 LC KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN RP 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 CLT Early Canada Calhoun Gray Starbrite F1 Crimson Sweet Georgia Rattlesnake Fiesta F1 Mickylee Sugar Baby Legacy Quetzali Early Canada Georgia Rattlesnake Quetzali Mickylee Calhoun Gray Fiesta F1 Starbrite F1 Sugar Baby Crimson Sweet Legacy Fiesta F1 Quetzali Crimson Sweet Sugar Baby Early Canada Legacy Starbrite F1 Georgia Rattlesnake Mickylee Calhoun Gray Crimson Sweet Early Canada Starbrite F1 Mickylee Quetzali Sugar Baby Fiesta F1 Calhoun Gray Trait 56.236 74.167 32.601 74.167 64.794 70.907 57.051 28.118 68.054 57.866 56.236 64.387 64.794 96.580 38.306 85.577 69.684 19.153 56.236 72.537 86.800 49.309 59.496 50.531 28.933 103.100 72.537 79.057 68.869 53.384 61.942 86.800 107.583 58.274 73.759 52.161 77.427 109.213 79 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 KN KN CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI SC 4 4 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 1 Georgia Rattlesnake Legacy Mickylee Crimson Sweet Fiesta F1 Calhoun Gray Georgia Rattlesnake Legacy Starbrite F1 Quetzali Early Canada Sugar Baby Calhoun Gray Georgia Rattlesnake Legacy Early Canada Crimson Sweet Quetzali Sugar Baby Mickylee Fiesta F1 Starbrite F1 Starbrite F1 Quetzali Crimson Sweet Calhoun Gray Mickylee Early Canada Sugar Baby Fiesta F1 Legacy Georgia Rattlesnake Sugar Baby Calhoun Gray Georgia Rattlesnake Crimson Sweet Fiesta F1 Legacy Quetzali Early Canada Starbrite F1 Mickylee Georgia Rattlesnake 71.722 81.502 37.898 0.000 59.904 108.805 59.904 66.832 47.679 56.236 68.054 55.014 59.496 46.049 39.528 44.826 45.234 43.196 56.644 62.757 59.904 97.395 98.617 33.823 33.008 32.193 24.858 55.014 21.598 59.089 46.049 51.346 36.268 57.866 53.384 24.858 98.617 57.459 29.748 35.861 68.054 72.129 86.331 80 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC FL FL FL FL 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 1 1 1 1 Legacy Calhoun Gray Mickylee Starbrite F1 Quetzali Fiesta F1 Sugar Baby Early Canada Crimson Sweet Georgia Rattlesnake Starbrite F1 Calhoun Gray Fiesta F1 Sugar Baby Mickylee Early Canada Crimson Sweet Legacy Quetzali Calhoun Gray Early Canada Starbrite F1 Mickylee Quetzali Crimson Sweet Georgia Rattlesnake Legacy Sugar Baby Fiesta F1 Calhoun Gray Legacy Sugar Baby Crimson Sweet Starbrite F1 Early Canada Quetzali Georgia Rattlesnake Fiesta F1 Mickylee Calhoun Gray Starbrite F1 Crimson Sweet Legacy 109.967 95.215 102.407 80.830 84.355 80.121 74.228 72.374 42.198 44.357 124.189 121.397 94.237 53.852 70.601 30.522 97.578 91.567 88.348 98.842 41.688 100.329 67.076 50.246 82.399 74.513 69.562 46.252 101.857 102.041 83.328 39.793 77.305 144.666 78.486 95.928 143.668 82.093 90.549 103.316 98.132 79.163 115.950 81 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL TX TX TX TX TX TX TX 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 Sugar Baby Early Canada Georgia Rattlesnake Fiesta F1 Quetzali Mickylee Calhoun Gray Legacy Crimson Sweet Sugar Baby Quetzali Starbrite F1 Georgia Rattlesnake Mickylee Early Canada Fiesta F1 Calhoun Gray Quetzali Fiesta F1 Crimson Sweet Starbrite F1 Mickylee Legacy Sugar Baby Early Canada Georgia Rattlesnake Starbrite F1 Crimson Sweet Mickylee Legacy Georgia Rattlesnake Fiesta F1 Sugar Baby Early Canada Calhoun Gray Quetzali Quetzali Starbrite F1 Early Canada Sugar Baby Fiesta F1 Georgia Rattlesnake Crimson Sweet 70.393 88.616 98.554 118.457 81.374 76.863 152.036 120.695 147.984 81.751 92.686 126.095 96.119 86.136 107.270 110.397 94.223 86.918 102.984 102.912 107.594 73.358 76.494 81.086 89.020 127.092 172.065 111.987 79.549 108.744 86.747 106.407 89.398 77.546 77.078 56.717 50.077 56.476 7.213 15.266 68.601 79.305 34.842 82 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX KN KN KN KN KN KN KN KN KN KN 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 Legacy Mickylee Calhoun Gray Sugar Baby Legacy Crimson Sweet Starbrite F1 Fiesta F1 Early Canada Quetzali Mickylee Georgia Rattlesnake Calhoun Gray Mickylee Crimson Sweet Sugar Baby Early Canada Legacy Quetzali Starbrite F1 Georgia Rattlesnake Fiesta F1 Calhoun Gray Sugar Baby Calhoun Gray Quetzali Early Canada Fiesta F1 Starbrite F1 Crimson Sweet Georgia Rattlesnake Legacy Mickylee Calhoun Gray Fiesta F1 Georgia Rattlesnake Legacy Crimson Sweet Starbrite F1 Quetzali Mickylee Early Canada Sugar Baby 78.364 88.968 92.646 22.829 45.166 21.444 93.552 32.636 21.639 27.688 33.171 102.626 27.606 26.281 28.417 18.185 43.982 48.693 27.753 35.951 22.362 33.908 21.012 16.182 54.464 13.423 14.144 34.149 43.376 0.000 45.010 30.985 12.914 50.531 70.499 59.089 64.794 50.531 75.797 49.309 52.569 46.456 43.604 83 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN KN CI CI CI CI CI CI CI CI CI CI CI CI CI 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 2 2 2 Sugar Baby Legacy Calhoun Gray Starbrite F1 Early Canada Quetzali Mickylee Georgia Rattlesnake Crimson Sweet Fiesta F1 Quetzali Sugar Baby Mickylee Fiesta F1 Georgia Rattlesnake Crimson Sweet Early Canada Starbrite F1 Calhoun Gray Legacy Quetzali Starbrite F1 Sugar Baby Legacy Georgia Rattlesnake Fiesta F1 Calhoun Gray Mickylee Early Canada Crimson Sweet Quetzali Crimson Sweet Fiesta F1 Early Canada Mickylee Georgia Rattlesnake Sugar Baby Calhoun Gray Starbrite F1 Legacy Mickylee Legacy Georgia Rattlesnake 41.974 70.092 72.944 70.907 49.309 35.861 48.086 59.089 61.534 60.311 46.456 50.939 50.531 77.427 69.277 50.939 68.869 68.054 87.207 72.129 63.164 105.138 58.681 82.725 64.387 62.349 72.537 54.606 57.051 67.239 72.944 74.982 97.395 60.719 65.202 81.094 66.832 88.022 95.765 62.757 59.089 42.789 75.797 84 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI CI SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 Crimson Sweet Calhoun Gray Sugar Baby Starbrite F1 Fiesta F1 Early Canada Quetzali Fiesta F1 Sugar Baby Early Canada Quetzali Legacy Crimson Sweet Calhoun Gray Georgia Rattlesnake Mickylee Starbrite F1 Calhoun Gray Sugar Baby Crimson Sweet Early Canada Mickylee Legacy Georgia Rattlesnake Fiesta F1 Quetzali Starbrite F1 Quetzali Starbrite F1 Georgia Rattlesnake Crimson Sweet Legacy Mickylee Fiesta F1 Calhoun Gray Sugar Baby Early Canada Sugar Baby Calhoun Gray Legacy Quetzali Early Canada Starbrite F1 73.352 79.872 59.496 74.982 81.094 45.641 64.387 67.239 51.346 67.647 68.054 65.609 94.950 86.392 65.202 48.901 83.132 71.722 53.791 48.901 47.271 63.164 94.135 69.277 79.464 45.641 83.947 104.384 79.301 82.745 116.731 58.498 43.420 141.956 83.682 45.152 72.557 51.143 85.190 74.208 63.001 88.348 96.152 85 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 Mickylee Fiesta F1 Crimson Sweet Georgia Rattlesnake Quetzali Crimson Sweet Mickylee Legacy Fiesta F1 Early Canada Starbrite F1 Georgia Rattlesnake Sugar Baby Calhoun Gray Early Canada Georgia Rattlesnake Calhoun Gray Legacy Mickylee Quetzali Fiesta F1 Crimson Sweet Starbrite F1 Sugar Baby Crimson Sweet Legacy Quetzali Fiesta F1 Sugar Baby Starbrite F1 Calhoun Gray Early Canada Mickylee Georgia Rattlesnake Crimson Sweet Fiesta F1 Starbrite F1 Sugar Baby Legacy Early Canada Quetzali Georgia Rattlesnake Calhoun Gray 48.168 139.776 91.792 113.084 75.308 136.658 59.700 67.708 95.194 77.610 106.951 65.018 61.514 75.414 61.025 55.666 114.490 53.812 40.853 68.767 118.219 106.421 127.326 52.569 0.000 44.673 66.497 15.869 73.064 54.587 88.303 119.500 114.715 92.999 113.437 180.906 123.836 106.330 87.727 107.824 110.522 116.064 120.040 86 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL FL TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 Mickylee Mickylee Sugar Baby Fiesta F1 Starbrite F1 Calhoun Gray Crimson Sweet Legacy Quetzali Georgia Rattlesnake Early Canada Legacy Fiesta F1 Mickylee Georgia Rattlesnake Crimson Sweet Sugar Baby Quetzali Calhoun Gray Early Canada Starbrite F1 Sugar Baby Legacy Starbrite F1 Crimson Sweet Fiesta F1 Mickylee Georgia Rattlesnake Early Canada Calhoun Gray Quetzali Quetzali Legacy Sugar Baby Mickylee Crimson Sweet Georgia Rattlesnake Early Canada Starbrite F1 Fiesta F1 Calhoun Gray Quetzali Legacy 128.388 116.927 90.624 91.469 146.110 143.339 85.028 118.799 72.884 84.165 119.752 104.945 107.860 93.071 144.545 64.698 94.024 100.429 132.922 118.853 156.635 9.943 17.971 24.369 27.670 34.231 17.645 19.560 13.122 16.219 5.909 17.767 30.156 8.884 10.229 5.664 13.652 8.924 29.218 30.808 23.187 8.761 9.413 87 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved. 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX TX 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 Fiesta F1 Georgia Rattlesnake Crimson Sweet Starbrite F1 Mickylee Early Canada Sugar Baby Calhoun Gray Fiesta F1 Calhoun Gray Sugar Baby Legacy Crimson Sweet Mickylee Georgia Rattlesnake Early Canada Quetzali Starbrite F1 4.034 4.157 9.780 27.385 11.818 11.899 9.495 9.454 19.764 19.194 14.181 5.990 19.031 11.899 13.855 13.163 6.724 7.417 88 © 2016 Mahendra Dia, Todd C. Wehner and Consuelo Arellano. All rights reserved.