BIOM 602 SPLIT PLOT DESIGN AND ANALYSIS In some experiments, we randomize levels of two or more factors on multiple "levels" of experimental units. For two factors, experimental units are commonly referred to as the whole units and sub units (in field trials, whole plots and sub plots). The sub units are within the whole units, and thus, are smaller than the whole units. Each levels of experimental units is expected to have a different variance, larger units (whole units) generally having greater variance than smaller units (sub units). Different sizes of units require that different experimental errors be estimated. The treatment structure for a split plot is a factorial because at least one factor is assigned to the whole units (whole plots) and one (or more) factor is assigned to the sub units (sub-plots). The sub unit design is always like a block design, while the whole unit design may be any of the designs that we have studied (e.g., CR, RCB, LS, etc.). The complete design name for split plot includes the identification of the whole plot design. For example, if the whole plot treatment factor is assigned as an RCB, then the design would be called a Randomized Complete Block Split-Plot Design. MIXED syntax for Split Plot Analysis: PROC MIXED ratio covtest; CLASS <classification variables>; MODEL <dep var> = <fixed sources>; Specifies the fixed sources of variation. In most cases this will be the sources of variation representing the factorial treatment structure. RANDOM <random sources>; Specifies the random sources of variation. In most cases this will be any blocking factors, plus the whole and sub unit error terms. It is the correct identification of the whole and sub unit error terms that make the analysis a split plot. LSMEANS <fixed sources>/pdiff; ESTIMATE . . . CONTRAST . . . The MIXED procedure uses the expected mean squares to compute the correct standard error of the difference for any of the mean comparison procedures, ESTIMATE, and CONTRAST statements. ODS . . . 3/26/02 1 Split-plot The SAS program and Output for a CRD Split-plot Example: OPTIONS LS=75 PS=500 NOCENTER NODATE PAGENO=1; TITLE1 ANALYZING CRD SPLIT-PLOT DESIGNS; TITLE2 Whole Plot Treatments Are 6 Combinations of 3 Ozone & 2 RainpH Levels Assigned as A CRD; TITLE3 The sub-plot Treatments are 3 Genotypes; DATA A; INPUT O3$ RAINPH$ REP GENOTYPE LEAFAGE AN; DATALINES; Data lines entered here RUN; The data are from a research on the effects of simulated ozone and acid rain on net photosynthesis (An) of one-year-old needles in seedlings of 3 ponderosa pine genotypes. Air pollution treatments consisted of a factorial combination of 3 ozone (filtered, ambient, and twice ambient) and two rain pH (3, 5.1) levels. This produced 6 treatment combinations, each of which replicated four times. Air pollution treatments were applied using exposure chambers that served as whole plots, within each the three genotypes were assigned as sub-plots. PROC PRINT; RUN; ANALYZING CRD SPLIT-PLOT The Whole Plot Treatments Are 6 Combinations of 3 Ozone & 2 Rain pH Levels The Sub-plot Treatments are 3 Genotypes Obs 1 2 3 4 5 6 . 44 45 46 47 48 . 67 68 69 70 71 72 3/9/16 O3 Ambient Ambient Ambient Ambient Ambient Ambient Rain pH 3.0 3.0 3.0 3.0 3.0 3.0 rep 1 1 1 2 2 2 Genotype 1 2 3 1 2 3 Leaf Age 1 1 1 1 1 1 An 7.6149 6.6716 5.2160 5.6866 6.7961 4.0350 Filtered Filtered Filtered Filtered Filtered 5.1 5.1 5.1 5.1 5.1 3 3 4 4 4 2 3 1 2 3 1 1 1 1 1 9.5690 7.5071 5.5003 7.3806 9.1482 TwiceAmb TwiceAmb TwiceAmb TwiceAmb TwiceAmb TwiceAmb 5.1 5.1 5.1 5.1 5.1 5.1 3 3 3 4 4 4 1 2 3 1 2 3 1 1 1 1 1 1 4.5229 6.8610 6.5856 5.7229 5.0970 6.6198 2 Split Plot PROC MIXED DATA=A; CLASS O3 RAINPH REP GENOTYPE; MODEL AN=O3|RAINPH|GENOTYPE/DDFM=KR; RANDOM REP*O3*RAINPH; RUN; The MODEL statement identifies the fixed sources of variation; in this case, ozone, rain ph, genotype, and their interactions. The RANDOM statement identifies the random sources of variation, which is replication*ozone*rainph interaction. This is the whole plot MSe. The default residual becomes the sub plot MSe. ANALYZING CRD SPLIT-PLOT DESIGNS The Whole Plot Treatments Are 6 Combinations of 3 Ozone & 2 Rain pH Levels The sub-plot Treatments are 3 Genotypes The Mixed Procedure Model Information Data Set WORK.A Dependent Variable An Covariance Structure Variance Components Estimation Method REML Residual Variance Method Profile Fixed Effects SE Method Model-Based Degrees of Freedom Method Containment Class Level Information Class O3 RainpH rep Genotype Levels 3 2 4 3 Values Ambient Filtered TwiceAmb 3.0 5.1 1 2 3 4 1 2 3 Dimensions Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Observations Used Observations Not Used Total Observations 3/9/16 2 48 24 1 72 72 0 72 3 Split Plot Iteration History Evaluations -2 Res Log Like 1 222.57320081 1 221.51041731 Iteration 0 1 Criterion 0.00000000 Convergence criteria met. Covariance Parameter Estimates Cov Parm O3*RainpH*rep Residual Estimate 0.3309 1.9436 These are the variance component (VC) estimates for the random effects. In this case the residual VC is the largest. the VC for residual is the sub-plot MSe. However, the VC for O3*RainpH*rep is not the MSe for the whole plots. The whole plot MSe is the sub plot VC + 3* whole plot VC [1.9436 + 3(0.3309)=2.9363], where 3 is the number of sub plots per whole plots. Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Effect Type 3 Tests of Fixed Effects Num Den DF DF F Value O3 RainpH O3*RainpH Genotype O3*Genotype RainpH*Genotype O3*RainpH*Genotype 3/9/16 221.5 225.5 225.7 227.9 2 1 2 2 4 2 4 18 18 18 36 36 36 36 29.75 0.56 0.65 0.03 0.97 0.32 0.66 Pr > F <.0001 0.4644 0.5337 0.9695 0.4349 0.7271 0.6257 These are the test of hypotheses for the main and interactive effects of the fixed factors. The only significant effect is the main effect of ozone. Note that the denominator degrees of freedoms (error dfs) are different for ozone, Rain pH, and their interaction versus those for genotype and interactions involving genotype. This is because of having two different error terms for the wholeand sub-plots. 4 Split Plot Suppose that in the previous study, the researcher was also interested in examining the An response of two needle age classes (conifers usually maintain several needle age classes). So the fixed factors of interests were factorial combinations of ozone and rain ph assigned to main plots (whole plot factors), genotype within ozone and rain ph combination (sub-plot factor), and needle age class within genotype (sub-sub-plot factor). This setting produces a design named Splitsplit-plot design. There are three different kinds of experimental units, and thus, three different random error terms should be identified and used. The SAS program and Output for a CRD Split-split-plot Example: OPTIONS LS=75 PS=500 NOCENTER NODATE PAGENO=1; TITLE1 ANALYZING CRD SPLIT-SPLIT-PLOT DESIGNS; DATA B; INPUT O3$ RAINPH$ REP GENOTYPE LEAFAGE AN; DATALINES; Data lines entered here RUN; PROC PRINT DATA=B; RUN; ANALYZING CRD SPLIT-PLOT The Whole Plot Treatments Are 6 Combinations of 3 Ozone & 2 Rain pH Levels The Sub-plot Treatments are 3 Genotypes Obs 1 2 3 4 5 6 7 8 9 10 . 133 134 135 136 137 138 139 140 141 142 143 144 3/9/16 O3 Ambient Ambient Ambient Ambient Ambient Ambient Ambient Ambient Ambient Ambient Rain pH 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 rep 1 1 1 1 1 1 2 2 2 2 Genotype 1 1 2 2 3 3 1 1 2 2 Leaf Age 1 2 1 2 1 2 1 2 1 2 An 7.6149 5.6383 6.6716 3.3239 5.2160 4.1720 5.6866 3.6622 6.7961 4.3654 TwiceAmb TwiceAmb TwiceAmb TwiceAmb TwiceAmb TwiceAmb TwiceAmb TwiceAmb TwiceAmb TwiceAmb TwiceAmb TwiceAmb 5.1 5.1 5.1 5.1 5.1 5.1 5.1 5.1 5.1 5.1 5.1 5.1 3 3 3 3 3 3 4 4 4 4 4 4 1 1 2 2 3 3 1 1 2 2 3 3 1 2 1 2 1 2 1 2 1 2 1 2 4.5229 2.0557 6.8610 3.8902 6.5856 4.7786 5.7229 4.1057 5.0970 3.1993 6.6198 3.5799 5 Split Plot PROC MIXED DATA=B; CLASS O3 RAINPH REP GENOTYPE LEAFAGE; MODEL AN=O3|RAINPH|GENOTYPE|LEAFAGE/DDFM=KR; RANDOM REP*O3*RAINPH REP*GENOTYPE*O3*RAINPH; RUN; The MODEL statement identifies the fixed sources of variation; in this case, ozone, rain ph, genotype, leaf age, and their interactions. The RANDOM statement identifies the random sources of variation consisting of 1) rep*O3*rainph interaction, which is the whole plot MSe, 2) rep*genotype*O3*rainph interaction, which is the sub-plot (genotype) MSe, and 3) the default residual, which becomes the sub-sub-plot (leaf age) MSe. ANALYZING CRD SPLIT-SPLIT-PLOT DESIGNS The Mixed Procedure Model Information Data Set WORK.B Dependent Variable An Covariance Structure Variance Components Estimation Method REML Residual Variance Method Profile Fixed Effects SE Method Prasad-Rao-JeskeKackar-Harville Degrees of Freedom Method Kenward-Roger Class Level Information Class O3 RainpH rep Genotype LeafAge Levels 3 2 4 3 2 Values Ambient Filtered TwiceAmb 3.0 5.1 1 2 3 4 1 2 3 1 2 Dimensions Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Observations Used Observations Not Used Total Observations 3/9/16 3 144 96 1 144 144 0 144 6 Split Plot Iteration History Iteration Evaluations 0 1 -2 Res Log Like Criterion 413.45258995 389.88307775 0.00000000 1 1 Convergence criteria met. Covariance Parameter Estimates Cov Parm Estimate O3*RainpH*rep 0.1702 O3*RainpH*rep*Genoty 0.8244 Residual 0.7014 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Type 3 Tests of Fixed Effects Num Effect DF O3 2 RainpH 1 O3*RainpH 2 Genotype 2 O3*Genotype 4 RainpH*Genotype 2 O3*RainpH*Genotype 4 LeafAge 1 O3*LeafAge 2 RainpH*LeafAge 1 O3*RainpH*LeafAge 2 Genotype*LeafAge 2 O3*Genotype*LeafAge 4 RainpH*Genoty*LeafAg 2 O3*Rainp*Genot*LeafA 4 3/9/16 These are the variance component (VC) estimates for the random effects. Remember that except for the residual VC these are not MSes. 389.9 395.9 396.1 399.4 Den DF 18 18 18 36 36 36 36 54 54 54 54 54 54 54 54 F Value 56.71 2.18 0.56 0.35 0.88 0.44 0.37 101.79 0.54 1.15 0.53 0.84 0.58 0.01 1.32 Pr > F <.0001 0.1575 0.5822 0.7036 0.4882 0.6447 0.8268 <.0001 0.5843 0.2891 0.5942 0.4377 0.6758 0.9860 0.2752 These are the test of hypotheses for the main and interactive effects of the fixed factors. The only significant effects are the main effect of ozone and . needle age. Note the differences in the denominator degrees of freedoms (error dfs), indicating different error terms for the whole-, sub-, and sub-subplots. 7 Split Plot SAS program and listing RCBD split plot example The SAS program and Output for the RCBD Split-plot Example: 1 2 3 4 5 6 7 8 9 10 11 TITLE1 'LAB#9: ANALYZING RCBD SPLIT PLOT DESIGNS'; OPTIONS LS=66 PS=54 PAGENO=1; TITLE2 'The Whole Plot Design is a Randomized Complete Block'; TITLE3 'The Whole Plot Treatment is Irrigated vs Non-irrigated'; TITLE4 'The Sub Plot Trt is Level of Nitrogen (0, 40, 80, 160)'; TITLE5 'The data are expressed in yield per hectare'; DATA sp; INPUT blk ig$ nit yield; LINES; This experiment is a randomized complete block split plot with four blocks. The whole plot treatment is irrigated or nonirrigated and the sub plot treatment is level of nitrogen (0, 40, 80 or 160), making the treatment structure a 2x4 factorial. The dependent variable is crop yield per hectare. Data lines entered here 45 RUN; NOTE: The data set WORK.SP has 32 observations and 4 variables. 47 48 49 3/9/16 TITLE5 'Print of data file'; PROC PRINT DATA=sp; QUIT; 8 Split Plot LAB#9: ANALYZING RCBD SPLIT PLOT DESIGNS The Whole Plot Design is a Randomized Complete Block The Whole Plot Treatment is Irrigated vs Non-irrigated The Sub Plot Treatment is Level of Nitrogen (0, 40, 80, 160) Print of data file 3/9/16 OBS BLK IG NIT YIELD 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 . . . 25 26 27 28 29 30 31 32 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 N N N N Y Y Y Y N N N N Y Y Y Y 0 40 80 160 0 40 80 160 0 40 80 160 0 40 80 160 26 31 33 25 32 41 49 46 22 29 35 24 31 29 38 44 4 4 4 4 4 4 4 4 N N N N Y Y Y Y 0 40 80 160 0 40 80 160 13 24 20 15 20 25 33 32 9 Split Plot 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 66 TITLE5 'Mixed model analysis of varaince'; TITLE6 'Orthogonal polynomial contrasts, Nit interaction'; PROC MIXED DATA=sp RATIO COVTEST; CLASS blk ig nit; MODEL yield = ig nit ig*nit; RANDOM blk blk*ig; CONTRAST 'Nitrogen linear' nit -7 -3 1 CONTRAST 'Nitrogen quadratic' nit 7 -4 -8 CONTRAST 'Nitrogen cubic' nit -3 8 -6 CONTRAST 'Ig*Nit linear' ig*nit -7 -3 1 CONTRAST 'Ig*Nit quadratic' ig*nit 7 -4 -8 CONTRAST 'Ig*Nit cubic' ig*nit -3 8 -6 LSMEANS ig nit ig*nit / PDIFF; ODS exclude listing lsmeans; ODS output LSMEANS=lsm; ODS exclude listing diffs; ODS output DIFFS=diffs; QUIT; and IG*Nit The MODEL statement identifies the fixed sources of variation. In this case the 2x4 factorial structure. The RANDOM statement identifies the random sources of variation, which are block and the block*irrigation interaction. The block*irrigation interaction is the whole plot MSe, while the residual is the sub plot MSe. Since the nitrogen treatment is quantitative, I have included a number of orthogonal polynomial contrasts. These contrasts should identify the form of the regression equation that would best describe these data. Because of the multiple random variances, it is difficult to correctly analyze data from split plot designs using simple regression techniques. 9; 5; 1; 9 7 3 -1 -9; 5 -7 4 8 -5; 1 3 -8 6 -1; NOTE: The data set WORK.LSM has 14 observations and 8 variables. NOTE: The data set WORK.DIFFS has 35 observations and 10 variables. 3/9/16 10 Split Plot LAB#9: ANALYZING RCBD SPLIT PLOT DESIGNS The Whole Plot Design is a Randomized Complete Block The Whole Plot Treatment is Irrigated vs Non-irrigated The Sub Plot Treatment is Level of Nitrogen (0, 40, 80, 160) Mixed model analysis of varaince Orthogonal polynomial contrasts, Nit and IG*Nit interaction The Mixed Procedure Model Information Data Set Dependent Variable WORK.SP yield Class Level Information Class blk ig nit Levels 4 2 4 Values 1 2 3 4 N Y 0 40 80 160 Dimensions Observations Used 32 Iteration History Iteration Evaluations -2 Res Log Like Criterion 0 1 1 1 175.73630999 151.89110756 0.00000000 Convergence criteria met. 3/9/16 11 Split Plot LAB#9: ANALYZING RCBD SPLIT PLOT DESIGNS The Whole Plot Design is a Randomized Complete Block The Whole Plot Treatment is Irrigated vs Non-irrigated The Sub Plot Treatment is Level of Nitrogen (0, 40, 80, 160) Mixed model analysis of varaince Orthogonal polynomial contrasts, Nit and IG*Nit interaction Covariance Parameter Estimates Ratio Estimate Standard Error Z Value Pr Z 3.1785 0.6532 1.0000 36.7292 7.5486 11.5556 34.5144 8.5764 3.8519 1.06 0.88 3.00 0.1436 0.1894 0.0013 Cov Parm blk blk*ig Residual These are the variance component (VC) estimates for the random effects. In this case the block VC is the largest, followed by the sub plot VC and then the whole plot VC. Remember these are variance components and not MS for the whole unit or block. In fact the whole unit variance (MS) would be the sub unit VC + 4 * whole unit VC [MSw = 11.56 + 4(7.55)=41.76], where the four is the number of sub units per whole unit. Type 3 Tests of Fixed Effects Effect Num DF Den DF F Value Pr > F ig nit ig*nit 1 3 3 3 18 18 13.04 17.86 4.98 0.0365 <.0001 0.0109 These are the test of hypotheses for the fixed effects. Note that although the F ratio is quite large for irrigation, it is barely significant due to the small number of degrees of freedom. This is the price the researcher frequently pays for using a split plot design; a large reduction in the sensitivity for tests concerning the whole plot treatment factor. On the other hand, tests at the sub plot level are sometimes more sensitive. Contrasts Label Nitrogen linear Nitrogen quadratic Nitrogen cubic Ig*Nit linear Ig*Nit quadratic Ig*Nit cubic 68 69 70 71 72 3/9/16 Num DF Den DF F Value Pr > F 1 1 1 1 1 1 18 18 18 18 18 18 24.93 28.59 0.05 13.12 0.79 1.02 <.0001 <.0001 0.8298 0.0020 0.3860 0.3252 The polynomial contrasts indicate that the nitrogen response contains both a linear and quadratic component and that the two response lines differ in the linear component (slope). TITLE5 'Print of main effect and treatment means'; PROC PRINT DATA=lsm; FORMAT Estimate StdErr 6.1; VAR EFFECT IG NIT Estimate StdErr; QUIT; 12 Split Plot LAB#9: ANALYZING RCBD SPLIT PLOT DESIGNS The Whole Plot Design is a Randomized Complete Block The Whole Plot Treatment is Irrigated vs Non-irrigated The Sub Plot Treatment is Level of Nitrogen (0, 40, 80, 160) Print of main effects and treatment means Obs Effect ig nit Estimate StdErr 1 2 3 4 5 6 7 8 9 10 11 12 13 14 74 75 76 77 78 79 3/9/16 ig ig nit nit nit nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit N Y N N N N Y Y Y Y _ _ 0 40 80 160 0 40 80 160 0 40 80 160 23.9 32.1 20.7 28.7 32.5 30.0 18.2 27.0 28.0 22.2 23.2 30.5 37.0 37.7 These are the broad sense estimates of the standard errors of the treatment means. That is they are estimates of the variation that would be expected in repetitions of the experiment based on random samples of other blocks from the same population of blocks as was sampled for the current experiment. 3.43 3.43 3.40 3.40 3.40 3.40 3.74 3.74 3.74 3.74 3.74 3.74 3.74 3.74 TITLE5'Print of differences and tests of differences'; PROC PRINT DATA=diffs; FORMAT DIFF StdErr 6.1 Probt 6.3; WHERE ig=_IG or nit=_NIT; VAR Effect ig nit _IG _NIT DIFF StdErr DF Probt; QUIT; 13 Split Plot LAB#9: ANALYZING RCBD SPLIT PLOT DESIGNS The Whole Plot Design is a Randomized Complete Block The Whole Plot Treatment is Irrigated vs Non-irrigated The Sub Plot Treatment is Level of Nitrogen (0, 40, 80, 160) Print of differences and tests of differences Obs Effect 1 2 3 4 5 6 7 8 9 10 11 15 16 18 21 24 29 30 31 32 33 34 35 ig nit nit nit nit nit nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig*nit ig N N N N N N N N N N N Y Y Y Y Y Y nit _ig _nit Estimate Y _ 40 80 160 80 160 160 40 80 160 0 80 160 40 160 80 160 40 80 160 80 160 160 -8.2 -8.0 -11.8 -9.3 -3.8 -1.3 2.5 -8.8 -9.8 -4.0 -5.0 -1.0 4.7 -3.5 5.8 -9.0 -15.5 -7.3 -13.8 -14.5 -6.5 -7.3 -0.7 _ 0 0 0 40 40 80 0 0 0 0 40 40 40 80 80 160 0 0 0 40 40 80 N N N Y N N Y N Y Y Y Y Y Y Y Y StdErr 2.3 1.7 1.7 1.7 1.7 1.7 1.7 2.4 2.4 2.4 3.1 2.4 2.4 3.1 2.4 3.1 3.1 2.4 2.4 2.4 2.4 2.4 2.4 DF 3 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 Probt Note that there are four different standard errors of the differences. These reflect the comparisons of means that are based on whole plot variance, sub plot variance or a combination of the two, as well as differences in replication between main effect means and two-way means. If one uses the polynomial contrast to interpret the nitrogen effect then these mean comparisons should only be used to examine the differences between irrigated and non-irrigated at a given nitrogen level. 0.036 0.000 0.000 0.000 0.041 0.472 0.159 0.002 0.001 0.113 0.123 0.682 0.064 0.272 0.028 0.009 0.000 0.007 0.000 0.000 0.015 0.007 0.759 81 82 83 DATA lsm; SET lsm; IF Effect='ig*nit'; The ODS statement in the MIXED procedure generated an output data set containing the treatment means. Only the interaction means (IG*NIT) are retained, to plot the data. 84 85 nitrogen = nit*1; RUN; The variable nitrogen is generated, which is the same as nit to give the variable actual name to appear in the plot. NOTE: There were 14 observations read from the dataset WORK.LSM. NOTE: The data set WORK.LSM has 8 observations and 9 variables. 87 88 89 90 91 TITLE5 'Plot of trt means for examination of interaction effect'; The PLOT procedure is used to construct a crude plot of the PROC PLOT DATA=lsm VPERCENT=70; treatment means to examine the irrigation by nitrogen FORMAT Estimate 5.0; interaction. The VAXIS option determines the placement of tick PLOT Estimate*nitrogen=ig / VAXIS=0 TO 40 BY 10;; marks on the vertical axis. QUIT; 3/9/16 14 Split Plot LAB#9: ANALYZING RCBD SPLIT PLOT DESIGNS The Whole Plot Design is a Randomized Complete Block The Whole Plot Treatment is Irrigated vs Non-irrigated The Sub Plot Treatment is Level of Nitrogen (0, 40, 80, 160) Plot of trt means for examination of interaction effect Plot of Estimate*nitrogen. Symbol is value of ig. 40 + | Y | Y | | | 30 + Y | N | N Estimate | | Y | N 20 + | N | | | | 10 + | | | | | 0 + +-+------------+------------+------------+------------+-0 40 80 120 160 The interaction can be seen as an increasing difference between irrigated and non-irrigated as the nitrogen level increases. In addition, very little increase in yield is seen as nitrogen is increased from 80 to 160 units under irrigation, while yields of non-irrigated plots decreased at the highest level of nitrogen. nitrogen 3/9/16 15 Split Plot