ST 524 Plot Size and Shape NCSU - Fall 2008 Question 1. Data set “uniftrialdata.xls” presents yields of a uniformity trial on winter wheat (simulated data). Unit size (1.5m wide × 4.5m long) plots are distributed in a 6 columns × 48 rows, for a total of 288 plots (size X = 1). Interest is in exploring the relationship between plot size and the variance among plots (in unit basis). There are four variables that identify plots according to their size: plot2, plot4, plot8 and plot16, where the plot sizes are 2,4,6,8, and 16 units. 1. Following the approach presented in Swallow and H a nested analysis of variance on yield is presented that will be used in the estimation of VX, variance among plots of size x, expressed in unitary basis. proc glm data=b3(where=(plot1<49 and col <7)); class plot1 plot2 plot4 plot8 plot16 ; model newyield = plot16 plot8(plot16) plot4(plot8*plot16) plot2(plot4*plot8*plot16) ; random plot16 plot8(plot16) plot4(plot8*plot16) plot2(plot4*plot8*plot16) /test; output out=outglm r=resid student=sres p=pred; run; The GLM Procedure Dependent Variable: newyield Source DF Sum of Squares Mean Square F Value Pr > F Model 143 209919.5978 1467.9692 2.40 <.0001 Error 144 87907.8423 610.4711 Corrected Total 287 297827.4402 R-Square Coeff Var Root MSE newyield Mean 0.704836 6.033877 24.70771 409.4832 Source plot16 plot8(plot16) plot4(plot8*plot16) plot(plot*plot*plot) DF 17 18 36 72 Type I SS 50233.17637 34266.59997 62756.55429 62663.26720 Mean Square 2954.89273 1903.70000 1743.23762 870.32316 F Value 4.84 3.12 2.86 1.43 Pr > F <.0001 <.0001 <.0001 0.0370 Source plot16 plot8(plot16) plot4(plot8*plot16) DF 17 18 36 Type III SS 50233.17637 34266.59997 62756.55429 Mean Square 2954.89273 1903.70000 1743.23762 F Value 4.84 3.12 2.86 Pr > F <.0001 <.0001 <.0001 plot(plot*plot*plot) 72 62663.26720 870.32316 1.43 0.0370 Expected Mean Squares Source Type III Expected Mean Square plot16 Var(Error) + 2 Var(plot(plot*plot*plot)) + 4 Var(plot4(plot8*plot16)) + 8 Var(plot8(plot16)) + 16 Var(plot16) plot8(plot16) Var(Error) + 2 Var(plot(plot*plot*plot)) + 4 Var(plot4(plot8*plot16)) + 8 Var(plot8(plot16)) plot4(plot8*plot16) Var(Error) + 2 Var(plot(plot*plot*plot)) + 4 Var(plot4(plot8*plot16)) plot(plot*plot*plot) Var(Error) + 2 Var(plot(plot*plot*plot)) Tuesday November 25, 2008 1 ST 524 Plot Size and Shape NCSU - Fall 2008 Plot size MS VX 1 610.4711 V1 = 610.4711 2 870.32316 V2 1743.23762 4 V4 1903.70000 8 V8 2954.89273 16 V16 870.32316 610.4711 = 129.9261 2 1743.23762 870.32316 218.2286 = 4 1903.70000 1743.23762 = 20.0578 8 2954.89273 1903.70000 = 65.69954 16 Variance components may be obtained directly with PROC MIXED, proc mixed data=b3(where=(plot1<49 and col<7)); class plot1 plot2 plot4 plot8 plot16 ; model newyield= / outp=predds ; random plot16 plot8(plot16) plot4(plot8*plot16) plot2(plot4*plot8*plot16) ; run; Variance components estimates from PROC MIXED Plot Size Estimate 1 610.47 2 129.93 4 218.23 8 20.0578 16 65.6995 log Vx 6.0354 0.9127log X The Mixed Procedure Covariance Parameter Estimates Cov Parm plot16 plot8(plot16) plot4(plot8*plot16) plot(plot*plot*plot) Residual 2. Estimate 65.6995 20.0578 218.23 129.93 610.47 Size 16 8 4 16 1 Next, a regression of Vx on X, in a log scale, is used to get a raw estimate of the coefficient of soil heterogeneity b, Smith’s b. The REG Procedure Model: MODEL1 Tuesday November 25, 2008 2 ST 524 Plot Size and Shape NCSU - Fall 2008 Dependent Variable: log_vx Analysis of Variance DF Sum of Squares Mean Square 1 3 4 4.00265 2.56906 6.57171 4.00265 0.85635 Root MSE Dependent Mean Coeff Var 0.92539 4.77010 19.39988 Source Model Error Corrected Total R-Square Adj R-Sq F Value Pr > F 4.67 0.1194 0.6091 0.4788 Parameter Estimates Variable DF Parameter Estimate Standard Error t Value Pr > |t| Intercept 1 6.03543 0.71681 8.42 0.0035 log_x 1 -0.91274 0.42218 -2.16 0.1194 Regression equation: log Vx 6.0354 0.9127log X Smith’s b = 0.9127 Values closer to 1 indicates increasing homogeneity of the soil. A plot size between 2 and 8 seems adequate since for X=16 the variance among plots of size 16 is greater. 3. Additionally, we can analyze the residuals for the plot size X = 1, X = 8 and see whether the use of a larger plot reduces the residual variation. Check residual distribution on field *** fit just an intercept in the model yij ij , coordinates of each plot, i = 1, 2, . . ., 48 is the yield in (i, j) plot, where i and j are the row and j = 1,2,3,4,5,6 column, is the overall mean, and ij is yij the residual value in (i, j) plot. proc glm data = newtrial; model newyield = ; output out = outglm r = resid student = sres p = pred; run; The GLM Procedure Dependent Variable: newyield Sum of Source DF Squares Mean Square Model Tuesday November 25, 2008 1 48290839.62 48290839.62 F Value Pr > F 46535.2 <.0001 3 ST 524 Plot Size and Shape NCSU - Fall 2008 Error 287 297827.44 Uncorrected Total 288 48588667.06 1037.73 R-Square Coeff Var Root MSE newyield Mean 0.000000 7.866930 32.21376 409.4832 Source Intercept DF Type I SS Mean Square F Value Pr > F 1 48290839.62 48290839.62 46535.2 <.0001 Parameter Estimate Standard Error t Value Pr > |t| Intercept 409.4832432 1.89821396 215.72 <.0001 *** residual plot on the field ***; Residual plot Standardized Residual plot *** graph a contour plot for residuals on the field ***; proc g3grid data=outglm out=out2; grid row*col = sres ; run; proc gcontour data=out2; plot row*col=sres/ levels= -4 -3 -2 -1 0 1 2 3 4;* pattern join; run; Tuesday November 25, 2008 4 ST 524 Plot Size and Shape Tuesday November 25, 2008 NCSU - Fall 2008 5