Stat 512 Spring 2011 Randomized Complete Block Designs (Ch 15 O&L) Definition - revisited The design structure of an experimental design consists of the structure of the grouping of experimental units into homogeneous units. This grouping of experimental units in conjunction with the form of the randomization of treatments to experimental units defines the design. Review of Design Types (L 4) 1) Completely Randomized Design (CRD). 2) Randomized Complete Block Design (RCBD). 3) Latin Square Design. 4) Randomized Incomplete Block Designs Design Advantages Completely Randomized Design (CRD) a) Relatively easy to construct. b) Easy to analyze, even for different sample sizes. c) Any number of treatments Randomized Complete Block Design (RCBD) a) Easily constructed for comparing t treatment means in the presence of a single source of extraneous variation (blocks). b) Easy to analyze, but equal sample sizes required. c) Used for any number of treatments or blocks. Latin Square Design a) Relatively easy to construct for comparing t treatment means in the presence of two extraneous sources of variation (two blocking factors). b) Relatively simple analysis. 1 Stat 512 Spring 2011 Design Disadvantages Completely Randomized Design (CRD) a) Experimental units should be homogeneous. Any degree of variation in the experimental units will lower the statistical power. Randomized Complete Block Design (RCBD) a) Requires that within block sub-units must be homogeneous, thus it is best for comparing only a few treatment means. b) The effect of each treatment on the response must be approximately the same from block to block. c) Will have lower statistical power if blocks are homogeneous. d) The design is less efficient than others in the presence of more than one source of variation. e) The efficiency of the design decreases as the number of treatments and, hence, block size increases. Latin Square Design a) Requires that within block sub-units must be homogeneous, thus it is best for comparing only a few treatment means. b) The effect of each treatment on the response must be approximately the same from row to row and column to column. c) Although a Latin square can be constructed for any value of t, it require t 2 unit to study t treatments. It is best suited for comparing t treatments when 5 ≤ t ≤ 10. d) As t increases the experimental error per unit is likely to increase. e) The analysis becomes very complicated if there are missing data or if treatments are missassigned. 2 Stat 512 Spring 2011 Randomized Complete Block Design Blocks Treatments 1 b Means 1 Y 11 Y1b 2 Y 21 Y2b . . . . . . . . . . . . t Y t1 Ytb Means 2 … Y12 … Y22 … . … . … . … Yt2 … … The statistical linear model for a randomized complete block design with an oneway (fixed) treatment structure is a TWO-WAY MIXED ANOVA with one fixed and one random factor with no interactions ie: Yij = µ + i + bj + eij (effects model) Yij = i + bj + eij (means model) i = 1, 2, …, t (treatments) j = 1, 2, …, r (blocks) Definitions: µ grand mean bj effect of the jth block (random factor) is the average deviation of the units in block j (j = dj - µ) i effect of the ith treatment (fixed factor) (i = i - µ) eij random error term for the ith treatment and jth block. 3 Stat 512 Spring 2011 Assumptions: bj are independent for all j follow N(0,sr2) eij are independent for all i and j follow N(0,se2) Sufficient conditions for estimation: ∑ 𝜏𝑖 = 0, ∑ 𝑏𝑗 = 0, Decomposition of the total sum of squares: Y Y t Y Y r Y YY Y Y Y or t r r 2 i j. . i 1 j 1 t 2 . j. . j 1 t r 2 i .. . i 1 2 i j. ji .. . i 1 j 1 S S S S S S S S T o t a l B l o c k s T r e a t m e n t s E r r o r Where t Sample mean for the jth block Y. j Y Sample mean for the ith i 1 t r treatment Yi. Y j 1 ij r t Sample grand mean Y.. r Y i1 j1 ij rt 4 ij Stat 512 Spring 2011 S S 2 2 2 B l o c k s M S E M S r t r B l o c k s B l o c k s B l o c k B l o c k R e p r 1 t 2 r i S S 2 T r e a t m e n t s i 1 M S E M S T r e a t m e n t s T r e a t m e n t s t 1 t 1 S S 2 M S E E M S E E t 1 r 1 ANOVA table for the RCBD Source Blocks Treatments Error Total df r-1 t-1 (r-1)(t-1) rt - 1 SS SSB SST SS-SST-SSB SS MS F SSB/(r-1) SST/(t-1) MST/MSE SSE/(r-1)(t-1) ANOVA table for the CRD Source Treatments Error Total df t-1 t(r - 1) rt – 1 SS SST SS - SST SS MS SST/(t-1) SSE/r(t-1) F MST/MSE Notice that there are fewer degrees of freedom for error in the RCBD design than in the CRD design, (r-1)(t-1) vs. t(r-1), or (r - 1) fewer degrees of freedom. In the RCBD, these r - 1 degrees of freedom have been partitioned from the error and assigned to the blocks. However, the SSE in RCBD is generally smaller than that of CRD (which supposedly contains block effects). Obviously one should only use the RCBD when the variation explained by the blocks more than offsets the degrees of freedom they consume. So how can one determined when an RCBD is appropriate? The concept of efficiency. 5 Stat 512 Spring 2011 Variability in the completely randomized design (CRD) In the CRD it is assumed that the experimental units are uniform. This is not always true in practice and it is necessary to develop methods to deal with variability. If in comparing two methods of fertilization one region of the field has much greater fertility than the others, then a treatment effect might be incorrectly ascribed to the treatment applied to this part of the field, making a Type I error. For this reason in CRD it is always advocated to include as much of the native variability of the experiment as possible within each plot, making each plot as representative of the whole experiment, and the whole experiment as uniform, as possible. In actual field studies plots are designed long and narrow to achieve this effect. However, if the plots are more variable, experimental error (MSE) is larger, F (MST/MSE) is smaller, and the experiment is less sensitive. Finally, if the experiment is replicated in a variety of situations to increase the scope of the experiment, this additional variability needs to be removed from the analysis to focus on the treatment effect. This is the purpose of blocking. 6 Stat 512 Spring 2011 No difference among blocks: if the RCBD design were applied to an experiment in which the blocks were really no different (i.e. no significant block effect), the MSE for the CRD would be smaller than the MSE for the RCBD simply due to degrees of freedom. For example, if t=3 and r=4, MSECRD = SSE/9, and MSERCBD = SSE/6. Therefore, the F statistic for the CRD would be larger. Consider a confidence interval for the differences between two means 2 Y Y CriticalF * MSE A B ( 1 , MSEdf ), r Under H0 = Y A - Y B =0 The CRD has a smaller critical F value than the RCBD because of its larger df. In addition if there are no differences among blocks then MSECRD=MSERCBD. Therefore, the larger critical F value in the RCBD moves the threshold of the rejection further from the mean (0) than in the CRD. This change in the position of the rejection threshold affects the Type II error () and the power of the test (1-). Under this scenario, the probability of failing to reject a false null hypothesis () will be smaller in the CRD than in the RCBD. In other words, the CRD would in this situation be more powerful (larger 1- ). Significant difference among blocks: On the other hand, suppose that there really were a substantial difference among blocks as well 7 Stat 512 Spring 2011 as among the treatments (H0 false). If the CRD were used, this difference among blocks would be allocated to the error, so the F statistic for the CRD would be smaller than the F statistic of the RCBD. Under this scenario, the RCBD would still have a larger critical F value because of the lost degrees of freedom, but this may be more than compensated by the smaller MSE. If the effect of the reduced MSE (threshold closer to 0) is larger than effect of the larger critical value (threshold further from 0) the net result will be a smaller , and a larger power (1-) in the RCBD relative to the CRD. Summary: The MSError is the estimator of the variance used for assessing hypotheses, whether for a CRD or an RCBD. If experimental units are relatively homogeneous, then the CRD is preferred. This is because the relatively larger dfError reduces the MSError. If the experimental units are heterogeneous, then the RCBD is preferred. This is because the SSBlock is large and subsequently the SSError is small, and the relative decrease in SSError is larger than the relative decrease in dfError. Example 1: 8 Stat 512 Spring 2011 A greenhouse consisting of six benches was to be used for an experiment assessing growth among four varieties of house plants. Because light intensity, humidity and temperature varied throughout the greenhouse it was decided that each bench should contain a complete replication of the experiment. Thus, each bench received each variety of potted plant. The change in plant height (cm) after 2 weeks was recorded: Bench 1 2 3 4 5 6 1 19.8 16.7 17.7 18.2 20.3 15.5 Varieties 3 16.4 15.4 14.8 15.6 16.4 14.6 2 21.9 19.8 21.0 21.4 22.1 20.8 4 14.7 13.5 12.8 13.7 14.6 12.9 Variety Means: Block Means: Grand Mean: ANOVA Table Source Sum of Degrees of Squares Freedom 19.793 5 Bench 3 Varieties 188.538 6.527 15 Error 214.858 23 Total H0: Ha: not all µi are equal. F=144.44 9 Mean Squares 3.959 62.846 0.435 F0 144.44 Stat 512 Spring 2011 Reject H0 if F > F (0.05, 3, 15) = 3.287 Conclusion: Reject H0 and assume the varieties do not have the same mean growth. P-value = 2.741011 Standard Error for the Treatment Mean: which is estimated by: S E Y i. SEY i. r M S E r Estimated Standard Error for the Difference in Two Treatment Means: 1 1 M S E 2 S E Y Y M S E i . i . rr r Multiple Comparisons - Fisher’s LSD: 1 1 1 1 L S D t M S E 2 . 1 3 1 0 . 4 3 5 0 . 8 1 1 5 2 , t 1 r 1 r r 6 6 Y Y 3 . 1 3 4 * Y Y 2 . 5 0 0 * Y Y 4 . 3 3 3 * 1 . 2 . 1 . 3 . 1 . 4 . Y Y 5 . 6 3 4 * Y Y 7 . 4 6 7 * Y Y 1 . 8 3 3 * 2 . 3 . 2 . 4 . 3 . 4 . * significant at the 0.05 level. Equivalently you could perform t - tests: H0: VS. Ha: t Yi. Yi. 1 1 M SE r r Reject if | t | > t (α/2, (t-1)(r-1)), or P-value ≤ α Conclusion: Example (Greenhouse example cont) MSE = 0.435, df = 15 and t(0.025, 15) = 2.131 : t = -8.230 P-value < 0.0001* : t = 6.565 P-value < 0.0001* 10 Stat 512 Spring 2011 : t = 11.379 P-value < 0.0001* : t = 14.796 P-value < 0.0001* : t = 19.609 P-value < 0.0001* : t = 4.814 P-value = 0.0002* * significant at the 0.05 level. Group Mean Treatment A 21.167 2 B 18.033 1 C 15.533 3 D 13.700 4 Note: In addition to the LSD procedure (or t-tests), Scheffe’s procedure, Bonferroni’s procedure and any of the other multiple comparison procedures or contrasts discussed in class can be used Design Efficiency One cannot say that a block design is more efficient than a completely randomized design, except when viewed in the context of the variability of the response among experimental units. The reverse assertion also cannot be made. Suppose it were possible to analyze the same data set under a randomized complete block design and a completely randomized design. 11 Stat 512 Spring 2011 Example 2: This example involves the response of sheep to estrogen. The sheep are blocked by ranch, with four treatments per block. The treatments are combinations of sex of the sheep (M or F) and level of estrogen treatment (S0 or S3). Although these data could be analyzed as a factorial experiment, in this example they are treated as four separate treatments. RCBD. Effect of estrogen on weight gains. Blocks are 4 different ranches. Block Treatment F-S0 M-S0 F-S3 M-S3 Block Total Block Mean I 47 50 57 54 208 52 II III 52 54 53 65 224 56 62 67 69 74 272 68 IV 51 57 57 59 224 56 Treatment Total Mean 212 53 228 57 236 59 252 63 928 58 Table 8.2 RCBD ANOVA Source of Variation Totals Blocks Treatments Error df 15 3 3 9 SS 854 576 208 70 MS df 15 3 12 SS 854 208 646 MS 192.00 69.33 7.78 F 24.69** 8.91** Table 8.3 CRD ANOVA Source of Variation Totals Treatments Error 12 69.33 53.83 F 1.29 NS Stat 512 Spring 2011 Since each treatment occurs the same number of times in each block, differences among blocks do not result from treatments but from other differences associated with the blocks. This component of the total sum of squares can be removed and the experimental error reduced accordingly. Compare the SSerror in Tables 8.2 and 8.3 Checking Model Assumptions ˆ ˆ eY Y Y Y Residuals: i j i j i j Y i j . j i . Normality: Wilks-Shapiro test Normal Probability Plot Equality of Variances: Levene's Test Likelihood Ratio Test Plots: Plot Residuals vs. Factor Levels Plot Residuals vs. Block Levels Plot Residuals vs. Predicted Values Other Problems: A block by treatment interaction can result from a poorly controlled experiment. If treatments are not manipulated in a consistent manner an interaction may result. 13 Stat 512 Spring 2011 Relative Efficiency of Blocking We saw earlier that if the variation among blocks is large then we can expect the RCBD method to work better than the CRD while if this variation is small it may not. There is no F-test for assessing whether blocks are significant and therefore effectively reduce the mean square error. The concept of relative efficiency formalizes the comparison between two experimental methods. Recall that the F statistic is defined by the formula F = MST/MSE. The experimental design affects primarily the MSE since the degrees of freedom for treatments is always t - 1. The information in the design is 1/MSE, so the relative efficiency of design to design to is (1/MSE1)/ (1/MSE2) = MSE2/MSE1. A relative efficiency measure is: r 1 M S r t 1 M S EM S E l o c k s B l o c k s R E B H r t 1 M S E M S E RE < 1 H < 1 => Ineffective Blocking RE = 1 H = 1 => RE > 1 H > 1 => Effective Blocking 14 Stat 512 Spring 2011 The Latin Square Design The randomized complete block design reduces the variation associated with each treatment mean by controlling for variation due to known nuisance variables (blocks). The concept can be extended to two levels of control. That is, two separate blocking factors (rows and columns) can be used to control for some forms of variation. In agricultural yield experiments one might find a moisture gradient running East - West and a fertility gradient running North - South. Neither of these components is comparatively of interest to the researcher, but it would be hove the researcher to control for these sources of variation when he or she desires to compare the treatment means of interest. Linear Statistical Model for the Latin Square (3-way ANOVA with 2 random effects and no interactions) Yij = µ + ri + cj +k + eij (effects model) Yij = Ri + Cj +k +eij (cell means model) i = 1, 2, , t j = 1, 2, , t k = 1, 2, , t Yij response for ith row, jth column. µ grand mean ri effect of the ith row effect (random) (ri = Ri - µ, where ri is the ith row mean) cj effect of the jth column effect (random) (cj = Cj - µ, where cj is the jth column mean) k effect of the kth treatment effect (fixed) (k = k - µ, where k is the kth treatment mean) eij random error component for the ith row, jth column. 15 Stat 512 Spring 2011 Assumptions: The random effects ri, cj and eij are normally distributed and mutually independent. Define: t Observed row mean: Yi. Y ij j 1 t t Observed column mean: Y. j Y ij i 1 t t Observed treatment mean Yk Y k 1 ij t This last summed over the response for the kth treatment. Decomposition of the Sum of Squares Y Y tY Y tY Y tY Y i. k .j i j . . . . . . . . t t 2 i j t t 2 i 1 2 j 1 t 2 k 1 Y Y Y Y 2 Y i j i . . j k . . t t 2 i j or S S = S S + S S + S S + S S T R o w s C o l u m n s T r e a t m e n t s E ANOVA Table Source Sum of Squares Mean Squares SSTreatments Degrees of Freedom t-1 t-1 t-1 Rows Columns Treatment s Error SSRows SSE (t - 1)(t - 2) MSE Total SST t2 1 SSColumns 16 F0 MSRows MSColumns MSTreatments MSTreatments MSE Stat 512 Spring 2011 Hypothesis Testing H : μ = μ = = μ s . H : N o t a l l μ a r e e q u a l 0 1 2 tv a k M S T rea tm en ts F 0 M SE Reject Ho if Fo > F( ,t - 1,(t - 1)(t - 2)) Conclusion: A Latin Square Example A latin square design was used to investigate the effect of shelf space on food sales. The experiment was carried out over a sixweek period using six different stores. The resulting sales of coffee creamer are presented in the following table (with shelf space index in parentheses). Store 1 2 3 4 5 6 1 27 (5) 34 (6) 39 (2) 40 (3) 15 (4) 16 (1) 2 14 (4) 31 (5) 67 (6) 57 (1) 15 (3) 15 (2) 3 18 (3) 34 (4) 31 (5) 39 (2) 11 (1) 14 (6) 4 35 (1) 46 (3) 49 (4) 70 (6) 9 (2) 12 (5) Weeks 5 26 (6) 37 (2) 38 (1) 37 (4) 18 (5) 19 (3) 6 22 (2) 23 (1) 48 (3) 50 (5) 17 (6) 22 (4) Analysis of Variance Table Source Store (Rows) Week (Column) Shelf Error Total SS 6502.25 df 5 MS 1300.45 533.92 5 106.78 477.58 1291.00 8804.75 5 20 95.52 64.55 35 17 F0 1.48 Stat 512 Spring 2011 Estimated Standard Error for Yk and Yk Yk M S E 2 M S E S E Y S E Y Y k k k t t Multiple Comparisons - Fisher’ s LSD: 1 1 L S D t S E Reject 2 , t 12 t M tt equality of k and k if Y Y SD k k L Equivalently you could perform t - tests: H :k 0 v s .H :k 0 0 k a k t0 Y Y k k 1 1 M SE t t Reject Ho if |to| > t 2,t1t1 or Pvalue < 0.05 Conclusion Note: In addition to the LSD procedure (or t-tests), Scheffe’ s procedure, Bonferroni’ s procedure and any of the other multiple comparison procedures or contrasts discussed in class can be used. Replicated Latin Squares Latin Squares can be replicated to increase the precision of an experiment. Replication can be approached in two manners: multiple independent tables; or multiple tables with a common set of rows or columns. For at four (t = 4) treatment Latin square design with two replicates we have: 18 Stat 512 Spring 2011 Independent Latin Squares Row 1 2 3 4 5 6 7 8 1 A B C D 2 B C D A 3 C D A B 4 D A B C 5 A B C D Column 6 7 B C D A Replicated Latin Squares with Common Rows Column Row 1 2 3 4 5 6 1 A B C D A B 2 B C D A B C 3 C D A B C D 4 D A B C D A 8 C D A B D A B C 7 C D A B 8 D A B C The analysis of a Latin square design with r independently replicated tables is similar to the analysis for a single Latin square design, except the model for analysis must account for variation among replicates. Thus, the model is of the form: Y = μ + κ ρ γ+ + e i j k l+ k i j k i l j l + τ Yijk = µ + tk + sl +ri(l) + cj(l) + eijk Source df Table s-1 Row s(t - 1) Column s(t - 1) Treatment t-1 Error (st - s - 1)(t - 1) 19 Stat 512 Spring 2011 The analysis of a Latin square design with r replicated tables having common rows is adjusted for independent columns, but uses the same rows in each table. Thus, the model is of the form: Yijk = µ + rk + sl +i + cj(l) + eijk Source Table Row Column Treatment Error df s-1 t-1 s(t - 1) t-1 (st - 2)(t - 1) An Experiment with six yearling dairy heifers was conducted as two latin squares. Treatments were three rations selected on the basis of diverse quality and physical characteristic and fed ad libitum. Each animal ate the three rations sequentially, one week on each. The response, Y, is pounds of dry matter consumed per 100 lb of body weight. The three treatments were (1) alfalfa hay, (2) corn silage, (3) blue-grass straw pellets. Square 1 2 Heifer Heifer Week 1 2 3 1 2.7 (1) 2.2 (2) 1.9 (3) 2 2.6 (2) 0.2 (3) 2.1 (1) 3 1.9 (3) 2.3 (1) 2.4 (2) 20 4 3.3 (1) 1.7 (3) 2.1 (2) 5 2.3 (2) 2.8 (1) 1.7 (3) 6 0.1 (3) 1.8 (2) 2.7 (1) Stat 512 Spring 2011 Source Sum of Squares Mean Squares 0.0050 0.4444 1.9244 Degrees of Freedom 1 2 4 Square Weeks (Rows) Heifers(Square) (Columns) Feed (Treatments) Error Total 6.1644 2 3.0822 2.5644 11.2028 8 17 0.3206 F0 0.0500 2.2222 0.4811 9.62 F(0.05, 2, 8) = 4.4590 For the above example we would reject the null hypothesis of equal means for the three feed treatments. 21 Stat 512 Spring 2011 Randomized Complete Block Design with a Twoway Treatment Structure Example A computer company, to test the efficiency of its new programmable calculator, selected size engineers who were proficient in the use of both this calculator and an earlier model and asked them to work out two problems on both calculators. One of the problems was statistical in nature, the other was an engineering problem. The order of the four calculations was randomized independently for each engineer. The length of time (in minutes) required to solve each problem was observed and is presented in the following table: Data for the RCBD with Two-way Treatment Structure Problem Statistical Engineering Model Model Engineer 1 2 3 4 5 6 Cell Means: New 3.1 3.8 3.0 3.4 3.3 3.6 Earlier 7.5 8.1 7.6 7.8 6.9 7.8 New 2.5 2.8 2.0 2.7 2.5 2.4 Y 3 . 3 6 7 Y 7 . 6 1 7 S t a t i s t i c a l , N e w S t a t i s t i c a l , E a r l i e r Y 2 . 4 8 3 Y 5 . 1 6 7 E n g i n e e r i n g , N e w E n g i n e e r i n g , E a r l i e r Marginal Means: Y 2 . 9 2 5 N e w Y 6 . 3 9 2 E a r l i e r Y 5 . 4 9 2Y 3 . 8 2 5 s t a t i s t i c a l E n g i n e e r i n g Model: 22 Earlier 5.1 5.3 4.9 5.5 5.4 4.8 Stat 512 Spring 2011 Yijk = µ + rk + i + j + ()ij+ eijk (effects model) i = 1, 2, , a j = 1, 2, , b k = 1, 2, , r µ - grand mean i - treatment effect for the ith level of factor A (i = i.. - ...) j - treatment effect for the jth level of factor B (j = .j. - ...) ()ij - interaction effect between the ith level of factor A and the jth level of factor B (()ij = ij. -i.. -.j. + ...) rk - block effect (random factor) (rk = ..k - ...) eijk - error for the kth block within the ijth treatment combination Decomposition of the total sum of squares: Y Y a b Y Y b r Y Y a r Y Y i j k . . . k . . . . . i . . . . . . j . . . . a b r i 1 j 1 k 1 2 r 2 k 1 a 2 i 1 b 2 j 1 r Y Y Y Y Y Y Y Y i j . i . . . j . . . . i j k i j . . . k . . . a b 2 i 1 j 1 a b r i 1 j 1 k 1 or S S S S S S S S S S S S T o t a l B l o c k s A B A B E r r o r S S 2 2 2 B l o c k s M S ,E M S r a r B l o c k s B l o c k s B l o c k B l o c k R e p r 1 S S 2 2 A M S , E M S r b A A a 1 S S 2 2 B M S , E M S r a B B b 1 S S 2 2 B M S A , E M S r A B A B a 1 b 1 23 2 Stat 512 Spring 2011 S S 2 M S E , E M S E a b 11 r ANOVA Table Source Sum of Squares Mean Squares SSA Degrees of Freedom r-1 a-1 Blocks Factor A SSBlocks MSA MS A MS E Factor B SSB b- 1 MSB MS B MS E Interaction SSAB MSAB MS AB MS E Error Total SSE (a - 1)(b 1) (ab-1)(r-1) abr – 1 SST F0 MSBlocks MSE Data for the RCBD with Two-way Treatment Structure Problem Statistical Engineering Model Model Engineer New Earlier New Earlier 1 3.1 7.5 2.5 5.1 2 3.8 8.1 2.8 5.3 3 3.0 7.6 2.0 4.9 4 3.4 7.8 2.7 5.5 5 3.3 6.9 2.5 5.4 6 3.6 7.8 2.4 4.8 24 Stat 512 Spring 2011 ANOVA Table Source Blocks (Engineers) Problem Model Problem*M odel Error Total Sum of Squares 1.0533 Degrees of Freedom 5 Mean Squares 0.2107 F0 72.1067 16.6667 3.6817 1 1 1 72.1067 16.6667 3.6817 1070.89 247.52 54.68 1.0100 15 0.0673 F(0.05, 1, 15) = 4.5431 For the above example we would reject the null hypothesis of no interaction and therefore have to compare cell means, not marginal means. 25