Completely Randomized Design (CRD): ( Ch 14 O&L) This is associated with homogenous units. Essentially all the units we have are “alike” or “homogenous”. We randomly assign treatments to the units. CRD uses the tenets of Randomization and Replication in terms of Design Theory. It does not need to make use of local control as the units are homogenous. The analysis of CRD for a single FACTOR is ONE WAY ANOVA that we have already discussed earlier. Most of this Chapter is review and it is a combination of Lecture 2 (1 way ANOVA), Lecture 5 (Factorial Structures). Example: We are interested in comparing three brands of cake mixes for how much the cake rises. The Brands are: Betty Crocker, Duncan Hines, and the Store Brand (say IGA). We can afford to use 4 boxes of cake mixes for each brand. How would we design this study? So here we have a CRD with a single Factor and we can use 1 way ANOVA to analyze this data. What would the ANOVA table look like in this scenario? Source Degrees of Freedom Sums of Squares Mean Squares F teststatistic Brand 3-1=2 SSF SSF/dff SSF/SSE Error =12-3=9 SSE SSE/dfe Total =12-1=11 TSS pvalue* Let us do a review of analyzing one way and factorial designs in a Completely Randomized Experiment. Examples of Completely Randomized Designs: An experiment interested in the shelf life of meats stored under four (4) different conditions: Commercial Plastic Wrap, Vacuum Packaged, Mixed Gas Atmosphere and CO2. Twelve steaks were randomly selected, with three steaks randomly assigned to each of the four treatments. Treatment structure is One-way. ANALYSIS is ONEWAY ANOVA Example2: The variation in tensile strength of asphaltic concrete is thought to be associated with the compaction method and the aggregate type. Four compaction methods (static, regular, low and very low) were considered along with two aggregate types (basaltic and siliceous). Three replicates were constructed in random order for each of the eight (8) treatments. Treatment structure is Two-way (Compaction method and aggregate type). ANALYSIS is TWOWAY ANOVA WITH INTERACTIONS How to Randomize? 1. Number the experimental units from 1 to r. 2. Generate a set of r random numbers (Random numbers can be generated from a random number table) 3. Rank the numbers from smallest to largest. These ranks correspond to experimental unit numbers. 4. Assign first group of replicates in sequence to the correspond treatment A. Similarly, assign second group of replicates for treatment B, etc. Example2: Illustration: Consider steak storage example in which there are four treatments (Commercial Plastic Wrap, Vacuum Packaged, Mixed Gas Atmosphere and CO2), each with three replicates. Sequence (Exp.Unit) Random number Rank (unit number) Treatm ent 1 448 4 A 2 699 7 A 3 340 3 A 4 733 9 B 5 580 6 B 6 852 12 B 7 514 5 C 8 723 8 C 9 152 2 C 10 744 10 D 11 828 11 D 12 041 1 D Model and Assumptions are the same as we discussed in earlier Lectures (cell means or treatment effects) Graphical representation of the cell means model Graphical representation of the effects model Least Square Estimators: The errors can be written as: Since eij eij Yij i . is unknown an estimator is of the form: ˆij Yij ˆi e where ˆi represents an as yet to be determined estimator of i . It is desired that the estimates of i have some optimal properties (e.g., unbiased, minimum variance, etc.). The method of least squares produces such estimates of i (i = 1, 2, , t) and is based on minimization of the sum of the squared errors: 2 ˆ ˆ S S E e Y i j i j i t r t r i 11 j i 11 j 2 ANOVA Table Source of Variation Degrees of Freedom Sum of Squares Mean Square F Treatment t-1 s SSTreatments S S T r e a t m e n t s M S T r e a t m e n t s t 1 M S F Treatments M SError Error N-t SS Error S S E rro r M S E rro r Nt Total N-1 SSTotal N =tr if the sample sizes for each treatment are the same. t N ri if i1 the sample sizes differ for the treatments. Tests of Hypotheses Recall that if SSTreatments is small, this would indicate that there is no “important” difference between the various treatment means. However, small must be interpreted in a statistical or probabilistic sense. In the ANOVA table we alluded to the F statistic: S S T re a tm e n ts M S 1 e n ts F Treatm t M S M S E rro r E rro r N t Expected mean squares: EM SError2 t 2 2 E M S r T r e a t m e n t s i i 1 Thus, the F - statistic, which is the ratio of the observed MSTreatment and MS Error is approximately 1.0 if there is no treatment effect and greater than 1.0 is there is a treatment effect. This statistic represents a measure of the deviation from the null 0: 1 2 t hypothesis H . The probability distribution of the F statistic is well known, with most elementary statistics texts having the probabilities tabulated. The distribution of the F - statistic depends on three parameters: 1 , 2 and . The parameter 1 represents the numerator degrees of freedom (t - 1), while 2 represents the denominator degrees of freedom (N - t). The parameter is known as the non-centrality parameter and is zero when the null hypothesis is true. F Distribution with v and v degrees of freedom Example 3: The Environmental Protection Agency (EPA) utilizes the services of a number of analytic laboratories. It is of interest to EPA that these laboratories produce equivalent results when asked to analyze water samples for possible contamination. In order to assess the quality of the analyses from these laboratories the EPA decided to send each of 3 laboratories a set of 6 samples that were known to have a DDT contamination level of 1000 ppm. Each water sample was constructed independent of the others and randomly assigned to a laboratory. The resulting data are given in the following table: Laboratory 1 2 3 4 5 6 Mean 1 1005 1015 1033 1028 1023 1043 Y1. 1024.50 2 995 1008 976 1014 1011 982 Y2. 997.67 3 950 Y3. 988.33 975 988 1015 1008 994 Y.. 1003.5 S S 9 1 8 1 .0; S S 4 2 3 0 .0 ; SSError 4950.0 T o ta l T r e a tm e n t ANOVA Table Source of Variation Treatments Degrees of Freedom 3-1=2 Sum of Squares 4230.0 Mean Square 2115.0 F0 6.4091 Error Total 18 - 3 = 15 18 - 1 = 17 4950.0 9181.0 330.0 H 0: 1 2 3 Ha : i j for at least one i j = 0.05 F0 6.4091 Reject H0 if F F 3 .6 8 0 .0 5 ,2 ,1 5 0 Conclusion: Reject H 0 , there is sufficient evidence to conclude that the mean estimated concentration of DDT differs between the three laboratories. P v a l u e P F F P F 6 . 4 0 9 1 0 . 0 0 9 7 0 v , v 2 , 1 5 1 2 Standard Errors and Confidence Intervals Under the assumptions of the ANOVA model, the variance for each treatment is assumed to be the same. This common, or pooled, variance is estimated by the MSError . MSError S2 As stated earlier, the standard error of a sample mean is the standard deviation of the sample divided by the square-root of the sample size. For the ANOVA model, we use the square-root of the MSError as our estimated standard deviation. SEY i. S r Confidence intervals for the treatment means are: Y t i. , Nt 2 S r Example3 For the EPA laboratory data the treatment means, treatment sample sizes and MSError are: MSError = 330.0 S = 18.17 N = 18 1 8 .1 7 Y 1 0 2 4 .5 r 6 S E 7 .4 2 Y 1 . 1 1 . 6 Y 9 9 7 .6 7 r2 6 S E Y2.18.177.42 2 . 6 Y 9 8 8 .3 3 r36 S E Y3.18.177.42 3 . 6 95% Confidence Intervals for Treatment Means: d fN t 1 8 3 1 5t 2 . 1 3 1 0 . 0 2 5 , 1 5 Y1. : 1024.50 ± 2.1317.42 (1008.7 to 1040.3) Y2. : 997.67 ± 2.1317.42 (981.9 to 1013.5) Y3. : 988.33 ± 2.1317.42 (972.5 to 1004.1) Unequal Number of Replicates Suppose that the number of replicates per treatment differs. That is, suppose that the number of observations for the ith treatment is r i1 ,2 , ,t. i The following changes result from this: Yij eij i 1 , 2 , , t j 1 , 2 , , r where ri i is the number of replications for the ith treatment group. Decomposition of the Sum of Squares: tr i tr i 2 tr i 2 Y Y Y Y Y Y i j . . i . . . i j i . i 1 j 1 i 1 j 1 2 i 1 j 1 ANOVA Table Source of Variation Degrees of Freedom Sum of Squares Mean Square Trt t-1 S S T r e a t m e n t i 1 SSError N-t MS Error ri Y Y t i1 j1 Total 2 ij i. SSTotal N-1 ri Y Y t i1 j1 1 r u where t 2 t i t 1 i 1 2 i Expected Mean Square MSTreatments MSTreatments 2 t2 t r t i MSError 2 2 Y Y rY Y i. .. ii. .. i 1j 1 Error F0 . 2 ij t . ri i .. t ui N ri and N i1 2 Standard Error of the Mean S EY i. S r i Confidence Intervals t S Y t W h e r e Nr i . i N t , r i 1 2 i Example: Unequal Replication An experiment was conducted to compare the yields of five lentil varieties under rainfall conditions in northern Syria. The experiment was planned in a completely randomized design with each variety replicated four times. During the growing season, however, sheep broke through fence and heavily grazed four plots along the edge of the experiment before they were detected. The yields, in kilogram per hectare, from the remaining plots are given in table below. The analysis of variance for the data is given in table. The sums of squares in table are computed as: Variety 1 2 3 4 5 740 545 325 740 605 430 440 290 630 505 760 390 870 430 640 540 Sum 2570 1375 615 2240 2080 8880 ri 4 3 2 3 4 16 y 642.5 458.3 307.5 746.7 520.0 555.0 ANOVA Table Source of Degrees of Sum of Mean Variation Freedom Squares Square Treatments 4 296,279 74,070 Error 11 126,421 11493 Total 15 422,700 F0 6.44 MINITAB OUTPUT One-way ANOVA: Yields versus Variiety Source DF SS MS F P 4 296279 74070 6.44 0.006 Error 11 126421 11493 Total 15 422700 Variiety S = 107.2 R-Sq = 70.09% R-Sq(adj) = 59.22% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev 1 4 642.5 151.1 2 3 458.3 79.1 3 2 307.5 24.7 4 3 746.7 120.1 5 4 520.0 72.9 ---+---------+---------+---------+-----(-----*-----) (------*------) (-------*--------) (-----*------) (-----*-----) ---+---------+---------+---------+-----200 Pooled StDev = 107.2 400 600 800 In CRD there are two types of models: Fixed effects models and Random effects models (discussed earlier) Examples: Fixed: A scientist develops three new fungicides. His interest is in these fungicides only. Random: A scientist is interested in the way a fungicide works. He selects, at random, three fungicides from a group of similar fungicides to study the action. One-way ANOVA Table for Fixed/ Random Effects (CRD): Source of Degrees of Mean Variation Freedom Square Treatments t-1 MSTreatments Error N= r*t N-t MS Error Expected Mean Square Fixed Random 2 2 r i2 2 r2 i t 1 2 How Many Replications? The required number of replications depends on the variance, significance level, power of the test, and size of the difference to be detected. The power approach involves a trial-and-error approach to solving an equation relating a noncentrality parameter to the number of replications, the size of the difference to be detected, the number of treatments, and the variance. The power resulting from using trial values for the number of replicates can be determined from power charts. The number of replicates can be changed until an acceptable power is found. Your book has a section on it (14.6), but SAS does it easier now and we will use the SAS way in LAB.