Name STATISTICS 402B Spring 2016 Midterm Exam II (100 points) 1. Four different washing solutions (labeled say 1, 2, 3, and 4) are being compared to study their effectiveness in retarding bacteria growth in 5-gallon milk containers. Bacteria counts remaining in the containers after 3 hours is the response variable. (a) (6) Give at least one advantage and one disadvantage of conducting this experiment using a completely randomized design. (Hint: think of sample sizes, randomization, estimate of experimental error) (b) (6) The experiment is run in a laboratory and only four trials can be run in a day. The experimenter decides to use a randomized complete block design with “Day” as a blocking factor. If measurements are being taken over five days, give an experimental plan for the experiment (i.e., the order of experimental runs to be run each day). (c) (8) Suppose the model is yij = µ + τi + βj + ij ; i = 1, . . . , 4 (Solutions), j = 1, . . . , 5 (Days). Fill out a partial ANOVA table (SV and d.f. only) for the experiment described in part (b). SV Solution Day Error Total d.f. (d) (6) Calculate a 95% CI for τ1 − τ2 if it is given that ȳ1. = 23.1 ȳ2. = 25.2, SSE = 23.52 and t.025,12 = 2.179. (Hint: First, use above table and given SSE to calculate M SE ) (e) (8) Suppose the 4 chemical analysts are available. The experimenter decides to use a Latin square design in order to control the variation among the different days as well as among the different analysts. Thus Day and Analyst are blocking factors. Show a possible experimental plan the experimenter could have used (after randomization). Start with a basic plan. 1 2. The following table gives the percent shrinkage during dyeing of 4 types fabric at 4 different dye temperatures. The effects of both the types of fabric and dye temperatures as well as any combined effects were of interest. The fabric-temperature combinations were allocated completely at random to 32 experimental runs so that each combination was replicated twice and then run in random order. The data are as shown below: Fabric I II III IV 210◦ F 0.8,1.6 2.2,1.8 2.8,3.6 2.2,2.8 Temperature 215◦ F 220◦ F 1.8,2.4 3.2,4.0 3.4,2.8 5.2,4.6 4.2,4.8 6.2,7.4 3.3,3.9 5.2,5.8 225◦ F 7.5,8.2 9.8,9.2 11.8,12.6 10.4,11.0 (a) (6) Describe the type of experiment used here by naming both the treatment arrangement and the experimental design used. (b) (10) Complete the ANOVA table for this experiment. You may use the JMP output noting that some numbers are missing in the JMP output. Source of Variation Fabric Temperature Fabric*Temperature Error Total d.f. SS MS F p-value 16 31 (c) (6) Obtain numbers from the JMP output and fill-out the following table of means: Fabric I 210◦ F Temperature 215◦ F 220◦ F 225◦ F ȳi.. II III IV ȳ.j. (d) (6) How would you determine whether there is significant interaction between fabric type and temperature effects? (if you perform a test of hypothesis, you must give the test statistic, the p-value, and your decision) Does the plot in the JMP output support your conclusion? Explain how or why? 2 (e) (6) Calculate the 95% CI for the difference in the effects of Fabric I and II. (t.025,16 = 2.12) (f) (6) Calculate the 95% CI the difference in the effects of Temperature 220◦ F and 225◦ F. (t.025,16 = 2.12) 3. (a) (6) Give two advantages of using a factorial experiment instead of single factor experiments to study effects of several factors. (b) (4) Explain how to define the main effect A of a 22 factorial with factors A and B. Recall that the treatment means are identified using the notation (1), a, b, ab. (c) (4) Explain how to define the interaction effect AB of a 22 factorial with factors A and B. Recall that the treatment means are identified using the notation (1), a, b, ab. 3 Spring 2016 STATISTICS 402B Examination 2 FORMULA SHEET Single Factor in a CRD Levels of the factor are the treatments; there are a treatments. iid Model: yij = µi + eij , where eij ∼ N (0, σ 2 ) or equivalently, by the effects model:yij = µ + τi + eij where i = 1 . . . , a; j = 1, . . . , ni Anova table to test H0 : µ1 = . . . = µa vs Ha : or H0 : τ1 = . . . = τa vs Ha : at least one inequality Source of Variation Treattment Error Total In the above table N = Pa i=1 ni at least one inequality d.f. a−1 N −a N −1 SS MS F SST rt M ST rt M ST rt /M SE SSE M SE SST rt if the sample sizes are n1 , n2 , . . . , na If the sample sizes are equal, i.e., n1 = n2 = . . . = na = n, we have that N = an Reject H0 at α level of significance if F > Fα,a−1,N −a or use the p-value from JMP. (can be used this way only when all sample sizes are equal to n). q √ LSDα = tα/2,N −a · sE n2 where sE = M SE LSD Procedure: Tukey’s Method: (can be used this way only when all sample sizes are equal to n) Tukeyα = HSDα = Q · sE q 2 n where Q is taken from the JMP output. Single Factor in a RCBD There are a treatments; b blocks. iid Model:yij = µ + τi +βj + ij , i = 1, . . . , a; b = 1, . . . , b where eij ∼ N (0, σ 2 ) | {z } µi Anova table to test H0 : µ1 = . . . = µa vs Ha : least one inequality SV Treatments Blocks Error Total at least one inequality or H0 : τ1 = . . . = τa vs Ha : d.f. a−1 b−1 (a − 1)(b − 1) N − 1(= ab − 1) SS SST rt SSBlk SSE SST at MS F SST rt /(a − 1) M ST rt /M SE SSBlk /(b − 1) M SE (= s2E ) Confidence Intervals for Pairwise Comparisons A 100(1 − α)% C.I. for τp − τq (or µp − µq ) is (ȳp. − ȳq. ) ∓ t α2 ,ν · sE q 2 b, ν = (a − 1)(b − 1) Single Factor in a Latin Square Design iid Model: yijk = µ + αi + τj + βk + ijk , where ijk ∼ N (0, σ 2 ) and yijk is an observation in the ith row, k th column for the j th treatment; i, j, k = 1, . . . , p. (Note: τj is the j th treatment effect.) Anova table to test H0 : τ1 = . . . = τp vs Ha : SV Treatments Rows Cols Error Total d.f. p-1 p-1 p-1 (p-1)(p-2) p2 − 1 at least one inequality SS SST rt SSRows SSCols SSE SST MS MST rt MSRows MSCols MSE=s2E F MST rt /MSE Confidence Interval for Pairwise Comparisons: A 100(1 − α)% CI for τp − τq is (ȳ.p. − ȳ.q. ) ∓ tα/2,ν · sE 2/p where s2E = M SE and ν = (p − 1)(p − 2) p Basic Latin Squares: Two-way Factorial in a CRD Two factors A (at a levels) and B (at b levels), crossed giving a A × B factorial each treatment combination replicated n times. yijk = µ + τi + βj + (τ β)ij + ijk i = 1, . . . , a; j = 1, . . . , b; k = 1, . . . , n; iid τi = effect of i-th level of A; βj = effect of j-th level of B; (τ β)ij =interaction effect and eijk ∼ N (0, σ 2 ) ANOVA Table: SV Treatment A B AB Error Total d.f. ab − 1 a−1 b−1 (a − 1)(b − 1) ab(n − 1) abn − 1 SS SST rt SSA SSB SSAB SSE SST Confidence Intervals for Main Effects: Factor A: A 100(1 − α)% C.I. for τ1 − τ2 (or µ̄1. − µ̄2. ) (ȳ1.. − ȳ2.. ) ∓ t α2 ,ν · sE where ν = ab(n − 1) q 2 bn s2E = M SE Factor B: A 100(1 − α) C.I. for β1 − β2 (or µ̄.1 − µ̄.2 ) (ȳ.1. − ȳ.2. ) ± t α2 ,ν · sE where ν = ab(n − 1) s2E = M SE q 2 an MS M ST rt M SA M SB M SAB M SE (= s2E ) F M ST rt /M SE M SA /M SE M SB /M SE M SAB /M SE (d) (12) The graph below shows the Yield totals of a 22 factorial with 3 replications for each treatment combination where A (Concentration) and B (Catalyst) are the two factors under study: i. Calculate Factor A effect. ii. Calculate Factor B effect. iii. Calculate Interaction AB effect. iv. Calculate the degrees of freedom for Total SS for in this experiment. v. Calculate the degrees of freedom for Error SS in this experiment. vi. Calculate the degrees of freedom for Treatment SS in this experiment. 4 Edited JMP Output for Fabric Dye Data (Problem #2) Analysis of Variance Source DF Model Error C. Total 15 16 31 Sum of Squares 329.32469 3.94500 333.26969 Mean Square F Ratio 21.9550 0.2466 89.0443 Prob > F <.0001* Effect Tests Source Nparm Fabric Temperature Fabric*Temperature DF Sum of Squares 37.67594 288.08094 3.56781 F Ratio <.0001* <.0001* 0.1951 Effect Details Fabric Least Squares Means Table Level I II III IV Least Sq Mean 3.6875000 4.8750000 6.6750000 5.5750000 Std Error Mean 3.68750 4.87500 6.67500 5.57500 Temperature Least Squares Means Table Level 210 215 220 225 Least Sq Mean 2.225000 3.325000 5.200000 10.062500 Std Error Level I,210 I,215 I,220 I,225 II,210 II,215 II,220 II,225 III,210 III,215 III,220 III,225 IV,210 IV,215 IV,220 IV,225 Least Sq Mean 1.200000 2.100000 3.600000 7.850000 2.000000 3.100000 4.900000 9.500000 3.200000 4.500000 6.800000 12.200000 2.500000 3.600000 5.500000 10.700000 Mean 2.2250 3.3250 5.2000 10.0625 Fabric*Temperature Least Squares Means Table Std Error 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 0.35111430 Prob > F