Spring 2016 STATISTICS 402B Sample Exam II Questions 1. The effect of five different catalysts (A, B, C, D, E) on the reaction time of a chemical process is being studied. A catalyst is added to a mixture of chemicals, the process initiated, and timed until the reaction is complete. (a) What is the response? reaction time (b) What are the treatments? 5 catalysts A, B, C, D, E (c) What are the experimental units/runs? each run of the chemical process (d) If the experiment is to be conducted using a completely randomized design, how many runs would you need to detect a difference between any pair of catalyst means of 2 standard deviations with α = 0.05 and β = .1? With a = 5,α = 0.05, β = .1, 4 = 2 from the table we get n = 9. 9 × 5 = 45 runs Thus you will need (e) Suppose that each prepared batch of the chemical mixture is large enough to do five runs of the process and that six batches were prepared. Describe how you would perform the experiment using a randomized complete block design (RCBD), stating your blocking factor clearly. The each prepared batch of chemicals will form a Block; thus we will will have 6 blocks. Each block will consist of 5 runs of the process, each run with a one of the catalysts such that all 5 catalysts are used within a block. (f) Describe in detail how you would do the randomization in the RCBD experiment you described above. The catalysts are assigned to the 5 runs within a block randomly, with a separate randomization for each block. 1 B C E A D Block 2 3 4 C C D D E A A A B E B C B D E 5 A B E C D 6 E D A B C (g) Complete the following skeleton ANOVA table for the RCBD you described. Source of Variation d.f. Catalyst 4 Batch 5 Error 20 Total 29 (h) Give a reason why conducting the experiment as a block design with batches as a blocking variable is better than a completely randomized design using mixtures from different batches completely randomly for performing the runs. Blocking is used to control a nuisance variable. If a CRD is used the variation among different batches of chemical mixtures will be part of the estimate of experimental error variance. Using batces as blocking variable we separate this variation from the error variance. 1 (i) Suppose each run of the process requires approximately 1.5 hours, so only five runs can be completed in one day. The experimenter decides to run the experiment using a Latin square design. Describe how you would perform the experiment using a latin square design (LSD) with batches and days as blocking factors. Show the experimental plan. Use Days as the Row Blocking variable (5 different days) and Batches (5 different batches) as the Column Blocking variable. Choose a basic 5 × 5 latin square from the text. Randomize Rows first and then the columns using permutations of the numbers 1 through 5. Days 1 2 3 4 5 1 A B C D E Batches 2 3 4 B C D A E C D E A E B A C D B 5 E D B C A (j) Given below is the anova table constructed from the results of latin square experiment: Source of Variation d.f. SS MS F p − value Catalyst 4 141.44 35.36 11.309 .0005 Batch 4 15.44 3.86 Day 4 12.24 3.06 Error 12 37.52 3.1267 Total 24 206.64 Test the hypothesis that the catalyst means are equal against the alternative that some means are different. Give the F-statistic, it’s degrees of freedom, the p-value, and your decision using α = .05. F = 11.309 with d.f=(4, 12), p-value=.0005. Since p < .5 rject H0 :catalyst means are equal (k) Compute the LSD for comparing the catalyst means. (l) The following are the catalyst means: LSD.05 = t.025,12 · √ p 3.1267 2/5 = 2.179 · 1.1183 = 2.44 Catalyst Means A 8.4 B 5.6 C 8.8 D 3.4 E 3.2 Use the LSD to find pairs of means that are different. Pairs of means (A, B), (A, D), (A, E), (B, C) (C, D) and (C, E) are different as the absolute difference of these pairs of means exceed 2.44. 2. An experiment is conducted to explore the relationship between height of step (5.75 in or 11.5 in) and rate of stepping (14 steps/min, 21 steps/min or 28 steps/min) on the change of heart rate of college students. Six college students each are randomly allocated to each step height and stepping rate combination. There are 6 combinations of step height and stepping rate. Thus there were 36 students participating in the experiment. Each student experiences each combination. The order is randomized for each student and enough time separates the trials so that students heart rates return to a resting rate. The resting heart rate for each student is taken before each trial and the heart rate at the end of 3 minutes of the stepping combination is also measured. The change in heart rate is calculated by subtracting the resting heart rate from the heart rate after stepping. Refer to the JMP output for the Stepping Experiment. Note this output has been edited so that there are several blank spots. (a) What are the response, conditions of interest and experimental material? Response: change in heart rate; Conditions: 6 factorial treatment combinations of step height (5.75 in or 11.5 in) and rate of stepping (14 steps/min, 21 steps/min or 28 steps/min); experimental units are 36 college student participants 2 (b) What design was used to collect the data? Explain how you determined what design was used. The design is a 2-way factorial in a completely randomized design; because 6 students were randomly allocated to each treatment combination (c) Write down the complete ANOVA table for this experiment. Source of Variation Frequency Height Frequency*Height Error Total d.f. 2 1 2 30 35 SS 3578.1667 2466.7778 81.0556 3631.0000 9757.0 MS 1789.0833 2466.7778 40.528 121.03 F 14.7817 20.3810 0.3348 p-value <.0001 <.0001 0.7181 (d) Comment on the interaction plot. Describe what you see in the plot and what it indicates about the possible interaction between step height and stepping frequency. The line segments in the plot are roughly parallel, indicating that the effects of step hoeight do not depend on the stepping frequency (e) Do the results from the ANOVA analysis confirm your above conclusion about interaction? Explain why? F-statistic for intearction is not significant at .05 level as the p-value is very large; thus confirms that there is no interaction (f) Are there statistically significant differences among the sample means for the step heights? Report the appropriate F-statistic, P-value, decision, reason for the decision and conclusion. F = 20.3810,p-value < .0001, Reject H0 of no differences among the means for the step heights as p-value is small (g) Are there statistically significant differences among the sample means for the stepping frequencies? Report the appropriate F-statistic, P-value, decision, reason for the decision and conclusion. F = 14.7817,p-value < .0001, Reject H0 of no differences among the means for the stepping frequencies as p-value is small (h) For comparing treatment (combination of height and frequency) means the value of q is 3.08179. Compute the value of HSD. HSD(=Tukey).05 = Q · sE q 2 n q =Q 2s2E n q = 3.08179 2(121.03) 6 = 19.57 (i) What would your recommendation be if you wanted the largest average increase in heart rate? Support your answer statistically. 3. (a) Give two advantages of using a factorial experiment instead of single factor experiments to study effects of several factors. Efficiency: Several factors are investigated using same expermental material rather than conducting separate experiments for each of the factors. Interactions: allows to study and interpret combined effects of several factors. (b) Explain how to define the main effect A of a 22 factorial with factors A and B. Recall that the treatment means are identified using the notation (1), a, b, ab. Is the average of the two simple effects a − (1) and ab − b giving 22 1 2 [ab − b + a − (1)] (c) Explain how to define the interaction effect AB of a factorial with factors A and B. Recall that the treatment means are identified using the notation (1), a, b, ab. 3 Is the average difference of the two simple effects a − (1) and ab − b giving a + (1)] 1 2 [ab − b − (d) What is a 23 factorial experiment? How many runs does a 23 factorial with each treatment combination replicated twice have? All combinations of 3 factors each at 2 levels. 8 × 2 = 16 (e) What are the numbers of degrees of freedom for Treatment and Error in the 23 factorial experiment in part (b)? 8 treatment combinations give 7 d.f. for Treatment and each combination replicated twice give 1 × 8 = 8 d.f for Error or Total d.f. 15 − 7 = 8 (f) Give the defining contrast for any two-factor interaction in a 23 factorial experiment. (AB ≡ (a − 1)(b − 1)(c + 1) = abc + ab − ac − bc − a − b + c + (1) (g) Suppose the 3-fi contrast and the Catalyst (1) a b ab Contrast + + Means 6.4 5.6 6.8 3.4 Calculate the 3-fi effect. treatment c ac + 3.3 7.2 totals for the experiment in part (d) are: bc abc + 4.6 5.0 d = (−6.4 + 5.6 + 6.8 − 3.4 + 3.3 − 7.2 − 4.5 + 5.0)/16 = −0.8/16 = −0.05 ABC 4 of