STATISTICS 402 - Assignment 8 Solution 1. In order to increase the yield strength of steel, steel bars are heated to a critical temperature in an oven for a specified amount of time, quenched in water and then cooled in the air. The strength of the bar is then measured by subjecting it to testing that destroys the bar. An experiment is performed to determine the effect of different factors on the strength of steel. One factor is the temperature of the oven which has two levels 1500 and 1600 degrees C. The second is the time the steel bar is heated at that temperature, which has three levels 10 minutes, 20 minutes and 30 minutes. a) What are the response, conditions and experimental material? The response is the yield strength. The conditions are 2 levels of oven temperature and 3 levels of time. The experimental material consists of steel bars. b) How many treatment combinations are there? There are 6 treatment combinations. c) If we have 4 replications of each treatment combination in a completely randomized design, what is the size of the difference in temperature level means that can be detected with Alpha = 0.05 and Beta = 0.05? There will be 12 observations for each of the 2 temperatures. With Alpha = 0.05 and Beta = 0.05 the difference in temperature level means that can be detected is 1.6 standard deviations. d) If we have 4 replications of each treatment combination in a completely randomized design, what is the size of the difference in heating time means that can be detected with Alpha = 0.05 and Beta = 0.05? There will be 8 observations for each of the 3 times. With Alpha = 0.05 and Beta = 0.05 the difference in time level means that can be detected is between 2.0 and 2.5 standard deviations. e) For a completely randomized design the treatments will be assigned at random to the bars. How long will it take to run the experiment as a completely randomized design? There are 8 steel bars at heated for each of the times; 10 minutes, 20 minute and 30 minutes so that is a total of 80 minutes plus 160 minutes plus 240 minutes, 480 minutes or 8 hours. This is the minimum amount of time because you have to allow time for the oven to heat up and cool down when the temperature is changed. 1 f) If you are going to do a randomized complete block design why can’t you form blocks by reusing? Once a bar is heated the strength is measured and measuring the strength destroys the bar so it cannot be used again. g) You can use a randomized complete block design by sorting the steel bars according to their initial strength (measured non-destructively). Give a partial analysis of variance table for a randomized complete block design using four blocks. Source Temperature Time Temperature*Time Block Error C. Total df 1 2 2 3 15 23 h) Using a randomized complete block design will not reduce the time it takes to run the experiment because one steel bar at a time is heated at the assigned temperature for the assigned time. To reduce the time it takes to run the experiment three steel bars are heated in the oven at a randomly assigned temperature, one bar, chosen at random, is removed after 10 minutes; a second bar, chosen at random, is removed after 20 minutes and the third bar is removed after 30 minutes. Explain why this is split plot (repeated measures) design. In your explanation, you must answer the following questions. Assume you have 24 steel bars to use in the experiment. This is a split plot/repeated measures design because there are two designs in one experiment. There is a completely randomized design that is used to assign the temperature level to groups of 3 steel bars (sorted on initial strength to be more uniform). The times in the oven are randomly assigned to steel bars within each group of three (the blocks). This is a randomized complete block design. i. What are the “whole plots”? A whole plot is a group of 3 steel bars (sorted on initial strength to be more uniform). ii. What is the “whole plot” factor? The whole plot factor is temperature. 2 iii. How will random assignment be used for the whole plot factor? Be specific. First sort the steel bars into groups of three based on initial strength (measured non-destructively) so within each group the bars are more uniform. There will be 8 groups. Assign 1500 degrees C at random to 4 of the groups and 1600 degrees C to the other 4 groups. This can be done by having 8 cards, 4 with 1500 and 4 with 1600 on them. Shuffle the cards and draw cards without replacement. The temperature on the card will be assigned to the group of steel bars. iv. What are the “sub plots”? Subplots are individual bars within a group of three sorted to be more uniform. v. What is the “sub plot” factor? The subplot factor is time in the oven. vi. How will random assignment be used for the sub plot factor? Be specific. For each group of three steel bars, number them with a unique integer 1, 2, or 3. Have three cards numbered 1, 2, and 3. Shuffle the cards. The first card drawn (without replacement) is the number of the bar taken out after 10 minutes. The second card drawn (without replacement) is the number of the bar taken out after 20 minutes. The remaining bar is taken out after 30 minutes. This procedure is repeated for each set of three steel bars. i) Construct a partial ANOVA table indicating sources of variation and degrees of freedom. Also indicate how to construct the appropriate F tests for determining the statistical significance of the model effects. Source Temperature Groups[Temperature] Time Temperature*Time Error C. Total df 1 6 2 2 12 23 F ratio MSTemperature/MSGroups[Temperature] MSTime/MSError MSTemperature*Time/MSError 3 2. A psychology experiment is conducted on the effects of anxiety and muscular tension on four different types of memory. There are 4 treatments (A, B, C, and D) with different degrees of anxiety and muscular tension. Twelve students are chosen at random from all students at a large university who have given their consent to participate in psychology experiments. Treatments are assigned at random to the students. Three students are assigned A, three students are assigned B, three students are assigned C and three students are assigned D. Each student performs four types of memory trials in random order. The random order is different for each student. For each memory trial the number of memory errors is recorded for each student. a) What is the response? The response is the number of memory errors. b) What are the experimental units? The experimental units are students at a large university who have given their consent to participate in psychology experiments. c) What is the whole plot? The whole plot is the student. d) What is the whole plot, between subjects, factor? The whole plot factor is the treatment (different degrees of anxiety and muscular tension). e) What is the sub plot? The sub plot is also the student (reusing the student for each of the memory trials). f) What is the sub plot, within subjects, factor? The sub plot factor is the different memory trials. 4 g) A JMP data table is posted on the course web site. Use JMP to analyze these data keeping in mind that this is a split plot (repeated measures) design. Be sure to include plots of main effects and an interaction plot. Turn in the computer output with your assignment. Response: Errors Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.95981 0.921294 1.452966 10.33333 48 Analysis of Variance Source DF Sum of Squares Model 23 1210.0000 Error 24 50.6667 C. Total 47 1260.6667 Mean Square 52.6087 2.1111 Test Denominator Synthesis Source Treatment Memory Trial Treatment*Memory Trial Student[Treatment]&Random MS Den 7.83333 2.11111 2.11111 2.11111 DF Den 8 24 24 24 Tests wrt Random Effects Source Treatment Memory Trial Treatment*Memory Trial Student[Treatment]&Random SS 164 950.167 33.1667 62.6667 MS Num 54.6667 316.722 3.68519 7.83333 F Ratio 24.9199 Prob > F <.0001* Denom MS Synthesis Student[Treatment]&Random Residual Residual Residual DF Num 3 3 9 8 F Ratio 6.9787 150.0263 1.7456 3.7105 Prob > F 0.0127* <.0001* 0.1328 0.0059* Effect Details Treatment Effect Test Sum of Squares 164.00000 F Ratio 6.9787 DF 3 Prob > F 0.0127* Denominator MS Synthesis: Student[Treatment]&Random 5 Least Squares Means Table Level Least Sq Mean Std Error A 11.833333 0.80794664 B 8.500000 0.80794664 C 8.500000 0.80794664 D 12.500000 0.80794664 Mean 11.8333 8.5000 8.5000 12.5000 LSMeans Differences Tukey HSD α=0.050 Q=3.20234 HSD = 3.20234(1.14261) = 3.659 LSMean[i] By LSMean[j] Mean[i]-Mean[j] A B C D Std Err Dif Lower CL Dif Upper CL Dif A 0 3.33333 3.33333 -0.6667 0 1.14261 1.14261 1.14261 0 -0.3257 -0.3257 -4.3257 0 6.99236 6.99236 2.99236 B -3.3333 0 1.8e-15 -4 1.14261 0 1.14261 1.14261 -6.9924 0 -3.659 -7.659 0.32569 0 3.65903 -0.341 C -3.3333 -2e-15 0 -4 1.14261 1.14261 0 1.14261 -6.9924 -3.659 0 -7.659 0.32569 3.65903 0 -0.341 D 0.66667 4 4 0 1.14261 1.14261 1.14261 0 -2.9924 0.34097 0.34097 0 4.32569 7.65903 7.65903 0 Level D A B C A A B B B Least Sq Mean 12.500000 11.833333 8.500000 8.500000 Levels not connected by same letter are significantly different. Memory Trial Effect Test Sum of Squares 950.16667 F Ratio 150.0263 DF 3 Prob > F <.0001* Denominator MS Synthesis: Residual 6 Least Squares Means Table Level Least Sq Mean Type 1 16.666667 Type 2 11.750000 Type 3 8.333333 Type 4 4.583333 Std Error 0.41943525 0.41943525 0.41943525 0.41943525 Mean 16.6667 11.7500 8.3333 4.5833 LSMeans Differences Tukey HSD α=0.050 Q=2.75861 HSD = 2.75861(0.59317) = 1.636 LSMean[i] By LSMean[j] Mean[i]-Mean[j] Type 1 Type 2 Type 3 Type 4 Std Err Dif Lower CL Dif Upper CL Dif Type 1 0 4.91667 8.33333 12.0833 0 0.59317 0.59317 0.59317 0 3.28034 6.69701 10.447 0 6.55299 9.96966 13.7197 Type 2 -4.9167 0 3.41667 7.16667 0.59317 0 0.59317 0.59317 -6.553 0 1.78034 5.53034 -3.2803 0 5.05299 8.80299 Type 3 -8.3333 -3.4167 0 3.75 0.59317 0.59317 0 0.59317 -9.9697 -5.053 0 2.11367 -6.697 -1.7803 0 5.38633 Type 4 -12.083 -7.1667 -3.75 0 0.59317 0.59317 0.59317 0 -13.72 -8.803 -5.3863 0 -10.447 -5.5303 -2.1137 0 Level Type 1 A Type 2 B Type 3 C Type 4 D Least Sq Mean 16.666667 11.750000 8.333333 4.583333 Levels not connected by same letter are significantly different. Treatment*Memory Trial Effect Test Sum of Squares 33.166667 F Ratio 1.7456 DF 9 Prob > F 0.1328 Denominator MS Synthesis: Residual 7 h) Are there some treatment effects that are different from zero? Report the appropriate F- and P- values to support your answer. Yes. F = 6.9787, P-value = 0.0127. The P-value is less than 0.05, so there are some treatment effects different from zero. i) Compute the HSD for comparing treatment means. Use the HSD to see which treatments means are statistically different from other treatment means? HSD = 3.20234(1.14261) = 3.659 Level D A B C A A B B B Least Sq Mean 12.500000 11.833333 8.500000 8.500000 Levels not connected by same letter are significantly different. Therefore, treatment D is statistically different from treatments B and C. There are no statistically significant differences between treatments A and D, or between treatments A, B or C. j) Are there some memory trial effects that are different from zero? Report the appropriate F- and P-values to support your answer. Yes. F = 150.0263, P-value <0.0001. The P-value is less than 0.05, so there are some treatment effects different from zero. k) Compute the HSD for comparing memory trial means. Use the HSD to see which memory trial means are statistically different from other memory trial means? HSD = 2.75861(0.59317) = 1.636 Level Type 1 A Type 2 B Type 3 C Type 4 D Least Sq Mean 16.666667 11.750000 8.333333 4.583333 Levels not connected by same letter are significantly different. Therefore all types of memory trials have mean number of memory errors that are statistically different from those of all other types. 8 l) Is there a statistically significant interaction between treatment and memory trial? Report the appropriate F- and P-values to support your answer. No. F = 1.7456, P-value = 0.1328. The P-value is greater than 0.05 therefore there is not a statistically significant interaction between the treatments and the types of memory trials. m) Comment on the residuals given on the next page. Remember that there are two sets of residuals; one set for the whole plot and one set for the subplot. Tell me what you see in the various plots (residuals vs. factor levels, Normal quantile plot, box plot and histogram) and indicate what this tells you about the Fisher conditions of equal standard deviations and normally distributed errors. Residuals for evaluating treatments: The residuals for evaluating the treatments have variation that is quite similar for each the treatments. The ratio of the largest standard deviation to the smallest is less than 2. The equal standard deviation condition is most likely met for evaluating differences among treatments. The distribution of residuals is slightly skewed right. There are not many residuals (only 8) so it is hard to tell, but the condition that errors be normally distributed may be in doubt. Residuals for evaluating memory trials: The standard deviation for Type 1 is much larger than that for Type 2 (2.55 times as big). The equal standard deviation condition may not be met. The distribution of residuals is skewed to the right indicating that the condition that errors be normally distributed may not be met. 9 Memory Errors for Different Treatments and Memory Trials Level A B C D Number 3 3 3 3 Mean 0 0 0 0 Whole Plot Residuals Std Dev 1.44338 1.50000 0.90139 1.63936 Level Type 1 Type 2 Type 3 Type 4 Number 12 12 12 12 Mean 0 0 0 0 Std Dev 1.54233 0.60302 0.99240 0.93744 Subplot Residuals 10