Answer key to homework due 3/24/05 Chap 12: 4,12, 13 Chap 11: 34, 69 Points: 34 (20 pts); 69 (30 pts); 12.4 (15 pts); 12 (15 pts) 13 (20 pts) 11.34 Two-way ANOVA: Density versus Strength, Time Source Strength Time Interaction Error Total S = 1.493 Strength 1 2 Time 10 14 DF 1 1 1 12 15 SS 52.5625 1.5625 3.0625 26.7500 83.9375 R-Sq = 68.13% Mean 2.625 6.250 Mean 4.125 4.750 MS 52.5625 1.5625 3.0625 2.2292 F 23.58 0.70 1.37 P 0.000 0.419 0.264 R-Sq(adj) = 60.16% Individual 95% CIs For Mean Based on Pooled StDev +---------+---------+---------+--------(-------*------) (-------*------) +---------+---------+---------+--------1.5 3.0 4.5 6.0 Individual 95% CIs For Mean Based on Pooled StDev ---+---------+---------+---------+-----(--------------*-------------) (-------------*--------------) ---+---------+---------+---------+-----3.20 4.00 4.80 5.60 1. Test interaction: Ho: There is no interaction effect; H1: Time and strength interact; Since F=1.37 and p=.264, we ACCEPT the Ho and conclude that there is NO interaction. 2. Effect of developer strength: Ho: Means for different developer strengths are equal u1=u2; H1: means are not equal, there is a relationship. The F-value for this test is 23.58, with p=.000, therefore, we REJECT ho, the means are NOT equal, there is a relationship. 3. Effect of development time: test Ho: development times have same means, u1=u2; h1: the means are different and there is a relationship. Since f=.70 and p-value=.419, we ACCEPT the HO. There is NO relationship, the means are equal. 4. Draw a graph: They might have a main effects graph for either or both variables, or they may have an interaction graph. Any receive credit. These look as follows: Interaction Plot (data means) for Density Main Effects Plot (data means) for Density Strength 6.5 6.0 6 Mean 5.5 Mean of Density Strength 1 2 7 Time 5.0 5 4 4.5 4.0 3 3.5 2 3.0 10 14 Time 2.5 1 2 10 14 e) We conclude that only developer strength matters. The effect of development time and the interaction does not affect density. The model explains 68% (r-sq) of the variation. It appears that strength #2 is significantly higher than #1. 11.69: Using file Access.mtw a) Test for equal variances using Levene’s test: Test for Equal Variances: ReadTime versus FileSize 95% Bonferroni confidence intervals for standard deviations FileSize 1 2 3 N 8 8 8 Lower 0.100801 0.130136 0.178452 StDev 0.165869 0.214139 0.293644 Upper 0.406234 0.524452 0.719171 Bartlett's Test (normal distribution) Test statistic = 2.14, p-value = 0.343 Levene's Test (any continuous distribution) Test statistic = 4.58, p-value = 0.022 The levene’s test has Ho: Groups have equal variances; h1: groups have different variances; Since p-value=.022, we REJECT ho, the groups have different variances. This means that TECHNICALLY, we should NOT use ANOVA , however, we will proceed nevertheless. b) Do a oneway ANOVA for read times and factor of file size. One-way ANOVA: ReadTime versus FileSize Source FileSize Error Total DF 2 21 23 S = 0.2306 SS 0.2763 1.1172 1.3935 MS 0.1381 0.0532 R-Sq = 19.83% F 2.60 P 0.098 R-Sq(adj) = 12.19% Level 1 2 3 N 8 8 8 Mean 2.2438 2.3863 2.5063 StDev 0.1659 0.2141 0.2936 Individual 95% CIs For Mean Based on Pooled StDev +---------+---------+---------+--------(---------*----------) (---------*----------) (----------*---------) +---------+---------+---------+--------2.08 2.24 2.40 2.56 Pooled StDev = 0.2306 We test Ho: means are NOT different, H1: means are different. value=.098, we accept Ho and conclude there is no difference. Since p- c) Do a tukey’s test. NOTE: Make this 2 points Extra credit since some may read problem as only do this if we rejected in part a. Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of FileSize Individual confidence level = 98.00% FileSize = 1 subtracted from: FileSize 2 3 Lower -0.1478 -0.0278 Center 0.1425 0.2625 Upper 0.4328 0.5528 ------+---------+---------+---------+--(-----------*----------) (----------*-----------) ------+---------+---------+---------+---0.25 0.00 0.25 0.50 FileSize = 2 subtracted from: FileSize 3 Lower -0.1703 Center 0.1200 Upper 0.4103 ------+---------+---------+---------+--(-----------*----------) ------+---------+---------+---------+---0.25 0.00 0.25 0.50 For all comparisons, zero is in the interval, indicating that there are no true differences in means. D) We conclude that read time is not affected by file size. Part II doing a two way anova using file size and buffer size as factors and read times as response variable. Two-way ANOVA: ReadTime versus FileSize, Buffer Source FileSize Buffer Interaction Error Total DF 2 1 2 18 23 SS 0.27630 0.93220 0.05773 0.12723 1.39346 MS 0.138150 0.932204 0.028867 0.007068 F 19.55 131.89 4.08 P 0.000 0.000 0.034 S = 0.08407 R-Sq = 90.87% R-Sq(adj) = 88.33% 5. test for interaction: Ho: there is NO interaction; h1: there is an interaction. For the interaction, f=4.08, p-value=.034, we REJECT the HO and conclude that there is an interaction. 6. Test for buffer size main effect. Ho: the means for all buffer sizes are equal; h1: means are not all equal. Since f=131.89, p-value=.000, we REJECT the Ho and conclude that means are not all equal, buffer size DOES affect read time. 7. Test for file size main effect. Ho: means for all file sizes are equal; H1: means are not all equal. Since f=19.55, p-value=.000, we REJECT ho and conclude that means are not all equal, file size DOES affect read time. 8. They should graph the main effects AND the interaction plots: Main Effects Plot (data means) for ReadTime FileSize 2.6 InteractionPlot (datameans) for ReadTime 2.8 Buffer FileSize 1 2 3 2.7 2.6 Mean Mean of ReadTime 2.5 2.4 2.5 2.4 2.3 2.3 2.2 2.2 2.1 1 2 3 1 2 1 2 Buffer From these graphs it appears that buffer size 2 has higher mean read time and file size 1 has much lower, while filesize 3 has much higher read time. The interaction plot seems to show that there is relatively little difference between the means for different file sizes with buffer #1 , but the differences between the means are much larger for buffer size two. (Basically, they should say something about the shape, but actual words are fairly loose). 9. To have the fastest read time (ie lowest mean) they should use buffer size 1 and filesize 1. However, the most important factor is buffer size 1. 10. In parts a-d with the Oneway ANOVA using file size we found NO effect. In parts e-I, with a twoway ANOVA we found significant effects for all factors, including file size. It appears that buffer size is the most important factor. Once we remove the variation caused by buffer size from the unexplained error term, the amount of variation explained by file size becomes significant, where before the unexplained was so large that file size was insignificant. We also find that there is an interaction effect which causes larger variations due to file size when buffer # 2 is used. Chapter 12: 4, 12, 13 (note on # 13 don’t have to do part d or give 4 extra points if they do it). 4. Test of two proportions using Z. excel. They may use minitab, by hand or Test and CI for Two Proportions Sample 1 2 X 283 1053 N 1090 2065 Sample p 0.259633 0.509927 Difference = p (1) - p (2) Estimate for difference: -0.250294 95% CI for difference: (-0.284093, -0.216496) Test for difference = 0 (vs not = 0): Z = -13.53 a) b) c) P-Value = 0.000 Test if difference: ho: p1-p2=0; h1: p1-p2≠0. Since p-value=.000, we reject the Ho and conclude that the proportions are different. Sample 2 (ages 8-18) appears to use it more often. P-value=.0000. This means that there is virtually zero risk of a type one error by rejecting the null hypothesis. 95% confidence interval for difference (-.28 to -.21) 12. Do same as #4 but using a chi-square test. They may do it by hand, by excel or by minitab. It should have a contingency table as shown: yes 283 1053 2-7 8-18 no 807 1012 Using minitab, stat/tables/chi-square test (table in worksheet) Expected counts are printed below observed counts Chi-Square contributions are printed below expected counts 1 yes 283 461.57 69.082 no 807 628.43 50.738 Total 1090 2 1053 1012 2065 874.43 1190.57 36.464 26.782 Total 1336 1819 3155 Chi-Sq = 183.066, DF = 1, P-Value = 0.000 a) The null hypothesis: proportions equal h1: proportions not equal Since p-value=.000, we reject ho and conclude that the proportions are not equal. B) C) p-value=.000 We get similar answers on both #4 and #12 although we used different methods. (note: in the chisquare table, we can see that group 1 (2-7 year olds) was more likely to say no (observed>expected), while older group 2 was more likely to say yes 13. Test of equal proportions using chisquare. Chi-Square Test: no particles, particles Expected counts are printed below observed counts Chi-Square contributions are printed below expected counts 1 no particles 320 296.89 particles 14 37.11 Total 334 2 1.799 14.393 80 103.11 5.180 36 12.89 41.441 116 Total 400 50 450 Chi-Sq = 62.812, DF = 1, P-Value = 0.000 a) HO: no relationship, proportions same; h1: relation, proportions different. Since chi-square=62.812, and p-value=.000, we reject Ho and conclude that there is a relationship. b) P-value=.000 c) It appears that if there are no particles on the chip, they are more likely to be in group 1 (good), while if there are particles on the chip, they are more likely to be in group 2 (bad). We see this by comparing observed and expected. For particles, we see that fewer than expected (14 vs. 37.11) are in good and more than expected (36 vs. 12.89) are in bad. d) EXTRA CREDIT: The z-test used in #5 and the chisquare results here give the same result that there is a difference. To get extra credit, must show results of a z-test as follows: Test and CI for Two Proportions Sample 1 2 X 320 14 N 400 50 Sample p 0.800000 0.280000 Difference = p (1) - p (2) Estimate for difference: 0.52 95% CI for difference: (0.389519, 0.650481) Test for difference = 0 (vs not = 0): Z = 7.93 P-Value = 0.000