252solngr4-081 4/13/08 (Open this document in 'Page Layout' view!) Name: Class days and time: Please include this on what you hand in! Graded Assignment 4 The data set GOLFBALL is in problem 11.14 of the text 9th or 10th edition or on the CD. You must answer at least the following questions versions A, B and C of the problem. Only neat and legible papers with written answers in complete sentences will be read! a) At a 5% level is there evidence of a difference in the average distance traveled by the golf balls with different designs? Why? c) What assumptions are necessary in (a)? e) What golf ball design should be chosen? Do this problem in Excel as follows. Use columns A, B, C, E and F on the Excel spreadsheet for data In the first row of Columns B, C, D and F put in Des 1, Des 2, Des 3 and Des 4. Label Column E with Des 4a and Column A with ‘golfer.’ Starting in Cell A2 Put in the letters A through J to identify the golfers – unless, of course, you want to suggest some names. Now put in the data in columns B, C, D and F, skipping column D Version A To fill column E in cell E2 write =F2 after your 'enter' this cell should read '213.9' Use the fill handle on cell E2 to make column E identical to column F except for the heading. Do not go on unless this is true. Save your data as gdataA.xls Use the 'tools' pull-down menu and pick ‘data analysis' (If you cannot find this, use Tools and Add-Ins to put in the analysis packs.) Pick 'ANOVA: Single Factor. Set input range to $B$1:$E$11. Select 'New worksheet ply' and ‘columns’, check 'labels in first row' hit 'OK' and save your results as gresltA.xls. Version B In order to check for the effect of the fact that the data is blocked by employees, repeat the analysis using ‘ANOVA: Two-Factor without replication. Set input range to $A$1:$E$11, check ‘labels,’ and save your results as gresltB.xls Version C Take the last digit of your student number (if it's zero, use 10) and add 5 to it. Call this x and make sure that I know its value. Go back to your original data or use the 'file' pull-down menu to open gresltA.xls. To fill column E this time in cell E2 write =F2+ x .Now highlight cell E2 and use the fill handle to make column E equal to column F plus x . Do not go on unless this is true. Save your data as gdataC.xls. Run the one-way ANOVA again and save your results as gresltC.xls Submit the data and results with your Student number. The most effective way to do this is to paste the results into a Word document and then add neat hand or typed notes. Indicate what hypotheses were tested, what the p-value was and whether, using the p-value, you would reject the null if (i) the significance level was 5% and (ii) the significance level was 10%, explaining why. You will have two answers for each of your two problems. For your version C ANOVA do a Scheffe confidence interval and a Tukey-Kramer interval or procedure for each of the C 24 6 possible differences between means and report which are different at the 5% level according to each of the 2 methods. Answer as much as you can of the questions in Problem 11.14. The extra credit below may be needed for a really complete answer. Extra Credit: Take the data from your last ANOVA and perform a Levene test on it using the third example in 252mvarex. as a pattern for your calculations. Make sure that you explain what is being tested and what you conclude. Hand in separately – this will be treated as extra credit on your next take-home exam. See below for all of this. Extra Extra Credit: Do Bartlett and Levene tests using the example in 252mvar as your pattern. It turns out that your ANOVA has just enough columns to do this test. See below for all of this. 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) Data for 1st and 2nd ANOVA gdataA Golfer Des 1 Des 2 1 206.32 203.81 2 223.85 223.85 3 207.94 206.75 4 224.79 223.97 5 206.19 205.68 6 229.75 234.3 7 204.45 204.49 8 228.51 219.5 9 209.65 210.86 10 221.44 233 Results for 1st ANOVA gresltA Anova: Single Factor SUMMARY Groups Des 1 Des 2 Des 3 Des 4a Des 3 217.08 230.55 221.43 227.95 218.04 213.84 224.13 224.87 211.82 229.49 Des 4a 213.9 231.1 221.28 221.53 229.43 235.45 213.54 228.35 214.51 225.09 Des 4 213.9 231.1 221.28 221.53 229.43 235.45 213.54 228.35 214.51 225.09 H 0 : 1 2 3 4 Count 10 10 10 10 ANOVA Source of Variation Between Groups Within Groups SS 397.9091 3129.27 Total 3527.179 Sum 2162.89 2166.21 2219.2 2234.18 Average 216.289 216.621 221.92 223.418 Variance 104.6483 139.6653 43.08287 60.3002 df MS 132.6364 86.92417 F 1.525886 3 36 P-value 0.224397 F crit 2.866266 39 2 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) Results for 2 nd ANOVA H 01 : RowEmployeemeans equal H 02 : 1 2 3 4 gresltB Anova: Two-Factor Without Replication SUMMARY Count 1 2 3 4 5 6 7 8 9 10 Des 1 Des 2 Des 3 Des 4a 4 4 4 4 4 4 4 4 4 4 Sum 841.11 909.35 857.4 898.24 859.34 913.34 846.61 901.23 846.84 909.02 Average 210.2775 227.3375 214.35 224.56 214.835 228.335 211.6525 225.3075 211.71 227.255 Variance 38.96229 16.26729 65.66647 7.024667 127.2787 99.43723 87.47603 17.81042 4.272733 25.50057 10 10 10 10 2162.89 2166.21 2219.2 2234.18 216.289 216.621 221.92 223.418 104.6483 139.6653 43.08287 60.3002 df MS 228.6766 132.6364 39.67334 F 5.763988 3.343212 ANOVA Source of Variation Rows Columns Error SS 2058.09 397.9091 1071.18 Total 3527.179 9 3 27 P-value 0.00018 0.033859 F crit 2.250131 2.960351 39 Answer: In the first ANOVA we get a p-value of .224397. Since this is above any significance level we are likely to use, we do not reject the null hypothesis that the mean distance that the golf balls go is the same for all numbers of hours worked. . In the second ANOVA, the p-value for columns (.033859) is much lower, so we can reject the original null hypothesis at the 5% significance level. Note that there is a very significant difference between golfers. Too bad that this version completely differs from what it said in the problem. 3 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) Results for 3rd ANOVA H 0 : 1 2 3 4 gdataC Golfer Des 1 Des 2 Des 3 Des 4c 1 206.32 203.81 217.08 223.9 2 223.85 223.85 230.55 241.1 3 207.94 206.75 221.43 231.28 4 224.79 223.97 227.95 231.53 5 206.19 205.68 218.04 239.43 6 229.75 234.3 213.84 245.45 7 204.45 204.49 224.13 223.54 8 228.51 219.5 224.87 238.35 9 209.65 210.86 211.82 224.51 10 221.44 233 229.49 235.09 Des 4 213.9 231.1 221.28 221.53 229.43 235.45 213.54 228.35 214.51 225.09 gresltC Anova: Single Factor SUMMARY Groups Des 1 Des 2 Des 3 Des 4c ANOVA Source of Variation Between Groups Within Groups Count 10 10 10 10 SS 1919.109 3129.27 Sum 2162.89 2166.21 2219.2 2334.18 Average 216.289 216.621 221.92 233.418 Variance 104.6483 139.6653 43.08287 60.3002 df MS 639.703 86.92417 F 7.359323 3 36 P-value 0.000573 F crit 2.866266 Total 5048.379 39 The modified problem is giving us some very real differences in the average distance the various golf ball designs go. The p-value is low enough to cause a rejection of the null hypothesis at any usual significance level. 4 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) Types of contrast between means. Individual Confidence Interval If we desire a single interval, we use the formula for the difference between two means when the variance is known. For example, if we want the difference between means of column 1 and column 2. 1 2 x1 x2 tn m s 1 1 , where s MSW . n1 n2 2 Scheff e Confidence Interval If we desire intervals that will simultaneously be valid for a given confidence level for all possible intervals 1 1 between column means, use 1 2 x1 x2 m 1Fm 1, n m s . n n2 1 Tukey Confidence Interval This also applies to all possible differences. 1 2 x1 x2 q m,n m s 2 1 1 . This gives rise to Tukey’s HSD (Honestly Significant n1 n 2 Difference) procedure. Two sample means x .1 and x .2 are significantly different if x.1 x.2 is greater than q m,n m s 2 1 1 n1 n 2 Contrasts From the Excel output, x1 216 .289 , x2 216 .621, x3 221 .920 , x4 233 .418 , m 4, n m 36, s 110 110 86 .9242 0.2 17 .3848 4,30 4, 40 4,36 3.85 and q.05 3.79 . Since 4.1695 . Assume 0.05 . We will need q.05 . The table says q.05 n1 n 2 n3 n 4 20 and s 2 MSW 86.9242 . So 36 is about halfway between 30 and 40, take a halfway point between the two table values and say 3,36 2.87 The contrasts follow. 4,36 36 q.05 3.82 . t .025 2.028 . F.05 1 2 Individual: 1 2 216 .289 216 .621 t 36 86 .9242 2 0.332 8.46 ns Scheffe: 1 2 216 .289 216 .621 0.332 3 2.87 0.332 12.233 1 1 0.332 2.028 17 .3848 10 10 86 .9242 3F.053, 36 86 .9242 1 1 10 10 1 1 0.332 2.934 17 .3848 10 10 ns 86 .9242 Tukey: 1 2 x1 x2 q .405,36 2 216 .289 216 .621 3.82 86 .9242 2 1 1 10 10 1 1 10 10 0.332 2.701 17 .3848 0.332 11.262 ns 5 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) 1 3 From the Excel output, x1 216 .289 , x2 216 .621, x3 221 .920 , x4 233 .418 . Individual: 1 3 216 .289 221 .920 t 36 86 .9242 2 5.631 8.46 ns Scheffe: 1 3 216 .289 221 .920 5.631 3 2.87 5.631 12.233 1 1 5.631 2.028 17 .3848 10 10 86 .9242 3F.053, 36 86 .9242 1 1 10 10 1 1 5.631 2.934 17 .3848 10 10 ns 86 .9242 Tukey: 1 3 x1 x3 q .405,36 2 216 .289 221 .920 3.82 1 1 10 10 86 .9242 2 1 1 10 10 5.631 2.701 17 .3848 5.631 11.262 ns 1 4 From the Excel output, x1 216 .289 , x2 216 .621, x3 221 .920 , x4 233 .418 . Individual: 1 4 216 .289 233 .418 t 36 86 .9242 2 17.129 8.46 s Scheffe: 1 4 216 .289 233 .418 17 .129 3 2.87 17.129 12.233 1 1 17 .129 2.028 17 .3848 10 10 3F.053, 36 86 .9242 1 1 10 10 1 1 17 .129 2.934 17 .3848 10 10 86 .9242 s 86 .9242 Tukey: 1 4 x1 x.4 q .405,36 2 216 .289 233 .418 3.82 1 1 10 10 86 .9242 2 1 1 10 10 17 .129 2.701 17 .3848 17.129 11.262 s 2 3 From the Excel output, x1 216 .289 , x2 216 .621, x3 221 .920 , x4 233 .418 . Individual: 2 3 216 .621 221 .920 t 36 86 .9242 2 5.299 8.46 Scheffe: 2 3 216 .621 221 .920 5.299 3 2.87 5.299 12.233 86 .9242 1 1 5.299 2.028 17 .3848 10 10 ns 3F.053, 36 86 .9242 1 1 10 10 1 1 5.299 2.934 17 .3848 10 10 ns 6 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) 86 .9242 Tukey: 2 3 x2 x3 q .405,36 2 1 1 10 10 216 .621 221 .920 3.82 1 1 10 10 86 .9242 2 5.299 2.701 17 .3848 5.299 11.262 ns 2 4 From the Excel output, x1 216 .289 , x2 216 .621, x3 221 .920 , x4 233 .418 . Individual: 2 4 216 .621 233 .418 t 36 86 .9242 2 16.797 8.46 s 3F.053, 36 Scheffe: 2 4 216 .621 233 .418 16 .797 3 2.87 16.797 12.233 1 1 16 .797 2.028 17 .3848 10 10 86 .9242 1 1 10 10 1 1 16 .797 2.934 17 .3848 10 10 86 .9242 s 86.9242 Tukey: 2 4 x2 x.4 q .405,36 2 1 1 10 10 216 .621 233 .418 3.82 1 1 10 10 86 .9242 2 16 .797 2.701 17 .3848 16.797 11.262 s 3 4 From the Excel output, x1 216 .289 , x2 216 .621, x3 221 .920 , x4 233 .418 . Individual: 3 4 221 .920 233 .418 t 36 86 .9242 2 11.498 8.46 s Scheffe: 3 4 221 .920 233 .418 11 .498 3 2.87 11.498 12.233 1 1 11 .498 2.028 17 .3848 10 10 86 .9242 3F.053, 36 86 .9242 1 1 10 10 1 1 11 .498 2.934 17 .3848 10 10 ns 86 .9242 Tukey: 3 4 x.3 x.4 q .405,36 2 221 .920 233 .418 3.82 86 .9242 2 1 1 10 10 1 1 10 10 11 .498 2.701 17 .3848 11.498 11.262 s Conclusion: I have included individual confidence levels here for completeness. The analysis of variance definitely tells us that the means are not the same, regardless of the significance level we might want to use, because the p-value is small. If we compare the differences in sample means we find that there is no difference between the means for the first 3 designs, but that most of the intervals show design 4 to be superior. The intervals are labeled ‘ns’ for not significant and ‘s’ for significant depending on whether the error part of the interval is larger or smaller than the difference between sample means. 7 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) Extra Credit: Take the data from your last ANOVA and perform a Levene test on it using the third example in 252mvarex. as a pattern for your calculations using Minitab. Make sure that you explain what is being tested and what you conclude. To do this copy your data into rows 1-10 of columns 1-5. Remember that your column labels should be written in above the columns. Just to make sure that you are in the right place, print out your data and run a one-way ANOVA using: print c1-c5 AOVO c2-c5 The test is simply vartest c2-c5; unstacked. Don’t give me results without explaining them. ————— 4/5/2005 11:23:01 PM ———————————————————— Welcome to Minitab, press F1 for help. MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\2gr4051C.MTW". Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My Documents\Minitab\2gr4-051C.MTW' Worksheet was saved on Tue Apr 05 2005 Results for: 2gr4-051C.MTW MTB > describe c2-c5 Descriptive Statistics: Des 1, Des 2, Des 3, Des 4 Variable Des 1 Des 2 Des 3 Des 4 N 10 10 10 10 N* 0 0 0 0 Variable Des 1 Des 2 Des 3 Des 4 Maximum 229.75 234.30 230.55 240.45 Mean 216.29 216.62 221.92 228.42 SE Mean 3.23 3.74 2.08 2.46 StDev 10.23 11.82 6.56 7.77 Minimum 204.45 203.81 211.82 218.54 Q1 206.29 205.38 216.27 219.36 Median 215.55 215.18 222.78 228.31 Q3 225.72 226.23 228.34 234.85 MTB > AOVOneway c2-c5. One-way ANOVA: Des 1, Des 2, Des 3, Des 4 Source Factor Error Total DF 3 36 39 S = 9.323 SS 971.0 3129.3 4100.3 MS 323.7 86.9 R-Sq = 23.68% F 3.72 P 0.020 This is identical to our previous ANOVA R-Sq(adj) = 17.32% MTB > vartest c2-c5; SUBC> unstacked. Test for Equal Variances: Des 1, Des 2, Des 3, Des 4 95% Bonferroni confidence intervals for standard deviations N Lower StDev Upper Des 1 10 6.40248 10.2298 22.6233 Des 2 10 7.39650 11.8180 26.1357 Des 3 10 4.10804 6.5638 14.5159 Des 4 10 4.86006 7.7653 17.1731 Bartlett's Test (normal distribution) Test statistic = 3.51, p-value = 0.320 8 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) Levene's Test (any continuous distribution) Test statistic = 3.78, p-value = 0.019 Test for Equal Variances: Des 1, Des 2, Des 3, Des 4 Test for Equal Variances for Des 1, Des 2, Des 3, Des 4 Bartlett's Test Test Statistic P-Value Des 1 3.51 0.320 Lev ene's Test Test Statistic P-Value Des 2 3.78 0.019 Des 3 Des 4 5 10 15 20 25 95% Bonferroni Confidence Intervals for StDevs We have very interesting results. If we are justified using ANOVA, then the Bartlett test should be the correct one to use and should be more powerful than the Levene test. This is not what seems to be happening. The Levene test, with a p-value of 1.9%, which is below 5% rejects the null hypothesis of equal variances. The Bartlett test, with a much higher p-value does not. ------------------------------------------------------------------------------ Extra Extra Credit: Do Bartlett and Levene tests by hand using the examples in 252mvar as your pattern. It turns out that your ANOVA has just enough columns to do this test. This is an awful lot of work unless you cheat and use the computer. If you cover your tracks, I’ll never know. To do the Bartlett test you need logarithms of variances. Label Columns 10-12 ‘stdev,’ ‘var’ and ‘log.’ Use the data that you already have in Minitab and get the variances as follows: stdev c2 k2 stdev c3 k3 stdev c4 k4 stdev c5 k5 print k2-k5 stack k2-k5 c10 let c11 = c10 * c10 let c12 = logten(c11) let k11 = mean(c11) let k12 = logten(k11) print k11 – k12 #These are the standard deviations of the columns. #These are the variances of the columns. #This is the pooled variance when you have equal sized samples. 9 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) print c10 – c12. Now you are on your own. The rest of this should be pretty easy because all your n j s are equal. The Levene test is longer, but should be much more familiar and perhaps easier to fake. Copy columns 1 through 5 to c21-c25. Then find their medians and subtract them from the columns and convert the columns to absolute values. let k22 = median(c22) let k23 = median(c23) let k24 = median (c24) let k25 = median(c25) let c22 = c22 - k22 let c23 = c23 - k23 let c24 = c24 - k24 let c25 = c25 - k25 describe c22-c25 print c21 – c25 let c22 = absolute(c22) let c23 = absolute(c23) let c24 = absolute(c24) let c25 = absolute(c25) print c21 – c25 #All the columns should have zero medians now. You are now ready for an ANOVA using: AOVO c2-c5 #You should get the same p-value as you got for the first Bartlett test you did. The Bartlett Test This test seems to require that the underlying distribution be Normal. It should not be used to compare two columns, since a simple F test described earlier is more appropriate. Recall that in the test for comparing 2 means with equal variances we used a pooled variance 2 n1 1s12 n2 1s22 . Assume that we have c sˆ p n1 n2 2 columns representing c independent samples. Then the pooled variance would be n 1s12 n2 1s 22 n3 1s32 nc 1s c2 . The test statistic used when there are 6 or more s p2 1 n1 n 2 n3 nc c rows is 2 c 1 2.30259 d n 1logsˆ n 1logs where j 2 p j 2 j 1 1 1 3c 1 n j 1 n j c For smaller examples (less than 6 rows) a special table is required and the instructions that I have found are very confusing. Use the computer. d 1 Bartlett Test Setup MTB > stdev c2 k2 Standard Deviation of Des 1 Standard deviation of Des 1 = 10.2298 MTB > stdev c3 k3 Standard Deviation of Des 2 Standard deviation of Des 2 = 11.8180 MTB > stdev c4 k4 Standard Deviation of Des 3 Standard deviation of Des 3 = 6.56375 10 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) MTB > stdev c5 k5 Standard Deviation of Des 4 Standard deviation of Des 4 = 7.76532 MTB > print k2-k5 Data Display K2 K3 K4 K5 10.2298 11.8180 6.56375 7.76532 MTB MTB MTB MTB MTB MTB > > > > > > #These are the standard deviations of the columns stack k2-k5 c10 let c11 = c10*c10 #These are the variances of the columns. let c12 = logten(c11) let k11 = mean(c11) #This is the pooled variance when you have equal sized samples. let k12 = logten(k11) print k11 k12 Data Display K11 K12 86.9242 1.93914 MTB > print c10-c12 Data Display Row 1 2 3 4 stdev 10.2298 11.8180 6.5638 7.7653 var 104.648 139.665 43.083 60.300 log 2.01973 2.14509 1.63430 1.78032 ---------------------------------------------------------------------------------------------------------------2 2 2 2 s1 104 .648 s 2 139 .665 s 3 43 .083 s 4 60 .300 n 2 10 n3 10 n 4 10 n1 10 n 1s12 n2 1s 22 n3 1s32 nc 1s c2 9104 .648 9139 .655 943 .083 960 .300 s p2 1 10 10 10 10 4 n1 n 2 n3 nc c 86 .9242 c 1 2.30259 1 1 1 2 n j 1 log sˆ 2p n j 1 log s 2j where d 1 d 3c 1 n j 1 n j c 1 2 1 4 1 1 12 1 1 1 1 1 1 1 1 15 9 36 1 15 36 36 1 .02037 1.0237 35 9 9 9 9 36 c 1 2.30259 d n 1logsˆ n 1logs j 2 p j 2 j 2.30259 36 log86 .9242 9log104 .648 9log 139 .655 9log 43 .083 9 log 60 .300 1.0237 2.30259 36 1.93914 92.01973 92.14509 91.63430 9 1.78032 1.0237 2.30259 69 .80904 97.57944 2.30259 1.59408 3.58554 . (The computer got 3.51 – Where did I 1.0237 1.0237 go wrong?) This has c 1 4 1 3 degrees of freedom and the chi-squared table says that 3 2 .05 7.8147. Since our computed chi-squared is less than the table chi-square, do not reject the null hypothesis. 11 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) The Levene Test This test is quite simple. It can be used for non-Normal data and can be used to compare two columns as well as more than two columns. (i) Find the median of each column. (This is the middle number or the average of the two middle numbers.) (ii) Subtract the median of each column from the column from which it comes and take the absolute value of the result. (iii) Do a 1-way ANOVA on the result. If the results would lead you to reject the null hypothesis (because the computed F is above the table F or the p-value is below your significance level), reject the null hypothesis of equal variances. Levene Test MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB > > > > > > > > > > > > > let c22 = c2 let c23 = c3 let c24 = c4 let c25 = c5 let k22 = median(c22) let k23 = median(c23) let k24 = median (c24) let k25 = median(c25) let c22 = c22 - k22 let c23 = c23 - k23 let c24 = c24 - k24 let c25 = c25 - k25 describe c22-c25 Descriptive Statistics: C22, C23, C24, C25 Variable C22 C23 C24 C25 N 10 10 10 10 N* 0 0 0 0 Mean 0.744 1.44 -0.860 0.108 Variable C22 C23 C24 C25 Maximum 14.20 19.12 7.77 12.14 SE Mean 3.23 3.74 2.08 2.46 StDev 10.23 11.82 6.56 7.77 C24 -5.70 7.77 -1.35 5.17 -4.74 -8.94 1.35 2.09 -10.96 6.71 C25 -9.41 7.79 -2.03 -1.78 6.12 12.14 -9.77 5.04 -8.80 1.78 Minimum -11.10 -11.37 -10.96 -9.77 Q1 -9.26 -9.80 -6.51 -8.95 Median -1.42109E-14 0.000000000 0.000000000 0.000000000 Q3 10.17 11.05 5.55 6.54 MTB > print c21-c25 Data Display Row 1 2 3 4 5 6 7 8 9 10 MTB MTB MTB MTB C21 > > > > let let let let C22 -9.225 8.305 -7.605 9.245 -9.355 14.205 -11.095 12.965 -5.895 5.895 c22 c23 c24 c25 = = = = C23 -11.37 8.67 -8.43 8.79 -9.50 19.12 -10.69 4.32 -4.32 17.82 absolute(c22) absolute(c23) absolute(c24) absolute(c25) 12 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) MTB > print c22 - c25 Data Display Row 1 2 3 4 5 6 7 8 9 10 C22 9.225 8.305 7.605 9.245 9.355 14.205 11.095 12.965 5.895 5.895 C23 11.37 8.67 8.43 8.79 9.50 19.12 10.69 4.32 4.32 17.82 C24 5.70 7.77 1.35 5.17 4.74 8.94 1.35 2.09 10.96 6.71 C25 9.41 7.79 2.03 1.78 6.12 12.14 9.77 5.04 8.80 1.78 MTB > AOVO c22-c25 One-way ANOVA: C22, C23, C24, C25 Source Factor Error Total DF 3 36 39 S = 3.741 SS 158.8 503.7 662.6 Level C22 C23 C24 C25 MS 52.9 14.0 F 3.78 N 10 10 10 10 R-Sq = 23.97% Mean 9.379 10.303 5.478 6.466 StDev 2.743 4.902 3.250 3.723 P 0.019 R-Sq(adj) = 17.64% Individual 95% CIs For Mean Based on Pooled StDev --------+---------+---------+---------+(---------*--------) (--------*---------) (---------*---------) (---------*--------) --------+---------+---------+---------+5.0 7.5 10.0 12.5 Pooled StDev = 3.741 So anyway, here is our original data. Row 1 2 3 4 5 6 7 8 9 10 Des 1 Des 2 Des 3 x1 x2 x3 206.32 223.85 207.94 224.79 206.19 229.75 204.45 228.51 209.65 221.44 203.81 223.85 206.75 223.97 205.68 234.30 204.49 219.50 210.86 233.00 217.08 230.55 221.43 227.95 218.04 213.84 224.13 224.87 211.82 229.49 Des 4 x4 218.90 236.10 226.28 226.53 234.43 240.45 218.54 233.35 219.51 230.09 (i) Find the median of each column. (This is the middle number or the average of the two middle numbers.) Let’s put the numbers in order so we can see the two middle numbers. Row x1 x2 x3 x4 1 2 3 4 5 6 7 8 9 10 204.45 206.19 206.32 207.94 209.65 221.44 223.85 224.79 228.51 229.75 203.81 204.49 205.68 206.75 210.86 219.50 223.85 223.97 233.00 234.30 211.82 213.84 217.08 218.04 221.43 224.13 224.87 227.95 229.49 230.55 218.54 218.90 219.51 226.28 226.53 230.09 233.35 234.43 236.10 240.45 The medians for the four columns are 215.545, 215.180, 222.780 and 228.310. 13 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) (ii) Subtract the median of each column from the column from which it comes and take the absolute value of the result. If we take the original numbers and subtract the medians, we get the following. Row x1 x2 x3 x4 1 2 3 4 5 6 7 8 9 10 -9.225 8.305 -7.605 9.245 -9.355 14.205 -11.095 12.965 -5.895 5.895 -11.37 8.67 -8.43 8.79 -9.50 19.12 -10.69 4.32 -4.32 17.82 -5.70 7.77 -1.35 5.17 -4.74 -8.94 1.35 2.09 -10.96 6.71 -9.41 7.79 -2.03 -1.78 6.12 12.14 -9.77 5.04 -8.80 1.78 If we remove the signs, our result is below. Row x1 x2 x3 x4 1 2 3 4 5 6 7 8 9 10 9.225 8.305 7.605 9.245 9.355 14.205 11.095 12.965 5.895 5.895 11.37 8.67 8.43 8.79 9.50 19.12 10.69 4.32 4.32 17.82 5.70 7.77 1.35 5.17 4.74 8.94 1.35 2.09 10.96 6.71 9.41 7.79 2.03 1.78 6.12 12.14 9.77 5.04 8.80 1.78 (iii) Do a 1-way ANOVA on the result. If the results would lead you to reject the null hypothesis (because the computed F is above the table F or the p-value is below your significance level), reject the null hypothesis of equal variances. Row 1 2 3 4 5 6 7 8 9 10 Totals x1 x2 x3 x4 9.225 8.305 7.605 9.245 9.355 14.205 11.095 12.965 5.895 5.895 11.37 8.67 8.43 8.79 9.50 19.12 10.69 4.32 4.32 17.82 5.70 7.77 1.35 5.17 4.74 8.94 1.35 2.09 10.96 6.71 9.41 7.79 2.03 1.78 6.12 12.14 9.77 5.04 8.80 1.78 Sum 93.790 103.03 54.78 64.66 316.26 nj 10 10 10 40 n 10 x j 9.379 10.303 5.478 6.466 SS 947.370 1277.75 395.142 542.818 x 2j 87.9656 106.0900 30.0085 41.8092 ij 316 .26 7.9065 x 40 x 265.9351 x 3163.08 2 ij 2 j 2 xij2 nx 2 3163 .08 407.9065 2 662 .57 2 2 2 2 2 2 2 2 . j x n j x. j nx 10 9.397 10 10 .303 10 5.478 10 6.466 40 7.9065 x SSB x SST x ij x 10 265 .9351 40 7.9065 2 158 .84 14 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) Source SS Between 158.84 DF 3 MS 52.95 F F.05 3.78 3,36 2.87 s F.05 H0 Column means equal Within 503.73 36 13.9925 Total 662.57 39 Our rejection of the null hypothesis at the 5% level means that we reject the hypothesis of equal variances. As far as the official answer to the problem 11.14 is concerned, here is the solution in the Instructor’s Solution Manual. It doesn’t resemble my results at all, so I can’t wait to see what you got. 11.14 (a) To test at the 0.05 level of significance whether there is any evidence of a difference in the average distance traveled by the golf balls differing in design, we conduct an F test: H0: 1 2 3 4 H1: At least one mean is different. (b) Decision rule: df: 3, 36. If F > 2.866, reject H0. ANOVA Source of SS df MS F P-value F crit Variation Between 2990.99 3 996.9966 53.02982 2.73E-13 2.866265 Groups Within 676.8244 36 18.80068 Groups Total 3667.814 39 Since Fcalc = 53.03 is above the critical bound of F = 2.866, reject H0. There is enough evidence to conclude that there is significant difference in the average distance traveled by the golf balls differing in design. To determine which of the means are significantly different from one another, we use the Tukey-Kramer procedure to establish the critical range: QU(c, n – c) = QU(4, 36). We use QU(4, 40) = 3.79 critical range = QU ( c ,n c ) MSW 2 1 1 18.8007 1 1 = 3.79 n n 2 10 10 j j ' 5.1967 Tukey Kramer Multiple Comparisons Sample Sample Group Mean Size Comparison 10 Group 1 to Group 2 1 206.614 (c) Absolute Difference Results 11.902 Means are different 10 Group 1 to Group 3 19.974 Means are 2 218.516 different 10 Group 1 to Group 4 22.008 Means are 3 226.588 different 10 Group 2 to Group 3 8.072 Means are 4 228.622 different 10.106 Means are Group 2 to Group 4 different 2.034 Means are MSW 18.800677 Group 3 to Group 4 not different At 5% level of significance, there is enough evidence to conclude that average traveling distances between all pairs of designs are different with the only exception of the pair between design 3 and design 4. The assumptions needed in (a) are (i) samples are randomly and independently 15 252grass4-051 4/13/05 (Open this document in 'Page Layout' view!) 11.14 cont. (d) drawn, (ii) populations are normally distributed, and (iii) populations have equal variances. To test at the 0.05 level of significance whether the variation within the groups is similar for all groups, we conduct a Levene's test for homogeneity of variance: H0: (e) 12 22 32 42 H1: At least one variance is different. ANOVA Source of SS df MS F P-value F crit Variation Between 40.63675 3 13.54558 2.093228 0.118276 2.866265 Groups Within 232.9613 36 6.471147 Groups Total 273.598 39 Since p-value = 0.1182 > 0.05, do not reject the null hypothesis. There is not enough evidence to conclude that there is any difference in the variation of the distance traveled by the golf balls differing in design. In order to produce golf balls with the furthest traveling distance, either design 3 or 4 can be used. 16