Solution to Graded Assignment 4

252solngr4-071s 4/3/07 1 252solngr4-041 4/05/07 (Open this document in 'Page Layout' view!) Name, Student Number: Class days and time: Please include this on what you hand in! Solution to Graded Assignment 4 The data set is part of a problem due to Pelosi and Sandifer. 20 Employees (A-T) are timed in a computer entry task initially (0hr), after 2 hours of work (2hr), after 4 hours (4hr) and after 6 hours (6hr). The times, in seconds are reported below. a) At a 5% significance level do the four mean times differ? b) Determine which of the times actually differ. c) On the basis of these data, how would you react to a proposal that employees only be allowed to work for four hours a day at this task? Only neat and legible papers with written answers in complete sentences will be read! 0 Hours 67 64 69 88 72 80 85 116 77 78 68 51 54 75 71 64 86 98 103 91 2 Hours 84 78 74 91 70 73 86 71 76 76 61 62 94 63 70 63 66 71 53 81 4 Hours 52 53 56 66 59 77 64 62 54 65 71 92 71 50 71 58 77 53 81 70 6 Hours 57 53 71 61 73 50 53 80 63 41 63 41 53 63 61 46 68 64 49 70 Do this problem in Excel as follows. Use columns A, B, C, E and F on the Excel spreadsheet for data In the first row of Columns B, C, D and E put in 0hr, 2hr, 4hr and 6hr. Starting in Cell A2 Put in the letters A through T to identify the employees – unless, of course, you want to suggest some names. Now put in the data in columns B, C, D and E, skipping column A If you bring this document into Word, the data can be moved into the Excel worksheet by highlighting the cells you want and copying and pasting. To fill column F in cell F2 write =B2 after your 'enter' this cell should read '82' Use the 'edit' pull-down menu and 'copy' cell F2 Use the 'edit' pull-down menu and 'paste' in cells F3 through F21. Now column F will be identical to B except for the heading. This can also be done as a simple copy and paste. Save your data as time1.xls Use the 'tools' pull-down menu and pick ‘data analysis' (If you cannot find this, use Tools and Add-Ins to put in the analysis packs.) Pick 'ANOVA: Single Factor. Set input range to $B$1:$E$21. Select 'New worksheet ply' and ‘columns’, check 'labels in first row' hit 'OK' and save your results as treslt1.xls. In order to check for the effect of the fact that the data is blocked by employees, repeat the analysis using ‘ANOVA: Two-Factor without replication. Set input range to $A$1:$E$21, and save your results as treslt2.xls Answer the following: Is there a significant difference between the task completion times according to the number of hours worked? How is this conclusion affected by blocking by employees? 252solngr4-071s 4/3/07 2 Take the last digit of your student number (if it's zero, use 10). Go back to your original data or use the 'file' pull-down menu to open time1.xls. To fill column B this time in cell B2 write =F2+x, replacing x with the last digit of your social security number. Use the 'edit' pull down menu and 'copy' cell B2 Use the 'edit' pull down menu and ‘paste’ in cells B3 through B21. Now column B will be more than the original B by the amount of your value of x. Save your data as time3.xls. Run the one-way ANOVA again and save your results as treslt3.xls Submit the data and results with your Student number. Indicate what hypotheses were tested, what the pvalue was and whether, using the p-value, you would reject the null if (i) the significance level was 5% and (ii) the significance level was 10%, explaining why. You will have two answers for each of your two problems. For your second ANOVA do a Scheffe confidence interval and a Tukey-Kramer interval or procedure for each of the C 24  6 possible differences between means and report which are different at the 5% level according to each of the 2 methods. Now on the basis of these data, how would you react to a proposal that employees only be allowed to work for four hours a day at this task? Why? Data for 1st and 2nd ANOVA 0hr 2hr A 67 B 64 C 69 D 88 E 72 F 80 G 85 H 116 I 77 J 78 K 68 L 51 M 54 N 75 O 71 P 64 Q 86 R 98 S 103 T 91 3hr 84 78 74 91 70 73 86 71 76 76 61 62 94 63 70 63 66 71 53 81 4hr 52 53 56 66 59 77 64 62 54 65 71 92 71 50 71 58 77 53 81 70 57 53 71 61 73 50 53 80 63 41 63 41 53 63 61 46 68 64 49 70 67 64 69 88 72 80 85 116 77 78 68 51 54 75 71 64 86 98 103 91 252solngr4-071s 4/3/07 3 Results for 1st ANOVA H 0 : 1   2   3   4 Anova: Single Factor SUMMARY Groups 0hr 2hr 3hr 4hr ANOVA Source of Variation Between Groups Within Groups Total Count 20 20 20 20 SS Sum 1557 1463 1302 1180 df 4211.05 11610.9 3 76 15821.95 79 Average 77.85 73.15 65.1 59 Variance 260.45 110.6605 125.5684 114.4211 MS F 1403.683 152.775 9.187913 P-value 2.93E05 F crit 2.724946 Results for 2 nd ANOVA H 01 : RowEmployeemeans equal H 02 :  1   2   3   4 Anova: Two-Factor Without Replication SUMMARY A B C D E F G H I J K L M N O P Q R S T 0hr 2hr 3hr 4hr Count 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 Sum 260 248 270 306 274 280 288 329 270 260 263 246 272 251 273 231 297 286 286 312 Average 65 62 67.5 76.5 68.5 70 72 82.25 67.5 65 65.75 61.5 68 62.75 68.25 57.75 74.25 71.5 71.5 78 Variance 199.3333 140.6667 63 231 41.66667 186 263.3333 560.25 121.6667 288.6667 20.91667 487 368.6667 104.25 23.58333 68.25 84.25 367 643.6667 102 20 20 20 20 1557 1463 1302 1180 77.85 73.15 65.1 59 260.45 110.6605 125.5684 114.4211 252solngr4-071s 4/3/07 4 252solngr4-041 4/05/04 ANOVA Source of Variation Rows SS 2726.45 df 19 MS 143.4974 F 0.920637 Columns Error 4211.05 8884.45 3 57 1403.683 155.8675 9.005617 P-value 0.56152 5.65E05 F crit 1.771973 2.766441 Total 15821.95 79 Answer: In the first ANOVA we get a p-value of .0000293. Since this is below any significance level we are likely to use, we reject the null hypothesis that the mean execution time is the same for all numbers of hours worked. In the second ANOVA. In the second ANOVA, the p-value for columns (.0000562) is almost as low, so we again reject the original null hypothesis. Note that there is no significant difference between individuals. Data for 3rd ANOVA 0hr A B C D E F G H I J K L M N O P Q R S T I added 3 to first column. 2hr 70 67 72 91 75 83 88 119 80 81 71 54 57 78 74 67 89 101 106 94 3hr 84 78 74 91 70 73 86 71 76 76 61 62 94 63 70 63 66 71 53 81 4hr 52 53 56 66 59 77 64 62 54 65 71 92 71 50 71 58 77 53 81 70 57 53 71 61 73 50 53 80 63 41 63 41 53 63 61 46 68 64 49 70 67 64 69 88 72 80 85 116 77 78 68 51 54 75 71 64 86 98 103 91 252solngr4-071s 4/3/07 5 252solngr4-041 4/05/04 H 0 : 1   2   3   4 Results for 3rd ANOVA Anova: Single Factor SUMMARY Groups Count 0hr 20 2hr 20 3hr 20 4hr 20 ANOVA Source of Variation SS Between Groups Within Groups Total Sum 1617 1463 1302 1180 df 5435.05 11610.9 3 76 17045.95 79 Average 80.85 73.15 65.1 59 Variance 260.45 110.6605 125.5684 114.4211 MS F 1811.683 152.775 11.85851 P-value 1.88E06 F crit 2.724946 Individual Confidence Interval If we desire a single interval, we use the formula for the difference between two means when the variance is known. For example, if we want the difference between means of column 1 and column 2. 1 1 , where s  MSW . 1   2  x1  x2   tn  m s  2 n1 n2 Scheffé Confidence Interval If we desire intervals that will simultaneously be valid for a given confidence level for all possible intervals  1 1  between column means, use 1   2  x1  x2   m  1Fm 1, n  m   s .   n n2  1  Tukey Confidence Interval This also applies to all possible differences. 1   2  x1  x2   q m,n  m  s 2 1 1 . This gives rise to Tukey’s HSD (Honestly Significant  n1 n 2 Difference) procedure. Two sample means x .1 and x .2 are significantly different if x.1  x.2 is greater than q m,n  m  s 2 1 1  n1 n 2 From the Excel output, x1  80.85, x2  73 .15, x3  65 .10, x4  59 .00, m  4, n  m  76, n1  n 2  n3  n 4  20 and MSW  152 .775 . Assume   0.05 . The contrasts follow. 252solngr4-071s 4/3/07 6 1   2 Individual: 1   2  80 .85  73 .15   t 76 152 .775 2 1 1   9.70  1.665 15 .2775 20 20  9.70  6.51 s 3F.053, 76 Scheffé: 1   2  80 .85  73 .15    9.70   3 2.73 1 1  20 20 1 1   9.70  125 .123  9.70  11 .18 20 20 152 .775 152 .775 Tukey: 1   2  x1  x2   q .405,76  2  80 .85  73 .15   3.73 152 .775 152 .775 2 ns 1 1  20 20 1 1   9.70  3.73 7.6387  9.70  10 .31 ns 20 20 1   3 Individual: 1   3  80 .85  65 .10   t 76 152 .775 2 1 1   15 .75  1.665 15 .2775 20 20  15.75  6.51 s 3F.053, 76 Scheffé: 1   3  80 .85  65 .10    15 .75   3 2.73 1 1  20 20 1 1   15 .75  125 .123  15 .75  11 .18 20 20 152 .775 152 .775 Tukey: 1   3  x1  x3   q .405,76  2  80 .85  65.10   3.73 152 .775 152 .775 2 s 1 1  20 20 1 1   15 .75  3.73 7.6387  15 .75  10 .31 s 20 20 1   4 Individual: 1   4  80 .85  59 .00   t 76 152 .775 2 1 1   21 .85  1.665 15 .2775 20 20  21.85  6.51 s Scheffé: 1   4  80 .85  59 .00    21 .85   3 2.73 3F.053, 76 152 .775 152 .775 2 1 1  20 20 1 1  21 .85  125 .123  21 .85  11 .18  20 20 152 .775 Tukey: 1   4  x1  x4   q .405,76  2  80 .85  59 .00   3.73 152 .775 s 1 1  20 20 1 1   21 .85  3.73 7.6387  21 .85  10 .31 s 20 20 252solngr4-071s 4/3/07 7 252solngr4-041 4/05/04  2  3 Individual:  2   3  73 .10  65 .10   t 76 152 .775 2 1 1   15 .75  1.665 15 .2775 20 20  8.00  6.51 s 3F.053, 76 Scheffé:  2   3  73 .15  65 .10    8.00   3 2.73 1 1  20 20 1 1   8.00  125 .123  8.00  11 .18 20 20 152 .775 152 .775 Tukey:  2   3  x2  x3   q .405,76  2  73 .15  65.10   3.73 152 .775 152 .775 2 ns 1 1  20 20 1 1   8.00  3.73 7.6387  8.00  10 .31 ns 20 20 2  4 Individual:  2   4  73 .15  59 .00   t 76 152 .775 2 1 1   14 .15  1.665 15 .2775 20 20  14.15  6.51 s 3F.053, 76 Scheffé:  2   4  73 .15  59 .00    14 .15   3 2.73 152 .775 1 1  20 20 1 1   14 .15  125 .123  14 .15  11 .18 20 20 152 .775 Tukey:  2   4  x2  x4   q .405,76  2  73 .15  59 .00   3.73 152 .775 152 .775 2 s 1 1  20 20 1 1   14 .15  3.73 7.6387  14 .15  10.31 s 20 20 3   4 Individual:  3   4  65 .10  59 .00   t 76 152 .775 2 1 1   6.1  1.665 15 .2775 20 20  6.10  6.51 ns 3F.053, 76 Scheffé:  3   4  65 .10  59 .00    6.10   3 2.73 152 .775 152 .775 2 1 1  20 20 1 1   6.10  125 .123  6.10  11 .18 20 20 152 .775 Tukey:  3   4  x3  x4   q .405,76  2  65 .10  59 .00   3.73 152 .775 ns 1 1  20 20 1 1   6.10  3.73 7.6387  6.10  10 .31 ns 20 20 252solngr4-071s 4/3/07 8 252solngr4-041 4/05/04 Conclusion: I have included individual confidence levels here for completeness. The analysis of variance definitely tells us that the means are not the same, regardless of the significance level we might want to use, because the p-value is microscopic. If we compare the differences in sample means we find that there is no subsequent difference between the mean for subsequent periods. The intervals are labeled ns for not significant and s for significant depending on whether the error part of the interval is larger or smaller than the difference between sample means. These conclusions are at the 95% confidence level, but the more conservative Scheffé procedure 3, 76 F.05  2.73  1.65 (2.73 came from the computer printout reference value – using the table we might have come up with something like F 3,60 which is slightly larger) as part of the error term. If we used .05 were to repeat our tests at the 1% level, we could use something like 3, 60 F.01  4.13  2.03 , which would make our error terms 23% larger. If we were to do that, the differences between nonadjacent periods would still remain significant. The strong gains over longer periods might make it unwise to limit daily hours of employees. Extra Credit: Take the data from your last ANOVA and perform a Levene test on it using the third example in 252mvarex as a pattern for your calculations. Make sure that you explain what is being tested and what you conclude. Hand in separately – this will be treated as extra credit on your next take-home exam. Extra Extra Credit: Do a Bartlett test using the example in 252mvar as your pattern. It turns out that your ANOVA has just enough columns to do this test. See exam for extra credit solution. © 2004 Roger Even Bove 252solngr4-071s 4/3/07 9 Extra Credit: 1) Show that you learned something from computer problem 2 by doing part B on Minitab. There should be very little difference in your result. Comments are in red. ————— 4/3/2007 5:28:57 PM ———————————————————— Welcome to Minitab, press F1 for help. Results for: 2gr4-071ANOVA.MTW MTB > print c1 - c5 Data Display Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Employee A B C D E F G H I J K L M N O P Q R S T 0hr 67 64 69 88 72 80 85 116 77 78 68 51 54 75 71 64 86 98 103 91 2hr 84 78 74 91 70 73 86 71 76 76 61 62 94 63 70 63 66 71 53 81 4hr 52 53 56 66 59 77 64 62 54 65 71 92 71 50 71 58 77 53 81 70 6hr 57 53 71 61 73 50 53 80 63 41 63 41 53 63 61 46 68 64 49 70 MTB > AOVO c2-c5 One-way ANOVA: 0hr, 2hr, 4hr, 6hr The low p-value means that the null hypothesis Source DF Factor 3 Error 76 Total 79 S = 12.36 SS MS F P of equal column means is rejected. 4211 1404 9.19 0.000 11611 153 15822 R-Sq = 26.62% R-Sq(adj) = 23.72% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ---+---------+---------+---------+-----0hr 20 77.85 16.14 (------*------) 2hr 20 73.15 10.52 (-----*------) 4hr 20 65.10 11.21 (------*------) 6hr 20 59.00 10.70 (------*------) ---+---------+---------+---------+-----56.0 64.0 72.0 80.0 Pooled StDev = 12.36 MTB > SUBC> SUBC> MTB > stack c2 c3 c4 c5 c6; subscripts c7; UseNames. Print c6 c7 c8 Data Display Row 1 2 3 4 5 6 7 8 9 10 Time 67 64 69 88 72 80 85 116 77 78 Hour 0hr 0hr 0hr 0hr 0hr 0hr 0hr 0hr 0hr 0hr Person A B C D E F G H I J This is just to show you what the stacked data looks like. 252solngr4-071s 4/3/07 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 68 51 54 75 71 64 86 98 103 91 84 78 74 91 70 73 86 71 76 76 61 62 94 63 70 63 66 71 53 81 52 53 56 66 59 77 64 62 54 65 71 92 71 50 71 58 77 53 81 70 57 53 71 61 73 50 53 80 63 41 63 41 53 63 61 46 68 0hr 0hr 0hr 0hr 0hr 0hr 0hr 0hr 0hr 0hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 2hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr 6hr K L M N O P Q R S T A B C D E F G H I J K L M N O P Q R S T A B C D E F G H I J K L M N O P Q R S T A B C D E F G H I J K L M N O P Q 10 252solngr4-071s 4/3/07 78 79 80 64 49 70 6hr 6hr 6hr 11 R S T MTB > table c8 c7. Tabulated statistics: Person, Hour Rows: Person Columns: Hour 0hr 2hr 4hr 6hr All A 1 1 B 1 1 C 1 1 D 1 1 E 1 1 F 1 1 G 1 1 H 1 1 I 1 1 J 1 1 K 1 1 L 1 1 M 1 1 N 1 1 O 1 1 P 1 1 Q 1 1 R 1 1 S 1 1 T 1 1 All 20 20 Cell Contents: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 20 20 Count This is an instruction from your 2-way ANOVA It tells you how much data is in each cell. 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 80 MTB > table c8 c7; SUBC> data c6. Tabulated statistics: Person, Hour Rows: Person Columns: Hour 0hr 2hr 4hr 6hr A 67 84 52 57 B 64 78 53 53 C 69 74 56 71 D 88 91 66 61 E 72 70 59 73 F 80 73 77 50 G 85 86 64 53 H 116 71 62 80 I 77 76 54 63 J 78 76 65 41 K 68 61 71 63 L 51 62 92 41 M 54 94 71 53 N 75 63 50 63 O 71 70 71 61 P 64 63 58 46 Q 86 66 77 68 R 98 71 53 64 S 103 53 81 49 T 91 81 70 70 Cell Contents: Time : DATA This is just a printout of data by cell. Because it was done by cell there were big blanks between each line. I edited them out. 252solngr4-071s 4/3/07 12 MTB > twoway c6 c7 c8; SUBC> means c8 c7. Two-way ANOVA: Time versus Hour, Person Source DF Hour 3 Person 19 Error 57 Total 79 S = 12.48 So here is our 2-way ANOVA. The first hypothesis test says that the hypothesis that hour means are equal is rejected. The high p-value for the second test, which is above any significance level we might use tells us that there is no difference between employee means. SS MS F P 4211.1 1403.68 9.01 0.000 2726.5 143.50 0.92 0.562 8884.5 155.87 15822.0 R-Sq = 43.85% R-Sq(adj) = 22.17% Individual 95% CIs For Mean Based on Pooled StDev Hour Mean ---+---------+---------+---------+-----0hr 77.85 (------*------) 2hr 73.15 (------*------) 4hr 65.10 (------*------) 6hr 59.00 (------*------) ---+---------+---------+---------+-----56.0 64.0 72.0 80.0 Individual 95% CIs For Mean Based on Pooled StDev Person Mean +---------+---------+---------+--------A 65.00 (-------*--------) B 62.00 (-------*--------) C 67.50 (-------*-------) D 76.50 (-------*-------) E 68.50 (--------*-------) F 70.00 (--------*-------) G 72.00 (-------*-------) H 82.25 (--------*-------) I 67.50 (-------*-------) J 65.00 (-------*--------) K 65.75 (--------*-------) L 61.50 (-------*-------) M 68.00 (-------*--------) N 62.75 (--------*-------) O 68.25 (--------*-------) P 57.75 (--------*-------) Q 74.25 (--------*-------) R 71.50 (--------*-------) S 71.50 (--------*-------) T 78.00 (-------*-------) +---------+---------+---------+--------45 60 75 90 Extra Credit: 2) Take the data from your last ANOVA. Use the instructions in 1) above to copy it into the Minitab spreadsheet and perform Levene and Bartlett tests on it using the third example in 252mvarex. as a pattern for your calculations using Minitab. Make sure that you explain what is being tested and what you conclude. MTB > print c1-c5 Data Display Row 1 2 3 4 5 6 7 8 9 10 Employee A B C D E F G H I J 0hr 67 64 69 88 72 80 85 116 77 78 This is just to remind you of the data. 2hr 84 78 74 91 70 73 86 71 76 76 4hr 52 53 56 66 59 77 64 62 54 65 6hr 57 53 71 61 73 50 53 80 63 41 252solngr4-071s 4/3/07 11 12 13 14 15 16 17 18 19 20 K L M N O P Q R S T 68 51 54 75 71 64 86 98 103 91 13 61 62 94 63 70 63 66 71 53 81 71 92 71 50 71 58 77 53 81 70 63 41 53 63 61 46 68 64 49 70 MTB > vartest c2-c5; SUBC> unstacked. This test was needlessly done twice. This is the unstacked version. Test for Equal Variances: 0hr, 2hr, 4hr, 6hr 95% Bonferroni confidence intervals for standard deviations N Lower StDev Upper 0hr 20 11.4383 16.1385 26.4296 2hr 20 7.4558 10.5195 17.2276 4hr 20 7.9422 11.2057 18.3514 6hr 20 7.5814 10.6968 17.5179 Bartlett's Test (normal distribution) Test statistic = 5.10, p-value = 0.165 Levene's Test (any continuous distribution) Test statistic = 1.29, p-value = 0.283 Test for Equal Variances: 0hr, 2hr, 4hr, 6hr Both p-values are above any significance level that we might use. This means that we cannot reject the null hypothesis of equal variances. Just a graphic of the info above. 252solngr4-071s 4/3/07 MTB > vartest c6 c7 Test for Equal Variances: Time versus Hour 14 Look at the stacked data several pages back. This is exactly the same as the last test, 95% Bonferroni confidence intervals for standard deviations Hour N Lower StDev Upper but done on stacked data. 0hr 20 11.4383 16.1385 26.4296 2hr 20 7.4558 10.5195 17.2276 4hr 20 7.9422 11.2057 18.3514 6hr 20 7.5814 10.6968 17.5179 Bartlett's Test (normal distribution) Test statistic = 5.10, p-value = 0.165 Levene's Test (any continuous distribution) Test statistic = 1.29, p-value = 0.283 Test for Equal Variances: Time versus Hour MTB > vartest c2-c5; SUBC> unstacked. Test for Equal Variances: 0hr, 2hr, 4hr, 6hr 95% Bonferroni confidence intervals for standard deviations N Lower StDev Upper 0hr 20 11.4383 16.1385 26.4296 2hr 20 7.4558 10.5195 17.2276 4hr 20 7.9422 11.2057 18.3514 6hr 20 7.5814 10.6968 17.5179 Bartlett's Test (normal distribution) Test statistic = 5.10, p-value = 0.165 Levene's Test (any continuous distribution) Test statistic = 1.29, p-value = 0.283 Test for Equal Variances: 0hr, 2hr, 4hr, 6hr 252solngr4-071s 4/3/07 15 Extra Extra Credit: Do Bartlett and Levene tests using the examples in 252mvar as your pattern. It turns out that your ANOVA has just enough columns to do this test. This is an awful lot of work unless you cheat and use the computer. If you cover your tracks, I’ll never know. To do the Bartlett test you need logarithms of variances. Label Columns 10-12 ‘stdev,’ ‘var’ and ‘log.’ Use the data that you already have in four columns in Minitab c2-c5 (labels in c1) and get the variances as follows: MTB MTB MTB MTB > > > > name name name name k2 k3 k4 k5 'stdev1' 'stdev2' 'stdev3' 'stdev4' MTB > stdev c2 k2 Standard Deviation of 0hr Standard deviation of 0hr = 16.1385 We are computing standard deviations of the columns and storing them as the Minitab constants k2, k3, k4 and k5. We actually want variances. MTB > stdev c3 k3 Standard Deviation of 2hr Standard deviation of 2hr = 10.5195 MTB > stdev c4 k4 Standard Deviation of 4hr Standard deviation of 4hr = 11.2057 MTB > stdev c5 k5 Standard Deviation of 6hr Standard deviation of 6hr = 10.6968 MTB > print k2-k5 Data Display stdev1 stdev2 stdev3 stdev4 MTB MTB MTB MTB MTB MTB MTB MTB > > > > > > > > 16.1385 10.5195 11.2057 10.6968 stack k2-k5 c10 let c11 = c10*c10 let c12 = logten(c11) let k11 = mean(c11) let k12 = logten(k11) name k11 'meansdsq' name k12 'logmean' print k11 - k12 We put the standard deviations in C10 and squared them to get variances. This is the pooled variance when you have equal sized samples. Data Display meansdsq logmean 152.775 2.18405 MTB > print c10 - c12 Note that I named my columns. Data Display Row 1 2 3 4 stdev 16.1385 10.5195 11.2057 10.6968 sdsq 260.450 110.661 125.568 114.421 logsdsq 2.41572 2.04399 2.09888 2.05851 Now you are on your own. I’ll finish this if anyone actually does the Bartlett test. Extra Extra Credit: Do Bartlett and Levene tests using the examples in 252mvar as your pattern. It turns out that your ANOVA has just enough columns to do this test. The Levene test is longer, but should be much more familiar and perhaps easier to fake. Copy columns 1 through 5 to c21-c25. Then find their medians and subtract them from the columns and convert the columns to absolute values. 252solngr4-071s 4/3/07 MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB > > > > > > > > > > > > > > > > > > 16 name k22 'med1' name k23 'med2' name k24 'med3' name k25 'med4' let c21 = c1 let c22 = c2 let c23 = c3 let c24 = c4 let c25 = c5 let k22 = median(c22) let k23 = median(c23) let k24 = median(c24) let k25 = median(c25) let c22 = c22 - k22 let c23 = c23 - k23 let c24 = c24 - k24 let c25 = c25 - k25 describe c22 - c25 I copied my original data to c21-c25 I subtracted the median for each column. I checked to see if the medians were zero. Descriptive Statistics: 1-med, 2-med, 3-med, 4-med Variable 1-med 2-med 3-med 4-med N 20 20 20 20 N* 0 0 0 0 Mean 1.85 1.15 0.60 -2.00 Variable 1-med 2-med 3-med 4-med Maximum 40.00 22.00 27.50 19.00 SE Mean 3.61 2.35 2.51 2.39 MTB > print c22 - c25 Data Display Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 MTB MTB MTB MTB 1-med -9 -12 -7 12 -4 4 9 40 1 2 -8 -25 -22 -1 -5 -12 10 22 27 15 > > > > let let let let 2-med 12 6 2 19 -2 1 14 -1 4 4 -11 -10 22 -9 -2 -9 -6 -1 -19 9 c22 c23 c24 c25 = = = = 3-med -12.5 -11.5 -8.5 1.5 -5.5 12.5 -0.5 -2.5 -10.5 0.5 6.5 27.5 6.5 -14.5 6.5 -6.5 12.5 -11.5 16.5 5.5 abs(c22) abs(c23) abs(c24) abs(c25) StDev 16.14 10.52 11.21 10.70 Minimum -25.00 -19.00 -14.50 -20.00 Q1 -8.75 -8.25 -10.00 -10.25 Median 0.00 0.00 0.00 0.00 Q3 11.50 8.25 6.50 6.00 These are the original data with column medians subtracted. 4-med -4 -8 10 0 12 -11 -8 19 2 -20 2 -20 -8 2 0 -15 7 3 -12 9 252solngr4-071s 4/3/07 17 MTB > print c22 - c25 Data Display Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1-med 9 12 7 12 4 4 9 40 1 2 8 25 22 1 5 12 10 22 27 15 2-med 12 6 2 19 2 1 14 1 4 4 11 10 22 9 2 9 6 1 19 9 3-med 12.5 11.5 8.5 1.5 5.5 12.5 0.5 2.5 10.5 0.5 6.5 27.5 6.5 14.5 6.5 6.5 12.5 11.5 16.5 5.5 This is the absolute value of the columns we just printed. 4-med 4 8 10 0 12 11 8 19 2 20 2 20 8 2 0 15 7 3 12 9 MTB > AOVO c22 - c25 We now do an ordinary 1-way ANOVA One-way ANOVA: 1-med, 2-med, 3-med, 4-med Source DF Factor 3 Error 76 Total 79 S = 7.535 Level 1-med 2-med 3-med 4-med N 20 20 20 20 SS MS 220.1 73.4 4314.9 56.8 4535.0 R-Sq = 4.85% Mean 12.350 8.150 9.000 8.600 StDev 10.174 6.491 6.378 6.386 Pooled StDev = 7.535 F 1.29 P 0.283 Since the p-value is above any significance level that we might use, we cannot reject the null hypothesis of equal variances. R-Sq(adj) = 1.10% Individual 95% CIs For Mean Based on Pooled StDev ----+---------+---------+---------+----(----------*----------) (----------*----------) (----------*----------) (-----------*----------) ----+---------+---------+---------+----6.0 9.0 12.0 15.0 Game over.

Solution to Graded Assignment 4

Related documents

Products

Support

Solution to Graded Assignment 4

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib