Solution to Graded Assignment 4

advertisement
252solngr4-071s 4/3/07
1
252solngr4-041 4/05/07
(Open this document in 'Page Layout' view!)
Name, Student Number:
Class days and time:
Please include this on what you hand in!
Solution to Graded Assignment 4
The data set is part of a problem due to Pelosi and Sandifer.
20 Employees (A-T) are timed in a computer entry task initially (0hr), after 2 hours of work (2hr), after 4
hours (4hr) and after 6 hours (6hr). The times, in seconds are reported below. a) At a 5% significance level
do the four mean times differ? b) Determine which of the times actually differ. c) On the basis of these data,
how would you react to a proposal that employees only be allowed to work for four hours a day at this task?
Only neat and legible papers with written answers in complete sentences will be read!
0 Hours
67
64
69
88
72
80
85
116
77
78
68
51
54
75
71
64
86
98
103
91
2 Hours
84
78
74
91
70
73
86
71
76
76
61
62
94
63
70
63
66
71
53
81
4 Hours
52
53
56
66
59
77
64
62
54
65
71
92
71
50
71
58
77
53
81
70
6 Hours
57
53
71
61
73
50
53
80
63
41
63
41
53
63
61
46
68
64
49
70
Do this problem in Excel as follows.
Use columns A, B, C, E and F on the Excel spreadsheet for data
In the first row of Columns B, C, D and E put in 0hr, 2hr, 4hr and 6hr. Starting in Cell A2 Put in the letters
A through T to identify the employees – unless, of course, you want to suggest some names.
Now put in the data in columns B, C, D and E, skipping column A
If you bring this document into Word, the data can be moved into the Excel worksheet by highlighting the
cells you want and copying and pasting.
To fill column F in cell F2 write =B2 after your 'enter' this cell should read '82'
Use the 'edit' pull-down menu and 'copy' cell F2
Use the 'edit' pull-down menu and 'paste' in cells F3 through F21. Now column F will be identical to B
except for the heading. This can also be done as a simple copy and paste. Save your data as time1.xls
Use the 'tools' pull-down menu and pick ‘data analysis' (If you cannot find this, use Tools and Add-Ins to
put in the analysis packs.)
Pick 'ANOVA: Single Factor. Set input range to $B$1:$E$21. Select 'New worksheet ply' and ‘columns’,
check 'labels in first row' hit 'OK' and save your results as treslt1.xls. In order to check for the effect of the
fact that the data is blocked by employees, repeat the analysis using ‘ANOVA: Two-Factor without
replication. Set input range to $A$1:$E$21, and save your results as treslt2.xls
Answer the following: Is there a significant difference between the task completion times according to the
number of hours worked? How is this conclusion affected by blocking by employees?
252solngr4-071s 4/3/07
2
Take the last digit of your student number (if it's zero, use 10). Go back to your original data or use the 'file'
pull-down menu to open time1.xls.
To fill column B this time in cell B2 write =F2+x, replacing x with the last digit of your social security
number.
Use the 'edit' pull down menu and 'copy' cell B2
Use the 'edit' pull down menu and ‘paste’ in cells B3 through B21. Now column B will be more than the
original B by the amount of your value of x. Save your data as time3.xls.
Run the one-way ANOVA again and save your results as treslt3.xls
Submit the data and results with your Student number. Indicate what hypotheses were tested, what the pvalue was and whether, using the p-value, you would reject the null if (i) the significance level was 5% and
(ii) the significance level was 10%, explaining why. You will have two answers for each of your two
problems.
For your second ANOVA do a Scheffe confidence interval and a Tukey-Kramer interval or procedure for
each of the C 24  6 possible differences between means and report which are different at the 5% level
according to each of the 2 methods. Now on the basis of these data, how would you react to a proposal that
employees only be allowed to work for four hours a day at this task? Why?
Data for 1st and 2nd ANOVA
0hr
2hr
A
67
B
64
C
69
D
88
E
72
F
80
G
85
H
116
I
77
J
78
K
68
L
51
M
54
N
75
O
71
P
64
Q
86
R
98
S
103
T
91
3hr
84
78
74
91
70
73
86
71
76
76
61
62
94
63
70
63
66
71
53
81
4hr
52
53
56
66
59
77
64
62
54
65
71
92
71
50
71
58
77
53
81
70
57
53
71
61
73
50
53
80
63
41
63
41
53
63
61
46
68
64
49
70
67
64
69
88
72
80
85
116
77
78
68
51
54
75
71
64
86
98
103
91
252solngr4-071s 4/3/07
3
Results for 1st ANOVA
H 0 : 1   2   3   4
Anova: Single Factor
SUMMARY
Groups
0hr
2hr
3hr
4hr
ANOVA
Source of
Variation
Between Groups
Within Groups
Total
Count
20
20
20
20
SS
Sum
1557
1463
1302
1180
df
4211.05
11610.9
3
76
15821.95
79
Average
77.85
73.15
65.1
59
Variance
260.45
110.6605
125.5684
114.4211
MS
F
1403.683
152.775
9.187913
P-value
2.93E05
F crit
2.724946
Results for 2 nd ANOVA H 01 : RowEmployeemeans equal H 02 :  1   2   3   4
Anova: Two-Factor Without Replication
SUMMARY
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
0hr
2hr
3hr
4hr
Count
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
Sum
260
248
270
306
274
280
288
329
270
260
263
246
272
251
273
231
297
286
286
312
Average
65
62
67.5
76.5
68.5
70
72
82.25
67.5
65
65.75
61.5
68
62.75
68.25
57.75
74.25
71.5
71.5
78
Variance
199.3333
140.6667
63
231
41.66667
186
263.3333
560.25
121.6667
288.6667
20.91667
487
368.6667
104.25
23.58333
68.25
84.25
367
643.6667
102
20
20
20
20
1557
1463
1302
1180
77.85
73.15
65.1
59
260.45
110.6605
125.5684
114.4211
252solngr4-071s 4/3/07
4
252solngr4-041 4/05/04
ANOVA
Source of
Variation
Rows
SS
2726.45
df
19
MS
143.4974
F
0.920637
Columns
Error
4211.05
8884.45
3
57
1403.683
155.8675
9.005617
P-value
0.56152
5.65E05
F crit
1.771973
2.766441
Total
15821.95
79
Answer: In the first ANOVA we get a p-value of .0000293. Since this is below any significance level we
are likely to use, we reject the null hypothesis that the mean execution time is the same for all numbers of
hours worked. In the second ANOVA. In the second ANOVA, the p-value for columns (.0000562) is almost
as low, so we again reject the original null hypothesis. Note that there is no significant difference between
individuals.
Data for 3rd ANOVA
0hr
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
I added 3 to first column.
2hr
70
67
72
91
75
83
88
119
80
81
71
54
57
78
74
67
89
101
106
94
3hr
84
78
74
91
70
73
86
71
76
76
61
62
94
63
70
63
66
71
53
81
4hr
52
53
56
66
59
77
64
62
54
65
71
92
71
50
71
58
77
53
81
70
57
53
71
61
73
50
53
80
63
41
63
41
53
63
61
46
68
64
49
70
67
64
69
88
72
80
85
116
77
78
68
51
54
75
71
64
86
98
103
91
252solngr4-071s 4/3/07
5
252solngr4-041 4/05/04
H 0 : 1   2   3   4
Results for 3rd ANOVA
Anova: Single Factor
SUMMARY
Groups
Count
0hr
20
2hr
20
3hr
20
4hr
20
ANOVA
Source of
Variation
SS
Between Groups
Within Groups
Total
Sum
1617
1463
1302
1180
df
5435.05
11610.9
3
76
17045.95
79
Average
80.85
73.15
65.1
59
Variance
260.45
110.6605
125.5684
114.4211
MS
F
1811.683
152.775
11.85851
P-value
1.88E06
F crit
2.724946
Individual Confidence Interval
If we desire a single interval, we use the formula for the difference between two means when the variance is
known. For example, if we want the difference between means of column 1 and column 2.
1
1
, where s  MSW .
1   2  x1  x2   tn  m s

2
n1 n2
Scheffé Confidence Interval
If we desire intervals that will simultaneously be valid for a given confidence level for all possible intervals
 1
1 
between column means, use 1   2  x1  x2   m  1Fm 1, n  m   s
.

 n
n2 
1

Tukey Confidence Interval
This also applies to all possible differences.
1   2  x1  x2   q m,n  m 
s
2
1
1
. This gives rise to Tukey’s HSD (Honestly Significant

n1 n 2
Difference) procedure. Two sample means x .1 and x .2 are significantly different if x.1  x.2 is greater
than q m,n  m 
s
2
1
1

n1 n 2
From the Excel output, x1  80.85, x2  73 .15, x3  65 .10, x4  59 .00, m  4, n  m  76,
n1  n 2  n3  n 4  20 and MSW  152 .775 . Assume   0.05 . The contrasts follow.
252solngr4-071s 4/3/07
6
1   2
Individual: 1   2  80 .85  73 .15   t 76 152 .775
2
1
1

 9.70  1.665 15 .2775
20 20
 9.70  6.51 s
3F.053, 76
Scheffé: 1   2  80 .85  73 .15  
 9.70  
3 2.73
1
1

20 20
1
1

 9.70  125 .123  9.70  11 .18
20 20
152 .775
152 .775
Tukey: 1   2  x1  x2   q .405,76 
2
 80 .85  73 .15   3.73
152 .775
152 .775
2
ns
1
1

20 20
1
1

 9.70  3.73 7.6387  9.70  10 .31 ns
20 20
1   3
Individual: 1   3  80 .85  65 .10   t 76 152 .775
2
1
1

 15 .75  1.665 15 .2775
20 20
 15.75  6.51 s
3F.053, 76
Scheffé: 1   3  80 .85  65 .10  
 15 .75  
3 2.73
1
1

20 20
1
1

 15 .75  125 .123  15 .75  11 .18
20 20
152 .775
152 .775
Tukey: 1   3  x1  x3   q .405,76 
2
 80 .85  65.10   3.73
152 .775
152 .775
2
s
1
1

20 20
1
1

 15 .75  3.73 7.6387  15 .75  10 .31 s
20 20
1   4
Individual: 1   4  80 .85  59 .00   t 76 152 .775
2
1
1

 21 .85  1.665 15 .2775
20 20
 21.85  6.51 s
Scheffé: 1   4  80 .85  59 .00  
 21 .85  
3 2.73
3F.053, 76
152 .775
152 .775
2
1
1

20 20
1
1
 21 .85  125 .123  21 .85  11 .18

20 20
152 .775
Tukey: 1   4  x1  x4   q .405,76 
2
 80 .85  59 .00   3.73
152 .775
s
1
1

20 20
1
1

 21 .85  3.73 7.6387  21 .85  10 .31 s
20 20
252solngr4-071s 4/3/07
7
252solngr4-041 4/05/04
 2  3
Individual:  2   3  73 .10  65 .10   t 76 152 .775
2
1
1

 15 .75  1.665 15 .2775
20 20
 8.00  6.51 s
3F.053, 76
Scheffé:  2   3  73 .15  65 .10  
 8.00  
3 2.73
1
1

20 20
1
1

 8.00  125 .123  8.00  11 .18
20 20
152 .775
152 .775
Tukey:  2   3  x2  x3   q .405,76 
2
 73 .15  65.10   3.73
152 .775
152 .775
2
ns
1
1

20 20
1
1

 8.00  3.73 7.6387  8.00  10 .31 ns
20 20
2  4
Individual:  2   4  73 .15  59 .00   t 76 152 .775
2
1
1

 14 .15  1.665 15 .2775
20 20
 14.15  6.51 s
3F.053, 76
Scheffé:  2   4  73 .15  59 .00  
 14 .15  
3 2.73
152 .775
1
1

20 20
1
1

 14 .15  125 .123  14 .15  11 .18
20 20
152 .775
Tukey:  2   4  x2  x4   q .405,76 
2
 73 .15  59 .00   3.73
152 .775
152 .775
2
s
1
1

20 20
1
1

 14 .15  3.73 7.6387  14 .15  10.31 s
20 20
3   4
Individual:  3   4  65 .10  59 .00   t 76 152 .775
2
1
1

 6.1  1.665 15 .2775
20 20
 6.10  6.51 ns
3F.053, 76
Scheffé:  3   4  65 .10  59 .00  
 6.10  
3 2.73
152 .775
152 .775
2
1
1

20 20
1
1

 6.10  125 .123  6.10  11 .18
20 20
152 .775
Tukey:  3   4  x3  x4   q .405,76 
2
 65 .10  59 .00   3.73
152 .775
ns
1
1

20 20
1
1

 6.10  3.73 7.6387  6.10  10 .31 ns
20 20
252solngr4-071s 4/3/07
8
252solngr4-041 4/05/04
Conclusion: I have included individual confidence levels here for completeness. The analysis of variance
definitely tells us that the means are not the same, regardless of the significance level we might want to use,
because the p-value is microscopic. If we compare the differences in sample means we find that there is no
subsequent difference between the mean for subsequent periods. The intervals are labeled ns for not
significant and s for significant depending on whether the error part of the interval is larger or smaller than
the difference between sample means.
These conclusions are at the 95% confidence level, but the more conservative Scheffé procedure
3, 76
F.05
 2.73  1.65 (2.73 came from the computer printout reference value – using the table we
might have come up with something like F 3,60 which is slightly larger) as part of the error term. If we
used
.05
were to repeat our tests at the 1% level, we could use something like
3, 60
F.01
 4.13  2.03 , which
would make our error terms 23% larger. If we were to do that, the differences between nonadjacent periods
would still remain significant. The strong gains over longer periods might make it unwise to limit daily
hours of employees.
Extra Credit: Take the data from your last ANOVA and perform a Levene test on it using the third
example in 252mvarex as a pattern for your calculations. Make sure that you explain what is being tested
and what you conclude. Hand in separately – this will be treated as extra credit on your next take-home
exam.
Extra Extra Credit: Do a Bartlett test using the example in 252mvar as your pattern. It turns out that your
ANOVA has just enough columns to do this test.
See exam for extra credit solution.
© 2004 Roger Even Bove
252solngr4-071s 4/3/07
9
Extra Credit: 1) Show that you learned something from computer problem 2 by doing part B on Minitab.
There should be very little difference in your result. Comments are in red.
————— 4/3/2007 5:28:57 PM ————————————————————
Welcome to Minitab, press F1 for help.
Results for: 2gr4-071ANOVA.MTW
MTB > print c1 - c5
Data Display
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Employee
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
0hr
67
64
69
88
72
80
85
116
77
78
68
51
54
75
71
64
86
98
103
91
2hr
84
78
74
91
70
73
86
71
76
76
61
62
94
63
70
63
66
71
53
81
4hr
52
53
56
66
59
77
64
62
54
65
71
92
71
50
71
58
77
53
81
70
6hr
57
53
71
61
73
50
53
80
63
41
63
41
53
63
61
46
68
64
49
70
MTB > AOVO c2-c5
One-way ANOVA: 0hr, 2hr, 4hr, 6hr
The low p-value means that the null hypothesis
Source DF
Factor
3
Error
76
Total
79
S = 12.36
SS
MS
F
P
of equal column means is rejected.
4211 1404 9.19 0.000
11611
153
15822
R-Sq = 26.62%
R-Sq(adj) = 23.72%
Individual 95% CIs For Mean Based on
Pooled StDev
Level
N
Mean StDev ---+---------+---------+---------+-----0hr
20 77.85 16.14
(------*------)
2hr
20 73.15 10.52
(-----*------)
4hr
20 65.10 11.21
(------*------)
6hr
20 59.00 10.70 (------*------)
---+---------+---------+---------+-----56.0
64.0
72.0
80.0
Pooled StDev = 12.36
MTB >
SUBC>
SUBC>
MTB >
stack c2 c3 c4 c5 c6;
subscripts c7;
UseNames.
Print c6 c7 c8
Data Display
Row
1
2
3
4
5
6
7
8
9
10
Time
67
64
69
88
72
80
85
116
77
78
Hour
0hr
0hr
0hr
0hr
0hr
0hr
0hr
0hr
0hr
0hr
Person
A
B
C
D
E
F
G
H
I
J
This is just to show you what the stacked data looks like.
252solngr4-071s 4/3/07
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
68
51
54
75
71
64
86
98
103
91
84
78
74
91
70
73
86
71
76
76
61
62
94
63
70
63
66
71
53
81
52
53
56
66
59
77
64
62
54
65
71
92
71
50
71
58
77
53
81
70
57
53
71
61
73
50
53
80
63
41
63
41
53
63
61
46
68
0hr
0hr
0hr
0hr
0hr
0hr
0hr
0hr
0hr
0hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
2hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
4hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
6hr
K
L
M
N
O
P
Q
R
S
T
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
10
252solngr4-071s 4/3/07
78
79
80
64
49
70
6hr
6hr
6hr
11
R
S
T
MTB > table c8 c7.
Tabulated statistics: Person, Hour
Rows: Person
Columns: Hour
0hr 2hr 4hr 6hr All
A
1
1
B
1
1
C
1
1
D
1
1
E
1
1
F
1
1
G
1
1
H
1
1
I
1
1
J
1
1
K
1
1
L
1
1
M
1
1
N
1
1
O
1
1
P
1
1
Q
1
1
R
1
1
S
1
1
T
1
1
All
20
20
Cell Contents:
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
20
20
Count
This is an instruction from your 2-way ANOVA
It tells you how much data is in each cell.
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
80
MTB > table c8 c7;
SUBC> data c6.
Tabulated statistics: Person, Hour
Rows: Person
Columns: Hour
0hr 2hr 4hr 6hr
A
67
84
52
57
B
64
78
53
53
C
69
74
56
71
D
88
91
66
61
E
72
70
59
73
F
80
73
77
50
G
85
86
64
53
H
116
71
62
80
I
77
76
54
63
J
78
76
65
41
K
68
61
71
63
L
51
62
92
41
M
54
94
71
53
N
75
63
50
63
O
71
70
71
61
P
64
63
58
46
Q
86
66
77
68
R
98
71
53
64
S
103
53
81
49
T
91
81
70
70
Cell Contents: Time : DATA
This is just a printout of data by cell. Because it was done by
cell there were big blanks between each line. I edited them out.
252solngr4-071s 4/3/07
12
MTB > twoway c6 c7 c8;
SUBC> means c8 c7.
Two-way ANOVA: Time versus Hour, Person
Source DF
Hour
3
Person 19
Error
57
Total
79
S = 12.48
So here is our 2-way ANOVA. The first
hypothesis test says that the hypothesis
that hour means are equal is rejected.
The high p-value for the second test,
which is above any significance level we
might use tells us that there is no difference
between employee means.
SS
MS
F
P
4211.1 1403.68 9.01 0.000
2726.5
143.50 0.92 0.562
8884.5
155.87
15822.0
R-Sq = 43.85%
R-Sq(adj) = 22.17%
Individual 95% CIs For Mean Based on
Pooled StDev
Hour
Mean ---+---------+---------+---------+-----0hr
77.85
(------*------)
2hr
73.15
(------*------)
4hr
65.10
(------*------)
6hr
59.00 (------*------)
---+---------+---------+---------+-----56.0
64.0
72.0
80.0
Individual 95% CIs For Mean Based on
Pooled StDev
Person
Mean
+---------+---------+---------+--------A
65.00
(-------*--------)
B
62.00
(-------*--------)
C
67.50
(-------*-------)
D
76.50
(-------*-------)
E
68.50
(--------*-------)
F
70.00
(--------*-------)
G
72.00
(-------*-------)
H
82.25
(--------*-------)
I
67.50
(-------*-------)
J
65.00
(-------*--------)
K
65.75
(--------*-------)
L
61.50
(-------*-------)
M
68.00
(-------*--------)
N
62.75
(--------*-------)
O
68.25
(--------*-------)
P
57.75
(--------*-------)
Q
74.25
(--------*-------)
R
71.50
(--------*-------)
S
71.50
(--------*-------)
T
78.00
(-------*-------)
+---------+---------+---------+--------45
60
75
90
Extra Credit: 2) Take the data from your last ANOVA. Use the instructions in 1) above to copy it into the
Minitab spreadsheet and perform Levene and Bartlett tests on it using the third example in 252mvarex. as a
pattern for your calculations using Minitab. Make sure that you explain what is being tested and what you
conclude.
MTB > print c1-c5
Data Display
Row
1
2
3
4
5
6
7
8
9
10
Employee
A
B
C
D
E
F
G
H
I
J
0hr
67
64
69
88
72
80
85
116
77
78
This is just to remind you of the data.
2hr
84
78
74
91
70
73
86
71
76
76
4hr
52
53
56
66
59
77
64
62
54
65
6hr
57
53
71
61
73
50
53
80
63
41
252solngr4-071s 4/3/07
11
12
13
14
15
16
17
18
19
20
K
L
M
N
O
P
Q
R
S
T
68
51
54
75
71
64
86
98
103
91
13
61
62
94
63
70
63
66
71
53
81
71
92
71
50
71
58
77
53
81
70
63
41
53
63
61
46
68
64
49
70
MTB > vartest c2-c5;
SUBC> unstacked.
This test was needlessly done twice. This is the
unstacked version.
Test for Equal Variances: 0hr, 2hr, 4hr, 6hr
95% Bonferroni confidence intervals for standard deviations
N
Lower
StDev
Upper
0hr 20 11.4383 16.1385 26.4296
2hr 20
7.4558 10.5195 17.2276
4hr 20
7.9422 11.2057 18.3514
6hr 20
7.5814 10.6968 17.5179
Bartlett's Test (normal distribution)
Test statistic = 5.10, p-value = 0.165
Levene's Test (any continuous distribution)
Test statistic = 1.29, p-value = 0.283
Test for Equal Variances: 0hr, 2hr, 4hr, 6hr
Both p-values are above any significance
level that we might use. This means that we
cannot reject the null hypothesis of equal
variances.
Just a graphic of the info above.
252solngr4-071s 4/3/07
MTB > vartest c6 c7
Test for Equal Variances: Time versus Hour
14
Look at the stacked data several pages back.
This is exactly the same as the last test,
95% Bonferroni confidence intervals for standard deviations
Hour
N
Lower
StDev
Upper
but done on stacked data.
0hr 20 11.4383 16.1385 26.4296
2hr 20
7.4558 10.5195 17.2276
4hr 20
7.9422 11.2057 18.3514
6hr 20
7.5814 10.6968 17.5179
Bartlett's Test (normal distribution)
Test statistic = 5.10, p-value = 0.165
Levene's Test (any continuous distribution)
Test statistic = 1.29, p-value = 0.283
Test for Equal Variances: Time versus Hour
MTB > vartest c2-c5;
SUBC> unstacked.
Test for Equal Variances: 0hr, 2hr, 4hr, 6hr
95% Bonferroni confidence intervals for standard deviations
N
Lower
StDev
Upper
0hr 20 11.4383 16.1385 26.4296
2hr 20
7.4558 10.5195 17.2276
4hr 20
7.9422 11.2057 18.3514
6hr 20
7.5814 10.6968 17.5179
Bartlett's Test (normal distribution)
Test statistic = 5.10, p-value = 0.165
Levene's Test (any continuous distribution)
Test statistic = 1.29, p-value = 0.283
Test for Equal Variances: 0hr, 2hr, 4hr, 6hr
252solngr4-071s 4/3/07
15
Extra Extra Credit: Do Bartlett and Levene tests using the examples in 252mvar as your pattern. It turns
out that your ANOVA has just enough columns to do this test.
This is an awful lot of work unless you cheat and use the computer. If you cover your tracks, I’ll never
know. To do the Bartlett test you need logarithms of variances. Label Columns 10-12 ‘stdev,’ ‘var’ and
‘log.’ Use the data that you already have in four columns in Minitab c2-c5 (labels in c1) and get the
variances as follows:
MTB
MTB
MTB
MTB
>
>
>
>
name
name
name
name
k2
k3
k4
k5
'stdev1'
'stdev2'
'stdev3'
'stdev4'
MTB > stdev c2 k2
Standard Deviation of 0hr
Standard deviation of 0hr = 16.1385
We are computing standard deviations of the
columns and storing them as the Minitab constants
k2, k3, k4 and k5. We actually want variances.
MTB > stdev c3 k3
Standard Deviation of 2hr
Standard deviation of 2hr = 10.5195
MTB > stdev c4 k4
Standard Deviation of 4hr
Standard deviation of 4hr = 11.2057
MTB > stdev c5 k5
Standard Deviation of 6hr
Standard deviation of 6hr = 10.6968
MTB > print k2-k5
Data Display
stdev1
stdev2
stdev3
stdev4
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
>
>
>
>
>
>
>
>
16.1385
10.5195
11.2057
10.6968
stack k2-k5 c10
let c11 = c10*c10
let c12 = logten(c11)
let k11 = mean(c11)
let k12 = logten(k11)
name k11 'meansdsq'
name k12 'logmean'
print k11 - k12
We put the standard deviations in C10 and squared
them to get variances.
This is the pooled variance when you have equal
sized samples.
Data Display
meansdsq
logmean
152.775
2.18405
MTB > print c10 - c12
Note that I named my columns.
Data Display
Row
1
2
3
4
stdev
16.1385
10.5195
11.2057
10.6968
sdsq
260.450
110.661
125.568
114.421
logsdsq
2.41572
2.04399
2.09888
2.05851
Now you are on your own. I’ll finish this if anyone
actually does the Bartlett test.
Extra Extra Credit: Do Bartlett and Levene tests using the examples in 252mvar as your pattern. It turns
out that your ANOVA has just enough columns to do this test.
The Levene test is longer, but should be much more familiar and perhaps easier to fake.
Copy columns 1 through 5 to c21-c25. Then find their medians and subtract them from the columns and
convert the columns to absolute values.
252solngr4-071s 4/3/07
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
MTB
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
16
name k22 'med1'
name k23 'med2'
name k24 'med3'
name k25 'med4'
let c21 = c1
let c22 = c2
let c23 = c3
let c24 = c4
let c25 = c5
let k22 = median(c22)
let k23 = median(c23)
let k24 = median(c24)
let k25 = median(c25)
let c22 = c22 - k22
let c23 = c23 - k23
let c24 = c24 - k24
let c25 = c25 - k25
describe c22 - c25
I copied my original data to c21-c25
I subtracted the median for each column.
I checked to see if the medians were zero.
Descriptive Statistics: 1-med, 2-med, 3-med, 4-med
Variable
1-med
2-med
3-med
4-med
N
20
20
20
20
N*
0
0
0
0
Mean
1.85
1.15
0.60
-2.00
Variable
1-med
2-med
3-med
4-med
Maximum
40.00
22.00
27.50
19.00
SE Mean
3.61
2.35
2.51
2.39
MTB > print c22 - c25
Data Display
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
MTB
MTB
MTB
MTB
1-med
-9
-12
-7
12
-4
4
9
40
1
2
-8
-25
-22
-1
-5
-12
10
22
27
15
>
>
>
>
let
let
let
let
2-med
12
6
2
19
-2
1
14
-1
4
4
-11
-10
22
-9
-2
-9
-6
-1
-19
9
c22
c23
c24
c25
=
=
=
=
3-med
-12.5
-11.5
-8.5
1.5
-5.5
12.5
-0.5
-2.5
-10.5
0.5
6.5
27.5
6.5
-14.5
6.5
-6.5
12.5
-11.5
16.5
5.5
abs(c22)
abs(c23)
abs(c24)
abs(c25)
StDev
16.14
10.52
11.21
10.70
Minimum
-25.00
-19.00
-14.50
-20.00
Q1
-8.75
-8.25
-10.00
-10.25
Median
0.00
0.00
0.00
0.00
Q3
11.50
8.25
6.50
6.00
These are the original data with column medians subtracted.
4-med
-4
-8
10
0
12
-11
-8
19
2
-20
2
-20
-8
2
0
-15
7
3
-12
9
252solngr4-071s 4/3/07
17
MTB > print c22 - c25
Data Display
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1-med
9
12
7
12
4
4
9
40
1
2
8
25
22
1
5
12
10
22
27
15
2-med
12
6
2
19
2
1
14
1
4
4
11
10
22
9
2
9
6
1
19
9
3-med
12.5
11.5
8.5
1.5
5.5
12.5
0.5
2.5
10.5
0.5
6.5
27.5
6.5
14.5
6.5
6.5
12.5
11.5
16.5
5.5
This is the absolute value of the columns we just printed.
4-med
4
8
10
0
12
11
8
19
2
20
2
20
8
2
0
15
7
3
12
9
MTB > AOVO c22 - c25
We now do an ordinary 1-way ANOVA
One-way ANOVA: 1-med, 2-med, 3-med, 4-med
Source DF
Factor
3
Error
76
Total
79
S = 7.535
Level
1-med
2-med
3-med
4-med
N
20
20
20
20
SS
MS
220.1 73.4
4314.9 56.8
4535.0
R-Sq = 4.85%
Mean
12.350
8.150
9.000
8.600
StDev
10.174
6.491
6.378
6.386
Pooled StDev = 7.535
F
1.29
P
0.283
Since the p-value is above any significance level
that we might use, we cannot reject the null
hypothesis of equal variances.
R-Sq(adj) = 1.10%
Individual 95% CIs For Mean Based on
Pooled StDev
----+---------+---------+---------+----(----------*----------)
(----------*----------)
(----------*----------)
(-----------*----------)
----+---------+---------+---------+----6.0
9.0
12.0
15.0
Game over.
Download