2.
3.
4.
1.
5.
6. a.
The error sum of squares is the sum of the squared deviation of each observation from its group average. b.
The within-group sum of squares is the sum of squares within each group. c.
The between group sum of square is the sum of squares between each group average from the overall mean. d.
The mean square error is the sum of squares divided by the degrees of freedom.
The mean square error
F = 3.5. p -value = 0.0352
The Bonferroni correction factor is a method for calculating probabilities associated with multiple comparisons. You essentially divide the significance level required for statistical significance by the number of planned comparisons you wish to make between the various factor levels. a.
0.0389 b.
0.0175 c.
0.0082 d.
0.0040 e.
0.0021 a.
Term
Region
The completed table is:
SS df
9305
Method
Interaction
Error
Total SS
12,204
6,023
32,809
60,341 b.
R 2 = 0.4563
3
1
3
32
39
MS
3,101.67
12,204.00
2,007.67
1,025.28
F
3.025
11.903
1.958 c.
The p -value for the Region term is 0.0437
The p -value for the Method term is 0.0016
The p -value for the Interaction term is 0.1401 d.
Both the Region and the Method effects are statistically significant. There is no statistical evidence of an interaction between the two factors.
1
Chapter 10: Analysis of Variance
7. a.
Use the delete command to remove the data. b.
The one-way ANOVA is:
SUMMARY
Count
LA
SF
DC
NY
ANOVA
Groups
Source of Variation
Between Groups
Within Groups
Total
SS
23424.26
39959.48
63383.74
8
7
8
8
Sum
1083
946
1083
1585
Average
135.38
135.14
135.38
198.13 df
3
27
MS
7808.087
1479.981
30
Variance
980.84
1524.81
1771.13
1649.55
F
5.276
P-value
0.005
F crit
2.960 c.
Count
Average
The means matrix is:
Standard Deviation
Minimum
Maximum
City = LA City = SF City = DC City = NY
8.000
135.375
31.318
79.000
175.000
Pairwise Mean Difference (row - column)
7.000
135.143
39.049
99.000
185.000
8.000
135.375
42.085
64.000
189.000
8.000
198.125
40.615
135.000
250.000
City = LA
City = SF
City = DC
City = NY
MSE = 1479.98082010582
City = LA City = SF City = DC City = NY
0.000 0.232 0.000 -62.750
0.000 -0.232
0.000
-62.982
-62.750
0.000
Pairwise Probabilities (Bonferroni Correction)
City = LA
City = SF
City = DC
City = NY
City = LA City = SF City = DC City = NY
- 1.000 1.000 0.018
-
-
1.000
-
0.023
0.018 d.
There is a significant difference in hotel prices between those in New York and those in Los
Angeles, San Francisco, and Washington D.C. By removing the outlier from the San Francisco data set, we've reduced the MSE of the analysis and thus the pairwise differences are statistically significant where they weren't before.
2
Chapter 10: Analysis of Variance
8. a.
The interaction plot appears as:
250
Average of Price
200
150
100
50
City
DC
LA
NY
SF
0
2 3 4
Stars
The lines are not exactly parallel, so there may be an interaction between the number of stars and the city with respect to the hotel price, but if it exists, it will be a small one. b.
The two-way ANOVA is:
Source of Variation
Sample
Columns
Interaction
Within
Total
SS
23,794.08
20,462.79
5,295.58
15,493.50
65045.96 df
6
12
2
3
MS
11,897.042
6,820.931
882.597
1,291.125
23
F
9.214
5.283
0.684
P-value
0.004
0.015
0.667
F crit
3.885
3.490
2.996 c.
The City effect is significant with a p -value of 0.015. The Stars effect is significant with a p -value of 0.004. There is no significant interaction between the City and Stars factor.
Average 2-star price: $131.50
Average 3-star price: $170.50
Average 4-star price: $208.63 d.
IN both cases the City factor is significant. By performing the two-way analysis we can discount hotel quality as the reason for the difference since the price is still different after adjusting for the effect of hotel quality.
3
Chapter 10: Analysis of Variance e.
The graph is:
Price vs. Stars
300
SF: y = 56.5x - 15.833
LA: y = 19x + 75.167
250
DC: y = 42.5x + 60
NY: y = 36.25x + 98.75
200
150
100
DC
LA
NY
SF
Linear (DC)
Linear (LA)
Linear (NY)
Linear (SF)
50
9.
350
300
250
200
150
100
50
0
0
1.5
2 2.5
3 3.5
4 4.5
Stars
The slopes appear similar for San Francisco, Washington, and New York, with slopes ranging from 36.25 to 56.5 dollars per star. However the slope appears much lower for Los Angeles with a value of $19 per star. a.
400
The boxplot appears as:
Cola = coke Cola = pepsi Cola = shasta Cola = generic
4
The multiple histograms appear as:
Chapter 10: Analysis of Variance b.
The one-way ANOVA is:
Source of Variation
Between Groups
Within Groups
Total
SS
183,750.50
80,355.96
264106.46 df
3
44
MS
61,250.17
1,826.27
47
F P-value
33.54 1.94E-11
F crit
2.816
5
Chapter 10: Analysis of Variance c.
The means matrix is:
Count
Average
Standard Deviation
Cola = coke
12.000
307.275
34.614
Minimum
Maximum
245.800
362.900
Pairwise Mean Difference (row - column)
Cola = coke
Cola = coke
0.000
Cola = pepsi
Cola = shasta
Cola = pepsi
12.000
142.442
29.554
89.700
210.700
Cola = pepsi
164.833
0.000
Cola = generic
MSE = 1826.27189393939
Pairwise Probabilities (Bonferroni Correction)
Cola = coke -
Cola = coke Cola = pepsi
0.000
Cola = pepsi
Cola = shasta
Cola = generic
-
Cola = shasta
12.000
275.725
42.670
214.600
362.900
Cola = shasta
31.550
-133.283
0.000
Cola = shasta
0.464
0.000
Cola = generic
12.000
239.958
58.419
156.100
327.800
Cola = generic
67.317
-97.517
35.767
0.000
Cola = generic
0.002
0.000
0.278 -
-
The pairs with significant differences in foam volume are: (Coke, Pepsi), (Coke, Generic), (Pepsi,
Shasta), and (Pepsi, Generic).
6
10. a.
The multiple histograms appear as:
Chapter 10: Analysis of Variance
The histograms show that tuition drops as you go from the prestigious schools to the lest prestigious. Notice that the spread is much narrower in the first group, which suggests a possible problem with unequal variance. However, the analysis of variance is fairly robust with respect to this assumption.
7
Chapter 10: Analysis of Variance b.
The one-way ANOVA is:
Count Groups
ANOVA
Source of Variation
Between Groups
Within Groups
1
2
3
4
Total
SS
1.17E+08
92804855
6
6
6
6
2.1E+08
Sum
93754
83038
65550
60948
Average
15,625.67
13,839.67
10,925.00
10,158.00
Variance
288,624.67
5,679,268.27
9,738,750.00 df MS
3 38,909,841.06
20 4,640,242.73
2,854,328.00
F P-value
8.39 0.00083
23
F crit
3.098
The p -value of 0.00083 allows us to reject the hypothesis of equal means at the 0.1% level. Group does have a significant effect. Note that the means of the four groups decrease with the group numbers, meaning that there is a possible trend towards lower tuition for colleges lower in the rating system. c.
The means matrix is:
Count
Average
Standard Deviation
Descriptive Statistics
Group = 1 Group = 2 Group = 3
6.000 6.000 6.000
15,626
537.238
13,840
2383.122
10,925
3120.697
Minimum
Maximum
14,710
16,250
10,945
17,200
6,450
15,000
Pairwise Mean Difference (row - column)
Group = 1 Group = 2 Group = 3
Group = 1
Group = 2
0.000 1786.000
0.000
4700.667
2914.667
0.000 Group = 3
Group = 4
MSE = 4640242.73333335
Pairwise Probabilities (Bonferroni Correction)
Group = 1 Group = 2 Group = 3
Group = 1
Group = 2
Group = 3
Group = 4
-
-
0.999
-
0.007
0.177
Group = 4
6.000
10,158
1689.476
8,150
12,400
Group = 4
5467.667
3681.667
767.000
0.000
-
Group = 4
0.002
0.046
1.000
There are significant differences between groups that arenot adjacent, but adjacent groups do not differ significantly. The first quartile is significantly more expensive than groups 3 and 4, but not group 2. The bottom quartile is significantly less expensive than the first two quartiles, but not the third quartile.
8
11.
Chapter 10: Analysis of Variance d.
The results of the regression command are:
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
0.731
0.534
0.513
2106.081
24
ANOVA
Regression
Residual df SS MS
1 111,951,673.63 1.12E+08
22 97,582,704.20 4435577
Total
Intercept
Group
23
-1,931.77
209,534,377.83
Coefficients Standard Error
17,466.50 1053.041
384.516 t Stat
16.587
-5.024
F
25.239
P-value
6.39E-14
4.97E-05
Significance F
4.97237E-05
Lower 95%
15,282.625
-2,729.205
Upper 95%
19,650.375
-1,134.328
The coefficient for group is –1932 which means that tuition drops by $1932 when the Group is increased by 1 quartile.
F -ratio = [(97,582,704 – 92,804,855)/2]/4,640,243 = 0.515 e.
a.
Use the delete command to delete the appropriate rows from the worksheet. b.
The boxplot appears as:
60
50
40
30
20
10
0
-10
Group = 1 Group = 2 Group = 3 Group = 4
The outlier is Hampshire College at 55 students per computer.
9
Chapter 10: Analysis of Variance c.
The one-way ANOVA is:
SUMMARY
Groups
ANOVA
Source of Variation
Between Groups
Within Groups
1
2
3
4
Total
Count
3
5
3
3
SS
178.192
2088.179
2266.371
Sum
20.790
77.040
32.610
23.310 df
3
10
Average
6.930
15.408
10.870
7.770
MS
59.397
208.818
Variance
22.270
497.568
20.379
6.305
F
0.284
P-value
0.836
13
F crit
3.708
The high p -value of 0.836 does not permit rejection of the hypothesis of equal means among the four grups. Note the very high variance in the second group due to the outlier from Hampshire
College. d.
Without the outlier the one-way ANOVA is:
SUMMARY
Groups Count
ANOVA
Source of Variation
Between Groups
Within Groups
Total
1
2
3
4
SS
50.984
128.771
3
4
3
3
179.756
Sum
20.79
22.04
32.61
23.31 df
3
9
12
Average
6.93
5.51
10.87
7.77
MS
16.995
14.308
Variance
22.270
10.288
20.379
6.305
F
1.188
P-value
0.368
F crit
3.863
With a p -value of 0.368 there is still not any indication of a significant difference between groups.
Does this mean that the four quartiles equal in terms of access to computers for their students?
Because of missing data, the analysis here is limited to just a few colleges and you should be reluctant to make any assertions about equal access based on such small samples. It is conceivable that a larger sample would find a significant difference between quartiles.
10
Chapter 10: Analysis of Variance
12. a.
3000
The boxplot and multiple histograms appear as:
2500
2000
1500
1000
500
0
Position = 3B
-500
Position = SS Position = 2B Position = 1B
11
Chapter 10: Analysis of Variance b.
The plots for LN salary are:
9
8
7
6
5
4 Position = 3B
3
Position = SS Position = 2B Position = 1B
The distribution of Salary is highly skewed, whereas the distribution of LN Salary is more symmetric.
12
Chapter 10: Analysis of Variance c.
The results of the one-way ANOVA are:
SUMMARY
Groups
3B
SS
2B
1B
ANOVA
Source of Variation
Between Groups
Within Groups
Total
Count
30
26
26
24
SS
4.384
87.012
91.395
Sum
181.243
147.114
158.635
148.968 df
3
102
105
Average
6.041
5.658
6.101
6.207
MS
1.461
0.853
Variance
0.897
0.820
0.542
1.171
F
1.713
P-value
0.169
F crit
2.694
The p -value is 0.169, so we cannot calim that there is a significant difference between positions.
Although the middle infielders (second base and shortstop) are not necessarily equally productive as hitters, they compensate with there fielding and therefore make salaries comparable to the others. For example, Ozzie Smith, a top shortstop, was not a great hitter but was paid well for his fielding. Futhermore, with LN Salary, there would need to be great disparities in salary to see significant differences using ANOVA. because the logarithmic transformation reduces the differences between numbers so dramatically.
13
Chapter 10: Analysis of Variance
13. a.
The boxplot and histograms appear as:
0.25
0.2
0.15
0.1
0.05
Position = 3B
0
Position = SS Position = 2B Position = 1B
The plots do not give us any strong reason to disbelieve a hypothesis of equal variance between the two groups.
14
Chapter 10: Analysis of Variance b.
The one-way ANOVA is:
SUMMARY
Groups
3B
SS
2B
1B
ANOVA
Source of Variation
Between Groups
Within Groups
Total
Count
30
26
26
24
SS
0.0591
0.0562
0.1153
Sum
3.77774
2.34500
2.34864
3.52020 df
3
102
105
Average
0.12592
0.09019
0.09033
0.14668
MS
0.020
0.001
Variance
0.00074
0.00046
0.00042
0.00055
F
35.789
P-value
0.000
F crit
2.694
The p -value is < 0.001, indicating that we can reject the null hypothesis that there is no difference between the groups and accept the alternative hypothesis that there exists a difference in RBI average for players from different positions. c.
The means matrix is:
Count
Average
Standard Deviation
Position = "3B"
RBI Aver
Sum of Squares
Pairwise Mean Difference (row - column)
30
0.12592
0.027206
0.497175
"3B"
"SS"
"2B"
"1B"
"3B"
0.000
MSE = 5.50540487668649E-04
Pairwise Probabilities (Bonferroni Correction)
"3B"
"3B" -
"SS"
"2B"
"1B"
-
Descriptive Statistics
Position = "SS" Position = "2B"
RBI Aver RBI Aver
26
0.09019
0.021363
0.222910
26
0.09033
0.020535
0.222700
"SS" "2B"
0.036
0.000
0.036
0.000
0.000
"SS"
0.000
-
"2B"
0.000
1.000
Position = "1B"
RBI Aver
24
0.14668
0.023534
0.529065
"1B"
-0.021
-0.056
-0.056
0.000
"1B"
0.010
0.000
0.000
-
First basemen have significantly higher RBI averages than all other position players. Shortstops and second basemen are significantly lower in their RBI averages than third basemen. Usually the second basemen and the shortstop are small and less powerful because they must be agile and quick for defense. First base is the least demanding defensive position in terms of agility, so the first baseman is often the biggest and most powerful hitter among the infield players.
15
Chapter 10: Analysis of Variance
14. a.
5sp at
The results of the two-sample t -test are:
Mean Diff.
-4,106.13
N
10
15
Std. Err.
1,229.240
Descriptive Statistics
Mean Std. Dev.
4,754.00
8,860.13
2,862.252
3,102.878 t
-3.340
Std. Err.
905.124
801.160 t-Test Analysis df
23.00 p-value
0.003 lower 95%
-6,649.01 upper 95%
-1,563.26
Equality of Variance Tests
F-Test
0.829
Bartlett
0.795
Levene
0.951 b.
The results of the one-way ANOVA are:
SUMMARY
Groups
5sp at
ANOVA
Source of Variation
Between Groups
Within Groups
Total
Count
10
15
SS
1.01E+08
2.09E+08
3.1E+08
Sum
47,540.00
132,902.00 df
Average
4,754.00
8,860.13
MS
1 101,161,985.71
23 9,066,188.42
24
Variance
8,192,487.78
9,627,853.12
F
11.16
P-value
0.0028
F crit
4.2793 c.
t -ratio 2 = (–3.340) 2 = 11.160 = F -ratio d.
The results of the ANOVA are:
SUMMARY
Groups
5sp at
Count
10
15
Sum
64
52
ANOVA
Source of Variation
Between Groups
Within Groups
Total
SS
51.627
222.133
273.76 df
1
23
24
Average
6.400
3.467
MS
51.6267
9.6580
Variance
12.489
7.838
F
5.3455
P-value
0.0301
F crit
4.2793
Automatic transmissions are signicantly more expensive, but they're also (in this data set) significantly younger. Therefore it's hard to determine whether the difference in price is a result of the transmission or the age of the model.
16
e.
Chapter 10: Analysis of Variance
The slope and confidence interval for the 5-speed transmission is –720.16 (–1,022,32 , –418.00)
For the automatic transmission it is –900.36 (–1,287.60 , –513.12)
The slopes are not exactly the same. From the intercepts, it would appear that new automatics are more expensive, but form the slopes they seem to depreciate faster as well (but another explanation is that most of them are not as old as the automatics and are in the stage of rapid depreciation.) This problem attempts to incorporate the fact that the 5-speed transmission cars are older by extrapolating the effects of age using linear regression.
It is difficult to make conclusions about the value of used cars with two regression lines because the two confidence intervals overlap. Also there is not much data in this data set. Nor are the data series necessarily linear. The chief problem with this data set is that type type of transmission is not independent of age, which has such a dramatic effect on price. It is not easy to draw any firm conclusions about the relationship between price and transmission type because of these issues.
17
Chapter 10: Analysis of Variance
15. a.
The boxplots and histograms are:
14000
12000
10000
8000
6000
4000
2000
Trans Age = 1 5sp Trans Age = 1 at Trans Age = 4 5sp Trans Age = 4 at Trans Age = 6 5sp Trans Age = 6 at
0
There is not enough data to determine whether the variance differs between the groups. There does not appear to be any large violations of the constant variance assumption.
18
Chapter 10: Analysis of Variance b.
The interaction plot appears as:
Interaction Plot
12,000
10,000
8,000
6,000
4,000
5sp at
2,000
0
1 to 3 4 to 5
Age
6 or more
There is no evidence of a strong interaction effect. c.
Count
Sum
Average
Variance
Count
Sum
Average
Variance
The results of the two-way ANOVA are:
SUMMARY
5sp at
Total
Count
Sum
Average
Variance
ANOVA
Source of
Variation
Sample
Columns
Interaction
Within
Total
1 to 3
2
13,250
6,625
781,250
2
19,895
9,948
18,574,513
4
33,145
8,286
10,131,590
SS
5,592,405.33
71,871,898.17
6,325,178.17
31,412,373.00
115,201,854.67
4 to 5
2
16,990
8,495
500,000
2
16,687
8,344
3,615,361
4
33,677
8,419
1,379,438 df
6 or more
2
5,400
2,700
1,280,000
2
7,250
6
43,832
3,625 7,305
6,661,250 14,411,700
4
12,650
3,163
2,932,292
MS
1 5,592,405.33
2 35,935,949.08
2
6
3,162,589.08
5,235,395.50
11
Total
6
35,640
5,940
7,510,190
F
1.07
6.86
0.60
P-value
0.34
0.03
0.58
F crit
5.99
5.14
5.14
The Columns sum of squares is large relative to the others and also shows significance with a p value of 0.028. This means that the age effect is responsible for much of the variance in price.
The ANOVA accounts for 73% of the variation in price.
19
Chapter 10: Analysis of Variance
16. a.
There is an outlier in Heat 7 in which a runner finished in 22.69 seconds.
13
12.5
12
11.5
11
10.5
10
9.5
1 2 3 4 5 6 7 8 9 10 11 12
9
Heat
Aside from the outlier, there is no strong visual evidence that the race times for one heat are substantially different from race times from other heats. b.
The results of the one-way ANOVA are:
SUMMARY
Groups
ANOVA
Source of Variation
Between Groups
Within Groups
Total
9
10
11
5
6
7
8
1
2
3
4
12
Count Sum Average Variance
9 96.09 10.68 0.27
8 85.07 10.63 0.49
9 94.00 10.44 0.09
9 93.84 10.43 0.05
9 96.15 10.68 0.12
9 94.43 10.49 0.03
9 108.48 12.05 16.09
9 95.07 10.56 0.08
9 94.50 10.50 0.04
9 95.38 10.60 0.08
8 83.82 10.48 0.06
9 96.51 10.72 0.15
SS
19.19
139.95
159.15 df MS F P-value F crit
11 1.74 1.17 0.32 1.89
94 1.49
105
The p -value for the ANOVA is 0.317, so we fail to reject the null hypothesis that the mean race times of the 12 heats are equal.
20
Chapter 10: Analysis of Variance c.
The pairwise mean differences are:
Pairwise Mean Difference (row - column)
1 2 3 4 5 6 7 8 9 10 11 12
1 0.000 0.043 0.232 0.250 -0.007 0.184 -1.377 0.113 0.177 0.079 0.199 -0.047
4
5
6
7
8
9
10
11
12
2
3
4
5
6
7
0.000 0.189 0.207 -0.050 0.142 -1.420 0.070 0.134 0.036 0.156 -0.090
0.000 0.018 -0.239 -0.048 -1.609 -0.119 -0.056 -0.153 -0.033 -0.279
0.000 -0.257 -0.066 -1.627 -0.137 -0.073 -0.171 -0.051 -0.297
0.000 0.191 -1.370 0.120 0.183 0.086 0.206 -0.040
0.000 -1.561 -0.071 -0.008 -0.106 0.015 -0.231
0.000 1.490 1.553 1.456 1.576 1.330
8
9
10
11
12
MSE =
1.4888752216312
Pairwise Probabilities (Bonferroni
Correction)
1 -
2
3
1
0.000 0.063 -0.034 0.086 -0.160
0.000 -0.098 0.023 -0.223
0.000 0.120 -0.126
0.000 -0.246
0.000
2 3 4 5 6 7 8 9 10 11 12
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
- 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
- 1.000 1.000 1.000 0.413 1.000 1.000 1.000 1.000 1.000
-
-
1.000 1.000 0.378 1.000 1.000 1.000 1.000 1.000
1.000 1.000 1.000 1.000 1.000 1.000 1.000
-
-
0.522 1.000 1.000 1.000 1.000 1.000
-
0.733 0.542 0.861 0.610 1.000
1.000 1.000 1.000 1.000
-
-
1.000 1.000 1.000
1.000 1.000
-
-
1.000
There are no significant pairwise differences. d.
The significance level for this test is 5% and since the recorded p -value is 0.317 which is greater than 0.05, we do not reject the null hypothesis. We conclude that the average race times for the 12 heats are equal and that the best runners in the world were evenly divided among the 12 heats.
21
Chapter 10: Analysis of Variance
17.
The boxplots of the reaction times are:
0.24
0.22
0.2
0.18
0.16
0.14
0.12
1 2 3 4 5 6 7 8 9 10 11 12
0.1
Hea t
The ANOVA is:
ANOVA
Source of Variation
Between Groups
Within Groups
Total
SS
0.005739
0.038629
0.044368 df MS F P-value F crit
11 0.000522 1.269628 0.254133 1.891991
94 0.000411
105
It is difficult from the boxplots to make any conclusions regarding the reaction times in the different heats. It is possible that Heat 10 has higher reaction times than other heats and Heat 11 has lower, but nothing substantial appears in the plot. The p -value for the ANOVA is 0.254, causing us to fail to reject the null hypothesis that the mean reaction times of the 12 heats are equal.
22
Chapter 10: Analysis of Variance
The means matrix shown below also does not indicate any pairwise differences.
Pairwise Mean Difference (row - column)
1 -
2
1
3
4
5
6
7
8
9
10
11
12
1 2 3
1 0.000 0.006 -0.004
2
3
4
5
0.000 -0.011
0.000
6
7
8
9
10
4 5
-0.005 0.003 0.000 -0.010 0.003 0.010 -0.007 0.018
-0.011 -0.003 -0.006 -0.017 -0.004 0.004 -0.013 0.012
-0.001 0.007 0.004 -0.006 0.007 0.014 -0.002 0.023
0.000 0.008 0.005 -0.005 0.008 0.015 -0.002 0.023
0.000 -0.003 -0.013 0.000 0.007 -0.010 0.015
11
12
MSE = 4.10947695035463E-04
Pairwise Probabilities (Bonferroni Correction)
6 7 8 9 10 11
0.000 -0.010 0.003 0.010 -0.007 0.018
0.000 0.013 0.020 0.004 0.029
0.000 0.007 -0.009 0.016
0.000 -0.017 0.008
0.000 0.025
0.000
1.000 1.000
-
2
-
3
1.000
-
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
1.000 1.000 1.000 1.000 1.000 1.000 1.000
4 5 6 7 8 9 10 11
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
-
-
1.000 1.000 1.000 1.000 1.000 1.000
-
1.000 1.000 1.000 1.000 1.000
1.000 1.000 1.000 0.299
-
-
1.000 1.000 1.000
1.000 1.000
-
-
0.848
-
12
0.002
-0.004
0.007
0.007
-0.001
0.002
0.013
0.000
-0.008
0.009
-0.016
0.000
12
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
18. a.
The two factors are the runner and the round. The ANOVA table appears as follows (after formatting): df Source of Variation
Rows
Columns
Error
SS
0.01225
0.00155
0.00468
13
2
26
MS
0.00094
0.00078
0.00018
F
5.23409
4.31024
P-value
0.00017
0.02417
F crit
2.11917
3.36901
Total 0.01849 41 b.
Based on the ANOVA, we reject the null hypothesis that the runners have equal reaction times with a p -value of 0.0017 and we reject the null hypothesis that the reaction times are the same, concluding that they differ from one round to another ( p -value = 0.02417).
The R 2 value is equal to the sum of squares for the rows and columns divided by the total sum of squares. In this case that is (0.01225 + 0.00155)/(0.01849) = 0.746746. So aobut 75% of the variation in reaction times can be explained by the runner and round factors.
23
Chapter 10: Analysis of Variance c.
Round 1: Average Reaction = 0.1675 (std. dev. = 0.0211)
Round 2: Average Reaction = 0.1640 (std. dev. = 0.0299)
Round 3: Average Reaction = 0.1532 (std. dev. = 0.0182)
It would appear that the reaction times decrease as the competition proceeds through each round in the meet. d.
Round 1 vs. Round 2 t -test: Paired Difference = 0.004, p -value = 0.541
Round 2 vs. Round 3 t -test: Paired Difference = 0.011, p -value = 0.060
Round 1 vs. Round 3 t -test: Paired Difference = 0.014, p -value = 0.006
There are significant differences between Round 1 and Round 3 reaction times and between Round
2 and Round 3. In both cases the earlier round had the slower reaction time. This lends credence to the theory that reaction times decrease as the runner proceeds through the three rounds. e.
The interaction plot is:
Reaction Times for each runner
0.220
0.200
0.180
0.160
0.140
0.120
0.100
Round 1 Round 2 Round 3
Round
Since the linear for each runner appears different (some reaction times stay level through the three rounds, others go up in round 2 and so forth) it seems likely that there is an interaction between runner and round. This means that the conclusion that reaction times decrease with each round may not hold for each runner in the competion.
24
19.
Chapter 10: Analysis of Variance a.
The two-way table appears as:
ActiveHR Height
Frequency
0
1
2
0 1
75 108
99 120
93
87
84
93
99
99
84 141
93 99
96 111
90 129
108 90
93 135
129 153
99 129
123 120
96 147 b.
The two-way ANOVA is:
ANOVA
Source of Variation
Sample
Columns
Interaction
Within
Total
SS
3727.80
3499.20
210.60
4683.60
12121.2 df
2
1
2
24
MS
1863.90
3499.20
105.30
195.15
29
F
9.551
17.931
0.540
P-value
0.001
0.000
0.590
F crit
3.403
4.260
3.403
The Height and Frequency factors are both significant, but the interaction between the two factors is not.
25
Chapter 10: Analysis of Variance c.
The interaction plot is:
80
60
40
20
160
Average of ActiveHR
140
120
100
Height
0
1
0
0 1 2
Frequency
The lines are roughly parallel indicating no interaction between the factors. d.
The two-way table for the change in HR is:
DiffHR Height
Frequency 0 1
0 15 39
9 33
6 15
9 15
0 15
1 21 45
24 27
15 24
15 39
18 6
2 24 66
42 60
27 51
48 39
18 57
ANOVA
The two-way ANOVA is:
Source of Variation
Sample
Columns
Interaction
Within
Total
SS
4048.80
1920.00
218.40
2700.00
8887.2 df
2
1
2
24
MS
2024.40
1920.00
109.20
112.50
29
F
17.995
17.067
0.971
P-value
0.000
0.000
0.393
F crit
3.403
4.260
3.403
As before, the Height and Frequency factors are significant, but the interaction term is not.
26
Chapter 10: Analysis of Variance
20. a.
880
The boxplot and histograms appear as:
860
840
820
800
780
760 small Standard medium Standard large Standard
740
small Octel medium Octel large Octel
27
Chapter 10: Analysis of Variance b.
The interaction plot between the Size and Type factors is:
860
840
820
800
780
760
Octel
Standard
740
720 large medium small c.
ANOVA
Source of Variation
Sample
Columns
Interaction
Within
Total
The two-way ANOVA is:
SS
26051.39
1056.25
804.1667
1962.5
29874.31 df
2
1
2
30
MS F
13025.69
1056.25
402.0833
65.41667
6.146497
199.1189
16.1465
35
P-value
4.8E-18
0.000363
0.005792
F crit
3.315833
4.170886
3.315833
There is a significant difference between the two types of filters. However the degree of difference depends on the filter size. The greatest difference occurs for themedium size filters.
28
21.
Chapter 10: Analysis of Variance a.
The boxplots appear as:
40
30
20
10
0
-10
-20
80
70
60
50
Plant1 Plant2 Plant3 Plant4 Plant5
-30
The are significant outliers in the 1 st and 2 nd plants. b.
The one-way ANOVA is:
Plant1
Plant2
Plant3
Plant4
Plant5
Groups Count
22
22
19
19
13
ANOVA
Source of Variation
Between Groups
Within Groups
Total
SS
450.9207
8749.088
9200.009
Sum
99.5
194.3
91.8
142.3
134.9 df
4
90
94
Average
4.523
8.832
4.832
7.489
10.377
MS
112.730
97.212
Variance
100.642
235.729
19.388
13.374
91.299
F
1.160
P-value
0.334
There is no evidence of a statistically significant difference between the plants.
F crit
2.473
29
-5
-10
-15
5
0
20
15
10
Chapter 10: Analysis of Variance c.
The matrix of paired differences is:
Pairwise Mean Difference (row - column)
Plant1
Plant2
Plant3
Plant4
Plant1
0.000
Plant2
-4.309
0.000
Plant5
MSE = 97.2120931991985
Pairwise Probabilities (Bonferroni Correction)
Plant1
Plant2
Plant3
-
Plant1
-
Plant2
1.000
Plant4
Plant5
-
Plant3
-0.309
4.000
0.000
Plant3
1.000
1.000
-
Plant4
-2.967
1.342
-2.658
0.000
Plant4
1.000
1.000
1.000
There is no evidence of differences between any pair of plants. d.
The boxplot for the reduced data set is:
-
Plant5
-5.854
-1.545
-5.545
-2.887
0.000
Plant5
0.931
1.000
1.000
1.000
30 Revised Data
25
Plant1 Plant2 Plant3 Plant4
30
Chapter 10: Analysis of Variance
Plant1
Plant2
Plant3
Plant4
Plant5
The revised one-way ANOVA is:
Groups Count
20
20
19
19
13
Sum
37.4
135.7
91.8
142.3
134.9
ANOVA
Source of Variation
Between Groups
Within Groups
Total
SS
670.433
2662.210
3332.643
Average
1.870
6.785
4.832
7.489
10.377 df
4
86
90
MS
167.608
30.956
Variance
15.469
35.948
19.388
13.374
91.299
F
5.414
P-value
0.001
F crit
2.478
After removing the outliers the p -value is significant, indicating that there is a difference between theplants. The paired differences are:
Pairwise Mean Difference (row - column)
Plant1
Plant2
Plant1
0.000
Plant2
-4.915
0.000
Plant3
Plant4
Plant5
MSE = 30.9559247010639
Pairwise Probabilities (Bonferroni Correction)
Plant1
Plant2
Plant3
-
Plant1
-
Plant2
0.064
Plant4
Plant5
-
Plant3
-2.962
1.953
0.000
Plant3
1.000
1.000
-
Plant4
-5.619
-0.704
-2.658
0.000
Plant4
0.022
1.000
1.000
-
Plant5
-8.507
-3.592
-5.545
-2.887
0.000
Plant5
0.000
0.735
0.069
1.000
There is a significant difference between Plant 1 and Plants 4 and 5.
31
Chapter 10: Analysis of Variance
22. a.
b.
The one-way ANOVA is:
SUMMARY
Count Groups
Meadow Pipit
Tree Pipit
Hedge Sparrow
Robin
Pied Wagtail
Wren
ANOVA
Source of Variation
Between Groups
Within Groups
Total c.
45
15
14
16
15
15
SS
42.940
94.248
137.188
Sum
1003.450
346.350
323.700
361.200
343.550
316.950 df
5
114
119
Average
22.299
23.090
23.121
22.575
22.903
21.130
MS
8.588
0.827
Variance
0.848
0.813
1.142
0.469
1.140
0.553
F
10.388
P-value
0.000
F crit
2.294
There is a significant difference in the egg sizes between the host birds ( p -value < 0.001).
The boxplots are:
25
24
23
22
21
20
19
Hedge Sparrow Meadow Pipit Pied Wagtail Robin Tree Pipit Wren
32
Chapter 10: Analysis of Variance d.
The means matrix is:
Pairwise Mean Difference (row - column)
Hedge
Sparrow
Meadow
Pipit
Hedge Sparrow
Meadow Pipit
Pied Wagtail
Robin
Tree Pipit
Wren
0.000 0.823
0.000
Pied
Wagtail
0.218
-0.604
0.000
MSE = .826739905318742
Pairwise Probabilities (Bonferroni Correction)
Hedge
Sparrow
Meadow
Pipit
Hedge Sparrow - 0.057
Meadow Pipit
Pied Wagtail
-
-
Robin
Tree Pipit
Wren
Pied
Wagtail
1.000
0.416 e.
-
Robin
0.546
-0.276
0.328
0.000
Robin
1.000
1.000
1.000
Tree Pipit
0.031
-0.791
-0.187
-0.515
0.000
-
Tree Pipit
1.000
0.064
1.000
1.000
-
Wren
1.991
1.169
1.773
1.445
1.960
0.000
Wren
0.000
0.001
0.000
0.000
0.000
There is a significant difference in the egg size between the wren and all other host birds. There are no other significant paired differences.
The eggs laid in the nests of wrens of significantly smaller than the eggs laid in the nests of other hosts birds. This lends credence to the theory that cuckoos lay their eggs in the nests of a particular host species.
33
35