Solutions to Exercises for Chapter 16 16.1. The question posed in

advertisement
Solutions to Exercises for Chapter 16
16.1. The question posed in here asks whether the three groups could be considered samples from
populations performing at the same level. Before analyzing the data, we need to decide whether an
analysis of variance, the Kruskal–Wallis test, or a randomization test might be most appropriate. We use
R to analyze the data as in Script 16.8.
To begin, given the way the study was designed, we can be relatively comfortable that the
assumption of independence is met. The children were sampled at random and the intervention was
administered in an individual setting. Thus, we turn our attention to the assumptions of normality and
homogeneity of variance. Given that there are 15 cases in each of the groups, we can be somewhat
assured that ANOVA would be robust with regard to a violation of either normality or homogeneity of
variance. However, we will look at both to model careful practice.
First, we generate the descriptive statistics for the three groups (i.e., means and standard
deviations). We then examine the distributions for the three cells with the table( ) and boxplot( )
command. Finally, we generate indices of skewness and kurtosis and their respective standard errors. The
results are as follows:
> table(group)
group
1
2
3
15 15 15
> round(tapply(dv,f.group,mean),3)
it
pt
cai
12.133 11.533 11.733
> round(mean(dv),3)
[1] 11.8
> round(tapply(dv,f.group,sd),3)
it
pt
cai
2.669 3.182 3.283
> tapply(dv,f.group,table)
$it
6
9 10 11 12 13 14 15 16
1
1
$pt
1
3
2
3
1
1
2
6
7
9 10 12 13 14 15 16
1
2
1
2
1
4
1
2
1
$cai
5
7 10 11 12 14 15 16 17
1
1
3
2
1
2
1
1
6
8
10
12
14
16
3
it
pt
cai
FIGURE 16.6 Boxplots for Exercise 16.1.
In examining the output, we are looking for reasonable means and standard deviations, as well as
possible outliers. We will assume that the means and standard deviations are reasonable; in your own
work, you will be able to make that determination. The results of the table( ) and the boxplot( )
commands indicate that there might be an outlier at the low end of the first group. At this point, we will
consider the issue of normality in a more formal fashion. We will look at skewness, kurtosis, and apply
the shapiro.test( ) function:
> round(tapply(dv,f.group,skewness),3)
it
pt
cai
-0.555 -0.458 -0.286
> round(tapply(dv,f.group,SEsk),3)
it
pt
cai
0.58 0.58 0.58
> round(tapply(dv,f.group,kurtosis),3)
it
pt
cai
0.704 -1.006 -0.044
> round(tapply(dv,f.group,SEku),3)
it
pt
cai
1.121 1.121 1.121
> tapply(dv,f.group,shapiro.test)
$it
Shapiro-Wilk normality test
W = 0.9562, p-value = 0.626
$pt
Shapiro-Wilk normality test
W = 0.9238, p-value = 0.2199
$cai
Shapiro-Wilk normality test
W = 0.9566, p-value = 0.6331
Given the measures of skewness and kurtosis, all of them are well within  2 standard errors of zero,
suggesting that all three samples are consistent with what you would expect if you were drawing random
samples out of normal populations. Furthermore, the results from the shapiro.test( ) function for each
group show no evidence of departure from normality. Thus, we would conclude that the data appear to
meet the assumption of normality.
Finally, we consider the issue of homogeneity of variance using the leveneTest( ) function from
the car package. From the descriptive statistics above, you can see that the standard deviations are all
about the same:
> leveneTest(dv ~ f.group,data=data)
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group
2
0.2868 0.7521
42
The p-value from the Levene test indicates that the variability appears to be equivalent across the groups.
In summary, we do not have any evidence that the assumptions have been violated, and if they
have, the violations are not large enough to be problematic. Indeed, when samples sizes exceed 10 and all
groups are of the same size, many statisticians do not even bother testing assumptions, as ANOVA is said
to be robust. Thus, we conclude that a conventional ANOVA would be the appropriate statistical
technique to apply to these data.
> oneway.test(dv ~ f.group,var.equal=TRUE)
One-way analysis of means
data:
dv and f.group
F = 0.1498, num df = 2, denom df = 42, p-value = 0.8613
The analysis results in an observed F-statistic of .1498 with an associated p-level of .86. As the p-level of
the observed test statistic is larger than our level of significance ( = .05), we do not reject the null
hypothesis. In conclusion, these data are consistent with the null hypothesis that the three samples come
from populations with the same mean. Thus, these three treatments do not seem to have had any
differential effects.
16.2. Exercise 16.2 is another three-group study calling for a comparison to determine if the three groups
are equivalent with regard to level of “readiness for school.” Keep in mind that the data are fictitious.
Given the nature of the study, the children are assumed to come from many different centers. In
addition, it was a sampling of all children entering kindergarten, not a sampling of centers from which we
further sampled children, and thus we probably do not need to worry about the assumption of
independence. Again, we enter the data into R and construct a data frame. We then generate descriptive
statistics:
> table(group)
group
1
2
3
15 11 13
> round(tapply(dv,f.group,mean),3)
fd
hd
nd
19.467 24.818 19.692
> round(mean(dv),3)
[1] 21.051
> round(tapply(dv,f.group,sd),3)
fd
hd
nd
5.083 3.894 5.468
Note that the numbers of cases in each of the groups are different, suggesting that we need to look more
carefully at the assumptions. Again, we will assume that the means and standard deviations look
reasonable. Now, we’ll look at the within-group distributions:
> #Examining cell distributions
> tapply(dv,f.group,table)
$fd
9 13 15 16 17 20 21 22 23 25 27
1
1
1
1
2
3
1
1
1
1
$hd
19 22 23 24 27 29 33
1
2
2
2
2
1
1
$nd
10 13 15 16 18 19 21 23 26 30
1
1
1
1
1
2
1
3
1
1
2
30
25
20
15
10
fd
hd
nd
FIGURE 16.7 Boxplots for Exercise 16.2.
Looking at the frequency distributions, we’ll assume that the minimum and maximum values are
within range, and there do not appear to be any extreme outliers, which is also supported by the boxplots
in Figure 16.7. Looking at the issue of normality,
> round(tapply(dv,f.group,skewness),3)
fd
hd
nd
-0.307
0.801
0.053
> round(tapply(dv,f.group,SEsk),3)
fd
hd
nd
0.580 0.661 0.616
> round(tapply(dv,f.group,kurtosis),3)
fd
-0.156
hd
nd
0.719 -0.110
> round(tapply(dv,f.group,SEku),3)
fd
hd
nd
1.121 1.279 1.191
> tapply(dv,f.group,shapiro.test)
$fd
Shapiro-Wilk normality test
W = 0.9706, p-value = 0.8662
$hd
Shapiro-Wilk normality test
W = 0.9392, p-value = 0.5111
$nd
Shapiro-Wilk normality test
W = 0.9875, p-value = 0.9985
As in Exercise 16.1, all of the measures of skewness and kurtosis are within 2 standard errors of zero,
and the results from the shapiro.test( ) applied to each group indicate large p-values for each group,
respectively. Thus, we would conclude that the three groups might have come from populations that are
normally distributed. Regarding homogeneity of variance,
> leveneTest(dv ~ f.group,data=data)
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group
2
0.6577 0.5242
36
The p-value from Levene’s test gives us reason to believe that the data are consistent with the
homogeneity assumption. In conclusion, overall the data appear to be in line with the assumptions for
ANOVA; if an assumption has been violated, the degree of violation probably is not sufficient to cause
problems.
Again, in light of the apparent correspondence of the data with the assumptions, we would
recommend using analysis of variance to examine the equality of the three population means:
> oneway.test(dv ~ f.group,var.equal=TRUE)
One-way analysis of means
data:
dv and f.group
F = 4.4943, num df = 2, denom df = 36, p-value = 0.01810
Thus, we reject the null hypothesis; these results suggest that the three groups are not samples from three
populations with the same mean. At this point, we know only that at least one of the three group means is
different from one or both of the others; we do not know which group means are different from which
other group means. Follow-up procedures that will allow us to make more specific statements will be
covered in Chapter 17.
16.3. This exercise is a four-group study calling for a comparison of the groups regarding level. As for
the previous two exercises, we read the data into R with the c( ) function and create a data frame. We then
generate descriptive statistics including sample sizes, means, and standard deviations:
> #Obtaining descriptive statistics
> table(group)
group
1
2
3
4
10 10 10 10
> round(tapply(dv,f.group,mean),3)
ma
sc
la
ss
59.7 54.6 70.1 72.0
> round(mean(dv),3)
[1] 64.1
> round(tapply(dv,f.group,sd),3)
ma
sc
la
ss
11.823 18.307 17.387
7.717
Although the four samples sizes are all 10, the samples are not very large, so we may be more
concerned about the degree to which the data conform to the assumptions for analysis of variance,
particularly normality. In addition, the smallest standard deviation (7.72) is less than one-half of the
largest standard deviation (18.31), so we may also want to be concerned with the assumption of
homogeneity of variance. But before looking at assumptions, we should look at the within-cell
distributions:
> tapply(dv,f.group,table)
$ma
45 46 50 53 58 62 74 75 76
1
1
1
1
2
1
1
1
1
$sc
39 42 43 45 49 55 63 68 99
1
1
2
1
1
1
1
1
$la
51 53 56 63 86 90 91 92
1
1
2
2
1
1
1
1
$ss
65 66 67 70 71 74 76 91
1
2
1
1
1
2
1
1
1
100
90
80
70
60
50
40
ma
sc
la
ss
FIGURE 16.8 Boxplots for Exercise 16.3.
We’ll assume that the minimum and maximum values are within range, but there appear to be
some large outliers in the science and social studies groups. This is confirmed by looking at the boxplots
in Figure 16.8. The boxplots also suggest the possibility of nonnormal distributions and nonequivalent
variances. Let’s consider the assumptions of normality more carefully:
> round(tapply(dv,f.group,skewness),3)
ma
sc
la
ss
0.326 1.823 0.350 1.822
> round(tapply(dv,f.group,SEsk),3)
ma
sc
la
ss
0.687 0.687 0.687 0.687
> round(tapply(dv,f.group,kurtosis),3)
ma
-1.435
sc
la
ss
3.537 -2.084
4.044
> round(tapply(dv,f.group,SEku),3)
ma
sc
la
ss
1.334 1.334 1.334 1.334
> tapply(dv,f.group,shapiro.test)
$ma
Shapiro-Wilk normality test
W = 0.896, p-value = 0.1979
$sc
Shapiro-Wilk normality test
W = 0.7926, p-value = 0.01180
$la
Shapiro-Wilk normality test
W = 0.8117, p-value = 0.02010
$ss
Shapiro-Wilk normality test
W = 0.8088, p-value = 0.01853
The skewness and kurtosis values suggest problems with the assumption of normality, particularly with
the science and social studies groups. The results from the shapiro.test( ) suggest that only the math
groups might be reasonably considered to be from a normal population. In looking at the assumption of
homogeneity of variance,
> leveneTest(dv ~ f.group,data=data)
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group
3
1.3997 0.2587
36
The assumption of equivalent variances might be tenable, suggesting that the nonnormality may be the
result of the outliers. Thus, we may consider using the Kruskal–Wallis test as an alternative to the
conventional ANOVA:
> kruskal.test(dv ~ f.group)
Kruskal-Wallis rank sum test
Kruskal-Wallis chi-squared = 10.3842, df = 3, p-value = 0.01557
Historically, this would have been the only reasonable choice. We obtained an H-statistic of 10.38, with
an associated p-level of .0156. Thus, it appears that the observed differences among the groups did not
occur by chance. Follow-up techniques to determine which groups are different from which other groups
will be presented in Chapter 17.
With the computing power available today, we might want to consider one of the
resampling/randomization approaches. We look at both bootstrapping and permutations. Using a
bootstrap approach, we construct 99,999 bootstrap samples and count the number of samples that equal or
exceed the actual SSag:
> pvalue
[1] 0.0332
With regard to the permutation approach and given 4 groups of 10 cases each, there are over
196,056,000,000,000,000,000 possible arrangements, suggesting that we would be wise to consider the
Monte Carlo p-value approach! Thus, we look at a sample of 99,999 permutations of the 40 cases,
counting the number of times we equal or exceed the actual SSag:
> pvalue
[1] 1e-05
Translated, the p-value is .00001. Thus, it would appear that the different teachers of the different
academic subjects (math, science, language arts, and social studies) are evaluated differently.
Download