STAT 210

advertisement
COMPARING TWO OR MORE POPULATION MEANS: INDEPENDENT SAMPLES
The two sample t-test procedures which we have discussed in the previous examples
can be used to compare the means of two populations. However, these methods will not work
if a study is intended to evaluate more than two groups. Next, we will discuss a method which
allows us to compare two OR MORE groups using independent samples.
Recall the data used in Example 6.3. We used a pooled t-test to determine whether the average
birth weight was lower when mothers were classified as smokers.
Questions:
1. What is the p-value for the two-tailed pooled t-test?
2. What is the p-value given in the analysis of variance section?
For two samples, the two-tailed pooled t-test and the analysis of variance will always give
identical results. The difference is that the F-test used in the analysis of variance (ANOVA) can
be used to compare more than two populations.
pg. 10
The Main Idea Behind an Analysis of Variance
When we look for differences in a set of sample means, it doesn’t matter how far apart those
sample means are. What matters is how far apart they are relative to the variability of the
observations. Consider the following scenarios.
Figure A
Figure B
Questions:
1. What can you say about the centers of the boxplots for each group in Figure A as
compared to Figure B?
2. What can you say about the variation within each group in Figure A?
3. What can you say about the variation within each group in Figure B?
4. Which scenario illustrates an example where variation between groups is LARGE
compared to variation within groups?
Comments:
1. In Figure A, there is a lot of variation within each group. Therefore, the observed
differences among the sample means could easily happen by random chance alone.
2. In Figure B, the centers of the boxplots are the same as in Figure A, but there is much
less variation within each group. Therefore, it is not likely that any sample from Group
C would have a mean as small as the mean from Group A, for example.
pg. 11
3. If the variation between groups is large relative to the variation within each group, then
we have evidence that something other than random chance is at work. Therefore, our
test statistic is as follows:
Variation Between Groups
F
Variation Within Groups
Note that this F-statistic takes on values between zero and positive infinity.
Questions:
1. What does it mean if the F-statistic is zero?
2. What does it mean if the F-statistic is very large?
We will use JMP to carry out the ANOVA. Therefore, JMP will calculate the F-statistic and its
associated p-value. The ANOVA procedure for comparing two or more means (assuming
independent samples) is outlined below.
Example 6.5: Jaffe, Parker and Wilson have investigated the concentration of several
hydrophobic organic substances (such as hexachlorobenzene, chlordane, heptachlor, aldrin,
dieldrin, endrin) in the Wolf River in Tennessee. Measurements were taken downstream of an
abandoned dump site that had previously been used by the pesticide industry to dispose of its
waste products. It was expected that these hydrophic substances might have a
nonhomogeneous vertical distribution in the river because of differences in density between
these compounds and water and because of the adsorption of these compounds on sediments,
which could lead to higher concentrations on the bottom. It is important to check this
hypothesis because the standard procedure of sampling at six-tenths of the depth could miss
the bulk of these pollutants if the distribution were not uniform. Grab samples were taken with
a La Motte-Vandorn water sampler of 1 liter capacity at various depths of the river. This
sampler consists of a horizontal plexiglas tube of 7 centimeters diameter and a plunger of each
side which shuts the sampler when the sampler is at the desired depth. Ten surface, 10 middepth and 10 bottom samples were collected, all within a relatively short period. The data can
be found in the file WolfRiver.JMP (a small portion of the data set is shown below).
Source: Jaffe, P. R., Parker, F. L., and Wilson, D. J. (1982). Distribution of toxic substances in rivers. Journal
of the Environmental Engineering Division, 108, 639-649.
pg. 12
In JMP, select Analyze > Fit Y by X. Move Aldrin to the Y, Response box and Depth to the X,
Factor box.
JMP returns side-by-side dotplots, and we can select Quantiles from the red drop-down arrow
next to the variable name to construct side-by-side boxplots. The green lines represent the
means for each group.
To determine whether there is a statistically significant difference in the Aldrin concentration
across depth, we will use a one-way analysis of variance (ANOVA).
pg. 13
Question of Interest: Is there a difference in the Aldrin concentration across depth (Bottom,
Mid-depth, or Surface)?
Step 0: Check the assumptions to be sure that the ANOVA is valid.

Make sure that the two groups are independent.

Check the normality assumption.

Check whether we have any evidence that the group variances differ (the ANOVA
assumes equal variances).
Step 1: Convert the research question into Ho and Ha.
H0:
pg. 14
Ha:
Step 2: Find the test statistic and p-value from your data.
In JMP, select Means/Anova from the red drop-down arrow.
JMP returns the following:
Step 3: Write a conclusion in the context of the problem.
pg. 15
Download