MAS1401 Handout 6

advertisement
MAS1401 Handout 6
Hypothesis Tests for the Population Mean
Similar arguments to those used to develop the idea of a confidence interval allow us
to test a hypothesis that the population mean is equal to a particular value.
For example suppose we were told that the average IQ of UK students was 115, and
we were interested in whether or not the mean IQ of the MAS1401 students was
different from this.
Then the hypothesis that µ (the population mean IQ for MAS1401) is equal to 115 is
known as the null hypothesis and denoted by H0.
Here the null hypothesis is:
H0: µ = 115.
The other possibility, that the mean IQ of the MAS1401 students is not 115, is known
as the alternative hypothesis, or the experimental hypothesis.
Here the alternative hypothesis is:
HA: µ ≠ 115.
In order to investigate which hypothesis is true we can carry out a hypothesis test.
In the example we are considering, the test we use is called a 1-sample t-test (because
there is one sample of data, and we use the t-distribution!).
MINITAB will carry this test out for us, using the procedure:
Stat>Basic Statistics>1-sample t...
Here, the outcome is p = 0.015. What does this mean?
Well, the outcome of a hypothesis test is always a “p-value”. The p-value is a measure
of probability. It tells us how likely we would be to see a sample as extreme as the
one we have actually observed, if, in fact, the null hypothesis were true.
Small p-values tell us that the sample we have observed would be unlikely to have
occurred if the null hypothesis really is true. Therefore a small p-value constitutes
evidence against the null hypothesis.
For the IQs of the MAS1401 students, we had H0: µ = 115, and p = 0.015.
This value of p is a small probability. It tells us that if the mean IQ really is 115 then
our sample is very unusual.
A much more plausible explanation is that H0 is in fact false, and the population mean
is not 115, but higher than that.
We use guidelines to interpret the result of a hypothesis test…
Guidelines for interpreting p-values:
If p > 0.05, we do not reject the null hypothesis at the 5% level. There is no
evidence against the null hypothesis.
If p < 0.05, we reject the null hypothesis at the 5% level. There is moderate
evidence against the null hypothesis.
If p < 0.01, we reject the null hypothesis at the 1% level. There is strong evidence
against the null hypothesis.
If p < 0.001, we reject the null hypothesis at the 0.1% level. There is very strong
evidence against the null hypothesis.
Note that these are guidelines only, and should not be interpreted as hard and fast
rules when making decisions!
Two independent samples
Lets return to the haematocrit data we looked at in Practical 2. There were
measurements on 126 women and 61 men.
We would like to use these data to make comparisons between the population
haematocrit distributions for females and males.
The method is to construct confidence intervals for the difference between the mean
haematocrit for women and men.
It is also possible to carry out a hypothesis test for whether the difference between the
means takes a particular value.
The most commonly tested value is zero, since this amounts to a test of whether or not
there is any difference between the population means.
We won’t worry about the details, but we will use Minitab to carry out the work for us,
using the 2-sample t-test option:
Stat>Basic Statistics>2-sample t…
Additional Notes:
Two dependent samples
In the haematocrit example, we called the samples independent because there was
nothing to relate any particular female measurement to any particular male
measurement.
Sometimes we have a situation where every measurement in one group is related to a
particular measurement in the other group.
Animal
Cholesterol Before:
1
2
3
4
5
6
7
8
210 217 208 215 202 209 207 210
Cholesterol After:
212 210 210 213 200 208 203 199
Note that each measurement of ‘cholesterol before’ is directly related to the
‘cholesterol after’ measurement immediately below it.
We need to take this into account, but the two-sample t-test we just carried out doesn’t
do this.
Instead we must carry out a paired t-test (because the data are in pairs!) using the
Minitab procedure:
Stat>Basic Statistics>Paired t…
Paired T-Test and CI: cholesterol before, cholesterol after
Paired T for cholesterol before - cholesterol after
cholesterol befo
cholesterol afte
Difference
N
8
8
8
Mean
209.750
206.875
2.87500
StDev
4.652
5.463
4.42194
SE Mean
1.645
1.931
1.56339
95% CI for mean difference: (-0.82184, 6.57184)
T-Test of mean difference = 0 (vs not = 0): T-Value = 1.84
P-Value = 0.109
We see that the 95% confidence interval for mean before – mean after is:
(-0.82, 6.57).
Note that this range of numbers includes the value zero, corresponding to the situation
where the population mean cholesterol level is the same before and after treatment.
Note that the null hypothesis is:
H0: mean before – mean after = 0,
which we could equivalently express as:
H0: the drug is not effective.
The result of the paired t-test is p = 0.109. This is not small, so there is no evidence
against the null hypothesis, i.e. there is no evidence that this drug is effective.
We must be careful how we interpret a non-significant result from a hypothesis test:
•
It is correct to say: “there is no evidence from this study that the drug is
effective.”
•
It is not correct to say “there is evidence from this study that the drug is not
effective.”
•
In general, absence of evidence does not automatically imply evidence of
absence, and we should think carefully about how good our study is before we
say it does!
•
In this example, collecting data from just 8 animals has not given our drug
much chance to prove it’s worth. This study was not very powerful. Bigger
samples means more power.
Download