A researcher randomly samples female (Sample 1) and male

advertisement
252doctor 9/13/07
The Doctor’s Problem
In Lucy Horwitz and Lou Ferleger’s book
Statistics for Social Change (South End Press,
1980) the authors talk about a doctor who
believes that a group of workers are suffering
from respiratory illness. She knows that the
number of sick days taken will be increased by a
respiratory illness and that similar workers
nationally take an average of 1.5 sick days a
month and that, nationally, the standard
deviation is 1.5.
She takes a sample of 100 workers and finds a
sample mean of 2.00. This is thus a rare example
of a test of a population mean when the standard
deviation is known.
There are three ways we can test this mean of 1.5.
We can simply test whether the mean is 1.5. This would give us the following hypotheses:
 H 0 :   1.5
.

 H 1 :   1.5
We can take her belief that the mean number of sick days is above the norm (1.5) seriously. This,
statement, because it does not contain an equality would give us the following hypotheses:
 H 0 :   1.5
.

 H 1 :   1.5
We may hear management claim that their workers are healthier then the norm, which would
imply that they take fewer sick days than the norm, which would imply that the mean number of sick days
is below 1.5. This would give us the following hypotheses:
 H 0 :   1.5
.

 H 1 :   1.5
The doctor wrongly uses the first of these tests, which is bad as a precedent but excellent for teaching
purposes. We thus have  x  
 0.8
 0.08 . We will assume that the confidence level is 95%,
n
100
 H 0 :   1.5
so that the significance level is   .05 . We are testing the two-sided hypotheses 
.
 H 1 :   1.5
Test Ratio Method:
x  0
z
,  0  1.5 . Make a diagram showing a ‘Normal’ curve with a mean at zero and two
x
values of z cutting off 2.5% tails on both sides of zero. According to the t table
z  z.025  1.960 , so the two ‘reject’ regions are the area below -1.960 and the area above
+1.960. z 
2.0  1.5
 6.25 is in the ‘reject’ region, so reject the null hypothesis.
0.08
Critical Value Method for x :
The formula table says xcv    z   x , and this is a formula for two critical values, which is
2
what we need if the null hypothesis simply says that the mean is 1.5. We want two critical values,
one above and one below 1.5, since it should be obvious that if our sample mean is too far above
or below 1.5, we would reject the null hypothesis. We use xcv   0  t x  1.5  1.960 0.08 
252doctor 9/13/07
 1.5  0.157 . Make a diagram with 1.5 in the middle showing a 95% ‘accept’ region between
1.343 and 1.657 and two 2.5% ‘reject’ regions, one below 1.343 and one above 1.657. Since
x  2.00 falls in the upper ‘reject’ region, reject the null hypothesis.
Confidence Interval Method:
The formula for a confidence interval for the mean is   x  z  x , and a two-sided hypothesis
2
requires a two-sided confidence interval. The interval becomes   x  z  x  2.00  1.960.08
2
 2.00  0.157 , or we can write P1.843    2.157   .95 . Make a diagram – you should use
2.00 as the middle. To represent the confidence interval shade the area between 1.843 and 2.157.
Since the null hypothesis mean of 2.0 does not fall on the confidence interval, the confidence
interval and the null hypothesis contradict one another, reject the null hypothesis.
Test ratio method using p-values.
A p-value is a measure of the credibility of the null hypothesis and is defined as the probability that a test
lower
 low 






statistic or ratio as extreme  as or more extreme  than the observed statistic or ratio could occur,
 high 
 higher 




assuming that the null hypothesis is true. In this case, values are extreme, relative to the null hypothesis
H 0 :   1.5 , if the sample mean is way above or way below 1.5. The easiest way to measure the
probability is with the test ratio z 
x  0
x
. We find the value of z, call it z 1 . If we are doing a 2-sided
test and z 1 is a positive number, we must find both Pz  z1  and Pz   z1  . These two numbers are
identical except for sign. So if z 1 is a negative number, we must find 2Pz  z1  and if z 1 is a positive
2.0  1.5
 6.25 , so
0.08
p  value  2Pz  z1   2Pz  6.25  . To find Pz  6.25  we use the Normal table. To do this, make a
diagram of the standardized Normal distribution and shade the area above 6.25. The Normal table says:
number, we must find 2Pz  z1  . We already know that z 
if z 0 is
3.90 and up
P0  z  z0  is
.5000.
So Pz  6.25   Pz  0  P0  z  6.25   .5  .5000  0 and p  value  2Pz  z1   2Pz  6.25 
 20  0. We could amplify this by saying p  value  2Px  2.00   2Pz  6.25  . If we are using a 5%
significance level, we can say that the p-value is below   .05 and reject the null hypothesis.
If we need a second example of this, assume that we are testing the same hypotheses, but find that
x  1.30 . Since this value of the sample mean is below  0  1.5 , we will look at the probability that the
1.30  1.5 

sample mean is below or lower than 1.30. p  value  2 Px  1.30   2 P z 
  2 Pz  2.50  .
0.08 

Again make a diagram of the standardized Normal distribution and shade the area below -2.50. The Normal
table says that P0  z  2.50   .4938 , so we have Pz  2.50   Pz  0  P2.50  z  0  .5  .4938
 .0062 and p  value  2Px  1.30   2Pz  2.50   2.0062   .0124 . If we are using a 5%
significance level, we can say that the p-value is below   .05 and reject the null hypothesis, but, if we
are using a 1% significance level, we cannot reject the null hypothesis.
Download