B. HYPOTHESIS TESTS FOR ONE SAMPLE

advertisement
252ones 9/20/07 (Open this document in 'Outline' view!)
B. HYPOTHESIS TESTS FOR ONE SAMPLE
1. The Meaning of Hypothesis Testing
A hypothesis is a statement about the characteristics of a population. To be of any use to us it must be
quantifiable and testable. The hypothesis to be tested is usually called the Null Hypothesis H 0 . A rival
hypothesis is called the Alternative Hypothesis H 1 or H A . It is usually true that the null hypothesis will be a
hypothesis of "no difference" and the alternative hypothesis covers all other possibilities.
To start out ask "What do I want to know about a population or populations?" Can I state this in terms
of population parameters? Can I state this in terms of a testable hypothesis ( null hypothesis?)? Does
my null hypothesis say that these parameters or differences between these parameters are insignificant
(i.e. not distinct from zero)?
Next ask "What am I assuming about the population or populations?" Are the parameters that I am
testing appropriate to the type of population that I am assuming? What can I use to test my hypothesis?
Can I find a sample statistic or statistics to do the job ? How many samples do I need? Can I calculate
the sample statistic or statistics.? What distribution does the test statistic have? Is this in accord with the
null hypothesis? What errors am I likely to make?
Usually there are three approaches to hypothesis testing involving a statement about a parameter of a
population: (i) the test ratio method, in which a ratio involving an estimate of a parameter is tested against a
well known distribution like t; (ii) the critical value method, in which values of estimates of a parameter are
found which could lead to rejection of H 0 and (iii) the confidence interval method, in which a confidence
interval is constructed for the parameter and compared to the null hypothesis.
If I use a test ratio, what is the probability of getting values as extreme or more extreme than I
actually got? (This is the p-value, the lower it is the less likely it is that the null hypothesis is true. If the
p-value falls below the significance level, I can say that I reject the null hypothesis.)
If I use a test statistic and my significance level is 5% or 1%, did the value fall among the most
likely 95% or 99% of values? Or was it a very unlikely value?
If I use a confidence interval, did the parameter value in my null hypothesis fall in the confidence
interval?
Remember:
a. A null hypothesis is usually a statement about a parameter of a population. It is never a
statement about a sample statistic. A sample statistic is used to test the hypothesis.
b. A null hypothesis usually contains an equality, an alternate hypothesis does not contain an
equality.
c. A null hypothesis often says that a parameter or a difference between parameters is
insignificant. If a result is significant we reject the null hypothesis.
1
2. Steps for Testing a Hypothesis Applied to testing for a Population Mean
a. Outline
i. State the problem as two hypotheses
ii. Quantify the hypotheses
iii. Identify the statistic, ratio or interval to be used
iv. Determine the sampling distribution of the statistic to be used.
v. Select a level of significance
vi. Find a value or values of the test ratio or statistic that would lead to rejection of the null hypothesis
vii. Compute a value of the statistic or ratio from a random sample
viii. By comparing the results of (vii) with the values found in (vi) accept or reject the null hypothesis
b. Application to a Population Mean
To test H 0 :    0 against H1 :    0 . Assume that we have computed s from our sample, and that we
do not know  .
x  0
i. Test Ratio: t 
sx
ii. Critical Value: xcv   0  t 2 s x
iii. Confidence Interval:   x  t 2 s x
Note: If  , the population standard deviation, is known, replace t and s x with z and
x.
c. One-sided tests.
To test H 0 :    0 against H1 :    0 or H 0 :    0 against H1 :    0 , if you use a critical value
or a confidence interval, you must use a one-sided one. Replace t  with t . One-sided tests take more
2
thinking than two-sided test, and the most common error is in stating the null hypothesis. In a problem
statement, the question asked is often the alternative hypothesis, not the null. Always ask yourself if the
statement contains a strict inequality. If it does it cannot be a null hypothesis.
Examples:
Question: Is the mean income less than 20000? H 0 :   20000
Question: Is the mean income at least 20000? H 0 :   20000
Question: Is the mean income more than 20000? H 0 :   20000
Question: Is the mean income at most 20000? H 0 :   20000
3. The Use of p-value instead of Significance Levels.
A p-value is a measure of the credibility of the null hypothesis and is defined as the probability that a test
lower
 low 






statistic or ratio as extreme  as or more extreme  than the observed statistic or ratio could occur,
 high 
 higher 




assuming that the null hypothesis is true.
Note: If we have a p-value and want to do a conventional Hypothesis test, we can reject the null hypothesis
if the p-value is below the significance level. The p-value can thus be said to represent the smallest level
of significance at which the null hypothesis can be rejected. Other interpretations are: a) (i) If
pvalue  .01 we strongly doubt H 0 , (ii) If .01  pvalue  .05 we somewhat doubt H 0 , and (iii) If
pvalue  .05 we cannot doubt H 0 ; or b) (i) If pvalue  .01 results are very significant, (ii) If
2
.01  pvalue  .05 results are significant, (iii) If .05  pvalue  .10 results are marginally significant and
(iv) If pvalue  .10 results are not significant.
This means that if we have a calculated a t - ratio with a value of t calc , and we have a left-sided test,
pvalue  Pt  t calc  . If we have a right sided test, pvalue  Pt  t calc  . If we have a 2-sided test
pvalue  2Pt  t calc  or pvalue  2Pt  t calc  , whichever is smaller. So, for a one-sided test make a
diagram of the t distribution with a mean of zero, find the value of t calc and shade the appropriate side of
t calc . For a 2-sided test, find both t calc and t calc and shaded the tail above whichever is positive and
below whichever is negative. For an example using t see 252onesx0. For an example using z replace t with z
in this paragraph and see 252doctor.
3
4. Type One and Type Two Errors
a. Definitions
A Type one error is rejecting H 0 when H 0 is true.
A Type two error is not rejecting H 0 when H 0 is false.
b. Probabilities
H 0 True
H 0 False
Do not
reject H 0
1

Confidence Level
Reject H 0

Operating Characteristic
1 
Significance Level
Power
5. Hypotheses about a Proportion
a. Tests: (For an example see 252onesx1)
To test H 0 : p  p 0 against H1 : p  p0
i. Test Ratio: z 
p  p0
p
, p 
p0 q0
n
ii. Critical Value: pcv  p0  z 2 p
pq
n
iii. Confidence Interval: p  p  z  2 s p , s p 
(b. Continuity Correction.
The continuity correction acts to expand the 'accept' interval by x 
if npq  9 .
i. Test Ratio: z 
p  .5 n  p 0
p
, p 

.5
This is the same as testing z against   z  2 

n

p

1
ii. Critical Value: pcv  p0  2n  z 2 p


Use  if
p0 q0
n

1
2
in each direction. it should be used
p  p 0 and  if p  p 0 .




iii. Confidence Interval: p  p  1 2n  z  2 s p

)
6. The Sign Test
a. The Sign Test for a Median.
To test H 0 :   0 against H1 :   0
In any distribution outside of the normal distribution, It is usually easier to use the p-value approach. For
example, let us assume that we are testing the hypotheses H 0 :  25 and H1 :  25 , where  is, as before,
the median. The most important fact to know about testing for a median is that numbers above and below a
median are equally likely to occur in a random sample. A test of the median is a test of proportions.
4
So let us use p as the proportion of the observations in our population that are above 25. ( p
could just as easily be the proportion below 25.) If this is true, and we are working with a continuous
distribution, our hypotheses become H 0 : p  .5 and H1 : p  .5 . Now let us assume that we take a sample of
n  20 and that we find that x , the number of points above 25, is 5. We expect that half of our points, or
10, will be above 25, so 5 seems low. We thus use a binomial table to find Px  5 for
n  20 and p  .5. The table tells us that Px  5  .0207 . Since this is a two-sided test, we double this
probability to .0414 and use this as our p-value. If our confidence level is 95%, our significance level must
be 5%, and, since the p-value is below 5%, we reject the null hypothesis.
(But if we are to repeat tests, it may be wise to define acceptance and rejection regions by defining
two critical values, and saying that if x  CVL or x  CVu , we will reject the null hypothesis. Again
assume that the significance level is   .05 . We can use the p-value approach by saying that, if we would
reject the null hypothesis using the p-value approach for some value of x , that value is in our rejection
region. Starting from the bottom, try x  0 . From the table for n  20 and p  .5,Px  0  .0000 . Since
this p-value is below  2 , we would reject H 0 if x were 0. We come to a similar conclusion if x takes
values of 2, 3 or 4. If x  5 , we have already seen that Px  5  .0207 , and that we would still reject the
null hypothesis. But if we try x  6 , we find Px  6  .0577 which is above  2  .025 , so we accept
H 0 if x is 6 or larger. But, since this is a two-sided test, it is also possible that x is too large. For
example if x is 16, Px  16   1  Px  15   1  .9941  .0059 . Since this is below  2 , we would reject
the null hypothesis if x were 16 or larger. So try x  15 , Px  15   1  Px  14   1  .9793  .0207 .
This is still too low for acceptance, so try 14. Px  14   1  Px  13   1  .9423  .0577 . Since this is
above  2 , we would accept the null hypothesis if x were 14. We can thus say that we accept the null
hypothesis if x is between 6 and 14, or that our critical values for x are 5 and 15. If we now look back at the
cumulative binomial table, we see that we rejected the null hypothesis for probabilities below .025 or  2 
and above .975 or 1   2  .)
Let's try a one-sided problem. Suppose that our null hypothesis is that median income in a region
is at least $20000 and that we take a sample with the results shown below. Let   .05 .
5
Our hypotheses are H 0 :  20000 ,H1 :  20000 . Let
p be the proportion of numbers in the population below
20000. If the median is exactly 20000, p will be exactly
0.5. But if the median is above 20000, p will be below .5.
Observation No.
We can replace our original hypotheses with H 0 : p  .5 and
H 1 : p  .5 . . We see that x , the quantity of numbers in the
sample below 20000, is 7. Our expected number of items
below 20000 is 0.5n  .5(10 )  5 , so 7 is high and our pvalue will be P( x  7)  1  P( x  6)  1 .8281
 .1719 . Since the p-value is above the significance level,
we accept the null hypothesis.
Income
1
2
3
4
10132
11252
13475
14260
5
16871
6
7
19357
19438
8
23010
9
10
30278
35932
(If we wish to set up accept and reject zones for this one sided test, we need to try higher values of
x . A value of x  8 is still too small; it gives a probability of .0547, which is above  , so try x  9 .
According to the binomial table for n  10 and p  .5 , Px  9  1  Px  8  1  .9893 =.0107, which is
below  . So 9 is our critical value, and we will reject the hypothesis if x  9 .)
To clarify the correspondence between hypotheses about a mean and hypotheses about a proportion, let us
assume that p is the proportion of the data above 20000. If 2000 is the median, then by definition of the
median p is one half. But let us assume that the median is above 2000, say 2100. then one half of the data
must be above 2100, so that more than one half of the data must be above 2000, which means less than one
half of the data is below 2000. Since a hypothesis about a median is a hypothesis about a proportion,
 H 0 :   0
 H : p .5
corresponds to  0
. The table below shows these correspondences depending on the

H
:



0
 1
 H 1 : p  .5
definition of p .
Hypotheses about
a median
Hypotheses about a proportion
If p is the proportion
If p is the proportion
above  0
below  0
 H 0 :   0

 H 1 :   0
 H 0 :   0

H 1 :   0
 H 0 : p .5

 H 1 : p .5
 H 0 : p .5

 H 1 : p .5
 H 0 : p .5

 H 1 : p .5
 H 0 : p .5

 H 1 : p  .5
 H 0 :   0

H 1 :   0
 H 0 : p .5

 H 1 : p  .5
 H 0 : p .5

 H 1 : p  .5
6
b. The Sign Test more Generally.
This technique can be used in other ways. For instance let us say that we wish to check the
effectiveness of a product brochure. A sample of 17 clients is asked about their impression of a product.
Then they read the brochure and once again are asked their impression. We write a + if their impression
has improved and a  if it is worse. A zero indicates no change. Our results are as follows:
Client
Sign
1
+
2
+
3
+
4
+
5
0
6

7
+
8
0
9

10
+
11
+
12
+
13

14
+
15
0
16
+
17
+
Since we are hoping for a positive effect, count the zeros as minuses. We will use the brochure if we believe
that the majority of the population will respond favorably. Let p be the proportion of plusses in the
population. Our hypotheses are H 0 : p  .5, H 1 : p  .5 . There are 11 plusses so that we must find Px  11
when p  .5 and n  17 . A large binomial table says that this value is .166, so that we must accept the null
hypothesis and not use the brochure
In the absence of a binomial table we must use the normal approximation to the binomial
p  p0
x
distribution. If p  is our observed proportion, we use z 
. But for the sign test,
n
p0 q0
n
p  .5 and q  1  .5  .5 . So z 
x
n  .5
.25
n

x
n  .5
.5
2x  n
 x  .5n  n  x  .5n


2
.


 n  .5 
n
n
n
(For relatively small values of n , a continuity correction is advisable, so try z 
2x  1  n
, where the +
n
n
n
, and the  applies if x  . In the problem above, where
2
2


211  1  17
  Pz  .970   .5  .3340 =.1660. Since this is a p-value, if
n  17 , use Px  11  P z 
17


 takes a typical value like.05 or .10, we can say p-value   and accept the null hypothesis. )
applies if x 
7
7. Hypothesis Test for Means - Rare Events
In statistics, ‘rare events’ is a code word for the Poisson distribution. The easiest way to approach Poisson
results is to use a p-value. For example if you wish to test H 0 :   5 against H 1 :   5 and you have a
result that says x  7 , the p-value is 2Px  7. (For an example, see 252sx2)
8. Hypothesis Tests for a Variance.
To test H 0 :    0 against H1 :    0 (For an example, see 252sx2)
i. Test Ratio:  2 
n  1s 2 or for large samples
2
ii. Critical Value:
2
s cv
0

 2  02
(Don't try this for large samples.)
iii. Confidence Interval:
s 2DF 
z 2  2DF 
 
2
or
n 1
n  1s 2
 2
2
z  2  2  2DF   1
12 2  02
n 1
2 
s 2DF 
n  1s 2 or for large samples
2
1 2
 z 2  2DF 
Appendix: One-sided and Two Sided Tests.
Assume the following: n  7,  0  12 .2, DF  n 1  6,   .05, x  12 .00 , s 2  .6082333 , so that
sx 
s

n
0.6082333
 0.29477 .
7
A 2-sided Test: H 0 :   12.2 H 1 :   12 .2
(i) Test ratio: t 
x   0 12 .00  12 .2
6

 0.678 . We test this against two values of t, t n 1  t .025
 2.447

sx
0.29477
2
6
 2.447 . We reject H 0 if t is above t n 1 or below  t n 1 . In this case we do
and  t n 1  t .025

2

2

2
not reject H 0 .
6
6
 0.553 and t .25
 0.718 .
If we use p-value: pval  2Pt  0.678 . On the t-table 0.678 is between t .30
So Pt  0.678 is between .25 and .30 and the p-value is between .50 and .60. Since the p-value is above
  .05, we do not reject H 0 .
(ii) Critical value for x : xcv   0  t s x  12.2  2.447 0.29477   12.2  0.72 . We reject H 0 if x is
2
above the upper x cv  12 .2  0.72  12.92 or below the lower x cv  12 .2  0.72  11 .48 . In this case
x  12 .00 and we do not reject H 0 .
(iii) Confidence interval:   x  t s x  12 .00  2.447 0.29447   12.00  0.72. This interval is 11.28 to
2
12.72. Since  0  12.2 is between these two limits, we do not reject H 0 .
8
A Left -Sided test: H 0 :   12.2 H 1 :   12 .2
x   0 12 .00  12 .2
6

 0.678 . We test this against one value of t,  tn 1  t .05
sx
0.29477
1.943 . We reject H 0 if t is below  tn1 . In this case we do not reject H 0 .
(i) Test ratio: t 
6
6
If we use p-value: pval  Pt  0.678. On the t-table 0.678 is between t .30
 0.553 and t .25
 0.718 . So
Pt  0.678 is between .25 and .30 and the p-value is between .25 and .30. Since the p-value is above
  .05, we do not reject H 0 .
(ii) Critical value for x : x cv   0  t s x  12.2  1.943 0.29477   12.2  0.57  11.63. We reject H 0 if
x is below x cv  11 .63 . In this case x  12 .00 and we do not reject H 0 .
(iii) Confidence interval. Since the alternate hypothesis is H 1 :   12 .2 , use   x  t s x
 12.00  1.943 0.29477   12.0  0.57  12.57. Since H 0 :   12.2 does not contradict   12.57, we do
not reject H 0 .
A Right-sided Test: H 0 :   12.2 H 1 :   12 .2
x   0 12 .00  12 .2
6
 1.943 .

 0.678 . We test this against one value of t, tn 1  t .05
sx
0.29477
if t is above t n 1 . In this case we do not reject H .
(i) Test ratio: t 
We reject H 0

0
6
6
 0.553 and t .25
 0.718 . So
If we use p-value: pval  Pt  0.678. On the t-table 0.678 is between t .30
Pt  0.678 is between .25 and .30, Pt  0.678 is between .70 and .75 and the p-value is between .70
and .75. Since the p-value is above   .05, we do not reject H 0 .
(ii) Critical value for x : x cv   0  t s x  12.2  1.943 0.29477   12.2  0.57  12.77. We reject H 0 if
x is above x cv  12 .77 . In this case x  12 .00 and we do not reject H 0 .
(iii) Confidence interval: Since the alternate hypothesis is H 1 :   12 .2 , use   x  t s x
 12.00  1.943 0.29477   12.0  0.57  11.43. Since H 0 :   12.2 does not contradict   11.43 , we
do not reject H 0 .
More on p-value
Let’s say that you have gotten one of the following results for a test of a mean with n  25
a) t  1.000
b) t  1.000
c) z  1.000 The values of z could also come from tests of proportions or variances.
d) z  1.000
A 2-sided Test
A p-value is defined as the probability that a test statistic or ratio as extreme as or more extreme than the
observed statistic or ratio could occur, assuming that the null hypothesis is true.
a) t  1.000 . You want Pt  1.000 ort  1.000   2Pt  1.000  .
To find Pt  1.000  , look at the t table. Since n  25 , df  n  1  24 .
24
24
 0.857 and t .15
 1.059 . This means
Look at the df  24 line. You will find that 1.000 is between t .20
that Pt  0.857   .20 and Pt  1.059   .15 . Since 1.000 is between these values we can say
.15  Pt  1.000   .20 . So pval  2Pt  1.000  , which means .30  pval  .40 .
b) t  1.000 . You want Pt  1.000 ort  1.000   2Pt  1.000 
9
24
24
You found in a) that 1.000 is between t .20
 0.857 and t .15
 1.059 . This means that Pt  0.857   .20
and Pt  1.059   .15 , but, since the t distribution is symmetrical, we can also say Pt  0.857   .20 and
Pt  1.059   .15 . Since 1.000 is between these values we can say .15  Pt  1.000   .20 . So
pval  2Pt  1.000  , which means .30  pval  .40 .
c) z  1.000 . You want Pz  1.000 orz  1.000   2Pz  1.000  . Make a diagram for z with a center
at zero and shade the area above 1.000. Use the Normal
table. Pz  1.000   Pz  0  P0  z  1  .5  .3413  .1587 , so pval  2Pz  1.000   2.1587   .3174
d) z  1.000 . You want Pz  1.000 orz  1.000   2Pz  1.000  . Make a diagram for z with a
center at zero and shade the area below -1.000.
Pz  1.000   Pz  0  P1  z  0  .5  .3413  .1587 , so pval  2Pz  1.000   2.1587   .3174
A Left-sided Test
A p-value is defined as the probability that a test statistic or ratio as low as or lower than the observed
statistic or ratio could occur, assuming that the null hypothesis is true.
24
 0.857 and
a) t  1.000 . You want Pt  1.000  . You found in 2-sided Test a) that 1.000 is between t .20
24
t .15
 1.059 . This means that Pt  0.857   .20 and Pt  1.059   .15 . Since 1.000 is between these
values we can say .15  Pt  1.000   .20 . But you want Pt  1.000  , so subtract these probabilities from
1. So .80  pval  .85 .
b) t  1.000 . You want Pt  1.000  . You found in a) that .15  Pt  1.000   .20 . Since the t
distribution is symmetrical, we can also say .15  Pt  1.000   .20 or .15  pval  .20 .
c) z  1.000 . You want Pz  1.000  . Make a diagram for z with a center at zero and shade the area below
1.000. Pz  1.000   Pz  0  P0  z  1
 .5  .3413  .8413 , so pval  .8413
d) z  1.000 . You want Pz  1.000  . Make a diagram for z with a center at zero and shade the area
below -1.000. Pz  1.000   Pz  0  P1  z  0  .5  .3413  .1587 , so pval  .1587 .
A Right-sided Test
A p-value is defined as the probability that a test statistic or ratio as high as or higher than the observed
statistic or ratio could occur, assuming that the null hypothesis is true.
24
 0.857 and
a) t  1.000 . You want Pt  1.000  . You found in 2-sided Test a) that 1.000 is between t .20
24
t .15
 1.059 . This means that Pt  0.857   .20 and Pt  1.059   .15 . Since 1.000 is between these
values we can say .15  Pt  1.000   .20 . So .15  pval  .20 .
b) t  1.000 . You want Pt  1.000  . You found in a) that .15  Pt  1.000   .20 . Since the t
distribution is symmetrical, we can also say .15  Pt  1.000   .20 . But you want Pt  1.000  , so
subtract these probabilities from 1. So .80  pval  .85 .
c) z  1.000 . You want Pz  1.000  . Make a diagram for z with a center at zero and shade the area above
1.000. Pz  1.000   Pz  0  P0  z  1  .5  .3413  .1587 , so pval  .1587 .
d) z  1.000 . You want Pz  1.000  . Make a diagram for z with a center at zero and shade the area
below -1.000. Pz  1.000   Pz  0  P1  z  0  .5  .3413  .8413 , so pval  .8413 .
Note that, since every one of these p-values is above 1%, 5% and 10%, you would not reject the null
hypothesis if you used any of these significance levels.
10
© 2005 R. E. Bove
11
Download