B. HYPOTHESIS TESTS FOR ONE SAMPLE

advertisement
252onesl 9/20/07 (Open this document in 'Outline' view!)
B. HYPOTHESIS TESTS FOR ONE SAMPLE
1. The Meaning of Hypothesis Testing
A hypothesis is a statement about the characteristics of a population.
To be of any use to us it must be quantifiable and testable.
The hypothesis to be tested is usually called the Null Hypothesis H 0 .
A rival hypothesis is called the Alternative or Research Hypothesis H 1
or H A . It is usually true that the null hypothesis will be a hypothesis
of "no difference" and the alternative hypothesis covers all other
possibilities.
To start out ask "What do I want to know about a population or
populations?" Can I state this in terms of population parameters?
Can I state this in terms of a testable hypothesis (null hypothesis)?
Does my null hypothesis say that these parameters or differences
between these parameters are insignificant (i.e. not distinct from
zero)?
Next ask "What am I assuming about the population or populations?"
Are the parameters that I am testing appropriate to the type of
population that I am assuming? What can I use to test my
hypothesis? Can I find a sample statistic or statistics to do the job ?
How many samples do I need? Can I calculate the sample statistic
or statistics.? What distribution does the test statistic have? Is this
in accord with the null hypothesis? What errors am I likely to make?
Usually there are three approaches to hypothesis testing
involving a statement about a parameter of a population:
(i)
The Test Ratio Method, in which a ratio involving an
estimate of a parameter is tested against a well known
distribution like t;
(ii)
The Critical Value Method, in which values of
estimates of a parameter are found which could lead to
rejection of H 0 ;
and
(iii)
The Confidence Interval Method, in which a
confidence interval is constructed for the parameter and
compared to the value in the null hypothesis.
If I use a test ratio, what is the probability of getting values as
extreme or more extreme than I actually got? (This probability is
the p-value: the lower it is the less likely it is that the null
hypothesis is true. If the p-value falls below the significance level,
I can say that I reject the null hypothesis.)
If I use a test statistic and my significance level is 5% or 1%,
did the value fall among the most likely 95% or 99% of values? Or
was it a very unlikely value?
If I use a confidence interval, did the parameter value in my
null hypothesis fall in the confidence interval?
1
Remember:
a. A null hypothesis is usually a statement
about a parameter of a population. It is
never a statement about a sample statistic.
A sample statistic is used to test the
hypothesis.
b. A null hypothesis usually contains an
equality, an alternate hypothesis does not
contain an equality.
c. A null hypothesis often says that a
parameter or a difference between parameters
is insignificant. If a result is significant we
reject the null hypothesis.
2
2. Steps for Testing a Hypothesis Applied to testing
for a Population Mean
a. Outline
i. State the problem as two hypotheses.
ii. Quantify the hypotheses.
iii. Identify the statistic, ratio or interval to be used
iv. Determine the sampling distribution of the statistic to
be used.
v. Select a level of significance.
vi. Find a value or values of the test ratio or statistic that
would lead to rejection of the null hypothesis.
vii. Compute a value of the statistic or ratio from a
random sample
viii. By comparing the results of (vii) with the values found
in (vi) reject or do not reject (‘accept’) the null hypothesis.
b. Application to a Population Mean
To test H 0 :    0 against H1 :    0 . Assume that we
have computed s from our sample, and that we do not know
 .
x  0
i. Test Ratio: t 
sx
ii. Critical Value: xcv  0  t 2 sx
iii. Confidence Interval:   x  t 2 sx
Note: If  , the population standard deviation, is
known, replace t and s x with z and  x .
c. One-sided Tests.
To test H 0 :    0 against H1 :    0 or H 0 :    0 against
H1 :    0 , if you use a critical value or a confidence
interval you must use a one-sided one. Replace t  with t .
2
One-sided tests take more thinking than two-sided test, and the
most common error is in stating the null hypothesis. In a problem
statement, the question asked is often the alternative hypothesis,
not the null. Always ask yourself if the statement contains a strict
inequality. If it does it cannot be a null hypothesis.
Examples:
Question: Is the mean income less than 20000? H 0 :   20000
Question: Is the mean income at least 20000? H 0 :   20000
Question: Is the mean income more than 20000? H 0 :   20000
Question: Is the mean income at most 20000? H 0 :   20000
3. The Use of p-value Instead of Significance Levels.
A p-value is a measure of the credibility of the null hypothesis and is
3
 low 


defined as the probability that a test statistic or ratio as extreme  as
 high 


lower




or more extreme  than the observed statistic or ratio could occur,
 higher 


assuming that the null hypothesis is true.
Note: If we have a p-value and want to do a conventional hypothesis
test, we can reject the null hypothesis if the p-value is below the
significance level. The p-value can thus be said to represent the
smallest level of significance at which the null hypothesis can be
rejected.
Other interpretations are:
a) (i) If pvalue  .01 we strongly doubt H 0 ,
(ii) If .01  pvalue  .05 we somewhat doubt H 0 , and
(iii) If pvalue  .05 we cannot doubt H 0 ; or
b) (i) If pvalue  .01 results are very significant,
(ii) If .01  pvalue  .05 results are significant,
(iii) If .05  pvalue  .10 results are marginally significant
and
(iv) if pvalue  .10 results are not significant.
For an example using t see 252onesx0. For an example using z replace t with z in
this paragraph and see 252doctor
4
4. Type One and Type Two Errors
a. Definitions
A Type one error is rejecting H 0 when H 0 is true.
A Type two error is not rejecting H 0 when H 0 is false.
H 0 True
H 0 False
Do not
reject H 0
Not an error
Type II Error
Reject H 0
Type I error
Not an error
b. Probabilities
H 0 True
H 0 False
Do not
reject H 0
1

Confidence Level
Reject H 0

Operating Characteristic
1 
Power
Significance Level
5. Hypotheses about a Proportion
a. Tests: (For an example see 252onesx1)
To test H 0 : p  p 0 against H1 : p  p0
i. Test Ratio: z 
p  p0
p
, p 
p0 q0
n
ii. Critical Value: pcv  p0  z 2 p
pq
n
iii. Confidence Interval: p  p  z  2 s p , s p 
(b. Continuity Correction.
The continuity correction acts to expand the 'accept' interval
by x  1 2 in each direction. It should be used if npq  9 .
i. Test Ratio: z 
p  .5 n  p 0
p
This is the same as testing z

.5 
against   z  2 

n p 

ii. Critical Value: pcv  p0 
p0 q0
n
, p 

1
2n 
z 2 p

Use  if

iii. Confidence Interval: p  p  1 2n  z  2 s p

p  p 0 and  if p  p 0 .
)
6. The Sign Test
a. The Sign Test for a Median.
To test H 0 :   0 against H1 :   0
5
In any distribution outside of the normal distribution, it is
usually easier to use the p-value approach. For example, let
us assume that we are testing the hypotheses H 0 :  25
and H1 :  25 , where  is, as before, the median. The
most important fact to know about testing for a median is that
numbers above and below a median are equally likely to
occur in a random sample.
A test of the median is a test of the proportion of points above
or below the alleged median.
So let us use p as the proportion of the observations
in our population that are above 25. ( p could just as easily be
the proportion below 25.) If this is true, and we are working
with a continuous distribution, our hypotheses become
H 0 : p  .5 and H1 : p  .5 . Now let us assume that we
take a sample of n  20 and that we find that x , the number
of points above 25, is 5. We expect that half of our points,
or 10, will be above 25, so 5 seems low. We thus use a
binomial table to find Px  5 for n  20 and p  .5.
The table tells us that Px  5  .0207 . Since this is a
two-sided test, we double this probability to .0414 and use
this as our p-value. If our confidence level is 95%, our
significance level must be 5%, and, since the p-value is below
5%, we reject the null hypothesis.
(But if we are to repeat tests, it may be wise to
define acceptance and rejection regions by defining two
critical values, and saying that if x  CVL or x  CVu , we
will reject the null hypothesis. Again assume that the
significance level is   .05 .
We can use the p-value approach by saying that, if we
would reject the null hypothesis using the p-value approach
for some value of x , that value is in our rejection region.
Starting from the bottom, try x  0 . From the table for
n  20 and p  .5, P x  0  .0000 . Since this p-value


is below  2 , we would reject H 0 if x were 0. We come to a
similar conclusion if x takes values of 2, 3 or 4. If x  5 ,
we have already seen that Px  5  .0207 , and that we
would still reject the null hypothesis. But if we try x  6 ,
we find Px  6  .0577 which is above  2  .025 , so we
‘accept’ H 0 if x is 6 or larger.
But, since this is a two-sided test, it is also possible that x is
too large. For example if x is 16,
P x  16  1  P x  15 = 1 - .9941  .0059 .





Since this is below 2 , we would reject the null hypothesis if
x were 16 or larger. So try x  15 . Px  15   1  Px  14   1  .9793  .0207 . This is still too




low for acceptance, so try 14. P x  14  1  P x  13  1  .9423  .0577 . Since this is above
we would
not reject the null hypothesis if x were 14. We can thus say
that we do not reject the null hypothesis if x is between 6 and

2
,
6
14, or that our critical values for x are 5 and 15. If we now
look back at the cumulative binomial table, we see that we
rejected the null hypothesis for probabilities below .025 or  2 
and above .975 or 1   2  .) {bin}
Let's try a one-sided problem. Suppose that our null
hypothesis is that median income in a region is at least $20000
and that we take a sample with the results shown below.
Let   .05 .
Our hypotheses are
H 0 :  20000, H1 :  20000 . Let
p be the proportion of numbers in the population below
20000. If the median is exactly 20000, p will be exactly
0.5. But if the median is above 20000, p will be below .5.
We can replace our original hypotheses with H 0 : p  .5 and
H 1 : p  .5 . . We see that x , the quantity of numbers in the
sample below 20000, is 7. Our expected number of items
below 20000 is 0.5n  .5(10 )  5 , so 7 is high and our pvalue will be P( x  7)  1  P( x  6)  1 .8281
 .1719 . Since the p-value is above the significance level,
we accept the null hypothesis.
Index
Income
1
2
3
4
10132
11252
13475
14260
5
16871
6
7
19357
19438
8
23010
9
10
30278
35932
(If we wish to set up accept and reject zones for this
one sided test, we need to try higher values of x . A value of
x  8 is still too small; it gives a probability of .0547, which
is above  , so try x  9 . According to the binomial table for
n  10 and p  .5 , Px  9  1  Px  8  1  .9893 =.0107,
which is below  . So 9 is our critical value, and we will
reject the hypothesis if x  9 .)
To clarify the correspondence between hypotheses about a
median and hypotheses about a proportion, let us assume that
p is the proportion of the data above 20000. If 2000 is the
median, then, by definition of the median, p is one half. But
let us assume that the median is above 2000, say 2100. Then
one half of the data must be above 2100, so that more than one
half of the data must be above 2000, which means less than one
half of the data is below 2000. Since a hypothesis about a
 H 0 :   0
median is a hypothesis about a proportion, 
H 1 :   0
 H 0 : p .5
corresponds to 
. The table below shows these
 H 1 : p  .5
correspondences depending on the definition of p .
Hypotheses about
a median
Hypotheses about a proportion
If p is the
If p is the
proportion above
proportion below
0
0
7
 H 0 :   0

 H 1 :   0
 H 0 :   0

H 1 :   0
 H 0 :   0

H 1 :   0
 H 0 : p .5

 H 1 : p .5
 H 0 : p .5

 H 1 : p .5
 H 0 : p .5

 H 1 : p .5
 H 0 : p .5

 H 1 : p  .5
 H 0 : p .5

 H 1 : p  .5
 H 0 : p .5

 H 1 : p  .5
8
b. The Sign Test more Generally.
This technique can be used in other ways. For
instance let us say that we wish to check the effectiveness of a
product brochure. A sample of 17 clients is asked about their
impression of a product. Then they read the brochure and once
again are asked their impression. We write a (+) if their
impression has improved and a () if it is worse. A zero
indicates no change. Our results are as follows:
Client
Sign
1 2 3 4 5 6 7 8 9 10
+ + + + 0  + 0  +
11
+
12
+
13

14
+
15
0
16
+
17
+
Since we are hoping for a positive effect, count the zeros as
minuses. We will use the brochure if we believe that the
majority of the population will respond favorably. Let p be the
proportion of plusses in the population. Our hypotheses are
H 0 : p  .5, H 1 : p  .5 . There are 11 plusses so that we must
find Px  11 when p  .5 and n  17 . A large binomial table
says that this value is .166, so that we must accept the null
hypothesis and not use the brochure.
In the absence of a binomial table we must use the
Normal approximation to the binomial distribution. If p 
is our observed proportion, we use z 
p  p0
x
n
. But for the
p0 q0
n
sign test, p  .5 and q  1  .5  .5 . So z 
x
n  .5

x
n  .5
2x  n
 x  .5n  n  x  .5n


2
.


 n  .5 
n
n
.5
.25
n
n
(For relatively small values of n , a continuity correction is
2x  1  n
advisable, so try z 
, where the + applies if
n
n
n
x  , and the  applies if x  . In the problem above ,
2
2

211  1  17 
  Pz  .970 
where n  17 , use Px  11  P z 
17


 .5  .3340 =.1660. Since this is a p-value, if  takes a
typical value like .05 or .10, we can say p-value   and not
reject the null hypothesis. )
9
7. Hypothesis Test for Means - Rare Events
In statistics, ‘rare events’ is a code word for the Poisson distribution. The
easiest way to approach Poisson results is to use a p-value. For example
if you wish to test H 0 :   5 against H 1 :   5 and you have a result
that says x  7 , the p-value is 2Px  7.
(For an example, see 252oneslx2) {poiss}
8. Hypothesis Tests for a Variance.
To test H 0 :    0 against H1 :    0 (For an example, see
252oneslx2) {chiSq}
i.
n  1s 2 or
2
Test Ratio:  2 
0
for large samples z  2 2  2DF   1
ii. Critical Value:
2
s cv
 2  02

2
n 1
(Don't try this for large samples.)
iii. Confidence Interval:
or
12 2  02
n 1
n  1s 2
 22
2 
s 2DF 
for large samples
 
z 2  2DF 
n  1s 2 or
2
1 2
s 2DF 
 z 2  2DF 
Appendix: One-sided and Two Sided Tests.
Assume the following: n  7,  0  12 .2, DF  n 1  6,   .05, x  12 .00 ,
s 2  .6082333 , so that s x 
s

n
0.6082333
 0.29477 .
7
A 2-sided Test: H 0 :   12.2 H 1 :   12 .2
x   0 12 .00  12 .2

 0.678 . We test this against
sx
0.29477
(i) Test ratio: t 
6
6
 2.447 and  t n 1  t .025
 2.447 .
two values of t, t n 1  t .025


2
We reject H 0 if t is above
do not reject H 0 .
t n 1

2
2
or below  t n 1 . In this case we

2
{ttable}
If we use p-value: pval  2Pt  0.678 . On the t-table 0.678 is
6
6
 0.553 and t .25
 0.718 . So Pt  0.678 is between
between t .30
.25 and .30 and the p-value is between .50 and .60. Since the p-value is
10
above   .05, we do not reject H 0 .
(ii) Critical value for x :
xcv  0  t 2 sx  12.2  2.4470.29477
 12.2  0.72 . We reject H 0 if x is above the upper critical value
x cv  12 .2  0.72  12.92 or below the lower critical value
x cv  12 .2  0.72  11 .48 . In this case x  12 .00 and we do not reject H 0 .
(iii) Confidence interval:
  x  t s x  12.00  2.4470.29447  12.00  0.72. This interval
2
is 11.28 to 12.72. Since  0  12.2
is between these two limits, we do not reject H 0 .
11
A Left -Sided test: H 0 :   12.2 H 1 :   12 .2
x   0 12 .00  12 .2

 0.678 . (We use the same data
sx
0.29477
as the two-sided problem) We test this against one value of t,
6
 tn 1  t .05
1.943 . We reject H 0 if t is below  tn1 . In this
(i) Test ratio: t 
case we do not reject H 0 .
If we use p-value: pval  Pt  0.678. On the t-table 0.678 is between
6
6
t .30
 0.553 and t .25
 0.718 . So Pt  0.678 is between .25 and .30
and the p-value is between .25 and .30. Since the p-value is above
  .05, we do not reject H 0 .
(ii) Critical value for x :
xcv  0  t s x  12.2 1.9430.29477  12.2  0.57  11.63. We
reject H 0 if x is below x cv  11 .63 . In
this case x  12 .00 and we do not reject H 0 .
(iii) Confidence interval. Since the alternate hypothesis is H 1 :   12 .2 ,
use   x  t s x  12.00  1.943 0.29477   12.0  0.57  12.57. Since
H 0 :   12.2 does not contradict   12.57, we do not reject H 0 .
A Right-sided Test: H 0 :   12.2 H 1 :   12 .2
x   0 12 .00  12 .2

 0.678 . We test this against
sx
0.29477
6
 1.943 . We reject H 0 if t is above tn 1 .
one value of t, tn 1  t .05
(i) Test ratio: t 
In this case we do not reject H 0 .
If we use p-value: pval  Pt  0.678. On the t-table 0.678 is between
6
6
t .30
 0.553 and t .25
 0.718 . So Pt  0.678 is between .25 and .30,
Pt  0.678 is between .70 and .75 and the p-value is between .70 and
.75. Since the p-value is above   .05, we do not reject H 0 .
(ii) Critical value for x :
xcv  0  t s x  12.2  1.9430.29477  12.2  0.57  12.77. We
reject H 0 if x is above x cv  12 .77 .
In this case x  12 .00 and we do not reject H 0 .
(iii) Confidence interval: Since the alternate hypothesis is
H 1 :   12 .2 , use   x  t s x  12.00  1.943 0.29477



 12.0  0.57  11.43. Since
H 0 :   12.2 does not contradict
  11.43 , we do not reject H 0 .
More on p-value
Let’s say that you have gotten one of the following results for a test of a
mean with n  25
a) t  1.000
b) t  1.000
c) z  1.000 The values of z could also come from tests of proportions or
variances.
12
d) z  1.000
A 2-sided Test
A p-value is defined as the probability that a test statistic or ratio
as extreme as or more extreme than the observed statistic or ratio
could occur, assuming that the null hypothesis is true.
a) t  1.000 . You want Pt  1.000 ort  1.000   2Pt  1.000  .
To find Pt  1.000  , look at the t table. Since n  25 ,
df  n  1  24 .
Look at the df  24 line. You will find that 1.000 is between
24
24
t .20
 0.857 and t .15
 1.059 . This means that
Pt  0.857   .20 and Pt  1.059   .15 . Since 1.000 is between these values we can say
.15  Pt  1.000   .20 . So pval  2Pt  1.000  ,
which means .30  pval  .40 .
b) t  1.000 . You want Pt  1.000 ort  1.000 
 2Pt  1.000  . You found in a) that 1.000 is between
24
24
t .20
 0.857 and t .15
 1.059 . This means that
Pt  0.857   .20 and Pt  1.059   .15 , but, since the t
distribution is symmetrical, we can also say Pt  0.857   .20
and Pt  1.059   .15 . Since 1.000 is between these values we
can say .15  Pt  1.000   .20 . So pval  2Pt  1.000  ,
which means .30  pval  .40 .
c) z  1.000 . You want Pz  1.000 orz  1.000   2Pz  1.000  . Make a diagram for z with a center
at zero and shade the area above
1.000. Use the Normal table. Pz  1.000   Pz  0  P0  z  1  .5  .3413  .1587 , so
pval  2Pz  1.000   2.1587   .3174
d) z  1.000 . You want Pz  1.000 orz  1.000 
 2Pz  1.000  . Make a diagram for z with a center at zero
and shade the area below -1.000. Pz  1.000   Pz  0  P1  z  0  .5  .3413  .1587 ,
so pval  2Pz  1.000   2.1587   .3174
A Left-sided Test
A p-value is defined as the probability that a test statistic or ratio
as low as or lower than the observed statistic or ratio could occur,
assuming that the null hypothesis is true.
a) t  1.000 . You want Pt  1.000  . You found in 2-sided Test a)
24
24
 0.857 and t .15
 1.059 . This means
that 1.000 is between t .20
that Pt  0.857   .20 and Pt  1.059   .15 . Since 1.000 is
between these values we can say .15  Pt  1.000   .20 . But you
want Pt  1.000  , so subtract these probabilities from 1.
So .80  pval  .85 .
b) t  1.000 . You want Pt  1.000  . You found in a) that .15  Pt  1.000   .20 . Since the t
distribution is symmetrical,
we can also say .15  Pt  1.000   .20 or .15  pval  .20 .
c) z  1.000 . You want Pz  1.000  . Make a diagram for z with
13
a center at zero and shade the area below 1.000. Pz  1.000   Pz  0  P0  z  1  .5  .3413  .8413 ,
so pval  .8413
d) z  1.000 . You want Pz  1.000  . Make a diagram for z
with a center at zero and shade the area below -1.000.
Pz  1.000   Pz  0  P1  z  0  .5  .3413  .1587 ,
so pval  .1587 .
A Right-sided Test
A p-value is defined as the probability that a test statistic or ratio
as high as or higher than the observed statistic or ratio could occur,
assuming that the null hypothesis is true.
a) t  1.000 . You want Pt  1.000  . You found in 2-sided Test a)
24
24
that 1.000 is between t .20
 0.857 and t .15
 1.059 . This means
that Pt  0.857   .20 and Pt  1.059   .15 . Since 1.000 is
between these values we can say .15  Pt  1.000   .20 .
So .15  pval  .20
b) t  1.000 . You want Pt  1.000  . You found in a) that .15  Pt  1.000   .20 . Since the t
distribution is symmetrical,
we can also say .15  Pt  1.000   .20 . But you want
Pt  1.000  , so subtract these probabilities from 1.
So .80  pval  .85 .
c) z  1.000 . You want Pz  1.000  . Make a diagram for z with
a center at zero and shade the area above 1.000. Pz  1.000   Pz  0  P0  z  1  .5  .3413  .1587 ,
so pval  .1587 .
d) z  1.000 . You want Pz  1.000  . Make a diagram for z
with a center at zero and shade the area below -1.000. Pz  1.000   Pz  0  P1  z  0
 .5  .3413  .8413 ,
so pval  .8413 .
Note that, since every one of these p-values is above 1%, 5% and
10%, you would not reject the null hypothesis if you used any of
these significance levels.
© 2005 R. E. Bove
14
Download