Example 10.3 A soda manufacturer is interested in determining

advertisement
Chapter 10
Hypothesis Testing Using a Single Sample
10.1 Hypotheses and Test Procedures
In the previous chapter, we introduced some methods to estimate the unknown value of
some population characteristic by sample data. Sample data may also be used to decide
whether some claim or hypothesis about a population characteristic is plausible.
A hypothesis is a claim either about the value of a single population characteristic or
about the values of several population characteristics.
For example, the following are all hypotheses.
 = 500, where  is the population mean;
 < .5, where  is the population proportion.
Question: Are the statements x > 500 and p = .5 hypotheses?
A test of hypothesis (or test procedure) is a method for using sample data to decide
between two competing hypotheses about a population characteristic. One hypothesis
might be  = 100 and the other   100.
Definition 10.1 The null hypothesis, denoted by H0, is a claim about a population
characteristic that is initially assumed to be true.
The alternative hypothesis, denoted by Ha, is the competing claim.
In carrying out a test of H0 versus Ha, the hypothesis H0 will be rejected in favor of Ha
only if sample evidence strongly suggests that H0 is false. If the sample does not contain
such evidence, H0 will not be rejected. The two possible conclusions are then (1) reject H0
or (2) fail to reject H0.
 The form of a null hypothesis:
H0: Population characteristic = hypothesized value.
Where the hypothesized value is a specific number determined by the problem context.
 The possible forms of the alternative hypothesis:
Ha: Population characteristic > hypothesized value
Ha: Population characteristic < hypothesized value
Ha: Population characteristic  hypothesized value.
Notes:
1. The reason for us to state the null hypothesis as a claim of equality is mainly for
simplicity. The development of a decision rule is easiest if there is only a single value
of a population characteristic involved in H0 .
2. We won’t test H0 = 50 versus Ha:  > 100. The number appearing in the alternative
hypothesis must be identical to the hypothesized value in H0.
3. Rejection of H0 indicates strong evidence that Ha is true. However, non-rejection of
H0 does not mean strong support for H0  only lack of strong evidence against it.
4. Keep the research objectives in mind when selecting the hypotheses to be tested.
10.2 Errors in Hypothesis Testing
Once hypotheses have been formulated, we need a method for using sample data to
determine whether H0 should be rejected. The method that we use for this purpose is
called a test procedure. Just as a jury may reach the wrong verdict in a trial, there is some
chance that the use of a test procedure with sample data may lead us to the wrong
conclusion about a population characteristic. There are two different types of errors that
might be made when making a decision in a hypothesis-testing problem.
Definition 10.2: Type I error – the error of rejecting H0 when H0 is true.
Type II error – the error of failing to reject H0 when H0 is false.
The only way to guarantee that neither type of error will occur is to make such decisions
on the basis of a census of the entire population.
Definition 10.3 The probability of a type I error is denoted by  and is called the level of
significance of the test. Thus, a test with  = .01 is said to have a level of significance
of .01 or to be a level .01 test.
The probability of a type II error is denoted by .
The relationship between  and : As  decreases,  will increase, and vice versa.
The ideal test procedure has both  = 0 and  = 0. However, if our decision must be
based on incomplete information (a sample rather than a census) it is impossible to
achieve this ideal. The standard test procedures do allow the user to control , but they
provide no direct control over . To achieve a smaller probability of making a type I error
the risk of a type II error will increase. In general, there is a compromise between small 
and small , leading to the following widely accepted principle for specifying a test
procedure.

The principle for specifying a test procedure
After assessing the consequences of type I and type II errors, identify the largest  that is
tolerable for the problem. Then employ a test procedure that uses this maximum
acceptable value (rather than any smaller value) as the level of significance (because
using a smaller  increases ). In other words, don’t make  smaller than it needs to be.
Thus, if you decide that  = .05 is tolerable, you should not use a test with  = .01,
because the smaller  inevitably results in a larger . The choice of  in any given
problem depends on the seriousness of a type I error relative to a type II error. If a type II
error has serious consequences, it may be a good idea to select a somewhat larger value
for .
10.3 Large-sample hypothesis tests for a population proportion
Now we are ready to develop procedures for using sample information to decide between
the null and alternative hypotheses. The fundamental idea behind hypothesis-testing
procedures is: We reject the null hypothesis if the observed sample is very unlikely to
have occurred when H0 is true.
Let
 = the proportion of individuals in a population that possess a certain property
p = (number of individuals in the sample that possess the property) / n
By the results in Chapter 8, the sampling distribution of p has the following properties.
1. p = 
2.  p 
 (1 )
n
3. when n is large (n ≥10 and n(1-) ≥10), the sampling distribution of p is
approximately normal.
Thus, z 
p 
 (1 ) n
has approximately a standard normal distribution when n is large.
Example 10.1 Population = {All blood recipients}.  = The proportion of all blood
recipients stricken with viral hepatitis = .07. A new treatment is given to n = 200 blood
recipients. Only 6 of the 200 patients contract hepatitis. The question of interest to
medical researchers is: Is the new treatment effective? (Does the new treatment reduce
the incidence rate of viral hepatitis?)
This question can be answered by testing the following hypotheses:
H0:  = 0.07 (the new treatment is ineffective) versus
Ha:  < 0.07 (the new treatment is effective).
Here p = 6 / 200 = 0.03. When H0 is true,
 p=  = 0.07,  p 
 (1 )
n

( 0.07)(10.07)
200
=0.018
Since n = 200 (0.07) = 14 > 10 and n(1-) = 200(1-0.07) = 186 > 10, the sampling
distribution of p is approximately normal. Then
P(p  .03 when H0 is true)  P(z  (0.03 – 0.07) /
0.07(10.07)
200
) = P (z  -2.22) = 0.0132
Since the probability is very small (as means that it is unlikely that a sample
proportion .03 or smaller would be observed if H0 is true), we reject H0 in favor of Ha.
Note: In the above test, two factors are critical.
(i) (p – hypothesized value) / (hypothesized value)(1  hypothesized value) n
= (0.03 – 0.07) /
0.07(10.07)
200
(ii) Criteria to judge whether P(p  .03 when H0 is true) is small.
Definition 10.4: A test statistic is the function of sample data on which a conclusion to
reject or fail to reject H0 is based. The P-value (sometimes called the observed
significance level) is a measure of inconsistency between the hypothesized value for a
population characteristic and the observed sample.

A decision as to whether H0 should be rejected based on the P-value and the chosen :
H0 should be rejected if P-value  .
H0 should not be rejected if P-value > .
In example 10.1, if  = 0.05, we should reject H0. However, if  = 0.01, we should not
reject H0.

Summary of large-sample z test for 
Null hypothesis H0:  = hypothesized value
Test statistic:
z = (p – hypothesized value) /
Alternative hypothesis
Ha:  > hypothesized value
(Upper-tailed test)
Ha:  < hypothesized value
(Lower-tailed test)
Ha:   hypothesized value
(Two-tailed test)
(hypothesized value)(1  hypothesized value) n
P-value
Area under z curve to right of calculated z
Area under z curve to left of calculated z
(i) 2(area to right of calculated z) if z is positive.
(ii) 2(area to left of calculated z) if z is negative
Assumptions:
(1) p is the sample proportion from a random sample
(2) The sample size is large (both n(hypothesized value) 10 and n(1-hypothesized
value)  10).
When carrying out a hypothesis test, the following steps are often used.

Steps in a hypothesis-testing analysis
1.
2.
3.
4.
5.
6.
7.
Describe the population characteristic about which hypotheses are to be tested.
State the null hypothesis, H0.
State the alternative hypothesis, Ha.
Select the significance level  for the test.
Display the test statistic to be used.
Check to make sure that any assumptions required for the test are satisfied.
Compute all quantities appearing in the test statistic and then the value of the test
statistic itself.
8. Determine the P-value associated with the observed value of the test statistic.
9. State the conclusion, which will be to reject H0 if P-value   and not to reject H0
otherwise. The conclusion should then be stated in the context of the problem, and
the level of significance should be included.
Step 1-4 constitute a statement of the problem, step 5-8 give the analysis that will lead
to a decision, and step 9 provides the conclusion.
Example 10.2: In 1970 only 20% of all applicants to graduate programs in
mathematics were female. The percentage of female applicants to graduate programs
in mathematics has been increasing recently. A random sample of 200 applicants to
graduate programs in mathematics had 52 female applicants. Does the sample data
suggest that there has been a significant increase in the proportion of females
applying to graduate programs in mathematics since 1970 at a significance level
of .05? Explain.
1.
2.
3.
4.
5.
6.
7.
8.
9.
 = proportion of female applicants to graduate programs in mathematics.
H0:  = ?
Ha:  > ?
Significance level:  = ?
Test statistic:
z = (p – hypothesized value) / (hypothesized value)(1  hypothesized value) n
= (p – .20) / (.20)(1  .20) 200
Assumptions: This test requires a random sample and a large sample size. The
given sample was a random sample with a sample size of n = 200. Since 200(.20)
= 40 > 10 and 200 (1 - .20) = 160 >10, the large-sample z test for π is appropriate.
Computations: p = 52 / 200 = 0.26
z = (.26 - .20) / ((.20)(1  .20) 200 = 0.06 / .0283 = 2.12
P-value: P-value = The area under the z curve to the right of 2.12
= P(z > 2.12) = 1 - .9830 = .0170
Since P-value = .0170 < .05 = , H0 is rejected at the 0.05 level of significance.
The data does indicate that there has been a significant increase in the proportion
of female applicants.
10.4 Hypothesis tests for a population mean
We will now turn our attention to developing a method for testing hypotheses about a
population mean.
The null hypothesis is
H0:  = hypothesized value.
The alternative hypothesis will be one of the following three forms
Ha:  > hypothesized value
Ha:  < hypothesized value
Ha:   hypothesized value
The test procedures are based on some results about the sampling distribution of x .
(I) When the population standard deviation  is known
When n is large or the population distribution is approximately normal, then z =
x 
 n
has
approximately a standard normal distribution. Thus we can use the test statistic:
z=

x hypothesized value

n
.
Summary of the one-sample z test for a population mean
Null hypothesis H0:  = hypothesized value
Test statistic: z =
x hypothesized value

n
Alternative hypothesis
Ha:  > hypothesized value
(Upper-tailed test)
Ha:  < hypothesized value
(Lower-tailed test)
Ha:   hypothesized value
(Two-tailed test)
.
P-value
Area under z curve to right of calculated z
Area under z curve to left of calculated z
(i) 2(area to right of calculated z) if z is positive.
(ii) 2(area to left of calculated z) if z is negative
Assumptions:
1. x is the sample mean of a random sample.
2.  is known.
3. The sample size is large (generally n  30) or the population distribution is at least
approximately normal.
Example 10.3 A soda manufacturer is interested in determining whether its bottling
machine tends to overfill. Each bottle is supposed to contain 12 oz of fluid. A random
sample of size 36 is taken from bottles coming off the production line, and the contents of
each bottle are carefully measured. It is found that the mean amount of soda for the
sample of bottles is 12.1 oz. Suppose it is known that  = .4 oz. Is the machine
overfilling? Test the relevant hypotheses at a significance level of 0.05.
1. Population characteristic of interest:
 = the mean amount of soda in the bottles filled by the machine
2.
3
4.
5.
Null hypothesis: H0:  = ?
Alternative Hypothesis: Ha:  > ?
Significance level:  = ?
x  hypothesized value
Test statistic: z =
= x 12n
 n
6. Assumptions: This test requires a random sample, known , and either a large sample
size or approximately a normal population. Since the given sample was a random
sample,  is known, and the sample size was n = 36, the z test is appropriate.
7. Computations: n = 36, x = 12.1, and  = .4, so
.112
z = 12
= 0.4.1/ 6 = 1.50
.4 36
8. P–value: this is a upper-tailed test, so the P–value is
P-value = P( z > 1.50) = 1 - .9332 = .0668
9. Since P-value = .0668 > .05 = , H0 is not rejected at the .05 level of significance.
There is no convincing evidence that the machine is overfilling.
(II) When the population standard deviation  is unknown
When n is large or the population distribution is approximately normal, then t =
approximately a t distribution with df = n – 1. Thus we can use the test statistic:
t=

xhypothesized value
s
n
.
Summary of the one-sample t test for a population mean
Null hypothesis: H0:  = hypothesized value
Test statistic: t =
xhypothesized value
s
n
.
s
x 
n
has
Alternative hypothesis
Ha:  > hypothesized value
(Upper-tailed test)
Ha:  < hypothesized value
(Lower-tailed test)
Ha:   hypothesized value
(Two-tailed test)
P-value
Area under t curve with df = n – 1 to right of calculated t
Area under t curve with df = n – 1 to left of calculated t
(i) 2(area to right of calculated t) if t is positive.
(ii) 2(area to left of calculated t) if t is negative
Assumptions:
1. x and s are the sample mean and sample standard deviation from a random sample.
2. The sample size is large (generally n  30) or the population distribution is at least
approximately normal.
Example 10.4 Are young women delaying marriage and marrying at a later age? This
question was addressed in a report issued by the Census Bureau. The report stated that in
1970 (based on census results) the mean age of brides marrying for the first time was
20.8 years. In 1990 (based on a random sample, since census results were not yet
available), the mean was 23.9. Suppose that the 1990 sample mean had been based on a
random sample of size 100 and that the sample standard deviation was 6.4. Is there
sufficient evidence to support the claim that women are now marrying later in life? Test
the relevant hypothesis using  = .01.
1. Population characteristic of interest:
 = average age of brides marrying for the first time in 1990
2. Null hypothesis: H0:  = ?
3. Alternative Hypothesis: Ha:  > ?
5.Significance level:  = ?
ed value
6.Test statistic: t = xhypothesiz
= xs 20n.8
s n
6.Assumptions: This test requires a random sample and either a large sample size or
approximately a normal population. Since the given sample was a random sample and
the sample size was n = 100, the t test is appropriate.
7.Computations: n = 100, x = 23.9, and s = 6.4, so
.9  20.8
t = 23
= .364.1 = 4.8438
6.4 100
8.P–value: this is a upper-tailed test, so the P- value is
P-value = P(t100-1 > 4.8438) = 0
9.Since P-value = 0  .01 = , we reject H0 at the .01 level of significance. The data
supports the claim that women were marrying later in life in 1990.
Download