Hypothesis Tests for a Population Proportion p

advertisement
ST 305
Chapters 20, 21
Reiland
Testing Hypotheses about Proportions
“If the People fail to satisfy their burden of proof,
you must find the defendant not guilty.”
-NY state jury instructions
“Extraordinary claims require extraordinary
proof”
-Carl Sagan
“The truth is seldom pure and never simple.”
-Oscar Wilde
“They make things admirably plain, but one hard
question will remain: If one hypothesis you lose,
Another in its place you choose …”
James Russell Lowell, Credidimus Jovem Regnare
Unit Objectives
At the conclusion of these chapters you will be able to:
ú
1. Perform hypothesis tests for population proportions based on the information contained in a
single sample.
Reading Assignment
Chapters 20 and 21.
Highlights from the Readings
OVERVIEW
Ä Ä TO THIS POINT:
In the preceding chapters we learned that populations are characterized by numerical descriptive
measures such as the (population) mean ., the (population) standard deviation 5 , or a (population) proportion
p. The value of a particular population descriptive measure is typically unknown; a reliable estimate of one of
these unknown values is frequently needed. NASA may be very interested in an estimate of the reliability of a
crucial component in the space shuttle; a candidate for an elective office may need an estimate of the proportion
of voters that will vote for him/her to plan campaign strategy.
Estimates of these unknown values are based on sample statistics computed from sample data. Since
sample statistics vary in a random manner from sample to sample, estimates based on them will be subject to
uncertainty. This uncertainty is reflected in the sampling distribution of a statistic.
In the preceding chapter we used a sampling distribution model for s: to estimate the unknown value of
a population proportion : with confidence intervals.
WHAT'S NEXT Ä Ä :
There are times when we want to make a decision rather than make an estimate. This requires us to
propose a model for the situation at hand and perform a test of hypothesis about that model.
In chapters 20 and 21 we use the sampling distribution model for s: to conduct hypothesis tests for
population proportions :. We will learn how to apply the information in a single sample to the formal
structure of hypothesis testing to make a decision about the unknown value of a population proportion :.
COMPONENTS OF A HYPOTHESIS TEST
æ Do people 18 to 24 really prefer Pepsi to Coke?
æ Does a new allergy medication really reduce symptoms more than a placebo?
æ Does a new marketing approach to sell a product work better than the traditional marketing approach?
ST 305
Hypothesis Testing for Proportions
page 2
In many situations we want to make a decision. To make decisions, we'll propose a model for the situation
at hand and test a hypothesis about the model. The result will assist us in answwering the real-world
question.
Example Dow Jones Industrial Average (DJIA)
The Dow Jones Industrial Average closing prices for the bull market 1982-1986:
Question:
Is the Dow just as likely to move higher as it is to move lower on any given day?
The data:
Out of the 1112 trading days in that period, the average increased on 573 days (sample proportion = 0.5153
or 51.53%).
That is more “up” days than “down” days.
But is it far enough from 50% to cast doubt on the assumption of equally likely up or down movement?
To answer this question we use a formal approach known as hypothesis testing.
1) HYPOTHESES
è
ST 305
Hypothesis Testing for Proportions
page 3
Null Hypothesis H!
• The null hypothesis, H! , specifies a population model parameter and proposes a value for that
parameter.
•
We usually write a null hypothesis about a proportion in the form
L! : : œ :!
where :! is a specific numerical value for the population proportion :
Alternative Hypothesis HE
• The alternative hypothesis, LE , contains the values of the parameter that we consider plausible if we
reject the null hypothesis L! . The alternative hypothesis can be one of the following 3 possibilities:
LE À : Á :! (2-sided or 2-tailed test)
LE À :  :! (1-sided or 1-tailed test)
LE À :  :! (1-sided or 1-tailed test)
Example Dow Jones Industrial Average (DJIA) (continued)
The particular value of :! we are interested in is !Þ&. The hypotheses we are interested in testing are
L! À : œ !Þ&
LE À : Á !Þ&
where : is the proportion of days on which the DJIA increases
è
2) TEST STATISTIC (using the data)
•
•
the test statistic is a number calculated from the data
for a hypothesis test for a proportion : with null hypothesis L! À : œ :! , the test statistic is
D œ
•
s:  :!
WHÐ:Ñ
s
: Ð": Ñ
where WHÐ:Ñ
s œ  ! 8 !
IMPORTANT!! The value of the test statistic is ALWAYS calculated assuming that the null
hypothesis L! is true!
Example Dow Jones Industrial Average (DJIA) (continued)
The hypotheses we are interested in testing are
L! À : œ !Þ& where : is the proportion of days on which the DJIA increases
LE À : Á !Þ&
The data: out of 1,112 trading days the DJIA went up 573 days, so s: œ
calculating the test statistic D :
&($
"""#
œ !Þ&"&$
: Ð": Ñ
!Þ&Ð!Þ&Ñ
!Þ#&
WHÐ:Ñ
œ Þ!"&!
s œ  ! 8 ! œ  """# œ  """#
D œ
s::!
WHÐ:Ñ
s
œ
!Þ&"&$!Þ&
Þ!"&!
œ "Þ!#
è
3) P-VALUE (weighing the evidence in the data)
The P-value is the probability, calculated assuming the null hypothesis L!
is true, of observing a value of the test statistic more extreme than the value
we actually observed.
ST 305
Hypothesis Testing for Proportions
page 4
The calculation of the P-value depends on the form of the alternative
hypothesis LA (see box below).
•
•
•
•
A small P-value says that the data we observed would be very unlikely if our null hypothesis L! is
true.
A small P-value is evidence against the null hypothesis L! .
How small does the P-value have to be to reject the null hypothesis L! ? The traditional cutoff
value is .05.
When the P-value is less than .05, we reject the null hypothesis L! .
Calculating P-values
Assume the value of the test statistic is D œ D! .
If LE À :  :! , then P-value = T ÐD  D! Ñ.
If LE À :  :! , then P-value = T ÐD  D! Ñ.
If LE À : Á :! , then P-value = #T ÐD  lD! lÑ.
Graphically:
If LE À :  :! , then P-value = T ÐD  D! Ñ
If LE À :  :! , then P-value = T ÐD  D! Ñ
If LE À : Á :! , then P-value = #T ÐD  lD! lÑ
ST 305
Hypothesis Testing for Proportions
page 5
Example Dow Jones Industrial Average (DJIA) (continued)
The hypotheses we are interested in testing are
L! À : œ !Þ& where : is the proportion of days on which the DJIA increases
LE À : Á !Þ&
The data: out of 1,112 trading days the DJIA went up 573 days, so s: œ
&($
"""#
œ !Þ&"&$
calculating the test statistic D :
: Ð": Ñ
!Þ&Ð!Þ&Ñ
!Þ#&
WHÐ:Ñ
œ Þ!"&!
s œ  ! 8 ! œ  """# œ  """#
D œ
s::!
WHÐ:Ñ
s
œ
!Þ&"&$!Þ&
Þ!"&!
œ "Þ!#
P-value: Since this is a 2-tailed test,
P-value = #T ÐD  "Þ!#Ñ = #‡Þ"&$* œ Þ$!()
Conclusion: since the P-value is greater than .05, our conclusion is “do not reject the null hypothesis”;
there is not sufficient evidence to reject the null hypothesis that the percentage of days on which the
DJIA goes up is 50%
è
EXAMPLE (Medication side effects) (continued)
Arthritis is a painful, chronic inflammation of the joints. An experiment on the side effects of pain relievers
examined arthritis patients to find the proportion of patients who suffer side effects when using ibuprofen to
relieve the pain.
If more than 3% of users suffer side effects, the Food and Drug Administration will put a stronger warning
label on packages of ibuprofen.
Hypotheses:
L! À : œ Þ!$ where : isthe proportion of users of ibuprofen who suffer side effects
L+ À :  Þ!$
DATA:
440 subjects with chronic arthritis were given ibuprofen for pain relief;
23 subjects suffered from adverse side effects.
Test statistic
s: œ
#$
%%!
Þ!$ÐÞ*(Ñ
œ Þ!&#$à WHÐ:Ñ
s œ  %%! œ Þ!!)"
Dœ
P-value: Since this is a 1-tail test,
P-value = T ÐD  #Þ(&Ñ œ Þ!!$!
s::!
WHÐ:Ñ
s
œ
Þ!&#$Þ!$
Þ!!)"
œ #Þ(&
ST 305
Hypothesis Testing for Proportions
page 6
Conclusion: since the P-value is less than .05, our conclusion is to “reject the null hypothesis”; there is
sufficient evidence to conclude that the proportion of ibuprofen users who suffer adverse side effects is
greater than .03.
To perform hypothesis tests for : using: EXCEL, see our class web page
http://www.stat.ncsu.edu/people/reiland/courses/st305/, click on Lecture Handouts in the left column, then
click on the file “Excel Spreadsheet for Hypothesis Tests for p”; Statcrunch, go to
http://statcrunch.stat.ncsu.edu, click on Stat > Proportions > One sample;TI 83/84 : Stat > Tests >
5:1-PropZTest.
è
Download