ST 305 Chapters 20, 21 Reiland Testing Hypotheses about Proportions “If the People fail to satisfy their burden of proof, you must find the defendant not guilty.” -NY state jury instructions “Extraordinary claims require extraordinary proof” -Carl Sagan “The truth is seldom pure and never simple.” -Oscar Wilde “They make things admirably plain, but one hard question will remain: If one hypothesis you lose, Another in its place you choose …” James Russell Lowell, Credidimus Jovem Regnare Unit Objectives At the conclusion of these chapters you will be able to: ú 1. Perform hypothesis tests for population proportions based on the information contained in a single sample. Reading Assignment Chapters 20 and 21. Highlights from the Readings OVERVIEW Ä Ä TO THIS POINT: In the preceding chapters we learned that populations are characterized by numerical descriptive measures such as the (population) mean ., the (population) standard deviation 5 , or a (population) proportion p. The value of a particular population descriptive measure is typically unknown; a reliable estimate of one of these unknown values is frequently needed. NASA may be very interested in an estimate of the reliability of a crucial component in the space shuttle; a candidate for an elective office may need an estimate of the proportion of voters that will vote for him/her to plan campaign strategy. Estimates of these unknown values are based on sample statistics computed from sample data. Since sample statistics vary in a random manner from sample to sample, estimates based on them will be subject to uncertainty. This uncertainty is reflected in the sampling distribution of a statistic. In the preceding chapter we used a sampling distribution model for s: to estimate the unknown value of a population proportion : with confidence intervals. WHAT'S NEXT Ä Ä : There are times when we want to make a decision rather than make an estimate. This requires us to propose a model for the situation at hand and perform a test of hypothesis about that model. In chapters 20 and 21 we use the sampling distribution model for s: to conduct hypothesis tests for population proportions :. We will learn how to apply the information in a single sample to the formal structure of hypothesis testing to make a decision about the unknown value of a population proportion :. COMPONENTS OF A HYPOTHESIS TEST æ Do people 18 to 24 really prefer Pepsi to Coke? æ Does a new allergy medication really reduce symptoms more than a placebo? æ Does a new marketing approach to sell a product work better than the traditional marketing approach? ST 305 Hypothesis Testing for Proportions page 2 In many situations we want to make a decision. To make decisions, we'll propose a model for the situation at hand and test a hypothesis about the model. The result will assist us in answwering the real-world question. Example Dow Jones Industrial Average (DJIA) The Dow Jones Industrial Average closing prices for the bull market 1982-1986: Question: Is the Dow just as likely to move higher as it is to move lower on any given day? The data: Out of the 1112 trading days in that period, the average increased on 573 days (sample proportion = 0.5153 or 51.53%). That is more “up” days than “down” days. But is it far enough from 50% to cast doubt on the assumption of equally likely up or down movement? To answer this question we use a formal approach known as hypothesis testing. 1) HYPOTHESES è ST 305 Hypothesis Testing for Proportions page 3 Null Hypothesis H! • The null hypothesis, H! , specifies a population model parameter and proposes a value for that parameter. • We usually write a null hypothesis about a proportion in the form L! : : œ :! where :! is a specific numerical value for the population proportion : Alternative Hypothesis HE • The alternative hypothesis, LE , contains the values of the parameter that we consider plausible if we reject the null hypothesis L! . The alternative hypothesis can be one of the following 3 possibilities: LE À : Á :! (2-sided or 2-tailed test) LE À : :! (1-sided or 1-tailed test) LE À : :! (1-sided or 1-tailed test) Example Dow Jones Industrial Average (DJIA) (continued) The particular value of :! we are interested in is !Þ&. The hypotheses we are interested in testing are L! À : œ !Þ& LE À : Á !Þ& where : is the proportion of days on which the DJIA increases è 2) TEST STATISTIC (using the data) • • the test statistic is a number calculated from the data for a hypothesis test for a proportion : with null hypothesis L! À : œ :! , the test statistic is D œ • s: :! WHÐ:Ñ s : Ð": Ñ where WHÐ:Ñ s œ ! 8 ! IMPORTANT!! The value of the test statistic is ALWAYS calculated assuming that the null hypothesis L! is true! Example Dow Jones Industrial Average (DJIA) (continued) The hypotheses we are interested in testing are L! À : œ !Þ& where : is the proportion of days on which the DJIA increases LE À : Á !Þ& The data: out of 1,112 trading days the DJIA went up 573 days, so s: œ calculating the test statistic D : &($ """# œ !Þ&"&$ : Ð": Ñ !Þ&Ð!Þ&Ñ !Þ#& WHÐ:Ñ œ Þ!"&! s œ ! 8 ! œ """# œ """# D œ s::! WHÐ:Ñ s œ !Þ&"&$!Þ& Þ!"&! œ "Þ!# è 3) P-VALUE (weighing the evidence in the data) The P-value is the probability, calculated assuming the null hypothesis L! is true, of observing a value of the test statistic more extreme than the value we actually observed. ST 305 Hypothesis Testing for Proportions page 4 The calculation of the P-value depends on the form of the alternative hypothesis LA (see box below). • • • • A small P-value says that the data we observed would be very unlikely if our null hypothesis L! is true. A small P-value is evidence against the null hypothesis L! . How small does the P-value have to be to reject the null hypothesis L! ? The traditional cutoff value is .05. When the P-value is less than .05, we reject the null hypothesis L! . Calculating P-values Assume the value of the test statistic is D œ D! . If LE À : :! , then P-value = T ÐD D! Ñ. If LE À : :! , then P-value = T ÐD D! Ñ. If LE À : Á :! , then P-value = #T ÐD lD! lÑ. Graphically: If LE À : :! , then P-value = T ÐD D! Ñ If LE À : :! , then P-value = T ÐD D! Ñ If LE À : Á :! , then P-value = #T ÐD lD! lÑ ST 305 Hypothesis Testing for Proportions page 5 Example Dow Jones Industrial Average (DJIA) (continued) The hypotheses we are interested in testing are L! À : œ !Þ& where : is the proportion of days on which the DJIA increases LE À : Á !Þ& The data: out of 1,112 trading days the DJIA went up 573 days, so s: œ &($ """# œ !Þ&"&$ calculating the test statistic D : : Ð": Ñ !Þ&Ð!Þ&Ñ !Þ#& WHÐ:Ñ œ Þ!"&! s œ ! 8 ! œ """# œ """# D œ s::! WHÐ:Ñ s œ !Þ&"&$!Þ& Þ!"&! œ "Þ!# P-value: Since this is a 2-tailed test, P-value = #T ÐD "Þ!#Ñ = #‡Þ"&$* œ Þ$!() Conclusion: since the P-value is greater than .05, our conclusion is “do not reject the null hypothesis”; there is not sufficient evidence to reject the null hypothesis that the percentage of days on which the DJIA goes up is 50% è EXAMPLE (Medication side effects) (continued) Arthritis is a painful, chronic inflammation of the joints. An experiment on the side effects of pain relievers examined arthritis patients to find the proportion of patients who suffer side effects when using ibuprofen to relieve the pain. If more than 3% of users suffer side effects, the Food and Drug Administration will put a stronger warning label on packages of ibuprofen. Hypotheses: L! À : œ Þ!$ where : isthe proportion of users of ibuprofen who suffer side effects L+ À : Þ!$ DATA: 440 subjects with chronic arthritis were given ibuprofen for pain relief; 23 subjects suffered from adverse side effects. Test statistic s: œ #$ %%! Þ!$ÐÞ*(Ñ œ Þ!&#$à WHÐ:Ñ s œ %%! œ Þ!!)" Dœ P-value: Since this is a 1-tail test, P-value = T ÐD #Þ(&Ñ œ Þ!!$! s::! WHÐ:Ñ s œ Þ!&#$Þ!$ Þ!!)" œ #Þ(& ST 305 Hypothesis Testing for Proportions page 6 Conclusion: since the P-value is less than .05, our conclusion is to “reject the null hypothesis”; there is sufficient evidence to conclude that the proportion of ibuprofen users who suffer adverse side effects is greater than .03. To perform hypothesis tests for : using: EXCEL, see our class web page http://www.stat.ncsu.edu/people/reiland/courses/st305/, click on Lecture Handouts in the left column, then click on the file “Excel Spreadsheet for Hypothesis Tests for p”; Statcrunch, go to http://statcrunch.stat.ncsu.edu, click on Stat > Proportions > One sample;TI 83/84 : Stat > Tests > 5:1-PropZTest. è