Dr. Neal, WKU MATH 183 Population Proportion A special case of a population mean is the proportion p of those having a certain designation. For example, we may ask what proportion of the population approves of the President’s performance. Because p is a proportion, it is always the case that 0 ! p ! 1 ; however, p is often stated as a percentage. If p = 0.53, then we usually say that p is 53%. However, we always should work with p in decimal form. When determining a proportion, questions may be asked in “Yes/No” form so that the responses are not numerical but instead are categorical. Responses may also be of the form 1. Strongly Approve 2. Approve 3. Indifferent 4. Disapprove 5. Strongly Disapprove But in order to analyze the approval rate mathematically, we must assign a numerical value to the categories that we are trying to measure. In this case the responses “Strongly Approve” and “Approve” could be counted as one category “Yes” and the other responses are “Not Yes.” We then assign the values 1 for “Yes” and 0 for “Not Yes” so that the responses actually become numerical. The True Mean and True Standard Deviation !# 1 if Yes There are only two possible measurements: xi = " . If we average all possible #$ 0 if not 1 measurements over the population of size N , then we obtain µ = N N " xi i =1 = # Yes = p, N which is the true proportion that we are trying to measure. Thus, ! µ = p. The variance is the average of the squares minus the square of the average. But 1 = N N # Yes – µ 2 = p ! p2 = N i =1 p(1! p) . By taking the square root, we obtain the true standard deviation. Thus, because xi2 = xi (still 1 or 0), we have ! !!= 2 " xi 2 – µ2 = p (1 " p) A random measurement X from the population is then either 1 or 0. Such a measurement used to describe a population proportion is called a Bernoulli trial, denoted by X ~ b(1, p) . Dr. Neal, WKU When p = 0.5, then max σ = 0.5. ! = 0. 5 p=0 p = 0. 5 p =1 2 We note that the function ! = p " p is circular for 0 ≤ p ≤ 1. When p = 1 (where 100% of the population is “Yes”), then every measurement is 1 and therefore ! = 0 because there is no deviation. When p = 0 (where none of the population is “Yes”), then every measurement is 0 and again there is no deviation; so ! = 0. The maximum standard deviation occurs when p = 1/2 which gives a value of ! = 0.5. So for proportions, it is always the case that ! ≤ U = 0.5. Confidence Interval for Proportions We estimate the value of p by the sample proportion p = # “Yes” m = # Responses n where n is the sample size and m is the number of favorable responses. Because p is unknown, then ! = p (1 " p) is also unknown. But we can estimate ! by S = p (1 ! p ) or by its upper bound of U = 0.5. With confidence intervals for the z " z # mean µ , we have µ ≈ x ± ! / 2 ; but now we use p ! p ± " /2 . Replacing ! by n n S = p (1 ! p ) or by U = 0.5, we have % z ' p ± "/ 2 ' p!& ' z ' p ± "/ 2 ( p (1 # p ) z $ 0.5 or p ± " / 2 n n p (1 # p ) n z" /2 $ 0. 5 N # n N #n or p ± N #1 n N #1 for large populations for a population of size N The confidence interval for a large population proportion can be found with the built-in 1–PropZInt feature (item A) from the STAT TESTS menu. It always uses ! ≈ p (1 ! p ) . Dr. Neal, WKU Example 1. Investigators asked 250 undergraduate students at a large university about prayer and found that 195 prayed at least a few times a year. Give a 95% confidence interval for the proportion p of all undergraduates who pray at this school. Can we be reasonably sure that the proportion is at least two-thirds? 195 = 0.78. Assuming a “large” population of students, we have 250 z # 0.5 z p (1 # p ) p ! p ± " /2 p ! p ± " /2 n n 1.96 # 0.5 1.96 $ 0.78 $ 0.22 or = 0.78 ± = 0.78 ± 250 250 Solution. First, p = ! 0.78 ± 0.062 ! 0.78 ± 0.05135 Thus, 0.72865 ! p ! 0.83135 So, 0.718 ! p ! 0.842 With the largest margin of error, p is at least 0.718; thus, we can clearly say that at this school more than 2/3 of the students pray. Using the built-in 1-PropZInt feature from the STAT TESTS menu: Note: In the second solution above, we used the “worst case scenario” of ! = 0.5 to obtain 0.78 ± 0.062 . Reporters often give this information as “78% of students pray.” * * Based on a survey of 250 students that has a maximum margin of error of 6.2 percentage points. Example 2. A random survey of a school’s 3600 Greek students found that 205 out of 500 had gone to worship (church, temple, etc.) in the last month. Find a 98% confidence interval for the true proportion of the school’s Greeks who had been to worship in the last month. Is there conclusive evidence that the proportion is less than 45% ? Solution. First, p = have p! p± 205 = 0.41. Then using the finite population size of N = 3600 , we 500 z" /2 p (1 # p ) n N#n 2.326 0.41$ 0.59 3100 = 0.41± = 0.41 ± 0.04748 . N #1 500 3599 That is, 0.36252 ! p ! 0.45748 , which means that p still might exceed 0.45. Note: In this case, the maximum margin of error is 2.326 ! 0.5 ! 500 3100 ≈ 0.04827. 3599 Dr. Neal, WKU Choosing the Sample Size As with confidence intervals for the mean, we may like to know in advance what sample size would provide a certain maximum margin of error e with a certain level of confidence r . By using U = 0.5 as a bound for ! , the required sample size n satisfies + $ z" /2 # 0. 5 ' 2 & ) % ( e $ z" / 2 # 0.5 ' 2 n!, N #& ) % ( e $ z" /2 # 0. 5 ' 2 ) -. ( N * 1) + &% ( e for large populations for a population of size N where z! / 2 is the appropriate z -score depending on the level of confidence. Note: We always round up to the nearest integer. Example 3. What sample size will guarantee a maximum margin of error of 0.035 for any 99% confidence interval of a proportion? What sample size would guarantee the result from a population of size 1200? Solution. For a large population, the required sample size must satisfy $ z" / 2 # 0.5 ' 2 " 2.576 ! 0.5 % 2 ' = 1354.24; n!& ) =$ # 0. 035 & % ( e thus, n must be at least 1355. From a population of size 1200, the sample size must satisfy $z " 0.5' 2 N " & #/ 2 ) % ( 1200 ! 1354.24 e n! = = 636.48 2 $ z# / 2 " 0.5 ' 1199 + 1354.24 (N * 1) + & ) % ( e thus, n must be at least 637. Dr. Neal, WKU Practice Exercises 1. In a nationwide survey, only 378 of 900 adults surveyed approved of the President’s performance. (a) Find a 95% confidence interval for the true proportion of adults who approve of the President’s performance. (b) What is the maximum margin of error for a 95% confidence interval based on a survey of 900 people? (c) Based on the results in Part (b), can you be relatively sure that less than a majority approve? (d) If you wanted to estimate the true proportion within 0.025 with 95% confidence, then how many people must be surveyed? 2. A pollster wants to survey the 535 members of Congress on a completely nonpartisan issue. The question will be “Did you watch the Super Bowl last February?” The true proportion p will be based on a random sample of 60 members of Congress. (a) Suppose 42 out of 60 said that they did watch the Super Bowl. Find a 98% confidence interval (with the largest margin of error) for the true proportion of members of Congress who watched the Super Bowl. Can you say with certainty whether or not at least two-thirds of Congress watched the Super Bowl? (b) If you wanted to estimate the true proportion p within 0.03 with 98% confidence, then how many members of Congress would you need to survey? Dr. Neal, WKU Answers z" /2 p (1 # p ) 1.96 $ 0.42 $ 0.58 378 = 0.42 ± = 0.42. Then p ! p ± n 900 900 = 0.42 ± 0. 032246 . That is, 0.387754 ! p ! 0.452246 , which also can be found easily with the 1–PropZInt screen. 1. (a) Note that p = (b) z! / 2 " 0.5 1.96 ! 0.5 = = 0.0326 n 900 (c) Using the maximum possible margin of error, we have p ≤ 0.42 + 0.0326 < 0.50; so we can be certain that less than a majority approve of the President’s performance. $ z" / 2 # 0.5 ' 2 " 1.96 ! 0.5 % 2 ' = 1536.64; so sample 1537. ) =$ (d) n ! & # 0. 025 & % ( e 2. (a) = p = 42 = 0.7, and then using N = 535 and n = 60 , we have 60 z # 0.5 p ! p ± " /2 n N$n 2.326# 0.5 475 = 0.7 ± 0.1416 = 0.7 ± N $1 60 534 That is, 0.5584 ! p ! 0.8416 , which means that p might be below 2/3. $ z# / 2 " 0.5' 2 $ 2.326 " 0.5' 2 N "& ) 535 " & ) % ( % ( 535" 1502.8544 e 0.03 = = = 394.74 ; (b) n ! 2 2 $ z# / 2 " 0.5 ' $ 2.326 " 0.5' 534 + 1502.8544 534 + & (N * 1) + & ) ) % % ( 0.03 ( e so sample 395 members of Congress.