AP Statistics Name _______________________ Sampling Distributions, CLT, and Confidence Intervals Hmw Assignments for Chapter 8: a. 2, 3, 7, 9 b. 10, 14, 17, 21 c. 23, 25, 29, 31 Hmw Assignments for Chapter 9: a. 1, 5, 7, 9 b. 11, 14, 16, 18 c. 25, 27, 29 d. 30, 33, 35, 39, 41 e. 47, 49, 50, 53, 55, 73 What is a sampling Distribution? Why are they used? What is sampling variability? What is the difference between a characteristic (parameter) and a statistic? Do problems 2, 3, 7, and 9 from chapter 8. NOTES: Statistics v. Characteristics, Sampling Distributions Sampling Distribution of Sample Means A statistic can be a random variable such as sample mean ( x ) and sample proportion ( p̂ ). Suppose MHS has the following population of seven senior football players and their weights in pounds are listed: Aaron—220, Brad—200, Chris—170, Doug—180, Eric—190, Frank—210, George—160. 220 200 170 180 190 210 160 190 LBS 7 = 20 To create a sampling distribution of sample means, we will find every simple random sample of a specified sample size, calculate the sample mean, and plot on the dotplot. Since there are seven players, we will look at all possible combinations of two players resulting in 21 pairs. Sample AB AC AD AE AF AG BC Sample mean 210 195 200 205 215 190 185 Sample BD BE BF BG CD CE CF Sample mean 190 195 205 180 175 180 190 Sample CG DE DF DG EF EG FG Sample mean 165 185 195 170 200 175 185 Now what if we looked at every possible sample size of size three? Sample ABC ABD ABE ABF ABG ACD ACE x 196.7 200 203.3 210 193.3 190 193.3 Sample ACF ACG ADE ADF ADG AEF AEG x 200 183.3 196.7 203.3 186.7 206.7 190 Sample AFG BCD BCE BCF BCG BDE BDF x 196.7 183.3 186.7 193.3 176.7 190 196.7 Sample BDG BEF BEG BFG CDE CDF CDG x 180 200 183.3 190 180 186.7 170 Sample CEF CEG CFG DEF DEG DFG EFG x 190 173.3 180 193.3 176.7 183.3 186.7 What do you notice about the sampling distributions of sample means as the sample size increases from the parent population? 1) What type of distribution is the parent population? 2) What is the mean (center) of the parent population? 3) What is the standard deviation of the parent population? 4) What shape (type of distribution) is the sampling distribution of sample means of sample size 2? 5) What shape (type of distribution) is the sampling distribution of sample means of sample size 3? 6) What appears to be the center (mean) value of the sampling distribution of sample means of sample size 2? 7) What appears to be the center (mean) value of the sampling distribution of sample means of sample size 3? 8) The standard deviation of the sampling distribution of sample means of sample size 2 is 13.2288 and for sample size 3 is 9.5665. How does this compare to the parent population standard deviation of 20? Four Basic Rules of Sampling Distributions of Sample Means with Central Limit Theorem Let x represent the sample mean of a simple random sample of sample size n from a parent population having population mean and population standard deviation . Then the following will hold true: 1. x ( sampling distributi on ) x ( parentpopulation ) x ( parent population ) 2. x (sampling distribution) 3. When the parent population is normal, the sampling distribution is normal for any sample size, n. 4. By the Central Limit Theorem (CLT): When the sample size, n, is sufficiently large, the sampling distribution of sample means, x , is approximated by a normal (bell-shaped) curve, even if the parent population distribution is not itself normally distributed. n (As long as n ≤ .5N) Do problems 10, 14, 17, and 21 from chapter 8 in your textbook. Sampling Distribution of Sample Proportions Suppose the MHS senior girls were asked if they would ask a high school boy out on a date (including the girl picks up the tab). Twenty-four senior girls were randomly selected. Girl Andrea Becky Cathy Delia Emily Frankie Vote Yes—1 Yes—1 No—0 Yes—1 No—0 No—0 Girl Grace Hillary Ima Jenny Kayla Laura Vote No—0 Yes—1 Yes—1 Yes—1 No—0 Yes—1 Girl Melissa Nancy Opa Patty Risa Stephanie Vote No—0 No—0 No—0 Yes—1 No—0 Yes—1 Girl Tina Urma Velma Wendy Yami Zelda Vote Yes—1 No—0 No—0 Yes—1 Yes—1 No—0 If we created a sampling distributions of sample proportions, p̂ , of sample size 2, there would be 24 24! 24 23 22! 24 23 276 possible combinations. For example: AB = 2, AC = 1, AD = 2, 2 1 22! 2 2 2! 24 2! AE = 1… VW = 1, VY = 1, VZ = 0. We construct the frequency distribution and relative frequency distribution of sampling proportions. Frequency Relative Frequency Sample proportion, p̂ 66 66/276 = .2391 p̂ = 0 where (0+0)/2 144 144/276 = .5217 p̂ = .5 where (0+1)/2 66 66/276 = .2391 p̂ = 1 where (1+1)/2 Total = 276 = 1.00 200 150 100 50 0 0 0.5 1 The center (mean) of the sampling distribution of sample proportions of sample size 2 is calculated by 66 144 66 value of pˆ probabilit y of pˆ 0 276 .5 276 1 276 .5 . If we created a sampling distributions of sample proportions, p̂ , of sample size 4, there would be 10626 possible combinations. For example: ABCD = 3, ABCE = 2, ABCF = 2,…,VWYZ = 2. We construct the frequency distribution and relative frequency distribution of sampling proportions. Sample proportion, p̂ p̂ = 0 where (0+0+0+0)/4 p̂ = .25 where (0+0+0+1)/4 p̂ = .5 where (0+0+1+1)/4 p̂ = .75 where (0+1+1+1)/4 p̂ = 1 where (1+1+1+1)/4 Frequency 495 2640 4356 2640 495 Total = 10626 Relative Frequency 495/10626 = .0466 2640/10626 = .2484 4356/10626 = .4099 2640/10626 = .2484 495/10626 = .0466 = 1.00 5000 4000 3000 Series1 2000 1000 0 0 0.25 0.5 0.75 1 The center (mean) of the sampling distribution of sample proportions of sample size 4 is calculated by 495 2640 4356 2640 495 value of pˆ probabilit y of pˆ 0 10626 .25 10626 .5 10626 .75 10626 1 10626 .5 What do you notice about the sampling distributions of sample proportions as the sample size increases from the parent population? 1) What shape (type of distribution) is the sampling distribution of sample proportions of sample size 2? 2) What shape (type of distribution) is the sampling distribution of sample proportions of sample size 4 3) What appears to be the center (mean) value of the sampling distribution of sample proportions of sample size 2? 4) What appears to be the center (mean) value of the sampling distribution of sample proportions of sample size 4? Three Basic Rules of Sampling Distributions of Sample Proportions Let p̂ represent the sample proportion of a simple random sample of sample size n from a parent population having population mean and population standard deviation . Then the following will hold true: 1. pˆ (sampling distribution ) ( parent population) 2. pˆ (sampling distribution ) 3. 1 n (As long as n ≤ .5N) When the sample size, n, is sufficiently large AND is not too close to 0 or 1, the sampling distribution of sample proportions, p̂ , is approximated by a normal (bell-shaped) curve. A rule of thumb to use if n is sufficiently large enough, verify both n 10 and n(1-) 10. Now do problems 23, 25, 29, and 31 from chapter 8 in your textbook. Practice Problems on Sampling Distributions DIRECTIONS:. Free Response questions need to have proper statistical notation and explanations containing complete sentences. Multiple Choice questions have only one correct answer that needs to be circled. Multiple Choice For questions 1-2: A phone-in poll conducted by a newspaper reported that 73% of those who called in liked business tycoon Donald Trump. 1. The number 73% is a A. statistics B. sample C. parameter D. population E. size 2. The unknown true percentage of American citizens that like Donald Trump is a A. statistics B. sample C. parameter D. population E. size 3. Which of the following are true? A. The mean of a population depends on the particular sample chosen. B. The standard deviations of two different samples from the same population will be the same. C. Statistical inferences can be used to draw conclusions about samples based on population data. D. Statistical inferences can be used to draw conclusions about populations based on sample data. E. None of the above statements are true. 4. Which best describes a sampling distribution of a statistic? A. It is the probability that the sample statistic equals the parameter of interest. B. It is the probability distribution of all the values that are contained in all possible samples of the same size. C. It is the distribution of all of the statistics calculated from all possible samples of the same size from the same population. D. It is the histogram of sample statistics from all possible samples of the same size. E. It is none of these. 5. A random sample of 50 U.S. working adults was asked to reveal their (gross) annual incomes. The variance of this sample: A. is always smaller than the variance of the population. B. cannot be computed since the population size is not given. C. equals the variance of the population. D. is an estimate of the variance in the sampling distribution of the sample means of the gross annual incomes of all possible samples of any sample size. E. is an estimate of the variance of the population but may differ from the variance of the population. For questions 6-8: A survey will ask a random sample of 1500 adults in OKC area if they support an increase in the sales tax from 8.25% to 9% with the additional revenue going to education. Suppose we know that π, the proportion of all OKC adults that support the increase = .30. 6. The mean, p̂ , of all possible values of the sample proportion in support of the increase will be A. 8.25% B. 30% 8.25% C. 0.30 D. 1500 E. 450 7. The standard deviation of p̂ (known as p̂ ) is A. 0.4582 B. 0.2100 C. 0.0118 D. 0.0141 E. 0.0000 8. The probability that p̂ is more than 0.40 is A. less than 0.0001 B. about 0.100 C. 0.4549 D. 0.50 E. 0.8918 9. A sample of size 49 is drawn from a normal population with a mean of 63 and a standard deviation of 14. What are the mean and standard deviation of the sampling distribution of sample means? A. µ = 9, = 2 B. µ = 63, = .286 C. µ = 63, = 2 D. µ = 1.286, = 3.5 E. µ = 9, = 14 10. The distribution of SAT Math scores of students taking Calculus I at a larger university is skewed left with a mean of 625 and a standard deviation of 44.5. If random samples of 100 students are repeatedly taken, which statement best describes the sampling distribution of sample means? A. Normal with a mean of 625 and standard deviation of 44.5. B. Normal with a mean of 625 and standard deviation of 4.45. C. Shape unknown with a mean of 625 and standard deviation of 44.5. D. Shape unknown with a mean of 625 and standard deviation of 4.45. E. No conclusion can be drawn since the population is not normally distributed. 11. Which of the following statements regarding the sampling distribution of sample means is incorrect? A. The sampling distribution is approximately normal when the population is normal or the sample size is sufficiently large. B. The mean of the sampling distribution is the mean of the population. C. The standard deviation of the sampling distribution is the standard deviation of the population. D. The sampling distribution is found by taking repeated samples of the same size from the population of interest and computing the mean of each sample. E. All of these are correct. 12. After repeated observations, it has been determined that the waiting time at the drive-through window at a local bank on Friday afternoons between 12:00 noon and 6:00 pm is skewed left with a mean of 3.5 minutes and standard deviation of 1.9 minutes. A sample of 100 customers is to be taken next Friday. What is the probability that the mean of the sample will exceed 4 minutes? A. .0042 B. .0396 C. .0420 D. .3960 E. The probability cannot be determined using a normal curve approximation. For questions 13-17: The distribution of actual weights of chocolate bars produced by a certain machine is normal with mean 8.1 ounces and standard deviation of 0.1 ounces. 13. If a sample of five of these chocolate bars is selected, the probability that their average weight is less than 8 ounces is A. 0.0125 B. 0.1853 C. 0.4871 D. 0.9873 E. Not enough information provided to answer the question. 14. If a sample of five of these chocolate bars is selected, there is only 5% chance that the average weight of the sample of five of the chocolate bars will be below A. 7.94 ounces B. 8.03 ounces C. 8.08 ounces D. 8.20 ounces E. 8.29 ounces 15. The company sells the individually (& independently selected) wrapped eight chocolate bars as a packaged special product, what is the expected weight of the special packaged product (chocolate only)? A. 0.8 ounces B. 8.1 ounces C. 40.5 ounces D. 64.8 ounces E. cannot be determined. 16. What is the standard deviation of the sampling distribution of the weight of the special packaged product of chocolate only? A. 0.035 ounces B. 0.08 ounces C. 0.2828 ounces D. 0.8944 ounces E. cannot be determined. 17. What is the probability that a special packaged product contains more than 66 ounces? A. 0.0000 B. 0.0010 C. 0.1100 D. 1.0000 E. cannot be determined. Free-Response 18. For the following situation label the x-axis of the population distribution and the sampling distribution of sample means. Show how the CLT supports your work in determining the values of the sampling distribution. A fisherman takes tourist out fishing and has noticed that over the long run the weights of a certain type of fish typically caught is approximately normal with a mean of 12 pounds and a standard deviation of 4 pounds. During a good day, the boatload of 16 people catches the limit of 4 fish each, or a total of 64 fish. The statistic of interest is the mean weight, x , of the 64 fish caught on a good day. Next: Chapter 9 Things to know: Point Estimate Mean Median Midrange Mode Trimmed Mean Standard Deviation Variance Proportion Statistic vs. Parameter (or Characteristic) What constitutes a good point estimate? Create and interpret a large sample confidence interval for the population proportion. Determine the sample size needed to create a confidence interval with a given error and level of confidence. Create and interpret a confidence interval for the population mean using z or t as appropriate. NOTES: Chapter nine begins with a discussion of point estimates. A point estimate is a statistic determined from a sample that is representative of the population. That statistic is a point estimate of its associated population parameter. For example, a value of x-bar from a good random sample is a point estimate of the population value of mu, a value of p from a random sample is a point estimate of the population value of pi. Any statistic taken from a good random sample is a point estimate of its corresponding population parameter. What makes a good point estimate? Good point estimates are a combination of two things: they are unbiased and they have a small standard deviation. However, sometimes a biased point estimate might be better than an unbiased one, if its standard deviation is much smaller. Your textbook has some good pictures of the relationship balance between bias and standard deviation. So what is bias? Bias when we talk about point estimates is different than bias when we talk about sampling. Sampling bias has to do with mistakes made when taking a random sample. There may be selection, measurement, or nonresponse bias. But, bias in the context of point estimates has to do with the ratio of overestimates to underestimates by the static. In other words, an unbiased static is one that, when it is wrong, overestimates ½ the time and underestimates ½ the time. Most of the common statistics are unbiased estimators of their associated parameters (x-bar and p are unbiased). However, s, the sample standard deviation, is a biased estimator of sigma, the population standard deviation. We still use s because it is the best option we have for estimating the population standard deviation. Textbook Homework: 1, 5, 7, 9 Confidence interval Confidence Interval for the population proportion pˆ E - or - pˆ E pˆ E p̂ = Sample Proportion E zcritical pˆ (1 pˆ ) n 1- = Confidence Level ( is the percentage in the tails) Sample is random n p̂ 10 and n(1- p̂ ) 10 n is less than 5% of the population If all of these are true then the distribution is approximately normal and you can create a Confidence Interval. Steps to follow… 1. 2. 3. 4. 5. Identify the parameter you are estimating. List the important statistics and their values, including level of confidence. State and verify the requirements. State your intentions. Show the appropriate formula and substitute into it. Find the confidence interval on the calculator and record the answer. 6. State the confidence interval in a sentence. Confidence interval Confidence interval for the population mean xE E zcritical - or - n xExE - or - E tcritical s n If you know the standard deviation of the population, , then create a z - interval. If you don' t know the standard deviation of the population, but you do know the standard deviation of your sample, s, then create a t - interval. Level of confidence 1 - Sample is random n 30 so the sample size is large enough for x to be approximately normally distributed -orThe sample can be demonstrated to be consistent with a sample from a population that is approximately normal. Thus x is approximately normally distributed. (Demonstrate this using graphical analysis of the sample data or analysis of the sample statistics.) If both of these conditions are satisfied you can create a confidence interval. The steps are the same as before. One Proportion Z Confidence Interval Finding sample size needed: 1. Joey Boatright and Chris Mersinger are running for class president at MHS. You conduct a sample survey of the senior class on who plans on voting for Joey instead of Chris (note: there are no other candidates or write-in options). You will tolerate a 4% margin of error with a 95% confidence level. Assume each candidate is equally likely to be favored in the senior class (hint: use 0.50 for the sample proportion). How large a sample is needed? 2. An MHS teacher intends to verify reliable information that the illiteracy rate at MHS is about 2%. How many randomly selected subjects should be tested if we want 96% confidence that the sample is in error by not more than one percentage point? Constructing Confidence Interval for a Sample Proportion: Example: An Associated Press article on potential violent behavior reported the results of a survey of 750 workers who were employed full time (San Luis Obispo Tribune, Sept. 7 1999). Of those surveyed, 125 indicated that they were so angered by a coworker during the past year that he or she felt like hitting the person (but didn’t). Construct a 95% confidence interval based on the information presented. 3. Mrs. Ima Mean considers a multiple choice test to be easy if at least 85% of the responses are correct. A sample of 175 student responses to one question indicates that 146 of those student responses were correct. Construct the 98% confidence interval for the true proportion of correct responses. Is it likely that this particular test question is really easy? 4. A chemistry teacher at MHS has created a wonder spray that he claims eliminates sophomore bad behavior when 10th graders come into contact with the potion. If 300 10th graders were sprayed, and 47% of them exhibit no bad sophomore behavior afterwards, would you conclude that his mixture was satisfactory (assuming satisfactory is at least 50%)? Construct AND interpret the 99% confidence interval. Verify assumptions: 5. Jaci makes 996 free throws out of 1000 attempts. Verify if a valid confidence interval could be constructed. Verify assumptions are met AND construct CI: 6. Kim randomly selected a card from a well shuffled regular 52 card deck. Out of the 50 trials, 17 red cards were drawn. Construct the 95% confidence intervals and interpret if Kim was playing with a full deck. Textbook Homework: 11, 14, 16, 18 Around the World in 45 Minutes Sample Size and Confidence Intervals We will “randomly” select a location on the earth by tossing and catching the globe. Record whether the mark on your finger lands on land, L, or water, W. 1. How many times do we need to toss and catch the earth to create a “good” confidence interval? Well… 2. What level of confidence do we want? 3. How close do we want to be to the true proportion? 4. Do we know what the true proportion is or is supposed to be? If not then estimate it to be .5. This gives us the largest sample size we could possibly need. Now use the last part of the C.I. formula. Bound on error (E) = Critical value*standard error -orE 1.96 .5 (1 .5 ) when the desired confidence level is 95% and is unknown. n Substitute how close you want to estimate within for E, and solve for n. 5. Now toss the world and create a 95% confidence interval for the proportion of the earth covered with land. Interpret the interval in the context of the problem. 6. What three factors involved in the process of creating a confidence interval determine its width? 7. Can any of these factors be controlled? Textbook Homework: 25, 27, 29 Z and t Confidence Interval for Mu x z* or n x t* s n Normal or t-distribution: 1. Mr. Moore wants to estimate the mean number of Gummi Bears a student can consume in one 55 minute class period. He randomly selects 30 students using the student rolls and a random number generator, buys multiple bags of Gummies and starts the timer. He gets an average of 85 GB with a standard deviation of 9.3 GB. 2. Suppose we know from the M&M/Mars website that the standard deviation of the diameter of an m&m is .235mm. Estimate the average diameter of an m&m. Your sample of 53 m&ms produces an average diameter of 9.8mm. Finding sample size needed: 3. Joey Boatright and Chris Mersinger are running for class president at MHS. You want to know how smart the average person voting for Joey is. So, you conduct a sample survey of students in the senior class of those who plan on voting for Joey and measure their IQ. You will tolerate a 4 point margin of error with a 95% confidence level. Assume that the smartest person planning vote for Joey has an IQ of 122 and the “least smart” person has an IQ of 76. How large a sample is needed? 4. An MHS teacher intends to verify reliable information that the average “big toe” length at MHS is about 2 inches. Information from previous years has consistently shown that “big toe” lengths for young adults have a standard deviation of .34 in. How many randomly selected subjects should be tested if we want 96% confidence that the sample is in error by not more than .1 inch? Constructing Confidence Interval For Estimating the Population Mean: 5. Five randomly selected students win a free visit to the dentist during National Dental Health week! They were asked how many months it had been since their last visit. They responded: 6, 17, 11, 22, and 29. Construct a 95% confidence interval for the mean number of months elapsed since the last visit to a dentist for the population of students that were eligible for the free visit. 6. A chemistry teacher at MHS has created a wonder spray that he claims eliminates sophomore bad behavior when 10th graders come into contact with the potion but wants to know how long to expect the potion to work. 30 randomly selected naughty 10th graders were sprayed, and they exhibited no bad sophomore behavior afterwards for an average of 4 hours with a standard deviation of .5 h. Construct AND interpret the 99% confidence interval for the teacher. Verify assumptions: The following data are the calories per half-cup serving for 16 popular chocolate ice cream brands: 270, 150, 170, 140, 160, 160, 160, 290, 190, 190, 160, 170, 150, 110, 180, 170. Is it reasonable to use a t-confidence interval to compute a confidence interval for , the true mean calories per half-cup serving of chocolate ice cream. Textbook Homework: 30, 33, 35, 39, 41 Student t Distribution Small sample inference for population mean: N30 If we know that the original distribution is normal and we know the standard deviation of the population, then we can construct a confidence interval based on the Standard Normal Distribution. If we don’t know we can sometimes use a Student t distribution to determine the critical values. Degrees of freedom - the number of scores that can vary after certain restrictions have been imposed on all scores. Example: If 10 scores have a mean of 80 we can freely assign random values of x to the first 9 scores. But, the tenth will have to be a specific value to result in the mean of 80. Degrees of freedom = n-1 Student t requirements Chapter 9: More practice problems So, why do people really do statistics? Well, people really do statistics to answer really important questions about trends and patterns within populations. Now, some of that can be accomplished by looking at graphs and summary statistics of samples, but not very accurately because of sampling variability (gasp). You know. What if you get one of those really weird samples that can just happen by chance and predict that the average height for 18-year-old females is 6ft? Well, if you’re working for Aeropostale, you may soon be out of a job. One alternative might be to study the whole population with which we are concerned. But as a colleague of mine once said, “We really don’t care that much…plus we’ve got a life that has a time limit.” So, we need to use a sample, but in some way that allows us to account for some of the dreaded sampling variability (gasp). Enter confidence intervals. In class we discussed using a point estimate from a sample, creating a margin of error using a certain level of confidence, and combining the two to create a confidence interval. We also discussed the conditions necessary for creating confidence intervals for estimating a proportion and for estimation a mean. Below are some sample problems to help you with the process of writing out the solution to a confidence interval question. Jeans: The Aeropostale Corporation is determining what styles of jeans will be popular next school year. Since skinny jeans have surged in popularity, Aeropostale is considering carrying more “skinny” styles than in previous years. A randomly selected sample of 2500 women between the ages of 16 and 21 were asked what style of jeans they planned to buy for back-to-school next year. 52% responded they would buy skinny jeans. Since Levi’s will only increase production if it is clear that a majority of women want the skinny jean look, should they increase production? Use a confidence interval to support your decision. Categorical or numerical data? Requirements: (why?) List the statistics you know or will need: Construct the interval: (write the formula) Write the answer in context: We are _____% confident that the true (proportion/mean) of ______________________________________is between _________and _________. What does the confidence level mean? If this procedure were repeated many times with the same sample size, we would expect about ________% of the resulting intervals to include the true population (proportion/mean) of _____________________________. What will you tell the Aeropostale Corporation? How would raising the confidence level to 98% change the interval? Grades: This year’s students seem to be doing better than last year’s students. A random sample of semester grades from this year’s students is listed below. Construct a 95% confidence interval for the mean semester grade of current students. Current scores 82 75 68 87 95 86 95 82 74 88 92 90 89 91 Categorical or numerical data? Requirements: (why?) List the statistics you know or will need: Construct the interval: (write the formula) What does the interval mean? (Write it in context.) We are _____% confident that the true (proportion/mean) of ______________________________________is between _________and _________. What does the confidence level mean? If this procedure were repeated many times with the same sample size, we would expect about ________% of the resulting intervals to include the true population (proportion/mean) of _____________________________. How would raising the confidence level to 98% change the interval? Textbook Homework: 47, 49, 50, 53, 55, 73 He’s angry, but is he right? The following excerpt is taken from a letter written by a corporation president and sent to the Associated Press. When you or anyone else attempts to tell me and my associates that 1223 persons account for our opinions and tastes here in America, I get mad! How dare you! When you or anyone else tells me that 1223 people represent America, it is astounding and unfair and should be outlawed. The writer then goes on to claim that because the sample size of 1223 people represents 120 million people, his letter represents 98,000 people (120 million divided by 1223) who share the same views. a. Given that the sample size is 1223 and the degree of confidence is 95%, find the margin of error for the proportion. Assume that there is no prior knowledge about the value of that proportion. b. The writer of the letter is taking the position that a sample size of 1223 taken from a population of 120 million people is too small to be meaningful. Do you agree or disagree? Base your answer on your findings from part a. It’s Frappy Time! 2002 AP FR Form B Q#4 Each person in a random sample of 1,026 adults in the United States was asked the following question. "Based on what you know about the Social Security system today, what would you like Congress and the President to do during this next year? The response choices and the percentages selecting them are shown below. Completely overhaul the system 19% Make some major changes 39% Make some minor changes 30% Leave the system the way it is now 11% No opinion 1% a. Find a 95% confidence interval for the proportion of all United States adults who would respond "Make some major changes" to the question. Give an interpretation of the confidence interval and give an interpretation of the confidence level. b. An advocate for leaving the system as it is now commented, "Based on this poll, only 39% of adults in the sample responded that they want some major changes made to the system, while 41% responded that they want only minor changes or no changes or no changes at all. Therefore, we should not change the system." Explain why this statement, while technically correct, is misleading. 2003 Form B Q6 Researchers at a large health maintenance organization (HMO) are planning a study of a certain mild illness. They will select a random sample of patients who are ages 35 to 54 and see if they contract the illness in the next year. The researchers are interested in estimating the proportions of men and of women who are likely to develop the illness in each of 4 age-groups: 35-39, 40-44, 45-49, and 50-54. The researchers plan to include 2,000 patients in the study. Suppose the researchers draw a random sample from all of the patients at this HMO who are ages 35 to 54 and find the following numbers within each gender and age group. ----------------------Age-Group-------------------35-39 40-44 45-49 50-54 Male 350 230 150 60 Female 445 370 245 150 a) Suppose that at the end of the study, 10 percent of the females in the 40-44 age group contracted the illness. Calculate a 95 percent confidence interval to estimate the population proportion of females in this agegroup that contracted the illness. Interpret this confidence interval in the context of this situation. Interpret the confidence level of 95 percent. b) Suppose that at the end of the study, 10 percent of the males in the 40-44 age group contracted the illness. The corresponding 95 percent confidence interval to estimate the population proportion of males in this age-group that contracted the illness is (0.061, 0.139). Note that this interval and the interval in part a) are of different lengths even though the two sample proportions were identical. What would be an alternative way to allocate a sample of 2,000 subjects so that the 95 percent confidence interval widths for all male age-groups and for all female age-groups (i.e., for all 8 groups) would be the same when the sample proportions are the same? Justify your answer. c) Based on previous studies, researchers believe that the percentage of those who contract the illness will be similar for males and females, and therefore plan to ignore gender when selecting a sample for this study. Previous studies also indicate that the percentage of adults who will contract this illness in the 35-39, 40-44, 4549, and 50-54 age-groups are anticipated to be 5%, 8%, 20%, and 35%, respectively. How should the sample of 2,000 subjects be allocated with respect to age-groups so that the widths of the 95 percent confidence intervals for the four groups will be approximately the same? Justify your answer.