Texas A&M University Department of Statistics STAT 211: Principles of Statistics I Practice Problems for Exam II, Fall 2023 Dr. Alexander Roitershtein [Use the following information to answer the next 2 questions.] Most graduate schools of business require applicants for admission to take the Graduate Management Admission Council’s GMAT examination. Scores on the GMAT are roughly normally distributed with a mean of 527 and a standard deviation of 112. 1. What is the probability of a randomly selected candidate scoring above 500 on the GMAT? Answer up to four decimal places. a) b) c) d) e) 0.3682 0.4047 0.4836 0.5948 None of the other options. 2. How high must an applicant score on the GMAT in order to score in the highest 5%? Answer up to two decimal places. a) b) c) d) e) 705.22 707.22 709.22 711.22 713.22 3. Suppose Alice just found out that she scored 75 on her first Statistics exam, with a z-score of −0.25. Her friend, Bob, who is in a different section with the same Professor, also scored 75 but had a z-score of 0.35. What can you conclude about the mean exam scores in the classes? a) b) c) d) e) The mean exam scores of both classes must both be equal to 75. The Alice’s class had a lower mean score compared to Bob’s class. The Alice’s class had a higher mean score compared Bob’s class. It is impossible to say without seeing all of the individual test scores. It is impossible to say since we don’t know the shape of either distribution. 1 [Use the following information to answer the next 3 questions.] A researcher is interested in the amount of water Texans drink per day on average. The true amount of water that Texans drink on any given day is normally distributed with a mean of 3,400 ml and a standard deviation of 400 ml. The researcher takes a random sample of 16 Texans and records how much water they drank in the previous 24 hours. 4. What is the distribution of the average amount of water drunk by the participants in a random sample of 16 Texans? a) X̄ ∼ N µX̄ = 3400, σX̄ = 400 q (400/3400)(3000/3400) 400 b) p̂ ∼ N µp̂ = 3400 , σp̂ = 16 400 c) X̄ ∼ N µX̄ = 3400, σX̄ = √ 16 q (3000/3400)(400/3400) d) p̂ ∼ N µp̂ = 3000 , σ = p̂ 3400 16 e) None of the other options 5. Using the Empirical Rule calculate the probability that the sample mean is greater than 3700 ml. a) b) c) d) e) 0.9970 0.0015 0.7743 0.2257 0.0030 6. The researcher constructs a 95% confidence interval for the parameter of interest. If we were to quadruple the sample size to 64 Texans, how would the width of the confidence interval change? a) b) c) d) e) The new confidence interval will be two times wider than the previous confidence interval. The new confidence interval will be four times wider than the previous confidence interval. The previous confidence interval will be two times wider than the new confidence interval. The previous confidence interval will be four times wider than the new confidence interval. These two confidence intervals will be equal in size since the error bound remains unaltered. 2 7. The number of calories consumed per day is a question of interest for nutritionists. It is known from past studies that the standard deviation of calories per day is 250. Nutritionists want to construct a 95% confidence interval for the amount of calories consumed per day, but sampling is expensive and they want to save money. What is the minimum sample size needed to create a 95% confidence interval such that the width is 100? a) b) c) d) e) 384 385 96 97 None of the other options 8. A political candidate, Alice, is running for office and she wants to know what her approval rating is and has hired you to help. You decided to take a sample of 100 people in the town and find that 64 of the people you sampled approve of Alice. Which of the intervals is the correct 99% confidence interval for Alice’s approval rating? Answer up to four decimal figures. a) b) c) d) e) (0.5167, 0.7634) (0.2367, 0.4834) (0.5459, 0.7341) (0.2659, 0.4541) None of the other options 9. A utility company wants to know the average electricity used per residential customer per day. A random sample of 41 residential customers are taken with sample mean of 30 kWh and sample standard deviation of 4.3 kWh, assume the population is approximately normally distributed. Which expression is the correct margin of error for a 90% confidence interval of the parameter of interest? Choose the nearest option. a) 1.96 ∗ 4.3 √ 41 b) 1.64 ∗ 4.3 √ 41 c) 2.02 ∗ 4.3 √ 41 d) 1.68 ∗ 4.3 √ 41 e) None of the other options 3 10. An inspector inspects a shipment of medications to determine the efficacy in terms of the proportion p in the shipment that failed to retain full potency after 60 days of production. Unless there is clear evidence that this proportion is significantly less than 0.05, she will reject the shipment. To reach a decision she selects a simple random sample of 200 pills. Suppose that 8 of the pills have failed to retain their full potency. What is the standard error of the sample mean? Answer up to three decimal places. a) b) c) d) e) 0.001 0.014 0.024 0.069 None of the above [Use the following information to answer the next 2 questions.] Medical researchers now believe there may be a link between baldness and heart attacks in men. The hypothesis testing problem that we’re interested in is H0 : There is no link between baldness and heart attacks in men vs Ha : There is a link between baldness and heart attacks in men at level of significance α = 0.05. 11. What would constitute a Type II error in this study? a) The study finds no link between baldness and heart attacks in men when in reality there is no link between baldness and heart attacks in men. b) The study finds a link between baldness and heart attacks in men when in reality there is no link between baldness and heart attacks in men. c) The study finds no link between baldness and heart attacks in men, when in reality there is a link between baldness and heart attacks in men. d) The study finds a link between baldness and heart attacks in men, when in reality there is a link between baldness and heart attacks in men. 12. a) b) c) d) e) What is the probability the test will result in a Type I error? 0.05 0.95 0.01 0.99 0.025 4 [Use the following information to answer the next 2 questions.] Suppose you roll a fair die 64 times where each roll is independent. A success in each trial is defined as getting a 4 or more. You are interested in the number of successes you get in 64 trials. 13. a) b) c) d) e) 14. a) b) c) d) e) What is the appropriate distribution? Normal(µ = 64/3, σ = 2/3) Normal(µ = 32, σ = 8) Normal(µ = 16, σ = 4) Binomial(n = 64, p = 1/4) Binomial(n = 64, p = 1/2) Find the mean and standard deviation of the above distribution. mean=32, mean=64, mean=32, mean=64, mean=16, standard standard standard standard standard deviation=4 deviation=4 deviation=8 deviation=8 deviation=8 15. The STAT201 Exam 2 scores are normally distributed and their middle 68% scores are between 60 and 86. Calculate the approximate mean and standard deviation of the distribution. Hint: Use the Empirical rule. a) b) c) d) e) µ = 70 and σ = 6.5 µ = 73 and σ = 6.5 µ = 70 and σ = 13 µ = 73 and σ = 13 µ = 0 and σ = 1 16. The SAT scores are normally distributed with mean 1,500 and standard deviation 300, and the ACT scores are also normally distributed with mean 21 and standard deviation 5. Jeff earned 1,800 on his SAT and Jane earned 24 on her ACT. A college admissions officer wants to determine which of the two applicants scored better on their standardized test with respect to the other test takers. Choose the correct explanation. a) Jeff scored higher on his standardized test with respect to the other test takers than Jane did. b) Jane scored higher on her standardized test with respect to the other test takers than Jeff did. c) Jeff and Jane scored the same on their standardized tests with respect to the other test takers. d) The admission officer is unable to compare their scores because the SAT and ACT have different distributions. e) None of the above 5 [Use the following information to answer the next 2 questions.] According to the US Census Bureau’s American Community Survey, 87% of Americans over the age of 25 have earned a high school diploma. Suppose a simple random sample of size n = 100 from this population has been drawn. 17. Compute the mean and the standard deviation of the proportion of Americans in the sample who have a high school diploma. a) b) c) d) e) µp̂ µp̂ µp̂ µp̂ µp̂ = 0.87, = 0.13, = 0.87, = 0.13, = 0.87, σp̂ σp̂ σp̂ σp̂ σp̂ = 0.0113 = 0.0113 = 0.0336 = 0.0336 = 0.00336 18. What is the probability that the percentage of Americans in the sample with a high school diploma is less than 85%? Answer up to four decimal places. a) b) c) d) e) 0.0385 0.1862 0.2760 0.3746 0.7240 19. The American Community Survey (ACS), part of the United States Census Bureau, conducts a yearly census similar to the one taken every ten years, but with a smaller percentage of participants. The most recent survey estimates with 90% confidence that the mean household income in the U.S. falls between $69,720 and $69,922. Find the point estimate for mean U.S household income and the error bound or the margin of error (MoE) for mean U.S. household income. a) b) c) d) e) Point Point Point Point Point estimate = 69000; MoE = 100 estimate = 69800; MoE = 101 estimate = 69821; MoE = 101 estimate = 69821; MoE = 202 estimate = 69842; MoE = 101 [Use the following information to answer the next 4 questions.] Assume that the population distribution of bag weights is normal with an unknown population mean and a known standard deviation of 0.1 ounces. A random sample of 16 small bags of the same brand of candies was selected. The weight of each bag was then recorded. The mean weight of the bags in the sample was 2.5 ounces. Suppose we wish to construct a 95% confidence interval for the mean weight of bags of that specific brand of candies. 6 20. What formula would you use to construct a 95% confidence interval for the mean weight of bags? The symbols bear their usual meanings. a) X̄ ± z ∗ × √Sn , where z ∗ = 1.96 b) X̄ ± z ∗ × √σn , where z ∗ = 1.96 c) X̄ ± t∗15 × √σn , where t∗15 = 2.131 d) X̄ ± t∗15 × √Sn , where t∗15 = 2.131 e) None of the above. The conditions for the confidence interval are not satisfied. 21. Consider the following interpretations of a 95% confidence interval for the mean weight of bags in this context. (I) If we take 10, 000 repeated samples under identical conditions, then in approximately 9, 500 cases, the population mean weight of bags would be equal to 2.5 ounces. (II) If we take 10, 000 repeated samples under identical conditions, then approximately 9, 500 of the estimated confidence intervals calculated from those samples will contain the population mean weight of bags. (III) If we take 10, 000 repeated samples under identical conditions, then in approximately 9, 500 cases, the estimated confidence intervals calculated from those samples would contain the sample mean weight of 2.5 ounces. (IV) If we take 10, 000 repeated samples under identical conditions, then approximately 500 of the estimated confidence intervals calculated from those samples will not contain the population mean weight of bags. (V) There is a 95% probability that the sample mean weight of bags will lie within the confidence interval. Select the correct answer from the following options: a) b) c) d) e) (II) is true. (I) and (II) are true. (I) and (III) are true. (II) and (IV) are true. (I), (II), and (V) are true. 22. What change, if any, would you observe in the 95% confidence interval if we use another random sample of size n = 64 instead of n = 16? a) The new confidence interval will be two times wider than the previous confidence interval. b) The new confidence interval will be four times wider than the previous confidence interval. c) The new confidence interval will be two times narrower than the previous confidence interval. d) The new confidence interval will be four times narrower than the previous confidence interval. e) These two confidence intervals will have an equal width since the error bound remains unaltered. 7 23. Suppose, instead of a 95% interval, we wish to construct a 90% confidence interval for the mean weight of bags. In that case, find the minimum sample size in order to ensure that the width of the 90% confidence interval is no larger than 0.1. a) b) c) d) e) 10 11 15 16 27 24. The owner of a travel agency would like to determine whether or not the mean age of the agency’s customers is over 24. If so, he plans to alter the destination of their special cruises and tours. If he concludes the mean age is over 24 when it is actually not, he makes a ( ) error. If he concludes the mean age is not over 24 when it actually is, he makes a ( ) error. a) Type II, Type I b) Type I, Type II c) Type I, Type I d) Type II, Type I e) α, Power 25. The Nielson Company reported that nationally 30% of Millennials order groceries online. Suppose that a U.S. grocery company wishes to test whether this figure is different in their local market. The test will be conducted at the 1% significance level. What is the probability that the grocery company will commit a Type I error? a) 0.01 α = Pr(type I error) b) 0.02 c) 0.05 d) 0.10 e) Not enough information 26. You are planning to use a sample proportion p̂ to estimate a population proportion p. Suppose a sample size of n = 100 and a confidence level of 95% yielded a margin of error of 0.025. Which of the following will result in a larger margin of error? (I) Increasing the sample size while keeping the same confidence level (II) Decreasing the sample size while keeping the same confidence level (III) Increasing the confidence level while keeping the same sample size (IV) Decreasing the confidence level while keeping the same sample size a) b) c) d) e) I and III I and IV II and III II and IV None of the above 8 27. Suppose we are testing the hypotheses H0 : µ = 70 vs Ha : µ > 70. Of the following sample means, which one will have the largest P-value? (Hint: draw a sampling distribution of X̄. a) x̄ = 72 b) x̄ = 70 c) x̄ = 68 d) x̄ = 66 e) x̄ = 60 28. Out of 2000 students in the school, 1400 passed an exam. What is approximately the standard error of p̂? a) b) c) d) 0.0001 0.0094 0.0078 0.9999 [Use the following information to answer the next 2 questions.] The proportion of a population with a characteristic of interest is p = 0.63. 29. Find the mean and standard deviation of the sample proportion p̂ obtained from random samples of size 3, 600. a) b) c) d) 30. a) b) c) d) e) µp̂ µp̂ µp̂ µp̂ = 0.63, = 0.37, = 0.37, = 0.63, σp̂ σp̂ σp̂ σp̂ = 0.008. = 0.08. = 0.008. = 0.08. How will the standard deviation of p̂ change if you decrease the size of the sample? decreases remains the same increases depends on the distribution None of the other options 31. It is known that 5-year stomach cancer survival rate is 63%. 100 patients with stomach cancer were randomly selected and it is found that the survival rate is 57% after 5 years.Assume that many random samples of size 100 patients are drawn and the sampling distribution of sample proportion is obtained. Which of the following is TRUE? a) b) c) d) e) It It It It It is is is is is exactly normal with mean 0.57 and standard deviation 0.0495. a binomial distribution with n = 100 and p = 0:63 approximately normal with mean 0.63 and standard deviation 0.0483. exactly normal with mean 0.63 and standard deviation 0.0483. approximately normal with mean 0.57 and standard deviation 0.0495. 9 [Use the following information to answer the next 5 questions.] Peggy is interested in the mean height of young women aged 18 to 24 in the US. Assume the population standard deviation is known to be 2.5 inches. She takes a sample of 72 young women aged 18 to 24 in the US and calculates a sample mean of 65 inches. 32. a) b) c) d) What is the parameter of interest? The The The The mean mean mean mean height height height height of of of of the 72 young women aged 18 to 24 in the US young women aged 18 to 24 in the US young women aged 18 to 24 in the world the 72 young women aged 18 to 24 in the world 33. Assume Peggy wants to create a 95% confidence interval about the true parameter. Which interpretation of a 95% confidence interval is correct? a) If we took repeated samples, the sample mean would equal the population mean in approximately 95% of the samples. b) If we took repeated samples, approximately 95% of the confidence intervals calculated from those samples would contain the sample mean of 65 c) If we took repeated samples , approximately 95% of the confidence intervals calculated from those samples would contain the true parameter d) There is a 95% probability that the true parameter is included within the interval e) There is a 95% probability that the sample mean of 65 is included within the interval 34. Assume Peggy wants to create a 95% confidence interval about the true parameter. What is the margin of error or error bound? a) b) c) d) e) 35. a) b) c) d) 36. a) b) c) d) 0.29 0.34 0.58 0.68 Not enough information provided to answer this question What will happen to the margin of error if the confidence level is now 90% instead of 95%? The margin The margin The margin Not enough of error will decrease of error will increase of error will be the same information provided to answer this question What will happen to the margin of error if the sample size is now 36 instead of 72? The margin The margin The margin Not enough of error will decrease of error will increase of error will be the same information provided to answer this question 10 [Use the following information to answer the next 4 questions.] Suppose a random sample of size n = 80 is drawn from a normal population with an unknown mean µ and a known standard deviation 2. Let (2.19, 3.67) be a 95% confidence interval for µ based on the observed sample and the sample standard deviation is 2.67. 37. Find the sample mean. a) b) c) d) e) 3.67 2.19 2.67 2.93 1.48 38. Find the corresponding margin of error. a) b) c) d) e) 0.05 0.72 0.74 0.76 1.48 39. Now suppose we wish to construct a 98% confidence interval based on a new sample while keeping the margin of error unaltered. Which of the following statements do you think would be correct? a) b) c) d) We need to decrease the sample size n. We need to increase the sample size n. Sample size n does not have any effect on the margin of error. Sample size n should remain the same to obtain the same margin of error. 40. Further, suppose we wish to construct a new confidence interval based on a new sample of size n = 30 while keeping the margin of error unaltered. Which of the following statements do you think would be correct? a) b) c) d) We need to decrease the level of confidence. We need to increase the level of confidence. Level of confidence should remain the same to obtain the same margin of error. Level of confidence does not have any effect on the margin of error. 41. Peggy is interested in the mean height of young women aged 18 to 24 in the US. Assume the population standard deviation is known to be 2.5 inches. What would be the minimum sample size required so that the width of a 95% confidence interval does not exceed 1 inch? a) b) c) d) e) 85 89 93 97 100 11 42. Gallup took in 2013 a nationally representative sample of 2027 adults and asked them about their soda consumption. The survey shows that 24% mostly drink diet soda. Based on the results, a 95% confidence interval for the proportion of American adults who mostly drink diet soda is: a) b) c) d) e) 0.24 ± 0.018 0.24 ± 0.009 0.24 ± 1.96 0.24 ± 0.152 0.24 ± 0.95 43. Earlier this month, 50% Californians voted ‘yes’ on Measure H: a sales tax measure to fund homeless services and prevention. They took a sample of 50 Californians. Find the 80th percentile for the above distribution of sample proportions. a) b) c) d) e) 0.56 0.64 0.66 0.50 0.059 [Use the following information to answer the next 3 questions.] A New Research Center poll included 1500 randomly selected adults who were asked whether “global warming is a problem that requires immediate government action”. Results showed that 850 of those surveyed indicated that immediate government action is required. Let p = the population proportion of adults who believe that immediate government action is required. 44. a) b) c) d) e) A 95% confidence interval for p is given by 0.57 ± 0.025 0.57 ± 0.033 0.57 ± 0.021 0.57 ± 0.020 0.57 ± 0.035 45. Another researcher Samantha requests to see a 85% confidence interval based on the same data. Pick the correct option. a) b) c) d) e) The 95% interval will be wider than the 85% interval The 95% interval will be approximately the same as the 85% interval The 95% interval will be narrower than the 85% interval Cannot compare the 95% and 85% intervals without looking at the data More information is required 12 46. Peter decides to estimate the above parameter by making a 90% confidence interval. What could he do to reduce the margin of error? a) b) c) d) e) Increase the number of students in his sample Decrease the number of students in his sample Use a different sampling scheme He must consider a different sample of 1500 adults None of the above [Use the following information to answer the next 3 questions.] Answer the next three questions based on the following information. The university is interested to know whether the students support sport passes to be included in their tuition fees. 250 students are sampled to estimate the proportion of students who support sports passes being included in tuition. Of them, 133 support it and 117 oppose. 47. a) b) c) d) Find a 99% confidence interval for the proportion. (0.469,0.595) (0.451,0.613) (0.421,0.643) (0.350,0.586) 48. The university president wants to know if more than half of the students support sport passes being included in the tuition fee. The sample proportion was 0.52. What would the appropriate null and alternative hypothesis be in this case? a) b) c) d) H0 H0 H0 H0 : p = 0.5 vs. Ha : p > 0.5 : p = 0.5 vs. Ha : p ̸= 0.5 : p = 0.52 vs. Ha : p > 0.52 : p̂ = 0.52 vs. Ha : p̂ > 0.52 49. The above hypothesis test is conducted. The p–value is 0.04 and the sample proportion was 0.52. What is the correct interpretation of this p–value? a) There is a 0.04 probability that the population proportion is 0.52. b) The true proportion must be bigger than 0.5. c) If hypothesis tests are conducted based on repeated samples, approximately 4% of these samples would have sample means away from the hypothesized value of 0.5. d) There is a 0.04 probability that the null hypothesis is correct. 13 50. Assume that a full survey is conducted and the true population proportion is 0.53. If a sample of 200 individuals is taken what is the expected value of the number of people who support the measure and the standard deviation of that estimate. a) b) c) d) expected expected expected expected value: value: value: value: 106, standard deviation: 7.058 106, standard deviation: 49.82 53, standard deviation: 49.82 53, standard deviation: 7.058 51. A researcher conducted an experiment on 8 randomly selected NASCAR drivers in which their reaction time was measured. The sample mean reaction time was 1.24 secs. The sample standard deviation reaction time was 0.12 secs. Assume that reaction time follows a normal distribution, construct the 98% confidence interval for the population mean reaction time based on these data is given by: a) b) c) d) e) 1.24 ± 0.083 1.24 ± 0.099 1.24 ± 0.118 1.24 ± 0.127 1.24 ± 0.136 52. Suppose the mean speed of internet in your apartment is usually 35 MBps. The internet provider, Sudden-Link, charged you more for the last month. They claimed the mean speed of your internet connection to be more than 35 MBps. You are skeptical. So you tracked your daily internet speed for the last 30 days. Your speed data yields the sample mean 36.2 and the sample standard deviation 4.32. Suppose the Sudden-Link manager asked you to provide a 95% confidence interval for the true mean speed of the Internet based on your sample. What should be your answer? 4.32 √ 30 4.32 36.2 ± t0.025;29 × √ 30 4.32 36.2 ± z0.05 × √ 30 4.32 36.2 ± z0.025 × √ 30 4.32 35 ± t0.025;29 × √ 30 a) 36.2 ± t0.05;29 × b) c) d) e) 53. The Dallas Cowboys may be first in the NFC East, but they are in a 6 − 6 position this season. Suppose a survey was conducted on the proportion of Cowboys fans who believe Jason Garrett should be fired and the 95% confidence interval is (0.73, 0.96). How would you interpret this in context? a) We are 95% confident that the true mean number of people who believe Jason Garrett should be fired is between 0.73 and 0.96. b) We are 90% confident that the true mean number of Cowboys fans who believe Jason Garrett should not be fired is between 0.73 and 0.96. c) We are 95% confident that the true proportion of Cowboys fans who believe Jason Garrett should be fired is between 0.73 and 0.96. d) We are 95% confident that the true proportion of people who believe Jason Garrett should be fired is between 0.73 and 0.96. 14 54. According to the American Automobile Association (AAA), erroneous driving is the cause of approximately 54% of all fatal automobile accidents in US. Thirty randomly selected fatal accidents are examined, and it is found that 14 of them were due to driving error. Suppose the typical mean height of the population is 64 inches, and Laura suspects that the mean height might be greater than 64 inches. What should be her null and alternative hypotheses? a) H0 : µ = 64 vs Ha : µ > 64 b) H0 : µ ≥ 64 vs Ha : µ < 64 c) H0 : X = 64 vs Ha : X > 64 d) H0 : X > 64 vs Ha : X ≤ 64 55. When a new drug is created, the pharmaceutical company must subject it to testing before receiving the necessary permission from the Food and Drug Administration (FDA) to market the drug. Suppose the null hypothesis is ”the drug is unsafe.” What is the Type II Error? a) b) c) d) To To To To conclude conclude conclude conclude the the the the drug drug drug drug is is is is safe when in, fact, it is unsafe. unsafe when, in fact, it is safe. safe when, in fact, it is safe. unsafe when, in fact, it is unsafe. 56. In testing H0 : µ = 5 vs. Ha : µ > 5, a sample of size n = 50 yielded a p-value of 0.014. If α = 0.01, and the true value of the mean was actually µ = 7, then the decision based on the data: a) b) c) d) e) was a Type I error. was a Type II error. was correct. was powerful. cannot be determined. 57. In testing H0 : p = 0.5 vs. Ha : p < 0.5, a sample of size n = 100 yielded a p-value of 0.027. If α = 0.05, and the true value of the population proportion p was actually p = 0.38, then the decision based on the data: a) b) c) d) e) was a Type I error. was a Type II error. was correct. was powerful. cannot be determined. 15 58. A group of doctors is deciding whether or not to perform an operation to remove a cancerous tumor. Suppose the null hypothesis is: H0 : the surgical procedure will successfully remove the tumor. State the Type I and Type II errors in complete sentences. a) T1: In reality, the surgery went well but the doctors want to operate again. They think they did not remove the tumor. T2: The doctors think they got all of the tumors when in reality they failed to get it all and you still have cancer. b) T1: The doctors think they got all of the tumors when in reality they failed to get it all and you still have cancer. T2: In reality, the surgery went well but the doctors want to operate again. They think they did not remove the tumor. c) T1: In reality, the surgery went well and the doctors think you are better. T2: The doctors think they got all of the tumors when in reality they failed to get it all and you still have cancer. d) T1: In reality, the surgery went well but the doctors want to operate again. They think they did not remove the tumor. T2: The doctors think they failed to get all of the tumors when in reality they failed to get it all and you still have cancer. 16