Mock Exam Midterm Statistics II, LBS summer term 2022 Dr Christian Reiner 1) If the significance level đź is 0.02, which p-value for a test statistic will result in a test conclusion to reject H0? A. 0.05 B. 0.01 C. 0.98 D. 0.97 E. 0.03 2) You’re conducting an experiment with 55 customers and you want to test the following hypotheses: đť0 : đ = 4 đŁđ đťđ : đ ≠ 4. The standard error is 0.5, and the alpha level is 0.05. The population of values is normally distributed. Suppose that your test statistic is 1.42. What is the p-value interval for this result? A. 0.005<p<0.01 B. 0.5<p<1.0 C. 0.2<p<0.4 D. 0.15<p<0.3 E. 0.1<p<0.2 3) Based on the following information, what is your conclusion? đť0 : đ = 348 đŁđ đťđ > 348, đź = 0.05 and p-value = 0.07 A. You should reject đź. B. You should fail to reject H0. C. You should fail to reject đź. D. You should reject Ha. E. You should accept H0. 4) If the alpha level is 0.05, what is the probability of a Type II error? A. 0.05 B. impossible to tell without further information C. 0.99 D. 0.95 E. 0.01 5) It’s believed that the average amount of sleep a person in the United States gets per night is 6.3 hours. A mom believes that mothers get far less sleep than that. She contacts a random sample of 20 other moms on a mom social networking site and finds that they get an average of 5.2 hours of sleep per night, with a standard deviation of 1.8 hours. Using this data, what is the value of the test statistic t from a t-test for a single population mean? A. –2.73 B. –2.66 C. 2.24 D. –12.11 E. 12.22 1 6) In a company, employees type an average of 20 words per minute. Typing rates are normally distributed with a standard deviation of 3. The manager of a large branch of the company believes that his employees do better than that. He randomly samples 30 employees from his branch and finds an average typing rate of 20.5 words per minute. If the manager wants a significance level of 0.05, what can he conclude? A. Accept the null hypothesis that the average words per minute is equal to 20. B. Reject the null hypothesis and conclude that the average words per minute is greater than 20. C. Fail to reject the null hypothesis that the average words per minute is equal to 20. D. Reject the null hypothesis that the average words per minute is equal to 20 and conclude that it’s not equal to 20. E. None of the above. 7) A magazine reports that the average number of minutes that U.S. teenagers spend texting each day is 120. You believe it’s less than that. What are your null and alternative hypotheses? A. đť0 : đ = 120 đŁđ đťđ : đ ≠ 120 B. đť0 : đ = 120 đŁđ đťđ : đ < 120 C. đť0 : đĽĚ = 120 đŁđ đťđ : đĽĚ < 120 D. đť0 : đĽĚ = 120 đŁđ đťđ : đĽĚ ≠ 120 E. none of the above 8) A precocious child wants to know which of two brands of batteries tends to last longer. She finds seven toys, each of which requires one battery. For each toy, she then randomly chooses one of the brands, puts a fresh battery of that brand in the toy, turns the toy on, and records the time before the battery dies. She repeats the experiment with a battery of the brand not used in the first trial for each toy. Her data is listed in the following table (in terms of hours of battery life before failure): Assume that the difference scores are normally distributed. The sample standard deviation of difference scores (sd) is 0.9214. At an alpha level of 0.01, given a not equal to alternative hypothesis, what is (are) the critical value(s) of t? A. -2.364; 2.364 B. -5.407; -5.407 C. 2.446 D. 3.499 E. -3.707; 3.707 2 9) A psychologist read a claim that average intelligence differs between smokers and nonsmokers and decided to investigate. She sampled 30 smokers and 30 nonsmokers and gave them an IQ test. She used an alpha level of 0.10. The mean for the smokers is 51.9, and the mean for the nonsmokers is 52.6. The population variance for each group is 5. Suppose that you were instead using an alpha level of 0.05. What would your decision be regarding this data? A. The null hypothesis can’t be rejected. This study doesn’t support the idea that smokers and nonsmokers differ in IQ. B. The null hypothesis can be rejected. Nonsmokers appear to be smarter than smokers. C. The null hypothesis can’t be rejected, but the researcher can still say with confidence that smokers are smarter than nonsmokers. D. The null hypothesis can be rejected. Smokers do appear to be smarter than nonsmokers. E. Accept the alternative hypothesis. Nonsmokers appear to be smarter than smokers. 10) You’re interested in the willingness of adult drivers (age 18 and over) in a metropolitan area to pay a toll to travel on less-congested roads. You draw a sample of 100 adult drivers and administer a survey on this topic to them. One of your questions asks whether the respondent voted in the last election. You find a much higher proportion of individuals claiming to have voted than is indicated by public records. What is this an example of? A. negativity bias B. selection bias C. non-response bias D. response bias E. undercoverage bias 11) You want to survey students at a high school and calculate the mean age. Which of the following procedures will result in a simple random sample? A. selecting one student at random, asking him or her to suggest three friends to participate and continuing in this fashion until you have your sample size B. using an alphabetized student roster and selecting every 15th name, starting with the first one C. numbering the students by using the school’s official roster and selecting the sample by using a random number generator D. classifying the students as male or female and drawing a random sample from each E. selecting three tables at random from the cafeteria during lunch hour and asking the students at those tables for their age 12) Which of the following is an example of a good census of 2,000 students in a high school? A. calculating the mean age of all the students by using their official records B. asking the first 25 students who arrive at school on a given day their age and calculating the mean from this information C. sending an e-mail to all students asking them to respond with their age and calculating the mean from those who respond D. Choices A and C E. Choices A, B, and C 3 13) You read a report that 60% of high-school graduates participated in sports during their high-school years. You believe that the percentage of high-school graduates who played sports is higher than what was reported. What type of statistical technique do you use to see whether you’re right? A. a confidence interval B. a z-score C. a statistic D. a parameter E. a hypothesis test 14) You read a report that 60% of high-school graduates participated in sports during their high-school years. You believe that the percentage of high-school graduates who played sports in high school is higher than what’s in the report. If you do a hypothesis test to challenge the report, which of these p-values would you be happiest to get? A. p = 1 B. p = 0.05 C. p = 0.50 D. p = 0.95 E. p = 0.001 15) The Central Limit Theorem states that, if a random sample of size n is drawn from a population, then the sampling distribution of the sample mean : A. is approximately normal if n < 30. B. is approximately normal if the underlying population is normal. C. is approximately normal if n ≥ 30. D. follows a hypergeometric distribution. E. None of these choices. 16) You’re interested in the willingness of adult drivers (age 18 and over) in a metropolitan area to pay a toll to travel on less-congested roads. You draw a sample of 100 adult drivers and administer a survey on this topic to them. Suppose that you collect your data in a way that makes it likely that the survey respondents aren’t representative of the target population. What is this called? A. bias B. differencing C. random error D. sample adjustment E. transformation 4 17) You want to survey students at a high school and calculate the mean age. Which of the following procedures will result in a simple random sample? A. selecting one student at random, asking him or her to suggest three friends to participate and continuing in this fashion until you have your sample size B. using an alphabetized student roster and selecting every 15th name, starting with the first one C. numbering the students by using the school’s official roster and selecting the sample by using a random number generator D. classifying the students as male or female and drawing a random sample from each E. selecting three tables at random from the cafeteria during lunch hour and asking the students at those tables for their age 18) The p-value in hypothesis testing represents which of the following: (Please select the best answer of those provided below.) A. The probability of failing to reject the null hypothesis, given the observed results B. The probability that the null hypothesis is true, given the observed results C. The probability that the observed results are statistically significant, given that the null hypothesis is true D. The probability of observing results as extreme or more extreme than currently observed, given that the null hypothesis is true E. The probability of committing a type II error. 19) A sociologist focusing on popular culture and media believes that the average number of hours per week (hrs/week) spent using social media is greater for women than for men. Examining two independent simple random samples of 100 individuals each, the researcher calculates sample standard deviations of 2.3 hrs/week and 2.5 hrs/week for women and men respectively. If the average number of hrs/week spent using social media for the sample of women is 1 hour greater than that for the sample of men, what conclusion can be made from a hypothesis test where: đť0: đđ−đđ=0 and đťa: đđ−đđ>0. Use a significance level of 5% . A. The observed difference in average number of hrs/week spent using social media is not significant B. The observed difference in average number of hrs/week spent using social media is significant C. A conclusion is not possible without knowing the average number of hrs/week spent using social media in each sample D. A conclusion is not possible without knowing the population sizes E. None of the answers is correct. 5 The Table below shows data from a random sample to investigate whether there is a link between epilepsy and depression. 20) A researcher believes that the proportion of individuals with diagnosed epilepsy that present with a depressive disorder, pE, is higher than the proportion of individuals without diagnosed epilepsy that present with a depressive disorder, pNE. Using the data from the table above and a 0.10 significance level, which of the following is the most appropriate conclusion given the results? A. Reject the null hypothesis; there is sufficient evidence to support the researcher’s claim. B. Fail to reject the null hypothesis; there is sufficient evidence to support the researcher’s claim. C. Accept the null hypothesis; there is not sufficient evidence to support the researcher’s claim. D. Accept the null hypothesis; there is sufficient evidence to support the researcher’s claim. E. All answers are incorrect. 21) What does it mean if a test statistic has a p-value of 0.01? A. There is a 99% chance of getting a value at least that extreme, if the null hypothesis is false. B. There is a 1% chance of getting a value at least that extreme, if the null hypothesis is false. C. There is a 1% chance of getting that value, if the null hypothesis is false. D. There is a 1% chance of getting that value, if the null hypothesis is true. E. There is a 1% chance of getting a value at least that extreme, if the null hypothesis is true. 22) A test was done with a significance level of 0.05, and the p-value was 0.001. Select the best description of this result. A. not statistically significant B. impossible to say without further information C. highly statistically significant D. marginally statistically significant E. statistically significant 6 23) One problem with hypothesis testing is that a real effect may not be detected. This problem is most likely to occur when A. the effect is small and the sample size is small. B. the effect is large and the sample size is small. C. the effect is small and the sample size is large. D. the effect is large and the sample size is large. E. The sample is a stratified random sample. 24) A recruiting firm reported that 78% of U.S. companies use social networks such as Facebook and LinkedIn to recruit job candidates. An economist thinks that the percentage is higher at technology companies. She samples 70 technology companies and finds that 55 of them use social networks. The p-value of the hypothesis test to test her claim at 0.05 level of significance is (round only at the last stage of your calculations) approximately A. 0.04 B. 0.45 C. 0.78 D. 0.12 E. 0.98 25) Which type of bias occurs because we do not obtain complete information about a population? A. Nonresponse bias B. No bias C. Response bias D. Sampling bias E. None of the above 26) What would be a type I error? A. The researcher concludes she has sufficient evidence that her new medication helps more than insulin injection, and her medication really is better than insulin injection. B. The researcher concludes she has sufficient evidence that her new medication helps more than insulin injection, when in reality her medication is not better than insulin injection. C. The researcher concludes she does not have sufficient evidence that her new medication helps more than insulin injection, and her medication really is not better than insulin injection. D. The researcher concludes she does not have sufficient evidence that her new medication helps more than insulin injection, when in reality her medication is better than insulin injection. E. The researcher concludes she has sufficient evidence that her new medication controls blood sugar level the same as insulin injection, and in reality there is a difference. 7 27) What would be a type II error? A. The researcher concludes she has sufficient evidence that her new medication helps more than insulin injection, and her medication really is better than insulin injection. B. The researcher concludes she has sufficient evidence that her new medication helps more than insulin injection, when in reality her medication is not better than insulin injection. C. The researcher concludes she does not have sufficient evidence that her new medication helps more than insulin injection, and her medication really is not better than insulin injection. D. The researcher concludes she does not have sufficient evidence that her new medication helps more than insulin injection, when in reality her medication is better than insulin injection. E. The researcher concludes she has sufficient evidence that her new medication controls blood sugar level the same as insulin injection, and in reality there is a difference. 28) A financial analyst determines the yearly research and development investments for 50 blue chip companies. She notes that the distribution is distinctly not bell-shaped. If the 50 dollar amounts are converted to z-scores, what can be said about the standard deviation of the 50 z-scores? A. It depends on the distribution of the raw scores. B. It is less than the standard deviation of the raw scores. C. It is greater than the standard deviation of the raw scores. D. It is equal to the standard deviation of the raw scores. E. It equals 1. 29) A company has 466 loyal customers and about 1000 customers in total. Your boss asks you to calculate h hypothesis test whether the proportion of loyal customers is equal to 50%. You tell him that this is inappropriate. Why? A. The sample size is below 30. B. The population distribution is not normal. C. The population proportion is known, so a hypothesis test is not necessary. D. The sample is biased. E. The sample size is larger than 10% of the population. 30) A scientist is testing whether a new fertilizer increases the height of a particular plant. As part of the experiment, 15 of the plants are randomly assigned the new fertilizer, and 15 are randomly assigned an old formula. What type of test should be used to determine whether the average height of the plants is higher using the new fertilizer? A. A matched-pairs t-test B. A one-sample proportion z-test C. A two-sample proportion z-test D. A two-sample t-test E. A chi-squared test of association 8 Solutions Question 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Answer B E B B A C B E A D C A E E C A C D B A E C A B D B D E C D 9 Statistical Formulas Testing one population mean đĽĚ − đ0 đĄ= đ , đđ = đ − 1 ⁄ đ √ Testing one population proportion đ§= đĚ − đ0 √đ0 (1 − đ0 ) đ đđ0 ≥ 5 đđđ đ(1 − đ0 ) ≥ 5 Comparing two population means (independent samples) (đĽĚ 1 − đĽĚ 2 ) − 0 đĄ= 2 √đ 1 đ1 + , đđ = đđđ{đ1 − 1, đ2 − 1} đ 22 đ2 The paired t-test (dependent samples) đĚ −0 đĄ = đ đ ⁄√đ đ , đđ = đ − 1 Comparing two population proportions (independent samples) đ§= đĚ = (đĚ1 − đĚ2 ) − 0 1 1 √đĚ (1 − đĚ ) ( + ) đ1 đ2 đĽ1 +đĽ2 đ1 +đ2 đ1 đĚ ≥ 5, đ1 (1 − đĚ ) ≥ 5, đ2 đĚ ≥ 5, đ2 (1 − đĚ ) ≥ 5 Confidence intervals and critical values for the z-distribution 10 11 12 13