Name:___________________________ Date:____________________ Labouré College Essentials of Statistics MAT3401 ANSWERS Final Prep Questions from Chapter 1 You are conducting a study of nurses working at Hospital Q. You plan to take a sample and give them a survey. Among the questions on your survey instrument are: A. How many hours are you scheduled to work each week? Answer to the nearest hour._____________ B. How satisfied are you with the number of hours of work you have scheduled per week? Respond using the following scale: 1 = not at all, 2 = somewhat, 3= very 1. How would you classify variable A above? Circle one answer (1 point). a. It is a qualitative, nominal variable. (not qualitative, which is category) b. It is a qualitative, ordinal variable. (not qualitative, which is category) c. It is a quantitative, ratio variable. d. It is a quantitative, nominal variable. (not nominal, which is category) 2. How would you classify variable B above? Circle one answer (1 point). a. It is a qualitative, nominal variable. (not equal categories) b. It is a qualitative, ordinal variable. (has an “ord”er of increasing satisfaction) c. It is a quantitative, ratio variable. (not quantitative; categories instead) a. It is a quantitative, nominal variable. (not quantitative; categories instead) 3. Is the proportion of responses “3 = very” to question B a statistic or a parameter? Circle one answer. (1 point) a. Statistic (from a sample – see question) b. Parameter (from a population) Page 1 Copyright © Monika Maya Wahi 4. Suppose you take random samples (not the whole population) from the following nurse occupational classifications at Hospital Q: Nurse I, Nurse II, Nurse III, Supervising Nurse I, and Supervising Nurse II. Most nurses (80%) fall in the Nurse I and Nurse II categories, but you want to get a good representation of higher level and supervising nurses, too. Therefore, you choose to send 20% of your surveys to nurses in each category so each one is equally represented (since there are 5 categories, each category will be represented equally and equal 100%). What kind of sampling technique are you using? Circle one answer. (1 point) a. Simple random (not equal chance of being sampled) b. Stratified (divided into categories or “strata” then sampled) c. Systematic (taking every “nth” in the sample) d. Cluster (geographic) e. Multistage (several stages of sample – large studies) f. Convenience (already-gathered group: class, participants at an event, etc.) 5. Of the population of 150 nurses working at Hospital Q, you only sampled 30 of them (6 from each category). From your sample of 30, you calculate a mean age of nurses of 47.8. Is 47.8 a parameter or a statistic? Circle one answer (1 point). a. 47.8 refers to a parameter. (this was not from a population) b. 47.8 refers to a statistic. (from a sample) 6. The Human Resources Department at Hospital Q tells you that, over 2012, among the entire population 150 nurses, 15 (10%) were promoted to a higher job classification. Is 10% a parameter or a statistic? Circle one answer (1 point). a. 10% refers to a parameter. (from the population – see question) b. 10% refers to a statistic. (this was not from a sample) Page 2 Copyright © Monika Maya Wahi 7. You want to test your job satisfaction survey. Labouré College gives you permission to distribute your survey to a microbiology class that includes 25 nursing students. You attend the class once and administer the survey. What type of sampling are you using? Circle one answer (1 point). a. Cluster (geographic) b. Simple random sampling (randomize list) c. Stratified (divide into “strata” first then sample) d. Convenience (already-assembled group) 8. Imagine you send out your survey to your sample of nurses by e-mail to their work e-mail. You send it out on December 24 (day before Christmas) and say they must fill it out before 7 days is up, which is December 31 (New Year’s Eve). You know from experience the organization only has a few nurses on staff during that week, as most nurses take time off. What kind of mistake might you be making? Circle one answer (1 point). a. Undercoverage b. Faulty recall c. Hidden bias d. Interviewer influence 9. An OB/GYN clinic wants to study the health of babies born to mothers who are HIV positive. The mother is enrolled in the study when it is found she is HIV positive and pregnant, and then is contacted to complete several surveys during the prenatal period, when the baby is born, and one year after the baby is born. What type of study design is this? Circle one answer (1 point). a. Observational study (no intervention) b. Experiment (intervention needs to be given) 10. An investigator wants to study if a new drug approved for Alzheimer’s disease can actually improve memory in young people. The investigator enrolls 50 college students, and randomizes 25 to receive placebo, and 25 to receive the Alzheimer’s drug. Then, she tests the students to see if those in the drug group are able to memorize items more quickly than those in the placebo group. What type of study design is this? Circle one answer (1 point). a. Observational study (not the type with an intervention) b. Experiment (intervention is drug/placebo) Page 3 Copyright © Monika Maya Wahi Questions from Chapter 2 1. The following data represent the observed number of native plant species from random samples of study plots on different islands in the Galapagos Island chain. Reference: Science, Vol. 179, p 893-895. 23 26 33 73 21 35 30 16 3 17 9 8 9 19 65 12 11 89 81 7 23 95 4 37 28 Select the correct stem-leaf that describes the plant species data by circling the letter corresponding to the correct stem-leaf plot. (2 points) a. 0 1 2 3 4 5 6 7 8 9 3 1 1 0 5 1 1 4 8 3 3 5 9 7 2 0 1 2 3 4 5 6 7 8 9 3 1 1 0 0 1 2 3 4 5 6 7 8 9 3 1 7 0 2 1 5 3 1 5 9 9 6 7 7 3 7 4 8 4 2 3 3 7 6 3 5 8 7 6 7 7 b. 5 3 1 5 9 9 8 9 9 B is the correct answer c. 4 4 8 3 5 5 2 7 8 3 8 3 5 7 8 9 d. 0 1 2 3 4 5 6 7 8 9 Page 4 5 0 2 3 5 2 1 0 1 2 1 5 6 2 2 1 5 6 3 2 8 3 7 9 5 7 8 4 Copyright © Monika Maya Wahi 2. Identify the shape of the distribution for the plant species data (1 point). Skewed right 3. In a survey you conducted, you asked 100 respondents to rate how much they liked spiders on a 1 to 5 scale, where 1 means “not at all” and 5 means “a lot”. You found that respondents were pretty polarized on the issue – they either really did not like spiders, or they loved them. Which histogram below represents this type of finding? Circle the letter corresponding to the correct histogram. (2 points) 40 30 20 10 0 1 2 1 2 3 4 5 a. 40 30 20 10 0 3 4 5 b. B is the correct answer. 60 40 20 0 c. 1 2 1 2 3 4 5 30 20 10 0 3 4 5 d. Page 5 Copyright © Monika Maya Wahi Questions from Chapter 3 The following is a box and whisker plot for the result of science test scores in a class. 1. Is the minimum score less than 50? Circle the correct answer below. (1 point) a. The minimum score is less than 50. b. The minimum score is equal to 50. c. The minimum score is more than 50. 2. Is the maximum score more than 90? Circle the correct answer below. (1 point) a. The maximum score is more than 90. b. The maximum score is equal to 90. c. The maximum score is less than 90. 3. What is the value of Q1? Circle the correct answer below. (1 point)? a. 54 b. 73 c. 81 d. 90 e. 95 4. What is the value of Q3? Circle the correct answer below. (1 point)? a. 54 b. 73 c. 81 d. 90 e. 95 Page 6 Copyright © Monika Maya Wahi Suppose you wanted to compare 3 lathes that make motor shafts to a design specification. The shafts are supposed to be 18.85 mm when they are done. You create a sample of motor shafts for each lathe (Lathe 1, Lathe 2, and Lathe 3), measure the shafts, and plot them below to compare. (From ASQ.org.) 1. Which group of lathe had the highest median of shaft size? Circle the letter for the correct answer. (1 point) a. Lathe 1 b. Lathe 2 c. Lathe 3 2. Which lathe had the largest range of shaft sizes ? Circle the letter for the correct answer. (1 point) a. Lathe 1 b. Lathe 2 c. Lathe 3 3. Which lathe had the largest interquartile range of shaft sizes ? Circle the letter for the correct answer. (1 point) a. Lathe 1 b. Lathe 2 c. Lathe 3 Page 7 Copyright © Monika Maya Wahi The Maui News gave the following costs in dollars per day for a random sample of 20 condominiums located throughout the island of Maui: For Questions 8-10, round to 1-digit after the decimal. 89 50 68 60 375 55 500 71 40 350 60 50 250 45 45 125 235 64 60 130 4. What is the mean of the above data ? Circle the letter for the correct answer. (1 point) a. $50.85 b. $97.35 c. $136.15 (2,723/20) = 136.15 d. $182.45 5. What is the median of the above data? Circle the letter for the correct answer. (1 point) a. $66.50 (10th and 11th place are 65 and 68. (65+68)/2 = 133/2 = 66.5) b. $82.70 c. $145.55 d. $220.25 6. What is the mode of the above data? Circle the letter for the correct answer. (1 point) a. $40 b. $45 c. $50 d. $60 (occurs 3 times) 7. Imagine you learned that the standard deviation of the above data was $133.56. You also learned that the standard deviation for a random sample of 20 costs in dollars per day of condominiums to rent in Boston was $182.45. Which city has more variation in its costs? Circle the letter for the correct answer. (1 point) a. Maui has more variation than Boston. b. Boston has more variation than Maui. ($182.45 is higher than $133.56) c. Neither, they have equal variation. Page 8 Copyright © Monika Maya Wahi Questions from Chapter 4 See the scattergram below and answer several questions. 60 y-axis 50 40 30 20 10 0 0 10 20 30 40 50 60 x-axis 1. What kind of association is shown between x and y in the scattergram? Circle the letter for the correct answer. (1 point) a. Positive correlation (trend going upward) b. Negative correlation (trend going downward) c. No correlation (no trend) 2. What kind of association is shown between x and y in the scattergram? Circle the letter for the correct answer. (1 point) a. Strong b. Moderate c. Weak d. None 3. What is the value of the correlation coefficient between x and y? You should be able to pick out the right answer from the list from looking at the scattergram. Circle the letter for the correct answer. (1 point) a. -0.02 (negative is correct direction, but weak correlation so not correct) b. 0.75 (not correct – correlation not positive) c. -0.89 (strong, negative correlation) d. 0.84 (not correct – correlation not positive) Page 9 Copyright © Monika Maya Wahi 4. What is the equation for the least squares line? You should be able to pick out the right answer from the list from looking at the scattergram. Circle the letter for the correct answer. (1 point) a. y-hat = -0.8x + 33.3 (correct – sign of slope (b) same as sign of correlation (r)) b. x-hat = -0.8y + 33.3 (never predicting x-hat. y-hat is dependent variable) c. y-hat = 0.8x + 33.3 (not correct – sign of slope (b) positive, must be negative if negative correlation r) d. x-hat = 0.8x + 33.3 (never predicting x-hat. y-hat is dependent variable) 5. Imagine x = 10. Using the equation, predict y-hat. (1 point) a. 23.4 b. 14.6 c. 33.3 d. 25.3 Page 10 see answer a to question 4 above: (-0.8)(10)+33.3 = 25.3 Copyright © Monika Maya Wahi Questions from Chapter 7 Researchers asked Japanese people in Ohasama to measure their blood pressure every day as part of a study and report back what they measured. After the study was done, the researchers found that for the systolic blood pressure (SBP), μ=117.3 mmHg and ϭ=13.4 mmhg Source: Imai, Y., et al. (1993) Characteristics of a community-based distribution of home blood pressure in Ohasama in northern Japan. J Hypertension 11(12):1441-1449. Answer the following questions based on this population’s SBP, where systolic blood pressure (SBP), μ=117.3 mmHg and ϭ=13.4 mmhg. Round SBPs to 1-digit after the decimal. Round Z scores to 2-digits after the decimal. Round probabilities/p-values to 4-digits after the decimal. Round percentages to a whole number. 1. According to the Empirical Rule, what is the SBP cutpoint for 34% of the data above the mean? (1 point) a. 103.9 b. 117.3 c. 130.7 34% is 1 ϭ above μ. 117.3 + 13.4 = 130.7. d. 117.0 2. According to the Empirical Rule, what is the SBP cutpoint for 34% of the data below the mean? (1 point) a. 103.9 34% is 1 ϭ below μ. 117.3 - 13.4 = 103.9. b. 117.3 c. 130.7 d. 117.0 Page 11 Copyright © Monika Maya Wahi 3. What is the z-score for the probability that, for an Ohasama resident selected at random, x is greater than 119.2? Circle the letter for the correct answer. (1 point) a. -1.14 b. 0.14 (119.2 – 117.3)/13.4 = 0.14 c. -0.5557 d. 0.4443 4. What is the probability that, for an Ohasama resident selected at random, x is greater than 119.2? Circle the letter for the correct answer. (1 point) a. -1.14 b. 0.5557 c. 0.4443 Look up 0.14 in z table, which is 0.5557 – which is bigger than 50%. The question asks for likelihood of being greater than 119.2, and since μ is below that at 117.3, we are looking for <50% piece at top. Therefore, 1 – 0.5557 = 0.4443. Best to draw a picture. 117.3 119.2 d. 0.0145 Page 12 Copyright © Monika Maya Wahi 5. What is the probability that, for an Ohasama resident selected at random, x is less than 115.4? Circle the letter for the correct answer. (1 point) a. 0.5557 b. 0.4443 Same situation as Question 4 – you need a picture to see that since you get this when you look up the z score for -0.14, you keep it since it is <0.5000 (50%) and you are looking for the little piece. You don’t need to do the 1 minus thing. Draw a picture! 115.4 117.3 c. 0.0145 d. -0.9927 (negative probability = nonsensical) 6. What is the correct picture corresponding to the area under the curve that should be shaded in for the previous question? a. b. c. d. Page 13 D is the correct answer Copyright © Monika Maya Wahi 7. What is the probability that, for an Ohasama resident selected at random, x is between 115.4 and 119.2? Circle the letter for the correct answer. (2 points) a. -0.0145 (negative probability = nonsensical) b. 0.5557 c. 0.4443 d. 0.1114 For 119.2 and lower, we use p = 0.5557 – that is the big piece part we didn’t want in Question 4. From that, we subtract p = 0.4443, which is the little piece we calculated in Question 6. 0.5557 – 0.4443 = 0.1114, or 11%, for the middle part. 115.4 117.3 119.2 8. What is the probability that, for a sample of 25 Ohasama residents selected at random, the sample mean is between 115.4 and 119.2? Round standard error to 1digit after decimal. Circle the letter for the correct answer. (2 points) a. -0.8555 (negative = nonsensical) b. 0.9473 c. 0.0007 d. 0.5160 Standard error is 13.4/√25 = 2.7. The z for 115.4 is (115.4 – 117.3)/2.7 = -0.70. The z for 119.2 is (119.2 – 117.3)/2.7 = 0.70. The probability for getting less than 119.2 (the big red piece including the little white piece at the bottom) is at z = 0.07, which is 0.7580. The probability for getting less than 115.4 (the little white piece at the bottom) is at z = -0.70, which is 0.2420. 0.7580 (big piece plus little piece) – 0.2420 (little piece) = 0.5160. 115.4 117.3 119.2 Page 14 Copyright © Monika Maya Wahi 9. What is the probability that, for a sample of 36 Ohasama residents selected at random, the sample mean is between 115.4 and 119.2? Round standard error to 1digit after decimal. Circle the letter for the correct answer. (2 points) a. -0.7111 b. 0.1111 c. 0.6102 Almost the same as the last question, but we use n=36. Standard error is 13.4/√36 = 2.2. The z for 115.4 is (115.4 – 117.3)/2.2= 0.86. The z for 119.2 is (119.2 – 117.3)/2.2 = 0.86. Referring to the picture from the last question, the probability for getting less than 119.2 (the big red piece plus the little white piece at the bottom) is at z = 0.86, which is 0.8051. The probability for getting less than 115.4 (the little white piece at the bottom) is at z = -0.86, which is 0.1949. 0.8051 – 0.1949= 0.6102. d. 0.9580 10. Imagine the Japanese government provides a free gym membership to Ohasama residents with an SBP over 140. What percentage of the Ohasama residents will be eligible for the free membership? (1 point) a. 5% Calculate z-score: (140 – 117.3)/13.4 = 1.69. Look up in z table and get 0.9545. Over 140 is small piece at top, so 1 – 0.9545 = 0.0455, or 5%. b. 40% c. 69% d. 95% Page 15 Copyright © Monika Maya Wahi 11. Imagine the Japanese government provides a free gym membership to Ohasama residents in the top 9% of SBP in the nation. What is the SBP cutpoint for Ohasama residents to be eligible for the free membership? (1 point) a. 125 b. 130 c. 135 To calculate x, first dig around in middle of z table to look up 9% or whatever is closest, which would be 0.9000 in the middle of the z table. Closest is 0.0901, which is at z-score -1.34. That’s for the bottom 9% because it is negative. The top 9% is at z-score 1.34. (Just in case you are wondering how that works, you can prove it by seeing that if you do 1 – 0.0901 you get 0.9099, and if you look up that probability, you get z-score 1.34 (this is the long way of doing it)). Now that you have z of 1.34, plug into equation: (1.34 * 13.4) + 117.3 = 135.3. We round it to give an easy cutpoint for the clinicians. d. 140 Page 16 Copyright © Monika Maya Wahi Questions from Chapter 8 Martin G. Larson wrote a statistical primer for cardiovascular research which was published in Circulation (2006, vol. 114: 76-81). In it, he reported population parameters for BMI in the city of Framingham, MA (as measured by offspring of Framingham Study participants, n=3,480, 1995-1998). Larson found that BMI was normally distributed in the population he considered. Here are the parameters Larson reported: μ = 27.9 ϭ = 5.1 Here are a few scores from the z-table: Level of Confidence (c) Critical Value z 0.90 or 90% 1.645 0.95, or 95% 1.96 0.99, or 99% 2.58 Instructions: You learn that having a BMI of 25 and over is considered being overweight, and are concerned that the adult residents of Framingham appear to be overweight. You wonder about whether your own small town where you live is overweight, so you measure 12 adult members of your population, and get a sample mean of 26.0. You decide to make a 90% confidence interval for the μ of your small town using this sample of 12 members and the ϭ estimate from the Larson article. Answer the following questions with respect to this scenario. You decide to set up your confidence interval equation. Answer the following questions about that equation. Round z to 2-digits after the decimal. Round BMI to 1-digit after the decimal. 1. What is the value of x-bar? (1 point) a. 27.9 b. 26.0 (sample mean given in question) c. 25.0 d. Not given in question Page 17 Copyright © Monika Maya Wahi 2. 3. 4. 5. 6. What is the value of n? (1 point) a. 12 (sample size given in question) b. 25 c. 26 d. 3,480 What is the value of ϭ? (1 point) a. 27 b. 25 c. 5.1 (parameter given in question) d. Not given in question What is the value of zc? (1 point) a. 1.645 (question specifies 90% confidence interval, from table in question) b. 1.96 c. 2.58 d. Not given in question What is the margin of error? (1 point) a. 1.2 b. 2.4 c. 4.6 d. 6.8 1.645 * (5.1/√12) = 2.4 What is the confidence interval for the μ BMI of the residents of your small town? (2 points) a. 24.0 to 27.0 b. 22.2 to 25.6 c. 20.4 to 29.7 d. 23.6 to 28.4 The x-bar is 26.0 and the E is 2.4. 26 – 2.4 = 23.6, and 26 + 2.4 = 28.4. Page 18 Copyright © Monika Maya Wahi 7. 8. After making this interval, what are you 90% confident about? (1 point) a. I am 90% confident that the true μ falls into the range of my 90% confidence interval. (remember sampling distributions) b. I am 90% confident that BMI is normally distributed. (illogical – given in question) c. I am 90% confident that the true ϭ falls into the range of my 90% confidence interval. (illogical) d. I am 90% confident that my sample is representative of the population. (illogical) According to your calculations, do you believe the adults in your small town are overweight (e.g., is the μ BMI of your population above 25.0)? (1 point) a. No. They may be overweight, but since the cutoff for overweight is BMI = 25.0 and the bottom limit of my confidence interval was below that, it is not clear if the μ is higher than 25.0 or lower than 25.0. b. No. The entire confidence interval (both lower and upper limits or bounds) for the μ was below 25.0. Therefore, it is very unlikely the population is overweight. c. Yes. The entire confidence interval (both lower and upper limits or bounds) for the μ was above 25.0. Therefore, it is very unlikely the population is of normal weight. d. There is no way to tell from this confidence interval. Page 19 Copyright © Monika Maya Wahi You realize Larson did not report on children, but you are concerned about the children in your community. You decide to measure 12 school-aged children in your community, but cannot use Larson’s estimate of ϭ because it applies to adults. You decide to estimate the μ of this population with a 90% confidence interval, but you must use s to estimate ϭ. Your measurements yield the following statistics: 9. 10. 11. x-bar = 26.0 s = 2.0 What table will you use to look up the probability in this question, and why? (2 points) a. The t-table, because the ϭ is unknown, and we have a small sample size (<500). (If this is the situation, you have to use t and not z.) b. The z-table, because the ϭ is unknown, and we have a small sample size (<500). (z-table is only for large sample sizes) c. The t-table, because BMI is not normally distributed. (illogical) d. The z-table, because t-table does not have probabilities for a 90% confidence interval. (not even true) What are the degrees of freedom in this question? (1 point) a. 10 b. 11 c. 12 d. 90 n=12, and 12 – 1 = 11. In your confidence interval equation, what is the value of s? (1 point) a. 2.0 (given in question) b. 5.1 c. 27.9 d. 25.0 Page 20 Copyright © Monika Maya Wahi 12. 13. 14. 15. In your confidence interval equation, what is the value of tc? (1 point) a. 0.026 b. 0.179 c. 1.796 Look in the t-table. Since it is d.f. = 11, look in the 11 row. Since we are looking for a 90% confidence interval, look under column where c = 0.900. There, t = 1.796. d. 2.535 In your confidence interval equation, what is the margin of error? (1 point) a. 0.5 b. 1.0 c. 2.5 d. 3.0 1.796 * (2.0/√12) = 1.0 State your 90% confidence interval. (2 points) a. 25.0 to 27.0 x-bar is 26.0 and E = 1.0. Lower limit is 26.0 – 1.0 = 25.0. Upper limit is 26.0 + 1.0 = 27.0 b. 24.0 to 28.0 c. 20.0 to 25.0 d. 25.1 to 26.1 According to your calculations, do you believe the children in your small town are overweight (e.g., is the μ BMI of your population above 25.0)? (1 point) a. No. They may be overweight, but since the cutoff for overweight is BMI = 25.0 and the bottom limit of my confidence interval was below that, it is not clear if the μ is higher than 25.0 or lower than 25.0. b. No. The entire confidence interval (both lower and upper limits or bounds) for the μ was below 25.0. Therefore, it is very unlikely the population is overweight. c. Yes. The entire confidence interval (both lower and upper limits or bounds) for the μ was at or above 25.0. Therefore, it is very unlikely the population is of normal weight. d. There is no way to tell from this confidence interval. Page 21 Copyright © Monika Maya Wahi