Fall ’10 – Exam 1 1. Which of the following best describes a 95th percentile for a given distribution? A percentile is a number (z-score) that has the associated percent of the observations below (to the left) of it. A. the value of 95% of the observations This would be the mode. B. a number that is 2 standard deviations above the mean of the distribution IF the data is normal, then 2 sds above the mean would be the approximate 97.5%tile. C. a number that would be equivalent to 95 on a 100 point scale D. a number such that 95% percent of the observations in the distribution have that value or less Yes E. Two of the above are true. 2. Bubba and Joe Bob decide they want to shoot Mrs.Cleary's scarecrow. They each take three shots. Bubba hits both the right and the left arm and the stomach. Joe Bob hits the left knee all three times. They were both aiming for the heart. Which of the following is true? Since the center of Bubba’s shots was close to the heart, on average he hit the target’s center unbiased. He had a large spread, so he had high variability. Joe Bob had small variability but the center of his shots way off the target biased. A. Bubba is unbiased but has high variability. B. Joe Bob is unbiased but has high variability. C. Bubba is biased but has low variability. D. Joe Bob is biased but has high variability. E. None of the above are correct. 3. Suppose x = 5 and sx = 2. What is y and sy if y = 3x - 4? Plug the old mean into the equation ybar = 3(5) – 4 = 11 E. y = 11 and sy = 6 4. Explain the boxplot for Truck. Since there is no line within the box indicating the value of the median, it, the median must be equal either Q 1 of Q3. E. Q3 = ~x or ~x = Q1 5. For X ~ N(3, 22), what is P(X > 0.65)? (rounded) 1 – P(Z < (0.65 – 3)/2) = 1 – P(Z < -1.18) = 1 – 0.119 = 0.881 A. 0.88 6. For a certain dataset, you are told that the standard deviation, s = 0. What else can you say about the dataset? If the sd = 0, then ALL of the data points are the same value and ALL measures of spread will also be 0. We don’t, however, now what the constant value of the distribution is. A. the mean, x = 0 B. the median, ~x = 0 C. the IQR = 0 D. all of the above E. none of the above 7. If the 5-Number-Summaries for a dataset are: 22, 37, 52, 65, 79, which of the following best describes the distribution? Checking the distance between the 5 numbers, you get 15, 15, 13 and 14. Since these are all about the same, we can say the distribution is approximately uniform. E. uniform 8. Some companies “grade on a bell curve" to compare the performance of their managers and professional workers. This forces the use of some low performance ratings, so that not all workers are graded “above average". Until the threat of lawsuits forced a change, Ford Motor Company's “performance management process" assigned 15% A's and C's and 70% B's to the company's 18,000 managers. Assuming the distribution is approximately bell-shaped (normal) what are the approximate mean and standard deviation if C's have scores of 25 or less and A's have scores of 475 or more? A normal distribution is symmetric so 25 is as far from center, the mean, as 475. (25+475)/2 = 250 and the distance from center to 25 is 225. We know this is 1 sd because the 15th %tile is approximately 1 sd below the mean. C. x = 250 and s = 225 9. Which of the following statements is/are true? A. The value of the median will change if you add 10 to the maximum in a dataset. Since you are only changing the maximum, the mean (and sd) would change, but the median would stay the same. B. The IQR covers 50% of the range (spread) of the data. This is only true for the uniform distribution. The IQR is the range of the middle 50% of the data. C. A z-score tells us how many standard deviations an observation is from its mean. Exactly. 10. What is z such that P(Z > z) = 0.075 and Z ~ N(0, 12)? 1 – P(Z < z) = 0.075, so P(Z < z) = 0.925 z = 1.44 D. 1.44 11. If the difference in the average husbands' and wives' weights is Hbar - Wbar ~ N(25, 102), how likely is the wife to weigh more than the husband? If the average wife weighs more than the average husband, Hbar – Wbar would be negative, so you need to solve P(Hbar – Wbar < 0) = P(Z < (0 – 25)/10) = P(Z < -2.5) = 0.0062 C. 0.0062 12. If we know that a distribution is skewed to the right, which of the following is true? A. The mean is smaller than the median. No, the mean will be larger (to the right) of the median. B. Outliers would most likely be to the left. The tail and therefore the outliers if they exist is on the right. C. The histogram would be shorter on the left. The bins of the histogram would be taller on the left and shorter on the right. D. All of the above are true. E. None of the above are true. 13. Numbering the boxplots 1 through 4 from left to right, which distribution is approximately uniform? A uniform boxplot has the 4 quarters about the same length, i.e., the 5 number summaries are about likely spaced. B. 3 only since it's the most spread out. 14. Which boxplot has an Interquartile Range, IQR, of about 20? Looking at the left margin, you’ll notice that the tick marks are 20 units apart. The box of the first boxplot is about that same distance. The 2nd and 4th boxplots have slightly smaller IQR’s. A. 1 15. Suppose the average height of the American male is 5"10' or 70 inches with a standard deviation of 6 inches. What is the distribution of a sample 50 American males' heights? Since the sample size n = 50 is more than 30, we can assume that the distribution of the sample mean will be approximately normal. The mean is the same as the mean of the data, 70, and the standard deviation is the standard deviation of the data divided by the square root of the sample size, 6/√50. B. N(70,(6/√50)2), 16. IF the heights of American males is X ~ N(70,62), how likely are you to find an American male over 7 feet, 84 inches? P(X > 84) = P(Z > (84 – 70)/6) = 1 – P(Z < 2.33) = 1 – 0.9901 = 0.0099 A. 0.0099 17. Suppose you know that a distribution is bell-shaped, with mean, x = 30 and standard deviation, sx = 4. Approximately what percent of the data will fall between 22 and 34? z1 = (22 – 30)/4 = -2, z2 = (34 – 30)/4 = +1. The area between 1sd = 68%. Add the area between -2 and -1, 13.5, and you get 81.5% C. about 80% 18. What is z_ such that P(-zα/2 < Z < z α/2) = 0.75 and Z ~ N(0, 12) 0.75 = 1 – α α/2 = 0.125 z = ± 1.15 B. 1.15 19. Suppose your high school grading system used an 8 point scale with a mean of 6.5 and a standard deviation of 1.2 and your average was 7. What is your equivalent average in a 100 point system with a mean of 82 and a standard deviation of 14? z = (7 – 6.5)/1.2 = 0.42, to convert to the new distribution x = µ + z* = x = 82 + (0.42)*14 = 87.83 B. 87.838 20. Suppose you get a z-score = 1.2 on this exam. Question #10 is really hard, so I decide to give everyone credit, i.e., I give 5 points to everyone who missed it. What SHOULD you think about this if you got the problem right originally? Adding points to SOME scores would increase the mean, but since you got it correct, your score stayed the same. This would make your z-score = (score – increased mean)/sd lower since you’re subtracting a larger number now. We don’t know if your z-score is positive, negative or now changed; we just know that it’s lower. B. You get gypped since your z-score would decrease. 1D,2A,3E,4E,5A,6C,7E,8C,9C,10D,11C, 12E,13B,14A,15B,16A,17C,18B,19B,20B