Math 1107 Practice test #1 (selected answer key) 1. a. 70 60 50 40 30 20 10 0 En g. Ph ys .S ci Ps . yc ho lo gy c. Ed u Sc i. Li fe C S Series1 b. A pie chart would not be appropriate since the percentages do not add up to 100. 2. Placebo Calcium 8 9 9 2 10 2 7 7 9 7 4 2 2 0 11 0 1 2 2 3 12 3 0 13 6 The distribution in the Calcium group seems to be slightly skewed to the right, whereas the distribution in the placebo group is more symmetric. However, the centers (sample means) of the two distributions are quite close in value. 3. The median is Q2 50.7, and the third quartile is Q3 58.1 . Therefore, the 1964, 1972, and 1984 were landslide elections. 4. A B -10 9 -5 b) Since additive A increased mileage in 75% of the cars (notice that the first quartile is 1, whereas for additive B the first quartile is -2), it will be the choice if it is desired to increase mileage in the highest proportion of cars. c) Definitely additive B has a higher mean of increase due to a much higher third quartile, and the values above the third quartile are much bigger than the corresponding values for additive A. Moreover, the values below the first quartile for additive B are bigger than the corresponding values for additive A. 40 40 Review exercises Part 1: Ex. 5. a,b) Since men’s heights have a greater standard deviation they will vary more than women’s heights. Therefore, it will be more likely to see a man qualify for the beanstalk membership than a woman. We will compute the z-scores corresponding to the heights needed for men and women to qualify for the membership. We have 74 69.1 70 64 zman 1.75, zwoman 2.4 . Since the z-score for a woman’s height is 2.8 2.5 much larger than the man’s height, it will be less likely to have women qualify for the membership than men. This is due to the shape of the normal curve: since it peaks in the middle, values closer to the center are more likely than values farther from the center. Ex. 6. a) The distribution of number of loaves sold is skewed to the right with a center around 110 and a range of 145-95=50. b) Because of the skewness to the right, it is expected that the mean is higher than the median. c) Sales 140 # of Days 130 120 110 100 90 Notice that the interquartile range is IQR Q3 Q1 105.5 97 8.5 , so using the rule for outliers, any value greater than Q3 1.5IQR 105.5 12.75 118.25 or smaller than Q1 1.5IQR 97 12.75 84.25 are outliers. Now, there are no values less than 95, but there are quite a few values above 118.25 so all of those would be outliers. Moreover, the values that are more than 3IQR above the third quartile, i.e., values above 131 are extreme outliers. d) We cannot conclude that about 68% of the days the bakery will expect to sell between 94 and 112 loaves since the data does not follow a normal model. There is extreme skewness associated with the presence of many outliers. Ex. 8. b) In order to find the percent of storms that produce rain with pH over 6 we need to compute the area under the normal curve with mean 4.9 and standard deviation 0.6, to the right of 6. This can be obtained by: normalcdf(6,E99,4.9,0.6)=0.0334 c) We need to evaluate the area under the normal curve to the left of 4: normalcdf(-E99,4,4.9,0.6)=0.0668. d) This is an inverse problem: we are given the area and we need to find the value that cuts off that area. Read again in the problem: the lower the pH the more acidic the rain. Since we are looking at the most acidic 20% of the storms, we need to find the lowest 20% of the pH values. We get this by performing invNorm(0.20, 4.9, 0.6)=4.4 e) The pH of the least 5% acidic storms we need to find the top 5% of the pH values. We do invNorm(0.95, 4.9, 0.6)=5.89. f) We have Q1 invNorm(0.25, 4.9,0.6) 4.5, Q3 invNorm(0.75, 4.9,0.6) 5.3 , so the interquartile range is IQR 0.8 . Exercise 12. We are told that 51% of homes had a personal computer (PC) and 41.5% had access to the internet. Common sense tells us that in order to have internet access one must have a personal computer, so it is conceivable to believe that some(if not all) of the homes with internet access had PC. Then it does not make sense to add the two percentages. So we do not agree with the conclusion in the newspaper. Exercise 14. a) We treat the data as a categorical variable. We don’t know the distance that was traveled from home. The bar graph appears below. Chart of Miles from home 30 25 Percent 20 15 10 5 0 1 to 5 11 to 15 16 to 20 6 to 10 Miles from home less than 1 over 20 b) As explained above, we don’t know how far the car traveled, we only know where the accident happened. It may have happened when the car was returning back home. Exercise 21. Here is a histogram of the arm/nose ratio done in Minitab. Note: the data can be downloaded from the text website or the CD. Histogram of arm/nose ratio 6 Frequency 5 4 3 2 1 0 12.0 12.8 13.6 14.4 15.2 arm/nose ratio 16.0 16.8 The distribution of ratios is skewed to the left, with a mean around 15 and a range of approximately 5. There may be outliers to the left of the center. b) The summary statistics appears below: Variable arm/nose ratio N 18 N* 0 Variable arm/nose ratio Maximum 16.900 Mean 14.978 SE Mean 0.295 StDev 1.252 Minimum 11.800 Q1 14.350 Median 15.250 Q3 15.750 Despite skewness, the mean and median are fairly close in value. The data does not have a high variability. This can be seen from the standard deviation (fairly small) and a small IQR=1.4. However, if we use the 1.5IQR rule for outliers we see that: Q1 1.5IQR 14.35 1.5 1.4 12.25 Q3 1.5IQR 15.75 1.5 1.4 17.85 We see that the minimum ratio value 11.8 is below 12.25 so it is an outlier. There are no outliers to the right of the center since 16.9<17.85. c) Based on the observations above, a ratio of 9.3 is indeed unusually small since it is more that 3 IQR below the first quartile: 9.2 Q1 3 IQR 14.35 3 1.4 10.15 . 1101 27.5% Republicans. 4002 b) Since the sample was random and representative, yes, this is a reasonable estimate. c) We have a total of 1101 people under 30 and 1004 over 65, and since these are nonoverlapping categories, the percentage of people under 30 or over 65 is: 1101 1004 50.1% . 4002 409 10.2% . d) There were 409 Independents under 30, so the percentage is 4002 Exercise 33. a) There were e) Be careful how you read this question: it is not the same as the above. Now we look at 409 27.3% . the percentage of people under 30 out of the Independents: 1497 f) This percentage is now computed out of the people under 30 so we divide by the row 409 40.9% . total: 1001 Exercise 23 page 125: b) Using the 68-95-99.7 rule we have that the central 95% should have a diameter between 10.4 2 4.7 1in, and 10.4 2 4.7 19.8in . c) Since less than one inch represents 2 standard deviations below the mean, from the normal model we get a percentage of 2.5%. d) A diameter of 5.7 represents one standard deviation below the mean. We know that within one standard deviation of the mean there are approx. 68% of diameters. This is exactly between 5.7 and 15.1. Then, between 5.7 and the center, 10.4 there are 34%. e) Between 10.4 and 15.1 there are also 34% of diameters. Since over 10.4 there are 50% of diameters, we have 50%-34%=16%. Exercise 25. The distribution is approximately uniform. There is too much heaviness in the tails to allow us to use the normal model. Exercise 28. a), b) Since the data is so dramatically skewed to the left, the median is a better measure of center and the IQR would be a better measure of spread. c) According to the Normal model, about 68% of neighborhoods would have a percent white within one standard deviation of the mean.. d) First, if we measure one standard deviation to the right of the mean we obtain 83.59+22.26=105.85 (impossible to be a percentage). Here you can see what skewness can lead to. One standard deviation to the left: 83.59-22.26=61.33. From the histogram we need to get an estimate of the number of neighborhoods that have a percentage of white between 61.33% and 100%. Estimating the heights of the bars in he histogram we get a total count of at least 400. Of the total of 500 neighborhoods this represents at least 400/500=80%, which is very different from what the normal model predicts. e) The Normal model is not a good model for this data. The distribution we have is skewed to the left. Therefore no conclusions can be reached by using the Normal model. Exercise 31. We use the TI83 to answer these questions. All questions are inverse problems: we are given the area under the Normal model and we need to find the z-scores that cut off those areas. a) invNorm(0.8)=0.84 b) invNorm(0.25)=-0.67 c) invNorm(0.03)=-1.88 d) Since we have the middle 90% area, the z-score to the left has and area of 0.05 to its left, and we have invNorm(.05)=-1.645. The right end point of the middle interval is by symmetry 1.645.. Exercise 33. a) The percent of steers that weigh over 1250 is normalcdf(1250, E99, 1152, 84)=.1217=12.17% b) The percent of steers weighing under 1200 is normalcdf(0,1200, 1152, 84)=71.61% c) Between 1000 and 1100: normalcdf(1000,1100, 1152, 84)=23.28% Exercise 37. To find the 40th percentile we compute invNorm(0.40,1152,84)=1130.7 lbs b) The 99th percentile: invNorm(.99, 1152, 84)=1347.4 lbs c) For IQR we need the first and third quartiles: IQR= invNorm(.75, 1152, 84)invNorm(.25, 1152, 84)=1208.66- 1095.34=113.32