252y0313 9/29/03 ECO252 QBA2 FIRST HOUR EXAM October 2, 2003 Name ________KEY_______ Hour of class registered _____ Show your work! Make Diagrams! I. (8 points) Do all the following Normal distribution problems. Before you take another exam please read ‘Things that you should never do on an exam or anywhere else’ in your syllabus supplement. 1) The owner of a fish market determined that the average weight for a catfish is 3.2 pounds with a standard deviation of 0.8 pounds. Assuming the weights of catfish are normally distributed, the probability that a randomly selected catfish will weigh more than 4.4 pounds is _______. x ~ N 3.2,0.8 . Make a diagram! Either show a Normal curve with a mean at zero and shade the area above 1.5, or show a Normal curve with a mean at 3.2 and shade the area above 4.4. 4.4 3.2 Px 4.4 P z Pz 1.5 Pz 0 P0 z 1.5 0.8 .5 .4332 .0668 2) The owner of a fish market determined that the average weight for a catfish is 3.2 pounds with a standard deviation of 0.8 pounds. Assuming the weights of catfish are normally distributed, the probability that a randomly selected catfish will weigh between 3 and 5 pounds is _______. x ~ N 3.2,0.8 . Make a diagram! Either show a Normal curve with a mean at zero and shade the area between -0.25 and 1.00, or show a Normal curve with a mean at 3.2 and shade the area between 3 and 5. 5 3.2 3 3.2 P3 x 5 P z P 0.25 z 1.00 0.8 0.8 P 0.25 z 0 P0 z 2.25 .0987 .4878 .5865 3) The owner of a fish market determined that the average weight for a catfish is 3.2 pounds with a standard deviation of 0.8 pounds. A citation catfish should be one of the top 2% in weight. Assuming the weights of catfish are normally distributed, at what weight (in pounds) should the citation designation be established? (Find x.02 ) Make a diagram! Draw a Normal curve with a mean at zero. By definition z .02 is the point with 2% above it and 98% below it. Since 50% is below zero, it must be true that P0 z z.02 .4800 . Show this on your diagram. If you look at the Normal table, the closest you can come to .4800 is P0 z 2.05 .4798 , though P0 z 2.06 .4803 is almost as good. So z .02 is 2.05 or 2.06. Use the formula x.02 z.02 3.2 2.05 0.8 3.2 1.64 4.84 or x.02 z.02 3.2 2.06 0.8 3.2 1.648 4.848 . Both of these are fine. An exact answer should be between these two, maybe 4.843. 252y0313 9/29/03 4) The owner of a fish market determined that the average weight for a catfish is 3.2 pounds with a standard deviation of 0.8 pounds. Assuming the weights of catfish are normally distributed, the probability that a randomly selected catfish will weigh less than 2.2 pounds is _______. x ~ N 3.2,0.8 . Make a diagram! Either show a Normal curve with a mean at zero and shade the area below -1.25, or show a Normal curve with a mean at 3.2 and shade the area below 2.2. 2.2 3.2 Px 2.2 F 2.2 P z Pz 1.25 Pz 0 P 1.25 z 0 0.8 .5 .3944 .1056 2 252y0313 9/29/03 II. (5 points-2 point penalty for not trying part a.) A random sample is taken of the number of automobiles involved in fog-related accidents on the New Jersey Turnpike. The following data is found. Accident 1 2 3 4 5 No. of cars 30 3 2 5 4 a. Compute the sample standard deviation, s , of the number of vehicles involved. Show your work! (3) b. A turnpike spokesperson says that the mean number of automobiles involved in this type of pile-up is at most 6 6 . Use the mean and standard deviation that you computed above to evaluate this statement at the 95% confidence level.(3) c. (Extra credit) A turnpike spokesperson says that the median number of automobiles in one of these accidents lies between 3 and 5. Treating this as a confidence interval for the median, what is the confidence level? What about these data makes me suspect that locating the median is a better idea than the hypothesis test in b)? (4) Solution: a) index x in order x x2 x 44 x 8.80 30 900 1 2 n 5 3 9 2 3 2 4 3 4 x 2 nx 2 s 5 25 4 5 n 1 4 16 5 900 44 954 954 58.80 2 141 .70 x 2 954 , n 5. x 44 , So 4 s 141 .70 11 .904 b) H 0 : 6, H 1 : 6. From the formula table: Interval for Confidence Interval Mean ( x t 2 s x unknown) DF n 1 H0 : 0 H1 : 0 Test Ratio t x 0 sx Critical Value xcv t 2 s x 141 .70 28 .34 5.3235 . 5 n You have not done a hypothesis test unless you have stated your hypotheses, run the numbers and stated your conclusion. You must do only one of the following. x 0 8.80 6 0.5260 . Since this is a one(i) Test ratio for a test of the mean. The test ratio is t calc sx 5.3235 Note that s x s Hypotheses 4 2.132 . The ‘reject’ zone is the area above 2.132. Since 0.526 is not sided, right-tail test, pick tn 1 t .05 in the ‘reject’ zone, do not reject the null hypothesis. 3 252y0313 9/29/03 (ii) A critical value for the sample mean. Because the alternate hypothesis is 6, we need a critical value 4 for x above 6. Since this is a one-sided test, we use tn 1 t .05 2.132 . The two-sided formula , n 1 n 1 x t s , becomes x t s 6 2.132 5.3235 6 11.350 17.350 . Make a cv 0 2 x cv 0 x diagram showing an almost Normal curve with a mean at 6 and a shaded 'reject' zone above 17.350. Since x 8.80 is not above 17.350, we do not reject H 0 . (iii) A confidence interval for the population mean. Because the alternate hypothesis is 6, we need a ' ' confidence interval. xcv t 2 s x becomes xcv t s x 8.80 2.132 5.3235 8.8 11.350 or 2.550 . Make a diagram showing an almost Normal curve with a mean at x 8.80 and the confidence interval above - 2.550 shaded. Since 0 6 is above – 2.550 and thus in the confidence interval, we do not reject H 0 . (The fact that this interval makes little sense is a clue to the answer to c). c) The ordered sample is 2, 3, 4, 5, 900 . We have proposed a confidence interval of 2 5. Remember that the probability of picking a number above the median is .5. This confidence interval is wrong if (i) all five numbers in our sample are above the median, (ii) four out of the five numbers are above the median, (iii) all five numbers in our sample are below the median, or (iv) four out of the five numbers are below the median. To find the probability of the first two events, call a number above the median a success and find the probability of four or more successes in 5 tries. According to the Binomial table for p .5 and n 5, Px 4 1 Px 3 1 .81250 .1875 . To find the probability of the other two events, find the probability of four or more failures in 5 tries, which is the probability of zero or one successes. Px 2 .1875 , as always, equal to the first probability. So the significance level is 2Px 4 2.1875 .3750 , and the confidence level is 1 - .3750 = .6250, terribly low. The use of the method in b) is based on the idea that either the sample mean is Normally distributed or the underlying distribution is Normal. For a sample this small, the sample mean is Normally distributed only if the underlying distribution is Normal, and the presence of one number that is considerably larger than the rest of the numbers in the sample makes it look like the underlying distribution is highly skewed. The median is resistant to the presence of outliers and thus a more appropriate statistic. 4 252y0313 9/29/03 III. Do all of the following Problems (17 points) Show your work except in multiple choice questions. 1. The marketing manager for an automobile manufacturer is interested in determining the proportion of new compact car owners who would have purchased a passenger-side inflatable air bag if it had been available for an additional cost of $300. The manager believes from previous information that the proportion is 0.30. Suppose that a survey of 200 new compact car owners is selected and 79 indicate that they would have purchased the inflatable air bag. If you were to conduct a test to determine whether there is evidence that the proportion is different from 0.30 and decided not to reject the null hypothesis, what conclusion could you draw? (2) a) There is sufficient evidence that the proportion is 0.30. b) There is not sufficient evidence that the proportion is 0.30. c) There is sufficient evidence that the proportion is not 0.30. d) *There is not sufficient evidence that the proportion is not 0.30. 2. An entrepreneur is considering the purchase of a coin-operated laundry. The present owner claims that over the past 5 years, the average daily revenue was $675 with a population standard deviation of $75. A sample of 30 days reveals a daily average revenue of $625. If you were to test the null hypothesis that the daily average revenue was $675, which test would you use? (1) a) * z -test of a population mean b) z -test of a population proportion c) t -test of a population mean d) 3. 2 -test of population variance How many Kleenex should the Kimberly Clark Corporation package of tissues contain? Researchers determined that 60 tissues is the average number of tissues used during a cold. Suppose a random sample of 100 Kleenex users yielded the following data on the number of tissues used during a cold: X = 52, s = 22. Suppose the alternative we wanted to test was H1 : 60 . State the correct rejection region for = 0.05. (2) a) Reject H0 if t > 1.660. 99 b) *Reject H0 if t < – 1.660. This is a left-tailed test, so we use tn 1 t .05 . c) Reject H0 if t > 1.984 or t < – 1.984. d) Reject H0 if t < – 1.984. 4. (ASW) The average monthly income of recent WCU business graduates four years ago was $3200. I believe that the recent recession has lowered the income and think I have the data to prove it. The data consists of 72 responses from a random sample of more recent graduates. Find the correct set of hypotheses below. (2) H 0: 3200 a) H 1 : 3200 b) H 0: 3200 * H 1 : 3200 c) H 0: 3200 H 1 : 3200 d) H 0: 3200 H 1 : 3200 5 252y0313 9/29/03 5. (Bassett et. al. p. 126) The manager of a bottling plant is anxious to reduce the variability of the weights of bottled fruit. Over the last two years the standard deviation has held at 15.2 grams and the distribution of the weights is believed to be normal. A new machine is introduced and the weights of the bottles in a randomly selected sample are 987 , 966 , 955 , 977 , 981, 967 , 975 , 980 , 953, 972 , so that n 10, x 9713 and x 2 9435347 . Does the new machine have better performance? a) State your null and alternative hypotheses. (1) b) Do a test of your null hypothesis using a test ratio and a 5% confidence level. (3) c) Do a 95% 2-sided confidence interval for the parameter of interest. (2) Solution: a) We want to improve performance by reducing variability, so we will switch if H 1 : 15 .2 is true. The opposite is H 1 : 15.2. This is a small-sample test of a standard deviation or variance. b) According to the formula table, the test statistic is 2 x 9713 971 .3 and x n 2 10 n 1s 2 02 9123 .344 15 .2 2 x s 2 nx 2 n 1 n 1s 2 02 and 9435347 10 971 .32 123 .344 . So 9 4.805 . Since this is a left tail test and .05 , our rejection 9 3.3251 . Since our test ratio is not in the rejection region, we do not region is below .95 reject the null hypothesis or buy the new machine. c) According to the outline, the interval is n 1 2 2 9 n 1 n 1s 2 9 22 2 n 1s 2 12 2 . Since 2 .025 19.0228 , 21 2 .975 2.7004 and n 1s 2 9123.344 1110 , 2 1110 1110 2 or 58 .35 2 411 .05 or, taking square roots, 19 .0228 2.7004 7.639 20.274 the interval is 6. (Bassett et. al. p121) A box of bolts comes from a manufacturer whose bolts supposedly have a mean length of 5cm. and a variance of (not a standard deviation) of 0.05. A random sample of 10 bolts is taken with the following results: 5.68, 5.13, 5.82, 5.71, 5.36, 5.52, 5.29, 5.77, 5.45, 5.39 . From these we calculate a sample mean of 5.512.We assume that the reported variance is a valid population variance and do not calculate a sample variance and we use a 5% significance level. a) Calculate a confidence interval for the mean on the assumption that the underlying distribution is normal. Is the population mean significantly different from 5cm? (2) b) If, instead, we found x 5.0707 find a p-value for the test . (2) c) If in b, we found that these 10 bolts had come from a population of 50, how would this affect the p-value? You do not need to do any new computations, but I expect a short understandable explanation. (1) d) Caramba! We look at the distribution of the lengths and find that it is not normal. So we do a test that the median is 5. What p-value do you get and what are the implications if our significance level is .01? (2) e) (Extra credit) Find a confidence interval for the median with the lowest confidence level that is above 95%.(2) 6 252y0313 9/29/03 Solution: We are testing whether the mean is 5. We are told that x 5.512 , n 10, 2 0.05 and .05 . From the formula table we have: Interval for Confidence Hypotheses Interval Mean ( H0 : 0 x z 2 x known) H1 : 0 Test Ratio z x 0 x Critical Value xcv z 2 x a) Using the formula above for a confidence interval, z z.025 1.960 and 2 2 0.05 0.0707 . So 5.512 1.960 0.0707 5.512 0.139 or 5.373 to n 10 n 5.651. Since this interval does not include 5, we can say that the population mean is significantly different from 5. x 0 5.0707 5 1. The p-value is b) We are testing H 0 : 5 against H 1 : 5 . z x 0.0707 x (from the Normal table) 2Px 1 2.5 .3413 .3174 . N n . The finite population correction makes the standard error smaller, and n N 1 thus makes z larger. If z is larger than 1, the probability of being above that value will be smaller, so that the p-value will be smaller. d) We are testing H 0 : 5 against H 1 : 5 All ten numbers are above 5, if the median is 5, the probability of being above 5 is .5, the probability of getting 10 numbers above 5 is, according to the Binomial table, Px 10 1 Px 9 1 .99902 Px 0 .00049 . Since this is a 2-sided test, we say the p-value is twice this, or .00098. Since this is below .01, we reject the null hypothesis. d) The numbers are 5.68, 5.13, 5.82, 5.71, 5.36, 5.52, 5.29, 5.77, 5.45, 5.39 . If we put them in c) x x , x 2 , x 3 , x 4 , x 5 , x 6 , x 7 , x8 , x 9 , x10 order we have 1 . Since the probability of 5.13, 5.29, 5.36, 5.39, 5.45, 5.52, 5.68, 5.71, 5.77 , 5.82 being above the median is .5, if we use the 1st and the 10th number and call being above the median a success the probability that we are wrong is 2Px 10 21 Px 9 21 .99902 2Px 0 2.00049 .00098 . This is the significance level, the confidence level is 1-.00098 = .99902. If we use the second and 9th number the significance level is 2Px 9 2Px 1 2.01074 .02148 and the confidence level is .97852 and if we use the third and 8th number, the significance level is 2Px 8 2Px 2 2.05469 .10939 , which is obviously too high. So our 97.9% confidence interval is 5.29 5.77. 7. We do a test of H 0 : p .60 against H 1 : p .60 and find a test ratio of z .98. a) What is the p-value of our result? (2) b) Assuming that your p-value in a) is correct, would we reject the null hypothesis if the significance level was 8%? Why? (1) Solution: a) Since this is a 2-sided test, 2Pz 0.98 2.5 .3365 .3270 . b) We reject the null hypothesis if the p-value is below the significance level. Since .3270 is not below .08, we do not reject the null hypothesis. 7 252y0313 9/29/03 ECO252 QBA2 FIRST EXAM October 2 2003 TAKE HOME SECTION Name: ______Back_______________ Social Security Number: _________________________ IV. Do the first two problems (at least 10 each) (or do sections adding to at least 20 points - Anything extra you do helps, and grades wrap around) . Show your work! State H 0 and H 1 where appropriate. You have not done a hypothesis test unless you have stated your hypotheses, run the numbers and stated your conclusion. Use a 95% confidence level unless another level is specified. 1. An airline president is tracking late arrivals and believes that the proportion is at most p 0 . Suppose that a sample of 200 flights is selected and 82 were late. Do the following: a) To find p 0 , take the third digit of your Social Security Number divide by 100 and add it to .30. For example, my Social Security Number is 265398248 and the third digit is 5, so the value of p 0 that you would use is 30 5% or .35. (no point credit for this section.) b) Using the value of p 0 that you found in a) prepare to conduct a test to determine whether there is evidence that the proportion is at most p 0 by stating you null and alternative hypotheses. (1) c) Find a critical value for the sample proportion for the hypotheses in b), using a significance level of 1% specify what your ‘reject’ region is and use it to test the null hypothesis. (2) d) (Extra credit) Assume that the actual population proportion is p 0 .03 . (Since my p 0 was .35, I would assume that p1 was .38) , find the power of the test in c). If I had used a lower significance level, explain whether the power would be higher, lower, or the same. (3.5) e) Compute a test ratio for the hypotheses in c) and test the hypotheses using a significance level of 1% (2) f) Use your test ratio in e) to get a p-value for the hypothesis in c) and explain whether and why you would reject the null hypothesis if the significance level was 3%. (2) g) Test the hypotheses in c) using an appropriate confidence level and a significance level of 1% (2) h) If you were doing a 2-sided 99% confidence interval for the proportion of flights that were late and wanted the proportion to be known within .01 , how large a sample would you use if you expected the proportion to be about 20%? What if you thought the proportion was about 4%? (2) i) (Extra credit) do a power curve for the test in c), using a few carefully chosen values of p1 that are above your p 0 . (4.5) Solution: See 252y0313a. Take only the pages you need! 2. Robert N Carver presents us with a data set representing departure delays for flights between JFK and LAX airports. The sample of 52 flights gives us a sample mean delay of 2.21 minutes. Assume a population standard deviation with a value of 10 + the third digit of your Social Security number. (Example: My Social Security Number is 265398248 and the third digit is 5, so the population standard deviation will be 10 5 15 ). Test the assertion that the mean delay is less than 2.5 minutes. (Use a significance level of 10% in this problem.) a) State your null and alternative hypotheses (1) b) Find a critical value of the sample mean and specify your ‘reject’ region. Make a diagram. (1) c) Do you reject the null hypothesis? Use your diagram to show why. (1) d) Create a power curve for this test. (Note that negative values of the mean are not impossible and would indicate an early departure.) (6) e) Assume that you want a 90% 2-sided confidence interval for the mean delay, with an error of 1.0 . How large a sample would you need? (2) f) Find a p-value for the null hypothesis.(2) Solution: See 252y0313b. Take only the pages you need! 8