252y0211 2/27/02 ECO252 QBA2 FIRST HOUR EXAM February 19, 2002 Name _____Key______________ Hour of class registered _____ Class attended if different ____ Show your work! Make Diagrams! How many of you looked at "Things You Should Never Do" before this exam? I. (14 points) Do all the following. x ~ N 9,6 28 .2 9 14 9 z P0.83 z 3.20 1. P14 x 28 .2 P 6 6 P0 z 3.20 P0 z 0.83 .4993 .2967 .2026 9 9 7 9 z P 2.67 z 0 .4962 2. P7 x 9 P 6 6 13 9 13 9 z P 3.67 z 0.67 3. P13 x 13 P 6 6 P 3.67 z 0 P0 z 0.67 .4999 .2486 .7485 0 9 13 9 z P 3.67 z 1.50 4. P13 x 0 P 6 6 P 3.67 z 0 P 1.50 z 0 .4999 .4332 .0667 17 9 5. F 17 (The cumulative probability up to 17) Px 17 P z 6 Pz 1.33 Pz 0 P0 z 1.33 .5 .4082 .9082 6. A symmetrical interval about the mean with 58% probability. We want two points x.79 and x.21 , so that Px.79 x x 21 .5800 . From the diagram, if we replace x by z, P0 z z.21 .2900 . The closest we can come is P0 z 0.81 .2910 . So z.21 0.81 , and x z.21 9 0.816 9 4.86 , 13 .86 9 4.14 9 z or 4.14 to 13.86. To check this note that P4.14 x 13 .86 P 6 6 P0.81 z 0.81 2P0 z 0.81 2.2910 .5820 58% 7. x.095 We want a point x.095 , so that Px x.095 .095 . (This is the 90.5 percentile) From the diagram, if we replace x by z, P0 z z.095 .4050 . The closest we can come is P0 z 1.31 .4049 . So z.095 1.31 , and x z.095 9 1.316 9 7.86 , or 16.86 . 16 .86 9 To check this note that Px 16.86 P z Pz 1.31 6 Pz 0 P0 z 1.31 .5 .4049 .0951 .095 252y0211 2/21/02 II. (6 points-2 point penalty for not trying part a.) A new product is tried on seven patients. Their breathing capacity after using the product is shown below (Note: You may want to move the decimal point to the left and work in thousands.). Patient capacity 1 2 3 4 5 6 7 2850 2380 2800 2860 2300 2650 2640 a. Compute the sample standard deviation, s , of the breathing capacity. Show your work! (3) b. Compute a 90% confidence interval for the mean breathing capacity, .(3) Solution: a) Original data Data divided by 1000 2 Row Row x x x x2 1 2850 8122500 1 2.85 8.1225 2 2380 5664400 2 2.38 5.6644 3 2800 7840000 3 2.80 7.8400 4 2860 8179600 4 2.86 8.1796 5 2300 5290000 5 2.30 5.2900 6 2650 7022500 6 2.65 7.0225 7 2640 6969600 7 2.64 6.9696 18480 49088600 18.48 49.0886 n 7, n 7, x 2 49088600 x 2 49 .0886 x 18480 , x 18 .48, x 18480 2640 x n x 7 49088600 72640 s n 1 6 50233 .33 or s 224.13 . 2 2 2 n 7 49 .0886 72.640 2 n 1 6 0.05023333 or s 0.22413 . s 2 x 2 nx 2 s 50233 .33 224 .13 0.05023333 0.22413 84 .712 sx 0.084712 7 7 n 7 n 7 b. From the problem statement .10 . From Table 3 of the syllabus supplement, if the population variance is unknown x t s x and t n21 t .605 1.943 . sx s x nx 2 x 18.48 2.640 (thousands) 2 So 2640 1.943 84.712 2640 164 .6 or 2475.4 to 2804.6. So 2.640 1.943 0.084712 2.640 0.1646 or 2.4754 to 2.8046 (thousands). 2 252y0211 2/21/02 III. Do at least 3 of the following 4 Problems (at least 10 each) (or do sections adding to at least 30 points Anything extra you do helps, and grades wrap around) . Show your work! State H 0 and H 1 where appropriate. You have not done a hypothesis test unless you have stated your hypotheses, run the numbers and stated your conclusion. Use a 95% confidence level unless another level is specified. 1. The population mean for similar patients to those mentioned on the previous page who had not used the new medicine was 2628. For your convenience the data are repeated below. Patient 1 2 3 4 5 6 7 capacity 2850 2380 2800 2860 2300 2650 2640 Test to see if the mean breathing capacity is now above 2628 using the sample mean and standard deviation you found in part II. a. State the null and alternate hypothesis (2) b. Find a critical value appropriate for this problem, using a confidence level of 90%.(3) c. Use your critical value to test the hypothesis. State clearly whether you reject the null hypothesis. (2) d. Repeat the test using (i) a test ratio (2) and (ii) a confidence interval. (2). Each time state clearly whether you reject the null hypothesis and why. e. Do a 90% two - sided confidence interval for the variance. (3) f. (Extra credit) Assume that the data does not come from a normal distribution. (i) State a confidence interval for the median using the highest and lowest values and give the confidence level.(4) (ii) Do the same using the second highest and second lowest values and give the confidence level. (3) Solution: From Table 3 of the Syllabus Supplement: Interval for Confidence Hypotheses Interval Mean ( x t 2 s x H0 : 0 Unknown) H : DF n 1 1 0 Test Ratio t x 0 sx Critical Value xcv 0 t 2 s x a) You were asked to see if mean breathing capacity is now above 2628 (or 2.628 thousand). This gives us 2628. Since this does not contain an equality, it must be an alternative hypothesis, so we have H 0 : 2628 and H 1 : 2628 . b) Our facts, from the previous page are: n 7, x 2640, s 224.13, or better, s x 84.712 (or 0.0847 thousand). In addition, .10 and 0 2628 . Since our alternate hypothesis is 2628 and is 6 1.440 . one-sided, our one -sided critical value must be above 2628. Our value of t is tn 1 t .10 xcv 0 t sx 2628 1.440 84.712 2750 .0 or 2.750 thousand . c) Make a diagram . Show an almost Normal curve with a center at 2628 and a 10% 'reject' zone above 2750.0. Since x 2640 is not in the 'reject' zone, do not reject H 0 . 3 252y0211 2/21/02 d) (i) The test ratio is t x 0 2640 2628 0.1417 . Make a diagram . Show an almost Normal sx 84 .712 6 curve with a center at 0 and a 10% 'reject' zone above t .10 1.440 . Since t 0.1417 is not in the 'reject' zone, do not reject H 0 . (ii) The confidence interval will have the same direction as the alternate hypothesis, so it will be x ts x 2640 1.440 84.712 . or 2518.0 . Make a diagram . Show an almost Normal curve with a center at 2640 and a shaded confidence interval above 2518. If you now represent H 0 : 2628 by shading the area below 2628, you will have double-shaded the area between 2518 and 2628, so the confidence interval and the null hypothesis test do not contradict one another, and we do not reject H 0 . e ) From the outline, the small-sample confidence interval is formula table, 2 n 1s 2 .25 .5 2 n 1s 2 22 2 n 1s 2 12 2 or from the . The table says .2056 12.5916 and .2956 1.6354 We know that s 2 50233 .33 or 0.050233 million. So 650233 .33 2 650233 .33 or 23937 2 184297 . In millions, these limits would be 0.023 and 12 .5916 1.6354 0.184. f) (i) We can use the binomial tables to answer this. The numbers in order are 2300, 2380, 2640, 2650, 2800, 2850 and 2860. Since n 7 and p .5, the probability that all 7 numbers are above the median or that all seven numbers are below the median is 2Px 7 21 Px 6 21 .99219 .01562 So P2300 2860 1 .01562 .98438 . (ii) The probability that 6 or more number are above the median or six or more are below the median is 2Px 6 21 Px 5 21 .93750 .125 . So P2380 2850 1 .125 .875 . 4 252y0211 2/21/02 2. You have heard that, though the mean wealth held by US households is at least 270 (thousand dollars), the median wealth held by households is no more than 61 ( thousand dollars). You take a sample of 300 households in your state and find that the mean wealth is 210(thousand) with a sample standard deviation of 707(thousand), Out of the 300 households, 161 have wealth above 61 (thousand dollars). Use a 95% confidence level. a. Test the statement about median wealth using a critical value. (4) b. Test the same statement using a test ratio and a p-value (3) c. Test the statement about the mean wealth using a critical value (3) d. Test the same statement using a test ratio and a p-value. (3) e. Do a 95% 2-sided confidence interval for the proportion of houses with wealth above 61 (thousand dollars.) (3) f. How large a sample would I need, if I wanted the proportion in the confidence interval to be correct to .01 ? (3) Solution: a) The outline explains that hypotheses about a median are hypotheses about a proportion. The correspondence is below: Hypotheses about Hypotheses about a proportion a median If p is the proportion If p is the proportion H 0 : 0 H 1 : 0 above 0 below 0 H 0 : p .5 H 1 : p .5 H 0 : p .5 H 1 : p .5 Since we are told that the median wealth is no more than 61 and the proportion above 61 is 161 out of 300, Our hypotheses are H : p .5 H 0 : 61 x 161 .53667 . and 0 Note that n 300 , x 161 . so that p n 300 H 1 : p .5 H1 : 61 From the formula table we have: Interval for Confidence Interval Proportion p p z 2 s p Hypotheses Test Ratio H 0 : p p0 z H1 : p p0 pq n q 1 p sp And we can also use the sign test formula, z 2x n n . p p0 p Critical Value pcv p0 z 2 p p0 q0 n q0 1 p0 p .05 , so we use z z.05 1.645 . p0 q0 .5.5 .02886 , so the critical value n 300 is pcv p0 z p .5 1.645.02886 .5475. Make a diagram of a normal curve with a mean at .5 and a The standard deviation of the sample proportion is p reject zone above .5475. Since p .53667 is not in the 'reject' zone, do not reject H 0 . b) If we use a test ratio we get z p p0 p .53667 .5 2 x n 2161 300 1.271 or z 1.270 . To .02886 n 300 get a p-value use pval Pz 1.27 .5 .1064 .3936 . Since pval .05, do not reject H 0 . 5 252y0211 2/21/02 c) Our hypotheses are now H0 : 270 and H1 : 270 . The problem says n 300 , x 210 , s 707 and .05 , DF n 1 299 , so we can use z z.05 1.645 in place of t . From the formula table we have: Interval for Confidence Hypotheses Test Ratio Critical Value Interval Mean ( x 0 x t 2 s x xcv t 2 s x H0 : 0 t unknown) s H : DF n 1 x 1 0 sx s 707 40 .82 . Because of the alternate hypothesis, we want a critical value below 270, so we n 300 use xcv t s x 270 1.645 40.82 202 .85. Make a diagram of a normal curve with a mean at 270 and a reject zone below 202.85. Since x 210 is not in the 'reject' zone, do not reject H 0 . d) If we use a test ratio we get t x 0 210 270 1.470 . To get a p-value use sx 40 .82 pval Pz 1.47 .5 .4292 .0708 . Since pval .05, do not reject H 0 . It would also be acceptable to note that 1.470 is between t.10 1.282 and t.05 1.645 so that the p-value is between .05 and .10. e) Using the confidence interval formula from a) above with z z.025 1.960 and s p 2 pq n .53667 .46233 .0273 . p p z s p .53667 1.960.0273 .537 .053 or .484 to .590. 2 300 f) From the outline n pqz2 .53667 .46333 1.960 2 9552 .3 . So use 9553 or more. It would also be e2 .012 acceptable to use p q .5 in this formula to get a slightly larger sample size. Most of you did not state hypotheses in this problem, making grading it a nightmare! 6 252y0211 2/21/02 3. In the previous problem you tested the mean wealth held by US households is at least 270 (thousand dollars), and, based on the fact that the sample standard deviation is 707 (thousand dollars), created a critical value for the sample mean. Since the sample was so large, you can assume that 707 is the population standard deviation. ( The sample mean is still 210.) a. Assume that the mean wealth is actually 260 (thousand dollars), and using the critical value you found on the last page, what is the power of the test? (3) b. Find the power curve for your test showing the necessary calculations.(6) c. Do a 2-sided 95% confidence interval for the mean in the problem on the last page and figure out how large a sample you would need to get the error part of the confidence interval down to 10 (thousand dollars). (4) d. Do a 2-sided 58% confidence interval for the mean, assuming that your sample of 300 came from a population of 1000. (3) e. (Extra credit) If you place the 300 values in the sample in order, which numbers would you use to find a 95% confidence interval for the median? (4). Solution: a) Our hypotheses were H0 : 270 and H1 : 270 . The problem said n 300 , x 210 , s 707 and .05 , DF n 1 299 , so we can use z z.05 1.645 in place of t . sx s 707 40 .82 . Because of the alternate hypothesis, we wanted a critical value below 270, so n 300 we used xcv t sx 270 1.645 40.82 202 .85. We will not reject H 0 if x 202 .85 . This problem says that 1 260 . From the outline Then P ' Accepting' H 0 H 0 is false x 1 202 .85 260 Px 202 .85 260 P z cv Pz Pz 1.40 .5 .4236 .9236 . 40 .82 x Power 1 1 .9236 .0764. Pathetic! b) Half of the distance between 270 and 202.85 is, roughly, 34, so I found the power at 1 270 , 1 236 , 1 202 .85, 1 169 , and 1 135 . x 1 Note that, in general, for this one-sided hypothesis P z cv . x From previous examples we know that 1 .95 1 270 power 1 .05 1 236 Px 202 .85 236 P z 202 .85 236 Pz 0.81 .5 .2910 .7910 . 40 .82 power 1 .2090 1 202 .85 From previous examples, we know that at the critical value .5 power 1 .5 1 169 Px 202 .85 169 P z 202 .85 169 Pz 0.83 .5 .2967 .2033 . 40 .82 power 1 .7967 1 135 Px 202 .85 135 P z 202 .85 135 Pz 1.66 .5 .4545 .0485 . 40 .82 power 1 .9515 You now have the power for 5 points above or equal to 270. Make a diagram. 7 252y0211 2/21/02 c) From the last question x 210 , s 707 , sx 40.82 and x t s x . .05 , and 2 z 2 z.025 1.960. Again, we are using z in place of t because of the large number of degrees of freedom. x t s x 210 1.960 40.82 210 80.00 or 130.0 to 290.0. From the outline 2 n z 2 2 e2 1.960 2 707 2 10 2 19202 .2 . So use a sample of size 19203 or larger. d) The confidence level is 58% and 1 .58 .42 . From the first page z z.21 0.81. From the 2 outline if n .05 N , use x x 707 300 n N n . Since N 1000 and n 300 , we compute N 1 1000 300 34.168. x t 2 s x 210 0.8134.168 210 27.7 or 182.3 to 237.7. 1000 1 n 1 z . 2 n 300 1 1.960 300 134 We will put the 2 2 numbers in order and use the 134th from the bottom and the 134th from the top (the 167th number). e) The formula from the outline was k 8 252y0211 2/21/02 4. According to Ronald Weirs, the Veterans Administration closed the cardiac units of VA hospitals that had mortality rates above 5%. In one hospital, their were 7 deaths in 102 operations (Consider this a sample proportion). Using a 99% confidence level, can you say that the hospital's mortality rate (proportion that died) was significantly above 5%? (Be specific about your significance level, especially in parts e and f.) a. State your null and alternate hypotheses (2) b. Do the problem using a critical value for the proportion that died. State clearly whether you reject the null hypothesis and why. Does this mean that the mortality rate was significantly above 5%? (3) c. Do the problem using a test ratio. (2) d. Do the problem using an appropriate confidence interval. (3) e. According to Weirs, a high school running back can expect an average of 8 injuries per 100 games. A Poisson distribution applies. You believe that your league is particularly rough. In the last 100 games there were 15 injuries. Formulate a test of your belief and tell me if you are right. Explain why. (4) f. If, instead, you had data for 1000 games, what would be a critical value for the number of injuries? (3) Solution: a) In parts a) through d) we are doing tests on a proportion. The material from the formula table is quoted in question 2. The important words in the question for hypothesis formulation are " Can you say that the hospital's mortality rate (proportion that died) was significantly above 5%?" This translates as H1 : p .05 . It is an alternate hypothesis because it does not contain an equality. The null hypothesis is thus H 0 : p .05. The problem says that .01, n 102 , x 7. so that p x 7 .0687 . n 102 p0 q0 .05.95 .02158 . This is a one-sided test and z z.01 2.327 . n 102 b) Since the alternative hypothesis says p .05, we need a critical value that is above .05. We use p pcv p0 z p .05 2.327.02158 .1002. Make a diagram of a normal curve with a mean at .05 and a reject zone above .1002. Since p .0687 is not in the 'reject' zone, do not reject H 0 . We cannot say that the mortality rate is significantly above 5%. p p0 .0687 .05 0.867 . Make a diagram of a normal curve with a mean at c) The test ratio is z p .02158 zero and a reject zone above z z.01 2.327 . Since z 0.867 is not in the 'reject' zone, do not reject H 0 . We cannot say that the mortality rate is significantly above 5%. pq .0687 .9313 0.025045 . To make the 2-sided n 102 confidence interval, p p z s p , into a 1-sided interval, go in the same direction as H1 : p .05 . We get 2 d) To do a confidence interval we need s p p p z s p .0687 2.327.025045 .0687 .0583 .0104. To see that p .0104 does not contradict H 0 : p .05, make a diagram. Represent the confidence interval under a normal curve with a mean at .0687 by shading the area above .0104. Represent the null hypothesis by shading the area below .05. these areas overlap, indicating that both can be true at the same time. 9 252y0211 2/21/02 e) Our null hypothesis is H 0 : Poisson(mean 8) . The alternative is most safely stated as H1 : not Poisson(mean 8) , though we will test for a mean greater than 8. According to the problem x 15. The easiest way to do this is to assume that the null hypothesis is true and to compute a p-value which would be defined as Px 15 . Using the Poisson table with a parameter of 8, we find that pval Px 15 1 Px 14 1 .98214 .01786 . If we use a significance level of .05 , we reject H 0 , since the p-value is below the significance level, but if we use a significance level of .01, we do not reject H 0 , since the p-value is above the significance level. If you rejected the null hypothesis, you can say that this is a rough league. f) In the unlikely case that you have data on 1000 games instead of 100, you would expect a Poisson distribution with a mean of 10 times 8 or 80. Our hypotheses are now H 0 : Poisson(mean 80) . The alternative is most safely stated as H1 : not Poisson(mean 80) . The question is 'how far above 80 can x be before we reject the null hypothesis.' We do not have a table for the Poisson distribution with a mean of 80, but we do know that this distribution is very similar to a Normal distribution with a mean of 80 and a standard deviation of 80. We know that the probability of x being below z 80 z 80 is 1 . These numbers imply that: If .05 , xcv 80 1.645 80 94 .91 . We reject H 0 if x is 95 or larger. If .01, xcv 80 2.327 80 100 .81 . We reject H 0 if x is 101 or larger. 10