252soln0 2/3/00 PROBLEM A2. If n 64 and x 11.50 , find 95% confidence intervals for the mean under the following circumstances: a. 6.30, N 3000 b. 6.30 , N 300 c. s 6.30, N 3000 d. s 6.30 , N 300 SOLUTION: Use the formulas from Table 3 of the syllabus supplement or from the outline. a. x z x 11.50 1.960 .7875 11.50 1.54 or 9.96 to 13.04 2 x x 6.30 64 n .7875 z 2 z.025 1.960 b. x z x 11.50 1.960 .6996 11.50 1.37 or 10.13 to 12.87 2 z 2 x N n 6.30 64 n N 1 z.025 1.96 x 236 300 64 0.7875 .8884 .6996 0.7875 299 300 1 c. x tn1 s x 11.50 1.998 .7875 11.50 1.57 or 9.93 to 13.07 2 sx sx 6.30 n .7875 64 t 2 n 1 63 t .025 1998 . d. x tn1 s x 11.50 1.998 .6996 11.50 1.40 or 10.10 to 12.90 2 sx sx n N n 6.30 N 1 64 300 64 .6996 300 1 63 tn1 t.025 1.998 2 PROBLEM A3. In a study of a grain market in an African country we want to figure out how large a sample we must take to find a daily average price for a grain transaction. (Assume a standard deviation of 5 cents.) a. We want a 99% confidence interval for the mean with an error of ±1 cent. b. What if the error is to be ±1/2 cent? z 2 2 , where z z z.005 since .01 . 2 e2 a. We are told that the maximum error must be e 1 (or e .01 ) and that 5 (or .05 ). SOLUTION: We use the formula n From the t table, z.005 2.576 so that n z 2 2 2.576 2 52 12 165 .89 . since we always e2 round this quantity up, use a sample size of at least 166. Note that if we use n 165 , we find that 5 90 2.576 90 1.003 . The error term will (if we assume that x 90 ) x z 2 n 165 5 90 2.576 90 1.000 . be slightly above 1. However, if we use n 166 , x z 2 n 166 b. This time the maximum allowable error is e 0.5 , so n z 2 2 2.576 2 52 0.52 663 .57 and e2 we must use a sample size of 664. Note that his sample size is four times the size in part a. 252soln0 2/3/00 PROBLEM A4 If s = 15 find a 95% confidence interval for if a) n = 26, b) n = 99 SOLUTION: Use the formulas from Table 3 of the syllabus supplement or from the outline. a. This is a small sample since n 31 , so use n 1s 2 22 2 n 1s 2 12 2 . Since the degrees of 2 13 .1197 , the interval freedom are n 1 31 1 30 , 22 .2025 40 .6466 and 12 2 9725 25 15 2 25 15 2 or 138 .388 2 428 .745 . Since an interval for the 40 .6466 13 .1197 standard deviation was requested, take the square root of both sides. 11.76 20.71 . b. Since the degrees of freedom are n 1 99 1 98 and are too large for the chi-square table use becomes s 2DF z 2 2DF 2 s 2DF z 2 2DF . Since 2DF 298 196 14 and 15 14 15 14 or 13.158 17.442 . 1.960 14 1.960 14 Note that due to the larger sample size, this interval is smaller than the one in a. z 2 z.025 1.960 , the formula becomes PROBLEM A5. a. Find the confidence level for an interval for the median using binomial tables, if from a sample of 12 we take the third observation from both ends. b. Do the same for the 19th observation from both ends in a sample of 50. c. Do the same for an interval using the 10th observation from both ends in a sample of 40, using the normal approximation to the binomial distribution. d. In part c, try to find a 95% confidence interval for the median. SOLUTION: a) If we take the third number from both the bottom and the top of the data, we get the interval x3 x10 from the ordered numbers x1, x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11, x12 . For example, if the numbers are 13.2, 17.1,18.5, 21.3, 21.4, 22.0, 27.1, 27.7, 28.9, 29.2, 35.4, 35.9 , we would say that the interval is 18 .5 29 .2. To find the confidence level, first find the significance level , the probability that the interval is wrong. The interval will be wrong if (i) x3 through x10 are all below the median or (ii) x3 through x10 are all above the median. The probability of these two events are both the same, so that we can figure out the probability that x3 through x10 are all below the median and double it. The probability of any given number being below (or above) the median is 0.5, and the probability that x3 through x10 are all below the median is the probability that the first 10 or more numbers are all below the median, and is the same as the probability of getting ten or more heads in twelve flips of a coin. 252soln0 2/3/00 From the binomial table for n 12 and p .5 , we find that Px 10 1 Px 9 1 .98071 .01929 . But note that the binomial distribution with p .5 is symmetrical so that Px 10 Px 2 . Also remember that we stated in the previous paragraph that to get the significance level, we must double this probability, so that 2.01929 .03858 . Thus the confidence level is 1 1 2.01929 .96142 . More generally, if k is the index of the number at the bottom of the confidence interval, (in the case we just did k 3 ) the confidence level is 1 1 2Px k 1 . b) If we take a sample of n 50 , put it in order, and then pick the 19th number k 19 from both the top and the bottom, so that the confidence interval is x19 x32 , the confidence level is 1 1 2Px k 1 1 2Px 19 1 1 2Px 18 1 2.03245 .93510 . This can also be done using the Normal Distribution. If we ignore the continuity correction, and recall that for the binomial distribution with p .5 and q 1 p .5 , np .5n and 2 npq n.5.5 .25n , k 1 P z k 1 .5n P z 18 .550 Pz 1.98 .5 .4761 .0239 Px k 1 P z .25 n .5 50 .and 1 1 2Px k 1 1 2.0239 .9522 . This looks way off, so try the same problem with a 18 .5 .550 continuity correction Px k 1 Px 18 P z Pz 1.84 .5 .4671 .0329 .5 50 and the confidence level is 1 1 2Px k 1 1 2.0329 .9342 . c) If n 40 and k 10 we have no binomial table, so use the normal approximation to the binomial distribution with a continuity correction. k 1 .5 P z k 1 .5 .5n P z 9 .5 .540 Pz 3.32 .5 .4995 .0005 Px k 1 P z .25 n .5 40 and the confidence level is 1 1 2Px k 1 1 2.0005 .9990 . d) If we want a 95% confidence interval and n 40 , we require that 1 1 2Px k 1 k 1 .5 P z k 1 .5 .5n .025 . 1 2.025 .95 . This means that Px k 1 P z .25 n But since z .025 1.960 , we know that Px k 1 Pz 1.960 .025 . So we can say that k 1 .5 .5n 1.960 . Solve this with n 40 , or note that k 1 .5 .5n 1.960 .25n and, solving .25 n for k , we find k .5 .5n 1.960 .25n . If we substitute n 40 , k .5 .540 1.960 .2540 20.5 10 20.5 6.26 14.30 . We could also follow the formula in the outline that says k n 1 z n 40 1 z 40 14 .30 . Obviously k must be a whole number and the more 2 2 conservative choice would be to round it down, so that the interval is x14 x 27 . 2 2 252soln0 2/3/00 PROBLEM B.1 A firm claims that its median wage is $32000. The union claims that the median () is lower. A random sample of 100 employees shows that 40% are above $32000. Set this up as two hypotheses and test with a significance level of 5%. SOLUTION: We always replace a hypothesis about a median with a hypothesis about a proportion in the sign test. The statement implicit in the above is that the median is at least $32000. Since we have the H : 32000 H : p .5 number over $32000 let p be the proportion over $32000. Then 0 becomes 0 . H 1 : 32000 H 1 : p .5 p0 q0 x .5.5 .05 . Note that a .40 , so that p n n 100 continuity correction has been added to all of these solutions. It has the effect of making the “accept” 0 .5 region larger by x 0.5 or p . n (i) Critical Value Method: .5 .5 p cv p 0 z p .5 1.645 .05 .5 .08225 .005 .41275 . If p is below this n 100 critical value we reject H 0 . Since .40 is below .41275, reject H 0 . Note that .05, n 100, x 40 and p (ii) Test Ratio Method: There are three possible versions. In all those below, the rule if x n 2 frequently used is , where appears, use . n if x 2 p .5 p 0 .40 0.5 .5 n 100 Pz 1.90 z P p .40 P z p .05 .5 .4713 .0287 x .5 np0 40 0.5 100 .5 Px 40 P z Pz 1.90 .0287 z 100 .5.5 np0 q 0 240 1 100 Px 40 P z Pz 1.90 .0287 n 100 In each case, the p-value is .0287. Since .05, p value and we reject H 0 z 2x 1 n 252soln0 2/3/00 PROBLEM B.2 We are testing that the median is 14. Let x be the number of items above 14. From a sample of size n 30 , we find x 25 . Use p for the proportion of the population over 14 and p for the proportion of the sample over 14. a) Test = 14 b) Test > 14 c) Test < 14 25 .8333 . Assume .05 . SOLUTION: Note that p 30 H : 14 H : p .5 a) 0 becomes 0 . If we use the critical value method H : 14 1 H 1 : p .5 p0 q0 0.5 .5.5 .5 pcv p0 z 2 .5 1.96 .5 .179 .017 .5 0.196 n n 30 30 or .304 to .696. Since .8333 is not in this interval reject H 0 . We are probably better off using the test ratio method, with z x .5 np0 np0 q 0 . Here np0 30 .5 15 and np0 q 0 15.5 7.5 . So 24 .5 15 pvalue 2 Px 25 2 P z 2 Pz 3.47 2.5 .4797 2.0003 .0006 . 7.5 Since this is below the significance level, reject H 0 . b) H 0 : 14 becomes H 1 : 14 H 0 : p .5 . H 1 : p .5 In this case 24 .5 15 pvalue Px 25 P z .0003 . Since this is below the significance level, 7.5 reject H 0 . c) H 0 : 14 H : p .5 becomes 0 . In this case, it is possible to have many items H 1 : 14 H 1 : p .5 over 14 and for H 0 still to be true. 25 .5 15 pvalue Px 25 P z Pz 3.83 .5 .4999 .9999 . Since this is 7.5 above the significance level, accept H 0 . PROBLEM B.3 A bank's average default rate on loans is supposedly 6 per month. In the first month there are 12 defaults. Test the first assertion assuming a Poisson distribution. Use a two-sided test with a 5% significance level. H 0 : Poisson6 SOLUTION: . Though it is possible to put together a rejection region, the easiest way H 1 : not Poisson6 to do this is to use the Poisson(6) table and a p-value approach. If we look up the probability that x is 12 or larger we find: pvalue 2Px 12 21 Px 11 21 .9799 2.0201 .0402 . Since pvalue , reject H 0 . 252soln0 2/3/00 PROBLEM B.4 a. I claim that x is binomially distributed with p .01 . Test this assertion using a 2sided 5% test if there are 3 successes in 10 trials. b. Test for a binomial distribution with p .10 when n 10 and x 4 . c. If n 100 and x 9 , test to see if p is at least 0.4. d. Calls coming into a switchboard in an hour presumably have a Poisson distribution with a mean of 144. Test this hypothesis if, in a given hour, 200 calls come in SOLUTION: x a. If we assume that has the Binomial distribution, our Hypotheses are H 0 : Binomial p .01 . If we have a Binomial table for p .01, note that H 1 : not Binomial p .01 np 10.01 .01, so that our value of x is too large. pvalue 2Px 3 21 Px 2 21 .99989 .00022 . This is below the significance level, so reject H 0 . b. c. If p .10 and n 10 , np 1 so that x 4 is too large. pvalue 2Px 4 21 Px 3 21 .9984 2.0016 .0032 . This is below the significance level, so reject H 0 . H 0 : Binomial p .4 Our hypotheses are now . Since n 100 and x 9 , np 40 and x is H 1 : Binomial p .4 too small. From the binomial table for p .4 , pvalue Px 9 .00000 , so reject H 0 . If a table with n 100 is unavailable, use the Normal approximation, pvalue Px 9 P p .09 d. .5 .09 .4 100 P z .09 .005 .4 Pz 6.23 0, so reject H . Pz 0 .4.6 .0024 100 This is a Poisson problem, but a table for Poisson(144) is not available. Fortunately for large values of m , the Poisson mean, x ~ N m, m . Since there are no specific requirements, assume that a 2-sided 95% test is wanted. H 0 : Poisson144 H 1 : not Poisson144 Then x m z m 200 144 4.67 . Since z 2 z.025 1.96 , and our test ratio is not between 1.96 , reject 144 H0. PROBLEM B.5 If x x x 2 2 nx 2 40 and the confidence level is 95%, test if it is true that the variance is 2 when a) n 10 , b) n 20 , c) n 40. SOLUTION: We are H 0 : 2 2 . H 1 : 2 2 testing n 1 x x From the outline, since s 2 x x n 1 2 , x x 2 40 x x 2 and 20 2 in all cases. Since 2 n 1 02 02 02 02 the confidence level is 95%, all we really need to do is find out whether our value of 2 falls between 2 n 1 s 2 2 12 2 and 22 , in this case .2975 and .2025 . a. 9 2.700 9 19.023 n 10 implies 9 degrees of freedom. .2975 and .2025 . Since our value of 2 does not fall between them, reject H 0 . b. 19 8.907 and 219 32 .852 . Since our value of 2 n 20 implies 19 degrees of freedom. .2975 .025 falls between them, do not reject H 0 . c. n 40 implies 39 degrees of freedom. Because we are beyond the 2 table, we must use the approximation, z 2 2 2 DF 1. We already know that 2 20, so that z 220 239 1 6.32 8.77 2.95 . For a confidence level of 95%, z must be between 1.96 and 1.96. Since this value of z is not, reject H 0 . 252soln0 2/3/00 PROBLEM C.1 Assume that = 4 and n = 70. Find the critical values, power function and operating characteristic curve for: H0 : 50 H1 : < 50 Use a significance level of 5 percent. SOLUTION: a) First, state the problem and find a critical value or values. H 0 : 50 4 4, n 70, .05 so x 0.47809 . Since this is a one-sided test, the n 70 H 1 : 50 formula for a two-sided critical value x cv 0 z x becomes xcv 0 z x , so that 2 xcv 50 1.645 0.47809 49.2135 . So we will not reject H 0 if the sample mean x is greater than or equal to 49.2135. b) Decide on what values of 1 to use to compute , the probability of a type II error. The usual set of values includes the mean from the null hypothesis, the critical value, a point about midway between these values and two points, one further out beyond the critical value by a distance equal to the distance between the null hypothesis mean and the critical value, and another halfway between this point and the critical value. We thus choose 50, 49.2135, and 49.6, which is about halfway between them. Since the difference between 50 and 49.2135 is about 0.8, the lowest value of 1 the we use is 48.4, and a point about halfway between 48.4 and 49.2135 is 48.8. c) Compute for each value of 1 . Since a type II error is wrongly ‘accepting’ the null hypothesis, we compute the probability that the sample mean will be above or equal to the critical value for each value of x 1 1 . Our computations are below. Note that, in general, for this one-sided hypothesis P z cv . x 49 .2135 50 Px 49 .2135 50 P z 1 50 Pz 1.645 .95 .47809 power 1 .05 1 49.6 Px 49 .2135 49 .6 P z 49 .2135 49 .6 Pz 0.81 .2910 .5 .7910 .47809 power 1 .2090 1 49.2135 Px 49 .2135 49 .2135 P z 49 .2135 49 .2135 Pz 0 .5000 .47809 power 1 .5000 1 48 .8 Px 49 .2135 48 .8 P z 49 .2135 48 .8 Pz 0.86 .5 .3051 .1949 .47809 power 1 .8051 1 48.4 Px 49 .2135 48 .4 P z power 1 .9554 49 .2135 48 .4 Pz 1.70 .5 .4554 .0446 .47809 252soln0 2/3/00 PROBLEM C.2 A hardware firm charges a flat rate for mailing of small tools based on an average weight of 20 oz. with a standard deviation of 3.60 oz. A consultant challenges this assumption and a sample of 100 packages is taken. Find critical values for a significance level of 1% and compute the power function and operating characteristic curve. SOLUTION: a) First, state the problem and find a critical value or values. H 0 : 20 3.60 3.60, n 100 , .01 so x 0.360 . Since this is a two sided test, the H : 20 n 100 1 formula for a critical value is x cv 0 z x , so that xcv 20 2.576 0.360 20 0.927 . So we will 2 not reject H 0 if the sample mean x is between 19.073 and 20.927. b) Decide on what values of 1 to use to compute , the probability of a type II error. The usual set of values includes the mean from the null hypothesis, the critical values, a point about midway between these values and two points, one further out beyond the critical value by a distance equal to the distance between the null hypothesis mean and the critical value, and another halfway between this point and the critical value. We thus choose the null hypothesis mean, 20 and the two critical values 19.073 and 20.927.20.5 and 21.5 are about halfway between 20 and the critical values. Since the difference between 20 and the critical values is about 1.0, the lowest value of 1 the we use is 18.0 and the highest is 22.0. Points about halfway between these numbers and the critical values are 18.5 and 21.5. c) Compute for each value of 1 . Since a type II error is wrongly ‘accepting’ the null hypothesis, we compute the probability that the sample mean will be between the critical values for each value of 1 . Our x 1 x 1 computations are below. Note that, in general, for a two-sided hypothesis P cv1 z cv 2 . x x 20 .927 20 19 .073 201 1 20 P19 .073 x 20 .927 20 P z 0.360 0.360 P2.575 z 2.575 2.4950 .99 1 1 20.5 or 19.5 20 .927 20 .5 19 .073 20 .5 z 0.360 0.360 P19 .073 x 20 .927 20 .5 P P 3.96 z 1.19 .5 .3830 .8830 1 20.927 or 19.073 20 .927 20 .927 19 .073 20 .927 z 0.360 0.360 P19 .073 x 20 .927 20 .927 P P 5.15 z 0.00 .5000 1 21 .5 or 18 .5 20 .927 21 .5 19 .073 21 .5 z 0.360 0.360 P19 .073 x 20 .927 19 .5 P P 6.74 z 1.59 .5 .4441 .0559 1 22.0 or 18.0 20 .927 22 .0 19 .073 22 .0 z 0.360 0.360 P19 .073 x 20 .927 22 .0 P P 8.13 z 2.98 .5 .4986 .0014 252soln0 2/3/00 If we round these results, we get the following values for the operating characteristic and power: 22.0 21.5 20.9 20.5 20.0 19.5 19.0 18.5 1 power .00 1.00 .06 .94 .50 .50 .88 .12 .99 .01 .88 .12 .50 .50 .06 .94 18.0 .00 1.00