251y0243 12/10/02 ECO251 QBA1 FINAL EXAM DECEMBER 10, 2002 Name KEY Class ________________ Part I. Do all the Following (16Points) Make Diagrams! Show your work! x ~ N 3, 6 . 6.04 3 P z 0.51 Px 6.04 P z 6 Pz 0 P0 z 0.51 .5 .1950 .3050 3.50 3 3.50 3 z P 1.08 z 0.08 2. P 3.50 x 3.50 P 6 6 P 1.08 z 0 P0 z 0.08 .3599 .0319 .3918 22 3 6 3 z P0.50 z 3.17 Change 17 to 22 3. P6 x 22 P 6 6 P0 z 3.17 P0 z 0.50 .4992 .1915 .3077 5.20 3 2.20 3 z P 0.13 z 0.37 4. P2.20 x 5.20 P 6 6 P 0.13 z 0 P0 z 0.37 .0517 .1443 .1960 2.63 3 P z 0.06 5. F 2.63 (Cumulative Probability) Px 2.63 P z 6 Pz 0 P 0.06 z 0 .5 .0239 .4761 6. x.33 (Find z .33 first) . (3 points) We want a point x.33 , so that Px x .33 .33 . Make a diagram for z , showing zero in the middle, 50% below zero, and the area above zero divided into 17% between z .33 and zero and 33% above z .33 . From the diagram, P0 z z.33 .1700. . The closest we can come is P0 z 0.44 .1700 . So z.33 0.44 , and x z.33 3 0.446 3 2.64 , or 5.64 . 1. 5.64 3 Px 5.38 P z Pz 0.44 6 Pz 0 P0 z 0.44 .5 .1700 .33 .4090 .41 To check this note that 7. A symmetrical region around the mean with a probability of 33%. (3 points) We want two points and x.665 x.335 , so that Px.665 x x 335 .3300 . Make a diagram for z , showing zero in the middle, an area in the middle of 33%, split in two by zero so that 16.5% is above zero and 16.5% is above zero. Since .5 - .165 = .335, the points we want are z ..335 and z.335 . From the diagram, P0 z z.335 .1650 . P0 z 0.43 .1664 . So use z.335 0.43 , and x z.285 3 0.436 3 2.58 , or 0.42 to 5.58. The closest we can come is 5.58 3 0.42 3 z P0.42 x 5.58 P 6 6 P 0.43 z 0.43 2P0 z 0.43 2.1664 .3328 .33. To check this note that 1 251y0243 12/10/02 II. (10 points-2 point penalty for not trying part a .) Show your work! We are investigating the reliability of a machine that fills 16ounce bottles. A sample of six is taken with the following results (Assume that it is correct to do a confidence interval from such a small sample): Be careful! There is a lot of chance for rounding error in this problem! 15.29 15.25 15.20 15.32 16.01 16.02 a. Compute the sample standard deviation, s , for the number of ounces. Show your work. (4) b. Compute a 95% Confidence interval for the mean. (6) c. (Extra credit) Can you say that the mean that you got is significantly different from 16? You must give a reason for your answer to be read. (2) Solution: Since the confidence level is 1 .95, the significance level is .05 a. 1 2 3 4 5 6 x2 x 15.29 15.25 15.20 15.32 16.01 16.02 93.09 x 93.09 , x nx s 233.784 232.563 231.040 234.702 256.320 256.640 1445.049 x 2 1445.049 and n 6 . So x x 93.09 15.515 n 6 0.7577 1445.049 615.515 0.1515 s x 0.1515 0.3893 . 5 n 1 5 s 0.1515 b. s x x 0.1589. The degrees of freedom are n 1 6 1 5 . 6 n .05 5 .025 . From the t-table, tn2 1 t.025 2.571 . But what compelled some of you to 2 2 decide that 0.3893 was s in a) but in b)? n 1 Putting this all together x t s x 15.515 2.5710.1589 15.515 0.409 or 15.11 to 2 2 2 x 2 2 15.92. More formally, P15.11 15.92 .95 . c. Since 16 is not included in this interval, we can say that the mean is significantly different from 16. Note that Minitab gets a similar solution. MTB > TInterval 95.0 c1. Confidence Intervals Variable N Mean StDev SE Mean 95.0 % C.I. C1 6 15.515 0.389 0.159 ( 15.106, 15.924) 2 251y0243 12/10/02 III. Do at least 4 of the following 6 Problems (at least 12 each) (or do sections adding to at least 48 points Anything extra you do helps, and grades wrap around) . Show your work! Please indicate clearly what sections of the problem you are answering! If you are following a rule like E ax aEx please state it! If you are using a formula, state it! If you answer a 'yes' or 'no' question, explain why! If you are using the Poisson or Binomial table, state things like n , p or the mean. Avoid crossing out answers that you think are inappropriate - you might get partial credit. Choose the problems that you do carefully – most of us are unlikely to be able to do more than half of the entire possible credit in this section!) 1. 36 employees are randomly drawn from a large corporation and their sick days checked. Note that the sample size is 36 throughout this problem! a. The sample mean is 5.3. Assume that the population standard deviation is known to be 2.1 and create a 98% confidence interval for the average number of sick days. (4) b. Assume that the facts in part a are correct but that the corporation only has 300 employees. Create a 98% confidence interval for the average number of sick days again, highlighting what has changed. (4) c. Assume that the sample mean is 5.3 and the corporation has 300 employees, but that 2.1 is a sample standard deviation. Create a 98% confidence interval for the average number of sick days again, highlighting what has changed. (4) d. The sample mean is 5.3. Assume that the population standard deviation is known to be 1.6 and create a 93% confidence interval for the average number of sick days. (3) e. Explain the effect of the following on the size of a confidence interval (keep the reasons brief) (3): (i) A smaller sample size. (ii) A larger population size. (iii) A higher confidence level. Solution: Note that the sample size is 36 throughout this problem. Since the confidence level is 1 .98, the significance level is .02 . Repeat after me! " z goes with (sigma - population variance); t goes with s (sample variance)!" Most of you seem to have been so hypnotized by the old exams that you had that you never noticed that most of this question concerned population variances. 2.1, x x 2.1 0.35 . From the t-table z 2 z.01 2.327 . 36 n x z 2 x 5.3 2.3270.35 5.3 0.81 or 4.49 to 6.11. More formally a) P4.49 6.11 98% . b) The population size, N 300 , is less than 20 times the sample size n 36 , so N n 2.1 300 36 x x 0.35 0.8829 0.3289 . From the t-table 36 300 1 n N 1 z 2 z.01 2.327 . x z x 5.3 2.3270.3289 5.3 0.77 2 or 4.53 to 6.07. More formally P4.53 6.07 98% . 3 251y0243 12/10/02 s 2.1. The degrees of freedom are n 1 36 1 35 . Since we are given s rather than , we n 1 must use t. From the t-table, t t.0135 2.438 . 2 c) The population size, N 300 , is less than 20 times the sample size n 36 , so s x sx n N n N 1 300 36 0.35 0.8829 0.3289 . Putting this all together 36 300 1 x tn1 s x 5.3 2.4380.3289 5.3 0.86 or 4.44 to 6.16. More formally, 2 2.1 P4.44 6.16 .98 . x 2.1 d) x 0.35 . . Since the confidence level is 1 .93, the significance level is 36 n .07 . We need z 2 z.035. Make a diagram for z . It should show 50% above zero, divided into 3.5% above z.035 and 50% - 3.5% = 46.5% below z.035 . So P0 z z.035 .4650 P0 z 1.81 .4649 . x z 2 x 5.3 1.810.35 5.3 0.63 or 4.67 to 5.93. More formally From the Normal table, the closest we can come is P4.67 5.93 93% . It’s remarkable how many of you could find something like z.035 on page one but not here. e) The basic formula for a confidence interval is x z x 2 or x tn1 s x 2 sx (i) A smaller sample size. If we use the formula x tn1 s x and s x , a smaller sample size will n 2 make the standard error larger because the sample size is in the denominator and will also make t larger, as we can see from the t table. (ii) A larger population size. The formula for the standard error is s x size grows, sx n n 1 N n . As the population N 1 N n approaches one from below, making the confidence interval larger. N 1 (iii) A higher confidence level. If we look at the t table, we see the value of t gets larger as the significance level gets smaller. The significance level falls as the confidence level rises. So a higher confidence level makes the interval larger. 4 251y0243 12/10/02 2. A sample of 5 is taken from a large Normal population with a mean of 3 and a standard deviation of 13. a. What is the probability that an individual item x in the sample lies between 1 and 5? (2) b. What is the probability that the sample mean x lies between 1 and 5? (2) c. What is the probability that at least one of the 5 measurements lies between 1 and 5? (2) d. What is the probability that all the 5 measurements lie between 1 and 5? (2) e. Do b) again assuming that the sample of 5 is taken from a population of 200 (1) f. Find the 99th percentile of the distribution of x (2) g. Find the 99th percentile of the distribution of x (1) Solution: x ~ N , N 3,12 . Since x x ~ N , x N 3,5.8138, n 5. x n 13 5 5.8138, 5 3 1 3 z P 0.15 z 0.15 P1 x 5 P 13 13 2P0 z 1.15 2.0596 .1192 . 53 1 3 z P 0.34 z 0.34 b) P1 x 5 P 5.8138 5.8138 2P0 z 0.34 2.1331 .2662 c) This problem and the next problem are Binomial problems with n 5, p .1192 and q 1 .1192 .8808 . Remember Px C xn p x q n x . a) Px 1 1 P0 1 C05 .8808 1 .5301 .4699 5 d) Px 5 C55 .1192 .000024 . 5 e) Because the population size is more than 20 times the sample size, the answer is essentially the same as b). f) Since the 99th percentile of z is z.01 2.327 , the 99th percentile of x is x z.10 3 2.32713 33.25 g) Since the 99th percentile of z is z.01 2.327 , the 99th percentile of x is x z.05 x 3 2.3275.8138 3 13.529 16.53 5 251y0243 12/10/02 3. An eight-sided die is rolled five times. It has sides numbered one through 8, all of which are equally likely. A one, two, three, four or five wins $4.00, so that the probability of winning on each roll is 5/8. Let x represent the number of wins and y represent the amount won. Find the following: a. The complete distribution of x and y (6) – note that you do not have to do this part to answer the remainder of the question. b. The mean and variance of x . (2) c. The mean and variance of y .(2) d. The chance of at least one win (2 points if you did not do a) (1) e. Assume that the die is rolled 50 times, find the probability of 30 or more wins. (Note that your binomial tables won’t help and that if you try to do this without using a table you will be here until Christmas) (i) Can this problem be done using the Poisson distribution? – Why? (1) (ii) Can this problem be done using the Normal distribution? – Why? (1) (iii) Decide which of the two distributions is correct and answer the question. (2 or 2.5) Solution: This problem is a Binomial problems with n 5, Px C xn p x q n x C x5 85 x 3 5 x 8 p 5 and q 1 5 3 . Remember 8 8 8 . The distribution is as follows: x y 0 0 1 4 2 8 3 12 4 16 5 20 C05 5 8 8 5 38 5 .00741 1 4 C15 5 8 3 8 5.012360 .06180 2 3 C 25 5 8 3 8 10.020599 .20599 3 2 C35 5 8 38 10.034332 .34332 4 1 C 45 5 8 3 8 5.057220 .28610 5 0 C55 5 8 3 8 .09537 0 3 There is a minimal rounding error since these add to .99999. b. The mean is np 55 8 3.125 and the variance is 2 npq 3.1253 8 1.171875 Eax aEx and Varax a 2Varx we get E4x 4Ex 43.125 12.5 and Var4 x 4 2 Varx 161.171875 18.75 d. Px 1 1 P0 1 .00741 .992591 . n 50 80.0. e. (i) n 50 , p 5 8 . If we test to see if the Poisson Distribution can be used, we find p 58 c. The prize is y 4 x so, using the formulas Since this is not above 500, we cannot use the Poisson Distribution. (ii) np 50 5 8 31.25 5. nq 50 3 8 50 31.25 18.75 5. Since these are both above 5, we can use the Normal distribution with 2 npq 31.253 8 11.71875. np 505 8 31.25 and (iii) With the continuity correction 29.5 31.25 Pz 0.51 .5 .1950 .3050 Px 29.5 P z 11.71875 6 251y0243 12/10/02 4. a. I buy two mainframes from a computer manufacturer. One computer has a 23% chance of breaking down during the first year, the second has a 25% chance of breaking down during the same period. Let A represent the event that the first computer breaks down in the first year and B represent the event that the second computer breaks down. Tell how you will note the complement of these events and assume that the events are independent. Find the following, noting if each is an event like A B, A B, or similar events involving the complement. Assume that events A and B are independent. (i) The probability that both computers break down in the first year . (2) (ii) The probability that one computer breaks down in the first year . (2) (iii) The probability that no computer breaks down in the first year . (2) (iv) If a breakdown costs you $1000 (and no machine will break down more than once in a year), what is the mean and variance of your breakdown costs? (4) b. I use an average of 2 boxes of paper in a day , but it takes 9 days to get a delivery. Given the average usage over 9 days , do the following: (i) What is the probability of using at least 20 boxes in the 9 day period? (1) (ii) What is the probability of using more than 20 boxes? (1) (iii) If x is the number of boxes used, what is P 13 x 27 ? (2) (iv) If I want to keep the probability of running out of paper below 1%, what is the minimum number of boxes I should have on hand when I reorder? (Why?) (2) Solution: A is the complement of A. A B PAPB .23.25 .0575 (ii) PA B PA B P APB PA PB .23.75 .77.25 .1725 .1925 .3650 . (This is P A B P A B ) (iii) PA B PA PB .77.75 .5775 . These probabilities add to one. a. (i) P (iv) If x represents the number of breakdowns, y 1000 x is the cost. x xPx Px 0 1 2 sum .5775 .3650 .0575 1.0000 0 .365 .115 .480 x 2 P x 0 .365 .230 .595 x Ex x Px .48 x2 Varx Ex 2 x2 .595 .482 .3646 Eax aEx and Varax a 2Varx , so E1000x 1000Ex 1000.48 480 and Var1000 x 1000 2 Varx 364600. b. (i) Poisson distribution 29 18. Px 20 1 Px 19 1 .65092 .34908 (ii) Px 20 1 Px 20 1 .73072 .26928 (iii) P13 x 27 Px 27 Px 12 .98268 .09167 .89101 (iv) You must have 29 boxes. Px 29 .99406 .This means Px 29 is below 1%. 7 251y0243 12/10/02 5. (Lee et. al.) The following joint probability table measures the satisfaction that customers have with local food stores. x is the satisfaction level, with 4 representing the highest value. y represents the number of years that the consumer has lived in the area, with 2 representing long term residents. x 1 2 3 4 Total 1 .02 .14 .23 .07 .46 2 .09 .17 .23 .05 .54 Total .11 .31 .46 .12 a. Are x and y independent? Why? (2) b. How do we know that this is a valid distribution? (1) c. Compute the mean and standard deviation of y (2) d. Compute the covariance between x and y and interpret it. Does this mean that long term residents are more satisfied? (3.5) e. Compute the correlation. What does this tell us that we couldn’t learn from the covariance? (2.5) f. Remember that these are joint probabilities. Find the conditional probability that a long-term resident is y very satisfied Px 4 y 2 . (2) g. Use the same method to find the complete conditional probability of x for long term residents, show that this is a valid distribution and compute the conditional mean for long-term residents. (5.5) h. If you are ready for some real thinking, find the conditional mean for short-term residents and use it to compare the satisfaction levels for the two kinds of residents. (To do this correctly we would need variances as well – have a nice break and we’ll worry about that next semester.) Solution: a. x and y are not independent. For example Px 1 y 1 .02 Px 1 P y 1 .11.46 .0506 b. The probabilities add to one and are not negative or above 1. y c. P y yP y 1 2 sum .46 .54 1.00 y 2 P y .46 1.08 1.54 0.46 2.16 2.62 y E y yP y 1.54 y2 E y 2 y2 2.62 1.542 0.2484 y .2484 .4983. x d. y 1 2 Px xPx x 2 P x P y yP y y 2 P y 0.46 1.08 1.54 0.46 2.16 2.62 1 .02 .09 .11 2 .14 .17 .31 3 .23 .23 .46 4 .07 .05 .12 .46 .54 1.00 .11 .62 1.38 .48 2.59 .11 1.24 4.14 1.92 7.41 Px 1, Ex xPx 2.59 , Ex x Px 7.41, P y 1, E y yP y 1.54 and Ey y P y 2.62 2 To summarize 2 x 2 2 y E xy = .02(1)(1) +.14(2)(1) +.23(3)(1) +.07(4)(1) +.09(1)(2) 0.02 + 0.18 +.17(2)(2) + 0.28 + 0.68 +.23(3)(2) + 0.69 + 1.38 +.05(4)(2) + 0.28 + 0.40 =3.91 251y0243 12/10/02 8 xy Covxy E xy x y 3.91 2.591.54 0.0786 . Negative, so x and y move oppositely. This means that long term residents are less satisfied. e. x2 Ex 2 x2 7.41 2.592 0.7019 and E y 2.62 1.54 0.2484 xy 0.0786 .0786 .18824 So that xy x y .7019 .2484 0.83780.4984 2 y 2 2 y 2 Since the square of the correlation is between .03 and .04, on a zero-one scale it is quite weak. f. By the multiplication rule Px 4 y 2 Px 4 y 2 .05 .0926 P y 2 .54 g. If we divide the whole top row by .46, and the bottom row by .54 we get y 1 2 1 .043 .167 2 .304 .315 x 3 .500 .426 4 .152 .093 Total 0.999 1.001 For short-term residents .043(1) + .304(2) + .500(3) + .152(4) = .043 + .608 + 1.500 + .608 = 2.759 For long-term residents .167(1) + .315(2) + .426(3) + .093(4) = .167 + .630 + 1.278 + .372 = 2.447. Anyway long-term residents seem less satisfied. 9 251y0243 12/10/02 6. Answers to a-c can be left in factorial form. a. A bridge hand consists of 13 cards. A deck is a population of 52 cards, with a total of 4 kings. What is the probability of 2 kings? What is the distribution you are using? (3) b. What are the mean and variance of the number of kings in a hand? (2) c. Now comes the fun. Let’s say we take a load of decks – you do not need to know how many cards we have, and we deal you 13 cards, what is the probability of 2 kings now? (3) d. Are the mean and variance of the number of kings in your hand in c) the same as the values in b)? What changes and why? (1) e. (Keller, Warrack) The amount of gasoline sold by one of your service stations daily is uniformly distributed between a minimum of 2000 and a maximum of 5000 gallons. (i) What is the mean and standard deviation of sales? (1.5) (ii) What is the probability that sales on a given day will fall between 3000 and 4000 gallons? (2) (iii)What is the probability that sales will be over 3000 gallons? (1) (iv)What is the probability of sales between 1900 and 2900 gallons? (1) (v) If you did not do (iii) using cumulative distributions, do it now (2) (vi)If you own 10 identical service stations, what is the probability that at least one has sales over 3000 gallons? (3) Solution: a. There are 4 kings in a deck of 52. So this is a Hypergeometric distribution with M 4, N 52, n 13 and x 3 . P x M x N M n x N n C C C . So P3 4 3 48 10 52 13 C C C 4! 48! 3!1! 38!10! 52! 39!13! 48 47 46 45 44 43 42 41 40 39 4 10 9 8 7 6 5 4 3 2 1 439 52 51 50 49 48 47 46 45 44 43 42 41 40 52 51 50 49 13 12 1110 9 8 7 6 5 4 3 2 1 13 12 11 43913 12 11 .04120 52 51 50 49 4 1 4 .07692 np 13 1 , 52 13 52 N n 52 13 4 48 39 4 48 2 .7647113.07101 npq 13 13 N 1 52 1 52 52 51 52 52 b. p .70588 so 0.70588 0.84017 . c. Binomial P x C p q n x x 13 12 11 1 12 3 2 1 13 13 3 10 n x 13! 1 . So P3 C p q 3!10! 13 13 3 3 10 3 12 13 10 286.0004551661.449137 .05847 d. Because the population is now infinite, we remove the .76471 finite population adjustment from the variance formula. The mean is unchanged. 10 251y0243 12/10/02 e. (Keller, Warrack) The amount of gasoline sold by one of your service stations daily is uniformly distributed between a minimum of 2000 and a maximum of 5000 gallons. (i) What is the mean and standard deviation of sales? (1.5) (ii) What is the probability that sales on a given day will fall between 3000 and 4000 gallons? (2) (iii)What is the probability that sales will be over 3000 gallons? (1) (iv)What is the probability of sales between 1900 and 2900 gallons? (1) (v) If you did not do (iii) using cumulative distributions, do it now (2) (vi)If you own 10 identical service stations, what is the probability that at least one has sales over 3000 gallons? (3) This is a continuous uniform distribution with c 2000 and d 5000 . 1 1 1 d c 5000 2000 3000 2 2 d c 5000 2000 c d 2000 5000 2 750000 3500 , (i) 12 12 2 2 so 750000 866.025 1000 1 .3333 (ii) P3000 x 4000 3000 3 5000 3000 2 .6667 (iii) P x 3000 3000 3 2900 2000 .3000 (iv) P1900 x 2900 3000 xc for c x d , F x 0 for x c and F x 1 for x c and that (v) Recall that F x d c 3000 2000 1 1 .6667 F 500 means Px 500 . Px 3000 1 F 3000 1 3000 3 2 1 n x n x (vi) This is your basic Binomial problem P x C x p q . p , q and n 10. 3 3 0 10 2 1 Px 1 1 P0 1 C 1 .000016935 .99998 3 3 3 0 11 251y0243 12/10/02 6. (ctd) f. (Keller, Warrack) (Extra credit) The number of hours an alkaline battery lasts is exponentially distributed with a parameter c of 0.04. (i) What are the mean and standard deviation of a battery’s life? (2) (ii) What is the probability that the battery will last between 10 and 15 hours? (2) (iii)What is the probability that it will last for more than 20 hours (1) (iv)Are you surprised that a jorcillator has two such batteries and that it works as long as one of the batteries works. What is the probability that the jorcillator lasts more than 20 hours? (2) (v) (Difficult) What is the probability that the jorcillator lasts between 10 and 15 hours? (The answer to this will not be published!) (7) g. The Muggle detector in front of my tower will identify a Muggle correctly as a Muggle 98% of the time and will identify a Wizard correctly as a Wizard 95% of the time. Assume that 4% of the population are Wizards and the rest Muggles. The Muggle detector says that you are Wizard. What is the probability that you really a Wizard? I only admit Wizards to my tower. (Hint: To do this you need decent notation or a good tree – Let Y (yes!) be the event that it says you are a Wizard and N (no!) be the event that it says you .04 and M are a Muggle. W is the event that you are a Wizard P W Muggle. (You need conditional probability to do this.) (6) is the event that you are a Solution: f) You were told in advance that this section would use the exponential distribution. In the exponential distribution from the outline 1 c Note that this is only for F x 1 e cx , when the mean time to a success is 1 . c x 0 . There is no probability below zero. 1 1 25 . (ii) F 10 1 e .0410 1 e 0.4 1 .67032 c .04 F 15 1 e .0415 1 e 0.6 1 .54881. So P10 x 15 .67032 .54881 .12151 (i) (iii) Px 20 1 F 20 1 1 e e .44933 (iv) The probability that one component lasts less than 20 hours is 1-.44933 = .55067. The probability that .0420 0.8 both fail before 20 hours is .55067 .30324 . The complement of this is .69676. 2 g. Not a hard problem. You are given PN M .98 , PY W .95 , PM .96 and PW .04 . This implies that PY M .02 and PN W .05 You are asked for PW Y . By Bayes’ Rule PW Y PY W PW . PY PY PY W PW PY M PM .95.04 .02.96 .0380 .0192 .0572 So PW Y PY W PW PY .0380 .6643 .0572 Another way to do this is to say that of 10000 people who come to my tower, 9600 will be Muggles and 400 will be Wizards. Of the Muggles, 98% or 9408 will be identified as Muggles. The remaining 2% or 192 will be wrongly identified as Wizards. Of the 400 Wizards, 95% or 380 will be correctly identified. So 192 + 380 = 572 are identified as Wizards. Of the 572, 380 or 66.4% actually are Wizards. 12