251x0541 4/26/05 ECO 251 QBA1 FINAL EXAM MAY 2, 2005 Name Class ________________ Part I. Do all the Following (14 Points) Make Diagrams! Show your work! Exam is normed on 75 points. There are actually 134 possible points. If you haven’t done it lately, take a fast look at ECO 251 - Things That You Should Never Do on a Statistics Exam (or Anywhere Else). x ~ N 12, 4.6 28 12 17 .26 12 z P1.14 z 3.48 P0 z 3.48 P0 z 1.14 1. P17.26 x 28 P 4 . 6 4.6 .4997 .3729 .1268 Normal Curv e with Mean 12 and Standard Dev iation 4.6 Normal Curv e with Mean 0 and Standard Dev iation 1 The Area Between 17.26 and 28 is 0.1262 The Area Between 1.14348 and 3.47826 is 0.1262 0.09 0.4 0.08 0.07 0.3 0.05 Density Density 0.06 0.04 0.2 0.03 0.1 0.02 0.01 0.00 -10 0 10 Da ta A x is 20 0.0 30 -5.0 -2.5 0.0 Da ta A x is 2.5 5.0 17 .26 12 Pz 1.14 2. F 17 .26 (Cumulative Probability) F 17 .26 Px 17 .26 P z 4.6 .5 P0 z 1.14 .5 .3729 .8729 Normal Curv e with Mean 12 and Standard Dev iation 4.6 Normal Curv e with Mean 0 and Standard Dev iation 1 The Area to the Left of 17.26 is 0.8736 The Area to the Left of 1.14348 is 0.8736 0.09 0.4 0.08 0.07 0.3 0.05 Density Density 0.06 0.04 0.2 0.03 0.1 0.02 0.01 0.00 -10 0 10 Da ta A x is 20 30 0.0 -5.0 -2.5 0.0 Da ta A x is 2.5 5.0 19 .30 12 Pz 1.59 .5 P0 z 1.59 .5 .4441 .0559 3. Px 19 .30 P z 4.6 1 251x0541 4/26/05 0.09 0.09 0.08 0.08 0.07 0.07 0.06 0.06 0.05 0.05 Density Density Normal Curv e with Mean 12 and Standard Dev iationN 4.6 ormal Curv e with Mean 12 and Standard Dev iation 4.6 The Area to the Right of 19.3 is 0.0563 The Area Between 12 and 16 is 0.3077 0.04 0.04 0.03 0.03 0.02 0.02 0.01 0.01 0.00 -10 0 10 Da ta A x is 20 0.00 30 -10 0 10 Da ta A x is 20 30 16 12 12 12 z P0 z 0.87 .3078 4. P12 x 16 P 4.6 4.6 Normal Curv e with Mean 12 and Standard Dev iation 4.6 Normal Curv e with Mean 0 and Standard Dev iation 1 The Area Between 12 and 16 is 0.3077 The Area Between 0 and 0.869565 is 0.3077 0.09 0.4 0.08 0.07 0.3 0.05 Density Density 0.06 0.04 0.2 0.03 0.1 0.02 0.01 0.00 -10 0 10 Da ta A x is 20 0.0 30 -5.0 -2.5 0.0 Da ta A x is 2.5 5.0 16 12 0 12 z P 2.61 z 0.87 P2.61 z 0 P0 z 0.87 5. P0 x 16 P 4.6 4.6 .4955 .3078 .8033 Normal Curv e with Mean 12 and Standard Dev iation 4.6 Normal Curv e with Mean 0 and Standard Dev iation 1 The Area Between 0 and 16 is 0.8032 The Area Between -2.6087 and 0.869565 is 0.8032 0.09 0.4 0.08 0.07 0.3 0.05 Density Density 0.06 0.04 0.2 0.03 0.1 0.02 0.01 0.00 -10 0 10 Da ta A x is 20 30 0.0 -5.0 -2.5 0.0 Da ta A x is 2.5 5.0 6. x.13 (Find z .13 first) Solution: Make a diagram. z .13 is defined as a point with 13% above it and thus 100% - 13% =87% below it, so it is the 87th percentile. The diagram for z will show an area with a probability of 87% below z .13 . It is split by a vertical line at zero into two areas. The lower one has a probability of 50% and the upper one a probability of 87% - 50% =37%. The upper tail of the distribution above z .13 has a probability of 13%, so that the entire area above 0 adds to 50%. From the diagram, we want one point z .13 so that Pz z.13 .8700 or P0 z z.13 .3700 . The closest we can come is P0 z 1.13 .3708 . We can say z .13 1.13 . From this we get x z 12 1.134.6 17.198 7. A symmetrical region around the mean with a probability of 33%. Solution: Make a diagram. The diagram for z will show a central area with a probability of 33%. It is split in two by a vertical line at 2 251x0541 4/26/05 zero into two areas with probabilities of 16.5%. The tails of the distribution each have a probability of 50% - 16.5% = 33.5%. From the diagram, we want two points z .. 335 and z .835 so that Pz .835 z z 335 .3300 . The upper point z .335 will have P0 z z .335 33 % .1650 , 2 and by symmetry z .835 z .335 . From the interior of the Normal table the closest we can come to .1650 is P0 z 0.43 .1664 or P0 z 0.42 .1628 . Though either z .335 0.42 or z .335 0.43 is an acceptable answer, in fact 0.43 is closer, so I will use z .335 0.43 , and our interval for z is -0.43 to 0.43. Since x ~ N 12, 4.6 , the diagram for x (if we bother) will show 33% probability split in two 16.5% regions on either side of 12, with 33.5% above x.335 and 33.5% below x.835 . The interval for x can then be written x z .335 6 0.434.6 6 1.978 or 4.022 to 7.978. [14, 14] 3 251x0541 4/26/05 II. (10 points+, 2 point penalty for not trying part a.) Show your work! x2 x 1 2 3 4 5 6 7 8 9 10 11 y2 y xy 4.0 16.00 10.2 104.04 ***** 4.5 20.25 10.4 108.16 ***** 5.0 25.00 10.5 110.25 ***** 5.5 30.25 21.8 475.24 ****** 6.0 36.00 36.8 1354.24 ****** 6.5 42.25 51.6 2662.56 ****** 7.0 49.00 66.2 4382.44 ****** 7.5 56.25 68.7 4719.69 ______ 8.0 64.00 68.0 4624.00 ______ 8.5 72.25 69.4 4816.36 ______ 9.5 90.25 75.0 5625.00 ****** 72.0 501.50 488.6 28981.98 3641.25 The data above represent a random sample of the intensity of advertising x , measured in number of exposure on evenings in prime TV, and y , the intensity of awareness of the product according to a consumer survey taken after the advertising campaign for 11 products. Calculate the following:. a. The sample standard deviation s y of awareness. (The standard deviation of x is 1.51383 – error - it was 1.73860.) (2) b. The sample covariance s xy between x and y after computing the 3 numbers not replaced by asterisks. (3) Solution: The xy column reads 40.80, 46.80, 52.50, 119.90, 220.80, 335.40, 463.40, 515.25, 544.00, 589.90, 712.50 c. The sample correlation rxy between x and y . (2) d. Given the size and sign of the correlation, what conclusion might you draw on the relation between x and y ? (1) e. Assume that the intensity of awareness was 15% higher ( v 1.15 y ). Find v , s v2 , s xv and rxv . Use only the values you computed in a-c and rules for functions of x and y to get your results. If you state the results without explaining why, or change x1 and x 2 and recompute the results, you will receive no credit. (4). f. Do a 90% confidence interval for the mean number of exposures. (2) [14, 28] x 72 .0 , x 2 501 .50 , y 488 .6 and Solution: These sums have been calculated for you. y a) xy 3641 .25 . y 488 .6 44.4182 s y y 2 28981 .98 and n 2 y 11 2 ny 2 n 1 28981 .98 1144 .4182 2 7279 .24 727 .924 10 10 s y 727.924 26.9801 x x 72.0 6.54545 n 11 s x2 x 2 nx 2 n 1 501 .50 116.54545 2 30 .2279 3.02273 10 10 s x 3.02273 1.73860 . b) s xy c) rxy xy nxy n 1 s xy sx s y 2 3641 .25 116.54545 44 .4182 443 .1418 44 .3142 10 10 44 .3142 .9447 1.73860 26.9801 This must be between -1 and 1. d) If we square the correlation we get 0. 892, which in a zero to one scale is fairly strong. We certainly could that number of exposures is closely related to product awareness, though we will find out that the small size of the sample could limit our certainty. 4 251x0541 4/26/05 e) From the syllabus supplement article: “Let us introduce two new variables, w and v , so that w ax b , and v cy d , where a, b, c, and d are constants. From the earlier part of this section we know the following: w2 Varw a 2Varx a 2 x2 w Ew Eax b aEx b v2 Var v c 2Var y c 2 y2 v E v E cy d cE y d To this we now add a new rule: Covw, v wv acCovx, y ac xy To find the correlation between w and v , recall that wv and v2 c 2 y2 , then w2 a 2 x2 wv wv . But since w v ac xy a 2 x2 c 2 y2 ac xy ac x y ac xy signac xy .” ac x y Since these rules work for sample statistics too, then w x 1x 0 , v 1.15y 1.15y 0 , so a 1, b 0, c 1.15 and d 0. w x 6.54545 , v 1.15 y 0 1.1544.4182 51.0809 , s w2 12 s x2 s x2 , so s w 3.02273 1.73860 , s v2 1.15 2 s 2y 1.15 2 44 .4182 58 .7431 , s v 58 .7431 7.6644 .. s wv 11.15s xy 1.1544.3141 50.9612 . rwv acs xy a 2 s x2 c 2 s 2y acs xy ac s x s y ac s xy signacrxy .94471 ac s x s y Note that, because the ac in the numerator cancels the ac in the denominator, the only thing that ac contributes to the result is its sign. If the product of a and c is negative we reverse the sign of xy . Signac thus takes the values 1 or 1 . f) Recall that .10 , x 6.54545 , s x2 3.02273 , s x 3.02273 1.73860 , and that the formula table says the following. Interval for Confidence Hypotheses Test Ratio Critical Value Interval x z 2 x Mean ( known) x Mean ( unknown) n x t 2 s x DF n 1 H0 : 0 H1 : 0 H0 : 0 H1 : 0 z t x 0 x x 0 sx xcv 0 z 2 x xcv 0 t 2 s x sx s n 3.02273 10 1.812 , so 0.274794 0.5242 . t n 1 t .05 11 2 n x t 2 s x 6.54545 1.8120.5242 6.55 0.95 , so that we can say P5.60 7.50 .90 sx s Or we could make a diagram showing an almost Normal curve with 6.55 in the middle, 5% below 5.60 and 5% above 7.50. 5 251x0541 4/26/05 III. Do at least 5 of the following 7 Problems (at least 12 each) (or do sections adding to at least 48 points Anything extra you do helps, and grades wrap around) . Show your work! Please indicate clearly what sections of the problem you are answering! If you are following a rule like E ax aEx please state it! If you are using a formula, state it! If you answer a 'yes' or 'no' question, explain why! If you are using the Poisson or Binomial table, state things like n , p or the mean. Avoid crossing out answers that you think are inappropriate - you might get partial credit. Choose the problems that you do carefully – most of us are unlikely to be able to do more than half of the entire possible credit in this section!) This is not an opinion questionnaire. Answers without reasons or supporting calculations or table references will not be accepted!!!! Note that some problems extend over 2 pages. 1. Find P4 x 12 for the following distributions: a) Normal with mean of 5.5 and standard deviation of 3 (1) 12 5.5 4 5.5 z P 0.50 z 2.17 P4 x 12 P 3 3 P0.50 z 0 P0 z 2.17 .1915 .4850 .6765 b) Continuous Uniform with c 2 and d 10 (2) Make a diagram. Show a box between 2 1 1 . Shade the area between 4 and 10. (There and 10 on the x axis with a height of d c 8 1 is no area between 10 and 12.) The area is 10 4 .75 . We could also say that 8 42 P4 x 12 F 12 F 4 1 1 .25 .75 . 10 2 c) Binomial with n 25 and p .05 (2) For any discrete distribution for integer values of x , P4 x 12 F 12 F 3 Px 12 Px 3 1.00000 .96591 .0341 d) Binomial with n 25 and p .55 (2) P4 x 12 can be expressed as the probability of between 25 – 12 = 13 failures and 25 – 4 = 21 failures, when the probability of failure is 1 - .55 = .45. P13 x 21 F 21 F 12 Px 21 Px 12 .99999 .69368 .30631 e) Geometric with p .05 (2) Recall that F x 1 q x P4 x 12 F 12 F 3 1 q12 1 q 3 q 3 q12 .953 .959 .953 1 .956 .857375.264906 .2271 f) Poisson with a parameter of 20. (2) P4 x 12 F 12 F 3 Px 12 Px 3 .03901 .00000 .03901 g) Find Px 2 for a Hypergeometric distribution with N 25, M 6 and n 4 .(3) From Great Distributions I Have Known Px Px 2 1 Px 1 1 P0 P1 1 C nNxM C xM C 419 C 06 C 425 C nN C 319 C16 C 425 and 19! 19! 1 6 15!4! 16!3! 1 25! 21! 4! 6 251x0541 4/26/05 16 19 18 17 16 19 18 17 6 19 18 17 6 4 19 18 17 40 232560 4 4 3! 3! 1 1 1 1 1 .7660 .2340 25 24 23 22 25 24 23 22 25 24 23 22 303600 4 3! . [14, 42] 7 251x0541 4/26/05 2. Find the following: The mean and standard deviation for a) Continuous Uniform with c 2 and d 10 (1) d c2 10 22 64 5.3333 c d 2 10 6.0 , 2 2 2 12 12 12 So 5.3333 2.3094 b) Binomial with n 25 and p .55 (1) (Remember that q 1 p and that both p and q must be between 0 and 1.) q 1 p 1 .55 .45, np 25.55 13 .75 , 2 npq 13.75 .45 6.1875 so npq 6.1875 2.4875 . The average number of successes in 25 tries, when the probability of a success in an individual try is 55% is 13.75. c) Geometric with p .05 (1) 1 1 q .95 20 .000 , 2 2 380 so 380 19.4936 . When the p .05 p .05 2 probability of a success in an individual try is 5% and we play a game repeatedly, on the average our first success will occur on the 20th try. d) Poisson with a parameter of 20. (1) Poisson Distribution with parameter of 10. m 20 , 2 m 20 so m 20 4.4721. e) f) Hypergeometric distribution with N 25, M 6 and n 4 . (1) M 6 If p .24 and q 1 p .76 , then np 4.24 .96 and N 25 N M 25 6 2 npq 4.24 .76 .79167 0.7296 0.57760 so N 1 25 1 0.57760 0.76000 If we take a sample of 4 from a population of 25 that is 24% successes, the average number of successes will be 0.96. You wish to find P4 x 12 for a binomial distribution with n 90 and p .02 . You do not have the appropriate binomial table, so show that you can use an approximation and do the problem using that approximation. (4) When you have a binomial situation with many tries and a small probability of success on n 500 . In this case, any one try you use the binomial distribution. The criterion is p n 90 4500 , which is well over 500. The mean is np 90 .02 1.8 . For any p .02 discrete distribution for integer values of x , P4 x 12 F 12 F 3 Px 12 Px 3 1.00000 .89129 .10871 8 251x0541 4/26/05 g) You are taking a sample of 90 from a population of 2000 that is 10% defective and want the probability of more than 25 defective items in the sample. (i) Show that you can use the Binomial distribution in this result. (1) (ii) Show that you can use the Normal or Poisson distribution to replace the Binomial distribution and do the problem (4+) (iii) Show the effect of using a finite population correction on this result. (2) Solution: (i) (ii) You can use the Binomial distribution if N 20 n . Since 20 n 2090 1800 is below N 2000 , use the Binomial distribution. (1) At the moment, we are trying to use the Binomial distribution with n 90 and p .10, but we do not have a table for n 90 . To use the n n 90 900 , 500 . In this case, p .10 p which is well over 500. If we want to use the Poisson distribution with a mean of E x np .10 90 9 , Px 25 1 Px 24 1 .99999 .00001 . If we want to use the Normal distribution, we must show that the expected number of success and the expected number of failures are both above 5. We already know that the expected number of successes is 9. The expected number of failures is 90 – 9 = 91, so we can use the Normal distribution with Poisson distribution we need np 90.1 9 and npq 9.9 8.1 2.8560 . (iii) 24 9 Px 24 P z Pz 5.25 1 and 2.8560 Px 25 1 Px 24 1 1 0 (4+) There is no effect from using a finite population correction on this result. 24 .5 9 (2) P B x 25 1 PB x 24 1 PN x 24 .5 1 P z 2.8560 1 Pz 5.43 1 1 0 . ( B subscript for Binomial, N for Normal.) h) You are taking a sample of n 100 from a population that is 60% in favor of your candidate. You will declare your candidate the victor if p .55 . Find P p .55 when the true population proportion is p .60 .(3) Solution: There are 2 ways to do this. If we stick to the binomial distribution, we are looking for the probability of 55 or more successes, which means 45 or fewer failures when the probability of failure is .40. From the binomial table this probability is .86891. On the other hand we might remember that this is Exercise 7.15. This Problem asks for pq , P p .55 when p has various values and n 100 . Because p ~ N p, n .55 .60 p p p p z . And in part b) p .60 and P p .55 P z .60 .40 p pq 100 n .55 .60 P z .00240 Pz 1.02 .5 .3461 .8461 . If we are absolutely insane, we 9 251x0541 4/26/05 might try a finite population correction. .545 .60 Pz 1.12 .5 .3686 .8686 P p .55 P z .00240 i) (Extra Credit) The Negative Binomial distribution gives the probability that there will be x failures before success n . We have the following formulas for it: nq nq and 2 2 . Assume that p .3 . Px C nn1x 1 p n q x , p p (i) (ii) (iii) What is the chance that our 5th success occurs after the 8th failure? (2) What is the average number of failures that will occur before the 5 th success? (2) Show that the Geometric distribution is a special case of the Negative Binomial distribution by showing under what conditions Px , and 2 are the same for both distributions. (5) Solution: (i) The chance that our 5th success occurs after the 8th failure requires x 8 and n 5 Px C nn1x 1 p n q x , x n 1 8 5 1 12 P8 C412 p 5 q 8 12! .35 .78 12 11 10 9 .00243 .057648 .06934 8! 4! 4 3 2 1 The average number of failures that will occur before the 5 th success is nq 5.7 11 .67 . .3 p The Geometric distribution is a special case of the Negative Binomial distribution if we can show the following facts from Great Distributions I Have Known . (ii) (iii) Distribution Geometric Uses Formula Mean Variance x1 Gives q 1 P( x) q p 2 2 probability that p p F x 1 q x the first success occurs on try x . To say that the first success occurs on try x is the same as saying that x 1 failures occurred before the first success. If we substitute x 1 for x in the negative Binomial formula and set n 1, Px 1 C111x11 p1q x1 C0x1 pq x1 q x1 p . The Geometric distribution says that the average try on which the first success occurs is try means we must expect 1 p q 1 1 failures before the first success. But if p p p we substitute n 1 into the negative binomial formula we get and 2 nq p 2 1q p 2 1 , but this p q . p (5) nq 1q q p p p [19+, 61] 10 251x0541 4/26/05 3. A random sample is taken of the time necessary for a firm to approve 49 life insurance policies. From the sample we get a sample mean of 43 days and a standard deviation of 24 days. a) Find a 99% confidence interval for the mean processing time assuming that the sample standard deviation is correct and that the sample of 49 policies was taken from only 400 policies. (4) b) On the basis of long experience, we know that the population standard deviation for the policies was 20. Find a 99% confidence interval assuming that this population standard deviation is correct and that the sample comes from a large number of policies. (4) c) Find a 99% confidence interval for the mean processing time assuming that the population standard deviation of 20 is correct and that the sample of 49 policies was taken from only 400 policies. (4) d) Repeat b) with a 74% confidence level. You cannot not use the t table to answer this question correctly. (3) [12, 73] Solution: From the introduction to problem O1. We are using the following formulas. When is known x z x , where x 2 x n , or x x n N n when the sample is N 1 more than 5% of the population. s s N n When is unknown x tn1 s x , where s x x , or s x x when the sample is 2 n n N 1 more than 5% of the population. For a)-c), 1 .99 .01 . a) We find a 99% confidence interval for the mean processing time assuming that x 43, s 24, n 49 and N 400 . Because we are given s we should use t . Because of the small population use a finite population correction. s N n 24 400 49 351 sx x 3.42857 3.21573 . 400 1 399 n N 1 49 t n1 t 48 2.682 x t n1 s 43 2.6823.21573 43 8.62 so 2 .005 2 x P34 .38 51 .62 .99 b) We find a 99% confidence interval for the mean processing time assuming that x 43, 20 , n 49 and N is large. Because we are given we should use z . 20 2.85714 z Z .005 2.576 Because of the large population x x 2 n 49 x z 2 x 43 2.5782.85714 43 7.36 so P35.64 50.36 .99 c) We find a 99% confidence interval for the mean processing time assuming that x 43, 20 , n 49 and N 400 . Because we are given we should use z . Because of the small population use a finite population correction. x x n z 2 Z .005 N n 20 400 49 351 2.67978 = 2.85714 N 1 400 1 399 49 2.576 x z 2 x 43 2.5762.67978 43 6.90 so P36.10 49.90 .99 We find a 74% confidence interval for the mean processing time assuming that x 43, 20 , n 49 and N is large. Because we are given we should use z . Because of 20 2.85714 If 1 .74 .26 z Z .13 . From the the large population x x 2 n 49 front page, z .13 1.13 so x z x 43 1.132.85714 43 3.23 and P39 .77 46 .23 .74 . 2 11 251x0541 4/26/05 4. A supermarket clerk is believed to take a mean time of 2.6 minutes with a standard deviation of 2.1 minutes to check out a customer. Assume that the underlying distribution is Normal. a) What is the probability that the clerk will take more than 2.8 minutes to check out an individual? (2) Solution: x ~ N 2.6, 2.1 2.8 2.6 Pz 0.10 .5 .0398 .4602 Px 2.8 P z 2.1 For x ~ N 2.6, 2.1 make a Normal curve centered at 2.6 and shade the area above (right of) 2.6; for z make a Normal curve centered at zero and shade the area above 0.10. Because the area is on one side of the mean, you are subtracting. b) If 3 people come to this clerk, what is the probability that at least one of the people takes more than 2.8 minutes to check out? (2) Solution: This is a binomial problem with p .4602 , q .5398 and n 3. Px C xn p x q n x and Px 1 1 P0 1 C03 p 0 q 3 1 .5398 3 1 .1573 .8427 . c) If 3 people come to this clerk, what is the probability that their average checkout time will be more than 2.8 minutes? (3) Solution: This is a problem involving a sample mean. If 2.1 2.1 N 2.6, N 2.6, 1.2124 . Your diagrams should x ~ N 2.6, 2.1 , x ~ N 2.6, n 3 be very similar to those in a). 2.8 2.6 Pz 0.16 .5 .0636 .4364 Px 2.8 P z 1.2124 d) If 7 people are in line, what is the probability that the clerk will be able to check them all out before the clerk goes on break in 18 minutes? (3) Solution: This is what I called a ‘birthday party’ problem. The seven people will be served if the sample mean of the serving time is below 18 2.5714 minutes. 7 2.1 2.1 N 2.6, N 2.6, 0.7937 x ~ N 2.6, n 7 2.5714 2.6 Pz 0.04 .5 .0160 .4840 For z make a Px 2.5714 P z 0.7937 Normal curve centered at zero and shade the area below -0.04. Because the area is on one side of the mean, you are subtracting. e) Do not guess on this question. Show what formula you should use! Let x represent the supermarket’s revenues and y represent their costs, so that profits are a random variable w x y. Will a smaller correlation between x and y (i) decrease or increase the expected value of profit? (1) (ii) decrease or increase the variability of profit? (2) [13, 86] Solution: The latest version of the outline that I posted for the last Take-home says “If a, c and d are constants, Var(ax cy) a 2Var( x) c 2Var( y) 2acCov( x, y) . (This implies) that Eax cy d aEx cE y d and Var (ax cy d ) a 2Var( x) c 2Var( y) 2acCov( x, y) .” Let a 1 , c 1 and d 0. (i) E x y E 1x 1y 0 1Ex 1E y 0 Ex E y . This is not affected by the covariance or the correlation. 2 (ii) Varx y Var (1x 1y 0) 12 Var ( x) 1 Var ( y) 21 1Cov( x, y) Var( x) Var( y) 2Cov( x, y) . Because the covariance is a multiple of the correlation, a smaller (positive) correlation, will make the covariance smaller and the variance larger. 12 251x0541 4/26/05 5. (Example modified from Hildebrand, Ott, and Gray) A bank officer estimates the joint probabilities for percent return on two utility bonds as shown below. For your convenience, most row and column sums are given in the joint probability table, and most column sums, which you do not need but which may speed your computations, are given in the xyPx, y table. xyPx, y Original Data x 4.0 4.0 .03 4.5 .04 y 5.0 .02 5.5 .00 6.0 .00 sum .09 4.5 5.0 5.5 6.0 sum .04 .03 .00 .00 .10 .06 .06 .04 .00 .20 .08 .20 .08 .02 .40 .04 .06 .06 .03 .19 .00 .03 .03 __ .22 .38 .21 __ 0.480 0.720 0.400 0.000 0.000 0.720 1.215 1.800 0.990 0.000 0.060 1.350 5.000 1.650 0.900 0.000 0.990 2.200 1.815 0.990 0.000 0.000 0.600 0.990 ___ Column sums are 1.600, 4.725, 9.500, 5.995 and ____ Note that E x 5.005 and x2 0.297475. a) If x and y were independent, what would the number be in the blank space on the original joint probability table? (1) If we fill in the numbers to get them to add to 1, we get the numbers below x 4.0 .03 4.5 .04 y 5.0 .02 5.5 .00 6.0 .00 sum .09 4.0 4.5 5.0 5.5 6.0 sum .04 .03 .00 .00 .10 .06 .06 .04 .00 .20 .08 .20 .08 .02 .40 .04 .06 .06 .03 .19 .00 .03 .03 __ .11 .22 .38 .21 .10 1.00 If x and y were independent, the number in the lower left corner would be .10 .11 .011 . b) What should the number be in the blank space in the joint probability table? Fill in all the blank spots in the tables. (2) Adding .05 in the lower right corner makes it add to 1. The missing number in the E xy table is xyPx, y 66.05 1.800 . x 4.0 4.0 .03 4.5 .04 y 5.0 .02 5.5 .00 6.0 .00 sum .09 4.5 5.0 5.5 6.0 sum .04 .03 .00 .00 .10 .06 .06 .04 .00 .20 .08 .20 .08 .02 .40 .04 .06 .06 .03 .19 .00 .03 .03 .05 .11 .22 .38 .21 .10 1.00 0.480 0.720 0.400 0.000 0.000 0.720 1.215 1.800 0.990 0.000 0.060 1.350 5.000 1.650 0.900 0.000 0.990 2.200 1.815 0.990 0.000 0.000 0.600 0.990 1.800 Column sums are 1.600, 4.725, 9.500, 5.995 and 3.390. Note that 1.600 + 4.725 + 9.500 + 5.995 + 3.300 = 25.2100 = E xy 13 251x0541 4/26/05 c) y Compute y and the population standard deviation of y (2) 4.0 4.5 5.0 5.5 6.0 Px xPx x 2 Px 4.5 4.0 .03 .04 .06 .04 .02 .08 .04 .00 .00 .00 .22 .09 0.36 0.99 1.44 4.455 x 5.5 5.0 .00 .03 .04 .06 .08 .20 .06 .06 .03 .03 .21 .38 1.90 1.155 9.50 6.3525 P y 6.0 .10 .00 .20 .00 .40 .02 .19 .03 .11 .05 1.00 .10 5.005 0.60 3.60 25 .3475 yP y 0.400 0.900 2.000 1.045 0.660 5.005 y 2 P y 1.6000 4.0500 10 .0000 5.7475 3.9600 25 .3575 Px 1 , Ex xPx 5.005 E x x Px 25.3475 P y 1 , E y yP y 5.005 and E y y P y 25.3575 2 To summarize 2 Ey 2 x 2 y x2 E x 2 x2 25.3475 5.005 2 0.297475 x 0.297475 0.545413 and y2 2 2 y 25.3575 5.0052 0.307475 y 0.307475 0.554504 d) Compute xy Covx, y and xy Corrx, y (4) E xy 0.480 0.720 xyPxy 0.400 0.000 0.000 0.720 1.215 1.800 0.990 0.000 0.060 1.350 5.000 1.650 0.900 0.000 0.990 2.200 1.815 0.990 0.000 0.000 0.600 25 .2100 0.990 1.800 xy Covx, y Exy x y 25.2100 5.0055.005 0.159975 xy Corr x, y e) xy x y 0.159975 0.528959 0.545413 0.554504 Compute Ex y and Var x y (2) Ex y Ex E y x y 5.005 5.005 10.001 and Var x y x2 y2 2 xy Varx Var y 2Covx, y 0.297475 0.307475 20.159975 0.9249 f) If the officer buys one unit of bond x and 5 units of bond y , so that the total return is (corrected) R x 5 y , what are the mean, standard deviation and coefficient of variation of R ? (4) Solution: In 4e we quoted Eax cy d aEx cE y d and Var (ax cy d ) a 2Var( x) c 2Var( y) 2acCov( x, y) so Ex 5 y Ex 5E y 5.005 5(5.005 ) 30.03 and Var ( x 5 y) Var( x) 25Var( y) 10Cov( x, y) 0.297475 250.307475 100.159975 = 0.297475+ 7.686875 + 1.59975 = 9.5841. So the standard deviation is 3.095619 and 3.095619 0.1031 the coefficient of variation is 30 .03 [15, 101] 14 251x0541 4/26/05 6. Consider the following joint probability table. B1 A1 .10 A2 .00 A3 .18 sum .28 B2 .15 .05 .02 .22 B3 .50 .00 .00 .50 Find P A1 B2 (2) sum a) .75 .05 20 b) Find P A1 B2 (2) c) Find P A1 B2 (1) Solution: The table says P A1 .75 , PB2 .22 , P A1 B2 .15 a) P A1 B2 P A1 PB2 P A1 B2 .75 .22 .15 .82 b) PA1 B2 P A1 B2 .15 .6818 c) P A1 B2 .15 P B 2 .22 B1 B2 B3 sum __ __ __ __ d) Fill in the blanks if A1 and B1 are mutually exclusive. A2 __ __ .05 .25 (2) A3 __ .15 .10 .45 sum __ .45 .30 Solution: A1 and B1 are mutually exclusive if P A1 B1 0 . If we fill this in, we get B1 B 2 B3 sum B1 B 2 B3 sum A1 .00 __ __ __ A1 .00 .15 .15 .30 A2 __ __ .05 .25 . Then make the rest add up. A2 .05 .15 .05 .25 A3 __ .15 .10 .45 A3 .20 .15 .10 .45 sum __ .45 .30 sum .25 .45 .30 1.00 A1 B1 B 2 B3 sum __ __ __ __ Fill in the blanks if A1 and B1 are independent. A2 __ __ .06 .20 (2) A3 __ .16 .12 .40 sum __ .40 .30 Solution: A1 and B1 are independent if P A1 B1 P A1 PB1 . If we fill this in, we B1 B 2 B3 sum A1 .12 __ __ .40 get A2 __ __ .06 .20 . The easiest way to do the rest is to assume A3 __ .16 .12 .40 sum .30 .40 .30 1.00 B1 B 2 B3 sum A1 .12 .16 .12 .40 independence all over and get A2 .06 .08 .06 .20 A3 .12 .16 .12 .40 sum .30 .40 .30 1.00 A1 e) 15 251x0541 4/26/05 f) B1 B 2 B3 sum A1 __ __ __ __ Fill in the blanks if A1 and B1 are collectively exhaustive. A2 __ __ .00 .05 (2) A3 __ .00 .00 .25 sum __ .15 .50 Solution: A1 and B1 are collectively exhaustive if P A1 B1 1 . Fill in the marginal B1 B 2 B3 sum A1 __ __ __ .70 probabilities and get A2 __ __ .00 .05 . Now if P A1 B1 A3 __ .00 .00 .25 sum .35 .15 .50 1.00 P A1 PB1 P A1 B1 .70 .35 P A1 B1 1.00 , P A1 B1 must be .05. If B1 B 2 B3 sum A1 .05 .15 .50 .70 we put .05 in and make it add up, we get A2 .05 .00 .00 .05 . An easier way to A3 .25 .00 .00 .25 sum .35 .15 .50 1.00 do this is to realize that P A2 B2 0 because it is outside of P A1 B1 . g) (Ben Horim and Levy) The senate is about to vote on a tax bill. There is a 45% chance that it will pass. If it passes there is a 15% chance that unemployment will increase in the next six months. If it does not pass there is a 60% chance that unemployment will increase in the next six months. Use the following notation: TP is the event that the bill passes and UI is the event that unemployment increases. (i) What is the probability that unemployment increases in the next six months? (2) (ii) You were out hunting minces (You make mince pie from them) in BanjiWanjiland for the last six months and never found out if the tax bill passed. When you get back you find out that unemployment has increases. What is the (posterior) probability that the tax bill passed? (3) [16, 117] Solution: This is a simple Bayes’ Rule problem. The Following facts are given: PTP .45 , P UI TP .15 and P UI TP .60 . We can immediately say P TP 1 PTP 1 .45 .55 . (i) (ii) PUI PUI TP P UI TP PUI TP PTP P UI TP P TP .15 .45 .60 .55 .0675 .3300 .3975 . PUI TP PTP .15.45 .1698 . But there are Bayes’ Rule says PTP UI PUI .3975 other ways to do this. You could set up a box based on the chances of the bill passing. UI UI TP TP and then use the .15 and the .60 to get the joint .45 .55 1.00 16 251x0541 4/26/05 UI probabilities. UI TP TP .0625 .3300 and fill in the rest. .4500 .5500 1.00 TP TP UI .0625 .3300 .3925 . UI .3875 .2200 .6075 .4500 .5500 1.0000 PUI TP .0625 Then PTP UI .1698 PUI .3975 17 251x0541 4/26/05 7. Do the following. a) Assume that the average number of students logging onto a system every hour is 900. (i) What is the chance that none will log on in 30 seconds? (2) (ii) What is the chance that none will log on in one minute? (1) (iii) What is the chance that none will log on in ten minutes? (2) Solution: (i) 900 divided into 120 30 second intervals is 7.5. For Poisson(7.5) P0 .00091 (ii) The mean is now 15 and P0 .00005 (iii) The mean is now 150 and using the Poisson formula P0 e 150 710 66 0 . Of course, if you want to work a little harder try the Normal approximation to the Poisson. Px 0 .5 150 .5 150 PN 0.5 x 0.5 P z P 12 .29 z 12 .20 .5 .5 0. 150 150 b) A shipment of 20 items is 30% defective. To decide whether to accept the shipment, you will take a sample of 4 items and reject the shipment if more than one item in the sample is defective. What is the probability that you will reject the shipment? (3) Solution: This is Hypergeometric with N 20 , M Np 6 and n 4. Repeating 1g, Px C nNxM C xM C nN and Px 1 Px 2 1 Px 1 1 P0 P1 1 C 414 C 06 C 420 C 313C16 C 420 14! 13! 14 14 13 12 11 13 12 11 1 6 13 12 11 14 10! 4! 9!3! 13 12 11 18 4 4 3! 3! 1 1 1 1 20! 20 19 18 17 20 19 18 17 20 19 18 17 16! 4! 4 3! 1 .2656 .7344 c) Repeat b) assuming that there are many items in the shipment. (3) Solution: Binomial with n 4 and p .30. Px 1 Px 2 1 Px 1 1 .6517 .3483 d) A large shipment 30% defective. To decide whether to accept the shipment, you will take a sample of 30 items and reject the shipment if more than three items in the sample are defective. What is the probability that you will reject the shipment? (3) Solution: Binomial with n 30 and p .30. np 30.30 9 5 . n 21 5 , so that we can substitute the Normal distribution. 2 npq 9.70 6.30 e) 3.5 9 Px 3 Px 4 PN x 3.5 P z Pz 2.19 .5 .4859 .9859 6.30 A large shipment is 2% defective. To decide whether to accept the shipment, you will take a sample of 30 items and reject the shipment if more than three items in the sample are defective. What is the probability that you will reject the shipment? (You will find that because of the small probability, this problem is done very differently from d)). (3) Solution: Binomial with n 30 and p .02. np 30.02 0.6 5 . But n 30 1500 500 , so we can use the Poisson distribution with amean of 0.6. p .02 Px 3 1 Px 3 1 .99664 .00336 18 251x0541 4/26/05 f) (Extra Credit) If a teller at the Grover’s Corner bank can serve 30 customers an hour, the average time to serve a customer is 2 minutes. Assume that the exponential distribution applies and find the following: (i) The probability that it takes more than 5 minutes to serve a customer. (2) (ii) The probability that it takes no more than 2 minutes to serve a customer. (2) (iii) The probability that it takes between 2 and 5 minutes to serve a customer. (1) 1 Solution: (i) If 2 , then c 0.5. F x 1 e cx . c Px 5 1 F 5 1 1 e ..55 e 2.5 .0821 1 e 1 .3679 .6321 (iii) P2 x 5 F 5 F 2 1 e 1 e e e (ii) Px 2 F 2 1 e ..52 1.0 ..5 5 .5 2 1 2.5 .3679 .0821 .0058 [17, 134] 19 251x0541 4/26/05 (Blank) 20