251y0233 12/02/02 ECO251 QBA1 THIRD EXAM NOVEMBER 20, 2001 Name: _____Key__________ (Open this document in 'Print Layout' view!) Section Enrolled: (Circle) MWF 10 MWF 11 TR 11 Since there is no correct answer for the Take-home exam a possible solution appears below. Many errors you made are in ‘Things that you should never do on a Statistics Exam or Anyplace Else.’ Wake up and read it! 251x023a 11/16/01 ECO251 QBA1 THIRD EXAM NOVEMBER 22-23, 2002 TAKE HOME SECTION Name: Seymour Butz Section Enrolled: (Circle) MWF 10 MWF 11 TR 11 TR 12:30 Social Security Number: 234567891 Throughout this exam show your work! Please indicate clearly what sections of the problem you are answering! If you are following a rule like E ax aEx , please state it! If you are using a formula, state it! If you answer a 'yes' or 'no' question, explain why! Part I. Do all the Following (10 Points) Show your work! My Social Security Number is 265398248. If I write it as below, so that, for example, x3 5, Minitab tells me that the sample mean is 5.222 and the sample standard deviation is 2.682. (Please don’t do these again!) Write your Social Security next to it in the same way and call it y. x 1 2 3 4 5 6 7 8 9 2 6 5 3 9 8 2 4 8 Compute the following, showing the steps clearly as if you had done it by hand. Do not tell me “That is what my calculator said,” though you are welcome to use your calculator, Excel or Minitab to check your work: 1. The sample variance of y. (2). 2. The sample covariance between x and y. (2) 3. The sample correlation between x and y. (2) 4. Interpret the correlation (1) Answers to questions 5) and 6) must be based on the mean, standard deviation, covariance and correlation that you found in questions 1-4. Do not recompute the answers after changing x or y ! 5. If all the numbers in y rise by 20, (so that if y was [2, 3, 4, 5, 6, 7, 8, 9, 1], it is now [22, 23, 24, 25, 26, 27, 28, 29, 21]) what will the new mean and standard deviation of y be? What will be the correlation between x and y be now? (1.5) 6. If, instead, all the numbers in y are multiplied by 3.5, (so that if y was [2, 3, 4, 5, 6, 7, 8, 9, 1], it is now [7, 10.5, 14, 17.5, 21, 24.5, 28, 31.5, 3.5]) what will the new mean and standard deviation of y be? What will be the correlation between x and y be now? (1.5) 251y0233 11/19/02 Worksheet obs x 1 2 3 4 5 6 7 8 9 x2 y y 2 xy x 2 4 2 4 4 6 36 3 9 18 5 25 4 16 20 3 9 5 25 15 9 81 6 36 54 8 64 7 49 56 2 4 8 64 16 4 16 9 81 36 8 64 1 1 8 47 303 45 265 227 x 47 5.22222 n 9 x nx 2 303 95.22222 s n 1 8 7.19444 s x 2.68225 2 2 x y y 45 5.00000 n 9 y ny 2 285 95.00000 s n 1 8 7.50000 s y 2.73861 2 2 y n 9 , x 47 , 2 x 2 303 , y 45 , y 2 285 and 2 xy 227 2 Note that the x and x columns and their sums were not needed! s xy rxy xy nx y 227 95.222225.00000 8 1.00000 n 1 s xy sx s y 8 1.00000 7.19444 7.50000 8 so 0.136135 Solution: From above: s y2 7.50000 1. The sample variance of y. 2. The sample covariance between 3. The sample correlation between x and y. s xy 1.00000 x and y. rxy 0.136135 4. Interpretation. The negative sign of the covariance and correlation indicates that x and y tend to move in opposite directions. The correlation squared is about .019. On a zero to one scale, this is extremely weak. 5. If all the numbers in y rise by 20, what will the new mean and standard deviation of y be? What will be the correlation between x and y be now? E cy d cE y d means that E 1y 20 1E means too, so that the sample mean rises by 20 to 25.0000. And y 20 works for sample Var cy d c 2Var y , so Var y 20 1 Var y 1 7.50000 7.50000 . The standard deviation 2 2 remains 2.73861. Also, Cov(ax b, cy d ) acCov( x, y ) , so Cov( x 0, y 20) 11Cov( x, y) 11 1.00000 1.00000 and if w ax b and v cy d , wv sign ac xy . In this case w 1x 0 and v 1y 20 signac sign11 , and we started with a 2 251y0233 11/19/02 correlation of rxy 0.136135 , so the correlation between the two is -0.136135. In short, adding 20 to y has no effect. 6. If, instead, all the numbers in y are multiplied by 3.5, what will the new mean and standard deviation of y be? What will be the correlation between x and y be now? (1.5) E cy d cE y d means that E 3.5 y 0 3.5E y 0 works for sample means too, so that the sample mean, which was 5.00000 is multiplied by 3.5 to become 17.5. And Varcy d c 2Var y , so if our original standard deviation was s y2 7.50000 we find Var 3.5 y 0 3.5 Var y 3.5 7.50000 19.75. 2 Also, 2 Covax b, cy d acCovx, y , and s xy 1.00000 so Cov( x 0, 3.5 y 0) 13.5Cov( x, y) 13.5 1.00000 3.5000 and if w ax b and v cy d , wv sign ac xy . signac sign13.5 , so the correlation between the two is unchanged at rxy 0.136135 . Part II: Take your Social Security number again, let g 1 be the 2nd and 3rd digits of the number, g 2 be the 2nd, 3rd and 4th digits and g 3 2g1 and g 4 3g1 . ( For example, my Social Security number is 265398248, so between g1 65 , g 2 653 , g 3 265 130 , g 4 3g1 365 195. A jorcillator’s lifespan (failure time) can be represented by a continuous uniform distribution g 1 and g 2 years (My jorcillator has a lifespan between 65 and 653 years). 1. What is the probability that it lasts between g 3 and g 4 years? (1) 2. What is the probability that it lasts between g 3 and 1000 years? (1) 3. What is the mean life of such a jorcillator ? (1) 4. What is the standard deviation of the life of such a jorcillator ? (1) 5. If I have five such jorcillators, what is the probability that at least one lasts between g 3 and 1000 years? (1) Solution: Seymour’s Social Security number was 234567891, so that g1 34 , g 2 345 , g 3 234 68 , and g 4 3g1 334 102. Our jorcillator has a lifespan from c 34 to d 345 years and we want P68 x 102 . 1. So P68 x 102 1 1 1 .003215 . d c 345 34 311 102 68 1 34 .1093 . In the diagram below (not to scale), shade the 345 34 311 area between 68 and 102. 0 251y0233 11/19/02 34 68 102 345 3 1 1 1 .003215 and the base of the shaded area is d c 345 34 311 1 .1093 . 102 – 68 = 34. So the area of the shaded area is P 68 x 102 34 311 The height of the box is Another way to do a problem of this type is to remember that for any continuous distribution, we can use differences between cumulative distributions, P a x b F b F a where the cumulative distribution is F x0 Px x0 and F x F x 1 for x d . So P68 x 102 F 102 F 68 xc d c for c x d , F x 0 for x c and 102 34 68 34 68 34 34 .1093 345 34 345 34 311 311 311 2. This time we want P 68 x 1000 . Remember, c 34 and d 345 In the diagram below, shade the area between 68 and 345 (since there is no area between 345 and 1000). 0 34 68 345 1000 1 1 1 .003215 and the base of the shaded area is d c 345 34 311 345 68 1 277 .8907 . 345 – 68 = 277. So the area of the box is P 68 x 1000 345 34 311 The height of the box is F x 1 for x d . 68 34 34 1 1 .1093 .8907 . So P68 x 1000 F 1000 F 68 1 235 34 311 cd 34 345 189.5 . 3. The formula for the mean is , c 34 and d 345 . So 2 2 2 2 d c 345 34 96721 2 2 8060.1 . So the 4. The formula for the variance is . So 12 12 12 standard deviation of the life of the jorcillator is 8060.1 89.78. ? If we use the cumulative distribution method, remember 5. If I have five such jorcillators, the probability that at least one lasts between g 3 and 1000 years? (1) P68 x 1000 .8907 . Because the probability of lasting is a constant p .8907, and we have n 5 tries, we use the Binomial distribution. Px C xn p x q n x , with We already found that q 1 p .1093. Px 1 1 P0 1 C05 p 0 q 5 1 .1093 .99998. . 5 4 251y0233 11/19/02 Check your answer! g 3 2g1 and g 4 3g1 . f x 1 is the density – not a probability. g 2 g1 g 4 g 3 3 g1 2 g1 g1 g 2 g1 g 2 g1 g 2 g1 g g 3 g 2 2 g1 2) Pg 3 x 1000 2 g 2 g1 g 2 g1 c d g1 g 2 3) 2 2 1) P g 3 x g 4 4) 2 d c 2 , so 12 g 2 g1 2 12 g 2 g1 12 g 2 2 g1 5) P x 1 1 P0 1 C p q 1 g 2 g1 5 0 0 5 5 5 251y0233 11/19/02 Part III. Do the following problems. ( Do at least 35 points ). Show your work! Choose the problems you do carefully, since most people cannot finish this exam. You may do parts of problems! 1. (13) The following table represents the joint probability of x and y . y a. b. 3 0 0 .65 1 5 6 x 5 0 .20 0 9 ___ 0 0 Fill in the missing number. (1) Are x and y independent? Why?(2) Compute xy , the covariance of x and y , and interpret it. (3) d. Compute xy , the correlation of x and y , and interpret it. (3) e. Find the distribution of x y . (1) f. Using only the results of a)-d), find the mean and variance of x y .(3) c. Solution: a) the missing number must be .15 because the numbers inside the table must add to 1. b) Check for independence: First you need to find P x and P y . Look at the upper left hand probability below. Its value is 0 and it represents Px 3 y 1 . If x and y were independent, we would have Px 3 y 1 Px 3 P y 1 .65.15 .0975 . Since this is not true, we can say that x and y are not independent. c) (iii) Compute y 1 5 6 Px xP x x 2 Px xy Covx, y . x 3 5 9 P y yP y y 2 P y 0 0 .15 .15 0.15 0.15 .20 0 .20 1.00 5.00 0 .65 0 0 .65 3.90 23.40 .65 .20 .15 1.0 5.05 28.55 1.95 1.00 1.35 4.30 5.85 5.00 12.15 23.00 Px 1 , Ex xPx 4.30 , Ex x Px 23.00 , P y 1 , E y yP y 5.05 and Ey y P y 28.55 2 To summarize 2 x 2 2 y 031 051 .1591 0 0 1.35 E xy xyPxy 035 .2055 095 0 5.00 0 18.05 .6536 056 096 11.7 0 0 xy Covxy E xy x y 18.05 4.305.05 3.665 . Negative, so x and y move oppositely to one another. 6 251y0233 11/19/02 xy Corr x, y , the correlation of x d) Compute x2 Ex 2 x2 23.0 4.302 4.51 and y. , and E y 28.55 5.05 3.0475 2 y 2 So that xy 2 2 y xy 3.665 3.665 .988583 . x y 4.51 3.0475 2.123681.74571 The sign has been interpreted above. Since the correlation squared is about .98, on a zero to one scale this is very strong. e) This is an easy one, since there are so many zeroes. If we run down the columns of the table and forget about the zeroes: x y x y 3 5 9 6 5 1 9 10 10 This means that x takes only two values P x y .65 .20 .15 x y 9 10 P x y .65 .35 f) Using only the results of a)-d), find the mean and variance of x y .(3) E x y E x E y x y 4.30 5.05 9.35 and Varx y x2 y2 2 xy Varx Var y 2Covx, y 4.5100 3.0475 23.665 0.2275 7 251y0233 11/19/02 2. (16) Find the following probabilities. (2 each) a. Assume that z has the standard Normal distribution. Make diagrams. (i) P 2.04 z 1.93 P 2.04 z 0 P 1.93 z 0 .4793 .4732 .0061 . z -2.04 -1.93 0 Shade the area between -2.04 and -1.93. (ii) Pz 2.02 Pz 0 P0 z 2.02 .5 .4783 .9783 0 z 2.02 Shade the entire area below 2.02. {iii) P1.04 z 3.55 P0 z 3.55 P0 z 1.04 .4998 .3508 .1490 0 Shade the area between 1.04 and 3.41. 1.04 3.41 z 8 251y0233 11/19/02 x has the binomial distribution. n 11. Normal diagrams won’t help! (i) P4 x 8; p .35 Px 8 Px 3 .99796 .42555 .57241 b. Assume that (ii) P 4 x 8; p .65 Since 4 successes in 11 tries correspond to 7 failures, 8 successes in 11 tries correspond to 3 failures and the probability of failure is 1 - .65 = .35 , P4 x 8; p .65 P3 x 7; p .35 Px 7 Px 2 .98776 .20013 .78763 (iii) Px 2; p .35 1 Px 1 1 .06058 .93942 x 2; p .35 1 Px 1 (iv) Do (iii) without using a table. (3 points) P 1 P0 P1 Note that Px C p q n x , with n x x q 1 p .65. Px 2 1 P0 P1 1 C 011 p 0 q 11 C111 p 1 q 10 1 .65 11.35.65 1 .008751 .051832 .93942 11 10 (v) Find the mean and standard deviation for the variable in (iii) np 11.35 3.85 , 2 npq 3.85 .65 2.5025 so npq 2.5025 1.58193 9 251y0233 11/19/02 3. (13) A firm is hiring two executives from a group of 20 executives that is 40% female. Assume that the individuals are chosen at random (actually without regard to gender). a. What is the probability that both of the individuals selected are women? (2) b. What is the probability that at least one woman is selected? (2) How many of you know that ‘at least one’ does not mean ‘exactly one.’ c. What is the mean and standard deviation of the distribution of the number of women selected. (2) d. If the pool (population) is larger than 20, but the firm still is hiring only two people and the population is still 40% female, how large would the pool have to be before we could use the binomial distribution to solve these problems? What is the probability that at least one woman is selected using the binomial distribution. (3) e. Assume that the pool is much, much larger than 20. How many people would the firm have to be hiring before we could use the Poisson distribution? What is the probability that at least one woman is hired now? (Answers without calculations will not be accepted for full credit.) (4) Solution: Hypergeometric Distribution with p .40, n 2, N 20. Remember M Np . C xM C nNxM n! C rn and M Np 20.40 8 . P x N n r !r! Cn a. Exactly 2 women: P2 b. At least one woman: 8 2 12 0 20 2 C C C 8! 12! 87 6! 2! 12! 0! 2 1 8 7 .14737 20 19 20! 20 19 2 1 18! 2! Px 0 1 P0 1 8 0 12 2 20 2 C C C 12! 1 10 ! 2 ! 1 1 20! 18 ! 2 ! 1211 2 1 2019 2 1 12 11 1 1 .347468 .65263 . 20 19 c. np 2 .40 0.80 . q 1 p 1 .40 .60 so N n 20 2 0.80.60 .947370.48 0.45474 so 2 npq N 1 20 1 0.45474 0.67434 d. The condition for using the binomial distribution is that the population size is above 20 times the sample size. This would mean a population above 20 times 2 or 40. It would still be true that p .40, and n 2, so that we could use the Binomial table which says Px 0 Px 0 .3600 . Thus Px 0 1 P0 1 .3600 .6400. 10 251y0233 11/19/02 e. We still have p .40, and the condition to replace the Binomial distribution with the Poisson is n n 500 for n, we find n 200 , so the sample would have be of size 201, 500. If we solve .40 p e m m x e m m 0 and np .40201 80.4. The outline says P x , so P0 e m and 0! x! Px 0 1 P0 1 e 80.4 1 1.2110 35 1.00000. 11 251y0233 11/19/02 4. (22) As you all know by now, a Jorcillator has two components, a phillinx and a flubberall. a. Assume that, in an average week 12 of my Jorcillators need repair. Using your table tell me the probability that (i) at least one, (ii) more than 20 and (iii) more than 30 need repair. (3) b. What is the standard deviation of the number that need repair? (1) c. Assume instead that in each of the first 200 weeks the Flubberall has a constant 5% chance of failing, what is the chance that the Flubberall fails in the third week? (1.5) d. What is the chance that it fails between the 3rd and the 10th week? (2) e. If x represents the week in which the Flubberall fails, what is the mean and standard deviation of x ? (1.5) f. What is the chance that the Flubberall lasts beyond the 10th week? What about the 20th? (2) g. Assume that the Jorcillator needs both components to operate, that the Phillinx follows the same distribution as the Flubberall and that their lives are independent, what is the chance that the Jorcillator does not last beyond the 10th week? (4) h. For the grand slam, what is the chance that the Jorcillator dies after the tenth, but not after the 20th week? (6) Solution: a) As I posted on the notice board last week, “If you are looking for numbers of successes when the number of tries is given and the average number of successes per unit time or space is given, you want the Poisson distribution. In this case, the mean (parameter) is m 12, so: (i) Px 0 1 P0 1 .00001 .99999. (ii) Px 20 1 Px 20 1 .98840 .0116. (iii) Px 30 1 Px 30 1 1 0. b) Since the mean and the variance are the same for this distribution, 12 3.46410. c) Again, from the notice board, ) “ If you are looking for the try on which the first success occurs out of many possible tries when the probability of success is constant, you want the Geometric distribution.” The problem says p .05 and we know P( x) q x 1 p , F x 1 q x 1 q 2 and 2 . So p p q 1 p .95 and P(3) q 2 p .95 .05 .045125. d) Use the cumulative distribution. P3 x 10 Px 10 Px 2 F 10 F 2 2 1 q10 1 q 2 q 2 q10 .95 2 .9510 .9025 .59874 .30376 q .95 1 20 and 2 2 380. e) Using the formulas above for mean and variance, .05 p .052 380 19.49359 . f) Px 10 1 Px 10 1 1 .9510 .9510 .59874 , Px 20 1 Px 20 .95 20 .035849 Px 10 1 .9510 1 .598737 .40126 . This is the probability that one component fails. Let A be the event that the jorcillator fails before the end of the 12th week. Let B be the event that it fails after the 12th, but before the end of the 24th week and let C be the event that it lasts beyond the 24th week. g) The jorcillator fails before the end of the 12th week if one components fails or if both components fail., so P A 2.40126 .40126 .64151 . To see this, remember the last exam. 2 On the last exam we had: 12 251y0233 11/19/02 The event A has a probability of p ; the event B also has a probability of p . If the two events are independent P A B 2 p p 2 b. P A B 2 p 2 c. P A B 2 p p d. P A B 0 2 e. P A B p a.* P A B P A PB P A B . P A p , PB p and, since A and B are independent, P A B P APB p 2 . So P A B p p p 2 2 p p 2 . Solution: From the addition rule Another way to do this is to note that the probability that each component will last beyond the tenth period was found to be Px 10 .59874 . He probability that both last beyond the 10th period is .59874 . 2 Thus the probability that at least one does not last is 1 .59874 .64151 . h) This is actually quite easy. The jorcillator will survive beyond the 20 th period only if both components 2 survive, so PC .035849 .001285 . Since the jorcillator must fail sometime, 2 PB 1 P A PC 1 .64151 .00129 .35361. 13