251y0641 5/10/06 ECO 251 QBA1 FINAL EXAM, Version 1 MAY 8, 2006 Name KEY Class ________________ Part I. Do all the Following (14 Points) Make Diagrams! Show your work! Illegible and poorly presented sections will be penalized. Exam is normed on 75 points. There are actually 123+ possible points. If you haven’t done it lately, take a fast look at ECO 251 - Things That You Should Never Do on a Statistics Exam (or Anywhere Else). x ~ N 13, 5.6 28 13 0 13 z P 2.32 z 2.68 1. P0 x 28 P 5.6 5.6 P2.32 z 0 P0 z 2.68 ..4898 .4963 .9861 For x ~ N 13, 5.6 make a Normal curve centered at 13 and shade the area from 0 to 28; for z make a Normal curve centered at zero and shade the area from -2.32 to 2.68. Since the diagrams show an area on both sides of the mean, you add. 12 13 Pz 0.18 2. F 12 .00 (Cumulative Probability) Px 12 P z 5.6 Pz 0 P0.18 z 0 =.5-.0714=.4286 For x ~ N 13, 5.6 make a Normal curve centered at 13 and shade the area below 12; for z make a Normal curve centered at zero and shade the area below -0.18. Since the diagrams show an area below the mean that does not touch the mean, you subtract. 28 13 Pz 2.68 Pz 0 P0 z 2.68 .5 .4963 .0037 3. Px 28 P z 5.6 For x ~ N 13, 5.6 make a Normal curve centered at 13 and shade the area above 28; for z make a Normal curve centered at zero and shade the area above 2.68. Since the diagrams show an area above the mean that does not touch the mean, you subtract 32 13 28 13 z P2.68 z 3.39 P0 z 3.39 P0 z 2.68 4. P28 x 32 P 5.6 5.6 .4997 .4963 .0034 For x ~ N 13, 5.6 make a Normal curve centered at 13 and shade the area above 28; for z make a Normal curve centered at zero and shade the area between 3.39 and 2.68. Since the diagrams show an area above the mean that does not touch the mean, you subtract 3 13 3 13 z P 2.68 z 1.79 P2.68 z 0 P1.79 z 0 5. P3 x 3 P 5.6 5.6 .4963 .4663 .0300 For x ~ N 13, 5.6 make a Normal curve centered at 13 and shade the area between -3 and 3, both of which are below 13; for z make a Normal curve centered at zero and shade the area between -2.68 and -1.79. Since the diagrams show an area below the mean that does not touch the mean, you subtract. 1 251y0641 5/10/06 x ~ N 13, 5.6 6. x.23 (Find z .23 first) . Solution: Make a diagram. The diagram for z will show an area with a probability of 100 - 23% = 77%. below z .23 . The area below z .23 is split by a vertical line at zero into two areas. The lower one has a probability of 50% and the upper one a probability of 77% - 50% = 27%. The upper tail of the distribution above z .23 has a probability of 23%, so that the entire area above 0 adds to 27% + 23% = 50%. From the diagram, we want one point z .23 so that Pz z 23 .77 or P0 z z 29 .2700 . If we try to find this point on the Normal table, the closest we can come is P0 z 0.74 .2704 , but P0 z 0.73 .2673 not as close, but is acceptable. So z .23 0.74 Since x ~ N 13, 5.6 , the diagram for x would show 77% probability split in two regions on either side of 13 with probabilities of 50% below 13 and 27% above 13 and below x.23 , and with 23% above x.23 . z .23 0.74 , so the value of x can then be written x.23 z.23 13 0.745.6 13 4.144 17.144 . To check this: 17 .144 13 Px 17.144 P z Pz 0.74 Pz 0 P0 z 0.74 5.6 .5000 .2704 .2296 .23. 7. A symmetrical region around the mean with a probability of 23%. [14, 14] Solution: Make a diagram. The diagram for z will show a central area with a probability of 23%. It is split in two by a vertical line at zero into two areas with probabilities of 11.5%. The tails of the distribution each have a probability of 50% - 11.5% = 38.5%. From the diagram, we want two points z .385 and z .615 so that Pz.615 z z.385 .2300 . The upper point, z .385 will have P0 z z .385 23 % .1150 , and by symmetry z .615 z .385 . From the interior of the Normal table the 2 closest we can come to .1150 is P0 z 0.29 .1141 , which is slightly too low. The next best point would be 0.30 since P0 z 0.30 .1179 . We can say z .385 0.29 , and our 23% symmetrical interval for z is -0.29 to 0.29. Since x ~ N 13, 5.6 , the diagram for x (if we bother) will show 23% probability split in two 11.5% regions on either side of 13, with 38.5% above x.385 and 38.5% below x.855 . The interval for x can then be written x z .385 13 0.295.6 13 1.624 or 11.376 to 14.624. To check this: 14 .624 13 11 .376 13 z P11 .376 x 14 .624 P P 0.29 z 0.29 5.6 5.6 2P0 z 0.29 2.1141 .2281 23% 2 251y0641 5/10/06 II. (10 points+, 2 point penalty for not trying part b.) Show your work! Mark individual sections clearly. Wynn, Anthony and Avronovic give us the following data for a sample of eight professional golfers. Ea or x is earnings in thousands of dollars and SA or y is average score. Row 1 2 3 4 5 6 7 8 Ea x 71.6 55.8 147.4 117.4 112.3 82.7 22.8 58.6 SA y 71.50 72.75 71.34 71.27 70.95 71.65 72.49 71.46 In order to speed things up, I have computed the sum of the first seven observations. 7 7 x 610.000, i 1 7 y 501.950, i 1 7 x 2 63720.1, i 1 7 y 2 35996.0 and i 1 xy 43607.4. i 1 Calculate the following: a. The sums that you will need to calculate whatever parts you do. (1 point if you don’t quit at b) Make sure that I can tell how you did these sums. b. The sample standard deviation s y of average score. (2) c. The sample covariance s xy between x and y . (2) d. The sample correlation rxy between x and y . (2) e. Given the size and sign of the correlation, what conclusion might you draw on the relation between x and y ? (1) Can you guess why the correlation isn’t stronger? f. Assume that the earnings of the golfers were 15% lower ( w .85 x ). Find w (the sample mean of earnings), s w2 , s wy and rwy . Use only the values you computed in a-d and rules for functions of x and y to get your results. If you state the results without explaining why, or change x1 and x 2 and recompute the results, you will receive no credit. (4). g. Do an 80% confidence interval for the population average score of professional golfers. (2) [14, 28] Solution: Here is the table that you should have generated. Row Ea x SA y x2 xy y2 1 2 3 4 5 6 7 8 71.6 71.50 55.8 72.75 147.4 71.34 117.4 71.27 112.3 70.95 82.7 71.65 22.8 72.49 58.6 71.46 668.6 573.41 5126.6 5112.25 3113.6 5292.56 21726.8 5089.40 13782.8 5079.41 12611.3 5033.90 6839.3 5133.72 519.8 5254.80 3434.0 5106.53 67154.1 41102.57 5119.4 4059.5 10515.5 8367.1 7967.7 5925.5 1652.8 4187.8 47794.9 Of course, you we only responsible for the last two lines. 3 251y0641 5/10/06 x 668.6, y 573.41, x xy 47794.9 and n 8. 2 y 67154.1, 2 41102.57, b. The sample standard deviation s y of average score comes from the variance. y 573 .41 71.67625 and s y y 2 y n 8 2.69919 0.3456 7 ny 2 2 n 1 41102 .6 871 .67625 2 7 s y .6210 x 668 .6 83.575 x 67154 .1 883 .575 2 1610 .84 n 8 n 1 7 s x 40.1352 Your answers will differ due to rounding error. c. The sample covariance s xy between x and y is x s xy s x2 2 nx 2 xy nxy 47794 .9 883.575 71.67625 18.2584 . n 1 7 d. The sample correlation rxy between x and y is rxy s xy sx s y 18 .2584 0.7326 1610 .84 0.3456 e. The best way to look at the strength of a correlation is to compute rxy2 .536 and to look at it on a zero to one scale. It’s not particularly weak or strong. The negative sign is expected because golfers win with low scores. However, correlations track linear relationships and there is no reason to expect that the relationship between scores to be linear. We would, in fact expect much higher earnings for low scores than could be explained by a straight line, because winners get large prizes. f. Assume that the earnings of the golfers were 15% lower ( w .85 x ). Find w (the sample mean of earnings), s w2 , s wy and rwy . Use only the values you computed in a-d and rules for functions of x and y to get your results. If you state the results without explaining why, or change x1 and x 2 and recompute the results, you will receive no credit. (4). Using formulas from 251var2, when w ax .85 x and v cy 1y . Recall x 83 .575 , s x2 1610.84 , s xy 18.2584 and rxy 0.7326 . i) Just as E ax aEx , w .85 x .8583.575 71.039 ii) Varax a 2Varx .852 1610.84 1163 .83 iii) Covw, v acCovx, y .85 118.2584 15.5196 iv) Corr w, v Corr ax, cy SignacCorr x, y .7326 g. Do an 80% confidence interval for the population average score of professional golfers. (2) Recall that y 71.67625 s 2y 0.3456 and remember that when is unknown x tn1 s x , 2 sx where s x n , or s x sx n N n when the sample is more than 5% of the population. N 1 7 1 .80 .20 . t n1 t.10 1.415 and s x 2 sx n 0.3456 0.0432 0.2078 8 y y t n1 s y 71.676 1.4150.2078 71.676 0.294 or 71.381 to 71.970. 2 4 251y0641 5/10/06 III. Do at least 5 of the following 6 sections (at least 12 each) (or do items adding to at least 48 points Anything extra you do helps, and grades wrap around) . Show your work! Please indicate clearly what sections of the problem you are answering! If you are following a rule like E ax aEx please state it! If you are using a formula, state it! If you answer a 'yes' or 'no' question, explain why! If you are using the Poisson or Binomial table, state things like n , p or the mean. Avoid crossing out answers that you think are inappropriate - you might get partial credit. Choose the problems that you do carefully – most of us are unlikely to be able to do more than half of the entire possible credit in this section!) This is not an opinion questionnaire. Answers without reasons or supporting calculations or table references will not be accepted (except in multiple choice questions)!!!! Answers that are hard to follow will be penalized. Note that some sections extend over more than one page. A. Answer the following 6 multiple choice questions. (These should be 2 each, but to discourage guessing, how about 2.5 each for right answers and 0.5 penalty for wrong answers.) 1. The t distribution should be used when the parent (underlying) population a) *Is Normal, the population standard deviation is unknown and we are testing a mean. b) Is Normal, the population standard deviation is known and we are testing a mean. c) Is Normal, the mean of the population is unknown and we are testing a mean. d) Is binomial and we are testing for a proportion. e) The t distribution should be used in all of these cases. 2. (Was 5) It is desired to estimate the average total compensation of CEOs in the Service industry. Data were randomly collected from 18 CEOs and the 97% confidence interval was calculated to be ($2,181,260, $5,836,180). Which of the following interpretations is correct? a) 97% of the sampled total compensation values fell between $2,181,260 and $5,836,180. b) We are 97% confident that the mean of the sampled CEOs falls in the interval $2,181,260 to $5,836,180. c) In the population of Service industry CEOs, 97% of them will have total compensations that fall in the interval $2,181,260 to $5,836,180. d) *We are 97% confident that the average total compensation of all CEOs in the Service industry falls in the interval $2,181,260 to $5,836,180. 3. (Was 9) In the construction of confidence intervals, if all other quantities are unchanged, an increase in the sample size will lead to a __________ interval. a) *narrower b) wider c) less significant d) biased 4. (Was 13) For air travelers, one of the biggest complaints is of the waiting time between when the airplane taxis away from the terminal until the flight takes off. This waiting time is known to have a skewed-right distribution with a mean of 10 minutes and a standard deviation of 8 minutes. Suppose 100 flights have been randomly sampled. Describe the sampling distribution of the mean waiting time between when the airplane taxis away from the terminal until the flight takes off for these 100 flights. a) Distribution is skewed-right with mean = 10 minutes and standard error = 0.8 minutes. b) Distribution is skewed-right with mean = 10 minutes and standard error = 8 minutes. c) *Distribution is approximately normal with mean = 10 minutes and standard error = 0.8 minutes. d) Distribution is approximately normal with mean = 10 minutes and standard error = 8 minutes. 5 251y0641 5/10/06 5. (Was 16) Why is the Central Limit Theorem so important to the study of sampling distributions? a) It allows us to disregard the size of the sample selected when the population is not normal. b) It allows us to disregard the shape of the sampling distribution when the size of the population is large. c) It allows us to disregard the size of the population we are sampling from. d) *It allows us to disregard the shape of the (parent) population when n is large. 6. (Was 21) What type of probability distribution will the consulting firm most likely employ to analyze the insurance claims in the following problem? An insurance company has called a consulting firm to determine if the company has an unusually high number of false insurance claims. It is known that the industry proportion for false claims is 3%. The consulting firm has decided to randomly and independently sample 100 of the company’s insurance claims. They believe the number of these 100 that are false will yield the information the company desires. a) *binomial distribution. b) geometric distribution c) continuous uniform distribution d) Poisson distribution. e) hypergeometric distribution. f) none of the above. [15] 6 251y0641 5/10/06 B. A random sample of 36 Kleenex users taken at a college found that the average sneezer used 52.10 tissues in the course of a cold. The researcher previously believed that the average was 59. Do a 95% confidence interval for the population mean and answer if the population mean is significantly different from 59.00 for parts a-d. Note that each number here is stated to the nearest hundredth. Please maintain at least that level of significance throughout the problem. Mark your individual questions clearly!!! a. Assume that the population standard deviation is known to be 20.95. (4) b. Assume, more realistically, that 20.95 was a sample standard deviation. (4) c. Now assume that the sample of 36 was drawn from a population of 401. (4) d. Assume that the population standard deviation is known to be 20.95 but that we want a confidence level of 96% (You cannot use the t-table for this part.).(4) e. Assume that the researcher was right and that the average number of Kleenex used by a sneezer has a population mean of 59 and a population standard deviation of 20.95. Assume that the researcher has 2250 tissues on hand for the 36 subjects. What is the probability that the researcher runs out of tissues? (4) [35] Solution: From the introduction to problem O1. We are using the following formulas. When is known x z x , where x 2 x n , or x x n N n when the sample is N 1 more than 5% of the population. s s N n When is unknown x tn1 s x , where s x x , or s x x when the sample is 2 n n N 1 more than 5% of the population. For a)-c), 1 .99 .01 . a) We find a 95% confidence interval for the mean number of tissues assuming that x 52 .10 , 20 .95, n 36 and N is large. Because we are given we should use 20 .95 3.4917 z z.025 1.960 z . Because of the large population x x 2 n 36 x z x 52.10 1.9603.4917 52.10 6.8437 so 2 P45 .26 58 .94 .95 . Note that 59.00 is not on the confidence interval, so that our sample mean is significantly different from 59. b) We find a 95% confidence interval for the mean number of tissues assuming that x 52 .10 , s 20 .95, n 36 and N is large. Because we are given s we should use s 20 .95 3.4917 t n1 t .35 t . Because of the large population s x x 025 2.030 2 n 36 x t n1 s 52.10 2.0303.4917 52.10 7.08 so 2 x P45 .02 59 .18 .95 . Note that 59.00 is on the confidence interval, so that our sample mean is not significantly different from 59. c) We find a 95% confidence interval for the mean processing time assuming that x 52 .10 , s 20 .95, n 36 and N 401 . Because we are given s we should use t . Because of the small population use a finite population correction. s N n 20 .95 401 36 365 sx x 3.49167 3.48167 .95525 3.3259 . 401 1 400 n N 1 36 t n1 t 35 2.030 x t n1 s 52.10 2.0303.3259 52.10 6.75 so 2 .025 P45 .35 58 .85 .95 2 x 7 251y0641 5/10/06 d) We find a 96% confidence interval for the mean number of tissues assuming that x 52 .10 , 20 .95, n 36 and N is large. Because we are given we should use 20 .95 3.4917 To find the value of z that we need, z . We still have x x 2 n 36 note that 1 .96 .04 , so that z z.02 . Make a diagram. Draw a horizontal 2 line at zero. There is 50% of the distribution below zero and 2% over z , which 2 implies that there is 48% between z and zero. If we use table 17, we see that 2 P0 z 2.05 .4798 and P0 z 2.06 .4803 . So both 2.05 and 2.06 are acceptable values of z ; 2.054 seems about right to me. . x z 2 x 52.10 2.0543.4917 52.10 7.17 or P44 .93 59 .27 .95 . Note that 59.00 is on the confidence interval, so that our sample mean is not significantly different from 59. e) If we assume that the researcher was right and that the average number of Kleenex used by a sneezer has a population mean of 59 and a population standard deviation of 20.95, the sample mean for a sample of 36 has a Normal distribution 20 .95 3.4917 . 2250 tissues with a mean of 59 and a standard error of x x n 36 is 62.5 tissues for each of the 36 subjects. 62 .5 59 Px 62 .5 P z Pz 1.00 .5 .3413 .8413 . 3.4917 8 251y0641 5/10/06 C. Assume that P A .25 and PB .70 . (Making joint probability tables will help. – do not round excessively, you should retain at least 3 figures to the right of the decimal point.) 1. Find (i) P A B , (ii) P A B (iii) P B A if a. A and B are independent. (3) b. A and B are mutually exclusive. (3) c. P A B .20 (3) B B sum A .25 Solution: In each case start with and use the addition rule, which states A .75 sum .70 .30 1.00 P A B P A PB P A B , and the multiplication rule, which states P A B P A B PB, P A B P A B or P B A . P B P A a) If A and B are independent, (i) P A B P A PB .25.75 .175 . If we simple fill in the P A B PB AP A , P A B B B sum .175 .075 .25 blanks so that the numbers add up, we get . A .525 .225 .75 sum .700 .300 1.00 A (i) P A B .25 .70 .175 .775 . (ii) We can read P A B .525 from the table. (iii) P B A PB .70. b) If A and B are mutually exclusive, by definition P A B 0. If we simple fill in the blanks so B 0 that the numbers add up, we get A .70 sum .70 A P A B .70 , (iii) PB A B sum .25 .25 . (i) P A B .25 .70 0 .95 , (ii) .05 .75 .30 1.00 P A B 0 0. P A P A c) If P A B .20 , P A B P A B PB .20 .70 .14 . If we simple fill in the blanks so that the numbers add up, we get B B sum A .14 .11 .25 . (i) P A B .25 .70 .14 .81 , (ii) P A B .56 , A .56 .19 .75 sum .70 .30 1.00 P A B .14 0.56 . P A .25 2. If P A .40 and PB .70 , show that A and B cannot be mutually exclusive. (3) Solution: If A and B are mutually exclusive, by definition P A B 0. (iii) PB A P A B .40 .70 0 1.10 . Since a probability cannot be above 1, this is impossible. 9 251y0641 5/10/06 3. At a Pennsylvania state college, 60% of freshmen come from Eastern Pennsylvania, 30% from Western Pennsylvania, and 10% from out of state. They are given a math test during freshman week. 70% of the students from Eastern Pennsylvania pass it as do 60% from from Western PA and 90% from out of state. Tara Bulsnob comes from Eastern PA and has passed the math test and will only be friends with someone else from Eastern PA who has passed the math test.. a. If she picks someone at random, what is the chance that that person has passed the math test? (2) b. If she picks someone who has passed the math test at random, what is the chance that Tara will be willing to be friends with that person? (3) [52] Solution: Define the following events: EP , Being from Eastern PA; WP , Being from Western PA; OS , Being from out of state and PM , Passing the math test. We have been given PEP .60 , PWP .30 , POS .10 , PPM EP .70 , PPM WP .60 and P PM OS .90 . Assume that we have 100 students. Since we know PEP .60 , PWP .30 and POS .10 , we EP WP OS total can divide the 100 students as follows. PM PM total . Since P PM EP .70 , 70% of 60 30 10 100 60 is 42. Similarly P PM WP .60 and P PM OS .90 give us numbers of people who passed in each group. PM PM EP WP 42 18 OS 9 total . If we add these and fill in the remainding numbers, we get total 60 30 10 100 EP WP OS total PM 42 18 9 69 . So a) If she picks someone at random, the chance that that person has PM 18 12 1 31 total 60 30 10 100 69 .69 . b) If she picks someone who has passed the math test at passed the math test is PPM 100 42 .6086 . random, the chance that Tara will be willing to be friends with that person is PEP PM 69 10 251y0641 5/10/06 D. If x is binomial, and n 11, find solutions to 1-5 (If you substitute another distribution for the binomial, or any other distribution, justify it!): 1) P3 x 6 when p .35 . (2) For any discrete distribution for integer values of x , P3 x 6 F 6 F 2 . For the binomial distribution, F 6 F 2 Px 6 Px 2 .94986 .20013 .7497 2) 3) P5 x 8 when p .65 . (2) P5 x 8 can be expressed as the probability of between 11 – 8 = 3 failures and 11 – 5 = 6 failures, when the probability of failure is 1 - .65 = .35. P3 x 6 F 6 F 2 Px 6 Px 2 .94986 .20013 .7497 P2 x 7 when p .465 . (3 or 3.5) If n 11, the expected number of successes is np 11.465 5.115 and the expected number of failures is 11 – 5.115 = 5.885. Since both of these are (barely) above, we can use the Normal distribution. npq 5.115 .535 2.736525 1.6524 . If we use the Normal distribution with a continuity correction we have 7.5 5.885 1.5 5.885 P1.5 x 7.5 P z 1 . 6524 1.6524 P2.65 z 0.98 P2.65 z 0 P0 z 0.98 .4960 .3365 .8325 Of course, if we do not use the continuity correction, we get 7 5.885 2 5.885 P2 x 7 P z P 2.35 z 0.67 1.6524 1.6524 P2.35 z 0 P0 z 0.67 .4906 .2486 .7392 4) 5) P1 x 3 when p 155 . (2) If n 11 , n p 1155 550 . So, since this is above 500, we can use the Poisson distribution with a parameter of 1 np 11 0.2 . P1 x 3 Px 3 Px 0 .99994 - .81873 = .1812 55 Px 1 when p .17 . (2) 1 .12878 .8712 . Px 1 1 P0 1 C 011 p 0 q11 1 q11 1 .83 11 6) If x follows the continuous uniform distribution with c 11 and d 17 , find P3 x 14 . (2) Make a diagram. Represent the distribution by a rectangle by a rectangle with a base from 11 to 17 and a height of 1c d 11711 16 . Shade the area in the rectangle below 14. (There is no area between 3 and 11.) This is the area with base 14 – 11 = 3. The area is 3 1 6 0.5000 . 7) If x follows the Poisson distribution with a parameter of 1.1 find P3 x 15 (2) P3 x 15 Px 15 Px 2 1 .90042 .0996 8) If x follows the Poisson distribution with a parameter of 37 find P15 x 45 (2 or 2.5) If we use the Normal distribution with a continuity correction we have 15 3.8730 . 45 .5 15 14 .5 15 P14 .5 x 45 .5 P z P 0.13 z 7.88 3.8730 3.8730 P0.13 z 0 P0 z 7.88 .0517 .5000 .5517 45 15 15 15 z P0 z 7.75 .5000 se Of course, if we do not u P15 x 45 P 3 . 8730 3 .8730 the continuity correction, we get. 11 251y0641 5/10/06 9) If x follows the Hypergeometric distribution with N 380 , M 133 and n 11, find P2 x 7 (2) Since the population size is more than 20 times the sample size, we can use the M .35 . binomial distribution p N [71] P3 x 6 F 6 F 2 Px 6 Px 2 .94986 .20013 .7497 10) (Extra Credit) Find P15 x 45 for an exponential distribution with c 137 , and explain the relation of this problem to 8) above. (3) 1 1 1 F x 1 ecx , when the mean time to a success is . So if 37 , c . The average number c 37 c of occurances in a given unit of time is 37 and would be used in a Poisson problem. If an average 1 . So of 37 events occur in one minute, the average wait for the first event to occur is c 37 Px1 x x 2 F x 2 F x1 1 e cx2 1 e cx1 e cx1 e cx2 . Note that if x1 15, cx1 15 45 .4054 and if x 2 45, cx 2 1.2162 37 37 So P15 x 45 e 0.4054 e 1.2162 .66671 .29635 .3704 12 251y0641 5/10/06 E. Assume that we are considering buying two stocks. For the first of these two stocks x 1.35 and x 3.5 . For the second stock y 2.00 and y 8.0 . Mark individual questions and parts of questions clearly. (Note: y appears as v in some equations due to a bug in Word formatting: it prints correctly) 1. Assume that the amounts above are in dollars per year and that you buy one share of each stock. Show the mean, standard deviation and coefficient of variation of your return x y if a) xy .8 , b) xy .0 and c) xy .8 . (8.5) 2. Assume that some time in the future the first stock doubles in value, so that its return is now w 2 x and the second stock rises by 50% so that its return is now v 1.5 y . Find v , wv , w v and [83.5] wv if xy .8 (4) 3. Again x 1.35 , x 3.5 , y 2.00 and y 8.0 . Assume that you have one dollar to invest, so that you will buy P1 shares of stock 1 and P2 shares of stock 2, where P1 P2 1 . Then R P1 x P2 y is your return and x 1.35 , x 3.5 , y 2.00 and y 8.0 . Show how xy affects the minimum risk point, by computing coefficients of variation for R P1 x P2 y for P1 equal to 0, .25, .50, .75 and 1 (You have already done this for zero and one) and for both xy .8 and xy .8 . This will give you a table with 10 values, some of which are duplicates. Comment on this and from what you learned on the last exam and try to identify ‘no fly’ zones, if there are any. (10+) Solution: From the solution to grass1, we had Ex y Ex E y x y and Var x y x2 y2 2 xy Varx Var y 2Covx, y . If we check 251varmin, we find xy xy x y and the coefficient of variation case C 1) For all sections the mean is Ex y x y R . E R x y 1.35 2.00 3.35 . Recall that x 3.5 and y 8.0 Recall that x 3.5 and y 8.0 x2 y 3.5 2 8.0 2 2 xy 12 .25 64 2 xy 76 .25 2 xy 1a) xy .8 xy xy x y .83.58.0 22.4 x2 y 76 .25 222 .4 31 .45 x y 31.45 5.6080 C 5.6080 1.674 3.35 1b) xy 0 xy xy x y 03.58.0 0 x2 y 76 .25 x y 76.25 8.7321 C 8.7321 2.607 3.35 1c) xy .8 xy xy x y .83.58.0 22.4 x2 y 76 .25 222 .4 121 .05 x y 121.05 11.0023 C 11 .0023 3.284 3.35 2) Using formulas from 251var2, when w ax 2 x and v cy 1.5 y . Recall xy 22.4 and xy .8 . 1a) Varcx c 2Varx 1.5 2 8.02 , so v 1.58 12 1b) Covw, v acCovx, y 21.522.4 67.2 1c) Varax cy a 2Var x 2acCovx, y c 2Var y 2 2 3.52 221.5 22 .4 1.5 2 8.02 49 134 .4 144 58.60 . wv 58 .60 7.6551 1d) Corrw, v Corrax, cy SignacCorrx, y .8. 13 251y0641 5/10/06 3) If we check 251varmin, we find the following. “If R P1 R1 P2 R2 and P1 P2 1 , then E R P1 E R1 P2 E R2 and VarR P12VarR1 P22VarR2 2P1 P2CovR1 , R2 is the variance of the return. Thus if P1 and P2 are both .50 , we can say VarR =.25VarR1 +.25VarR2 +.50CovR1 R2 . …… Since variance is a measure of risk, minimizing variance minimizes risk, though actually, the best measure of risk is probably the coefficient of variation, the standard deviation divided by the mean, in this case C R E R .” R Var R Recall that x 1.35 , y 2.00 , x 3.5 y 8.0 . So that x2 12.25 and y2 64 3a) xy .8 and xy 22.4 . VarR P12VarR1 P22VarR2 2P1 P2CovR1 , R2 E R Row 1 2 3 4 5 1.0 1.35 12.25 0.75 .75(1.35)+ .25(2)= 1.5125 .5625(12.25)+ .0625(64)-.375(22.4)= 2.4906 0.5 .5(1.35)+ .5(2)= 1.675 .25(12.25)+ .25(64)-.50(22.4)= 7.8625 0.25 .25(1.35)+ .75(2)= 1.7375 .0625(12.25)+ .5625(64)-.375(22.4)= 28.3656 0 2.00 64 Row 1 2 3 4 5 1.0 0.75 0.5 0.25 0 P1 P1 E R 1.35 1.5125 1.675 1.7375 2.00 R CR 3.5 1.578 2.804 5.326 8.0 2.59 1.04 1.67 3.07 4.00 3a) xy .8 and xy 22.4 . VarR P12VarR1 P22VarR2 2P1 P2CovR1 , R2 E R Row 1 2 3 4 5 1.0 1.35 12.25 0.75 .75(1.35)+ .25(2)= 1.5125 .5625(12.25)+ .0625(64)+.375(22.4)= 19.2906 0.5 .5(1.35)+ .5(2)= 1.675 .25(12.25)+ .25(64)+.50(22.4)= 30.2625 0.25 .25(1.35)+ .75(2)= 1.7375 .0625(12.25)+ .5625(64)+.375(22.4)= 45.1656 0 2.00 64 Row 1 2 3 4 5 1.0 0.75 0.5 0.25 0 P1 P1 E R 1.35 1.5125 1.675 1.7375 2.00 R CR 3.5 4.392 5.501 6.720 8.0 2.59 2.90 3.28 3.87 4.00 With the negative covariance, putting 75% in stock 1 lowers risk, creating a ‘no fly’ zone between 100% and 75%, where a lower return gives higher risk. Even the 50-50 split gives better results than all in stock 1. But now we are in a ‘normal’ region where accepting higher risk yields higher return. With the positive covariance, there seems to be no ‘no fly’ zone, accepting increasing risk always yields higher return. 14 251y0641 5/10/06 F. (BLK 38) Assume that the amount of gasoline purchased per car at a large service station has an approximately Normal distribution with a mean of $15 and a standard deviation of $4. Find the following. Mark individual questions clearly! 1. The probability that a given car will purchase between $14 and $16 worth of gas. (2) 2. The probability that a random sample of 16 cars will have a sample mean between $14 and $16 (3) 3. The probability that all 16 cars in the sample will purchase an amount between $14 and $16. (2) 4. The probability that at least one of the 16 cars will purchase an amount between $14 and $16.(2) 5. The 95th percentile of the distribution of the sample mean for the 16 cars. (1.5) 6. The 10th percentile of the distribution of the sample mean for 16 cars. (1.5) [95.5] 4 Solution: Note that x ~ N (15,4) . This means that for a sample of 16, we will have x ~ N (15, ) or 16 x ~ N (15,1) 16 15 14 15 z P 0.25 z 0.25 1. P14 x 16 P 4 4 P0.25 z 0 P0 z 0.25 .0987 .0987 .1974 For x ~ N (15,4) make a Normal curve centered at 15 and shade the area from 14 to 16; for z make a Normal curve centered at zero and shade the area from -0.25 to 0.25. Since the diagrams show an area on both sides of the mean, you add. 16 15 14 15 z P 1.00 z 1.00 2. P14 x 16 P 1 1 P1.00 z 0 P0 z 1.00 .3413 .3413 .6826 For x ~ N (15,1) make a Normal curve centered at 15 and shade the area from 14 to 16; for z make a Normal curve centered at zero and shade the area from -0.25 to 0.25. Since the diagrams show an area on both sides of the mean, you add. 3. This is a binomial problem. Let y be the number of cars that purchse an amount between $14 and $16. Then, assuming independence, y will have a binomial distribution with n 16 , p .1974 and 16 16 0 p q p16 .1974 16 5.31 10 12 q 1 .1974 .8026. Px C xn p x q n x . So P16 C16 4. This is also a binomial problem. y still has a binomial distribution with n 16 , p .1974 and q 1 .1974 .8026. Px C xn p x q n x . So Px 1 1 P0 1 C 016 p 0 q16 1 q16 1 .8026 16 1 .02965 .9704 4. x ~ N (15,1) . The 95th percentile is x.05 and has 95% of the data below it. According to the t table, z .05 1.645 . If we employ the usual transformation x z , we conclude that x.05 15 1.645 1 16.645 . 5. x ~ N (15,1) . The 10th percentile is x.90 and has 10% of the data below it. Because the mean of x is 15, x.90 is below 15. Because the standardized Normal distribution is summetrical about zero, z .90 z10 . According to the t table, z .10 1.282 . This means that z .90 1.282 . If we employ the usual transformation x z , we conclude that x.90 15 1.282 1 13.718 . 15 251y0641 5/10/06 ECO 251 QBA1 Name FINAL EXAM, Version 1 Class _________________ MAY 8, 2006 Student Number _____________ Supplementary Correlation Problem This problem is an edited version of a problem due to Ben-Horim and Levy. (14 points) A textile firm operates in two companies, the United States x and Japan y . Its anticipated profits for 2007 are approximated by the following joint distribution. Please turn in any scratch paper that you use. x 2 y 4 12 Px 2 .24 .08 0 .32 2 .06 .25 .08 .39 6 0 .11 .18 .29 P y .30 .44 .26 1.00 a. Find the expected net profit for both countries and the countries combined. (1.5) b. Find the (population) standard deviation of profits for one or both countries (2) c. Find the (population) covariance and correlation between profits in both countries (3) d. Comment on the strength and the sign of the results and explain from general economic knowledge why you would be very surprised to get a different sign. (1.5) e. What is the standard deviation of the firm’s total profit? (2) f. To verify that the firm’s mean and standard deviation are the values that you presumably got above by using formulas that you learned in class, note that, though there are 9 joint probabilities, x y can only take 8 values. Fill in the following table and compute Ex y and x y using it. If there is a discrepancy, find your error. (4 if they agree) x y Px y x y Px y x y 2 P x y -4 0 2 4 6 10 14 18 16 251y0641 5/10/06 Solution: To compute the means and variances we do the following tableau. x 2 .24 .08 0 .32 2 y 4 12 Px xPx x 2 Px 2 .06 6 0 .11 .18 .29 .25 .08 .39 0.64 0.78 1.74 P y yP y .30 0.60 y 2 P y 1.20 .44 1.76 7.04 .26 3.12 37 .44 1.00 4.28 45 .68 1.88 1.28 1.56 10 .44 13 .28 This tells us Ex 1.88 , E x 2 13.28 , E y 4.28 , E y 2 45.68 . We also need to do 0 .24 2 2 .06 2 2 06 2 0.96 0.24 E xy .08 24 .2524 .1164 0.64 2.00 2.64 19 .6 . 0 212 .08212 .18612 0 1.92 12 .96 a. The expected net profit for both countries and the countries combined is Ex 1.88 , E y 4.288 and Ex y Ex E y x y 1.88 4.28 6.16 . b. The (population) standard deviation of profits for both countries is found by computing the square root of the variance. x Px Var y E y y P y x2 Varx E x 2 x2 2 y2 2 2 2 y 13.28 1.882 9.74560 , x 9.7456 3.12179 , 2 x 2 y 45.68 4.282 27.3616 and y 27.3616 5.23083. c. The (population) covariance and correlation between profits in both countries is found from E xy . xy Covxy Exy x y 19.6 1.884.28 11.5536 . So that xy xy x y 11 .5536 .707526 . 9.7456 27 .3616 2 .5005 This can be termed d. The strength of the correlation is best measured on a zero to one scale by xy neither weak or strong. The sign of the results is positive, indicating that the profits in the two countries tend to move together as we would expect in an increasingly internationalized economy where world-wide prosperity and depression are major factors in profits of transnational firms. e. The standard deviation of the firm’s total profit is found from the variance of the total? (2) Var x y x2 y2 2 xy Var x Var y 2Covx, y 9.74560 27.3616 211.5536 60.2144 x y 60.2144 7.7597 . Row x y Px y 1 -4 2 0 3 2 4 4 5 6 6 10 7 14 8 18 Total 0.24 0.06 0.08 0.00 0.25 0.11 0.08 0.18 1.00 x y Px y x y 2 Px y -0.96 0.00 0.16 0.00 1.50 1.10 1.12 3.24 6.16 3.84 0.00 0.32 0.00 9.00 11.00 15.68 58.32 98.16 So Ex y 6.16 and Varx y 98.16 6.16 2 60.2144 17 251y0641 5/10/06 Extra Credit Problem Version 1 (Stevens, modified) A sample of 350 parents of children 19 to 24 months is taken. They were asked the following question: Did your child consume a hot dog today? 110 said yes. According to a USA Today Article the national percentage for hot dogs was 25% 1) Using the information above, check to see if the percent consuming the food was significantly different from the percent given in the USA Today Article by creating a 95% confidence interval for the proportion. (3.5) 2) (Extra extra credit!) Assume the above results came from a daycare system where the population of parents of children attending is only 1400. Do part 1) again with this additional information. (3.5) x Solution: 1) Let p and q 1 p be the sample quantities. Remember that a proportion is always n between zero and one. For the interval write p p z s p , where s p 2 x 110 , so p pq . We have n 350 and n x 110 .3143 and q 1 .3143 .6857 . Our estimate of the standard error of the n 350 .3143 .6857 0.0006157 .02481 . If 1 .95 , 350 1.960 . proportion is s p 2 .05 2 .025 . According to the t table, z .025 Thus our confidence interval is p .3143 1.960 .02481 .3143 .0486 . So we can say P.2657 p .3629 .95. Since .25 is not on this interval, we can say that the proportion found in this survey was significantly different from .25. Just to give myself some credibility, I ran this on Minitab and got the following. Version 1 MTB > POne 350 110; SUBC> Test 0.25; SUBC> UseZ. Test and CI for One Proportion Test of p = 0.25 vs p not = 0.25 Sample 1 X 110 N 350 Sample p 0.314286 95% CI (0.265651, 0.362921) Z-Value 2.78 P-Value 0.005 .0248 2) For the interval we still write p p z s p , but we have N 1400 , so that our sample is more than one 2 twentieth of the population. Then we must use the finite population correction and s p have n 350 and x 110 , so p N n N 1 pq . We n x 110 .3143 and q 1 .3143 .6857 . Our estimate of the standard n 350 error of the proportion is 1400 350 .3143 .6857 sp 0.700467 0.0006157 000431 .020768 . Thus our confidence 1400 1 350 interval is p .3143 1.960 .020768 .3143 .0407 . So we can say P.2735 p .3550 .95. Since .25 is not on this interval, we can still say that the proportion found in this survey was significantly different from .25 18