251solngr2-081b 6/18/08 (Open this document in 'Page Layout' view!) Graded Assignment 2 Name: Key There will be a penalty for papers that are unstapled. Note that from now on neatness means paper neatly trimmed on the left side if it has been torn, multiple pages stapled and paper written on only one side. The stapling is for your protection – putting your name on every page helps too, I still have some unclaimed pages from an old exam (as well as an old exam with no name on it that the perp will not admit responsibility for). Note problem 3 at the end to see how risk fits into this. x x 4 .10 5 .05 y 2 .05 .30 0 3 0 0 .50 1 and 6 0 4 1 0 5 .05 6 .10 y 2 0 .30 .05 3 .50 0 0 1) Use the joint probability tables above. For these joint probability tables (i) check for independence, (ii) Compute E x and Varx , (iii) Compute Covx, y or xy and Corr x, y or xy , (iv) Compute Ex y and Var x y from the results in (ii) and (iii), (v) Compute Cov3x 3, y and Corr 3x 3, y using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that y 1y 0 . Solution: (i) Check for independence: First you need to find Px and P y . Look at the upper left hand probability below. Its value is a).10 or b) 0 and it represents Px 4 y 1 . If x and y are independent, we would have Px 4 y 1 Px 4 P y 1 . We need to find out what these probabilities are so we add the rows and columns to get marginal or total probabilities. x x 1 a) y 2 3 Px 4 .10 .05 0 .15 ` 5 .05 6 0 .30 0 0 .50 .35 .50 P y .15 .35 .50 1.00 1 b) y 2 3 Px 4 0 0 .50 .50 5 .05 .30 0 .35 6 .10 .05 0 .15 P y .15 .35 .50 1.00 ` Thus we have for a) Px 4 P y 1 .15.15 .0225 . Since these are not equal to Px 4 y 1 .10 in a) , x and y cannot be independent. Even one place where the joint probability is not the product of the marginal probabilities is enough to show that x and y are not independent. For b) we have Px 4 P y 1 .50.15 .0725 . But this is not equal to Px 4 y 1 0 If this one is not enough to convince you, how about, for both a) and b), Px 5 y 2 .30 Px 5 P y 2 .35.35 .1225 . Actually the fastest way to prove nonindependence is to look for zeroes. If Px 5 y 3 0 in both a) and b) and x and y are independent, then it must be true that Px 5 0 or P y 3 0 . Notice that the second row is not proportional to the first row or any other row. This is also evidence of non-independence. If variables are independent all rows must be proportional to one another and all columns must also be proportional to one another. 251solngr2-081b 6/18/08 (Open this document in 'Page Layout' view!) A zero covariance or correlation would be the consequence of independence, but it is not true that a zero correlation or covariance would prove independence. We have already seen one example where there is a zero correlation, but no independence (Downing and Clark, pg. 219, Computational Problem 3). Px 1 (a check for a valid distribution), Let’s finish the job we did in (i) by computing E x xPx , E x x Px , P y 1 , x 2 2 y E y yP y and E y y 2 2 P y . The easiest way to do this is to multiply the items in the P y column by the items in the y column to get the yP y column and then to multiply the items in the yP y column by the items in the y column to get the y 2 P y column. Then multiply the items in the Px row by the items in the x row to get the xPx row and then multiply the items in the xPx row by the items in the x row to get the x 2 Px row. Then add up all the rows and columns outside the original table. x 1 y a) 2 3 Px xPx x 2 Px y 5 .05 .05 .30 0 0 .15 6 .35 0 0 .50 .50 `0.60 1.75 3.00 P y .15 yP y y 2 P y 0.15 0.15 .35 0.70 1.40 .50 1.50 4.50 1.00 2.35 6.05 5.35 2.40 8.75 18 .0 29 .15 x 4 0 5 .05 0 .50 .50 1 b) 4 .10 2 3 Px xPx x Px 2 P y .15 6 .10 .05 0 .15 .30 0 .35 yP y y 2 P y 0.15 0.15 .35 0.70 1.40 .50 1.50 4.50 1.00 2.35 6.05 `2.00 1.75 0.90 4.65 8.00 8.75 5.40 22 .15 Px 1 (a check), E x xPx 5.35 , E x x Px 29.15 , P y 1 , E y yP y 2.35 and E y y P y 6.05 . b) Px 1 (a check), E x xPx 4.65 , E x x Px 22 .15 , P y 1 , E y yP y 2.35 and E y y P y 6.05 . 2 To summarize a) 2 x 2 2 y 2 2 x 2 2 y (ii) Compute E x and Varx . Remember that variances and standard deviations are never negative. We actually need means and variances for both x and y . From the above xPx 5.35 , x2 Varx Ex 2 x2 x 2 Px x2 29.15 5.352 0.5275 ( 0.5275 0.7263 ), E y yP y 2.35 and Var y E y y P y 6.05 2.35 0.5275 ( y 0.5275 0.7263); b) E x xPx 4.65 , x2 Varx Ex 2 x2 x 2 Px x2 22.15 4.652 0.5275 ( 0.5275 0.7263 ), E y yP y 2.35 and Var y E y y P y 6.05 2.35 0.5275 ( y 0.5275 0.7263). a) x E x y x 2 y 2 2 y 2 2 y 2 2 y 2 2 2 y 2 x y x 2 y 251solngr2-081b 6/18/08 (Open this document in 'Page Layout' view!) (iii) Compute Covx, y or xy and Corr x, y or xy . a) We must now compute E xy by multiplying each pair of values of x and y by their joint probabilities. We had x 5.35 , x2 0.5275 ( x 0.7263 ), y 2.35 , y2 0.5275 ( y 0.7263 ) and x 4 .10 5 .05 6 . 0 y 2 .05 .30 0 3 0 0 .50 1 061 0.40 0.25 0 .10 41 .0551 E xy xyPxy .0542 .30 52 062 0.40 3.00 0 13 .05 043 053 .50 63 0 0 9.00 To complete what we have done, write xy Covxy Exy x y 13.05 5.352.35 0.4775. So that xy xy 0.4775 0.8194 .9052 . 0.5275 0.5275 b) We must now compute E xy by multiplying each pair of values of x and y by their joint probabilities. x y We had x 4.65 , x2 0.5275 ( x 0.7263 ), y 2.35 , y2 0.5275 ( y 0.7263 ) and x 4 1 0 5 .05 6 . .10 y 2 0 .30 .05 3 .50 0 0 041 .0551 .10 61 0 0.25 0.60 E xy xyPxy 042 .30 52 .0562 0 3.00 0.60 10 .45 .50 43 053 063 6.00 0 0 To complete what we have done, write xy Covxy Exy x y 10.45 4.652.35 0.4775. So that xy xy 0.4775 0.8194 .9052 . In general, joint probability tables with 0.5275 0.5275 only the diagonals filled produce correlations close to +1 or -1. A northwest to southwest diagonal produces a positive correlation and a southwest to northeast diagonal produces a negative correlation. The two tables here have dominant diagonals, each number in the diagonal is larger than other numbers in its row and column and so the correlations are similar to those of tables with only the diagonals filled. Remember that the correlation must be between -1 and +1! Note that the strength of a correlation is found by squaring the correlation and measuring x y 2 the strength on a zero to one scale. In a) we had xy .9052 , so xy .90522 .8194 and we can say that there is a relatively strong tendency for y to rise as x rises. In b) we had xy .9052 , so 2 xy .90522 .8194 and we can say that there is a relatively weak tendency for y to fall as x rises or for y to rise as x falls. 251solngr2-081b 6/18/08 (Open this document in 'Page Layout' view!) (iv) Compute Ex y and Var x y from the results in (ii) and (iii). Compute Ex y and Var x y from the results in (ii) and (iii). How many of you ignored the instructions and wrote down each value of x y with its probability. What a great way to waste time! The formulas that you were given were Ex y Ex E y x y and Var x y x2 y2 2 xy Var x Var y 2Covx, y a) We had x 5.35 , x2 0.5275 ( x 0.7263 ), y 2.35 , y2 0.5275 ( y 0.7263 ) and xy 0.4775 . Ex y x y 5.35 2.35 7.70 and Var x y x2 y2 2 xy 0.5275 0.5275 20.4775 2.0100 ( x y 2.0100 1.4177 ) b) We had x 4.65 , x2 0.5275 ( x 0.7263 ), y 2.35 , y2 0.5275 ( y 0.7263 ) and xy 0.4775 . Ex y x y 4.65 2.35 7.00 and Var x y x2 y2 2 xy 0.5275 0.5275 20.4775 0.1000 ( x y 0.1000 0.3162 ) (v) Compute Cov3x 3, y and Corr 3x 3, y using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that y 1y 0 . 251v2out says Cov(ax b, cy d ) acCov( x, y) and Corr (ax b, cy d ) (sign(ac))Corr ( x, y) , where signac has the value 1 or 1 depending on whether the product of a and c is negative or positive. a 3 and c 1 . Cov(3x 3, 1y 0) 31Cov( x, y) 3Covx, y Corr (3x 3,1y 0) (sign(31))Corr ( x, y) sign 3Corr x, y 1 Corrx, y a) We had xy 0.4775 and xy .9052 . So Cov(3x 3, 1y 0) 3Covx, y 30.4775 1.4325 and Corr (3x 3,1y 0) 1 Corr x, y .9052 . b) We had xy 0.4775 and xy .9052 . So Cov(3x 3, 1y 0) 3Covx, y 30.4775 1.4325 and Corr (3x 3,1y 0) 1 Corr x, y .9052 . 2) The following data represent the scores of a group of students on a math placement test and their grades in a math course. (i) Compute the sample mean and variance of x , (ii) Compute Covx, y or s xy and Corr x, y or rxy , (iii) Compute the sample mean and variance of x y from the results in (i) and (ii). (iv) Compute Cov6 x 3, y and Corr 6 x 3, y using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that y 1y 0 . Test Score Grades y x 51 80 52 77 59 87 45 72 61 80 54 84 56 83 67 87 63 92 53 77 251solngr2-081b 6/18/08 (Open this document in 'Page Layout' view!) (i) Compute the sample mean and variance of x . Row 1 2 3 4 5 6 7 8 9 10 x2 x y y2 xy 51 2601 80 6400 4080 52 2704 77 5929 4004 59 3481 87 7569 5133 45 2025 72 5184 3240 61 3721 80 6400 4880 54 2916 84 7056 4536 56 3136 83 6889 4648 67 4489 87 7569 5829 63 3969 92 8464 5796 53 2809 77 5929 4081 561 31851 819 67389 46227 x 561, y 819 , x 31851 , x 561 56.1 and y y 819 81.9 . x 2 To summarize the results of these computations n 10 , y 2 67389 and s x2 x s 2y y 2 nx 2 n 1 2 ny 2 n 1 xy 46227 . Thus n 10 n 10 31851 10 56 .12 378 .9 42 .1000 . s x 42 .1000 6.4885 9 9 67389 10 81 .92 312 .9 34 .7667 . s y 34.7667 5.8963 9 9 (ii) Compute Covx, y or s xy and Corr x, y or rxy . xy 46227 , x 56.1 and y 81.9 x x y y xy nx y 46227 1056.181.9 281 .1 31.2333 . Recall s xy rxy n 1 s xy sx s y n 1 31 .2333 42 .1 34 .7667 9 9 31 .2333 42 .134 .7667 0.6665 .8164 2 (iii) Compute the sample mean and variance of x y x y 56.1 81.9 138.0. Recall s x2 x y from the results in (i) and (ii). 42.1000 , s 2y 34 .7667 and s xy 31.2333 s x2 y s x2 s 2y 2s xy 42 .1000 34 .7667 231 .2333 139 .3333 . (iv) Compute Cov6 x 3, y and Corr 6 x 3, y using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that y 1y 0 . 251v2out says Cov(ax b, cy d ) acCov( x, y) and Corr (ax b, cy d ) (sign(ac))Corr ( x, y) , where signac has the value 1 or 1 depending on whether the product of a and c is negative or positive. a 6 and c 1. s xy Covx, y 31.2333 and rxy Corrx, y .8164 Cov(6 x 3, 1y 0) 61Cov( x, y) 631.2333 187 .3998 Corr (6 x 3, 1y 0) (sign(61))Corr ( x, y) 10.8164 .8164 . 251solngr2-081b 6/18/08 (Open this document in 'Page Layout' view!) 3) The PHLX Gold/Silver SectorSM (XAUSM) is a capitalization-weighted index composed of 16 companies involved in the gold and silver mining industry. XAU was set to an initial value of 100 in January 1979; options commenced trading on December 19, 1983. The Dow-Jones Utility average is an average based on the prices of 16 (I think) utility stocks. Both gold and utilities can attract cautious investors under certain stock market conditions, so it is interesting to look at how they move relative to one another. The values of these two indices for 20 very recent trading days are given on the next page. x Row pick Date 1 * 03/05 2 * 03/04 3 * 03/03 4 * 02/29 5 * 02/28 6 * 02/27 7 * 02/26 8 * 02/25 9 * 02/22 10 * 02/21 11 0 02/20 12 1 02/19 13 2 02/15 14 3 02/14 15 4 02/13 16 5 02/12 17 6 02/11 18 7 02/08 19 8 02/07 20 9 02/06 PHLXGS 203.32 195.62 202.98 196.58 202.84 197.84 193.13 188.12 189.94 196.56 189.56 186.10 177.32 176.87 179.43 176.65 182.06 181.25 174.88 129.98 y You are expected to work with 11 of the 20 observations shown above. Use the first 10 rows of data and pick one more row by finding the row marked with the second-to-last digit of your student number. (i) Compute the sample mean and standard deviation of x , (ii) Compute Covx, y or s xy and Corr x, y or rxy , (iii) DJUT 496.60 489.97 482.76 477.50 492.40 496.04 504.63 500.78 497.54 491.82 499.95 499.85 500.41 498.79 504.05 502.08 497.90 494.39 496.96 498.66 Compute the sample mean and variance of x y from the results in (i) and (ii). (iv) The coefficient of variation is computed by dividing the standard deviation by the mean. Compute a coefficient of variation for x , y and x y and compare the relative safety of investing in precious metal stocks, investing in utilities and doing both. Solution: Since you were not supposed to do this problem, I am just going to present the answer with the original numbers. (i) Compute the sample mean and standard deviation of x . Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 x 03/05 03/04 03/03 02/29 02/28 02/27 02/26 02/25 02/22 02/21 02/20 02/19 02/15 02/14 02/13 02/12 02/11 02/08 02/07 02/06 y x2 y2 xy 203.32 496.60 41339.0 246612 100969 195.62 489.97 38267.2 240071 95848 202.98 482.76 41200.9 233057 97991 196.58 477.50 38643.7 228006 93867 202.84 492.40 41144.1 242458 99878 197.84 496.04 39140.7 246056 98137 193.13 504.63 37299.2 254651 97459 188.12 500.78 35389.1 250781 94207 189.94 497.54 36077.2 247546 94503 196.56 491.82 38635.8 241887 96672 189.56 499.95 35933.0 249950 94771 186.10 499.85 34633.2 249850 93022 177.32 500.41 31442.4 250410 88733 176.87 498.79 31283.0 248791 88221 179.43 504.05 32195.1 254066 90442 176.65 502.08 31205.2 252084 88692 182.06 497.90 33145.8 247904 90648 181.25 494.39 32851.6 244421 89608 174.88 496.96 30583.0 246969 86908 129.98 498.66 16894.8 248662 64816 3721.03 9923.08 697304 4924233 1845390 251solngr2-081b 6/18/08 (Open this document in 'Page Layout' view!) x 3721 .03, y 9923 .08, x 3721 .03 186 .0515 and xy 1845390 . Thus x To summarize the results of these computations n 20 , x 697304 , y 4924233 and y 9923 .08 496 .1540 . s x y 2 2 n 2 x 20 n 2 nx 2 n 1 20 697304 20 186 .0515 5000 .787 263 .199 . 19 19 2 Minitab says 263.201. s x 263 .199 16 .2234 Minitab says 16.2235. Though the mean and variance of y were not requested, we will need them. s 2y y 2 ny 2 n 1 4924233 20 496 .1540 2 857 .166 45 .1140 Minitab says 45.1342. 19 19 s y 45.1140 6.7167 Minitab says 6.7167. (ii) Compute Covx, y or s xy and Corr x, y or rxy . s xy x x y y xy nx y n 1 Minitab says -42.8130 rxy s xy sx s y n 1 42 .8378 263 .199 45 .1140 1845390 20 186 .0515 496 .1540 813 .9190 42 .8378 . 19 19 42.8378 2 263 .199 45.1140 (iii) Compute the sample mean and variance of 0.1545 .3931 . Minitab says -.3928. x y from the results in (i) and (ii). We have x 186 .0515 , y 496.1540 , s x2 263.199 , s 2y 45 .1140 and s xy 42 .8378 . So x y x y 186.0515 496.1540 682.2055 and s x2 y s x2 s 2y 2 s xy 263 .199 45.1140 242.8378 = 222.6374 . (iv) The coefficient of variation is computed by dividing the standard deviation by the mean. Compute a coefficient of variation for x , y and x y and compare the relative safety of investing in precious metal stocks, investing in utilities and doing both. s x 16 .2234 , s y 6.7167 and s x y 222.6374 14.9210 . s x y sy sx 16 .2234 14 .9210 6.7167 .0872 , C y .0219 . This .0135 and C x y x 186 .0515 682 .2055 y 496 .154 x y seems to show that investing in precious metals is much more (over 6 times as) risky than either utilities or a 50-50 strategy of doing both. However, because of the negative covariance, the 50-50 strategy is only about 62% riskier than utilities alone. Cx