Graded Assignment 2

advertisement
251solngr2-081b
6/18/08 (Open this document in 'Page Layout' view!)
Graded Assignment 2
Name: Key
There will be a penalty for papers that are unstapled. Note that from now on neatness means paper
neatly trimmed on the left side if it has been torn, multiple pages stapled and paper written on only
one side. The stapling is for your protection – putting your name on every page helps too, I still have
some unclaimed pages from an old exam (as well as an old exam with no name on it that the perp will
not admit responsibility for). Note problem 3 at the end to see how risk fits into this.
x
x
4
.10
5
.05
y 2
.05
.30
0
3
0
0
.50
1
and
6
0
4
1
0
5
.05
6
.10
y 2
0
.30
.05
3 .50
0
0
1) Use the joint probability tables above.
For these joint probability tables (i) check for independence, (ii) Compute E x  and Varx  , (iii) Compute
 
 
Covx, y  or  xy and Corr x, y  or  xy , (iv) Compute Ex  y  and Var x  y  from the results in (ii)
and (iii), (v) Compute Cov3x  3, y  and Corr 3x  3, y  using the formulas in section K4 of 251v2out
or section C1 of 251var2. Note that y  1y  0 .
Solution:
(i) Check for independence: First you need to find Px  and P y  .
Look at the upper left hand
probability below. Its value is a).10 or b) 0 and it represents Px  4   y  1 . If x and y are
independent, we would have Px  4   y  1  Px  4 P y  1 . We need to find out what these
probabilities are so we add the rows and columns to get marginal or total probabilities.
x
x
1
a)
y
2
3
Px 
4
.10





.05
0
.15
`
5
.05
6
0

.30
0
0 .50 

.35 .50
P y 
.15
.35
.50
1.00
1
b)
y
2
3
Px 
4
 0

 0
 .50

.50
5
.05
.30
0
.35
6
.10 

.05 
0

.15
P y 
.15
.35
.50
1.00
`
Thus we have for a) Px  4 P y  1  .15.15   .0225 . Since these are not equal to
Px  4   y  1  .10 in a) , x and y cannot be independent. Even one place where the joint
probability is not the product of the marginal probabilities is enough to show that x and y are not
independent. For b) we have Px  4 P y  1  .50.15   .0725 . But this is not equal to
Px  4   y  1  0 If this one is not enough to convince you, how about, for both a) and b),
Px  5   y  2  .30  Px  5 P y  2  .35.35   .1225 . Actually the fastest way to prove nonindependence is to look for zeroes. If Px  5   y  3  0 in both a) and b) and x and y are independent,
then it must be true that Px  5  0 or P y  3  0 . Notice that the second row is not proportional to
the first row or any other row. This is also evidence of non-independence. If variables are independent all
rows must be proportional to one another and all columns must also be proportional to one another.
251solngr2-081b
6/18/08 (Open this document in 'Page Layout' view!)
A zero covariance or correlation would be the consequence of independence, but it is not true that a zero
correlation or covariance would prove independence. We have already seen one example where there is a
zero correlation, but no independence (Downing and Clark, pg. 219, Computational Problem 3).
Px   1 (a check for a valid distribution),
Let’s finish the job we did in (i) by computing

 E x    xPx  , E x    x Px  ,  P y   1 , 
x
2
2
y
 E y  
 yP y  and E y    y
2
2
P y  .
The easiest way to do this is to multiply the items in the P y  column by the items in the y column to get
the yP y  column and then to multiply the items in the yP y  column by the items in the y column to get
the y 2 P y  column. Then multiply the items in the Px  row by the items in the x row to get the
xPx  row and then multiply the items in the xPx  row by the items in the x row to get the x 2 Px  row.
Then add up all the rows and columns outside the original table.
x





1
y
a)
2
3
Px 
xPx 
x 2 Px 
y
5
.05
.05
.30
0
0
.15
6
.35
0

0
.50 

.50
`0.60  1.75  3.00 
P y 
.15
yP y  y 2 P y 
0.15
0.15
.35
0.70
1.40
.50
1.50
4.50
1.00
2.35
6.05
5.35
2.40  8.75  18 .0  29 .15
x
4
0
5
.05


 0
 .50

.50
1
b)
4
.10
2
3
Px 
xPx 
x Px 
2
P y 
.15
6
.10 

.05 
0

.15
.30
0
.35
yP y  y 2 P y 
0.15
0.15
.35
0.70
1.40
.50
1.50
4.50
1.00
2.35
6.05
`2.00  1.75  0.90 
4.65
8.00  8.75  5.40 
22 .15
 Px   1 (a check),   E x    xPx  5.35 , E x    x Px   29.15 ,
 P y   1 ,   E y    yP y   2.35 and E y    y P y   6.05 .
b)  Px   1 (a check),   E x    xPx   4.65 , E x    x Px   22 .15 ,  P y   1 ,
  E  y    yP y   2.35 and E y    y P y   6.05 .
2
To summarize a)
2
x
2
2
y
2
2
x
2
2
y
(ii) Compute E x  and Varx  . Remember that variances and standard deviations are never negative.
We actually need means and variances for both x and y . From the above
 xPx  5.35 ,  x2  Varx  Ex 2   x2   x 2 Px   x2  29.15  5.352  0.5275
(   0.5275  0.7263 ),   E  y    yP y   2.35 and
  Var  y   E y     y P y     6.05  2.35   0.5275 (  y  0.5275  0.7263);
b)   E x    xPx   4.65 ,  x2  Varx   Ex 2   x2   x 2 Px    x2  22.15  4.652  0.5275
(   0.5275  0.7263 ),   E  y    yP y   2.35 and
  Var  y   E y     y P y     6.05  2.35   0.5275 (  y  0.5275  0.7263).
a)  x  E x  
y
x
2
y
2
2
y
2
2
y
2
2
y
2
2
2
y
2
x
y
x
2
y
251solngr2-081b
6/18/08 (Open this document in 'Page Layout' view!)
 
 
(iii) Compute Covx, y  or  xy and Corr x, y  or  xy . a) We must now compute E xy  by multiplying
each pair of values of x and y by their joint probabilities. We had  x  5.35 ,  x2  0.5275
(  x  0.7263 ),  y  2.35 ,  y2  0.5275 (  y  0.7263 ) and
x
4
.10
5
.05
6 .
0
y 2
.05
.30
0
3
0
0
.50
1
061  0.40 0.25
0 
 .10 41 .0551



E xy  
xyPxy    .0542 .30 52 062  0.40 3.00
0   13 .05
  043
053 .50 63  0
0
9.00 
To complete what we have done, write  xy  Covxy  Exy   x  y  13.05  5.352.35  0.4775.

So that  xy 
 xy

0.4775
 0.8194  .9052 .
0.5275 0.5275
b) We must now compute E xy  by multiplying each pair of values of x and y by their joint probabilities.
 x y
We had  x  4.65 ,  x2  0.5275 (  x  0.7263 ),  y  2.35 ,  y2  0.5275 (  y  0.7263 ) and
x
4
1
0
5
.05
6 .
.10
y 2
0
.30
.05
3 .50
0
0
041 .0551 .10 61  0
0.25 0.60 




E xy  
xyPxy     042 .30 52 .0562   0
3.00 0.60   10 .45
 .50 43
053
063 6.00
0
0 
To complete what we have done, write  xy  Covxy  Exy   x  y  10.45  4.652.35  0.4775.

So that  xy 
 xy

 0.4775
  0.8194  .9052 . In general, joint probability tables with
0.5275 0.5275
only the diagonals filled produce correlations close to +1 or -1. A northwest to southwest diagonal produces
a positive correlation and a southwest to northeast diagonal produces a negative correlation. The two tables
here have dominant diagonals, each number in the diagonal is larger than other numbers in its row and
column and so the correlations are similar to those of tables with only the diagonals filled.
Remember that the correlation must be between -1 and +1!
Note that the strength of a correlation is found by squaring the correlation and measuring
 x y
2
the strength on a zero to one scale. In a) we had  xy  .9052 , so  xy
 .90522  .8194 and we can say
that there is a relatively strong tendency for y to rise as x rises. In b) we had  xy  .9052 , so
2
 xy
  .90522  .8194 and we can say that there is a relatively weak tendency for y to fall as x rises or
for y to rise as x falls.
251solngr2-081b
6/18/08 (Open this document in 'Page Layout' view!)
(iv) Compute Ex  y  and Var x  y  from the results in (ii) and (iii). Compute Ex  y  and Var x  y 
from the results in (ii) and (iii). How many of you ignored the instructions and wrote down each value
of x  y with its probability. What a great way to waste time!
The formulas that you were given were Ex  y   Ex  E y    x   y and Var x  y 
  x2   y2  2 xy  Var x   Var  y   2Covx, y 
a) We had  x  5.35 ,  x2  0.5275 (  x  0.7263 ),  y  2.35 ,  y2  0.5275 (  y  0.7263 ) and
 xy  0.4775 .
Ex  y    x   y  5.35  2.35  7.70 and Var x  y    x2   y2  2 xy
 0.5275  0.5275  20.4775   2.0100 (  x  y  2.0100  1.4177 )
b) We had  x  4.65 ,  x2  0.5275 (  x  0.7263 ),  y  2.35 ,  y2  0.5275 (  y  0.7263 ) and
 xy  0.4775 .
Ex  y    x   y  4.65  2.35  7.00 and Var x  y    x2   y2  2 xy
 0.5275  0.5275  20.4775   0.1000 (  x y  0.1000  0.3162 )
(v) Compute Cov3x  3, y  and Corr 3x  3, y  using the formulas in section K4 of 251v2out or
section C1 of 251var2. Note that y  1y  0 .
251v2out says Cov(ax  b, cy  d )  acCov( x, y) and Corr (ax  b, cy  d )  (sign(ac))Corr ( x, y) , where
signac has the value 1 or 1 depending on whether the product of a and c is negative or
positive. a  3 and c  1 . Cov(3x  3, 1y  0)  31Cov( x, y)  3Covx, y 
Corr (3x  3,1y  0)  (sign(31))Corr ( x, y)  sign 3Corr x, y    1 Corrx, y 
a) We had  xy  0.4775 and  xy  .9052 . So Cov(3x  3, 1y  0)  3Covx, y   30.4775   1.4325
and Corr (3x  3,1y  0)  1 Corr x, y   .9052 .
b) We had  xy  0.4775 and  xy  .9052 . So
Cov(3x  3, 1y  0)  3Covx, y   30.4775   1.4325 and
Corr (3x  3,1y  0)  1 Corr x, y   .9052 .
2) The following data represent the scores of a group of students on a math placement test and their grades
in a math course. (i) Compute the sample mean and variance of x , (ii) Compute Covx, y  or s xy and
Corr x, y  or rxy , (iii) Compute the sample mean and variance of
x  y 
from the results in (i) and (ii).
(iv) Compute Cov6 x  3, y  and Corr 6 x  3, y  using the formulas in section K4 of 251v2out or
section C1 of 251var2. Note that y  1y  0 .
Test Score
Grades
y
x
51
80
52
77
59
87
45
72
61
80
54
84
56
83
67
87
63
92
53
77
251solngr2-081b
6/18/08 (Open this document in 'Page Layout' view!)
(i) Compute the sample mean and variance of x .
Row
1
2
3
4
5
6
7
8
9
10
x2
x
y
y2
xy
51 2601 80 6400 4080
52 2704 77 5929 4004
59 3481 87 7569 5133
45 2025 72 5184 3240
61 3721 80 6400 4880
54 2916 84 7056 4536
56 3136 83 6889 4648
67 4489 87 7569 5829
63 3969 92 8464 5796
53 2809 77 5929 4081
561 31851 819 67389 46227
 x  561,  y  819 ,  x  31851 ,
 x  561  56.1 and y   y  819  81.9 .
x
2
To summarize the results of these computations n  10 ,
y
2
 67389 and
s x2
x

s 2y
y

2
 nx 2
n 1
2
 ny 2
n 1
 xy  46227 . Thus
n
10
n
10

31851  10 56 .12 378 .9

 42 .1000 . s x  42 .1000  6.4885
9
9

67389  10 81 .92 312 .9

 34 .7667 . s y  34.7667  5.8963
9
9
(ii) Compute Covx, y  or s xy and Corr x, y  or rxy .
 xy  46227 , x  56.1 and y  81.9
 x  x  y  y    xy  nx y  46227  1056.181.9  281 .1  31.2333 .

Recall
s xy
rxy 
n 1
s xy
sx s y

n 1
31 .2333
42 .1 34 .7667

9
9
31 .2333  
42 .134 .7667 
0.6665  .8164
2
(iii) Compute the sample mean and variance of
x  y  x  y  56.1  81.9  138.0. Recall
s x2
x  y  from the results in (i) and (ii).
 42.1000 , s 2y  34 .7667 and s xy  31.2333
s x2 y  s x2  s 2y  2s xy  42 .1000  34 .7667  231 .2333   139 .3333 .
(iv) Compute Cov6 x  3, y  and Corr 6 x  3, y  using the formulas in section K4 of 251v2out or
section C1 of 251var2. Note that y  1y  0 .
251v2out says Cov(ax  b, cy  d )  acCov( x, y)
and Corr (ax  b, cy  d )  (sign(ac))Corr ( x, y) , where signac has the value 1 or 1 depending
on whether the product of a and c is negative or positive. a  6 and c  1.
s xy  Covx, y   31.2333 and rxy  Corrx, y   .8164
Cov(6 x  3, 1y  0)  61Cov( x, y)  631.2333   187 .3998
Corr (6 x  3, 1y  0)  (sign(61))Corr ( x, y)  10.8164   .8164 .
251solngr2-081b
6/18/08 (Open this document in 'Page Layout' view!)
3) The PHLX Gold/Silver SectorSM (XAUSM) is a capitalization-weighted index composed of 16 companies
involved in the gold and silver mining industry. XAU was set to an initial value of 100 in January 1979;
options commenced trading on December 19, 1983. The Dow-Jones Utility average is an average based on
the prices of 16 (I think) utility stocks. Both gold and utilities can attract cautious investors under certain
stock market conditions, so it is interesting to look at how they move relative to one another. The values of
these two indices for 20 very recent trading days are given on the next page.
x
Row pick Date
1
* 03/05
2
* 03/04
3
* 03/03
4
* 02/29
5
* 02/28
6
* 02/27
7
* 02/26
8
* 02/25
9
* 02/22
10
* 02/21
11
0 02/20
12
1 02/19
13
2 02/15
14
3 02/14
15
4 02/13
16
5 02/12
17
6 02/11
18
7 02/08
19
8 02/07
20
9 02/06
PHLXGS
203.32
195.62
202.98
196.58
202.84
197.84
193.13
188.12
189.94
196.56
189.56
186.10
177.32
176.87
179.43
176.65
182.06
181.25
174.88
129.98
y
You are expected to work with 11 of the 20
observations shown above. Use the first 10 rows
of data and pick one more row by finding the
row marked with the second-to-last digit of your
student number. (i) Compute the sample mean
and standard deviation of x , (ii) Compute
Covx, y  or s xy and Corr x, y  or rxy , (iii)
DJUT
496.60
489.97
482.76
477.50
492.40
496.04
504.63
500.78
497.54
491.82
499.95
499.85
500.41
498.79
504.05
502.08
497.90
494.39
496.96
498.66
Compute the sample mean and variance of
x  y  from the results in (i) and (ii). (iv) The
coefficient of variation is computed by dividing
the standard deviation by the mean. Compute a
coefficient of variation for x , y and x  y and
compare the relative safety of investing in
precious metal stocks, investing in utilities and
doing both.
Solution: Since you were not supposed to do this problem, I am just going to present the answer with the
original numbers.
(i) Compute the sample mean and standard deviation of x .
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
x
03/05
03/04
03/03
02/29
02/28
02/27
02/26
02/25
02/22
02/21
02/20
02/19
02/15
02/14
02/13
02/12
02/11
02/08
02/07
02/06
y
x2
y2
xy
203.32 496.60 41339.0 246612 100969
195.62 489.97 38267.2 240071
95848
202.98 482.76 41200.9 233057
97991
196.58 477.50 38643.7 228006
93867
202.84 492.40 41144.1 242458
99878
197.84 496.04 39140.7 246056
98137
193.13 504.63 37299.2 254651
97459
188.12 500.78 35389.1 250781
94207
189.94 497.54 36077.2 247546
94503
196.56 491.82 38635.8 241887
96672
189.56 499.95 35933.0 249950
94771
186.10 499.85 34633.2 249850
93022
177.32 500.41 31442.4 250410
88733
176.87 498.79 31283.0 248791
88221
179.43 504.05 32195.1 254066
90442
176.65 502.08 31205.2 252084
88692
182.06 497.90 33145.8 247904
90648
181.25 494.39 32851.6 244421
89608
174.88 496.96 30583.0 246969
86908
129.98 498.66 16894.8 248662
64816
3721.03 9923.08 697304
4924233 1845390
251solngr2-081b
6/18/08 (Open this document in 'Page Layout' view!)
 x  3721 .03,  y  9923 .08,
 x  3721 .03  186 .0515 and
xy  1845390 . Thus x 
To summarize the results of these computations n  20 ,
 x  697304 ,  y  4924233 and 
 y  9923 .08  496 .1540 . s   x
y
2
2
n
2
x
20
n
2
 nx
2
n 1

20
697304  20 186 .0515 
5000 .787

 263 .199 .
19
19
2
Minitab says 263.201. s x  263 .199  16 .2234 Minitab says 16.2235.
Though the mean and variance of y were not requested, we will need them.
s 2y 
y
2
 ny 2
n 1

4924233  20 496 .1540 2 857 .166

 45 .1140 Minitab says 45.1342.
19
19
s y  45.1140  6.7167 Minitab says 6.7167.
(ii) Compute Covx, y  or s xy and Corr x, y  or rxy .
s xy 
 x  x  y  y    xy  nx y
n 1
Minitab says -42.8130
rxy 
s xy
sx s y

n 1
42 .8378
263 .199 45 .1140


1845390  20 186 .0515 496 .1540 
813 .9190

 42 .8378 .
19
19
42.8378 2

263 .199 45.1140 
(iii) Compute the sample mean and variance of
0.1545  .3931 . Minitab says -.3928.
x  y  from the results in (i) and (ii).
We have x  186 .0515 , y  496.1540 , s x2  263.199 , s 2y  45 .1140 and s xy  42 .8378 .
So x  y  x  y  186.0515  496.1540  682.2055 and s x2 y  s x2  s 2y  2 s xy
 263 .199  45.1140  242.8378  = 222.6374 .
(iv) The coefficient of variation is computed by dividing the standard deviation by the mean. Compute a
coefficient of variation for x , y and x  y and compare the relative safety of investing in precious metal
stocks, investing in utilities and doing both.
s x  16 .2234 , s y  6.7167 and s x  y  222.6374  14.9210 .
s x y
sy
sx
16 .2234
14 .9210
6.7167

 .0872 , C y 

 .0219 . This

 .0135 and C x  y 
x 186 .0515
682
.2055
y 496 .154
x y
seems to show that investing in precious metals is much more (over 6 times as) risky than either utilities or
a 50-50 strategy of doing both. However, because of the negative covariance, the 50-50 strategy is only
about 62% riskier than utilities alone.
Cx 
Download