   

251v2out 4/19/2006 (Open this document in 'Outline' view!) K. Two Random Variables. 1. Regression (Summary). 2. Covariance (  xy and s xy ) a. Population Covariance The population covariance is defined, using probability, as Cov( x, y )   xy  E x   x  y   y  E xy    x  y . This can be used to describe the relationship    between x and y . If the covariance is positive we can say that x and y tend to move together, while if it is negative we can say that they tend to move in opposite directions. In order to use this formula we must realize that E xy    xyPx, y  . This means that we must add together the product of x and y , together with their joint probability, for each possible pair of values of x and y . For example, assume that x x and y are related by the following joint probability table: 400 y 600 800      400 .12 600 .15 .10 .05 .16 .07 800 . .18   .08  .09   We begin by taking the upper left hand probability, .12, which is the probability that both x and y are 400, and multiplying it by 400 twice. Then we take the next probability in the same row, .15, which is the probability that x is 600 and y is 400, and multiply it by both 600 and 400. If we continue in this way we get  .12 400 400  .15600 400  .18800 400  E xy   xyPxy    .10 400 600  .05600 600  .08800 600    .16 400 800  .07 600 800  .09 800 800    19200 36000 57600    24000 18000 38400   335600 .  51200 33600 57600  We can now use the following tableau to compute the means and variances of x and y . x 400 600 800 P y  yP y  y 2 P y   .12 400 .15 .18  .45 180 72000   y 600 .05 .08  .23 138 82800  .10  .16 800 .07 .09  .32 256 204800   Px  .38 .27 .35 1.00 574 359600 xPx  152  162  280  594 2 x Px  60800  97200  224000  382000  Px   1 (a check),   E x    xPx   594 , E x    x  P y   1 ,   E y    yP y   574 and E y    y P y   359600 2 To summarize x 2 2 Px   382000 , 2 y We will need the variances below. To complete what we have done, write  xy  Covxy  Exy   x  y  335600 594574  5356 b. The Sample Covariance The sample covariance is much easier to compute, the formula being s xy   x  x  y  y    xy  nx y . n 1 n 1 For example, assume that we have data on income ( x ) and savings ( y )(in thousands) for 5 families. x y x2 y2 xy 1 2 3 4 5 1.9 12.4 6.4 7.0 7.0 0.0 0.9 0.4 1.2 0.3 3.61 153.76 40.96 49.00 49.00 0.00 0.81 0.16 1.44 0.09 0.00 11.16 2.56 8.40 2.10 Sum 34.7 2.8 296.33 2.50 24.22 Family Then x  s x2 34 .7 2 .8  6.94 and y   0.56 . 5 5 x  2  nx 2 n 1 y   296 .33  56.94 2  13 .878 , 4 2.50  50.56 2  0.2330 and since n 1 4 24.22  56.94 0.56  xy  24 .22 , s xy   1.197 . 5 1 The positive sign of s xy , the sample covariance, indicates that x and y tend to move together. s 2y 2  ny 2   2 3. The Correlation Coefficient (  xy and rxy ) The size of a covariance is relatively meaningless; to judge the strength of the relationship between x and y we need to compute the correlation, which is found by dividing the covariance by the standard deviations of x and y. a. Population Correlation. For the population covariance, recall from above that    x2  E x 2   x2  382000  594 2  29164 and    y2  E y 2   y2  359600  5742  30124 . So that  xy   xy  x y   5356   5356  0.181 . 170 .77 173 .56  29164 30124 The correlation must always be between positive and negative 1  1.0    1.0 . A correlation close to zero is called weak. A correlation that is close to one in absolute value is called strong. (Actually statisticians prefer to look at the value of the correlation squared.) A strong positive correlation indicates that x and y have a relationship that is close to a straight line with a positive slope. A strong negative correlation means that the relationship approximates a straight line with a negative slope. Unfortunately, the correlation only indicates linear relationships; a nonlinear relationship that is obvious on a graph may give a zero correlation. b. Sample Correlation. Recall that s xy  1.197 , s x2  13 .878 , and s 2y  0.2330 . If we divide the correlation by the two standard deviations, we find that rxy  s xy sx s y  1.197  0.6657 . 13 .878 0.2330 4. Functions of Two Random Variables. Cov(ax  b, cy  d )  acCov( x, y) and if w  ax  b and v  cy  d ,  wv  signac xy or Corr (ax  b, cy  d )  (sign(ac))Corr ( x, y) , where signac has the value 1 or 1 depending on whether the product of a and c is negative or positive. 5. Sums of Random Variables. a. Ex  y   Ex  E y  and Var x  y    x2   y2  2 xy  Var x   Var y   2Covx, y  b. Independence. 3 (i) Definition. Px, y   Px P y  (ii) Consequences If x and y are independent, E xy   E x E  y  , Covx, y   0 ,  xy  0 and Varx  y   Varx   Var y  . c. If a, c and d are constants, Var(ax  cy)  a 2Var( x)  c 2Var( y)  2acCov( x, y) . This and a. imply that Eax  cy  d   aEx   cE y   d and Var(ax  cy  d )  a 2Var( x)  c 2Var( y)  2acCov( x, y) d. Application to portfolio analysis If R  P1 R1  P2 R2 and P1  P2  1 , then E R   P1 E R1   P2 E R2  and VarR  P12VarR1   P22VarR2   2P1 P2CovR1 , R2  . Variance is usually considered a measure of risk, though actually, the best measure of risk is probably the coefficient of variation, the standard deviation  divided by the mean, in this case C  R . E R  The remainder of this material can be found in the Supplement in the document 251var2. You can get a slightly expanded version of this at 251varmin . 4

   

Related documents

Products

Support

   

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib