1.6 Prediction and correlation

R 2 , CORRLATION PREDICTION, Prediction: The fitted regression equation is : Yˆ  b0  b1 X  Y  b1 ( X  X ) . , Based on the fitted equation, the predicted value at a specified value X 0 is Yˆ0  Y  b1 ( X 0  X ) E(Yˆ0 )  E(Y  b1 ( X 0  X ))  E(Y )  ( X 0  X ) E(b1 )   0  1 X  ( X 0  X ) 1   0  1 X 0 =The mean of Y at X 0 and Var(Yˆ0 )  Var(Y  b1 ( X 0  X ))  Var(Y )  Var(b1 ( X 0  X )) (Since Cov(b1 , Y )  0 )  2 n  ( X 0  X ) Var (b1 )  2 2 n  (X 0  X ) 2 2 S XX 1 ( X 0  X )2      S XX n  2 Therefore, 1 ( X 0  X )2  ˆ s.e.(Y0 )  s    S XX n  1/ 2 . Note: Var (Yˆ0 ) or s.e.(Yˆ0 ) achieve their minimum as X 0  X . That is, we might our best “prediction” in the “middle” of our average range of X ( X ) . As X 0 is far away from X , the prediction would be less accurate (since the standard error is getting large). Thus, a (1   )100% confidence interval for E(Yˆ0 )   0  1 X 0 is  Yˆ0  t n2,1 / 2 s.e.(Yˆ0 )  Yˆ0  t n2,1 / 2 s.e.(Yˆ0 ), Yˆ0  t n2,1 / 2 s.e.(Yˆ0 ) 1  R 2 , Correlation, and Regression: (a) R 2 : n R2   (Yˆ i i 1 n  Y )2  (Yi  Y ) 2 , i 1 is the ratio of regression sum of square and total sum of square (corrected). R 2 is also the ratio of (the distance between model 2 and model 1) and (the distance n between data (model 0) and model 1). Since the total sum of square  (Y i 1 i  Y ) 2 is n the sum of the regression sum of square  (Yˆ  Y ) i 1 n  (Y i 1 i 2 i and the residual sum of square  Yî ) 2 . Large R 2 implies the proportion of the total sum of square contributed by the regression sum of square is large. For example, if R 2 =0.9, then 90% of total sum of square comes from the regression sum of square. Heuristically, that indicates 90% of Yi .can be explained by Yî . That is, model 2 can fit the data well. In addition, large R 2 also implies the regression sum of square is large relative to the residual sum of square. In the above example, the regression sum of square is 9 times larger than the residual sum of square since the residual sum of square contributes 10% of total sum of square (corrected). That is, the distance between model 2 and model 1 is large relative to the variation of the data. As we explain in the previous section, this might imply the slope in the regression is significant. Thus, model 2 might be sensible. R 2 is usually recommended as a “useful first thing to look at” in a regression printout. (b) Correlation: The correlation coefficient between the covariate X and the response Y is n rXY  (X i 1 n (X i 1 i i  X )(Yi  Y )  X) n 2  (Y i 1 i Y )  2 S S 2 XY 1/ 2 1/ 2 XX YY S ,  1  rXY  1 As Yi  aX i  b , then rXY  1 or rXY  1 . That is, rXY  1 implies a significant linear relationship between X and Y. The correlation coefficient is also associated with the regression coefficient b1 .  S YY S S 1/ 2 S XY  b1  XY  YY   1/ 2 S XX S 1XX/ 2 S 1XX/ 2 S YY  S XX    1/ 2 rXY . As b1  0  rXY  0  a positively linear relation. As b1  0  rXY  0  a negatively linear relation As b1  0  rXY  0  there is no significantly linear relation between X and Y. Note: rXY measures linear association between X and Y, while b1 measures the size of the change in Y due to a unit change in X. rXY is unit-free and scale-free. Scale change in the data will affect b1 but not rXY Note: the value of a correlation rXY shows only the extent to which X and Y are linearly associated. It does not by itself imply that any sort of casual relationship exists between X and Y. Such a false assumption has lead to erroneous conclusions on many occations. Note that rXY is also associated with R 2 since  Yˆ  Y  n R2  i 1 n 2 i  Y i 1 i Y  2  b1 S XY S YY  S XY S  2 S XX  XY S XY    S YY S XX S YY and rXY    S XY  ( sign.of .b1 ) R 2 1/ 2 1/ 2 S XX SYY 1/ 2  ( sign.of .b1 ) R , where   R  R2 1/ 2 and rXY has the same sign as b1 . The above equation indicates that large R 2 implies strong correlation between the response and the covariate. Note: rXY  (sign of b1 ) R only holds for the simple linear regression Y   0  1 X   . 3 The correlation between the response Y and the fitted value Yˆ is n rYYˆ   (Y  Y )(Yˆ  Yˆ ) i 1 i n i n  R , where Yˆ  n  (Yi  Y )  (Yî  Yˆ ) 2  Yˆ i 1 n 2 i 1 i . i 1 The derivation of rYYˆ : Since n Yˆ   Yî i 1 n n   (b 0 i 1  b1 X i ) n  b0  b1 X , 2  (Yî  Yˆ ) 2   b0  b1 X i  (b0  b1 X )   b12 ( X i  X ) 2  b12 S XX , n n n i 1 i 1 i 1 and n  (Y i 1 i  Y )(Yî  Yˆ )   (Yi  Y )b0  b1 X i  (b0  b1 X )   (Yi  Y )b1 ( X i  X ) n n i 1 i 1 n  b1  (Yi  Y )( X i  X )  b1 S XY , i 1 thus rYYˆ  S 1/ 2 YY b1 S XY b S  2 11 / 2  1 / 2XY1 / 2  ( sign.of .b1 )rXY  (sign.of .b1 ) 2 R  R . 2 1/ 2 (b1 S XX ) (b1 ) SYY S XX The equation rYYˆ  R implies large value of R 2 also implies the significantly positively linear relation between the observation Yi and the predicted values Yî . In other word, the prediction of Yi is not unrelevant to Yi . Note: rYYˆ  R holds not only for the simple linear regression Y   0  1 X   , but also for the multiple linear regression!! 4

1.6 Prediction and correlation

Related documents

Products

Support

1.6 Prediction and correlation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib