4. Linear Regression Let Y X , ~ N 0, 2 I . Denote S Y X Y X . t In linear algebra, X 1 p1 X 11 1 X X 1 2 p 1 21 X 0 1 p 1 X X 1 np 1 n1 is the linear combination of the column vector of X . That is, X R X the column space of X . Then, S Y X 2 the square distance between Y and X Least square method is to find the appropriate between Y and Xb is smaller than the one between combination of the column vectors of Intuitively, X Xb X , for example, Y Y . Thus, Xb most accurately. Further, 1 and the other linear X1 , X 2 , X 3 , . is the information provided by covariates to interpret the response such that the distance X 1 , X 2 ,, X p1 is the information which interprets Y S Y X Y X t Y Xb Xb X Y Xb Xb X t Y Xb Y Xb 2Y Xb Xb X Xb X Xb X t t Y Xb Xb X 2 2 b If we choose the estimate t 2Y Xb X b t of Y Xb such that is orthogonal every R X , then Y Xb X 0 . Thus, t vector in S Y Xb That is, if we choose b 2 Xb X Y Xbt X satisfying S b Y Xb Thus, b satisfying b 2 of Y Xbt X 2 , Xb Xb Y Xbt X . 0 , then S b Y Xb and for any other estimate 2 0 2 Y Xb 2 S b . is the least square estimate. Therefore, 0 X t Y Xb 0 X tY X t Xb b X t X X tY 1 Since Yˆ X b X X t X 1 X tY PY , P X X t X P 1 P is called the projection matrix or hat matrix. Y on the space spanned by the covariate vectors. The vector of residuals is projects the response vector e Y Yˆ Y X b Y PY I P Y We have the following two important theorems. 2 Xt, . Theorem: 1. P and I P are idempotent. 2. rank I P trI P n p 3. I PX 0 4. E mean residual sum of square Y Yˆ t Y Yˆ E s2 E n p 2 [proof:] 1. PP X X tX 1 XtX XtX 1 Xt X XtX 1 Xt P and I PI P I P P P P I P P P I P . 2. Since P is idempotent, rank P trP . Thus, rank P trP tr X X t X 1 X t tr X t X 1 X t X trI p p p Similarly, rank I P trI P trI trP tr A B tr A trB n p 3. I P X X PX X X X t X 4. 3 1 XtX X X 0 t RSS model p et e Y Yˆ Y Yˆ Y Xb Y Xb t Y PY Y PY t Y t I P I P Y t Y t I P Y I P is idempotent Thus, E RSS model p E Y t I P Y E Z , V Y t t trI P V Y X I P X E Z AZ t tr A A tr I P 2 I 0 I P X 0 2trI P n p 2 Therefore, RSS model p E mean residual sum of square E n p 2 Theorem: If Y ~ N X , 2 I , Then, where X is a n p matrix of rank p . 1 1. b ~ N , 2 X t X 2. b t X t X b ~ 2 2 p 4 RSS model p 2 3. n p s 2 ~ n2 p 2 b t X t X b 2 4. is independent of RSS model p 2 n p s 2 2 . [proof:] 1. Z Since for a normal random variable , Z ~ N , CZ ~ N C , CC t thus for Y ~ N X , 2 I , 1 b XtX ~ N X t X 1 X tY X X , X X t t 1 2 X X X X N , X X N , X tX t 1 t 1 2 1 t 2. X I X X t b ~ N 0, X t X t 1 X t t 2 1 2 . Thus, b t X X t 1 2 1 t b X t X b b 2 Z ~ N 0, ~ t 1 2 Z Z ~ p . 2 p 5 3. I PI P I P and rank I P n p , thus for A2 A, rank A r Y X t I P Y X ~ 2 and Z ~ N , 2 I n p 2 t Z AZ 2 ~ r 2 Since I P X 0, Y t I P X 0, X t I P Y X 0 , RSS model p n p s 2 Y t I P Y 2 2 2 t Y X I P Y X ~ 2 n p 2 4. Let Q1 t Y X Y X 2 t t Y Xb Y Xb Xb X Xb X 2 t Y t I P Y b X t X b 2 2 Q2 Q1 Q2 where Q2 t t Y Xb Y Xb Y PY Y PY 2 2 Y t I P Y 2 6 . and Q1 Q2 t t Xb X Xb X b X t X b 2 Xb X 2 2 2 0 Since Q1 t t Y X Y X Y X I 1 Y X ~ 2 2 2 Z Y X ~ N 0, I Z I 2 t 2 1 Z Q1 ~ n2 and by the previous result, Q2 t Y Xb Y Xb RSS model p 2 2 t Y X I P Y X ~ n2 p 2 therefore, Q2 , RSS model p 2 is independent of Q1 Q2 t b X t X b 2 . Q1 ~ r21 , Q2 ~ r22 , Q1 Q2 0, Q1 , Q2 are quadratic form of multivaria te normal Q is independen t of Q Q 2 1 2 7 n