6.4 Linear regression

4. Linear Regression Let  Y  X   ,  ~ N 0, 2 I . Denote S    Y  X  Y  X  . t In linear algebra,  X 1 p1   X 11  1 X  X  1 2 p  1 21      X   0    1  p 1             X X 1   np  1   n1   is the linear combination of the column vector of X . That is, X  R X   the column space of X . Then, S    Y  X 2  the square distance between Y and X Least square method is to find the appropriate between Y and Xb is smaller than the one between combination of the column vectors of Intuitively, X Xb X , for example, Y Y . Thus, Xb most accurately. Further, 1 and the other linear X1 , X 2 , X 3 , . is the information provided by covariates to interpret the response such that the distance X 1 , X 2 ,, X p1 is the information which interprets Y S    Y  X  Y  X  t  Y  Xb  Xb  X  Y  Xb  Xb  X  t  Y  Xb  Y  Xb   2Y  Xb   Xb  X    Xb  X   Xb  X  t t  Y  Xb  Xb  X 2 2 b If we choose the estimate t  2Y  Xb  X b    t  of Y  Xb such that is orthogonal every R X  , then Y  Xb X  0 . Thus, t vector in S    Y  Xb That is, if we choose b 2  Xb  X Y  Xbt X satisfying   S b   Y  Xb Thus, b satisfying b 2 of  Y  Xbt X 2 ,  Xb  Xb Y  Xbt X .  0 , then S b   Y  Xb and for any other estimate 2 0 2  Y  Xb 2  S b  . is the least square estimate. Therefore,  0  X t Y  Xb  0  X tY  X t Xb  b  X t X  X tY 1 Since  Yˆ  X b  X X t X  1  X tY  PY , P  X X t X P  1 P is called the projection matrix or hat matrix. Y on the space spanned by the covariate vectors. The vector of residuals is projects the response vector e  Y  Yˆ  Y  X b  Y  PY  I  P Y We have the following two important theorems. 2 Xt, . Theorem: 1. P and I  P are idempotent. 2. rank I  P  trI  P  n  p 3. I  PX  0 4. E mean residual sum of square     Y  Yˆ t Y  Yˆ  E s2  E  n p       2  [proof:] 1.  PP  X X tX  1  XtX XtX  1  Xt  X XtX  1 Xt  P and I  PI  P  I  P  P  P  P  I  P  P  P  I  P . 2. Since P is idempotent, rank P  trP . Thus,   rank P  trP  tr X X t X  1   X t  tr X t X  1  X t X  trI p p   p Similarly, rank I  P   trI  P   trI   trP   tr A  B   tr A  trB  n p 3. I  P  X   X  PX  X  X X t X 4. 3  1 XtX  X  X  0   t RSS model p   et e  Y  Yˆ Y  Yˆ  Y  Xb  Y  Xb   t  Y  PY  Y  PY  t  Y t I  P  I  P Y t  Y t I  P Y  I  P is idempotent Thus,  E RSS model p   E Y t I  P Y    E Z    , V Y       t t  trI  P V Y    X  I  P  X    E Z AZ    t    tr A    A     tr I  P  2 I  0  I  P X  0      2trI  P     n  p  2 Therefore,  RSS model p  E mean residual sum of square   E   n p   2 Theorem: If   Y ~ N X , 2 I , Then,   where X is a n  p matrix of rank p .  1  1. b ~ N  , 2 X t X 2. b   t X t X b    ~  2 2 p 4 RSS model p  2 3.  n  p s 2  ~  n2 p 2  b   t X t X b    2 4. is independent of RSS model p  2  n  p s 2  2 . [proof:] 1. Z Since for a normal random variable ,  Z ~ N  ,    CZ ~ N C , CC t thus for    Y ~ N X , 2 I ,   1 b  XtX  ~ N  X t X   1 X tY  X X , X X t t  1 2    X X X X     N  , X X     N , X tX t 1 t 1 2 1 t 2.    X I X X t b   ~ N 0, X t X t  1 X t   t 2  1  2 . Thus, b    t X X    t 1 2 1 t  b    X t X b    b     2   Z ~ N 0,     ~   t 1 2  Z  Z ~  p  . 2 p 5 3. I  PI  P  I  P and rank I  P  n  p , thus    for A2  A, rank  A  r   Y  X t I  P Y  X  ~  2  and Z ~ N  , 2 I  n p 2    t  Z    AZ    2 ~    r 2     Since I  P X  0, Y t I  P X  0,  X t I  P Y  X   0 , RSS model p  n  p s 2 Y t I  P Y   2 2 2 t  Y  X  I  P Y  X   ~ 2 n p 2 4. Let Q1 t  Y  X  Y  X   2 t t  Y  Xb Y  Xb    Xb  X   Xb  X   2 t Y t I  P Y b    X t X b      2 2  Q2  Q1  Q2  where Q2 t t  Y  Xb Y  Xb Y  PY  Y  PY    2  2 Y t I  P Y 2 6 . and Q1  Q2 t t  Xb  X   Xb  X  b    X t X b      2  Xb  X  2 2 2 0 Since Q1 t t  Y  X  Y  X  Y  X  I 1 Y  X    ~ 2 2 2  Z  Y  X ~ N 0, I  Z  I  2 t 2 1 Z  Q1 ~  n2  and by the previous result, Q2 t  Y  Xb Y  Xb RSS model p    2 2 t  Y  X  I  P Y  X   ~  n2 p 2  therefore, Q2  , RSS model p  2 is independent of Q1  Q2 t  b    X t X b     2 .  Q1 ~  r21 , Q2 ~  r22 , Q1  Q2  0, Q1 , Q2 are quadratic form     of multivaria te normal  Q is independen t of Q  Q  2 1 2   7 n

6.4 Linear regression

Related documents

Products

Support

6.4 Linear regression

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib