Chapter 3 The Two-variable Regression model   

Chapter 3 The Two-variable Regression model A. the model Yi     X i   i , , , i= 1, 2,…, N population model ( the truth model) ε: random error term; or disturbance; unbiased, theoretical error to measure the variance between the actual Y and the observed Y. The problem of specification (1) α and β are unknown parameters of population, need estimate; (2) Y=α+βX , may not be the real (or exact) relationship between X and Y, or may be omitted the some variables, just to choose the important variable specify the population. (3) the value of X and Y, just observed value. There are unobservable values of disturbance and population. (4) random sample- with some sampling error. e  Y  Yˆ , residual , error term; Yî  ˆ  ˆX i Regression analysis  to get the expect value of Y given X, E(Y X ) ; for each E(Y X ) =α+βX , ie. to estimate the model Yˆ  ˆ  ˆX i i (regression line) 1. assumptions and standard linear regression model (SLRM), or the classical linear regression model. (1). Y and X are linear relate, Yi     X i   i , , is the “true” model; (2). The X’s are non-stochastic variables where values are fixed, Var(X)≠0 For multiple regression model; the X is a matrix with full rank ( there is no linear relationship between independent variables ) (3). E (ei )  0 Var (ei )  E(ei )   2 Cov (ei , e j )  E{[ ei  E (ei )][ e j  E (e j )]}  E (ei , e j ) =0 2 (4). e ~ N (0,  2 ) , (1)~(4) called assumptions of the classical normal linear regression model. Illustrate: (1) Y is related to X. (2) The value of X is fixed. 1 (3) if E (ei )   '  0 then must to rewrite the model Yi    X i  ei  ( ' ' )  (   ' )  X i  (ei   ' ) =  *  X i  e * i  E (ei* )  E (ei   ' )   ' ' =0 var (ei )  E(ei )   2 2  Homoescedasticity, ie, the variance along the regression line is the same. If var (ei )   i2 , heteroscdasticity; ie, the variance of the regression line is different. 2 Cov (ei , e j )  0, for i  j  the error process is serially uncorrelated. [if Cov (ei , e j )  0, for i  j ]  the error process is serially correlated or autocorrelated. Negative serial correlation positive serial correlation 2.  E ( X i ei )  X i E (ei ) =0 …………………an implicit assumption. ＊ The stochastic regression model has 3 unknow parameters: ，， ２＊ E (Yi )  E (  X i  ei )    X i ( ei is random var iables ) Var (Yi )  E (Yi  E (Yi )) 2  E[(  X i  ei )  (  X i )  E (ei ) 2   2 ……………regression variance, or variance about the regression line or residual variance. Yi are uncorrelated (  cov( ei e j )  0, for i  j )  Yi ~ (  X i ,  2 ) if e ~ N (0,  2 ) , then  Yi ~ N (  X i ,  2 ) 3 B. Best linear unbiased estimate, BLUE ~ Condtition (1) linear:  is a linear combination of sample values. ~ (2) unbiasedness: E ( )   (3) best  lowest variance ( ie. Most efficiency, ie: consistency) ~ v a r( )  v a r(* ) ,  * is any other linear unbiased estimate. 1. prove that X is BLUE: X 1 1 1 1 (X1  X 2    X N )  X1  X 2    X N N N N N N 1 = a 1 X 1  a 2 X 2    a N X N ; ai  , i  1,2,..., N N =  ai X i (ie. X and X i are linear related) (1) X  i  (2) E ( X )  E ( ai X i )   ai E ( X i )   X , a i  N 1 N (3) Var ( X )  Var( ai X i )  E[ ai X i E( ai X i )] 2 E[ ai ( X i  x )]2 = E[ ai2 ( X i   x ) 2 ]  E[ ( X i   x )( X j   x )] i j = E[ a ( X i   x ) ]   ai a j [ E ( X i   x )( X j   x )] 2 i 2 i j =  ai2 E[( X i   x ) 2 ]   X2  ai2   X2  ( 1 2  X2 )  (is as small as possible) N N ＊ if H=F (a1 , a2 ,, a N )   G(a1 , a2 ,, a N )   X2  ai2   ( ai  1) ＊ to Min.  X2  ai2 H  2 X2  ai    0 ai ai   , 2 X2 f o ra l li,   ai  H  ( ai  1)  0,  1 Account to (A1), (A2)， a i  , for all i N prove the X is BLUE 4  N 2 X2 …….(A1) 2 X2 ……………(A2) N 2. Gauss-Markov theorem: ( LSE is BLUE) ＊＊ Given assumption (1)(2)(3), least squares estimators ˆ and ˆ are the best linear unbiased estimators of  and  . LSE: ˆ *   (X  (X i i  X ) Yi X ) 2   C i Yi , where C i  i  Xi  ˆ N N ˆ ( ˆ and  are functions of random var., ˆ  Y  ˆX   (X  (X Y i  X)  X )2 i  ˆ and ˆ are random var. ) Prove: (1) account to the formula of ̂ , the ci is constant.  ̂ and Yi are linearly related, ie. ̂ is a linear estimator. (Xi  X )  0 , (Ps:  Ci   ( X i  X )2 (X i  X )2 ( X i  X )2   C  {[( X  X ) 2 ]2 }  [ ( X  X ) 2 ]2   i i 1   (X i  X )2 2 i (X  X )  1 (X  X )  X  X  X  1) }  X  NX (X  X )  Ci x i  {( X i  X ) 2 ( X i  X )}  i (X  X )X  Ci X i { ( Xi  X ) 2 i i 1  xi2 2 i 2 i 2 i i 2 2 i (2) ˆ   Ci Yi   Ci (   X i  ei )    C i    Ci X i   Ci ei     Ci ei E (ˆ )  E (  C e )    E (C e  C e  ....  C e )  i i 1 1 2 2 N N =   C1 E (e1 )  C2 E (e2 )  ....  C N E (e N )   ( ˆ is an unbiased estimator of  ) (3) Var (ˆ )  E[ˆ  E (  )] 2  E (ˆ   ) 2  E ( Ci ei ) 2 = E(C1 e1  C2 e2  ....  C N eN  2C1C2 e1e2  2C1C3e1e3  ...  2C N 1Cn eN 1eN ) 2 2 2 2 2 2 = C1  2  C2  2  ....  C N  2   2  Ci  2 2 2  Var ( ˆ )  by N ,  2 cons tan t and X  small var iantion in X , the l arg er Var ( ˆ ) 5 2  2 (X i  X )2 ~ Define any arbitrary linear estimator of  as    wi Yi , where wi  ci  d i , (d i : any arbitrary constant) ~ For  to be an unbiased estimator of  , the d i must fulfill certain conditions. ~    wi (   X i  ei )    wi    wi X i   wi ei ~  E (  )    wi    wi X i , for unbiasedness,  wi  0, and  wi X i  1  wi  ci  d i ,   d i  0, d X  d x i i i i 0 ~ 2 2 2  (  )  E ( wi ei ) 2   2  wi   2  ci   2  d i = 2 x 2 i   2  d i2  Var ( ˆ )   2  d i 2 ~ ~ Ie Var (  )  Var ( ˆ ) , only when d i =0, Var (  )  Var ( ˆ )  ̂ has Min. Variance. ＜Prove ̂ ＞  X i   Yi  X c Y  [( 1  X c )Y ]  ˆ ii  N i i N N N ̂ and Yi are linearly related, ie. ̂ is a linear estimator (1) ˆ  Y i 1 1  X ci )Yi ]  [(  X ci )(   X i  ei )] N N e   =  (  X ci  X i  X ci X i  i  X cei ) N N N 1 1 =    X   X   (  X c i ) ei =    (  X c i ) e i N N (̂ is an unbiased estimator of  )  E(ˆ )   , (2) ˆ   [( ( ei ) 1 (3) Var (ˆ )  E[ˆ  E ( )]  E[ (  X ci )ei ]2  E[ ]  E[ X 2 ( ci ei ) 2 ] N N2 2 2   Xi 1 X2 X   (  ) = 2 2 N N  xi N  xi2  xi 2 2 2 2 2 6 2 1 Define any arbitrary linear estimator of  as ~   (  Xwi )Yi , where N wi  ci  d i , (d i : any arbitrary constant) for unbiasedness, w i  0, and w X i i  1,   d i  0, d X  d x i i i i 0  ei  X w e   ] 2  Var (~ )  E[~  E (~ )] 2  E[   ii N e = E[ i N = 2 N  X  wi ei ]  E[ 2 ( ei ) 2 N 2 ]  E[ X ( wi ei ) ]  2 2 2 N  X 2 2  wi2  X 2 2 ( ci2  d i2 )  var(ˆ )  X 2 2  d i2 Ie Var(~)  Var(ˆ ) , only when d i =0, Var(~)  Var(ˆ ) ̂ has Min. Variance. 1 ~ ＊ Cov(ˆ , ˆ )  E{[ˆ  E (ˆ )][   E ( ˆ )]}  E[ (  X ci )ei   ci ei ] N = E[ e c e i i i N  X (  c i ei ) 2 ]   2  ci  X2  2 2 N x ˆ Variance  coVariance matrix (  )  ˆ  X  2   x 7 N X X    x2  1    x 2  2 x 2 i   2 X  xi2 C. S 2  ˆ 2  1 eî2 is an unbiased estimator of  2  N 2  2  E(ei ) 2 ,  eˆ 2 =ESS;   and  to be estimated , the d.f. =N-2 ( or T-K)  eî  Yi  Yî  Yi  ˆ  ˆX i <PROVE> in deviation form: yˆ i  ̂xi  eî  yi  yˆ i  yi  ̂ xi = xi  (ei  e )  ˆ xi = (  ˆ ) x  (e  e ) i i eî  [(   ˆ ) xi  (ei  e )] 2 2   eî2  [(  ˆ ) 2 xi  (ei  e ) 2  2(  ˆ ) xi (ei  e )] = (  ˆ ) 2 x 2  (e  e ) 2  2(   ˆ ) x (e  e )] 2  i   i A B For C: i C For A: E [(   ˆ ) 2  xi2 ]   xi2 E ( ˆ   )   xi2  For B: i 2 x 2 i 2 1 ( ei2 )] N 1 =E [e12  e22    e N2 ]  E[(e1  e2    e N )(e1  e2    e N )] N 1 = N 2  ( N 2 )  ( N  1) 2 N x i ei (  x i ei ) 2  ˆ (    )  x i ( ei  e )  (  x i ei  e  x i )   ( ˆ   ) 2  xi2 2 2  xi  xi E [ (ei  e ) 2 ]  E[ ei2  2 E[ (   ˆ ) xi (ei  e )]  E ( ˆ   ) 2  xi2   xi2 x 2 i 2 E ( ei2 )  E( A)  E( B)  E(C)   2  ( N  1) 2  2 2  ( N  2) 2 E( S 2 )  1 1 E ( eî2 )  ( N  2) 2   2 ,  S 2 is unbiased N 2 N 2 8 S: standard error of regression, SER; or standard error of the estimate, SEE; or estimation standard deviation of e. Sˆ , S ˆ : standard error of the estimation of coefficient 1 1 1 E ( eî2 )  ( y i  ˆxi ) 2  ( y i2  ˆ 2  xi2  2ˆ  xi y i )  N 2 N 2 N 2 1 ( y i2  ˆ  xi y i ) = N 2 S2  D. if Yi ~ N (  X i ,  ) , then  ~ N ( , 2  2  X i2 x 2 i ) , ˆ ~ N (  , 2 x 2 i )  ˆ and ˆ are linear combination of independently normal variances, Y1 , Y2 , , YN  ˆ and ˆ must be normally distributed.  H 0 :   0, H1  0,   0.05 ˆ   0 ˆ    ~ N (0, 1), if  ˆ is known   ˆ The test statistic: Z C  x ˆ   0 tC  S ˆ Interval estimates: ˆ  t N  2, ~t N  2,  S ˆ  2 if  ˆ is unknown and ˆ  t 2 N  2,  S ˆ 2 E. Descriptive properties (some mathematic characteristics of LSE) Yˆ  ˆ  ˆ X …… regression, Y  Yˆ  eˆ  ˆ  ˆ X  eˆ 1. Prove: 2. Prove:  eˆ  0,  eˆ   ( y i i i  eˆ  0  ˆx )   eˆ X  0,  eˆ X   eˆ i i i  eˆ x   ( y i i y eˆ  Y  Yˆ ……calculated residual i   Xeˆ  0  ˆ  xi  Ny  ˆ  x  0 this is a good property. ( X  X  X )   eˆxi   eˆX   eˆx i , i (x  y  0)  xi y i x 2  0  ˆ xi ) xi   xi yi  ˆ  xi2   xi yi   i  xi2 9 Account to 1. & 2.  ie. X and eˆ are orthogonal , ie. X ' eˆ  0 2T T 1 X and eˆ are linearly uncorrelated Must be to estimate (find)value of ˆ and ˆ , when X and eˆ uncorrelated. 3. Y  Yˆ PROVE: Y  (Yˆ  eˆ )  Yˆ   eˆ  Y    i N  eˆ 4. i Yî  0, i i N   Yêˆ  0 i i N Yˆ   i N  Yˆ ie. If  Yˆ eˆ  0 , can increase the performance of regression. PROVE: F.   eî Yî   eî (ˆ  ˆX i )  ˆ  eî  ˆ  eî X i  0  0  0 Goodness of Fit ( Yi  Y )  ( Yi  Yî )  ( Yî  Y ) explained deviation of Y Total deviation of Y unexplained deviation of Y 2  ( Yi  Y )   ( Yi  Yî ) 2   ( Yî  Y ) 2 TSS ESS RSS  (Yi  Y ) 2  (Yi  Yî ) 2  2(Yi  Yî )(Yî  Y )  (Yî  Y ) 2  (Y i  Y ) 2   (Yi  Yî ) 2  2 (Yi  Yî )(Yî  Y )   (Yî  Y ) 2   (Yi  Yˆ ) 2   (Yî  Y ) 2 10 2. R2 the coefficient of determination ie. The R squared of the regression equation Define: R 2  RSS ESS  1 TSS TSS RSS   R  TSS 2  (Yˆ  Y )  (Y  Y ) 2 i i  yˆ  y 2 i 2 ( ˆ x )   y i 2 i 2 i 2  ˆ 2 x y 2 i 2 i var( X )  ˆ 2 var(Y ) RSS ESS  eî R   1  1 TSS TSS  yi2 2 2 (1) R2 is a measure of the goodness of fit of the regression model ie. Ex. R2=0.959, measure the proportion (95.9%) of the variation in Y which is explained by the regression equation; ie how regression model fits the data (S or ̂ ) ( SEE, standard error of estimate), is one kind measure but the problem is that it depends on the unit of measurement used. If unit is changed, we will get different estimate. R2 is independent of unit of measurement.) ＊＊ when use R2 , must be keep follow condition.: (a) The estimator must be an OLS estimator. (b) The relationship be estimated must be linear. (c) The linear relationship that is being estimated must include a constant or intercept term. (2) 0  R 2  1 R2=0, the model can’t explain the variation in Y R2 =1, the case of perfect fit in Y. (special case) 11 (3) R2 is higher value when used the time series data; R2 is lower when used the cross-section data. (4) R2 can be used for analyze the causality (relationship between of X and Y) (5) If multiple regression, R 2  as independ var iable   independ var iable , TSS fixed , but the RSS ↑, ie ESS ↓. So We use R 2 ( the adjusted coefficient of determination) It takes into account of d.f. And may be constant as k↑ 1 R 2  ESS / N  K TSS / N  1 ESS/N-K ; the residual variance, TSS/N-1: the variance in Y Var (eˆ) S2 =  Var (Y ) Var (Y ) R 2  1 ( ESS / N  K ESS N  1 N 1 )  1   1  (1  R 2 ) , TSS / N  1 TSS N  K N K K  k 1 As K↑→(N-K) ↓→ R 2 ↓ As K↑→ TSS fixed, RSS↑, so R 2 is not necessary↓ as K↑, may be offset ＊ if K↑: S2 ↓,－, ↑ so the R 2 ↑,－, ↓ ＊ R 2 can be negative , the condition is 1  (1  R 2 ) 12 N 1 0 N K ie. 1  (1  R 2 ) N 1 NK 3. testing the regression equation: critical value Fd . f .1, d . f .2 ,  FC  exp lained var iance RSS / 1  , if Y    X unplained var iance ESS / N  2  y / 1  ˆ  x eˆ / N  2 ˆ = 2 i 2 2 i 2 i 2   RSS N  2 RSS / TSS N  2    ESS 1 ESS / TSS 1 R 2 /1 (1  R 2 ) / N  2 FC =0: No the explained power of the independent variance in regression for dependent variance. The value of FC is getting large: (1) t N2 2,   F1, N 2,   tC  If ˆ   S ˆ β=0, with null hypothesis: H0: β=0 ˆ    S x t the relationship between X and Y is very closely. 2 N 2  2 ˆ 2  x 2  eˆ 2  N 2 RSS / 1 = F1, ESS N 2 N 2 (2) The F testing can be used the joint hypothesis test: (multi-variables regression equation ) H0: all β=0 (not include β0 ) H1: not all β=0 ˆ 2  x 2 ˆ 0 ＊ as   0, ˆ  Y  X , but don' t happen ˆ  Y ,  F  ˆ 2 13 G. maximum likelihood estimation Yi    X i  ei , ei ~ N (0,  2 )  Yi ~ N (  X i ,  2 ) 1 f (Yi )  2 2 1 exp[  2 2 (Yi    X i ) 2 ] frequency function, f (Yi ,  ,  ,  2 ) L= f(Y1) f(Y2) … f(YN) …………………likelihood function N 1 i 1 2 = 2 Log L =   1 exp[  2 2 (2 2 ) N 1 log( 2 2 )  2 2 2  log L 2   2 2  (Y i  (Y ~ ie ,  ~ MLE  Y   X  ˆ OLSE  (Y i i N 2 exp[  1 2 2  (Y i    X i ) 2 ]    X i ) 2 Y      X    X i )  0  N  Yi    X i  log L 2  2  2 1 (Yi    X i ) 2 ] = i i 0 ~  Xi  N N ~ If    ˆ OLSE , then ~  ˆ OLSE )  ~  ( ie. Y i ~   Yi X i  ~ X i    X i2  0    X i ) X i  0 ~  Xi ~  ) X i    X i2  0 N N ~ Yi  X i  ( X i ) 2 ~    Yi X i      X i2  0 N N   Yi X i  ( Y i N  X i Yi   X i  Y ~   MLE   ˆ OLSE 2 2 N  X i  ( X i )  log L N 1   2 2  2 2 4 ~ 2 MLE = 1 N ie. ~ 2  (Y i MLE  (Yi    X i ) 2  0 1 ~  ~   X i ) 2 = N  (e ) 2 i 2 is biased estimate of ~ 2 14  OLSE N 2 1   (Yi    X i ) 2  0 2 2  1 2 ei  N 2

Chapter 3 The Two-variable Regression model   

Related documents

Products

Support

Chapter 3 The Two-variable Regression model   

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib