Multiple Regression Analysis: Simple Linear Models

MULTPLE REGRESSION-I Regression analysis examines the relation between a single dependent variable Y and one or more independent variables X1, …,Xp. SIMPLE LINEAR REGRESSION MODELS First Order Model with One Predictor Variable Yi  0  1X i1   i , i=1,2,…,n  0 is the intercept of the line, and 1 is the slope of the line. One unite increase in X gives 1 unites increase in Y.  i is called a statistical random error for the ith observation Yi. It accounts for the fact that the statistical model does not give an exact fit to the data.  i cannot be observed. We assume: o E(i)=0 o Var(i)=2 for all i=1,…,n o Cov(i, j)=0 for all ij  Response/Regression function and an example: EYi   0  1X i1 EYi   9.5  2.1X i1 The response Yi, given the level of X in the ith trial Xi, comes from a probability distribution whose mean is 9.5+2.1 Xi. Therefore, Response/Regression function relates the means of the probability distributions of Y (for given X ) to the level of X.  0, 1, and  are unknown parameters and have to be estimated from data. We use b0, b1, and e for the estimates of 0, 1, and .  Fitted Values and an Example: Ŷi  b0  b1Xi1 is called fitted values Ŷi  9.6  1.9Xi1  Residual and an Example: ei  Yi  Ŷi is called the residual value ei  108  (9.6  1.9 * 45)  12.9  i  108  104  4 o The residuals are observable, and can be used to check assumptions on the statistical random errors i. o Points above the line have positive residuals, and points below the line have negative residuals o A line that fits the data well has small residuals. o We want the residuals to be small in magnitude, because large negative residuals are as bad as large positive residuals. Therefore, we cannot simply require  ei  0. o In fact, e i  0. (derivation on board). o Two immediate solutions: | e | to be small Require  e to be small - Require - i 2 i o We consider the second option because working with squares is mathematically easier than working with absolute values (for example, it is easier to take derivatives).  Find the pair (b0, b1) that minimizes e (least square estimates) o We call SSE= e the Residual Sum of Squares o We want to find the pair (b0, b1) that minimizes SSE= e = (Y  b  b X ) 2 i 2 i 2 i 2 i 0 1 i o We set the partial derivatives of SSE with respect to b0, b1 equal to zero: SSE  0 Normal equation (1) :   (Y b i 0  X (Y b i i i 0  b1X i )  0  b1Xi )  0 Normal equation (2) :   (1)(2)(Y b 0 SSE  1  (X )(2)(Y b i i 0  b1X i )  0  b1Xi )  0 Then the solution is (derivation on board); b 0  Ŷ  b1X b1   (X  X)(Y  Y )  (X  X ) i i 2 i  Properties of the residuals o  e  0. since the regression line goes through the point (X, Y) . o  X e  0 and  Ŷ e  0  The residuals are uncorrelated with the independent variables Xi and with the fitted values Ŷ . (prove it on board) i i i i i i Why  e  0. ,  X e  0 and  Ŷ e  0 ? In fact, from normal equation (1) and (2), we can immediately tell  e  0 and  X e  0 . Since  Ŷ e  (b  b X )e  b e  b  X e , we can easily know that  Ŷ e  0 ? i i i i i i i i 0 1 i i i i 0 i i i 1 i i o Least square estimates are uniquely defined as long as the values of the independent variable are not all identical. In that case the numerator  (X  X)  0 (draw figure).  Point Estimation of Error Terms Variance 2 o Single Population: Unbiased sample variance estimator of the population variance 2 i Yi     i , i  1,  , n     i    ,   i ,  j  0 2 2  s2  2   Y  Y  i   E s2 n 1  2 o Regression model:Unbiased sample variance estimator of the population variance Yi   0  1X i   i , i  1,, n   E i   0,  2  i    2 ,   i ,  j  0 SSE n2  EMSE   2 MSE   Inference in Regression Analysis: SS XY b1  Estimator of 1 SS XX Eb1  1 Mean of b1 2 2 Variance of b1  b1  SS XX Estimated variance MSE s 2 b1  SS XX t-Score and 1- CI of 1 t*  b1  1 ~ t n  2 sb1 1   CI : b1  t 1   2 ; n  2sb1 s 2 b1  MSE SS XX  Estimating the Mean Value at Xh: EY      X Estimator Yˆ  b  b X Mean E Yˆ   EY  Variance  1 X  X    Yˆ       SS Estimated variance  n   1 X  X   s Yˆ   MSE    h 0 h 0 1 1 h h h h 2 2 2 h h XX 2 2 h h  n SS XX   Analysis of Variance Approach to Regression Analysis SSTOT  SSR  SSE Yi  Y  Ŷi  Y  Yi  Ŷi  Y  Y    Ŷ  Y Y  Ŷ    Ŷ  Y    Y  Ŷ  2 2 i i i 2 i 2 i i Basic Table Source of Variation SS     Regression SSR   Yi  Y Error SSE   Yi  Yi Total SSTOT    Yi  Y  2 2 2 df MS E{MS} 1 MSR  SSR  2   2   X  X  2 1 i 1 n-2 MSE  SSE n2 2 n-1 o Test Statistic o F Distribution o Numerator Degrees of Freedom dfR=1 o Denominator MSR MSE MSR F*  ~ F 1, n  2  MSE Tail Probability : F*    * Degrees of Freedom dfR=n-1 P F  F 1   ;1, n  2    H 0 : 1  0 o Hypothesis: H 1 : 1  0 For  level of significance o Decision Rule: If F *  F 1   ;1, n  2   H 0 If F *  F 1   ;1, n  2   H1 Fitting a regression in SAS data Toluca; infile 'C:\stat231B06\ch01ta01.txt'; input lotsize workhrs; proc reg; /*least square estimation of regression coefficient*/ model workhrs=lotsize; output out=results p=yhat r=residual;/*yhat denotes for fitted values and residual denotes for residual values*/ run; proc print data=results; var yhat residual; /*print the fitted values and residual vales*/ run; SAS Output The REG Procedure Model: MODEL1 Dependent Variable: workhrs Number of Observations Read Number of Observations Used 25 25 Analysis of Variance Source DF Sum of Squares Mean Square Model Error Corrected Total 1 23 24 252378 54825 307203 252378 2383.71562 Root MSE Dependent Mean Coeff Var 48.82331 312.28000 15.63447 Variable Intercept lotsize DF 1 1 R-Square Adj R-Sq Parameter Estimates Parameter Standard Estimate Error 62.36586 26.17743 3.57020 Obs 1 2 0.34697 yhat residual 347.982 169.472 51.018 -48.472 F Value Pr > F 105.88 <.0001 0.8215 0.8138 t Value 2.38 Pr > |t| 0.0259 10.29 <.0001 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 240.876 383.684 312.280 276.578 490.790 347.982 419.386 240.876 205.174 312.280 383.684 133.770 455.088 419.386 169.472 240.876 383.684 455.088 169.472 383.684 205.174 347.982 312.280 -19.876 -7.684 48.720 -52.578 55.210 4.018 -66.386 -83.876 -45.174 -60.280 5.316 -20.770 -20.088 0.614 42.528 27.124 -6.684 -34.088 103.528 84.316 38.826 -5.982 10.720 Extension to multiple regression model Extension to multiple regression model: Extension to multiple regression model Extension to multiple regression model Extension to multiple regression model Degrees of freedom: the number of independent components that are needed to calculate the respective sum of squares. (1) SSTOT is the sum of n squared components. However, since  ( yi  y)  0 , only n1 components are needed for its calculation. The remaining one can always be n 1 calculated from y n  y    ( yi  y) . Hence, SSTOT has n-1 degrees of freedom. i 1 (2) SSE=  ei2 is the sum of n squared components. However, there are p restrictions among the residuals, coming from the p normal equations XY  XX b  X' (Y  Xb )  X' e  0 . Hence SSE has n-p degrees of freedom. p1 p p p1 (3) SSR=  ( ŷi  y)2 is the sum of p squared components since all fitted values Ŷi are calculated from the same estimated regression line. One degree of freedom is lost because of one restriction  (ŷi  y)   (ŷi  yi  yi  y)   (ŷi  yi )   (yi  y)  0 . Hence SSR has p-1 degrees of freedom. Extension to multiple regression model Extension to multiple regression model

Multiple Regression Analysis: Simple Linear Models

Related documents

Products

Support

Multiple Regression Analysis: Simple Linear Models

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib