Research Method Lecture 2 (Ch3) Multiple linear regression © 1 Model with k independent variables y=β0+β1x1+β2x2+….+βkxk+u β0 is the intercept βj for j=1,…,k are the slope parameters 2 Mechanics of OLS Variable labels Suppose you have n observations. Then you have data that look like Obs id Y x1 x2 … xk 1 y1 x11 x12 x1k 2 y2 x21 x22 x2k : : : : : n yn xn1 xn2 xnk 3 The OLS estimates of the parameters are chosen to minimize the estimated sum of squared errors. That is, you minimize Q, given below, by choosing betas. n n Q uˆi ( yi ˆ0 ˆ1 xi1 ˆ2 xi 2 ... ˆk xik ) 2 i 1 2 i 1 This can be achieved by taking the partial derivatives of Q with respect to betas, then set them equal to zero. (See next page) 4 The first order conditions (FOCs) n Q 0 2 ( yi ˆ0 ˆ1 xi1 ... ˆk xik ) 0 ˆ0 i 1 n Q 0 2 xi1 ( yi ˆ0 ˆ1 xi1 ... ˆk xik ) 0 ˆ1 i 1 ……. n Q 0 2 xik ( yi ˆ0 ˆ1 xi1 ... ˆk xik ) 0 ˆk i 1 You solve these equations for betas. The solutions are the OLS estimators for the coefficients. 5 Most common method to solve for the FOCs is to use matrix notation. We will use this method later. For our purpose, more useful representation of the estimators are given in the next slide. 6 The OLS estimators The slope parameters have the following representation. The jth parameter (except intercept) is given by n ˆ j ˆ i 1 n ij yi 2 ˆ ij i 1 Where ˆij is the OLS residual of the following equation where xj is regressed on all other explanatory variables. That is; xij ˆ0 ˆ1 xi1 ... ˆ j 1 xij1 ˆ j xij1 ... ˆk 1 xik ˆij all the explanatory variables except xj Proof: See the front board 7 Unbiasedness of OLS Now, we introduce a series of assumptions to show the unbiasedness of OLS. Assumption MLR.1: Linear in parameters The population model can be written as y=β0+β1x1+β2x2+….+βkxk+u 8 Assumption MLR.2: Random sampling We have a random sample of n observations {xi1 xi2…xik, yi}, i=1,…,n following the population model. 9 MLR.2 means the following MLR.2a yi , i=1,…,n are iid MLR.2b xi1,i=1,….,n are iid : xik, i=1,….,n are iid MLR.2c Any variables across observations are independent MLR.2d ui , i=1,…,n are iid Ob Y s id x1 x2 1 y1 x11 x12 x1k 2 y2 x21 x22 x2k : : : : n yn xn1 xn2 : … xk xnk 10 Assumption MLR.3: No perfect collinearity In the sample and in the population, none of the independent variables are constant, and there are no exact linear relationships among the independent variables. 11 Assumption MLR.4: Zero conditional mean E(u|x1,x2,…,xk)=0 12 Combined with MRL.2 and MRL.4, we have the following. MLR.4a: E(ui|xi1, xi2,…,xik)=0 for i=1,…,n MLR.4b: E(ui|x11,x12,..,x1k,x21,x22,..,x2k,..…,xn1,xn2,..,xnk)=0 for i=1,…,n. We usually write this as E(ui|X)=0 MLR.4b means that conditional on all the data, the expected value of ui is zero. 13 Unbiasedness of OLS parameters Theorem 3.1 Under assumption MRL.1 through MRL.4 we have E ( ˆ j ) j for j=0,1,..,k Proof: See front board 14 Omitted variable bias Suppose that the following population model satisfies MLR.1 through MLR.4 y=β0+β1x1+β2x2+u -----------------------------(1) But, further suppose that you instead estimate the following model which omits x2, perhaps because of a simple mistake, or perhaps because x2 is not available in your data. y=β0+β1x1+v ------------------------------------(2) 15 Then, OLS estimate of (1) and OLS estimate of (2) have the following relationship. ~ ~ ˆ ˆ 1 1 21 ~ ˆ ˆ where 1 , 2 is the OLS estimate from (1), and 1 is the OLS estimate from (2). ~ and, 1is the OLS estimate of the following model x2=δ0+δ1x1+e Proof for this will be give later for a general case. 16 So we have ~ ~ E ( 1 | X ) 1 21 So, unless 2 =0 or is biased. ~ 1 =0, the estimate from equation (2), ~1 , ~ Notice that 1 >0 if cov(x1,x2) >0 and vise versa, so we can predict the direction of the bias in the following way. 17 Summary of bias ~ ~ <0 1 1>0 i.e,. cov(x1,x2)>0 i.e., cov(x1,x2)<0 Β2>0 Positive bias (upward bias) Negative bias (downward bias) Β2<0 Negative bias Positive bias (downward bias) (upward bias) 18 Question Suppose the population model (satisfying the MRL.1 through MRL.4) is given by (Crop yield)= β0+ β1(fertilizer)+ β2(land quality)+u -----(1) But your data do not have land quality variable, so you estimate the following. (Crop yield)= β0+ β1(fertilizer)+ v ---------------------------(2) Questions next page: 19 Consider the following two scenarios. Scenario 1: On the farm where data were collected, farmers used more fertilizer on pieces of land where land quality is better. Scenario 2: On the farm where data were collected, scientists randomly assigned different quantities of fertilizer on different pieces of land, irrespective of the land quality. Question 1: In which scenario, do you expect to get an unbiased estimate? Question 2: If the estimate under one of the above scenario is biased, predict the direction of the bias. 20 Omitted bias, more general case Suppose the population model (which satisfies MRL.1 through MRL.3) is given by y=β0+β1x1+β2x2+….+βk-1xk-1+βkxk+u -----(1) But you estimate a model which omits xk. y=β0+β1x1+β2x2+….+βk-1xk-1+v -----(2) 21 Then, we have the following ~ ~ ˆ ˆ j j k j ~ ˆ ˆ where j , k is the OLS estimate from (1), and j is the OLS estimate from (2). ~ And, j is the OLS estimate of the following xk=δ0+δ1x1+…+ δk-1xk-1+ e 22 In general, it is difficult to predict the direction of bias in the general case. However, approximation is often useful. Note that ~j is likely to be positive if the correlation between xj and xk are positive. Using this, you can make predict the “approximate” direction of the bias. 23 Endogeneity Consider the following model y=β0+β1x1+β2x2+….+βk-1xk-1+βkxk+u A variable xj is said to be endogenous if xj and u are correlated. This causes a bias in βj, and in certain cases, for other variables as well. One reason why endogeneity occurs is the omitted variable problem, described in the previous slides. 24 Variance of OLS estimators First, we introduce one more assumption Assumption MLR.5: Homoskedasticity Var(u|x1,x2,…,xk)=σ2 This means that the variance of u does not depend on the values of independent variables. 25 Combining MLR.5 with MLR.2, we also have MRL.4a Var(ui|X)=σ2 for i=1,…,n where X denotes all the independent variables for all the observations. That is, x11, x12,..,x1k, x2l,x22,…x2k,…., xn1, xn2,…xnk. 26 Sampling variance of OLS slope estimators Theorem 3.2: Under assumptions MLR.1 through MLR.5, we have Var ( ˆ j | X ) where 2 SST j (1 R j ) 2 for j=1,…,n n SST j ( xij x j ) i 1 And Rj2 is the R=squared from regressing xj on all other independent variables. That is, R-squared from the following regression: x j 0 1 x1 ... j 1 x j 1 j x j 1 ... k 1 xk e All x - variables except xj Proof: see front board 27 The standard deviation of OLS slope parameters are given by the square root of the variance, which is 2 ˆ ˆ sd ( j ) Var ( j | X ) 2 2 SST j (1 R j ) SST j (1 R j ) for j=1,…,n 28 The estimator of σ2 In theorem 3.2, σ2 is unknown, which have to be estimated. The estimator is given by n 1 2 ˆ ˆ 2 u i n k 1 i 1 n-k-1 comes from (# obs)-(# parameters estimated including the intercept). This is called the degree of freedom. 29 Theorem 3.3: Unbiased estimator of σ2 . Under MLR.1 through MLR.5, we have 2 ˆ E ( ) 2 Proof: See the front board 30 Estimates of the variance and the standard errors of OLS slope parameters We replace the σ2 in the theorem 3.2 by ˆ 2 to get the estimate of the variance of the OLS parameters. This is given by Note the is a hat ^ ˆ 2 Var(ˆ j | X ) SST (1 R j 2 j ) indicating that this is an estimate. Then the standard error of the OLS estimate is the square root of the above. This is the estimated standard deviation of the slope parameters 2 ˆ ˆ ˆ se( j ) 2 2 SST j (1 R j ) SST j (1 R j ) 31 Multicollinearity Var ( ˆ j | X ) 2 SST j (1 R j ) 2 •If xj is highly correlated with other independent variables, Rj2 gets close to 1. This in turn means that the variance of the βj gets large. This is the problem of multicollinearity. •In an extreme case where xj is perfectly linearly correlated with other explanatory variables, Rj2 is equal to 1. In this case, you cannot estimate betas at all. However, this case is eliminated by MLR.3. •Note that multicollinearity does not violate any of the OLS assumptions (except the perfect multicollinearity case), and should not be over-emphasized. You can reduce variance by increasing the number of observations. 32 Gauss-Markov theorem Theorem 3.4 Under Assumption MLR.1 through MRL.5, OLS estimates of beta parameters are the best linear unbiased estimators. This theorem means that among all the possible unbiased estimators of the beta parameters, OLS estimators have the smallest variances. 33