FIXED-EFFECTS REGESSION MODEL INTRODUCTION The objective of most empirical studies in economics is to explain the relationship between a dependent variable, Y, and one or more explanatory variables (X1, X2, …, Xk). To do this, we want to know if Xi has an effect on Y, and if so the direction and size of the effect. For our answers to these questions to be valid, we must use the sample data to obtain an unbiased estimate of the effect of X on Y. To obtain an unbiased estimate, we must control for confounding variables, both observable and unobservable. To control for observable confounding variables, we can use a multiple classical linear regression model. To control for unobservable confounding variables that differ across units, but are constant over time we can use a fixed effects regression model. The fixed effects regression model is an extension of the multiple classical linear regression model. However, to use a fixed regression model, we must have panel (longitudinal) data. Panel (Longitudinal) Data Panel (longitudinal) data is data on is data on two or more units for two or more time periods. The units in each time period are the same. SPECIFICATION OF FIXED EFFECTS REGRESSION MODEL Suppose we have an economic relationship that involves a dependent variable, Y, two observable explanatory variables, X1 and X2, and one or more unobservable confounding variables. We have panel data for Y, X1, and X2. The panel data consists of N-units and T-time periods, and therefore we have N times T observations. The classical linear regression model without the intercept is, Yit = β1Xit1 + β2Xit2 + μit for i = 1, 2, …, N and t = 1, 2, …, T Where Yit is the value of Y for the ith unit for the tth time period; Xit1 is the value of X1 for the ith unit for the tth time period, Xit2 is the value of X2 for the ith unit for the tth time period, and μit is the error for the ith unit for the tth time period. The fixed effects regression model, which is an extension of the classical linear regression model, is Yit = β1Xit1 + β2Xit2 + νi + εit where μit = νi + εit. We have decomposed the error term for the classical linear regression model into two components. The component νi represents all unobserved factors that vary across units but are constant over time. The component εit represents all unobserved factors that vary across units and time. We assume that the net effect on Y of unobservable factors for the ith unit that are constant over time is a fixed parameter, designated αi. Therefore, we can rewrite the fixed effects model as Yit = β1Xit1 + β2Xit2 + α1 + α2 + … + αn + εit We have replaced the unobserved error component νi with a set of fixed parameters, α1 + α2 + … + αn, one for each of the N units in the sample. For example, α1 represents the net effect on Y of unobservable factors that are constant over time for unit one. These N fixed parameters control for the net effects of all unobservable factors that differ across units but are constant over time. Intuitively, we are using each unit as a control for itself. This is because variation in Y over time cannot be explained by factors that vary across units, but don’t vary over time. ESTIMATION Two alternative but equivalent estimators can be used to estimate the parameters of the fixed effects regression model. 1) Least squares dummy variable estimator. 2) Fixed effects estimator. Least Squares Dummy Variable Estimator The least squares dummy variable estimator involves two steps. In step #1, we create a dummy variable for each of the N units in the study. These N dummy variables are defined as follows. Dkit = 0 Dkit = 1 if k i if k = i In step #2, we run a regression of the dependent variable on the N dummy variables and the explanatory variables, X1 and X2, using the OLS estimator. That is we estimate the following linear regression model using the OLS estimator. Yit = β1Xit1 + β2Xit2 + α1D1it + α2D2it + … + αnDnit + εit We obtain estimates of the N fixed effects constant parameters and the two slope parameter. The estimate of the constants and slope parameters are unbiased in small samples. The estimate of the slope parameters are consistent in large samples with a fixed T as N . However, the estimates of the constant parameters are not consistent, with a fixed T as N . This is because as we add each additional unit we add a new parameter. In general, the larger T, the better the estimates of the constant parameters. Because of this, when T is small many researchers view of intercept parameters as controls, and ignore the actual estimates. Fixed Effects Estimator When N is large, using the least squares dummy variable estimator is cumbersome or impossible. In this case, we use the fixed-effects estimator. The fixed effects estimator involves two steps. In step #1, we transform the original data to time-demeaned data. This is called the within transformation. This transformation for each variable is given as follows, yit = Yit – YiBar xit = Xit – XiBar it = it – iBar where YiBar is the average value of Y for the ith unit over the T years; XiBar is the average value of X for the ith unit over the T years; iBar is the average value of for the ith unit over the T years; Yit, Xit, and it are the actual values, and yit, xit, and it are the deviations from the time means. In step #2, we run a regression of yit on xit using the OLS estimator. That is; estimate the following equation using OLS, yit = β1xit1 + β2xit2 + it Note that this regression does not include any constant terms. If you want to obtain estimates of the N fixed-effects parameters, you can recover these estimates by using the following formula. i^ = YiBar - ^ XiBar for i = 1,2, …, N The fixed effects estimator yields exactly the same estimates as the least squares dummy variable estimator. The fixed effects estimator also has the same properties as the least squares dummy variable estimator. Observed Variables That Vary Across Units and Are Constant Over Time The fixed-effects parameters, αi, capture the net effects of all variables, both observable and unobservable, that differ across units but are constant over time. Therefore, in the fixed-effects model we can’t include any observable variable that differs across units but is constant over time. If we do, then we have perfect multicollinearity, and we can’t obtain estimates of the parameters. HYPOTHESIS TESTING Same as MCLRM. PREDICTION AND GOODNESS OF FIT Same as MCLRM