Panel Data Estimation Panel data sets refer to sets that consist of both time series and cross section data. This has the effect of expanding the number of observations available, for instance if we have 10 years of data across 10 countries, we have 100 observations. So although there would not be enough to estimate the model as a time series or a cross section, there would be enough to estimate it as a panel. These are particularly useful where we are using LDCs (lesser developed countries) data, which is often only in annual form and starts in the 1980s or 1990s. Panel data can be in either a balanced or unbalanced format, a balanced panel is where there is an observation for every unit of observation in the time series and unbalanced where observations are missing. A further benefit is that it can overcome the problem of unobserved heterogeneity in a cross section data set. This occurs where there are unobserved variations in the characteristics of the respondents in a survey based data base. Unobserved Effect The standard panel data model is of the following form: yit 0 k s j 2 p 1 j x jit p z pi uit Where: Y is the dependent variable Xj are the observed explanatory variables Zp are unobserved explanatory variables The subscript i refers to the cross section (an individual or country etc), whereas the subscript t refers to a time period (year, month etc). We could also have added a trend term to allow for a shift in the intercept over time). If we assume the unobserved effect does not vary over time and given that it is unobserved and difficult to measure, the model can be rewritten as: i s p z pi p 1 yit 0 k j x jit j 2 i u it Where the δi is referred to as an individual unobserved specific effect. We could also account for changes in the intercept over time, assuming a constant rate of change over time. If we wish to allow for a variable rate of change, we could include dummy variables for each time period. If the unobserved effects are not included in the model, the estimates will be inefficient, there are basically two ways of overcoming this potential problem, a fixed effects model and a random effects model. Fixed Effects Models The main way to account for these unobserved fixed effects is to introduce a dummy variable for each individual (minus 1 as usual with dummy variables). In this case the unobserved effect is being treated as the coefficient of the individual- specific dummy variable. In this case we get the following model: yit k n j 2 i 1 j xijt i Ai uit Where Ai takes the value of 1 for an observation relating to individual I and zero otherwise. In the above example we have omitted the intercept for the whole regression and included an intercept for each individual. This is an alternative to omitting one of the individual dummies. There are two other ways in which fixed effects can be incorporated into the model: - Within-groups fixed effects, where the variation is explained about the mean of the dependent variable in terms of the variations about the means of the explanatory variables. However this method has potential problems such as the loss of the x variables that remain constant for an individual. - Taking first-differences of the variables. Again the problem with the x variable remains, but a potential advantage is that it could remove any problem of first-order autocorrelation. Random Effects Model If the unobserved effects are distributed randomly, we can treat the δi as random variables, drawn from a given distribution. This involves subsuming the unobserved effects into the disturbance term to give: yit 0 k j x jit j 2 vit vit i uit Apart from assuming the random effects are drawn randomly from a distribution, we also assume that the unobserved effects are independent of the observed x variables. To determine whether fixed or random effects models are most appropriate, the Hausman test is used, where the null hypothesis is that the random effects model is most appropriate.