GENERAL LINEAR REGRESSION MODEL INTRODUCTION The general linear regression model is a statistical model that describes a data generation process. The general linear regression model is a generalization of the classical linear regression model. You can obtain the general linear regression from the classical linear regression by changing one assumption: assume that the disturbances are nonspherical rather than spherical. Because of this, the general linear regression model can be used to describe data generation processes characterized by heteroscedasticity and autocorrelation. SPECIFICATION The specification of the general linear regression model is defined by the following set of assumptions. Assumptions 1. The functional form is linear in parameters. y = X + 2. The error term has mean zero. E() = 0 3. The errors are nonspherical. Cov() = E(T) = W where W is any nonsingular TxT variance-covariance matrix of disturbances. 4. The error term has a normal distribution ~N 5. The error term is uncorrelated with each independent variable. Cov (,X) = 0 Sources of Nonspherical Errors There are 2 major sources of nonspherical errors. 1. The error term does not have constant variance. This is called heteroscedasticity. In this case, the disturbances are drawn from probability distributions that have different variances. This often occurs when using cross-section data. When the error term has non constant variance, the variance-covariance matrix of disturbances is not given by a constant times the identity matrix (i.e., W 2I). This is because the elements on the principal diagonal of W, which are the variances of the distributions from which the disturbances are drawn, are not a constant given by 2 but have different values. 2. The errors are correlated. This is called autocorrelation or serial correlation. In this case, the disturbances are correlated with one another. This often occurs when using time-series data. When the disturbances are correlated, the variance-covariance matrix of disturbances is not given by a constant times the identity matrix (i.e., W 2I). This is because the elements off the principal diagonal of W, which are the covariances of the disturbances, are non-zero numbers. Classical Linear Regression Model as a Special Case of the General Linear Regression Model If the error term has constant variance and the errors are uncorrelated, then W = 2I and the general linear regression model reduces to the classical linear regression model. General Linear Regression Model Concisely Stated in Matrix Format The sample of T multivariate observations (Yt, Xt1, Xt2, …, Xtk) are generated by a process described as follows. y = X + , ~ N(0, W) or alternatively y ~ N(X, W) ESTIMATION Choosing an Estimator To obtain estimates of the parameters of the model, you need to choose an estimator. We will consider the following 3 estimators: 1. Ordinary least squares (OLS) estimator 2. Generalized least squares (GLS) estimator 3. Feasible generalized least squares (FGLS) estimator Ordinary Least Squares (OLS) Estimator To obtain estimates of the parameters of the general linear regression model, you can apply the OLS estimator to the sample data. The OLS estimator is given by the rule: ^ = (XTX)-1XTy The variance-covariance matrix of estimates for the OLS estimator is Cov( ^) = 2(XTX)-1 Properties of the OLS Estimator If the sample data are generated by the general linear regression model, then the OLS estimator has the following properties. 1. 2. 3. 4. The OLS estimator is unbiased The OLS estimator is inefficient. The OLS estimator is not the maximum likelihood estimator. The variance-covariance matrix of estimates is incorrect, and therefore the estimates of the standard errors are biased and inconsistent 5. Hypothesis tests are not valid. Property 2 means that in the class of linear unbiased estimators, the OLS estimator does not have minimum variance. Thus, an alternative estimator exists that will yield more precise estimates. Generalize Least Squares (GLS) Estimator The GLS estimator is given by the rule: ^GLS = (XTW-1X)-1XT W-1y The variance-covariance matrix of estimates for the GLS estimator is Cov( ^) = (XTW-1X)-1 Properties of the GLS Estimator If the sample data are generated by the general linear regression model, then the GLS estimator has the following properties. 1. 2. 3. 4. The GLS estimator is unbiased The GLS estimator is efficient. The GLS estimator is the maximum likelihood estimator. The variance-covariance matrix of estimates is correct, and therefore the estimates of the standard errors are unbiased and consistent. 5. Hypothesis tests are valid. If the sample data are generated by the general linear regression model, then the GLS estimator is the best linear unbiased estimator (BLUE) of the population parameters. The reason that the GLS estimator is more precise than the OLS estimator is because the OLS estimator wastes information. That is, the OLS estimator does not use the information contained in W about heteroscedasticity and/or autocorrelation, while the GLS estimator does. Major Shortcoming of the GLS Estimator To actually use the GLS estimator, we must know the elements of the variance-covariance matrix of disturbances, W. That means that you must know the true values of the variances and covariances for the disturbances. However, since you never know the true elements of W, you cannot actually use the GLS estimator, and therefore the GLS estimator is not a feasible estimator. Feasible Generalized Least Squares (FGLS) Estimator To make the GLS estimator a feasible estimator, you can use the sample of data to obtain an estimate of W. When you replace true W with its estimate W^ you get the FGLS estimator. The FGLS estimator is given by the rule: ^FGLS = (XTW-1^X)-1XT W-1^y The variance-covariance matrix of estimates for the GLS estimator is Cov( ^) = (XTW-1^X)-1 FGLS Estimator as a Weighted Least Squares Estimator The FGLS estimator is also a weighted least squares estimator. The weighted least squares estimated is derived as follows. Find a TxT transformation matrix P such that μ* = Pμ, where μ* has variance-covariance matrix Cov(μ*) = E(μ* μ*T) = σ2I. This transforms the original error term μ that is nonspherical to a new error term that is spherical. Use the matrix P to derive a transformed model. Py = PXβ + Pμ or y* = X*β + μ* where y* = Py, X* = PX, μ* = Pμ. The transformed model satisfies all of the assumptions of the classical linear regression model. The FGLS estimator is the OLS estimator applied to the transformed model. Note that the transformed model is a computational device only. We use it to obtain efficient estimates of the parameters and standard errors of the original model of interest. Major Problem with Using the FGLS Estimator A major problem with using the FGLS estimator is that to estimate W you must obtain an estimate of each element in W (i.e., each variance and covariance). The matrix W is a TxT matrix and therefore contains T 2 elements. Because it is a symmetric matrix, ½T(T + 1) of these elements are different. Thus, if you have a sample size of T = 100, then you must use these 100 observations to obtain estimates of 5,050 different variances and covariances. You cannot obtain this many estimates with 100 observations because you do not have enough degrees of freedom. Resolving the Degrees of Freedom Problem To circumvent the degrees of freedom problem and obtain estimates of the variances and covariances in W, you must specify a model that describes what you believe is the nature of heteroscedasticity and/or autocorrelation. You can then use the sample data to estimate the parameters of your model of heteroscedasticity and/or autocorrelation. You can then use these parameter estimates to obtain estimates of the variances and covariances in W. Some often used models of heteroscedasticity are the following. 1. Assume that the error variance is a linear function of the explanatory variables. 2. Assume that the error variance is an exponential function of the explanatory variables. 3. Assume the error variance is a polynomial function of the explanatory variables. Some often used models of autocorrelation are the following. 1. First-order autoregressive process 2. Second-order autoregressive process 3. Higher-order autoregressive process Properties of the FGLS Estimator If the sample data are generated by the general linear regression model, then the FGLS estimator has the following properties. The FGLS estimator may or may not be unbiased in small samples. However, if W^ is a consistent estimator of W, then the FGLS estimator is asymptotically unbiased, efficient, and consistent. In this case, Monte Carlo studies have shown that the FGLS estimator generally yields better estimates than the OLS estimator. Caveat For W^ to be a consistent estimator of W, your model of heteroscedasticity or autocorrelation must be a reasonable approximation of the true unknown heteroscedasticity or autocorrelation. If it is not, then the FGLS estimator will not have desirable small or large sample properties. HYPOTHESIS TESTING The following statistical tests can be used to test hypotheses in the general linear regression model. 1) t-test. 2) F-test. 3) Likelihood ratio test. 4) Wald test. 5) Lagrange multiplier test. GOODNESS-OF-FIT It is somewhat more difficult to measure the goodness-of-fit of the model when the sample data are generated by the general linear regression model. The FGLS estimator is simply the OLS estimator applied to a transformed regression that purges the heteroscedasticity and/or autocorrelation. Many economists use as their measure of goodness of fit the R2 statistic applied to the transformed regression. However, the transformed regression is simply a computational device, not the original model of interest. The fact that you have a good or bad fit for the transformed regression may be of no interest. HETEROSCEDASTICITY AND THE GENERAL LINEAR REGRESSION MODEL Consider the following general linear regression model with heteroscedasticity. Yt = β1 + β2Xt2 + β3Xt3 + μt where var(μt) = E(μt2) = σt2 The t subscript attached to sigma squared indicates that the error for each unit in the sample is drawn from a probability distribution with a difference variance. Models of Heteroscedasticity It is often assumed that the var(μt) is either a linear or exponential function of the explanatory variables. These two alternative models of heteroscedasticity can be written as follows. Linear hetero: σt2 = α1 + α2Xt2 + α3Xt3 Exponential hetero: ln(σt2) = α1 + α2Xt2 + α3Xt3 The model of exponential heteroscedasticity is written in log-linear form. Testing for Heteroscedasticity Four alternative tests for heteroscedasticity. 1) Breusch-Pagan test. 2) Harvey-Godfrey test. 3) White test. 4) Wooldridge test. The Breusch-Pagan test assumes that if heteroscedasticity exists it is linear. The Harvey-Godfrey test assumes that if heteroscedasticity exists it is exponential. The White and Wooldridge tests assume that if heteroscedasticity exists it has an unspecified general form. Remedies for Heteroscedasticity When there is evidence of heteroscedasticity, econometricians choose one of two alternatives. 1. Use the OLS estimator. Correct the estimates of the standard errors of the estimates so they are unbiased and consistent. 2. Use the FGLS estimator. White Robust Standard Errors If you are uncertain of the true model of heteroscedasticity, then you can estimate the parameters of the model using the OLS estimator, and use White’s correction to obtain unbiased and consistent estimates of the standard errors. This is called White robust standard errors or White-Huber robust standard errors. If you choose this alternative, you will obtain unbiased but inefficient estimates of the parameters of the model, but consistent estimates of the standard errors. Hypothesis tests will be valid, but you will lose some precision. FGLS Estimator If you are relatively certain about the true model of heteroscedasticity, then you can use the FGLS estimator. The FGLS estimator is a weighted least squares (WLS) estimator. To use the WLS estimator, begin by specifying a transformed model that satisfies all of the assumptions of the classical linear regression model. The transformed model, which is a computational device, is given by wtYt = wtβ1 + β2(wtXt1) + β3(wtXt2) + wtμt The transformed model is obtained by multiplying each side of the statistical equation by an appropriate weight, wt. The appropriate weight is wt = 1/σt, where the weight is the reciprocal of the standard deviation of the error. Note that the error variance in the transformed model is var(wtμt) = var[(1/σt)μt] = (1/σt)2var(μt) = var(μt)/ var(μt) = 1 so the transformed model has constant variance of 1, and therefore a homoscedastic error term. To implement the WLS estimator, you use the sample of data to estimate the weight wt = 1/σt. You then regress wtYt on wt, wtXt1, and wtXt2 using the OLS estimator.