Applied Econometrics Week 1

Applied Econometrics Week 1-2: Ordinary Least Squares Michal Rubaszek Readings R.C. Hill, W.E. Griffiths, G.C. Lim, 2012. Principles of Econometrics, John Wiley & Sons, Inc., Chapters 1,2,3,4,5 1 1.1 Introduction to Econometrics Definition of econometrics Econometrics: application of mathematical and statistical techniques to economics in the study of problems, the analysis of data, and the development and testing of theories and models. 1.2 Econometric model The general specification of a (single-equation) econometric model is: yt = f (xt , α, t ) for t = 1, 2, . . . , T (1) where: yt - a dependent variable; xt - a vector of K independent variables (constant included); α - a vector of model parameters (that are to be estimated); t - error term (stochastic part of the model); t - moment of observation For linear models the notation translates into: yt = α1 x1t + α2 x2t + . . . + αK xKt + t (2) Note. In the above notation we implicitly add “for t = 1, 2, . . . , T ” Remark. The econometric model differs from an economic model in two dimensions. First, variables in the former are indexed by t. Second, there is a stochastic part represented by the error term in the econometric model. 1 Example 1. Our focus is to find the marginal propensity to consume (MPC) out of disposable income. The economic model is: C = α + βY, where C and Y are the levels of consumption and disposable income, respectively, and the parameter β describes the MPC. The corresponding econometric model is: Ct = α + βYt + t where parameters α and β are to be estimated on the basis of T empirical observations for Ct and Yt . 1.3 Types of data In order to estimate the parameters of the econometric model we need empirical observations, where there are various types of data. The first classification is based on the source of the data: • macroeconomic data (macroeconometrics) • microeconomic data (microeconometrics) • financial data (financial econometrics) • experimental data (experimental econometrics) The second classification is based on the type of data sample: • time series (collected over discrete intervals of time: yt , t = 1, 2, ..., T ) • cross-section data (collected across sample units in a particular time period: yi , i = 1, 2, ..., N ) • panel or longitudinal data (observations on many individual units over time: yit , i = 1, 2, ..., N and t = 1, 2, ..., T ) 2 Stages of building an econometric model The process of constructing an econometric model consists of six stages: 1. Setting up a research hypothesis 2. Choosing a functional form and the set of explanatory variables 3. Collecting the data 4. Estimating the model 5. Verification process 6. Application 2 Example 2. We analyze the relationship between unemployment rate and inflation in Poland (file example1a.wf1). Research hypothesis: Model specification: Data: Estimation: Verification: Application: 3 To estimate the slope of the Phillips curve in Poland πt = α1 + α2 ut + t Quarterly data for inflation and the unemployment rate from the period 1995-2011 (source: CSO). Ordinary Least Squares (OLS) Determination coefficient R2 , test whether α2 < 0 Verification of the theoretical model Ordinary Least Squares estimator Let us write down the linear econometric model given by (2) in a shorter form: yt = α0 xt + t , (3) where α = [α1 α2 . . . αK ]0 is the vector of parameters and xt = [x1t x2t . . . xKt ]0 is the vector of explanatory variables. We can observe the values of yt and xt , but don’t know the values of α. We need to estimate the parameters. There are many methods to do so, of which the most popular is the Ordinary Least Squares (OLS). Let us denote by α̂ the estimate of vector α. It could be any vector of size K, even containing the most unreasonable values. For such a vector we can calculate: Fitted values: Residuals: Sum of squared residuals: ŷt = α̂0 xt , t = 1, 2, . . . , T et = yt − ŷt , t = 1, 2, . . . , T PT SSE(α̂) = t=1 e2t Let’s notice that the value of SEE depends on our choice of α̂. That means that we can find α̂ such that the SEE(α̂) is minimum, i.e.: ∀a∈<K SEE(α̂) ≤ SEE(a) (4) This value is called the OLS estimate. To find this value we need to solve the optimization problem (i.e. find the value of α̂ for which the first derivative of SEE is null). This solution is the formula for the OLS estimator: T T X X α̂ = ( xt x0t )−1 ( xt yt ). t=1 (5) t=1 Remark. The OLS estimator is a general formula and is a random variable. The properties of the estimator depend on the structure of the model (described by assumptions). OLS estimates are numbers that we obtain by applying the general formulas to the observed data. This distinction is fundamental to understand econometric inference. 3 Example 3. The application of (5) to the Phillips curve model leads to the following estimates (example1a.wf1): π̂t = 11.1 − 0.28ut , which means that ceteris paribus an increase in the unemployment rate by 1 percentage point leads to a decrease of the annual CPI inflation rate by 0.28 percentage point. 4 Assumptions of the linear regression model To perform statistical inference for the OLS estimator given by (5) it is essential to make assumptions about the underlying model. The standard set of assumptions is as follows: A1 For each t the expected value of yt given xt is constant: E(yt |xt ) = E(α0 xt + t |xt ) = α0 xt + E(t |xt ) = α0 xt A2 For each t and xt the variance of yt is: var(yt |xt ) = σ 2 A3 For each s and t, such that s 6= t, the covariance of the error term is null: cov(s , t ) = 0 A4 The values of xkt for k = 1, 2, . . . , K are not random and are not linear functions of other explanatory variables A5 (Additional assumption) The random term is normally distributed: t ∼ N (0, σ 2 ) Note. A1 indicates that the error term has a probability distribution with zero mean. A2 and A3 indicate that the error term is homoscedastic and not autocorrelated. A4 indicates that the value of the explanatory variables are known (are not stochastic) and that there is no exact collinearity. Gauss-Markov Theorem Under assumptions A1-A4 the OLS estimator is: Unbiased: Consistent: Effective: 5 E(α̂) = α limT →∞ α̂T = α var(α̂) is the lowest in the class of linear and unbiased estimators Interval estimation Since the OLS estimator α̂ is a random variable, it has a distribution. To illustrate this let us consider the following example 4 Example 4. Let’s generate a sample of 100 observation from the true model yt = 10+5xt +t , where t ∼ N (0, 22 ). The data are available in the example1b.wf file. Subsequently, let’s divide the sample into five equal subsamples of 20 observations and calculate OLS estimates for each subsample. The results, which are presented below, show that the estimates vary across the subsamples and are never exactly equal to the true values of the parameters. subsample a1 a2 1 9.21 5.27 2 10.97 5.75 3 9.70 4.57 4 9.96 5.51 5 10.41 6.14 true value 10 5 The interpretation of the above results is that for each subsample the OLS estimate is a single draw from the distribution for the OLS estimator. Under the assumptions A1-A5 this distribution is: α̂ ∼ N (α, Σ) where: T X xt x0t )−1 Σ = σ2 ( (6) t=1 The diagonal elements of the covariance matrix Σ, which we denote by σα2 k , stand for the variance of individual parameters var(α̂k ) = σα2 k . The distribution of the OLS estimator for an individual parameter is: α̂k ∼ N (αk , σα2 k ). Now we can compute a range of values in which the true parameter αk is likely to fall (confidence interval or interval estimate): P rob(ak − 1.96σαk ≤ αk ≤ ak + 1.96σαk ) = 0.95. In practice we cannot use the above formula for inference because we don’t know the variance of the error term σ 2 . We need to substitute σ 2 with its unbiased estimate: σ b2 = PT 2 t=1 et T −K (7) and the covariance matrix Σ with its estimator: T X b =σ Σ b2 ( xt x0t )−1 . (8) t=1 b which we denote by S 2 , stand for the estimators for The diagonal elements of the matrix Σ, k the variance of individual parameters: Sk2 = σ̂α2 k . The substitution of σαk with Sk changes the distribution from normal to t-distribution with ν = T − K degrees of freedom, so that: α̂k − αk ∼ tν Sk and the interval estimate changes into: 5 (9) P rob(α̂k − t∗ν Sk ≤ αk ≤ α̂k + t∗ν Sk ) = 0.95, where t∗ν is the critical value of the tν distribution for the 95% interval. Example 5. For the Phillips curve model (example1a.wf1) the values of Sk are 3.10 and 0.22, which we write down as: π̂t = 11.1 − 0.28 ut . (3.10) Since T = 68, then ν = 66, t∗ν (0.22) = 1.995 and: P rob(4.89 ≤ α1 ≤ 17.30) = 0.95 P rob(−0.72 ≤ α2 ≤ 0.10) = 0.95 6 Hypothesis tests Tests of hypotheses about parameter values compare a conjecture we have about the population to the information contained in a sample of data. The test of a hypothesis consists of the following stages: 1. Setting a null H0 and alternative H1 hypotheses 2. Computing a test statistic 3. Determining a rejection region 4. Comparing the test statistic to the rejection region For individual parameters of the linear model the set of hypotheses is: H0 :αk = c H1 :αk 6= c (10) Assuming the null is true we can substitute αk in (9) with c, hence the test statistic: tαk = α̂k − c ∼ tν . Sk (11) The rejection region depends on a probability γ, called the significance level of the test (usually γ = 5%). For a given γ we need to find the critical value t∗ν,γ of the t-distribution, which determines the rejection region. Our decision is as follows: if |tαk | ≥ t∗ν,γ we reject the null if |tαk | < t∗ν,γ we don’t reject the null Remark. The significance level γ describes the probability of Type I error, i.e. rejecting the null when it is true. In practice there is also Type II error, i.e. not rejecting the null when it is false. We cannot control for Type II error since its probability depends on the unknown value of αk . 6 The p-value When reporting the outcome of statistical hypothesis tests it has become standard practice to report the p-value (probability value) of the test. We can compare this p-value to the chosen significance value γ. Our decision is as follows: if p ≤ γ we reject the null if p > γ we don’t reject the null Example 6. For the Phillips curve example the values of t statistics are tα1 = 3.57 and tα2 = −1.28. Given that the critical value for the 5% significance level is t∗ = 1.995, we can reject the null H0 : α0 = 0, but cannot reject the null H0 : α1 = 0. This is confirmed by the probability values: pα1 = 0.0007 < 0.05 and pα2 = 0.2051 > 0.05. 7 Tests for joint hypothesis A null hypothesis of multiple restrictions on the parameters, which is called a joint hypothesis test, can be tested with one of three substitutive tests: F -test, Lagrange Multiplier LM -test and the likelihood ratio test. Here we will discuss the first two tests. The general form of M linear restrictions on the parameters of a regression can be written as: H0 :Rα = r H1 :Rα 6= r (12) where R is a M × K matrix and r a M × 1 vector. If the null is true then the fit of the restricted model shouldn’t be significantly worse than the fitPof the unrestricted model. We can test it by T comparing the Sum of Squared Residuals SEE = t=1 e2t of both models. If the null is true, the F -test statistic: F = (SSER − SSEU )/M SSEU /(T − K) (13) has an F distribution with v1 = M and v2 = T −K degrees of freedom. Subscripts R and U denotes restricted and unrestricted model, respectively. Remark. Since the fit of the restricted model cannot be better than the fit of the unrestricted model, inequality SEER − SEEU ≥ 0 always holds. In the case of the LM test, under the null the statistic: LM = (SSER − SSEU ) =M ×F σ̂ 2 (14) has a χ2 distribution with v = M degrees of freedom (on the correspondence between both tests: see Appendix 6A in the “Principles of Econometrics”). 7 Example 7. We analyze the relationship between the interest rate (it ), year-on-year inflation (πt ) and year-on-year GDP growth rate (yt ). The quarterly data, which are taken from the MPdata.wf1 file, cover the period 1974-2011 and relate to the U.S. economy. The results of OLS estimation are: ît = 1.41 + 0.97 πt + 0.26 yt . (0.46) (0.07) (0.09) We want to verify the null that the coefficients α2 and α3 , which describe the impact of inflation and output, are 1.5 and 0.5, respectively. In terms of (12) we can write these hypothesis as: 0 H0 : 0 0 H1 : 0 1 0 1 0 0 1.5 α= 1 0.5 0 1.5 α 6= 1 0.5 The test statistics: F (2, 149) = 135.6 (p = 0.000) and LM (2) = 271.2 (p = 0.000) indicate that we should reject the null. Finally, it should be noticed that the joint hypothesis tests are usually applied to test for the overall significance of the regression model. In this case, for a linear model with a constant: yt = α1 + α2 x2t + . . . + αK xKt + t the form of hypotheses given by (12) is: H0 :α2 = 0 ∧ α3 = 0 ∧ . . . ∧ αk = 0 (15) H1 :α2 = 0 ∨ α3 = 0 ∨ . . . ∨ αk = 0. Given that under the null the model shrinks to yt = α1 + t , we get SEER = result, the value of the F test statistic is: F = 8 PT R2 /M (1 − R2 )/(T − K) t=1 (yt − ȳ)2 . As a (16) Recommended exercises from Principles of Econometrics Exercise 1. Solve exercises for models with one explanatory variable: 2.10 (CAPM), 2.12 (House price), 2.15 (wage vs education) Exercise 2. Solve exercises for hypotheses testing: 3.7 (CAPM), 3.8 (House price), 3.12 (wage vs education) Exercise 3. Solve exercises for models with many explanatory variables: 5.13 (House price), 5.19 (wage vs education), 5.25 (production function) Housework 1. Read Probability Primer, chapter 1, Principles of Econometrics 8

Applied Econometrics Week 1

Related documents

Products

Support

Applied Econometrics Week 1

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib