Econometrics 1 Lecture 1 Classical Linear Regression Analysis 1 What is Econometrics ? Application of mathematical statistics to economic data to send empirical support: a. Economic theory postulates a qualitative relation b. Mathematical economics turns economic theory in equations c. Economic statistics concerns with collecting, processing and presenting economic data d. Econometricians estimate precise numerical estimates of these relations 2 Branches of Econometrics Econometrics Theoretical Classical Bayesian Applied Classical Bayesian 3 Econometric Methodology Traditional or Classical Methodology of Econometrics Mehtodology of Bayesian Econometrics Bayesian prior Statement of hypothesis Specification of the Sample information mathematical model Specification of Posterior information econometric model Data collection Estimation of parameters of the econometric model Hypothesis testing Forecasting or prediction Using model for control or policy analysis 4 Assume a Simple Linear regression model: Yi xi ei 1 2 Main assumptions about the error termei are following: Mean of ei is zero for every observations ofxi , E ei 0 variance of ei is constant var ei 2 for every ith observation cov(eie j ) 0 for all i j ; this also means there is no autocorrelation or heteroscedasticity; errors are homoscedatic and independent of each other x ; E ei xi 0 there is no correlation betweenei and the explanatory variable i explanatory variable,xi , is exogenous, not random variance of the dependent variable is equal to the variance of the error term var yi var ei 2 5 Graphical Illustration of a Simple Linear Regression Model ˆy ˆ ˆ x i 1 2 i . Y X represents an observation. Some Each dot in the above graph observations lie above the least square Yˆi line and other observations lie below it. These errors represent all sorts elements missing from this relationship. Some of them might be due to the missing variables, others might be due to measurement errors, still other may be from the mis-specification of the relationship. The least square line is the line best fits the data set. Differences between ˆ i each observation and the Y line is represented by error termse i . As some of them are above the line and others below the line, positive errors cancel out with the negative errors. Note that the least square line passes through the average values of variables X and Y;X Y, . 6 Minimisation of Error Sum Square Errors are given by ei Yi 1 2 xi . Some of them are positive and some others are negative. Since mean of these errors is zero, E ei 0 , it is customary to take sum squared errors and estimate the unknown parameters 1 and 2 that minimise the sum squared errors. 2 S ei2 Yi xi 1 2 i i (1) Sign of each and every squared error would be positive 2 e 0 when ei ~ N 0, 2 . i S e2 ~ n2 distribution where subscript n stands for i i degrees of freedom which equals the number of terms in S Normal equations of the least square estimator are obtained by minimising S function (1) with respect to 1and 2 . 7 Derivation of Normal Equations S 2 Y x 1 0 and i 1 2 i 1 S 2 Y x x 0 i 1 2 i i 2 Thus normal equations are yi N1 2 xi i i xi yi 1 xi 2 xi2 i i i (2) (3) This is a system of two equations, (2) and (3) , and two unknowns 1 and 2 . All other values such as xt , xt yt , x 2 , y t and N are known from the t sample information on X and Y. In order get value of eliminate by multiplying the (2) by xt and 2 1 (3) by N and take a difference of the resulting two equations. 8 OLS Estimators xi yi i i N xi xi 1i 2 i 2 (4) N x y N x N x2 (5) i i i i 1 i 2 i i Now subtracting (5) from (4) we get the estimator for . 2 N x y x y i i i ii i ˆ i (6) 2 2 N x2 x i i i i Estimator for 1 can be found by dividing both sides x y . of (2) by N and using the average values and 9 ˆ ˆ y x (7) An Example of OLS Estimation Y 4 6 7 8 11 15 18 22 Sumy X 5 8 10 12 14 17 20 25 Sumx 91 111 Food expenditure and income: data and prediction Xy xsquare ysquare Ypred Sqpredy 20 25 16 2.866285 8.21559 48 64 36 5.742472 32.97598 70 100 49 7.65993 58.67453 96 144 64 9.577388 91.72636 154 196 121 11.49485 132.1315 255 289 225 14.37103 206.5266 360 400 324 17.24722 297.4666 550 625 484 22.04087 485.7997 Sumxy sumxsq sumysq 36.4218 Smsqpred y 1553 1843 1319 127.4218 1313.517 prede 1.133715 0.257528 -0.65993 -1.57739 -0.49485 0.628967 0.75278 -0.04087 sqprede 1.28531 0.066321 0.435508 2.488153 0.244873 0.395599 0.566678 0.00167 smsqpred e -3.9E-05 5.484111 10 Estimates ˆ 2 N xi y i xi y i i i i N x xi i i 2 2 i Or using the values from the above table. ˆ 2 8(1553) 111(91) 12424 10101 2323 0.95873 8(1843) (111) 2 14744 12321 2423 ˆ1 y ˆ 2 x (8) 91 111 0.95872 11.375 0.95872(13.875) 11.375 13.30224 1.92724 8 8 (9) The fitted regression line is yˆ i ˆ1 ˆ 2 xi 1.92724 0.95873 xi (10) 11 Interpretation and Prediction Both slops and intercepts make economic sense. In this sample expenditure on foods is determined by weekly income of an individual, people spend 95.6% percent of their weekly income in food expenditure. People who do not have any income receive a income subsidy of 1.93 pence per week. Mean prediction We can use equation (10) to find the predicted values Yˆi for each observation on xi . These are reported as YPRED in the above table. If the weekly income is 40 predicted food expenditure will be 36.422. Error terms are also estimated using the fact that eˆi yi yˆi yi ˆ ˆ xi yi 1.92724 0.95873xi 1 2 These predicted errors are reported as prede in the above table. Note that as expected some of the errors are negative and some other are positive. 12 Prediction of Food Expenditure Prediction of food expenditure 25.00 Predicted food expenditure 20.00 15.00 10.00 5.00 0.00 0 1 2 3 4 5 6 7 8 Income 13 Use of regression estimates to calculate the elasticities The definition of elasticity of food expenditure on income is given by Y Y Y X X 13.875 0.95783 0.95783 1.1683 X X X Y Y 11.375 This suggests that the expenditure on food is elastic around the mean. There will be 17 pence more expenditure to every £1 rise in weekly income. 14 Hints to get into the Shazam program in the Network 0. Create a Metrics directory in G: drive. 1. Login to the network 2. At start choose Applications\Economics\Professional Shazam Limdep also is available there if you are familiar with it. 3. You have an editor in the Shazam program to write your program. An econometric program involves following four steps while compiling and computing the model. 1. declaring the sample size 2. reading the data for each variables declared 3. calculations (checking the discriptive statistics if the mean and variance, correlation ; make sure that there are no missing observations0 4. Using the standard Shazam routines for estimation, such as OLS x Y; Arima x 5. Interpreting the results whether they make sense according to the economic theory. 4. Click on Shazam. Now you should be in the Shazam program. Click on File/New, it will bring you to the Shazam editor. 15 6. Write a Shazam program similar to the one as given in the example below. Save this Shazam file in your own directory in G:\metrics directory How to Read Data in Shazam? For small data files cut your data and past in the Shazam editor. For large data file you can read the data directly from the file. The data file should be in the text format. If your data is in Excell save your data in the text format using “save as” option or make a number small files of data and combine those data using Shazam program. There are more examples in File/Open/Intro option in the menu. There is also a demo which can bring you various features of Shazam. It is worth trying if this is your 16 first encounter with Shazam. Getting Around with Shazam 7. When you have your program written, then click on “Run”, and “Run Batch” to execute your program. 8. If everything is alright you will get Shazam working behind the screen and displaying output in the screen. You may save your result file in your directory if you wish by using “copy” and “past” in the “edit” menu. 9. Get more practice on several aspects of the programming with Shazam such as reading a data file, transforming variables by taking log or lag or square, plotting one variable against another one using “plot x y /gnu line only”, saving a variable to use later on. Also write a couple of lines to read regression estimates and diagnostics. 10. Consult the Short Loan Section in the Library to borrow a hard copy of the Shazam manual. You can always use the online manual inside the Shazam program which you can get by clicking at Help and Visit Shazam Online option while you are in the Shazam programme. 17 A Simple Example of Shazam sample 1-10 read y x1 x2 3.5 15 16 4.5 20 13 5 30 10 6 42 7 7 50 7 9 54 5 8 65 4 10 72 3 12 85 3.5 14 90 2 ols y x1 x2 /cov=b anova CONFID X1 X2 confid x1 x2 / TCRIT=3.499 gen1 srb =sqrt(b:1) print b srb ols y x1 x2 /predict=py diagnos / het diagnos / acf dim p 10 2 gls y x1 x2 /omega=b print py stop 18 Simple Regression in matrix notation 1 1 X'X 5 8 1 1 X 'Y 5 8 1 5 1 8 1 10 111 8 1 1 1 1 1 1 1 12 X ' X => 10 12 14 17 20 25 1 14 111 1843 1 17 1 20 1 25 4 6 7 91 1 1 1 1 1 1 8 => X 'Y 1553 10 12 14 17 20 25 11 15 18 22 19 The estimators in terms of matrix notation The estimators in terms of matrix notation: N ˆ1 ˆ 2 xi i X’X x x i i 2 i i 1 1 yi 111 91 i = X ' X 1 X ' Y = 8 111 1843 1553 x i y i i X’Y The desired inverse matrix is X ' X 1 X’X X’Y 1 1 1843 111 Adj X ' X 8 X'X 2423 111 1 ˆ1 8 111 91 1 1843 111 91 1 1843(91) 111(1553) ˆ 8 1553 2423 111(91) (8)1553 2 111 1843 1553 2423 111 ˆ1 1 167723 172383 1.92736 ˆ 10101 12424 0.95873 2423 2 20