Hitchhiker’s Guide to EViews and Econometrics January 2000 Byung-Joo Lee Department of Economics University of Notre Dame Notre Dame, IN 46556 ByungJoo.Lee.81@nd.edu 574-631-6837 EViews’ User’s Guide This is a short guide to use EViews, the econometric software that we will use in this course. EViews program is available on the campus cluster computers with windows operating system installed. We do not have Macintosh version of this program installed on the campus computers (Macintosh version is available for individual purchase directly from the publisher). Summary of the Program EViews is a windows graphical interfaced statistical software. This program simplifies many complicated statistical problems with a few simple mouse clicks. However, this program is also powerful enough to handle many state of the art econometrics problems. This program is a successor of MicroTSP from the same company. This program handles time-series analysis better than cross section data analysis. More flexible maximum likelihood estimation procedure is one major lacking point of EViews. There are two ways to execute this program. First you can perform many statistical functions with just using menu bar. You can also do the same task using EViews commands. We will use both ways to handle statistical problems. Getting Started When you start EViews program, EViews window appears. The first line in the window is the title bar, labeled Econometrics Views. The second line is called the menu bar beginning with File. Each one of the entries has its own submenu as you double click your mouse. They are very similar to standard windows menu bar items. Each of those items will be explained later whenever necessary. There is a blank line right below the menu bar, and it is called the command window. This is the area that you type in the EViews command to perform statistical functions (As I mentioned in the previous section, you can just use menu bar to do the same task). The main area is the work area that will display the results of statistical tasks. Each result will have its own window and as you perform multiple statistical tasks, you will see several windows cascaded. You can drag each window to different position for your own viewing convenience. The very bottom line is the status line. This line is divided into three sections. The first one is a message section (from EViews to users), the second section shows the default directory that EViews will use to look for data and program. You can change the default directory by clicking the Update Default Directory in the directory setting window. The last section displays the current work file name. When you open EViews, this section shows that work file is untitled (because you did not name the work file yet). 1 Introductory Session This is an introductory session of EViews. This session only covers the most basic and necessary tools that we need to perform minimal regression analysis. The next section, Econometric Review, covers brief but fairly comprehensive topics of the most commonly used econometric models. We will practice an actual statistical analysis using small data set. In this practice, I will explain both using menu bar and EViews command. EViews command will be boldfaced. Our practice will appear as italics. 1. Begin EViews session and prepare for the workfile. You need to create workfile for any EViews session. Click on the File|New|Workfile. (Type CREATE in a command mode) A dialog box appears to ask the frequency of data: annual, semi-annual, quarterly, monthly, weekly, daily or undated. Select appropriate frequency and enter the starting and end period of data. For example, in the start period area, type 1960 for annual or 1960:1 for quarterly, monthly or weekly (you can use period instead of colon, e.g., 1960.1). The end period can be entered as 1969 (annual), 1969:4 (quarterly) or 1969:12(monthly), etc (see step 11 to enter data). For cross section data, choose none for data frequency and enter 1 in the start period and n (number of sample size) in the end period. Then, click OK. Now the workfile appears (currently showing as workfile:UNTITLED) and there are two variables already in the workfile: C(constant) and resid(Residuals). 2. Type in data for three series CONS, INCOME and CPI. To enter data manually, click on Quick|Empty Group(Edit Series). (Type DATA). Blank spreadsheet appears with the pre-specified frequency you entered in step 1. Click in the gray cell to enter the series (variable) name. This window is called the Group within your own work file. For each operation you do in your work file, you can name these as separate group or you can just discard to close the group windows. You can use any ASCII characters up to 8 characters. Start type in numeric data immediately below the series name. Type CONS (consumption) and press down arrow (DATA CONS) and start to type in 325, 335,… Data is at the end of this exercise. After finishing cons, go to the next column and in the gray area, type INCOME and type in 350, 364, 385,… Repeat this for another series CPI. (You can do all these three series at once by DATA CONS INCOME CPI). If you want to read a data from other file(ASCII file, Lotus *.WK3, or Excel *.XLS), click on Proc/Import Data in the workfile menu. Choose proper file format to read in and select data file from your directory. You will have a data dialog window that is asking the order of data (by observation or by series: most data is arranged by 2 observation) and series names (if they don’t have names). If data already have series names, simply type in how many series in the data set. EViews will read in all necessary data into the workfile. You can export selected series into any data file format by click on Proc/Export Data in the workfile menu. 3. Save and retrieve your current working file. When you are done entering all your data, you need to save your work file so that you can use it later. To save your working file, click on File|SaveAs and give appropriate file name (work file has an automatic extension of *.wf1) and path for your own (SAVE filename). In a later session when you need this work file back, you can retrieve by File|Open and find your appropriate work file (LOAD filename). 4. Generate real variables using nominal variables and price index. Note that CONS and INCOME are nominal terms. In order to use these variables in real terms, we need to modify these variables into real terms. In your current work file menu bar, there is a button for GENR. Click GENR (GENR), you will have a window asking equation and sample period. Enter appropriate equations to change nominal variables into real terms. Type RCONS(Real consumption)=CONS/CPI and adjust for sample period. Sample period is already entered as the entire period. Do the same for RINCOME (Real income). 5. Print CONS, INCOME, CPI, RCONS, RINCOME. If you want to look at your data, you can use SHOW command in the work file menu bar and type appropriate series name(s) (SHOW CONS INCOME). For a single series, you can just double click on its series name in the workfile directory. For a multiple series, highlight the entire series name in the workfile directory and click SHOW. You can type more than one series name in the series name window. You can print using PRINT menu in the work group menu bar (PRINT CONS INCOME). 6. Plot each series separately and simultaneously. You can a graph for a single or multiple series. You can also do this by View|Graphics under the Group menu (This is only available under the View in the Group menu, where you see the actual list of series, for example) (PLOT series name or PLOT series1, series2). Each series is plotted using different color. You can make multiple graphs by choosing View|Multiple Graphs in the Group menu. 3 7. Draw and print the scatter diagram of RCONS and RINCOME. To see the relationship between two variables, you can draw a scatter diagram between two variables (SCAT CONS, INCOME, first variable in the vertical axis and the second one in the horizontal axis). You can print this scatter diagram using PRINT command in the scatter diagram menu bar. 8. Calculate the correlation, covariance and test the Granger causality between RCONS and RINCOME. You can obtain descriptive statistics of each variables using Quick|Series Statistics for single series or Quick|Group Statistics for multiple series. You can obtain the same statistics under the View|Descriptive Stats under the Group menu. There are eight statistics that you can obtain under the View menu such that descriptive statistics, crosstab, correlations, covariances, correlogram, cross correlation, cointegration test, Granger causality. 9. Estimate the least squares regression of RCONS on RINCOME with intercept. Also estimate the same regression using nominal variables. If you want to run regression of RCONS on RINCOME with intercept, click on Object|New Object|Equation (or Quick|Estimate Equation). Before you click OK button, you can (optionally) give the name for your Object. It is not necessary, but this will make it easier for you to refer to your estimated equation later. In the Equation Specification window, you can type regression equation that you want to estimate, and choose appropriate estimation method (LS: Least Squares, TSLS: Two Stage Least Squares for Simultaneous Equation Estimation, ARCH, LOGIT and PROBIT etc). For simple least squares, type RCONS C RINCOME (LS RCONS C RINCOME) for regression of RCONS(Dependent Variable) on C(intercept) and RINCOME(explanatory variable). 10. Estimate the least squares regression of RCONS on last year’s RCONS, RINCOME with intercept. If you want to estimate the regression using different specification or different estimation technique or different sample period, repeat step 9 and modify appropriate estimation option. You can also include lagged values in the regression by using RCONS(-1) as one period lagged value of RCONS or RINCOME(-2) as two period lagged value of RINCOME. 4 11. The following is the dada set we used for this introductory session. Year 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 Cons 325 335 355 375 401 433 466 492 537 576 Income Cpi 350 0.887 364 0.896 385 0.906 405 0.917 438 0.929 473 0.945 512 0.972 547 1.000 590 1.042 630 1.098 5 Data Handling 1. Data Transformation You can transform most of data using GENR button in the workfile menu bar. Following are the most commonly used functions and operations. +, -, *, / >, <, =, <> <=, => AND, OR add, subtract, multiply and divide GT, LT, equal and not equal LTE, GTE Logical Operator. (X AND Y) is 1 if both are true X2=X^2 LY=LOG(X) EX=EXP(X) AX=ABS(X) SQX=SQR(X) RND, NRND RX=@INV(X) Raise to the Power Natural Log Transformation Exponential Function Absolute Value Square Root Random Number Generator, Uniform and Normal Inverse or Reciprocal of X DX=D(X) DnX=D(X,n) LX=X(-1) First Difference of X, X(t)-X(t-1) nth Order Differencing, (1-L)nX, where L is a Lag Operator One Lagged Value of X 2. Descriptive Statistics @SUM(X) @MEAN(X) @VAR(X) @COV(X,Y) @COR(X,Y) @DNORM(X) @CNORM(X) Sum of X Mean of X Variance of X Covariance between X and Y Correlation between X and Y Standard Normal Density Function of X CDF of Standard Normal Random Variable 3. Regression Statistics If you assign the name of your Object for your regression, you can use the regression name to retrieve various regression statistics. For example, assume our regression name is TEST. Then, TEST@R2 is an R2 value of the TEST regression. If you did not assign regression name, @R2 refers to R2 value of the most recently estimated equation. @R2, @RBAR @SE, @SSR @DW, @F, @LOGL R2 and adjusted R2 Standard error of regression, sum of squared residual Durbin-Watson, F-statistic, value of log-likelihood function 6 4. Pooled time-series and cross section data You need to assign names of the cross section members. For example, assume that you are analyzing cross country study of income Y, consumption CONS, interest rate R and price level P for three countries, US, JAPAN and CANADA. I assume that you already have data read into the Workfile and their names are: YUS, YJP, YCA, CONSUS, CONSJP, CONSCA, RUS, RJP, RCA, PUS, PJP and PCA. To obtain the Pooled regression window, click Objects|New Object|Pool, and you will get Pooled Estimation window. In the regression window, you have Cross Section Identifiers: (Enter identifiers below this line). Type US, JP and CA in each line. For each series you defined before, you will have three series. For example, Y? is YUS, YJP and YCA. Same is true for all other variables CONS, R and P. 7 Econometric Review This section provides a quick review of econometric technique commonly used for many empirical economic research projects. However, this section does not intend to teach econometric theory. Those who are interested in learning more econometric theory can read Gujarati (1996)’s Basic Econometrics, 3rd ed., or more advanced book, Econometric Analysis, 4th ed., by W. Greene (2000). In this section, I assume that data is already loaded and ready to use and all the variables are defined. If you are not familiar with this, go back to the steps 2 and 3 in previous section. For EViews command, I will provide both menu bar and EViews command for each operation whenever possible. Throughout this section, I will assume the following regression model and variable notations. I also assume that you know how to interpret regression output. Yt β 0 β 1X t1 β 2 X t2 u t Yt (or Yt1, Yt2) is a dependent variable and Xt1 and Xt2 are independent variables and s are parameters to estimate. 1. Ordinary Least Squares (OLS) Estimation. This is the most commonly and widely used estimation method when classical assumptions of regression are all met. The classical assumptions are: 1) Eut X t 0 2) X t and u t are uncorrelated, EX t u t 0 3) Error terms are homoskedastic, Var u t X t E u t X t σ 2 2 4) There is no autocorrelation, Covut ,us X Eut us X 0 Yt β 0 β 1X t1 β 2 X t2 u t , u t ~ N 0, σ 2 Quick|Estimate Equation (You have the equivalent menu Objects|New Objects|Equation, and this is the same as throughout all regression procedure). Give appropriate regression name in the Name of Object window. Click OK to get regression window and type Y C X1 X2 in the Equation Specification window. Choose Least Squares in the Estimation Settings window. Specify appropriate sample period. LS Y C X1 X2 8 2. Generalized Least Squares (GLS) Estimation. When the classical assumptions of regression are violated, we still can have consistent estimates by OLS, but we no longer have efficient estimators. To obtain efficient estimators, we need to use various modifications of OLS, collectively called GLS. Yt β 0 β 1X t1 β 2 X t2 u t , ut ~ N 0, σX 2 2.1. Weighted Least Squares for Heteroskedasticity. If you have a prior knowledge of the pattern of heteroskedasticity, e.g., 2 2 σX σ 2 X 1 , then you can transform the heteroskedasticity into homoskedasticity by dividing all variables by X 1 . This is called the weighted least squares estimation. Quick|Estimate Equation and type Y C X1 X2 in the Equation Specification window. Click Options button, and select Weighted LS/TSLS and type in appropriate series name (e.g., X1) for weights in the Weight box. 2.2. Heteroskedasticity Consistent Covariance Matrix. When the heteroskedasticity structure is unknown, we can still estimate the covariance matrix consistently by either White’s or Newey-West heteroskedasticity consistent estimation method. It is recommended that when you suspect heteroskedasticity problem, but you are not sure about the structure of heteroskedasticity. Quick|Estimate Equation and type Y C X1 X2 in the Equation Specification window. Click Options button, and select Heteroskedasticity Consistent Covariance and choose either White or Newey-West. 2.3. Autocorrelation (Serial Correlation) problem. Autocorrelation problem arises when the error term follows autocorrelation structure such that u t ρu t 1 ε t . In the OLS regression output, if DurbinWatson statistic is close to 0 or 4, we may suspect that there is the first order autocorrelation problem. This model is estimated by Cochrane-Orcutt iterative procedure and lose one lagged observation. Quick|Estimate Equation and type Y C X1 X2 AR(1) in the Equation Specification window. Choose Least Squares in the Estimation Settings window. For higher order autocorrelation (hth), simply add AR(h), and we loose lagged h observations. 9 LS Y C X1 X2 AR(1) LS Y C X1 X2 AR(1) AR(2) Regression output reports parameter estimates of the model and the autocorrelation parameters. 2.4. ARIMA Model. When error term follows the general ARIMA(p,0,q) structure such that u t ρ 1u t 1 ρ pu t p θ 1ε t 1 θ q ε t q , we have the ARIMA model. Quick|Estimate Equation and type Y C X1 X2 AR(1) MA(1) in the Equation Specification window. Choose Least Squares in the Estimation Settings window. LS Y C X1 X2 AR(1) MA(1) 3. Autoregressive Conditional Heteroskedasticity ((G)ARCH). ARCH model is similar to the ARIMA model, but ARCH model assumes the ARIMA relationship in the second moment, i.e., the conditional variance of u t follows autoregressive (ARCH) and/or moving average (GARCH) components in heteroskedasticity structure. These models are frequently used to analyze the financial data where some periods of large price volatility (measured by the variance) follows by the periods of relative tranquility. Yt β 0 β 1X t1 β 2 X t2 u t , h t E u t Ω t 1 : Conditional variance of u t , Ω t 1 is the all available information set up to time (t-1). 2 3.1. ARCH(p) h t α 0 α1 u t 1 α 2 u t 2 ... αp u t p 2 2 2 3.2. GARCH(p,q) h t α 0 α1 u t 1 α 2 u t 2 ... αp u t p β1 h t 1 β 2 h t 2 ... β q h t q 2 2 2 3.3. ARCH-M model Allow the mean of a sequence is a function of conditional variance. Useful model to study asset market. Yt β 0 β1X t1 β 2 X t2 δ h t u t , where u t follows ARCH(p) model. 10 3.4. TARCH, EGARCH These are variations of ARCH to allow asymmetric nature of price volatility. Quick|Estimate Equation and choose ARCH in the Estimation Settings window. Type your regression function as before in the Mean Equation Specification window. Choose appropriate settings for GARCH(p,q) (GARCH(1,1) is default) and indicate whether you have ARCH-M model or not. ARCH is already selected in the Estimation Settings window. ARCH(p,q) Y C X1 X2 3.5. Test for ARCH Model To test the existence of ARCH, Lagrange Multiplier (LM) test is often used. To carry out this test, choose View|Residual Test|ARCH LM Test from the main windows menu bar. 4. Panel Data Analysis (Pooled Time Series and Cross Section Regression). 4.1. Pooled Time Series and Cross Section Pooled time series and cross section regression equation is as following: Yit X it β uit . ' Depending on the error structure of uit , we can allow cross-sectional heteroskedasticity or cross-sectional correlation. To estimate this model, you need to understand the way EViews handle the pooled data structure. (See the previous section for pooled data handling). To estimate this model, click Objects|New Object|Pool, and you will get Pooled Estimation window. Specify appropriate entries (dependent variable, sample period and regressors). In the Regressors window, you need to specify which variables have the common coefficients and which have different coefficients (cross section specific coefficients). In above model, we assume that all variables have the same coefficients (no cross section specific coefficients). To allow cross-sectional heteroskedasticity, select Cross section weights, and to allow cross-sectional correlation, select SUR estimation. For intercept choose either none or common. 4.2. Panel Data Analysis 11 Panel data analysis is a special case of pooled time series and cross section data analysis. This is often found in the longitudinal data structure where same individual is followed over periods of time. Therefore, there may exist individual specific effect (heterogeneity) constant over time. There are two ways to handle this problem, fixed effect or random effect. Fixed Effect Model: Yit αi X it β uit Random Effect Model: Yit α X it β ui ε it ' ' Fixed effect assume the individual heterogeneity is explained by the different intercept terms, while random effect handles this using random disturbance term ui which is constant through time. In this model, we can specify that some variable have a cross section specific coefficients. To estimate the panel data regression, follow above instruction to obtain the Pooled Estimation window and choose the Fixed effects or Random effects on intercept term selection. 5. Limited Dependent Variable Analysis (Logit or Probit). Limited dependent variable analysis is appropriate when dependent variable takes binary values ( Yt 1 or 0 ). This analysis comes from the following model. Yt β 0 β1 X t ut * * The latent variable ( Yt ) is unobservable, but the binary variable Yt is observed one if β 0 β1 X t u t 0 and zero otherwise. Depending on the assumptions about the error term, we define either Logit model ( u t has a Weibull distribution) or Probit model ( u t has a normal distribution). These models are estimated by maximum likelihood estimation with numerical iteration (typically by Newton-Raphson or BHHH method). Quick|Estimate Equation and type Y C X1 X2 in the Equation Specification window. Choose Logit or Probit in the Estimation Settings window. Specify appropriate sample period. LOGIT Y C X1 X2 PROBIT Y C X1 X2 6. Non-Linear Least Squares. 12 EViews automatically applies nonlinear least squares to any equation that is nonlinear in its coefficients. You can just specify nonlinear equation in the Equation Specification window. For example, if you want to estimate the CES production function with the following specification: Yt A θ K t ρ 1 θ L t 1 ρ ρ , where A, θ,ρ are parameters to be estimated. Quick|Estimate Equation , and in the Equation Specification window, type Y C(1) * C(2) * K^(-C(3)) 1 C(2) * L^(-C(3)^(1/C(3)) , where C’s are parameters to be estimated nonlinearly. Choose Least Squares in the Estimation Settings window. 7. Non-Stationary Time Series Analysis Time series data y t t 1 is nonstationary if its autocorrelation coefficient ( ρ , see section to above) is one, i.e., this series explodes as time progresses and has no finite variance. If this is the case, we call that this series has a unit root ( ρ 1 ), or in a more technical notation, y t ~ I(1) , which means that the series y t has to be differenced once to be stationary. T 7.1. Unit Root Test General unit root test proceeds as follows: Consider the following regression model: y t ρ y t 1 u t . Testing the unit root hypothesis is equivalent to test to see if ρ 1 . This basic equation is modified to the following three equations. Δy t θ y t 1 u t Δy t α θ y t 1 u t Δy t α θ y t 1 β t u t This is the basic Dickey-Fuller unit root test equation, and the testable hypothesis is θ 0 (i.e., ρ 1 , y t has a unit root). There are three different tables depending on your testable equation (w/ or w/o intercept and/or trend variable). More general version of the original DK test is the Augmented DK test (ADF) as following. P Δy t θ y t 1 ω p Δy t -p 1 u t p 1 P Δy t α θ y t 1 ω p Δy t -p 1 u t p 1 P Δy t α θ y t 1 β t ω p Δy t -p 1 u t p 1 13 In the ADF test, we assume the error terms ( u t ) are independent and have constant variances. Also, the lag length P in the regression equation is rather arbitrary. To overcome this problem, Phillips-Perron generalized ADF test as following: y t α * θ y t 1 u t ~ ~ ~θ yt α y t 1 β t - T 2 u t Even though these equations look simpler than ADF test, this test allows far more general data generating process allowable by the ADF test. Both tests use the same critical values. Select the series you want to test unit root, and double click (or View|Show) the series to get the series window. Click View|Unit Root Test and choose appropriate options (ADF or Phillips-Perron, and appropriate equation for unit root test). 7.2. Vector Autoregression (VAR) VAR is a system of stationary time series variables. Each equation has the same right-hand side variables consisting of exogenous variables and the lagged values of all endogenous variables in the system. This system is often used to determine the causality (Granger-causality) between variables. This system is also useful to investigate the external shock effects on the endogenous variables using impulse response function. Yt α 10 β 11 Yt 1 β 12 Z t 1 β 13 t u1t Z t α 20 β 21 Yt 1 β 22 Z t 1 β 23 t u2t Objects|New Object|VAR and select appropriate entries. For VAR specification, choose Unrestricted VAR and specify Endogenous and Exogenous variables. Also specify lag length and sample period. 7.3. Cointegration When time series variables are non-stationary, it is interesting to see if there is a certain common trend between those non-stationary series. If two non-stationary series X t ~ I(1), Yt ~ I(1) has a linear relationship such that Z t m α X t β Yt and Z t ~ I(0) , ( Z t is stationary), then we call the two series X t and Yt are cointegrated. Two broad approaches to test for the cointegration are Engel and Granger (1987) and Johansen (1988). Broadly speaking, cointegration test is equivalent to examine if the residuals of regression between tow non-stationary series are stationary. For Engel-Granger test, regress Yt on X t (or vice versa), and use the residual to see if it is stationary (unit root test described above). If it is stationary, two series X t and Yt are cointegrated. 14 Johansen uses more complicated VAR structure to test the cointegration. EViews use Johansen test for cointegration. In a multiple non-stationary time series, it is possible that there is more than one linear relationship to form a cointegration. This is called the cointegration rank. For cointegration test, select the series (group of variables) to test cointegration to obtain group window. Choose View|Cointegration Test and specify appropriate settings for testing. The setting is whether you want to specify intercept and/or linear deterministic time trend in the cointegration equation. 7.4. Error Correction Model If two or more non-stationary time series are cointegrated, then there exists an Error Correction Model (ECM). Cointegration is a necessary condition for ECM. ECM describes the long run equilibrium relationship between non-stationary series. Even though individual series are non-stationary, when they are cointegrated, there is a long run equilibrium relationship, and ECM explains this relationship. ΔX t m1 θ 11 ΔX t 1 θ 12 ΔYt 1 1 Z t 1 u1t ΔYt m2 θ 21 ΔX t 1 θ 22 ΔYt 1 2 Z t 1 u2t ECM is similar to VAR, but the original series are non-stationary and they are cointegrated. To estimate ECM, follow the same path as VAR estimation. Objects|New Object|VAR and select appropriate entries. For VAR specification, choose Vector Error Correction, and specify appropriate cointegration equation (i.e., w/ or w/o intercept and/or deterministic time trends). 8. System of Equations When we have more than one equations to estimate together, we will use additional information from other equations to improve the efficiency of parameter estimates. 8.1. Seemingly Unrelated Regression Y1t β10 β11X1t1 β12 X1t2 u1t Y2 t β 20 β 21X2 t1 β 22 X2 t2 u2 t Y1, Y2 are dependent variables and X1, X2 s are independent variables. Error terms u1, u2 are contemporaneously correlated, i.e., cov(u1,u2) 0 . OLS estimators are still consistent, but they are not efficient. 15 Objects|New Objects|System and you can give the name of the system in the Name for Object box for later use. Click OK and then you will have a blank System window. Type your equations for such that Y1=C(1)+C(2)* X1+C(3)*X2 for first equation and Y2=C(4)+C(5)* X3+C(6)*X4, etc. Click Estimate and choose Seemingly Unrelated Regression for SUR model. Since SUR estimation involves numerical iteration, you can choose appropriate number of iterations and convergence criteria in the Options button. 8.2. Simultaneous Equation System Y1t β10 β11X1t θ 12 Y2 t u1t Y2 t β 20 β 21X2 t θ 21Y1t u2 t This equation system is different from SUR model in a sense that the dependent variables appear in the right hand side of each equation. Because of this endogeneity problem, simple OLS of each equation will yield inconsistent estimators. To estimate this simultaneous equation system, each equation should fist satisfy identification condition of order condition and rank condition. 8.2.1. Two Stage Least Squares (TSLS). This is one of the most often used estimation methods for simultaneous equation. The first stage of the TSLS estimation involves the estimation of all endogenous variables on all exogenous variables in the system and some other instrumental variables. The second stage is the least squares estimation of the structural equations using the estimated values of the endogenous variables from the first stage. The structural parameters are estimated in each equation separately. 8.2.2. Three Stage Least Squares (3SLS) 3SLS is more efficient estimation procedure than TSLS in the sense that 3SLS estimates entire structural parameters all at once. The first two stages of 3SLS is equivalent to the TSLS, but 3SLS uses TSLS estimates to estimate covariance structure of entire system. Using the estimated co-variances of the system, the final stage (the third stage) is the GLS estimation method of the entire system. This method is more efficient than TSLS. To use either one of above estimation method, click Objects|New Object|System as above for SUR estimation. In the System window, type in only the behavioral equations. Behavioral equations are the ones with structural parameters to estimate. Ignore any other identities in the system. Type your equations for such that Y1=C(1)+C(2)* Y2+C(3)*X1 for first equation and Y2=C(4)+C(5)* Y1+C(6)*X2, etc. Since this is TSLS or 3SLS estimation, you need to specify instrumental variables for the first stage estimation. After the structural equations, you need to specify which variables to use as instrumental variables. 16 For TSLS and 3SLS, you need all exogenous variables in the system for the instrumental variables. Type INST X1 X2. Constant is automatically included as an instrumental variable. Click Estimate in the System menu bar. You will have a choice of different estimation methods. Choose either Two Stage Least Squares or Three Stage Least Squares. 8.3. Generalized Method of Moment (GMM) Estimation GMM estimation is relatively new estimation technique in econometrics and it is intuitively appealing because of its weak assumptions of estimation process. This is one example of growing literature of semi-parametric estimation methods. GMM uses the sample analog of population orthogonality condition to estimate parameters. For example, if the exogenous variable ( X t ) is independent of random disturbance term ( u t ), then we have the population orthogonality condition EX t u t 0 . Then, we have Eut X t 0 and Eu t gX t 0 for any function of g . From this population orthogonality condition, we can form 1 T sample analogs of population orthogonality conditions such that u t 0 and T t 1 T 1 u t X t 0 , where ut Yt β0 β1 X t . These are called the sample T t 1 moments, and we estimate the population parameters by minimizing the following criteria function Sβ . 1 T ut ' T Sβ m T WT m T , where m T T t 1 , and WT is a weighting matrix 1 u X t t T t 1 defined as mT β mT β ' 1 . EViews provides GMM estimation method in the system of equation estimation. Follow the same step as above for TSLS (3SLS). Type appropriate estimation equations and instrumental variables list, then choose GMM. Depending on the data structure, choose appropriate method either for heteroskedasticity or autocorrelation problem. 17