INTRODUCTION TO ECONOMETRICS DEFINITION Econometrics is the branch of economics that uses data to analyze the economic relationships. An economic relationship is a relationship between two or more economic variables. Let X and Y be two economic variables. Let X1, X2, …, Xk, and Y be three or more economic variables. In general, econometrics uses data to analyze the relationship between X and Y, or X1, X2, …, Xk, and Y. Why Do We Want to Use Data to Analyze Economic Relationships? We usually have two objectives. 1) Explanation. 2) Prediction. We want to know if X affects Y? That is, does X “produce an effect” on Y. We want to know if X predicts Y? That is, can we use X to predict Y. EXPLANATION To explain the relationship between X and Y, or X1, X2, …, Xk, and Y, we need to address 5 questions. 1. 2. 3. 4. 5. Does X have an effect on Y? Or do X1, X2, …, Xk have separate and/or joint effects on Y? What is the direction of the effect(s)? What is the size of the effect(s)? Which X has the biggest (smallest) effect? What is the mechanism that produces the effect? How Do We Define Effect? Many possible effects can produce an economic relationship. Any of the following might explain the relationship between X and Y. X causes Y. Y causes X. Z causes both X and Y. Z causes Y and is correlated with X. X causes Z, and Z causes Y. X and Y have a spurious correlation that results from chance. Independent Causal Effect We are almost always interested in the independent causal effect of X on Y. If we change X and hold Z constant will Y change? How Do We Measure Effect? There are 2 alternative measures of effect. 1) Total effect. 2) Marginal effect. Total Effect The total effect of X on Y is the change in Y when X changes from its current value (zero) to zero (its current value). Marginal Effect The marginal effect of X on Y is the change in Y when X changes by one unit. Elasticity The elasticity of Y with respect to X is the percentage change in Y when X changes by one percent. It is a unit-free measure of the marginal effect. It allows us to compare the size of the effects of two or more variables that are measured in different units. Marginal Effect Can Be Constant or Variable The marginal effect of X on Y can be constant or variable. This depends on whether the relationship between X and Y is linear or non-linear. Marginal Effect Can Be Static or Dynamic The marginal effect of X on Y can be static or dynamic. The static independent causal effect of X on Y is the effect of a one unit change in X on Y for a single time period. The dynamic independent causal effect of X on Y is the effect of a one unit change in X in the current period on Y in the current period and one or more future periods. It is a sequence or time path of two or more causal effects of a one unit change in X on Y. The dynamic causal effect is also called the dynamic multiplier. PREDICTION To use X (or X1, X2, …, Xk) to predict Y, we need to address the following questions? 1. Do X (or X1, X2, …, Xk) and Y have a pattern or relationship that allow us to use X (or X1, X2, …, Xk) to predict Y? 2. Does X (or X1, X2, …, Xk) explain a relatively large amount of the variation in Y? 3. Is the pattern or relationship between X (or X1, X2, …, Xk) and Y stable? EXPLANATION Vs PREDICTION If our objective is to explain the relationship between X and Y, then we need to know the independent causal effect of X on Y. If our objective is to use X to predict Y, you don’t need to know the independent causal effect of X on Y. You can make a good prediction if X and Y have a stable pattern, and X explains a sufficiently large amount of the variation in Y. EMPIRICAL STUDY To use data to analyze economic relationships, economists conduct an empirical study. Design of an Empirical Study The design of an empirical study refers to how the data are generated and analyzed. Independent Causal Effect and Study Design To determine if X has an independent causal effect on Y, and if so the direction and size of the effect, the study must be designed to account for 3 possibilities. 1) Confounding Variables. 2) Reverse causation. 3) Chance. Steps in an Empirical Study An empirical study has 6 steps. 1) Objective. 2) Data. 3) Descriptive statistics. 4) Statistical model. 5) Estimation, hypothesis testing , goodness of fit. 6) Conclusions. OBJECTIVE The objective of the study is to use a sample of data to make conclusions about an economic relationship in a population. DATA A sample of data must be obtained from a population. This sample of data must be produced. The process that produces the data is called the data generation process or the experiment. There are 2 types of data generation processes. 1) Controlled experiment. 2) Uncontrolled experiment. The data generation process can produce 4 types of data. 1) Cross-section data. 2) Time series data. 3) Pooled cross-section data. 4) Panel (longitudinal) data. Controlled Experiment If the data is obtained from a controlled experiment, then the researcher designs the process that generates the data on X and Y. This data is called experimental data. Uncontrolled Experiment If the data is obtained from an uncontrolled experiment, then the researcher doesn’t design the process that generates the data on X and Y. A natural process generates the data. The researcher passively observe this process and collects the data after it’s produced. This data is called observational data. Types of Data Cross-section data is a set of observations on two or more units at a given point in time. Time series data is a set of observations on a single unit for two or more time periods. Pooled crosssection data is a set of observations on two or more units for two or more time periods. The units in each time period are different. Panel (longitudinal) data is a set of observations on two or more units for two or more time periods. The units in each time period are the same. DESCRIPTIVE STATISTICS Once we have our data, we must “get to know our data.” To get to know our data we calculate descriptive statistics for the variables. This usually includes the frequency distributions, means, standard deviations, and correlation coefficients for the variables. STATISTICAL MODEL A statistical model is a model that describes a data generation process. A statistical model that describes an economic data generation process is called and econometric model. Equation To specify the econometric model, we specify an equation that describes the economic relationship of interest. The equation has 4 major components. 1) Variables. 2) Error term. 3) Parameters. 4) Functional form. Variables We must choose the variables for the equation. The dependent variable is the variable to be explained. The explanatory variables are the variables that you believe have an effect on the dependent variable. The relationship between the dependent and explanatory variables can be written in general functional form as Yt = ƒ(Xt1, Xt2, …, Xtk), where the subscript t indicates the tth unit (e.g., individual, firm, state, nation). Error Term For most economic relationships, the dependent variable probably depends upon hundreds of factors. The error term represents the “net effect” of all factors other than the explanatory variables on the observed value of the dependent variable. This includes omitted variables and measurement error. The general functional form including the error term is Yt = ƒ(Xt1, Xt2, …, Xtk; μt), where μt is the error term for the tth unit. The variables Y, X1, X2, …, Xk are known and observable. The variable μ is unknown and unobservable. We assume μ is a random variable. We describe its behavior by a probability distribution. Parameters A parameter is a quantifiable characteristic of a population. It is always unknown and unobservable. The parameters of the equation are constants that link together the dependent and explanatory variables. They are usually denoted using Greek letters such as β1, β2, … βk. The parameters represent the effects of the X’s on Y. Functional Form The next step is to choose a functional form for the relationship between the dependent variable, explanatory variables, parameters, and error term. The functional form describes the nature of the economic relationship. The standard econometric model can be written in general functional form as Yt = ƒ(Xt1, Xt2, …, Xtk) + μt or Observed Y = Average Y + Random Amount. This tells us that the observed value of Y for the tth unit is comprised of two components: a systematic component and a random component. The systematic component is given by the conditional mean function: Average Y = ƒ(Xt1, Xt2, …, Xtk). It is the average value of Y for any set of X values, given the β values. The systematic component is the average relationship, or pattern between the X’s and Y. The error term is the random component. For the standard econometric model a functional form is chosen for the conditional mean function. It is assumed that the error, which can be positive or negative, is added to the average value of Y. Therefore, the error for the tth unit is the difference between the observed value of Y for the tth unit and the average value of Y for all units that have the same set of values of the X’s as the tth unit. For example, if the functional form chosen is linear in variables and parameters, then the equation that describes the statistical relationship between Y and the X’s is: Yt = β1 + β2Xt2 + β2Xt3 + … + βkXtk + μt, where Xt1 = 1 and therefore it’s coefficient is the intercept or constant in the equation. ESTIMATION, HYPOTHESIS TESTING, GOODNESS OF FIT The next step in the study involves estimation, hypothesis testing, and goodness of fit. Estimation The sample data is used to obtain estimates of the true unknown population parameters of the econometric model, β1, β2, … βk. To obtain estimates of the parameters, we must choose an estimator. An estimator is a rule or formula that tells us how to use the sample data to obtain an estimate of a population parameter. We want to choose an accurate and reliable estimator, so we can obtain an estimate that is close to the true value of the population parameter. Hypothesis Testing The sample data is used to test hypotheses about the values of the true unknown population parameters, β0, β1, β2, … βk. Specification Testing We often use the sample data to test hypotheses about the specification of the model. Theses include variable specification, functional form, and assumptions about the error term. Goodness of Fit Goodness-of-fit measures how well the econometric model fits the sample data. The better the model fits the sample data, the better the model predicts the values of Y in the sample. CONCLUSIONS The final step is to make conclusions about the economic relationship of interest, and the validity of the conclusions. Conclusions The conclusions involve either an explanation or prediction. To explain the relationship between X and Y, we answer the following questions. Does X have an effect on Y? What is the direction of the effect? What is the size of the effect? What is the mechanism that produces the effect? To make a prediction, you use X to predict the value of Y. Validity of Conclusions The judge the validity of the conclusions we ask the following questions. 1) How valid is the explanation? 2) How good is the prediction? To assess the validity of an explanation, we use two criteria. 1) Internal validity. 2) External validity. These concepts make a distinction between the population and setting studied and the population and setting to which the conclusions are generalized. To be internal valid, the conclusions about the population studied must be unbiased and confidence statements about those conclusions must be correct. To be externally valid, the conclusions about the population studied must generalize to the population and setting of interest.