Ezequiel Uriel
University of Valencia
Version: 09-2013
1 Econometrics and economic data
1.1 What is econometrics?
1.2 Steps in developing an econometric model
1.3 Economic data
1
1
2
5
First, let us see something about the origin of econometrics as a discipline. The term econometrics is believed to have been crafted by Ragnar Frisch, co-winner of the first Nobel Prize in Economic Sciences in 1969, along with fellow econometrician Jan
Tinbergen. Both of them were founders of the Econometric Society in 1933. In section I of the constitution of this society, it is stated that
“The Econometric Society is an international society for the advancement of economic theory in its relation to statistics and mathematics. Its main object shall be to promote studies that aim at a unification of the theoretical-quantitative and the empiricalquantitative approach to economic problems and that are penetrated by constructive and rigorous thinking similar to that which has come to dominate the natural sciences”
In the first issue of Econometrica (1933), the Econometric Society journal,
Ragnar Frisch gives us an explanation about the meaning of econometrics :
“But there are several aspects of the quantitative approach to economics, and no single one of these aspects, taken by itself, should be confounded with econometrics. Thus, econometrics is by no means the same as economic statistics. Nor is it identical with what we call general economic theory, although a considerable portion of this theory has a definitely quantitative character. Nor should econometrics be taken as synonymous with the application of mathematics to economics. Experience has shown that each of these three viewpoints, that of statistics, economic theory, and mathematics, is a necessary, but not by itself a sufficient condition for a real understanding of the quantitative relations in modern economic life. It is the unification of all three that is powerful. And it is this unification that constitutes econometrics.”
Today, we would also say that econometrics is the combined study of economic models, mathematical statistics, and economic data. Within the field of econometrics, econometric theory can be distinguished from applied econometrics.
Econometric theory concerns the development of tools and methods, and the study of the properties of econometric methods. Econometric theory belongs to the field of statistics.
Applied econometrics is a term describing the development of quantitative economic models and the application of econometric methods to these models using economic data. Applied econometrics is mainly used in the field of applied economics.
What are the goals of Econometrics? We are going to examine three:
1) Knowledge of the real economy . Econometric methods allow us to estimate economic magnitudes such as the marginal propensity to consume or the elasticity of labor with respect to output. These estimations are located in a determined time and space: for example, in Spain in the last quarter of the
20th century. In addition to the estimation, in which numerical values are obtained, econometric methods allow us to perform tests of hypothesis; for
1
example, in a production function, is the hypothesis of constant returns to scale admissible?
2) Economic simulation policy . Econometrics methods can be used to simulate the effects of alternative policies. For example, with an appropriate econometric model we could see, in quantitative terms, how the different increases in tobacco tax affect the consumption of tobacco.
3) Prediction or forecasting . Very often econometric methods are used to predict values of economic variables in the future. By making predictions we try to reduce our uncertainty in the future of the economy. This is not an easy task, since in general the predictions are only satisfactory when there are no drastic changes in the economy. Although it would be useful to be able to predict these drastic changes accurately, both econometric and other alternative methods tend to be imprecise.
There are three main steps in developing an econometric model : specification, estimation and validation.
While in a first approximation these stages follow a sequential order, in econometric analysis it is generally necessary to go back more than once within this sequence. It is necessary to continuously confront the model with the data and any other information source, in order to obtain an econometric model compatible with the data.
The model can be used to analyze reality, offer better predictions or constitute a good basis for making decisions. Now we will describe the steps listed above.
(a) Specification
In this first step, the model or models used must be defined, as well as data to be used in the estimation stage.
In the specification step, we will refer to four elements: the economic model, the econometric model, the statistical assumptions of the model and the data. In this section we will refer to the first three elements; in the following section we will examine different types of data used in econometric analysis.
The first element we need is an economic model. In some cases, a formal economic model is constructed entirely using economic theory. In other cases, economic theory is used less formally in constructing an economic model.
After we have an economic model, we must convert it into an econometric model. We are going to see that with two examples.
E
XAMPLE
1.1
Keynesian consumption function
Keynes formulated his well-known consumption function in three propositions:
Proposition 1 : Consumption is a function of income, and both variables are measured in real terms. If the variables are measured in real terms, it means that when consumers decide the proportion of income devoted to consumption, they are not affected by monetary illusion.
Analytically, proposition 1 can be expressed in the following way: cons
( ) (1-1)
Proposition 2 : Consumption is an increasing function of income, but an increase in income always causes an increase, to a lesser degree, in consumption.
2
This proposition implies that marginal propensity to consumption is greater than 0 (it is an increasing function), but it is smaller than 1 (an increase in income always causes an increase, to a lesser degree, in consumption).
Analytically, proposition 2 can be expressed in the following way:
0
d cons d inc
1 (1-2)
Proposition 3 : The proportion of income consumed is smaller when income increases. That is to say, the proportion of the last euro earned devoted to consumption is smaller than the proportion of total income earned devoted to consumption.
Analytically, proposition 3 can be expressed in the following way: d d con
cons inc inc
(1-3)
In other words, the marginal propensity to consume is smaller than the average propensity to consume.
These three propositions constitute an economic model: the Keynesian consumption function.
To estimate and test this model we must convert it into an econometric model. For this conversion, two requirements must be accomplished.
According to the first requirement, it is necessary to specify the mathematical form of the function. The linear function has been used in this case because, in addition to being simple, it is compatible with the description made by Keynes.
In order to justify the second requirement, it must be taken into account that the model formulated in proposition 1 is deterministic. That is to say, income is the only factor in the determination of consumption. But in real life there are many other factors, other than income, which have an influence on consumption. In an econometric model, all the factors different from the independent variables included are gathered in a variable denominated random disturbance or error ( u ). The second requirement is the introduction of the term of error in the equation .
In general, all the relevant factors must be introduced explicitly in the econometric model; all the other factors are taken into account in a unique variable: the error or the random disturbance. In the
Keynesian consumption function the only relevant factor considered is income.
Taking into account these two requirements, Keynesian consumption function can be expressed in the following way: cons
2
(1-4)
This is an econometric model that can be estimated if you have data on consumption and income.
Let us see now the other two propositions. In this linear model, the marginal propensity to consumption is the following: d cons
2 d inc
(1-5)
Consequently, proposition 2 in this model is the following:
0
2
1 (1-6)
Once the model has been estimated, it is possible to test whether the estimate of
2
is between 0 and l.
The average propensity to consume in the linear model, considering that the error is equal to 0, is the following: cons inc
inc
2 inc
1 inc
2
(1-7)
Therefore, proposition 3 implies that
1
2
2 inc
or
1 inc
0 (1-8)
That is to say,
1
0 (1-9)
3
Once the model has been estimated, testing proposition 3 is equivalent to testing whether the intercept is significantly greater than 0.
E XAMPLE 1.2
Wage determination
Economic model :
Formal economic theory - human capital theory- says that education ( educ ), experience ( exper ) and training are factors that affect productivity and hence the wage . Therefore, an economic model for wage determination could be the following: wage
( , , ) (1-10)
Incidentally, do you think there is any variable missing in this model?
Econometric model :
The corresponding econometric model, using a mathematical linear form, is the following: wage
1 2 educ
3 exper
4 training
u (1-11)
To sum up, to convert an economic model into an econometric model: a) The form of the function f (.) has been specified. b) A disturbance variable has been included to reflect the effect of other variables affecting wage, but not appearing in the model.
An important element in the specification of the model is the formulation of a set of statistical assumptions, which are used in subsequent steps. These statistical assumptions play a key role in hypothesis testing and, in general, throughout the inference process carried out with the model.
(b) Estimation
In the estimation process we obtain numerical values of the coefficients of an econometric model. To complete this stage, data are required on all observable variables that appear in the specified econometric model, while it is also necessary to select the appropriate estimation method, taking into account the implications of this choice on the statistical properties of estimators of the coefficients. The distinction between estimator and estimate should be made clear. An estimator is the result of applying an estimation method to an econometric specification. On the other hand, an estimate consists of obtaining a numerical value of an estimator for a given sample. For example, applying a very simple estimation method, called ordinary least squares , to the specification of the consumption function (1-4) provides expressions which determine the estimators
ˆ
1 and
ˆ
2
. Substituting the sample data in these expressions, two numbers are obtained: one for
ˆ
1
and one for
ˆ
2
which provide estimates of the parameters
1
and
2
.
In general, it is possible to obtain analytical expressions of the estimators, particularly in the case of estimating linear relationships. But in non-linear procedures of estimation it is often difficult to establish their analytical expression.
(c) Validation
The results are assessed in the validation stage, where we assess whether the estimates obtained in the previous stage are acceptable, both theoretically and from the statistical point of view. On the one hand, we analyze, whether estimates of model parameters have the expected signs and magnitudes: that is to say, whether they satisfy the constraints established by economic theory.
From the statistical point of view, on the other hand, statistical tests are performed on the significance of the parameters of the model, using the statistical
4
assumptions made in the specification step. In turn, it is important to test whether the statistical assumptions of the econometric model are fulfilled, although it should be noted that not all assumptions are testable. The violation of any of these assumptions implies, in general, the application of another estimation method that allows us to obtain estimators whose statistical properties are as good as possible.
One way to establish the ability of a model to make predictions is to use the model to forecast outside the sample period, and then to compare the predicted values of the endogenous variable with the values actually observed.
1.3 Economic data
As we have seen, an empirical analysis uses data to test a theory or to estimate a relationship. It is important to stress that in Econometrics we use non-experimental data.
Non experimental or observational data are collected by observing the real world in a passive way. In this case, data are not the outcome of controlled experiments.
Experimental data are often collected in laboratory environments in the same way as in natural sciences. Now, we are going to see three types of data which can be used in the estimation of an econometric model: time series, cross sectional data, and panel data.
Time Series
In time series, data are observations on a variable over time. For example: magnitudes from national accounts such as consumption, imports, income, etc. The chronological ordering of observations provides potentially important information.
Consequently, ordering matters.
Time series data cannot be assumed to be independent across time. Most economic series are related to their recent histories. Typical examples include macroeconomic aggregates such as prices and interest rates. This type of data is characterized by serial dependence.
Given that most aggregated economic data are only available at a low frequency
(annual, quarterly or perhaps monthly), the sample size can be much smaller than in typical cross sectional studies. The exception is financial data where data are available at a high frequency (weekly, daily, hourly, etc.) and so sample sizes can be quite large.
Cross Sectional Data
Cross sectional data sets have one observation per individual and data are referred to a determined point in time. In most studies, the individuals surveyed are individuals (for example, in the Labor Force Survey (EPA) more than 100000 individuals are interviewed every quarter), households (for example, the Household
Budget Survey), firms (for example, industrial firm survey) or other economic agents.
Surveys are a typical source for cross-sectional data. In many contemporary econometric cross sectional studies the sample size is quite large.
In cross sectional data, observations must be obtained by random sampling. Thus, cross sectional observations are mutually independent. The ordering of observations in cross sectional data does not matter for econometric analysis. If the data are not obtained with a random sample, we have a sample selection problem.
So far we have referred to micro data type, but there may also be cross sectional data relating to aggregate units such as countries, regions, etc. Of course, data of this type are not obtained by random sampling.
5
Panel Data
Panel data (or longitudinal data) are time series for each cross sectional member in a data set. The key feature is that the same cross sectional units are followed over a given time period. Panel data combines elements of cross sectional and time series data.
These data sets consist of a set of individuals (typically people, households, or corporations) surveyed repeatedly over time. The common modeling assumption is that the individuals are mutually independent of one another, but for a given individual, observations are mutually dependent. Thus, the ordering in the cross section of a panel data set does not matter, but the ordering in the time dimension matters a great deal. If we do not take into account the time in panel data, we say that we are using pooled cross sectional data.
6