TWO-STAGE LEAST SQUARES ESTIMATOR AND THE K

advertisement
two-stage least squares estimator and the k-class estimator
Two-stage least squares has been a widely used method of estimating the
parameters of a single structural equation in a system of linear simultaneous
equations. This article first considers the estimation of a full system of equations.
This provides a context for understanding the place of two-stage least squares in
simultaneous-equation estimation. The article concludes with some comments on
the lasting contribution of the two-stage least squares approach and more
generally the future of the identification and estimation of simultaneous-equations
models.
Two-stage least squares (2SLS) was originally proposed as a method of estimating the
parameters of a single structural equation in a system of linear simultaneous equations. It
was introduced more or less independently by Theil (1953a; 1953b; 1961), Basmann
(1957) and Sargan (1958). The early work on simultaneous equations estimation was
carried out by a group of econometricians at the Cowles Foundation. This work was
based on the method of maximum likelihood. In particular, Anderson and Rubin (1949;
1950) developed the limited information maximum likelihood (LIML) estimator for the
parameters of a single structural equation. Anderson (2005) gives the history of 2SLS a
revisionist twist by pointing out that Anderson and Rubin (1950) indirectly includes the
2SLS estimator and its asymptotic distribution. The notation of that paper is difficult and
the exposition is somewhat obscure, which may explain why few econometricians are
aware of its contents. See Farebrother (1999) for additional insights into the precursors of
2SLS.
2SLS was by far the most widely used method in the 1960s and the early 1970s.
The explanation involves both the state of statistical knowledge among applied
econometricians and the state of computer technology. The classic treatment of maximum
likelihood methods of estimation is presented in two Cowles Commission monographs:
Koopmans (1950), Statistical Inference in Dynamic Economic Models, and Hood and
Koopmans (1953), Studies in Econometric Method, which was directed at a wider
audience. Among applied econometricians, relatively few had the statistical training to
1
master the papers in these monographs, especially Koopmans (1950). By the end of the
1950s computer programs for ordinary least squares were available. These programs
were simpler to use and much less costly to run than the programs for calculating LIML
estimates. Owing to advances in computer technology, and, perhaps, also the statistical
background of applied econometricians, the popularity of 2SLS started to wane towards
the end of the 1970s. In particular, the difficulty of calculating LIML estimates was no
longer an important constraint.
This article first considers the estimation of a full system of equations and then
focuses on 2SLS. This approach provides a context for understanding the place of 2SLS
in simultaneous-equation estimation. The article is organized as follows. A twoequation structural form model with normal errors and no lagged dependent variables is
introduced in section 1. Section 2 reviews the properties of the ordinary least squares
estimator of the parameters of a structural equation. The indirect least squares estimator
is introduced in section 3. In section 4 presents the indirect feasible generalized least
squares estimator, and briefly discusses maximum likelihood methods. Section 5
develops two rationales for the 2SLS procedure, and the k-class family of estimators is
defined in section 6. Finite sample results on the comparisons of estimators are reported
in section 7, and the concluding comments are in section 8. (Our exposition of structuralform estimation draws heavily on the treatment by Goldberger, 1991. For the
presentation of GMM and more recent methods of simulation-equation estimation, see
Mittelhammer, Judge and Miller, 2000.)
1. The model
In the spirit of Goldberger (1991), we consider a two-equation demand and supply model
to fix ideas and notation. The endogenous variables are y1 (quantity) and y2 (price), the
exogenous variable is x (income), and the disturbances are u1 (demand shock) and u2
(supply shock). For convenience the intercepts are suppressed in both equations.
The structural form of the model is
Demand y1  1 y2   2 x  u1 ,
(1.1)
Supply y2  3 y1
 u2 .
(1.2)
With the terms in y1 and y2 transferred to the left-hand side, the matrix representation of
structural form is
2
 1 3 
( y1 , y2 ) 
  x( 2 , 0)  (u1 , u2 ),
1 
 1
or
y Γ  xΒ  u.
In the structural-form coefficient matrices Γ and Β , the columns refer to equations, while
the rows refer to variables.
Each endogenous variable can be solved for in terms of the exogenous variables
and structural shocks to get the reduced form of the model:
Quantity. y1  11 x  v1 ,
Price.
y2  12 x  v2 .
(1.3)
(1.4)
In matrix form,
( y1 , y2 )  x(11 , 12 )  (v1 , v2 ),
or
y  xΒΓ1  uΓ1  xΠ  v.
The reduced form is derived by post-multiplying the structural form by Γ1 , where
Π = ΒΓ1 is the reduced-form coefficient matrix and v = uΓ 1 is the reduced-form
disturbance vector.
Next we consider the statistical specification of a linear simultaneous-equation
model for the general case of a m  1 endogenous-variable vector y , the k  1
exogenous-variable vector x and the m  1structural-disturbance vector u . The
specification is the following:
yΓ = xΒ + u,
(A1)
Γ nonsingular,
(A2)
E (u | x) = 0,
(A3)
V (u | x)  Σ positive definite.
(A4)
3
Here Γ is m  m, Β is k  m, Σ is m  m. Assumption (A1) gives the system of m
structural equations in m endogenous variables. Assumption (A2) says that the system is
complete in the sense that y is uniquely determined by x and u . (A3) says that x is
exogenous in the sense that the conditional expectation of u given x is zero for all
values of x . Assumption (A4) is a homoskedasticity requirement, and positive
definiteness rules out exact linear dependency among the structural disturbances.
The implications of the specification (A1)-(A4) are the following:
y  xΠ + v, v = uΓ1 ,
E ( v | x)  0,
V ( v | x)  (Γ1 )ΣΓ1  Ω positive definite.
(B1)
(B2)
(B3)
The reduced-form disturbance vector v is mean-independent of, and homoskedastic with
respect to, the exogenous variable vector x.
Next we turn from the population to the sample. We suppose that a sample of n
observations from the multivariate distribution of x and y is obtained by stratified
sampling: n values of x are selected, forming the rows of the n  k observed matrix X
with rank (X) = k. For each observation, a random drawing is made from the relevant
conditional distribution of y given x, giving the rows of the n  m observed matrix Y,
where the successive drawings are independent. The statements about asymptotic
properties of the estimators rely on the additional assumption that the matrix XX / n has
a positive definite limit. If instead sampling is random from the joint distribution of
x and y , there is no substantial change in the results.
2. Ordinary least squares
In simultaneous equations models, the parameters of interest are the structural parameter,
the  's in the demand-supply example and the elements of Γ and Β and in general case,
rather than the reduced form parameters, the  's or Π . Ordinary least squares (OLS)
estimation of the structural parameters is not appropriate because the structural
parameters are not coefficients of the conditional expectation functions among the
4
observable variables. We now illustrate this point for the supply equation of the demand
and supply model.
The reduced-form of the demand and supply model expressed explicitly in terms
of the structural parameters:
y1  ( 2 x  1u2  u1 ) /(1  13 ),
y2  ( 23 x  3u1  u2 ) /(1  13 ).
(2.1)
(2.2)
For convenience, suppose that x, u1 and u2 are trivariate-normally distributed with zero
means, variances  x2 ,  12 ,  22 , and zero correlations. Then y2 and y1 are bivariate normal,
so the conditional expectation of y2 given y1 is
E ( y2 | y1 )   * y1
with
 *  C ( y1 , y2 ) / V ( y1 ).
If a*   3 , then the sample least squares regression of y2 on y1 will provide a unbiased
minimum variance estimator of  3 . If a*   3 , then least squares is not appropriate for
the estimation of  3 .
From equations (2.1) and (2.2) we calculate
C ( y1 , y2 )  ( 23 x2  312  1 22 ) /(1  1 3 )2 ,
V ( y1 )  ( 22 x2  12  12 22 ) /(1  13 )2 .
Let    22 x2   12 . Then
 *  3
1 22

.

(  12 22 )   12 22
Clearly the parameter of interest  3 is not the slope of the conditional expectation
function of y2 given y1 . This result is usually described by saying that OLS gives a
biased estimator of the structural parameter  3 . Another description is that OLS gives a
unbiased estimator of slope of the conditional expectation function, which happens to
differ from the slope of the structural equation. Observe that a*   3 in the special case
with 1  0 ; in this case, y1 is a function of x and u1 only so that E (u2 | y1 )  0. .
5
The problem with OLS can be illustrated without relying on normality. From
(1.2) we get
E ( y2 | y1 )  3 y1  E (u2 | y1 ) .
From eq. (2.2),
C ( y1 , u2 )  C ( 2 x  1u2  u1 , u2 ) /(1  1 3 )  1 22 /(1  1 3 ) .
Because y1 and u2 are correlated, we see that E (u2 | y1 )  E (u2 )  0.
3. Indirect least squares
The next method we consider uses OLS to estimate the reduced-form parameters, and
then converts the OLS reduced-form estimates into estimates of the structural-form
parameters. This method, called ‘indirect least squares’ (ILS), produces estimates that are
consistent, although not unbiased. Koopmans and Hood (1953) attribute ILS to M. A.
Girshick. Again see Farebrother (1999) for precursors.
The key to ILS is the relation that relates the reduced-form coefficients to the
structural-form coefficients, namely, Π = ΒΓ 1 , which can be rewritten as ΠΓ = Β.
Suppose Π is known along with the prior knowledge that certain elements of Γ and Β are
zero. The question is whether we can solve ΠΓ = Β uniquely for the remaining unknown
elements of Γ and Β . When a structural parameter is uniquely determined, we say that
the parameter is identified in terms of Π or, more simply, that is identified. This suggests
that the identified structural-form parameters may be estimated via OLS estimates of the
reduced-form coefficients.
The relation between reduced-form and structural coefficients for the demand and
supply model is the following:
 1 3 
   2
 1 1 
11 12  
0 .
There are two equations in three unknowns:
11  112   2 12  311  0.
(3.1)
6
On the right-hand-side of (3.1), solve the equation for  3  12 / 11. We conclude that the
slope coefficient of the supply equation is identified. With respect to estimation, the ILS
estimate of  3 is obtained by replacing 11 and 12 by their OLS counterparts.
The ILS estimator of  3 is consistent since the equation-by-equation OLS
estimators of 11 and 12 are consistent. Moreover, the equation-by-equation OLS
estimates are the same as the generalized least squares (GLS) estimates, that is, the OLS
and GLS estimates coincide in every sample. This is because the explanatory variables
are identical in the two reduced-form equations. A consequence is that the ILS estimator
is asymptotically efficient.
4. Indirect feasible generalized least squares
For some simultaneous-equation models, prior knowledge that certain elements of
Γ and Β are zero implies restrictions on Π . In this situation, equation-by-equation OLS
estimates of the π’s are not optimal, and ILS does not yield a unique estimate of the
structural parameters. We now illustrate the case with restrictions on Π using a
modification of the original structural model.
The modified model has three exogenous variables, x1 (income), x2 (wage rate)
and x3 (interest rate). The modification consists of allowing the three exogenous
variables to enter the supply equation:
Demand. y1  1 y2   2 x1
 u1 ,
y2  3 y1   4 x1  5 x2   6 x3  u2 .
Supply.
(4.1)
(4.2)
The reduced-form of the modified structural-form system is
Quantity. y1  11 x1   21 x2   31 x3  v1 ,
Price.
y2  12 x1   22 x2   32 x3  v2
(4.3)
(4.4)
In the ΠΓ = Β format, the relation between the reduced-form and structural
coefficients is:
 11 12 
2 4 

  1  3  

   0 5  .
  21  22   
1  

 1

 31  32 
 0 6 
7
There are now six equations in six unknowns:
 11  1 12   2  12   3 11   4
 21  1 22  0  22   3 21   5
 31  1 32  0  32   3 31   6 .
(4.5)
The system on the left of (4.5) determines the parameters of the demand equation. Solve
either of the equations that has 0 on its right-hand side for 1   31 /  32   21 /  22 , and
then get  2 from the remaining equation. Clearly, the coefficients of the demand equation
are identified in terms of Π . Furthermore, there is a restriction on the π’s, namely
 31 /  32   21 /  22 , because on the left of (4.5) there are three equations in two unknowns.
The system on the right-hand side of (4.5), which refers to the supply equation,
consists of three equations in four unknowns. Once a value is assigned to  3 , the
equations can be solved for  4 ,  5 ,  6 . A different arbitrary value for  3 generates
different values for  4 ,  5 ,  6 . The solution is not unique. Hence, the coefficients of the
supply equation are not identified in terms of Π .
With respect to estimation, ILS using the equation-by-equation OLS estimates of
Π will not give unique estimates of the structural parameters of the supply equation. The
result is two different ILS estimates of 1 . This problem can be overcome by estimating
the reduced-form subject to the restriction  31 /  32   21 /  22 . The restricted estimates of
the  's can be converted into unique estimates of the  's using the sample counterpart of
the system (4.5).
Suppose there are restrictions on Π . Then the fact that the explanatory variables
are identical in every reduced-form equation does not imply that the OLS and GLS
estimates of the π’s are the same. In other words, OLS estimation of the reduced form
will not be optimal. If the variance matrix of the reduced-form disturbance vector Ω is
known, then GLS subject to the restrictions on Π is the natural (nonlinear) estimation
procedure. The conversion of the GLS estimates of the π’s into estimates of the α’s can
be described as ‘indirect GLS’. Since the GLS estimator is consistent and asymptotically
efficient, the indirect-GLS estimators of Γ and Β are also consistent and asymptotically
efficient.
8
When Ω is unknown, as is true in practice, feasible GLS is the natural estimation
procedure for Π . Feasible GLS is similar to GLS except that an estimator Ω̂ is used in
place of Ω . The estimator Ω̂ comes from the residuals of the equation-by-equation OLS
reduced-form regressions. The resulting estimates of the α’s are referred to as
‘indirect-FGLS’ estimates because the FGLS estimates of the π’s are converted into
estimates of the α’s. Because the FGLS estimator of Π is consistent and asymptotically
efficient, the indirect-FGLS estimators of Γ and Β are also consistent and asymptotically
efficient.
Indirect GLS and indirect FGLS are referred to as ‘full-information’ methods
because they use all the restrictions on Π at once. Estimation of a single structural
equation using only the restrictions on Π for that equation alone is often called ‘limited
information’ estimation. If all the restrictions are correctly specified, then fullinformation estimators are more efficient than limited-information estimators.
In some variants of the simultaneous-equation model it is assumed that u | x is
multivariate normal. The addition of the normality assumption enables the estimation of
Π by maximum likelihood. The resulting estimator of the structural parameters is
known as ‘full-information maximum likelihood’, or FIML. If Ω is known, then FIML
coincides with indirect-GLS. If Ω is unknown, FIML differs from indirect-FGLS, but the
estimators have the same asymptotic distribution.
The difference between FIML and indirect FGLS can be clarified by briefly
turning from the population to the sample. Let V̂ = Y - XP , where P = (XX)-1 XY is the
estimator of Π obtained by equation-by-equation OLS. The estimator of Ω used in
ˆ  (1/ n)VV
ˆ 1VV ),
ˆ ˆ . The criterion minimized by FGLS is tr(Ω
FGLS is Ω
ˆ  (1/ n)VV
ˆ ˆ (as a conditional solution)
where V = Y - XΠ . FIML proceeds by inserting Ω
into the log-likelihood function to obtain the log-likelihood concentrated on | V V | . The
consequence is that the criterion minimized by FIML is | V V | . The difference in the
criteria explains the difference in the estimators.
The maximum likelihood estimation of a single structural-form equation that
uses only the restrictions on Π for that equation alone is referred to as ‘limited-
9
information maximum likelihood’, or LIML. We next consider another limitedinformation estimation method.
5. Two-stage least squares
The 2SLS estimator uses the unrestricted reduced-form estimate P, the equation-byequation OLS estimates of the π’s, which accounts for its popularity. The mechanics of
the 2SLS method can be described simply. In the first stage, the right-hand-side
endogenous variables of the structural equation are regressed on all the exogenous
variables in the reduced form, and the fitted values are obtained. In the second stage, the
right-hand-side endogenous variables are replaced by their fitted values, and the lefthand-side endogenous variable of the equation is regressed on the right-hand-side fitted
values and the exogenous variables included in the equation.
Two rationales for the 2SLS procedure are now developed using the demand
equation of the modified structural model. The starting point for the first rationale is the
expectation of the demand equation conditional on x1, x2, and x3. Taking expectations
gives
E ( y1 | x1 , x2 , x3 )  1E ( y2 | x1 , x2 , x3 )   2 x1  E (u1 | x1 , x2 , x3 ),
or
E ( y1 | x1 , x2 , x3 )  1 y2*   2 x1.
From the reduced-form eq. (4.4),
y2*  E ( y2 | x1 , x2 , x3 , )   12 x1   22 x2   32 x3 .
Because y 2* is linear function of the exogenous variables, it is exogenous. If y 2* were
observed, then y1 could be regressed on y 2* and x1 to get unbiased estimates of 1 and  2 .
But y 2* is unobservable because 12 ,  22 , and  32 are unknown. However, an unbiased and
consistent estimate ŷ2 can be obtained by replacing the unknown π’s by their OLS
estimates. Then, making the replacement of ŷ2 for y 2* produces consistent estimates of
the structural parameters.
The second rationale exploits the fact that in the population the following moment
conditions hold:
10
E (u1 )  0, C ( x1 , u1 )  C ( x2 , u1 )  C ( x3 , u1 )  0.
These imply two orthogonality conditions:
E ( y2*u1 )  0, E ( x1u1 )  0.
If we let u1  y1  (a1 y2*  a2 x1 ), then 1 and  2 are the values for a1 and a2 that
make E ( y2*u1 )  0 and E ( x1u1 )  0. 2SLS chooses the estimates that make the analogous
sample quantities zero, that is,
 yˆ
i
u  0 and  i x1i u1i  0, (i  1,..., n). This illustrates
2i 1i
that 2SLS has an instrumental-variable (IV) interpretation.
The IV interpretation can be illustrated more explicitly by writing the demand
equation in terms of the observations for a sample size of n:
y1 = y 21 + x1 2 + u1
 
x1   1   u 1
2 
 Z1α + u1 ,
 y2
where in the context of the demand equation y 1 and y 2 are the columns of Y and x1 is
the first column of X. As we have shown, regressing y 1 on Z1 will not give a consistent
estimator for α 1 . Instead replace Z1 by
ˆ  NZ  N(y , x ) =  Ny , Nx  =  yˆ , x  ,
Z
1
1
2
1
2
1
2
1
1
ˆ gives the normal
where N is the idempotent matrix X(XX )X . Regressing y on Z
1
1
equations,
ˆ Z
ˆ a* = Z
ˆ y ,
Z
1
1 1
1 1
the solution to which is the 2SLS estimator.
The 2SLS normal equations are equivalent to a set of orthogonality
ˆ u = 0, where u = y - Z a* . The equivalence follows from an algebraic
conditions: Z
1 1
1
1
1 1
fact:
ˆ Z = Z NZ = Z NNZ = Z
ˆ Z
ˆ .
Z
1
1
1
1
1
1
1
1
11
ˆ are legitimate instruments because they are, at least asymptotically,
The variables in Z
1
uncorrelated with the disturbance. The IV interpretation implies that the 2SLS estimator
is consistent. In fact, it is the optimal feasible IV estimator.
We also note that the 2SLS estimator can be interpreted as a general-method-ofmoments (GMM) estimator. In the above example, this follows from the fact that a*1
minimizes the quadratic form
(y 1 - Z1a1* )X(XX)1 X(y 1 - Z1a1* ).
It can shown that 2SLS is the optimal feasible GMM estimator. An advantage of the
GMM approach is that heteroskedasticity and autocorrelation can be accommodated by
an appropriate redefinition of the optimal weighting matrix in the definition of the GMM
estimator (see Ruud, 2000, pp. 718–21)
We conclude this section with some remarks on estimation in the simultaneousequation model.

If all the structural equations are identified, and there are no restrictions on Π ,
then ILS, indirect-FGLS, FIML, LIML and 2SLS all produce the same estimates.

If there are restrictions on Π , then LIML and 2SLS produce different estimates in
the sample, but the estimators have the same asymptotic distribution, and
similarly for indirect FGLS and FIML.

If a parameter is not identified, then there is no method to estimate it consistently.

We have confined our attention to the case in which the prior information used for
identification consists of normalizations and exclusions (zero restrictions). If other
information is available (for example, Σ is diagonal, or a coefficient in one
structural equation is equal to a coefficient in another), then some modifications
are needed in the description of the estimators and their statistical properties.
6. The k-class family
The k-class family of estimators of the coefficients of a single structural equation is
illustrated for the demand equation of the modified structural model. For this equation,
the estimator is
12

a*k  Z1 (I - kM)Z1

1
Z1 (I - kM)y 1 ,
where M = I - N. This family was introduced by Theil (1953b; 1961). It includes the
OLS estimator (k = 0) and the 2SLS estimator (k = 1).
A remarkable fact is that the k-class family includes LIML. The LIML estimator
is obtained by setting k   * , where  * the smallest root of
| W1 -  W |  0.
In the determinantal equation, W1 is the cross-product matrix of residuals from the OLS
regression of
 y1 , y 2  on x1 (the included endogenous variables on the included
exogenous variable), and W is the cross-product matrix of the residuals from the OLS
regression of  y1 , y 2  on X (the included endogenous on all the exogenous variables).
Moreover, the LIML estimator of 1 is the value of a1 that minimizes the variance ratio:
(1, a1 ) W1 (1, a1 )
,
(1, a1 ) W (1, a1 )
Accordingly, the LIML estimator is also sometimes referred to as the ‘least-variance
ratio’ estimator.
The k-class estimator is consistent if (k – 1) converges in probability to 0 and has
the same limiting distribution as 2SLS if n1/ 2 (k  1) converges in probability to 0. These
conditions are clearly not satisfied when k = 0 and hence by OLS. A proof that these
conditions are satisfied for LIML is given in Amemiya (1985, pp. 237-238). Zellner
(1998) shows that a Bayesian method-of-moments approach justifies certain other
members of the k-class (and double k-class) family of estimators.
7. Finite sample distributions
There has been debate about whether the LIML estimator is better than 2SLS. The reason
for this debate is that the finite sample distributions of the estimators differ when there
are restrictions on Π . Hence we now limit our attention to the case where restrictions are
present.
A key difference between the estimators is the existence of moments. The 2SLS
estimator has moments up to certain order. By contrast, the LIML estimator has no
13
moments. This result holds for an arbitrary number of included endogenous variables.
Mariano (2001) reviews the moment existence results.
In addition to the moment existence results, closed form expressions for the
moments and probability densities of 2SLS, LIML and k-class estimators have been
derived. These expressions are complicated; see Phillips (1983) for specific references.
The finite sample results have mostly come from the study of a model with two
included endogenous variables. Moreover, it is often assumed that the disturbances are
normal. The finite sample results are obtained using analytical and simulation methods. A
survey of the results and their practical implications is given in Mariano (2001). One of
the results is that the LIML distribution is far more symmetric than 2SLS, though more
spread out, and it approaches normality faster. Similarly, Anderson (1982) concludes that
for many cases that occur in practice the standard the normal theory is inadequate for
2SLS, but provides a fairly good approximation to the actual distribution of the LIML
estimator. The symmetry result is not surprising because the approximate distribution of
the LIML estimator (obtained from large sample asymptotic expansions) is median
unbiased. Median unbiasedness does not hold in general for 2SLS.
It is helpful to put the debate over the relative merits of LIML and 2SLS in
perspective. On the assumption that the model is correctly specified, the presumption is
that maximum likelihood will have better properties in some well-defined sense. This is
because it uses more information than GMM. See Anderson, Kunitomo and Morimume
(1986) and Takeuchi and Morimume (1985) for results on the second-order efficiency of
LIML. An advantage of GMM is that it is robust to misspecifications of the disturbance
distribution, although this advantage was not the original motivation for introducing
2SLS.
8. Epilogue
Keynesian economics initially played a key role in propelling research into the estimation
of simultaneous-equation models. Interest in the estimation of linear simultaneousequation models began to wane from about the late 1970s. Historically, this decline
paralleled the decline of the Keynesian paradigm as a result of the monetarist
counterrevolution and later rational expectations. At the same time, there was growing
14
awareness that linear simultaneous-equation models often suffered from potentially
serious misspecification. On the one hand, even taking for granted the statistical
specification of the model (A1–A4), economic theory did not provide a satisfactory basis
for deciding what endogenous and exogenous variables should be excluded from
individual structural equations. As a consequence, the identification of structural
parameters was open to question. On the other hand, the statistical specification of the
linear structural-form model was itself questionable, due to either the nature of the data or
considerations of economic theory or both. The issue of identification was highlighted by
Sims (1980) in a well-known paper aptly titled ‘Macroeconomics and Reality’.
Simultaneous-equation models have been generalized by the introduction of
nonlinear structural equations. Amemiya (1974) generalized the 2SLS method to
nonlinear models. His nonlinear 2SLS estimator is a GMM estimator. However, unlike
in the case of linear models, it cannot be thought of as being obtained in two steps where
the first consists of running a least squares regression. The GMM is a two-step estimator,
but the first step consists of choosing a weighting matrix. More generally, GMM and IV
estimators can be thought of as the descendants of the 2SLS approach. They constitute
the contemporary basis for much of the estimation of structural parameters
macroeconomics. It is in this sense that 2SLS lives on as a structural estimation method.
Does the identification and estimation of simultaneous-equation models have a
future? There are some positive signs. Although indirectly, these issues continue to play a
role in the identified vector autoregression literature, for example, Bernanke and Blinder
(1992), Gordon and Leeper (1995), Cushman and Zha (1997). More recently, there has
been a revival of interest in simultaneous-equations models formulated as nonlinear
dynamic stochastic general equilibrium models. This formulation has the advantage that
the resulting models are viewed as more firmly anchored in economic theory. Estimation
of such models presents serious challenges. Some examples where these challenges are
addressed include DeJong, Ingram and Whiteman (2000) and Fernandez-Villaverde and
Rubio-Ramirez (2005).
N. E. Savin
See also seemingly unrelated regressions; simultaneous equation models
15
Bibliography
Amemiya, T. 1974. The nonlinear two-stage least-stage estimator. Journal of
Econometrics 2, 105–10.
Amemiya, T. 1985. Advanced Econometrics. Cambridge, MA: Harvard University Press.
Anderson, T. 1982. Some recent developments on the distribution of single-equation
estimators. In Advances in Econometrics, ed. W. Hildenbrand. Cambridge:
Cambridge University Press.
Anderson, T. 2005. Origins of the limited information maximum likelihood and twostage least squares estimators. Journal of Econometrics 127, 1–16.
Anderson, T., Kunitomo, N. and Morimune, K. 1986. Comparing single equation
estimators in a simultaneous equation system. Econometric Theory 2, 1–32.
Anderson, T. and Rubin, H. 1949. Estimator of the parameters of a single equation in a
complete system of stochastic equations. Annals of Mathematical Statistics 20,
46–63.
Anderson, T. and Rubin, H. 1950. The asymptotic properties of estimates of the
parameters of a single equation in a complete system of stochastic equations.
Annals of Mathematical Statistics 21, 570–82.
Basmann, R. 1957. A generalized classical method of linear estimation of coefficients in
a structural equation. Econometrica 25, 77–83.
Bernanke, B. and Blinder, A. 1992. The federal funds rate and the channels of monetary
transmission. American Economic Review 82, 901–21.
Cushman, D. and Zha, T. 1997. Identifying monetary policy in a small open economy
under flexible exchange rates. Journal of Monetary Economics 39, 433–48.
DeJong, D., Ingram, B. and Whiteman, C. 2000. A Bayesian approach to dynamic
macroeconomics. Journal of Econometrics 98, 203–23.
Goldberger, A. 1991. A Course in Econometrics. Cambridge, MA: Harvard University
Press.
Gordon, D and Leeper, E. 1995. The dynamic impacts of monetary policy: an exercise in
tentative identification. Journal of Political Economy 102, 1228–47.
16
Farebrother, R. 1999. Fitting Linear Relationships: A History of the Calculus of
Observations 1750–1900. New York: Springer.
Fernandez-Villaverde, J. and Rubio-Ramirez, J. 2005. Estimating dynamic equilibrium
economies: linear versus nonlinear likelihood. Journal of Applied Econometrics
20, 891–910.
Hood, W. and Koopmans, T. 1953. Studies in Econometric Method. Cowles Foundation
Monograph 14. New Haven: Yale University Press.
Koopmans, T. 1950. Statistical Inference in Dynamic Economic Models. Cowles
Commission Monograph 10. New York: John Wiley and Sons.
Koopmans, T. and Hood, W. 1953. The estimation of simultaneous linear economic
relationships. In Studies in Econometric Method, ed. W. Hood and T. Koopmans.
Cowles Foundation Monograph 14. New Haven: Yale University Press.
Mariano, R. 2001. Simultaneous equation model estimators: statistical properties and
practical implications. In A Companion to Theoretical Econometrics, ed. B.
Baltagi. Oxford: Blackwell.
Mittelhammer, R., Judge, G. and Miller, D. J. 2000. Econometric Foundations.
Cambridge: Cambridge University Press.
Phillips, P. 1983. Exact small sample theory in the simultaneous equation model. In
Handbook of Econometrics, ed. Z. Griliches and M. Intriligator. Amsterdam:
North-Holland.
Ruud, P. 2000. An Introduction to Classical Econometric Theory. Oxford: Oxford
University Press.
Sargan, J. 1958. Estimation of economic relationships using instrumental variables.
Econometrica 67, 557–86.
Sims, C. 1980. Macroeconomics and reality. Econometrica 48, 1–48.
Takeuchi, K. and Morimune, K. 1985. Third-order efficiency of the extended maximum
likelihood estimator in a simultaneous equation system. Econometrica 53, 177–
200.
Theil, H. 1953a. Repeated least-squares applied to a complete equation systems. Mimeo.
The Hague: Central Planning Bureau.
17
Theil, H. 1953b. Estimation and simultaneous correlation in complete equation systems.
Mimeo. The Hague: Central Planning Bureau.
Theil, H. 1961. Economic Forecasts and Policy, 2nd edn. Amsterdam: North-Holland.
Zellner, A. 1998. The finite sample properties of simultaneous equations’ estimates and
estimators: Bayesian and non-Bayesian approaches. Journal of Econometrics 83,
185–212.
18
Download