Learning about the parameters and the estimation Fabio Canova

advertisement
Learning about the parameters and the
dynamics of DSGE models: identification and
estimation
Fabio Canova
IGIER, UPF and CEPR
Luca Sala
IGIER - Universitá Bocconi
May 3, 2005
Abstract
In this paper we study identification and estimation of structural parameters
in DSGE models. Results show that for a large class of models some (if not all)
parameters are either not identified or weakly identified
1
Introduction
The 1990’s have seen a remarkable development in the specification of DSGE models. The literature has added considerable realism to the constructions popular in the
1980’s: in particular, a number of shocks and frictions have been introduced into primitive RBC models driven by technology disturbances. Steps forward have also been
made in comparing the models’ approximation to the data. While 10 years ago it was
standard to calibrate the parameters of a model and perform informal evaluation of
its outcomes, now maximum likelihood or Bayesian estimation of the structural parameters is quite common both in policy and academic circles (see e.g. Ireland (2003),
Smets and Wouters (2003) , Canova (2004)) and new techniques have been introduced
for evaluation purposes (see Del Negro et. al. (2004)).
Given the complexity of the current structures and the difficulties existing in extracting information useful to evaluate the fit of DSGE models, a strand of the literature
has considered less demanding limited information methods and focused on whether
the model matches the data along certain dimensions. Following the works of Rotemberg and Woodford (1997) and Altig, et. al. (2003), and others, it has become now
common to estimate the parameters of a model by matching conditional dynamics in
1
response to certain structural shocks (see also Canova (2002) for an alternative but still
limited information approach). Matching impulse responses nicely fits into the class of
indirect inference methods with nuisance parameters, developed by Renault and Dridi
(1998). One of the crucial conditions needed for the methodology to deliver meaningful
estimates is the one of identifiability: there must be a unique maximum in the objective function relating structural parameters to the selected conditional responses and
there should be enough curvature in the objective function to pin down the area of
the maximum with sufficient precision. Since in DSGE models conditional dynamics in
response to shocks depend in a highly non-linear way on the structural parameters, it is
far from clear how to check whether these identifiability conditions are met. Moreover,
even if local identifiability conditions are satisfied, it could be hard to obtain estimate
the parameters and unrestricted estimates of their standard errors may turn out to be
unreasonably large.
This paper investigate identifiability issues in DSGE models and their consequences
for parameter estimation. Since the field is vast and largely unexplored, we focus on
general issues and leave many interesting but detailed avenues of reserach untouched.
We start in the next section discussing the generics of identifications in DSGE models,
contrast the problems existing in linear an nonlinear models and highlight why the
theory recently developed for GMM estimates by Stock and Wright (2000) and Wright
(2003) does not immediately apply when instead of moment conditions one deals with
impulse responses.
The next two sections provide examples of simple structures generating two commonly found identification problems in DSGE models: observational equivalence and
disappearance of structrual parameters from teh relevant objective functions. We then
discuss situations where, although the objective function has a unique zero and the rank
of its Hessian fill, weak identification problems arise. We devide our analysis in two.
First, we consider the situation where an investigator knows the population DGP but
matches only the dynamics of one or two of the arrays of shocks driving the economy.
We show that some parameters are not recoverable from impulse responses and that the
objective function may be too flat in some dimensions to allow hill climbing routines to
work appropriately. We examine whether for the identification point of view it is better
to match impulse responses of the VAR coefficients of the model and to what extent
small samples worsen the identification problem. Matching VAR coefficients could be
beneficial since some non-linearities are eliminated. However, since information about
contemporaneous correlations is ignored, identification problems may be compounded.
We also examine the effects of using different weighting matrices and different objective
function and brifly compare identification issues in classical and bayesian frameworks.
We show that typical identification problems are magnified if the weighting matrix
heavily discount responses at long horizons and that using the likelihood function in
place of a distance function could be beneficial in some situations but not in general.
Next, we examine what happens when an investigator has an incorrectly specified
model and uses the dynamic implications of one or two shocks to find estimates of the
parameters of her model. In particular, we are interested in examining whether a spurious estimation phenomena exist, that is, a situation in which two alternative features of
the model may be indistinguishable from the point of view of impulse response dynamics
2
(e.g. habit in consumption plus sticky prices vs. habit in leisure and sticky wages) or a
situation in which parameters which do not exist in the original specification become estimable because conditional dynamics in response to certain shocks are observationally
equivalent to those produced by the true DGP. Our interest in this issue is motivated
contrasting evidence found in recent papers; for example Boivin and Giannoni (2002),
find structural breaks in monetary policy rules estimated out of responses to monetary
shocks while ML and Bayesian methods fail to find this (see Canova (2004) or Gali and
Rabanal (2004)); Meier and Muller (2004) find that a financial accelerator is important
in matching impulse responses to monetary shocks while ML and Bayesian estimation
find it insignificant (see e.g. Neri (2003)). Also in this case, we examine whether it
is possible to improve estimates alteriing the objective function, adding information,or
varying the distance matrix and changing the dimension of the vector of responses.
The rest of the paper is organized as follows. The next section discusses few general
identification issues in linear and nonlinear model. Section 3 present examples of different structrual models which produce observationally equivalent impulse responses.
Section 4 present an example where some structural parameters are unidentified from
impulse responses. Section 5 and 6 examine weak identification issues. We show that
this problem is generic and of difficult solution and the consequencies it produces on
estimates of the structural parameters of the model. Section 7 deals with small samples and tries to measure how identification problems and small samples interact to
bias estimates and standard errors. Section 8 present cases where inexistent parameters could be estimable because of observational equivalence issues. Section 9 discusses
diagnostic to detect identification problems in DSGE models, provides some suggestion
for practical empirical work and offers a few preliminary conclusions.
2
A few definitional issues
Identification has to do with the ability to draw inference about a theoretical structure
from observed samples. It goes without saying that, whatever estimation approach one
takes, a parameter must be identifiable to be estimable. While identification issues in
linear models with or without expectations have been extensively studied and literature
on this issue goes back at least to Koopmans and Reiersol (1950), Rothenberg (1971),
Pesaran (1981), the identification problem becomes much more complicated when the
model is nonlinear in the parameters, as it is the case with the solution of DSGE models.
Let us start by recalling few standard definitions. Suppose the structural model
can be characterizes uniquely by a density function L(y, θ0 ), where θ0 is a m × 1 vector
of parameters. Identification has to do with the possibility of distinguishing between
parameter points and it is related to the existence of an inverse mapping from the
probability distribution of the random variables Y to the vector θ0 and thus to the
structural model. Two parameter vectors θ1 and θ2 are said observationally equivalent if
L(y, θ1 ) = L(y, θ2 ), for any y. A parameter vector θ0 is locally identifiable if there exists
an open neighborhood of θ0 , containing no other θ which is observationally equivalent
to θ0 ; it is globally identifiable if there is no other θ in the entire parameter space
which is observationally equivalent. If a component j of θ0 , call it θj0 , is observationally
3
equivalent to all θj ∈ Θj , then θ0 is underidentified. On the other hand, a parameter
vector θ0 is partially identifiable if there exists a constant vector α such that θ1 = αθ0 ,
is observationally equivalent to θ0 .
Local identifiability is related to the rank of the information matrix: =(θ), whose
2
∂log(L(y,θ))
) = −E( ∂ log(L(y,θ))
). One can prove
(i, j)-th element, =(θ)ij , is E( ∂log(L(y,θ))
∂θi
∂θj
∂θi ∂θj
that θ0 is locally identifiable if and only if the rank of =(θ0 ) is constant and equal
to m in a neighborhood of θ0 (Rothenberg, 1971). The intuition behind this result
is straightforward: the information matrix measures the "curvature" of the likelihood
function at each point. If in the direction of some parameters there is no curvature,
there is no information about those parameters and the corresponding rows and columns
of =(θ) will be zero.
It is important to stress that while the above definitions and results are specific to
the likelihood function, similar concepts can be defined for estimation methods based
on the minimization of an objective function g(y, θ). In this case, condition for local
identifiability of the true parameter θ0 is that g(y, θ) has a unique minimum at θ0 . This
imply two things: 1. the gradient must be zero only in θ0 : ∇g(y, θ0 ) = 0 and 2. the
2
|θ=θ0 must be positive definite in a neighborhood of θ0 (H must
Hessian H = ∂ ∂θg(y,θ)
i θj
then be also non-singular). If the rank of the Hessian at θ0 is less than m, some elements
of θ0 will not be locally identifiable. A particular case is when the second derivative is
zero in the direction of some parameters. This implies the objective function is locally
linear in that direction and thus cannot have a unique local minimum.
Let us start discussing possible problems that may arise in practice.
First, the population objective function may be not globally concave. Certain parameters may be indistinguishable from the point of view of a particular objective function
and this mean that multiple solutions to the optimization problem may be obtained.
This clearly does not imply that the two parameterization would be indistinguishable
under all objective functions nor that is impossible to find certain parameters that are
uniquely and precisely pinned down: it simply suggests that some objective functions
are unsuited to distinguish certain features of the model at hand.
Second, the population objective function may be independent of certain parameters. A structural parameter may disappear from the log-linearized solution of a model
so that the objective function is zero for all the range of the parameter.
Third, two structural parameters may enter only proportionally in the objective
function. Therefore, they can not be separately recovered. This is a well known problem
and naturally links to the rank condition for identification used in old style simultaneous
equation systems.
Fourth, the population objective function may be globally concave and have a
unique minimum but its curvature may be ”insufficient” in some or all directions,
so that plateaus or flat areas may be present. Note that this problem could be either
specific to a small neighbour of the parameter space or concern the entire parameter
space.
One further complication emerges when the objective function is asymmetric in
the neighborhood of the zero and curvature deficient in only a part of the parameter
4
space. Since in this case the objective function is asymmetric, different part of the
parameter space may carry be different information about the parameters. Therefore,
identification may depend on the true parameter vector. We call this phenomenon
asymmetric weak identification problem.
Fifth, and this applies primarily to limited information estimation methods which
consider only some aspects of the model, different variables (shocks) may carry different
information about the parameters. In other words, it is possible that a parameter is
identified from an objective function which considers all the features of the model, but
it may not be identifiable from one equation or one set of responses are used. Hence, if
limited information methods are used, it may matter which equation or shock is used to
pin down the parameters and different equation (or shocks) may produce substantially
different parameter estimates or leave parameters unidentified. We call this limited
information identification problem.
In the next few section we show that all these issues become relevant when we deal
with DSGE models.
3
Observational equivalence: two structural models have
the same impulse responses.
The example we consider here is as simple as possible to allow us to compute an
analitical solution to the model and, at the same time, to illustrate one of the problems
often encountered with DSGE models: the observational equivalence of two structures
with respect to impulse responses. Suppose a time series xt is generated from xt =
λ1 λ2
1
λ2 +λ1 Et xt+1 + λ1 +λ2 xt−1 +vt where λ2 ≥ 1 ≥ λ1 ≥ 0. The unique rational expectations
1
vt . Therefore, given vt = 1, the
(stable) solution is given by xt = λ1 xt−1 + λ2λ+λ
2
λ2 +λ1
λ2 +λ1
2 λ2 +λ1
responses of xt are [ λ2 , λ1 λ2 , λ1 λ2 , ...], so that, using at least two horizons of
xt , one can estimate the two parameters, λ1 and λ2 . What we want to show is that
there is different process, whose rational expectation (stable) solution has the same
impulse response. In this case, an objective function measuring the distance between
theoretical and empirical impulse responses will have multiple zeros.
Consider for example the process yt = λ1 yt−1 +wt where wt is an iid shock with zero
mean and variance σ 2w . Clearly, the responses of yt to a shock in wt will be identical to
1
σ v . Hence, a process with backward
the responses of xt to a shock in vt if σ w = λ2λ+λ
2
looking dynamics is observationally equivalent to a process with forward and backward
looking dynamics and iid fundamental error. This argument, for example, is the basis
for Beyer and Farmer’s (2004) claim that the data cannot tell whether a Phillips curve
is backward looking or forward looking and it is the cornerstone of Pesaran’s (1981)
critique of tests of rational vs. adaptive expectations models.
What is the key ingredient that makes the two processes equivalent from the point
of view of impulse responses? Clearly, the unstable root λ2 enters the solution only
contemporaneously. Since the variance of the shocks is not estimable from normalized
impulse responses (any value simply implies a proportional increase in all the elements
of the impulse response function), we can arbitrarily select it in the second case so as to
5
capture the effects of the unstable root. In other words, an investigator has one degree
of freedom in the analysis and can choose it so as to make two processes share both
contemporaneous and lagged dynamics.
Several other examples of observationally equivalent structures have appeared in
the literature. For example, Ma (2002) has shown that a standard forward looking
Phillips curve is consistent with two structural models having very different firms’
pricing behaviour and Altig et al. (2004) constructed a model with firm specific capital
which produces the same inflation dynamics of a model without firm specific capital.
Note that while two observationally equivalent structures imply that the objective
function has two zeros, a continuum of observationally equivalent models indexed, for
example, by the size of two parameters, will create a canyon or a ridge in the objective
function and information external to the model needs to be brought in to disentangle
the various structural representations.
4
Some structural parameters are unidentifiable from impulse responses.
Examples of models where structural parameters do not enter impulse responses or
where two parameters enter in a proportional fashion are also numerouos. It is easy
to build examples where structural parameters completely disappear from the moving
average representation of the model.
For this purpose, consider a version of the three equations New keynesian model:
yt = a1 Et yt+1 + a2 (it − Et π t+1 ) + v1t
(1)
π t = a3 Et π t+1 + a4 yt + v2t
(2)
it = a5 Et π t+1 + v3t
(3)
where the first equation is the log-linearized Euler condition, the second a forward
looking Phillips curve relationship and the third an equation characterizing monetary
policy. Using the method of underdetermined coefficients, and noting that the model
features no state variables, it is easy to guess that the solution for the output gap,
inflation and the nominal interest is a linear function of the three shocks vjt . It is
straightforward to show that, in deviation from steady states, such a solution is of the
form:
⎤ ⎡
⎤
⎤⎡
⎡
ŷt
1 0 a2
v1t
⎣ π̂ t ⎦ = ⎣ a4 1 a2 a4 ⎦ ⎣ v2t ⎦
0 0
1
v3t
ît
While the model we consider is very simple and none of the three shocks induce
any propagation over time, several useful points can be made.
First, the parameters a1 , a3 , a5 disappear from the log-linearized solution - they
only enter in the steady states and in a nonlinear fashion. Interestingly, they are
those characterizing the forward looking dynamics of the model. Therefore, if one uses
the responses of the output gap, of inflation and of the nominal interest rate to any
6
of the shocks, she will only be able to recover a2 , a4 . Since variations in a1 , a3 , a5
have no influence on the dynamics of the system in deviation from steady states, any
objective function (and the likelihood itself) which uses the dynamics of the output
gap, inflation and the nominal rate will be independent of them (the parameters are
completely unidentified).
Second, different shocks carry different information about the parameters. For example, responses to v1t allow us to recover only a4 ; responses to v3t may be used to
back out both a4 and a2 while responses to v2t have no information for the two parameters. Therefore, in this case matching responses to monetary shocks makes responses
to other shocks superfluous as far as parameter estimation is concerned. Conversely,
matching responses to v2t leaves all the parameters of the model unidentified.
5
Weak and under identification
There is a sense in which the cases considered in the two previous examples are pathological. The objective function is ill-behaved in both cases: it has either multiple zeros
or it is equal to zero in some dimension regardless of the true parameter value. While
these cases may be seen as extreme, there are many applied situations where the population objective function has a unique zero, the Hessian is locally positive definite but
still parameters are only weakly identified: away from the true parameter vector the
objective function is close to zero and this may occur either in a neighbour of the true
parameter vector or over a large range of values. We call the first phenomenon locally
weak identification and the second one globally weak identification. To show that both
features are relatively standard in DSGE model we use a simple RBC model driven by
technology disturbances. We choose to work with the simplest version of the model
since in this case we can compute the solution analytically. This allows us to study
whether and how VAR coefficients and impulse responses change once parameters are
altered and therefore to better highlight problems that may occur.
The representative agent maximizes a utility function, which depends on currect and
∞ 1−ϕ
P
ct
future consumption, and it is of the form Ut = β t
1−ϕ and the resource constraint is
t
ct + kt+1 = zt ktη + (1 − δ)kt . zt is a first-order autoregressive process with autoregressive
coefficient ρ, steady state value z ss and variance σ 2e and kt is the current capital stock.
The parameters of the model are θ = [β, ϕ, δ, η, ρ, z ss ]. Noting that the current level
of capital stock and the technological disturbance are the states of the problem, we
can use the method of undetermined coefficients to express the endogenous variables in
terms of the states and the shocks. In particular, the solution for wt+1 = [kt+1 , ct , yt , zt ]
is of the form wt+1 = Awt + Bet where B = [vkz , vcz , 1, 1]0 and the 4 × 4 matrix A is
given by:
⎡
⎤
vkk 0 0 vkz ρ
⎢ vck 0 0 vcz ρ ⎥
⎥
A=⎢
⎣ η 0 0
ρ ⎦
0 0 0
ρ
7
where vkk = 12 γ −
ss
ss
q
( 12 γ)2 − β −1 ; vkz =
ss
(1−β(1−δ))ρ−ϕ(1−ρ) ycss
ss ;
(1−β(1−δ))(1−η)+ϕvck +ϕ(1−ρ) kcss
(1−β(1−δ))(1−η)(1−β+βδ(1−η))
and the
ϕηβ+β −1 +1
ss
vck = (β −1 −
vkk ) kcss ; vcz = ycss − kcss vvk and γ =
superscript ss
indicates steady states values. It is immediate to compute impulse responses for the four
variables of the model to a shock in et and to build the criterion function that minimizes
the distance from theoretical IRFs from those obtained from the data. In order to
investigate the shape of the objective function we compute the theoretical impulse
responses obtained with the following parameterization: (β = 0.985, ϕ = 2.0, ρ =
0.95, η = 0.36, δ = 0.025, z ss = 1) and compute the distance surface when we vary
parameters in an economically reasonable neighborhood of the selected values. Figure
1 presents three three-dimensional surfaces and contour plots obtained when we vary
two parameters at the time.
distance
2.5
10
5
0
2.5
2
2
1.5
σ
0.8
0.9
ρ
1.5
5
1
0.5
0.1
0.05
0.01
0.05
0.1
0.5
1
5
0.8
5
distance
0.9
0.95 0.9
0.85 0.8
ρ
0.2
0.4
0.6
σ
5
0.95
10
5
0
5
1
0.5
0.1
0.01 0.01
0.05
0.1
0.5
1
0.85 ρ
0.9
0.5
0.1
0.5 0.05
1
1
0.95
5
ρ
5
5
0.85
0.8
0.8
η
0.2
0.4 η
1
1
0.
0.6
0.8
distance
0.03
2
0.02
0.
0
0.03
0.02
0.01
δ
5
0.
05
01 5
0. 0.0
1
0.
0.01
0.98
β
0.985
δ
5
0.
0.99
0.98
β
0.985
0.99
distance
0.03
2
0
0.01
0.02
δ
0.03
1.4
1.2
1
0.02
0.5
0.01
0.005.01
1
0.8
0.9
z
1
1.1
z
0.5
1.2 1.3
1
0.1
δ
1.4
s
s
Distance function and countour plots: Rbc model
Figure 1
Clearly, although there is a unique minimum in correspondence of the true parame8
ters, the objective function is very flat both locally around the minimum and globally
over the entire parameter range. For example, the persistence parameter ρ is weakly
identified in the interval [0.8,0.99] and the share of capital in the production function
η is only weakly identifiable in the range [0.3,0.6]. Interestingly, the distance function
displays a canyon of approximately of the same depth in the depreciation rate δ and
the discount factor β [running from (δ = 0.005, β = 0.975) up to (δ = 0.03, β = 0.99)]
and in the steady state value of the technology shock z ss and the depreciation δ. Note
that in both cases the one percent countour (SE IL MIN e ZERO;COS?E 1 PERCENT
CONTOUR?) is pretty large and essentially includes the full range of economically
interesting values for δ and β. Hence, even though it is possible to identify one of the
two parameters conditional on the other, joint estimation is unlikely to be successful.
Therefore, not only it it may be impossible to recover the parameters of the DGP in
response to technology disturbances because standard routines are unlikely to be able
to fruitfully search plateaus of this sort but, as we will see later on, attempts to make
the distance function better behaved may result in poor impulse response matches.
Given our analytical solution, we can go one step further and analyze the gradient
of the entries of the matrices A with respect to the parameters and check which are
the VAR coefficients primarily responsible for this pattern of results - - examining the
gradient of the entries of A sufficies, since the matrix B share the same entries as the
matrix A. It turns out that vkk and vkz are the coefficients which cause the problems.
In fact, the local derivative of vkz with respect to ρ is small and equal 0.08 and those
of vkk and vkz with respect to η are, respectively, -0.10 and 0.09. In other words, the
objective function is flat in ρ and η because the dynamics of the capital stocks are only
weakly influenced by these two parameters. Since the the law of motion of the capital
stock determines the dynamics of output, consumption and investments, there is little
additional information contained in the responses of these variables. Note also that
the local derivative of vkk and vkz with respect to β and δ are not only small but have
similar magnitude and opposite sign. Therefore, the dynamics of the capital stock are
roughly insensitive to proportional increases/decreases in these two parameters.
Table 1: RBC model, 95 % estimation range
Parameters Value True model
β = 0.995
weighting
VAR
β
η
ϕ
δ
ρ
z ss
Ob. fct.
[0.9771, 0.9998]
[0.3598, 0.3633]
[1.5213, 4.9232]
[0.0002, 0.0460]
[0.9218, 0.9544]
[0.7117, 1.8424]
[7.5e-5, 0.0003]
0.985
0.36
2.0
0.025
0.95
1.0
[0.9810, 0.9853]
[0.3594, 0.3616]
[1.9961, 2.0403]
[0.0208, 0.0261]
[0.9488, 0.9501]
[0.9981, 1.0061]
[9.7e-9, 0.0001]
0.995
[0.3565, 0.3566]
[1.9522, 1.9557]
[0.0352, 0.0357]
[0.9503, 0.9504]
[0.9534, 0.9542]
[0.00001, 0.00007]
[0.9617, 0.9999]
[0.2974, 0.4882]
[1.1021, 4.8821]
[0.0001, 0.0445]
[0.9388, 0.9604]
[0.7813, 2.3481]
[8.5e-7, 1.5e-5]
Depending on the initial conditions, different values of β and δ could be selected.
The standard practice of fixing β works here since, as shown in Figure 1, for any
value of β the distance function has reasonable curvature in the δ dimension (and
9
viceversa). Given the loose relation between the size of these two parameters and the
value of the distance function, fixing β arbitrarily produces inconsistent estimates of δ
and possibly of the parameters. Since the two parameters enter in an almost additive
manner in the relevant entries of the matrix A, there is a continuum of values for β and
δ which generates objective functions that are sufficiently close to the minimum. Fixing
one parameter helps eliminating this multiplicity, but may induce serious estimation
biases. We show these facts in details in Table 1, where we reports the true parameters,
the range obtained randomizing on the initial conditions in a neighborhood of the true
parameters and the estimates obtained arbitrarily fixing β to 0.995. Figure 2 graphs the
population responses of capital, the real rate, consumption, output and the technology
shock and those obtained with the estimated parameters in the latter case.
1
1
1
1
0.9
0.9
0.9
0.9
0.9
true
0.8
0.8
0.7
0.7
1
true
0.6
0.6
0.5
0.8
0.8
0.8
0.7
0.7
0.7
0.6
0.6
0.6
0.5
0.5
0.4
0.4
0.4
0.3
0.3
0.3
0.2
0.2
0.2
0.1
0.1
0.1
0.5
0.5
true
0.4
0.4
0.3
true
0.3
0.2
0.2
0.1
0.1
0
10
capital
20
10
real rate
20
0
10
consumption
20
0
output
10
20
0
10
20
tech. shock
Figure 2
First of all the table shows that in a neighborhood of the true parameters the
distance function is relatively well behaved: except for δ, convergence to the true
parameter vector occur starting from reasonably closed initial conditions. Second,
both the table and the figure indicate that serious biases may occur when the value of
β is misspecified. For example, the estimate of the risk aversion coefficient is typically
downward biased by 5-10% and the estimated value of the depreciation rate has an
upward bias of 50% on average. Since impulse responses are computed analytically,
sample size is not an issue here. Note that for the 100 simulations presented in figure 2
the minimized value of the objective function is within the ranges of the minimized value
10
of objective functions in the correct case indicating that while only a global minimum
exists, the objective function is considerably flat around it. Third, with a misspecified
β the dynamics of output, capital stock and consumption in response to technology
shocks are somewhat less persistent despite the fact that the response of zt to its own
shocks is perfectly matched.
One of the reasons for why parameters may be only weakly identifiable from the
objective function is that responses at long horizons may carry little information about
the structural parameters. This is analogous to the weak instrument problem in GMM
where variables which are too lagged in the past maybe more likely to satisfy the
exogeneity assumption but may also be weakly correlated with the function of interest
(see e.g. Stock, Wright and Yogo (2002)). So far we have used 20 steps of each of
the variables to match responses and since we work with population impulse responses
we have used the identity as weighting matrix. To see whether identificability of the
parameters changes and, at the same time, mimic typical situations encountered in
practice were actual responses at long horizons have large standard erros, we have
repeated the exercise using a diagonal weighting matrix with h12 on the diagional,
h = 1, 2, . . . , 20. The objective function is now much worsely behaved for identification
purposes: plateaus exist in all dimensions and the objective fucntion is now completely
flat for a much larger range of values than those found in figure 1. As a consequence,
parameter estimates obtained from various initial conditions may differ considerably
from the true ones. In Table 1 we report the range of estimates obtained using 100
random initial conditions but forced the minimization routine to find the zero within
the range of economically reasonable parameters. Now all the ranges are large and
this occurs in conjunction with minimized valued of the objective function which are
relatively low. Therefore, our simple experiment seems to suggest that the objective
function has better identification properties when many horizons are considered and
when each horizon is equally weighted.
11
Matching VAR coefficients
1
1
1
1
0.9
0.9
0.9
0.8
0.8
0.8
0.7
0.7
0.7
0.6
0.6
0.6
0.5
0.5
0.5
0.4
0.4
0.4
0.3
0.3
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.3
0.4
0.2
0.3
0.1
0.2
0
0.1
-0.1
10
capital
20
10
real rate
20
0.2
consumption
10
output
20
0.2
0.3
10
20
10
20
tech.shock
Figure 3
One may wonder if matching the coefficients of the VAR representation of the model,
as opposed to the impulse responses, would make any difference for identification and
estimation. Intuitively, concentrating on VAR coefficients could help because one nonlinear transformation from the VAR representation to the impulse responses disappears.
On the other hand, choosing parameters to match the coefficients of the A matrix could
worsen the outcome since this method neglects the information present in the contemporaneous B matrix. A priori is difficult to determine which effect dominates. The
last column of table 1, which reports the range of estimates obtained matching VAR
coefficients indicates that the latter problem is more important and that estimation
performance deteriorates. In fact, the range of estimates obtained randomizing on initial conditions is about as large as those obtained weighting impulse responses with h12 .
Furthermore, the distribution of certain parameter estimates, in particular those of ϕ
and δ, turns out to be centered around the wrong value. Figure 3 also clearly indicates
that minimizing the distance between VAR coefficients does not necessarily minimize
the distance of impulse responses even though the population values fall within the
envolope of responses we have produced. All in all, it appears that matching VAR
coefficients is a worse strategy to obtain estimates of the structural parameters than
matching impulse responses.
12
6
Combining observationally equivalence and weak identification problems
To show the problems one faces in practice when estimating structural parameters
from impulse responses, we use a model which is considered the workhorse in the
empirical New-Keynesian literature (see Woodford (2003), Ireland (200?) or Rubio and
Rabanal (2004)). We choose such a specification because several authors, including Ma
(2002), Beyer and Farmer (2004) and Nason and Smith (2005), have shown that such a
framework may be liable to many of the problems we have discussed so far. The model
consists of the following three equations:
h
1
1
yt−1 +
Et yt+1 + (it − Et π t+1 ) + v1t
1+h
1+h
ϕ
ω
β
(ϕ + ϑ)(1 − ζβ)(1 − ζ)
π t−1 +
π t+1 +
yt + v2t
=
1 + ωβ
1 + ωβ
(1 + ωβ)ζ
= φr it−1 + (1 − φr )(φπ π t−1 + φy yt−1 ) + v3t
yt =
(4)
πt
(5)
it
(6)
where h is the degree of habit persistence, θn is the inverse elasticity of labor supply, ϕ
is the relative risk aversion coefficient, β is the discount factor, ω the degree of indexation of prices, ζ the degree of price stickiness, while φr , φπ , φy are policy parameters.
The first two shocks are assumed to follow autoregressive processes of order one with
autoregressive parameters ρ1 , ρ2 while v3t is an iid shock. The variances of the three
shocks are denoted by σ 2i . This model has 14 parameters: θ1 = (σ 21 , σ 22 , σ 23 ) are unidentifed from any scaled impulse response, while θ2 = (β, ϕ, ϑ, ζ, φr , φπ , φy , ρ1 , ρ2 , h, ω) are
the structural parameters over which we focus attention.
We first check whether the objective function displays obvious pathologies. To do
this we calculated the Hessian of the objective function in the neighborhood of the true
parameter vector which we select to be θ2 = (β = 0.985, ϕ = 2, ϑ = 3, ζ = 0.68, φr =
0.2, φπ = 1.55, φy = 1.1, ρ1 = 0.65, ρ2 = 0.65, ω = 0.25, h = 0.85) and computed its
rank and its eigenvalues. If one of the parameters is completely unidentifiable, the
Hessian should be rank deficient. Since we have three shocks we can define several
objective functions, one for each shocks and an overall one. It turns out that for the
chosen parameter configuration, the Hessian of the first three objective functions is rank
deficient (respectively equal to 10, 10, 9), while for the last one no rank deficiencies
are found. Therefore, a reseracher using responses to single shocks may not be able
to recover certain parameters (as we have shown happens in section 4), while for the
full set of responses, unidentification pathologies are absent. Nevertheless, even using
the full set of responses, weak identification problems are present. In fact, a few of the
eigenvalues of the Hessian there are small (the smallest is equal to 1) implying that the
objective function is very flat in some dimensions.
To examine which are the dimensions of the parameter space which create problems,
we plot in figure 4 the shape of the objective function in each of the elements of θ2 , when
we try to minimize the distance of the responses to shock 1 (column 1), shock 2 (colomn
2), shock 3 (column 3) and all shocks (column 4), and we vary one parameter at the
time within a reasonable range around selected values. Figure 4 shows the curvature
13
of the objective function in one dimension conditional on the other n − 1 values being
fixed at their ”true values”.
-4
φ = 1.1φ = 1.55φ = 0.2 ζ = 0.6 θ = 3 φ = 2 β = 0.99
ρ = 0.65
h = 0.85ω = 0.05ρ2 = 0.65
π
r
n
y
1
x 10
-4
x 10
-4
x 10
2
0.985
0.99
0
5
0.995
5
1
0.985
0.99
0
5
0.995
0
0.985
0.99
0.995
0.985
0
1
0
1
2
3
1
2
0
1
3
1
0.5
0
0.5
0
0
2
20
10
0
0.4
0.6
10.2
0.5
0
0.1
0.2
0.1
0.05
0
1
1.5
0.04
4
0.995
0.8
2
0.4
0.6
0.1
0.2
1
1.1
1.2
1.3
0
0.1
0.5
1
1.5
0
0.04
2
0.65
0.7
0.9
1
1.1
1.2
0
0.1
1.3
0.6
0.65
0.6
0.7
0.65
-0.1
0.1
0.7
0
-0.2
0.01
0.005
0
0
0.6
0.65
-0.1
0.1
0.7
0
0.005
0.8
0.9
IS shocks
1
0.1
0.2
0.3
0.8
0.9
1
1.5
0.6
0.8
0.1
0.2
0.3
2
1
1.5
2
0.05
0.9
1
1.1
1.2
1.3
0
1
0.9
1
1.1
1.2
1.3
0.5
0
0.6
0.65
0.7
0.6
0.65
0.7
0.6
0.65
0.7
0.1
0.05
0
0.6
0.65
0.7
0.2
0
0
-0.2
0.2
0
0.2
0.01
0
0.7
0.8
0.9
1
Monetary Policy shocks
Cost Push shocks
0.4
0
1
0
0.7
4
0
0.005
0
0.7
3
0.1
0
-0.2
0.01
0.2
2
0
0.2
1
0.05
0
-0.2
0.01
0.2
0.8
0
0.1
0.1
0.6
0
0.1
0.05
0
-0.1
0.4
0.02
-0.1
0.6
0
0.05
0
0
4
2
20
0
0.1
0.3
1
2
1
0
0.5
0.02
0.9
2
0
10.2
0.8
0.05
0
0.04
3
10
0
0.1
2
2
0
20 0
4
0.5
0.3
0
1
0.5
0
20
10
0
10.2
0.02
0
0.1
0.99
10
5
0
1
-4
x 10
2
2
1
0
0.7
0.8
0.9
1
All shocks
Figure 4
Many interesting features are present in the figure. The objective function is very
flat in many dimensions and this is regardless of the shock we try to match. Therefore,
it is possible that parameter estimates will significantly deviate from the selected values. Second, different shocks bring different information about the parameters. Hence,
the particular shock selected matters for the estimation performance and estimation
results may depend on the shock considered. It is interesting to notice that the distance
function of responses to monetary shocks is very flat in all the dimensions except ζ.
Therefore, responses to monetary shocks are unlikely to pin down the structural parameters of the model. Third, the objective function is asymmetric in certain dimensions.
For example, when cost push shocks are considered, the distance function is very asymmetric in the risk aversion parameter ϕ, the inverse elasticity of labor supply θn and the
price stickiness ζ. In this sense, initial conditions may matter: convergence to the true
parameter values may be easier in one direction or another. Fourth, and confirming
our previous results, there are parameters which are unidentified by certain shocks. In
fact, as intuition would suggest, the persistence of say, the cost push shocks, can not
be identified considering the responses to other shocks. Finally, and probably more
importantly, even when the responses to all shocks are used, the objective function is
still flat in several dimensions. Therefore, even when all the available information is
used, weak identification problems may be present.
14
While Figures 4 and 5 consider one dimension at the time, the objective functions
displays canyons and ridges in several dimension, regardless of the responses we consider. There is a continuum of models indexed by the size of two parameters which
are observationally equivalent from the point of view of impulse responses. We show
three dimensional pictures and contour plots in the first panel of figure 6 (these are
analogous to those presented in Figure 1) for the case of cost push shocks. It is easy to
see that the distance function is not very large for any combination of the parameters
within the chosen range; that neither ϑ nor h will be easily identifiable from this shock
and that there are combinations of φx , φp which are consistent with the same value of
the distance function.
One may wonder if thes unpleasant results are specific to an objective function
which measures the distance of impulse responses or whether they are intrinsic to the
model and every estimation approach will face similar identification problems. To
adress this question we repeat the calculations used to obtain the first panel of figure
6 but now using likelihood function. To make the esperient comparable we consider
both a limited information approach (i.e. we consider one shock at the time) or a
full information appraoch (i.e. we consider all the shocks). Intuitively, the likelihood
fucntion uses more information than the distance function since the covariance matrix
of the shocks is used in estimation. On the other hand, if the normality assumption on
which estimation is based is incorrect, likelihood based inference may be problematic.
In the second panel of figure 6 we show that the situation slightly improves when the
likelihood function is used, but not a lot. The likelihood function is pretty flat in
the (ϑ, ζ) dimensions and there is a ridge on the diagonal of the box. However, it is
slightly better behaved in the other dimensions. Hence, limited information ML is not
likely to be significantly better than matching impulse responses to estimate structural
parameters.
Last interesting feature of this model is the lack of concavity in the direction of
some parameters.
LACK OF CONCAVITY + BAD ESTIMATION RESULTS
15
Distance function
0.5
0
1
0.1
0.8 0.05
0.01
0.01
0.6 0.05
0.1
0.5
6
θ
4
ζ2
0 0
1
0.5
0.5
0.4
1
n
0.1
0.05
0.01
0.01
0.05
0.1
2
3
4
0.05
0.0
1
1
1
0.5
1
0.6
0
1
0.1 0.2 0.3 0.4 0.5
0.
05
1.2
0.0
0.
1
0.5
0
1.5
1
0.
01
0.5
0
0 .0
5
0.1
0.5
0.5 -0.5
Distance function
0.1
1
1
05
0.
0.1
2
1
φ
0.5
0.8
0
1.5
1.5
0.5 1
1.2 1.3 1.4 1.5 1.6 1.7
φ
x
p
Cost push shock
12
0
8
6
4
8 6
4 2
θ
0.6
n
0.65
0.7
2
0.6
ζ
0.65
0.7
ζ
0
1.4
1.2
h
-50
-100
1.41.2
00. 0 0
0..59 .9 .1
1
5
θn
-100
-200
1210
00.09.5.1
10
1
0.8
10.8
0.6
h
1
0.5
1.5
0.1
0.9
0.6
0.5
ω
1
1.5
1.4
0
-200
1.41.2
φx
0.
01.5
1.2
-100
1
0.8
10.8
0.6
φ
x
1
1.5
2
0.6
1
φ
π
1.5
φ
π
Cost push shock
Figure 6
16
2
Since it is now common to estimate models like the one used in this example with
Bayesian methods (see e.g. Canova (2004), Rubio and Rabanal (2004)), few words
contrasting identification problems in classical and Bayesian frameworks are in order.
Posterior distributions are proportional to the likelihood of the model times the prior.
If the prior is very loosely specified, classical and posterior analyses will lead to the same
inference. However, if the prior is tightly specified, it is easy to produce well behaved
posterior distributions even if the likelihood function has very little information about
the parameter. Hence, Bayesian methods may give the illusion of overcoming weak
identification problems, but this is just that, an illusion since the posterior distribution
will simply reflect prior information. However, this problem is easily detectable. In fact,
by letting prior information become more and more diffuse, will make the posterior of
weakly identifiable parameters also more and more diffuse. Such a simple diagnostic
for weak identification does not exist in our (classical) case and this makes the analysis
more complicated. But in general, weak identification appears to be more intrinsic to
the model than to the estimation approach.
7
Misspecification and observational equivalence
The previous sections were concerned with the weak or under identification of some of
the parameters of a DSGE model. Here we consider a related but distinct problem:
the possibility of obtaining estimates of parameters which do not appear in the true
DGP. We have already seen that, given a model, weak identification makes it difficult
to pin down certain structural parameters (only combinations are possibly estimables).
Here we want to show that models which are near-observational from the point of
view of some objective function imply that it is possible to obtain reasonable estimates
of structures which are not those generating the data. What we have in mind, in
particular, here are models with different frictions and the possibility that impulse
responses will be incapable to distinguish them as they produce similar dynamics for
the variables of interest.
To address this issue we consider a model which is much richer than those examined
so far, includes real and nominal frictions and has been shown to fit reasonably well
both the US economy (see Christiano, et al. (2005), Dedola and Neri (2004)) and the
EU economy (see Smets and Wouters (2003)). The log linearized model consists of the
17
following 11 equations
0 = −kt+1 + (1 − δ)kt + δxt
0 = −ut + ψrt
ηδ
ηδ
xt + (1 − )ct − ηkt − (1 − η)Nt − ηut − ezt
0 =
r̄
r̄
0 = −Rt + φr Rt−1 + (1 − φr )(phiπ π t + φy yt ) + ert
0 = −yt + ηkt + (1 − η)Nt + ηut + ezt
0 = −Nt + kt − wt + (1 + ψ)rt
h
h
1−h
ct+1 − ct +
ct−1 −
(Rt − π t+1 )]
0 = Et [
1+h
1+h
(1 + h)ϕ
β
1
χ−1
β
1
0 = Et [
xt+1 − xt +
xt−1 +
qt +
ext+1 −
ext ]
1+β
1+β
1+β
1+β
1+β
0 = Et [π t+1 − Rt − qt + β(1 − δ)qt+1 + βr̄rt+1 ]
γp
β
π t+1 − π t +
π t−1 + Tp (ηrt + (1 − η)wt − ezt + ept )]
0 = Et [
1 + βγ p
1 + βγ p
β
1
β
wt−1 +
π t+1 −
wt+1 − wt +
0 = Et [
1 + βγ p
1+β
1+β
1 + βγ w
γw
ϕ
πt +
(ct − hct−1 ) − ewt )]
(wt − σNt −
1+β
1 + βγ w t−1
1−h
The first equation is the capital accumulation equation and δ is the depreciation rate,
where xt represent current investments; the second links capacity utilization ut to the
real rate rt and ψ is a parameter; the third equation is the resource constraint of the
economy linking consumption ct and investment expenditure to output produced, where
r̄ is teh steady state interest rate ezt is a technological disturbance; the fourth equation represent the policy rule of the monetary authority and ert is a monetary policy
disturbance; the fifth equation represent the production function where η is the share
of capital in the production function; the sixth equation is a labor demand equation
where ht is hours worked and wt the real wage rate; the seventh equation is an Euler
equation for consumption where h is a parameter capturing habit persistence ϕ is the
risk aversion coefficient and π t teh current inflation rate; the eight equation is an Euler
equation for investment where qt is Tobin’s q, β the discount factor, χ−1 the elasticity
of investment with respect to Tobin’s q and ext an investment shock; the ninth equation describes the dynamics of the Tobin’s q; the last two equation represent the wage
setting and the price setting equations where γ p (γ w ) represents the level of price (wage
indexation), ζ p (ζ w ) the price (wage) stickiness parameter, σ l is the inverse elasticity
of labor supply, λw is a wage markup, ept (ewt ) are shocks to the pricing relationships
(1−βζ p )(1−ζ p )
(1−βζ w )(1−ζ w )
and Tp ≡ (1+βγ
and Tw ≡ (1+β)(1+(1+λ
. The vector of parameters in−1
w )σ l λw )ζ w
p )ζ p
cludes the structural ones: θ1 = (β, ϕ, σ l , h, δ, η, χ, ψ, γ p , γ w , ζ p , ζ w , λw , φr , φπ , φy ) and
the auxiliary ones θ2 = (ρz , ρx , σ z , σ r , σ p , σ w , σ x ) where the first two represent the
persistence of the technology and investment shocks and the last five the standard
deviation of the disturbances. As usual the standard errors of the shocks are not identified from the normalized impulse responses and the some of the persistence persistence
18
parameters are identified only when own shocks are considered.
To first show the identification problems a reseracher faces in matching the responses of the model we construct population responses by calibrating the parameters
to the posterior mean estimates for the US economy obtained by Dedola and Neri,
that is θ1 = (β = 0.991, ϕ = 3.014, σ l = 2.145, h = 0.448, δ = 0.0182, η = 0.209, χ =
6.300, ψ = 0.564, γ p = 0.862, γ w = 0.221, ζ p = 0.887, ζ w = 0.620, λw = 1.2, φr =
0.779, π̄ = 1.016, φπ = 1.454, φy = 0.234) and θ2 = (ρz = 0.997, ρx = 0.522, σ z =
0.0064, σ r = 0.0026, σ p = 0.221, σ w = 0.253σ x = 0.557) and first examine the shape of
the distance function in the neighborhood of this parameter vector, one parameter at
a time. Figures x1 and x2 show the shape of the distance function when we consider
monetary policy shocks or monetary and technology shocks jointly: both figures indicate that the problems we have previously noted are present to a much larger degree
in this model. First, the local derivative of the objective function with respect to many
of the parameters with both shocks is extremely flat and this is true even for quite
large variations of the range of the parameters. Second, the objective function is highly
asymmetric in the dimensions represented by the price and wage stickiness parameters
(ζ p , ζ w ), the wage markup λw , the policy persistence parameter φr and the risk aversion coefficient ϕ. Therefore, it may be very important to properly specify the initial
conditions: estimation routines which start, e.g., from ζ p below 0.8 will probably stop
roughly were they start.
19
-7
-7
-8
-7
-5
-5
habit = 0.448x 10chi = 6.3 x 10sigc = 3.014
x 10
delta = 0.018x 10
eta = 0.209 x 10
beta = 0.991x 10
2
4
4
1
1.5
1.5
1.5
3
3
1
1
1
2
2
0.5
0.5
0
0.5
0.5
1
1
0
0
0
0
0
0.015 0.02
0.2 0.25 0.988
0.99
0.992
0.994 0.40.450.5
5 6 7
2.5 3 3.5
-7
-8
-3
-6
-5
-7
zeta
= 0.62 x 10
zeta
= 0.887x 10
gamma
= 0.221
gamma
=
0.862
x
10
x 10sig = 2.145 x 10psi = 0.564 x 10
w
p
w
l
p
3
3
1.5
4
2
2
1
1.5
3
1
2
0.5
1
2
1
1
0.5
0
0
0
0
0
0
2
3
0.5 0.6
0.85 0.9
0.8 0.9
0.6 0.7 0.150.20.25
-6
-7
-7
-4
lambda
= 0.234
phi = 0.275 x 10phipi = 1.454x 10phi = 0.779 rhoz = 0.997
x 10
x 10
w
y
r
1
1
2
4
3
0.5
1.5
3
2
0
0.5
1
2
1
0
1
0.5
0.20.250.3
0
0.2
0.3
0
-0.5
1.45 1.5
0
0.75 0.8
-1
0.980.99 1
Monetary shocks
Figure 7
These identification problems are however, much deeper than this. In figure x3
we show the surface of the distance function when monetary shocks are considered in
the dimensions represented by the price stickiness (ζ p ), price indexation (γ p ), wage
sticky (ζ w ) and wage indexation (γ w ). It is clear that when monetary policy shocks
are considered, there are various combinations of the four parameters which produce
a minimized objective function which is very close to the true one. Furthermore, even
arbitrarily multiplying the objective function by a large number are unlikely to make
minimization routines recover the true DGP since the surface has a number of local
minima which are not necessarily visited as initial conditions are changed. It is important to stress that, at least in these dimensions, adding technology shocks does not
20
make these parameters more easily identifiable.
-5
6
-6
-8
-6
-5
-5
x 10sigc = 3.014
x 10
delta = 0.018 x 10eta = 0.209 x 10
beta = 0.991 x 10
habit = 0.448 x 10chi = 6.3
8
4
4
8
3
1
6
6
2
4
2
2
4
0.5
1
2
0
2
0
0
0
0
0
0.015 0.02
0.2 0.25 0.988
0.99
0.992
0.994
0.4 0.45 0.5
5 6 7
2.5 3 3.5
-5
-7
-3
-6
-4
-7
zeta
=
0.62
zeta
=
0.887
gamma
=
0.862
sig
=
2.145
gamma
= 0.221
psi = 0.564 x 10 p
x 10
x 10
x 10 l
x 10 w
x 10
p
w
2
8
2
4
6
1
4
0.5
2
1
2
3
1
1.5
2
0.5
1
0
0
0
0
0
0
2
3
0.5 0.6
0.85 0.9
0.8 0.9
0.6 0.7 0.15 0.2 0.25
-5
-6
-7
-4
-4
lambda = 0.234
phi i = 1.454 x 10phir = 0.779 x 10rhoz = 0.997
x 10phi = 0.275 x 10
x 10
w
y
p
1
12
1.5
1.5
10
4
1
8
3
1
0.5
2
0.5
0
0.5
0.2 0.25 0.3
0
6
4
1
0.2
0.3
0
2
1.45
1.5
0
0.75
0.8
0.980.99 1
Both shocks
Figure 8
Armed with this evidence, we have considered a few alternative models where either
stickyness or indexation in wage or prices are eliminated from the true DGP and asked
our minimization routines to find estimates when the general model of considered in
this section is used. Table 3 report our estimation results. Notice that we provide
estimates without standard errors since as in Neely, Roy and Whiteman (2001) it is
impossible to invert the Hessian matrix as the condition number (the ratio of the largest
to the smallest singular value) of the covariance matrices associated with this table is
of the order of 101 8. In all these cases the matrix was singular to machine tollerance,
indicating not only weak but also underidentification of two or more parameters. Also
we present estimates which produce the smallest value of the objective fucntion starting
21
from random set of 100 initial conditions.
Table 3: Estimates of various models, matching monetary policy shocks
ζp
γp
ζw
γw
Obj.Fun.
Baseline 0.887 0.862 0.620 0.221
Estimates 0.833 0.549 0.604 0.379 1.46 e-06
Case 1
0.000 0.862 0.620 0.221
Estimates 0.397 0.010 0.654 0.419 4.51 e-07
Case 2
0.000 0.000 0.620 0.221
Estimates 0.395 0.010 0.653 0.411 4.50 e-07
Case 3
0.000 0.862 0.620 0.000
Estimates 0.442 0.001 0.673 0.407 6.54 e-07
Case 4
0.887 0.000 0.000 0.221
Estimates 0.901 0.280 0.011 0.010 1.94 e-07
Case 5
0.887 0.000 0.620 0.801
Estimates 0.928 0.302 0.586 0.155 3.50 e-07
Case 6
0.887 0.000 0.000 0.221
Estimates 0.895 0.321 0.071 0.010 7.35 e-06
The table provide several interesting features. First, in the baseline case, estimates
of price indexation are considerably lower than the true ones. Second, responses to
monetary shocks can not distinguish models featuring no price stickyness from models
featuring no price indexation (see case 1), models where there is price indexation from
those where there is not (compare cases 1 and 2). Moreover, it possible to confuse a
model with no price stickyness and no indexation with a model where these two features
exist but no price indexation is present (see case 3) and models with no price stickyness
and high wage indexation are observationally equivalent to models where both features
are present and price indexation is, roughly, twice as important as wage stickyness (see
case 5). Finally, a model where prices are sticky and wages are indexed can not be
distinguished from a model which features price stickyness and price indexation but
no wage stickyness or wage indexation (case 4). Third, in all the cases the minimized
objective fucntion is within the tollerance level. Therefore, flatness of the distance
surface is not necessarily a problem in these cases. This is clearly shown in figure
x4 where we report responses to monetary shocks obtained in case 4 with true and
estimated parameters: no investigators would doubt looking at this graph that she
has nailed down the correct model! One may wonder if these problems can be partially
reduced by using responses to a larger number of shocks of the model. Case 6 in table 3
consider estimating the parameters of the model using responses to both monetary and
technology shocks. Clearly, no improvements is visible. But this is expected: as figure
x2 has shown, adding technology shocks does not necessarily increase the identificability
of these four parameters. We have also conducted exercises by altering the weighting
matrix and changing the numebr of responses considered but the message is unchanged.
Also, using estimated instead of population responses will make, as shown in table 2,
matters substantially worse.
22
Inflation
Interest rate
0.1
0.5
0
0
-0.1
0
5
Real10
wage
15
-0.5
20
0
0
5
10
Investment
15
20
0
5
Hours10
worked
15
20
0
5 Capacity10
utilisation 15
20
0
5
10
15
quarters after shock
20
0
-0.5
-0.5
-1
-1.5
0
5
10
Consumption
15
-1
20
0
0.2
-0.1
0
-0.2
0
5
10
output
15
-0.2
20
0
0.5
-0.2
0
-0.4
0
5
10
15
quarters after shock
-0.5
20
No wage stickyness, no price indexation
Figure 9
We would like to stress that we are not the first one to indicate problems with specification which use stickyness and indexation jointly in a model (see e.g. Ma (2002)
or Beyer and Farmer (2004)). However we are the first to show that the problem is
much more widespread, that it involves combinations of all the nominal features of the
model and that it is possible to estimate the wrong model without being aware that a
problem exists. Since it is important for empirical practice to have ways to check for
under and weak identification and for observational equivalence of alternative economci
structures.
8
Detecting identification problems
TO BE COMPLETED
9
Conclusions and suggestions for empirical practice
TO BE COMPLETED
23
References
[1] Altig, D., Christiano, L., Eichenbaum, M. and Linde, J. (2003) The role of monetary policy in the propagation of technology shocks, Northwestern University,
manuscript.
[2] Altig, D., Christiano, L., Eichenbaum, M. and Linde, J. (2004) Firm specific capital
etc. Northwestern University, manuscript.
[3] Beyer, A. and Farmer, R. (2004) On the indeterminacy of new Keynesian Economics, ECB working paper 323.
[4] Boivin, J. and Giannoni, M. (2002) Has monetary Policy Become Less Powerful?,
Columbia Business School, manuscript.
[5] Canova, F. (2004) Structrual Changes in the US Economy, 1948-2002, available at
www.igier.unibocconi.it/canova
[6] Canova, F. (2005) Methods for Applied Macroeconomic Research, forthcoming,
Princeton University Press.
[7] Cogley, T., Colacito, R., Sargent, T. (2005)
[8] Del Negro, M, Schorfheide, F., Smets, F. And R. Wouters (2004) On the fit of
New Keynesian models, CEPR working paper
[9] Ellison, M. (2005) Discussion to Cogley, Colacito and Sargent
[10] Gali, J. and Rabanal, P. (2004) Technology shocks and aggregate fluctuations:
How well does the RBC model fit Postwar US data?, NBER working paper 10636
[11] Ireland, P. (2004) Technology Shocks in the New keynesian model, Boston College,
manuscript
[12] Neri, S. (2003) Comparing adjustment cost and principle agent models: A Bayesian
approach, Bank of Italy, manuscript.
[13] Ma, A. (2002) Economic Letters.
[14] Meier, F. and Mueller, (2004), Estimating a model of the financial accelerator,
EUI working paper.
[15] Pesaran, H. (1981) identification of Rational Expectations Models, Journal of
Econometrics, 16, 375-398.
[16] Rotemberg, J. and Woodford, M. (1997) , An Optimization based Econometric
Framework for the evalaution of Monetary Policy, NBER Macroeconomic Annual,
12, 297-346.
[17] Smith, A. (1993) Estimating Nonlinear Time series models using simulated VARs,
Journal of Applied Econometrics, 8, 63-84.
24
[18] Stock, J. and Wright, J. (2000) GMM with weak identification, Econometrica, 68,
1055-1096.
[19] Stock, J., Wright, J. and Yogo, M. (2002) A survey of weak instruments and
weak identification in Generalized methods of moments, Journal of Business and
Economics Statistics, 20, 518-529.
25
Download