Learning about the parameters and the dynamics of DSGE models: identification and estimation Fabio Canova IGIER, UPF and CEPR Luca Sala IGIER - Universitá Bocconi May 3, 2005 Abstract In this paper we study identification and estimation of structural parameters in DSGE models. Results show that for a large class of models some (if not all) parameters are either not identified or weakly identified 1 Introduction The 1990’s have seen a remarkable development in the specification of DSGE models. The literature has added considerable realism to the constructions popular in the 1980’s: in particular, a number of shocks and frictions have been introduced into primitive RBC models driven by technology disturbances. Steps forward have also been made in comparing the models’ approximation to the data. While 10 years ago it was standard to calibrate the parameters of a model and perform informal evaluation of its outcomes, now maximum likelihood or Bayesian estimation of the structural parameters is quite common both in policy and academic circles (see e.g. Ireland (2003), Smets and Wouters (2003) , Canova (2004)) and new techniques have been introduced for evaluation purposes (see Del Negro et. al. (2004)). Given the complexity of the current structures and the difficulties existing in extracting information useful to evaluate the fit of DSGE models, a strand of the literature has considered less demanding limited information methods and focused on whether the model matches the data along certain dimensions. Following the works of Rotemberg and Woodford (1997) and Altig, et. al. (2003), and others, it has become now common to estimate the parameters of a model by matching conditional dynamics in 1 response to certain structural shocks (see also Canova (2002) for an alternative but still limited information approach). Matching impulse responses nicely fits into the class of indirect inference methods with nuisance parameters, developed by Renault and Dridi (1998). One of the crucial conditions needed for the methodology to deliver meaningful estimates is the one of identifiability: there must be a unique maximum in the objective function relating structural parameters to the selected conditional responses and there should be enough curvature in the objective function to pin down the area of the maximum with sufficient precision. Since in DSGE models conditional dynamics in response to shocks depend in a highly non-linear way on the structural parameters, it is far from clear how to check whether these identifiability conditions are met. Moreover, even if local identifiability conditions are satisfied, it could be hard to obtain estimate the parameters and unrestricted estimates of their standard errors may turn out to be unreasonably large. This paper investigate identifiability issues in DSGE models and their consequences for parameter estimation. Since the field is vast and largely unexplored, we focus on general issues and leave many interesting but detailed avenues of reserach untouched. We start in the next section discussing the generics of identifications in DSGE models, contrast the problems existing in linear an nonlinear models and highlight why the theory recently developed for GMM estimates by Stock and Wright (2000) and Wright (2003) does not immediately apply when instead of moment conditions one deals with impulse responses. The next two sections provide examples of simple structures generating two commonly found identification problems in DSGE models: observational equivalence and disappearance of structrual parameters from teh relevant objective functions. We then discuss situations where, although the objective function has a unique zero and the rank of its Hessian fill, weak identification problems arise. We devide our analysis in two. First, we consider the situation where an investigator knows the population DGP but matches only the dynamics of one or two of the arrays of shocks driving the economy. We show that some parameters are not recoverable from impulse responses and that the objective function may be too flat in some dimensions to allow hill climbing routines to work appropriately. We examine whether for the identification point of view it is better to match impulse responses of the VAR coefficients of the model and to what extent small samples worsen the identification problem. Matching VAR coefficients could be beneficial since some non-linearities are eliminated. However, since information about contemporaneous correlations is ignored, identification problems may be compounded. We also examine the effects of using different weighting matrices and different objective function and brifly compare identification issues in classical and bayesian frameworks. We show that typical identification problems are magnified if the weighting matrix heavily discount responses at long horizons and that using the likelihood function in place of a distance function could be beneficial in some situations but not in general. Next, we examine what happens when an investigator has an incorrectly specified model and uses the dynamic implications of one or two shocks to find estimates of the parameters of her model. In particular, we are interested in examining whether a spurious estimation phenomena exist, that is, a situation in which two alternative features of the model may be indistinguishable from the point of view of impulse response dynamics 2 (e.g. habit in consumption plus sticky prices vs. habit in leisure and sticky wages) or a situation in which parameters which do not exist in the original specification become estimable because conditional dynamics in response to certain shocks are observationally equivalent to those produced by the true DGP. Our interest in this issue is motivated contrasting evidence found in recent papers; for example Boivin and Giannoni (2002), find structural breaks in monetary policy rules estimated out of responses to monetary shocks while ML and Bayesian methods fail to find this (see Canova (2004) or Gali and Rabanal (2004)); Meier and Muller (2004) find that a financial accelerator is important in matching impulse responses to monetary shocks while ML and Bayesian estimation find it insignificant (see e.g. Neri (2003)). Also in this case, we examine whether it is possible to improve estimates alteriing the objective function, adding information,or varying the distance matrix and changing the dimension of the vector of responses. The rest of the paper is organized as follows. The next section discusses few general identification issues in linear and nonlinear model. Section 3 present examples of different structrual models which produce observationally equivalent impulse responses. Section 4 present an example where some structural parameters are unidentified from impulse responses. Section 5 and 6 examine weak identification issues. We show that this problem is generic and of difficult solution and the consequencies it produces on estimates of the structural parameters of the model. Section 7 deals with small samples and tries to measure how identification problems and small samples interact to bias estimates and standard errors. Section 8 present cases where inexistent parameters could be estimable because of observational equivalence issues. Section 9 discusses diagnostic to detect identification problems in DSGE models, provides some suggestion for practical empirical work and offers a few preliminary conclusions. 2 A few definitional issues Identification has to do with the ability to draw inference about a theoretical structure from observed samples. It goes without saying that, whatever estimation approach one takes, a parameter must be identifiable to be estimable. While identification issues in linear models with or without expectations have been extensively studied and literature on this issue goes back at least to Koopmans and Reiersol (1950), Rothenberg (1971), Pesaran (1981), the identification problem becomes much more complicated when the model is nonlinear in the parameters, as it is the case with the solution of DSGE models. Let us start by recalling few standard definitions. Suppose the structural model can be characterizes uniquely by a density function L(y, θ0 ), where θ0 is a m × 1 vector of parameters. Identification has to do with the possibility of distinguishing between parameter points and it is related to the existence of an inverse mapping from the probability distribution of the random variables Y to the vector θ0 and thus to the structural model. Two parameter vectors θ1 and θ2 are said observationally equivalent if L(y, θ1 ) = L(y, θ2 ), for any y. A parameter vector θ0 is locally identifiable if there exists an open neighborhood of θ0 , containing no other θ which is observationally equivalent to θ0 ; it is globally identifiable if there is no other θ in the entire parameter space which is observationally equivalent. If a component j of θ0 , call it θj0 , is observationally 3 equivalent to all θj ∈ Θj , then θ0 is underidentified. On the other hand, a parameter vector θ0 is partially identifiable if there exists a constant vector α such that θ1 = αθ0 , is observationally equivalent to θ0 . Local identifiability is related to the rank of the information matrix: =(θ), whose 2 ∂log(L(y,θ)) ) = −E( ∂ log(L(y,θ)) ). One can prove (i, j)-th element, =(θ)ij , is E( ∂log(L(y,θ)) ∂θi ∂θj ∂θi ∂θj that θ0 is locally identifiable if and only if the rank of =(θ0 ) is constant and equal to m in a neighborhood of θ0 (Rothenberg, 1971). The intuition behind this result is straightforward: the information matrix measures the "curvature" of the likelihood function at each point. If in the direction of some parameters there is no curvature, there is no information about those parameters and the corresponding rows and columns of =(θ) will be zero. It is important to stress that while the above definitions and results are specific to the likelihood function, similar concepts can be defined for estimation methods based on the minimization of an objective function g(y, θ). In this case, condition for local identifiability of the true parameter θ0 is that g(y, θ) has a unique minimum at θ0 . This imply two things: 1. the gradient must be zero only in θ0 : ∇g(y, θ0 ) = 0 and 2. the 2 |θ=θ0 must be positive definite in a neighborhood of θ0 (H must Hessian H = ∂ ∂θg(y,θ) i θj then be also non-singular). If the rank of the Hessian at θ0 is less than m, some elements of θ0 will not be locally identifiable. A particular case is when the second derivative is zero in the direction of some parameters. This implies the objective function is locally linear in that direction and thus cannot have a unique local minimum. Let us start discussing possible problems that may arise in practice. First, the population objective function may be not globally concave. Certain parameters may be indistinguishable from the point of view of a particular objective function and this mean that multiple solutions to the optimization problem may be obtained. This clearly does not imply that the two parameterization would be indistinguishable under all objective functions nor that is impossible to find certain parameters that are uniquely and precisely pinned down: it simply suggests that some objective functions are unsuited to distinguish certain features of the model at hand. Second, the population objective function may be independent of certain parameters. A structural parameter may disappear from the log-linearized solution of a model so that the objective function is zero for all the range of the parameter. Third, two structural parameters may enter only proportionally in the objective function. Therefore, they can not be separately recovered. This is a well known problem and naturally links to the rank condition for identification used in old style simultaneous equation systems. Fourth, the population objective function may be globally concave and have a unique minimum but its curvature may be ”insufficient” in some or all directions, so that plateaus or flat areas may be present. Note that this problem could be either specific to a small neighbour of the parameter space or concern the entire parameter space. One further complication emerges when the objective function is asymmetric in the neighborhood of the zero and curvature deficient in only a part of the parameter 4 space. Since in this case the objective function is asymmetric, different part of the parameter space may carry be different information about the parameters. Therefore, identification may depend on the true parameter vector. We call this phenomenon asymmetric weak identification problem. Fifth, and this applies primarily to limited information estimation methods which consider only some aspects of the model, different variables (shocks) may carry different information about the parameters. In other words, it is possible that a parameter is identified from an objective function which considers all the features of the model, but it may not be identifiable from one equation or one set of responses are used. Hence, if limited information methods are used, it may matter which equation or shock is used to pin down the parameters and different equation (or shocks) may produce substantially different parameter estimates or leave parameters unidentified. We call this limited information identification problem. In the next few section we show that all these issues become relevant when we deal with DSGE models. 3 Observational equivalence: two structural models have the same impulse responses. The example we consider here is as simple as possible to allow us to compute an analitical solution to the model and, at the same time, to illustrate one of the problems often encountered with DSGE models: the observational equivalence of two structures with respect to impulse responses. Suppose a time series xt is generated from xt = λ1 λ2 1 λ2 +λ1 Et xt+1 + λ1 +λ2 xt−1 +vt where λ2 ≥ 1 ≥ λ1 ≥ 0. The unique rational expectations 1 vt . Therefore, given vt = 1, the (stable) solution is given by xt = λ1 xt−1 + λ2λ+λ 2 λ2 +λ1 λ2 +λ1 2 λ2 +λ1 responses of xt are [ λ2 , λ1 λ2 , λ1 λ2 , ...], so that, using at least two horizons of xt , one can estimate the two parameters, λ1 and λ2 . What we want to show is that there is different process, whose rational expectation (stable) solution has the same impulse response. In this case, an objective function measuring the distance between theoretical and empirical impulse responses will have multiple zeros. Consider for example the process yt = λ1 yt−1 +wt where wt is an iid shock with zero mean and variance σ 2w . Clearly, the responses of yt to a shock in wt will be identical to 1 σ v . Hence, a process with backward the responses of xt to a shock in vt if σ w = λ2λ+λ 2 looking dynamics is observationally equivalent to a process with forward and backward looking dynamics and iid fundamental error. This argument, for example, is the basis for Beyer and Farmer’s (2004) claim that the data cannot tell whether a Phillips curve is backward looking or forward looking and it is the cornerstone of Pesaran’s (1981) critique of tests of rational vs. adaptive expectations models. What is the key ingredient that makes the two processes equivalent from the point of view of impulse responses? Clearly, the unstable root λ2 enters the solution only contemporaneously. Since the variance of the shocks is not estimable from normalized impulse responses (any value simply implies a proportional increase in all the elements of the impulse response function), we can arbitrarily select it in the second case so as to 5 capture the effects of the unstable root. In other words, an investigator has one degree of freedom in the analysis and can choose it so as to make two processes share both contemporaneous and lagged dynamics. Several other examples of observationally equivalent structures have appeared in the literature. For example, Ma (2002) has shown that a standard forward looking Phillips curve is consistent with two structural models having very different firms’ pricing behaviour and Altig et al. (2004) constructed a model with firm specific capital which produces the same inflation dynamics of a model without firm specific capital. Note that while two observationally equivalent structures imply that the objective function has two zeros, a continuum of observationally equivalent models indexed, for example, by the size of two parameters, will create a canyon or a ridge in the objective function and information external to the model needs to be brought in to disentangle the various structural representations. 4 Some structural parameters are unidentifiable from impulse responses. Examples of models where structural parameters do not enter impulse responses or where two parameters enter in a proportional fashion are also numerouos. It is easy to build examples where structural parameters completely disappear from the moving average representation of the model. For this purpose, consider a version of the three equations New keynesian model: yt = a1 Et yt+1 + a2 (it − Et π t+1 ) + v1t (1) π t = a3 Et π t+1 + a4 yt + v2t (2) it = a5 Et π t+1 + v3t (3) where the first equation is the log-linearized Euler condition, the second a forward looking Phillips curve relationship and the third an equation characterizing monetary policy. Using the method of underdetermined coefficients, and noting that the model features no state variables, it is easy to guess that the solution for the output gap, inflation and the nominal interest is a linear function of the three shocks vjt . It is straightforward to show that, in deviation from steady states, such a solution is of the form: ⎤ ⎡ ⎤ ⎤⎡ ⎡ ŷt 1 0 a2 v1t ⎣ π̂ t ⎦ = ⎣ a4 1 a2 a4 ⎦ ⎣ v2t ⎦ 0 0 1 v3t ît While the model we consider is very simple and none of the three shocks induce any propagation over time, several useful points can be made. First, the parameters a1 , a3 , a5 disappear from the log-linearized solution - they only enter in the steady states and in a nonlinear fashion. Interestingly, they are those characterizing the forward looking dynamics of the model. Therefore, if one uses the responses of the output gap, of inflation and of the nominal interest rate to any 6 of the shocks, she will only be able to recover a2 , a4 . Since variations in a1 , a3 , a5 have no influence on the dynamics of the system in deviation from steady states, any objective function (and the likelihood itself) which uses the dynamics of the output gap, inflation and the nominal rate will be independent of them (the parameters are completely unidentified). Second, different shocks carry different information about the parameters. For example, responses to v1t allow us to recover only a4 ; responses to v3t may be used to back out both a4 and a2 while responses to v2t have no information for the two parameters. Therefore, in this case matching responses to monetary shocks makes responses to other shocks superfluous as far as parameter estimation is concerned. Conversely, matching responses to v2t leaves all the parameters of the model unidentified. 5 Weak and under identification There is a sense in which the cases considered in the two previous examples are pathological. The objective function is ill-behaved in both cases: it has either multiple zeros or it is equal to zero in some dimension regardless of the true parameter value. While these cases may be seen as extreme, there are many applied situations where the population objective function has a unique zero, the Hessian is locally positive definite but still parameters are only weakly identified: away from the true parameter vector the objective function is close to zero and this may occur either in a neighbour of the true parameter vector or over a large range of values. We call the first phenomenon locally weak identification and the second one globally weak identification. To show that both features are relatively standard in DSGE model we use a simple RBC model driven by technology disturbances. We choose to work with the simplest version of the model since in this case we can compute the solution analytically. This allows us to study whether and how VAR coefficients and impulse responses change once parameters are altered and therefore to better highlight problems that may occur. The representative agent maximizes a utility function, which depends on currect and ∞ 1−ϕ P ct future consumption, and it is of the form Ut = β t 1−ϕ and the resource constraint is t ct + kt+1 = zt ktη + (1 − δ)kt . zt is a first-order autoregressive process with autoregressive coefficient ρ, steady state value z ss and variance σ 2e and kt is the current capital stock. The parameters of the model are θ = [β, ϕ, δ, η, ρ, z ss ]. Noting that the current level of capital stock and the technological disturbance are the states of the problem, we can use the method of undetermined coefficients to express the endogenous variables in terms of the states and the shocks. In particular, the solution for wt+1 = [kt+1 , ct , yt , zt ] is of the form wt+1 = Awt + Bet where B = [vkz , vcz , 1, 1]0 and the 4 × 4 matrix A is given by: ⎡ ⎤ vkk 0 0 vkz ρ ⎢ vck 0 0 vcz ρ ⎥ ⎥ A=⎢ ⎣ η 0 0 ρ ⎦ 0 0 0 ρ 7 where vkk = 12 γ − ss ss q ( 12 γ)2 − β −1 ; vkz = ss (1−β(1−δ))ρ−ϕ(1−ρ) ycss ss ; (1−β(1−δ))(1−η)+ϕvck +ϕ(1−ρ) kcss (1−β(1−δ))(1−η)(1−β+βδ(1−η)) and the ϕηβ+β −1 +1 ss vck = (β −1 − vkk ) kcss ; vcz = ycss − kcss vvk and γ = superscript ss indicates steady states values. It is immediate to compute impulse responses for the four variables of the model to a shock in et and to build the criterion function that minimizes the distance from theoretical IRFs from those obtained from the data. In order to investigate the shape of the objective function we compute the theoretical impulse responses obtained with the following parameterization: (β = 0.985, ϕ = 2.0, ρ = 0.95, η = 0.36, δ = 0.025, z ss = 1) and compute the distance surface when we vary parameters in an economically reasonable neighborhood of the selected values. Figure 1 presents three three-dimensional surfaces and contour plots obtained when we vary two parameters at the time. distance 2.5 10 5 0 2.5 2 2 1.5 σ 0.8 0.9 ρ 1.5 5 1 0.5 0.1 0.05 0.01 0.05 0.1 0.5 1 5 0.8 5 distance 0.9 0.95 0.9 0.85 0.8 ρ 0.2 0.4 0.6 σ 5 0.95 10 5 0 5 1 0.5 0.1 0.01 0.01 0.05 0.1 0.5 1 0.85 ρ 0.9 0.5 0.1 0.5 0.05 1 1 0.95 5 ρ 5 5 0.85 0.8 0.8 η 0.2 0.4 η 1 1 0. 0.6 0.8 distance 0.03 2 0.02 0. 0 0.03 0.02 0.01 δ 5 0. 05 01 5 0. 0.0 1 0. 0.01 0.98 β 0.985 δ 5 0. 0.99 0.98 β 0.985 0.99 distance 0.03 2 0 0.01 0.02 δ 0.03 1.4 1.2 1 0.02 0.5 0.01 0.005.01 1 0.8 0.9 z 1 1.1 z 0.5 1.2 1.3 1 0.1 δ 1.4 s s Distance function and countour plots: Rbc model Figure 1 Clearly, although there is a unique minimum in correspondence of the true parame8 ters, the objective function is very flat both locally around the minimum and globally over the entire parameter range. For example, the persistence parameter ρ is weakly identified in the interval [0.8,0.99] and the share of capital in the production function η is only weakly identifiable in the range [0.3,0.6]. Interestingly, the distance function displays a canyon of approximately of the same depth in the depreciation rate δ and the discount factor β [running from (δ = 0.005, β = 0.975) up to (δ = 0.03, β = 0.99)] and in the steady state value of the technology shock z ss and the depreciation δ. Note that in both cases the one percent countour (SE IL MIN e ZERO;COS?E 1 PERCENT CONTOUR?) is pretty large and essentially includes the full range of economically interesting values for δ and β. Hence, even though it is possible to identify one of the two parameters conditional on the other, joint estimation is unlikely to be successful. Therefore, not only it it may be impossible to recover the parameters of the DGP in response to technology disturbances because standard routines are unlikely to be able to fruitfully search plateaus of this sort but, as we will see later on, attempts to make the distance function better behaved may result in poor impulse response matches. Given our analytical solution, we can go one step further and analyze the gradient of the entries of the matrices A with respect to the parameters and check which are the VAR coefficients primarily responsible for this pattern of results - - examining the gradient of the entries of A sufficies, since the matrix B share the same entries as the matrix A. It turns out that vkk and vkz are the coefficients which cause the problems. In fact, the local derivative of vkz with respect to ρ is small and equal 0.08 and those of vkk and vkz with respect to η are, respectively, -0.10 and 0.09. In other words, the objective function is flat in ρ and η because the dynamics of the capital stocks are only weakly influenced by these two parameters. Since the the law of motion of the capital stock determines the dynamics of output, consumption and investments, there is little additional information contained in the responses of these variables. Note also that the local derivative of vkk and vkz with respect to β and δ are not only small but have similar magnitude and opposite sign. Therefore, the dynamics of the capital stock are roughly insensitive to proportional increases/decreases in these two parameters. Table 1: RBC model, 95 % estimation range Parameters Value True model β = 0.995 weighting VAR β η ϕ δ ρ z ss Ob. fct. [0.9771, 0.9998] [0.3598, 0.3633] [1.5213, 4.9232] [0.0002, 0.0460] [0.9218, 0.9544] [0.7117, 1.8424] [7.5e-5, 0.0003] 0.985 0.36 2.0 0.025 0.95 1.0 [0.9810, 0.9853] [0.3594, 0.3616] [1.9961, 2.0403] [0.0208, 0.0261] [0.9488, 0.9501] [0.9981, 1.0061] [9.7e-9, 0.0001] 0.995 [0.3565, 0.3566] [1.9522, 1.9557] [0.0352, 0.0357] [0.9503, 0.9504] [0.9534, 0.9542] [0.00001, 0.00007] [0.9617, 0.9999] [0.2974, 0.4882] [1.1021, 4.8821] [0.0001, 0.0445] [0.9388, 0.9604] [0.7813, 2.3481] [8.5e-7, 1.5e-5] Depending on the initial conditions, different values of β and δ could be selected. The standard practice of fixing β works here since, as shown in Figure 1, for any value of β the distance function has reasonable curvature in the δ dimension (and 9 viceversa). Given the loose relation between the size of these two parameters and the value of the distance function, fixing β arbitrarily produces inconsistent estimates of δ and possibly of the parameters. Since the two parameters enter in an almost additive manner in the relevant entries of the matrix A, there is a continuum of values for β and δ which generates objective functions that are sufficiently close to the minimum. Fixing one parameter helps eliminating this multiplicity, but may induce serious estimation biases. We show these facts in details in Table 1, where we reports the true parameters, the range obtained randomizing on the initial conditions in a neighborhood of the true parameters and the estimates obtained arbitrarily fixing β to 0.995. Figure 2 graphs the population responses of capital, the real rate, consumption, output and the technology shock and those obtained with the estimated parameters in the latter case. 1 1 1 1 0.9 0.9 0.9 0.9 0.9 true 0.8 0.8 0.7 0.7 1 true 0.6 0.6 0.5 0.8 0.8 0.8 0.7 0.7 0.7 0.6 0.6 0.6 0.5 0.5 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0.5 0.5 true 0.4 0.4 0.3 true 0.3 0.2 0.2 0.1 0.1 0 10 capital 20 10 real rate 20 0 10 consumption 20 0 output 10 20 0 10 20 tech. shock Figure 2 First of all the table shows that in a neighborhood of the true parameters the distance function is relatively well behaved: except for δ, convergence to the true parameter vector occur starting from reasonably closed initial conditions. Second, both the table and the figure indicate that serious biases may occur when the value of β is misspecified. For example, the estimate of the risk aversion coefficient is typically downward biased by 5-10% and the estimated value of the depreciation rate has an upward bias of 50% on average. Since impulse responses are computed analytically, sample size is not an issue here. Note that for the 100 simulations presented in figure 2 the minimized value of the objective function is within the ranges of the minimized value 10 of objective functions in the correct case indicating that while only a global minimum exists, the objective function is considerably flat around it. Third, with a misspecified β the dynamics of output, capital stock and consumption in response to technology shocks are somewhat less persistent despite the fact that the response of zt to its own shocks is perfectly matched. One of the reasons for why parameters may be only weakly identifiable from the objective function is that responses at long horizons may carry little information about the structural parameters. This is analogous to the weak instrument problem in GMM where variables which are too lagged in the past maybe more likely to satisfy the exogeneity assumption but may also be weakly correlated with the function of interest (see e.g. Stock, Wright and Yogo (2002)). So far we have used 20 steps of each of the variables to match responses and since we work with population impulse responses we have used the identity as weighting matrix. To see whether identificability of the parameters changes and, at the same time, mimic typical situations encountered in practice were actual responses at long horizons have large standard erros, we have repeated the exercise using a diagonal weighting matrix with h12 on the diagional, h = 1, 2, . . . , 20. The objective function is now much worsely behaved for identification purposes: plateaus exist in all dimensions and the objective fucntion is now completely flat for a much larger range of values than those found in figure 1. As a consequence, parameter estimates obtained from various initial conditions may differ considerably from the true ones. In Table 1 we report the range of estimates obtained using 100 random initial conditions but forced the minimization routine to find the zero within the range of economically reasonable parameters. Now all the ranges are large and this occurs in conjunction with minimized valued of the objective function which are relatively low. Therefore, our simple experiment seems to suggest that the objective function has better identification properties when many horizons are considered and when each horizon is equally weighted. 11 Matching VAR coefficients 1 1 1 1 0.9 0.9 0.9 0.8 0.8 0.8 0.7 0.7 0.7 0.6 0.6 0.6 0.5 0.5 0.5 0.4 0.4 0.4 0.3 0.3 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.3 0.4 0.2 0.3 0.1 0.2 0 0.1 -0.1 10 capital 20 10 real rate 20 0.2 consumption 10 output 20 0.2 0.3 10 20 10 20 tech.shock Figure 3 One may wonder if matching the coefficients of the VAR representation of the model, as opposed to the impulse responses, would make any difference for identification and estimation. Intuitively, concentrating on VAR coefficients could help because one nonlinear transformation from the VAR representation to the impulse responses disappears. On the other hand, choosing parameters to match the coefficients of the A matrix could worsen the outcome since this method neglects the information present in the contemporaneous B matrix. A priori is difficult to determine which effect dominates. The last column of table 1, which reports the range of estimates obtained matching VAR coefficients indicates that the latter problem is more important and that estimation performance deteriorates. In fact, the range of estimates obtained randomizing on initial conditions is about as large as those obtained weighting impulse responses with h12 . Furthermore, the distribution of certain parameter estimates, in particular those of ϕ and δ, turns out to be centered around the wrong value. Figure 3 also clearly indicates that minimizing the distance between VAR coefficients does not necessarily minimize the distance of impulse responses even though the population values fall within the envolope of responses we have produced. All in all, it appears that matching VAR coefficients is a worse strategy to obtain estimates of the structural parameters than matching impulse responses. 12 6 Combining observationally equivalence and weak identification problems To show the problems one faces in practice when estimating structural parameters from impulse responses, we use a model which is considered the workhorse in the empirical New-Keynesian literature (see Woodford (2003), Ireland (200?) or Rubio and Rabanal (2004)). We choose such a specification because several authors, including Ma (2002), Beyer and Farmer (2004) and Nason and Smith (2005), have shown that such a framework may be liable to many of the problems we have discussed so far. The model consists of the following three equations: h 1 1 yt−1 + Et yt+1 + (it − Et π t+1 ) + v1t 1+h 1+h ϕ ω β (ϕ + ϑ)(1 − ζβ)(1 − ζ) π t−1 + π t+1 + yt + v2t = 1 + ωβ 1 + ωβ (1 + ωβ)ζ = φr it−1 + (1 − φr )(φπ π t−1 + φy yt−1 ) + v3t yt = (4) πt (5) it (6) where h is the degree of habit persistence, θn is the inverse elasticity of labor supply, ϕ is the relative risk aversion coefficient, β is the discount factor, ω the degree of indexation of prices, ζ the degree of price stickiness, while φr , φπ , φy are policy parameters. The first two shocks are assumed to follow autoregressive processes of order one with autoregressive parameters ρ1 , ρ2 while v3t is an iid shock. The variances of the three shocks are denoted by σ 2i . This model has 14 parameters: θ1 = (σ 21 , σ 22 , σ 23 ) are unidentifed from any scaled impulse response, while θ2 = (β, ϕ, ϑ, ζ, φr , φπ , φy , ρ1 , ρ2 , h, ω) are the structural parameters over which we focus attention. We first check whether the objective function displays obvious pathologies. To do this we calculated the Hessian of the objective function in the neighborhood of the true parameter vector which we select to be θ2 = (β = 0.985, ϕ = 2, ϑ = 3, ζ = 0.68, φr = 0.2, φπ = 1.55, φy = 1.1, ρ1 = 0.65, ρ2 = 0.65, ω = 0.25, h = 0.85) and computed its rank and its eigenvalues. If one of the parameters is completely unidentifiable, the Hessian should be rank deficient. Since we have three shocks we can define several objective functions, one for each shocks and an overall one. It turns out that for the chosen parameter configuration, the Hessian of the first three objective functions is rank deficient (respectively equal to 10, 10, 9), while for the last one no rank deficiencies are found. Therefore, a reseracher using responses to single shocks may not be able to recover certain parameters (as we have shown happens in section 4), while for the full set of responses, unidentification pathologies are absent. Nevertheless, even using the full set of responses, weak identification problems are present. In fact, a few of the eigenvalues of the Hessian there are small (the smallest is equal to 1) implying that the objective function is very flat in some dimensions. To examine which are the dimensions of the parameter space which create problems, we plot in figure 4 the shape of the objective function in each of the elements of θ2 , when we try to minimize the distance of the responses to shock 1 (column 1), shock 2 (colomn 2), shock 3 (column 3) and all shocks (column 4), and we vary one parameter at the time within a reasonable range around selected values. Figure 4 shows the curvature 13 of the objective function in one dimension conditional on the other n − 1 values being fixed at their ”true values”. -4 φ = 1.1φ = 1.55φ = 0.2 ζ = 0.6 θ = 3 φ = 2 β = 0.99 ρ = 0.65 h = 0.85ω = 0.05ρ2 = 0.65 π r n y 1 x 10 -4 x 10 -4 x 10 2 0.985 0.99 0 5 0.995 5 1 0.985 0.99 0 5 0.995 0 0.985 0.99 0.995 0.985 0 1 0 1 2 3 1 2 0 1 3 1 0.5 0 0.5 0 0 2 20 10 0 0.4 0.6 10.2 0.5 0 0.1 0.2 0.1 0.05 0 1 1.5 0.04 4 0.995 0.8 2 0.4 0.6 0.1 0.2 1 1.1 1.2 1.3 0 0.1 0.5 1 1.5 0 0.04 2 0.65 0.7 0.9 1 1.1 1.2 0 0.1 1.3 0.6 0.65 0.6 0.7 0.65 -0.1 0.1 0.7 0 -0.2 0.01 0.005 0 0 0.6 0.65 -0.1 0.1 0.7 0 0.005 0.8 0.9 IS shocks 1 0.1 0.2 0.3 0.8 0.9 1 1.5 0.6 0.8 0.1 0.2 0.3 2 1 1.5 2 0.05 0.9 1 1.1 1.2 1.3 0 1 0.9 1 1.1 1.2 1.3 0.5 0 0.6 0.65 0.7 0.6 0.65 0.7 0.6 0.65 0.7 0.1 0.05 0 0.6 0.65 0.7 0.2 0 0 -0.2 0.2 0 0.2 0.01 0 0.7 0.8 0.9 1 Monetary Policy shocks Cost Push shocks 0.4 0 1 0 0.7 4 0 0.005 0 0.7 3 0.1 0 -0.2 0.01 0.2 2 0 0.2 1 0.05 0 -0.2 0.01 0.2 0.8 0 0.1 0.1 0.6 0 0.1 0.05 0 -0.1 0.4 0.02 -0.1 0.6 0 0.05 0 0 4 2 20 0 0.1 0.3 1 2 1 0 0.5 0.02 0.9 2 0 10.2 0.8 0.05 0 0.04 3 10 0 0.1 2 2 0 20 0 4 0.5 0.3 0 1 0.5 0 20 10 0 10.2 0.02 0 0.1 0.99 10 5 0 1 -4 x 10 2 2 1 0 0.7 0.8 0.9 1 All shocks Figure 4 Many interesting features are present in the figure. The objective function is very flat in many dimensions and this is regardless of the shock we try to match. Therefore, it is possible that parameter estimates will significantly deviate from the selected values. Second, different shocks bring different information about the parameters. Hence, the particular shock selected matters for the estimation performance and estimation results may depend on the shock considered. It is interesting to notice that the distance function of responses to monetary shocks is very flat in all the dimensions except ζ. Therefore, responses to monetary shocks are unlikely to pin down the structural parameters of the model. Third, the objective function is asymmetric in certain dimensions. For example, when cost push shocks are considered, the distance function is very asymmetric in the risk aversion parameter ϕ, the inverse elasticity of labor supply θn and the price stickiness ζ. In this sense, initial conditions may matter: convergence to the true parameter values may be easier in one direction or another. Fourth, and confirming our previous results, there are parameters which are unidentified by certain shocks. In fact, as intuition would suggest, the persistence of say, the cost push shocks, can not be identified considering the responses to other shocks. Finally, and probably more importantly, even when the responses to all shocks are used, the objective function is still flat in several dimensions. Therefore, even when all the available information is used, weak identification problems may be present. 14 While Figures 4 and 5 consider one dimension at the time, the objective functions displays canyons and ridges in several dimension, regardless of the responses we consider. There is a continuum of models indexed by the size of two parameters which are observationally equivalent from the point of view of impulse responses. We show three dimensional pictures and contour plots in the first panel of figure 6 (these are analogous to those presented in Figure 1) for the case of cost push shocks. It is easy to see that the distance function is not very large for any combination of the parameters within the chosen range; that neither ϑ nor h will be easily identifiable from this shock and that there are combinations of φx , φp which are consistent with the same value of the distance function. One may wonder if thes unpleasant results are specific to an objective function which measures the distance of impulse responses or whether they are intrinsic to the model and every estimation approach will face similar identification problems. To adress this question we repeat the calculations used to obtain the first panel of figure 6 but now using likelihood function. To make the esperient comparable we consider both a limited information approach (i.e. we consider one shock at the time) or a full information appraoch (i.e. we consider all the shocks). Intuitively, the likelihood fucntion uses more information than the distance function since the covariance matrix of the shocks is used in estimation. On the other hand, if the normality assumption on which estimation is based is incorrect, likelihood based inference may be problematic. In the second panel of figure 6 we show that the situation slightly improves when the likelihood function is used, but not a lot. The likelihood function is pretty flat in the (ϑ, ζ) dimensions and there is a ridge on the diagonal of the box. However, it is slightly better behaved in the other dimensions. Hence, limited information ML is not likely to be significantly better than matching impulse responses to estimate structural parameters. Last interesting feature of this model is the lack of concavity in the direction of some parameters. LACK OF CONCAVITY + BAD ESTIMATION RESULTS 15 Distance function 0.5 0 1 0.1 0.8 0.05 0.01 0.01 0.6 0.05 0.1 0.5 6 θ 4 ζ2 0 0 1 0.5 0.5 0.4 1 n 0.1 0.05 0.01 0.01 0.05 0.1 2 3 4 0.05 0.0 1 1 1 0.5 1 0.6 0 1 0.1 0.2 0.3 0.4 0.5 0. 05 1.2 0.0 0. 1 0.5 0 1.5 1 0. 01 0.5 0 0 .0 5 0.1 0.5 0.5 -0.5 Distance function 0.1 1 1 05 0. 0.1 2 1 φ 0.5 0.8 0 1.5 1.5 0.5 1 1.2 1.3 1.4 1.5 1.6 1.7 φ x p Cost push shock 12 0 8 6 4 8 6 4 2 θ 0.6 n 0.65 0.7 2 0.6 ζ 0.65 0.7 ζ 0 1.4 1.2 h -50 -100 1.41.2 00. 0 0 0..59 .9 .1 1 5 θn -100 -200 1210 00.09.5.1 10 1 0.8 10.8 0.6 h 1 0.5 1.5 0.1 0.9 0.6 0.5 ω 1 1.5 1.4 0 -200 1.41.2 φx 0. 01.5 1.2 -100 1 0.8 10.8 0.6 φ x 1 1.5 2 0.6 1 φ π 1.5 φ π Cost push shock Figure 6 16 2 Since it is now common to estimate models like the one used in this example with Bayesian methods (see e.g. Canova (2004), Rubio and Rabanal (2004)), few words contrasting identification problems in classical and Bayesian frameworks are in order. Posterior distributions are proportional to the likelihood of the model times the prior. If the prior is very loosely specified, classical and posterior analyses will lead to the same inference. However, if the prior is tightly specified, it is easy to produce well behaved posterior distributions even if the likelihood function has very little information about the parameter. Hence, Bayesian methods may give the illusion of overcoming weak identification problems, but this is just that, an illusion since the posterior distribution will simply reflect prior information. However, this problem is easily detectable. In fact, by letting prior information become more and more diffuse, will make the posterior of weakly identifiable parameters also more and more diffuse. Such a simple diagnostic for weak identification does not exist in our (classical) case and this makes the analysis more complicated. But in general, weak identification appears to be more intrinsic to the model than to the estimation approach. 7 Misspecification and observational equivalence The previous sections were concerned with the weak or under identification of some of the parameters of a DSGE model. Here we consider a related but distinct problem: the possibility of obtaining estimates of parameters which do not appear in the true DGP. We have already seen that, given a model, weak identification makes it difficult to pin down certain structural parameters (only combinations are possibly estimables). Here we want to show that models which are near-observational from the point of view of some objective function imply that it is possible to obtain reasonable estimates of structures which are not those generating the data. What we have in mind, in particular, here are models with different frictions and the possibility that impulse responses will be incapable to distinguish them as they produce similar dynamics for the variables of interest. To address this issue we consider a model which is much richer than those examined so far, includes real and nominal frictions and has been shown to fit reasonably well both the US economy (see Christiano, et al. (2005), Dedola and Neri (2004)) and the EU economy (see Smets and Wouters (2003)). The log linearized model consists of the 17 following 11 equations 0 = −kt+1 + (1 − δ)kt + δxt 0 = −ut + ψrt ηδ ηδ xt + (1 − )ct − ηkt − (1 − η)Nt − ηut − ezt 0 = r̄ r̄ 0 = −Rt + φr Rt−1 + (1 − φr )(phiπ π t + φy yt ) + ert 0 = −yt + ηkt + (1 − η)Nt + ηut + ezt 0 = −Nt + kt − wt + (1 + ψ)rt h h 1−h ct+1 − ct + ct−1 − (Rt − π t+1 )] 0 = Et [ 1+h 1+h (1 + h)ϕ β 1 χ−1 β 1 0 = Et [ xt+1 − xt + xt−1 + qt + ext+1 − ext ] 1+β 1+β 1+β 1+β 1+β 0 = Et [π t+1 − Rt − qt + β(1 − δ)qt+1 + βr̄rt+1 ] γp β π t+1 − π t + π t−1 + Tp (ηrt + (1 − η)wt − ezt + ept )] 0 = Et [ 1 + βγ p 1 + βγ p β 1 β wt−1 + π t+1 − wt+1 − wt + 0 = Et [ 1 + βγ p 1+β 1+β 1 + βγ w γw ϕ πt + (ct − hct−1 ) − ewt )] (wt − σNt − 1+β 1 + βγ w t−1 1−h The first equation is the capital accumulation equation and δ is the depreciation rate, where xt represent current investments; the second links capacity utilization ut to the real rate rt and ψ is a parameter; the third equation is the resource constraint of the economy linking consumption ct and investment expenditure to output produced, where r̄ is teh steady state interest rate ezt is a technological disturbance; the fourth equation represent the policy rule of the monetary authority and ert is a monetary policy disturbance; the fifth equation represent the production function where η is the share of capital in the production function; the sixth equation is a labor demand equation where ht is hours worked and wt the real wage rate; the seventh equation is an Euler equation for consumption where h is a parameter capturing habit persistence ϕ is the risk aversion coefficient and π t teh current inflation rate; the eight equation is an Euler equation for investment where qt is Tobin’s q, β the discount factor, χ−1 the elasticity of investment with respect to Tobin’s q and ext an investment shock; the ninth equation describes the dynamics of the Tobin’s q; the last two equation represent the wage setting and the price setting equations where γ p (γ w ) represents the level of price (wage indexation), ζ p (ζ w ) the price (wage) stickiness parameter, σ l is the inverse elasticity of labor supply, λw is a wage markup, ept (ewt ) are shocks to the pricing relationships (1−βζ p )(1−ζ p ) (1−βζ w )(1−ζ w ) and Tp ≡ (1+βγ and Tw ≡ (1+β)(1+(1+λ . The vector of parameters in−1 w )σ l λw )ζ w p )ζ p cludes the structural ones: θ1 = (β, ϕ, σ l , h, δ, η, χ, ψ, γ p , γ w , ζ p , ζ w , λw , φr , φπ , φy ) and the auxiliary ones θ2 = (ρz , ρx , σ z , σ r , σ p , σ w , σ x ) where the first two represent the persistence of the technology and investment shocks and the last five the standard deviation of the disturbances. As usual the standard errors of the shocks are not identified from the normalized impulse responses and the some of the persistence persistence 18 parameters are identified only when own shocks are considered. To first show the identification problems a reseracher faces in matching the responses of the model we construct population responses by calibrating the parameters to the posterior mean estimates for the US economy obtained by Dedola and Neri, that is θ1 = (β = 0.991, ϕ = 3.014, σ l = 2.145, h = 0.448, δ = 0.0182, η = 0.209, χ = 6.300, ψ = 0.564, γ p = 0.862, γ w = 0.221, ζ p = 0.887, ζ w = 0.620, λw = 1.2, φr = 0.779, π̄ = 1.016, φπ = 1.454, φy = 0.234) and θ2 = (ρz = 0.997, ρx = 0.522, σ z = 0.0064, σ r = 0.0026, σ p = 0.221, σ w = 0.253σ x = 0.557) and first examine the shape of the distance function in the neighborhood of this parameter vector, one parameter at a time. Figures x1 and x2 show the shape of the distance function when we consider monetary policy shocks or monetary and technology shocks jointly: both figures indicate that the problems we have previously noted are present to a much larger degree in this model. First, the local derivative of the objective function with respect to many of the parameters with both shocks is extremely flat and this is true even for quite large variations of the range of the parameters. Second, the objective function is highly asymmetric in the dimensions represented by the price and wage stickiness parameters (ζ p , ζ w ), the wage markup λw , the policy persistence parameter φr and the risk aversion coefficient ϕ. Therefore, it may be very important to properly specify the initial conditions: estimation routines which start, e.g., from ζ p below 0.8 will probably stop roughly were they start. 19 -7 -7 -8 -7 -5 -5 habit = 0.448x 10chi = 6.3 x 10sigc = 3.014 x 10 delta = 0.018x 10 eta = 0.209 x 10 beta = 0.991x 10 2 4 4 1 1.5 1.5 1.5 3 3 1 1 1 2 2 0.5 0.5 0 0.5 0.5 1 1 0 0 0 0 0 0.015 0.02 0.2 0.25 0.988 0.99 0.992 0.994 0.40.450.5 5 6 7 2.5 3 3.5 -7 -8 -3 -6 -5 -7 zeta = 0.62 x 10 zeta = 0.887x 10 gamma = 0.221 gamma = 0.862 x 10 x 10sig = 2.145 x 10psi = 0.564 x 10 w p w l p 3 3 1.5 4 2 2 1 1.5 3 1 2 0.5 1 2 1 1 0.5 0 0 0 0 0 0 2 3 0.5 0.6 0.85 0.9 0.8 0.9 0.6 0.7 0.150.20.25 -6 -7 -7 -4 lambda = 0.234 phi = 0.275 x 10phipi = 1.454x 10phi = 0.779 rhoz = 0.997 x 10 x 10 w y r 1 1 2 4 3 0.5 1.5 3 2 0 0.5 1 2 1 0 1 0.5 0.20.250.3 0 0.2 0.3 0 -0.5 1.45 1.5 0 0.75 0.8 -1 0.980.99 1 Monetary shocks Figure 7 These identification problems are however, much deeper than this. In figure x3 we show the surface of the distance function when monetary shocks are considered in the dimensions represented by the price stickiness (ζ p ), price indexation (γ p ), wage sticky (ζ w ) and wage indexation (γ w ). It is clear that when monetary policy shocks are considered, there are various combinations of the four parameters which produce a minimized objective function which is very close to the true one. Furthermore, even arbitrarily multiplying the objective function by a large number are unlikely to make minimization routines recover the true DGP since the surface has a number of local minima which are not necessarily visited as initial conditions are changed. It is important to stress that, at least in these dimensions, adding technology shocks does not 20 make these parameters more easily identifiable. -5 6 -6 -8 -6 -5 -5 x 10sigc = 3.014 x 10 delta = 0.018 x 10eta = 0.209 x 10 beta = 0.991 x 10 habit = 0.448 x 10chi = 6.3 8 4 4 8 3 1 6 6 2 4 2 2 4 0.5 1 2 0 2 0 0 0 0 0 0.015 0.02 0.2 0.25 0.988 0.99 0.992 0.994 0.4 0.45 0.5 5 6 7 2.5 3 3.5 -5 -7 -3 -6 -4 -7 zeta = 0.62 zeta = 0.887 gamma = 0.862 sig = 2.145 gamma = 0.221 psi = 0.564 x 10 p x 10 x 10 x 10 l x 10 w x 10 p w 2 8 2 4 6 1 4 0.5 2 1 2 3 1 1.5 2 0.5 1 0 0 0 0 0 0 2 3 0.5 0.6 0.85 0.9 0.8 0.9 0.6 0.7 0.15 0.2 0.25 -5 -6 -7 -4 -4 lambda = 0.234 phi i = 1.454 x 10phir = 0.779 x 10rhoz = 0.997 x 10phi = 0.275 x 10 x 10 w y p 1 12 1.5 1.5 10 4 1 8 3 1 0.5 2 0.5 0 0.5 0.2 0.25 0.3 0 6 4 1 0.2 0.3 0 2 1.45 1.5 0 0.75 0.8 0.980.99 1 Both shocks Figure 8 Armed with this evidence, we have considered a few alternative models where either stickyness or indexation in wage or prices are eliminated from the true DGP and asked our minimization routines to find estimates when the general model of considered in this section is used. Table 3 report our estimation results. Notice that we provide estimates without standard errors since as in Neely, Roy and Whiteman (2001) it is impossible to invert the Hessian matrix as the condition number (the ratio of the largest to the smallest singular value) of the covariance matrices associated with this table is of the order of 101 8. In all these cases the matrix was singular to machine tollerance, indicating not only weak but also underidentification of two or more parameters. Also we present estimates which produce the smallest value of the objective fucntion starting 21 from random set of 100 initial conditions. Table 3: Estimates of various models, matching monetary policy shocks ζp γp ζw γw Obj.Fun. Baseline 0.887 0.862 0.620 0.221 Estimates 0.833 0.549 0.604 0.379 1.46 e-06 Case 1 0.000 0.862 0.620 0.221 Estimates 0.397 0.010 0.654 0.419 4.51 e-07 Case 2 0.000 0.000 0.620 0.221 Estimates 0.395 0.010 0.653 0.411 4.50 e-07 Case 3 0.000 0.862 0.620 0.000 Estimates 0.442 0.001 0.673 0.407 6.54 e-07 Case 4 0.887 0.000 0.000 0.221 Estimates 0.901 0.280 0.011 0.010 1.94 e-07 Case 5 0.887 0.000 0.620 0.801 Estimates 0.928 0.302 0.586 0.155 3.50 e-07 Case 6 0.887 0.000 0.000 0.221 Estimates 0.895 0.321 0.071 0.010 7.35 e-06 The table provide several interesting features. First, in the baseline case, estimates of price indexation are considerably lower than the true ones. Second, responses to monetary shocks can not distinguish models featuring no price stickyness from models featuring no price indexation (see case 1), models where there is price indexation from those where there is not (compare cases 1 and 2). Moreover, it possible to confuse a model with no price stickyness and no indexation with a model where these two features exist but no price indexation is present (see case 3) and models with no price stickyness and high wage indexation are observationally equivalent to models where both features are present and price indexation is, roughly, twice as important as wage stickyness (see case 5). Finally, a model where prices are sticky and wages are indexed can not be distinguished from a model which features price stickyness and price indexation but no wage stickyness or wage indexation (case 4). Third, in all the cases the minimized objective fucntion is within the tollerance level. Therefore, flatness of the distance surface is not necessarily a problem in these cases. This is clearly shown in figure x4 where we report responses to monetary shocks obtained in case 4 with true and estimated parameters: no investigators would doubt looking at this graph that she has nailed down the correct model! One may wonder if these problems can be partially reduced by using responses to a larger number of shocks of the model. Case 6 in table 3 consider estimating the parameters of the model using responses to both monetary and technology shocks. Clearly, no improvements is visible. But this is expected: as figure x2 has shown, adding technology shocks does not necessarily increase the identificability of these four parameters. We have also conducted exercises by altering the weighting matrix and changing the numebr of responses considered but the message is unchanged. Also, using estimated instead of population responses will make, as shown in table 2, matters substantially worse. 22 Inflation Interest rate 0.1 0.5 0 0 -0.1 0 5 Real10 wage 15 -0.5 20 0 0 5 10 Investment 15 20 0 5 Hours10 worked 15 20 0 5 Capacity10 utilisation 15 20 0 5 10 15 quarters after shock 20 0 -0.5 -0.5 -1 -1.5 0 5 10 Consumption 15 -1 20 0 0.2 -0.1 0 -0.2 0 5 10 output 15 -0.2 20 0 0.5 -0.2 0 -0.4 0 5 10 15 quarters after shock -0.5 20 No wage stickyness, no price indexation Figure 9 We would like to stress that we are not the first one to indicate problems with specification which use stickyness and indexation jointly in a model (see e.g. Ma (2002) or Beyer and Farmer (2004)). However we are the first to show that the problem is much more widespread, that it involves combinations of all the nominal features of the model and that it is possible to estimate the wrong model without being aware that a problem exists. Since it is important for empirical practice to have ways to check for under and weak identification and for observational equivalence of alternative economci structures. 8 Detecting identification problems TO BE COMPLETED 9 Conclusions and suggestions for empirical practice TO BE COMPLETED 23 References [1] Altig, D., Christiano, L., Eichenbaum, M. and Linde, J. (2003) The role of monetary policy in the propagation of technology shocks, Northwestern University, manuscript. [2] Altig, D., Christiano, L., Eichenbaum, M. and Linde, J. (2004) Firm specific capital etc. Northwestern University, manuscript. [3] Beyer, A. and Farmer, R. (2004) On the indeterminacy of new Keynesian Economics, ECB working paper 323. [4] Boivin, J. and Giannoni, M. (2002) Has monetary Policy Become Less Powerful?, Columbia Business School, manuscript. [5] Canova, F. (2004) Structrual Changes in the US Economy, 1948-2002, available at www.igier.unibocconi.it/canova [6] Canova, F. (2005) Methods for Applied Macroeconomic Research, forthcoming, Princeton University Press. [7] Cogley, T., Colacito, R., Sargent, T. (2005) [8] Del Negro, M, Schorfheide, F., Smets, F. And R. Wouters (2004) On the fit of New Keynesian models, CEPR working paper [9] Ellison, M. (2005) Discussion to Cogley, Colacito and Sargent [10] Gali, J. and Rabanal, P. (2004) Technology shocks and aggregate fluctuations: How well does the RBC model fit Postwar US data?, NBER working paper 10636 [11] Ireland, P. (2004) Technology Shocks in the New keynesian model, Boston College, manuscript [12] Neri, S. (2003) Comparing adjustment cost and principle agent models: A Bayesian approach, Bank of Italy, manuscript. [13] Ma, A. (2002) Economic Letters. [14] Meier, F. and Mueller, (2004), Estimating a model of the financial accelerator, EUI working paper. [15] Pesaran, H. (1981) identification of Rational Expectations Models, Journal of Econometrics, 16, 375-398. [16] Rotemberg, J. and Woodford, M. (1997) , An Optimization based Econometric Framework for the evalaution of Monetary Policy, NBER Macroeconomic Annual, 12, 297-346. [17] Smith, A. (1993) Estimating Nonlinear Time series models using simulated VARs, Journal of Applied Econometrics, 8, 63-84. 24 [18] Stock, J. and Wright, J. (2000) GMM with weak identification, Econometrica, 68, 1055-1096. [19] Stock, J., Wright, J. and Yogo, M. (2002) A survey of weak instruments and weak identification in Generalized methods of moments, Journal of Business and Economics Statistics, 20, 518-529. 25