a

advertisement
Measuring Disagreement in Qualitative Survey Data∗
Frieder Mokinskia , Xuguang Shengb† and Jingyun Yangc
a Centre
for European Economic Research (ZEW), Germany
b Department
c The
of Economics, American University, USA
Methodology Center, Pennsylvania State University, USA
∗
This paper was presented at the 19th Federal Forecasters Conference and Society of Government Economists Annual
Conference. We thank the participants in the conferences for helpful comments and suggestions. Dr. Yang’s research
was supported by Award Number P50DA010075-16 from the National Institute on Drug Abuse and NIH/NCI R01
CA168676. The usual disclaimer applies.
†
Corresponding author. Mailing address: 4400 Massachusetts Avenue N.W., Washington, DC, 20016, USA. Email:
sheng@american.edu. Tel: +1 202 885 3782. Fax: +1 202 885 3790.
Measuring Disagreement in Qualitative Survey Data
Abstract
To measure disagreement among respondents in qualitative survey data, we propose new methods applicable to both univariate and multivariate comparisons. Based on prior work, our first
measure quantifies the level of disagreement in predictions of a single variable. Our second
method constructs an index of overall disagreement from a dynamic factor model across several
variables. Using directional forecasts from the Centre for European Economic Research Financial Market Survey, we find that our measures yield levels of disagreement consistent with point
forecasts from the European Central Bank’s Survey of Professional Forecasters. To illustrate
usefulness, we explore the source and predictive power of forecast disagreement.
Keywords: Disagreement, Dynamic factor model, Qualitative data, Survey forecast.
1
Introduction
Disagreement among forecasters plays an increasingly important role in economic modeling, forecasting and policy. After having documented substantial disagreement in inflation forecasts, Mankiw, Reis, and Wolfers (2004) suggest that disagreement may be a key to macroeconomic dynamics.
In a similar vein, Driver, Trapani, and Urga (2012) show that disagreement, as a proxy for private information, may increase forecast accuracy for typical macroeconomic variables. Despite the
growing body of literature on disagreement, the overwhelming majority of studies focus on point
forecasts. However, most business and consumer surveys only provide qualitative indications of
the current and expected future economic conditions.1 The expectations of consumers and firms
are central to most economic analyses, as illustrated by the New Keynesian model. The joint
importance of these economic agents’ expectations and disagreement motivates our development
of econometric methods to analyze the level of disagreement in qualitative survey data.
In this paper, we propose two measures of disagreement among the qualitative expectations of
survey respondents. Building on Carlson and Parkin’s (1975) well-known probability approach, we
design our first measure to extract the level of disagreement for predictions of a single variable.
More specifically, we relax the restrictive assumptions of the Carlson-Parkin method by using
a flexible distribution and allowing for time-varying parameters. Our second measure employs a
dynamic factor model to quantify the level of overall disagreement in forecasts for several variables.
To illustrate, we estimate our two measures of disagreement on directional forecasts from the Centre
for European Economic Research (ZEW) Financial Market Survey, and compare the estimates to
conventional disagreement measures obtained from point forecasts. We find that the two novel
measures are highly correlated with the conventional measures of disagreement, and the second
measure tracks the benchmark even more closely.
We apply our measures to explore the source and predictive power of disagreement in qualitative survey data. In the first application, we employ a special survey conducted by the ZEW
and study the role of heterogeneous forecasting methods in generating forecast disagreement. Among six methods - econometric modeling, fundamental analysis, technical analysis, judgment,
in-house research and consensus forecasts - most respondents report that fundamental analysis
and judgment are especially important forecasting techniques, while technical analysis is relatively
1
There exist a number of qualitative business and consumer surveys across many countries, such as the European
Commission Business and Consumer Surveys, the IFO Business Survey, the Confederation of British Industry Business Survey, and the University of Michigan Survey of Consumers.
1
less important. Our analysis indicates that relying heavily upon econometric modeling increases
disagreement, whereas paying close attention to the consensus forecasts reduces disagreement. In
another application, we explore the economic significance of disagreement by studying whether disagreement has any predictive power for economic activity. To this end, we utilize the well-known
business survey conducted by the Institute for Supply Management (ISM), available since January
1948. We find that disagreement in the ISM survey can indeed improve our forecasts of industrial production. For instance, including the estimated price and employment disagreement in the
model significantly decreases the mean squared forecast error for almost all forecast horizons.
Our paper makes three contributions to the literature on macroeconomic forecasting. First, we
propose new econometric methods for measuring disagreement in qualitative survey data. These
methods may effectively be applied in both univariate and multivariate comparisons. We establish
the validity of our disagreement measures by comparing results from qualitative and quantitative
data sets. Second, we provide direct evidence that forecasters use differing methods when revising
their predictions, which confirms the implications of the theoretical models in Lahiri and Sheng
(2008) and Patton and Timmermann (2010). Third, we find that disagreement has economically
meaningful predictive value. With quantitative data, Legerstee and Franses (2010) document the
predictive power of disagreement measures. Our confirmation of this relationship in qualitative
data firmly establishes that the degree of disagreement signals upcoming structural and temporal
changes in an economic process. These contributions are especially significant because we focus on
the underemphasized, yet immensely important field of qualitative economic expectations.
The paper proceeds as follows. Section 2 describes the methods for measuring disagreement
in qualitative survey data. In section 3, we compare the result of our measures of disagreement
for directional forecasts to that of conventional measures of disagreement for point forecasts. In
section 4, we link forecast disagreement to forecasting technology. Section 5 explores the predictive
power of disagreement, and section 6 concludes.
2
Measuring Disagreement in Directional Forecasts
For most qualitative surveys, the percentages of respondents who expect a variable to increase, stay
the same or decrease are the only aggregate statistics that are available.2 To best address the real
2
When the individual responses are available, one can use the kappa statistic to measure (dis)agreement in qualitative
survey data. See Song, Boulier, and Stekler (2009) for a recent application.
2
world limitations of data, our methods of measuring disagreement only require aggregated data.
Throughout this section, we develop our measures of disagreement in qualitative survey data.
2.1
Disagreement in Predictions of A Single Variable
Carlson and Parkin (1975) present a method for obtaining point forecasts from qualitative survey
data. The Carlson-Parkin quantification method has been widely used in the literature. See
Nardo (2003) and Pesaran and Weale (2006) for recent reviews. Their quantification method rests
on two key assumptions. First, the method assumes that survey respondents convert latent point
forecasts to directional forecasts. If the point forecast fit of respondent i at time t is larger than the
threshold τup,t , the respondent will report “go up”; if fit is between τdown,t and τup,t , the respondent
will report “stay same”; if fit is below τdown,t , the respondent will report “go down.” The second
assumption is that the point forecasts {fit }(i=1,...,Nt ) in period t are independently and identically
distributed normal with mean µt and variance σt2 .
While many studies have focused on the cross-sectional mean µt of the latent distribution
of point forecasts, very few have analyzed the cross-sectional variance σt2 .3 We recognize the
importance of the cross-sectional variance (or standard deviation) as a measure of disagreement
among survey respondents.
Below, we explain how to obtain estimates of µt and σt2 . Let Ut and Dt be the percentages of “go
up” and “go down” responses in period t. If the number of responses in t is sufficiently large, then
Ut ≈ P rob(fit ≥ τup,t ) = 1 − Φ(
τup,t −µt
)
σt
and Dt ≈ P rob(fit ≤ τdown,t ) = Φ(
τdown,t −µt
),
σt
where Φ is
the cumulative distribution function of the standard normal distribution. By assuming symmetric
thresholds that do not vary over time, that is, τup,t = −τdown,t = τ , we obtain
µt = τ Φ−1 (Dt ) + Φ−1 (1 − Ut ) / Φ−1 (Dt ) − Φ−1 (1 − Ut ) ,
σt = 2τ / Φ−1 (1 − Ut ) − Φ−1 (Dt ) .
(1)
(2)
If we know the share of responses in each direction, the only unknown in equation (2) is τ. However,
since τ is only a scaling constant, we can set it to an arbitrary value without losing information
on the dynamics of this basic measure of disagreement.
To further develop our model, we address two criticisms of the Carlson-Parkin method. Carlson
3
Notable exceptions include Dasgupta and Lahiri (1992) and Mankiw, Reis, and Wolfers (2004), who use σt2 as a
measure of dispersion in forecasts.
3
(1975) finds that the cross-sectional distribution of point inflation forecasts is non-Gaussian. Alternative distributions have been used. Dasgupta and Lahiri (1992) use the scaled t-distribution,
and Batchelor (1981) experiments with skewed distributions. To alleviate this concern, we replace
the normal distribution with a scaled t-distribution as an alternative to our standard measure.
Similarly, others criticize that the assumption of symmetrical, time-constant thresholds could be
violated in practice. Smith and McAleer (1995) estimate thresholds in a time-varying parameter
framework, while Pesaran (1984) relates thresholds to observed variables. To address this issue,
we estimate a simple time-varying parameter model in the spirit of Cooley and Prescott (1976).
By allowing the thresholds to be asymmetric and time-varying, equations (1) and (2) become
τup,t Φ−1 (Dt ) − τdown,t Φ−1 (1 − Ut ) / Φ−1 (Dt ) − Φ−1 (1 − Ut ) ,
= − (τup,t − τdown,t ) / Φ−1 (Dt ) − Φ−1 (1 − Ut ) .
µt =
(3)
σt
(4)
We can rewrite equation (3) as

Φ−1 (D
Φ−1 (D
Φ−1 (1
0 

− Ut )
t )/
t) −

  τup,t 
0
µt = 
 
 = xt β t .
−Φ−1 (1 − Ut )/ Φ−1 (Dt ) − Φ−1 (1 − Ut )
τdown,t
(5)
Moreover, we assume that the threshold vector βt evolves according to a multivariate random walk
βt = βt−1 + νt ,
(6)
where V ar[νt ] = diag(σν2 ). Equations (5) and (6) specify a state-space model, in which the average
point forecast µt is the measurement. Since µt is unobserved, the model cannot be estimated.
Different approaches may effectively estimate µt . First, µt can be replaced by the realization
of the target variable. This method implicitly assumes that there is no systematic discrepancy
between the average point forecast and the actual, implying that forecasts are unbiased on average.
Alternatively, besides directional forecasts, questionnaires sometimes ask for directional assessment
of the past. Thresholds are then estimated by replacing xt with the corresponding assessment and
µt with the realization. Since the assessment of the past is not available in most surveys, we follow
the first approach and replace µt with the realization. We introduce a measurement error, ut , to
allow for the possiblity that the realization of the target variable, yt , may not exactly equal the
4
average point forecast. The measurement equation thus becomes
yt = µt + ut = x0t βt + ut ,
(7)
where the second equality follows from imputing equation (5) for µt . We first estimate the model in
equations (6) and (7) using the Kalman filter and then obtain filtered estimates of the time-varying
thresholds. By imputing the estimated thresholds into equation (4), we derive the disagreement
measure.4 For more details on parameter estimation, see Section 2.3.
2.2
Disagreement in Predictions of Many Variables
We propose an econometric approach to properly analyzing disagreement among the growing number of qualitative forecasts for disaggregate variables. Suppose, for instance, that we are interested
in measuring overall disagreement about the state of an economy from forecasts for its sectors, or
overall disagreement for the Euro area from disagreeing forecasts for its member countries. We
estimate such an index of overall disagreement from a dynamic factor model. Our disagreement
index is related to Banternghansa and McCracken (2009) and Sinclair and Stekler (2012). Using
the Mahalanobis distance, the former measures overall disagreement in point forecasts across several variables and the latter tests the difference between the vectors that contain different vintages
of GDP estimates.
We develop the model under the assumption that there is a single unobserved factor driving the
disaggregate disagreement measures (Stock and Watson, 1991). This assumption is justified by the
observation that the pattern of estimated disagreement in empirical analyses is often quite similar
across variables. This commonality is in turn reflected in the fact that most of the total variation
in forecast disagreement of macro variables is well summarized by a single common factor.5 Let
σit denote a disaggregate disagreement measure computed via equation (2) or (4) for variable i,
i = 1, . . . , n at period t, t = 1, . . . , T . Let Yt denote an n × 1 vector of σit that are assumed to move
contemporaneously with overall disagreement. We decompose Yt into two stochastic components:
the common unobserved variable, which we call “overall disagreement,” Ft , and an n-dimensional
component that represents idiosyncratic movements in the variable-specific disagreement series,
4
In another experiment, we also estimate the Carlson-Parkin model with time-constant but asymmetric thresholds.
5
While it would be possible to model disagreement using more than one factor, using a single factor brings large
computational benefits and offers a simple interpretation. In a recent paper, Carriero, Clark, and Marcellino (2012)
show that the cost paid in using only one factor for volatilities is more than offset by the possibility of using a larger
information set in Bayesian VARs.
5
ut . Both Ft and ut are modeled to have stochastic structures. This setup suggests the following
model specification:
Yt = c0 + γFt + ut ,
(8)
φ(L)Ft = ξt ,
(9)
D(L)ut = νt ,
(10)
where L denotes the lag operator, φ(L) and D(L) are lag polynomials of orders p and q, respectively.
In equation (8), Ft enters each of the variables contemporaneously with variable-specific weights.
For the purpose of parameter identification, we normalize σξ2 to 0.01 and further assume that Ft
and ut are mutually uncorrelated at all leads and lags. This assumption requires that D(L) is
diagonal and the n × 1 disturbances are mutually uncorrelated:
D(L) = diag(d1 (L), . . . , dn (L)),
Var[ξt νt0 ]0 = diag(σξ2 , σν21 , . . . , σν2n ).
Equations (8)-(10) form a state-space model, in which equation (8) is the measurement equation
and equations (9)-(10) are the state equations. We perform maximum likelihood estimation and
obtain filtered overall disagreement using the Kalman filter. The following section provides more
details on the estimation process.
2.3
Estimation of State-Space Models
Both the time-varying thresholds model in Section 2.1 and the dynamic factor model in Section
2.2 are linear Gaussian state-space models. In this section we briefly discuss the Kalman filter
algorithm used to estimate both models. The presentation is based on Durbin and Koopman
(2012), an excellent reference for state-space modeling.
We begin by describing a general linear Gaussian state-space model that nests our two models:
y t = c + Z t α t + εt ,
αt+1 = Tt αt + ηt ,
(11)
(12)
where equation (11) is the measurement equation, equation (12) is the state equation, yt is a
6
k × 1 vector of observed variables, and αt is an m × 1 vector of state variables. The model
assumes that the disturbances εt and ηt are both serially independent and independent of each
other at all leads and lags with εt ∼ N (0, Ht ) and ηt ∼ N (0, Qt ). Given Yt = (y1 , . . . , yt ), define
at|t−1 = E[αt |Yt−1 ], at|t = E[αt |Yt ], Pt|t−1 = V ar[αt |Yt−1 ], and Pt|t = V ar[αt |Yt ]. Moreover, let
vt = yt − E[yt |Yt−1 ] = yt − c − Zt at|t−1 be the one-step-ahead forecast error of yt given Yt−1 , and
Ft = V ar(vt |Yt−1 ) = Zt Pt|t−1 Zt0 + Ht be its variance.
We assume that the initial state vector has a Gaussian distribution, α1 ∼ N (a1 , P1 ), where
a1 and P1 are known. By the law of iterated expectations, the log-likelihood is log L(YT ) =
PT
t=1 log p(yt |Yt−1 ), with p(y1 |Y0 ) = p(y1 ). Since yt |Yt−1 ∼ N (c + Zt at|t−1 , Ft ), the log-likelihood
then becomes
T
log L(YT ) = −
Tk
1X
log 2π −
log |Ft | + vt0 Ft−1 vt .
2
2
(13)
t=1
We calculate vt and Ft in equation (13) with the Kalman filter, which works as follows. Starting
with N (at|t−1 , Pt|t−1 ), the distribution of αt given Yt−1 , we obtain vt = yt − c − Zt at|t−1 and
Ft = Zt Pt|t−1 Zt0 + Ht . After observing yt , we update our inference about the state vector αt by
the equations
at|t = at|t−1 + Pt|t−1 Zt0 Ft−1 vt ,
(14)
Pt|t = Pt|t−1 − Pt|t−1 Zt0 Ft−1 Zt Pt|t−1 .
(15)
Based on the updated inference, we forecast the distribution of the state vector in period t + 1 as
at+1|t = Tt at|t ,
(16)
Pt+1|t = Tt Pt|t Tt0 + Qt .
(17)
The recursions (14)-(17) constitute the Kalman filter for model (11)-(12) and enable us to update
our knowledge of the system for each new observation. We use the exact initial Kalman filter
of Koopman (1997). Loosely speaking, this algorithm deals with the problem of initialization by
taking the limit of the log-likelihood function for P1 approaching infinity.
Next, we show how the models in Sections 2.1 and 2.2 are special cases of the general model
presented above. In terms of the notation of the general model, the time-varying threshold Carlson-
7
Parkin model in (6)-(7) can be reformulated as
yt = yt ,
c = 0,
Zt = xt ,
εt = ut ,
Ht = σu2 ,
αt+1 = βt+1 ,


1 0
Tt = 
,
0 1
ηt = νt ,


2
σν 0 
Qt = 
.
0 σν2
Similarly, the dynamic factor model in (8)-(10) arises as a special case of the general model, if we
set the vectors and matrices of the measurement and state equations as
yt = Yt ,
c = c0 ,
Zt = γ,
εt = ut ,
Ht = diag(σ12 , . . . , σk2 ),

φ(1) φ(2) · · ·

 1
0
···



1
···
 0

 ..
..
..
 .
.
.


Tt = 
0
···
 0


 0
0
···


 0
0
···

 .
..
..
 ..
.
.


0
0
···
φ(p)
0
0
···
0
0
0
···
0
..
.
0
..
.
0
..
.
···
..
.
1
0
0
···
0
d1 (1)
0
···
0
..
.
0
..
.
0
0
8
d2 (1) · · ·
..
..
.
.
0
···

0 

0 



0 

.. 
. 


0 
,


0 


0 

.. 
. 


dk (1)
0
αt+1 =
ηt
Ft+1 Ft · · ·
0
0
= ξt νt ,
Ft−p+1 u1,t+1 · · ·
uk,t+1 ,
Qt = diag(ση2 , σν21 , . . . , σν2k ),
where we assume that q = 1; that is, the idiosyncratic errors of the individual disagreement
measures have AR(1) dynamics.
3
Estimation of Disagreement from Qualitative Survey Data
In this section, we compare measures of disagreement from directional forecasts to that from point
forecasts of Euro area real GDP growth and inflation.
3.1
Data
We use two data sets: the ZEW financial market survey and the European Central Bank (ECB)’s
survey. From the ZEW financial market survey, we obtain directional forecasts. Since December
1991, this survey has collected the monthly responses of roughly 300 professionals in the German
financial sector. The survey asks respondents for six-month directional forecasts of economic
activity, inflation, different interest rates, stock market indexes and exchange rates for Germany,
Italy, France, Great Britain, the Euro area, the United States, etc. Nolte and Pohlmeier (2007)
explore the predictive ability of quantified forecasts based on the ZEW data. The so-called ZEW
indicator of economic sentiment is formed from the survey and this indicator receives wide media
coverage. According to Entorf, Gross, and Steiner (2012), the ZEW indicator has a significant
high-frequency impact both on returns and the volatility of the German stock market index DAX.
In conjunction to the qualitative data, we obtain point forecasts from the ECB’s Survey of
Professional Forecasters (SPF). Since the first quarter of 1999, the ECB SPF has documented the
quarterly responses of about 59 respondents on average. Respondents are professional forecasters
with economic expertise and about 75 percent of them are located in the European Union. The
survey asks participants only for forecasts of three variables: Euro area inflation (Harmonized
Index of Consumer Prices; HICP), real GDP growth and unemployment rate. However, for each
variable the respondents provide point as well as density forecasts at several fixed horizons.
9
Both surveys feature forecasts for Euro area inflation and real GDP growth. There are, however,
differences in the wording of the questions. While the ECB SPF point forecasts explicitly refer
to Euro area HICP inflation and real GDP growth, the ZEW directional forecasts refer to the
“annual inflation rate” and the “overall macroeconomic situation” in the Euro area.6 With these
differences in mind, we interpret the ZEW directional forecasts as if they were referring to the same
variables as in the ECB SPF. In addition, the two surveys are conducted at different times. While
the monthly ZEW survey is typically conducted in the first half of a month, the quarterly ECB
survey - at least since 2002Q2 - is conducted in the third week of the first month of a quarter.7 To
synchronize the timing of the two surveys, we match the quarterly ECB SPF forecasts with the
ZEW forecasts conducted the first month in a quarter.
Below, we assess the consistency of our new measures of qualitative expectations with the more
commonly studied quantitative measures. We apply our measures of disagreement to the ZEW
directional forecast of Euro area inflation and real GDP growth. With these results, we assess their
consistency with benchmark disagreement measures obtained from ECB SPF point forecasts.
3.2
Disagreement about Euro Area GDP Growth
We compare the results of our measurement approach to a benchmark measure computed as the
cross-sectional standard deviation of the ECB SPF point forecasts for Euro area real GDP growth
in the next twelve months8 .
The top panel of Figure 1 depicts the Carlson-Parkin measure of disagreement (equation 2)
against the benchmark measure. Because the disagreement measure for directional forecasts is
only determined up to a constant of scale, we standardized both series.9 A correlation of .60 shows
that the two measures are closely related.
The bottom panel of Figure 1 shows the overall disagreement index, extracted from individual
disagreement measures for German, French and Italian economic activity. Although very similar
to the measure based on directional forecast for the Euro area, our measure of overall disagreement
6
The corresponding questions read, “In the medium-term (6 months), the macroeconomic situation will improve,
not change or worsen;” and “In the medium-term (6 months), the annual inflation rate will rise, stay the same or
fall.”
7
Before 2002Q2 the ECB SPF survey had typically been conducted one or two weeks later.
8
As an alternative measure of disagreement from point forecasts, we have used the inter-quartile range. None of our
results change qualitatively. For brevity, we do not report the results here.
9
The remaining figures show standardized disagreement measures as well.
10
tracks the benchmark measure even more closely. The correlation between the overall disagreement
and the benchmark is .70, which is .10 higher than the correlation between the disagreement
constructed from the forecasts for the Euro area and the benchmark. This result importantly
implies that there are gains from estimating disagreement at the country level and then pooling
the country-level disagreement measures, relative to estimating disagreement at the aggregate level.
The gains arise from the similarity of disagreement across three country-level variables, which is
reflected in high correlations (ranging from 0.76 to 0.99) of each disagreement estimate with the
overall disagreement. Our finding is consistent with the result in Marcellino, Stock, and Watson
(2003) that there are typically gains from forecasting the economic series at the country level and
then pooling the forecasts, relative to forecasting at the aggregate level.
We experimented with several modifications of the basic Carlson-Parkin measure. We replaced
the latent normal distribution with a scaled t-distribution with ten degrees of freedom. The
top panel of Figure 2 depicts this disagreement measure against the ECB SPF benchmark. The
correlation between the disagreement measure using the scaled t-distribution and the benchmark is
moderately higher than that between the basic Carlson-Parkin measure and the benchmark (.65 vs.
.60). Moreover, we relaxed the assumption of symmetrical thresholds. Appendix A.1 describes how
we constructed the actuals to estimate the asymmetric thresholds. The middle panel of Figure
2 depicts the disagreement measure that assumes asymmetric but time-constant thresholds. It
turns out that this alteration does not increase the correlation with the benchmark over the basic
Carlson-Parkin measure. Finally, we estimated disagreement allowing for asymmetric and timevarying thresholds. As shown in the bottom panel of Figure 2, this measure shows no improvement
over the basic Carlson-Parkin measure in terms of the correlation with the benchmark disagreement
measure.
3.3
Disagreement about Euro Area Inflation
Similar to our analysis of Euro area GDP growth, we apply our new measures of disagreement to
forecasts of Euro area inflation. We compare disagreement measures obtained from ZEW directional forecasts with a benchmark measure computed as the cross-sectional standard deviation of
the 12-month HICP inflation forecasts from the ECB SPF.
The upper panel of Figure 3 compares the basic Carlson-Parkin disagreement measure to the
benchmark. The two series display a correlation of .40. Surprisingly, the disagreement measure
11
from directional forecasts seems to lead the benchmark disagreement from point forecasts. Indeed,
the correlation between the two rises to .54 when the Carlson-Parkin measure is lagged by one
quarter.
The relationship is even more pronounced when we use the approach in Section 2.2. Results
from this approach are depicted in the lower panel of Figure 3. We first calculate Carlson-Parkin
disagreement measures for the member countries, Germany, Italy and France. Then, we extract an overall disagreement using the dynamic factor model from the three country-level inflation
disagreement series. By taking account of the similarity of disagreement across countries, this
bottom-up approach increases the correlation between the disagreement measure and the benchmark from .40 to .49. Again, if we lag the overall disagreement by one period, the correlation is
remarkably increased to .64.
We also explored three modifications of the basic Carlson-Parkin approach. The three modifications are depicted in Figure 4. The first modification relies on a scaled t-distribution with
ten degrees of freedom. Unfortunately, its correlation with the benchmark disagreement is slightly
lower than that of the basic Carlson-Parkin measure (.38 vs. .40). The second modification uses
asymmetric, but time-constant thresholds. This modification achieves a minor improvement over
the basic Carlson-Parkin measure (.43 vs. .40). The third alternative, a model with time-varying
and asymmetric thresholds, does not show a clear improvement over the simple Carlson-Parkin
approach.
To summarize, the basic Carlson-Parkin approach accurately measures disagreement in predictions of a single variable. The various modifications, including using a scaled t distribution and
allowing asymmetric and/or time-varying thresholds, do not produce significant improvements over
the basic approach. The overall disagreement constructed from individual disagreement series performs very well and closely tracks the benchmark measures of disagreement. However, one has to
interpret these results with caution, since they are based on only thirteen years of quarterly data.
As the time period covered by the data is extended, more robust analyses can be conducted.
4
Forecasting Technology and Forecast Disagreement
We demonstrate the usefulness of our disagreement measures in two applications: the source of
disagreement and the predictive power of disagreement. We address the first issue in this section
and the second one in the next section.
12
The literature offers several prominent theories about why forecasters disagree. In the sticky
information model of Mankiw and Reis (2002), forecasters only occasionally pay attention to news,
and this inattention endogenously generates disagreement in aggregate expectations. In contrast,
Sims (2003) and Woodford (2003) develop the noisy information model in which forecasters are
continuously updating their information, but observe noisy signals about the true state. Andrade
and Le Bihan (2010) and Coibion and Gorodnichenko (2012) consider both sticky information and
noisy information models for professional forecasters. While the latter finds the basic noisy information model to be the best characterization of the expectations formation process of professional
forecasters, the former concludes that both models cannot quantitatively replicate the forecast
error and disagreement observed in the SPF data.
A third explanation for the existence of disagreement focuses on the strategic behavior of forecasters. Ehrbeck and Waldmann (1996), Laster, Bennett, and Geoum (1999) and Ottaviani and
Sørensen (2006) have explored this possibility. These models typically assume that after observing
the same public information, forecasters have incentives to report distorted predictions for reputational concerns. However, the strategic forecasting models seem to be less relevant in our case
because of the anonymity of respondents in the ZEW survey.
With a focus on expectation formation, Kandel and Pearson (1995), Lahiri and Sheng (2008),
Patton and Timmermann (2010) and Manzan (2011) show that forecasters may disagree due to
their different prior beliefs or their differential interpretation of public information. These papers
provide indirect evidence that forecasters use different models and judgment in revising their
predictions.
In this section, we utilize a special questionnaire, which was attached to the ZEW financial
market survey in March 2011, to explore the role of heterogeneous models in directly generating
forecast disagreement. Among other things, the questionnaire asked respondents about the importance of several methods in their directional forecasts for economic activity. Respondents used the
categories “small,”“medium” or “high” to assess the importance of each of the following methods:
econometric modeling, fundamental analysis, technical analysis, judgment, in-house research and
consensus forecasts. Table 1 summarizes the results. The importance of the methods varies largely
across the panel. For instance, some respondents pay little attention to econometric modeling,
in-house research and consensus forecasts, but others assign high importance to these methods.
The respondents disagree less on technical analysis, which most of them do not use. On the other
hand, most of the respondents rely heavily on fundamental analysis and judgment. This finding
13
coincides with the finding in Batchelor and Dua (1990) that the single most important forecasting
technique used by the Blue Chip panel was judgment.
We expect that the forecasting technology of a respondent is a mixture of the methods mentioned
above. In order to identify groups that use similar mixtures of methods we apply cluster analysis.
More specifically, we use the K-means clustering algorithm of Hartigan and Wong (1979) on coded
data, where we denote the response “small” by 1, “medium” by 2 and “high” by 3. We choose the
number of clusters by inspecting the incremental reduction of the within-cluster sum of squares
achieved by an additional cluster. Since the curve flattens out when more than four clusters are
included, we assign the individual respondents to four clusters. Table 2 gives the average (coded)
value of the variables and the number of respondents for each of the four clusters. The clusters
differ most substantially with respect to the importance of econometric modeling, in-house research
and consensus forecasts. Respondents in clusters one and four put little weight on econometric
modeling, whereas clusters two and three include respondents who rely heavily on econometrics. In
clusters one and three, forecasters attach a low weight to in-house research and consensus forecasts,
whereas clusters two and four are much more in favor of accounting for the forecasts of others.
To summarize, the four clusters mainly differ in two dimensions: the use of econometric modeling
(clusters 1 and 4 vs. 2 and 3), and the weight on the forecasts of others (clusters 1 and 3 vs. 2
and 4).
We use the cluster assignments to examine the relationship between forecasting technologies and
the level of disagreement. The four panels of Figure 5 depict cluster-level disagreement against
full-sample disagreement. In Table 2 we also include average levels of disagreement for each cluster
and the result of testing whether average cluster-level disagreement differs from average full-sample
disagreement. We find that clusters one and four have below average disagreement. Cluster
two does not display a significantly different disagreement than the full sample. Cluster three
has above average disagreement. If we link disagreement levels to forecasting technology of the
respondents, we find that the use of econometric modeling boosts disagreement, whereas paying
close attention to the forecasts of others reduces disagreement. Therefore, different forecasting
technologies induce different levels of disagreement, implying that research about disagreement
should pay close attention to the forecasting technology of the group of forecasters. Furthermore,
studies that yield conflicting results but have similar methodologies may be explained by differences
in the forecasting technologies of the samples.
14
5
Does Disagreement Have Predictive Value?
As a second application of our disagreement measures, in this section we examine the potential
predictive power of disagreement in qualitative survey data for economic activity. Due to the
relatively short time span of the ZEW survey, we use another well-known business survey conducted
by the Institute for Supply Management (ISM). Each month the ISM sends supply chain managers
and business executives a questionnaire that asks about their firm’s production, employment and
other information for the preceding month. Respondents report whether a variable has gone “up,”
“down” or “stayed the same” over the previous month. The ISM data have three main advantages
for forecasting purposes. First, as the ISM survey has been conducted since January 1948, it has
a long history. Second, it is timely, as new ISM data comes out on the first business day of every
month. Third, the ISM data is subject to minimal revisions.
We focus on data on new orders, production, employment and prices, because they are available
from January 1948. Other series began sporadically between June 1976 and January 1997. We
construct the standard Carlson-Parkin disagreement measure for each of the four variables and
also compute an overall disagreement index using the dynamic factor model. We then explore the
potential of these disagreement measures for forecasting U.S. industrial production.
We consider the following four models:
h
= ayt + εt ,
yt+h
(18)
h
yt+h
= ayt + b1 Dt + εt ,
(19)
h
yt+h
= ayt + b1 P M It + εt ,
(20)
h
yt+h
= ayt + b1 P M It + b2 Dt + εt ,
(21)
where yth = ln (IPt+h )−ln (IPt ) is the log growth rate in industrial production from t−h to t, Dt is
the disagreement in month t, and P M It is ISM Purchasing Managers’ Index (PMI) for that month.
We include the PMI in the regression because several studies indicate that this index has forecasting
power for GDP and the business cycle (see, inter alia, Dasgupta and Lahiri (1993) and Banerjee
and Marcellino (2006)). By comparing model (19) to (18), and model (21) to (20), we measure the
extent to which a disagreement measure adds value to forecasting industrial production. Each of
the four models is estimated with h = 1, 3, 6, 9, 12 months using real-time data for the industrial
production index from archival Federal Reserve economic data(ALFRED). Additionally, models
15
(19) and (21) are estimated with each of the five disagreement measures separately.
The forecasts are produced using an expanding estimation window methodology. Starting in
January 1970, we estimate each model using data between January 1948 when the ISM survey
was first conducted and December 1969, the latest period for which the target variable had been
publicly observed in January 1970. Based on the estimated parameters we make a one-stepahead forecast with each model. If the forecast horizon is h = 1, the forecast refers to the log
growth in industrial production from December 1969 to January 1970. Similarly, forecasts with
horizon h = 12 refer to the log growth in industrial production from December 1969 to December
1970. Next, we proceed to February 1970 by extending the estimation window from January 1948
to January 1970 and making one-step-ahead forecasts again. We continue analogously until we
reach January 2010, when the estimation window includes the observations from January 1948 to
December 2009, and a final set of one-step-ahead forecasts are made. This systematic procedure
produces a set of forecasts for industrial production.10
We evaluate the forecasts using the first release data, available one month after the month that
the industrial production index refers to. We compute the mean squared forecast errors (MSFE)
for each model, and compare the MSFEs of the models that include disagreement to that of the
models without disagreement. We test for equal predictive ability for these pairs of models (with
vs. without the disagreement index) by the Clark and West (2007) test statistic.11
Table 3 shows the results of our forecast evaluation. Three points are worth noting. First, when
comparing a simple AR(1) model (column 3) to the AR(1) model with the PMI as an additional
regressor (column 5), we see that the PMI adds power to predicting industrial production at all
horizons. Second, the disagreement extracted from ISM survey data can be used to forecast industrial production with varying degrees of success (columns 7 and 8). More specifically, there is
not much improvement when disagreement in new orders or production is added to the regression.
However, including estimated disagreement in employment or prices as additional regressors significantly decreases the MSFE in both the simple AR(1) model and the AR(1) model with the PMI.
This substantial decrease implies that the disagreement in these two variables has usefulness in
forecasting industrial production beyond that of the PMI. Third, including the overall disagreement
10
We also repeat the experiment with rolling estimation windows of 10 and 20 years of data and find similar results.
To save space, the results are not reported here.
11
Note that when comparing nested models, the alternative hypothesis is that the larger model has higher predictive
ability than the nested model because the test refers to predictive ability at the population level. Thus, if the additional variables have predictive power, their coefficients should be non-zero, indicating an improvement in predictive
ability over the nested model.
16
constructed from all four variables in the models does not significantly reduce the MSFE. Considering the estimation results of the dynamic factor model, the reason is quite obvious. Whereas the
disagreement in new orders and production affects the common factor heavily (factor loadings of
1.197 and 1.015, respectively), the disagreement in employment has only a moderate effect (factor
loading of 0.387), and the disagreement in prices does not significantly load on the factor (factor
loading of 0.048). Therefore, the disagreement in new orders and production dominates the overall
disagreement, yielding an insignificant result.
We assess the robustness of the predictive power of disagreement by incorporating the mean
forecast in equations (18)-(21), as many studies confirm the predictive value of the mean (see,
eg. Elliott and Timmermann (2005) and Legerstee and Franses (2010)). Our analysis specific to
industrial production finds that including the mean slightly reduces the MSFE and disagreement
still significantly adds value in producing forecasts.12 In thinking about why disagreement has
predictive power in forecasting industrial production and possibly other macroeconomic variables,
note that disagreement might pick up some omitted variables (Driver, Trapani, and Urga, 2012)
or proxy for forecast uncertainty (Lahiri and Sheng, 2010).
6
Conclusion
We present two methods for measuring disagreement in qualitative data on economic expectations.
The first method quantifies the level of disagreement in predictions of a single variable. The
second constructs an index of overall disagreement across several target variables. Our empirical
results show that our disagreement measures estimated from directional forecasts closely track the
conventional disagreement measures obtained from point forecasts.
We apply our disagreement measures to analyze the source and economic significance of disagreement. We find that forecasters use a wide range of techniques in interpreting and weighting public information, resulting in substantial heterogeneity in their forecasts. Furthermore, our
analysis shows that the disagreement estimated from qualitative survey data contains economically
meaningful information for forecasting purposes.
We hope to spur further research into disagreement among forecasters by providing appropriate
tools for the analysis of qualitative survey data. Such tools are especially important because many
surveys, in particular those with large panels of non-professional forecasters, collect directional
12
To save space, these results are not reported here. They are available upon request.
17
forecasts only. Due to the large cross-sectional dimension of many of these data sets, qualitative
data may provide invaluable insight to the determinant and use of disagreement in economic
modeling, forecasting and policy.
18
A
Appendix
A.1
Construction of Realizations
Realizations are required for the estimation of the time-varying, asymmetric thresholds model
and the time-constant, asymmetric thresholds model of Section 2.1. These have been constructed
through the following procedure.
In the case of inflation, respondents provide six months ahead forecasts of the annual inflation
rate of the Euro area (“annual inflation will rise/stay same/fall”). As the observed counterpart,
we utilize the six months ahead change in the annual HICP inflation rate. We account for the
publication lag by lagging the actual series by one month. The idea is to start from the latest
figure available at the time when the directional forecasts are made. The one month lag is sufficient
because we use flash estimates that are available from October 2001. For example, the six months
ahead change in the annual inflation rate in January 2012 is computed as the annual inflation rate
in June 2012 minus the annual inflation rate in December 2011. Because flash estimates are not
available before October 2001, we impute final revised data for inflation.
In the case of GDP, the survey questionnaire asks respondents to forecast whether the overall
macroeconomic situation will improve, stay the same or worsen in the next six months. As the
observed counterpart, we employ the six months ahead growth of real GDP relative to the latest
available figure at the time of the forecast. In the third month of each quarter, we use the GDP
growth of the current and next quarter as the realization. Due to the quarterly frequency of GDP,
we only use directional forecasts of the last month of each quarter for estimation. Therefore,
the estimates of the time-varying thresholds refer to the third month in a quarter. We adopt a
pragmatic approach and use the same estimated thresholds for the first month in the following
quarter.
19
References
Andrade, P., and H. Le Bihan (2010): “Inattentive Professional Forecasters,” Banque de France
Working Paper no. 307.
Banerjee, A., and M. Marcellino (2006): “Are There Any Reliable Leading Indicators for US
Inflation and GDP Growth?,” International Journal of Forecasting, 22(1), 137 – 151.
Banternghansa, C., and M. McCracken (2009): “Forecast Disagreement Among FOMC
Members,” Federal Reserve Bank of St. Louis. Working Paper 2009-059A.
Batchelor, R. (1981): “Aggregate Expectations Under the Stable Laws,” Journal of Econometrics, 16(2), 199 – 210.
Batchelor, R., and P. Dua (1990): “Forecaster Ideology, Forecasting Technique, and the Accuracy of Economic Forecasts,” International Journal of Forecasting, 6(1), 3 – 10.
Carlson, J. A. (1975): “Are Price Expectations Normally Distributed?,” Journal of the American
Statistical Association, 70(352), 749–754.
Carlson, J. A., and M. J. Parkin (1975): “Inflation Expectations,” Economica, 42, 123–138.
Carriero, A., T. E. Clark, and M. Marcellino (2012): “Common Drifting Volatility in
Large Bayesian VARs,” CEPR Working Paper no. DP8894.
Clark, T. E., and K. D. West (2007): “Approximately Normal Tests for Equal Predictive
Accuracy in Nested Models,” Journal of Econometrics, 138, 291–311.
Coibion, O., and Y. Gorodnichenko (2012): “What Can Survey Forecasts Tell Us about
Information Rigidities?,” Journal of Political Economy, 120, 116–159.
Cooley, T. F., and E. C. Prescott (1976): “Estimation in the Presence of Stochastic Parameter Variation,” Econometrica, 44(1), 167–84.
Dasgupta, S., and K. Lahiri (1992): “A Comparative Study of Alternative Methods of Quantifying Qualitative Survey Responses Using NAPM Data,” Journal of Business and Economic
Statistics, 10(4), 391–400.
(1993): “On the Use of Dispersion Measures from NAPM Surveys in Business Cycle
Forecasting,” Journal of Forecasting, 12(3-4), 239–253.
Driver, C., L. Trapani, and G. Urga (2012): “On the Use of Cross-Sectional Measures of
Uncertainty,” Working paper, Cass Business School, City University London.
Durbin, J., and S. Koopman (2012): Time Series Analysis by State Space Methods. Oxford
University Press Oxford, 2 edn.
Ehrbeck, T., and R. Waldmann (1996): “Why Are Professional Forecasters Biased? Agency
versus Behavioral Explanations,” The Quarterly Journal of Economics, 111(1), 21–40.
Elliott, G., and A. Timmermann (2005): “Optimal Forecast Combination under Regime
Switching,” International Economic Review, 46(4), 1081–1102.
Entorf, H., A. Gross, and C. Steiner (2012): “Business Cycle Forecasts and their Implications
for High Frequency Stock Market Returns,” Journal of Forecasting, 31(1), 1–14.
Hartigan, J. A., and M. A. Wong (1979): “A K-Means Clustering Algorithm,” Applied Statistics, 28, 100–108.
Kandel, E., and N. D. Pearson (1995): “Differential Interpretation of Public Signals and Trade
in Speculative Markets,” Journal of Political Economy, 103(4), pp. 831–872.
Koopman, S. J. (1997): “Exact Initial Kalman Filtering and Smoothing for Nonstationary Time
Series Models,” Journal of the American Statistical Association, 92(440), 1630–1638.
20
Lahiri, K., and X. Sheng (2008): “Evolution of Forecast Disagreement in a Bayesian Learning
Model,” Journal of Econometrics, 144(2), 325–340.
Lahiri, K., and X. Sheng (2010): “Measuring Forecast Uncertainty by Disagreement: The
Missing Link,” Journal of Applied Econometrics, 25(2), 514–538.
Laster, D., P. Bennett, and I. S. Geoum (1999): “Rational Bias in Macroeconomic Forecasts,”
The Quarterly Journal of Economics, Vol. 114, No. 1, 293–318.
Legerstee, R., and P. H. Franses (2010): “Does Disagreement amongst Forecasters Have
Predictive Value?,” Working Paper, Tinbergen Institute Discussion Paper 088/4.
Mankiw, N. G., and R. Reis (2002): “Sticky Information versus Sticky Prices: A Proposal to
Replace the New Keynesian Phillips Curve,” The Quarterly Journal of Economics, 117, 1295–
1328.
Mankiw, N. G., R. Reis, and J. Wolfers (2004): “Disagreement about Inflation Expectations,”
in NBER Macroeconomics Annual 2003, ed. by M. Gertler, and K. Rogoff, pp. 209–248. MIT
Press, MA.
Manzan, S. (2011): “Differential Interpretation in the Survey of Professional Forecasters,” Journal
of Money, Credit and Banking, 43(5), 993–1017.
Marcellino, M., J. H. Stock, and M. W. Watson (2003): “Macroeconomic Forecasting in
the Euro area: Country Specific versus Area-Wide Information,” European Economic Review,
47(1), 1–18.
Nardo, M. (2003): “The Quantification of Qualitative Survey Data: A Critical Assessment,”
Journal of Economic Surveys, 17(5), 645–668.
Nolte, I., and W. Pohlmeier (2007): “Using Forecasts of Forecasters to Forecast,” International
Journal of Forecasting, 23, 15–28.
Ottaviani, M., and P. N. Sørensen (2006): “The Strategy of Professional Forecasting,” Journal
of Financial Economics, 81(2), 441–466.
Patton, A. J., and A. Timmermann (2010): “Why Do Forecasters Disagree? Lessons from the
Term Structure of Cross-sectional Dispersion,” Journal of Monetary Economics, 57(7), 803–820.
Pesaran, M. H. (1984): “Expectations Formation and Macroeconometric Modelling,” in Contemporary Macroeconomic Modelling, ed. by P. Malgrange, and P.-A. Muet, pp. 27–55. Basil
Blackwell, Oxford.
Pesaran, M. H., and M. Weale (2006): “Survey Expectations,” in Handbook of Economic
Forecasting, ed. by G. Elliott, C. W. Granger, and A. Timmermann, vol. 1, chap. 14, pp. 715–
776. North-Holland.
Sims, C. (2003): “Implications of Rational Inattention,” Journal of Monetary Economics, 50,
665–690.
Sinclair, T. M., and H. O. Stekler (2012): “Examining the Quality of Early GDP Component
Estimates,” International Journal of Forecasting, forthcoming.
Smith, J., and M. McAleer (1995): “Alternative Procedures for Converting Qualitative Response Data to Quantitative Expectations: an Application to Australian Manufacturing,” Journal of Applied Econometrics, 10, 165–185.
Song, C., B. L. Boulier, and H. O. Stekler (2009): “Measuring Consensus in Binary Forecasts: NFL Game Predictions,” International Journal of Forecasting, 25(1), 182–191.
Stock, J., and M. Watson (1991): “A Probability Model of the Coincident Economic Indicators,”
in Leading Economic Indicators: New Approaches and Forecasting Records, ed. by K. Lahiri, and
G. Moore, pp. 63–90. Cambridge University Press.
21
Welch, B. L. (1947): “The Generalization of “Student’s” Problem when Several Different Population Variances are Involved,” Biometrika, 34(1-2), 28–35.
Woodford, M. (2003): “Imperfect Common Knowledge and the Effects of Monetary Policy,” in
Knowledge, Information, and Expectations in Modern Macroeconomics: In Honor of Edmund
Phelps, ed. by P. Aghion, R. Frydman, J. Stiglitz, and M. Woodford, chap. 1, pp. 25–58. Princeton
University Press, New Jersey.
22
usage of
methods
econometric
modeling
fundamental
analysis
technical
analysis
small
medium
high
0.39
0.33
0.28
0.04
0.20
0.76
0.67
0.27
0.07
judgement
in-house
research
consensus
forecasts
0.09
0.34
0.57
0.32
0.38
0.30
0.28
0.51
0.21
Table 1: Distribution of methods employed by respondents to the ZEW Financial Market Survey.
23
cluster
members
etric.
model.
fundamen.
analysis
technical
analysis
judgement
in-house
research
consens.
fcasts
avg. disagreement
1
2
3
4
39
63
41
41
1.00
2.49
2.51
1.22
2.79
2.77
2.90
2.37
1.26
1.60
1.22
1.39
2.49
2.44
2.37
2.61
1.44
2.43
1.15
2.63
1.69
1.90
1.46
2.63
1.24
1.33
1.74∗
1.05∗
full
panel
184
1.90
2.72
1.40
2.47
1.98
1.92
1.32
Table 2: Descriptive statistics for the four clusters identified by the K-means clustering algorithm.
Column “members” shows the number of respondents assigned to each cluster; the columns to the
right of “members” show the average importance that the members of each cluster assign to the
methods, respectively. Numeric values are obtained by converting response “small” to 1, “medium”
to 2 and “high” to 3. The last column shows the average disagreement in the directional forecast
for Euro area economic activity for each cluster. One asterisk denotes significance of the t-test for
equal sample means of Welch (1947) at the one percent significance level. The null hypothesis is
that the two sub-samples have equal sample means, and the two-sided alternative hypothesis is
that the two means differ.
24
Horizon
Disagr.
Measure
MSFE
AR
AR+D
AR+PMI
1.00
3.00
6.00
9.00
12.00
New
New
New
New
New
0.42
2.96
10.68
20.91
31.73
0.42
3.02
10.78
21.02
32.26
1.00
3.00
6.00
9.00
12.00
Product.
Product.
Product.
Product.
Product.
0.42
2.96
10.68
20.91
31.73
1.00
3.00
6.00
9.00
12.00
Employm.
Employm.
Employm.
Employm.
Employm.
1.00
3.00
6.00
9.00
12.00
1.00
3.00
6.00
9.00
12.00
relative MSFE
AR+PMI+D
AR+D
AR
AR+PMI+D
AR+PMI
0.37
2.73
9.89
20.46
30.89
0.37
2.74
9.75
20.16
31.30
1.01
1.02
1.01
1.00
1.02
1.00∗∗
1.00∗
0.99∗
0.99∗
1.01
0.42
3.04
10.92
21.06
31.89
0.37
2.73
9.89
20.46
30.89
0.36
2.75
9.86
20.29
31.38
1.00
1.03
1.02
1.01
1.00
0.98∗∗
1.00
1.00∗
0.99∗
1.02
0.42
2.96
10.68
20.91
31.73
0.42
2.95
10.29
19.89
30.12
0.37
2.73
9.89
20.46
30.89
0.36
2.62
9.05
18.62
28.46
1.00
0.99∗
0.96∗∗∗
0.95∗∗∗
0.95∗∗∗
0.97∗∗∗
0.96∗∗∗
0.91∗∗∗
0.91∗∗∗
0.92∗∗∗
Prices
Prices
Prices
Prices
Prices
0.42
2.96
10.68
20.91
31.73
0.41
2.74
9.48
17.20
25.53
0.37
2.73
9.89
20.46
30.89
0.36
2.52
8.79
17.59
26.15
0.97∗∗∗
0.92∗∗∗
0.89∗∗∗
0.82∗∗∗
0.80∗∗∗
0.98∗∗
0.92∗∗
0.89∗∗
0.86∗∗
0.85∗∗
DI
DI
DI
DI
DI
0.42
2.96
10.68
20.91
31.73
0.42
3.07
10.90
20.96
31.85
0.37
2.73
9.89
20.46
30.89
0.36
2.74
9.68
19.94
31.19
1.01
1.03
1.02
1.00
1.00
0.99∗∗
1.00∗∗
0.98∗∗
0.97∗
1.01
Ord.
Ord.
Ord.
Ord.
Ord.
Table 3: Results of the pseudo out-of-sample forecasting experiment. Column “Horizon” indicates
the forecast horizon in months; column “Disagr.Measure” indicates the disagreement measures; the
columns under the header “MSFE” indicate mean squared forecast errors of each of the four models
presented in equations (18-21); the last two columns under the header “relative MSFE” indicate (1)
the relative MSFE of the AR(1) model with disagreement as an explanatory variable relative to the
pure AR(1) model (left column), and (2) the relative MSFE of the AR(1) model that includes the
PMI and disagreement as explanatory variables relative to the AR(1) model that only includes the
PMI as an additional regressor (right column). ∗/ ∗ ∗/ ∗ ∗∗ indicates significance of the Clark and
West (2007) test for equal predictive ability at the 1/5/10 percent significance level, respectively
(one-sided alternative hypothesis: large model is better).
25
(a) Carlson-Parkin (Correlation .60)
(b) Overall Disagreement (Correlation .70)
Figure 1: Disagreement measures for Euro area GDP. Solid lines: measures based on directional
forecasts from the ZEW financial market survey. Dotted lines: standard deviation of twelve-month
point forecasts for Euro area real GDP growth from ECB SPF. Top panel: standard Carlson-Parkin
measure based on six-month directional forecasts for Euro area economic activity. Bottom panel:
overall disagreement based on six-month directional forecasts for German, Italian and French
economic activity.
26
(a) Scaled t10 Carlson-Parkin (Correlation .65)
(b) Asymmetric but Time-Constant Thresholds Carlson-Parkin
(Correlation .56)
(c) Asymmetric and Time-Varying Thresholds Carlson-Parkin
(Correlation .50)
Figure 2: Disagreement measures for Euro area GDP. Solid lines: measures based on directional
forecasts from the ZEW financial market survey. Dotted lines: standard deviation of twelve-month
point forecasts for Euro area real GDP growth from ECB SPF.
27
(a) Carlson-Parkin (Correlation .40)
(b) Overall Disagreement (Correlation .49)
Figure 3: Disagreement measures for Euro area inflation. Solid lines: measures based on directional forecasts from the ZEW financial market survey. Dotted lines: standard deviation of
twelve-month point forecasts for Euro area HICP inflation from ECB SPF. Top panel: standard
Carlson-Parkin measure based on six-month directional forecasts for the Euro area inflation rate.
Bottom panel: overall disagreement based on six-month directional forecasts for the German,
Italian and French inflation rate.
28
(a) Scaled t10 Carlson-Parkin (Correlation .38)
(b) Asymmetric but Time-Constant Thresholds Carlson-Parkin
(Correlation .39)
(c) Asymmetric and Time-Varying Thresholds Carlson-Parkin
(Correlation .29)
Figure 4: Disagreement measures for Euro area inflation. Solid lines: measures based on directional forecasts from the ZEW financial market survey. Dotted lines: standard deviation of
twelve-month point forecasts for Euro area HICP inflation from ECB SPF.
29
(a) Cluster 1 (Correlation .58)
(b) Cluster 2 (Correlation .85)
(c) Cluster 3 (Correlation .78)
(d) Cluster 4 (Correlation .82)
Figure 5: Cluster-level disagreement. Solid lines: Carlson-Parkin Disagreement for Euro area economic activity for each cluster. Dotted line: Carlson-Parkin Disagreement for Euro area economic
activity for the full panel.
30
Download