The Trilemma Between Accuracy, Timeliness and Smoothness in Real-Time Signal Extraction Marc Wildi and Tucker McElroy June 26, 2013 Abstract The evaluation of economic data and monitoring of the economy is often concerned with an assessment of mid- and long-term dynamics of time series (trend and/or cycle). Frequently, one is interested in the most recent estimate of a target signal, a so-called real-time estimate. Unfortunately, real-time signal extraction is a difficult prospective estimation problem which involves linear combinations of one- and possibly infinitely many multi-step ahead forecasts of a series. We here address performances of real-time designs by proposing a generic Direct Filter Approach (DFA). We decompose the ordinary MSE into Accuracy, Timeliness and Smoothness error components, and we propose a new two-dimensional tradeoff between these conflicting terms, the so-called ATS-trilemma. With this formalism, we are able to derive a general class of optimization criteria that allow the user to address specific research priorities, in terms of the Accuracy, Timeliness and Smoothness properties of the resulting real-time filter. Contents 1 Introduction 2 2 ATS-Trilemma and Customization 2.1 Target . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Mean-Square Paradigm . . . . . . . . . . . . . . . . 2.3 Accuracy, Timeliness and Smoothness Components . 2.4 Signal Extraction ATS-Trilemma and Customization 2.5 Forecast AT-Dilemma . . . . . . . . . . . . . . . . . 2.6 Spectrum, Signal and Customization Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 4 4 5 6 6 3 Replicating and Customizing a Generic Model-Based Approach 3.1 Empirical Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Performance Measures . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 A Generic Model-Based Approach . . . . . . . . . . . . . . . . . . . 3.4 Customizing the Model-Based Approach: Known DGP . . . . . . . . 3.5 Customizing the Model-Based Approach: Unknown DGP . . . . . . 3.5.1 Empirical AR(1) Spectrum . . . . . . . . . . . . . . . . . . . 3.5.2 Non-parametric Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6 7 7 9 12 13 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Summary and Conclusion 17 5 Appendix 5.1 Empirical Distributions of Performances when the DGP is Known: the cases a1 = 0 and a1 = 0.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Empirical Distributions of Performances based on the Empirical AR(1) Spectrum: the Cases a1 = 0 and a1 = 0.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1 18 24 5.3 5.4 1 Empirical Distributions of Performances based on the Periodogram: the Cases a1 = 0 and a1 = 0.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Time-Shift in Frequency Zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 31 Introduction The evaluation of economic data and monitoring of the economy is often concerned with an assessment of mid- and long-term dynamics of time series (trend and/or cycle). For this purpose a broad range of time-proven techniques are available to the analyst: ARIMA-based approaches (e.g., TRAMO-SEATS (Maravall and Caparello, 2004)), State-Space methods (e.g., STAMP (Koopman, Harvey, Doornik, Shepherd (2000)), and classic filters (e.g., Hodrick-Prescott (HP) or CristianoFitzgerald (CF); see Hodrick and Prescott (1997) and Cristiano and Fitzgerald (2003)) are widelyused benchmarks. Frequently, one is interested in the most recent estimate of a target signal; the most recent estimate depends on only present and past data, and is referred to as a real-time estimate. Unfortunately, performances of real-time (or concurrent) filters generally differ from performances of historical estimates (these historical estimators are sometimes called “smoothers”) because, trivially, the former cannot rely on future data, i.e., real-time filters are asymmetric. We here address performances of real-time designs (although the method can be easily extended to smoothing) by proposing a generic Direct Filter Approach (DFA) that was first introduced in Wildi (2005). The key idea of the DFA is that real-time estimates, which are computed by the application of a linear filter to the time series, can be forced to have desirable properties – such as smoothness and timeliness – by direct design of the real-time filter. In order to impose these desirable properties on a filter, we proceed by decomposing the mean square error (MSE) of filtering into components that correspond to the key properties of a filter. It is well-known (Priestley, 1981) that any linear filter is characterized by its frequency response function (frf), i.e., the Discrete Fourier Transform (DFT) of its filter coefficients, and the frf can be decomposed in terms of a gain function and a phase function. The former governs smoothness of filter estimates, describing the so-called pass-band and stop-band, whereas the latter governs timeliness of the filter, controlling the advance or retardation in time of underlying harmonies. Here, we make a novel connection of these concepts with the signal extraction MSE. We decompose the ordinary MSE into Accuracy, Timeliness and Smoothness error components, and we propose a new two-dimensional tradeoff between these conflicting terms, the so-called ATS-trilemma. This is analogous, but not mathematically equivalent to, the bias-variance tradeoff that arises from the classical decomposition of MSE of a parameter estimator. With this formalism, we are able to derive a general class of optimization criteria that allow the user to address specific research priorities, in terms of the Accuracy, Timeliness and Smoothness properties of the resulting real-time filter. We call such a criterion a “customization.” Although customization for real-time signal extraction can be achieved via other methodologies, we argue that our particular decomposition of MSE offers the most direct and compelling connection between parameters and the corresponding characteristics of a real-time filter. For example, in a modelbased framework (e.g., TRAMO-SEATS or STAMP) one could adjust model orders and/or model parameters to achieve modifications to smoothness (e.g., increase the integration order for the trend to obtain a smoother real-time trend), but the connection of such adjustments to the phase function of the concurrent filter arising from such models is much harder to discern; moreover, knowing the phase function alone does not provide information about its impact on MSE – and one is ultimately concerned about signal extraction error. If instead one advocates usin an asymmetric version of a popular nonparametric filter, such as the HP, CF, or ideal low-pass filter, there are few parameters available to adjust the phase function, and the story is much the same as in the modelbased framework. So while we acknowledge that other methodologies provide a connection between parameters and vague notions of smoothness and phase, our contribution here is to make these connections mathematically explicit through the ATS decomposition. The ATS decomposition can in fact be used to analyze and customize a model-based concurrent filter, or a nonparametric concurrent filte; we actually advocate workin with a richer class of concurrent filters, that allow for 2 more flexibility in terms of gain and phase functions, than is typically possible with model-based or nonparametric approaches. If one were to weight squared bias and variance by some convex combination – where equal weights correspond to the estimator’s MSE – one could emphasize either aspect, depending upon user priorities. Such a customization results in a one-dimensional space – or curve – of criteria, the classical MSE being but the central point. Similarly, real-time signal extraction customization results in a two-dimensional space – or triangle – of criteria, wherein the MSE lies in the center, and the vertices correspond to exclusive emphasis on either Accuracy, Timeliness, or Smoothness. Thus, traditional approaches (such as those discussed above) can be replicated perfectly by DFA, and can also be customized. Interestingly, the two-dimensional tradeoff collapses to a bipolar dilemma in a classic one- or multi-step ahead forecast perspective, which we demonstrate below (essentially because the notions of pass-band and stop-band are irrelevant in forecasting problems). The key facet here is that MSE can be decomposed into a summation of constituent error measures that correspond to useful quantities of interest. Just as with a classical estimator, where it is useful to examine both its squared bias and its variance – described heuristically as accuracy and precision respectively – here in the context of signal extraction we argue that it is useful to decompose MSE into three components of Accuracy, Timeliness, and Smoothness. As in nonparametric density estimation, where altering the bandwidth can decrease bias at the expense of higher variance, or vice versa, so here an improvement of one of the ATS components typically entails a deterioration in one or both of the other components. The main concepts are introduced in Section 2. We then replicate and customize a simple generic model-based filter in Section 3, to demonstrate that DFA includes the classical approaches, while being more flexible. We propose classic time-domain performance measures, namely peakcorrelation (for Timeliness) and curvature (mean second-order differences for Smoothness) and illustrate that the customized design can outperform model-based filters in both aspects simultaneously, out-of-sample. Due to space limitations we here restrict the exposition to univariate approaches, though customization and the underlying trilemma readily extend to multivariate approaches; this is currently being investigated by the authors. 2 ATS-Trilemma and Customization Here we introduce the mathematical concepts needed; some of this material can be found in Wildi and McElroy (2013), but Section 2.3 here is novel, and exposits the main thesis of this paper. 2.1 Target Let {xt } be out time series data. We assume that the target (signal) {yt } is specified by the output of a (possibly bi-infinite) filter applied to xt yt = ∞ X γk xt−k , (1) k=−∞ P∞ and we denote by Γ(ω) = k=−∞ γk exp(−ikω) the frf of the filter. This specification is completely general, such that we could account for model-based targets (based on ARIMA or State Space models), nonparametric filters (HP, CF, or Henderson (1916)), or “ideal” low-pass and band-pass filters such as Γ(ω; η1 , η2 ) = 1[η1 ,η2 ] (|ω|) (2) where 0 ≤ η1 < eta2 and with the convention that Γ(ω; η1 , η2 ) is a low-pass if η1 = 0. The ideal low-pass and band-pass filter are symmetric, with γk = (sin[η2 k] − sin[η1 k])/(πk) for k ≥ 1 and γ0 = (η2 − η1 )/π. Typically, signal extraction targets are symmetric (γk = γ−k ) but our definition (1) allows for a general specification. As an example, h-step ahead forecasting would be obtained by setting γk = 1{k=−h} . The resulting transfer function Γ(ω) = exp(ihω) would be an anticipative allpass filter. 3 2.2 Mean-Square Paradigm b (because it is a concurrent approximation to Γ) of a given We seek a concurrent filter, called Γ P L−1 k b length L > 0. Let Γ(B) = k=0 bk B for the backshift operator B, so that we seek filter coefficients {bk } such that the finite sample estimate ybt := L−1 X bk xt−k (3) k=0 is as close as possible to yt in mean-square, i.e., we need to solve arg min E (yt − ybt )2 . (4) b0 ,b1 ,··· ,bL−1 0 b Let b = [b0 , b1 , · · · , bL−1 ] . We could restrict to Γ(B) filters that arise as the concurrent filter of a particular model (see Bell and Martin (2004) for the formula of a concurrent filter as a function of signal and noise models), in which case there is a model parameter θ such that b is a function of θ. Typically, θ has dimension lower than L, which may be fairly large (e.g., L = 100). Alternatively, b Γ(B) may be a concurrent version of a nonparametric filter, such as the Henderson trend filter. Here too b is a function of parameters θ that govern smoothing. In the Henderson case, θ is scalar and discrete, corresponding to the available (discrete) orders of Henderson filters. For the b HP, θ corresponds to a scalar smoothing parameter. We also consider the case that Γ(B) is not a function of an underlying parameter, but involves the full class of order L moving average filters. For simplicity of exposition we now assune that {xt } is a weakly stationary process with a continuous spectral density h (the DFT of the autocovariance function) such that Z π 1 2 b |Γ(ω) − Γ(ω)| h(ω)dω, (5) E (yt − ŷt )2 = 2π −π −iω b b −iω )|2 h(ω); which follows from the fact that yt −b yt = (Γ(B)−Γ(B))x ) − Γ(e t , which has spectral density |Γ(e see Brockwell and Davis (1991). Generalizations to non-stationary integrated processes are proposed in Wildi (2005) and Wildi and McElroy (2013)1 . Replication of traditional model-based (or classical) filter-designs is obtained by plugging the corresponding target signal Γ(ω) and the corresponding spectral density h (pseudo-spectral density in the case of integrated processes – see Bell and Hillmer (1984) for discussion) into (5). 2.3 Accuracy, Timeliness and Smoothness Components Consider the following identity |Γ(ω) − Γ̂(ω)|2 = A(ω)2 + Â(ω)2 − 2A(ω)Â(ω) cos Φ̂(ω) − Φ(ω) = 2 b (A(ω) − A(ω)) b +4A(ω)A(ω) sin b Φ(ω) − Φ(ω) 2 !2 (6) b b b where A(ω) = |Γ(ω)|, A(ω) = |Γ(ω)| are amplitude functions and Φ(ω) = Arg(Γ(ω)), Φ(ω) = b Arg(Γ(ω)) are phase functions of the filters involved. In the case of typical signal extraction problems – where Γ is a symmetric filter – the phase of the target vanishes Φ(ω) = 0; in the case of h−step ahead forecasting, the phase becomes Φ(ω) = hω. We now plug (6) into (5) and obtain 1 Under suitable filter restrictions, the filter error yt − ybt is weakly stationary even if {xt } is integrated. 4 the following decomposition of the signal extraction MSE Z π 2 b h(ω)dω Γ(ω) − Γ(ω) −π Z π 2 b = (A(ω) − A(ω)) h(ω)dω −π Z π b A(ω)A(ω)) sin +4 −π b Φ(ω) − Φ(ω) 2 !2 h(ω)dω In the case of signal extraction we can specify pass-bands and stop-bands of a filter by pass-band = {ω|A(ω) ≥ 0.5} stop-band = [−π, π] \ passband The original MSE can then be decomposed additively into the following four terms: Z 2 b A(ccuracy) := (A(ω) − A(ω)) h(ω)dω (7) passband Z T(imeliness) := 4 b A(ω)A(ω) sin passband Z b Φ(ω) − Φ(ω) 2 !2 h(ω)dω 2 b (A(ω) − A(ω)) h(ω)dω S(moothness) := (8) (9) stopband Z R(esidual) := 4 b A(ω)A(ω) sin stopband b Φ(ω) − Φ(ω) 2 !2 h(ω)dω (10) Accuracy measures the contribution to the MSE when the phase (time-shift) and the noise suppression in the stop-band are ignored. This would correspond to the performance of a symmetric b = 0 in the stop-band, and with the filter (no time-shift) with perfect noise suppression, i.e., A(·) same amplitude as the considered real-time filter in the pass-band. A corresponding symmetric filter could be easily constructed by inverse Fourier transformation. Smoothness measures the MSE contribution attributable to the leakage of the real-time filter in the stop-band, corresponding to undesirable high-frequency noise. This quantity is linked to the well-known time-domain “curvature’” measure described in Section 3.2. Timeliness measures the MSE contribution generated by the time-shift. It is linked to the time-domain “peak-correlation” concept described in Section 3.2. Finally, the Residual is that part of the MSE which is not attributable to any of the above error b components. Since the product A(ω)A(ω) is small in the stop-band (possibly vanishing, as in the case of the ideal low-pass) the Residual is generally negligible. From a slightly different perspective, user priorities are rarely concerned about time-shift properties of components in the stop-band, which ought to be damped or eliminated anyways. For the sake of simplicity we henceforth focus on the ideal low-pass target, wherein the Residual vanishes completely. 2.4 Signal Extraction ATS-Trilemma and Customization The MSE can be easily generalized to provide a customized measure by assigning weights to the terms of its ATS decomposition: M(λ1 , λ2 ) = λ1 Timeliness + λ2 Smoothness + (1 − λ1 − λ2 ) Accuracy. (11) The parameters λ1 , λ2 ∈ [0, 1] allow one to assign priorities to single or pairwise combinations of MSE error terms: the underlying three-dimensional tradeoff is called the ATS-trilemma and the optimization paradigm (11) is called a customized criterion. The user is free to navigate on the customization triangle according to specific research priorities: the ordinary MSE is obtained by setting λ1 = λ2 = 1/3, whereas complete emphasis on Smoothness arises from the choice λ1 = 0, λ2 = 1, etc. 5 2.5 Forecast AT-Dilemma The above trilemma was obtained by assuming that the target filter Γ(·) discriminates components into pass and stopbands. In contrast, h-step ahead forecasting is concerned with the approximation of an anticipative allpass filter. Since the stop-band is non-existant, the Smothness and Residual error components vanish and the above trilemma collapses into a forecasting dilemma: M(λ1 ) = λ1 Timeliness + (1 − λ1 ) Accuracy While this paradigm is sufficient to address all-pass filtering problems, such as multi-step ahead forecasting, it cannot emphasize Smoothness, and hence is less flexible than (11). 2.6 Spectrum, Signal and Customization Interfaces The proposed method can replicate and customize traditional (ARIMA or State-Space) modelbased approaches as well as classic filter designs (HP, CF, Henderson), by plugging the corresponding (pseudo) spectral densities as well as the target signals into (13). We could combine these in any order: as an example, we could target a HP filter by supplying a model-based (pseudo) spectral density. Moreover, we could use alternative spectral estimates (for example non parametric) or targets: the resulting spectrum- and signal-interfaces allow for a flexible implementation of general – hybrid – forecasting problems. Finally, the customization-interface allows one to emphasize particular filter characteristics, so as to align with research priorities. Our encompassing methodology will be illustrated on a hybrid signal extraction problem in the next section. 3 Replicating and Customizing a Generic Model-Based Approach 3.1 Empirical Design We here propose a simple artificial framework based on an ideal low-pass target with cutoff π/12 given by Γ(ω; 0, π/12) and an AR(1) data-generating process (DGP) {xt } satisfying xt = a1 xt−1 + t (12) and {t } a white noise process. Our design may refer to real-time economic indicators: log-returns of macro-data2 can be fitted more or less satisfactorily by benchmark AR(1) models, and users are typically interested in inferring business cycle or trend dynamics from such a design. Our artificial framework is deliberately over-simplified in order to highlight the salient features of the ATS-trilemma. Instead of the schematic criterion (11) we here rely on the slightly more general concept of weighting functions W1 (ω), W2 (ω) ≥ 0 Z 2 b (A(ω) − A(ω)) h(ω)dω pass-band b Φ(ω) − Φ(ω) 2 Z + 4 b A(ω)A(ω) sin pass-band Z + !2 W1 (ω)h(ω)dω 2 b (A(ω) − A(ω)) W2 (ω)h(ω)dω (13) stop-band (recall that the fourth term, the residual, vanishes in our framework). We then select W1 (ω; λ) = 1+λ W2 (ω; η) = (1 + |ω| − π/12)η 2 Industrial production, employment or income, as proposed and used by the NBER. 6 (14) where π/12 is the cutoff of the target signal. For λ = η = 0 the MSE criterion is obtained; λ > 0 emphasizes Timeliness whereas η > 0 highlights Smoothness. 3.2 Performance Measures We rely on the following well-known time-domain measures for assessing out-of-sample performances of our real-time filter designs 2 2 E (1 − B) ybt Curvature := (15) var(ybt ) Peak-Correlation := Arg max(cor(yt , ybt+j )) (16) j Mean-Shift := Sample-MSE := b E(φ(ω)|ω ∈ pass-band) E (yt − ybt )2 (17) (18) Mean-square second-order differences emphasize the geometric curvature of a time series; smaller Curvature leads to a smoother appearance of a filtered series (e.g., having less noisy ripples). The proposed measure is normalized in order to immunize against scaling effects. Note that we are not primarily interested in the magnitude of the correlations in (16), but in the integer j0 at which the correlation between the target yt and the shifted (real-time) estimate ybt+j0 is maximized. The latter is called leading, coincident or lagging depending on j0 being positive, zero or negative respectively. The Mean-Shift (17) broadly reflects the peak-correlation, though it is able to measure non-integer shifts. Finally, the Sample MSE is computed in order to assess the overall loss in mean-square performances entailed by customization. In contrast to ATS components, all the above performance measures, except the sample MSE, are invariant to scaling effects – multiplying a filter by a positive constant does not affect results. Since expectations are unknown, we shall report sample distributions (box-plots) of the above performance measures, distinguishing in- and out-of-sample periods. In particular we shall benchmark four empirical filters - three customized and one MSE - against the best MSE filter (assuming knowledge of the true AR(1) DGP). 3.3 A Generic Model-Based Approach Given a sample x1 , ..., xT from {xt }, the generic model-based approach (Cleveland (1972)) consists in forecasting and backcasting missing data xT +k , k > 0, x1−k , k > 0 in the target specification yT = ∞ X γk xT −k . (19) k=−∞ For simplicity, we here assume that backcasts can be neglected (because the series will be sufficiently long). Assuming knowledge of the true AR(1) DGP in 12, the model-based real-time filter 7 is then obtained as ŷT = ≈ −1 X γk x̂T −k + k=−∞ k=0 −1 X T −1 X γk x̂T −k + k=−∞ = T −1 X −1 X ∞ X γk x̂T −k k=T γk xT −k k=0 |k| γk a1 xT + k=−∞ = γk xT −k + 0 X T −1 X γk xT −k k=0 ! |k| γk a1 xT + k=−∞ T −1 X γk xT −k (20) k=1 |k| where a1 xT is the forecast of xT +|k| . This proceeding is generic in the sense that it straightforwardly extends to arbitrary (S)ARIMA-processes. Also, it emphasizes explicitly the forecastparadigm underlying classic model-based approaches. We now compare the above filter coefficients with those obtained by (13), assuming λ = η = 0 (continuous integrals are substituted by discrete sums). Figure 1 illustrates the outcome for three different DGPs based on a1 = −0.9, 0, 0.9, in (12) (we set L = T = 120, so that filter and sample lengths correspond to ten years of monthly data.). null device 1 8 Filter coefficients of real−time MBA (solid) and DFA (dotted): a1=−0.9,0,0.9, L=120 0.0 0.1 0.2 0.3 0.4 a1= −0.9 a1= 0 a1= 0.9 0 20 40 60 80 100 120 Figure 1: Filter coefficients of real-time model-based (red) and DFA-MSE (blue) DFA- and model-based coefficients virtually overlap for all lags and all DGPs (a tiny departure can be observed at the rightmost end for a1 = 0.9) so that backcasts in the latter can indeed be negleted. The effect of the DGP is visible in lag zero (utmost left). We now proceed to customization. 3.4 Customizing the Model-Based Approach: Known DGP We first assumme knowledge of the true DGP: this unrealistic assumption is relaxed in section 3.5. Besides our benchmark MSE-filter (red line in the following figures), we propose to compute and analyze three customized designs: a “balanced” filter (green) which will outperform the MSE design in terms of Timeliness and Smoothness errors, as well as two unbalanced designs, emphasizing more heavily either Timeliness (blue) or Smoothness (cyan) error-components. Amplitude and time-shift functions of all filters are plotted in Figures 2 (a1 = −0.9), 10 (a1 = 0) and 11 (a1 = 0.9). 9 Target MBA−MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 8 Target MBA−MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 6 1.2 Shift: a1=−0.9 0 0.0 0.2 2 0.4 0.6 4 0.8 1.0 Amplitude: a1=−0.9 0 2pi/6 4pi/6 pi 0 2pi/6 4pi/6 pi Figure 2: MBA vs. customized filters a=-0.9: amplitude and time-shifts Increasing λ tends to decrease the time shift in the pass-band, as desired; as an example, the blue filter is a virtually zero-phase real-time filter. The empirical distributions of Curvature and Peak-Correlation in Figures 3, 12 and 13 confirm the intended effects (recall that both measures are insensitive to scaling effects): both dimensions can be addressed either individually (blue, cyan) or simultaneously (green) by suitable customization. The frequency-domain Mean-Shift confirms the time-domain Peak-Correlations, as expected (note that the former is not a random variable since the DGP is known: all other statistics are random-variables, though). Filters with larger η (green,cyan) are subject to stronger shrinkage, therefore MSE performances may benefit from scaling outputs. Tables 1, 2 and 3 report true MSEs as well as the ATS-decompositions of the various designs for the three AR(1) processes: Timeliness and Smoothness components are affected as desired by customization. MBA-MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 Accuracy 0.004472 0.019176 0.021950 0.018080 Timeliness 0.003312 0.000085 0.000150 0.000001 Smoothness 0.003860 0.000295 0.000018 0.000793 Residual 0.000000 0.000000 0.000000 0.000000 Total MSE 0.011645 0.019556 0.022118 0.018874 Table 1: Total MSE and ATS-error components: MSE vs. customized, a1=-0.9 10 Peak−Cor., a1=−0.9 Mean−shift, a1=−0.9 4 Curv., a1=−0.9 ● ● ● 1.0 0.2 2 1.5 0.3 2.0 3 0.4 2.5 ● ● 1 ● 0.5 0.1 ● ● Best MSE DFA(30,1) Sample MSE, a1=−0.9 ● Best MSE DFA(30,1) 0.0 0.0 0 ● Best MSE DFA(30,1) scaled Sample MSE, a1=−0.9 ● ● ● 0.020 ● ● ● ● 0.015 ● 0.010 0.010 0.015 0.020 0.025 0.025 0.030 ● ● ● Best MSE DFA(30,1) ● ● ● ● Best MSE DFA(30,1) Figure 3: Empirical distributions of Curvature, Peak-Correlation, Mean-Shift and Sample MSE statistics (original and scaled): a1=-0.9 To conclude, we plot filter outputs for a particular realization (the first of the sample) of each DGP; see Figures 4, 14 and 153 . The bottom plots compare MSE (red) and balanced (green) filters. 3A symmetric filter of length 120 is used for reference. All series are standardized for ease of visual inspection. 11 Final symmetric vs. real−time outputs: a1=−0.9 −2 0 2 Final symmetric MBA−MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 360 400 440 480 520 560 600 2 Lambda=30, eta=0.5 −2 MSE 0 MSE (red) vs. balanced (green): a1=−0.9 360 400 440 480 520 560 600 Figure 4: Filter outputs: final symmetric (black) vs. real-time MSE and customized designs, a1=-0.9 Series with smaller Peak-Correlations or smaller Mean-Shift tend to lie “to the left” of the MSE filter-outputs (red), which indicates an anticipatory behavior, and series with smaller Curvature appear smoother, as desired. 3.5 Customizing the Model-Based Approach: Unknown DGP The previous results rely on the unrealistic assumption that the true DGP is known. We now analyze empirical spectral estimates based either on estimated AR(1) models or on the non-parametric periodogram. All estimates rely on samples of length T = 120; out-of-sample performances are computed on samples of length 1000; empirical distributions of performances rely on 100 replications of this basic setting for each of the above AR(1) processes. 12 3.5.1 Empirical AR(1) Spectrum In order to avoid identification issues we here assume that the model is known (i.e., a zero-mean AR(1)). For each realization an empirical AR(1) spectrum is obtained by relying on the estimated AR(1) coefficient. In- and out-of-sample empirical distributions of Curvature and Peak-Correlation are plotted in Figures 5 (a1 = −0.9), 16 (a1 = 0) and 17 (a1 = 0.9); corresponding distributions of sample MSEs are plotted in Figure 6, 18 and 19: note that scaling terms are truly out-of-sample too. Once again, scaling appears useful in terms of MSE performances if η is “large” (green, cyan). Out-of-sample distributions appear more tightly concentrated because the sample lengths are longer (1000 vs. 120 in-sample). The balanced design (green) clearly outperforms the MSE-filter in terms of smaller Peak-Correlation and Curvature measures, for all three processes and out-of-sample, at costs of moderate losses in terms of MSE-performances, especially after re-scaling. ● ● ● ● ● Curv.out, a1=−0.9 ● ● ● ● 0.8 1.0 ● ● 0.8 Curv−in, a1=−0.9 ● 0.6 0.4 0.4 0.6 ● ● ● ● (30,0.5) ● ● ● 0.0 0.0 ● ● Best MSE ● 0.2 0.2 ● ● ● ● (500,0.3) Best MSE Peak−Cor−in, a1=−0.9 ● ● (30,0.5) (500,0.3) Peak−Cor.out, a1=−0.9 ● ● ● 8 4 ● 6 3 ● ● ● ● ● 2 ● 4 ● ● 2 1 ● 0 ● 0 ● Best MSE (30,0.5) (500,0.3) Best MSE (30,0.5) (500,0.3) Figure 5: Empirical distributions of Curvature and Peak-Correlation based on estimated AR(1) spectrum: in-sample (left plots) and out-of-sample (right plots) for a1=-0.9 13 MSE−in, a1=−0.9 Sample MSE out, a1=−0.9 ● 0.05 ● ● ● ● ● ● ● 0.010 0.01 0.03 ● 0.020 ● Best MSE (30,0.5) (500,0.3) 0.04 ● ● (30,0.5) (500,0.3) Scaled sample MSE out, a1=−0.9 ● 0.030 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.020 0.03 ● ● Best MSE scaled MSE−in, a1=−0.9 ● ● 0.010 0.01 0.02 ● ● Best MSE (30,0.5) (500,0.3) ● ● ● ● ● ● ● ● Best MSE (30,0.5) (500,0.3) Figure 6: Empirical distributions of Sample MSEs based on estimated AR(1) spectrum: in-sample (left plots) and out-of-sample (right plots) for a1=-0.9 3.5.2 Non-parametric Spectrum In contrast to the previous sections we here ignore a priori knowledge about the true DGP and use a non-parametric spectral estimate, namely the raw periodogram. We rely exactly on the same empirical setting as in the previous section (same simulated data) but we set L = 24 and the periodogram is computed on the discrete frequency grid ωk = k2π/T , k = −T /2, ..., T /2, where T = 120. The choice of the filter-length reflects the maximal duration of components in the stopband of the target filter (two years). We are also interested in assessing overfitting in a richly parametrized framework and its consequences on customization. In- and out-of-sample empirical distributions of Curvature and Peak-Correlation are plotted in Figures 7 (a1 = −0.9), 21 (a1 = 0) and 21 (a1 = 0.9) and corresponding MSEs of original and scaled series are plotted in Figure 8, 22 and 23. In comparison to the previous AR(1) spectrum out14 of-sample MSE performances slightly degrade, empirical distributions are less tightly concentrated (especially of Peak-Correlation) and the customization-effect appears somehow shifted, i.e., PeakCorrelations are smaller and Curvatures are larger than those obtained in the previous section (parametric spectrum). The latter effect is due to the fact that the passband is short (only six discrete frequency ordinates, including frequency zero) and that the number of freely determined parameters L = 24 is comparatively large; therefore, emphasizing time-shift properties in the passband (Timeliness) can be addressed more easily than emphasizing amplitude characteristics in the stopband (Smoothness). Despite overfitting we observe that suitably customized filters (cyan) still outperform the MSE design in the targeted dimensions, out-of-sample. Once again, MSE performances benefit from scaling if filters are shrunken (cyan). Considering the empirical framework, out-of-sample and in-sample MSE performances are remarkably consistent, and not too far distant from the model-based approach. Curv−in, a1=−0.9 Curv.out, a1=−0.9 ● 1.5 ● ● 1.0 ● ● ● ● ● ● ● ● Best MSE (30,0.5) (500,0.3) (30,0.5) (500,0.3) Peak−Cor.out, a1=−0.9 10 10 ● ● ● ● Best MSE Peak−Cor−in, a1=−0.9 ● ● 8 ● 8 ● 0.5 0.2 0.4 0.6 0.8 1.0 ● ● ● ● 6 ● 6 ● ● ● ● 2 ● ● 4 ● 2 4 ● ● ● 0 0 ● Best MSE (30,0.5) (500,0.3) Best MSE (30,0.5) (500,0.3) Figure 7: Empirical distributions of Curvature and Peak-Correlation based on periodogram: insample (left plots) and out-of-sample (right plots) for a1=-0.9 15 Sample MSE out, a1=−0.9 ● ● ● ● ● ● ● 0.020 0.025 0.030 0.035 MSE−in, a1=−0.9 ● ● ● ● ● ● ● 0.005 0.010 0.015 ● ● ● ● ● Best MSE (30,0.5) (500,0.3) Best MSE ● 0.025 ● ● ● (500,0.3) Scaled sample MSE out, a1=−0.9 0.030 scaled MSE−in, a1=−0.9 (30,0.5) ● ● ● ● ● ● ● ● ● ● ● ● 0.005 0.010 0.015 0.020 ● Best MSE (30,0.5) (500,0.3) Best MSE (30,0.5) (500,0.3) Figure 8: Empirical distributions of Sample MSEs (original and scaled series) based on periodogram: in-sample (left plots) and out-of-sample (right plots) for a1=-0.9 In order to illustrate the above results in visual terms we briefly compare outputs of the best MSE-filter (assuming knowledge of the DGP), of the balanced customized model-based filter (section 3.5.1: green filter) and of the non-parametric customized estimate (cyan filter in the previous graphs) for a particular realization (the first of the sample), see Figure 9. null device 1 16 MSE (red) vs. customized param. (green) and non−param. (cyan) −2 −1 0 1 2 MSE Model−based: Lambda=30, eta=0.5 Non−parametric: Lambda=30, eta=1 360 400 440 480 520 560 600 Figure 9: Outputs of best MSE, customized model-based (green) and customized non-parametric (cyan) filters: a1=0 The customized filter outputs anticipate the MSE filter, and their appearance is smoother in the sense that noisy high-frequency ripples are damped more effectively. 4 Summary and Conclusion Real-time signal extraction is a difficult prospective estimation problem which involves linear combinations of one- and multi-step ahead forecasts of a series. Tackling this problem ‘directly’, by the DFA, allows to expand the (allpass-) forecast AT-dilemma into a richer ATS-trilemma. The resulting customized optimization criterion generalizes the classical MSE-paradigm, by allowing the user to navigate on the customization triangle according to specific research priorities. In particular, Smoothness (curvature) and Timeliness (peak-correlation) performances of a real-time filter can be improved simultaneously, at the expense of Accuracy and total MSE. The DFA and 17 the resulting ATS-trilemma are generic concepts in the sense that they allow for arbitrary target signals or spectral estimates. In particular, classical ARIMA-based approaches can be replicated exactly, by supplying the corresponding entries. Once replicated, a particular approach can be customized to accommodate for particular research priorities. Our examples confirm that the best MSE-filter, assuming knowledge of the true DGP, can be predictably outperformed in terms of curvature and peak-correlation out-of-sample by suitably customized parametric (model-based) or non-parametric (periodogram) designs. Shift: a1=0 Target MBA−MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 Target MBA−MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 8 Amplitude: a1=0 2 0.4 0.6 4 0.8 6 1.0 1.2 Empirical Distributions of Performances when the DGP is Known: the cases a1 = 0 and a1 = 0.9 0 0.2 5.1 Appendix 0.0 5 0 2pi/6 4pi/6 pi 0 2pi/6 4pi/6 Figure 10: MBA vs. customized filters a=0: amplitude and time-shifts 18 pi Target MBA−MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 Target MBA−MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 5 6 Shift: a1=0.9 0.0 0 1 0.5 2 3 1.0 4 1.5 Amplitude: a1=0.9 0 2pi/6 4pi/6 pi 0 2pi/6 4pi/6 pi Figure 11: MBA vs. customized filters a=0.9: amplitude and time-shifts True MSE DFA-MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 Accuracy 0.001926 0.001836 0.004725 0.005414 0.004675 Timeliness 0.001084 0.000206 0.000017 0.000040 0.000001 Smoothness 0.002972 0.002355 0.002880 0.003044 0.003357 Residual 0.001243 0.001124 0.000022 0.000018 0.000000 Total MSE 0.007224 0.005521 0.007644 0.008515 0.008033 Table 2: Total MSE and ATS-error components: MSE vs. customized, a1=0 True MSE DFA-MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 Accuracy 0.005182 0.002714 0.059663 0.146754 0.067056 Timeliness 0.013786 0.020710 0.000814 0.002131 0.000039 Smoothness 0.048181 0.025249 0.047194 0.022384 0.068199 Residual 0.009774 0.000143 0.000040 0.000048 0.000007 Total MSE 0.076923 0.048817 0.107712 0.171317 0.135300 Table 3: Total MSE and ATS-error components: MSE vs. customized, a1=0.9 19 Peak−Cor., a1=0 Mean−shift, a1=0 4 Curv., a1=0 1 1.0 0.2 2 1.5 0.3 3 2.0 0.4 2.5 ● DFA(30,1) Best MSE Sample MSE, a1=0 0.10 ● ● ● 0.04 0.06 0.08 ● ● ● Best MSE ● DFA(30,1) DFA(30,1) Best MSE DFA(30,1) scaled Sample MSE, a1=0 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Best MSE ● 0.0 0.0 0 0.5 0.1 ● ● ● ● ● ● ● ● ● Best MSE ● ● DFA(30,1) Figure 12: Empirical distributions of Curvature, Peak-Correlation, Mean-Shift and Sample MSE statistics (original and scaled): a1=0 20 ● 1.5 ● Best MSE DFA(30,1) 1.0 0.0 0.00 0.0 0.02 0.5 0.5 0.04 1.0 0.06 1.5 0.08 2.0 0.10 2.5 0.12 Mean−shift, a1=0.9 2.0 Peak−Cor., a1=0.9 3.0 0.14 Curv., a1=0.9 Best MSE Sample MSE, a1=0.9 Best MSE DFA(30,1) scaled Sample MSE, a1=0.9 ● 4 2.5 ● ● DFA(30,1) 2.0 ● 1.5 3 ● ● ● ● ● ● ● ● ● 1.0 2 ● 1 ● Best MSE 0.5 ● DFA(30,1) Best MSE DFA(30,1) Figure 13: Empirical distributions of Curvature, Peak-Correlation, Mean-Shift and Sample MSE statistics (original and scaled): a1=0.9 21 0 1 2 Final symmetric vs. real−time outputs: a1=0 −2 Final symmetric MBA−MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 360 400 440 480 520 560 600 0 1 2 MSE (red) vs. balanced (green): a1=0 MSE −2 Lambda=30, eta=0.5 360 400 440 480 520 560 600 Figure 14: Filter outputs: final symmetric vs. real-time MSE and customized designs, a1=0 22 Final symmetric vs. real−time outputs: a1=0.9 −2 0 2 Final symmetric MBA−MSE Lambda=30, eta=0.5 Lambda=30, eta=1 Lambda=500, eta=0.3 360 400 440 480 520 560 600 2 Lambda=30, eta=0.5 −2 MSE 0 MSE (red) vs. balanced (green): a1=0.9 360 400 440 480 520 560 600 Figure 15: Filter outputs: final symmetric vs. real-time MSE and customized designs, a1=0.9 23 5.2 Empirical Distributions of Performances based on the Empirical AR(1) Spectrum: the Cases a1 = 0 and a1 = 0.9 Curv.out, a1=0 0.6 Curv−in, a1=0 0.8 ● ● 0.4 0.6 0.2 ● ● (30,0.5) 0.0 0.0 ● ● ● (500,0.3) Best MSE Peak−Cor−in, a1=0 ● ● ● (500,0.3) Peak−Cor.out, a1=0 ● ● 4 ● (30,0.5) ● 3 5 ● 4 6 ● ● 0.2 0.4 ● ● ● Best MSE ● ● ● ● ● 2 ● ● ● ● ● Best MSE ● ● (30,0.5) 0 0 1 1 2 3 ● (500,0.3) Best MSE (30,0.5) (500,0.3) Figure 16: Empirical distributions of Curvature and Peak-Correlation based on estimated AR(1) spectrum: in-sample (left plots) and out-of-sample (right plots) for a1=0 24 Curv.out, a1=0.9 ● 0.12 0.4 Curv−in, a1=0.9 0.08 0.3 ● 0.04 0.2 ● 0.1 ● ● Best MSE (30,0.5) 0.00 0.0 ● (500,0.3) Best MSE 5 ● ● 1.0 ● ● 1.5 4 3 Peak−Cor.out, a1=0.9 ● ● (500,0.3) 2.0 Peak−Cor−in, a1=0.9 (30,0.5) Best MSE ● 0.5 ● ● ● 0.0 1 ● 0 2 ● ● (30,0.5) (500,0.3) ● Best MSE (30,0.5) (500,0.3) Figure 17: Empirical distributions of Curvature and Peak-Correlation based on estimated AR(1) spectrum: in-sample (left plots) and out-of-sample (right plots) for a1=0.9 25 Sample MSE out, a1=0 ● ● ● ● 0.04 0.05 0.06 0.10 0.08 0.15 0.10 MSE−in, a1=0 Best MSE (30,0.5) (500,0.3) Best MSE ● ● ● (30,0.5) (500,0.3) ● ● ● ● 0.07 0.06 ● Scaled sample MSE out, a1=0 0.09 0.10 scaled MSE−in, a1=0 ● ● ● ● ● ● ● 0.02 0.03 0.05 ● Best MSE (30,0.5) (500,0.3) Best MSE (30,0.5) (500,0.3) Figure 18: Empirical distributions of Sample MSEs based on estimated AR(1) spectrum: in-sample (left plots) and out-of-sample (right plots) for a1=0 26 Sample MSE out, a1=0.9 5 MSE−in, a1=0.9 5 ● 3 4 4 ● 3 ● 2 2 ● ● ● 1 1 ● ● Best MSE (30,0.5) (500,0.3) Best MSE (500,0.3) Scaled sample MSE out, a1=0.9 ● ● ● 2.0 1.5 ● 1.5 1.0 (30,0.5) 2.5 scaled MSE−in, a1=0.9 ● ● ● ● ● ● ● 0.5 0.5 1.0 ● Best MSE (30,0.5) (500,0.3) ● ● Best MSE ● ● (30,0.5) (500,0.3) Figure 19: Empirical distributions of Sample MSEs based on estimated AR(1) spectrum: in-sample (left plots) and out-of-sample (right plots) for a1=0.9 27 5.3 Empirical Distributions of Performances based on the Periodogram: the Cases a1 = 0 and a1 = 0.9 Curv.out, a1=0 ● ● 1.2 ● 0.8 1.0 Curv−in, a1=0 0.8 0.6 ● ● ● 0.2 0.4 0.4 ● ● ● ● ● ● 0.0 ● ● Best MSE (30,0.5) (500,0.3) Best MSE Peak−Cor−in, a1=0 ● ● ● 6 8 (500,0.3) Peak−Cor.out, a1=0 7 ● ● (30,0.5) 5 ● ● ● ● ● 0 0 ● ● ● Best MSE ● ● 1 2 ● 3 ● 2 4 ● ● 4 6 ● ● (30,0.5) (500,0.3) Best MSE (30,0.5) (500,0.3) Figure 20: Empirical distributions of Curvature and Peak-Correlation based on periodogram: in-sample (left plots) and out-of-sample (right plots) for a1=0 28 Curv.out, a1=0.9 ● ● ● ● ● ● ● ● ● 0.0 0.1 0.2 0.3 0.4 0.5 ● ● ● ● 0.3 0.4 0.5 Curv−in, a1=0.9 ● ● ● 0.2 ● ● 0.1 ● ● ● ● Best MSE (30,0.5) (500,0.3) (30,0.5) 4 ● ● ● Best MSE (30,0.5) (500,0.3) 1 2 ● ● ● Best MSE (30,0.5) 0 ● (500,0.3) ● 3 4 ● 2 3 ● 0 1 ● Peak−Cor.out, a1=0.9 ● ● ● Best MSE Peak−Cor−in, a1=0.9 ● ● (500,0.3) Figure 21: Empirical distributions of Curvature and Peak-Correlation based on periodogram: in-sample (left plots) and out-of-sample (right plots) for a1=0.9 29 Sample MSE out, a1=0 ● ● ● ● ● ● ● ● ● ● ● 0.08 ● ● 0.12 ● ● ● ● ● ● ● ● 0.02 0.04 0.06 0.10 0.14 MSE−in, a1=0 Best MSE (30,0.5) (500,0.3) ● Best MSE scaled MSE−in, a1=0 (30,0.5) (500,0.3) Scaled sample MSE out, a1=0 ● ● ● ● ● ● 0.12 ● ● ● ● ● ● ● 0.02 0.04 0.08 0.06 0.10 ● Best MSE (30,0.5) (500,0.3) Best MSE (30,0.5) (500,0.3) Figure 22: Empirical distributions of Sample MSEs (original and scaled series) based on periodogram: in-sample (left plots) and out-of-sample (right plots) for a1=0 30 2.5 MSE−in, a1=0.9 Sample MSE out, a1=0.9 2.0 2.0 1.5 ● 1.0 ● ● ● 0.5 0.5 1.0 1.5 ● ● ● ● ● Best MSE ● ● ● 2.5 ● ● ● (30,0.5) (500,0.3) ● ● ● ● (30,0.5) (500,0.3) ● ● ● ● Best MSE Scaled sample MSE out, a1=0.9 1.6 scaled MSE−in, a1=0.9 ● ● ● ● ● ● ● ● ● ● ● ● ● 0.8 1.5 ● ● ● ● 0.4 0.5 ● ● ● 1.2 2.5 ● ● Best MSE (30,0.5) (500,0.3) Best MSE (30,0.5) (500,0.3) Figure 23: Empirical distributions of Sample MSEs(original and scaled series) based on periodogram: in-sample (left plots) and out-of-sample (right plots) for a1=0.9 5.4 Time-Shift in Frequency Zero b The time-shift Φ(ω)/ω is subject to a singularity in frequency zero: a zero-over-zero quotient. However, one can apply first order Taylor approximations to both terms of the quotient (l’Hôpital’s 31 rule) which leads to b φ(0) = = = b Φ(ω) ω→0 ω d b Φ(ω) dω lim ω=0 1 d b dω Γ(ω) ω=0 b −iA(0) PL−1 = j=0 jbj PL−1 . j=0 bj (21) The second and third equalities are obtained by looking at −i L−1 X jbj = j=0 = = d b Γ(ω) dω ω=0 d b d b b exp(−iΦ(0)) − iA(0) exp(−iΦ(0)) A(ω) Φ(ω) dω dω ω=0 ω=0 d b b −iA(0) Φ(ω) dω ω=0 The derivative of the amplitude vanishes in zero because the amplitude is a continuous even b b function i.e. A(−ω) = A(ω). References [1] Bell, W. and Hillmer, S. (1984) Issues involved with seasonal adjustment of economic time series. J. Bus. Econ. Stat. 2, 291–320. [2] Bell, W. and Martin, D. (2004) Computation of asymmetric signal extraction filters and mean squared error for ARIMA component models. Journal of Time Series Analysis 25, 603–625. [3] Brockwell, P. and Davis, R. (1991) Time Series: Theory and Methods. New York: Springer. [4] Christiano, L. and Fitzgerald, T. (2003) The band pass filter. International Economic Review 44, 435–465. [5] Cleveland, W.P. (1972) Analysis and Forecasting of Seasonal Time series. PhD thesis, University of Wisconsin-Madison, 1972. [6] Koopman, S., Harvey, A., Doornik, J., and Shepherd, N. (2000) Stamp 6.0: Structural Time Series Analyser, Modeller, and Predictor. London: Timberlake Consultants. [7] Henderson, R. (1916) Note on graduation by adjusted average. Trans. Actuar. Soc. Amer. 17, 43–48. [8] Hodrick, R. and Prescott, E. (1997) Postwar U.S. business cycles: an empirical investigation. Journal of Money, Credit, and Banking 29, 1–16. [9] Maravall, A. and Caparello, G. (2004) Program TSW: Revised Reference Manual. Working paper 2004, Research Department, Bank of Spain. http://www.bde.es. [10] Wildi M. and McElroy T. (2013) Direct Filter Approach and the Trilemma between Accuracy, Timeliness and Smoothness in Real-Time Forecasting and Signal Extraction. Working paper http://blog.zhaw.ch/idp/sefblog/uploads/DFA6.pdf. 32 [11] Wildi M. (2005) Signal Extraction: Efficient Estimation, Unit-Root Tests and Early Detection of Turning Points. Lecture Notes in Economics and Mathematical Systems, 547, Springer. 33