Spectral Analysis Jesus Fernandez-Villaverde University of Pennsylvania 1

advertisement
Spectral Analysis
Jesus Fernandez-Villaverde
University of Pennsylvania
1
Why Spectral Analysis?
We want to develop a theory to obtain the business cycle properties
of the data. Burns and Mitchell (1946).
General problem of signal extraction.
We will use some basic results in spectral (or harmonic) analysis.
Then, we will develop the idea of a lter.
2
Riesz-Fisher Theorem
First we will show that there is an intimate link between L2 [
and l2 ( 1; 1).
; ]
Why? Because we want to represent a time series in two di erent
ways.
Riesz-Fischer Theorem: Let fcng1
n= 1 2 l2 ( 1; 1). Then, there
exist a function f (! ) 2 L2 [ ; ] such that:
1
X
f (! ) =
cj e i!j
j= 1
This function is called the Fourier Transform of the series.
3
Properties
Its nite approximations converge to the in nite series in the mean
square norm:
lim
n!1
Z
n
X
2
cj e i!j
f (! ) d! = 0
j= n
It satis es the Parseval's Relation:
Z
jf (! )j2 d! =
Inversion formula:
1
ck =
2
Z
1
X
n= 1
jcnj2
f (! ) ei!k d!
4
Harmonics
n
oj=1
are called harmonics and constitute an
The functions e i!j
j= 1
orthonormal base in L2 [ ; ].
The orthonormality follows directly from:
if j 6= k and
Z
e i!j ei!k d! = 0
Z
e i!j ei!j d! = 1
The fact that they constitute a base is given by the second theorem
that goes in the opposite direction than Riesz-Fischer: given any function in L2 [ ; ] we can nd an associated sequence in l2 ( 1; 1).
5
Converse Riesz-Fischer Theorem
Let f (! ) 2 L2 [
such that:
; ]. Then there exist a sequence fcng1
n= 1 2 l2 ( 1; 1)
f (! ) =
1
X
cj e i!j
j= 1
where:
Z
1
f (! ) ei!k d!
ck =
2
and nite approximations converge to the in nite series in the mean square
norm:
lim
n!1
Z
n
X
2
cj e i!j
j= n
6
f (! ) d! = 0
Remarks I
.
The Riesz-Fisher theorem and its converse assure then that the Fourier
transform is an bijective from l2 ( 1; 1) into L2 [ ; ].
The mapping is isometric isomorphism since it preserves linearity and
1
distance: for any two series fxng1
,
fy
g
n
n= 1
n= 1 2 l2 ( 1; 1)
with Fourier transforms x (! ) and y (! ) we have:
1
X
x (! ) + y (! ) =
x (! ) =
j= 1
1
X
j= 1
1
2
Z
jx (! )
(xn + yn) ei!j
y (! )j2 d!
7
1
2
xnei!j
0
=@
1
X
j= 1
jxk
11
yk j2A
2
Remarks II
.
There is a limit in the amount of information in a series.
We either have concentration in the original series or concentration in
the Fourier transform.
This limit is given by:
0
@
1
X
j= 1
2
1
j 2 cj A
Z 1
1
! 2 jf (! )j2 d!
4
1
cj
4
This inequality is called the Robertson-Schr•
odinger relation.
8
Stochastic Process
X
fXt : ! Rm, m 2 N, t = 1; 2; :::g is a stochastic process dened on a complete probability space ( ; =; P ) where:
1.
= Rm 1
T Rm
t=0
limT !1
m)
m 1 is just
2. =
limT !1 =T
limT !1 T
B
(
R
B
R
t=0
the Borel -algebra generated by the measurable nite-dimensional
product cylinders.
3. P T (B )
P (B ) j=T
De ne a T segment as
P Y T 2 B , 8B 2 =T .
XT
realization of that segment as
0
0
0
X1; :::; XT with X 0
0
xT
x01; :::; x0T .
9
= f;g and a
Moments
Moments, if they exist, can be de ned in the usual way:
t
tj = Ext xt j =
= Ext =
Z
Rm
Z
(j+1)
Rm
(xt
xtdP T
) xt j
dP T
If both t and tj are independent of t for all j , X is a covariancestationary or weakly stationary process.
Now, let us deal with X : covariance-stationary process with zero mean
(the process can be always renormalized to satisfy this requirement).
10
z Transform
If
P1
j= 1
j < 1,
de ne the operator gx : C ! C :
gx (z ) =
1
X
jz
j
j= 1
where C is the complex set.
This mapping is known as the autocovariance generating function.
Dividing this function by 2 and evaluating it at e i! (where ! is a
real scalar), we get the spectrum (or power spectrum) of the process
xt :
1
1 X
i!j
sx (! ) =
je
2 j= 1
11
Spectrum
The spectrum is the Fourier Transform of the covariances of the
process, divided by 2 to normalizes the integral of sx ( ) to 1.
The spectrum and the autocovariances are equivalent: there is no
information in one that is not presented in other.
That does not mean having two representations of the same information is useless: some characteristics of the series, as its serial correlation are easier to grasp with the autocovariances while others as its
unobserved components (as the di erent uctuations that compose
the series) are much easier to understand in the spectrum.
12
Example
General ARMA model:
(L) xt = (L) "t.
P
. (1) representation of xt in the lag
Since xt = 1
"
;
an
M
A
t
j
j
j=0
operator (L) and variance 2 satis es:
gx (L) = j
(L)j2 =
L 1
(L)
2
Wold's theorem assures us that we can write any stationary ARMA as
an M A (1) and then since we can always make:
(L) =
(L)
(L)
we will have
gx (L) =
(L)
L 1
2
and then the spectrum is given by:
2
e i!j
sx (! ) =
2
e i!j
13
=
(L)
L 1
(L)
L 1
ei!j
ei!j
e i!j
2
Working on the Spectrum
Since j =
j:
.
0
1
X
1
X
1
1 @
i!j
sx (! ) =
=
je
0+
2 j= 1
2
j=1
0
1
1
X
1 @
A
=
0+2
j cos (!j )
2
j=1
iz
by Euler's Formula ( e +e
2
iz
1
h
i
i!j
i!j
A
+e
j e
= cos z )
Hence:
1. The spectrum is always real-valued.
2. It is the sum of an in nite number of cosines.
3. Since cos (! ) = cos ( ! ) = cos (! + 2 k), k = 0; 1; 2; :::, the
spectrum is symmetric around 0 and all the relevant information is
concentrated in the interval [0; ].
14
Spectral Density Function
.
Sometimes the autocovariance generating function is replaced by the
autocorrelation generating function (where every term is divide by 0).
Then, we get the spectral density function:
s0x (! )
where j = j = 0.
0
1
X
1
1 @
A
=
1+2
j cos (!j )
2
j=1
15
Properties of Periodic Functions
.
Take the modi ed cosine function:
yj = A cos (!j
)
! (measured in radians): angular frequency (or harmonic or simply
the frequency ).
2 =! : period or whole cycle of the function.
A: amplitude or range of the function.
: phase or how much the function is translated in the horizontal axis.
16
Di erent Periods
.
! = 0, the period of uctuation is in nite, i.e. the frequency associated with a trend (stochastic or deterministic).
! =
(the Nyquist frequency), the period is 2 units of time, the
minimal possible observation of a uctuation.
Business cycle uctuations: usually de ned as uctuations between 6
and 32 quarters. Hence, the frequencies of interest are those comprised
between =3 and =16.
We can have cycles with higher frequency than . When sampling in
not continuous these higher frequencies will be imputed to the frequencies between 0 and . This phenomenon is known as aliasing.
17
Inversion
Using the inversion formula, we can nd all the covariances from the
spectrum:
.
Z
sx (! ) ei!j d! =
Z
sx (! ) cos (!j ) d! = j
When j = 0, the variance of the series is:
Z
sx (! ) d! = 0
Alternative interpretation of the spectrum: integral between [
it is the variance of the series.
In general, for some ! 1
and and ! 2
Z !
2
!1
; ] of
:
sx (! ) d!
is the variance associated with frequencies in the [! 1; ! 2]
Intuitive interpretation: decomposition of the variances of the process.
18
Spectral Representation Theorem
Any stationary process can be written in its Cramer's Representation:
.
Z
Z
xt =
0
u (! ) cos !td! +
0
v (! ) sin !td!
A more general representation is given by:
xt =
where
Z
(! ) d!
( ) is any function that satis es an orthogonality condition:
Z
(! 1) (! 2)0 d! 1d! 2 = 0 for ! 1 6= ! 2
19
Relating Two Time Series
Take a zero mean, covariance stationary random process Y with real. process X , with realization xt:
ization yt and project it against the
yt = B (L) xt + "t
where
B (L) =
1
X
bj Lj
j= 1
and Ext"t = 0 for all j .
Adapting our previous notation of covariances to distinguish between
di erent stationary processes we de ne:
l
j
= Eltlt j
20
Thus:
0
ytyt j = @
1
X
10
bs x t s A @
s= 1
0
1
X
+@
s= 1
1
X
1
b r xt j r A
r= 1
1
0
b s xt s A "t j + @
1
X
r= 1
1
br xt j r A "t + "t"t j
Taking expected values of both sides, the orthogonality principle implies that:
y
j
=
1
X
1
X
bsbr xj+r s + "j
s= 1 r= 1
With these covariances, the computation of the spectrum of y is direct:
1
0
1
1
1
X
1 @ X
1 X
y i!j
i!j
" e i!j A
x
e
=
b
b
e
+
sy (! ) =
r
s
j
j+r s
2 j= 1 j
2
j;s;r= 1
j= 1
21
If we de ne h = j + r
s:
e i!j = e i!(h r+s) = e i!he i!sei!r
we get:
1
X
1
X
1
1
X
X
1
1
" e i!j
x e i!h +
sy (! ) =
br ei!r
bse i!s
j
h
2
2
r= 1
s= 1
r= 1
j= 1
= B ei!r B e i!s sx (! ) + s" (! )
Using the symmetry of the spectrum:
sy (! ) =
B e i!
22
2
sx (! ) + s" (! )
LTI Filters
Given a process X , a linear time-invariant lter (LTI- lter) is a operator from the space of sequences into itself that generates a new
.
process Y of the form:
yt = B (L) xt
Since the transformation is deterministic, s" (! ) = 0 and we get:
2
sy (! ) = B e i!
sx (! )
B e i! : frequency transform or the frequency response function is
the Fourier transform of the coe cients of the lag operator.
G (! ) = B e i!
is the gain (its modulus).
2
2
B e i!
is the power transfer function (since B e i!
is a
quadratic form, it is a real function). It indicates how much the spectrum of the series is changed at each particular frequency.
23
Gain and Phase
The de nition of gain implies that the ltered series has zero variance
P
i0 = P1
b
e
at ! = 0 if and only if B e i0 .= 1
j= 1 bj = 0.
j= 1 j
Since the gain is a complex mapping, we can write:
B e i! = B a (! ) + iB b (! )
where B a (! ) and B b are both real functions.
Then we de ne the phase of the lter
(! ) = tan 1
B b (! )
B a (! )
!
The phase measures how much the series changes its position with
respect to time when the lter is applied. For a given (! ), the lter
shifts the series by (! ) =! time units.
24
Symmetric Filters I
A lter is symmetric if bj = b j :
.
B (L) = b0 +
1
X
bj Lj + L j
j=1
Symmetric lters are important for two reasons:
1. They do not induce a phase shift since the Fourier transform of
B (L) will always be a real function.
2. Corners of the a T
segment of the series are di cult to deal
with since the lag operator can only be applied to one side.
25
Symmetric Filters II
P1
If j= 1 bj = 0, symmetric
lters can remove trend frequency components, either deterministic or stochastic up to second order (a quadratic
deterministic trend or a double unit root):
.
1
X
bj Lj
=
j= 1
=
1
X
j=1
1
X
bj
Lj
+L j
2
bj Lj + L j
= (1
where we used 1
(1
L) 1
L) 1
Lj = (1
bj Lj
j= 1
1
X
2 =
bj
1
1
X
j=1
jX1
(k
j=1
=
1
X
L 1
bj
j=1
L 1 B 0 (L)
h= j+1
Lj
1
jhj) Lh
L) 1 + L + ::: + Lj 1 and
(1 + L + :::) 1 + L 1 + ::: =
1
X
j=1
if the sum is well de ned.
26
bj
jX1
h= j+1
(k
jhj) Lh
L j
Ideal Filters
An ideal lter is an operator B (L) such that the new process Y only
.
has positive spectrum in some speci ed part of the domain.
Example: a band-pass lter for the interval f(a; b) [ ( b; a)g 2
( ; ) and 0 < a
b
, we need to choose B e i! such
that:
(
= 1; for ! 2 (a; b) [ ( b; a) ;
B e i! =
= 0 otherwise
Since a > 0, this de nition implies that B (0) = 0. Thus, a band-pass
lter shuts o all the frequencies outside the region (a; b) or ( b; a)
and leaves a new process that only uctuates in the area of interest.
27
How Do We Build an Ideal Filter?
Since X is a zero mean, covariance stationary process,
.
1, the Riesz-Fischer Theorem holds.
P1
2
n= 1 jxn j <
If we make f (! ) = B e i! , we can use the inversion formula to set
the Bj :
1
bj =
2
Z
B e i! ei!j d!
Substituting B e i! by its value:
Z
1
1
B e i! ei!j d! =
bj =
2
2
Z
1 b i!j
=
e
+ e i!j d!
2 a
where the second step just follows from
28
Z
a
b
ei!j d! +
Z b
a
ei!j d!
!
R a i!j
Rb
i!j d! .
e
d!
=
ae
b
Building an Ideal Filter
Now, using again Euler's Formula,
1
bj =
2
Z b
1
sin jb sin ja
b
.
2 cos !jd! =
sin !jja =
8 j 2 N n f0g
j
j
a
Since sin z =
b j=
sin ( z ),
sin ( jb)
sin ( ja)
=
j
Also:
sin jb + sin ja
sin jb sin ja
=
= bj
j
j
Z
b
1 b
2 cos ! 0d! =
b0 =
2 a
and we have all the coe cients to write:
yt =
0
b
@
a
+
1
X
sin jb
j=1
sin ja
j
a
1
Lj + L j A xt
The coe cients bj converge to zero at a su cient rate to make the
sum well de ned in the reals.
29
Building an Low-Pass Filter
Analogously, a low-pass lter allows only the low frequencies (between
.
( b; b)), implying a choice of B e i! such that:
B e i!
=
(
= 1; for ! 2 ( b; b) ;
= 0 otherwise
Just set a = 0.
sin jb
8 j 2 N n f0g
bj = b j =
j
b
b0 =
0
yt = @
b
1
X
1
sin jb j
+
L + L j A xt
j
j=1
30
Building an High-Pass Filter
Finally a high-pass allows only the high frequencies:
.
(
= 1; for ! 2 (b; ) [ ( ;
i!
B e
=
= 0 otherwise
b) ;
This is just the complement of a low-pass ( b; b):
0
yt = @1
b
1
X
1
sin jb j
L + L j A xt
j
j=1
31
Finite Sample Approximations
With real, nite, data, it is not possible to apply any of the previous
formulae since they require an in. nite amount of observations.
Finite sample approximations are this required.
We will study two approximations:
1. the Hodrick-Prescott lter
2. the Baxter-King lters.
We will be concern with the minimization of several problems:
1. Leakage: the lter passes frequencies that it was designed to eliminate.
2. Compression: the lter is less than one at the desired frequencies.
3. Exacerbation: the lter is more than one at the desired frequencies.
32
HP Filter
Hodrick-Prescott (HP) lter:
1. It is a coarse and relatively uncontroversial procedure to represent
the behavior of macroeconomic variables and their comovements.
2. It provides a benchmark of regularities to evaluate the comparative
performance of di erent models.
Suppose that we have T observations of the stochastic process X ,
fxtgt=1
t=1 .
HP decomposes the observations into the sum of a trend component,
xtt and a cyclical component xct:
xt = xtt + xct
33
Minimization Problem
How? Solve:
min
xtt
T
X
xt
2
t
xt
+
t=1
TX1 h
xtt+1
t=2
Intuition.
Meaning of :
1.
= 0 )trivial solution (xtt = xt).
2.
= 1 )linear trend.
34
xtt
xtt
xtt 1
i2
Matrix Notation
To compute the HP lter is easier to use matrix notation, and rewrite
minimization problem as:
0
t
x
x
min x
xt
0
where x = (x1; :::; xT ) ,
0
B
B
B
A=B
B
B
@
1
0
...
...
0
xt
=
xt
0
t
t
x1; :::; xT
2 1
1
...
...
+
0
2 1
... ...
...
1
35
0
t
Ax
Axt
and:
0
0
...
. . . ...
2 1
0
1
2 1
1
C
C
C
C
C
C
A
(T 2) T
Solution
First order condition:
xt
x + A0Axt = 0
or
xt = I + A 0 A
1
x
I + A0A 1 is a sparse matrix (with density factor (5T
We can exploit this property.
36
6) =T 2).
Properties I
To study the properties of the HP lter is however more convenient
.
to stick with the original notation.
We write the rst-order condition of the minimization problem as:
0 =
xtt
2 xt
4
h
xtt+1
+2
xtt
h
xtt
xtt
i
t
t
t
xt 1
xt 1 xt 2
i
h
t
xt 1 + 2
xtt+2 xtt+1
xtt+1
xtt
i
Now, grouping coe cients and using the lag operator L:
xt =
=
=
xtt
h
+
L 2
1+
h
1
(1
4
2L + L2
L 1
2
L)
i
2
+ (6 + 1)
1
2
1
L
37
h
L 1
i
2+L +
4 L+
L2
= F (L) xtt
i
h
L 2
2L 1
+1
i
xtt
Properties II
De ne the operator C (L) as:
(1
1) F (L) . 1 =
C (L) = (F (L)
1+
(1
L)2 1
2
L 1
L)2 1
2
L 1
Now, if we let, for convenience, T = 1, and since
xtt = B (L) xt
xct = (1 B (L)) xt
we can see that
1
B (L) = 1
B (L) = F (L) 1
F (L) 1 = (F (L) 1) F (L) 1 = C (L)
i.e., the cyclical component xct is equal to:
xct
(1
=
1+
2
L)
1
L)2 1
(1
2
1
L
xt
2
L 1
a stationary process if xt is integrated up to fourth order.
38
Properties III
Remember that:
sxc (! ) = 1
2
2
B e i!
sx (! ) = C e i!
sx (! )
.
We can evaluate at e i! and taking the module, the gain of the cyclical
component is:
Gc (! ) =
1+
=
2
1
e i!
1
2
e i!
1
1
4 (1
2
ei!
2
cos (! ))2
1 + 4 (1
where we use the identity 1
ei!
cos (! ))2
e i!
1
ei! = 2 (1
cos (! )).
This function gives zero weight to zero frequencies and close to unity
on high frequency, with increases in moving the curve to the left.
Finally note that since the gain is real,
translated in time.
39
(! ) = 0 and the series is not
Butterworth Filters
Filters with gain:
2
.
12d3 1
!
sin
2 A 7
6
G (! ) = 41 + @
5
!0
sin 2
0
that depends on two parameters, d, a positive integer, and ! 0, that
sets the frequency for which the gain is 0:5.
Higher values of d make the slope of the band more vertical while
smaller values of ! 0 narrow the width of the lter band.
40
Butterworth Filters and the HP Filter
cos (! ) = 2 sin2
.
Gc (! ) = 1
1 + 16
Using the fact that 1
Since B (L) = 1
! ;
2
the HP gain is:
1
sin4 !2
C (L), the gain of the trend component is just:
t
Gxt (! ) = 1
Gc ( ! ) =
1
1 + 16 sin4 !2
a particular case of a Butterworth lter of the sine version with d = 2
1
and ! 0 = 2 arcsin
0:25 .
2
41
Ideal Filter Revisited
Recall that in our discussion about the ideal lter yt =
we derived the formulae:
sin jb sin ja
bj = b j =
8j2N
j
b
b0 =
for the frequencies a = 2pu and b = 2p where 2
l
the lengths of the uctuations of interest.
P1
jx
b
L
t
j
j= 1
pl < pu < 1 are
Although this expressions cannot we evaluate for all integers, they
suggest the feasibility of a more direct approach to band-pass ltering
using some form of these coe cients.
42
Baxter and King (1999)
Take the rst k coe cients of the above expression.
.
Then, since we want a gain that implies zero variance at ! = 0, we
normalize each coe cient as:
to get that
bfj = bj
b0 + 2
Pk
j=1 bj
2k + 1
Pk
f = 0 as originally proposed by Craddock (1957).
b
j= k j
This strategy is motivated by the remarkable result that the solution
to the problem of minimizing
Z
1
Q=
B e
2
where B ( ) is the ideal band-pass
is simply to take the rst k coe
and to make all the higher order
i!
Bk e i!
2
d!
lter and Bk ( ) is its k-approximation,
cients bj from the inversion formula
coe cients zero.
43
To see that, write the minimization problem s.t.
L=
1
2
Z
where
!; eb =
First order conditions:
Z
1
e i!j
2
0
e
!; b d!
!; eb
2
4B e i!
0
e
!; b +
0=
k
X
j= k
for j = 1; :::; k.
44
1
X
j= 1
+
Pk
f
j= k bj = 0 as:
k
X
j= k
3
e
bj e i!j 5
0 i!j
e
!; b e d! =
bfj
bfj
Since the harmonics e i!j are orthonormal
0 = 2 bj
bfj +
Pk
j= k bj
= 2
2k + 1
for j = 1; :::; k, that provides the desired result.
Notice that, since the ideal lter is a step function, this approach
su ers from the same Gibbs phenomenon that arises in Fourier analysis
(Koopmans, 1974).
45
Download