Notes 1

advertisement
1: INTRODUCTION, BASICS
A time series is any sequence of data values, {xt }tn=−1
0 observed at particular (equally spaced, say)
values of time, t . Examples we will study include monthly carbon dioxide concentrations at Mauna
Loa, and monthly levels of the Dow Jones Industrial Average from January 1947 to July 1992. Examples in Bloomfield include the magnitude of a variable star at midnight on 600 successive nights,
monthly sunspot activity, and a yearly index of wheat prices in Western Europe. Plots of xt versus t
often show apparent repetitive or cyclical behavior in the data. For some series these apparent cycles
are simply artifacts in the observed data, and will not continue into the future. Other series have
"genuine" cyclical components. For example, a series of daily high temperatures in New York City
would have a seasonal component with a period of one year.
In this course, we will devote considerable attention to the use of Fourier analysis (often called
harmonic analysis, or frequency domain analysis) for analyzing time series in terms of their cyclical
(sinusoidal) components. As we will see, any time series can be analyzed in the frequency domain,
whether or not it contains "genuine" cyclical components. By thinking about a data set in terms of its
sinusoidal components, (i.e., by doing a harmonic analysis, or thinking about the data in the frequency
domain) we are not assuming that these components are necessarily "real". Nevertheless, harmonic
analysis often reveals interesting features in the data that are much harder to see in a time domain plot
of xt versus t . In addition, frequency domain theory provides the best mathematical tool for treating
general stationary time series, beyond the simple ARMA models described by Box and Jenkins. For
example, the problem of forecasting a stationary time series requires frequency domain methods for its
general solution. Finally, the effects of linear operations (linear filters) on a time series are most easily
understood in the frequency domain. Thus, frequency domain analysis can provide a deeper understanding of the standard ARMA theory and methods.
The purest periodic series, and the fundamental waveform of harmonic analysis, is the cosine
wave
xt = R cos(ωt + φ)
.
-2-
We define
Period = # Time Units/Complete Cycle = 2π/ω
frequency = # Cycles/Unit Time = ω/2π = 1/Period
Angular frequency = # Radians/Unit Time
.
Since there are 2π radians per cycle, the angular frequency is
2π [frequency] = ω
.
R is the amplitude, and represents the heights of successive peaks of the wave form. φ is the phase
and determines the origin of the waveform. For better understanding of φ, use the important relation
cos (A + B ) = cos A cos B − sin A sin B
to show that cos (ωt − π/2) = sin ωt . Thus, if φ = 0 we get xt = R cos(ωt ) while if φ = −π/2, we get
xt = R sinωt . These two waveforms are identical except for the phase lag, or time shift.
The general cosine wave xt = R cos(ωt + φ) can also be written as a sum of sine and cosine
waves, xt = A cosωt + B sinωt . To prove this, note that
R cos(ωt + φ) = R [cosωt cosφ − sinωt sinφ] = [R cosφ]cosωt + [−R sinφ]sinωt = A cosωt + B sinωt
where A = R cosφ and B = −R sinφ. It follows that
R = (A 2 + B 2) ⁄2
1
,
φ = tan−1(−B /A )
.
Any sum of sine and cosine waves at frequency ω, xt = A cosωt + B sinωt , can be written as
xt = R cos(ωt + φ) with R and φ as given above.
Harmonic Representation of a Time Series
Here, we will show that any data sequence x 0 , . . . , xn −1 can be written as a sum of sinusoids,
with various frequencies, amplitudes, and phases (1). Thus, a data set can always be thought of in terms
of its harmonic components. In general, a data set of size n will have n harmonic components, so no
information is gained or lost by passing to the frequency domain. Define the j’th Fourier frequency by
-3ω j = 2πj /n . A sinusoid at angular frequency ω j executes j complete cycles in the course of the data
(since it executes j /n cycles per unit time and the time extent of the data is n ). The frequencies
ω j (j = 0 , . . . , n −1) are a convenient choice (but not the only possible choice) for the frequencies of
the harmonic components of the data set. The representation is
xt = A 0 +
(A j cosω j t + B j sinω j t ) + An /2(−1)t
Σ
0 < j <n /2
,
(1)
where the harmonic components are
2 n −1
A j = hh xt cosω j t
n t =0
Σ
,
2 n −1
B j = hh xt sinω j t
n t =0
1 n −1
A 0 = hh xt = xd
n t =0
Σ
Σ
0 < j < n /2
1 n −1
An /2 = hh xt (−1)t
n t =0
Σ
,
,
.
Note that A j2 + B j2 is the squared amplitude of the contribution to xt from the sinusoid at ω j .
Even though in practice the data values are usually real numbers, the mathematical manipulations
of harmonic analysis are best carried out using the arithmetic of complex numbers. All complex
numbers can be written as A + Bi , where A and B are real numbers. The complex number i = 0 + 1i is
the square root of −1: i 2 = −1. The complex exponential exp(i ω) is given by Euler’s formula as
exp(i ω) = cosω + i sinω if ω is real. The conjugate of the complex number Z = A + iB is given by
Z * = Zd = A − iB . Note that the conjugate of exp(i ω) is exp(−i ω) = cosω − i sinω. All the usual rules of
algebra apply to complex numbers. Thus, for example, exp(i ω) exp(i λ) = exp(i (ω + λ)).
The rest of this section is devoted to proving (1), and also to introducing some key elements of
harmonic analysis. Define the discrete Fourier transform (DFT) of the data set {xt }tn=−1
0 to be the
sequence of complex numbers {J j } jn=−1
0 , where
1 n −1
J j = hh xt exp(−i ω j t )
n t =0
Σ
j = 0 , . . . , n −1
.
Note that Jn −j = J j*. We can also write J j as
1 n −1
1 n −1
1
J j = hh xt cosω j t − i hh xt sinω j t = hh (A j − iB j )
n t =0
n t =0
2
Σ
Σ
.
-4In some contexts, we will need to express the DFT as a function of a continuous frequency variable ω:
1 n −1
J (ω) = hh xt exp(−i ωt )
n t =0
Σ
.
Note that J j = J (ω j ).
A key formula in harmonic analysis is given in Exercise 2.2 of Bloomfield for the DFT of a data
set consisting entirely of ones:
n −1
sin(n ω/2)
h1h exp(−i ωt ) = exp{−i (n −1)ω/2} hhhhhhhhh
n t =0
n sin(ω/2)
Σ
.
(2)
The function
sin(n ω/2)
Dn (ω) = hhhhhhhhh
n sin(ω/2)
is called the Dirichlet Kernel. By definition,
Dn (0) = lim Dn (ω) = 1
ω→0
.
Dn (ω) is periodic with period 2π, and Dn (−ω) = Dn (ω). For ω ∈ [−π , π], Dn (ω) reaches its maximum
sin x
value at ω = 0 and for small ω Dn (ω) behaves like the function h hhh with x = n ω/2. A key fact is that
x
Dn (ω) has zeros at the Fourier frequencies ω1 , . . . , ωn −1. This follows since sin(n ω j /2) = sin(πj ),
which is zero. Thus, Dn (ω j ) = 0 for j ≠ 0 mod n . Many important facts follow from this. From (2), we
get
n −1
Σ exp(−i ω j t ) = 0
t =0
,
j ≠ 0 mod n
.
Thus,
n −1
n −1
t =0
t =0
Σ
cos(ω j t ) =
Σ sin(ω j t ) = 0
(j ≠ 0 mod n )
.
The sequence {exp(i ω j t )}tn=0−1 can be considered as a vector in Cn , the space of n -dimensional
complex-valued vectors. If V j is the column vector V j = {exp(i ω j t )}tn=−1
0 , then V 0 , . . . , Vn −1 are vectors in Cn . The inner product (V j , Vk ) is defined as V j*Vk , where * denotes the conjugate transpose.
Two vectors in Cn are said to be orthogonal if their inner product is zero. In fact, the vectors
-5V 0 , . . . , Vn −1 are mutually orthogonal since if j ≠ k ,
(V j , V k ) =
n −1
n −1
n −1
Σ exp(−i ω j t ) exp(i ωk t ) = tΣ=0 exp(i (ωk − ω j )t ) = tΣ=0 exp(i ωk −j t ) = 0
t =0
,
since Dn (ωk −j ) = 0. This is extremely important since it shows that V 0 , . . . , Vn −1 form a basis for Cn ,
and hence any vector in Cn (e.g., the data vector X = (x 0 , . . . , xn −1)T ) can be expressed as a linear
combination of the complex exponential basis vectors V 0 , . . . , Vn −1. Thus, we can write
X =
n −1
Σ αj Vj
.
j =0
To find the coefficient αk , take the inner product with Vk of both sides, to obtain
(Vk , X ) =
n −1
Σ α j (Vk , V j ) = αk (Vk , Vk ) = n αk
.
j =0
Thus,
1
1 n −1
αk = hh (Vk , X ) = hh exp(−i ωk t )xt = Jk
n
n t =0
Σ
It follows that X =
.
n −1
Σ J j V j , in other words that
j =0
xt =
n −1
Σ J j exp(i ω j t )
.
(3)
j =0
Thus, xt is a sum of complex exponentials at the Fourier frequencies. The inverse Fourier transform
−1
of a complex sequence {z j } jn=0
is
n −1
Σ z j exp(i ω j t ).
j =0
Thus, (3) shows that xt is the inverse Fourier
transform of the DFT, {J j }. Formula (3) is called Fourier inversion. We can now derive Formula (1)
(the real version of (3)), as follows. From (3), we find that
xt = J 0 +
Σ
[J j exp(i ω j t ) + Jn −j exp(i ωn −j t )] + Jn /2exp(i ωn /2t ) .
0 < j < n /2
Since Jn −j = Jdj and exp(i ωn −j t ) = exp(−i ω j t ), we find that
J j exp(i ω j t ) + Jn −j exp(i ωn −j t ) = 2 Re [J j exp(i ω j t )]
1
= 2 Re [ hh (A j − iB j ) (cosω j t + i sinω j t )] = A j cosω j t + B j sinω j t .
2
-6-
Therefore,
xt = A 0 +
Σ (A j cosω j t
0 < j <n /2
+ B j sinω j t ) + An /2(−1)t
.
Download