1: INTRODUCTION, BASICS A time series is any sequence of data values, {xt }tn=−1 0 observed at particular (equally spaced, say) values of time, t . Examples we will study include monthly carbon dioxide concentrations at Mauna Loa, and monthly levels of the Dow Jones Industrial Average from January 1947 to July 1992. Examples in Bloomfield include the magnitude of a variable star at midnight on 600 successive nights, monthly sunspot activity, and a yearly index of wheat prices in Western Europe. Plots of xt versus t often show apparent repetitive or cyclical behavior in the data. For some series these apparent cycles are simply artifacts in the observed data, and will not continue into the future. Other series have "genuine" cyclical components. For example, a series of daily high temperatures in New York City would have a seasonal component with a period of one year. In this course, we will devote considerable attention to the use of Fourier analysis (often called harmonic analysis, or frequency domain analysis) for analyzing time series in terms of their cyclical (sinusoidal) components. As we will see, any time series can be analyzed in the frequency domain, whether or not it contains "genuine" cyclical components. By thinking about a data set in terms of its sinusoidal components, (i.e., by doing a harmonic analysis, or thinking about the data in the frequency domain) we are not assuming that these components are necessarily "real". Nevertheless, harmonic analysis often reveals interesting features in the data that are much harder to see in a time domain plot of xt versus t . In addition, frequency domain theory provides the best mathematical tool for treating general stationary time series, beyond the simple ARMA models described by Box and Jenkins. For example, the problem of forecasting a stationary time series requires frequency domain methods for its general solution. Finally, the effects of linear operations (linear filters) on a time series are most easily understood in the frequency domain. Thus, frequency domain analysis can provide a deeper understanding of the standard ARMA theory and methods. The purest periodic series, and the fundamental waveform of harmonic analysis, is the cosine wave xt = R cos(ωt + φ) . -2- We define Period = # Time Units/Complete Cycle = 2π/ω frequency = # Cycles/Unit Time = ω/2π = 1/Period Angular frequency = # Radians/Unit Time . Since there are 2π radians per cycle, the angular frequency is 2π [frequency] = ω . R is the amplitude, and represents the heights of successive peaks of the wave form. φ is the phase and determines the origin of the waveform. For better understanding of φ, use the important relation cos (A + B ) = cos A cos B − sin A sin B to show that cos (ωt − π/2) = sin ωt . Thus, if φ = 0 we get xt = R cos(ωt ) while if φ = −π/2, we get xt = R sinωt . These two waveforms are identical except for the phase lag, or time shift. The general cosine wave xt = R cos(ωt + φ) can also be written as a sum of sine and cosine waves, xt = A cosωt + B sinωt . To prove this, note that R cos(ωt + φ) = R [cosωt cosφ − sinωt sinφ] = [R cosφ]cosωt + [−R sinφ]sinωt = A cosωt + B sinωt where A = R cosφ and B = −R sinφ. It follows that R = (A 2 + B 2) ⁄2 1 , φ = tan−1(−B /A ) . Any sum of sine and cosine waves at frequency ω, xt = A cosωt + B sinωt , can be written as xt = R cos(ωt + φ) with R and φ as given above. Harmonic Representation of a Time Series Here, we will show that any data sequence x 0 , . . . , xn −1 can be written as a sum of sinusoids, with various frequencies, amplitudes, and phases (1). Thus, a data set can always be thought of in terms of its harmonic components. In general, a data set of size n will have n harmonic components, so no information is gained or lost by passing to the frequency domain. Define the j’th Fourier frequency by -3ω j = 2πj /n . A sinusoid at angular frequency ω j executes j complete cycles in the course of the data (since it executes j /n cycles per unit time and the time extent of the data is n ). The frequencies ω j (j = 0 , . . . , n −1) are a convenient choice (but not the only possible choice) for the frequencies of the harmonic components of the data set. The representation is xt = A 0 + (A j cosω j t + B j sinω j t ) + An /2(−1)t Σ 0 < j <n /2 , (1) where the harmonic components are 2 n −1 A j = hh xt cosω j t n t =0 Σ , 2 n −1 B j = hh xt sinω j t n t =0 1 n −1 A 0 = hh xt = xd n t =0 Σ Σ 0 < j < n /2 1 n −1 An /2 = hh xt (−1)t n t =0 Σ , , . Note that A j2 + B j2 is the squared amplitude of the contribution to xt from the sinusoid at ω j . Even though in practice the data values are usually real numbers, the mathematical manipulations of harmonic analysis are best carried out using the arithmetic of complex numbers. All complex numbers can be written as A + Bi , where A and B are real numbers. The complex number i = 0 + 1i is the square root of −1: i 2 = −1. The complex exponential exp(i ω) is given by Euler’s formula as exp(i ω) = cosω + i sinω if ω is real. The conjugate of the complex number Z = A + iB is given by Z * = Zd = A − iB . Note that the conjugate of exp(i ω) is exp(−i ω) = cosω − i sinω. All the usual rules of algebra apply to complex numbers. Thus, for example, exp(i ω) exp(i λ) = exp(i (ω + λ)). The rest of this section is devoted to proving (1), and also to introducing some key elements of harmonic analysis. Define the discrete Fourier transform (DFT) of the data set {xt }tn=−1 0 to be the sequence of complex numbers {J j } jn=−1 0 , where 1 n −1 J j = hh xt exp(−i ω j t ) n t =0 Σ j = 0 , . . . , n −1 . Note that Jn −j = J j*. We can also write J j as 1 n −1 1 n −1 1 J j = hh xt cosω j t − i hh xt sinω j t = hh (A j − iB j ) n t =0 n t =0 2 Σ Σ . -4In some contexts, we will need to express the DFT as a function of a continuous frequency variable ω: 1 n −1 J (ω) = hh xt exp(−i ωt ) n t =0 Σ . Note that J j = J (ω j ). A key formula in harmonic analysis is given in Exercise 2.2 of Bloomfield for the DFT of a data set consisting entirely of ones: n −1 sin(n ω/2) h1h exp(−i ωt ) = exp{−i (n −1)ω/2} hhhhhhhhh n t =0 n sin(ω/2) Σ . (2) The function sin(n ω/2) Dn (ω) = hhhhhhhhh n sin(ω/2) is called the Dirichlet Kernel. By definition, Dn (0) = lim Dn (ω) = 1 ω→0 . Dn (ω) is periodic with period 2π, and Dn (−ω) = Dn (ω). For ω ∈ [−π , π], Dn (ω) reaches its maximum sin x value at ω = 0 and for small ω Dn (ω) behaves like the function h hhh with x = n ω/2. A key fact is that x Dn (ω) has zeros at the Fourier frequencies ω1 , . . . , ωn −1. This follows since sin(n ω j /2) = sin(πj ), which is zero. Thus, Dn (ω j ) = 0 for j ≠ 0 mod n . Many important facts follow from this. From (2), we get n −1 Σ exp(−i ω j t ) = 0 t =0 , j ≠ 0 mod n . Thus, n −1 n −1 t =0 t =0 Σ cos(ω j t ) = Σ sin(ω j t ) = 0 (j ≠ 0 mod n ) . The sequence {exp(i ω j t )}tn=0−1 can be considered as a vector in Cn , the space of n -dimensional complex-valued vectors. If V j is the column vector V j = {exp(i ω j t )}tn=−1 0 , then V 0 , . . . , Vn −1 are vectors in Cn . The inner product (V j , Vk ) is defined as V j*Vk , where * denotes the conjugate transpose. Two vectors in Cn are said to be orthogonal if their inner product is zero. In fact, the vectors -5V 0 , . . . , Vn −1 are mutually orthogonal since if j ≠ k , (V j , V k ) = n −1 n −1 n −1 Σ exp(−i ω j t ) exp(i ωk t ) = tΣ=0 exp(i (ωk − ω j )t ) = tΣ=0 exp(i ωk −j t ) = 0 t =0 , since Dn (ωk −j ) = 0. This is extremely important since it shows that V 0 , . . . , Vn −1 form a basis for Cn , and hence any vector in Cn (e.g., the data vector X = (x 0 , . . . , xn −1)T ) can be expressed as a linear combination of the complex exponential basis vectors V 0 , . . . , Vn −1. Thus, we can write X = n −1 Σ αj Vj . j =0 To find the coefficient αk , take the inner product with Vk of both sides, to obtain (Vk , X ) = n −1 Σ α j (Vk , V j ) = αk (Vk , Vk ) = n αk . j =0 Thus, 1 1 n −1 αk = hh (Vk , X ) = hh exp(−i ωk t )xt = Jk n n t =0 Σ It follows that X = . n −1 Σ J j V j , in other words that j =0 xt = n −1 Σ J j exp(i ω j t ) . (3) j =0 Thus, xt is a sum of complex exponentials at the Fourier frequencies. The inverse Fourier transform −1 of a complex sequence {z j } jn=0 is n −1 Σ z j exp(i ω j t ). j =0 Thus, (3) shows that xt is the inverse Fourier transform of the DFT, {J j }. Formula (3) is called Fourier inversion. We can now derive Formula (1) (the real version of (3)), as follows. From (3), we find that xt = J 0 + Σ [J j exp(i ω j t ) + Jn −j exp(i ωn −j t )] + Jn /2exp(i ωn /2t ) . 0 < j < n /2 Since Jn −j = Jdj and exp(i ωn −j t ) = exp(−i ω j t ), we find that J j exp(i ω j t ) + Jn −j exp(i ωn −j t ) = 2 Re [J j exp(i ω j t )] 1 = 2 Re [ hh (A j − iB j ) (cosω j t + i sinω j t )] = A j cosω j t + B j sinω j t . 2 -6- Therefore, xt = A 0 + Σ (A j cosω j t 0 < j <n /2 + B j sinω j t ) + An /2(−1)t .