2: ELEMENTARY FREQUENCY DOMAIN FACTS AND TECHNIQUES Least Squares Fitting of Sinusoidal Models Suppose we suspect that our data is dominated by a periodic component at frequency ω. Assume for now that ω is known. Then it is appropriate to fit the model xt = µ + A cosωt + B sinωt + εt , (1) where the εt are random error terms, typically assumed to be uncorrelated with zero mean and equal variances. The exact least squares estimates of the parameters µ , A , B are solutions to the normal equations Σ(xt − µ − A cosωt − B sinωt ) = 0 , Σcosωt (xt − µ − A cosωt − B sinωt ) = 0 , Σsinωt (xt − µ − A cosωt − B sinωt ) = 0 . The terms in these equations which do not depend on xt may be evaluated from Exercise 2.2: n Σ(cosωt )2 = h2h {1 + Dn (2ω) cos(n −1)ω} n Σ(sinωt )2 = h2h {1 − Dn (2ω) cos(n −1)ω} Σsinωt Σcosωt sinωt , , Σcosωt n = hh Dn (2ω) sin(n −1)ω 2 = nDn (ω) cos{(n −1)ω/2} = nDn (ω) sin{(n −1)ω/2} , , . If ω is not too close to zero and n is large, Dn (2ω) is negligible. (And if ω is a nonzero Fourier frequency, then Dn (2ω) is exactly zero.) Thus, we obtain the approximations Σcos2ωt n ∼ ∼ hh 2 , Σsin2ωt n ∼ ∼ hh 2 , Σcosωt sinωt ∼ ∼0 , Σcosωt ∼ ∼0 , The normal equations then yield the approximate least squares estimators 1 µ = hh n Σxt , 2 A = hh n Σxt cosωt , 2 B = hh n Σxt sinωt . Σsinωt ∼ ∼0 . -2Note that A and B are closely related to the DFT of the data, J (ω). Specifically, A = 2 Re [J (ω)], and B = −2 Im [J (ω)]. If ω = ω j is a Fourier frequency, then A and B are the cosine and sine sums A j , B j introduced in Part 1. Decomposing the Sum of Squares From the harmonic representation xt = n −1 Σ J j exp(i ω j t ) , j =0 xt is a sum of (complex) harmonic components at frequencies ω0 , . . . , ωn −1. The component at ω j is the series (or vector) J j exp(i ω j t ), t = 0 , . . . , n −1. This vector has squared modulus (sum of squares) n −1 n −1 Σ e J j exp(i ω j t ) e 2 = tΣ=0 e J j e 2 = n t =0 e Jj e 2 . The sum of squares of the data vector {xt }tn=0−1 has a nice representation in terms of the sums of squares of the harmonic components of {xt }: n −1 Σ t =0 = e xt e 2 = n −1 n −1 n −1 n −1 n −1 n −1 n −1 t =0 j =0 t =0 j =0 k =0 ΣΣ n −1 e J j exp(i ω j t ) e 2 = Σ Σ Σ J j Jk*exp(i ω j −k t ) n −1 n −1 n −1 J j Jk* Σ exp(i ω j −k t ) = Σ Σ J j Jk* n .1{j = k } = n Σ e J j e 2 Σ Σ j =0 k =0 j =0 j =0 k =0 t =0 . Thus, the sum of squares of {xt } is decomposed as the sum of the sums of squares n e J j e 2 of the harmonic components of {xt }. The Periodogram The periodogram was invented by Schuster in 1898 for detecting periodic components in time series data. It is defined by n I j = hhh e J j e 2 , j = 0 , . . . , n −1 . 2π n More generally, the periodogram is defined for all frequencies ω by I (ω) = hhh e J (ω) e 2. 2π -3Note that I j = I (ω j ). Note also that the sum of squares of {xt } is n −1 n −1 Σ e xt e 2 = 2π jΣ=0 I j t =0 , a constant multiple of the sum of the periodogram values. The periodogram ordinate I j is the contribution to the total sum of squares of {xt } due to the harmonic component at frequency ω j . If xt = A cos(ω j t ) + B sin(ω j t ), a pure sinusoid oscillating at an exact Fourier frequency, then all periodogram ordinates will be zero except for I j . In general, a peak in the periodogram at a given frequency indicates a strong harmonic component in the data {xt } at that frequency. In the model (1), if the true frequency ω is unknown, it can be estimated by maximizing I (λ) over all frequencies. To reduce the computational expense of the procedure, we may wish to restrict the search to Fourier frequencies ω j . If used wisely, the periodogram can be a very useful tool. If used naively, however, it can lead the data analyst to find periodic components which do not really exist in the underlying data-generating mechanism. For example, in model (1), even if A and B are zero, the periodogram will typically exhibit several strong peaks. Thus, until we can derive the distribution of the periodogram under the null hypothesis where there are no underlying periodic components, we should be careful not to declare that we have found a "true" cycle. Many of the past attempts at detecting "business cycles" in economic data are flawed by their failure to account for random variation in the periodogram. Smooth Functions Qualitatively speaking, a sequence x 0 , . . . , xn −1 is smooth if its DFT is large only at low frequencies. This is true since exp(i ω j t ) for j small is a slowly varying (hence smooth) function, and since any linear combination of smooth functions is smooth. On the other hand, if j is large, exp(i ω j t ) oscillates rapidly and hence is not smooth. No smooth sequence can contain such components with large amplitude. A quantitative measure of roughness is the (circular) roughness coefficient, n −1 n −1 (xt − xt −1)2/ Σ xt 2 Σ t =0 t =0 . -4- To see how the coefficient measures roughness, we use frequency domain methods. Since xt = n −1 Σ J j exp(i ω j t ), we have j =0 xt − xt −1 = n −1 n −1 j =0 j =0 Σ J j (exp(i ω j t ) − exp(i ω j (t −1)) = Σ J j exp(i ω j t ) {1 − exp(−i ω j )} . Thus, the roughness coefficient is (see "Decomposing the Sum of Squares") n −1 e 1 − exp(−i ω j ) e 2 e J j e 2 Σ j =0 n −1 Σ 4 (sinω j /2)2 e J j e 2 j =0 hhhhhhhhhhhhhhhhhhhhh = hhhhhhhhhhhhhhhhh n −1 n −1 e Jj e 2 e Jj e 2 Σ j =0 Σ . j =0 (The proof of the last equality is left as an exercise.) Thus, for a fixed value of the denominator, the roughness coefficient is reduced by increasing the low frequency terms at the expense of the high frequency terms. Aliasing Suppose that our time series is an exact cosine wave, xt = cosωt . The frequency ω may be any real number, but the fact that we only have observations at the equally spaced time points t = 0 , . . . , n −1 implies that certain frequencies will be indistinguishable from each other (aliases). In fact, the largest frequency (i.e., the most rapid oscillation) we can observe is ω = π, which corresponds to the series xt = cosπt = (−1)t . For any larger value of ω, the series xt = cosωt is indistinguishable from the series xt = cosω′t (t = 0 , . . . , n −1) for some ω′ such that 0 ≤ ω′ ≤ π. (Example given below.) The frequencies ω and ω′ are said to be aliases of each other, and ω′ is the principal alias. For example, suppose that ω is between π and 2π. Then we can define ω′ = 2π − ω. Note that ω′ is between 0 and π, and cosω′t = cos(2πt − ωt ) = cos(2πt )cos(−ωt ) − sin2πt sin(−ωt ) = cosωt . Thus, the frequencies ω′ and ω are indistinguishable, as long as the time spacing between observations (the "sampling rate") is one time unit. Even if the series really is cosω′t , it looks just like the series cosωt when observed at equally spaced intervals of one time unit, t = 0 , 1 , . . . , n −1. To see the -5- difference, we would have to increase the sampling rate. As stated above, the maximum observable frequency is π radians per unit time. It is called the folding frequency (or Nyquist frequency). All higher frequencies are "folded" down into the interval [0 , π]. A sinusoid at the Nyquist frequency executes 1 complete cycle for every two units of time (that is, one half cycle per unit time, or 2π(1/2) = π radians per unit time). Thus, for a frequency to be observable (i.e., in the range of principal aliases), at least two samples (observations) per cycle are required.