Revised Chapter 15 in Specifying and Diagnostically Testing Econometric Models (Edition 3) © by Houston H. Stokes 8 February 2015. All rights reserved. Preliminary Draft Chapter 15 Spectral Analysis of Time Series ..................................................... 1 15.0 Introduction ............................................................. 1 15.1 A Brief Treatment of Spectral Analysis Theory ..................................... 2 Table 15.1 Values and Names Calculated by the B34S spectral Command ..................... 13 15.2 Examples ............................................................... 14 Table 15.2 Program to Generate AR(1) models ........................................ 14 Figure 15.1 AR(1) Model where .9 ............................................ 15 Figure 15.2 AR(1) Model where .9 .......................................... Table 15.3 Program to Analyze Gas Furnace Data using Spectral Methods ...................... Figure 15.3 Spectral analysis of GASIN ................................. Figure 15.4 Spectral analysis of GASOUT ................................ Figure 15.5 Cross spectral analysis of GASIN-GASOUT part 1 ............. Figure 15.5 Cross spectral analysis of GASIN-GASOUT part 2 ............................. 15.3 Matrix Command Implementation ............................................. Table 15.4 Matrix Command Implementation of Spectral Analysis ........................... Table 15.5 Cross Spectral Analysis with the Matrix Command .............................. Table 15.6 Verification that the sin and cosine vectors are orthogonal ......................... Table 15.7 Inverse Spectral Examples .............................................. 15.4 Wavelet Analysis ......................................................... Table 15.8 Wavelet basis functions supported ......................................... Table 15.9 Empirically derived factors for four wavelet bases .............................. Table 15.10 Wavelet Filter For The Nino Series ....................................... Figure 15.6 Raw Nino series and three wavelet smoothed series ............................. 15.5 Use of Normalized Cumulative Periodogram to test for white noise ...................... Table 15.12 Calculating the Normalized Cumulative Periodogram ............................. Table 15.13 Testing series for white noise using the Normalized Cumulative Periodogram ............. Figure 15.7 Normalized Cumulative Periodogram for gasout series ............................ Figure 15.8 Normalized Cumulative Periodogram for white noise series ......................... 15.6 Forecasting using spectral methods............................................. Table 15.14 A Subroutine to Calculate Forecasts using the FFT of a series ........................ Table 15.15 Using the FFT to Forecast ............................................... 15.7 Conclusion.............................................................. 16 17 22 23 24 25 25 25 26 27 28 32 33 34 36 39 41 41 42 43 44 44 45 47 53 Spectral Analysis of Time Series 15.0 Introduction For single series the B34S spectral command calculates the Fourier cosine and sine coefficients, the periodogram and, if weights are supplied, the spectrum. The spectral command provides substantially more capability than the spectral option available under the B34S bjiden and bjest sentences. For multiple series the cspectral paragraph calculates the real part of the cross periodogram, the imaginary part of the cross periodogram, the cospectral density estimate, the 15-1 15- 2 Chapter 15 quadrature-spectrum estimate, the amplitude, the coherency squared and the phase spectrum. The output from this command is similar to that produced with the SAS/ETS spectra command.1 After a brief discussion of the theory, the spectrum is calculated for generated AR data with positive and negative coefficients. While spectral analysis usually proceeds by a Fourier decomposition of the series into cosine and sine coefficients, it is possible to use OLS methods to make these calculations. The advantage of the OLS approach is that it highlights the relationship between time and frequency domain approaches. The matrix command also contains substantial spectral programming capability that includes spectral, cspectral and fft commands together with complex variable manipulation capability. Use of these features will be discussed later. Wavelet analysis provides means by which a series can be filtered in a manner that has certain advantages over the windowed Fourier transformation (WFT). Torrence and Compo (1998) stress the WFT has the disadvantage of being inefficient since it "imposes a scale or 'response interval' T into the analysis, a problem not found with the wavelet transformation. They advocate using wavelets as a means by which a data series can be filtered from noise. Hastie-TibshiraniFriedman (2001, 149) note the ability of wavelets to represent a series and provide a means by which a sparse representation can be found. "Wavelets bases are very popular in signal processing and compression, since they are able to represent both smooth and/or locally bumpy functions in an efficient way – a phenomenon dubbed time and frequency localization. In contrast, the traditional Fourier basis allows only frequency localization." In section 15.4 the Torrence-Compo wavelet implementation is discussed and examples shown. 15.1 A Brief Treatment of Spectral Analysis Theory The basic idea behind spectral analysis is to decompose the variance in a series by frequency. Assume the model yt yt 1 ut . (15.1-1) If 0 , it will be shown that the series contains low-frequency information, while if 0 , the series contains high-frequency information. Spectral quantities, such as the periodogram and spectrum, can be calculated using frequency-domain methods or time domain methods such as OLS. Assume an autocovariance-generating function (see Hamilton (1994) equation [3.6.1]) g y ( z) j j z j, (15.1-2) 1 Jenkins - Watts (1968), Anderson (1971) and Bloomfield (1976) provide good references for spectral methods. Hamilton (1994, Chap. 6) provides a very concise treatment, which is further summarized here. For a more modern reference, see Wei (2006, Chapters 11-13) 15-2 where z = complex scalar, and covariances are summable. For spectral models assume z cos() i sin() ei . (15.1-3) The population spectrum S y is equal to the autocovariance-generating function g y ( z ) evaluated at z e i and divided by 2π S y ( ) (2 ) 1 g y (ei w ) (2 ) 1 j ei j . (15.1-4) j Consider an MA(1) model. Once we get the covariance-generating function, from this function we can get the spectrum. Assume a model such as yt et et 1, (15.1-5) which was discussed in Chapter 7. Here E ( yt ) E (et ) E (et 1 ) (15.1-6) E ( yt ) 2 E (et et 1 ) 2 E ( et2 2 et et 1 2et21 ) 0 2 2 (15.1-7) 2 E ( yt )( yt 1 ) E (et et 1 )(et 1 et 2 ) E (et et 1 et21 et et 2 2et 1et 2 ) (15.1-8) 0 et21 0 0 Using (15.1-2), (15.1-7) and (15.1-8) the autocovariance- generating function for this model becomes g y ( z ) [ 2 ] z 1 [(1 2 ) 2 ]z 0 [ 2 ]z1 2 [ z 1 1 2 z ] (15.1-9) (1 z )(1 z ). 2 1 For the general MA(q) model g y ( z) 2 (1 1z 2 z 2 , , q z q )(1 1z 1 2 z 2 , 15-3 , q z q ). (15.1-10) 15-4 Chapter 15 For the AR(1) model yt (1 B ) 1 et (15.1-11) g y ( z) [ 2 /(1 1z)(1 1z 1 )]. (15.1-12) In general, for the ARMA(p,q) model g y ( z ) [ 2 (1 1 z 2 z 2 , , q z q )(1 1z 1 2 z 2 , , q z q )]/ (1 1 z 2 z 2 , , p z p )(1 1 z 1 2 z 2 , , p z p )]. [ (15.1-13) We now show how to write the spectrum in terms of z of the MA(1) model. From (15.1-3), (15.1-4) and (15.1-9) S y ( ) (2 ) 1 2 (1 e i )(1 ei ) (2 ) 1 2 (1 e i ei 2 ). (15.1-14) Spectral Analysis of Time Series 15-5 Since e i ei cos( ) i sin( ) cos( ) i sin( ) 2 cos( ) (15.1-15) S y () (2 )1 2 [1 2 2 cos()]. (15.1-16) Since cos( ) goes from 1 to -1 as goes from 0 to , equation (15.1-16) indicates that if 0 ( 0) , the spectrum monotonically decreases (increases) as frequency increases from 0 to . The spectrum of the AR(1) model in equation (15.1-11) can be calculated from the covariance-generating function (15.1-12) using (15.1-15) as S y (2 ) 1 2 /[(1 e i )(1 ei )] (2 ) 1 2 /[(1 e i ei 2 )] (15.1-17) (2 ) /[(1 2 cos( ))]. 1 2 2 Equation (15.1-17) shows that if 0 ( 0) , the spectrum is monotonically decreasing (increasing) over the range [0, ] since the denominator increases (decreases). From (15.1-13) the spectrum for the ARMA(p,q) process becomes S y ( ) [ 2 (1 1 z 2 z 2 , , q z q )(1 1z 1 2 z 2 , , q z q )]/ [2 (1 1 z 2 z 2 , , p z p )(1 1 z 1 2 z 2 , , p z p )]. (15.1-18) Following Hamilton (1994), it can be shown that if (1 1z 2 z 2 , , q z q ) (1 1z)(1 2 z), ,(1 q z) (15.1-19) (1 1z 2 z 2 , , q z q ) (1 1z)(1 2 z), ,(1 q z) (15.1-20) then (15.1-17) can be written as 15-6 Chapter 15 q S y ( ) ( / 2 ) (1 2 j (1 2 j j 1 p 2 j 1 2 j cos( )) . (15.1-21) 2 j cos( )) The spectrum contains all the information concerning the autocovariances. Assuming the sequence of autocovariances j is summable over the range j , (i. e., the covariances die out) Hamilton (1994, Appendix 6.A) proves that S y ( )eik d k (15.1-22) j or, alternatively, S y cos(k ) d k . (15.1-23) j For k=0, equation (15.1-22) becomes S y ( ) d 0 , (15.1-24) j which indicates that the area under the population spectrum between to is the variance. Since the spectrum is symmetric, the portion of the variance of yt that is attributed to frequencies less than is 2 S y ( ) d. . j 0 Results for a sample of T observations follow directly from the above results for the population. From (15.1-4) we get Pˆy ( ) (2 ) 1 T 1 j T 1 ˆ j ei j , (15.1-25) where Pˆy ( ) is the sample periodogram.2 Since Pˆy ( ) is called the sample periodogram. The sample periodogram can be smoothed to form the sample spectrum Sˆ ( ) . Weights are selected to smooth out noise in the estimated sample periodogram. In B34S there 2 y must be an odd number of weights. All weights are normalized to sum to (1/(4*π)). WEIGHTS(1 1 1)$ implies Spectral Analysis of Time Series ei j cos( j ) i sin( j ) 15-7 (15.1-26) and for a covariance stationary process j j , we can write (15.1-25) as Pˆy ( ) (2 ) 1 T 1 ˆ [cos( j ) i sin( j )] j T 1 (2 ) 0 [cos(0) i sin(0)] (2 ) 1 1 T 1 ˆ [cos( j) i sin( j)], (15.1-27) j j 1 which since cos(0) 1, sin(0) 0, sin( ) sin( ) and cos( ) cos( ) , sin(0) =0, sin(-θ) = sin(θ) and cos(-θ)=cos(θ) can be written T 1 Pˆy ( ) (2 ) 1[ˆ0 2 ˆ j cos( j )]. (15.1-28) j 1 The sample periodogram can be estimated using Fourier methods or OLS. While the OLS approach is slower, it highlights what is being estimated with spectral analysis. The sample equivalent of the spectral representation theorem that states that any covariance-stationary process yt can be written as yt [ ( ) cos( t ) ( )sin(t )] d (15.1-29) 0 is M yt ˆ [ˆ j cos( j (t 1)) ˆj sin( j (t 1))] et . (15.1-30) j 1 If T is odd, there will be M=(T-1)/2 frequencies in equation (15.1-30), where 1 2 / T , 2 4 / T , and M 2 M / T (T 1) / T . If equation (15.1-30) is estimated M times, then the M sets of coefficients ˆ and ˆ can be used to recover the sample periodogram and j j other quantities. The reason for this is that all the right-hand-side variables are orthogonal and hence the regression can be done in sequences of models with two variables on the right plus the constant. Define Pˆi (i ) as the sample periodogram of series i (here yt ) evaluated at frequency j , then rectangular weighting, while WEIGHTS(1 2 1)$ implies triangular weighting. 15-8 Chapter 15 Pˆi ( i ) (T / 8 )(ˆ 2j ˆj2 ). (15.1-31) The sample variance3 involves summing the periodogram values and is T M i 1 j 1 (1/ T ) ( yt y )2 .5[ (ˆ 2j ˆ j2 )]. (15.1-32) Since T ˆ j (2 / T ) yt cos[ j (t 1)] (15.1-33) t 1 T ˆ j (2 / T ) yt sin[ j (t 1)], (15.1-34) t 1 (15.1-31) can be written as T Pˆi ( j ) (1/ 2 T )[( yt cos[ j (t 1)] (15.1-35) t 1 It can be shown that 2Pˆi () / Si () 2 (2), (15.1-36) which implies that E[ Pˆi ()] Si (). (15.1-37) Since 2 (2) has a mean of 2 and a 95% confidence interval of .05 - 7.4, Pˆi ( ) is not a good estimate of the population spectrum. Another problem is that (15.1-28) requires that as many parameters ( i ) have to be estimated as observations. The solution is to weight Pˆi ( ) to form an estimate of the sample spectrum for series i, where h = the number of weights minus 1 divided by 2. Sˆi ( ) h w m h Pˆ ( ). m h 1 i 3 Note that the large sample variance formula is used. (15.1-38) Spectral Analysis of Time Series 15-9 In B34S the WEIGHTS sentence requires that the user set an odd number of weights to be used for smoothing the spectrum. The weights are then normalized to sum to 1/(4*π) or .079577 . Up to 99 weights can be supplied. If the supplied sentence was weights(1 2 3 2 1) $ then triangular weights of .0088419, .017684, .026526, .017684 and .0088419 would be used, while if the sentence was weights(1 1 1 1 1)$ then there would be five weights of (1/ 20 ) or .015915. Note that both sets of weights sum to (1/ 4 ) or .079577. The B34S and SAS have slightly different parameterizations of the spectral quantities. The values calculated by B34S (and SAS) are listed in Table 15.1. Note that j in equations (15.1-33) and (15.1-34) is 2k / T in Table 15.1 in the calculation of ˆ and ˆ , which are called COSj(k) and j j SINj(k), respectively. B34S and SAS normalize (ˆ ˆ 2 ) by T/2 in place of (T/8π) listed in equation (15.1-31). If the data contain cycles greater than π, then these will be seen as having cycles with a range of 0 to π. For the lowest frequency 1 2 / T , the corresponding period is T. In words 2 this means that if there are T data points, it will be impossible to detect a cycle longer than T. If two series are added, then the autocovariance-generating function of the sum is the sum of the autocovariance-generating functions of each series. Up until now the analysis has been in terms of one series. We now assume two series, xt and yt , and develop a spectral representation . Expanding the notation, define the population cross spectrum as S x y () (1/ 2 ) x( jy){cos( j ) i sin( j )}. (15.1-39) j Equation (15.1-40) can be broken into two parts, the population cospectrum cx y ( ) defined as cx y () (1/ 2 ) x( jy) cos( j) j and the population quadrature spectrum defined as (15.1-40) 15-10 Chapter 15 qx y () (1/ 2 ) x( jy) sin( j ), (15.1-41) j where S x y ( ) cx y ( ) i qx y ( ). (15.1-42) It can be shown that the covariance between x and y is S () d E ( yt y )( xt x ) (15.1-43) () E ( yt y )( xt x ) (15.1-44) () d 0. (15.1-45) xy or c xy since q xy The population cospectrum, cx y ( ) , measures the portion of the covariance between xt and yt that is attributable to cycles of frequency . Looking now at the sample, define T ˆ y j (2 / T ) yt cos[ j (t 1)] (15.1-46) ˆy j (2 / T ) yt sin[ j (t 1)] (15.1-47) ˆ x j (2 / T ) xt cos[ j (t 1)] (15.1-48) ˆx j (2 / T ) xt sin[ j (t 1)] (15.1-49) t 1 T t 1 T t 1 T t 1 The sample covariance between xt and yt can be written as T M t 1 j 1 (1/ T ) ( yt y )( xt x ) (1/ 2) (ˆ y jˆ x j ˆx jˆy j ), (15.1-49) which implies that the portion of the sample covariance between xt and yt that is due to common dependence on cycles of frequency is (1/ 2)(ˆ ˆ ˆ ˆ ) . The sample cross periodogram is j xj yj xj yj Spectral Analysis of Time Series 15-11 the sum of a real cˆx y ( j ) and imaginary component qˆ x y ( j ) Pˆx y ( j ) cˆx y ( j ) i qˆ x y ( j ) (15.1-50) where cˆx y ( j ) (T / 8 )(ˆ x jˆ y j ˆx jˆy j ) (15.1-51) qˆ x y ( j ) (T / 8 )(ˆ x jˆ y j ˆx jˆy j ). (15.1-52) The formulas used by B34S and SAS scales by (T/2) in place of (T/8π) in equation (15.1-51) and (15.1-52) or RPxy (k ) 4 cˆx y ( j ) and IPx y (k ) qˆ x y ( j ). While the real part of the sample cross periodogram measures to what degree xt and yt have common cycles, i. e., cycles at the same frequency, the imaginary component of the sample cross periodogram corrects for whether these cycles are in phase or out of phase. If both series shared common cycles, but these cycles were out of phase, the result would be that the contemporaneous covariance would be low. In a manner similar to equation (15.1-38), the real and imaginary parts of the sample cross periodogram can be weighted to form an estimate of the sample cospectral density estimate Cˆ x y ( j ) and sample quadrature spectrum Qˆ x y ( j ). Cˆ x y ( ) Qˆ x y ( ) h w m h 1 m h h w m h cˆx y ( ) (15.1-53) qˆ ( ). (15.1-54) m h 1 x y The population coherence hˆx y ( j ) , measures the correlation between xt and yt by frequency, is hˆ ( ) [(Cˆ ( ))2 (Qˆ ( )) 2 ]/[ Sˆ ( ) Sˆ ( )], (15.1-55) xy j xy j xy j x j y j while the amplitude Aˆ x y ( j ), is Aˆ x y ( j ) [(Cˆ x y ( j ))2 (Qˆ x y ( j ))2 ].5. (15.1-56) Using (15.1-56), (15.1-55) can be rewritten as hˆx y ( j ) [ Aˆ x y ( j )]2 / [ Sˆx ( j ) Sˆ y ( j )]. (15.1-57) 15-12 Chapter 15 The phase ˆ x y ( j ) between xt and yt is ˆ x y ( j ) arctan[Qˆ x y ( j ), Cˆ x y ( j )]. (15.1-58) In the next section generated data and the Box-Jenkins (1976) gas furnace data are used to illustrate these spectral magnitudes. Table 15.1 provides a summary of the formulas used in these calculations and the b34s names. Before moving to this discussion it is important to fully document the differences between SAS and B34S. The B34S spectral command will exactly replicate the SAS procedure spectra with two exceptions. In contrast with SAS, B34S does not print the zero frequency data point (where PERIOD = ). There is a "bug" in SAS concerning how weighting is done for the end points of the quadrature spectrum. The SAS Institute has acknowledged this "bug" and provides a switch that will provide the correct results. Since the quadrature spectrum is an intermediate step toward calculating the amplitude, the coherency squared and the phase, these values differ also. To illustrate the problem, assume the Box-Jenkins (1976) gas furnace data is run with WEIGHTS (1 1 1). Here the weights are all .026525. The sum of the three weights is .079575, which is (1/4π). For frequency .00378, .006757, .01014, and .013551 the IP values are -33.07, -39.08, -35.49 and -12.04, respectively. The QS values should be -2.791, -2.855, -2.298 and -1.586. SAS produces -1.914, 2.855, -2.298 and -1.586. The number of values differing at the beginning and at the ending are the number of weights - 2 or 1 in this case. At the end points, the correct way to calculate the QS value is to fold. This is illustrated next. QS(1) =(.026525)*(-33.07)+(.026525)*(-33.07)+(.026525)*(-39.08) QS(2) =(.026525)*(-33.07)+(.026525)*(-39.08)+(.026525)*(-35.49) QS(3) =(.026525)*(-39.08)+(.026525)*(-35.49)+(.026525)*(-12.04). SAS uses the above formulas for QS(2) and QS(3) but for QS(1) adds in the IP value of 0.0 for frequency 0.0, giving QS(1) =(.026525)*(0.0)+(.026525)*(-33.07)+(.026525)*(-39.08). 4 4 The SAS command ALTW gives the “correct” answer but was not made the default to maintain compatibility with older versions of SAS and replicate the old answers. This appears to be a self serving decision on the part of SAS. Spectral Analysis of Time Series 15-13 Table 15.1 Values and Names Calculated by the B34S spectral Command _________________________________________________________________ FREQ(k) Frequency from 0 to π if NOBSPP is set = 0. Cycles per observation is NOBSPP*FREQ/2π, where NOBSPP is the number of observations per unit time. Using the default setting of NOBSPP = 1, FREQ is in the range 0 to .5. The number of frequencies calculated goes from 1 to K, where if T is even, K = (T/2) - 1 PERIOD(k) Period or wavelength ( 1 / FREQ ). COSj(k) Cosine transform of Xjt. Defined over range 1 to K. T (2/T)*X *cos(k*(i-1)*2*π/T). COSj(k)= Σi=1 ji SINj(k) Sine transform of Xjt. Defined over range 1 to K. T (2/T)*X *sin(k*(i-1)*2*π/T). SINj(k)= Σi=1 ji Pj(k) Periodogram of Xjt. Defined over range 1 to K. Pj(k) = (T/2)*((COSj(k)**2) + (SINj(k)**2)). Sj(k) Spectral density estimate of Xjt. Defined over range 1 to K as Sj(k) = Σvi=-v w(i)*Pj(k+i), where v=(p-1)/2 if the WEIGHTS sentence (containing v elements) is present and Sj(k) = Pj(k) if it is not. RPmn(k) Real part of cross periodogram of Xmt and Xnt. Defined over range 1 to K as Rpmn(k)= (T/2)*((COSm(k)*COSn(k))+(SINm(k)*SINn(k))). IPmn(k) Imaginary part of cross periodogram of Xmt and Xmt. Defined over over range 1 to K as IPmn(k) = (T/2)*((COSn(k)*SINm(k))-(SINn(k)*COSm(k))). CSmn(k) Cospectral density estimate (real part of cross spectrum) or the weighted real part of the cross periodogram. Defined over range 1 to K as CSmn(k) = Σvi=-v w(i) * RPmn(k+i), where v=(p-1)/2 if the WEIGHTS sentence (containing v elements) is present and as Csmn(k) = RPmn(k) if it is not. QSmn(k) Quadrature-spectrum (imaginary part of cross spectrum) or the weighted imaginary part of the cross periodogram. Defined over range 1 to K as QSmn(k) = Σvi=-v w(i)*IPmn(k+i), where v=(p-1)/2 if the weights sentence (containing v elements) is present and QSmn(k)= Ipmn(k) if it is not. Amn(k) Amplitude (modulus of cross-spectrum). Defined over range 1 to K as Amn(k) = ((CSmn(k)**2) + (QSmn(k)**2))**.5. Kmn(k) Coherency squared. Defined over range 1 to K as Kmn(k)=(Amn(k)**2)/(Sn(k)*Sm(k)). If the WEIGHTS sentence is not present, Kmn(k) reduces to 1.0 for all frequencies. PHmn(k) Phase spectrum in radians. Defined over range 1 to K as PHmn(k) = arctan(QSmn(k),CSmn(k)). 15-14 Chapter 15 15.2 Examples Table 15.2 shows the statements to generate AR(1) models of the form of equation (15.1-1), where in Model 1, Φ = .9, while in Model 2, Φ = -.9. The programming setup illustrates the use of the Macro facility. If the B34SLET statement setting PLOT1 = no is changed to PLOT1 = yes, hard copy graphs will be produced. Table 15.2 Program to Generate AR(1) models ____________________________________________________________ %b34slet b34sexec b34sexec b34sexec b34sexec plot1 = no $ options open('_JUNK.FSV') disp=unknown unit(44)$ b34srun$ options clean(44)$ b34srun$ options gfactor(.8)$ b34srun$ data noob=300 maxlag=1 heading('Sample AR(1) Data')$ build noise x1 x2$ gen noise= rn()$ gen x1 = lp(1,1,noise) ar(.9) ma(1.0) values(0.0) $ gen x2 = lp(1,1,noise) ar(-.9) ma(1.0) values(0.0) $ b34srun$ b34sexec spectral list(sin,cos,p,s) nobspp=0 scafname=spec scaunit=44 output(all)$ weights( 1 2 3 4 3 2 1)$ var x1$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine Transform Model 1")$ plot=(freq cos_1) gposition(2) title("Cos Transform Model 1")$ plot=(freq p_1 ) gposition(3) title("Periodogram Model 1")$ plot=(freq s_1 ) gposition(4) title("Spectrum Model 1")$ b34srun$ %b34sif(&plot1.eq.YES.or.&plot1.eq.yes)%then$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig1.wmf') plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine Transform Model 1")$ plot=(freq cos_1) gposition(2) title("Cos Transform Model 1")$ plot=(freq p_1 ) gposition(3) title("Periodogram Model 1")$ plot=(freq s_1 ) gposition(4) title("Spectrum Model 1")$ b34srun$ %b34sendif$ b34sexec options clean(44)$ B34SRUN$ b34sexec spectral list(sin,cos,p,s) nobspp=0 scafname=spec scaunit=44 output(all)$ weights( 1 2 3 4 3 2 1)$ var x2$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine Transform Model 2")$ plot=(freq cos_1) gposition(2) title("Cos Transform Model 2")$ plot=(freq p_1 ) gposition(3) title("Periodogram Model 2")$ plot=(freq s_1 ) gposition(4) title("Spectrum Model 2")$ b34srun$ %b34sif(&plot1.eq.YES.or.&plot1.eq.yes)%then$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig2.wmf') plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine Transform Model 2")$ plot=(freq cos_1) gposition(2) title("Cos Transform Model 2")$ Spectral Analysis of Time Series 15-15 plot=(freq p_1 ) gposition(3) title("Periodogram Model 2")$ plot=(freq s_1 ) gposition(4) title("Spectrum Model 2")$ b34srun$ %b34sendif$ Graphics produced by the code in Table 15.2 are listed in Figures 15.1 and 15.2. Note that as predicted by theory, all the information in the series where Φ = .9 is at low frequency while when Φ = .-9, the information is at high frequency. The rough appearance of the periodogram is smoothed by the WEIGHTS (1 2 3 2 1). Figure 15.1 AR(1) Model where .9 15-16 Figure 15.2 AR(1) Model where .9 Chapter 15 Spectral Analysis of Time Series 15-17 The next example uses the Box-Jenkins (1976) gas furnace data. First the periodogram is estimated with OLS methods as suggested in equation (15.1-30). The code in Table 15.3 provides an example that uses the MACRO OLSSPEC, which is called as %b34smcall %olsspec(y=gasout noob=296 ntest=20)$ to produce estimates to the periodogram using the reg command. Next, the spectral command is used to estimate, list and plot spectral and cross-spectral values. These are discussed below. The rather long but complete command file is provided to show how all figures and tables were produced. It is to be stressed that the OLS method of estimating the periodogram is not the preferred way to proceed for production work, but has much to recommend as a method for understanding the periodogram as just representing the explained sum of squares at a frequency. Table 15.3 Program to Analyze Gas Furnace Data using Spectral Methods ______________________________________________________ b34sexec options include('c:\b34slm\gas.b34')$ b34srun$ %b34smacro olsspec$ /$ Needs to be called as %b34smcall(y=yname noob=_ ntest=_ ) /$ y = yname /$ noob = number of observations i series /$ ntest must be set to lt noob/2 b34sexec spectral list(sin,cos,p,s) nobspp=0$ var %b34seval(&Y)$ weights(1 2 3 2 1)$ b34srun$ %b34sdo i=1,&ntest$ b34sexec data set $ %b34sif(&i.eq.1)%then$ build sin cos freq$ %b34sendif$ gen freq=(timespi(2.0)/%b34seval(&noob))*%b34seval(&I)$ gen sin=sin((kount()-1.)*freq)$ gen cos=cos((kount()-1.)*freq)$ b34srun$ b34sexec reg$ model %b34seval(&y)=sin cos$ b34srun$ %b34senddo $ %b34smend$ %b34smcall olsspec(y= gasout noob=296 ntest=20)$ b34sexec options open('_junk.fsv') disp=unknown unit(44)$ b34srun$ b34sexec options clean(44)$ b34srun$ %b34slet plot1=yes$ b34sexec spectral scafname=spec scaunit=44 output=all list=all nobspp= 1 plotby( freq ) $ weights( 1 2 3 4 3 2 1)$ var gasin gasout $ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) plottype=xyplot$ 15-18 Chapter 15 plot=(freq sin_1) gposition(1) title("Sine transform Gasin")$ plot=(freq cos_1) gposition(2) title("Cos transform Gasin")$ plot=(freq p_1 ) gposition(3) title("Periodogram Gasin ")$ plot=(freq s_1 ) gposition(4) title("Spectrum Gasin")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) plottype=xyplot$ plot=(freq sin_2) gposition(1) title("Sine Transform Gasout")$ plot=(freq cos_2) gposition(2) title("Cos Transform Gasout")$ plot=(freq p_2 ) gposition(3) title("Periodogram Gasout")$ plot=(freq s_2 ) gposition(4) title("Spectrum Gasout")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) plottype=xyplot$ plot=(freq rp_1) gposition(1) title("Real Part Cross Period Gasin-Gasout")$ plot=(freq ip_1) gposition(2) title("Imag Part Cross Period Gasin-Gasout")$ plot=(freq cs_1) gposition(3) title("Cospectral Density Gasin-Gasout")$ plot=(freq qs_1) gposition(4) title("Quadrature-Spectrum Gasin-Gasout")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) Plottype=xyplot$ Plot=(freq a_1) gposition(1) Title("Amplitude Gasin-Gasout")$ Plot=(freq k_1) gposition(2) Title("Coherency Gasin-Gasout")$ Plot=(freq ph_1) gposition(3) Title("Phase Gasin-Gasout")$ b34srun$ /$ graphs to list b34sexec options gfactor(.8)$ b34srun$ %b34sif(&plot1.eq.yes.or.&plot1.eq.yes)%then$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig3.wmf') plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine Transform Gasin")$ plot=(freq cos_1) gposition(2) title("Cos Transform Gasin")$ plot=(freq p_1 ) gposition(3) title("Periodogram Gasin ")$ plot=(freq s_1 ) gposition(4) title("Spectrum Gasin")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig4.wmf') plottype=xyplot$ plot=(freq sin_2) gposition(1) title("Sine Transform Gasout")$ plot=(freq cos_2) gposition(2) title("Cos Transform Gasout")$ plot=(freq p_2 ) gposition(3) title("Periodogram Gasout")$ plot=(freq s_2 ) gposition(4) title("Spectrum Gasout")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig5.wmf') plottype=xyplot$ plot=(freq rp_1) gposition(1) title("Real Part Cross Period Gasin-Gasout")$ plot=(freq ip_1) gposition(2) title("Imag Part Cross Period Gasin-Gasout")$ plot=(freq cs_1) gposition(3) title("Cospectral Density Gasin-Gasout")$ Spectral Analysis of Time Series 15-19 plot=(freq qs_1) gposition(4) title("Quadrature-Spectrum Gasin-Gasout")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig6.wmf') plottype=xyplot$ plot=(freq a_1) gposition(1) title("Amplitude Gasin-Gasout")$ plot=(freq k_1) gposition(2) title("Coherency gasin-gasout")$ plot=(freq ph_1) gposition(3) title("Phase gasin-gasout")$ b34srun$ %b34sendif$ Edited output from running the program in Table 15.3 follows. The periodogram and spectrum for GASOUT for the first ten frequencies are listed below. The regression output that duplicates these values is shown next. Only the first three reg command outputs are shown. Note that the model sum of squares of 442.44887, 406.68593 and 529.75342 are the same as found with the spectral command under P_1. SIN and COS values from the two sources are listed next and found to be the same. Periodogram values are generated from the SIN and COS values using (15.1-31) with scaling (T/2), in place of (T / 8 ) giving identical results from a time prospective using OLS or from a frequency prospective using Fourier analysis. As noted above, while the OLS approach may be easier to interpret in that it shows how the periodogram relates to the regression diagnostics, it is substantially slower. Note that each periorogram value can be calculated. It is left as an exercise for the reader to validate that the spectral values have been calculated from the periodogram using the scaled (triangular) weights 1, 2, 3, 2, and 1 from equation (15.1-38). For the output listed below, series 1 is GASOUT. Note that for observation 1, period = 296. Frequency is 1/period = .0033784. The listed frequency can be obtained by .003378(1/(2 )) .021227 Obs 1 2 3 4 5 6 7 8 9 10 PERIOD 296.0 148.0 98.67 74.00 59.20 49.33 42.29 37.00 32.89 29.60 REG Command. Version FREQ SIN_1 0.2123E-01 -1.714 0.4245E-01 -1.482 0.6368E-01 0.6334 0.8491E-01 0.6519 0.1061 0.5667 0.1274 -0.1568 0.1486 -1.063 0.1698 0.4674 0.1910 -0.4580 0.2123 0.5578E-01 10000000 755 OLS Estimation Dependent variable Adjusted R**2 Standard Error of Estimate Sum of Squared Residuals Model Sum of Squares Total Sum of Squares F( 2, 293) F Significance 1/Condition of XPX Number of Observations Durbin-Watson 0} 0} 0} P_1 442.4 406.7 529.8 129.4 78.69 18.02 255.9 45.92 262.0 76.91 1 February 1997 Real*8 space available Real*8 space used Variable SIN { COS { CONSTANT { COS_1 -0.2243 -0.7423 1.783 0.6703 0.4589 -0.3118 0.7739 -0.3030 -1.249 0.7187 Coefficient -1.7144070 -0.22433884 53.509122 GASOUT 0.1404460150470348 2.968754524497961 2582.356504031045 442.4488675905768 3024.805371621621 25.10062379103647 0.9999999999132527 0.5555555555555556 296 6.376831838373521E-02 Std. Error 0.24403012 0.24403012 0.17255535 t -7.0253911 -0.91930800 310.09830 S_1 35.35 33.04 28.14 17.95 11.64 7.946 10.93 11.22 11.78 7.876 15-20 Chapter 15 REG Command. Version 1 February 1997 Real*8 space available Real*8 space used 10000000 755 OLS Estimation Dependent variable Adjusted R**2 Standard Error of Estimate Sum of Squared Residuals Model Sum of Squares Total Sum of Squares F( 2, 293) F Significance 1/Condition of XPX Number of Observations Durbin-Watson Variable SIN { COS { CONSTANT { 0} 0} 0} GASOUT 0.1285420898400539 2.989240914807500 2618.119445300439 406.6859263211827 3024.805371621621 22.75659665299046 0.9999999993494489 0.5555555555555556 296 6.275877065018662E-02 Coefficient -1.4821769 -0.74231363 53.509122 REG Command. Version GASOUT 0.1695058964630655 2.918139078177818 2495.051954119426 529.7534175021956 3024.805371621621 31.10511407826055 0.9999999999994319 0.5555555555555556 296 6.501243564636745E-02 Coefficient 0.63341038 1.7827524 53.509122 REG Command. Version t 2.6406452 7.4321747 315.47699 10000000 755 OLS Estimation Dependent variable Adjusted R**2 Standard Error of Estimate Sum of Squared Residuals Model Sum of Squares Total Sum of Squares F( 2, 293) F Significance 1/Condition of XPX Number of Observations Durbin-Watson 0} 0} 0} Std. Error 0.23986955 0.23986955 0.16961339 1 February 1997 Real*8 space available Real*8 space used Variable SIN { COS { CONSTANT { -6.0321201 -3.0210463 307.97308 10000000 755 OLS Estimation Dependent variable Adjusted R**2 Standard Error of Estimate Sum of Squared Residuals Model Sum of Squares Total Sum of Squares F( 2, 293) F Significance 1/Condition of XPX Number of Observations Durbin-Watson 0} 0} 0} t 1 February 1997 Real*8 space available Real*8 space used Variable SIN { COS { CONSTANT { Std. Error 0.24571409 0.24571409 0.17374610 Coefficient 0.65193021 0.67027858 53.509122 GASOUT 3.624381523407638E-02 3.143556705612574 2895.410987090722 129.3943845308991 3024.805371621621 6.547007460527660 0.9983466201852209 0.5555555555555556 296 5.641171059948612E-02 Std. Error 0.25839877 0.25839877 0.18271552 t 2.5229618 2.5939697 292.85482 _________________________________________________________ The second set of problems in the code in Table 15.3 involves estimating a cross spectral model using the Box-Jenkins (1976) gas furnace data. Here weights of [1, 2, 3, 4, 3, 2, 1] are used. Actual values are listed for the first 40 frequencies and plots of all values are given. Inspection of Figures 15.3 and 15.4 and especially 15.5 and 15.6 and listings indicates that GASIN maps to GASOUT at low frequencies. The weighting smoothes the periodogram values. For output listed below, series 1 is GASIN and series 2 is GASOUT. Note that here FREQ = 1/PERIOD. Spectral Analysis of Time Series Obs 15-21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 PERIOD 296.0 148.0 98.67 74.00 59.20 49.33 42.29 37.00 32.89 29.60 26.91 24.67 22.77 21.14 19.73 18.50 17.41 16.44 15.58 14.80 14.10 13.45 12.87 12.33 11.84 11.38 10.96 10.57 10.21 9.867 9.548 9.250 8.970 8.706 8.457 8.222 8.000 7.789 7.590 7.400 FREQ 0.3378E-02 0.6757E-02 0.1014E-01 0.1351E-01 0.1689E-01 0.2027E-01 0.2365E-01 0.2703E-01 0.3041E-01 0.3378E-01 0.3716E-01 0.4054E-01 0.4392E-01 0.4730E-01 0.5068E-01 0.5405E-01 0.5743E-01 0.6081E-01 0.6419E-01 0.6757E-01 0.7095E-01 0.7432E-01 0.7770E-01 0.8108E-01 0.8446E-01 0.8784E-01 0.9122E-01 0.9459E-01 0.9797E-01 0.1014 0.1047 0.1081 0.1115 0.1149 0.1182 0.1216 0.1250 0.1284 0.1318 0.1351 SIN_1 0.4798 0.3560 -0.4462E-01 -0.1421 -0.9377E-01 -0.6245E-01 0.3529 -0.2504 -0.3567 0.1291 -0.1486 0.3193E-02 0.1735 0.1826 0.6261E-01 0.1643 0.7021E-01 -0.2711 0.9543E-01 0.1108E-01 -0.5102E-01 -0.6070E-02 0.8882E-01 -0.6353E-01 -0.6266E-01 0.1244 -0.1261 0.7276E-01 -0.7566E-01 0.3848E-01 0.1100 -0.8815E-01 -0.6512E-01 -0.8999E-02 0.4304E-01 0.1303 -0.1981E-02 -0.1171E-02 0.4124E-03 -0.5830E-01 SIN_2 -1.714 -1.482 0.6334 0.6519 0.5667 -0.1568 -1.063 0.4674 -0.4580 0.5578E-01 -0.3376 -0.2289E-01 -0.9361 -0.8758 0.8874E-01 0.2742 -0.2818 0.4794E-01 -0.2331 -0.4056 0.2422 -0.3126 -0.8514E-01 0.1641 0.1842E-01 0.4029 -0.2429 0.1863 -0.1554 0.4538E-01 0.1857 -0.2289 -0.2054 -0.6225E-01 -0.1184E-01 0.2045 -0.8518E-02 -0.1834E-01 0.4063E-01 0.2978E-01 COS_1 0.1931 0.3565 -0.5042 -0.2708 -0.2222 0.2150 0.1057 -0.2797E-01 0.3711 -0.1646 0.1131 -0.6891E-01 0.2199 0.2172 -0.1209 -0.1507 0.9312E-01 -0.1195 0.1405 0.9569E-01 -0.2574 0.8152E-01 0.1528E-01 -0.2489 -0.2016 -0.1371 0.1878 -0.6586E-01 0.5779E-01 -0.2020E-01 0.2119E-02 -0.3095E-01 -0.3968E-01 0.6456E-01 0.6325E-01 -0.2427E-01 -0.4682E-01 0.3996E-01 0.2228E-01 0.7638E-01 COS_2 -0.2243 -0.7423 1.783 0.6703 0.4589 -0.3118 0.7739 -0.3030 -1.249 0.7187 -0.4302 0.9868E-01 0.3890 0.3462 0.1638 0.3253 0.2096 -0.6516 0.4178 0.2647 -0.4199 0.1288 0.2173 -0.5095 -0.5336 -0.3746 0.1310 -0.5671E-01 0.1007 0.6868E-02 0.6038E-01 -0.5963E-01 -0.1437 0.6460E-01 0.5566E-01 -0.1395 -0.6563E-01 0.3119E-01 -0.2612E-02 0.1332 P_1 39.60 37.56 37.92 13.84 8.612 7.418 20.08 9.395 39.21 6.479 5.165 0.7043 11.61 11.92 2.745 7.355 2.013 12.99 4.271 1.373 10.19 0.9889 1.202 9.764 6.595 5.072 7.577 1.425 1.342 0.2795 1.792 1.292 0.8606 0.6288 0.8663 2.599 0.3251 0.2366 0.7352E-01 1.367 P_2 442.4 406.7 529.8 129.4 78.69 18.02 255.9 45.92 262.0 76.91 44.26 1.519 152.1 131.3 5.137 26.79 18.26 63.18 33.87 34.72 34.78 16.91 8.063 42.41 42.18 44.78 11.27 5.615 5.072 0.3118 5.645 8.278 9.300 1.191 0.4793 9.069 0.6482 0.1938 0.2453 2.758 S_1 2.976 2.675 2.235 1.714 1.299 1.190 1.227 1.278 1.308 1.049 0.8214 0.6559 0.5668 0.5671 0.5458 0.5440 0.4795 0.5033 0.4622 0.3981 0.4056 0.3545 0.3834 0.4592 0.4519 0.4363 0.3766 0.2564 0.1770 0.1213 0.9122E-01 0.8530E-01 0.8654E-01 0.8751E-01 0.8479E-01 0.8274E-01 0.6805E-01 0.5485E-01 0.4527E-01 0.4053E-01 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 PERIOD 296.0 148.0 98.67 74.00 59.20 49.33 42.29 37.00 32.89 29.60 26.91 24.67 22.77 21.14 19.73 18.50 17.41 16.44 15.58 14.80 14.10 13.45 12.87 12.33 11.84 11.38 10.96 10.57 10.21 9.867 9.548 9.250 8.970 8.706 8.457 8.222 8.000 7.789 7.590 7.400 FREQ 0.3378E-02 0.6757E-02 0.1014E-01 0.1351E-01 0.1689E-01 0.2027E-01 0.2365E-01 0.2703E-01 0.3041E-01 0.3378E-01 0.3716E-01 0.4054E-01 0.4392E-01 0.4730E-01 0.5068E-01 0.5405E-01 0.5743E-01 0.6081E-01 0.6419E-01 0.6757E-01 0.7095E-01 0.7432E-01 0.7770E-01 0.8108E-01 0.8446E-01 0.8784E-01 0.9122E-01 0.9459E-01 0.9797E-01 0.1014 0.1047 0.1081 0.1115 0.1149 0.1182 0.1216 0.1250 0.1284 0.1318 0.1351 RP_1 -128.2 -117.2 -137.2 -40.57 -22.96 -8.471 -43.41 -16.07 -44.43 -16.45 0.2236 -1.017 -11.38 -12.53 -2.110 -0.5905 -0.3994E-01 9.601 5.397 3.084 14.17 1.835 -0.6276 17.22 15.75 15.02 8.177 2.559 2.601 0.2379 3.043 3.259 2.823 0.7001 0.4456 4.444 0.4573 0.1877 -0.6136E-02 1.249 IP_1 -33.07 -39.08 -35.49 -12.04 -12.27 -7.870 -57.05 -13.16 -91.10 -15.09 -15.12 0.1868 -40.45 -37.51 -3.106 -14.02 -6.062 -26.99 -10.75 -6.178 -12.40 -3.655 -3.049 -10.84 -5.497 -1.282 -4.308 -1.206 -0.2014 -0.1748 -0.9250 0.2703 -0.1786 -0.5087 -0.4654 1.955 0.3978E-01 -0.1030 0.1341 1.486 CS_1 -9.691 -8.722 -7.268 -5.301 -3.648 -2.626 -2.184 -1.991 -1.846 -1.429 -0.9536 -0.7065 -0.5322 -0.4658 -0.3087 -0.1028 0.1193 0.3557 0.4434 0.4637 0.5361 0.5457 0.6595 0.8470 0.8913 0.8623 0.6943 0.4569 0.2960 0.2015 0.1829 0.1769 0.1732 0.1567 0.1354 0.1247 0.9227E-01 0.7148E-01 0.5326E-01 0.3813E-01 QS_1 -2.641 -2.475 -2.123 -1.867 -1.722 -2.071 -2.612 -2.837 -2.990 -2.499 -2.084 -1.817 -1.688 -1.639 -1.427 -1.312 -1.088 -1.066 -0.9803 -0.8184 -0.7187 -0.5532 -0.4920 -0.4753 -0.3875 -0.3078 -0.2343 -0.1377 -0.8169E-01 -0.5190E-01 -0.2928E-01 -0.2120E-01 -0.1209E-01 -0.1997E-02 0.1177E-01 0.2623E-01 0.2999E-01 0.3352E-01 0.4042E-01 0.4738E-01 A_1 10.04 9.067 7.571 5.620 4.034 3.344 3.405 3.466 3.514 2.879 2.292 1.950 1.770 1.704 1.460 1.316 1.094 1.124 1.076 0.9407 0.8967 0.7771 0.8228 0.9712 0.9719 0.9156 0.7327 0.4772 0.3071 0.2081 0.1852 0.1781 0.1737 0.1568 0.1359 0.1275 0.9702E-01 0.7895E-01 0.6686E-01 0.6082E-01 K_1 0.9974 0.9952 0.9862 0.9525 0.9058 0.8515 0.9030 0.9216 0.9317 0.9186 0.9066 0.9113 0.9188 0.9226 0.8722 0.8510 0.8554 0.8817 0.8698 0.8410 0.8244 0.8031 0.8003 0.8432 0.8550 0.8610 0.8595 0.8525 0.8571 0.8779 0.9012 0.8645 0.8031 0.7518 0.7172 0.7596 0.7774 0.7553 0.7280 0.6905 PH_1 -2.876 -2.865 -2.857 -2.803 -2.701 -2.474 -2.267 -2.183 -2.124 -2.090 -2.000 -1.942 -1.876 -1.848 -1.784 -1.649 -1.462 -1.249 -1.146 -1.055 -0.9299 -0.7922 -0.6409 -0.5114 -0.4101 -0.3428 -0.3254 -0.2927 -0.2693 -0.2521 -0.1588 -0.1193 -0.6965E-01 -0.1274E-01 0.8671E-01 0.2073 0.3142 0.4384 0.6491 0.8931 Obs S_2 33.99 30.88 26.01 19.35 13.83 11.03 10.46 10.20 10.13 8.601 7.051 6.359 6.014 5.550 4.477 3.741 2.919 2.845 2.880 2.643 2.405 2.121 2.207 2.436 2.445 2.232 1.659 1.042 0.6215 0.4066 0.4173 0.4302 0.4339 0.3735 0.3037 0.2585 0.1779 0.1505 0.1356 0.1322 15-22 Chapter 15 Figure 15.3 Spectral analysis of GASIN Spectral Analysis of Time Series Figure 15.4 Spectral analysis of GASOUT 15-23 15-24 Chapter 15 Figure 15.5 Cross spectral analysis of GASIN-GASOUT part 1 Spectral Analysis of Time Series 25 Figure 15.5 Cross spectral analysis of GASIN-GASOUT part 2 15.3 Matrix Command Implementation The B34S matrix command, discussed in Chapter 16, but used in many chapters, provides a more compact and flexible way to do spectral analysis. For an example consult the code in Table 15.4 which when run Table 15.4 Matrix Command Implementation of Spectral Analysis _____________________________________________________ b34sexec options ginclude('gas.b34'); b34srun; b34sexec matrix; call loaddata; /; /; No weighting here px=sx /; call spectral(gasin,sinx,cosx,px,sx,freq); freq2=freq/(2.0*pi()); period=vfam(1.0/afam(freq2)); call tabulate(freq freq2 period sinx cosx px sx); /; Chapter 15 26 /; With weights 1 2 3 2 1 px ne sx /; call spectral(gasin,sinx,cosx,px,sx,freq:1 2 3 2 1); call tabulate(freq freq2 period sinx cosx px sx); call graph(freq2,sx:heading 'Spectrum of Gasin' :plottype xyplot :file 'sp_gasin.wmf'); b34srun; produces the periodogram, spectrum and other intermediate steps as well as a graph which is not shown due to space. The first 20 lines of the tabulation illustrate what is being calculated. The first tabulation shows that if there is no weighting px=sx. Note that by dividing FREQ by 2 we get the B34S FREQ value. => => CALL SPECTRAL(GASIN,SINX,COSX,PX,SX,FREQ:1 2 3 2 1)$ CALL TABULATE(FREQ FREQ2 PERIOD SINX COSX PX SX)$ Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 FREQ FREQ2 PERIOD 0.2123E-01 0.3378E-02 296.0 0.4245E-01 0.6757E-02 148.0 0.6368E-01 0.1014E-01 98.67 0.8491E-01 0.1351E-01 74.00 0.1061 0.1689E-01 59.20 0.1274 0.2027E-01 49.33 0.1486 0.2365E-01 42.29 0.1698 0.2703E-01 37.00 0.1910 0.3041E-01 32.89 0.2123 0.3378E-01 29.60 0.2335 0.3716E-01 26.91 0.2547 0.4054E-01 24.67 0.2760 0.4392E-01 22.77 0.2972 0.4730E-01 21.14 0.3184 0.5068E-01 19.73 0.3396 0.5405E-01 18.50 0.3609 0.5743E-01 17.41 0.3821 0.6081E-01 16.44 0.4033 0.6419E-01 15.58 0.4245 0.6757E-01 14.80 SINX COSX PX 0.4798 0.1931 39.60 0.3560 0.3565 37.56 -0.4462E-01 -0.5042 37.92 -0.1421 -0.2708 13.84 -0.9377E-01 -0.2222 8.612 -0.6245E-01 0.2150 7.418 0.3529 0.1057 20.08 -0.2504 -0.2797E-01 9.395 -0.3567 0.3711 39.21 0.1291 -0.1646 6.479 -0.1486 0.1131 5.165 0.3193E-02 -0.6891E-01 0.7043 0.1735 0.2199 11.61 0.1826 0.2172 11.92 0.6261E-01 -0.1209 2.745 0.1643 -0.1507 7.355 0.7021E-01 0.9312E-01 2.013 -0.2711 -0.1195 12.99 0.9543E-01 0.1405 4.271 0.1108E-01 0.9569E-01 1.373 SX 3.100 2.840 2.341 1.588 1.117 0.9096 1.253 1.421 1.544 1.046 0.7134 0.4780 0.6011 0.6412 0.5340 0.4994 0.4752 0.5329 0.4752 0.4157 Cross spectral analysis can easily be done with the matrix command as is shown with the code listed in Table 15.5 Table 15.5 Cross Spectral Analysis with the Matrix Command __________________________________________________ b34sexec options ginclude('gas.b34'); b34srun; b34sexec matrix; call loaddata; * For sample output See Stokes (1997) page 424; call cspectral(gasin,gasout,sinx,siny,cosx,cosy,px,py,sx,sy, rp,ip,cs,qs,a,k,ph,freq:1 2 3 4 3 2 1); freq2=freq/(2.0*pi()); period=vfam(1.0/afam(freq2)); call tabulate(freq2,period,sinx,siny,cosx,cosy,px,py,sx,sy); call tabulate(freq2,period,rp,ip,cs,qs,a,k,ph); call graph(freq2,a :heading 'Amplitude':plottype xyplot); call graph(freq2,k :heading 'Coherence':plottype xyplot); call graph(freq2,ph:heading 'Phase':plottype xyplot); b34srun; Spectral Analysis of Time Series 27 which runs the same problem as was illustrated earlier. Edited output showing the first 20 lines of each tabulation replicates the calculations shown in the prior section. => CALL LOADDATA$ => * FOR SAMPLE OUTPUT SEE STOKES (1997) PAGE 424$ => => CALL CSPECTRAL(GASIN,GASOUT,SINX,SINY,COSX,COSY,PX,PY,SX,SY, RP,IP,CS,QS,A,K,PH,FREQ:1 2 3 4 3 2 1)$ => FREQ2=FREQ/(2.0*PI())$ => PERIOD=VFAM(1.0/AFAM(FREQ2))$ => CALL TABULATE(FREQ2,PERIOD,SINX,SINY,COSX,COSY,PX,PY,SX,SY)$ Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 => FREQ2 PERIOD 0.3378E-02 296.0 0.6757E-02 148.0 0.1014E-01 98.67 0.1351E-01 74.00 0.1689E-01 59.20 0.2027E-01 49.33 0.2365E-01 42.29 0.2703E-01 37.00 0.3041E-01 32.89 0.3378E-01 29.60 0.3716E-01 26.91 0.4054E-01 24.67 0.4392E-01 22.77 0.4730E-01 21.14 0.5068E-01 19.73 0.5405E-01 18.50 0.5743E-01 17.41 0.6081E-01 16.44 0.6419E-01 15.58 0.6757E-01 14.80 SINX SINY COSX COSY PX 0.4798 -1.714 0.1931 -0.2243 39.60 0.3560 -1.482 0.3565 -0.7423 37.56 -0.4462E-01 0.6334 -0.5042 1.783 37.92 -0.1421 0.6519 -0.2708 0.6703 13.84 -0.9377E-01 0.5667 -0.2222 0.4589 8.612 -0.6245E-01 -0.1568 0.2150 -0.3118 7.418 0.3529 -1.063 0.1057 0.7739 20.08 -0.2504 0.4674 -0.2797E-01 -0.3030 9.395 -0.3567 -0.4580 0.3711 -1.249 39.21 0.1291 0.5578E-01 -0.1646 0.7187 6.479 -0.1486 -0.3376 0.1131 -0.4302 5.165 0.3193E-02 -0.2289E-01 -0.6891E-01 0.9868E-01 0.7043 0.1735 -0.9361 0.2199 0.3890 11.61 0.1826 -0.8758 0.2172 0.3462 11.92 0.6261E-01 0.8874E-01 -0.1209 0.1638 2.745 0.1643 0.2742 -0.1507 0.3253 7.355 0.7021E-01 -0.2818 0.9312E-01 0.2096 2.013 -0.2711 0.4794E-01 -0.1195 -0.6516 12.99 0.9543E-01 -0.2331 0.1405 0.4178 4.271 0.1108E-01 -0.4056 0.9569E-01 0.2647 1.373 PY SX 442.4 406.7 529.8 129.4 78.69 18.02 255.9 45.92 262.0 76.91 44.26 1.519 152.1 131.3 5.137 26.79 18.26 63.18 33.87 34.72 SY 2.976 2.675 2.235 1.714 1.299 1.190 1.227 1.278 1.308 1.049 0.8214 0.6559 0.5668 0.5671 0.5458 0.5440 0.4795 0.5033 0.4622 0.3981 33.99 30.88 26.01 19.35 13.83 11.03 10.46 10.20 10.13 8.601 7.051 6.359 6.014 5.550 4.477 3.741 2.919 2.845 2.880 2.643 CALL TABULATE(FREQ2,PERIOD,RP,IP,CS,QS,A,K,PH)$ Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 FREQ2 PERIOD 0.3378E-02 296.0 0.6757E-02 148.0 0.1014E-01 98.67 0.1351E-01 74.00 0.1689E-01 59.20 0.2027E-01 49.33 0.2365E-01 42.29 0.2703E-01 37.00 0.3041E-01 32.89 0.3378E-01 29.60 0.3716E-01 26.91 0.4054E-01 24.67 0.4392E-01 22.77 0.4730E-01 21.14 0.5068E-01 19.73 0.5405E-01 18.50 0.5743E-01 17.41 0.6081E-01 16.44 0.6419E-01 15.58 0.6757E-01 14.80 RP IP -128.2 -33.07 -117.2 -39.08 -137.2 -35.49 -40.57 -12.04 -22.96 -12.27 -8.471 -7.870 -43.41 -57.05 -16.07 -13.16 -44.43 -91.10 -16.45 -15.09 0.2236 -15.12 -1.017 0.1868 -11.38 -40.45 -12.53 -37.51 -2.110 -3.106 -0.5905 -14.02 -0.3994E-01 -6.062 9.601 -26.99 5.397 -10.75 3.084 -6.178 CS -9.691 -8.722 -7.268 -5.301 -3.648 -2.626 -2.184 -1.991 -1.846 -1.429 -0.9536 -0.7065 -0.5322 -0.4658 -0.3087 -0.1028 0.1193 0.3557 0.4434 0.4637 QS -2.641 -2.475 -2.123 -1.867 -1.722 -2.071 -2.612 -2.837 -2.990 -2.499 -2.084 -1.817 -1.688 -1.639 -1.427 -1.312 -1.088 -1.066 -0.9803 -0.8184 A K 10.04 9.067 7.571 5.620 4.034 3.344 3.405 3.466 3.514 2.879 2.292 1.950 1.770 1.704 1.460 1.316 1.094 1.124 1.076 0.9407 PH 0.9974 0.9952 0.9862 0.9525 0.9058 0.8515 0.9030 0.9216 0.9317 0.9186 0.9066 0.9113 0.9188 0.9226 0.8722 0.8510 0.8554 0.8817 0.8698 0.8410 -2.876 -2.865 -2.857 -2.803 -2.701 -2.474 -2.267 -2.183 -2.124 -2.090 -2.000 -1.942 -1.876 -1.848 -1.784 -1.649 -1.462 -1.249 -1.146 -1.055 The graphs, which are not shown, can be easily placed in Word or other software systems. The OLS approach to obtaining the periodogram works because the sin and cosine transforms are orthogonal. This can be demonstrated with the code in Table 15.6 Table 15.6 Verification that the sin and cosine vectors are orthogonal _______________________________________________________ b34sexec options ginclude('gas.b34')$ b34srun$ b34sexec matrix; 28 Chapter 15 call loaddata; call echooff; count=dfloat(integers(norows(gasout)))-1.; ncase=(norows(gasout)/2)-1; per=array(ncase:); test=array(ncase,2*ncase:); s1 =array(norows(gasout):); c1 =array(norows(gasout):); base=(2.0*pi())/dfloat(norows(gasout)); do i=1,ncase; s1=sin(count*base*dfloat(i)); c1=cos(count*base*dfloat(i)); is_1=(i-1)*2+1; ic_1= is_1+1; test(,is_1)=s1; test(,ic_1)=c1; if(i.le.20)then; call olsq(gasout s1 c1 :print :qr ); call print(' ':); modelss=%tss-%rss; call print('%tss-%rss',modelss:); endif; if(i.gt.20)call olsq(gasout s1 c1 :qr ); per(i)=%tss-%rss; enddo; call print(per); call graph(per); call print('Illustrate Orthagonality of sine and cosine vectors':); call print(ccf(test)); b34srun; which will produce a 294 by 294 matrix of correlations of the 147 pairs of sin and cosin vectors where all cross correlations are close to machine zero except along the diagonal where they are 1.0. Due to space limits this matrix, which contains 86,436 correlations, is not shown here. It is possible to obtain the original values of the series from the sin and cosin values from (15.1-30) with a small error. Table 15.7 shows both a B34S and a SAS setup. Table 15.7 Inverse Spectral Examples ________________________________________________ %b34slet dosas=1; b34sexec options ginclude('gas.b34'); b34srun; /; /; Testing FFT /; b34sexec options ginclude('gas.b34'); b34srun; b34sexec matrix; Spectral Analysis of Time Series call loaddata; call echooff; subroutine spectest(x,testx,error); call spectral(x,sinx,cosx,px,sx,freq1); count=dfloat(integers(norows(x)))-1. ; /; Test 100% recovery of actual value adj_freq=freq1/(2.*pi()); period=1.0/afam(adj_freq); do i=1,norows(x); sum1=sinx*sin(count(i)*freq1); sum2=cosx*cos(count(i)*freq1); test(i)=sum1+sum2; enddo; call print('mean(x)',mean(x):); adj=mean(x); testx=afam(test)+adj; error_1=x-testx; call tabulate(x,testx,error_1); return; end; call call call call call spectest(gasin, yhat,error); names(all); olsq(yhat gasin :print); spectest(gasout,yhat,error); olsq(yhat gasout :print); /; Now look at fft cfft=fft(gasin); test=fft(cfft:back)/dfloat(norows(gasin)); error=gasin-test; call tabulate(gasin,test,error); b34srun; %b34sif(&dosas.ne.0)%then; b34sexec options open('testsas.sas') unit(29) disp=unknown$ b34srun$ b34sexec options clean(29) $ b34seend$ b34sexec pgmcall idata=29 icntrl=29$ sas $ * sas commands next ; pgmcards$ proc spectra out=specgas coef p s; var gasin gasout; weights 1 1 1; run; proc means data=specgas; run; proc print data=specgas; run; b34sreturn$ b34srun $ b34sexec options close(29)$ b34srun$ /$ the next card has to be modified to point to sas location /$ be sure and wait until sas gets done before letting b34s resume 29 Chapter 15 30 /$ *************************************************************** b34sexec options dodos('start /w /r sas testsas' ) dounix('sas testsas' ) $ b34srun$ b34sexec options npageout noheader writeout(' ','output from sas',' ',' ') writelog(' ','output from sas',' ',' ') copyfout('testsas.lst') copyflog('testsas.log') /;dodos('erase testsas.sas','erase testsas.lst','erase testsas.log') dounix('rm testsas.sas','rm testsas.lst','rm testsas.log') $ b34srun$ b34sexec options header$ b34srun$ %b34sendif; When the code in Table 15.7 is run it produces edited output that shows the recovery of GASIN and GASOUT from the sin and cosin values. Note the very small error which is validated with OLS. mean(x) Obs -5.683445945945946E-02 X TESTX ERROR_1 1 -0.1090 -0.1090 0.7633E-15 2 0.000 -0.2720E-14 0.2720E-14 3 0.1780 0.1780 0.4496E-14 4 0.3390 0.3390 0.3497E-14 5 0.3730 0.3730 0.1499E-14 6 0.4410 0.4410 -0.1998E-14 7 0.4610 0.4610 -0.4441E-15 8 0.3480 0.3480 -0.1665E-15 9 0.1270 0.1270 -0.1998E-14 10 -0.1800 -0.1800 -0.4496E-14 11 -0.5880 -0.5880 -0.6550E-14 12 -1.055 -1.055 -0.7327E-14 13 -1.421 -1.421 -0.4441E-14 14 -1.520 -1.520 -0.1110E-14 15 -1.302 -1.302 -0.2442E-14 16 -0.8140 -0.8140 -0.1998E-14 17 -0.4750 -0.4750 0.2776E-15 18 -0.1930 -0.1930 0.4441E-14 19 0.8800E-01 0.8800E-01 0.8313E-14 20 0.4350 0.4350 0.1044E-13 …………………………………………………………………………………………………. 280 0.2510 0.2510 0.1205E-13 281 0.2800 0.2800 0.6550E-14 282 0.000 0.6939E-16 -0.6939E-16 283 -0.4930 -0.4930 -0.1499E-14 284 -0.7590 -0.7590 -0.3442E-14 285 -0.8240 -0.8240 -0.5440E-14 286 -0.7400 -0.7400 -0.4663E-14 287 -0.5280 -0.5280 0.4885E-14 288 -0.2040 -0.2040 0.1138E-14 289 0.3400E-01 0.3400E-01 -0.3775E-14 290 0.2040 0.2040 0.1360E-14 291 0.2530 0.2530 -0.1094E-13 292 0.1950 0.1950 -0.8438E-14 293 0.1310 0.1310 -0.1887E-14 294 0.1700E-01 0.1700E-01 0.2387E-14 295 -0.1820 -0.1820 0.1016E-13 296 -0.2620 -0.2620 0.1221E-14 Ordinary Least Squares Estimation Dependent variable Centered R**2 Adjusted R**2 YHAT 1.000000000000000 1.000000000000000 Spectral Analysis of Time Series Residual Sum of Squares Residual Variance Standard Error Total Sum of Squares Log Likelihood Mean of the Dependent Variable Std. Error of Dependent Variable Sum Absolute Residuals 1/Condition XPX Maximum Absolute Residual Number of Observations Variable GASIN CONSTANT mean(x) Obs Lag Coefficient 0 1.0000000 0 0.38515884E-15 53.50912162162162 5.020523454753602E-26 1.707661039031837E-28 1.306775052957408E-14 339.4936188885142 9043.711769338292 -5.683445945945921E-02 1.072765504078466 2.840377339427970E-12 0.8358143402344875 4.363176486776865E-14 296 SE 0.70922662E-15 0.76061639E-15 t 0.14099866E+16 0.50637726 X TESTX ERROR_1 1 53.80 53.80 0.2132E-13 2 53.60 53.60 0.7105E-14 3 53.50 53.50 0.7105E-14 4 53.50 53.50 0.000 5 53.40 53.40 -0.7105E-14 6 53.10 53.10 0.7105E-14 7 52.70 52.70 0.000 8 52.40 52.40 -0.7105E-14 9 52.20 52.20 0.9237E-13 10 52.00 52.00 0.9237E-13 11 52.00 52.00 0.1066E-12 12 52.40 52.40 0.8527E-13 13 53.00 53.00 0.7816E-13 14 54.00 54.00 0.1066E-12 15 54.90 54.90 0.1137E-12 16 56.00 56.00 0.1066E-12 17 56.80 56.80 0.7105E-14 18 56.80 56.80 -0.7105E-14 19 56.40 56.40 0.000 20 55.70 55.70 -0.1421E-13 ………………………………………………………………………………………………………. 280 54.40 54.40 0.1066E-12 281 53.70 53.70 -0.7105E-14 282 53.30 53.30 -0.3553E-13 283 52.80 52.80 -0.2842E-13 284 52.60 52.60 -0.3553E-13 285 52.60 52.60 -0.5684E-13 286 53.00 53.00 -0.2842E-13 287 54.30 54.30 0.7105E-14 288 56.00 56.00 -0.4263E-13 289 57.00 57.00 0.9237E-13 290 58.00 58.00 0.9948E-13 291 58.60 58.60 0.1066E-12 292 58.50 58.50 0.9237E-13 293 58.30 58.30 0.9948E-13 294 57.80 57.80 0.7816E-13 295 57.30 57.30 0.1208E-12 296 57.00 57.00 0.5684E-13 Ordinary Least Squares Estimation Dependent variable Centered R**2 Adjusted R**2 Residual Sum of Squares Residual Variance Standard Error Total Sum of Squares Log Likelihood Mean of the Dependent Variable Std. Error of Dependent Variable Sum Absolute Residuals F( 1, 294) YHAT 0.9999997207772101 0.9999997198274727 8.445943587647710E-04 2.872769927771330E-06 1.694924755784554E-03 3024.804527027029 1469.512199422020 53.50912162163348 3.202120339382677 0.4999998603887690 1052922356.462301 31 Chapter 15 32 F Significance 1/Condition XPX Maximum Absolute Residual Number of Observations Variable GASOUT CONSTANT Lag 0 0 Coefficient 0.99999972 0.14940968E-04 1.000000000000000 1.214609942288382E-06 1.691397594697719E-03 296 SE 0.30817805E-04 0.16519738E-02 t 32448.765 0.90443130E-02 The same calculation is done with the inverse FFT. Note the small errors: Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 290 291 292 293 294 295 296 GASIN TEST ERROR -0.1090 -0.1090 0.3747E-15 0.000 -0.2497E-14 0.2497E-14 0.1780 0.1780 0.4302E-14 0.3390 0.3390 0.3386E-14 0.3730 0.3730 0.1554E-14 0.4410 0.4410 -0.1277E-14 0.4610 0.4610 -0.1554E-14 0.3480 0.3480 -0.9992E-15 0.1270 0.1270 -0.2415E-14 -0.1800 -0.1800 -0.4136E-14 -0.5880 -0.5880 -0.5995E-14 -1.055 -1.055 -0.6217E-14 -1.421 -1.421 -0.3775E-14 -1.520 -1.520 -0.2220E-15 -1.302 -1.302 -0.2220E-15 -0.8140 -0.8140 0.000 -0.4750 -0.4750 0.1721E-14 -0.1930 -0.1930 0.4191E-14 0.8800E-01 0.8800E-01 0.5773E-14 0.4350 0.4350 0.4829E-14 0.2040 0.2040 0.2530 0.2530 0.1950 0.1950 0.1310 0.1310 0.1700E-01 0.1700E-01 -0.1820 -0.1820 -0.2620 -0.2620 -0.2720E-14 -0.5607E-14 -0.5079E-14 -0.2914E-14 -0.9957E-15 -0.6106E-15 -0.6106E-15 From the sin and cosine coefficients the original series was recovered. The OLS model tested the conversion and found the coefficient of Y and on Yˆ was very close to 1.0, and, as expected, was close to being a perfect fit. 15.4 Wavelet Analysis Wavelets provide a means by which a series can be filtered to remove noise.5 The first step is to estimate a discrete Fourier transform of a series xi i 1, , N where there are N periods at frequency k, xˆ k , where 5 This section discusses the approach to wavelet estimation suggested by Torrence and Compo (1998) which have provided the Fortran code that is used for all calculations. Consult this article for future information. Where ever possible, the notion in this section follows their article. Spectral Analysis of Time Series xˆk 1 N N 1 x e n 0 33 2 ikn / N (15.4-1) n Define the wavelet function as ( s ) . The wavelet transform is the inverse Fourier transform of the product N 1 Wn ( s ) xˆkˆ * ( s k )e i k n t (15.4-2) k 0 where k 2 k N 2 k for k and N t 2 N t otherwise. The wavelet transforms in (15.4-2) are 2 s ˆ normalized to have unit energy ˆ ( sk ) 0 ( sk ) where the unscaled transforms ˆ 0 have t .5 been normalized to have unit energy or |ˆ 0 ( ') |2 d ' 1 . At each scale s, N 1 |ˆ ( s ) | N . 2 k 0 k Table 15.8 lists the wavelet basis functions and their properties and has been taken from Torrence and Compo (1998) Table 1. H ( ) = Heaviside step function which =1 if 0 and 0 otherwise. The DOG is the only real valued wave function of the three. Table 15.8 Wavelet basis functions supported Name 0 ( ) Morlet ( 0 frequency) .25ei e Paul ˆ 0 ( s ) 0 2 /2 2m i mm! (1 i ) ( m 1) (2m)! .25 H ()e( s ) 0 2 /2 2m H ( )( s ) m e s m(2m 1)! ( m order) DOG (Derivative of Gaussian) ( 1) m 1 d m 2 / 2 (e ) m ( m .5) d ( m derivative) Source Torrence and Compo (1998 table 1) 2 i m ( s ) m e ( s ) / 2 ( m .5) Chapter 15 34 Table 15.9 Empirically derived factors for four wavelet bases Name C j0 Morlet (0 6) Paul ( m 4) Marr (DOG m 2) DOG ( m 6) .776 1.132 3.541 1.966 2.32 1.17 1.43 1.37 .60 1.5 1.4 0.97 0 (0) .25 .7511255 1.079 .867 .884 Source Torrence and Compo (1998 table 2) = reconstruction factor C = decorrelation factor for time averaging j0 = factor for scale averaging After selecting the wavelet basis function, wavelet analysis first proceeds by selecting the appropriate scales s to use. Define s j s0 2 j j , j 0,1, ,J J j 1 log 2 ( N t / s0 ) (15.4-3) s0 is the smallest scale and J is the largest. Smaller t values mean more resolution. For the Morlet wavelet it is usually set as .5. By summing over all scales, the original time series can be reconstructed from the real part of the wavelet transformation, Wn ( s j ) , as j t .5 j Wn ( s j ) xn s.5 C 0 (0) j 0 j (15.4-4) Torrence and Compo (1998) outline how these formulas are used. Assume a new wavelet function 1 for time period n 0 so that xn n 0 which implies a Fourier transform xˆ k from (15.4-1) that N is constant for all k. Substituting xˆ k into (15.4-2) gives W ( s ) 1 N N 1 ˆ k 0 * ( sk ) which using (15.4-4) can be solved for (15.4-5) Spectral Analysis of Time Series j t.5 J W ( s j ) C s.5 0 (0) j 0 j 35 (15.4-6) Total energy is conserved under the wavelet function and should be checked to make sure appropriate s0 and t values have been selected. j t 2 N 1 J | W ( s ) | 2 C N n n 0 j 0 j (15.4-7) This suggests that all scales should be inspected using (15.4-4) and (15.4-7) to see if the level and the variance can be accurately recovered. If this is not the case, s0 and j would have to be adjusted. The time-averaged wavelet spectrum of a certain period is Wn2 ( s ) 1 na n2 | W ( s) | 2 n n n1 (15.4-8) where na n2 n1 1 . When summed over all local wavelet spectrum the global wavelet spectrum W 2 ( s) 1 N N 1 | W ( s) | 2 n n 0 (15.4-9) is obtained. The scale-averaged wavelet power is defined as the weighted sum of the wavelet power spectrum over scales s1 to s2 W 2 n j t C j2 | Wn ( s j ) |2 j j1 sj (15.4-10) It is possible to filter a series by using (15.4-4) and summing a range of scales xn j t .5 j Wn ( s j ) s.5 C 0 (0) j j j 2 (15.4-11) 1 One way to think about (15.4-11) is that the larger j1 the more the information in the short periods (small scale) that is removed. Thus noise that by assumption is of short duration, can be removed from the data to capture what is hoped is the more fundamental series. The higher sampling rate, the Chapter 15 36 more of this noise reduction strategy may be required. Experimentation may be required to set the scales that are appropriate for the series at hand. Once the series has been processed it can be further analyzed using nonlinear and linear models that hopefully will better capture the underlying structure.. Wavelet calculations are illustrated using data of 504 observations on the Nino suggested by Torrence and Compo (1998). Table 15.4-10 shows a setup to completely filter the Nino series. Table 15.10 Wavelet Filter For The Nino Series b34sexec options ginclude('wavedata.mac') member(nino3); b34srun; /; /; Basic Wavelet test /; b34sexec matrix; call loaddata; call wavelet(nino :type morlet :settings :s0 .25 :dt .25 :lower 2. :upper 7.9 :jtot 44); call tabulate(nino,%recon_y); call olsq(nino %recon_y :print); call tabulate(%scale,%period,%w_power,%w_phase, %w_ampl,%signif,%global,%g_sig); call print(%sa_df %sa_sig); call print('mean original data ',mean(nino):); call print('mean of reconstructed data ',mean(%recon_y):); call print('Variance of original Data ',variance(nino):); call print('Variance of reconstructed Data ',variance(%recon_y):); b34srun; Edited output follows: B34S 8.11C Variable NINO CONSTANT (D:M:Y) 5/ 8/07 (H:M:S) 17: 2:31 Label DATA STEP # Cases 1 Nino2 sea surface temperature 2 5/ 8/07. h:m:s 17: 2:31. => CALL LOADDATA$ => => CALL WAVELET(NINO :TYPE MORLET :SETTINGS :S0 .25 :DT .25 :LOWER 2. :UPPER 7.9 :JTOT 44)$ Wavelet Option Settings Input Variable Number of Original Observations Sampling time (dt) Wavelet used Wave number (param) Smallest scale of wavelet s0=%scale(1) Largest scale of wavelet (%scale(jtot)) Spacing between discrete scales (dj) Number of scales (jtot) Number of observation after padding Lag1 (background autocorrelation) Significance level Lower Scale for filter Upper Scale for filter Std. Dev. 504 -0.198413E-04 504 1.00000 Number of observations in data file 504 Current missing variable code 1.000000000000000E+31 Data begins on (D:M:Y) 1: 1:1871 ends 1:10:1996. Frequency is B34S(r) Matrix Command. d/m/y Sea Surface Temp Mean NINO 504 0.2500000000000000 Morlet 6.000000000000000 0.2500000000000000 430.5389646099018 0.2500000000000000 44 1024 0.7200000000000000 5.000000000000004E-02 2.000000000000000 7.900000000000000 4 0.734328 0.00000 Variance 0.539238 0.00000 PAGE 1 Maximum Minimum 2.50000 1.00000 -1.85000 1.00000 Spectral Analysis of Time Series Work array size (nk) => 1024 CALL TABULATE(NINO,%RECON_Y)$ The filter is successful at capturing the NINO series as shown by the application of (15.4-4). Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 . 494 495 496 497 498 499 500 501 502 503 504 NINO -0.1500 -0.3000 -0.1400 -0.4100 -0.4600 -0.6600 -0.5000 -0.8000 -0.9500 -0.7200 -0.3100 -0.7100 -1.040 -0.7700 -0.8600 -0.8400 -0.4100 -0.4900 -0.4800 -0.7200 -1.210 -0.8000 0.1600 0.4600 0.4000 1.000 %RECON_Y -0.1508 -0.3005 -0.1408 -0.4109 -0.4618 -0.6617 -0.5019 -0.8021 -0.9533 -0.7219 -0.3113 -0.7119 -1.044 -0.7721 -0.8630 -0.8423 -0.4116 -0.4912 -0.4818 -0.7219 -1.214 -0.8021 0.1602 0.4618 0.4009 1.004 . . 0.3900 -0.1700 1.040 0.7700 0.1200 -0.3500 -0.2200 0.8000E-01 -0.8000E-01 -0.1800 -0.6000E-01 0.3913 -0.1706 1.043 0.7723 0.1206 -0.3513 -0.2204 0.7998E-01 -0.7998E-01 -0.1808 -0.5990E-01 An OLS regression documents the closeness of the fit. => CALL OLSQ(NINO %RECON_Y :PRINT)$ Ordinary Least Squares Estimation Dependent variable Centered R**2 Adjusted R**2 Residual Sum of Squares Residual Variance Standard Error Total Sum of Squares Log Likelihood Mean of the Dependent Variable Std. Error of Dependent Variable Sum Absolute Residuals F( 1, 502) F Significance 1/Condition XPX Maximum Absolute Residual Number of Observations Variable %RECON_Y CONSTANT => => Lag 0 0 Coefficient 0.99689178 -0.19583793E-04 NINO 0.9999998133138651 0.9999998129419804 5.063609377294851E-05 1.008687126951166E-07 3.175983512159919E-04 271.2364998015873 3345.437370731567 -1.984126984126920E-05 0.7343279745169901 0.1518517527012808 2689004765.108992 1.000000000000000 0.6454896073187762 5.326197560795443E-04 504 SE 0.19224375E-04 0.14146955E-04 t 51855.615 -1.3843115 CALL TABULATE(%SCALE,%PERIOD,%W_POWER,%W_PHASE, %W_AMPL,%SIGNIF,%GLOBAL,%G_SIG)$ This table produces the power and phase and other values by period/scale Obs 1 2 3 4 5 6 7 8 9 10 %SCALE 0.2500 0.2973 0.3536 0.4204 0.5000 0.5946 0.7071 0.8409 1.000 1.189 %PERIOD 0.2583 0.3071 0.3652 0.4343 0.5165 0.6143 0.7305 0.8687 1.033 1.229 %W_POWER %W_PHASE 0.6064E-06 116.4 0.1538E-04 116.6 0.3005E-03 118.3 0.3146E-02 122.8 0.1232E-01 132.4 0.2164E-01 143.6 0.2336E-01 141.6 0.5037E-01 135.1 0.2590 148.1 0.3354 173.0 %W_AMPL %SIGNIF 0.7787E-03 7.230 0.3922E-02 0.8132 0.1734E-01 0.3707 0.5609E-01 0.2774 0.1110 0.2631 0.1471 0.2855 0.1529 0.3365 0.2244 0.4181 0.5089 0.5369 0.5791 0.7035 %GLOBAL %G_SIG 0.1340E-05 2.689 0.3449E-04 0.3053 0.6882E-03 0.1406 0.7453E-02 0.1064 0.2912E-01 0.1021 0.4269E-01 0.1123 0.5829E-01 0.1342 0.9573E-01 0.1693 0.1767 0.2211 0.2689 0.2948 37 Chapter 15 38 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 => 1.414 1.682 2.000 2.378 2.828 3.364 4.000 4.757 5.657 6.727 8.000 9.514 11.31 13.45 16.00 19.03 22.63 26.91 32.00 38.05 45.25 53.82 64.00 76.11 90.51 107.6 128.0 152.2 181.0 215.3 256.0 304.4 362.0 430.5 1.461 1.737 2.066 2.457 2.922 3.475 4.132 4.914 5.844 6.949 8.264 9.828 11.69 13.90 16.53 19.66 23.38 27.80 33.06 39.31 46.75 55.60 66.11 78.62 93.50 111.2 132.2 157.2 187.0 222.4 264.5 314.5 374.0 444.8 = 6.4406460 %SA_SIG = 0.43879433 CALL PRINT('mean of reconstructed data 0.3958 0.5479 0.7847 1.276 1.954 2.569 2.270 2.109 2.357 1.703 1.368 1.318 1.605 1.823 1.311 1.424 1.154 1.295 0.6433 0.5865 1.008 0.5386 0.7283 1.363 1.169 0.6206 0.9145 0.1606 0.1404 1.092 2.003 0.2956 0.7419E-03 0.3191E-08 0.3979 0.5392 0.7290 0.9773 1.292 1.676 2.125 2.625 3.157 3.700 4.236 4.753 5.246 5.716 6.167 6.603 7.027 7.440 7.838 8.214 8.560 8.867 9.128 9.341 9.508 9.634 9.726 9.790 9.834 9.863 9.881 9.893 9.899 9.902 ',MEAN(%RECON_Y):)$ -2.582799130488190E-07 CALL PRINT('Variance of original Data Variance of original Data => 0.9314 1.236 1.635 2.140 2.758 3.481 4.285 5.130 5.968 6.750 7.442 8.025 8.496 8.865 9.146 9.355 9.509 9.621 9.702 9.760 9.802 9.831 9.852 9.867 9.878 9.885 9.891 9.894 9.897 9.899 9.900 9.901 9.902 9.902 -1.984126984126920E-05 mean of reconstructed data => 0.6751 0.9913 0.6674 0.4481 0.1477 0.1260 0.9716 1.722 0.8570 0.4540 0.5562 1.140 1.433 1.508 0.4866 0.6009 0.3603 1.189 0.9217 0.7477 1.003 0.5339 0.8368 1.215 1.116 0.8159 0.9598 0.4140 0.3777 1.045 1.415 0.5437 0.2724E-01 0.5649E-04 CALL PRINT('mean original data ',MEAN(NINO):)$ mean original data => -125.3 -117.8 -159.3 153.5 13.59 58.03 -165.5 -146.5 -140.4 66.75 120.3 131.0 137.4 129.9 119.8 -116.2 -50.34 12.27 14.68 -34.62 -42.79 2.065 91.35 117.7 133.6 173.3 -173.0 -163.0 -114.7 -112.3 -112.3 -112.3 -112.3 -112.3 CALL PRINT(%SA_DF %SA_SIG)$ %SA_DF => 0.4558 0.9827 0.4454 0.2008 0.2182E-01 0.1589E-01 0.9440 2.964 0.7344 0.2061 0.3094 1.300 2.055 2.273 0.2368 0.3610 0.1298 1.415 0.8495 0.5591 1.005 0.2850 0.7003 1.477 1.247 0.6657 0.9212 0.1714 0.1426 1.092 2.003 0.2956 0.7419E-03 0.3191E-08 ',VARIANCE(NINO):)$ 0.5392375741582253 CALL PRINT('Variance of reconstructed Data ',VARIANCE(%RECON_Y):)$ Variance of reconstructed Data 0.5426052999725153 B34S Matrix Command Ending. Last Command reached. Space available in allocator Number variables used Number temp variables used 11856880, peak space used 54, peak number used 28, # user temp clean 61607 54 0 Table 15.11 illustrates application of equation (15.4-11) to filter the noise from the nino3 series using successively higher values of s0 . The results are shown in Figure 15.6 which is best viewed in color. The series filter3 is the most aggressive at removing noise due to s0 4.5 . Table 15.11 Smoothing Nino Series by Local Periods b34sexec options ginclude('wavedata.mac') member(nino3); b34srun; /; /; Illustrates filtering. Increasing so => tighter filter /; b34sexec matrix; call loaddata; /; This setting will closely filter series call wavelet(nino :type morlet :settings :s0 2.25 :dt .25 :jtot 44); filter1=%recon_y; Spectral Analysis of Time Series 39 call wavelet(nino :type morlet :settings :s0 3.5 :dt .25 :jtot 44); filter2=%recon_y; call wavelet(nino :type morlet :settings :s0 4.5 :dt .25 :jtot 44); filter3=%recon_y; call tabulate(nino,filter1,filter2,filter3); call graph(nino,filter1,filter2 filter3 :nolabel :heading 'Raw Nino and smoothed series' :file 'nino.wmf'); b34srun; Figure 15.7 shows increased smoothing of the Nino series Raw Nino and smoothed series 2.5 2 1.5 1 N I N O .5 0 -.5 -1 -1.5 50 100 150 200 250 Obs 300 350 Figure 15.6 Raw Nino series and three wavelet smoothed series A portion of the output that produced Figure 15.6 is shown. Wavelet Option Settings Input Variable Number of Original Observations Sampling time (dt) Wavelet used Wave number (param) Smallest scale of wavelet s0=%scale(1) Largest scale of wavelet (%scale(jtot)) Spacing between discrete scales (dj) NINO 504 0.2500000000000000 Morlet 6.000000000000000 2.250000000000000 3874.850681489117 0.2500000000000000 400 450 500 F I L T E R 1 F I L T E R 2 F I L T E R 3 Chapter 15 40 Number of scales (jtot) Number of observation after padding Lag1 (background autocorrelation) Significance level Lower Scale for filter Upper Scale for filter Work array size (nk) 44 1024 0.7200000000000000 5.000000000000004E-02 2.000000000000000 6.000000000000000 1024 => FILTER1=%RECON_Y$ => => CALL WAVELET(NINO :TYPE MORLET :SETTINGS :S0 3.5 :DT .25 :JTOT 44)$ Wavelet Option Settings Input Variable Number of Original Observations Sampling time (dt) Wavelet used Wave number (param) Smallest scale of wavelet s0=%scale(1) Largest scale of wavelet (%scale(jtot)) Spacing between discrete scales (dj) Number of scales (jtot) Number of observation after padding Lag1 (background autocorrelation) Significance level Lower Scale for filter Upper Scale for filter Work array size (nk) NINO 504 0.2500000000000000 Morlet 6.000000000000000 3.500000000000000 6027.545504538625 0.2500000000000000 44 1024 0.7200000000000000 5.000000000000004E-02 2.000000000000000 6.000000000000000 1024 => FILTER2=%RECON_Y$ => => CALL WAVELET(NINO :TYPE MORLET :SETTINGS :S0 4.5 :DT .25 :JTOT 44)$ Wavelet Option Settings Input Variable Number of Original Observations Sampling time (dt) Wavelet used Wave number (param) Smallest scale of wavelet s0=%scale(1) Largest scale of wavelet (%scale(jtot)) Spacing between discrete scales (dj) Number of scales (jtot) Number of observation after padding Lag1 (background autocorrelation) Significance level Lower Scale for filter Upper Scale for filter Work array size (nk) NINO 504 0.2500000000000000 Morlet 6.000000000000000 4.500000000000000 7749.701362978233 0.2500000000000000 44 1024 0.7200000000000000 5.000000000000004E-02 2.000000000000000 6.000000000000000 1024 => FILTER3=%RECON_Y$ => CALL TABULATE(NINO,FILTER1,FILTER2,FILTER3)$ Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 . NINO -0.1500 -0.3000 -0.1400 -0.4100 -0.4600 -0.6600 -0.5000 -0.8000 -0.9500 -0.7200 -0.3100 -0.7100 -1.040 -0.7700 -0.8600 -0.8400 -0.4100 -0.4900 -0.4800 -0.7200 -1.210 -0.8000 0.1600 0.4600 0.4000 1.000 2.170 2.500 2.340 0.8000 . FILTER1 -0.1085 -0.1893 -0.2873 -0.3926 -0.4926 -0.5770 -0.6408 -0.6861 -0.7198 -0.7479 -0.7707 -0.7817 -0.7720 -0.7381 -0.6890 -0.6467 -0.6371 -0.6741 -0.7441 -0.8005 -0.7732 -0.5943 -0.2287 0.3012 0.9105 1.471 1.847 1.943 1.732 1.265 . FILTER2 FILTER3 -0.1517 -0.1635 -0.2318 -0.2125 -0.3187 -0.2659 -0.4060 -0.3239 -0.4864 -0.3867 -0.5541 -0.4545 -0.6054 -0.5271 -0.6400 -0.6038 -0.6611 -0.6829 -0.6751 -0.7614 -0.6900 -0.8349 -0.7135 -0.8978 -0.7500 -0.9434 -0.7992 -0.9646 -0.8540 -0.9546 -0.9008 -0.9079 -0.9211 -0.8209 -0.8946 -0.6932 -0.8033 -0.5274 -0.6361 -0.3302 -0.3927 -0.1112 -0.8565E-01 0.1170 0.2599 0.3402 0.6087 0.5438 0.9208 0.7143 1.158 0.8407 1.290 0.9155 1.300 0.9355 1.191 0.9021 0.9806 0.8211 . . Spectral Analysis of Time Series 41 15.5 Use of Normalized Cumulative Periodogram to test for white noise Box-Jenkins (1976, 294-298) and Box-Jenkins-Reinsel (2008, 347-350) suggest using the normalized cumulative periodogram cˆ(i ) defined as j cˆ( j ) x Pˆ ( ) i 1 x Tsˆ 2 j , (15.5-1) where Pˆ ( ) is defined in (15.1-25) and (15.1-27) and ŝ 2 is the estimated variance of the series, to check for white noise. The advantage of using (15.5-1) in conjunction with inspection of the ACF is that one can tell using the plot if the violation of the white noise assumption occurred due to more low frequency than high frequency information (the plot is above the diagonal) or more high frequency information than low frequency information (the plot was below the diagonal). Probability limits for 99%, 95%, 90% and 75% are respectively 1.63, 1.36, 1.22 and 1.02. The below listed program cperiod listed in Table 15.12 implements this test which is tested using the program listed in Table 15.13. Table 15.12 Calculating the Normalized Cumulative Periodogram subroutine cperiod(x,name,c_period,c_p_freq,idrop); /; /; Normalized Cumulative Periodogram /; /; Box-Jenkins-Rensel (2008,347-350) suggests calculation of /; cumulative Periodogram to test detect periodic nonrandomness /; /; See Jenkins and Watts (1968, 235) /; /; For significance of .95 and .75 lamda = 1.36 and 1.02 /; .99 and .90 lamda = 1.63 and 1.22 /; band is +- lamda/sqrt(n/2)-1)) /; /; Command built October 2009 by Houston H. Stokes /; /; x = series to test /; name = name of series /; c_period = normalized cumulative periodogram /; c_p_freq = frequency of normalized cumulative periodogram /; idrop = Number of c_period values to drop /; /; name of file is 'c_n_period.wmf' /; n=dfloat(norows(x)); varx=variance(x); if(varx.le.0.0d+00)then; call print('ERROR: Series has no variance':); go to done; Chapter 15 42 endif; p =spectrum(x,freq2); c_p_freq=freq2; c_period=cusum(p)/(n*varx); if(idrop.gt.0)then; c_p_freq =dropfirst(freq2, idrop); c_period=dropfirst(c_period,idrop); endif; diag=dfloat(integers(1,norows(c_period))); diag=diag/dfloat(norows(c_period)); test=1./dsqrt(((n/2.) -1.)); upper99=diag+((1.63)*test); lower99=diag-((1.63)*test); upper95=diag+((1.36)*test); lower95=diag-((1.36)*test); /; /; These bands can be added if desired /; /; upper90=diag+((1.22)*test); /; lower90=diag-((1.22)*test); upper75=diag+((1.02)*test); lower75=diag-((1.02)*test); call call call call character(cc,'Cumulative Periodogram of '); character(cc2,name); ialen(cc2,ii); expand(cc,cc2,27,27+ii); call graph(c_p_freq,c_period,diag, upper99 lower99 upper95, lower95 upper75 lower75 :heading cc :pgborder :nocontact :nolabel :plottype xyplot :file 'n_c_period.wmf'); done continue; return; end; Table 15.13 Testing series for white noise using the Normalized Cumulative Periodogram b34sexec options ginclude('gas.b34'); b34srun; /$ /$ Job tests c_period Command /$ b34sexec matrix; call loaddata; call load(cperiod); idrop=0; call cperiod(gasout,'gasout',c_period,c_p_freq,idrop); call dodos('rename n_c_period.wmf fig15.7.wmf' :); Spectral Analysis of Time Series 43 x=rn(gasout); call cperiod(x, 'ran()',c_period,c_p_freq, idrop); call dodos('rename n_c_period.wmf fig15.8.wmf' :); b34srun$ Figure 15.7 for the gasout series clearly shows that there is a preponderance of low frequency information making it possible to accept ta the 99% level that the series is not white noise. Figure 15.8 shows the plot for a white noise series. Cumulative Periodogram of gasout 1 .8 CDULULUL _IPOPOPO PAPWPWPW EGEEEEEE R RRRRRR I 999977 O 995555 D .6 .4 .2 0 0 .5 1 1.5 C_P_FREQ 2 Figure 15.7 Normalized Cumulative Periodogram for gasout series 2.5 3 Chapter 15 44 Cumulative Periodogram of ran() 1 .8 CDULULUL _IPOPOPO PAPWPWPW EGEEEEEE R RRRRRR I 999977 O 995555 D .6 .4 .2 0 0 .5 1 1.5 C_P_FREQ 2 2.5 3 Figure 15.8 Normalized Cumulative Periodogram for white noise series 15.6 Forecasting using spectral methods Table 15.14 lists a matrix command subroutine that uses spectral methods to construct forecasts. Unlike a RATS command of the same name, no attempt is made to smooth the FFT. Spectral Analysis of Time Series Table 15.14 A Subroutine to Calculate Forecasts using the FFT of a series subroutine specfore(data,startf,numf,detrend,forecast,obs,error,actual); /; /; Forecast with spectral methods. /; /; Based on code developed by Michael Hunstad using /; regression methods to partially reverse-engineer the RATS /; specfore command. An improved version of the /; Hunsted Matlab code is in c:\b34slm\mfiles as /; specfore.m /; /; This implementation by Houston H. Stokes uses /; a FFT to save space and reduce CPU use. Added capability is /; provided Unlike the RATS implementation of this technique, /; the current implementation does not smooth the FFT /; /; Rats smooths the data /; /; data => series to forecast. # of obs = n /; startf => last period before start forecasting /; numf => number of forecasts /; detrend => Detrend the data if gt 0. =2 print trend OLS Model /; forecast => Forecast /; obs => Observation number associated with forecast /; error => Defined if startf lt n /; actual => defined if startf lt n /; /; Routine developed 8 December 2009 by Houston H. Stokes /; nobs=norows(data); error=missing(); actual=missing(); if(nobs .lt.startf)then; call epprint('ERROR: In call specfore startf not le nobs of data':); call epprint(' nobs of data was ',nobs:); call epprint(' startf was ',startf:); go to endit; endif; series=data(integers(1,startf)); seriesm=mean(series); series=series-seriesm; if(detrend.ne.0)then; trend=dfloat(integers(startf)); if(detrend.eq.1)call olsq(series trend :qr if(detrend.eq.2)call olsq(series trend :qr :print); if(klass(series).eq.5)series=series-afam(%yhat); if(klass(series).eq.1)series=series-vfam(%yhat); tcoef=%coef(1); tmean=%coef(2); endif; obs=dfloat(integers(startf+1,startf+numf)); 45 46 Chapter 15 call spectral(series,sinx,cosx,px,sx,freq); beta=vfam(catrow(cosx,sinx)); /; zero out cosx for highest freq ijunk=norows(cosx); beta(ijunk,1)=0.0; forecast=vector(numf:); tt=dfloat(integers(0,numf-1))+dfloat(startf); do i=1,numf; c1=cos(afam(freq)*afam(tt(i))); s1=sin(afam(freq)*afam(tt(i))); cc=vfam(catrow(c1,s1)); if(detrend.eq.0)forecast(i)=transpose(beta)*cc + sfam(seriesm); if(detrend.ne.0)forecast(i)=transpose(beta)*cc + sfam(seriesm) +tmean + (dfloat(startf+i)*tcoef); enddo; if(startf.lt.nobs)then; iend=dmin1(startf+numf,nobs); nn2 =integers(iend-startf); actual = data(integers(startf+1,iend)); if(abs(klass(actual)).eq.1)error = actual - forecast(nn2); if(abs(klass(actual)).eq.5)error = actual - afam(forecast(nn2)); endif; endit continue; return; end; Table 15.15 shows the use of the specfore command in B34S and RATS. Note the effect of the trend correction. The MATLAB implementation uses OLS in the place of the FFT and does not provide the options of trend correction. The advantage of the OLS implementation to get the sine and cosine coefficients is transparency. The disadvantage is the added computing costs in both CPU time and space. Spectral Analysis of Time Series Table 15.15 Using the FFT to Forecast %b34slet domatlab = 0; %b34slet dorats = 0; %b34slet dob34s1 = 1; %b34slet file1="'_b34sdat.dat'"$ %b34slet file2="'b34sdata.m'"$ b34sexec options ginclude('b34sdata.mac') member(lydiapnm); b34srun; /$ user places RATS commands between /$ PGMCARDS$ /$ note: user RATS commands here /$ B34SRETURN$ /$ %b34sif(&dob34s1.ne.0)%then; b34sexec matrix; call echooff; call loaddata; call load(specfore); call print(' ':); call print('Forecast of sales and Advertising':); nfor=30; base=60; call call call call specfore(sales, base,nfor,0,fsales1,obs,error1,actual1); specfore(sales, base,nfor,2,fsales2,obs,error2,actual2); specfore(advertis,base,nfor,0,fadd1,obs,error3,actual3); specfore(advertis,base,nfor,2,fadd2,obs,error4,actual4); call print(' ':); call print('With out Trend Correction':); call tabulate(obs,actual1,fsales1,error1,actual3,fadd1,error3); call print('With Trend Correction':); call tabulate(obs,actual2,fsales2,error2,actual4,fadd2,error4); nn=integers(norows(actual1)); obs = obs(nn); fsales1=fsales1(nn); fsales2=fsales2(nn); nn=integers(norows(actual3)); fadd1 =fadd1(nn); fadd2 =fadd2(nn); call tabulate(obs actual1,fsales1 fsales2 ); call graph(obs actual1,fsales1 fsales2 :plottype xyplot :heading 'Sales Forecast out of sample # 2 with trend' :nolabel :nocontact :pgborder); call graph(obs actual3 fadd1 fadd2 :plottype xyplot :heading 'Advertis Forecast out of sample notrend # 2 with trend' :nolabel :nocontact :pgborder); 47 Chapter 15 48 cc1=ccf(fsales1,actual1); cc2=ccf(fsales2,actual2); cc3=ccf(fadd1,actual3); cc4=ccf(fadd2,actual4); ss1=sumsq(error1); ss2=sumsq(error2); ss3=sumsq(error3); ss4=sumsq(error4); call call call call call call call call call print(' ':); print('Out of print('Out of print('Out of print('Out of print('Out of print('Out of print('Out of print('Out of sample sample sample sample sample sample sample sample sales no trend sales with trend advertis no trend sales with trend sales forecast no sales forecast with adver forecast no adver forecast with sumsq sumsq sumsq sumsq trend trend trend trend correlation correlation correlation correlation ',ss1:); ',ss2:); ',ss3:); ',ss4:); ',cc1:); ',cc2:); ',cc3:); ',cc4:); b34srun; %b34sendif; %b34sif(&dorats.ne.0)%then; B34SEXEC OPTIONS OPEN('rats.dat') UNIT(28) DISP=UNKNOWN$ B34SRUN$ B34SEXEC OPTIONS OPEN('rats.in') UNIT(29) DISP=UNKNOWN$ B34SRUN$ B34SEXEC OPTIONS CLEAN(28)$ B34SRUN$ B34SEXEC OPTIONS CLEAN(29)$ B34SRUN$ B34SEXEC PGMCALL$ RATS PASSASTS PCOMMENTS('* ', '* Data passed from B34S(r) system to RATS', '* ') $ PGMCARDS$ * * see section 7.5 in RATS manual * Source(NOECHO) d:\R\specfore.src * SOURCE(ECHO) d:\R\specfore.src SET SERIES = sales COMPUTE istart = 60 COMPUTE iend = 90 * * @SPECFORE( options ) series start end forecasts * Computes forecasts using spectral techniques * * Parameters: * series : (input) Series to be forecast * start end : Range of entries to forecast * forecasts : (output) Series for computed forecasts * @SPECFORE(DIFFS=0,SDIFFS=0,TRANS=NONE,CONSTANT) SERIES ISTART IEND FORE Spectral Analysis of Time Series SET ERROR = SERIES - FORE PRINT istart iend SERIES FORE ERROR B34SRETURN$ B34SRUN $ B34SEXEC OPTIONS CLOSE(28)$ B34SRUN$ B34SEXEC OPTIONS CLOSE(29)$ B34SRUN$ B34SEXEC OPTIONS /$ dodos('start /w /r rats386 rats.in rats.out ') dodos('start /w /r rats32s rats.in /run') dounix('rats rats.in rats.out')$ B34SRUN$ B34SEXEC OPTIONS NPAGEOUT WRITEOUT('Output from RATS',' ',' ') COPYFOUT('rats.out') dodos('ERASE rats.in','ERASE rats.out','ERASE rats.dat') dounix('rm rats.in','rm rats.out','rm rats.dat') $ B34SRUN$ %b34sendif; %b34sif(&domatlab.ne.0)%then; /$ /$ Builds a MATLAB input file for MATLAB version 6. /$ Changes made 2 February 2002 /$ /$ Since MATLAB is case sensitive, use lower case for all variable /$ references that are from b34s. MATLAB users upper case for a matrix /$ variable /$ /$ This job assumes user has already loaded data in B34S /$ The file name for file1 is hard coded in the matlab m file (file2) /$ /$ User changes this to default matlab file directory /$ /$ /$ Job runs on linux matlab and windows matlab /$ /$ When job ends, output will be seen in b34s.out file /$ /$ User loads data here if it has not occured already /$ b34sexec options open(%b34seval(&file1)) unit(28) disp=unknown$ b34seend$ b34sexec options clean(28)$ b34seend$ b34sexec options open(%b34seval(&file2)) unit(29) disp=unknown$ b34seend$ b34sexec options clean(29)$ b34seend$ b34sexec pgmcall$ matlab lowercase outfile(%b34seval(&file1))$ pgmcards$ % User MATLAB commands here such as plot(varname) % x1=test(sales,60,2,1,1); x1=specfore(sales,60,10,1,1); % quit is needed since have to get out of matlab automatically 49 Chapter 15 50 % Comment to stay in matlab and see plot b34sreturn$ b34seend$ b34sexec options close(28)$ b34srun$ b34sexec options close(29)$ b34srun$ b34sexec options dodos('start /w /r matlab /r b34sdata /logfile matlab.out') dounix('matlab < b34sdata.m > matlab.out'); b34srun; b34sexec options writeout(' ', 'Output from Matlab ', ' '); b34srun; b34sexec options copyfout('matlab.out'); b34srun; b34sexec options dodos('erase matlab.out') dounix('rm matlab.out'); b34srun; %b34sendif; When run this job produces Variable SALES ADVERTIS CONSTANT # Cases 1 2 3 78 78 78 Mean Std Deviation 1278.692308 619.3974359 1.000000000 196.6126330 433.9763696 0.000000000 Variance Number of observations in data file 78 Current missing variable code 1.000000000000000E+31 Data begins on (D:M:Y) 1: 1:1954 ends 1: 6:1960. Frequency is B34S(r) Matrix Command. d/m/y => Maximum 38656.52747 188335.4893 0.000000000 1728.000000 1388.000000 1.000000000 12 8/12/09. h:m:s 16:17:39. CALL ECHOOFF$ Forecast of sales and Advertising With out Trend Correction Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 OBS 61.00 62.00 63.00 64.00 65.00 66.00 67.00 68.00 69.00 70.00 71.00 72.00 73.00 74.00 75.00 76.00 77.00 78.00 ACTUAL1 1052. 1102. 1355. 1323. 1296. 1127. 1170. 1059. 1116. 1214. 966.0 1089. 814.0 1087. 1180. 1167. 1210. 1092. FSALES1 1297. 1316. 1730. 1537. 1326. 1262. 1171. 1477. 1633. 1544. 1461. 1085. 1173. 1404. 1621. 1506. 1523. 1339. ERROR1 -244.8 -214.2 -374.8 -214.2 -29.78 -135.2 -0.7833 -418.2 -516.8 -330.2 -494.8 3.783 -358.8 -317.2 -440.8 -339.2 -312.8 -247.2 ACTUAL3 838.0 994.0 1020. 865.0 819.0 83.00 56.00 224.0 881.0 436.0 160.0 68.00 749.0 857.0 898.0 705.0 489.0 59.00 FADD1 1245. 1385. 946.6 954.4 51.57 74.43 36.57 502.4 1135. 952.4 665.6 163.4 978.6 1309. 1353. 1106. 501.6 158.4 ERROR3 -406.6 -391.4 73.43 -89.43 767.4 8.567 19.43 -278.4 -253.6 -516.4 -505.6 -95.43 -229.6 -452.4 -454.6 -401.4 -12.57 -99.43 With Trend Correction Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 OBS 61.00 62.00 63.00 64.00 65.00 66.00 67.00 68.00 69.00 70.00 71.00 72.00 73.00 74.00 75.00 76.00 77.00 ACTUAL2 1052. 1102. 1355. 1323. 1296. 1127. 1170. 1059. 1116. 1214. 966.0 1089. 814.0 1087. 1180. 1167. 1210. FSALES2 1028. 1043. 1461. 1264. 1057. 988.7 901.8 1204. 1364. 1271. 1192. 811.7 903.8 1131. 1352. 1233. 1254. ERROR2 24.21 59.30 -105.8 59.30 239.2 138.3 268.2 -144.7 -247.8 -56.70 -225.8 277.3 -89.79 -43.70 -171.8 -65.70 -43.79 ACTUAL4 838.0 994.0 1020. 865.0 819.0 83.00 56.00 224.0 881.0 436.0 160.0 68.00 749.0 857.0 898.0 705.0 489.0 FADD2 948.2 1084. 650.2 653.0 -244.8 -227.0 -259.8 201.0 838.2 651.0 369.2 -138.0 682.2 1008. 1056. 805.0 205.2 ERROR4 -110.2 -90.04 369.8 212.0 1064. 310.0 315.8 22.96 42.84 -215.0 -209.2 206.0 66.84 -151.0 -158.2 -100.0 283.8 Minimum 772.0000000 39.00000000 1.000000000 Spectral Analysis of Time Series 18 19 20 21 22 23 24 25 26 27 28 29 30 Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Out Out Out Out Out Out Out Out of of of of of of of of 78.00 79.00 80.00 81.00 82.00 83.00 84.00 85.00 86.00 87.00 88.00 89.00 90.00 OBS 61.00 62.00 63.00 64.00 65.00 66.00 67.00 68.00 69.00 70.00 71.00 72.00 73.00 74.00 75.00 76.00 77.00 78.00 sample sample sample sample sample sample sample sample 1092. NA NA NA NA NA NA NA NA NA NA NA NA 1066. 979.8 986.7 1152. 1283. 954.8 777.7 974.8 1086. 1393. 1442. 1104. 1018. ACTUAL1 1052. 1102. 1355. 1323. 1296. 1127. 1170. 1059. 1116. 1214. 966.0 1089. 814.0 1087. 1180. 1167. 1210. 1092. FSALES1 1297. 1316. 1730. 1537. 1326. 1262. 1171. 1477. 1633. 1544. 1461. 1085. 1173. 1404. 1621. 1506. 1523. 1339. sales no trend sales with trend advertis no trend sales with trend sales forecast no sales forecast with adver forecast no adver forecast with sumsq sumsq sumsq sumsq trend trend trend trend 26.30 NA NA NA NA NA NA NA NA NA NA NA NA 59.00 NA NA NA NA NA NA NA NA NA NA NA NA -143.0 -271.8 85.04 729.2 525.0 -193.8 -189.0 668.2 916.0 893.2 670.0 293.2 -206.0 51 202.0 NA NA NA NA NA NA NA NA NA NA NA NA FSALES2 1028. 1043. 1461. 1264. 1057. 988.7 901.8 1204. 1364. 1271. 1192. 811.7 903.8 1131. 1352. 1233. 1254. 1066. correlation correlation correlation correlation 1804827.578333335 426933.1095972446 2229763.246666659 1847970.135394790 0.5039421385465304 0.5023270654992271 -9.283852553158210E-02 -9.305692521339150E-02 B34S Matrix Command Ending. Last Command reached. Space available in allocator Number variables used Number temp variables used B34S 8.11D (D:M:Y) 8856637, peak space used 100, peak number used 2228, # user temp clean 8/12/09 (H:M:S) 16:17:43 9740 108 0 PGMCALL STEP Output from RATS * * Data passed from B34S(r) system to RATS * CALENDAR 1954 1 12 ALLOCATE 78 OPEN DATA rats.dat DATA(FORMAT=FREE,ORG=OBS, $ MISSING= 0.1000000000000000E+32 ) / $ SALES $ ADVERTIS $ CONSTANT SET TREND = T TABLE Series Obs Mean Std Error SALES 78 1278.69230769 196.61263304 ADVERTIS 78 619.39743590 433.97636957 TREND 78 39.50000000 22.66053839 Minimum Maximum 772.00000000 1728.00000000 39.00000000 1388.00000000 1.00000000 78.00000000 * * see section 7.5 in RATS manual * Source(NOECHO) d:\R\specfore.src * SOURCE(ECHO) d:\R\specfore.src SET SERIES = sales COMPUTE istart = 60 COMPUTE iend = 90 * * @SPECFORE( options ) series start end forecasts * Computes forecasts using spectral techniques * * Parameters: * series : (input) Series to be forecast * start end : Range of entries to forecast * forecasts : (output) Series for computed forecasts * @SPECFORE(DIFFS=0,SDIFFS=0,TRANS=NONE,CONSTANT) SERIES ISTART IEND FORE SET ERROR = SERIES - FORE PRINT istart iend SERIES FORE ERROR Lydia Pinkham Monthly Data PAGE 2 Chapter 15 52 ENTRY 1958:12 1959:01 1959:02 1959:03 1959:04 1959:05 1959:06 1959:07 1959:08 1959:09 1959:10 SERIES 1072 1052 1102 1355 1323 1296 1127 1170 1059 1116 1214 FORE 1240.460004817 1200.763596247 1235.408547587 1352.704220311 1358.394344689 1320.524022842 1280.304763050 1287.241992611 1305.249870237 1326.211286902 1360.412853130 ERROR -168.4600048169 -148.7635962471 -133.4085475874 2.2957796894 -35.3943446892 -24.5240228422 -153.3047630499 -117.2419926114 -246.2498702365 -210.2112869018 -146.4128531300 Spectral Analysis of Time Series B34S 8.11D 1959:11 1959:12 1960:01 1960:02 1960:03 1960:04 1960:05 1960:06 1960:07 1960:08 1960:09 1960:10 1960:11 1960:12 1961:01 1961:02 1961:03 1961:04 1961:05 1961:06 (D:M:Y) 966 1089 814 1087 1180 1167 1210 1092 NA NA NA NA NA NA NA NA NA NA NA NA 8/12/09 (H:M:S) 16:17:43 1300.166743214 1302.188498103 1323.426262875 1330.313780233 1354.704626186 1320.833749223 1303.224404989 1310.609988566 1330.479685398 1337.848002486 1318.335043563 1317.479168750 1317.880136638 1316.633819117 1335.905858936 1328.465844195 1323.632170191 1333.070850317 1322.175915526 1344.757186664 PGMCALL STEP Lydia Pinkham Monthly Data 53 PAGE 3 -334.1667432144 -213.1884981030 -509.4262628748 -243.3137802326 -174.7046261861 -153.8337492228 -93.2244049893 -218.6099885658 NA NA NA NA NA NA NA NA NA NA NA NA Note the out-of-sample correlation of .5039 and .5023 for sales forecasts made without and with a trend correction. The advertis forecasts, on the other hand were not good, presumably due to less structure in the underlying series. Experimentation with the spectral forecasting approach suggests that it is a valuable tool but cannot be used blindly. Unlike an ARIMA model which eventually will converge to the expected value of the series, this is not the case with a spectral forecast. 15.7 Conclusion After a brief survey of spectral analysis theory, the periodogram of two simple AR models was graphed. Using the gas furnace data the OLS approach to spectral analysis was contrasted with the frequency domain approach. The matrix command implementation was shown to be an easily customizable way to used spectral tools to explore the dynamics of a series. In Stokes (200x) use of the fast fourier transform to filter series is illustrated. Some simple examples using the FILTER subroutine are presented in Chapter 16 of this book. Wavelet analysis, that provides a way to study a series by both time and frequency are discussed and a number of examples shown that analyze the Nino water temperature data. At issue is the appropriate period to use for the analysis. If noise in the data is assumed to be of short duration, with wavelet analysis it is possible to remove this short duration information to allow analysis to proceed with a smoothed series that more accurately reflects the underlying process. Finally forecasting a series using FFT methods is illustrated.