Analysis by EEOF of the monthly temperature and precipitation time series in Romania C. Mares and Ileana Mares National Meteorological Administration Bucuresti-Ploiesti, 97, 013686, Bucharest, Romania E-mail: constantin.mares@meteo.inmh.ro Abstract The meteorological time series analyzed in this paper include mean monthly temperature values from 29 stations, and precipitation amounts from 31 stations, in Romania distributed relatively uniformly. Both data series include the 1950-2003 period. The first problem, which appears when applying the Extended Empirical Orthogonal Functions (EEOF) decomposition of the multivariate time series, is to construct the covariance matrix. Considering NS the number of stations and M the length of the time series, a concatenated matrix is built with length (m*NS, M - m+1), where m = 3 months. From the concatenated matrix we formed the symmetric covariance matrix, with different lags. To solve for the eigenvalues, the covariance matrix is diagonalized so that the first eigenvalue explains the largest covariance fraction of the events computed at moments t, t +1, t + 2. In EEOF applications, a major problem is to determine how many components to retain. This problem is extensively discussed in the literature, but the most attractive selection criterion is Rule N. According to Rule N, only the first three components for temperature, and the first nine components for precipitation, can be considered significant, while the other components represent noise. The first three EEOF modes for temperature, account for 86% of the total variance, while for the precipitation the first nine modes explain 63%. 1. Introduction Data analysis using Empirical Orthogonal Functions (EOF) has become a standard procedure in meteorological and oceanographic studies. Lorenz (1956) introduced the EOF analysis to earth sciences. In previous studies (Mares 1988), the EOF technique with varimax rotation (Richman 1986) has been used in order to find homogenous areas of temperature and precipitation variability in Romania. While many different types of EOF techniques are available (Kim and Wu 1999), the extended EOF (EEOF) as Weare and Nasstrom (1982) has been chosen for the present study, because it is best suited to forecasting. The conventional EOF identifies the main patterns of variability, which are coherent in space. In EEOF, those patterns coherent both in space and in time are identified (Wang et al. 1995). In such analyses, the field is studied in m successive moments, that is a mobile window is inserted of length m. In a traditional EOF analysis, m = 1. The way in which the window is selected depends on the aim of the analysis. Vautard and Ghil (1989) discussed the power of the EEOF method for identifying physical oscillations, as well as for analyzing, filtering and forecasting time series. As shown in Vautard (1995), the singular spectral analysis (SSA) is a particular application of the classical EOF. In classical EOF analysis, the state vector contains information about points in space at the same moment in time. When space and time vary, SSA is called M- SSA. One disadvantage of the EEOF analysis is that it is ineffective when the spatial coherence is small. Another disadvantage is a large computational memory requirement. The combined statistical technique of decomposing certain meteorological fields into EEOF and, followed by temporal extrapolation, using an AR-MEM model yielded useful results for surface temperature fields in Romania (Mares and Mares, 2003). In the present study the Section 2 describes EEOF analysis of temperature and precipitation in Romania and Summary and concluding remarks follow in Section 3. 2. EEOF analysis of temperature and precipitation in Romania The meteorological time series analyzed in this paper include mean monthly temperature values from 31 stations, and precipitation amounts from 33 stations, in Romania distributed relatively uniformly across the country. Both data series include the 1950-2003 period. The first problem, which appears when applying the EEOF decomposition of the multivariate time series, is to construct the covariance matrix. Considering NS the number of stations and M the length of the time series, a concatenated matrix is built with length (m*NS, M m+1), where m = 3 months. For these data, the concatenated matrix has the following form: X X1,1 . . . X NS ,1 X1, 2 . . . X NS , 2 X1, 3 . . . X NS , 3 X1, 2 . . . X NS , 2 X1, 3 . . . X NS , 3 X1, 4 . . . X NS , 4 . . . . . . . . . . . . . . . . . . X1, M 2 . . . X NS , M 2 X1, M 1 . . . X NS , M 1 X1, M . . . X NS , M where NS = 31 for temperatures and NS = 33 for precipitation, while M = 646. The data are standardized with respect to the average and variance of each month. The time sequences are subdivided as follows: a) January 1950 - October 2003; b) February 1950 November 2003; c) March 1950 - December 2003. From the concatenated matrix we formed the symmetric covariance matrix, with different lags, CXX = ( X XT ) / M, whose elements are given by the sums for all the moments of product between the value of X in point i and the value in point j. Because of the standardization, the covariance matrix is identical to the correlation matrix. To solve for the eigenvalues, the covariance/correlation matrix is diagonalized so that the first eigenvalue explains the largest covariance fraction of the events computed at moments t, t +1, t + 2. Weare and Nasstrom (1982) are the first researchers who applied this technique and, according to their theory, it is not necessary that the events should be successive. For instance, other successions like t, t + 2, t + 4 can be used. In both EEOF and EOF applications, a major problem is to determine how many components to retain. This problem is extensively discussed in the literature, but the most attractive selection criteria is Rule N, found in Preisendorfer (1988). This method implies estimation a 90% confidence interval, with the confidence limits at the probabilities p1 = 0.95 and p2=0.05, for each EEOF. As Preisendorfer (1988) shows, when the time series are correlated, one should not directly apply Rule N, one should determine an effective sample size n* to be used in place of n. The effective sample size n* is approximately given by n* = n [(1-ρ2)(1+ ρ2], where n = 646 and ρ autoregressive (autocorrelation) coefficients. The n* values are estimates for each ρ(x), x = 1,…,p points, and the smallest of these are retained instead of n. In order to obtain the confidence interval, 100 independent random data sets were synthesized, which represents the elements of a (p x n*) matrix, where p is the number of stations and n* is number of independent samples. In the present study, p = 31 for temperature or 33 for precipitation. First the eigenvalues are obtained. Then the 5 and 95 percentiles are determined using the 5th and the 95th achievement, determined from the cumulative distribution for the jth random eigenvalue, arranged decreasing order. Fig.1a shows the amplitude of the first 20 eigenvalues for temperature data in comparison with the 5 and 95 percentiles. According to Preisendorfer (1988), this band between 5 and 95 percentiles decreases approximately as (n-1)-1/2 with increasing sample size, n. Korres et al. (2000), also obtained a very narrow band between 5 and 95 percentiles, and showed that this is due to the very small ratio n/p in their analysis. For more insight into this confidence interval, a small graphic is enclosed in Fig. 1a, showing the confidence interval for the first 6 eigenvalues. For the precipitation field, the first 20 eigenvalues and curves of the confidence limits are presented in Fig. 1b. The cumulative variances both for temperature and precipitation cases appear in Fig. 1c. According to Rule N, only the first 3 components for temperature, and the first 9 components for precipitation, can be considered significant, while the other components represent noise. The first 3 EEOF modes for temperature, account for 86% of the total variance, while for the precipitation the first 9 modes explain 63%. Because the first EEOF mode reproduces the large-scale features, it has been used in teleconnections with NAO and ENSO (Mares et al. 2002). This first principal component of EEOF reproduces well the anomalies of observations averaged over all stations. 34 32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0 2 1.95 1.9 1.85 1.8 1.75 1.7 1.65 1.6 1.55 1.5 Random eigenvalue Eigenvalue T (a) Fig. 2a and b shows the time evolution of the first PC of EEOF for both temperatures (a) and precipitation (b), compared against 5 years of observations (1950-1954). Although the first EEOF mode for the precipitation field reproduces only 19% of the total variance, in comparison with 37% reproduced by EEOF1 for temperature, the monthly precipitation anomalies for all of Romania is reproduced very well. The correlation coefficients between the first principal components of EEOF and initial time series for 646 values are equal to 0.796 for temperature, and 0.758 for precipitation, with a high significance level due to the length of the series (Brooks and Carruthers 1953). 95% Signif. Level 5% Signif. Level 1 2 3 4 5 Eigenvalue index 6 95% Signif. Level 5% Signif. Level Eigenvalues T (a) 1 3 5 7 9 11 13 15 17 19 Eigenvalue index 2.5 2 1.5 1 0.5 0 -0.5 -1 -1.5 -2 -2.5 -3 (b) AMPLITUDE 18 Eigenvalue PP 16 14 95% Signif. Lev. 12 5% Signif. Lev. Eigenvalues PP 10 8 6 4 2 0 1 3 5 7 9 11 13 15 17 19 EEOF1 T AM T 1 Eigenvalue index 6 11 16 21 26 31 36 41 46 51 56 MONTH (c) ( b) 2 0.8 EEOF1 PP AM PP 1.5 0.6 AMPLITUDE Cumulative variances 1 0.4 PP T 0.2 0 1 3 5 7 9 11 13 15 17 19 Eigenvalue index 1 0.5 0 -0.5 -1 -1.5 FIG. 1. The first 20 eigenvalues for temperatures (a) and for precipitation (b) along with the 95% and 5% significance levels. The cumulative variance associated to the first 20 modes for temperature (solid lines) and precipitation ( dotted lines ) are presented in (c). 1 6 11 16 21 26 31 36 41 46 51 56 MONTH FIG. 2. The amplitude time series of EEOF1 and anomalies mean (AM) of the initial time series for 1950-1954 : a) - temperature and b) - precipitation. 3. Summary and concluding remarks References According to Rule N, only the first 3 components for temperature, and the first 9 components for precipitation, can be considered significant, while the other components represent noise. The first 3 EEOF modes for temperature, account for 86% of the total variance, while for the precipitation the first 9 modes explain 63%. In order to attempt a physical interpretation of the first three EEOF components for the temperature field, teleconnections were performed between the large scale atmospheric circulations indices as the North Atlantic Oscillation (NAO) or the blocking type atmospheric circulation indices and each of time series of the EEOF components. Regarding the teleconnections with NAO, the best results, with a 1% significance level were found for the first PC of EEOF for January and February, simultaneously and with lags of one month. The second component shows also some link, but much weaker than the first component, while for the third component the results are very weak. From the teleconnections with the circulation indices, differentiated results were obtained, function of the EEOF components. The atmospheric circulation over the Atlantic region best reflects in the behaviour of components 1 and 2, but mostly in component 1. The atmospheric circulation over European region has a signal in components 2 and 3, but especially in the third component for the months of February and October. It follows that the first EEOF component reflects the atmospheric processes taking place at a large scale and at a certain distance, while the components with a higher number reflect the atmospheric circulation at a smaller scale. Several spectral peaks are evident in temperature field as well as precipitation. For temperature, most significant peak has a period of 26 months. For precipitation, there are two significant peaks: 8 months and 23 months. The 26 and 23 month-periodicities might be associated with the Quasi-Biennial Oscillations (QBO) in this part of the European continent. Acknowledgments A part of this study was supported by Ministry of Education and Research in Romania, under the Climosis Project (contract No.405/20.09.2004). Second author is grateful for the WMO support for her participation in this Symposium. Brooks, C. E. P., and N. Carruthers: Handbook of Statistical Methods in Meteorology. Her Majesty's Stationery Office, 412 pp,1953. Kim, K. Y, and Q. Wu: A comparison study of EOF techniques: Analysis of nonstationary data with periodic statistics. J. Climate, 12, 185-199,1999. Korres, G., N. Pinardi, and A. Lascaratos: The ocean response to low-frequency interannual atmospheric variability in the Mediterranean Sea. Part II: Empirical Orthogonal Functions analysis. J. Climate, 13, 732-745, 2000. Lorenz, E. N. : Empirical orthogonal functions and statistical weather prediction. Statistical Forecasting Project Scientific Report No. 1, Department of Meteorology, MIT, Cambridge, Mass., 49 pp, 1956. Mares, Ileana: Factor analysis of monthly temperature and precipitation and determination homogeneous zones over Romania territory. Meteorology and Hydrology, 18, 23-27, 1988. Mares C., Ileana Mares and M., Mihailescu : Testing of NAO and ENSO signals in the precipitation field in Europe. Climatic Change: Implications for the Hydrological Cycle and for Water Management. Advances in Global Change Research, 10, M. Beniston, Ed., Kluwer Academic Publishers, 113-121, 2002. Mares C. and Ileana Mares : Improvement of Long-Range Forecasting by EEOF Extrapolation using an AR-MEM Model. Weather and Forecasting, 18, 311-324, ISSN 1520-0434, 2003. Preisendorfer, R. W. : Principal component analysis in meteorology and oceanography. Developments in Atmospheric Science, 17, C. D. Mobley, Ed., Elsevier Science Publishers B.V., Amsterdam, 426 pp, 1988. Richman, M.B. : Rotation of principal components. Int. J. Climatol., 6, 293-335, 1986. Vautard, R and M. Ghil: Singular spectrum analysis in nonlinear dynamics with application to paleoclimatic time series. Physica D, 35,395-424, 1989. Vautard, R. : Patterns in time: SSA and MSSA. Analysis of Climate Variability: Applications on Statistical Techniques, H. Von Storch and A. Navarra, Eds., Springer- Verlag, New York, 259-280, 1995. Wang, R., K. Fraedrich, and S. Pawson: Phase-space characteristics of the tropical stratospheric quasi-biennial oscillation. J. Atmos. Sci., 52, 4482-4500,1995. Weare, B. C., and J.N. Nasstrom: Examples of extended empirical orthogonal function analyses. Mon.Wea. Rev.,110, 481-485, 1982.