Analysis by EEOF of the monthly temperature and

advertisement
Analysis by EEOF of the monthly temperature and precipitation time series
in Romania
C. Mares and Ileana Mares
National Meteorological Administration
Bucuresti-Ploiesti, 97, 013686, Bucharest, Romania
E-mail: constantin.mares@meteo.inmh.ro
Abstract
The meteorological time series analyzed in this paper include mean monthly temperature values from 29
stations, and precipitation amounts from 31 stations, in Romania distributed relatively uniformly. Both data
series include the 1950-2003 period. The first problem, which appears when applying the Extended Empirical
Orthogonal Functions (EEOF) decomposition of the multivariate time series, is to construct the covariance
matrix. Considering NS the number of stations and M the length of the time series, a concatenated matrix is
built with length (m*NS, M - m+1), where m = 3 months. From the concatenated matrix we formed the
symmetric covariance matrix, with different lags. To solve for the eigenvalues, the covariance matrix is
diagonalized so that the first eigenvalue explains the largest covariance fraction of the events computed at
moments t, t +1, t + 2. In EEOF applications, a major problem is to determine how many components to retain.
This problem is extensively discussed in the literature, but the most attractive selection criterion is Rule N.
According to Rule N, only the first three components for temperature, and the first nine components for
precipitation, can be considered significant, while the other components represent noise. The first three EEOF
modes for temperature, account for 86% of the total variance, while for the precipitation the first nine modes
explain 63%.
1. Introduction
Data analysis using Empirical Orthogonal
Functions (EOF) has become a standard procedure
in meteorological and oceanographic studies. Lorenz
(1956) introduced the EOF analysis to earth
sciences.
In previous studies (Mares 1988), the EOF technique
with varimax rotation (Richman 1986) has been used
in order to find homogenous areas of temperature
and precipitation variability in Romania.
While many different types of EOF techniques are
available (Kim and Wu 1999), the extended EOF
(EEOF) as Weare and Nasstrom (1982) has been
chosen for the present study, because it is best
suited to forecasting.
The conventional EOF identifies the main patterns of
variability, which are coherent in space. In EEOF,
those patterns coherent both in space and in time
are identified (Wang et al. 1995). In such analyses,
the field is studied in m successive moments, that is
a mobile window is inserted of length m.
In a traditional EOF analysis, m = 1.
The way in which the window is selected depends on
the aim of the analysis.
Vautard and Ghil (1989) discussed the power of the
EEOF method for identifying physical oscillations, as
well as for analyzing, filtering and forecasting time
series.
As shown in Vautard (1995), the singular spectral
analysis (SSA) is a particular application of the
classical EOF. In classical EOF analysis, the state
vector contains information about points in space at
the same moment in time. When space and time
vary, SSA is called M- SSA.
One disadvantage of the EEOF analysis is that it is
ineffective when the spatial coherence is small.
Another disadvantage is a large computational
memory requirement.
The combined statistical technique of decomposing
certain meteorological fields into EEOF and, followed
by temporal extrapolation, using an AR-MEM model
yielded useful results for surface temperature fields
in Romania (Mares and Mares, 2003).
In the present study the Section 2 describes EEOF
analysis of temperature and precipitation in Romania
and Summary and concluding remarks follow in
Section 3.
2. EEOF analysis of temperature and
precipitation in Romania
The meteorological time series analyzed in this
paper include mean monthly temperature values
from 31 stations, and precipitation amounts from 33
stations, in Romania distributed relatively uniformly
across the country. Both data series include the
1950-2003 period. The first problem, which appears
when applying the EEOF decomposition of the
multivariate time series, is to construct the
covariance matrix. Considering NS the number of
stations and M the length of the time series, a
concatenated matrix is built with length (m*NS, M m+1), where m = 3 months. For these data, the
concatenated matrix has the following form:







X 







X1,1
.
.
.
X NS ,1
X1, 2
.
.
.
X NS , 2
X1, 3
.
.
.
X NS , 3
X1, 2
.
.
.
X NS , 2
X1, 3
.
.
.
X NS , 3
X1, 4
.
.
.
X NS , 4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
X1, M  2
.
.
.
X NS , M  2
X1, M 1
.
.
.
X NS , M 1
X1, M
.
.
.
X NS , M















where NS = 31 for temperatures and NS = 33 for
precipitation, while M = 646. The data are
standardized with respect to the average and
variance of each month.
The time sequences are subdivided as follows: a)
January 1950 - October 2003; b) February 1950 November 2003; c) March 1950 - December 2003.
From the concatenated matrix we formed the
symmetric covariance matrix, with different lags, CXX
= ( X XT ) / M, whose elements are given by the sums
for all the moments of product between the value of
X in point i and the value in point j. Because of the
standardization, the covariance matrix is identical to
the correlation matrix. To solve for the eigenvalues,
the covariance/correlation matrix is diagonalized so
that the first eigenvalue explains the largest
covariance fraction of the events computed at
moments t, t +1, t + 2. Weare and Nasstrom (1982)
are the first researchers who applied this technique
and, according to their theory, it is not necessary that
the events should be successive. For instance, other
successions like t, t + 2, t + 4 can be used.
In both EEOF and EOF applications, a major
problem is to determine how many components to
retain.
This problem is extensively discussed in the
literature, but the most attractive selection criteria is
Rule N, found in Preisendorfer (1988).
This method implies estimation a 90% confidence
interval, with the confidence limits at the probabilities
p1 = 0.95 and p2=0.05, for each EEOF.
As Preisendorfer (1988) shows, when the time
series are correlated, one should not directly apply
Rule N, one should determine an effective sample
size n* to be used in place of n. The effective sample
size n* is approximately given by n* = n [(1-ρ2)(1+
ρ2], where n = 646 and ρ autoregressive
(autocorrelation) coefficients. The n* values are
estimates for each ρ(x), x = 1,…,p points, and the
smallest of these are retained instead of n. In order
to obtain the confidence interval, 100 independent
random data sets were synthesized, which
represents the elements of a (p x n*) matrix, where p
is the number of stations and n* is number of
independent samples. In the present study, p = 31
for temperature or 33 for precipitation.
First the eigenvalues are obtained. Then the 5
and 95 percentiles are determined using the 5th and
the 95th achievement, determined from the
cumulative distribution for the jth random eigenvalue,
arranged decreasing order. Fig.1a shows the
amplitude of the first 20 eigenvalues for temperature
data in comparison with the 5 and 95 percentiles.
According to Preisendorfer (1988), this band
between 5 and 95 percentiles decreases
approximately as (n-1)-1/2 with increasing sample
size, n. Korres et al. (2000), also obtained a very
narrow band between 5 and 95 percentiles, and
showed that this is due to the very small ratio n/p in
their analysis.
For more insight into this confidence interval, a
small graphic is enclosed in Fig. 1a, showing the
confidence interval for the first 6 eigenvalues. For
the precipitation field, the first 20 eigenvalues and
curves of the confidence limits are presented in Fig.
1b. The cumulative variances both for temperature
and precipitation cases appear in Fig. 1c. According
to Rule N, only the first 3 components for
temperature, and the first 9 components for
precipitation, can be considered significant, while the
other components represent noise.
The first 3 EEOF modes for temperature, account
for 86% of the total variance, while for the
precipitation the first 9 modes explain 63%. Because
the first EEOF mode reproduces the large-scale
features, it has been used in teleconnections with
NAO and ENSO (Mares et al. 2002). This first
principal component of EEOF reproduces well the
anomalies of observations averaged over all
stations.
34
32
30
28
26
24
22
20
18
16
14
12
10
8
6
4
2
0
2
1.95
1.9
1.85
1.8
1.75
1.7
1.65
1.6
1.55
1.5
Random eigenvalue
Eigenvalue T
(a)
Fig. 2a and b shows the time evolution of the first PC
of EEOF for both temperatures (a) and precipitation
(b), compared against 5 years of observations
(1950-1954). Although the first EEOF mode for the
precipitation field reproduces only 19% of the total
variance, in comparison with 37% reproduced by
EEOF1 for temperature, the monthly precipitation
anomalies for all of Romania is reproduced very well.
The correlation coefficients between the first
principal components of EEOF and initial time series
for 646 values are equal to 0.796 for temperature,
and 0.758 for precipitation, with a high significance
level due to the length of the series (Brooks and
Carruthers 1953).
95% Signif. Level
5% Signif. Level
1
2
3
4
5
Eigenvalue index
6
95% Signif. Level
5% Signif. Level
Eigenvalues T
(a)
1
3
5
7
9
11
13
15
17
19
Eigenvalue index
2.5
2
1.5
1
0.5
0
-0.5
-1
-1.5
-2
-2.5
-3
(b)
AMPLITUDE
18
Eigenvalue PP
16
14
95% Signif. Lev.
12
5% Signif. Lev.
Eigenvalues PP
10
8
6
4
2
0
1
3
5
7
9
11
13
15
17
19
EEOF1 T
AM T
1
Eigenvalue index
6 11 16 21 26 31 36 41 46 51 56
MONTH
(c)
( b)
2
0.8
EEOF1 PP
AM PP
1.5
0.6
AMPLITUDE
Cumulative variances
1
0.4
PP
T
0.2
0
1
3
5
7
9
11
13
15
17
19
Eigenvalue index
1
0.5
0
-0.5
-1
-1.5
FIG. 1. The first 20 eigenvalues for temperatures
(a) and for precipitation (b) along with the 95% and
5% significance levels. The cumulative variance
associated to the first 20 modes for temperature
(solid lines) and precipitation ( dotted lines ) are
presented in (c).
1
6 11 16 21 26 31 36 41 46 51 56
MONTH
FIG. 2. The amplitude time series of EEOF1 and
anomalies mean (AM) of the initial time series for
1950-1954 : a) - temperature and b) - precipitation.
3. Summary and concluding remarks
References
According to Rule N, only the first 3 components for
temperature, and the first 9 components for
precipitation, can be considered significant, while the
other components represent noise.
The first 3 EEOF modes for temperature, account
for 86% of the total variance, while for the
precipitation the first 9 modes explain 63%.
In order to attempt a physical interpretation of the
first three EEOF components for the temperature
field, teleconnections were performed between the
large scale atmospheric circulations indices as the
North Atlantic Oscillation (NAO) or the blocking type
atmospheric circulation indices and each of time
series of the EEOF components.
Regarding the teleconnections with NAO, the
best results, with a 1% significance level were found
for the first PC of EEOF for January and February,
simultaneously and with lags of one month. The
second component shows also some link, but much
weaker than the first component, while for the third
component the results are very weak.
From the teleconnections with the circulation
indices, differentiated results were obtained, function
of the EEOF components. The atmospheric
circulation over the Atlantic region best reflects in the
behaviour of components 1 and 2, but mostly in
component 1. The atmospheric circulation over
European region has a signal in components 2 and
3, but especially in the third component for the
months of February and October. It follows that the
first EEOF component reflects the atmospheric
processes taking place at a large scale and at a
certain distance, while the components with a higher
number reflect the atmospheric circulation at a
smaller scale.
Several spectral peaks are evident in temperature
field as well as precipitation. For temperature, most
significant peak has a period of 26 months. For
precipitation, there are two significant peaks: 8
months and 23 months.
The 26 and 23 month-periodicities might be
associated with the Quasi-Biennial Oscillations
(QBO) in this part of the European continent.
Acknowledgments
A part of this study was supported by Ministry of Education
and Research in Romania, under the Climosis Project
(contract No.405/20.09.2004).
Second author is grateful for the WMO support for her
participation in this Symposium.
Brooks, C. E. P., and N. Carruthers: Handbook of
Statistical Methods in Meteorology. Her Majesty's
Stationery Office, 412 pp,1953.
Kim, K. Y, and Q. Wu: A comparison study of EOF
techniques: Analysis of nonstationary data with periodic
statistics. J. Climate, 12, 185-199,1999.
Korres, G., N. Pinardi, and A. Lascaratos: The ocean
response to low-frequency interannual atmospheric
variability in the Mediterranean Sea. Part II: Empirical
Orthogonal Functions analysis. J. Climate, 13, 732-745,
2000.
Lorenz, E. N. : Empirical orthogonal functions and
statistical weather prediction. Statistical Forecasting
Project Scientific Report No.
1, Department of
Meteorology, MIT, Cambridge, Mass., 49 pp, 1956.
Mares, Ileana: Factor analysis of monthly temperature and
precipitation and determination homogeneous zones over
Romania territory. Meteorology and Hydrology, 18, 23-27,
1988.
Mares C., Ileana Mares and M., Mihailescu : Testing of
NAO and ENSO signals in the precipitation field in
Europe. Climatic Change: Implications for the Hydrological
Cycle and for Water Management. Advances in Global
Change Research, 10, M. Beniston, Ed., Kluwer Academic
Publishers, 113-121, 2002.
Mares C. and Ileana Mares : Improvement of Long-Range
Forecasting by EEOF Extrapolation using an AR-MEM
Model. Weather and Forecasting, 18, 311-324, ISSN
1520-0434, 2003.
Preisendorfer, R. W. : Principal component analysis in
meteorology and oceanography. Developments in
Atmospheric Science, 17, C. D. Mobley, Ed., Elsevier
Science Publishers B.V., Amsterdam, 426 pp, 1988.
Richman, M.B. : Rotation of principal components. Int. J.
Climatol., 6, 293-335, 1986.
Vautard, R and M. Ghil: Singular spectrum analysis in
nonlinear dynamics with application to paleoclimatic time
series. Physica D, 35,395-424, 1989.
Vautard, R. : Patterns in time: SSA and MSSA.
Analysis of Climate Variability: Applications on Statistical
Techniques, H. Von Storch and A. Navarra, Eds.,
Springer- Verlag, New York, 259-280, 1995.
Wang, R., K. Fraedrich, and S. Pawson: Phase-space
characteristics of the tropical stratospheric quasi-biennial
oscillation. J. Atmos. Sci., 52, 4482-4500,1995.
Weare, B. C., and J.N. Nasstrom: Examples of extended
empirical orthogonal function analyses. Mon.Wea.
Rev.,110, 481-485, 1982.
Download