COST733CAT - a database of weather and circulation type classif

advertisement
1
COST733CAT - a database of weather and circulation type classifications
2
Andreas Philipp
3
University of Augsburg, Germany
4
a.philipp@geo.uni-augsburg.de
5
6
Judit Bartholy
7
Eotvos Lorand University, Budapest, Hungary
8
bari@ludens.elte.hu
9
10
Christoph Beck
11
University of Augsburg, Germany
12
c.beck@geo.uni-augsburg.de
13
14
Michel Erpicum
15
16
Pere Esteban
17
Institut d'Estudis Andorrans (IEA), Andorra
18
pesteban.cenma@iea.ad
19
20
Xavier Fettweis
21
22
Radan Huth
23
Institute of Atmospheric Physics, Prague, Czech Republic
24
huth@ufa.cas.cz
25
26
Paul James
27
28
Sylvie Jourdain
29
30
31
Frank Kreienkamp
32
Climate and Environment Consulting Potsdam GmbH, Germany
33
frank.kreienkamp@cec-potsdam.de
34
35
Thomas Krennert
36
37
Spyros Lykoudis
38
Institute of Environmental Research and Sustainable Development, National Observatory of
39
Athens, Greece
40
slykoud@meteo.noa.gr
41
Silas Michalides
42
43
44
Krystyna Pianko
45
Piia Post
46
47
Domingo Rassilla Álvarez
48
49
Arne Spekat
50
Climate and Environment Consulting Potsdam GmbH, Germany
51
arne.spekat@cec-potsdam.de
52
53
Filippos S. Tymvios
54
55
Correspondence to: Andreas Philipp, Universitaetsstrasse 10, 86135 Augsburg, Germany
56
57
Key words: weather type classification, circulation type classification, dataset, COST 733, Europe
58
Abstract
59
A new database of classification catalogs has been compiled during the COST Action 733
60
"Harmonisation and Applications of Weather Type Classifications for European regions" in order to
61
evaluate different methods for weather and circulation type classification. This paper gives a
62
technical description of the included methods and provides a basis for further studies dealing with
63
the dataset. Even though the list of included methods is by far not complete, it demonstrates the
64
huge variety of methods and their variations. In order to allow a systematic overview an attempt for
65
a new conceptional systemization for classification methods is presented reflecting the way types
66
are defined. Methods using predefined types include manual and threshold based classifications
67
while methods producing derived types include those based on eigenvector techniques, leader
68
algorithms and optimization algorithms. Concerning the input dataset and the method configuration
69
the classification catalogs included in the dataset have been unified to allow for comparisons of the
70
pure methods. The different methods are discussed with respect to their intention and
71
implementation as well as their systematization.
72
73
74
1.) Introduction
75
Classification of weather and atmospheric circulation states into distinct types is a widely used tool
76
for describing and analyzing weather and climate conditions. The general idea is to transfer
77
multivariate information given on the metrical scale in a input dataset, e.g. a time series of daily
78
pressure fields, to a univariate time series of type membership on the nominal scale, i.e. a so called
79
classification catalog. The advantage of such a substantial information compression is the
80
straightforward use of the catalogs. On the other hand the loss of information caused by the
81
classification process makes it sometimes difficult to relate the remaining information to other
82
climate elements like temperature or precipitation, which in many cases is the main objective for
83
applications of classifications. For other applications which are just interested in a compact
84
description of atmospheric dynamics the problem is the same since as much information as possible
85
should be recorded by a single nominal variable, the classification catalog. It may be a consequence
86
of this contrariness, that the number of different classification methods is huge and still increasing,
87
in the hope to find a classification method producing a simple catalog but reflecting the most
88
relevant parts of climate variability. However, the number of classification methods and their results
89
is a drawback as it is hard to decide which one to use for certain applications. This was the reason to
90
initiate COST action 733 entitled "Harmonisation and Applications of Weather Type Classifications
91
for European regions" that intents to compare different classification methods and to evaluate
92
whether there might be one or a few universally superior methods, which can be recommended.
93
Further it should be evaluated whether it is possible to combine favorable properties of existing
94
methods in order to develop a reference classification. The basis for this work has been established
95
by producing a database of classification catalogs (called cost733cat) under unified conditions
96
concerning the input dataset and the configuration of the classification procedure in order to make
97
the resulting catalogs as comparable as possible concerning the method alone. As a consequence in
98
some aspects the included catalogs might be suboptimal for some applications, e.g. all methods
99
have been applied to mean sea level pressure while the 850 or 500 hPa geopotential height might be
100
more suitable for some applications. However, the current version of the collection, which will be
101
made available for free at the end of COST Action 733 in 2010, can be applied for more than
102
intercomparison questions and future versions might include more suitable configurations.
103
In order to describe the included methods an attempt for a new systemization has been developed
104
within the COST Action and is used here in the following. For a long while classification methods
105
have been divided into two main groups, namely manual and automated classifications (e.g. Yarnal
106
1993). The first group has been called "manual" while the second group is called "automated". An
107
alternative discrimination refers to "subjective" versus "objective" methods, which is not quite the
108
same since automated methods which are often regarded as objective always include subjective
109
decisions. A third group of hybrid methods has been established (e.g. Sheridan 2002) accounting for
110
methods that define types subjectively but assign all observation patterns automatically (see Huth et
111
al. 2008 for a more detailed discussion). Another distinction could be made between circulation
112
type classifications (CTC) utilizing solely atmospheric circulation data like air pressure, and so
113
called weather type classifications (WTC) using additional information about other weather
114
elements like temperature, precipitation, etc. However new developments of WTCs and subjective
115
classifications which always include some information about the synoptic weather situation of a
116
circulation type are rare and actually only one automated method (called WLK) using other
117
parameters than pressure fields is included in cost733cat. The reasons are probably the high efforts
118
of manual classifications as well as the higher demand for CTCs in so called circulation-to-
119
environment applications (Yarnal 1993) relating circulation to other weather elements after the
120
classification process.
121
With the growing availability of computing capacities during the last decades, the number of
122
automated CTC methods increased considerably, since it is easy now to modify existing algorithms
123
and produce new classifications. Therefore it seems necessary to find a new systematization of
124
methods especially accounting for the increased diversity of automated methods and their
125
algorithms (see Huth et al. 2008 and Jolliffe & Philipp 2009 for further developments). Therefore
126
an attempt is made in the paper to distinguish between basic classification strategies which might
127
shed some more light on the functionality and effects of the different methods.
128
This paper is organized as follows: A review of the classification methods included into the catalog
129
database is given in section 2 structured by a methodological systematization which is explained for
130
each method group respectively. Section 3 describes the dataset compilation concerning the input
131
dataset and the unified method configurations. Finally in section 4 the systematization of methods
132
is discussed and concluding remarks point out the main methodological differences and commons.
133
134
2.) Classification methods and systematization
135
In order to get an overview of commonly used methods an initial inventory of classification
136
schemes, developed by different authors has been produced by a questionnaire in the year 2006
137
(Huth et al. 2008). By removing redundant methods and adding some classical methods (e.g. Lund
138
1963) a broad spectrum of different strategies for classification became apparent. In order to
139
systematize these different approaches concerning methodological commonalities, the kind of type
140
definition can be utilized.
141
Two main strategies can be discerned concerning the relation between type definition and
142
assignment of objects to them. The first strategy is to establish a set of types in prior to the process
143
of assignment (called "predefined types" hereafter), while the second way is to arrange the entities
144
to be classified (daily patterns in this case) following a certain algorithm such that the types are,
145
together with the assignment, the result of the process (called "derived types"). The use of
146
predefined types corresponds to a deductive top down approach where it is already known how the
147
relevant types look like, while the generation of derived types corresponds to the inductive bottom
148
up approach where more or less no knowledge about the structure of the dataset or effects of certain
149
types is assumed but should be derived by data mining. This fundamental difference will be
150
specified in the following.
151
152
I. Methods using predefined types
153
Methods using predefined types include those with subjectively chosen weather situations and those
154
where the allocation of days to one type depends on thresholds or rules. They have in common a
155
presumed concept of the relation between circulation and surface climate variables like temperature
156
and precipitation even though it is rarely formulated explicitly. Especially for European surface
157
climate it is for example important whether the large scale flow is organized zonally or
158
meridionally. Therefore predefined types are preferentially defined to clearly discern between these
159
two configurations, while this is not necessarily the case for derived types. The difference between
160
subjectively defined types and their definition by thresholds is just the formulation of explicit rules
161
for the latter.
162
163
I.1. Subjective definition of types
164
Subjective classifications are based on the expert experience about the effect of certain circulation
165
patterns on various surface climate parameters, i.e. they try to discern between typical synoptic
166
situations. A main problem for this approach (as well as for the subject of weather and circulation
167
type classification as a whole) is the diffuse meaning of typical. To define typical in the meaning of
168
more often than other situations does not solve the problem, because there is no obvious way how
169
to separate other situations from it, since there are smooth gradual transitions from one situation to
170
another. However, typical situations may be further concretized by including (not always in an
171
explicit way) the effects of circulation on associated surface climate variables. Thus a typical
172
westerly pattern for central Europe might be defined for prevailing westerly winds combined with a
173
high probability for stratiform precipitation and warm temperatures in winter. It might be this
174
integration of effects increasing the spread of possibilities for different situations which results in
175
the characteristic high number of types of subjective classifications, ranging between 29 for Hess-
176
Brezowsky (1952) and 43 for the ZAMG-classification (Lauscher 1985) inlcuding originally over
177
80 classes (see below). The only exception so far is the classification by Peczely (1957) with 13
178
types. There are several drawbacks of subjective classifications. One is that they are not transferable
179
and scalable to other regions. Another one is the high likelihood for inhomogeneities caused e.g. by
180
changing classifiers which at least in case of the Hess-Brezowsky-classification has to be
181
considered (Cahynová and Huth 2009). However in order to compare non-subjective classifications
182
concerning their information content with the existing subjective expert classifications some of
183
them are included in the dataset.
184
185
HBGWL/HBGWT - Hess Brezowsky Grosswetterlagen/-typen
186
One of the most famous catalogs focusing on central Europe is the one founded by Baur (1948) and
187
revised and further developed by Hess and Brezowski (1952) and more recently by Gerstengarbe
188
and Werner (1999).The concept of type definition strongly follows the flow direction of air masses
189
onto central Europe, discerning zonal, mixed and meridional types which are further discriminated
190
into 10 "Großwettertypes" and on the last hierarchical level into 29 "Großwetterlagen" and one
191
undefined or transitional type. The subjective definition of types and assignment of daily patterns
192
includes the knowledge of the authors about importance of the specific circulation patterns for
193
temperature and precipitation conditions in central Europe.
194
195
OGWL - Objective Grosswetterlagen
196
An objectivized version of the "Hess and Brezowsky Großwetterlagen" has been produced by
197
James (2007) using only circulation composites (mean MSLP and 500 hPa GPH) of the 29 original
198
types and newly assigning the daily circulation patterns to them. This has been done for winter and
199
summer separately and by applying time filters in order to achieve a minimum persistence of 3
200
days. In the cost733cat dataset an unfiltered version based solely on MSLP is included.
201
202
PECZELY - Hungarian weather types
203
György Péczely, a Hungarian climatologist and professor of the Szeged University, Hungary (1924-
204
1984), originally published his macrocirculation classification system in 1957. The system was
205
defined on the base of the geographical location of cyclones and anticyclones over the Carpathian
206
basin, however also the the position of cold and warm fronts are considered. All together 13 types
207
were composed which are pooled into 3 superordinated groups of meridional-northern, zonal-
208
western and zonal-eastern types (Péczely, 1957, 1961, 1983). After the death of Gyorgy Peczely,
209
one of his followers, Csaba Karossy continued the coding process (Karossy, 1994, 1997).
210
211
PERRET - Alpine weather statistics
212
The Swiss weather type classification after Perret (1987) is part of the Alpine Weather Statistics a
213
comprehensive characterization of the regional synoptic situation in a circle with radius of 2°
214
latitude (ca. 222km) centered at Switzerland. Here it is used as an additional information among 33
215
parameters, including others like the threshold based Schüepp classification (below). The Perret
216
classification shows some concordance to the Hess/Brezowsky classification described above.
217
However, the main idea behind it differs in so far that not the main flow direction determines the
218
superordinated groups but the intensity and cyclonicity of the circulation within the target area.
219
Thus the main distinction is made between types dominated by upper level flow, types dominated
220
by upper level highs and those dominated by upper level lows. On a second hierarchy level 5 flow
221
directions are discriminated for the first supergroup which are further characterized by being
222
cyclonic or anticyclonic which leads to a total of 12 types. On the other hand the upper level high
223
and low groups are specified by the position of the pressure centers on a second and third detail
224
level, leading to 9 and 10 types and a total of 31 types.
225
226
ZAMG - Central Agency for Meteorology and Geodynamics Eastern Alpine weather types
227
Since the 1950s weather types have been identified on a daily basis at the Central institute of
228
Meteorology in Vienna / Austria. Weather types according to Baur (1948) and Lauscher (1985), air
229
masses and passages of surface fronts over the eastern alpine edge have been logged. Individual
230
knowledge and education of the forecasters leads to a mixture of applications of methods during the
231
following years, the classification types from Lauscher and Hess-Brezowski (1952) are mostly used
232
within the catalogue. Also 8 circulation types (cyclonic or anticyclonic), and weak gradient
233
situations result in additional 17 classes. Altogether more than 80 different classes had been found
234
on handwritten sheets during the process of digitalization and had been reduced to 43 classes.
235
However, resembling class types are still present, e.g. a low over western parts of Europe and a
236
southwesterly anticyclonic flow at the eastern Alpine edge, both types being causally related.
237
Consequently, additional work is needed for further homogenization of the catalog. Besides the
238
weather type classification the ZAMG catalog also includes the detection of 6 cold and 6 warm
239
different air masses at an upper level (< 850 hPa) and a surface level (> 850 hPa). Their possible
240
changes during the day indicates advection without fronts. Further, fronts and frontal passages over
241
the town of Vienna are logged (11 different surface and upper fronts). This manual classification is
242
still done regularly by the night shift forecaster in Vienna around 2200 UTC.
243
244
I.2. Threshold based methods
245
Compared to subjective classifications threshold based methods define their types indirectly by
246
declaration of a borderline between different types in form of thresholds. Alternatively the
247
distinction between types can be realized by predefined rules for assignment, which is essentially
248
the same. For example a distinction can be made between days with a westerly main flow direction
249
over the domain and days with northerly, easterly or southerly direction where the angle between
250
the sectors represents a threshold or borderline between the types. In contrast to the subjective
251
methods the use of thresholds or explicit rules allows for automated classification. However, the
252
term objective which is sometimes used to point out the difference to subjective classifications is
253
debatable, since the predetermination of thresholds and rules also includes subjective decisions.
254
However their advantage is the reproducibility and of course their computer based fast processing.
255
256
GWT - Grosswetter-types or Prototype classification
257
Ten main circulation types are determined according to the so-called Großwetter-types of HBGWL
258
(see above). The basic idea is to characterize them in terms of varying degrees of zonality,
259
meridionality and vorticity of the large-scale MSLP field. Coefficients of zonality (Z), meridionality
260
(M), and vorticity (V) for each case (day) are determined as spatial correlation coefficients between
261
the respective MSLP field and three prototypical SLP patterns representing idealized W-E, S-N,
262
and central low-pressure isobars over the region of interest. The ten main circulation types are then
263
defined by means of particular combinations of these three coefficients. Assignment to central high
264
and low pressure types results from a maximum V coefficient (negative and positive, respectively).
265
The eight directional types are defined in terms of the Z and M coefficients (e.g. Z = 1 and M= 0
266
for the W-E pattern, Z = 0.7 and M= 0.7 for the SW-NE pattern, and so on, and remaining cases are
267
assigned to one of these types according to the minimum Euclidean distance of their respective Z
268
and M coefficients from those of the predefined types. A subdivision of the directional circulation
269
types into cyclonic and anticyclonic subtypes according to the sign of the V coefficient leads to 18
270
types and an even finer partitioning into subtypes of negative, indifferent and positive V coefficient
271
results in 27 circulation types.
272
273
LITADVE/LITTC - Litynski advection and circulation types
274
The classification scheme is based on three indices, calculated from gridded MSLP data, for
275
estimating the advection of air masses and the cyclonicity characteristic. Meridional (Wp) and zonal
276
(Ws) indices are calculated as spatially averaged components of the geostrophical wind vector
277
while cyclonicity (Cp) is estimated as the MSLP value over the central grid point of the domain.
278
Threshold values for these three indices are defined for each day of the year utilizing the respective
279
long-term means and standard deviations resulting in three categories for each index (negative,
280
indifferent, positive). Circulation types are defined as distinct combinations of index categories for
281
Wp and Ws, leading to nine types, characterized by the direction of advection (e.g. Wp=negative
282
and Ws=positive for type NW). Including Cp results in a further subdivision of directional types
283
according to their cyclonicity characteristics in 27 circulation types (e.g. Wp=positive,
284
Ws=indifferent and Cp=negative for type SC). Each case is assigned to one circulation type
285
according to its categorized index values.
286
287
LWT2 - Lamb weather types version 2
288
LWT2 is a modified version of the objective Jenkinson and Collison (1977) system for classifying
289
daily MSLP fields according to the subjectively derived Lamb-Weather types (Lamb 1950). Daily
290
MSLP fields are classified into 26 circulation types, indicating flow direction and vorticity. Based
291
on a selection of grid-points, direction and intensity of flow and vorticity are computed for each
292
daily MSLP field. The appropriate circulation type is determined according to vorticity-flow ratio
293
thresholds resulting in 8 pure directional types (e.g. W=West), 2 pure anticyclonic / cyclonic types
294
(A,C) and 16 hybrid types (e.g. ANE=Nort -East, partly anticyclonic). In contrast to Jenkinson and
295
Collison (1977) no unclassified days are allowed in LWT2 (James 2006) and the vorticity and flow
296
strength thresholds are adjusted so that exactly 33% of days fall into each of the three vorticity
297
classes anticyclonic, indeterminate and cyclonic.
298
299
WLK – Objektive Wetterlagenklassifikation
300
This method is based on the WLK weather type classification by Dittmann et al. 1995 and Bissolli
301
and Dittmann 2003, originally including 40 different types. The alphanumeric output consists of
302
five letters: the first two letters denote the main wind sector number counting clockwise: sector
303
01=NE, 02=SE, 03=SW, 04=NW, 00=undefined (varying directions). As input parameter the true
304
wind direction (u-, v- components) at 700 hPa is used. Within the domain different weights can be
305
defined. 2/3 of all weighted wind directions are connected to a sector of 10°, further these sectors
306
are assigned to a quadrant (i.e. „SW“ = [190°, 280°]). The third and fourth letter denote
307
>A<nticyclonicity or >C<yclonicity at 925 hPa and at 500, respectively. For each grid point a
308
weighted area mean value of the quasi geostrophic Vorticity is calculated. The fifth letter denotes
309
>D<ry or >W<et conditions, according to an weighted area mean value of precipitable water (whole
310
atmospheric column) which is compared to the long-term daily mean. Further input parameters are
311
the geopotential height, the temperature and the relative humidity at 955, 850, 700, 500 and 300
312
hPa. The consecutive method WLKC733 included in cost733cat.1.2 provides a few simplifications:
313
circulation patterns are derived from a simple majority (threshold) of the weighted wind field
314
vectors at 700 hPa. Further on, up to nine circulation patterns and one undefined circulation class is
315
computed, though the number is flexible to the requirements of the application. I.e. 9 main wind
316
sectors are defined counting clockwise (WLKC09): sector 01=345-15°, 02=15-75°, 03=75-105°,
317
04=105-165°, 05=165-195°, 06=195-255°, 07=255-285°, 08=285-345°, 00=undefined (varying
318
directions); 6 main sectors are used in WLKC28: 01=330-30°, 02=30-90°, 03=90-150°, 04=150-
319
210°, 05=210-270°, 06=270-330°, 00=undefined (varying directions). The last deviation from the
320
original scheme is the replacement of the integrated precipitable water content by the DMO
321
parameter of towering water content. Therefore the number of input parameter fields is reduced.
322
323
SCHÜEPP – Alpine weather statistics
324
Although initially developed as a manual classification scheme (Schüepp 1957, 1968) - explicitly
325
focusing on the western Alpine region and Switzerland - this classification is assigned to the
326
threshold based methods, as it utilizes distinct numerically derived thresholds. The classification of
327
days into 40 classes is based on MSLP and 500 hPa data for a spatial domain of 2° radius centered
328
at 46.5°N, 9°E. Circulation types are defined by thresholds with regard to the following variables:
329
surface pressure gradient and wind direction, 500 hPa wind direction and intensity, 500 hPa GPH,
330
vertical wind shear and baroclinicity. The resulting 40 classes are grouped into 3 generic main
331
groups. Advective / convective types, characterized by pressure gradients above / below a certain
332
threshold and mixed types showing intermediate characteristics concerning horizontal and vertical
333
dynamics. The Schüepp classification is only available for the original spatial domain.
334
335
II. Methods producing derived types
336
In contrast to methods utilizing predefined types the following methods are based on the idea to find
337
types which are indicated by any structure in the dataset itself. In particular three superordinate
338
strategies may be discerned. The first group utilizes principal component analysis (PCA) to
339
determine principal components (PCs) explaining major fractions of the variance of the input data.
340
The patterns to be classified are assigned to classes according to some measure of relation to the
341
principal components. The second strategy is to find leading patterns according to their number of
342
similar patterns within a certain distance around them while the third strategy is the combinatorial
343
approach to optimize a partition according to a function, commonly the minimization of within-type
344
variability.
345
346
II.1. PCA based methods
347
The potential of principal component analysis (PCA) to be used as a classification tool was
348
suggested by Richman (1981) and more deeply discussed and elaborated by Gong and Richman
349
(1995). The basic idea of using PCA as a classification tool consists in assigning each case to a
350
principal component according to some rule. However there are different modes for PCA differing
351
fundamentally from each other. In the most often used s-mode the results are score time series
352
representing the most important types of data variability in time, while the loadings indicate where
353
and how much these time series are realized. Things are reversed in t-mode, where the scores
354
describe important patterns and the loadings reflect the amount of their time variant realization.
355
Thus the t-mode seems more appropriate, however also the s-mode might be utilized.
356
357
TPCA - principal component analysis in t-mode
358
To classify circulation patterns by PC loadings, PCA has to be used in a T-mode with oblique
359
rotation (e.g., Richman 1986, Huth 1993, and Compagnucci and Richman 2008). Here we apply
360
TPCA in a setting similar to Huth (2000): PCA is first conducted on ten subsets of data, the first
361
subset being defined by selecting the 1st, 11th, 21st, etc. day; the second subset by selecting the
362
2nd, 12th, 22nd, etc.day, and so on. The solutions are projected onto all data by solving matrix
363
equation ΦAT = FTZ (here F and Φ are matrices of PC scores and PC correlations, respectively, Z
364
is the full data matrix, and A are pseudo-loadings to be determined). Each day is classified with that
365
PC (type) for which it has the highest loading. Contingency tables are finally used to compare the
366
ten classifications and, based on the tables, the classification most consistent with the other nine
367
classifications is selected as the resultant one.
368
369
P27 - Kruzinga empirical orthogonal function types
370
The P27 classification scheme developed at the Royal Netherlands Meteorological Institute
371
(Kruizinga 1979; Buishand and Brandsma 1997) utilizes the s-mode variant of PCA. Originally
372
daily 500 hPa heights were transformed to patterns ptq of reduced seasonal variability by subtracting
373
the long-term daily average from each gridpoint.
374
The pattern of each day t is approximated as: pt ≈ s1t a 1+s2t a 2 +s3t a 3 ,
375
a 2 and a3 are the first three principal component vectors and s1t to s3t are the respective scores.
376
Instead of using the correlation or covariance matrix, the eigenvectors are calculated from the
377
unadjusted (neither with regard to mean nor variance) product matrix. The amplitudes of the scores
t=1,. . .. . ,N, where a1 ,
378
of the first three components, representing zonality, meridionality and cyclonality respectively, are
379
subdivided into n1, n2 and n3 equiprobable intervals. Finally each object (day) is assigned to one of
380
the n1 x n2 x n3 possible interval combinations. The original classification has 3 equiprobable
381
intervals for all components, resulting in 27 classes. However, varying numbers of classes can be
382
achieved by defining different numbers of amplitude intervals (e.g. 3 x 3 x 2 = 18 classes). The
383
interpretation of the principal components to be connected to the zonal and meridional components
384
of the flow and to cyclonality does not hold always depending on the location and the scale of the
385
used grid.
386
387
PCAXTR - principal component analysis extreme scores
388
For this classification approach Varimax rotated s-mode PCA, based on the correlation matrix, is
389
used to build initial centroids based on the score time-series. The centroids are calculated averaging
390
the days that fulfil the principle of the “extreme scores” (Esteban et al; 2005, 2006, 2009): patterns
391
with high score values for a certain component (normally values higher than +2 for the positive
392
phase, or lower than -2 for the negative phase), but with low score values for the remainder
393
(normally between +1 and -1) were selected. According to this, twice as many classes as retained
394
PCs result from this method. However, PCs for which none real case exhibits an "extreme score"
395
are considered as an artificial result of the PCA and the according class is eliminated. All days are
396
finally assigned to the remaining centroids according to their minimum Euclidean distance.
397
398
II.2. Leader algorithms
399
Methods based on the so called leader algorithm (Hartigan 1975) have been established when
400
computing capacities have been available but were still low. They try to find key (or leader)
401
patterns in the sample of maps, which are located in the center of high density clouds of entities
402
within the multidimensional phase space spawned by the variables, i.e. grid point values. Thus the
403
aim is close to that of non-hierarchical cluster analysis but the expensive iterations of the latter are
404
avoided.
405
406
LUND - classical leader algorithm
407
Lund (1963) used a simple linear correlation method to identify frequently appearing, well
408
separated sea-level pressure patterns over the northeastern USA. To do so Pearson correlation
409
coefficients between all days are calculated and for each day all coefficients of r>0.7 to any other
410
pattern are counted. The first key pattern (leader) is defined as the day with the largest number of
411
correlation coefficients r >0.7. This day and all days with a correlation coefficients r >0.7 to that
412
key day are then removed from the dataset. On the remainder data set the search for the second and
413
following leaders is done in the same manner, until all types (of the predefined number of types)
414
have a key day. Finally all days are just assigned to the key day according to the highest correlation
415
coefficient, regardless to which key day they had been assigned initially.
416
417
ESLP/EZ850 - Erpicum and Fettweis
418
This classification algorithm is similar to Lund, however, differing by the calculation and use of the
419
similarity measure (Fettweis et al. 2009). Another difference is the skip of the final step. Thus the
420
final classification of the days is based on their first assignment to a key day. However to avoid
421
large dissimilar class sizes a novel scheme varying the threshold is introduced as explained below.
422
To give the same weight to all the grid points of the input maps for the similarity index
423
computation, the data are first normalized over the temporal dimension in a preprocessing step.
424
Then, for each pair of days the similarity index i(day1,day2) = 1 - 1/2 * SUM |(Z(day1)-Z(day2)|, is
425
calculated, where Z(day) is the vector of pressure values for one day. This index equals to the
426
maximum of 1 if the Z surface is compared with itself and is negative if both Z surfaces are very
427
different. Departing from Lund (1963) the distance threshold for determination of key patterns ic is
428
progressively decremented, starting from one and being decremented by a factor epsilon for each
429
type, i.e. ic(k) = 1 - epsilon * k; k=1,n, where k is the number of the class and n is the maximum
430
number of types. The whole classification scheme (see above) is repeatedly applied for different
431
values of epsilon in order to minimize the skill score (percentage of explained variance) defined by
432
Buishand et al. (1997). For each run epsilon is increased by epsilon=0.05*(r-1); r=1,m where r is
433
number of the run and m is the maximum number of runs. Thus compared to the Lund (1963)
434
classification this procedure is considerably more time demanding, however includes a mechanism
435
for reducing within-type variability.
436
437
KH - Kirchhofer types
438
The method initially introduced by Kirchhofer (1974) was intended to classify 500hPa geopotential
439
height patterns over Europe, using as criterion the squared distances between normalized grid-point
440
values. Two patterns were allowed to be classified as similar only if this so called Kirchhofer score
441
was high enough, and if the Kirchhofer scores for a set of user-defined sub-sections of the two
442
patterns were also acceptable. The method implemented here is a variation of the original based on
443
Yarnal (1984) , who suggested that the columns and rows of the grid could serve as sub-domains,
444
and Blair (1998), who replaced the squared distances with, equivalent, linear correlation
445
coefficients between map patterns. Since this distance measure is considerably lower than the usual
446
pattern correlation coefficient, the threshold for finding key patterns is set to 0.4. Apart from that
447
the procedure follows exactly the Lund (1963) classification scheme (see above).
448
449
II.3. Optimization algorithms
450
Optimization methods are combinatorial approaches to arrange the whole set of objects (days)
451
within groups (or clusters) in such a way that a certain function is optimized. This function is the
452
minimization of the within-type variability measured as the overall sum of the Euclidean distances
453
between the member objects of a type and the average of the type (centroid). Most of the included
454
optimization methods are based on the k-means clustering algorithm (e.g. Hartigan 1975). In order
455
to avoid redundancy its principal is described here. K-means starts with an initial starting partition
456
of the objects (daily pressure maps) and evaluates one object after the other whether it is in the most
457
similar cluster and if not shifts it to a nearer one in terms of the Euclidean distance between the
458
object and the centroid of the cluster. By doing so, the affected centroids in turn have to be
459
recalculated which in turn is changing the situation for following checks and makes it necessary to
460
repeatedly iterate through the list of objects and check them again. At some point of this process all
461
objects are in their nearest cluster and no shift is necessary and possible anymore, i.e. convergence
462
to an optimum has been reached. Most of the following optimization methods only differ
463
concerning the starting partition or data preprocessing for k-means. Only SANDRA and NNW
464
method use alternative ways for optimization. The PETISCO classification does not really optimize
465
the whole partition in the end but tries to find iteratively optimal leading centroids in an
466
intermediate step which are finally combined. Thus this method utilizes a hybrid between the leader
467
algorithm and a kind of cluster-analysis and might be also assigned to the method group of leader
468
algorithms or be an own group by itself. However conveniently it is listed here beside the actual
469
optimization methods. Likewise NNW does not converge to an optimum since its iterations have
470
been limited to 2000 here.
471
472
CKMEANS - k-means by dissimilar seeds
473
For this classification method the k-means algorithm is initiated using a starting partition based on
474
dissimilar pressure fields of the dataset to be classified following Enke and Spekat (1997). The
475
initialization takes place by randomly selecting one object (pressure map). The seed for the second
476
cluster is then determined as the object most different to the first, while the seed for the third cluster
477
is the object with the highest sum of the distances to the first two seed-patterns, and so on, until
478
every cluster has one seed-pattern. In a stepwise procedure the starting partition, initially
479
consisting of the k seed-patterns is gradually identified. All remaining days are then assigned to
480
their most similar class. With each day entering a class, the centroid positions are re-computed. As a
481
consequence the multi dimensional distance between class centroids continually decreases while the
482
variability within the individual classes of the starting partition, on the other hand, increases. After
483
the assignment of all days has been performed the iterative k-means clustering process is launched
484
(see above). The centroids converge towards a final configuration which has no similarity with the
485
starting partition. In order to retain representativity, classes are kept in the process only if they do
486
not fall short of a certain number, e.g., 5% of all days. Otherwise the class will be removed and its
487
contents distributed among the remaining classes.
488
489
PCACA - k-means by seeds from hierarchical cluster analysis of principal components
490
This method follows some recommendations proposed by Yarnal (1993). In a preprocessing step, a
491
high-pass filter using the 13-day running mean is applied on the input data in order to remove the
492
seasonal cycle. Afterwards a s-mode PCA (Varimax rotated using the correlation matrix) is applied
493
on the filtered data to reduce the collinearity, simplifying the numerical calculations and improving
494
the performance of subsequent clustering procedure. The daily PC-score time series of the retained
495
PCs (mostly 3 according to a scree test) were the input of the clustering procedure. In order to
496
obtain the starting partition for the k-means procedure (see above) the hierarchical cluster algorithm
497
by Ward (1963) is applied.
498
499
PETISCO - leader algorithm with optimized key patterns
500
This method resembles some aspects of the leader algorithm, however includes an optimization
501
procedure for the classification seeds (Petisco 2005). Like for LUND (see above) key patterns are
502
determined among all days but here with a threshold of 0.9 for the pattern correlation r. In case of
503
using more than one level (Originally, both, the MSLP and the 500 hPa GPH field have been used)
504
r is the minimum of the correlation coefficients calculated separately for each level. In contrast to
505
the leader algorithm the key pattern is calculated as the centroid of the key day and all days within
506
r>0.9, the latter called key group in the following. Iteratively the calculation for the key group
507
centroid and the search for new members of the key group is repeated, until optimized key patterns
508
exist, i.e. they do not change anymore. From these key groups the largest are selected as final types
509
and all remaining days are assigned to them according to their minimum correlation coefficient.
510
511
PCAXTRKMS - k-means using PCA derived seeds
512
For this k-means variant the starting partition is determined by the PCAXTR method described
513
above. Consequently this is the only optimization method with some restrictions for the number of
514
types.
515
SANDRA - simulated annealing and diversified randomization clustering
516
As it is discussed in literature the k-means clustering algorithm is by design an unstable method (see
517
e.g. Michelangeli et al. 1995, Philipp et al. 2007, Fereday et al. 2008), since it has no strategy to
518
avoid local optima of the optimization function. In contrast, the heuristic simulated annealing
519
algorithm is able to approximate the global optimum. The only difference to k-means are so called
520
wrong reassignments, i.e. objects may be removed from their nearest cluster, depending on a
521
probability P which is high at the beginning but slowly decreases during the optimization process.
522
Thus, if the process was trapped in a local optimum of poor quality some objects can be shifted
523
again which might lead to an overall improvement for the following steps. In order to slowly reduce
524
P a control parameter T which is a huge number at the beginning, is stepwise reduced by a cooling
525
factor C (e.g. C=0.999) from iteration to iteration by calculating T=T*C. P is then calculated as P =
526
exp( (Dcurrent - Dnew)/T ) , where Dcurrent is the Euclidean distance of the object to its current cluster
527
and Dnew the distance to a potentially new cluster. If a random number r (0<r<1) is less than P, the
528
wrong reassignment takes effect. At the end when T is a tiny number and hopefully all poor-quality
529
local optima have been bypassed, only improving reassignments are in effect (like in k-means) and
530
convergence is reached. In order to speed up the run time SANDRA uses a relatively low cooling
531
factor (C=0.999) leading to a quick convergence but repeats the whole process 1000 times with
532
randomly diversified starting partitions and a randomized scheme for object and cluster ordering
533
leading to a diversified chronology for the tests and thus different ways to approach to the global
534
optimum. From these 1000 results the best is finally selected.
535
536
SANDRAS - classification of sequences of days with SANDRA
537
This method differs from SANDRA only by using modified input data. Instead of single daily
538
patterns, three day sequences are used, thus the history of the development of the last day in this
539
sequence is included for type definition, resulting in types of successions of maps (Philipp 2008).
540
This approach might be applied in principle to all classification methods, but has been included in
541
the dataset only for the SANDRA scheme in order to evaluate its effects in general.
542
543
NNW - neural network self organizing maps (Kohonen network)
544
The Neural Network architecture proposed for the classification of weather patterns (see e.g.
545
Michaelides et al. 2007, Tymvios et al., 2007, Tymvios et al., 2008a, Tymvios et al., 2008b) is the
546
Kohonen’s (1990) SOFM (Self-Organising Features Map). The SOFM network has the ability to
547
learn without being shown correct outputs in the sample patterns and is able to separate data into a
548
specified number of categories with only two layers: an input layer and an output layer, the latter
549
consisting of one neuron for each possible output category. The objective is to discover significant
550
features or regularities in the input data Χ(k,j); k=1, 2, … , p; j=1, 2, ... M; where k is a day and p the
551
total number of days and j is a gridpoint and M the total number of gridpoints. The input vector
552
X(p), representing a pressure pattern, is connected with each output neuron through weights w(j),
553
j=1,2,... M, which are randomly chosen at the beginning. The output neuron whose weight vector is
554
most similar to the input vector X is the so called winner neuron. The weight vector of the winner
555
neuron, as well as those of its neighborhood neurons, are updated to become more similar to the
556
input pattern following the learning rule:
557
558
where the neighborhood set N(I,R) of neuron I of radius R consists of the neurons 1,I±1,…,I±R,
559
assuming these neurons exist, with the maximum adjustment being around the winner I and
560
decreasing for more distant neurons in terms of the distance within the (hexagonal) network
561
topology. The coefficient a in the above relationship is called the learning factor and decreases to
562
zero as the learning progresses. Also the radius R of the neighborhood around the winner unit is
563
relatively large to start with, to include all neurons. As the learning process continues, the
564
neighborhood is consecutively shrunk down to the point where only the winner unit is updated
565
(Kohonen, 2001). When all vectors in the training set were presented once at the input (one epoch),
566
the whole procedure was repeated 2000 times for the cost733cat dataset version 1.2. In the end, this
567
algorithm organizes the weights of the two-dimensional map, such that topologically close nodes
568
become sensitive to input that is physically similar. These networks are extremely demanding as far
569
as computing power and memory is concerned.
570
571
3.) Method configuration and dataset compilation
572
The different methods reviewed above represent a large bandwidth of strategies to distinguish
573
synoptic situations into classes. Beside the subjective classifications which are included as
574
supplementation, all are automated and allow for recalculation which is a basic criterion for later
575
applications since they should be applicable to different regions, datasets, periods etc. However, in
576
order to allow successive comparisons studies of the pure effects caused by different classification
577
methods, it is necessary to apply the methods in a standardized way on a standardized input dataset.
578
This should exclude differences in the resulting classification catalogs caused by other conditions
579
than by the classification method alone. Therefore unified test conditions have been defined and all
580
automated methods have been recalculated accordingly. The standardization has been applied in
581
two steps, the first concerning the input data source and the temporal and spatial domain, the second
582
step additionally specifying the number of types produced and MSLP as the only variable.
583
The predefined input dataset is the ECMWF ERA40 reanalysis dataset (Uppala et al. 2005)
584
available for the period 09/1957 to 08/2002 in 1° by 1° grid resolution and in 6-hourly time steps
585
thus excluding discrepancies caused by different input data sources, assimilation or preprocessing
586
schemes. It has been decided to use only 12 o'clock UTC data and to apply all methods for the full
587
year period. Another important feature is the clipping of the ERA40 1° by 1° grid. Different spatial
588
scales and regions have been covered by a set of 12 unified domains throughout Europe presented
589
in Figure 1 and Table 1. The largest is covering whole Europe by a reduced grid resolution (domain
590
0) while the smallest is confined to the greater Alpine area comprising 12x18 grid points (domain
591
6). All methods have been applied to all 12 domains.
592
593
[Figure 1.]
594
595
[Table 1.]
596
597
Several of the above listed catalogues use other or more variables than MSLP in their original
598
variants. To exclude the use of different variables as one potential source of dissimilarity, in a
599
second step towards unification all methods (except WLK) have been recalculated using only
600
MSLP, again for all domains, and the resulting catalogs have been added to the dataset. Another
601
source of considerable dissimilarities is the number of types. While some methods allow to chose an
602
arbitrary number of types (like cluster analysis) others are limited to one or a few numbers, either
603
due to their concept (e.g. by division by wind sectors) or due to technical reasons (e.g. leading to
604
empty classes). The original numbers of types are varying between 4 and 43 types which makes it
605
rather difficult to compare the classifications. Preliminary analyses suggested that the quality of
606
classifications, e.g. in their power to discriminate surface temperature and precipitation, is strongly
607
sensitive to the number of types (Huth 2009, Beck and Philipp 2009). To eliminate this effect, it
608
has been decided to fix the numbers of types in the classifications to a few distinct numbers. To
609
span the usual range of the numbers of types in circulation classifications, and to allow methods
610
with non-flexible numbers of types (such as the Litinsky (1969) classification) to be included, it has
611
been decided to fix the numbers of types at 9, 18, and 27; with deviations by up to two from these
612
numbers being allowed. Therefore, each method has been run three times where necessary, one for
613
each of the reference number of types, again for all spatial domains.
614
All in all 73 method variants,, each of them applied to the 12 spatial domains where possible, are
615
included in cost733cat version 1.2. An overview of the variants of classifications is presented in
616
Table 2. It includes variants initially created by the authors as well as variants produced under the
617
unified classification conditions as described above. The abbreviations (column 2) reflect the
618
method or the authors name as well (in case of variants with unified conditions) the input parameter
619
(i.e. SLP) and the number of types. The number of types (column 3) may be given as a range, which
620
applies to catalogs with different numbers for the 12 spatial domains. At least one unified variant is
621
given with the number of types close to 9, 18 and 27 for all domains. A key reference for more
622
detailed descriptions of the method is added in column 4. Finally the method group is indicated in
623
the last column.
624
625
[Table 2.]
626
627
The dataset consists physically of ASCII files, following the naming convention in Table 2 column
628
2 and the domain number as documented in Table 1, holding 16436 lines, each line representing a
629
day, while the date is given by year, month and day in the first three columns of each line, followed
630
by the type number of the day. These single catalog files are organized in directories for each
631
method variant (see Table 2). Additionally catalog compilations for similar numbers of types are
632
provided in comprehensive files for easier processing of comparison analyses.
633
634
4.) Discussion and concluding remarks
635
636
A number of 788 classification weather and circulation type catalogs for European regions has
637
been compiled to a consistent dataset. While 8 spatially fixed and mostly not automated
638
classifications are included, 22 automated classification methods in 65 method variants have been
639
applied to 12 standard domains within Europe, including variants for unified classification
640
conditions, in order to provide comparable catalogs for future studies on classification methods and
641
their possible improvement. The various classification methods have been reviewed and
642
systematized according to the strategy for generation of types. In the following the main aspects of
643
these methods are summarized and discussed.
644
A main distinction is made between methods using predefined types and those producing derived
645
types. Within the former, manual or subjective methods are still in use and have their value due to
646
the integration of the experience of experts, which is often hard to formulate in precise rules for
647
automated classification. Therefore 4 subjective catalogs have been included in the dataset for
648
comparison (HBGWL/HBGWT, PECZELY, PERRET and ZAMG), even though it is unlikely that
649
new manual classifications will be established in the future. Disadvantages are the sometimes
650
imprecise definition of the types, which may lead to inhomogeneities and that they are made for
651
distinct geographical regions and are not scalable. The former problems are overcome by the
652
automatic assignment of days due to their pressure patterns as it is done by the OGWL method,
653
however the latter problems remain.
654
They are often replaced now by threshold based methods (GWT, LITADVE/LITTC, LWT2, WLK,
655
SCHÜEPP) which have the advantage of automatic processing and a lower degree of subjectivity.
656
However, subjectivity is also apparent by definition of the thresholds. Even though some thresholds
657
seem natural (like the symmetric partitioning of the wind rose) it is always an arbitrary decision to
658
use certain thresholds and not others. However they are appealing with their clear and precise
659
definition of types. However it became apparent that the definition and application of thresholds or
660
rules is very different, even though their common concept is based on airmass flow.
661
The second basic strategy for classification is to derive types from the input dataset by data
662
analysis. PCA based methods (TPCA, P27, PCAXTR) have in common the use of eigenvectors,
663
however, these are calculated and applied quite differently and it is debatable whether they can be
664
considered as a homogeneous group of methods. This is especially true for P27, where there is some
665
conceptional similarity to the GWT classification of the threshold group. Methodological more
666
homogeneous appears the group of methods based on the leader algorithm (LUND, ESLP/EZ850,
667
KH), differing only by using different distance metrics, different thresholds for finding key patterns
668
and by omitting the final redistribution step for ESLP/EZ850. May be the most coherent group is
669
the group of optimization methods (CKMEANS, PCACA, PETISCO, PCAXTRKMS, SANDRA,
670
SANDRAS, NNW), where only the PETISCO method does not try to minimize the within-type
671
differences at the end and thus might be considered as a member of the leader algorithm group as
672
well. Differences between the other methods confine to the kind of creating starting partitions, data
673
preprocessing and variations of the optimization algorithm, while their objective is all the same and
674
clearly defined.
675
[-------------------------------------------------------------->
676
Unification, drawbacks
677
[The reason for rareness of subjective WTCs is probably the demand from the majority of
678
applications to use CTCs for relating circulation to target variables (circulation-environment
679
approach after Yarnal 1993), e.g. for downscaling where only reliable circulation data are
680
available.]
681
682
[In some cases the assignment of methods to a method group is not unambiguous, e.g. P27 is
683
assigned to the PCA group but is also related to THR methods.
684
It turned out that PETSICO is now inspired by the leader algorithm (LDR group) and is somewhat
685
misplaced within the OPT group (it seems it was initially a k-means procedure.]
686
687
While most of the traditional subjective classifications are made for the full year without
688
subdivisions for different seasons, some applications use separate classifications e.g. for the four
689
meteorological seasons, which helps to stratify links between circulation and other parameters
690
which are seasonally variable. However, this approach complicates seamless studies on seasonality
691
of types. In case of some classification schemes (e.g. the so called GWT) it makes no difference
692
whether the seasons are resolved or not. In the current version of the cost733cat dataset the full year
693
classification approach has been implemented for all classifications taking into account that this
694
might lead to a drop of performance for some applications, however since this is done for all
695
methods, relative differences in performance should be conserved.
696
697
While most of the authors contributing to the catalog dataset originally used mean sea level pressure
698
(MSLP), some methods have been applied to other parameters like the geopotential height of the
699
850 or 500 hPa level or wind components, humidity and temperature (the three latter parameters
700
only for the WLK classification). ----> Unification
701
702
In version 1.2 (the current version at the time of writing) of the cost733cat dataset not all
703
configuration features have been unified so far. Thus methods use different distance measures,
704
mostly Euclidean distance and correlation coefficient, and some include a special preprocessing like
705
high-pass filtering and a preceding PCA (e.g. PCACA). It is debatable whether such differences
706
should be regarded as method-specific or should be unified as well, and indeed such considerations
707
are under discussion. Furthermore there are discussions under way whether the overall quality of
708
the catalogs should be improved, e.g. by using 850 hPa, seasonal classifications or inclusion of
709
temperature and precipitation as input variables. And last but not least there will be some more
710
catalogs in order to test further developed classification methods in the future. However, such future
711
modifications of the catalogs will be just added as additional variants in order to keep the dataset
712
backwards compatible.
713
<--------------------------------------------------------------]
714
715
The list of included methods of the cost733 catalog database is by far not complete. However, it is
716
believed that the most common basic methods are represented with some of their variations.
717
Methods not included are mostly those which are specialized for a certain target variable or don't
718
fulfill the criterion of producing distinct and unambiguous classifications of the entities. An
719
example for a method including both aspects is the fuzzy classification method developed by
720
Bardossy et al. (2002), which tries to find circulation types explaining the variations of a specified
721
target variable, e.g. station precipitation time series, while the daily maps are members of more
722
than one class. The consequence of optimization of a classification for one or a few target variables
723
is that the resulting catalog is applicable for a small target region but not for climate variations on a
724
larger, e.g. continental scale. On the other hand one of the attracting properties of classification
725
methods is the transfer from the metric scale to the nominal scale, which is violated by the principle
726
of fuzzy classification. These aspects should not debase the value of the mentioned methods since
727
they usually are superior to generalized and discrete methods, but their results are useful for specific
728
purposes only (see e.g Enke et al. 2005).
729
However, further studies of the presented classification dataset might show that the idea of
730
generalized classifications including the main links between circulation and climate variability in
731
different or large regions is unrealistic. And the attempts to find generally "better" classification
732
methods is like shooting on a moving target. If a method is optimal for circulation it may be bad for
733
explaining temperature variations, if it is optimal for temperature it may be bad for precipitation,
734
etc. Thus a universally best method might be unreachable, explaining the large number of attempts
735
to develop more and more different variations of existing methods or to develop completely new
736
ones. It is the hope of the authors that this database reflecting the large variety of classification
737
methods will help to come to a conclusion on whether it is worth the amount of effort spent in the
738
search for the ultimate best method. However, even if the results of the intercomparison studies of
739
the classifications only show that some of the methods are more or less useful in many aspects, the
740
aim of the work on this catalog collection will have been reached.
741
742
743
References
744
745
Bárdossy A. J. Stehlík, and C. Hans-Joachim, 2002: Automated objective classification of daily
746
circulation patterns for precipitation and temperature downscaling based on optimized fuzzy rules.
747
Climate Res., 23, 11–22.
748
749
Baur, F., 1948: Einführung in die Großwetterkunde (Introduction into large scale weather),
750
Dieterich Verlag, Wiesbaden.
751
752
Beck, C., 2000. Zirkulationsdynamische Variabilität im Bereich Nordatlantik-Europa seit 1780.
753
Würzburger Geographische Arbeiten 95. (German)
754
755
Beck, C., Jacobeit, J., Jones, P. D., 2007. Frequency and within-type variations of large scale
756
circulation types and their effects on low-frequency climate variability in Central Europe since
757
1780. Int. J. Climatol. 27, 473-491.
758
759
Bissolli, P., Dittmann, E., 2003: Objektive Wetterlagenklassen. In: Klimastatusbericht 2003. DWD
760
(Hrsg.). Offenbach 2004.
761
762
Blair (1998): The Kirchhofer technique of synoptic typing revisited. Int. J. Climatol. 18: 1625–1635
763
764
Buishand, T.A. and Brandsma, T., Comparison of circulation typeification schemes for predicting
765
temperature and precipitation in the Netherlands. International Journal of Climatology 17, 875-889,
766
(1997).
767
768
Cahynová M. and R. Huth, 2009. Enhanced lifetime of atmospheric circulation types over Europe:
769
fact or fiction? Tellus A, DOI: 10.1111/j.1600-0870.2009.00393.x
770
771
CHARALAMBOUS, C., CHARITOU, A., and KAOUROU, F. (2001), Comparative analysis of
772
neural network models: Application in bankruptcy prediction, Annals of Operations Res. 99, 403–
773
425
774
775
KOHONEN, T., (1990), The Self-Organizing Map. Proceedings of IEEE, 78(9), pp.
776
1464-1480
777
778
Compagnucci, R.H., Richman, M.B., 2008: Can principal component analysis provide atmospheric
779
circulation or teleconnection patterns? Int. J. Climatol., 28, 703-726.
780
781
Dittmann, E., Barth, S., Lang, J., Müller-Westermeier, G., 1995. Objektive
782
Wetterlagenklassifikation. Ber. Dt. Wetterd. 197, Offenbach a. M., Germany.
783
784
Ekstrom M, Jonsson P and Barring L. 2002. Synoptic pressure patterns associated with major wind
785
erosion events in southern Sweden. Climate Research 23: 51–66.
786
787
Enke, W., Spekat, A., 1997. Downscaling climate model outputs into local and regional weather
788
elements by classification and regression. Clim. Res. 8, 195-207.
789
790
Enke, W.; Schneider, F.; Deutschländer, Th. (2005):A novel scheme to derive optimised circulation
791
pattern classifications for downscaling and forecast purposes. Theoretical and Applied Climatology,
792
DOI: 10.1007/s00704-004-0116-x.
793
794
Erpicum, M., Mabille, G., Fettweis, X., 2008. Automatic synoptic weather circulation types
795
classification based on the 850 hPa geopotential height. Book of Abstracts COST 733 Mid-term
796
Conference, Advances in Weather and Circulation Type Classifications & Applications, 22-25
797
October 2008 Krakow, Poland, 33.
798
ESTEBAN P, JONES PD, MARTÍN-VIDE J, MASES M. (2005) Atmospheric circulation patterns
799
related to heavy snowfall days in Andorra, Pyrenees. International Journal of Climatology. 25: 319-
800
329.
801
ESTEBAN P, MARTIN-VIDE J, MASES M (2006) Daily atmospheric circulation catalogue for
802
western europe using multivariate techniques.International Journal of Climatology 26: 1501-1515
803
ESTEBAN, P.; NINYEROLA, M. and PROHOM, M. (2009): Spatial modelling of air temperature
804
and precipitation for Andorra (Pyrenees) from daily circulation patterns. Theoretical and Applied
805
Climatology. In press. Online at:
806
http://www.springerlink.com/content/r2728365g085q007/?p=0713eaeac536434ebc4b237b6677218
807
a&pi=1
808
Fereday D.R., J.R. Knight, A.A. Scaife, C.K. Folland, A. Philipp, 2008. Cluster analysis of North
809
Atlantic / European circulation types and links with tropical Pacific sea surface temperatures.
810
Journal of Climate, 21(15) 3687-3703. DOI: 10.1175/2007JCLI1875.1
811
812
Fettweis, X., Mabille, G., Erpicum, M. and Nicolay, S.: The 1958-2008 Greenland ice sheet surface
813
melt and the mid-tropospheric atmospheric circulation, Climate Dynamics, in preparation, 2009.
814
815
Gerstengarbe, F.-W., P. Werner, 1999: Katalog der Großwetterlagen Europas (1881-1998), nach
816
Paul Hess und Helmuth Brezowsky (Catalog of European weather types (1881-1998) after Paul
817
Hess and Helmuth Brezowsky, in German). Potsdam, pp 138. Online at http://www.pik-
818
potsdam.de/~uwerner/gwl/
819
820
Gong, X., Richman, M.B., 1995: On the application of cluster analysis to growing season
821
precipitation data in North America east of the Rockies. J. Climate, 8, 897-931.
822
823
Hartigan, J.A.,1975. Clustering algorithms, Wiley series in probability and mathematical statistics,
824
New York, 351 pp.
825
826
Hess, P., Brezowski, H., 1952. Katalog der Großwetterlagen Europas. Ber. Dt. Wetterd. in der US-
827
Zone 33, Bad Kissingen, Germany.
828
829
Hewitson B, Crane RG (1992) Regional climates in the GISS Global Circulation Model and
830
Synoptic-scale Circulation. J Climate 5:1002-1011.
831
832
Huth, R., 1993: An example of using obliquely rotated principal components to detect circulation
833
types over Europe. Meteorol. Z., 2, 285-293.
834
835
Huth, R. (2000): A circulation classification scheme applicable in GCM studies. Theor. Appl.
836
Climatol., 67, 1-18.
837
838
Huth R., C. Beck, A. Philipp, M. demuzere, Z. Ustrnul, M. Cahynová, J. Kyselý, O. E. Tveito
839
(2008): Classification of Atmospheric Circulation Patters - Recent Advances and Applications is
840
published in the Annals of the New York Academy of Sciences, 1146, pp105-152.
841
842
Huth R. (2009): Synoptic-climatological applicability of circulation classifications from the
843
COST733 collection: First results. Physics and Chemistry of the Earth, current issue.
844
845
James, P. M., 2006. Second Generation Lamb Weather Types – a new generic classification with
846
evenly tempered type frequencies. 6th Annual Meeting of the EMS / 6th ECAC, EMS2006A00441,
847
Ljubljana, Slovenia.
848
849
James P.M., (2007): An objective classification for Hess and Brezowsky Grosswetterlagen over
850
Europe. Theor. Appl. Climatol., 88, 17-42
851
852
Jenkinson, A.F., Collison, B. P., 1977. An initial climatology of gales over the North Sea. Synop.
853
Climatol. BranchMemo.N◦ 62, Meteorological Office, London, UK. 18 pp.
854
855
Károssy, Cs., 1994: Péczely’s classification of macrosynoptic types and the catalogue of weather
856
situations (1951-1992). In: Nowinsky, L. (ed): Light trapping of insects influenced by abiotic
857
factors. Part I. Savaria University Press, Szombathely. 117-130.
858
859
Károssy, Cs., 1997: Catalogue of Péczely’s macrosynoptic weather situations (1993-1996). In:
860
Nowinsky, L. (ed): Light trapping of insects influenced by abiotic factors. Part II. Savaria
861
University Press. Szombathely. 159-162.
862
863
Kirchhofer, W., 1974. Classification of European 500 mb patterns. Arbeitsbericht der
864
Schweizerischen Meteorologischen Zentralanstalt, 43, Zurich, Switzerland, pp. 1-16..
865
866
KOHONEN, T. (2001), Self-Organizing Maps, 3rd Ed. (Springer Series in Information Sciences)
867
868
Kruizinga, S., 1979. Objective classification of daily 500 mbar patterns. Preprints Sixth Conference
869
on Probability and Statistics in Atmospheric Sciences, 9-10 October 1979, Banff, Alberta,
870
American Meteorological Society, Boston, MA, 126-129.
871
872
Lamb, H., 1950. Types and spells of weather around the year in the British Isles: Annual trends,
873
seasonal structure of years, singularities. Quart. J. R. Met. Soc. 76, 393-438.
874
875
Lauscher, F., 1985: Klimatologische Synoptik Österreichs mittels der ostalpinen
876
Wetterlagenklassifikation, Arbeiten aus der Zentralanstalt für Meteorologie und Geodynamik,
877
Publikation Nr. 302, Heft 64
878
879
Litynski, J. (1969): A numerical classification of circulation patterns and weather types in Poland.
880
Prace Panstwowego Instytutu Hydrologiczno-Meteorologicznego 97: 3–15 (in Polish)
881
882
Lund, I. A., 1963. Map-pattern classification by statistical methods. J. Appl. Meteorol. 2, 56-65.
883
Martin, E., 2004. Validation of Alpine snow in ERA-40. ERA-40 Project Report Series 14.
884
European Centre for Medium-Range Weather Forecasts, Shinfield, Reading, UK.
885
886
MICHAELIDES, S.C., PATTICHIS, C.S., and KLEOVOULOU, G. (2001), Classification of
887
rainfall variability by using artificial neural networks, Internat. J. Climatology 21, 1401–1414
888
889
MICHAELIDES, S.C., LIASSIDOU F. and SCHIZAS C.N. (2007), Synoptic classification and
890
establishment of analogues with artificial neural networks. Pure and applied geophysics. 164 (2007)
891
1347-1364
892
893
Péczely, Gy., 1957: Grosswetterlagen in Ungarn. Kleinere Veröffentlichungen der Zentralanstalt
894
für Meteorologie, No.30. Budapest
895
896
Péczely, Gy., 1961: Characterising the meteorological macrosynoptic situations in Hungary. (in
897
Hungarian). OMI Kisebb Kiadványai No. 32. Budapest
898
899
Péczely, Gy., 1983: Catalogue of macrosynoptic situations of Hungary in years 1881-1983. (in
900
Hungarian). OMSz Kisebb Kiadványai No. 53. Budapest
901
902
Perret R, (1987): Une classification des situations météorologiques à l'usage de la prévision,
903
Schweizerischen Meteorologischen Zentralanstalt, 46, Zurich, Switzerland., 124 pp.
904
905
PETISCO, S.E., J.M. MARTÍN y D. GIL (2005): Método de estima de precipitación mediante
906
"downscaling" (A method for precipitation downscaling, in Spanish). Nota técnica n.º 11 del
907
Servicio de Variabilidad y Predicción del Clima, INM, Madrid.
908
909
Philipp, A., Della-Marta, P. M., Jacobeit, J., Fereday, D. R., Jones, P. D., Moberg, A., Wanner, H.,
910
2007. Long term variability of daily north atlantic-european pressure patterns since 1850 classified
911
by simulated annealing clustering. J. Clim. 20, 4065-4095.
912
913
Philipp, A., 2008. Comparison of principal component and cluster analysis for classifying
914
circulation pattern sequences for the European domain. Theor. Appl. Climatol. DOI
915
10.1007/s00704-008-0037-1
916
917
Pianko-Kluczynska, K. (2007): New Calendar of Atmosphere Circulation Types According to J.
918
Litynski”, Wiadomosci Meteorologii Hydrologii Gospodarki Wodnej, pp. 65-85 (In Polish).
919
920
Richman, M.B., 1981: Obliquely rotated principal components: An improved meteorological
921
maptyping technique? J. Appl. Meteorol., 20, 1145-1159.
922
923
Schüepp, M., 1968. Kalender der Wetter- und Witterungslagen von 1955 bis 1967.
924
Veröffentlichungender Schweizerischen Meteorologischen Zentralanstalt, 11, 43 pp.
925
926
Schüepp, M., 1957. Klassifikationsschema, Beispiele und Probleme der Alpenwetterstatistik. La
927
Meteorologie 4, 291-299.
928
929
Schüepp, M., 1979. Witterungsklimatologie (Klimatologie der Schweiz III). Beihefte zu den
930
Annalen der Schweizerischen Meteorologischen Anstalt 1978 , 93 pp.
931
932
TYMVIOS F.S., COSTANTINIDES P., RETALIS A., MICHAELIDES S.C., PARONIS S.,
933
EVRIPIDOU P. and KLEANTHOUS S. (2007), The AERAS project: data base implementation and
934
Neural Network classification tests. Urban Air Quality Proceedings, 2007.
935
936
TYMVIOS F.S., SAVVIDOU K., MICHAELIDES S.C., NIKOLAIDES K.A. (2008a),
937
Atmospheric circulation patterns associated with heavy precipitation over Cyprus. Geophysical
938
Research Abstracts, Vol. 10, EGU2008-A-00000, 2008
939
940
TYMVIOS F.S., SAVVIDOU K., MICHAELIDES S.C.,(2008b), Association of geopotential
941
height patterns to heavy rainfall events in the eastern Μediterranean. Advances in Geosciences
942
(submitted).
943
944
Richman MB (1986) Rotation of principal components. J Climatol 6: 293-335
945
Uppala, S.M., Kållberg, P.W., Simmons, A.J., Andrae, U., da Costa Bechtold, V., Fiorino, M.,
946
Gibson, J.K., Haseler, J., Hernandez, A., Kelly, G.A., Li, X., Onogi, K., Saarinen, S., Sokka, N.,
947
Allan, R.P., Andersson, E., Arpe, K., Balmaseda, M.A., Beljaars, A.C.M., van de Berg, L., Bidlot,
948
J., Bormann, N., Caires, S., Chevallier, F., Dethof, A., Dragosavac, M., Fisher, M., Fuentes, M.,
949
Hagemann, S., Hólm, E., Hoskins, B.J., Isaksen, L., Janssen, P.A.E.M., Jenne, R., McNally, A.P.,
950
Mahfouf, J.-F., Morcrette, J.-J., Rayner, N.A., Saunders, R.W., Simon, P., Sterl, A., Trenberth,
951
K.E., Untch, A., Vasiljevic, D., Viterbo, P., and Woollen, J. 2005: The ERA-40 re-analysis. Quart.
952
J. R. Meteorol. Soc., 131, 2961-3012.doi:10.1256/qj.04.176
953
954
955
Yarnal B. (1984): A procedure for the classification of synoptic weather maps from gridded
956
atmospheric pressure surface data, Computers & Geosciences, 10, 397-410
957
958
959
Yarnal B (1993) Synoptic climatology in environmental analysis Belhaven Press, London.
960
Figure 1. (will be finished soon and made available on the wiki)
961
Table 1.: Identification numbers, names and coordinates of spatial domains defined for
962
classification input data.
#
00
01
02
03
04
05
06
07
08
09
10
11
name
Europe
Iceland
West Scandinavia
Northeastern Europe
British Isles
Baltic Sea
Alps
Central Europe
Eastern Europe
Western Mediterranean
Central Mediterranean
Eastern Mediterranean
longitudes
37°W to 56°E by 3° (32)
34°W to 3°W by 1° (32)
06°W to 25°E by 1° (32)
24°E to 55°E by 1° (32)
18°W to 08°E by 1° (27)
08°E to 34°E by 1° (27)
03°E to 20°E by 1° (18)
03°E to 26°E by 1° (24)
22°E to 45°E by 1° (24)
17°W to 09°E by 1° (27)
07°E to 30°E by 1° (24)
20°E to 43°E by 1° (24)
latitudes
30°N to 76°N by 2° (24)
57°N to 72°N by 1° (16)
57°N to 72°N by 1° (16)
55°N to 70°N by 1° (16)
47°N to 62°N by 1° (16)
53°N to 68°N by 1° (16)
41°N to 52°N by 1° (12)
43°N to 58°N by 1° (16)
41°N to 56°N by 1° (16)
31°N to 48°N by 1° (18)
34°N to 49°N by 1° (16)
27°N to 42°N by 1° (16)
963
964
965
Table 2.: Methods and variants overview. Alphabetical list of abbreviations used to identify
966
individual variants of classification configurations. In column 2 the number of types is given,
967
column 3 holds the parameters used for classification (SLP: sea level pressure, Z geopotential
968
height, UV zonal/meridional wind component and PW precipitable water, SFC denotes surface and
969
numbers the referring pressure level) and column 4 the key reference. The method groups are given
970
in column 5 where SUB means subjective, THR threshold based, PCA based on principal
971
component analysis, LDR leader algorithm and OPT optimization algorithm.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
abbreviation
types
CKMEANSC09
CKMEANSC18
CKMEANSC27
ESLPC09
ESLPC18
ESLPC27
EZ850C10
EZ850C20
EZ850C30
GWT
GWTC10
GWTC18
GWTC26
HBGWL
9
18
27
9
18
27
10
20
30
18
10
18
26
29
parameters
SLP
SLP
SLP
SLP
SLP
SLP
Z850
Z850
Z850
SLP
SLP
SLP
SLP
various
key reference
Enke and Spekat (1997)
Enke and Spekat (1997)
Enke and Spekat (1997)
Erpicum et al. (2008)
Erpicum et al. (2008)
Erpicum et al. (2008)
Erpicum et al. (2008)
Erpicum et al. (2008)
Erpicum et al. (2008)
Beck (2000)
Beck (2000)
Beck (2000)
Beck (2000)
Hess and Brezowski (1952)
method
group
OPT
OPT
OPT
LDR
LDR
LDR
LDR
LDR
LDR
THR
THR
THR
THR
SUB
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
HBGWT
KHC09
KHC18
KHC27
LITADVE
LITTC
LITTC18
LUND
LUNDC09
LUNDC18
LUNDC27
LWT2
LWT2C10
LWT2C18
NNW
NNWC09
NNWC18
NNWC27
OGWL
OGWLSLP
P27
P27C08
P27C16
P27C27
PCACA
PCACAC09
PCACAC18
PCACAC27
PCAXTR
PCAXTRC09
PCAXTRC18
PCAXTRKM
PCAXTRKMC09
PCAXTRKMC18
PECZELY
PERRET
PETISCO
PETISCOC09
PETISCOC18
PETISCOC27
SANDRA
SANDRAC09
SANDRAC18
SANDRAC27
SANDRAS
SANDRASC09
SANDRASC18
SANDRASC27
SCHUEPP
TPCA07
TPCAC09
TPCAC18
10
various
9
SLP
18
SLP
27
SLP
9
SLP
27
SLP
18
SLP
10
SLP
9
SLP
18
SLP
27
SLP
26
SLP
10
SLP
18
SLP
9-30
Z500
9
SLP
18
SLP
27
SLP
29
SLP, Z500
29
SLP
27
Z500
8
SLP
16
SLP
27
SLP
4-5
SLP
9
SLP
18
SLP
27
SLP
11-17
SLP
9-10
SLP
15-18
SLP
11-17
SLP
9-10
SLP
15-18
SLP
13
various
40
various
25-38
SLP, Z500
9
SLP
18
SLP
27
SLP
18-23
SLP
9
SLP
18
SLP
27
SLP
30
Z925, Z500
9
SLP
18
SLP
27
SLP
40 SLP, Z500, U/V SFC/500
7
SLP
9
SLP
18
SLP
Hess and Brezowski (1952)
Blair (1998)
Blair (1998)
Blair (1998)
Litynski (1969)
Litynski (1969)
Litynski (1969)
Lund (1963)
Lund (1963)
Lund (1963)
Lund (1963)
James (2006)
James (2006)
James (2006)
Michaelides (2001)
Michaelides (2001)
Michaelides (2001)
Michaelides (2001)
James (2007)
James (2007)
Kruizinga (1979)
Kruizinga (1979)
Kruizinga (1979)
Kruizinga (1979)
Yarnal (1993)
Yarnal (1993)
Yarnal (1993)
Yarnal (1993)
Esteban et al. (2005)
Esteban et al. (2005)
Esteban et al. (2005)
Esteban et al. (2006)
Esteban et al. (2006)
Esteban et al. (2006)
Peczely (1957)
Perret (1987)
Petisco (2005)
Petisco (2005)
Petisco (2005)
Petisco (2005)
Philipp et al. (2007)
Philipp et al. (2007)
Philipp et al. (2007)
Philipp et al. (2007)
Philipp (2008)
Philipp (2008)
Philipp (2008)
Philipp (2008)
Schüepp (1979)
Huth (1993)
Huth (1993)
Huth (1993)
SUB
LDR
LDR
LDR
THR
THR
THR
LDR
LDR
LDR
LDR
THR
THR
THR
OPT
OPT
OPT
OPT
SUB
SUB
PCA
PCA
PCA
PCA
OPT
OPT
OPT
OPT
PCA
PCA
PCA
OPT
OPT
OPT
SUB
SUB
OPT/LDR
OPT/LDR
OPT/LDR
OPT/LDR
OPT
OPT
OPT
OPT
OPT
OPT
OPT
OPT
THR
PCA
PCA
PCA
67
68
69
70
71
72
73
972
TPCAC27
TPCAV
WLKC09
WLKC18
WLKC28
WLKC733
ZAMG
27
6-12
9
18
28
40
43
SLP
SLP
U700, V700
U700,V700, Z925
U700,V700, Z925/500
U/V700, Z925/500, PW
various
Huth (1993)
Huth (1993)
Dittmann et al. (1995)
Dittmann et al. (1995)
Dittmann et al. (1995)
Dittmann et al. (1995)
Lauscher (1985)
PCA
PCA
THR
THR
THR
THR
SUB
Download